From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22150C433F5 for ; Fri, 28 Jan 2022 16:10:07 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4Jlj8s2yncz3cCS for ; Sat, 29 Jan 2022 03:10:05 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=KvhOshay; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=svens@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=KvhOshay; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4Jlj856pzkz2yLT for ; Sat, 29 Jan 2022 03:09:25 +1100 (AEDT) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 20SFoldD027320; Fri, 28 Jan 2022 16:08:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : references : date : in-reply-to : message-id : content-type : mime-version; s=pp1; bh=QuLXzDgjL9wnzbED3Nhc8cRtQ2KTEN03cmdeQI331YI=; b=KvhOshayGMTSXSGaeA4/I2ZeblH7U3fOIQzTE1m0wGbar/VSD3opcDl8nLE3BekGGnDm 6jVnwWvHywZNTwvxjzKy/WF++mICRRobj3FDDlydD+twy/BwX+d+aYcgfHNvkuR0xcu6 MjWinBsFlmyZopC3PRN3Zc7bOJl/ySFoUllVHP14WIg7S8GsRjVk9Z8wldzo3Q59vnfU Ifogc4f7yDExw/YnPT5qr5fHy0QC9aPzHaDhIBfr5wNH+qWiMnfNLark9BBWz2EQqBoK 6KRQ8xCmRr+g6SJ9YBuvYa9/woAg/g9uUaA6Q+bkmoAqBN9eBlttO9ZKMEIg5s72yk0X oQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3dvjwus6pk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 28 Jan 2022 16:08:55 +0000 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 20SFrXPj007396; Fri, 28 Jan 2022 16:08:55 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0b-001b2d01.pphosted.com with ESMTP id 3dvjwus6nx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 28 Jan 2022 16:08:55 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 20SG2rDB030994; Fri, 28 Jan 2022 16:08:53 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma01fra.de.ibm.com with ESMTP id 3dr9ja8jqp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 28 Jan 2022 16:08:53 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 20SG8n4X33948128 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 28 Jan 2022 16:08:49 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 865BB11C052; Fri, 28 Jan 2022 16:08:49 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1669111C058; Fri, 28 Jan 2022 16:08:49 +0000 (GMT) Received: from tuxmaker.linux.ibm.com (unknown [9.152.85.9]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 28 Jan 2022 16:08:49 +0000 (GMT) From: Sven Schnelle To: Mark Rutland Subject: Re: ftrace hangs waiting for rcu References: <20220127114249.03b1b52b@gandalf.local.home> Date: Fri, 28 Jan 2022 17:08:48 +0100 In-Reply-To: (Mark Rutland's message of "Fri, 28 Jan 2022 15:42:48 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-GUID: HIA2iVl0ZRTGnPPvDah9uhzigmjhRZTE X-Proofpoint-ORIG-GUID: 2MXkrPI1bRRT1QHJ7IkNsiOUGgkUm0bm X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-01-28_05,2022-01-28_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxlogscore=999 phishscore=0 spamscore=0 mlxscore=0 malwarescore=0 impostorscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 suspectscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2201280101 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, Kees Cook , "Paul E. McKenney" , hca@linux.ibm.com, LKML , Steven Rostedt , Ingo Molnar , Sachin Sant , Russell King , Andrew Morton , Yinan Liu , linuxppc-dev@lists.ozlabs.org, Ard Biesheuvel , linux-arm-kernel@lists.infradead.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Hi Mark, Mark Rutland writes: > On arm64 I bisected this down to: > > 7a30871b6a27de1a ("rcu-tasks: Introduce ->percpu_enqueue_shift for dynamic queue selection") > > Which was going wrong because ilog2() rounds down, and so the shift was wrong > for any nr_cpus that was not a power-of-two. Paul had already fixed that in > rcu-next, and just sent a pull request to Linus: > > https://lore.kernel.org/lkml/20220128143251.GA2398275@paulmck-ThinkPad-P17-Gen-1/ > > With that applied, I no longer see these hangs. > > Does your s390 test machine have a non-power-of-two nr_cpus, and does that fix > the issue for you? We noticed the PR from Paul and are currently testing the fix. So far it's looking good. The configuration where we have seen the hang is a bit unusual: - 16 physical CPUs on the kvm host - 248 logical CPUs inside kvm - debug kernel both on the host and kvm guest So things are likely a bit slow in the kvm guest. Interesting is that the number of CPUs is even. But maybe RCU sees an odd number of CPUs and gets confused before all cpus are brought up. Have to read code/test to see whether that could be possible. Thanks for investigating! Sven