From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7563C43381 for ; Thu, 21 Mar 2019 22:08:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7723321874 for ; Thu, 21 Mar 2019 22:08:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726374AbfCUWIc (ORCPT ); Thu, 21 Mar 2019 18:08:32 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:43922 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726529AbfCUWIc (ORCPT ); Thu, 21 Mar 2019 18:08:32 -0400 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x2LM3iAT040871 for ; Thu, 21 Mar 2019 18:08:31 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0a-001b2d01.pphosted.com with ESMTP id 2rch18ddxw-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 21 Mar 2019 18:08:30 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 21 Mar 2019 22:08:30 -0000 Received: from b01cxnp23034.gho.pok.ibm.com (9.57.198.29) by e11.ny.us.ibm.com (146.89.104.198) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 21 Mar 2019 22:08:27 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x2LM8P6g24772704 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 Mar 2019 22:08:25 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5BD1BB208C; Thu, 21 Mar 2019 22:08:25 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2ED36B2088; Thu, 21 Mar 2019 22:08:25 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.188]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 21 Mar 2019 22:08:25 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 4F9BB16C3D8A; Thu, 21 Mar 2019 15:09:17 -0700 (PDT) Date: Thu, 21 Mar 2019 15:09:17 -0700 From: "Paul E. McKenney" To: "He, Bo" Cc: "gregkh@linuxfoundation.org" , "Zhang, Jun" , "Bai, Jie A" , "Xiao, Jin" , "stable@vger.kernel.org" Subject: Re: "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" apply to 3.18-stable tree Reply-To: paulmck@linux.ibm.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 19032122-2213-0000-0000-0000036853AC X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010791; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000281; SDB=6.01177731; UDB=6.00616120; IPR=6.00958427; MB=3.00026099; MTD=3.00000008; XFM=3.00000015; UTC=2019-03-21 22:08:28 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19032122-2214-0000-0000-00005DC0B89E Message-Id: <20190321220917.GF4102@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-03-21_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903210154 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On Thu, Mar 21, 2019 at 04:03:04PM +0000, He, Bo wrote: > The rcu_gp_kthread_wake() function is invoked when it might be necessary > to wake the RCU grace-period kthread. Because self-wakeups are normally > a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from > this kthread, it naturally refuses to do the wakeup. > > Unfortunately, natural though it might be, this heuristic fails when > rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler > that interrupted the grace-period kthread just after the final check of > the wait-event condition but just before the schedule() call. In this > case, a wakeup is required, even though the call to rcu_gp_kthread_wake() > is within the RCU grace-period kthread's context. Failing to provide > this wakeup can result in grace periods failing to start, which in turn > results in out-of-memory conditions. > > This race window is quite narrow, but it actually did happen during real > testing. It would of course need to be fixed even if it was strictly > theoretical in nature. > > [ backport for 3.18 commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903 > upstream. ] > > Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread") > Reported-by: "He, Bo" > Co-developed-by: "Zhang, Jun" > Co-developed-by: "He, Bo" > Co-developed-by: "xiao, jin" > Co-developed-by: Bai, Jie A > Signed-off: "Zhang, Jun" > Signed-off: "He, Bo" > Signed-off: "xiao, jin" > Signed-off: Bai, Jie A > Signed-off-by: "Zhang, Jun" > [ paulmck: Switch from !in_softirq() to "!in_interrupt() && > !in_serving_softirq() to avoid redundant wakeups and to also handle the > interrupt-handler scenario as well as the softirq-handler scenario that > actually occurred in testing. ] They all look good, thank you! I subjected all of the others to light rcutorture testing, which they passed. This v3.18 patch hung, however. Trying it again with stock v3.18 got the same hang, so I believe we can exonerate the patch and give it a good firm "maybe" on 3.18. Worth paying special attention to further test results from 3.18.x, though! Thanx, Paul > Signed-off-by: Paul E. McKenney > Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com > --- > kernel/rcu/tree.c | 20 ++++++++++++++------ > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index 9815447d22e0..f9fb34e1aa71 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1399,15 +1399,23 @@ static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) > } > > /* > - * Awaken the grace-period kthread for the specified flavor of RCU. > - * Don't do a self-awaken, and don't bother awakening when there is > - * nothing for the grace-period kthread to do (as in several CPUs > - * raced to awaken, and we lost), and finally don't try to awaken > - * a kthread that has not yet been created. > + * Awaken the grace-period kthread. Don't do a self-awaken (unless in > + * an interrupt or softirq handler), and don't bother awakening when there > + * is nothing for the grace-period kthread to do (as in several CPUs raced > + * to awaken, and we lost), and finally don't try to awaken a kthread that > + * has not yet been created. If all those checks are passed, track some > + * debug information and awaken. > + * > + * So why do the self-wakeup when in an interrupt or softirq handler > + * in the grace-period kthread's context? Because the kthread might have > + * been interrupted just as it was going to sleep, and just after the final > + * pre-sleep check of the awaken condition. In this case, a wakeup really > + * is required, and is therefore supplied. > */ > static void rcu_gp_kthread_wake(struct rcu_state *rsp) > { > - if (current == rsp->gp_kthread || > + if ((current == rsp->gp_kthread && > + !in_interrupt() && !in_serving_softirq()) || > !ACCESS_ONCE(rsp->gp_flags) || > !rsp->gp_kthread) > return; > -- > 2.20.1 > > > > > -----Original Message----- > From: gregkh@linuxfoundation.org > Sent: Thursday, March 21, 2019 1:43 AM > To: Zhang, Jun ; He, Bo ; Bai, Jie A ; Xiao, Jin ; paulmck@linux.ibm.com > Cc: stable@vger.kernel.org > Subject: FAILED: patch "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" failed to apply to 3.18-stable tree > > > The patch below does not apply to the 3.18-stable tree. > If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . > > thanks, > > greg k-h > > ------------------ original commit in Linus's tree ------------------ > > >From 1d1f898df6586c5ea9aeaf349f13089c6fa37903 Mon Sep 17 00:00:00 2001 > From: "Zhang, Jun" > Date: Tue, 18 Dec 2018 06:55:01 -0800 > Subject: [PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt > > The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread. Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup. > > Unfortunately, natural though it might be, this heuristic fails when > rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call. In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context. Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions. > > This race window is quite narrow, but it actually did happen during real testing. It would of course need to be fixed even if it was strictly theoretical in nature. > > This patch does not Cc stable because it does not apply cleanly to earlier kernel versions. > > Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread") > Reported-by: "He, Bo" > Co-developed-by: "Zhang, Jun" > Co-developed-by: "He, Bo" > Co-developed-by: "xiao, jin" > Co-developed-by: Bai, Jie A > Signed-off: "Zhang, Jun" > Signed-off: "He, Bo" > Signed-off: "xiao, jin" > Signed-off: Bai, Jie A > Signed-off-by: "Zhang, Jun" [ paulmck: Switch from !in_softirq() to "!in_interrupt() && > !in_serving_softirq() to avoid redundant wakeups and to also handle the > interrupt-handler scenario as well as the softirq-handler scenario that > actually occurred in testing. ] > Signed-off-by: Paul E. McKenney > Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9ceb93f848cd..21775eebb8f0 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -1593,15 +1593,23 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp) } > > /* > - * Awaken the grace-period kthread. Don't do a self-awaken, and don't > - * bother awakening when there is nothing for the grace-period kthread > - * to do (as in several CPUs raced to awaken, and we lost), and finally > - * don't try to awaken a kthread that has not yet been created. If > - * all those checks are passed, track some debug information and awaken. > + * Awaken the grace-period kthread. Don't do a self-awaken (unless in > + * an interrupt or softirq handler), and don't bother awakening when > + there > + * is nothing for the grace-period kthread to do (as in several CPUs > + raced > + * to awaken, and we lost), and finally don't try to awaken a kthread > + that > + * has not yet been created. If all those checks are passed, track > + some > + * debug information and awaken. > + * > + * So why do the self-wakeup when in an interrupt or softirq handler > + * in the grace-period kthread's context? Because the kthread might > + have > + * been interrupted just as it was going to sleep, and just after the > + final > + * pre-sleep check of the awaken condition. In this case, a wakeup > + really > + * is required, and is therefore supplied. > */ > static void rcu_gp_kthread_wake(void) > { > - if (current == rcu_state.gp_kthread || > + if ((current == rcu_state.gp_kthread && > + !in_interrupt() && !in_serving_softirq()) || > !READ_ONCE(rcu_state.gp_flags) || > !rcu_state.gp_kthread) > return; >