stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" apply to 3.18-stable tree
@ 2019-03-21 16:03 He, Bo
  2019-03-21 22:09 ` Paul E. McKenney
  0 siblings, 1 reply; 2+ messages in thread
From: He, Bo @ 2019-03-21 16:03 UTC (permalink / raw)
  To: gregkh@linuxfoundation.org, Zhang, Jun, Bai, Jie A, Xiao, Jin,
	paulmck@linux.ibm.com
  Cc: stable@vger.kernel.org

The rcu_gp_kthread_wake() function is invoked when it might be necessary
to wake the RCU grace-period kthread.  Because self-wakeups are normally
a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
this kthread, it naturally refuses to do the wakeup.

Unfortunately, natural though it might be, this heuristic fails when
rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
that interrupted the grace-period kthread just after the final check of
the wait-event condition but just before the schedule() call.  In this
case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
is within the RCU grace-period kthread's context.  Failing to provide
this wakeup can result in grace periods failing to start, which in turn
results in out-of-memory conditions.

This race window is quite narrow, but it actually did happen during real
testing.  It would of course need to be fixed even if it was strictly
theoretical in nature.

[ backport for 3.18 commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903
upstream. ]

Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
Reported-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
Co-developed-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off: "He, Bo" <bo.he@intel.com>
Signed-off: "xiao, jin" <jin.xiao@intel.com>
Signed-off: Bai, Jie A <jie.a.bai@intel.com>
Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
[ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
  !in_serving_softirq() to avoid redundant wakeups and to also handle the
  interrupt-handler scenario as well as the softirq-handler scenario that
  actually occurred in testing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
---
 kernel/rcu/tree.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9815447d22e0..f9fb34e1aa71 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1399,15 +1399,23 @@ static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
 }
 
 /*
- * Awaken the grace-period kthread for the specified flavor of RCU.
- * Don't do a self-awaken, and don't bother awakening when there is
- * nothing for the grace-period kthread to do (as in several CPUs
- * raced to awaken, and we lost), and finally don't try to awaken
- * a kthread that has not yet been created.
+ * Awaken the grace-period kthread.  Don't do a self-awaken (unless in
+ * an interrupt or softirq handler), and don't bother awakening when there
+ * is nothing for the grace-period kthread to do (as in several CPUs raced
+ * to awaken, and we lost), and finally don't try to awaken a kthread that
+ * has not yet been created.  If all those checks are passed, track some
+ * debug information and awaken.
+ *
+ * So why do the self-wakeup when in an interrupt or softirq handler
+ * in the grace-period kthread's context?  Because the kthread might have
+ * been interrupted just as it was going to sleep, and just after the final
+ * pre-sleep check of the awaken condition.  In this case, a wakeup really
+ * is required, and is therefore supplied.
  */
 static void rcu_gp_kthread_wake(struct rcu_state *rsp)
 {
-	if (current == rsp->gp_kthread ||
+	if ((current == rsp->gp_kthread &&
+	     !in_interrupt() && !in_serving_softirq()) ||
 	    !ACCESS_ONCE(rsp->gp_flags) ||
 	    !rsp->gp_kthread)
 		return;
-- 
2.20.1




-----Original Message-----
From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> 
Sent: Thursday, March 21, 2019 1:43 AM
To: Zhang, Jun <jun.zhang@intel.com>; He, Bo <bo.he@intel.com>; Bai, Jie A <jie.a.bai@intel.com>; Xiao, Jin <jin.xiao@intel.com>; paulmck@linux.ibm.com
Cc: stable@vger.kernel.org
Subject: FAILED: patch "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" failed to apply to 3.18-stable tree


The patch below does not apply to the 3.18-stable tree.
If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@vger.kernel.org>.

thanks,

greg k-h

------------------ original commit in Linus's tree ------------------

From 1d1f898df6586c5ea9aeaf349f13089c6fa37903 Mon Sep 17 00:00:00 2001
From: "Zhang, Jun" <jun.zhang@intel.com>
Date: Tue, 18 Dec 2018 06:55:01 -0800
Subject: [PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt

The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread.  Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup.

Unfortunately, natural though it might be, this heuristic fails when
rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call.  In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context.  Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions.

This race window is quite narrow, but it actually did happen during real testing.  It would of course need to be fixed even if it was strictly theoretical in nature.

This patch does not Cc stable because it does not apply cleanly to earlier kernel versions.

Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
Reported-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
Co-developed-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off: "He, Bo" <bo.he@intel.com>
Signed-off: "xiao, jin" <jin.xiao@intel.com>
Signed-off: Bai, Jie A <jie.a.bai@intel.com>
Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com> [ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
  !in_serving_softirq() to avoid redundant wakeups and to also handle the
  interrupt-handler scenario as well as the softirq-handler scenario that
  actually occurred in testing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9ceb93f848cd..21775eebb8f0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1593,15 +1593,23 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)  }
 
 /*
- * Awaken the grace-period kthread.  Don't do a self-awaken, and don't
- * bother awakening when there is nothing for the grace-period kthread
- * to do (as in several CPUs raced to awaken, and we lost), and finally
- * don't try to awaken a kthread that has not yet been created.  If
- * all those checks are passed, track some debug information and awaken.
+ * Awaken the grace-period kthread.  Don't do a self-awaken (unless in
+ * an interrupt or softirq handler), and don't bother awakening when 
+ there
+ * is nothing for the grace-period kthread to do (as in several CPUs 
+ raced
+ * to awaken, and we lost), and finally don't try to awaken a kthread 
+ that
+ * has not yet been created.  If all those checks are passed, track 
+ some
+ * debug information and awaken.
+ *
+ * So why do the self-wakeup when in an interrupt or softirq handler
+ * in the grace-period kthread's context?  Because the kthread might 
+ have
+ * been interrupted just as it was going to sleep, and just after the 
+ final
+ * pre-sleep check of the awaken condition.  In this case, a wakeup 
+ really
+ * is required, and is therefore supplied.
  */
 static void rcu_gp_kthread_wake(void)
 {
-	if (current == rcu_state.gp_kthread ||
+	if ((current == rcu_state.gp_kthread &&
+	     !in_interrupt() && !in_serving_softirq()) ||
 	    !READ_ONCE(rcu_state.gp_flags) ||
 	    !rcu_state.gp_kthread)
 		return;


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" apply to 3.18-stable tree
  2019-03-21 16:03 "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" apply to 3.18-stable tree He, Bo
@ 2019-03-21 22:09 ` Paul E. McKenney
  0 siblings, 0 replies; 2+ messages in thread
From: Paul E. McKenney @ 2019-03-21 22:09 UTC (permalink / raw)
  To: He, Bo
  Cc: gregkh@linuxfoundation.org, Zhang, Jun, Bai, Jie A, Xiao, Jin,
	stable@vger.kernel.org

On Thu, Mar 21, 2019 at 04:03:04PM +0000, He, Bo wrote:
> The rcu_gp_kthread_wake() function is invoked when it might be necessary
> to wake the RCU grace-period kthread.  Because self-wakeups are normally
> a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
> this kthread, it naturally refuses to do the wakeup.
> 
> Unfortunately, natural though it might be, this heuristic fails when
> rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
> that interrupted the grace-period kthread just after the final check of
> the wait-event condition but just before the schedule() call.  In this
> case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
> is within the RCU grace-period kthread's context.  Failing to provide
> this wakeup can result in grace periods failing to start, which in turn
> results in out-of-memory conditions.
> 
> This race window is quite narrow, but it actually did happen during real
> testing.  It would of course need to be fixed even if it was strictly
> theoretical in nature.
> 
> [ backport for 3.18 commit 1d1f898df6586c5ea9aeaf349f13089c6fa37903
> upstream. ]
> 
> Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
> Reported-by: "He, Bo" <bo.he@intel.com>
> Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
> Co-developed-by: "He, Bo" <bo.he@intel.com>
> Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
> Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
> Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
> Signed-off: "He, Bo" <bo.he@intel.com>
> Signed-off: "xiao, jin" <jin.xiao@intel.com>
> Signed-off: Bai, Jie A <jie.a.bai@intel.com>
> Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
> [ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
>   !in_serving_softirq() to avoid redundant wakeups and to also handle the
>   interrupt-handler scenario as well as the softirq-handler scenario that
>   actually occurred in testing. ]

They all look good, thank you!

I subjected all of the others to light rcutorture testing, which they
passed.  This v3.18 patch hung, however.  Trying it again with stock
v3.18 got the same hang, so I believe we can exonerate the patch and
give it a good firm "maybe" on 3.18.

Worth paying special attention to further test results from 3.18.x, though!

							Thanx, Paul

> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
> ---
>  kernel/rcu/tree.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9815447d22e0..f9fb34e1aa71 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1399,15 +1399,23 @@ static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp)
>  }
>  
>  /*
> - * Awaken the grace-period kthread for the specified flavor of RCU.
> - * Don't do a self-awaken, and don't bother awakening when there is
> - * nothing for the grace-period kthread to do (as in several CPUs
> - * raced to awaken, and we lost), and finally don't try to awaken
> - * a kthread that has not yet been created.
> + * Awaken the grace-period kthread.  Don't do a self-awaken (unless in
> + * an interrupt or softirq handler), and don't bother awakening when there
> + * is nothing for the grace-period kthread to do (as in several CPUs raced
> + * to awaken, and we lost), and finally don't try to awaken a kthread that
> + * has not yet been created.  If all those checks are passed, track some
> + * debug information and awaken.
> + *
> + * So why do the self-wakeup when in an interrupt or softirq handler
> + * in the grace-period kthread's context?  Because the kthread might have
> + * been interrupted just as it was going to sleep, and just after the final
> + * pre-sleep check of the awaken condition.  In this case, a wakeup really
> + * is required, and is therefore supplied.
>   */
>  static void rcu_gp_kthread_wake(struct rcu_state *rsp)
>  {
> -	if (current == rsp->gp_kthread ||
> +	if ((current == rsp->gp_kthread &&
> +	     !in_interrupt() && !in_serving_softirq()) ||
>  	    !ACCESS_ONCE(rsp->gp_flags) ||
>  	    !rsp->gp_kthread)
>  		return;
> -- 
> 2.20.1
> 
> 
> 
> 
> -----Original Message-----
> From: gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> 
> Sent: Thursday, March 21, 2019 1:43 AM
> To: Zhang, Jun <jun.zhang@intel.com>; He, Bo <bo.he@intel.com>; Bai, Jie A <jie.a.bai@intel.com>; Xiao, Jin <jin.xiao@intel.com>; paulmck@linux.ibm.com
> Cc: stable@vger.kernel.org
> Subject: FAILED: patch "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" failed to apply to 3.18-stable tree
> 
> 
> The patch below does not apply to the 3.18-stable tree.
> If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@vger.kernel.org>.
> 
> thanks,
> 
> greg k-h
> 
> ------------------ original commit in Linus's tree ------------------
> 
> >From 1d1f898df6586c5ea9aeaf349f13089c6fa37903 Mon Sep 17 00:00:00 2001
> From: "Zhang, Jun" <jun.zhang@intel.com>
> Date: Tue, 18 Dec 2018 06:55:01 -0800
> Subject: [PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
> 
> The rcu_gp_kthread_wake() function is invoked when it might be necessary to wake the RCU grace-period kthread.  Because self-wakeups are normally a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from this kthread, it naturally refuses to do the wakeup.
> 
> Unfortunately, natural though it might be, this heuristic fails when
> rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler that interrupted the grace-period kthread just after the final check of the wait-event condition but just before the schedule() call.  In this case, a wakeup is required, even though the call to rcu_gp_kthread_wake() is within the RCU grace-period kthread's context.  Failing to provide this wakeup can result in grace periods failing to start, which in turn results in out-of-memory conditions.
> 
> This race window is quite narrow, but it actually did happen during real testing.  It would of course need to be fixed even if it was strictly theoretical in nature.
> 
> This patch does not Cc stable because it does not apply cleanly to earlier kernel versions.
> 
> Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
> Reported-by: "He, Bo" <bo.he@intel.com>
> Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
> Co-developed-by: "He, Bo" <bo.he@intel.com>
> Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
> Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
> Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
> Signed-off: "He, Bo" <bo.he@intel.com>
> Signed-off: "xiao, jin" <jin.xiao@intel.com>
> Signed-off: Bai, Jie A <jie.a.bai@intel.com>
> Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com> [ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
>   !in_serving_softirq() to avoid redundant wakeups and to also handle the
>   interrupt-handler scenario as well as the softirq-handler scenario that
>   actually occurred in testing. ]
> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 9ceb93f848cd..21775eebb8f0 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1593,15 +1593,23 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)  }
>  
>  /*
> - * Awaken the grace-period kthread.  Don't do a self-awaken, and don't
> - * bother awakening when there is nothing for the grace-period kthread
> - * to do (as in several CPUs raced to awaken, and we lost), and finally
> - * don't try to awaken a kthread that has not yet been created.  If
> - * all those checks are passed, track some debug information and awaken.
> + * Awaken the grace-period kthread.  Don't do a self-awaken (unless in
> + * an interrupt or softirq handler), and don't bother awakening when 
> + there
> + * is nothing for the grace-period kthread to do (as in several CPUs 
> + raced
> + * to awaken, and we lost), and finally don't try to awaken a kthread 
> + that
> + * has not yet been created.  If all those checks are passed, track 
> + some
> + * debug information and awaken.
> + *
> + * So why do the self-wakeup when in an interrupt or softirq handler
> + * in the grace-period kthread's context?  Because the kthread might 
> + have
> + * been interrupted just as it was going to sleep, and just after the 
> + final
> + * pre-sleep check of the awaken condition.  In this case, a wakeup 
> + really
> + * is required, and is therefore supplied.
>   */
>  static void rcu_gp_kthread_wake(void)
>  {
> -	if (current == rcu_state.gp_kthread ||
> +	if ((current == rcu_state.gp_kthread &&
> +	     !in_interrupt() && !in_serving_softirq()) ||
>  	    !READ_ONCE(rcu_state.gp_flags) ||
>  	    !rcu_state.gp_kthread)
>  		return;
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-03-21 22:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-21 16:03 "[PATCH] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt" apply to 3.18-stable tree He, Bo
2019-03-21 22:09 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).