linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "Zhang, Jun" <jun.zhang@intel.com>
Cc: "He, Bo" <bo.he@intel.com>, Steven Rostedt <rostedt@goodmis.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"josh@joshtriplett.org" <josh@joshtriplett.org>,
	"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
	"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
	"Xiao, Jin" <jin.xiao@intel.com>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	"Bai, Jie A" <jie.a.bai@intel.com>,
	"Sun, Yi J" <yi.j.sun@intel.com>
Subject: Re: rcu_preempt caused oom
Date: Wed, 12 Dec 2018 18:42:34 -0800	[thread overview]
Message-ID: <20181213024234.GF4170@linux.ibm.com> (raw)
In-Reply-To: <88DC34334CA3444C85D647DBFA962C2735AD5F77@SHSMSX104.ccr.corp.intel.com>

On Thu, Dec 13, 2018 at 02:11:35AM +0000, Zhang, Jun wrote:
> Hello, Paul
> 
> I think the next patch is better.
> Because ULONG_CMP_GE could cause double write, which has risk that write back old value.
> Please help review.
> I don't test it. If you agree, we will test it.

Just to make sure that I understand, you are worried about something like
the following, correct?

o	__note_gp_changes() compares rnp->gp_seq_needed and rdp->gp_seq_needed
	and finds them equal.

o	At just this time something like rcu_start_this_gp() assigns a new
	(larger) value to rdp->gp_seq_needed.

o	Then __note_gp_changes() overwrites rdp->gp_seq_needed with the
	old value.

This cannot happen because __note_gp_changes() runs with interrupts
disabled on the CPU corresponding to the rcu_data structure referenced
by the rdp pointer.  So there is no way for rcu_start_this_gp() to be
invoked on the same CPU during this "if" statement.

Of course, there could be bugs.  For example:

o	__note_gp_changes() might be called on a different CPU than that
	corresponding to rdp.  You can check this with something like:

	WARN_ON_ONCE(rdp->cpu != smp_processor_id());

o	The same things could happen with rcu_start_this_gp(), and the
	above WARN_ON_ONCE() would work there as well.

o	rcutree_prepare_cpu() is a special case, but is irrelevant unless
	you are doing CPU-hotplug operations.  (It can run on a CPU other
	than rdp->cpu, but only at times when rdp->cpu is offline.)

o	Interrupts might not really be disabled.

That said, your patch could reduce overhead slightly, given that the
two values will be equal much of the time.  So it might be worth testing
just for that reason.

So why not just test it anyway?  If it makes the bug go away, I will be
surprised, but it would not be the first surprise for me.  ;-)

							Thanx, Paul

> Thanks!
> 
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 0b760c1..c00f34e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1849,7 +1849,7 @@ static bool __note_gp_changes(struct rcu_state *rsp, struct rcu_node *rnp,
>                 zero_cpu_stall_ticks(rdp);
>         }
>         rdp->gp_seq = rnp->gp_seq;  /* Remember new grace-period state. */
> -       if (ULONG_CMP_GE(rnp->gp_seq_needed, rdp->gp_seq_needed) || rdp->gpwrap)
> +       if (ULONG_CMP_LT(rdp->gp_seq_needed, rnp->gp_seq_needed) || rdp->gpwrap)
>                 rdp->gp_seq_needed = rnp->gp_seq_needed;
>         WRITE_ONCE(rdp->gpwrap, false);
>         rcu_gpnum_ovf(rnp, rdp);
> 
> 
> -----Original Message-----
> From: Paul E. McKenney [mailto:paulmck@linux.ibm.com] 
> Sent: Thursday, December 13, 2018 08:12
> To: He, Bo <bo.he@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>; linux-kernel@vger.kernel.org; josh@joshtriplett.org; mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun <jun.zhang@intel.com>; Xiao, Jin <jin.xiao@intel.com>; Zhang, Yanmin <yanmin.zhang@intel.com>; Bai, Jie A <jie.a.bai@intel.com>; Sun, Yi J <yi.j.sun@intel.com>
> Subject: Re: rcu_preempt caused oom
> 
> On Wed, Dec 12, 2018 at 11:13:22PM +0000, He, Bo wrote:
> > I don't see the rcutree.sysrq_rcu parameter in v4.19 kernel, I also checked the latest kernel and the latest tag v4.20-rc6, not see the sysrq_rcu.
> > Please correct me if I have something wrong.
> 
> That would be because I sent you the wrong patch, apologies!  :-/
> 
> Please instead see the one below, which does add sysrq_rcu.
> 
> 							Thanx, Paul
> 
> > -----Original Message-----
> > From: Paul E. McKenney <paulmck@linux.ibm.com>
> > Sent: Thursday, December 13, 2018 5:03 AM
> > To: He, Bo <bo.he@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>; 
> > linux-kernel@vger.kernel.org; josh@joshtriplett.org; 
> > mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun 
> > <jun.zhang@intel.com>; Xiao, Jin <jin.xiao@intel.com>; Zhang, Yanmin 
> > <yanmin.zhang@intel.com>; Bai, Jie A <jie.a.bai@intel.com>
> > Subject: Re: rcu_preempt caused oom
> > 
> > On Wed, Dec 12, 2018 at 07:42:24AM -0800, Paul E. McKenney wrote:
> > > On Wed, Dec 12, 2018 at 01:21:33PM +0000, He, Bo wrote:
> > > > we reproduce on two boards, but I still not see the show_rcu_gp_kthreads() dump logs, it seems the patch can't catch the scenario.
> > > > I double confirmed the CONFIG_PROVE_RCU=y is enabled in the config as it's extracted from the /proc/config.gz.
> > > 
> > > Strange.
> > > 
> > > Are the systems responsive to sysrq keys once failure occurs?  If 
> > > so, I will provide you a sysrq-R or some such to dump out the RCU state.
> > 
> > Or, as it turns out, sysrq-y if booting with rcutree.sysrq_rcu=1 using the patch below.  Only lightly tested.
> 
> ------------------------------------------------------------------------
> 
> commit 04b6245c8458e8725f4169e62912c1fadfdf8141
> Author: Paul E. McKenney <paulmck@linux.ibm.com>
> Date:   Wed Dec 12 16:10:09 2018 -0800
> 
>     rcu: Add sysrq rcu_node-dump capability
>     
>     Backported from v4.21/v5.0
>     
>     Life is hard if RCU manages to get stuck without triggering RCU CPU
>     stall warnings or triggering the rcu_check_gp_start_stall() checks
>     for failing to start a grace period.  This commit therefore adds a
>     boot-time-selectable sysrq key (commandeering "y") that allows manually
>     dumping Tree RCU state.  The new rcutree.sysrq_rcu kernel boot parameter
>     must be set for this sysrq to be available.
>     
>     Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 0b760c1369f7..e9392a9d6291 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -61,6 +61,7 @@
>  #include <linux/trace_events.h>
>  #include <linux/suspend.h>
>  #include <linux/ftrace.h>
> +#include <linux/sysrq.h>
>  
>  #include "tree.h"
>  #include "rcu.h"
> @@ -128,6 +129,9 @@ int num_rcu_lvl[] = NUM_RCU_LVL_INIT;  int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
>  /* panic() on RCU Stall sysctl. */
>  int sysctl_panic_on_rcu_stall __read_mostly;
> +/* Commandeer a sysrq key to dump RCU's tree. */ static bool sysrq_rcu; 
> +module_param(sysrq_rcu, bool, 0444);
>  
>  /*
>   * The rcu_scheduler_active variable is initialized to the value @@ -662,6 +666,27 @@ void show_rcu_gp_kthreads(void)  }  EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
>  
> +/* Dump grace-period-request information due to commandeered sysrq. */ 
> +static void sysrq_show_rcu(int key) {
> +	show_rcu_gp_kthreads();
> +}
> +
> +static struct sysrq_key_op sysrq_rcudump_op = {
> +	.handler = sysrq_show_rcu,
> +	.help_msg = "show-rcu(y)",
> +	.action_msg = "Show RCU tree",
> +	.enable_mask = SYSRQ_ENABLE_DUMP,
> +};
> +
> +static int __init rcu_sysrq_init(void)
> +{
> +	if (sysrq_rcu)
> +		return register_sysrq_key('y', &sysrq_rcudump_op);
> +	return 0;
> +}
> +early_initcall(rcu_sysrq_init);
> +
>  /*
>   * Send along grace-period-related data for rcutorture diagnostics.
>   */
> 


  reply	other threads:[~2018-12-13  2:42 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-29  8:49 rcu_preempt caused oom He, Bo
2018-11-29 13:06 ` Paul E. McKenney
2018-11-29 14:27   ` Paul E. McKenney
2018-11-30  8:03     ` He, Bo
2018-11-30 14:43       ` Paul E. McKenney
2018-11-30 15:16         ` Steven Rostedt
2018-11-30 15:18           ` He, Bo
2018-11-30 16:49             ` Paul E. McKenney
2018-12-03  7:44               ` He, Bo
2018-12-03 13:56                 ` Paul E. McKenney
2018-12-04  7:50                   ` He, Bo
2018-12-04 19:49                     ` Paul E. McKenney
2018-12-05  8:42                       ` He, Bo
2018-12-05 17:44                         ` Paul E. McKenney
     [not found]                           ` <CD6925E8781EFD4D8E11882D20FC406D52A16C46@SHSMSX104.ccr.corp.intel.com>
2018-12-06 17:38                             ` Paul E. McKenney
     [not found]                               ` <CD6925E8781EFD4D8E11882D20FC406D52A180C5@SHSMSX104.ccr.corp.intel.com>
2018-12-07 14:11                                 ` Paul E. McKenney
2018-12-09 19:56                                   ` Paul E. McKenney
2018-12-10  6:56                                     ` He, Bo
2018-12-11  0:38                                       ` Paul E. McKenney
2018-12-11  4:46                                         ` Paul E. McKenney
2018-12-11  5:29                                           ` He, Bo
2018-12-12  1:37                                           ` He, Bo
2018-12-12  2:24                                             ` Paul E. McKenney
     [not found]                                               ` <CD6925E8781EFD4D8E11882D20FC406D52A192C3@SHSMSX104.ccr.corp.intel.com>
2018-12-12 15:42                                                 ` Paul E. McKenney
2018-12-12 21:03                                                   ` Paul E. McKenney
2018-12-12 23:13                                                     ` He, Bo
2018-12-13  0:12                                                       ` Paul E. McKenney
2018-12-13  2:11                                                         ` Zhang, Jun
2018-12-13  2:42                                                           ` Paul E. McKenney [this message]
     [not found]                                                             ` <88DC34334CA3444C85D647DBFA962C2735AD5F9E@SHSMSX104.ccr.corp.intel.com>
2018-12-13  4:40                                                               ` Paul E. McKenney
     [not found]                                                                 ` <CD6925E8781EFD4D8E11882D20FC406D52A197EC@SHSMSX104.ccr.corp.intel.com>
2018-12-13 18:11                                                                   ` Paul E. McKenney
2018-12-14  1:30                                                                     ` He, Bo
2018-12-14  2:15                                                                       ` Paul E. McKenney
2018-12-14  2:40                                                                         ` He, Bo
2018-12-14  5:10                                                                           ` Paul E. McKenney
2018-12-14  5:38                                                                             ` Paul E. McKenney
2018-12-17  3:15                                                                               ` He, Bo
2018-12-17  4:26                                                                                 ` Paul E. McKenney
     [not found]                                                                                   ` <CD6925E8781EFD4D8E11882D20FC406D52A1A634@SHSMSX104.ccr.corp.intel.com>
2018-12-18  2:46                                                                                     ` Zhang, Jun
2018-12-18  3:12                                                                                       ` He, Bo
2018-12-18  5:34                                                                                       ` Paul E. McKenney
2019-02-13  8:31                                                           ` [tip:core/rcu] rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes() tip-bot for Zhang, Jun
2019-02-13  8:30 ` [tip:core/rcu] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt tip-bot for Zhang, Jun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181213024234.GF4170@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=bo.he@intel.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jie.a.bai@intel.com \
    --cc=jin.xiao@intel.com \
    --cc=josh@joshtriplett.org \
    --cc=jun.zhang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rostedt@goodmis.org \
    --cc=yanmin.zhang@intel.com \
    --cc=yi.j.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).