All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "He, Bo" <bo.he@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"josh@joshtriplett.org" <josh@joshtriplett.org>,
	"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
	"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
	"Zhang, Jun" <jun.zhang@intel.com>,
	"Xiao, Jin" <jin.xiao@intel.com>,
	"Zhang, Yanmin" <yanmin.zhang@intel.com>,
	"Bai, Jie A" <jie.a.bai@intel.com>,
	"Sun, Yi J" <yi.j.sun@intel.com>
Subject: Re: rcu_preempt caused oom
Date: Wed, 12 Dec 2018 16:12:14 -0800	[thread overview]
Message-ID: <20181213001214.GE4170@linux.ibm.com> (raw)
In-Reply-To: <CD6925E8781EFD4D8E11882D20FC406D52A193C3@SHSMSX104.ccr.corp.intel.com>

On Wed, Dec 12, 2018 at 11:13:22PM +0000, He, Bo wrote:
> I don't see the rcutree.sysrq_rcu parameter in v4.19 kernel, I also checked the latest kernel and the latest tag v4.20-rc6, not see the sysrq_rcu.
> Please correct me if I have something wrong.

That would be because I sent you the wrong patch, apologies!  :-/

Please instead see the one below, which does add sysrq_rcu.

							Thanx, Paul

> -----Original Message-----
> From: Paul E. McKenney <paulmck@linux.ibm.com> 
> Sent: Thursday, December 13, 2018 5:03 AM
> To: He, Bo <bo.he@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>; linux-kernel@vger.kernel.org; josh@joshtriplett.org; mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun <jun.zhang@intel.com>; Xiao, Jin <jin.xiao@intel.com>; Zhang, Yanmin <yanmin.zhang@intel.com>; Bai, Jie A <jie.a.bai@intel.com>
> Subject: Re: rcu_preempt caused oom
> 
> On Wed, Dec 12, 2018 at 07:42:24AM -0800, Paul E. McKenney wrote:
> > On Wed, Dec 12, 2018 at 01:21:33PM +0000, He, Bo wrote:
> > > we reproduce on two boards, but I still not see the show_rcu_gp_kthreads() dump logs, it seems the patch can't catch the scenario.
> > > I double confirmed the CONFIG_PROVE_RCU=y is enabled in the config as it's extracted from the /proc/config.gz.
> > 
> > Strange.
> > 
> > Are the systems responsive to sysrq keys once failure occurs?  If so, 
> > I will provide you a sysrq-R or some such to dump out the RCU state.
> 
> Or, as it turns out, sysrq-y if booting with rcutree.sysrq_rcu=1 using the patch below.  Only lightly tested.

------------------------------------------------------------------------

commit 04b6245c8458e8725f4169e62912c1fadfdf8141
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date:   Wed Dec 12 16:10:09 2018 -0800

    rcu: Add sysrq rcu_node-dump capability
    
    Backported from v4.21/v5.0
    
    Life is hard if RCU manages to get stuck without triggering RCU CPU
    stall warnings or triggering the rcu_check_gp_start_stall() checks
    for failing to start a grace period.  This commit therefore adds a
    boot-time-selectable sysrq key (commandeering "y") that allows manually
    dumping Tree RCU state.  The new rcutree.sysrq_rcu kernel boot parameter
    must be set for this sysrq to be available.
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0b760c1369f7..e9392a9d6291 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -61,6 +61,7 @@
 #include <linux/trace_events.h>
 #include <linux/suspend.h>
 #include <linux/ftrace.h>
+#include <linux/sysrq.h>
 
 #include "tree.h"
 #include "rcu.h"
@@ -128,6 +129,9 @@ int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
 int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
 /* panic() on RCU Stall sysctl. */
 int sysctl_panic_on_rcu_stall __read_mostly;
+/* Commandeer a sysrq key to dump RCU's tree. */
+static bool sysrq_rcu;
+module_param(sysrq_rcu, bool, 0444);
 
 /*
  * The rcu_scheduler_active variable is initialized to the value
@@ -662,6 +666,27 @@ void show_rcu_gp_kthreads(void)
 }
 EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
 
+/* Dump grace-period-request information due to commandeered sysrq. */
+static void sysrq_show_rcu(int key)
+{
+	show_rcu_gp_kthreads();
+}
+
+static struct sysrq_key_op sysrq_rcudump_op = {
+	.handler = sysrq_show_rcu,
+	.help_msg = "show-rcu(y)",
+	.action_msg = "Show RCU tree",
+	.enable_mask = SYSRQ_ENABLE_DUMP,
+};
+
+static int __init rcu_sysrq_init(void)
+{
+	if (sysrq_rcu)
+		return register_sysrq_key('y', &sysrq_rcudump_op);
+	return 0;
+}
+early_initcall(rcu_sysrq_init);
+
 /*
  * Send along grace-period-related data for rcutorture diagnostics.
  */


  reply	other threads:[~2018-12-13  0:33 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-29  8:49 rcu_preempt caused oom He, Bo
2018-11-29 13:06 ` Paul E. McKenney
2018-11-29 14:27   ` Paul E. McKenney
2018-11-30  8:03     ` He, Bo
2018-11-30 14:43       ` Paul E. McKenney
2018-11-30 15:16         ` Steven Rostedt
2018-11-30 15:18           ` He, Bo
2018-11-30 16:49             ` Paul E. McKenney
2018-12-03  7:44               ` He, Bo
2018-12-03 13:56                 ` Paul E. McKenney
2018-12-04  7:50                   ` He, Bo
2018-12-04 19:49                     ` Paul E. McKenney
2018-12-05  8:42                       ` He, Bo
2018-12-05 17:44                         ` Paul E. McKenney
     [not found]                           ` <CD6925E8781EFD4D8E11882D20FC406D52A16C46@SHSMSX104.ccr.corp.intel.com>
2018-12-06 17:38                             ` Paul E. McKenney
     [not found]                               ` <CD6925E8781EFD4D8E11882D20FC406D52A180C5@SHSMSX104.ccr.corp.intel.com>
2018-12-07 14:11                                 ` Paul E. McKenney
2018-12-09 19:56                                   ` Paul E. McKenney
2018-12-10  6:56                                     ` He, Bo
2018-12-11  0:38                                       ` Paul E. McKenney
2018-12-11  4:46                                         ` Paul E. McKenney
2018-12-11  5:29                                           ` He, Bo
2018-12-12  1:37                                           ` He, Bo
2018-12-12  2:24                                             ` Paul E. McKenney
     [not found]                                               ` <CD6925E8781EFD4D8E11882D20FC406D52A192C3@SHSMSX104.ccr.corp.intel.com>
2018-12-12 15:42                                                 ` Paul E. McKenney
2018-12-12 21:03                                                   ` Paul E. McKenney
2018-12-12 23:13                                                     ` He, Bo
2018-12-13  0:12                                                       ` Paul E. McKenney [this message]
2018-12-13  2:11                                                         ` Zhang, Jun
2018-12-13  2:42                                                           ` Paul E. McKenney
     [not found]                                                             ` <88DC34334CA3444C85D647DBFA962C2735AD5F9E@SHSMSX104.ccr.corp.intel.com>
2018-12-13  4:40                                                               ` Paul E. McKenney
     [not found]                                                                 ` <CD6925E8781EFD4D8E11882D20FC406D52A197EC@SHSMSX104.ccr.corp.intel.com>
2018-12-13 18:11                                                                   ` Paul E. McKenney
2018-12-14  1:30                                                                     ` He, Bo
2018-12-14  2:15                                                                       ` Paul E. McKenney
2018-12-14  2:40                                                                         ` He, Bo
2018-12-14  5:10                                                                           ` Paul E. McKenney
2018-12-14  5:38                                                                             ` Paul E. McKenney
2018-12-17  3:15                                                                               ` He, Bo
2018-12-17  4:26                                                                                 ` Paul E. McKenney
     [not found]                                                                                   ` <CD6925E8781EFD4D8E11882D20FC406D52A1A634@SHSMSX104.ccr.corp.intel.com>
2018-12-18  2:46                                                                                     ` Zhang, Jun
2018-12-18  3:12                                                                                       ` He, Bo
2018-12-18  5:34                                                                                       ` Paul E. McKenney
2019-02-13  8:31                                                           ` [tip:core/rcu] rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes() tip-bot for Zhang, Jun
2019-02-13  8:30 ` [tip:core/rcu] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt tip-bot for Zhang, Jun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181213001214.GE4170@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=bo.he@intel.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jie.a.bai@intel.com \
    --cc=jin.xiao@intel.com \
    --cc=josh@joshtriplett.org \
    --cc=jun.zhang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=rostedt@goodmis.org \
    --cc=yanmin.zhang@intel.com \
    --cc=yi.j.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.