From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: "He, Bo" <bo.he@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"josh@joshtriplett.org" <josh@joshtriplett.org>,
"mathieu.desnoyers@efficios.com" <mathieu.desnoyers@efficios.com>,
"jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
"Zhang, Jun" <jun.zhang@intel.com>,
"Xiao, Jin" <jin.xiao@intel.com>,
"Zhang, Yanmin" <yanmin.zhang@intel.com>,
"Bai, Jie A" <jie.a.bai@intel.com>,
"Sun, Yi J" <yi.j.sun@intel.com>
Subject: Re: rcu_preempt caused oom
Date: Wed, 12 Dec 2018 16:12:14 -0800 [thread overview]
Message-ID: <20181213001214.GE4170@linux.ibm.com> (raw)
In-Reply-To: <CD6925E8781EFD4D8E11882D20FC406D52A193C3@SHSMSX104.ccr.corp.intel.com>
On Wed, Dec 12, 2018 at 11:13:22PM +0000, He, Bo wrote:
> I don't see the rcutree.sysrq_rcu parameter in v4.19 kernel, I also checked the latest kernel and the latest tag v4.20-rc6, not see the sysrq_rcu.
> Please correct me if I have something wrong.
That would be because I sent you the wrong patch, apologies! :-/
Please instead see the one below, which does add sysrq_rcu.
Thanx, Paul
> -----Original Message-----
> From: Paul E. McKenney <paulmck@linux.ibm.com>
> Sent: Thursday, December 13, 2018 5:03 AM
> To: He, Bo <bo.he@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>; linux-kernel@vger.kernel.org; josh@joshtriplett.org; mathieu.desnoyers@efficios.com; jiangshanlai@gmail.com; Zhang, Jun <jun.zhang@intel.com>; Xiao, Jin <jin.xiao@intel.com>; Zhang, Yanmin <yanmin.zhang@intel.com>; Bai, Jie A <jie.a.bai@intel.com>
> Subject: Re: rcu_preempt caused oom
>
> On Wed, Dec 12, 2018 at 07:42:24AM -0800, Paul E. McKenney wrote:
> > On Wed, Dec 12, 2018 at 01:21:33PM +0000, He, Bo wrote:
> > > we reproduce on two boards, but I still not see the show_rcu_gp_kthreads() dump logs, it seems the patch can't catch the scenario.
> > > I double confirmed the CONFIG_PROVE_RCU=y is enabled in the config as it's extracted from the /proc/config.gz.
> >
> > Strange.
> >
> > Are the systems responsive to sysrq keys once failure occurs? If so,
> > I will provide you a sysrq-R or some such to dump out the RCU state.
>
> Or, as it turns out, sysrq-y if booting with rcutree.sysrq_rcu=1 using the patch below. Only lightly tested.
------------------------------------------------------------------------
commit 04b6245c8458e8725f4169e62912c1fadfdf8141
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date: Wed Dec 12 16:10:09 2018 -0800
rcu: Add sysrq rcu_node-dump capability
Backported from v4.21/v5.0
Life is hard if RCU manages to get stuck without triggering RCU CPU
stall warnings or triggering the rcu_check_gp_start_stall() checks
for failing to start a grace period. This commit therefore adds a
boot-time-selectable sysrq key (commandeering "y") that allows manually
dumping Tree RCU state. The new rcutree.sysrq_rcu kernel boot parameter
must be set for this sysrq to be available.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0b760c1369f7..e9392a9d6291 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -61,6 +61,7 @@
#include <linux/trace_events.h>
#include <linux/suspend.h>
#include <linux/ftrace.h>
+#include <linux/sysrq.h>
#include "tree.h"
#include "rcu.h"
@@ -128,6 +129,9 @@ int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. */
/* panic() on RCU Stall sysctl. */
int sysctl_panic_on_rcu_stall __read_mostly;
+/* Commandeer a sysrq key to dump RCU's tree. */
+static bool sysrq_rcu;
+module_param(sysrq_rcu, bool, 0444);
/*
* The rcu_scheduler_active variable is initialized to the value
@@ -662,6 +666,27 @@ void show_rcu_gp_kthreads(void)
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
+/* Dump grace-period-request information due to commandeered sysrq. */
+static void sysrq_show_rcu(int key)
+{
+ show_rcu_gp_kthreads();
+}
+
+static struct sysrq_key_op sysrq_rcudump_op = {
+ .handler = sysrq_show_rcu,
+ .help_msg = "show-rcu(y)",
+ .action_msg = "Show RCU tree",
+ .enable_mask = SYSRQ_ENABLE_DUMP,
+};
+
+static int __init rcu_sysrq_init(void)
+{
+ if (sysrq_rcu)
+ return register_sysrq_key('y', &sysrq_rcudump_op);
+ return 0;
+}
+early_initcall(rcu_sysrq_init);
+
/*
* Send along grace-period-related data for rcutorture diagnostics.
*/
next prev parent reply other threads:[~2018-12-13 0:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-29 8:49 rcu_preempt caused oom He, Bo
2018-11-29 13:06 ` Paul E. McKenney
2018-11-29 14:27 ` Paul E. McKenney
2018-11-30 8:03 ` He, Bo
2018-11-30 14:43 ` Paul E. McKenney
2018-11-30 15:16 ` Steven Rostedt
2018-11-30 15:18 ` He, Bo
2018-11-30 16:49 ` Paul E. McKenney
2018-12-03 7:44 ` He, Bo
2018-12-03 13:56 ` Paul E. McKenney
2018-12-04 7:50 ` He, Bo
2018-12-04 19:49 ` Paul E. McKenney
2018-12-05 8:42 ` He, Bo
2018-12-05 17:44 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A16C46@SHSMSX104.ccr.corp.intel.com>
2018-12-06 17:38 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A180C5@SHSMSX104.ccr.corp.intel.com>
2018-12-07 14:11 ` Paul E. McKenney
2018-12-09 19:56 ` Paul E. McKenney
2018-12-10 6:56 ` He, Bo
2018-12-11 0:38 ` Paul E. McKenney
2018-12-11 4:46 ` Paul E. McKenney
2018-12-11 5:29 ` He, Bo
2018-12-12 1:37 ` He, Bo
2018-12-12 2:24 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A192C3@SHSMSX104.ccr.corp.intel.com>
2018-12-12 15:42 ` Paul E. McKenney
2018-12-12 21:03 ` Paul E. McKenney
2018-12-12 23:13 ` He, Bo
2018-12-13 0:12 ` Paul E. McKenney [this message]
2018-12-13 2:11 ` Zhang, Jun
2018-12-13 2:42 ` Paul E. McKenney
[not found] ` <88DC34334CA3444C85D647DBFA962C2735AD5F9E@SHSMSX104.ccr.corp.intel.com>
2018-12-13 4:40 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A197EC@SHSMSX104.ccr.corp.intel.com>
2018-12-13 18:11 ` Paul E. McKenney
2018-12-14 1:30 ` He, Bo
2018-12-14 2:15 ` Paul E. McKenney
2018-12-14 2:40 ` He, Bo
2018-12-14 5:10 ` Paul E. McKenney
2018-12-14 5:38 ` Paul E. McKenney
2018-12-17 3:15 ` He, Bo
2018-12-17 4:26 ` Paul E. McKenney
[not found] ` <CD6925E8781EFD4D8E11882D20FC406D52A1A634@SHSMSX104.ccr.corp.intel.com>
2018-12-18 2:46 ` Zhang, Jun
2018-12-18 3:12 ` He, Bo
2018-12-18 5:34 ` Paul E. McKenney
2019-02-13 8:31 ` [tip:core/rcu] rcu: Prevent needless ->gp_seq_needed update in __note_gp_changes() tip-bot for Zhang, Jun
2019-02-13 8:30 ` [tip:core/rcu] rcu: Do RCU GP kthread self-wakeup from softirq and interrupt tip-bot for Zhang, Jun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181213001214.GE4170@linux.ibm.com \
--to=paulmck@linux.ibm.com \
--cc=bo.he@intel.com \
--cc=jiangshanlai@gmail.com \
--cc=jie.a.bai@intel.com \
--cc=jin.xiao@intel.com \
--cc=josh@joshtriplett.org \
--cc=jun.zhang@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=rostedt@goodmis.org \
--cc=yanmin.zhang@intel.com \
--cc=yi.j.sun@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.