From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, rostedt@goodmis.org,
Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
patches@linaro.org, "Paul E. McKenney" <paul.mckenney@linaro.org>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: [PATCH RFC tip/core/rcu 32/41] rcu: Update stall-warning documentation
Date: Wed, 1 Feb 2012 11:41:50 -0800 [thread overview]
Message-ID: <1328125319-5205-32-git-send-email-paulmck@linux.vnet.ibm.com> (raw)
In-Reply-To: <1328125319-5205-1-git-send-email-paulmck@linux.vnet.ibm.com>
From: "Paul E. McKenney" <paul.mckenney@linaro.org>
Add documentation of CONFIG_RCU_CPU_STALL_VERBOSE, CONFIG_RCU_CPU_STALL_INFO,
and RCU_STALL_DELAY_DELTA. Describe multiple stall-warning messages from
a single stall, and the timing of the subsequent messages. Add headings.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
Documentation/RCU/stallwarn.txt | 85 ++++++++++++++++++++++++++++++++++++---
1 files changed, 79 insertions(+), 6 deletions(-)
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index 083d88c..54f354c 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -14,12 +14,36 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
issues an RCU CPU stall warning. This time period is normally
ten seconds.
-RCU_SECONDS_TILL_STALL_RECHECK
+ This configuration parameter may be changed at runtime via the
+ /sys/module/rcutree/parameters/rcu_cpu_stall_timeout, however
+ this parameter is checked only at the beginning of a cycle.
+ So if you are 30 seconds into a 70-second stall, setting this
+ sysfs parameter to (say) five will shorten the timeout for the
+ -next- stall, or the following warning for the current stall
+ (assuming the stall lasts long enough). It will not affect the
+ timing of the next warning for the current stall.
- This macro defines the period of time that RCU will wait after
- issuing a stall warning until it issues another stall warning
- for the same stall. This time period is normally set to three
- times the check interval plus thirty seconds.
+ Stall-warning messages may be enabled and disabled completely via
+ /sys/module/rcutree/parameters/rcu_cpu_stall_suppress.
+
+CONFIG_RCU_CPU_STALL_VERBOSE
+
+ This kernel configuration parameter causes the stall warning to
+ also dump the stacks of any tasks that are blocking the current
+ RCU-preempt grace period.
+
+RCU_CPU_STALL_INFO
+
+ This kernel configuration parameter causes the stall warning to
+ print out additional per-CPU diagnostic information, including
+ information on scheduling-clock ticks and RCU's idle-CPU tracking.
+
+RCU_STALL_DELAY_DELTA
+
+ Although the lockdep facility is extremely useful, it does add
+ some overhead. Therefore, under CONFIG_PROVE_RCU, the
+ RCU_STALL_DELAY_DELTA macro allows five extra seconds before
+ giving an RCU CPU stall warning message.
RCU_STALL_RAT_DELAY
@@ -64,6 +88,54 @@ INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffi
This is rare, but does happen from time to time in real life.
+If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set,
+more information is printed with the stall-warning message, for example:
+
+ INFO: rcu_preempt detected stall on CPU
+ 0: (63959 ticks this GP) idle=241/3fffffffffffffff/0
+ (t=65000 jiffies)
+
+In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
+printed:
+
+ INFO: rcu_preempt detected stall on CPU
+ 0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
+ (t=65000 jiffies)
+
+The "(64628 ticks this GP)" indicates that this CPU has taken more
+than 64,000 scheduling-clock interrupts during the current stalled
+grace period. If the CPU was not yet aware of the current grace
+period (for example, if it was offline), then this part of the message
+indicates how many grace periods behind the CPU is.
+
+The "idle=" portion of the message prints the dyntick-idle state.
+The hex number before the first "/" is the low-order 12 bits of the
+dynticks counter, which will have an even-numbered value if the CPU is
+in dyntick-idle mode and an odd-numbered value otherwise. The hex
+number between the two "/"s is the value of the nesting, which will
+be a small positive number if in the idle loop and a very large positive
+number (as shown above) otherwise.
+
+For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
+CPU is not in the process of trying to force itself into dyntick-idle
+state, the "." indicates that the CPU has not given up forcing RCU
+into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
+indicates that the CPU has not recented forced RCU into dyntick-idle
+mode (it would otherwise indicate the number of microseconds remaining
+in this forced state).
+
+
+Multiple Warnings From One Stall
+
+If a stall lasts long enough, multiple stall-warning messages will be
+printed for it. The second and subsequent messages are printed at
+longer intervals, so that the time between (say) the first and second
+message will be about three times the interval between the beginning
+of the stall and the first message.
+
+
+What Causes RCU CPU Stall Warnings?
+
So your kernel printed an RCU CPU stall warning. The next question is
"What caused it?" The following problems can result in RCU CPU stall
warnings:
@@ -128,4 +200,5 @@ is occurring, which will usually be in the function nearest the top of
that portion of the stack which remains the same from trace to trace.
If you can reliably trigger the stall, ftrace can be quite helpful.
-RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.
+RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE
+and with RCU's event tracing.
--
1.7.8
next prev parent reply other threads:[~2012-02-01 19:44 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-01 19:41 [PATCH RFC 0/41] RCU commits for 3.4 Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 01/41] rcu: Bring RTFP.txt up to date Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 02/41] rcu: Improve synchronize_rcu() diagnostics Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 03/41] rcu: Add lockdep-RCU checks for simple self-deadlock Paul E. McKenney
2012-02-02 0:55 ` Josh Triplett
2012-02-02 16:20 ` Paul E. McKenney
2012-02-02 19:56 ` Josh Triplett
2012-02-02 20:42 ` Paul E. McKenney
2012-02-03 9:04 ` Josh Triplett
2012-02-03 18:05 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 04/41] rcu: Add diagnostic for misaligned rcu_head structures Paul E. McKenney
2012-02-02 1:00 ` Josh Triplett
2012-02-02 16:22 ` Paul E. McKenney
2012-02-02 20:11 ` Josh Triplett
2012-02-02 1:01 ` Josh Triplett
2012-02-02 16:27 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 05/41] rcu: Avoid waking up CPUs having only kfree_rcu() callbacks Paul E. McKenney
2012-02-02 1:15 ` Josh Triplett
2012-02-02 16:34 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 06/41] rcu: Move RCU_TRACE to lib/Kconfig.debug Paul E. McKenney
2012-02-02 1:39 ` Josh Triplett
2012-02-02 17:05 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 07/41] s390: Convert call_rcu() to kfree_rcu() Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 08/41] tcm_fc: " Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 09/41] ipv4: " Paul E. McKenney
2012-02-01 19:49 ` David Miller
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 10/41] " Paul E. McKenney
2012-02-01 19:50 ` David Miller
2012-02-02 0:24 ` Josh Triplett
2012-02-02 15:56 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 11/41] mac80211: " Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 12/41] rcu: Simplify offline processing Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 13/41] rcu: Make rcutorture flag online/offline failures Paul E. McKenney
2012-02-02 1:46 ` Josh Triplett
2012-02-02 17:08 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 14/41] rcu: Limit lazy-callback duration Paul E. McKenney
2012-02-02 2:03 ` Josh Triplett
2012-02-02 17:13 ` Paul E. McKenney
2012-02-03 4:07 ` Josh Triplett
2012-02-03 5:54 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 15/41] rcu: Check for callback invocation from offline CPUs Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 16/41] rcu: Don't make callbacks go through second full grace period Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 17/41] rcu: Remove single-rcu_node optimization in rcu_start_gp() Paul E. McKenney
2012-02-02 2:13 ` Josh Triplett
2012-02-02 17:16 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 18/41] rcu: Protect __rcu_read_unlock() against scheduler-using irq handlers Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 19/41] rcu: Streamline code produced by __rcu_read_unlock() Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 20/41] rcu: Prevent RCU callbacks from executing before scheduler initialized Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 21/41] rcu: Inform RCU of irq_exit() activity Paul E. McKenney
2012-02-02 2:30 ` Josh Triplett
2012-02-02 17:30 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 22/41] rcu: Simplify unboosting checks Paul E. McKenney
2012-02-02 2:38 ` Josh Triplett
2012-02-02 17:48 ` Paul E. McKenney
2012-02-03 4:23 ` Josh Triplett
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 23/41] rcu: Clean up straggling rcu_preempt_needs_cpu() name Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 24/41] rcu: Check for idle-loop entry while in RCU read-side critical section Paul E. McKenney
2012-02-02 5:13 ` Josh Triplett
2012-02-02 17:50 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 25/41] rcu: Make rcu_sleep_check() also check rcu_lock_map Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 26/41] rcu: Note that rcu_access_pointer() can be used for teardown Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 27/41] rcu: Remove #ifdef CONFIG_SMP from TREE_RCU Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 28/41] rcu: Set RCU CPU stall times via sysfs Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 29/41] rcu: Print scheduling-clock information on RCU CPU stall-warning messages Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 30/41] rcutorture: Permit holding off CPU-hotplug operations during boot Paul E. McKenney
2012-02-02 5:43 ` Josh Triplett
2012-02-02 17:56 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 31/41] rcu: Add CPU-stall capability to rcutorture Paul E. McKenney
2012-02-02 5:53 ` Josh Triplett
2012-02-02 9:15 ` Julia Lawall
2012-02-02 18:03 ` Paul E. McKenney
2012-02-02 18:00 ` Paul E. McKenney
2012-02-01 19:41 ` Paul E. McKenney [this message]
2012-02-02 5:56 ` [PATCH RFC tip/core/rcu 32/41] rcu: Update stall-warning documentation Josh Triplett
2012-02-02 18:18 ` Paul E. McKenney
2012-02-03 5:42 ` Josh Triplett
2012-02-03 5:58 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 33/41] rcu: Make boolean rcutorture parameters be of type "bool" Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 34/41] rcu: Check for illegal use of RCU from offlined CPUs Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 35/41] rcu: Move synchronize_sched_expedited() to rcutree.c Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 36/41] rcu: No interrupt disabling for rcu_prepare_for_idle() Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 37/41] lockdep: Add CPU-idle/offline warning to lockdep-RCU splat Paul E. McKenney
2012-02-02 6:07 ` Josh Triplett
2012-02-02 18:30 ` Paul E. McKenney
2012-02-03 6:12 ` Josh Triplett
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 38/41] rcu: Rework detection of use of RCU by offline CPUs Paul E. McKenney
2012-02-02 6:11 ` Josh Triplett
2012-02-02 18:31 ` Paul E. McKenney
2012-02-03 9:17 ` Josh Triplett
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 39/41] rcu: Wait at least a jiffy before declaring a CPU to be offline Paul E. McKenney
2012-02-02 6:12 ` Josh Triplett
2012-02-02 18:27 ` Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 40/41] rcu: Call out dangers of expedited RCU primitives Paul E. McKenney
2012-02-01 19:41 ` [PATCH RFC tip/core/rcu 41/41] rcu: Trace only after NULL-pointer check Paul E. McKenney
2012-02-02 0:18 ` [PATCH RFC tip/core/rcu 01/41] rcu: Bring RTFP.txt up to date Josh Triplett
2012-02-02 1:33 ` Paul E. McKenney
2012-02-02 2:01 ` Josh Triplett
2012-02-02 16:47 ` Paul E. McKenney
2012-02-02 22:32 ` Josh Triplett
2012-02-03 18:00 ` Paul E. McKenney
2012-02-02 22:47 ` [PATCH RFC 0/41] RCU commits for 3.4 Kevin Hilman
2012-02-02 23:58 ` Paul E. McKenney
2012-02-03 19:54 ` Kevin Hilman
2012-02-06 7:04 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1328125319-5205-32-git-send-email-paulmck@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=patches@linaro.org \
--cc=paul.mckenney@linaro.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).