From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Damien Wyart" <damien.wyart@free.fr>,
"Peter Zijlstra" <a.p.zijlstra@chello.nl>,
"Mike Galbraith" <efault@gmx.de>,
"Frédéric Weisbecker" <fweisbec@gmail.com>,
"Rafael J. Wysocki" <rjw@sisk.pl>,
"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
"Kernel Testers List" <kernel-testers@vger.kernel.org>
Subject: Re: [Bug #12650] Strange load average and ksoftirqd behavior with 2.6.29-rc2-git1
Date: Mon, 16 Feb 2009 14:39:44 -0800 [thread overview]
Message-ID: <20090216223944.GF6785@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090216200923.GA28938@elte.hu>
On Mon, Feb 16, 2009 at 09:09:23PM +0100, Ingo Molnar wrote:
>
> * Paul E. McKenney <paulmck@linux.vnet.ibm.com> wrote:
>
> > Here the calls to rcu_process_callbacks() are only 75
> > microseconds apart, so that this function is consuming more
> > than 10% of a CPU. The strange thing is that I don't see a
> > raise_softirq() in between, though perhaps it gets inlined or
> > something that makes it invisible to ftrace.
>
> look at the latest trace please, that has even the most inline
> raise-softirq method instrumented, so all the raising is
> visible.
Ah, my apologies! This time looking at:
http://damien.wyart.free.fr/ksoftirqd_pb/trace_tip_2009.02.16_ksoftirqd_pb_abstime_proc.txt.gz
799.521187 | 1) <idle>-0 | | rcu_check_callbacks() {
799.521371 | 1) <idle>-0 | | rcu_check_callbacks() {
799.521555 | 1) <idle>-0 | | rcu_check_callbacks() {
799.521738 | 1) <idle>-0 | | rcu_check_callbacks() {
799.521934 | 1) <idle>-0 | | rcu_check_callbacks() {
799.522068 | 1) ksoftir-2324 | | rcu_check_callbacks() {
799.522208 | 1) <idle>-0 | | rcu_check_callbacks() {
799.522392 | 1) <idle>-0 | | rcu_check_callbacks() {
799.522575 | 1) <idle>-0 | | rcu_check_callbacks() {
799.522759 | 1) <idle>-0 | | rcu_check_callbacks() {
799.522956 | 1) <idle>-0 | | rcu_check_callbacks() {
799.523074 | 1) ksoftir-2324 | | rcu_check_callbacks() {
799.523214 | 1) <idle>-0 | | rcu_check_callbacks() {
799.523397 | 1) <idle>-0 | | rcu_check_callbacks() {
799.523579 | 1) <idle>-0 | | rcu_check_callbacks() {
799.523762 | 1) <idle>-0 | | rcu_check_callbacks() {
799.523960 | 1) <idle>-0 | | rcu_check_callbacks() {
799.524079 | 1) ksoftir-2324 | | rcu_check_callbacks() {
799.524220 | 1) <idle>-0 | | rcu_check_callbacks() {
799.524403 | 1) <idle>-0 | | rcu_check_callbacks() {
799.524587 | 1) <idle>-0 | | rcu_check_callbacks() {
799.524770 | 1) <idle>-0 | | rcu_check_callbacks() {
[ . . . ]
Yikes!!!
Why is rcu_check_callbacks() being invoked so often? It should be called
but once per jiffy, and here it is called no less than 22 times in about
3.5 milliseconds, meaning one call every 160 microseconds or so.
Hmmm...
Looks like we never return from:
799.521142 | 1) <idle>-0 | | tick_nohz_stop_sched_tick() {
Perhaps we are taking an interrupt immediately after the
local_irq_restore()? And at 799.521209 deciding to exit nohz mode.
And then deciding to go back into nohz mode at 799.521326, 117
microseconds later, after which we re-invoke rcu_check_callbacks(),
which again raises RCU's softirq.
And the reason we are invoking rcu_check_callbacks() so often appears
to be in in arch/x86/kernel/process_32.c cpu_idle() near line 107,
which explains my failure to reproduce on a 64-bit system:
void cpu_idle(void)
{
int cpu = smp_processor_id();
current_thread_info()->status |= TS_POLLING;
/* endless idle loop with no priority at all */
while (1) {
tick_nohz_stop_sched_tick(1);
while (!need_resched()) {
check_pgt_cache();
rmb();
if (rcu_pending(cpu))
rcu_check_callbacks(cpu, 0);
if (cpu_is_offline(cpu))
play_dead();
local_irq_disable();
__get_cpu_var(irq_stat).idle_timestamp = jiffies;
/* Don't trace irqs off for idle */
stop_critical_timings();
pm_idle();
start_critical_timings();
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
If we go in and out of nohz mode quickly, we will invoke rcu_pending()
each time. I would expect rcu_pending() to return 0 most of the time,
but that apparently isn't the case with treercu...
What is the easiest way for me to make it easy to trace the return path
from __rcu_pending()? Make each return path call an empty function
located off where the compiler cannot see it, I guess... Diagnostic
patch along these lines below. Frederic, Damien, could you please give
it a go? (And of course please let me know if something else is
needed.)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
---
rcupdate.c | 23 +++++++++++++++++++++++
rcutree.c | 31 +++++++++++++++++++++++++------
2 files changed, 48 insertions(+), 6 deletions(-)
diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index d92a76a..42bbf03 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -175,3 +175,26 @@ void __init rcu_init(void)
__rcu_init();
}
+void __rcu_pending_qs_pending(void)
+{
+}
+
+void __rcu_pending_callbacks_ready(void)
+{
+}
+
+void __rcu_pending_needs_gp(void)
+{
+}
+
+void __rcu_pending_new_completed(void)
+{
+}
+
+void __rcu_pending_new_gp(void)
+{
+}
+
+void __rcu_pending_fqs(void)
+{
+}
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index b2fd602..e2d72c3 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1234,6 +1234,13 @@ void call_rcu_bh(struct rcu_head *head, void (*func)(struct rcu_head *rcu))
}
EXPORT_SYMBOL_GPL(call_rcu_bh);
+extern void __rcu_pending_qs_pending(void);
+extern void __rcu_pending_callbacks_ready(void);
+extern void __rcu_pending_needs_gp(void);
+extern void __rcu_pending_new_completed(void);
+extern void __rcu_pending_new_gp(void);
+extern void __rcu_pending_fqs(void);
+
/*
* Check to see if there is any immediate RCU-related work to be done
* by the current CPU, for the specified type of RCU, returning 1 if so.
@@ -1249,30 +1256,42 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
check_cpu_stall(rsp, rdp);
/* Is the RCU core waiting for a quiescent state from this CPU? */
- if (rdp->qs_pending)
+ if (rdp->qs_pending) {
+ __rcu_pending_qs_pending();
return 1;
+ }
/* Does this CPU have callbacks ready to invoke? */
- if (cpu_has_callbacks_ready_to_invoke(rdp))
+ if (cpu_has_callbacks_ready_to_invoke(rdp)) {
+ __rcu_pending_callbacks_ready();
return 1;
+ }
/* Has RCU gone idle with this CPU needing another grace period? */
- if (cpu_needs_another_gp(rsp, rdp))
+ if (cpu_needs_another_gp(rsp, rdp)) {
+ __rcu_pending_needs_gp();
return 1;
+ }
/* Has another RCU grace period completed? */
- if (ACCESS_ONCE(rsp->completed) != rdp->completed) /* outside of lock */
+ if (ACCESS_ONCE(rsp->completed) != rdp->completed) /* outside of lock */ {
+ __rcu_pending_new_completed();
return 1;
+ }
/* Has a new RCU grace period started? */
- if (ACCESS_ONCE(rsp->gpnum) != rdp->gpnum) /* outside of lock */
+ if (ACCESS_ONCE(rsp->gpnum) != rdp->gpnum) /* outside of lock */ {
+ __rcu_pending_new_gp();
return 1;
+ }
/* Has an RCU GP gone long enough to send resched IPIs &c? */
if (ACCESS_ONCE(rsp->completed) != ACCESS_ONCE(rsp->gpnum) &&
((long)(ACCESS_ONCE(rsp->jiffies_force_qs) - jiffies) < 0 ||
- (rdp->n_rcu_pending_force_qs - rdp->n_rcu_pending) < 0))
+ (rdp->n_rcu_pending_force_qs - rdp->n_rcu_pending) < 0)) {
+ __rcu_pending_fqs();
return 1;
+ }
/* nothing to do */
return 0;
next prev parent reply other threads:[~2009-02-16 22:39 UTC|newest]
Thread overview: 131+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-14 20:35 2.6.29-rc5: Reported regressions from 2.6.28 Rafael J. Wysocki
2009-02-14 20:35 ` [Bug #12414] iwl4965 cannot use "ap auto" on latest 2.6.28/29? Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12490] ath5k related kernel panic in 2.6.29-rc1 Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12418] Repeated ioctl(4, 0x40046445, ..) loop in glxgears Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12419] possible circular locking dependency on i915 dma Rafael J. Wysocki
2009-02-16 3:50 ` Wang Chen
2009-02-14 20:38 ` [Bug #12444] X hangs following switch from radeonfb console - Bisected Rafael J. Wysocki
2009-02-16 17:52 ` Graham Murray
2009-02-16 21:52 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12496] swsusp cannot find resume device (sometimes) Rafael J. Wysocki
2009-02-15 0:05 ` Arjan van de Ven
2009-02-15 14:23 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12497] new barrier warnings in 2.6.29-rc1 Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12499] Problem with using bluetooth adaper connected to usb port Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12494] Sony backlight regression from 2.6.28 to 29-rc Rafael J. Wysocki
2009-02-17 10:51 ` Norbert Preining
2009-02-14 20:38 ` [Bug #12491] i915 lockdep warning Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12501] build bug in eeepc-laptop.c Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12510] 2.6.29-rc2 dies on startup Rafael J. Wysocki
2009-02-16 21:02 ` Ferenc Wagner
2009-02-16 21:12 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12551] end_request: I/O error, dev cciss/c0d0, sector 87435720 Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12502] pipe_read oops on sh Rafael J. Wysocki
2009-02-15 0:23 ` Adrian McMenamin
2009-02-15 14:27 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12571] Suspend-resume on Dell Latitude D410 newly broken in 2.6.29-rc* Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12574] possible circular locking dependency detected Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12610] sync-Regression in 2.6.28.2? Rafael J. Wysocki
2009-02-21 17:56 ` Theodore Tso
2009-02-22 10:02 ` Rafael J. Wysocki
2009-02-23 4:35 ` Greg KH
2009-02-23 5:37 ` Theodore Tso
2009-02-23 16:54 ` [stable] " Greg KH
2009-02-14 20:38 ` [Bug #12609] v2.6.29-rc2 libata sff 32bit PIO regression Rafael J. Wysocki
2009-02-15 4:20 ` Larry Finger
2009-02-15 8:10 ` Jeff Garzik
2009-02-15 12:05 ` Sergei Shtylyov
2009-02-15 16:48 ` Hugh Dickins
2009-02-14 20:38 ` [Bug #12615] boot hangs while bringing up gianfar ethernet Rafael J. Wysocki
2009-02-15 14:42 ` Peter Korsgaard
2009-02-15 21:08 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12613] [Suspend regression][DRM, RADEON] Rafael J. Wysocki
[not found] ` <4997E7D7.60205@numericable.fr>
2009-02-15 10:20 ` etienne
2009-02-14 20:38 ` [Bug #12617] unable to compile e100 firmware into kernel Rafael J. Wysocki
2009-02-15 17:38 ` David Woodhouse
2009-02-15 19:58 ` Andrey Borzenkov
2009-02-15 21:09 ` Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12650] Strange load average and ksoftirqd behavior with 2.6.29-rc2-git1 Rafael J. Wysocki
2009-02-15 8:09 ` Damien Wyart
2009-02-15 9:00 ` Ingo Molnar
2009-02-15 9:51 ` Damien Wyart
2009-02-15 10:13 ` Ingo Molnar
2009-02-15 10:34 ` Damien Wyart
2009-02-15 10:41 ` Damien Wyart
2009-02-15 10:42 ` Damien Wyart
2009-02-15 10:43 ` Damien Wyart
2009-02-15 11:01 ` Ingo Molnar
2009-02-15 14:06 ` Frederic Weisbecker
2009-02-15 18:03 ` Damien Wyart
2009-02-15 19:18 ` Damien Wyart
2009-02-15 19:31 ` Ingo Molnar
2009-02-16 8:42 ` Damien Wyart
2009-02-16 9:21 ` Ingo Molnar
2009-02-16 10:49 ` Damien Wyart
2009-02-16 9:25 ` Ingo Molnar
2009-02-16 9:27 ` Ingo Molnar
2009-02-16 9:32 ` Ingo Molnar
2009-02-16 9:50 ` Ingo Molnar
2009-02-16 11:56 ` Damien Wyart
2009-02-16 12:26 ` Ingo Molnar
2009-02-16 13:02 ` Damien Wyart
2009-02-16 13:21 ` Ingo Molnar
2009-02-16 16:06 ` Paul E. McKenney
2009-02-16 18:56 ` Paul E. McKenney
2009-02-16 19:08 ` Frederic Weisbecker
2009-02-16 20:02 ` Frederic Weisbecker
2009-02-16 21:31 ` Paul E. McKenney
2009-02-16 20:09 ` Ingo Molnar
2009-02-16 22:39 ` Paul E. McKenney [this message]
2009-02-16 22:51 ` Paul E. McKenney
2009-02-17 9:46 ` Ingo Molnar
2009-02-17 14:01 ` Paul E. McKenney
2009-02-17 15:39 ` Damien Wyart
2009-02-17 16:05 ` Paul E. McKenney
2009-02-17 21:48 ` Ingo Molnar
2009-02-17 4:34 ` Frederic Weisbecker
2009-02-17 15:10 ` Paul E. McKenney
2009-02-17 16:00 ` Frederic Weisbecker
2009-02-17 22:37 ` Frederic Weisbecker
2009-02-17 22:48 ` Paul E. McKenney
2009-02-18 0:38 ` Ingo Molnar
2009-02-18 1:02 ` Paul E. McKenney
2009-02-17 6:11 ` Damien Wyart
2009-02-17 15:11 ` Paul E. McKenney
2009-02-16 20:44 ` Damien Wyart
2009-02-15 10:12 ` Christian Kujau
2009-02-15 10:54 ` Ingo Molnar
2009-02-14 20:38 ` [Bug #12659] Failure to resume two Sandisk USB flash drives attached to a Belkin USB Busport Mobile (F5U022) Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12618] hackbench [pthread mode] regression with 2.6.29-rc3 Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12663] Commit 8c7e58e690ae60ab4215b025f433ed4af261e103 breaks resume Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12660] Linux 2.6.28.3 freezing on a 32-bits x86 Thinkpad T43p Rafael J. Wysocki
2009-02-14 23:29 ` Mathieu Desnoyers
2009-02-14 20:38 ` [Bug #12668] USB flash disk surprise disconnect Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12671] uvc_status_cleanup(): undefined reference to `input_unregister_device' Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12670] BUG: unable to handle kernel paging request at pin_to_kill+0x21 Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12681] s2ram: fails to wake up on Acer Extensa 4220 (SMP disabled) Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12680] Not having a VIA PadLock hardware incurs a long delay in probing on modules insertion attempt Rafael J. Wysocki
2009-02-14 20:38 ` [Bug #12705] X200: Brightness broken since 2.6.29-rc4-58-g4c098bc Rafael J. Wysocki
2009-02-15 13:43 ` Matthew Garrett
2009-02-15 14:37 ` Rafael J. Wysocki
2009-02-17 23:05 ` Eric Anholt
2009-02-17 23:13 ` Matthew Garrett
2009-02-17 23:23 ` Jesse Barnes
2009-02-18 9:36 ` Nico Schottelius
2009-02-13 9:33 ` Nico Schottelius
2009-02-13 9:40 ` Nico Schottelius
2009-02-13 13:43 ` Matthew Garrett
2009-03-10 2:28 ` Eric Anholt
2009-03-10 5:38 ` Nico Schottelius
2009-02-13 9:42 ` Ingo Molnar
2009-02-13 18:05 ` Len Brown
2009-02-16 9:06 ` ZhangRui
2009-02-16 10:58 ` Nico Schottelius
2009-02-16 13:13 ` Nico Schottelius
2009-02-16 21:40 ` Norbert Preining
2009-02-16 15:54 ` Nico Schottelius
2009-02-19 9:01 ` Nico Schottelius
2009-02-14 20:38 ` [Bug #12706] Oopses and ACPI problems (Linus 2.6.29-rc4) Rafael J. Wysocki
2009-02-16 7:29 ` 2.6.29-rc5: Reported regressions from 2.6.28 Jarek Poplawski
2009-02-16 21:11 ` Rafael J. Wysocki
-- strict thread matches above, loose matches on Subject: below --
2009-02-08 19:05 2.6.29-rc4: " Rafael J. Wysocki
2009-02-08 19:21 ` [Bug #12650] Strange load average and ksoftirqd behavior with 2.6.29-rc2-git1 Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090216223944.GF6785@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=damien.wyart@free.fr \
--cc=efault@gmx.de \
--cc=fweisbec@gmail.com \
--cc=kernel-testers@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rjw@sisk.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).