From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Josh Triplett <josht@linux.vnet.ibm.com>,
linux-kernel@vger.kernel.org, laijs@cn.fujitsu.com,
dipankar@in.ibm.com, akpm@linux-foundation.org,
mathieu.desnoyers@polymtl.ca, dvhltc@us.ibm.com, niv@us.ibm.com,
tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org,
hugh.dickins@tiscali.co.uk, benh@kernel.crashing.org
Subject: Re: [PATCH -tip/core/rcu 1/6] Cleanups and fixes for RCU in face of heavy CPU-hotplug stress
Date: Tue, 18 Aug 2009 13:07:01 -0700 [thread overview]
Message-ID: <20090818200701.GG6766@linux.vnet.ibm.com> (raw)
In-Reply-To: <20090818152643.GA5549@elte.hu>
On Tue, Aug 18, 2009 at 05:26:43PM +0200, Ingo Molnar wrote:
>
> FYI, i've started triggering hangs in -tip testing recently, during
> CPU hotplug tests:
>
> [ 57.632003] eth0: no IPv6 routers present
> [ 103.564010] kmemleak: 29 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
> [ 200.380003] Hangcheck: hangcheck value past margin!
> [ 248.192003] INFO: task S99local:2974 blocked for more than 120 seconds.
> [ 248.194532] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 248.202330] S99local D 0000000c 6256 2974 2687 0x00000000
> [ 248.208929] 9c7ebe90 00000086 6b67ef8b 0000000c 9f25a610 81a69869 00000001 820b6990
> [ 248.216123] 820b6990 820b6990 9c6e4c20 9c6e4eb4 82c78990 00000000 6b993559 0000000c
> [ 248.220616] 9c7ebe90 8105f22a 9c6e4eb4 9c6e4c20 00000001 9c7ebe98 9c7ebeb4 81a65cb3
> [ 248.229990] Call Trace:
> [ 248.234049] [<81a69869>] ? _spin_unlock_irqrestore+0x22/0x37
> [ 248.239769] [<8105f22a>] ? prepare_to_wait+0x48/0x4e
> [ 248.244796] [<81a65cb3>] rcu_barrier_cpu_hotplug+0xaa/0xc9
> [ 248.250343] [<8105f029>] ? autoremove_wake_function+0x0/0x38
> [ 248.256063] [<81062cf2>] notifier_call_chain+0x49/0x71
> [ 248.261263] [<81062da0>] raw_notifier_call_chain+0x11/0x13
> [ 248.266809] [<81a0b475>] _cpu_down+0x272/0x288
> [ 248.271316] [<81a0b4d5>] cpu_down+0x4a/0xa2
> [ 248.275563] [<81a0c48a>] store_online+0x2a/0x5e
> [ 248.280156] [<81a0c460>] ? store_online+0x0/0x5e
> [ 248.284836] [<814ddc35>] sysdev_store+0x20/0x28
> [ 248.289429] [<8112e403>] sysfs_write_file+0xb8/0xe3
> [ 248.294369] [<8112e34b>] ? sysfs_write_file+0x0/0xe3
> [ 248.299396] [<810e4c8f>] vfs_write+0x91/0x120
> [ 248.303817] [<810e4dc1>] sys_write+0x40/0x65
> [ 248.308150] [<81002d73>] sysenter_do_call+0x12/0x28
>
> config and bootlog attached. I'd suspect one of these patches:
>
> 684ca5c: rcu: Fix typo in rcu_irq_exit() comment header
> b612ba8: rcu: Make rcupreempt_trace.c look at offline CPUs
> 8064d54: rcu: Make preemptable RCU scan all CPUs when summing RCU counters
> 2e59755: rcu: Simplify RCU CPU-hotplug notification
> 799e64f: cpu hotplug: Introduce cpu_notifier() to handle !HOTPLUG_CPU case
> 2756962: rcu: Split hierarchical RCU initialization into boot-time and CPU-online piece
>
> Any ideas?
Gah... I thought I had fixed that one!!! I was seeing a deadlock
where rcu_barrier_cpu_hotplug() would register the three RCU callbacks,
then wait for them. But in some situations, it would wait for them in
a state such that grace period could not complete. I convinced myself
that moving the wait back from CPU_DEAD to CPU_POST_DEAD solved the
problem.
I am going to take a more bullet-proof approach, switching from the
wait_completion() form to wait_event(), which will allow me to wait
for the previous hotplug operation's callbacks at the beginning of the
subsequent hotplug operation.
I reserve the right to insert a short delay in the CPU-hotplug path
outside of any locks, but would imagine that people would prefer that
I avoid that sort of thing, at least until we have bulk CPU-hotplug
operations.
Thanx, Paul
next prev parent reply other threads:[~2009-08-18 20:07 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-15 16:51 [PATCH -tip/core/rcu 1/6] Cleanups and fixes for RCU in face of heavy CPU-hotplug stress Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 1/6] Split hierarchical RCU initialization into boot-time and CPU-online pieces Paul E. McKenney
2009-08-15 17:07 ` [tip:core/rcu] rcu: " tip-bot for Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 2/6] Introduce cpu_notifier() to handle !HOTPLUG_CPU case Paul E. McKenney
2009-08-15 17:07 ` [tip:core/rcu] cpu hotplug: " tip-bot for Paul E. McKenney
2009-08-17 17:21 ` [PATCH -tip/core/rcu 2/6] " Josh Triplett
2009-08-17 18:28 ` Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 3/6] Simplify RCU CPU-hotplug notification Paul E. McKenney
2009-08-15 17:07 ` [tip:core/rcu] rcu: " tip-bot for Paul E. McKenney
2009-08-20 4:02 ` [PATCH -tip/core/rcu 3/6] " Lai Jiangshan
2009-08-20 4:21 ` Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 4/6] Make preemptable RCU scan all CPUs when summing RCU counters Paul E. McKenney
2009-08-15 17:07 ` [tip:core/rcu] rcu: " tip-bot for Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 5/6] Make rcupreempt_trace.c look at offline CPUs Paul E. McKenney
2009-08-15 17:07 ` [tip:core/rcu] rcu: " tip-bot for Paul E. McKenney
2009-08-15 16:53 ` [PATCH -tip/core/rcu 6/6] Fix typo in rcu_irq_exit() comment header Paul E. McKenney
2009-08-15 17:00 ` Ingo Molnar
2009-08-15 17:10 ` Paul E. McKenney
2009-08-15 17:11 ` Ingo Molnar
2009-08-15 17:08 ` [tip:core/rcu] rcu: " tip-bot for Josh Triplett
2009-08-17 18:24 ` [PATCH -tip/core/rcu 1/6] Cleanups and fixes for RCU in face of heavy CPU-hotplug stress Josh Triplett
2009-08-17 19:20 ` Paul E. McKenney
2009-08-18 15:26 ` Ingo Molnar
2009-08-18 20:07 ` Paul E. McKenney [this message]
2009-08-19 6:06 ` Paul E. McKenney
2009-08-19 11:59 ` Ingo Molnar
2009-08-19 12:09 ` [tip:core/rcu] rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation tip-bot for Paul E. McKenney
2009-08-19 15:24 ` [PATCH -tip/core/rcu 1/6] Cleanups and fixes for RCU in face of heavy CPU-hotplug stress Mathieu Desnoyers
2009-08-19 16:38 ` Paul E. McKenney
2009-08-19 18:10 ` Mathieu Desnoyers
2009-08-19 18:31 ` Paul E. McKenney
2009-08-20 14:03 ` Mathieu Desnoyers
2009-08-21 14:17 ` Ingo Molnar
2009-08-21 14:29 ` Steven Rostedt
2009-08-21 14:44 ` Ingo Molnar
2009-08-21 15:00 ` Mathieu Desnoyers
2009-08-21 15:37 ` Paul E. McKenney
2009-08-21 14:58 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090818200701.GG6766@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=dipankar@in.ibm.com \
--cc=dvhltc@us.ibm.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=josht@linux.vnet.ibm.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.