From: Manfred Spraul <manfred@colorfullife.com>
To: paulmck@linux.vnet.ibm.com
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>,
linux-kernel@vger.kernel.org, cl@linux-foundation.org,
mingo@elte.hu, akpm@linux-foundation.org, dipankar@in.ibm.com,
josht@linux.vnet.ibm.com, schamp@sgi.com, niv@us.ibm.com,
dvhltc@us.ibm.com, ego@in.ibm.com, rostedt@goodmis.org,
peterz@infradead.org
Subject: Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation
Date: Sun, 31 Aug 2008 12:58:12 +0200 [thread overview]
Message-ID: <48BA7944.8070402@colorfullife.com> (raw)
In-Reply-To: <20080830143438.GF7107@linux.vnet.ibm.com>
Paul E. McKenney wrote:
>
>> Perhaps it's possible to rely on CPU_DYING, but I haven't figured out yet
>> how to handle read-side critical sections in CPU_DYING handlers.
>> Interrupts after CPU_DYING could be handled by rcu_irq_enter(),
>> rcu_irq_exit() [yes, they exist on x86: the arch code enables the local
>> interrupts in order to process the currently queued interrupts]
>>
>
> My feeling is that CPU online/offline will be quite rare, so it should
> be OK to clean up after the races in force_quiescent_state(), which in
> this version is called every three ticks in a given grace period.
>
If you add failing cpu offline calls, then the problem appears to be
unsolvable:
If I get it right, the offlining process looks like this:
* one cpu in the system makes the CPU_DOWN_PREPARE notifier call. These
calls can sleep (e.g. slab sleeps on semaphores). The cpu that goes
offline is still alive, still doing arbitrary work. cpu_quiet calls on
behalf of the cpu would be wrong.
* stop_machine: all cpus schedule to a special kernel thread [1], only
the dying cpu runs.
* The cpu that goes offline calls the CPU_DYING notifiers.
* __cpu_disable(): The cpu that goes offline check if it's possible to
offline the cpu. At least on i386, this can fail.
On success:
* at least on i386: the cpu that goes offline handles outstanding
interrupts. I'm not sure, perhaps even softirqs are handled.
* the cpus stopps handling interrupts.
* stop machine leaves, the remaining cpus continue their work.
* The CPU_DEAD notifiers are called. They can sleep.
On failure:
* all cpus continue their work. call_rcu, synchronize_rcu(), ...
* some time later: the CPU_DOWN_FAILED callbacks are called.
Is that description correct?
Then:
- treating a cpu as always quiet after the rcu notifer was called with
CPU_OFFLINE_PREPARE is wrong: the target cpu still runs normal code:
user space, kernel space, interrupts, whatever. The target cpu still
accepts interrupst, thus treating it as "normal" should work.
__cpu_disable() success:
- after CPU_DYING, a cpu is either in an interrupt or outside read-side
critical sections. Parallel synchronize_rcu() calls are impossible until
the cpu is dead. call_rcu() is probably possible.
- The CPU_DEAD notifiers are called. a synchronize_rcu() call before the
rcu notifier is called is possible.
__cpu_disable() failure:
- CPU_DYING is called, but the cpu remains fully alive. The system comes
fully alive again.
- some time later, CPU_DEAD is called.
With the current CPU_DYING callback, it's impossible to be both
deadlock-free and race-free with the given conditions. If
__cpu_disable() succeeds, then the cpu must be treated as gone and
always idle. If __cpu_disable() fails, then the cpu must be treated as
fully there. Doing both things at the same time is impossible. Waiting
until CPU_DOWN_FAILED or CPU_DEAD is called is impossible, too: Either
synchronize_rcu() in a CPU_DEAD notifier [called before the rcu
notifier] would deadlock or read-side critical sections on the
not-killed cpu would race.
What about moving the CPU_DYING notifier calls behind the
__cpu_disable() call?
Any other solutions?
Btw, as far as I can see, rcupreempt would deadlock if a CPU_DEAD
notifier uses synchronize_rcu().
Probably noone will ever succeed in triggering the deadlock:
- cpu goes offline.
- the other cpus in the system are restarted.
- one cpu does the CPU_DEAD notifier calls.
- before the rcu notifier is called with CPU_DEAD:
- one CPU_DEAD notifier sleeps.
- while CPU_DEAD is sleeping: on the same cpu: kmem_cache_destroy is
called. get_online_cpus immediately succeeds.
- kmem_cache_destroy acquires the cache_chain_mutex.
- kmem_cache_destroy does synchronize_rcu(), it sleeps.
- CPU_DEAD processing continues, the slab CPU_DEAD tries to acquire the
cache_chain_mutex. it sleeps, too.
--> deadlock, because the already dead cpu will never signal itself as
quiet. Thus synchronize_rcu() will never succeed, thus the slab CPU_DEAD
notifier will never return, thus rcu_offline_cpu() is never called.
--
Manfred
[1] open question: with rcu_preempt, is it possible that these cpus
could be inside read side critical sections?
next prev parent reply other threads:[~2008-08-31 10:58 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-21 23:43 [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation Paul E. McKenney
2008-08-22 4:37 ` Ingo Molnar
2008-08-22 13:47 ` Paul E. McKenney
2008-08-22 17:22 ` Paul E. McKenney
2008-08-22 18:16 ` Josh Triplett
2008-08-23 16:07 ` Ingo Molnar
2008-08-24 2:44 ` Paul E. McKenney
2008-08-22 23:29 ` Josh Triplett
2008-08-23 1:53 ` Paul E. McKenney
2008-08-25 22:02 ` Josh Triplett
2008-08-26 16:05 ` Paul E. McKenney
2008-08-27 0:38 ` Josh Triplett
2008-08-27 18:34 ` Paul E. McKenney
2008-08-27 20:23 ` Josh Triplett
2008-08-27 20:41 ` Paul E. McKenney
2008-08-25 10:34 ` Peter Zijlstra
2008-08-25 15:16 ` Paul E. McKenney
2008-08-25 15:26 ` Peter Zijlstra
2008-08-27 18:28 ` Paul E. McKenney
2008-08-24 8:08 ` Manfred Spraul
2008-08-24 16:32 ` Paul E. McKenney
2008-08-24 18:25 ` Manfred Spraul
2008-08-24 21:19 ` Paul E. McKenney
2008-08-25 0:07 ` [PATCH, RFC, tip/core/rcu] v2 " Paul E. McKenney
2008-08-30 0:49 ` [PATCH, RFC, tip/core/rcu] v3 " Paul E. McKenney
2008-08-30 9:33 ` Peter Zijlstra
2008-08-30 14:10 ` Paul E. McKenney
2008-08-30 15:40 ` Peter Zijlstra
2008-08-30 19:38 ` Paul E. McKenney
2008-09-02 13:26 ` Mathieu Desnoyers
2008-09-02 13:41 ` Peter Zijlstra
2008-09-02 14:55 ` Paul E. McKenney
2008-08-30 9:58 ` Lai Jiangshan
2008-08-30 13:32 ` Manfred Spraul
2008-08-30 14:34 ` Paul E. McKenney
2008-08-31 10:58 ` Manfred Spraul [this message]
2008-08-31 17:20 ` Paul E. McKenney
2008-08-31 17:45 ` Manfred Spraul
2008-08-31 17:55 ` Paul E. McKenney
2008-08-31 18:18 ` Manfred Spraul
2008-08-31 19:23 ` Paul E. McKenney
2008-08-30 14:29 ` Paul E. McKenney
2008-09-01 9:38 ` Andi Kleen
2008-09-02 1:05 ` Paul E. McKenney
2008-09-02 6:18 ` Andi Kleen
2008-09-05 15:29 ` [PATCH, RFC] v4 " Paul E. McKenney
2008-09-05 19:33 ` Andrew Morton
2008-09-05 23:04 ` Paul E. McKenney
2008-09-05 23:52 ` Andrew Morton
2008-09-06 4:16 ` Paul E. McKenney
2008-09-06 16:37 ` Manfred Spraul
2008-09-07 17:25 ` Paul E. McKenney
2008-09-07 10:18 ` [RFC, PATCH] Add a CPU_STARTING notifier (was: Re: [PATCH, RFC] v4 scalable classic RCU implementation) Manfred Spraul
2008-09-07 11:07 ` Andi Kleen
2008-09-07 19:46 ` Paul E. McKenney
2008-09-15 16:02 ` [PATCH, RFC] v4 scalable classic RCU implementation Paul E. McKenney
2008-09-16 16:52 ` Manfred Spraul
2008-09-16 17:30 ` Paul E. McKenney
2008-09-16 17:48 ` Manfred Spraul
2008-09-16 18:22 ` Paul E. McKenney
2008-09-21 11:09 ` Manfred Spraul
2008-09-21 21:14 ` Paul E. McKenney
2008-09-23 23:53 ` [PATCH, RFC] v6 " Paul E. McKenney
2008-09-25 7:26 ` Ingo Molnar
2008-09-25 14:05 ` Paul E. McKenney
2008-09-25 7:29 ` Ingo Molnar
2008-09-25 14:18 ` Paul E. McKenney
2008-10-10 16:09 ` [PATCH, RFC] v7 " Paul E. McKenney
2008-10-12 15:52 ` Manfred Spraul
2008-10-12 22:46 ` Paul E. McKenney
2008-10-13 18:03 ` Manfred Spraul
2008-10-15 1:11 ` Paul E. McKenney
2008-10-15 8:13 ` Manfred Spraul
2008-10-15 15:26 ` Paul E. McKenney
2008-10-22 18:41 ` Manfred Spraul
2008-10-22 21:02 ` Paul E. McKenney
2008-10-22 21:24 ` Manfred Spraul
2008-10-27 16:45 ` Paul E. McKenney
2008-10-27 19:48 ` Manfred Spraul
2008-10-27 23:52 ` Paul E. McKenney
2008-10-28 5:30 ` Manfred Spraul
2008-10-28 15:17 ` Paul E. McKenney
2008-10-28 17:21 ` Manfred Spraul
2008-10-28 17:35 ` Paul E. McKenney
2008-10-17 8:34 ` Gautham R Shenoy
2008-10-17 15:35 ` Gautham R Shenoy
2008-10-17 15:46 ` Paul E. McKenney
2008-10-17 15:43 ` Paul E. McKenney
2008-12-08 18:42 ` Paul E. McKenney
2008-11-02 20:10 ` Manfred Spraul
2008-11-03 20:33 ` Paul E. McKenney
2008-11-05 19:48 ` Manfred Spraul
2008-11-05 21:27 ` Paul E. McKenney
2008-11-15 23:20 ` [PATCH, RFC] v8 " Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48BA7944.8070402@colorfullife.com \
--to=manfred@colorfullife.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=dipankar@in.ibm.com \
--cc=dvhltc@us.ibm.com \
--cc=ego@in.ibm.com \
--cc=josht@linux.vnet.ibm.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=schamp@sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.