* [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
@ 2008-08-31 17:58 Manfred Spraul
2008-08-31 19:17 ` Paul E. McKenney
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Manfred Spraul @ 2008-08-31 17:58 UTC (permalink / raw)
To: linux-kernel; +Cc: paulmck, Ingo Molnar, akpm
When a cpu is taken offline, the CPU_DYING notifiers are called on the
dying cpu. According to <linux/notifiers.h>, the cpu should be "not
running any task, not handling interrupts, soon dead".
For the current implementation, this is not true:
- __cpu_disable can fail. If it fails, then the cpu will remain alive
and happy.
- At least on x86, __cpu_disable() briefly enables the local interrupts
to handle any outstanding interrupts.
What about moving CPU_DYING down a few lines, behind the __cpu_disable()
line?
There are only two CPU_DYING handlers in the kernel right now: one in
kvm, one in the scheduler. Both should work with the patch applied
[and: I'm not sure if either one handles a failing __cpu_disable()]
The patch survives simple offlining a cpu. kvm untested due to lack
of a test setup.
Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>
---
kernel/cpu.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e202a68..5b7c88f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -199,13 +199,14 @@ static int __ref take_cpu_down(void *_param)
struct take_cpu_down_param *param = _param;
int err;
- raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
- param->hcpu);
/* Ensure this CPU doesn't handle any more interrupts. */
err = __cpu_disable();
if (err < 0)
return err;
+ raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
+ param->hcpu);
+
/* Force idle task to run as soon as we yield: it should
immediately notice cpu is offline and die quickly. */
sched_idle_next();
--
1.5.5.1
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-08-31 17:58 [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers Manfred Spraul
@ 2008-08-31 19:17 ` Paul E. McKenney
2008-08-31 19:23 ` Paul E. McKenney
2008-09-06 16:49 ` Ingo Molnar
2008-09-13 6:36 ` Avi Kivity
2 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2008-08-31 19:17 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, Ingo Molnar, akpm
On Sun, Aug 31, 2008 at 07:58:49PM +0200, Manfred Spraul wrote:
> When a cpu is taken offline, the CPU_DYING notifiers are called on the
> dying cpu. According to <linux/notifiers.h>, the cpu should be "not
> running any task, not handling interrupts, soon dead".
>
> For the current implementation, this is not true:
> - __cpu_disable can fail. If it fails, then the cpu will remain alive
> and happy.
> - At least on x86, __cpu_disable() briefly enables the local interrupts
> to handle any outstanding interrupts.
>
> What about moving CPU_DYING down a few lines, behind the __cpu_disable()
> line?
> There are only two CPU_DYING handlers in the kernel right now: one in
> kvm, one in the scheduler. Both should work with the patch applied
> [and: I'm not sure if either one handles a failing __cpu_disable()]
>
> The patch survives simple offlining a cpu. kvm untested due to lack
> of a test setup.
Several architectures re-enable interrupts in __cpu_disable() or in
functions called from __cpu_disable(), which happens after CPU_DYING,
if I understand correctly. :-(
Thanx, Paul
> Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>
> ---
> kernel/cpu.c | 5 +++--
> 1 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index e202a68..5b7c88f 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -199,13 +199,14 @@ static int __ref take_cpu_down(void *_param)
> struct take_cpu_down_param *param = _param;
> int err;
>
> - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> - param->hcpu);
> /* Ensure this CPU doesn't handle any more interrupts. */
> err = __cpu_disable();
> if (err < 0)
> return err;
>
> + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> + param->hcpu);
> +
> /* Force idle task to run as soon as we yield: it should
> immediately notice cpu is offline and die quickly. */
> sched_idle_next();
> --
> 1.5.5.1
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-08-31 19:17 ` Paul E. McKenney
@ 2008-08-31 19:23 ` Paul E. McKenney
0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2008-08-31 19:23 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, Ingo Molnar, akpm
On Sun, Aug 31, 2008 at 12:17:21PM -0700, Paul E. McKenney wrote:
> On Sun, Aug 31, 2008 at 07:58:49PM +0200, Manfred Spraul wrote:
> > When a cpu is taken offline, the CPU_DYING notifiers are called on the
> > dying cpu. According to <linux/notifiers.h>, the cpu should be "not
> > running any task, not handling interrupts, soon dead".
> >
> > For the current implementation, this is not true:
> > - __cpu_disable can fail. If it fails, then the cpu will remain alive
> > and happy.
> > - At least on x86, __cpu_disable() briefly enables the local interrupts
> > to handle any outstanding interrupts.
> >
> > What about moving CPU_DYING down a few lines, behind the __cpu_disable()
> > line?
> > There are only two CPU_DYING handlers in the kernel right now: one in
> > kvm, one in the scheduler. Both should work with the patch applied
> > [and: I'm not sure if either one handles a failing __cpu_disable()]
> >
> > The patch survives simple offlining a cpu. kvm untested due to lack
> > of a test setup.
>
> Several architectures re-enable interrupts in __cpu_disable() or in
> functions called from __cpu_disable(), which happens after CPU_DYING,
> if I understand correctly. :-(
Never mind -- you are moving CPU_DYING after __cpu_disable(). :-/
Thanx, Paul
> > Signed-Off-By: Manfred Spraul <manfred@colorfullife.com>
> > ---
> > kernel/cpu.c | 5 +++--
> > 1 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index e202a68..5b7c88f 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -199,13 +199,14 @@ static int __ref take_cpu_down(void *_param)
> > struct take_cpu_down_param *param = _param;
> > int err;
> >
> > - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> > - param->hcpu);
> > /* Ensure this CPU doesn't handle any more interrupts. */
> > err = __cpu_disable();
> > if (err < 0)
> > return err;
> >
> > + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> > + param->hcpu);
> > +
> > /* Force idle task to run as soon as we yield: it should
> > immediately notice cpu is offline and die quickly. */
> > sched_idle_next();
> > --
> > 1.5.5.1
> >
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-08-31 17:58 [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers Manfred Spraul
2008-08-31 19:17 ` Paul E. McKenney
@ 2008-09-06 16:49 ` Ingo Molnar
2008-09-06 17:08 ` Manfred Spraul
2008-09-13 6:36 ` Avi Kivity
2 siblings, 1 reply; 7+ messages in thread
From: Ingo Molnar @ 2008-09-06 16:49 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, paulmck, akpm
* Manfred Spraul <manfred@colorfullife.com> wrote:
> - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> - param->hcpu);
> /* Ensure this CPU doesn't handle any more interrupts. */
> err = __cpu_disable();
> if (err < 0)
> return err;
>
> + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
> + param->hcpu);
hm, doesnt this break things like CPU cross-calls done in CPU_DYING
callbacks?
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-09-06 16:49 ` Ingo Molnar
@ 2008-09-06 17:08 ` Manfred Spraul
2008-09-06 17:13 ` Ingo Molnar
0 siblings, 1 reply; 7+ messages in thread
From: Manfred Spraul @ 2008-09-06 17:08 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, paulmck, akpm
Ingo Molnar wrote:
> * Manfred Spraul <manfred@colorfullife.com> wrote:
>
>> - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
>> - param->hcpu);
>> /* Ensure this CPU doesn't handle any more interrupts. */
>> err = __cpu_disable();
>> if (err < 0)
>> return err;
>>
>> + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
>> + param->hcpu);
>>
>
> hm, doesnt this break things like CPU cross-calls done in CPU_DYING
> callbacks?
>
We are within stop_machine(). No other cpu is running. As fas as I can
see no cross-calls are possible.
Which scenario do you think about?
--
Manfred
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-09-06 17:08 ` Manfred Spraul
@ 2008-09-06 17:13 ` Ingo Molnar
0 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2008-09-06 17:13 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, paulmck, akpm
* Manfred Spraul <manfred@colorfullife.com> wrote:
> Ingo Molnar wrote:
>> * Manfred Spraul <manfred@colorfullife.com> wrote:
>>
>>> - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
>>> - param->hcpu);
>>> /* Ensure this CPU doesn't handle any more interrupts. */
>>> err = __cpu_disable();
>>> if (err < 0)
>>> return err;
>>> + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod,
>>> + param->hcpu);
>>>
>>
>> hm, doesnt this break things like CPU cross-calls done in CPU_DYING
>> callbacks?
>>
>
> We are within stop_machine(). No other cpu is running. As fas as I can
> see no cross-calls are possible.
ah, ok - my bad. I was confusing it with the much more common
CPU_DOWN_PREPARE type of callbacks which do use various cross-CPU APIs.
applied to tip/sched/devel, thanks Manfred!
Ingo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers
2008-08-31 17:58 [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers Manfred Spraul
2008-08-31 19:17 ` Paul E. McKenney
2008-09-06 16:49 ` Ingo Molnar
@ 2008-09-13 6:36 ` Avi Kivity
2 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2008-09-13 6:36 UTC (permalink / raw)
To: Manfred Spraul; +Cc: linux-kernel, paulmck, Ingo Molnar, akpm
Manfred Spraul wrote:
> When a cpu is taken offline, the CPU_DYING notifiers are called on the
> dying cpu. According to <linux/notifiers.h>, the cpu should be "not
> running any task, not handling interrupts, soon dead".
>
> For the current implementation, this is not true:
> - __cpu_disable can fail. If it fails, then the cpu will remain alive
> and happy.
> - At least on x86, __cpu_disable() briefly enables the local interrupts
> to handle any outstanding interrupts.
>
> What about moving CPU_DYING down a few lines, behind the __cpu_disable()
> line?
> There are only two CPU_DYING handlers in the kernel right now: one in
> kvm, one in the scheduler. Both should work with the patch applied
> [and: I'm not sure if either one handles a failing __cpu_disable()]
>
> The patch survives simple offlining a cpu. kvm untested due to lack
> of a test setup.
>
>
kvm should work with this patch.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-09-13 6:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-31 17:58 [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers Manfred Spraul
2008-08-31 19:17 ` Paul E. McKenney
2008-08-31 19:23 ` Paul E. McKenney
2008-09-06 16:49 ` Ingo Molnar
2008-09-06 17:08 ` Manfred Spraul
2008-09-06 17:13 ` Ingo Molnar
2008-09-13 6:36 ` Avi Kivity
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox