* [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
[not found] ` <1425404595-17816-1-git-send-email-paulmck@linux.vnet.ibm.com>
@ 2015-03-03 17:42 ` Paul E. McKenney
[not found] ` <1425404595-17816-4-git-send-email-paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2015-03-03 17:42 UTC (permalink / raw)
To: linux-kernel
Cc: mingo, laijs, dipankar, akpm, mathieu.desnoyers, josh, tglx,
peterz, rostedt, dhowells, edumazet, dvhart, fweisbec, oleg,
bobby.prani, Paul E. McKenney, James Hogan, linux-metag
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
This commit removes the open-coded CPU-offline notification with new
common code. This change avoids calling scheduler code using RCU from
an offline CPU that RCU is ignoring. This commit is compatible with
the existing code in not checking for timeout during a prior offline
for a given CPU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <linux-metag@vger.kernel.org>
---
arch/metag/kernel/smp.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index f006d2276f40..ac3a199e33e7 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -261,7 +261,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
}
#ifdef CONFIG_HOTPLUG_CPU
-static DECLARE_COMPLETION(cpu_killed);
/*
* __cpu_disable runs on the processor to be shutdown.
@@ -299,7 +298,7 @@ int __cpu_disable(void)
*/
void __cpu_die(unsigned int cpu)
{
- if (!wait_for_completion_timeout(&cpu_killed, msecs_to_jiffies(1)))
+ if (!cpu_wait_death(cpu, 1))
pr_err("CPU%u: unable to kill\n", cpu);
}
@@ -314,7 +313,7 @@ void cpu_die(void)
local_irq_disable();
idle_task_exit();
- complete(&cpu_killed);
+ (void)cpu_report_death();
asm ("XOR TXENABLE, D0Re0,D0Re0\n");
}
--
1.8.1.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
[not found] ` <1425404595-17816-4-git-send-email-paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2015-03-10 15:30 ` James Hogan
2015-03-10 16:59 ` Paul E. McKenney
0 siblings, 1 reply; 5+ messages in thread
From: James Hogan @ 2015-03-10 15:30 UTC (permalink / raw)
To: Paul E. McKenney, linux-kernel-u79uwXL29TY76Z2rM5mHXA
Cc: mingo-DgEjT+Ai2ygdnm+yROfE0A, laijs-BthXqXjhjHXQFUHtdCDX3A,
dipankar-xthvdsQ13ZrQT0dZR+AlfA,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w,
josh-iaAMLnmF4UmaiuxdJuQwMA, tglx-hfZtesqFncYOwBW4kG4KsQ,
peterz-wEGCiKHe2LqWVfeAwA7xHQ, rostedt-nx8X9YLhiw1AfugRpC6u6w,
dhowells-H+wXaHxf7aLQT0dZR+AlfA, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
dvhart-VuQAYsv1563Yd54FQh9/CA, fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
oleg-H+wXaHxf7aLQT0dZR+AlfA, bobby.prani-Re5JQEeQqe8AvxtiuMwx3w,
linux-metag-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 2979 bytes --]
Hi Paul,
On 03/03/15 17:42, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>
> This commit removes the open-coded CPU-offline notification with new
> common code. This change avoids calling scheduler code using RCU from
> an offline CPU that RCU is ignoring. This commit is compatible with
> the existing code in not checking for timeout during a prior offline
> for a given CPU.
>
> Signed-off-by: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Cc: James Hogan <james.hogan-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
> Cc: <linux-metag-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
I gave this a try via linux-next, but unfortunately it causes the
following warning every time a CPU goes down:
META213-Thread0 DSP [LogF] CPU1: unable to kill
If I add printks, I see that the state on entry to both cpu_wait_death
and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
changed from its initial value.
Should arches other than x86 now be calling cpu_set_state_online()? The
patchlet below seems to resolve it for Meta (not sure if that is the
best place in the startup sequence to do it, perhaps it doesn't matter).
diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
index ac3a199e33e7..430e379ec71f 100644
--- a/arch/metag/kernel/smp.c
+++ b/arch/metag/kernel/smp.c
@@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
* OK, now it's safe to let the boot CPU continue
*/
set_cpu_online(cpu, true);
+ cpu_set_state_online(cpu);
complete(&cpu_running);
/*
Looking at the comment before cpu_set_state_online:
> /*
> * Mark the specified CPU online.
> *
> * Note that it is permissible to omit this call entirely, as is
> * done in architectures that do no CPU-hotplug error checking.
> */
Which suggests it wasn't wrong to omit it before your patches came
along.
Cheers
James
> ---
> arch/metag/kernel/smp.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index f006d2276f40..ac3a199e33e7 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -261,7 +261,6 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> }
>
> #ifdef CONFIG_HOTPLUG_CPU
> -static DECLARE_COMPLETION(cpu_killed);
>
> /*
> * __cpu_disable runs on the processor to be shutdown.
> @@ -299,7 +298,7 @@ int __cpu_disable(void)
> */
> void __cpu_die(unsigned int cpu)
> {
> - if (!wait_for_completion_timeout(&cpu_killed, msecs_to_jiffies(1)))
> + if (!cpu_wait_death(cpu, 1))
> pr_err("CPU%u: unable to kill\n", cpu);
> }
>
> @@ -314,7 +313,7 @@ void cpu_die(void)
> local_irq_disable();
> idle_task_exit();
>
> - complete(&cpu_killed);
> + (void)cpu_report_death();
>
> asm ("XOR TXENABLE, D0Re0,D0Re0\n");
> }
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
2015-03-10 15:30 ` James Hogan
@ 2015-03-10 16:59 ` Paul E. McKenney
[not found] ` <20150310165935.GR5708-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2015-03-10 16:59 UTC (permalink / raw)
To: James Hogan
Cc: linux-kernel, mingo, laijs, dipankar, akpm, mathieu.desnoyers,
josh, tglx, peterz, rostedt, dhowells, edumazet, dvhart, fweisbec,
oleg, bobby.prani, linux-metag
On Tue, Mar 10, 2015 at 03:30:42PM +0000, James Hogan wrote:
> Hi Paul,
>
> On 03/03/15 17:42, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> >
> > This commit removes the open-coded CPU-offline notification with new
> > common code. This change avoids calling scheduler code using RCU from
> > an offline CPU that RCU is ignoring. This commit is compatible with
> > the existing code in not checking for timeout during a prior offline
> > for a given CPU.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: James Hogan <james.hogan@imgtec.com>
> > Cc: <linux-metag@vger.kernel.org>
>
> I gave this a try via linux-next, but unfortunately it causes the
> following warning every time a CPU goes down:
> META213-Thread0 DSP [LogF] CPU1: unable to kill
That is certainly not what I had in mind, thank you for finding this!
> If I add printks, I see that the state on entry to both cpu_wait_death
> and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
> changed from its initial value.
>
> Should arches other than x86 now be calling cpu_set_state_online()? The
> patchlet below seems to resolve it for Meta (not sure if that is the
> best place in the startup sequence to do it, perhaps it doesn't matter).
>
> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> index ac3a199e33e7..430e379ec71f 100644
> --- a/arch/metag/kernel/smp.c
> +++ b/arch/metag/kernel/smp.c
> @@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
> * OK, now it's safe to let the boot CPU continue
> */
> set_cpu_online(cpu, true);
> + cpu_set_state_online(cpu);
> complete(&cpu_running);
>
> /*
>
> Looking at the comment before cpu_set_state_online:
> > /*
> > * Mark the specified CPU online.
> > *
> > * Note that it is permissible to omit this call entirely, as is
> > * done in architectures that do no CPU-hotplug error checking.
> > */
>
> Which suggests it wasn't wrong to omit it before your patches came
> along.
And that suggestion is quite correct. The idea was indeed to accommodate
architectures that do not do error checking.
Does the following patch (on top of current -next) remove the need for
your addition of cpu_set_state_online() above?
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 18688e0b0422..80400e019c86 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -460,7 +460,7 @@ bool cpu_report_death(void)
do {
oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu));
- if (oldstate == CPU_ONLINE)
+ if (oldstate == CPU_ONLINE || CPU_POST_DEAD)
newstate = CPU_DEAD;
else
newstate = CPU_DEAD_FROZEN;
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
[not found] ` <20150310165935.GR5708-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2015-03-11 11:03 ` James Hogan
[not found] ` <550020F6.6020105-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: James Hogan @ 2015-03-11 11:03 UTC (permalink / raw)
To: paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, mingo-DgEjT+Ai2ygdnm+yROfE0A,
laijs-BthXqXjhjHXQFUHtdCDX3A, dipankar-xthvdsQ13ZrQT0dZR+AlfA,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w,
josh-iaAMLnmF4UmaiuxdJuQwMA, tglx-hfZtesqFncYOwBW4kG4KsQ,
peterz-wEGCiKHe2LqWVfeAwA7xHQ, rostedt-nx8X9YLhiw1AfugRpC6u6w,
dhowells-H+wXaHxf7aLQT0dZR+AlfA, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
dvhart-VuQAYsv1563Yd54FQh9/CA, fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
oleg-H+wXaHxf7aLQT0dZR+AlfA, bobby.prani-Re5JQEeQqe8AvxtiuMwx3w,
linux-metag-u79uwXL29TY76Z2rM5mHXA
[-- Attachment #1: Type: text/plain, Size: 4149 bytes --]
On 10/03/15 16:59, Paul E. McKenney wrote:
> On Tue, Mar 10, 2015 at 03:30:42PM +0000, James Hogan wrote:
>> Hi Paul,
>>
>> On 03/03/15 17:42, Paul E. McKenney wrote:
>>> From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>>>
>>> This commit removes the open-coded CPU-offline notification with new
>>> common code. This change avoids calling scheduler code using RCU from
>>> an offline CPU that RCU is ignoring. This commit is compatible with
>>> the existing code in not checking for timeout during a prior offline
>>> for a given CPU.
>>>
>>> Signed-off-by: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>>> Cc: James Hogan <james.hogan-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
>>> Cc: <linux-metag-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
>>
>> I gave this a try via linux-next, but unfortunately it causes the
>> following warning every time a CPU goes down:
>> META213-Thread0 DSP [LogF] CPU1: unable to kill
>
> That is certainly not what I had in mind, thank you for finding this!
>
>> If I add printks, I see that the state on entry to both cpu_wait_death
>> and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
>> changed from its initial value.
>>
>> Should arches other than x86 now be calling cpu_set_state_online()? The
>> patchlet below seems to resolve it for Meta (not sure if that is the
>> best place in the startup sequence to do it, perhaps it doesn't matter).
>>
>> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
>> index ac3a199e33e7..430e379ec71f 100644
>> --- a/arch/metag/kernel/smp.c
>> +++ b/arch/metag/kernel/smp.c
>> @@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
>> * OK, now it's safe to let the boot CPU continue
>> */
>> set_cpu_online(cpu, true);
>> + cpu_set_state_online(cpu);
>> complete(&cpu_running);
>>
>> /*
>>
>> Looking at the comment before cpu_set_state_online:
>>> /*
>>> * Mark the specified CPU online.
>>> *
>>> * Note that it is permissible to omit this call entirely, as is
>>> * done in architectures that do no CPU-hotplug error checking.
>>> */
>>
>> Which suggests it wasn't wrong to omit it before your patches came
>> along.
>
> And that suggestion is quite correct. The idea was indeed to accommodate
> architectures that do not do error checking.
>
> Does the following patch (on top of current -next) remove the need for
> your addition of cpu_set_state_online() above?
Don't forget the "oldstate == ", otherwise it'll work for the wrong
reason :-/
Checking for CPU_POST_DEAD does seem to fix the immediate problem,
however this still leaves open the possibility of a single timeout
propagating to all further offlines after CPU_DEAD_FROZEN gets set. I've
confirmed that by adding a delay loop only on the second
cpu_report_death() call, and sure enough the 2nd and further offlines
all fail even though the CPU stops immediately after the 2nd one.
If this check is primarily so that CPU_DEAD_FROZEN is set if
cpu_wait_death timed out, would it be better to instead check explicitly
for CPU_BROKEN?
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 18688e0b0422..c697f73d82d6 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -460,7 +460,7 @@ bool cpu_report_death(void)
do {
oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu));
- if (oldstate == CPU_ONLINE)
+ if (oldstate != CPU_BROKEN)
newstate = CPU_DEAD;
else
newstate = CPU_DEAD_FROZEN;
Cheers
James
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 18688e0b0422..80400e019c86 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -460,7 +460,7 @@ bool cpu_report_death(void)
>
> do {
> oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu));
> - if (oldstate == CPU_ONLINE)
> + if (oldstate == CPU_ONLINE || CPU_POST_DEAD)
> newstate = CPU_DEAD;
> else
> newstate = CPU_DEAD_FROZEN;
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code
[not found] ` <550020F6.6020105-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
@ 2015-03-11 18:58 ` Paul E. McKenney
0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2015-03-11 18:58 UTC (permalink / raw)
To: James Hogan
Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA, mingo-DgEjT+Ai2ygdnm+yROfE0A,
laijs-BthXqXjhjHXQFUHtdCDX3A, dipankar-xthvdsQ13ZrQT0dZR+AlfA,
akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
mathieu.desnoyers-vg+e7yoeK/dWk0Htik3J/w,
josh-iaAMLnmF4UmaiuxdJuQwMA, tglx-hfZtesqFncYOwBW4kG4KsQ,
peterz-wEGCiKHe2LqWVfeAwA7xHQ, rostedt-nx8X9YLhiw1AfugRpC6u6w,
dhowells-H+wXaHxf7aLQT0dZR+AlfA, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
dvhart-VuQAYsv1563Yd54FQh9/CA, fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
oleg-H+wXaHxf7aLQT0dZR+AlfA, bobby.prani-Re5JQEeQqe8AvxtiuMwx3w,
linux-metag-u79uwXL29TY76Z2rM5mHXA
On Wed, Mar 11, 2015 at 11:03:18AM +0000, James Hogan wrote:
> On 10/03/15 16:59, Paul E. McKenney wrote:
> > On Tue, Mar 10, 2015 at 03:30:42PM +0000, James Hogan wrote:
> >> Hi Paul,
> >>
> >> On 03/03/15 17:42, Paul E. McKenney wrote:
> >>> From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> >>>
> >>> This commit removes the open-coded CPU-offline notification with new
> >>> common code. This change avoids calling scheduler code using RCU from
> >>> an offline CPU that RCU is ignoring. This commit is compatible with
> >>> the existing code in not checking for timeout during a prior offline
> >>> for a given CPU.
> >>>
> >>> Signed-off-by: Paul E. McKenney <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> >>> Cc: James Hogan <james.hogan-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
> >>> Cc: <linux-metag-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> >>
> >> I gave this a try via linux-next, but unfortunately it causes the
> >> following warning every time a CPU goes down:
> >> META213-Thread0 DSP [LogF] CPU1: unable to kill
> >
> > That is certainly not what I had in mind, thank you for finding this!
> >
> >> If I add printks, I see that the state on entry to both cpu_wait_death
> >> and cpu_report_death is already CPU_POST_DEAD, suggesting that it hasn't
> >> changed from its initial value.
> >>
> >> Should arches other than x86 now be calling cpu_set_state_online()? The
> >> patchlet below seems to resolve it for Meta (not sure if that is the
> >> best place in the startup sequence to do it, perhaps it doesn't matter).
> >>
> >> diff --git a/arch/metag/kernel/smp.c b/arch/metag/kernel/smp.c
> >> index ac3a199e33e7..430e379ec71f 100644
> >> --- a/arch/metag/kernel/smp.c
> >> +++ b/arch/metag/kernel/smp.c
> >> @@ -383,6 +383,7 @@ asmlinkage void secondary_start_kernel(void)
> >> * OK, now it's safe to let the boot CPU continue
> >> */
> >> set_cpu_online(cpu, true);
> >> + cpu_set_state_online(cpu);
> >> complete(&cpu_running);
> >>
> >> /*
> >>
> >> Looking at the comment before cpu_set_state_online:
> >>> /*
> >>> * Mark the specified CPU online.
> >>> *
> >>> * Note that it is permissible to omit this call entirely, as is
> >>> * done in architectures that do no CPU-hotplug error checking.
> >>> */
> >>
> >> Which suggests it wasn't wrong to omit it before your patches came
> >> along.
> >
> > And that suggestion is quite correct. The idea was indeed to accommodate
> > architectures that do not do error checking.
> >
> > Does the following patch (on top of current -next) remove the need for
> > your addition of cpu_set_state_online() above?
>
> Don't forget the "oldstate == ", otherwise it'll work for the wrong
> reason :-/
I clearly wasn't doing well yesterday, was I? :-/
> Checking for CPU_POST_DEAD does seem to fix the immediate problem,
> however this still leaves open the possibility of a single timeout
> propagating to all further offlines after CPU_DEAD_FROZEN gets set. I've
> confirmed that by adding a delay loop only on the second
> cpu_report_death() call, and sure enough the 2nd and further offlines
> all fail even though the CPU stops immediately after the 2nd one.
>
> If this check is primarily so that CPU_DEAD_FROZEN is set if
> cpu_wait_death timed out, would it be better to instead check explicitly
> for CPU_BROKEN?
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 18688e0b0422..c697f73d82d6 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -460,7 +460,7 @@ bool cpu_report_death(void)
>
> do {
> oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu));
> - if (oldstate == CPU_ONLINE)
> + if (oldstate != CPU_BROKEN)
> newstate = CPU_DEAD;
> else
> newstate = CPU_DEAD_FROZEN;
This does look much better! I will incorporate this with attribution.
The idea is to support two use cases. The first use case provides full
checking, and the second provides minimal checking.
Full checking is used by architectures that require that one of the
surviving CPUs so something to help the offlined CPU go offline,
Xen being one example. In this case, the architecture invokes
cpu_check_up_prepare(), which returns an error code if the CPU did not
go offline properly. The architecture can choose to return an error or
to provide the offlining help at that point. The CPU being onlined then
calls cpu_set_state_online(). When the CPU goes offline, it invokes
cpu_report_death(), which can race with the timing out of one of the
surviving CPUs invoking cpu_wait_death(). If cpu_wait_death() times
out first, or if cpu_report_death() is never called, state is set so
that the next call to cpu_check_up_prepare() can react accordingly.
Minimal checking is what metag does. The cpu_check_up_prepare() and
cpu_set_state_online() functions are never called, just cpu_report_death()
and cpu_wait_death().
And yes, this time I drew state diagrams. Which I should have done in
the first place.
Thanx, Paul
> Cheers
> James
>
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> > index 18688e0b0422..80400e019c86 100644
> > --- a/kernel/smpboot.c
> > +++ b/kernel/smpboot.c
> > @@ -460,7 +460,7 @@ bool cpu_report_death(void)
> >
> > do {
> > oldstate = atomic_read(&per_cpu(cpu_hotplug_state, cpu));
> > - if (oldstate == CPU_ONLINE)
> > + if (oldstate == CPU_ONLINE || CPU_POST_DEAD)
> > newstate = CPU_DEAD;
> > else
> > newstate = CPU_DEAD_FROZEN;
> >
>
--
To unsubscribe from this list: send the line "unsubscribe linux-metag" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-03-11 18:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20150303174144.GA13139@linux.vnet.ibm.com>
[not found] ` <1425404595-17816-1-git-send-email-paulmck@linux.vnet.ibm.com>
2015-03-03 17:42 ` [PATCH tip/core/rcu 04/20] metag: Use common outgoing-CPU-notification code Paul E. McKenney
[not found] ` <1425404595-17816-4-git-send-email-paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-03-10 15:30 ` James Hogan
2015-03-10 16:59 ` Paul E. McKenney
[not found] ` <20150310165935.GR5708-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-03-11 11:03 ` James Hogan
[not found] ` <550020F6.6020105-1AXoQHu6uovQT0dZR+AlfA@public.gmane.org>
2015-03-11 18:58 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox