PATCH: Fix race in cpu_down (hotplug cpu)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* PATCH: Fix race in cpu_down (hotplug cpu)
@ 2005-09-19  3:15 Nigel Cunningham
  0 siblings, 0 replies; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  3:15 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, Zwane Mwaikambo; +Cc: Linux Kernel Mailing List

Hi.

There is a race condition in taking down a cpu (kernel/cpu.c::cpu_down).
A cpu can already be idling when we clear its online flag, and we do not
force the idle task to reschedule. This results in __cpu_die timing out.
A simple fix is to force the idle task on the cpu going to reschedule.

Without the patch below, Suspend2 get into a deadlock at resume time
when this issue occurs. I could not complete 20 cycles without seeing
the issue. With the patch below, I have completed 75 cycles on the trot
without problems.

Please apply.

Signed-off-by: Nigel Cunningham <ncunningham@cyclades.com>

diff -ruNp 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 9910-hotplug-cpu-race.patch-new/kernel/cpu.c
--- 9910-hotplug-cpu-race.patch-old/kernel/cpu.c	2005-08-29 10:29:58.000000000 +1000
+++ 9910-hotplug-cpu-race.patch-new/kernel/cpu.c	2005-09-19 12:15:08.000000000 +1000
@@ -126,6 +126,9 @@ int cpu_down(unsigned int cpu)
 	while (!idle_cpu(cpu))
 		yield();

+	/* CPU may have idled before we set its offline flag. */
+	set_tsk_need_resched(idle_task(cpu));
+
 	/* This actually kills the CPU. */
 	__cpu_die(cpu);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* PATCH: Fix race in cpu_down (hotplug cpu)
@ 2005-09-19  3:28 Nigel Cunningham
  2005-09-19  4:22 ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  3:28 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds, Zwane Mwaikambo; +Cc: Linux Kernel Mailing List

Hmm... managed to miss a word at the end of the first para and thus not
make sense. Let's try again.

----------

Hi.

There is a race condition in taking down a cpu (kernel/cpu.c::cpu_down).
A cpu can already be idling when we clear its online flag, and we do not
force the idle task to reschedule. This results in __cpu_die timing out.
A simple fix is to force the idle task on the cpu going down to reschedule.

Without the patch below, Suspend2 get into a deadlock at resume time
when this issue occurs. I could not complete 20 cycles without seeing
the issue. With the patch below, I have completed 75 cycles on the trot
without problems.

Please apply.

Signed-off-by: Nigel Cunningham <ncunningham@cyclades.com>

diff -ruNp 9910-hotplug-cpu-race.patch-old/kernel/cpu.c 9910-hotplug-cpu-race.patch-new/kernel/cpu.c
--- 9910-hotplug-cpu-race.patch-old/kernel/cpu.c	2005-08-29 10:29:58.000000000 +1000
+++ 9910-hotplug-cpu-race.patch-new/kernel/cpu.c	2005-09-19 12:15:08.000000000 +1000
@@ -126,6 +126,9 @@ int cpu_down(unsigned int cpu)
 	while (!idle_cpu(cpu))
 		yield();

+	/* CPU may have idled before we set its offline flag. */
+	set_tsk_need_resched(idle_task(cpu));
+
 	/* This actually kills the CPU. */
 	__cpu_die(cpu);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  3:28 Nigel Cunningham
@ 2005-09-19  4:22 ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  4:22 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell

On Mon, Sep 19, 2005 at 01:28:38PM +1000, Nigel Cunningham wrote:
> There is a race condition in taking down a cpu (kernel/cpu.c::cpu_down).
> A cpu can already be idling when we clear its online flag, and we do not
> force the idle task to reschedule. This results in __cpu_die timing out.

"when we clear its online flag" - This happens in take_cpu_down in the
context of stopmachine thread. take_cpu_down also ensures that idle 
thread runs when it returns (sched_idle_next). So when idle thread runs,
it should notice that it is offline and invoke play_dead.  So I don't 
understand why __cpu_die should time out.


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: PATCH: Fix race in cpu_down (hotplug cpu)
@ 2005-09-19  4:48 Li, Shaohua
  2005-09-19  5:10 ` Srivatsa Vaddagiri
  2005-09-19  5:23 ` Nigel Cunningham
  0 siblings, 2 replies; 26+ messages in thread
From: Li, Shaohua @ 2005-09-19  4:48 UTC (permalink / raw)
  To: vatsa, Nigel Cunningham
  Cc: Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell

Hi,
>
>On Mon, Sep 19, 2005 at 01:28:38PM +1000, Nigel Cunningham wrote:
>> There is a race condition in taking down a cpu
(kernel/cpu.c::cpu_down).
>> A cpu can already be idling when we clear its online flag, and we do
not
>> force the idle task to reschedule. This results in __cpu_die timing
out.
>
>"when we clear its online flag" - This happens in take_cpu_down in the
>context of stopmachine thread. take_cpu_down also ensures that idle
>thread runs when it returns (sched_idle_next). So when idle thread
runs,
>it should notice that it is offline and invoke play_dead.  So I don't
>understand why __cpu_die should time out.
I guess Nigel's point is cpu_idle is preempted before take_cpu_down. If
the preempt occurs after the cpu_is_offline check, when the cpu (after
sched_idle_next) goes into idle again, nobody can wake it up. Nigel,
isn't it?

Thanks,
Shaohua

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  4:48 Li, Shaohua
@ 2005-09-19  5:10 ` Srivatsa Vaddagiri
  2005-09-19  5:31   ` Shaohua Li
  2005-09-19  5:23 ` Nigel Cunningham
  1 sibling, 1 reply; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  5:10 UTC (permalink / raw)
  To: Li, Shaohua
  Cc: Nigel Cunningham, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 12:48:38PM +0800, Li, Shaohua wrote:
> I guess Nigel's point is cpu_idle is preempted before take_cpu_down. If
> the preempt occurs after the cpu_is_offline check, when the cpu (after

How can that happen? Idle task is running at max priority (MAX_RT_PRIO-1)
and with SCHED_FIFO policy at this point. If that is indeed happening,
then we need to modify sched_idle_next not to allow that.

> sched_idle_next) goes into idle again, nobody can wake it up. Nigel,
> isn't it?


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  5:10 ` Srivatsa Vaddagiri
@ 2005-09-19  5:31   ` Shaohua Li
  2005-09-19  5:57     ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 26+ messages in thread
From: Shaohua Li @ 2005-09-19  5:31 UTC (permalink / raw)
  To: vatsa
  Cc: Nigel Cunningham, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 10:40 +0530, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 12:48:38PM +0800, Li, Shaohua wrote:
> > I guess Nigel's point is cpu_idle is preempted before take_cpu_down. If
> > the preempt occurs after the cpu_is_offline check, when the cpu (after
> 
> How can that happen? Idle task is running at max priority (MAX_RT_PRIO-1)
> and with SCHED_FIFO policy at this point. If that is indeed happening,
> then we need to modify sched_idle_next not to allow that.
A CPU is idle and then is preempted and starting offline CPU. After
calling stop_machine_run, the CPU goes into idle and it will resume last
idle loop. If the CPU is broken at specific point, then the CPU will
continue executing previous idle and have no chance to call play_dead.
Am I missing anything? Nigel's patch seems can fix the situation for
mwait_idle and poll_idle but can't fix for default_idle in i386 to me.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  5:31   ` Shaohua Li
@ 2005-09-19  5:57     ` Srivatsa Vaddagiri
  2005-09-19  6:11       ` Nigel Cunningham
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  5:57 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Nigel Cunningham, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 01:31:27PM +0800, Shaohua Li wrote:
> A CPU is idle and then is preempted and starting offline CPU. After
> calling stop_machine_run, the CPU goes into idle and it will resume last
> idle loop. If the CPU is broken at specific point, then the CPU will
> continue executing previous idle and have no chance to call play_dead.

Ok, that makes sense. Nigel, could you confirm which idle routine you are 
using?

> Am I missing anything? Nigel's patch seems can fix the situation for
> mwait_idle and poll_idle but can't fix for default_idle in i386 to me.

I would say the right fix here is for poll_idle and mwait_idle (& similar
other idle routines) to monitor 'cpu_offline' flag in addition to need_resched 
flag, rather than what Nigel has suggested. 

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  5:57     ` Srivatsa Vaddagiri
@ 2005-09-19  6:11       ` Nigel Cunningham
  2005-09-19  6:23         ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  6:11 UTC (permalink / raw)
  To: vatsa
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

Hi.

On Mon, 2005-09-19 at 15:57, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 01:31:27PM +0800, Shaohua Li wrote:
> > A CPU is idle and then is preempted and starting offline CPU. After
> > calling stop_machine_run, the CPU goes into idle and it will resume last
> > idle loop. If the CPU is broken at specific point, then the CPU will
> > continue executing previous idle and have no chance to call play_dead.
> 
> Ok, that makes sense. Nigel, could you confirm which idle routine you are 
> using?

>From dmesg:

monitor/mwait feature present.
using mwait in idle threads.

> > Am I missing anything? Nigel's patch seems can fix the situation for
> > mwait_idle and poll_idle but can't fix for default_idle in i386 to me.
> 
> I would say the right fix here is for poll_idle and mwait_idle (& similar
> other idle routines) to monitor 'cpu_offline' flag in addition to need_resched 
> flag, rather than what Nigel has suggested. 

Ok, but what about default_idle?

Regards,

Nigel


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:11       ` Nigel Cunningham
@ 2005-09-19  6:23         ` Srivatsa Vaddagiri
  2005-09-19  6:29           ` Nick Piggin
                             ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  6:23 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 04:11:11PM +1000, Nigel Cunningham wrote:
> > Ok, that makes sense. Nigel, could you confirm which idle routine you are 
> > using?
> 
> >From dmesg:
> 
> monitor/mwait feature present.
> using mwait in idle threads.

Ok, that may explain why __cpu_die is timing out for you! Can you try a 
(untested, I am afraid) patch on these lines: 

--- process.c.org	2005-09-19 11:44:57.000000000 +0530
+++ process.c	2005-09-19 11:48:28.000000000 +0530
@@ -245,16 +245,18 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
  */
 static void mwait_idle(void)
 {
+	int cpu = raw_smp_processor_id();
+
 	local_irq_enable();
 
 	if (!need_resched()) {
 		set_thread_flag(TIF_POLLING_NRFLAG);
 		do {
 			__monitor((void *)&current_thread_info()->flags, 0, 0);
-			if (need_resched())
+			if (need_resched() || cpu_is_offline(cpu))
 				break;
 			__mwait(0, 0);
-		} while (!need_resched());
+		} while (!need_resched() || !cpu_is_offline(cpu));
 		clear_thread_flag(TIF_POLLING_NRFLAG);
 	}
 }


Other idle routines will need similar fix.

> Ok, but what about default_idle?

default_idle should be fine as it is. IOW it should not cause __cpu_die to 
timeout.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:23         ` Srivatsa Vaddagiri
@ 2005-09-19  6:29           ` Nick Piggin
  2005-09-19  7:00             ` Srivatsa Vaddagiri
  2005-09-19  6:31           ` Nigel Cunningham
  2005-09-19  6:37           ` Shaohua Li
  2 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2005-09-19  6:29 UTC (permalink / raw)
  To: vatsa
  Cc: Nigel Cunningham, Li Shaohua, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 11:53 +0530, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 04:11:11PM +1000, Nigel Cunningham wrote:
> > > Ok, that makes sense. Nigel, could you confirm which idle routine you are 
> > > using?
> > 
> > >From dmesg:
> > 
> > monitor/mwait feature present.
> > using mwait in idle threads.
> 
> Ok, that may explain why __cpu_die is timing out for you! Can you try a 
> (untested, I am afraid) patch on these lines: 
> 
> --- process.c.org	2005-09-19 11:44:57.000000000 +0530
> +++ process.c	2005-09-19 11:48:28.000000000 +0530
> @@ -245,16 +245,18 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
>   */
>  static void mwait_idle(void)
>  {
> +	int cpu = raw_smp_processor_id();
> +
>  	local_irq_enable();
>  
>  	if (!need_resched()) {
>  		set_thread_flag(TIF_POLLING_NRFLAG);
>  		do {
>  			__monitor((void *)&current_thread_info()->flags, 0, 0);
> -			if (need_resched())
> +			if (need_resched() || cpu_is_offline(cpu))
>  				break;
>  			__mwait(0, 0);
> -		} while (!need_resched());
> +		} while (!need_resched() || !cpu_is_offline(cpu));
>  		clear_thread_flag(TIF_POLLING_NRFLAG);
>  	}
>  }
> 
> 
> Other idle routines will need similar fix.
> 

It seems to me that it would be much nicer to just go with Nigel's
first fix. That is, rather than clutter up all idle routines with
this.

Nick

-- 
SUSE Labs, Novell Inc.



Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:29           ` Nick Piggin
@ 2005-09-19  7:00             ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  7:00 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Nigel Cunningham, Li Shaohua, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 04:29:39PM +1000, Nick Piggin wrote:
> It seems to me that it would be much nicer to just go with Nigel's
> first fix. That is, rather than clutter up all idle routines with
> this.

I felt it to be more of a hack to get things working (do we really
want the idle thread to be rescheduled at that point?). Plus the fact
that it can cause the idle thread to call schedule() before the cpu_is_offline 
check made me uncomfortable (maybe it is just fine).

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:23         ` Srivatsa Vaddagiri
  2005-09-19  6:29           ` Nick Piggin
@ 2005-09-19  6:31           ` Nigel Cunningham
  2005-09-19  7:09             ` Srivatsa Vaddagiri
  2005-09-19  6:37           ` Shaohua Li
  2 siblings, 1 reply; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  6:31 UTC (permalink / raw)
  To: vatsa
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

Hi.

On Mon, 2005-09-19 at 16:23, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 04:11:11PM +1000, Nigel Cunningham wrote:
> > > Ok, that makes sense. Nigel, could you confirm which idle routine you are 
> > > using?
> > 
> > >From dmesg:
> > 
> > monitor/mwait feature present.
> > using mwait in idle threads.
> 
> Ok, that may explain why __cpu_die is timing out for you! Can you try a 
> (untested, I am afraid) patch on these lines: 

Will do. Given my (understandable I guess) difficulty in reproducing it
reliably, shall I add a printk in there so I can see when it would have
otherwise failed to drop out?

> --- process.c.org	2005-09-19 11:44:57.000000000 +0530
> +++ process.c	2005-09-19 11:48:28.000000000 +0530
> @@ -245,16 +245,18 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
>   */
>  static void mwait_idle(void)
>  {
> +	int cpu = raw_smp_processor_id();
> +
>  	local_irq_enable();
>  
>  	if (!need_resched()) {
>  		set_thread_flag(TIF_POLLING_NRFLAG);
>  		do {
>  			__monitor((void *)&current_thread_info()->flags, 0, 0);
> -			if (need_resched())
> +			if (need_resched() || cpu_is_offline(cpu))
>  				break;
>  			__mwait(0, 0);
> -		} while (!need_resched());
> +		} while (!need_resched() || !cpu_is_offline(cpu));

Shouldn't this be !need_resched() && !cpu_is_offline(cpu)?

Regards,

Nigel

>  		clear_thread_flag(TIF_POLLING_NRFLAG);
>  	}
>  }
> 
> 
> Other idle routines will need similar fix.
> 
> > Ok, but what about default_idle?
> 
> default_idle should be fine as it is. IOW it should not cause __cpu_die to 
> timeout.
-- 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:31           ` Nigel Cunningham
@ 2005-09-19  7:09             ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  7:09 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 04:31:36PM +1000, Nigel Cunningham wrote:
> > -		} while (!need_resched());
> > +		} while (!need_resched() || !cpu_is_offline(cpu));
> 
> Shouldn't this be !need_resched() && !cpu_is_offline(cpu)?

Yes!


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:23         ` Srivatsa Vaddagiri
  2005-09-19  6:29           ` Nick Piggin
  2005-09-19  6:31           ` Nigel Cunningham
@ 2005-09-19  6:37           ` Shaohua Li
  2005-09-19  6:36             ` Nick Piggin
  2005-09-19  7:07             ` Srivatsa Vaddagiri
  2 siblings, 2 replies; 26+ messages in thread
From: Shaohua Li @ 2005-09-19  6:37 UTC (permalink / raw)
  To: vatsa
  Cc: Nigel Cunningham, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 11:53 +0530, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 04:11:11PM +1000, Nigel Cunningham wrote:
> > > Ok, that makes sense. Nigel, could you confirm which idle routine you are 
> > > using?
> > 
> > >From dmesg:
> > 
> > monitor/mwait feature present.
> > using mwait in idle threads.
> 
> Ok, that may explain why __cpu_die is timing out for you! Can you try a 
> (untested, I am afraid) patch on these lines: 
> 
> --- process.c.org	2005-09-19 11:44:57.000000000 +0530
> +++ process.c	2005-09-19 11:48:28.000000000 +0530
> @@ -245,16 +245,18 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
>   */
>  static void mwait_idle(void)
>  {
> +	int cpu = raw_smp_processor_id();
> +
>  	local_irq_enable();
>  
>  	if (!need_resched()) {
>  		set_thread_flag(TIF_POLLING_NRFLAG);
>  		do {
>  			__monitor((void *)&current_thread_info()->flags, 0, 0);
> -			if (need_resched())
> +			if (need_resched() || cpu_is_offline(cpu))
>  				break;
if the breakpoint is here, you will still have trouble.

>  			__mwait(0, 0);
> -		} while (!need_resched());
> +		} while (!need_resched() || !cpu_is_offline(cpu));
>  		clear_thread_flag(TIF_POLLING_NRFLAG);
>  	}
>  }
> 
> 
> Other idle routines will need similar fix.
> 
> > Ok, but what about default_idle?
> 
> default_idle should be fine as it is. IOW it should not cause __cpu_die to 
> timeout.
Why default_idle should be fine? it can be preempted before the
'local_irq_disable' check. Even with Nigel's patch, there is a very
small window at safe_halt (after 'sti' but before 'hlt').

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:37           ` Shaohua Li
@ 2005-09-19  6:36             ` Nick Piggin
  2005-09-19  6:54               ` Nigel Cunningham
  2005-09-19  7:12               ` Shaohua Li
  2005-09-19  7:07             ` Srivatsa Vaddagiri
  1 sibling, 2 replies; 26+ messages in thread
From: Nick Piggin @ 2005-09-19  6:36 UTC (permalink / raw)
  To: Shaohua Li
  Cc: vatsa, Nigel Cunningham, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 14:37 +0800, Shaohua Li wrote:
> On Mon, 2005-09-19 at 11:53 +0530, Srivatsa Vaddagiri wrote:

> > default_idle should be fine as it is. IOW it should not cause __cpu_die to 
> > timeout.
> Why default_idle should be fine? it can be preempted before the
> 'local_irq_disable' check. Even with Nigel's patch, there is a very
> small window at safe_halt (after 'sti' but before 'hlt').
> 

Ah, actually I have a patch which makes all CPU idle threads
run with preempt disabled and only enable preempt when scheduling.
Would that help?

Nick

-- 
SUSE Labs, Novell Inc.



Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:36             ` Nick Piggin
@ 2005-09-19  6:54               ` Nigel Cunningham
  2005-09-19  7:12               ` Shaohua Li
  1 sibling, 0 replies; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  6:54 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Li Shaohua, vatsa, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	lkml, Rusty Russell, Ingo Molnar

Hi.

On Mon, 2005-09-19 at 16:36, Nick Piggin wrote:
> On Mon, 2005-09-19 at 14:37 +0800, Shaohua Li wrote:
> > On Mon, 2005-09-19 at 11:53 +0530, Srivatsa Vaddagiri wrote:
> 
> > > default_idle should be fine as it is. IOW it should not cause __cpu_die to 
> > > timeout.
> > Why default_idle should be fine? it can be preempted before the
> > 'local_irq_disable' check. Even with Nigel's patch, there is a very
> > small window at safe_halt (after 'sti' but before 'hlt').
> > 
> 
> Ah, actually I have a patch which makes all CPU idle threads
> run with preempt disabled and only enable preempt when scheduling.
> Would that help?

Sounds to me like it should, but I'm not claiming any expertise in this
area. Going off your other posts, you're thinking of this plus my
original patch?

Regards,

Nigel 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:36             ` Nick Piggin
  2005-09-19  6:54               ` Nigel Cunningham
@ 2005-09-19  7:12               ` Shaohua Li
  2005-09-19  7:22                 ` Nick Piggin
  1 sibling, 1 reply; 26+ messages in thread
From: Shaohua Li @ 2005-09-19  7:12 UTC (permalink / raw)
  To: Nick Piggin
  Cc: vatsa, Nigel Cunningham, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 16:36 +1000, Nick Piggin wrote:
> On Mon, 2005-09-19 at 14:37 +0800, Shaohua Li wrote:
> > On Mon, 2005-09-19 at 11:53 +0530, Srivatsa Vaddagiri wrote:
> 
> > > default_idle should be fine as it is. IOW it should not cause __cpu_die to 
> > > timeout.
> > Why default_idle should be fine? it can be preempted before the
> > 'local_irq_disable' check. Even with Nigel's patch, there is a very
> > small window at safe_halt (after 'sti' but before 'hlt').
> > 
> 
> Ah, actually I have a patch which makes all CPU idle threads
> run with preempt disabled and only enable preempt when scheduling.
> Would that help?
It should solve the issue to me. Should we take care of the latency?
acpi_processor_idle might execute for a long time.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  7:12               ` Shaohua Li
@ 2005-09-19  7:22                 ` Nick Piggin
  2005-09-19  7:28                   ` Ingo Molnar
  0 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2005-09-19  7:22 UTC (permalink / raw)
  To: Shaohua Li
  Cc: vatsa, Nigel Cunningham, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell, Ingo Molnar

On Mon, 2005-09-19 at 15:12 +0800, Shaohua Li wrote:
> On Mon, 2005-09-19 at 16:36 +1000, Nick Piggin wrote:

> > Ah, actually I have a patch which makes all CPU idle threads
> > run with preempt disabled and only enable preempt when scheduling.
> > Would that help?
> It should solve the issue to me. Should we take care of the latency?
> acpi_processor_idle might execute for a long time.
> 

Oh really? I think yes, the latency should be taken care of because
we want to be able to provide good latency even for !preempt kernels.
If a solution can be found for acpi_processor_idle, that would be
ideal.

IMO it always felt kind of hackish to run the idle threads with
preempt on.

Thanks,
Nick

-- 
SUSE Labs, Novell Inc.



Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  7:22                 ` Nick Piggin
@ 2005-09-19  7:28                   ` Ingo Molnar
  2005-09-19  7:37                     ` Nick Piggin
  0 siblings, 1 reply; 26+ messages in thread
From: Ingo Molnar @ 2005-09-19  7:28 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Shaohua Li, vatsa, Nigel Cunningham, Andrew Morton,
	Linus Torvalds, Zwane Mwaikambo, lkml, Rusty Russell


* Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> On Mon, 2005-09-19 at 15:12 +0800, Shaohua Li wrote:
> > On Mon, 2005-09-19 at 16:36 +1000, Nick Piggin wrote:
> 
> > > Ah, actually I have a patch which makes all CPU idle threads
> > > run with preempt disabled and only enable preempt when scheduling.
> > > Would that help?
> > It should solve the issue to me. Should we take care of the latency?
> > acpi_processor_idle might execute for a long time.
> > 
> 
> Oh really? I think yes, the latency should be taken care of because we 
> want to be able to provide good latency even for !preempt kernels. If 
> a solution can be found for acpi_processor_idle, that would be ideal.

the ACPI idle code runs with irqs disabled anyway, so there's no issue 
here. If something takes long there, we can do little about it. (but in 
practice ACPI sleep latencies are pretty ok - the only latencies i found 
in the past were due to need_resched bugs in the ACPI idle routine)

> IMO it always felt kind of hackish to run the idle threads with 
> preempt on.

Yes, idle threads can have preemption disabled. There's not any big 
difference in terms of latencies, the execution paths are all very 
short.

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  7:28                   ` Ingo Molnar
@ 2005-09-19  7:37                     ` Nick Piggin
  2005-09-19 22:55                       ` Nigel Cunningham
  0 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2005-09-19  7:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Shaohua Li, vatsa, Nigel Cunningham, Andrew Morton,
	Linus Torvalds, Zwane Mwaikambo, lkml, Rusty Russell

On Mon, 2005-09-19 at 09:28 +0200, Ingo Molnar wrote:
> * Nick Piggin <nickpiggin@yahoo.com.au> wrote:

> > Oh really? I think yes, the latency should be taken care of because we 
> > want to be able to provide good latency even for !preempt kernels. If 
> > a solution can be found for acpi_processor_idle, that would be ideal.
> 
> the ACPI idle code runs with irqs disabled anyway, so there's no issue 
> here. If something takes long there, we can do little about it. (but in 
> practice ACPI sleep latencies are pretty ok - the only latencies i found 
> in the past were due to need_resched bugs in the ACPI idle routine)
> 

Ah, in that case I agree: we have nothing to worry about by merging
such a patch then.

> > IMO it always felt kind of hackish to run the idle threads with 
> > preempt on.
> 
> Yes, idle threads can have preemption disabled. There's not any big 
> difference in terms of latencies, the execution paths are all very 
> short.
> 

Thanks for the confirmation Ingo. This is part of my "cleanup resched
and cpu_idle" patch FYI. It should already be in -mm, but has some
trivial EM64T bug in it that Andrew hits but I can't reproduce.

I'll dust it off and send it out, hopefully someone will be able to
reproduce the problem!

-- 
SUSE Labs, Novell Inc.



Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  7:37                     ` Nick Piggin
@ 2005-09-19 22:55                       ` Nigel Cunningham
  2005-09-20  4:41                         ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19 22:55 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Li Shaohua, vatsa, Andrew Morton, Linus Torvalds,
	Zwane Mwaikambo, lkml, Rusty Russell

Hi.

On Mon, 2005-09-19 at 17:37, Nick Piggin wrote:
> Thanks for the confirmation Ingo. This is part of my "cleanup resched
> and cpu_idle" patch FYI. It should already be in -mm, but has some
> trivial EM64T bug in it that Andrew hits but I can't reproduce.
> 
> I'll dust it off and send it out, hopefully someone will be able to
> reproduce the problem!

I got on the same page eventually :). When you have it ready, I'll be
happy to try it. Apart from trying another 75 suspends (which I'm happy
to do), I'm not really sure how we can be totally sure that the patch
fixes it. Do you have any thoughts in this regard?

Regards,

Nigel



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19 22:55                       ` Nigel Cunningham
@ 2005-09-20  4:41                         ` Srivatsa Vaddagiri
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-20  4:41 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Nick Piggin, Ingo Molnar, Li Shaohua, Andrew Morton,
	Linus Torvalds, Zwane Mwaikambo, lkml, Rusty Russell

On Tue, Sep 20, 2005 at 08:55:47AM +1000, Nigel Cunningham wrote:
> I got on the same page eventually :). When you have it ready, I'll be
> happy to try it. Apart from trying another 75 suspends (which I'm happy
> to do), I'm not really sure how we can be totally sure that the patch
> fixes it. Do you have any thoughts in this regard?

Since this seems to be related to CPU down event, you could run a
tight loop like this overnight (for all CPUs in the system):

	while :
	do
		echo 0 > online
		echo 1 > online
	done


This, maybe in conjunction with some other test like LTP or make -jN bzImage,
is usually enough to capture all CPU-hotplug related problems.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  6:37           ` Shaohua Li
  2005-09-19  6:36             ` Nick Piggin
@ 2005-09-19  7:07             ` Srivatsa Vaddagiri
  1 sibling, 0 replies; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  7:07 UTC (permalink / raw)
  To: Shaohua Li
  Cc: Nigel Cunningham, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell, Ingo Molnar

On Mon, Sep 19, 2005 at 02:37:09PM +0800, Shaohua Li wrote:
> > -			if (need_resched())
> > +			if (need_resched() || cpu_is_offline(cpu))
> >  				break;
> if the breakpoint is here, you will still have trouble.

[snip]

> Why default_idle should be fine? it can be preempted before the
> 'local_irq_disable' check. 
> Even with Nigel's patch, there is a very
> small window at safe_halt (after 'sti' but before 'hlt').

Good point. Sounds like the patch that Nick has for disabling premption
while it is idle may be a cure for these problems.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  4:48 Li, Shaohua
  2005-09-19  5:10 ` Srivatsa Vaddagiri
@ 2005-09-19  5:23 ` Nigel Cunningham
  2005-09-19  5:28   ` Srivatsa Vaddagiri
  1 sibling, 1 reply; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  5:23 UTC (permalink / raw)
  To: Li Shaohua
  Cc: vatsa, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell

Hi.

On Mon, 2005-09-19 at 14:48, Li, Shaohua wrote:
> Hi,
> >
> >On Mon, Sep 19, 2005 at 01:28:38PM +1000, Nigel Cunningham wrote:
> >> There is a race condition in taking down a cpu
> (kernel/cpu.c::cpu_down).
> >> A cpu can already be idling when we clear its online flag, and we do
> not
> >> force the idle task to reschedule. This results in __cpu_die timing
> out.
> >
> >"when we clear its online flag" - This happens in take_cpu_down in the
> >context of stopmachine thread. take_cpu_down also ensures that idle
> >thread runs when it returns (sched_idle_next). So when idle thread
> runs,
> >it should notice that it is offline and invoke play_dead.  So I don't
> >understand why __cpu_die should time out.
> I guess Nigel's point is cpu_idle is preempted before take_cpu_down. If
> the preempt occurs after the cpu_is_offline check, when the cpu (after
> sched_idle_next) goes into idle again, nobody can wake it up. Nigel,
> isn't it?

Maybe I'm just an ignoramus, but I was thinking (without being a
scheduler expert at all) that if the idle thread was already running,
trying to set it up to run next might possibly have zero effect. I've
added a bit of debugging code to try and see in better detail what's
happening.

Regards,

Nigel

> Thanks,
> Shaohua
-- 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  5:23 ` Nigel Cunningham
@ 2005-09-19  5:28   ` Srivatsa Vaddagiri
  2005-09-19  5:35     ` Nigel Cunningham
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa Vaddagiri @ 2005-09-19  5:28 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell

On Mon, Sep 19, 2005 at 03:23:01PM +1000, Nigel Cunningham wrote:
> Maybe I'm just an ignoramus, but I was thinking (without being a
> scheduler expert at all) that if the idle thread was already running,
> trying to set it up to run next might possibly have zero effect. I've

sched_idle_next actually adds the idle task to its runqueue (normally
it is not present in a runqueue while running) and as well changes its 
priority/policy. So it does have *some* effect!

> added a bit of debugging code to try and see in better detail what's
> happening.

Could you elaborate (with some stack traces maybe) on the deadlock you are 
seeing during resume? Maybe that can throw some light.

-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: PATCH: Fix race in cpu_down (hotplug cpu)
  2005-09-19  5:28   ` Srivatsa Vaddagiri
@ 2005-09-19  5:35     ` Nigel Cunningham
  0 siblings, 0 replies; 26+ messages in thread
From: Nigel Cunningham @ 2005-09-19  5:35 UTC (permalink / raw)
  To: vatsa
  Cc: Li Shaohua, Andrew Morton, Linus Torvalds, Zwane Mwaikambo,
	Linux Kernel Mailing List, Rusty Russell

On Mon, 2005-09-19 at 15:28, Srivatsa Vaddagiri wrote:
> On Mon, Sep 19, 2005 at 03:23:01PM +1000, Nigel Cunningham wrote:
> > Maybe I'm just an ignoramus, but I was thinking (without being a
> > scheduler expert at all) that if the idle thread was already running,
> > trying to set it up to run next might possibly have zero effect. I've
> 
> sched_idle_next actually adds the idle task to its runqueue (normally
> it is not present in a runqueue while running) and as well changes its 
> priority/policy. So it does have *some* effect!

Ok.

> > added a bit of debugging code to try and see in better detail what's
> > happening.
> 
> Could you elaborate (with some stack traces maybe) on the deadlock you are 
> seeing during resume? Maybe that can throw some light.

I'll try. I'm having trouble reproducing it now (yes, having reversed my
patch!).

Regards,

Nigel
-- 



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2005-09-20  4:42 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-19  3:15 PATCH: Fix race in cpu_down (hotplug cpu) Nigel Cunningham
  -- strict thread matches above, loose matches on Subject: below --
2005-09-19  3:28 Nigel Cunningham
2005-09-19  4:22 ` Srivatsa Vaddagiri
2005-09-19  4:48 Li, Shaohua
2005-09-19  5:10 ` Srivatsa Vaddagiri
2005-09-19  5:31   ` Shaohua Li
2005-09-19  5:57     ` Srivatsa Vaddagiri
2005-09-19  6:11       ` Nigel Cunningham
2005-09-19  6:23         ` Srivatsa Vaddagiri
2005-09-19  6:29           ` Nick Piggin
2005-09-19  7:00             ` Srivatsa Vaddagiri
2005-09-19  6:31           ` Nigel Cunningham
2005-09-19  7:09             ` Srivatsa Vaddagiri
2005-09-19  6:37           ` Shaohua Li
2005-09-19  6:36             ` Nick Piggin
2005-09-19  6:54               ` Nigel Cunningham
2005-09-19  7:12               ` Shaohua Li
2005-09-19  7:22                 ` Nick Piggin
2005-09-19  7:28                   ` Ingo Molnar
2005-09-19  7:37                     ` Nick Piggin
2005-09-19 22:55                       ` Nigel Cunningham
2005-09-20  4:41                         ` Srivatsa Vaddagiri
2005-09-19  7:07             ` Srivatsa Vaddagiri
2005-09-19  5:23 ` Nigel Cunningham
2005-09-19  5:28   ` Srivatsa Vaddagiri
2005-09-19  5:35     ` Nigel Cunningham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox