sched domains bringup race?

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* sched domains bringup race?
@ 2004-07-16  2:13 Dave Hansen
  2004-07-16  3:03 ` Nick Piggin
  2004-07-18 20:45 ` Keshavamurthy Anil S
  0 siblings, 2 replies; 10+ messages in thread
From: Dave Hansen @ 2004-07-16  2:13 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Matthew C. Dobson [imap], Linux Kernel Mailing List

I keep getting oopses for the non-boot CPU in find_busiest_group(). 
This occurs the first time that the CPU goes idle.  Those groups are set
up in sched_init_smp(), which is called after smp_init():

static int init(void * unused)
{
	...
        fixup_cpu_present_map();
        smp_init();
        sched_init_smp();

But, the idle threads for the secondary CPUs are initialized in
smp_init().  So, what happens when a CPU tries to schedule (using sched
domains) before sched_init_smp() completes?  I think it goes boom! :)

Anyway, I was thinking that we should just hold the runqueue lock on the
non-boot CPUs until the sched domain init code is done.  Does that sound
feasible?

-- Dave

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-16  2:13 sched domains bringup race? Dave Hansen
@ 2004-07-16  3:03 ` Nick Piggin
  2004-07-16  3:30   ` Dave Hansen
  2004-07-18 20:45 ` Keshavamurthy Anil S
  1 sibling, 1 reply; 10+ messages in thread
From: Nick Piggin @ 2004-07-16  3:03 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Matthew C. Dobson [imap], Linux Kernel Mailing List

Dave Hansen wrote:
> I keep getting oopses for the non-boot CPU in find_busiest_group(). 
> This occurs the first time that the CPU goes idle.  Those groups are set
> up in sched_init_smp(), which is called after smp_init():
> 
> static int init(void * unused)
> {
> 	...
>         fixup_cpu_present_map();
>         smp_init();
>         sched_init_smp();
> 
> But, the idle threads for the secondary CPUs are initialized in
> smp_init().  So, what happens when a CPU tries to schedule (using sched
> domains) before sched_init_smp() completes?  I think it goes boom! :)
> 

It shouldn't because sched_init sets up dummy domains for
all runqueues.

Obviously something is going wrong somewhere though.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-16  3:03 ` Nick Piggin
@ 2004-07-16  3:30   ` Dave Hansen
  2004-07-16  3:46     ` Nick Piggin
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Hansen @ 2004-07-16  3:30 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Matthew C. Dobson [imap], Linux Kernel Mailing List

On Thu, 2004-07-15 at 20:03, Nick Piggin wrote:
> It shouldn't because sched_init sets up dummy domains for
> all runqueues.
> 
> Obviously something is going wrong somewhere though.

Hmmm, but there still might be some concurrency problems, right?  There
isn't any locking while the setup is being done, so are all of the
intermediate initialization states valid?  Or, could one of the CPUs be
catching the init code in the middle of an operation?

-- Dave

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-16  3:30   ` Dave Hansen
@ 2004-07-16  3:46     ` Nick Piggin
  2004-07-16  5:27       ` Nick Piggin
  0 siblings, 1 reply; 10+ messages in thread
From: Nick Piggin @ 2004-07-16  3:46 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Matthew C. Dobson [imap], Linux Kernel Mailing List

Dave Hansen wrote:
> On Thu, 2004-07-15 at 20:03, Nick Piggin wrote:
> 
>>It shouldn't because sched_init sets up dummy domains for
>>all runqueues.
>>
>>Obviously something is going wrong somewhere though.
> 
> 
> Hmmm, but there still might be some concurrency problems, right?  There
> isn't any locking while the setup is being done, so are all of the
> intermediate initialization states valid?  Or, could one of the CPUs be
> catching the init code in the middle of an operation?
> 

cpu_attach_domain is supposed to be able to do the switchover
without any races.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-16  3:46     ` Nick Piggin
@ 2004-07-16  5:27       ` Nick Piggin
  0 siblings, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-07-16  5:27 UTC (permalink / raw)
  Cc: Dave Hansen, Matthew C. Dobson [imap], Linux Kernel Mailing List

Nick Piggin wrote:
> Dave Hansen wrote:
> 
>> On Thu, 2004-07-15 at 20:03, Nick Piggin wrote:
>>
>>> It shouldn't because sched_init sets up dummy domains for
>>> all runqueues.
>>>
>>> Obviously something is going wrong somewhere though.
>>
>>
>>
>> Hmmm, but there still might be some concurrency problems, right?  There
>> isn't any locking while the setup is being done, so are all of the
>> intermediate initialization states valid?  Or, could one of the CPUs be
>> catching the init code in the middle of an operation?
>>
> 
> cpu_attach_domain is supposed to be able to do the switchover
> without any races.

Although the sched_domain_debug is definitely racy. It should really
be locking each runqueue before traversing its domains. Just undef
SCHED_DOMAIN_DEBUG to be sure...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-16  2:13 sched domains bringup race? Dave Hansen
  2004-07-16  3:03 ` Nick Piggin
@ 2004-07-18 20:45 ` Keshavamurthy Anil S
  2004-07-19  7:31   ` Nick Piggin
  1 sibling, 1 reply; 10+ messages in thread
From: Keshavamurthy Anil S @ 2004-07-18 20:45 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Nick Piggin, Matthew C. Dobson [imap], Linux Kernel Mailing List

On Thu, Jul 15, 2004 at 07:13:46PM -0700, Dave Hansen wrote:
> I keep getting oopses for the non-boot CPU in find_busiest_group(). 
> This occurs the first time that the CPU goes idle.  Those groups are set
> up in sched_init_smp(), which is called after smp_init():
> 
> static int init(void * unused)
> {
> 	...
>         fixup_cpu_present_map();
>         smp_init();
>         sched_init_smp();
> 
> But, the idle threads for the secondary CPUs are initialized in
> smp_init().  So, what happens when a CPU tries to schedule (using sched
> domains) before sched_init_smp() completes?  I think it goes boom! :)
> 
> Anyway, I was thinking that we should just hold the runqueue lock on the
> non-boot CPUs until the sched domain init code is done.  Does that sound
> feasible?

Even on my system which is Intel 865 chipset (P4 with HT enabled system) 
I see a bug check somewhere in the schedular_tick during boot.
However if I move the sched_init_smp() after do_basic_setup() the
kernel boots without any problem. Any clue here?

 static int init(void * unused)
 {
 	...
         fixup_cpu_present_map();
         smp_init();
	 populate_rootfs();
	 do_basic_setup();
 
         sched_init_smp();
-Anil

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-18 20:45 ` Keshavamurthy Anil S
@ 2004-07-19  7:31   ` Nick Piggin
  2004-07-22 21:55     ` Nathan Lynch
  0 siblings, 1 reply; 10+ messages in thread
From: Nick Piggin @ 2004-07-19  7:31 UTC (permalink / raw)
  To: Keshavamurthy Anil S
  Cc: Dave Hansen, Matthew C. Dobson [imap], Linux Kernel Mailing List

Keshavamurthy Anil S wrote:
> On Thu, Jul 15, 2004 at 07:13:46PM -0700, Dave Hansen wrote:
> 
>>I keep getting oopses for the non-boot CPU in find_busiest_group(). 
>>This occurs the first time that the CPU goes idle.  Those groups are set
>>up in sched_init_smp(), which is called after smp_init():
>>
>>static int init(void * unused)
>>{
>>	...
>>        fixup_cpu_present_map();
>>        smp_init();
>>        sched_init_smp();
>>
>>But, the idle threads for the secondary CPUs are initialized in
>>smp_init().  So, what happens when a CPU tries to schedule (using sched
>>domains) before sched_init_smp() completes?  I think it goes boom! :)
>>
>>Anyway, I was thinking that we should just hold the runqueue lock on the
>>non-boot CPUs until the sched domain init code is done.  Does that sound
>>feasible?
> 
> 
> Even on my system which is Intel 865 chipset (P4 with HT enabled system) 
> I see a bug check somewhere in the schedular_tick during boot.
> However if I move the sched_init_smp() after do_basic_setup() the
> kernel boots without any problem. Any clue here?
> 
>  static int init(void * unused)
>  {
>  	...
>          fixup_cpu_present_map();
>          smp_init();
> 	 populate_rootfs();
> 	 do_basic_setup();
>  
>          sched_init_smp();

There shouldn't be any problem doing that if we have to, obviously we
need to know why. Is it possible that cpu_sibling_map, or one of the
CPU masks isn't set up correctly at the time of the call?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-19  7:31   ` Nick Piggin
@ 2004-07-22 21:55     ` Nathan Lynch
  2004-07-22 23:23       ` Keshavamurthy Anil S
  2004-07-23  0:33       ` Nick Piggin
  0 siblings, 2 replies; 10+ messages in thread
From: Nathan Lynch @ 2004-07-22 21:55 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Keshavamurthy Anil S, Dave Hansen, Matthew C. Dobson [imap],
	Linux Kernel Mailing List

On Mon, 2004-07-19 at 02:31, Nick Piggin wrote:
> Keshavamurthy Anil S wrote:
> > Even on my system which is Intel 865 chipset (P4 with HT enabled system) 
> > I see a bug check somewhere in the schedular_tick during boot.
> > However if I move the sched_init_smp() after do_basic_setup() the
> > kernel boots without any problem. Any clue here?
> 
> There shouldn't be any problem doing that if we have to, obviously we
> need to know why. Is it possible that cpu_sibling_map, or one of the
> CPU masks isn't set up correctly at the time of the call?

In 2.6.8-rc1-mm1 at least, backing this patch out fixed it for me on
ppc64:

http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc1/2.6.8-rc1-mm1/broken-out/detect-too-early-schedule-attempts.patch

Code with statements of the form:

if (system_state == SYSTEM_BOOTING)
	/* do something boot-specific */
else
	/* do something assuming system_state == SYSTEM_RUNNING */

is broken by this change.  Parts of the cpu bringup code in arch/ppc64
do this (and thus need to be fixed if the above change is kept). 
Chances are there is similar code in some x86 setups.

Nathan


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-22 21:55     ` Nathan Lynch
@ 2004-07-22 23:23       ` Keshavamurthy Anil S
  2004-07-23  0:33       ` Nick Piggin
  1 sibling, 0 replies; 10+ messages in thread
From: Keshavamurthy Anil S @ 2004-07-22 23:23 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Nick Piggin, Keshavamurthy Anil S, Dave Hansen,
	Matthew C. Dobson [imap], Linux Kernel Mailing List

On Thu, Jul 22, 2004 at 04:55:40PM -0500, Nathan Lynch wrote:
> On Mon, 2004-07-19 at 02:31, Nick Piggin wrote:
> > Keshavamurthy Anil S wrote:
> > > Even on my system which is Intel 865 chipset (P4 with HT enabled system) 
> > > I see a bug check somewhere in the schedular_tick during boot.
> > > However if I move the sched_init_smp() after do_basic_setup() the
> > > kernel boots without any problem. Any clue here?

  This was happening even without CONFIG_SCHED_SMT and later found to be
ACPI bug. Sorry for the confusion.

> > 
> > There shouldn't be any problem doing that if we have to, obviously we
> > need to know why. Is it possible that cpu_sibling_map, or one of the
> > CPU masks isn't set up correctly at the time of the call?
> 
> In 2.6.8-rc1-mm1 at least, backing this patch out fixed it for me on
> ppc64:
> 
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc1/2.6.8-rc1-mm1/broken-out/detect-too-early-schedule-attempts.patch
> 
> Code with statements of the form:
> 
> if (system_state == SYSTEM_BOOTING)
> 	/* do something boot-specific */
> else
> 	/* do something assuming system_state == SYSTEM_RUNNING */
> 
> is broken by this change.  Parts of the cpu bringup code in arch/ppc64
> do this (and thus need to be fixed if the above change is kept). 
> Chances are there is similar code in some x86 setups.
> 
> Nathan
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: sched domains bringup race?
  2004-07-22 21:55     ` Nathan Lynch
  2004-07-22 23:23       ` Keshavamurthy Anil S
@ 2004-07-23  0:33       ` Nick Piggin
  1 sibling, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2004-07-23  0:33 UTC (permalink / raw)
  To: Nathan Lynch
  Cc: Keshavamurthy Anil S, Dave Hansen, Matthew C. Dobson [imap],
	Linux Kernel Mailing List, Andrew Morton

Nathan Lynch wrote:

>On Mon, 2004-07-19 at 02:31, Nick Piggin wrote:
>
>>Keshavamurthy Anil S wrote:
>>
>>>Even on my system which is Intel 865 chipset (P4 with HT enabled system) 
>>>I see a bug check somewhere in the schedular_tick during boot.
>>>However if I move the sched_init_smp() after do_basic_setup() the
>>>kernel boots without any problem. Any clue here?
>>>
>>There shouldn't be any problem doing that if we have to, obviously we
>>need to know why. Is it possible that cpu_sibling_map, or one of the
>>CPU masks isn't set up correctly at the time of the call?
>>
>
>In 2.6.8-rc1-mm1 at least, backing this patch out fixed it for me on
>ppc64:
>
>http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc1/2.6.8-rc1-mm1/broken-out/detect-too-early-schedule-attempts.patch
>
>Code with statements of the form:
>
>if (system_state == SYSTEM_BOOTING)
>	/* do something boot-specific */
>else
>	/* do something assuming system_state == SYSTEM_RUNNING */
>
>is broken by this change.  Parts of the cpu bringup code in arch/ppc64
>do this (and thus need to be fixed if the above change is kept). 
>Chances are there is similar code in some x86 setups.
>
>

That patch can be dropped AFAIKS.

sched-clean-init-idle.patch introduces a better check.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-07-23  0:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-16  2:13 sched domains bringup race? Dave Hansen
2004-07-16  3:03 ` Nick Piggin
2004-07-16  3:30   ` Dave Hansen
2004-07-16  3:46     ` Nick Piggin
2004-07-16  5:27       ` Nick Piggin
2004-07-18 20:45 ` Keshavamurthy Anil S
2004-07-19  7:31   ` Nick Piggin
2004-07-22 21:55     ` Nathan Lynch
2004-07-22 23:23       ` Keshavamurthy Anil S
2004-07-23  0:33       ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox