Oops in find_busiest_group(): 2.6.8-rc1-mm1

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Oops in find_busiest_group(): 2.6.8-rc1-mm1
@ 2004-07-15  6:04 Dave Hansen
  2004-07-15  6:15 ` Nick Piggin
  2004-07-29  6:42 ` Paul Jackson
  0 siblings, 2 replies; 9+ messages in thread
From: Dave Hansen @ 2004-07-15  6:04 UTC (permalink / raw)
  To: PPC64 External List; +Cc: Linux Kernel Mailing List, Nick Piggin

Looks like 'find_busiest_group()::this' is null:

cpu 0x1: Vector: 380 (Data SLB Access) at [c0000002ffe0b420]
    pc: c000000000046644: .find_busiest_group+0x24c/0x470
    lr: c00000000004681c: .find_busiest_group+0x424/0x470
    sp: c0000002ffe0b6a0
   msr: 8000000000001032
   dar: 10
  current = 0xc0000002fff70da0
  paca    = 0xc00000000033c900
    pid   = 0, comm = swapper
...
1:mon> r
R00 = 0000000000000080   R16 = 0000000000000080
R01 = c0000002ffe0b6a0   R17 = 0000000000000080
R02 = c0000000004a5470   R18 = 0000000000000080
R03 = 0000000000000046   R19 = c00000000adfb408
R04 = c00000000050dd27   R20 = 0000000000000001
R05 = c00000000052dd50   R21 = 0000000000000000
R06 = c0000000003b7828   R22 = 0000000000000000
R07 = fffffffffffe0cb8   R23 = c0000002ffe0b710
R08 = c00000000050d180   R24 = c0000000004a2008
R09 = c00000000050d918   R25 = c000000000330c38
R10 = 0000000000000000   R26 = c000000000330c38
R11 = 0000000000000001   R27 = 0000000000000001
R12 = 0000000000000010   R28 = c00000000050d198
R13 = c00000000033c900   R29 = 0000000000000080
R14 = 0000000000000000   R30 = c0000000003c29e8
R15 = c000000000330c38   R31 = c0000002ffe0b6a0
pc  = c000000000046644 .find_busiest_group+0x24c/0x470
lr  = c00000000004681c .find_busiest_group+0x424/0x470
msr = 8000000000001032   cr  = 88428428
ctr = c0000000001527a8   xer = 0000000000000000   trap =      380

I put a little printk in:

        /* How much load to actually move to equalise the imbalance */
        if (!busiest || !this)
                printk("%s() busiest: %p this: %p\n", __func__, busiest, this);
        *imbalance = (*imbalance * min(busiest->cpu_power, this->cpu_power))
                                / SCHED_LOAD_SCALE;

And sure enough, this showed up on the console:

find_busiest_group() busiest: c00000000050d180 this: 0000000000000000

This code also looks funny to begin with:

>                 if (local_group) {
>                         this_load = avg_load;
>                         this = group;
>                         goto nextgroup;
>                 } else if (avg_load > max_load) {
>                         max_load = avg_load;
>                         busiest = group;
>                 }
> nextgroup:
>                 group = group->next;
>         } while (group != sd->groups);

Why bother with the 'goto nextgroup;'?  Shouldn't the first if block
just fall through to the target of the goto anyway?

-- Dave


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-15  6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
@ 2004-07-15  6:15 ` Nick Piggin
  2004-07-29  6:42 ` Paul Jackson
  1 sibling, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2004-07-15  6:15 UTC (permalink / raw)
  To: Dave Hansen; +Cc: PPC64 External List, Linux Kernel Mailing List, Nick Piggin

Dave Hansen wrote:
> Looks like 'find_busiest_group()::this' is null:
> 
> cpu 0x1: Vector: 380 (Data SLB Access) at [c0000002ffe0b420]
>     pc: c000000000046644: .find_busiest_group+0x24c/0x470
>     lr: c00000000004681c: .find_busiest_group+0x424/0x470
>     sp: c0000002ffe0b6a0
>    msr: 8000000000001032
>    dar: 10
>   current = 0xc0000002fff70da0
>   paca    = 0xc00000000033c900
>     pid   = 0, comm = swapper
> ...
> 1:mon> r
> R00 = 0000000000000080   R16 = 0000000000000080
> R01 = c0000002ffe0b6a0   R17 = 0000000000000080
> R02 = c0000000004a5470   R18 = 0000000000000080
> R03 = 0000000000000046   R19 = c00000000adfb408
> R04 = c00000000050dd27   R20 = 0000000000000001
> R05 = c00000000052dd50   R21 = 0000000000000000
> R06 = c0000000003b7828   R22 = 0000000000000000
> R07 = fffffffffffe0cb8   R23 = c0000002ffe0b710
> R08 = c00000000050d180   R24 = c0000000004a2008
> R09 = c00000000050d918   R25 = c000000000330c38
> R10 = 0000000000000000   R26 = c000000000330c38
> R11 = 0000000000000001   R27 = 0000000000000001
> R12 = 0000000000000010   R28 = c00000000050d198
> R13 = c00000000033c900   R29 = 0000000000000080
> R14 = 0000000000000000   R30 = c0000000003c29e8
> R15 = c000000000330c38   R31 = c0000002ffe0b6a0
> pc  = c000000000046644 .find_busiest_group+0x24c/0x470
> lr  = c00000000004681c .find_busiest_group+0x424/0x470
> msr = 8000000000001032   cr  = 88428428
> ctr = c0000000001527a8   xer = 0000000000000000   trap =      380
> 
> I put a little printk in:
> 
>         /* How much load to actually move to equalise the imbalance */
>         if (!busiest || !this)
>                 printk("%s() busiest: %p this: %p\n", __func__, busiest, this);
>         *imbalance = (*imbalance * min(busiest->cpu_power, this->cpu_power))
>                                 / SCHED_LOAD_SCALE;
> 
> And sure enough, this showed up on the console:
> 
> find_busiest_group() busiest: c00000000050d180 this: 0000000000000000
> 

OK, it is overdue for a bit of an audit anyway, so I'll see if I can
see what is going wrong.

> This code also looks funny to begin with:
> 
> 
>>                if (local_group) {
>>                        this_load = avg_load;
>>                        this = group;
>>                        goto nextgroup;
>>                } else if (avg_load > max_load) {
>>                        max_load = avg_load;
>>                        busiest = group;
>>                }
>>nextgroup:
>>                group = group->next;
>>        } while (group != sd->groups);
> 
> 
> Why bother with the 'goto nextgroup;'?  Shouldn't the first if block
> just fall through to the target of the goto anyway?
> 

It was there for a good reason once... obviously no point to it now.
Thanks for the report.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-15  6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
  2004-07-15  6:15 ` Nick Piggin
@ 2004-07-29  6:42 ` Paul Jackson
  2004-07-29  8:33   ` Nick Piggin
  1 sibling, 1 reply; 9+ messages in thread
From: Paul Jackson @ 2004-07-29  6:42 UTC (permalink / raw)
  To: Dave Hansen; +Cc: linuxppc64-dev, linux-kernel, piggin

I just hit what might be the same oops.

I had not upgraded my working kernel for a month, and just now, when I
upgraded to 2.6.8-rc2-mm1, running sn2_defconfig on a small SN2 system,
it fails to boot everytime, ending with an Oops that starts out with:

======================================================
Freeing unused kernel memory: 320kB freed
Unable to handle kernel NULL pointer dereference (address 0000000000000008)
swapper[0]: Oops 8813272891392 [1]
Modules linked in:

Pid: 0, CPU 0, comm:              swapper
psr : 0000101008022018 ifs : 8000000000000e20 ip  : [<a0000001000bd710>]    Not tainted
ip is at find_busiest_group+0xb0/0x640
======================================================

I added a conditional printk_ratelimit'ed print at the top of
find_busiest_group() whenever group is NULL, just before the first
dereference of group in the line:

	local_group = cpu_isset(this_cpu, group->cpumask);

That print fires about 20,480 times each 5 second suppression window.

But it boots, if I also add code to break out of the "do { ... } while
(group != sd->groups)" loop, whenever group goes NULL.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29  6:42 ` Paul Jackson
@ 2004-07-29  8:33   ` Nick Piggin
  2004-07-29  9:29     ` Paul Jackson
  0 siblings, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2004-07-29  8:33 UTC (permalink / raw)
  To: Paul Jackson; +Cc: Dave Hansen, linuxppc64-dev, linux-kernel



Paul Jackson wrote:

>I just hit what might be the same oops.
>
>I had not upgraded my working kernel for a month, and just now, when I
>upgraded to 2.6.8-rc2-mm1, running sn2_defconfig on a small SN2 system,
>it fails to boot everytime, ending with an Oops that starts out with:
>
>======================================================
>Freeing unused kernel memory: 320kB freed
>Unable to handle kernel NULL pointer dereference (address 0000000000000008)
>swapper[0]: Oops 8813272891392 [1]
>Modules linked in:
>
>Pid: 0, CPU 0, comm:              swapper
>psr : 0000101008022018 ifs : 8000000000000e20 ip  : [<a0000001000bd710>]    Not tainted
>ip is at find_busiest_group+0xb0/0x640
>======================================================
>
>I added a conditional printk_ratelimit'ed print at the top of
>find_busiest_group() whenever group is NULL, just before the first
>dereference of group in the line:
>
>	local_group = cpu_isset(this_cpu, group->cpumask);
>
>That print fires about 20,480 times each 5 second suppression window.
>
>But it boots, if I also add code to break out of the "do { ... } while
>(group != sd->groups)" loop, whenever group goes NULL.
>
>

OK, I still can't work out why this is happening. Can you try with
2.6.8-rc2-mm1? Does it happen continually after the system has booted?
If it happens in 2.6.8-rc2-mm1, comment out the call to cpu_attach_domain
in kernel/sched.c (so you'll only be using the dummy boot-up domain).
Does that fix it?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29  8:33   ` Nick Piggin
@ 2004-07-29  9:29     ` Paul Jackson
  2004-07-29 10:36       ` Nick Piggin
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Jackson @ 2004-07-29  9:29 UTC (permalink / raw)
  To: Nick Piggin; +Cc: haveblue, linuxppc64-dev, linux-kernel

Nick writes:
>  Can you try with 2.6.8-rc2-mm1?

This _is_ with 2.6.8-rc2-mm1.

> Does it happen continually after the system has booted?

Yes - nonstop - 4 times per millisecond, for at least as
long as the machine has been up (I'm rebooting every few
minutes, for other reasons ...).

> comment out the call to cpu_attach_domain ... Does that fix it?

Yes - that fixes it.  My ratelimited printks on NULL group cease.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29  9:29     ` Paul Jackson
@ 2004-07-29 10:36       ` Nick Piggin
  2004-07-29 12:22         ` Dave Hansen
  2004-07-29 15:35         ` Dimitri Sivanich
  0 siblings, 2 replies; 9+ messages in thread
From: Nick Piggin @ 2004-07-29 10:36 UTC (permalink / raw)
  To: Paul Jackson; +Cc: haveblue, linuxppc64-dev, linux-kernel, Jesse Barnes

Paul Jackson wrote:
> Nick writes:
> 
>> Can you try with 2.6.8-rc2-mm1?
> 
> 
> This _is_ with 2.6.8-rc2-mm1.
> 
> 
>>Does it happen continually after the system has booted?
> 
> 
> Yes - nonstop - 4 times per millisecond, for at least as
> long as the machine has been up (I'm rebooting every few
> minutes, for other reasons ...).
> 
> 
>>comment out the call to cpu_attach_domain ... Does that fix it?
> 
> 
> Yes - that fixes it.  My ratelimited printks on NULL group cease.
> 

Hmm, nothing else seems to be oopsing. Maybe it is the ia64
domain setup code that Jesse did? The domains/groups must
not have been built properly somewhere.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29 10:36       ` Nick Piggin
@ 2004-07-29 12:22         ` Dave Hansen
  2004-07-29 15:35         ` Dimitri Sivanich
  1 sibling, 0 replies; 9+ messages in thread
From: Dave Hansen @ 2004-07-29 12:22 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Paul Jackson, PPC64 External List, Linux Kernel Mailing List,
	Jesse Barnes

On Thu, 2004-07-29 at 03:36, Nick Piggin wrote:
> Hmm, nothing else seems to be oopsing. Maybe it is the ia64
> domain setup code that Jesse did? The domains/groups must
> not have been built properly somewhere.

Does backing this patch out help?  It did on ppc64.

http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc1/2.6.8-rc1-mm1/broken-out/detect-too-early-schedule-attempts.patch

-- Dave


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29 10:36       ` Nick Piggin
  2004-07-29 12:22         ` Dave Hansen
@ 2004-07-29 15:35         ` Dimitri Sivanich
  2004-07-29 15:49           ` Jesse Barnes
  1 sibling, 1 reply; 9+ messages in thread
From: Dimitri Sivanich @ 2004-07-29 15:35 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Paul Jackson, haveblue, linuxppc64-dev, linux-kernel,
	Jesse Barnes

On Thu, Jul 29, 2004 at 08:36:57PM +1000, Nick Piggin wrote:
> Hmm, nothing else seems to be oopsing. Maybe it is the ia64
> domain setup code that Jesse did? The domains/groups must
> not have been built properly somewhere.

Here's a patch to 2.6.8-rc2-mm1 that allows things to work:

--- sched.c.old 2004-07-29 10:11:00.000000000 -0500
+++ sched.c     2004-07-29 10:27:58.000000000 -0500
@@ -3770,8 +3770,6 @@ __init static void arch_init_sched_domai
                cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
 
 #ifdef CONFIG_NUMA
-               if (i != first_cpu(sd->groups->cpumask))
-                       continue;
                sd = &per_cpu(node_domains, i);
                group = cpu_to_node_group(i);
                *sd = SD_NODE_INIT;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
  2004-07-29 15:35         ` Dimitri Sivanich
@ 2004-07-29 15:49           ` Jesse Barnes
  0 siblings, 0 replies; 9+ messages in thread
From: Jesse Barnes @ 2004-07-29 15:49 UTC (permalink / raw)
  To: Dimitri Sivanich
  Cc: Nick Piggin, Paul Jackson, haveblue, linuxppc64-dev, linux-kernel,
	Jesse Barnes

On Thursday, July 29, 2004 8:35 am, Dimitri Sivanich wrote:
> Here's a patch to 2.6.8-rc2-mm1 that allows things to work:
>
> --- sched.c.old 2004-07-29 10:11:00.000000000 -0500
> +++ sched.c     2004-07-29 10:27:58.000000000 -0500
> @@ -3770,8 +3770,6 @@ __init static void arch_init_sched_domai
>                 cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
>
>  #ifdef CONFIG_NUMA
> -               if (i != first_cpu(sd->groups->cpumask))
> -                       continue;
>                 sd = &per_cpu(node_domains, i);
>                 group = cpu_to_node_group(i);
>                 *sd = SD_NODE_INIT;

Yep, this was a merge error.  I posted it as the first reply (f1rst p0st!) to 
Andrew's 2.6.8-rc2-mm1 announcement.  Sorry for the trouble, my last patch 
didn't include it, but there was some confusion since there were several 
fixes to the scheduler code posted to Nick's 'consolidate sched domains' 
thread.

Jesse

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-07-29 16:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-15  6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
2004-07-15  6:15 ` Nick Piggin
2004-07-29  6:42 ` Paul Jackson
2004-07-29  8:33   ` Nick Piggin
2004-07-29  9:29     ` Paul Jackson
2004-07-29 10:36       ` Nick Piggin
2004-07-29 12:22         ` Dave Hansen
2004-07-29 15:35         ` Dimitri Sivanich
2004-07-29 15:49           ` Jesse Barnes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox