* Oops in find_busiest_group(): 2.6.8-rc1-mm1
@ 2004-07-15 6:04 Dave Hansen
2004-07-15 6:15 ` Nick Piggin
2004-07-29 6:42 ` Paul Jackson
0 siblings, 2 replies; 9+ messages in thread
From: Dave Hansen @ 2004-07-15 6:04 UTC (permalink / raw)
To: PPC64 External List; +Cc: Linux Kernel Mailing List, Nick Piggin
Looks like 'find_busiest_group()::this' is null:
cpu 0x1: Vector: 380 (Data SLB Access) at [c0000002ffe0b420]
pc: c000000000046644: .find_busiest_group+0x24c/0x470
lr: c00000000004681c: .find_busiest_group+0x424/0x470
sp: c0000002ffe0b6a0
msr: 8000000000001032
dar: 10
current = 0xc0000002fff70da0
paca = 0xc00000000033c900
pid = 0, comm = swapper
...
1:mon> r
R00 = 0000000000000080 R16 = 0000000000000080
R01 = c0000002ffe0b6a0 R17 = 0000000000000080
R02 = c0000000004a5470 R18 = 0000000000000080
R03 = 0000000000000046 R19 = c00000000adfb408
R04 = c00000000050dd27 R20 = 0000000000000001
R05 = c00000000052dd50 R21 = 0000000000000000
R06 = c0000000003b7828 R22 = 0000000000000000
R07 = fffffffffffe0cb8 R23 = c0000002ffe0b710
R08 = c00000000050d180 R24 = c0000000004a2008
R09 = c00000000050d918 R25 = c000000000330c38
R10 = 0000000000000000 R26 = c000000000330c38
R11 = 0000000000000001 R27 = 0000000000000001
R12 = 0000000000000010 R28 = c00000000050d198
R13 = c00000000033c900 R29 = 0000000000000080
R14 = 0000000000000000 R30 = c0000000003c29e8
R15 = c000000000330c38 R31 = c0000002ffe0b6a0
pc = c000000000046644 .find_busiest_group+0x24c/0x470
lr = c00000000004681c .find_busiest_group+0x424/0x470
msr = 8000000000001032 cr = 88428428
ctr = c0000000001527a8 xer = 0000000000000000 trap = 380
I put a little printk in:
/* How much load to actually move to equalise the imbalance */
if (!busiest || !this)
printk("%s() busiest: %p this: %p\n", __func__, busiest, this);
*imbalance = (*imbalance * min(busiest->cpu_power, this->cpu_power))
/ SCHED_LOAD_SCALE;
And sure enough, this showed up on the console:
find_busiest_group() busiest: c00000000050d180 this: 0000000000000000
This code also looks funny to begin with:
> if (local_group) {
> this_load = avg_load;
> this = group;
> goto nextgroup;
> } else if (avg_load > max_load) {
> max_load = avg_load;
> busiest = group;
> }
> nextgroup:
> group = group->next;
> } while (group != sd->groups);
Why bother with the 'goto nextgroup;'? Shouldn't the first if block
just fall through to the target of the goto anyway?
-- Dave
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-15 6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
@ 2004-07-15 6:15 ` Nick Piggin
2004-07-29 6:42 ` Paul Jackson
1 sibling, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2004-07-15 6:15 UTC (permalink / raw)
To: Dave Hansen; +Cc: PPC64 External List, Linux Kernel Mailing List, Nick Piggin
Dave Hansen wrote:
> Looks like 'find_busiest_group()::this' is null:
>
> cpu 0x1: Vector: 380 (Data SLB Access) at [c0000002ffe0b420]
> pc: c000000000046644: .find_busiest_group+0x24c/0x470
> lr: c00000000004681c: .find_busiest_group+0x424/0x470
> sp: c0000002ffe0b6a0
> msr: 8000000000001032
> dar: 10
> current = 0xc0000002fff70da0
> paca = 0xc00000000033c900
> pid = 0, comm = swapper
> ...
> 1:mon> r
> R00 = 0000000000000080 R16 = 0000000000000080
> R01 = c0000002ffe0b6a0 R17 = 0000000000000080
> R02 = c0000000004a5470 R18 = 0000000000000080
> R03 = 0000000000000046 R19 = c00000000adfb408
> R04 = c00000000050dd27 R20 = 0000000000000001
> R05 = c00000000052dd50 R21 = 0000000000000000
> R06 = c0000000003b7828 R22 = 0000000000000000
> R07 = fffffffffffe0cb8 R23 = c0000002ffe0b710
> R08 = c00000000050d180 R24 = c0000000004a2008
> R09 = c00000000050d918 R25 = c000000000330c38
> R10 = 0000000000000000 R26 = c000000000330c38
> R11 = 0000000000000001 R27 = 0000000000000001
> R12 = 0000000000000010 R28 = c00000000050d198
> R13 = c00000000033c900 R29 = 0000000000000080
> R14 = 0000000000000000 R30 = c0000000003c29e8
> R15 = c000000000330c38 R31 = c0000002ffe0b6a0
> pc = c000000000046644 .find_busiest_group+0x24c/0x470
> lr = c00000000004681c .find_busiest_group+0x424/0x470
> msr = 8000000000001032 cr = 88428428
> ctr = c0000000001527a8 xer = 0000000000000000 trap = 380
>
> I put a little printk in:
>
> /* How much load to actually move to equalise the imbalance */
> if (!busiest || !this)
> printk("%s() busiest: %p this: %p\n", __func__, busiest, this);
> *imbalance = (*imbalance * min(busiest->cpu_power, this->cpu_power))
> / SCHED_LOAD_SCALE;
>
> And sure enough, this showed up on the console:
>
> find_busiest_group() busiest: c00000000050d180 this: 0000000000000000
>
OK, it is overdue for a bit of an audit anyway, so I'll see if I can
see what is going wrong.
> This code also looks funny to begin with:
>
>
>> if (local_group) {
>> this_load = avg_load;
>> this = group;
>> goto nextgroup;
>> } else if (avg_load > max_load) {
>> max_load = avg_load;
>> busiest = group;
>> }
>>nextgroup:
>> group = group->next;
>> } while (group != sd->groups);
>
>
> Why bother with the 'goto nextgroup;'? Shouldn't the first if block
> just fall through to the target of the goto anyway?
>
It was there for a good reason once... obviously no point to it now.
Thanks for the report.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-15 6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
2004-07-15 6:15 ` Nick Piggin
@ 2004-07-29 6:42 ` Paul Jackson
2004-07-29 8:33 ` Nick Piggin
1 sibling, 1 reply; 9+ messages in thread
From: Paul Jackson @ 2004-07-29 6:42 UTC (permalink / raw)
To: Dave Hansen; +Cc: linuxppc64-dev, linux-kernel, piggin
I just hit what might be the same oops.
I had not upgraded my working kernel for a month, and just now, when I
upgraded to 2.6.8-rc2-mm1, running sn2_defconfig on a small SN2 system,
it fails to boot everytime, ending with an Oops that starts out with:
======================================================
Freeing unused kernel memory: 320kB freed
Unable to handle kernel NULL pointer dereference (address 0000000000000008)
swapper[0]: Oops 8813272891392 [1]
Modules linked in:
Pid: 0, CPU 0, comm: swapper
psr : 0000101008022018 ifs : 8000000000000e20 ip : [<a0000001000bd710>] Not tainted
ip is at find_busiest_group+0xb0/0x640
======================================================
I added a conditional printk_ratelimit'ed print at the top of
find_busiest_group() whenever group is NULL, just before the first
dereference of group in the line:
local_group = cpu_isset(this_cpu, group->cpumask);
That print fires about 20,480 times each 5 second suppression window.
But it boots, if I also add code to break out of the "do { ... } while
(group != sd->groups)" loop, whenever group goes NULL.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 6:42 ` Paul Jackson
@ 2004-07-29 8:33 ` Nick Piggin
2004-07-29 9:29 ` Paul Jackson
0 siblings, 1 reply; 9+ messages in thread
From: Nick Piggin @ 2004-07-29 8:33 UTC (permalink / raw)
To: Paul Jackson; +Cc: Dave Hansen, linuxppc64-dev, linux-kernel
Paul Jackson wrote:
>I just hit what might be the same oops.
>
>I had not upgraded my working kernel for a month, and just now, when I
>upgraded to 2.6.8-rc2-mm1, running sn2_defconfig on a small SN2 system,
>it fails to boot everytime, ending with an Oops that starts out with:
>
>======================================================
>Freeing unused kernel memory: 320kB freed
>Unable to handle kernel NULL pointer dereference (address 0000000000000008)
>swapper[0]: Oops 8813272891392 [1]
>Modules linked in:
>
>Pid: 0, CPU 0, comm: swapper
>psr : 0000101008022018 ifs : 8000000000000e20 ip : [<a0000001000bd710>] Not tainted
>ip is at find_busiest_group+0xb0/0x640
>======================================================
>
>I added a conditional printk_ratelimit'ed print at the top of
>find_busiest_group() whenever group is NULL, just before the first
>dereference of group in the line:
>
> local_group = cpu_isset(this_cpu, group->cpumask);
>
>That print fires about 20,480 times each 5 second suppression window.
>
>But it boots, if I also add code to break out of the "do { ... } while
>(group != sd->groups)" loop, whenever group goes NULL.
>
>
OK, I still can't work out why this is happening. Can you try with
2.6.8-rc2-mm1? Does it happen continually after the system has booted?
If it happens in 2.6.8-rc2-mm1, comment out the call to cpu_attach_domain
in kernel/sched.c (so you'll only be using the dummy boot-up domain).
Does that fix it?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 8:33 ` Nick Piggin
@ 2004-07-29 9:29 ` Paul Jackson
2004-07-29 10:36 ` Nick Piggin
0 siblings, 1 reply; 9+ messages in thread
From: Paul Jackson @ 2004-07-29 9:29 UTC (permalink / raw)
To: Nick Piggin; +Cc: haveblue, linuxppc64-dev, linux-kernel
Nick writes:
> Can you try with 2.6.8-rc2-mm1?
This _is_ with 2.6.8-rc2-mm1.
> Does it happen continually after the system has booted?
Yes - nonstop - 4 times per millisecond, for at least as
long as the machine has been up (I'm rebooting every few
minutes, for other reasons ...).
> comment out the call to cpu_attach_domain ... Does that fix it?
Yes - that fixes it. My ratelimited printks on NULL group cease.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.650.933.1373
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 9:29 ` Paul Jackson
@ 2004-07-29 10:36 ` Nick Piggin
2004-07-29 12:22 ` Dave Hansen
2004-07-29 15:35 ` Dimitri Sivanich
0 siblings, 2 replies; 9+ messages in thread
From: Nick Piggin @ 2004-07-29 10:36 UTC (permalink / raw)
To: Paul Jackson; +Cc: haveblue, linuxppc64-dev, linux-kernel, Jesse Barnes
Paul Jackson wrote:
> Nick writes:
>
>> Can you try with 2.6.8-rc2-mm1?
>
>
> This _is_ with 2.6.8-rc2-mm1.
>
>
>>Does it happen continually after the system has booted?
>
>
> Yes - nonstop - 4 times per millisecond, for at least as
> long as the machine has been up (I'm rebooting every few
> minutes, for other reasons ...).
>
>
>>comment out the call to cpu_attach_domain ... Does that fix it?
>
>
> Yes - that fixes it. My ratelimited printks on NULL group cease.
>
Hmm, nothing else seems to be oopsing. Maybe it is the ia64
domain setup code that Jesse did? The domains/groups must
not have been built properly somewhere.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 10:36 ` Nick Piggin
@ 2004-07-29 12:22 ` Dave Hansen
2004-07-29 15:35 ` Dimitri Sivanich
1 sibling, 0 replies; 9+ messages in thread
From: Dave Hansen @ 2004-07-29 12:22 UTC (permalink / raw)
To: Nick Piggin
Cc: Paul Jackson, PPC64 External List, Linux Kernel Mailing List,
Jesse Barnes
On Thu, 2004-07-29 at 03:36, Nick Piggin wrote:
> Hmm, nothing else seems to be oopsing. Maybe it is the ia64
> domain setup code that Jesse did? The domains/groups must
> not have been built properly somewhere.
Does backing this patch out help? It did on ppc64.
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.8-rc1/2.6.8-rc1-mm1/broken-out/detect-too-early-schedule-attempts.patch
-- Dave
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 10:36 ` Nick Piggin
2004-07-29 12:22 ` Dave Hansen
@ 2004-07-29 15:35 ` Dimitri Sivanich
2004-07-29 15:49 ` Jesse Barnes
1 sibling, 1 reply; 9+ messages in thread
From: Dimitri Sivanich @ 2004-07-29 15:35 UTC (permalink / raw)
To: Nick Piggin
Cc: Paul Jackson, haveblue, linuxppc64-dev, linux-kernel,
Jesse Barnes
On Thu, Jul 29, 2004 at 08:36:57PM +1000, Nick Piggin wrote:
> Hmm, nothing else seems to be oopsing. Maybe it is the ia64
> domain setup code that Jesse did? The domains/groups must
> not have been built properly somewhere.
Here's a patch to 2.6.8-rc2-mm1 that allows things to work:
--- sched.c.old 2004-07-29 10:11:00.000000000 -0500
+++ sched.c 2004-07-29 10:27:58.000000000 -0500
@@ -3770,8 +3770,6 @@ __init static void arch_init_sched_domai
cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
#ifdef CONFIG_NUMA
- if (i != first_cpu(sd->groups->cpumask))
- continue;
sd = &per_cpu(node_domains, i);
group = cpu_to_node_group(i);
*sd = SD_NODE_INIT;
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Oops in find_busiest_group(): 2.6.8-rc1-mm1
2004-07-29 15:35 ` Dimitri Sivanich
@ 2004-07-29 15:49 ` Jesse Barnes
0 siblings, 0 replies; 9+ messages in thread
From: Jesse Barnes @ 2004-07-29 15:49 UTC (permalink / raw)
To: Dimitri Sivanich
Cc: Nick Piggin, Paul Jackson, haveblue, linuxppc64-dev, linux-kernel,
Jesse Barnes
On Thursday, July 29, 2004 8:35 am, Dimitri Sivanich wrote:
> Here's a patch to 2.6.8-rc2-mm1 that allows things to work:
>
> --- sched.c.old 2004-07-29 10:11:00.000000000 -0500
> +++ sched.c 2004-07-29 10:27:58.000000000 -0500
> @@ -3770,8 +3770,6 @@ __init static void arch_init_sched_domai
> cpumask_t nodemask = node_to_cpumask(cpu_to_node(i));
>
> #ifdef CONFIG_NUMA
> - if (i != first_cpu(sd->groups->cpumask))
> - continue;
> sd = &per_cpu(node_domains, i);
> group = cpu_to_node_group(i);
> *sd = SD_NODE_INIT;
Yep, this was a merge error. I posted it as the first reply (f1rst p0st!) to
Andrew's 2.6.8-rc2-mm1 announcement. Sorry for the trouble, my last patch
didn't include it, but there was some confusion since there were several
fixes to the scheduler code posted to Nick's 'consolidate sched domains'
thread.
Jesse
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-07-29 16:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-15 6:04 Oops in find_busiest_group(): 2.6.8-rc1-mm1 Dave Hansen
2004-07-15 6:15 ` Nick Piggin
2004-07-29 6:42 ` Paul Jackson
2004-07-29 8:33 ` Nick Piggin
2004-07-29 9:29 ` Paul Jackson
2004-07-29 10:36 ` Nick Piggin
2004-07-29 12:22 ` Dave Hansen
2004-07-29 15:35 ` Dimitri Sivanich
2004-07-29 15:49 ` Jesse Barnes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox