public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.8-rc1-mm1 "Badness in schedule" on ppc32
@ 2004-07-15  0:00 Mikael Pettersson
  0 siblings, 0 replies; 5+ messages in thread
From: Mikael Pettersson @ 2004-07-15  0:00 UTC (permalink / raw)
  To: akpm; +Cc: jhf, linux-kernel, trini

On 2004-07-14 22:01:50, Tom Rini wrote:
>On Fri, Jul 09, 2004 at 02:11:03PM -0700, Andrew Morton wrote:
>
>> 
>> jhf@rivenstone.net (Joseph Fannin) wrote:
>> > 
>> > On Thu, Jul 08, 2004 at 11:50:25PM -0700, Andrew Morton wrote:
>> > > 
>> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/
>> > 
>> > > +detect-too-early-schedule-attempts.patch
>> > > 
>> > > Catch attempts to call the scheduler before it is ready to go.
>> > 
>> > With this patch, my Powermac (ppc32) spews 711 (I think)
>> > warning messages during bootup.
>> 
>> hm, OK.  It could be that the debug patch is a bit too aggressive, or that
>> ppc got lucky and happens to always be in state TASK_RUNNING when these
>> calls to schedule() occur.
>> 
>> Maybe this task incorrectly has _TIF_NEED_RESCHED set?
>> 
>> Anyway, ppc guys: please take a look at the results from
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/broken \
>> -out/detect-too-early-schedule-attempts.patch and check that the kernel really \
>> should be calling schedule() at this time and place, let us know?
>
>Now that kallsyms data is OK, I took a quick look.. and all of this
>comes from generic code, at least on the machine I tried.  So if the
>code shouldn't be calling schedule() then, it's a more generic problem..
>
>... or I'm not following.

On my ppc32 (G3 PowerMac) 2.6.8-rc1-mm1 throws a large number of
"Badness in schedule" during boot. Below are the ones I managed
to capture: they contain both generic traces, and traces involving
Mac-only drivers.

Some of the traces involve the PDC202XX_NEW driver; I'll move that
card into an x86 PC tomorrow to see if the traces reappear or not;
if they don't then it does look like a PPC32-specific problem.

The kernel .config is SMP=n, PREEMPT=n, no debugging nonsense :-)

/Mikael

 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e0bc] wait_for_completion+0x7c/0x118
 [c00e10b0] adb_request+0x170/0x238
 [c00e1334] do_adb_reset_bus+0x1bc/0x530
 [c00e1778] adb_probe_task+0x54/0xb8
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbc8] do_probe+0x68/0x2c8
 [c00ec214] probe_hwif+0x3a0/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbb8] do_probe+0x58/0x2c8
 [c00ec4a0] probe_hwif+0x62c/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbc8] do_probe+0x68/0x2c8
 [c00ec4a0] probe_hwif+0x62c/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c0038ef8] pdflush+0xc4/0x1f4
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e0bc] wait_for_completion+0x7c/0x118
 [c00e10b0] adb_request+0x170/0x238
 [c00e1334] do_adb_reset_bus+0x1bc/0x530
 [c00e1778] adb_probe_task+0x54/0xb8
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbb8] do_probe+0x58/0x2c8
 [c00ec214] probe_hwif+0x3a0/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbc8] do_probe+0x68/0x2c8
 [c00ec214] probe_hwif+0x3a0/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebe20] do_probe+0x2c0/0x2c8
 [c00ec214] probe_hwif+0x3a0/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbb8] do_probe+0x58/0x2c8
 [c00ec4a0] probe_hwif+0x62c/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e0bc] wait_for_completion+0x7c/0x118
 [c00e10b0] adb_request+0x170/0x238
 [c00e1334] do_adb_reset_bus+0x1bc/0x530
 [c00e1778] adb_probe_task+0x54/0xb8
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebbc8] do_probe+0x68/0x2c8
 [c00ec4a0] probe_hwif+0x62c/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c0021400] msleep+0x38/0x54
 [c00ebe20] do_probe+0x2c0/0x2c8
 [c00ec4a0] probe_hwif+0x62c/0x6b8
 [c00ed030] probe_hwif_init+0x18/0x88
 [c00efd30] ide_setup_pci_device+0x70/0x88
 [c0206bd0] init_setup_pdcnew+0x10/0x20
 [c0206d40] pdc202new_init_one+0x30/0x44
 [c0207788] ide_scan_pcidev+0x80/0xc4
 [c02077f8] ide_scan_pcibus+0x2c/0x11c
 [c02076e0] ide_init+0x68/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c00f32b0] pmac_ide_setup_device+0x11c/0x664
 [c0207ac0] pmac_ide_macio_attach+0x11c/0x27c
 [c00ddd04] macio_device_probe+0x78/0xa4
 [c00cbd74] bus_match+0x50/0x9c
 [c00cbef4] driver_attach+0x74/0xdc
 [c00cc30c] bus_add_driver+0xac/0x160
 [c00cc928] driver_register+0x30/0x40
 [c00de730] macio_register_driver+0x4c/0x68
 [c0207e74] pmac_ide_probe+0x38/0x54
 [c02076e4] ide_init+0x6c/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e638] schedule_timeout+0x80/0xe0
 [c00f32e0] pmac_ide_setup_device+0x14c/0x664
 [c0207ac0] pmac_ide_macio_attach+0x11c/0x27c
 [c00ddd04] macio_device_probe+0x78/0xa4
 [c00cbd74] bus_match+0x50/0x9c
 [c00cbef4] driver_attach+0x74/0xdc
 [c00cc30c] bus_add_driver+0xac/0x160
 [c00cc928] driver_register+0x30/0x40
 [c00de730] macio_register_driver+0x4c/0x68
 [c0207e74] pmac_ide_probe+0x38/0x54
 [c02076e4] ide_init+0x6c/0x90
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c017e0bc] wait_for_completion+0x7c/0x118
 [c00e10b0] adb_request+0x170/0x238
 [c00e1334] do_adb_reset_bus+0x1bc/0x530
 [c00e1778] adb_probe_task+0x54/0xb8
 [c0008268] kernel_thread+0x44/0x60
Badness in schedule at kernel/sched.c:2153
Call trace:
 [c0005d74] check_bug_trap+0x98/0xdc
 [c0005f0c] ProgramCheckException+0x154/0x220
 [c00055a0] ret_from_except_full+0x0/0x4c
 [c017da40] schedule+0x24/0x5fc
 [c00283a4] worker_thread+0x214/0x218
 [c002ca7c] kthread+0xec/0x128
 [c0008268] kernel_thread+0x44/0x60

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.8-rc1-mm1 "Badness in schedule" on ppc32
@ 2004-07-15 19:08 Mikael Pettersson
  2004-07-15 20:27 ` Tom Rini
  0 siblings, 1 reply; 5+ messages in thread
From: Mikael Pettersson @ 2004-07-15 19:08 UTC (permalink / raw)
  To: akpm; +Cc: jhf, linux-kernel, trini

On Thu, 15 Jul 2004 02:00:01 +0200 (MEST), Mikael Pettersson wrote:
>On 2004-07-14 22:01:50, Tom Rini wrote:
>>On Fri, Jul 09, 2004 at 02:11:03PM -0700, Andrew Morton wrote:
>>
>>> 
>>> jhf@rivenstone.net (Joseph Fannin) wrote:
>>> > 
>>> > On Thu, Jul 08, 2004 at 11:50:25PM -0700, Andrew Morton wrote:
>>> > > 
>>> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/
>>> > 
>>> > > +detect-too-early-schedule-attempts.patch
>>> > > 
>>> > > Catch attempts to call the scheduler before it is ready to go.
>>> > 
>>> > With this patch, my Powermac (ppc32) spews 711 (I think)
>>> > warning messages during bootup.
>>> 
>>> hm, OK.  It could be that the debug patch is a bit too aggressive, or that
>>> ppc got lucky and happens to always be in state TASK_RUNNING when these
>>> calls to schedule() occur.
>>> 
>>> Maybe this task incorrectly has _TIF_NEED_RESCHED set?
>>> 
>>> Anyway, ppc guys: please take a look at the results from
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/broken \
>>> -out/detect-too-early-schedule-attempts.patch and check that the kernel really \
>>> should be calling schedule() at this time and place, let us know?
>>
>>Now that kallsyms data is OK, I took a quick look.. and all of this
>>comes from generic code, at least on the machine I tried.  So if the
>>code shouldn't be calling schedule() then, it's a more generic problem..
>>
>>... or I'm not following.
>
>On my ppc32 (G3 PowerMac) 2.6.8-rc1-mm1 throws a large number of
>"Badness in schedule" during boot. Below are the ones I managed
>to capture: they contain both generic traces, and traces involving
>Mac-only drivers.
>
>Some of the traces involve the PDC202XX_NEW driver; I'll move that
>card into an x86 PC tomorrow to see if the traces reappear or not;
>if they don't then it does look like a PPC32-specific problem.

Tried that now but I've been unable to trigger any
"Badness in schedule" messages on the x86 box.
Looks like PPC32 has a problem in -mm.

/Mikael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.8-rc1-mm1 "Badness in schedule" on ppc32
  2004-07-15 19:08 Mikael Pettersson
@ 2004-07-15 20:27 ` Tom Rini
  0 siblings, 0 replies; 5+ messages in thread
From: Tom Rini @ 2004-07-15 20:27 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: akpm, jhf, linux-kernel

On Thu, Jul 15, 2004 at 09:08:27PM +0200, Mikael Pettersson wrote:

> On Thu, 15 Jul 2004 02:00:01 +0200 (MEST), Mikael Pettersson wrote:
> >On 2004-07-14 22:01:50, Tom Rini wrote:
> >>On Fri, Jul 09, 2004 at 02:11:03PM -0700, Andrew Morton wrote:
> >>
> >>> 
> >>> jhf@rivenstone.net (Joseph Fannin) wrote:
> >>> > 
> >>> > On Thu, Jul 08, 2004 at 11:50:25PM -0700, Andrew Morton wrote:
> >>> > > 
> >>> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/
> >>> > 
> >>> > > +detect-too-early-schedule-attempts.patch
> >>> > > 
> >>> > > Catch attempts to call the scheduler before it is ready to go.
> >>> > 
> >>> > With this patch, my Powermac (ppc32) spews 711 (I think)
> >>> > warning messages during bootup.
> >>> 
> >>> hm, OK.  It could be that the debug patch is a bit too aggressive, or that
> >>> ppc got lucky and happens to always be in state TASK_RUNNING when these
> >>> calls to schedule() occur.
> >>> 
> >>> Maybe this task incorrectly has _TIF_NEED_RESCHED set?
> >>> 
> >>> Anyway, ppc guys: please take a look at the results from
> >>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/broken \
> >>> -out/detect-too-early-schedule-attempts.patch and check that the kernel really \
> >>> should be calling schedule() at this time and place, let us know?
> >>
> >>Now that kallsyms data is OK, I took a quick look.. and all of this
> >>comes from generic code, at least on the machine I tried.  So if the
> >>code shouldn't be calling schedule() then, it's a more generic problem..
> >>
> >>... or I'm not following.
> >
> >On my ppc32 (G3 PowerMac) 2.6.8-rc1-mm1 throws a large number of
> >"Badness in schedule" during boot. Below are the ones I managed
> >to capture: they contain both generic traces, and traces involving
> >Mac-only drivers.
> >
> >Some of the traces involve the PDC202XX_NEW driver; I'll move that
> >card into an x86 PC tomorrow to see if the traces reappear or not;
> >if they don't then it does look like a PPC32-specific problem.
> 
> Tried that now but I've been unable to trigger any
> "Badness in schedule" messages on the x86 box.
> Looks like PPC32 has a problem in -mm.

On x86, could you force the PDC202XX_NEW to dump_stack in the function
in question?  Perhaps there's a calling order issue on ppc.  Thanks.

-- 
Tom Rini
http://gate.crashing.org/~trini/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.8-rc1-mm1 "Badness in schedule" on ppc32
@ 2004-07-16 13:38 Mikael Pettersson
  2004-07-16 13:48 ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Mikael Pettersson @ 2004-07-16 13:38 UTC (permalink / raw)
  To: trini; +Cc: akpm, jhf, linux-kernel

On Thu, 15 Jul 2004 13:27:05 -0700, Tom Rini wrote:
>On Thu, Jul 15, 2004 at 09:08:27PM +0200, Mikael Pettersson wrote:
>
>> On Thu, 15 Jul 2004 02:00:01 +0200 (MEST), Mikael Pettersson wrote:
>> >On 2004-07-14 22:01:50, Tom Rini wrote:
>> >>On Fri, Jul 09, 2004 at 02:11:03PM -0700, Andrew Morton wrote:
>> >>
>> >>> 
>> >>> jhf@rivenstone.net (Joseph Fannin) wrote:
>> >>> > 
>> >>> > On Thu, Jul 08, 2004 at 11:50:25PM -0700, Andrew Morton wrote:
>> >>> > > 
>> >>> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/
>> >>> > 
>> >>> > > +detect-too-early-schedule-attempts.patch
>> >>> > > 
>> >>> > > Catch attempts to call the scheduler before it is ready to go.
>> >>> > 
>> >>> > With this patch, my Powermac (ppc32) spews 711 (I think)
>> >>> > warning messages during bootup.
>> >>> 
>> >>> hm, OK.  It could be that the debug patch is a bit too aggressive, or that
>> >>> ppc got lucky and happens to always be in state TASK_RUNNING when these
>> >>> calls to schedule() occur.
>> >>> 
>> >>> Maybe this task incorrectly has _TIF_NEED_RESCHED set?
>> >>> 
>> >>> Anyway, ppc guys: please take a look at the results from
>> >>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.7/2.6.7-mm7/broken \
>> >>> -out/detect-too-early-schedule-attempts.patch and check that the kernel really \
>> >>> should be calling schedule() at this time and place, let us know?
>> >>
>> >>Now that kallsyms data is OK, I took a quick look.. and all of this
>> >>comes from generic code, at least on the machine I tried.  So if the
>> >>code shouldn't be calling schedule() then, it's a more generic problem..
>> >>
>> >>... or I'm not following.
>> >
>> >On my ppc32 (G3 PowerMac) 2.6.8-rc1-mm1 throws a large number of
>> >"Badness in schedule" during boot. Below are the ones I managed
>> >to capture: they contain both generic traces, and traces involving
>> >Mac-only drivers.
>> >
>> >Some of the traces involve the PDC202XX_NEW driver; I'll move that
>> >card into an x86 PC tomorrow to see if the traces reappear or not;
>> >if they don't then it does look like a PPC32-specific problem.
>> 
>> Tried that now but I've been unable to trigger any
>> "Badness in schedule" messages on the x86 box.
>> Looks like PPC32 has a problem in -mm.
>
>On x86, could you force the PDC202XX_NEW to dump_stack in the function
>in question?  Perhaps there's a calling order issue on ppc.  Thanks.

I hacked pdc202xx_init_one() to dump_stack(), and upped ppc's
log buffer size to capture all badness messages. The ppc boot
log is a bit large, so I put both the ppc and x86 logs in
<http://www.csd.uu.se/~mikpe/linux/2.6.8-rc1-mm1-scheduler-badness/>.

All badness calls appear to emanate from sleeps/waits in init code
called from init/main.c:init(), which itself runs in a kernel thread.
It seems extremely fishy that the kernel considers the scheduler
off-limits even though threads have been created and started.

The init thread is itself created in init/main.c:rest_init():
>static void noinline rest_init(void)
>{
>	kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND);
>	numa_default_policy();
>	system_state = SYSTEM_BOOTING_SCHEDULER_OK;
>	unlock_kernel();
> 	cpu_idle();
>} 
system_state is changed only after the init thread is created.
Unless kernel_thread guarantees some execution ordering between
parent and child, I don't see how this could be race-free.

But I also don't see why ppc and x86 behave so differently here.

/Mikael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 2.6.8-rc1-mm1 "Badness in schedule" on ppc32
  2004-07-16 13:38 2.6.8-rc1-mm1 "Badness in schedule" on ppc32 Mikael Pettersson
@ 2004-07-16 13:48 ` Nick Piggin
  0 siblings, 0 replies; 5+ messages in thread
From: Nick Piggin @ 2004-07-16 13:48 UTC (permalink / raw)
  To: Mikael Pettersson; +Cc: trini, akpm, jhf, linux-kernel

Mikael Pettersson wrote:
> On Thu, 15 Jul 2004 13:27:05 -0700, Tom Rini wrote:

[ much needed cutting ]

>>On x86, could you force the PDC202XX_NEW to dump_stack in the function
>>in question?  Perhaps there's a calling order issue on ppc.  Thanks.
> 
> 
> I hacked pdc202xx_init_one() to dump_stack(), and upped ppc's
> log buffer size to capture all badness messages. The ppc boot
> log is a bit large, so I put both the ppc and x86 logs in
> <http://www.csd.uu.se/~mikpe/linux/2.6.8-rc1-mm1-scheduler-badness/>.
> 
> All badness calls appear to emanate from sleeps/waits in init code
> called from init/main.c:init(), which itself runs in a kernel thread.
> It seems extremely fishy that the kernel considers the scheduler
> off-limits even though threads have been created and started.
> 
> The init thread is itself created in init/main.c:rest_init():
> 
>>static void noinline rest_init(void)
>>{
>>	kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND);
>>	numa_default_policy();
>>	system_state = SYSTEM_BOOTING_SCHEDULER_OK;
>>	unlock_kernel();
>>	cpu_idle();
>>} 
> 
> system_state is changed only after the init thread is created.
> Unless kernel_thread guarantees some execution ordering between
> parent and child, I don't see how this could be race-free.
> 
> But I also don't see why ppc and x86 behave so differently here.
> 

You must have missed my mail to the linuxppc list.

sched-clean-init-idle (which is in -mm) has the following hunk to
schedule() which should catch all unsafe calls to it, I think.

+    /*
+     * The idle thread is not allowed to schedule!
+     * Remove this check after it has been exercised a bit.
+     */
+    if (unlikely(current == rq->idle) && current->state != TASK_RUNNING) {
+        printk(KERN_ERR "bad: scheduling from the idle thread!\n");
+        dump_stack();
+    }
+

So the system_state patch can be dropped.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2004-07-16 13:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-16 13:38 2.6.8-rc1-mm1 "Badness in schedule" on ppc32 Mikael Pettersson
2004-07-16 13:48 ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2004-07-15 19:08 Mikael Pettersson
2004-07-15 20:27 ` Tom Rini
2004-07-15  0:00 Mikael Pettersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox