* 3.7-rc8 boot failure due to recent workqueue-related patch
@ 2012-12-08 20:35 Mel Gorman
2012-12-09 0:50 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Mel Gorman @ 2012-12-08 20:35 UTC (permalink / raw)
To: Tejun Heo
Cc: Linus Torvalds, Neela Syam Kolli, James E.J. Bottomley,
linux-scsi, linux-kernel
Commit 8852aac2 (workqueue: mod_delayed_work_on() shouldn't queue timer on
0 delay) is causing the following boot failure for me. Found by bisection
but no further analysis unfortunately until I get back home properly.
I've added the relevant maintainers for megasas and SCSI in case it's
somehow specific to that driver.
[ 10.959999] ------------[ cut here ]------------
[ 10.964751] WARNING: at kernel/workqueue.c:1363 __queue_delayed_work+0x170/0x180()
[ 10.972496] Hardware name: PowerEdge R810
[ 10.976618] Modules linked in: i7core_edac(+) pcspkr joydev ses(+) enclosure lpc_ich(+) ata_piix(+) mfd_core edac_core sg serio_raw dcdbas wmi gf128mul acpi_power_meter button microcode autofs4 processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh_emc scsi_dh ata_generic megaraid_sas pata_atiixp
[ 11.006274] Pid: 0, comm: swapper/4 Not tainted 3.7.0-rc8-numacore-20121207 #1
[ 11.013671] Call Trace:
[ 11.016236] <IRQ> [<ffffffff8104752f>] warn_slowpath_common+0x7f/0xc0
[ 11.023102] [<ffffffff8104758a>] warn_slowpath_null+0x1a/0x20
[ 11.029053] [<ffffffff81066430>] __queue_delayed_work+0x170/0x180
[ 11.035357] [<ffffffff8108aee6>] ? trigger_load_balance+0x1b6/0x260
[ 11.037184] ata_piix 0000:00:1f.2: setting latency timer to 64
[ 11.037814] scsi1 : ata_piix
[ 11.038015] scsi2 : ata_piix
[ 11.038106] ata1: SATA max UDMA/133 cmd 0xece8 ctl 0xecf8 bmdma 0xecc0 irq 23
[ 11.038112] ata2: SATA max UDMA/133 cmd 0xecf0 ctl 0xecfc bmdma 0xecc8 irq 23
[ 11.068248] [<ffffffff8106652d>] queue_delayed_work_on+0x4d/0x60
[ 11.074459] [<ffffffff8106657c>] queue_delayed_work+0x1c/0x20
[ 11.080409] [<ffffffff8106659b>] schedule_delayed_work+0x1b/0x20
[ 11.086630] [<ffffffffa000cb3e>] megasas_complete_cmd+0x54e/0x6f0 [megaraid_sas]
[ 11.094292] [<ffffffff8106f419>] ? enqueue_hrtimer+0x29/0xc0
[ 11.100195] [<ffffffffa000cd6f>] megasas_complete_cmd_dpc+0x8f/0xf0 [megaraid_sas]
[ 11.108063] [<ffffffff8105027a>] tasklet_action+0x6a/0xe0
[ 11.113667] [<ffffffff8104fd80>] __do_softirq+0xd0/0x260
[ 11.119188] [<ffffffff8159da5c>] call_softirq+0x1c/0x30
[ 11.124626] [<ffffffff81004715>] do_softirq+0x75/0xb0
[ 11.129882] [<ffffffff81050095>] irq_exit+0xb5/0xc0
[ 11.134968] [<ffffffff8159e153>] do_IRQ+0x63/0xe0
[ 11.139880] [<ffffffff8159522d>] common_interrupt+0x6d/0x6d
[ 11.145653] <EOI> [<ffffffff8133017d>] ? intel_idle+0xed/0x150
[ 11.151922] [<ffffffff8133015e>] ? intel_idle+0xce/0x150
[ 11.157442] [<ffffffff8145e389>] cpuidle_enter+0x19/0x20
[ 11.163017] [<ffffffff8145ea57>] cpuidle_idle_call+0xa7/0x340
[ 11.168972] [<ffffffff8100bf0a>] cpu_idle+0x7a/0xf0
[ 11.174060] [<ffffffff81581995>] start_secondary+0x209/0x20b
[ 11.179945] ---[ end trace 61c662a962be9ab7 ]---
[ 11.184683] ------------[ cut here ]------------
[ 11.189412] kernel BUG at kernel/workqueue.c:1364!
[ 11.194316] invalid opcode: 0000 [#1] PREEMPT SMP
[ 11.199400] Modules linked in: i7core_edac pcspkr joydev ses(+) enclosure lpc_ich ata_piix mfd_core edac_core sg serio_raw dcdbas wmi gf128mul acpi_power_meter button microcode autofs4 processor thermal_sys scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh_emc scsi_dh ata_generic megaraid_sas pata_atiixp
[ 11.228314] CPU 4
[ 11.230206] Pid: 0, comm: swapper/4 Tainted: G W 3.7.0-rc8-numacore-20121207 #1 Dell Inc. PowerEdge R810/0TT6JF
[ 11.241923] RIP: 0010:[<ffffffff810663f5>] [<ffffffff810663f5>] __queue_delayed_work+0x135/0x180
[ 11.251035] RSP: 0018:ffff88047f843db0 EFLAGS: 00010086
[ 11.256463] RAX: 0000000000000000 RBX: ffff88046d0bc9c0 RCX: 0000000000000ead
[ 11.263740] RDX: 0000000000002365 RSI: 0000000000000046 RDI: 0000000000000009
[ 11.270987] RBP: ffff88047f843de0 R08: 0000000000000000 R09: 00000000000004e0
[ 11.278235] R10: ffff8800000bc740 R11: 0000000000000960 R12: 0000000000000200
[ 11.285486] R13: 0000000000000000 R14: ffff88086f852380 R15: 0000000000000297
[ 11.292788] FS: 0000000000000000(0000) GS:ffff88047f840000(0000) knlGS:0000000000000000
[ 11.301055] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 11.306915] CR2: 00007f9b7fe65000 CR3: 0000000001a0c000 CR4: 00000000000007e0
[ 11.314163] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 11.321412] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 11.328685] Process swapper/4 (pid: 0, threadinfo ffff88046f60e000, task ffff88046f60c4c0)
[ 11.337117] Stack:
[ 11.339244] ffffffff8108aee6 0000000000000086 ffff88086f4636c0 ffff88086f463c30
[ 11.347091] ffff88086f463c38 0000000000000297 ffff88047f843e00 ffffffff8106652d
[ 11.354937] 0000000000000004 ffff88086f89e1c0 ffff88047f843e10 ffffffff8106657c
[ 11.362763] Call Trace:
[ 11.365318] <IRQ>
[ 11.367293] [<ffffffff8108aee6>] ? trigger_load_balance+0x1b6/0x260
[ 11.374150] [<ffffffff8106652d>] queue_delayed_work_on+0x4d/0x60
[ 11.380362] [<ffffffff8106657c>] queue_delayed_work+0x1c/0x20
[ 11.386313] [<ffffffff8106659b>] schedule_delayed_work+0x1b/0x20
[ 11.392565] [<ffffffffa000cb3e>] megasas_complete_cmd+0x54e/0x6f0 [megaraid_sas]
[ 11.400231] [<ffffffff8106f419>] ? enqueue_hrtimer+0x29/0xc0
[ 11.406105] [<ffffffffa000cd6f>] megasas_complete_cmd_dpc+0x8f/0xf0 [megaraid_sas]
[ 11.413957] [<ffffffff8105027a>] tasklet_action+0x6a/0xe0
[ 11.419566] [<ffffffff8104fd80>] __do_softirq+0xd0/0x260
[ 11.425084] [<ffffffff8159da5c>] call_softirq+0x1c/0x30
[ 11.430516] [<ffffffff81004715>] do_softirq+0x75/0xb0
[ 11.435771] [<ffffffff81050095>] irq_exit+0xb5/0xc0
[ 11.440851] [<ffffffff8159e153>] do_IRQ+0x63/0xe0
[ 11.445759] [<ffffffff8159522d>] common_interrupt+0x6d/0x6d
[ 11.451526] <EOI>
[ 11.453501] [<ffffffff8133017d>] ? intel_idle+0xed/0x150
[ 11.459417] [<ffffffff8133015e>] ? intel_idle+0xce/0x150
[ 11.464932] [<ffffffff8145e389>] cpuidle_enter+0x19/0x20
[ 11.470462] [<ffffffff8145ea57>] cpuidle_idle_call+0xa7/0x340
[ 11.476459] [<ffffffff8100bf0a>] cpu_idle+0x7a/0xf0
[ 11.481541] [<ffffffff81581995>] start_secondary+0x209/0x20b
[ 11.487455] Code: 85 74 ff ff ff 65 8b 3c 25 c4 b0 00 00 e9 67 ff ff ff 0f 1f 40 00 48 3b 52 48 0f 85 11 ff ff ff 66 0f 1f 44 00 00 e9 13 ff ff ff <0f> 0b 0f 0b 44 89 e6 4c 89 ff e8 5c 1c ff ff e9 7c ff ff ff e8
[ 11.510825] RIP [<ffffffff810663f5>] __queue_delayed_work+0x135/0x180
[ 11.517532] RSP <ffff88047f843db0>
[ 11.521138] ---[ end trace 61c662a962be9ab8 ]---
[ 11.527550] Kernel panic - not syncing: Fatal exception in interrupt
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 3.7-rc8 boot failure due to recent workqueue-related patch
2012-12-08 20:35 3.7-rc8 boot failure due to recent workqueue-related patch Mel Gorman
@ 2012-12-09 0:50 ` Linus Torvalds
2012-12-09 11:11 ` Mel Gorman
0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2012-12-09 0:50 UTC (permalink / raw)
To: Mel Gorman
Cc: Tejun Heo, Xiaotian Feng, Neela Syam Kolli, James E.J. Bottomley,
linux-scsi, Linux Kernel Mailing List
On Sat, 8 Dec 2012, Mel Gorman wrote:
>
> Commit 8852aac2 (workqueue: mod_delayed_work_on() shouldn't queue timer on
> 0 delay) is causing the following boot failure for me. Found by bisection
> but no further analysis unfortunately until I get back home properly.
> I've added the relevant maintainers for megasas and SCSI in case it's
> somehow specific to that driver.
This should be fixed by commit c1d390d8e612. The Megaraid SAS driver was
doing insane things, using a work-struct for delayed work, and casting
pointers around. It had happened to work for all the wrong reasons before,
the mod_delayed_work_on() changes just exposed how crazy that crap was.
So please test current -git. If it's still broken, we'll need to know, but
I'm assuming this is the already-known-and-fixed bug.
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: 3.7-rc8 boot failure due to recent workqueue-related patch
2012-12-09 0:50 ` Linus Torvalds
@ 2012-12-09 11:11 ` Mel Gorman
0 siblings, 0 replies; 3+ messages in thread
From: Mel Gorman @ 2012-12-09 11:11 UTC (permalink / raw)
To: Linus Torvalds
Cc: Tejun Heo, Xiaotian Feng, Neela Syam Kolli, James E.J. Bottomley,
linux-scsi, Linux Kernel Mailing List
On Sat, Dec 08, 2012 at 04:50:21PM -0800, Linus Torvalds wrote:
>
>
> On Sat, 8 Dec 2012, Mel Gorman wrote:
> >
> > Commit 8852aac2 (workqueue: mod_delayed_work_on() shouldn't queue timer on
> > 0 delay) is causing the following boot failure for me. Found by bisection
> > but no further analysis unfortunately until I get back home properly.
> > I've added the relevant maintainers for megasas and SCSI in case it's
> > somehow specific to that driver.
>
> This should be fixed by commit c1d390d8e612. The Megaraid SAS driver was
> doing insane things, using a work-struct for delayed work, and casting
> pointers around. It had happened to work for all the wrong reasons before,
> the mod_delayed_work_on() changes just exposed how crazy that crap was.
>
Correct, it is fixed by commit c1d390d8e612. Sorry for the noise.
--
Mel Gorman
SUSE Labs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-12-09 11:11 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-08 20:35 3.7-rc8 boot failure due to recent workqueue-related patch Mel Gorman
2012-12-09 0:50 ` Linus Torvalds
2012-12-09 11:11 ` Mel Gorman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox