* Soft Lockup Debugging
@ 2010-02-25 15:20 Aric D. Blumer
2010-02-25 21:43 ` Russell King - ARM Linux
0 siblings, 1 reply; 3+ messages in thread
From: Aric D. Blumer @ 2010-02-25 15:20 UTC (permalink / raw)
To: linux-arm-kernel
I was wondering if someone could give me some pointers on how to debug
the soft lockup shown in the attachment.
System: Kernel 2.6.29.6, PXA320, AR6002 wifi using the SDIO stack.
The lockup only occurs when the CPU_FREQ governor is set to powersave
(reducing CPU frequency from 806 MHz to 104 MHz) and the wifi interface
is sending traffic. I suspect that the main schedule() function is
getting stuck on the runqueue spinlock. If that's the case, how do I
figure out who is holding the lock? If that's not the case, can anyone
enlighten me on what exactly is hanging from this trace?
Thanks.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: soft_lockup.txt
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20100225/28cc81e3/attachment-0001.txt>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Soft Lockup Debugging
2010-02-25 15:20 Soft Lockup Debugging Aric D. Blumer
@ 2010-02-25 21:43 ` Russell King - ARM Linux
2010-02-26 13:12 ` Aric D. Blumer
0 siblings, 1 reply; 3+ messages in thread
From: Russell King - ARM Linux @ 2010-02-25 21:43 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Feb 25, 2010 at 10:20:17AM -0500, Aric D. Blumer wrote:
> [ 450.102093] Exception stack(0xcfd79b70 to 0xcfd79bb8)
> [ 450.107170] 9b60: 00000029 cf80aac0 c043b7f4 40000013
> [ 450.115604] 9b80: cf80aac0 00000029 00000029 c040d730 cfd78000 c0440900 0000000a cfd79bd4
> [ 450.124064] 9ba0: cfd79bd8 cfd79bb8 c007cc8c c007b1ac 40000013 ffffffff
> [ 450.132506] r6:04000000 r5:cfd79ba4 r4:ffffffff
> [ 450.137222] [<c007b180>] (handle_IRQ_event+0x0/0x84) from [<c007cc8c>] (handle_level_irq+0x78/0xf0)
> [ 450.146419] r7:c040d730 r6:cfd79cb0 r5:00000029 r4:c0416648
> [ 450.152195] [<c007cc14>] (handle_level_irq+0x0/0xf0) from [<c0030060>] (__exception_text_start+0x60/0x74)
> [ 450.161910] r5:c0447f20 r4:00000029
> [ 450.165559] [<c0030000>] (__exception_text_start+0x0/0x74) from [<c0030a70>] (__irq_svc+0x30/0xc0)
> [ 450.174667] Exception stack(0xcfd79c10 to 0xcfd79c58)
> [ 450.179745] 9c00: 00000014 cf8bce00 c043b7a0 40000013
> [ 450.188188] 9c20: cf8bce00 00000014 00000014 c040d730 cfd78000 c0440900 0000000a cfd79c74
> [ 450.196649] 9c40: cfd79c78 cfd79c58 c007cc8c c007b1ac 40000013 ffffffff
> [ 450.205105] r6:00000200 r5:cfd79c44 r4:ffffffff
> [ 450.209815] [<c007b180>] (handle_IRQ_event+0x0/0x84) from [<c007cc8c>] (handle_level_irq+0x78/0xf0)
> [ 450.219011] r7:c040d730 r6:cfd79d60 r5:00000014 r4:c041615c
> [ 450.224797] [<c007cc14>] (handle_level_irq+0x0/0xf0) from [<c0030060>] (__exception_text_start+0x60/0x74)
> [ 450.234512] r5:c0447f20 r4:00000014
> [ 450.238159] [<c0030000>] (__exception_text_start+0x0/0x74) from [<c0030a70>] (__irq_svc+0x30/0xc0)
Looks to me like a stuck IRQ problem - but no idea which IRQ. You'll
need to disassemble handle_IRQ_event to find out which register the
interrupt number ends up in when PC=0xc007b1ac, and then read it
from the exception stack. The register order there is:
r0-r12, sp (r13), lr (r14), pc (r15), cpsr, -1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Soft Lockup Debugging
2010-02-25 21:43 ` Russell King - ARM Linux
@ 2010-02-26 13:12 ` Aric D. Blumer
0 siblings, 0 replies; 3+ messages in thread
From: Aric D. Blumer @ 2010-02-26 13:12 UTC (permalink / raw)
To: linux-arm-kernel
On 02/25/2010 04:43 PM, Russell King - ARM Linux wrote:
> On Thu, Feb 25, 2010 at 10:20:17AM -0500, Aric D. Blumer wrote:
>
>> [ 450.102093] Exception stack(0xcfd79b70 to 0xcfd79bb8)
>> [ 450.107170] 9b60: 00000029 cf80aac0 c043b7f4 40000013
>> [ 450.115604] 9b80: cf80aac0 00000029 00000029 c040d730 cfd78000 c0440900 0000000a cfd79bd4
>> [ 450.124064] 9ba0: cfd79bd8 cfd79bb8 c007cc8c c007b1ac 40000013 ffffffff
>> [ 450.132506] r6:04000000 r5:cfd79ba4 r4:ffffffff
>> [ 450.137222] [<c007b180>] (handle_IRQ_event+0x0/0x84) from [<c007cc8c>] (handle_level_irq+0x78/0xf0)
>> [ 450.146419] r7:c040d730 r6:cfd79cb0 r5:00000029 r4:c0416648
>> [ 450.152195] [<c007cc14>] (handle_level_irq+0x0/0xf0) from [<c0030060>] (__exception_text_start+0x60/0x74)
>> [ 450.161910] r5:c0447f20 r4:00000029
>> [ 450.165559] [<c0030000>] (__exception_text_start+0x0/0x74) from [<c0030a70>] (__irq_svc+0x30/0xc0)
>> [ 450.174667] Exception stack(0xcfd79c10 to 0xcfd79c58)
>> [ 450.179745] 9c00: 00000014 cf8bce00 c043b7a0 40000013
>> [ 450.188188] 9c20: cf8bce00 00000014 00000014 c040d730 cfd78000 c0440900 0000000a cfd79c74
>> [ 450.196649] 9c40: cfd79c78 cfd79c58 c007cc8c c007b1ac 40000013 ffffffff
>> [ 450.205105] r6:00000200 r5:cfd79c44 r4:ffffffff
>> [ 450.209815] [<c007b180>] (handle_IRQ_event+0x0/0x84) from [<c007cc8c>] (handle_level_irq+0x78/0xf0)
>> [ 450.219011] r7:c040d730 r6:cfd79d60 r5:00000014 r4:c041615c
>> [ 450.224797] [<c007cc14>] (handle_level_irq+0x0/0xf0) from [<c0030060>] (__exception_text_start+0x60/0x74)
>> [ 450.234512] r5:c0447f20 r4:00000014
>> [ 450.238159] [<c0030000>] (__exception_text_start+0x0/0x74) from [<c0030a70>] (__irq_svc+0x30/0xc0)
>>
> Looks to me like a stuck IRQ problem - but no idea which IRQ. You'll
> need to disassemble handle_IRQ_event to find out which register the
> interrupt number ends up in when PC=0xc007b1ac, and then read it
> from the exception stack. The register order there is:
>
> r0-r12, sp (r13), lr (r14), pc (r15), cpsr, -1
>
I see. So when interpreting soft lockup backtraces, the most important
context is probably the one that the soft lockup detection code
interrupts, and in this case it is the IRQ handlers. Makes perfect
sense in hindsight.
You were right. The issue is a stuck IRQ in the pxamci.c code. I've
got a fix, but it's relevant to this discussion:
http://lists.arm.linux.org.uk/lurker/message/20090615.210855.a3f32cf1.en.html
Thanks for the help!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-02-26 13:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-25 15:20 Soft Lockup Debugging Aric D. Blumer
2010-02-25 21:43 ` Russell King - ARM Linux
2010-02-26 13:12 ` Aric D. Blumer
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.