public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PANIC][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings
@ 2003-03-29 20:45 Shawn Starr
  2003-03-29 21:15 ` Robert Love
  2003-03-31  2:52 ` [PANIC][2.5.66bk3+] " Zwane Mwaikambo
  0 siblings, 2 replies; 9+ messages in thread
From: Shawn Starr @ 2003-03-29 20:45 UTC (permalink / raw)
  To: linux-kernel

Panic from 2.5.66-bk3 w/ ksymoops dump:

In both panics below c012e9b4 does not exist as a kernel symbol in
System.map:
=======================================================

Unable to handle kernel paging request at virtual address 6b6b6b6f
 printing eip:
c012e9b4
*pde = 00000000
Oops: 0002 [#1]
CPU:    0
EIP:    0060:[<c012e9b4>]    Not tainted
EFLAGS: 00010012
EIP is at run_timer_softirq+0xe4/0x3f0
eax: 6b6b6b6b   ebx: 6b6b6b6b   ecx: c2e7e150   edx: 6b6b6b6b
esi: 6b6b6b6b   edi: c114a000   ebp: c0419860   esp: c114bf0c
ds: 007b   es: 007b   ss: 0068
Process init (pid: 1, threadinfo=c114a000 task=c114e000)
Stack: c041a8b0 c011282e c114bf94 c114bf24 c114e5d4 c114bfc4 00000011
c114a000
       000000e7 00000092 00000001 c04c9c48 fffffffd 00000046 c012963a
c04c9c48
       c114a000 c114a000 00000000 c04183a0 c010cd75 00000000 c114bf94
c04183a0
Call Trace:
 [<c011282e>] timer_interrupt+0x19e/0x3f0
 [<c012963a>] do_softirq+0x9a/0xa0
 [<c010cd75>] do_IRQ+0x235/0x370
 [<c017a557>] sys_stat64+0x37/0x40
 [<c010ac18>] common_interrupt+0x18/0x20
 [<c010a2bb>] restore_all+0x1/0xe

Code: 89 50 04 89 02 c7 41 30 00 00 00 00 81 3d 60 98 41 c0 3c 4b
 kernel/timer.c:258: spin_lock(kernel/timer.c:c0419860) already locked by
kernel/timer.c/398
Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing


ksymoops dump:

Code;  00000000 Before first symbol
00000000 <_EIP>:
Code;  00000000 Before first symbol
   0:   89 50 04                  mov    %edx,0x4(%eax)
Code;  00000003 Before first symbol
   3:   89 02                     mov    %eax,(%edx)
Code;  00000005 Before first symbol
   5:   c7 41 30 00 00 00 00      movl   $0x0,0x30(%ecx)
Code;  0000000c Before first symbol
   c:   81 3d 60 98 41 c0 3c      cmpl   $0x4b3c,0xc0419860
Code;  00000013 Before first symbol
  13:   4b 00 00


We know this is poisioned ('6b') EIP c012e9b4 is not present in System.map.
The machine was on for several hours 8+


----------------------------------------------------------------------------
--------------------------------------

Panic #2 (older) from 2.5.65:

Unable to handle kernel paging request at virtual address 6b6b6b6f
 printing eip:
c012e920
*pde = 00000000
Oops: 0002
CPU:    0
EIP:    0060:[<c012e920>]    Not tainted
EFLAGS: 00010016
EIP is at run_timer_softirq+0xd0/0x3f0
eax: 6b6b6b6b   ebx: c7de5150   ecx: 000000ee   edx: 6b6b6b6b
esi: 6b6b6b6b   edi: 6b6b6b6b   ebp: c0418bc0   esp: c117fe80
ds: 007b   es: 007b   ss: 0068
Process init (pid: 1, threadinfo=c117e000 task=c117c000)
Stack: c01127ae c117ff04 c117ff44 c0130249 c128423c c7fc2ac0 c117e000
c117e000
       c117e000 00000001 c04c5c48 fffffffd 00000046 c012960a c04c5c48
c117e000
       c117e000 00000000 c0417700 c010cd05 00000000 c117ff04 c0417700
fffffffe
Call Trace:
 [<c01127ae>] timer_interrupt+0x19e/0x3f0
 [<c0130249>] __dequeue_signal+0xc9/0x180
 [<c012960a>] do_softirq+0x9a/0xa0
 [<c010cd05>] do_IRQ+0x235/0x370
 [<c0180e37>] link_path_walk+0x247/0xdc0
 [<c010abf8>] common_interrupt+0x18/0x20
 [<c014007b>] sys_timer_delete+0x1db/0x210
 [<c014d756>] fprob+0x26/0x40
 [<c014d7ab>] check_poison_obj+0x3b/0x1b0
 [<c0181e6c>] __user_walk+0x5c/0x60
 [<c014f71c>] kmem_cache_alloc+0x12c/0x170
 [<c017f9c1>] getname+0x31/0xd0
 [<c017f9c1>] getname+0x31/0xd0
 [<c016b83b>] sys_open+0x1b/0x90
 [<c010a28b>] syscall_call+0x7/0xb

Code: 89 50 04 89 02 c7 43 30 00 00 00 00 81 3d c0 8b 41 c0 3c 4b
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
 kernel/timer.c:251: spin_lock(kernel/timer.c:c0418bc0) already locked by
kernel/timer.
c/389

ksymoops zwane debugged showed garbage, also poisioned ('6b').

I can say none of these irq mishaps have happened in 2.4.xx so some driver
or resource is trying to reuse a timer that doesn't exist anymore (if thats
the case).

Shawn.


^ permalink raw reply	[flat|nested] 9+ messages in thread
* Re: [PANIC][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings
@ 2003-03-29 22:10 Manfred Spraul
  0 siblings, 0 replies; 9+ messages in thread
From: Manfred Spraul @ 2003-03-29 22:10 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel, Shawn Starr

Robert wrote:

>> Code: 89 50 04 89 02 c7 41 30 00 00 00 00 81 3d 60 98 41 c0 3c 4b
>>  kernel/timer.c:258: spin_lock(kernel/timer.c:c0419860) already locked by
>> kernel/timer.c/398
>> Kernel panic: Aiee, killing interrupt handler!
>> In interrupt handler - not syncing
>
>This is not a panic, just an oops.  And it was just a debugging check
>from spin lock debugging, but unfortunately you were in an interrupt
>handler so the machine went bye bye.
>
>It is probably a simple double-lock deadlock, detected by spin lock
>debugging.  Knowing the EIP would help... but timer_interrupt() is a
>good first guess.
>  
>
No, this is wrong. spinlock debugging never forces an oops, it just 
complains with printk and tries to continue.

What happened is that someone registered a timer, and then kfreed the 
timer while it was still active. Then the call from run_timers() caused 
a crash, which corrupted the spinlock state, which provoked a spinlock 
debugging message.

Shawn: If you want to debug this, then you should try to print the "last 
user" field of the slab object that contains the timer. Add a test into 
run_timers that checks if timer->function is < 0xC0000000.

Something like

    kmem_cache_t *c = GET_PAGE_CACHE(virt_to_page(timer));
    struct slab *slabp = GET_PAGE_SLAB(virt_to_page(timer));
    void * obj = slabp->s_mem+c->objsize*((timer-slabp->s_mem)/c->objsize);
    unsigned long last_user = *(unsigned 
long*)(obj+c->objsize-BYTE_PER_WORD);

finds address of the last caller of kfree() or kmem_cache_free() on the 
slab object. It only works if slab debugging is enabled. Just print 
last_user, and look it up in System.map. Or use print_symbol (see 
mm/slab.c for an example).

If you need help I can write a patch.
--
    Manfred



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-03-31  2:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-29 20:45 [PANIC][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings Shawn Starr
2003-03-29 21:15 ` Robert Love
2003-03-29 21:42   ` [OOPS][2.5.66bk3+] " Shawn Starr
2003-03-29 22:11     ` Robert Love
2003-03-29 22:28     ` Andrew Morton
2003-03-29 22:33       ` Shawn Starr
2003-03-29 22:34       ` Shawn Starr
2003-03-31  2:52 ` [PANIC][2.5.66bk3+] " Zwane Mwaikambo
  -- strict thread matches above, loose matches on Subject: below --
2003-03-29 22:10 Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox