* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
@ 2003-03-30 16:10 Shawn Starr
2003-03-30 17:12 ` Petr Vandrovec
2003-03-30 21:02 ` Roland Dreier
0 siblings, 2 replies; 8+ messages in thread
From: Shawn Starr @ 2003-03-30 16:10 UTC (permalink / raw)
To: Shawn Starr, Andrew Morton; +Cc: rml, linux-kernel
Function found was: delayed_work_timer_fn (kernel/workqueue.c)
free of pending timer at c7411150
function=c0138ba0
Call Trace:
[<c014d4b4>] timer_hunt+0x84/0x90
[<c0138ba0>] delayed_work_timer_fn+0x0/0x170
[<c014fb08>] kfree+0x1c8/0x320
[<c024f8a6>] release_dev+0x696/0x840
[<c024f8a6>] release_dev+0x696/0x840
[<c01566eb>] zap_pmd_range+0x4b/0x70
[<c015675b>] unmap_page_range+0x4b/0x80
[<c0250054>] tty_release+0x94/0x1b0
[<c016db2c>] __fput+0xac/0x100
[<c024ffc0>] tty_release+0x0/0x1b0
[<c016db7b>] __fput+0xfb/0x100
[<c016ba6c>] filp_close+0x15c/0x230
[<c0125ebc>] put_files_struct+0x6c/0xe0
[<c0127300>] do_exit+0x400/0xaa0
[<c0255487>] read_chan+0x327/0x1010
[<c0127bdb>] do_group_exit+0x1cb/0x210
[<c01303e5>] dequeue_signal+0x35/0xa0
[<c013319e>] get_signal_to_deliver+0x40e/0x920
[<c010a08d>] do_signal+0xdd/0x110
[<c024e22d>] tty_read+0x24d/0x2d0
[<c0256170>] write_chan+0x0/0x240
[<c016c641>] vfs_read+0xe1/0x1c0
[<c010cd75>] do_IRQ+0x235/0x370
[<c010a119>] do_notify_resume+0x59/0x5c
[<c010a2f6>] work_notifysig+0x13/0x15
---- oops --
Unable to handle kernel paging request at virtual address 6b6b6b6f
printing eip:
c012e9c7
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c012e9c7>] Not tainted
EFLAGS: 00010002
EIP is at run_timer_softirq+0xe7/0x410
eax: c7411150 ebx: c7411154 ecx: 6b6b6b6b edx: 6b6b6b6b
esi: 6b6b6b6b edi: 6b6b6b6b ebp: c7630000 esp: c7631c68
ds: 007b es: 007b ss: 0068
Process agetty (pid: 488, threadinfo=c7630000 task=c7626cc0)
Stack: 00000046 c010ac18 00000002 00000002 00000000 c04c9c40 fffffffd
c7630000
00000040 0000007b 00000001 c04c9c48 fffffffd 00000046 c012964a
c04c9c48
c7630000 c7630000 00000000 c04185e0 c010cd75 00000000 c7631cf0
c04185e0
Call Trace:
[<c010ac18>] common_interrupt+0x18/0x20
[<c012964a>] do_softirq+0x9a/0xa0
[<c010cd75>] do_IRQ+0x235/0x370
[<c010ac18>] common_interrupt+0x18/0x20
[<c014007b>] sys_timer_delete+0xdb/0x210
[<c014fb2c>] kfree+0x1ec/0x320
[<c024f8a6>] release_dev+0x696/0x840
[<c024f8a6>] release_dev+0x696/0x840
[<c01566eb>] zap_pmd_range+0x4b/0x70
[<c015675b>] unmap_page_range+0x4b/0x80
[<c0250054>] tty_release+0x94/0x1b0
[<c016db2c>] __fput+0xac/0x100
[<c024ffc0>] tty_release+0x0/0x1b0
[<c016db7b>] __fput+0xfb/0x100
[<c016ba6c>] filp_close+0x15c/0x230
[<c0125ebc>] put_files_struct+0x6c/0xe0
[<c0127300>] do_exit+0x400/0xaa0
[<c0255487>] read_chan+0x327/0x1010
[<c0127bdb>] do_group_exit+0x1cb/0x210
[<c01303e5>] dequeue_signal+0x35/0xa0
[<c013319e>] get_signal_to_deliver+0x40e/0x920
[<c010a08d>] do_signal+0xdd/0x110
[<c024e22d>] tty_read+0x24d/0x2d0
[<c0256170>] write_chan+0x0/0x240
[<c016c641>] vfs_read+0xe1/0x1c0
[<c010cd75>] do_IRQ+0x235/0x370
[<c010a119>] do_notify_resume+0x59/0x5c
[<c010a2f6>] work_notifysig+0x13/0x15
Code: 89 4a 04 89 11 c7 40 30 00 00 00 00 81 3d a0 9a 41 c0 3c 4b
<0>Kernel panic: Aiee, killing interrupt handler!
kernel/timer.c:258: spin_lock(kernel/timer.c:c0419aa0) already locked by
kernel/timer.c/398
In interrupt handler - not syncing
ksymoops dump:
Code; 00000000 Before first symbol
00000000 <_EIP>:
Code; 00000000 Before first symbol
0: 89 4a 04 mov %ecx,0x4(%edx)
Code; 00000003 Before first symbol
3: 89 11 mov %edx,(%ecx)
Code; 00000005 Before first symbol
5: c7 40 30 00 00 00 00 movl $0x0,0x30(%eax)
Code; 0000000c Before first symbol
c: 81 3d a0 9a 41 c0 3c cmpl $0x4b3c,0xc0419aa0
Code; 00000013 Before first symbol
13: 4b 00 00
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 16:10 [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer Shawn Starr
@ 2003-03-30 17:12 ` Petr Vandrovec
2003-03-30 21:02 ` Roland Dreier
1 sibling, 0 replies; 8+ messages in thread
From: Petr Vandrovec @ 2003-03-30 17:12 UTC (permalink / raw)
To: Shawn Starr; +Cc: Andrew Morton, rml, linux-kernel
On Sun, Mar 30, 2003 at 11:10:48AM -0500, Shawn Starr wrote:
> Function found was: delayed_work_timer_fn (kernel/workqueue.c)
>
> free of pending timer at c7411150
> function=c0138ba0
> Call Trace:
> [<c014d4b4>] timer_hunt+0x84/0x90
> [<c0138ba0>] delayed_work_timer_fn+0x0/0x170
> [<c014fb08>] kfree+0x1c8/0x320
> [<c024f8a6>] release_dev+0x696/0x840
> [<c024f8a6>] release_dev+0x696/0x840
drivers/char/tty_io.c schedules flush_to_ldisc work. This
work can potentially reschedule itself, making call to
flush_scheduled_work() at the end of release_dev()
useless :-(
Unfortunately I have no idea what is that code trying to do -
we already did tty->ldisc.close() and replaced
tty->ldisc with something else when we call flush_scheduled_work(),
so I doubt that work we are waiting for can do anything
useful even if TTY_DONT_FLIP is not set...
Petr Vandrovec
vandrove@vc.cvut.cz
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 16:10 [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer Shawn Starr
2003-03-30 17:12 ` Petr Vandrovec
@ 2003-03-30 21:02 ` Roland Dreier
2003-03-30 21:28 ` Shawn Starr
1 sibling, 1 reply; 8+ messages in thread
From: Roland Dreier @ 2003-03-30 21:02 UTC (permalink / raw)
To: Shawn Starr; +Cc: Andrew Morton, rml, linux-kernel
Shawn> Function found was: delayed_work_timer_fn
Shawn> (kernel/workqueue.c)
It looks to me like something is calling schedule_delayed_work()
(which calls queue_delayed_work(), which starts a timer) and then
freeing the work_struct before it's executed.
Here's a list of places that use schedule_delayed_work() where the
work_struct might be kmalloc()ed. Are you using any of these drivers?
(Obviously you're using tty_io, so that bears some looking at)
drivers/char/cyclades.c
drivers/char/mxser.c
drivers/char/tty_io.c
drivers/isdn/i4l/isdn_tty.c
drivers/message/fusion/mptlan.c
drivers/net/hamradio/baycom_epp.c
drivers/net/plip.c
drivers/scsi/imm.c
drivers/scsi/ppa.c
If tty_io.c is the problem, then maybe something like the patch below
will find the culprit.
- Roland
===== drivers/char/tty_io.c 1.68 vs edited =====
--- 1.68/drivers/char/tty_io.c Thu Mar 27 21:15:44 2003
+++ edited/drivers/char/tty_io.c Sun Mar 30 12:51:00 2003
@@ -169,6 +169,10 @@
static inline void free_tty_struct(struct tty_struct *tty)
{
+ if (timer_pending(&tty->flip.work.timer)) {
+ printk(KERN_WARNING "freeing tty with pending flip work timer from [<%p>]\n",
+ __builtin_return_address(0));
+ }
kfree(tty);
}
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 21:02 ` Roland Dreier
@ 2003-03-30 21:28 ` Shawn Starr
2003-03-30 23:17 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Shawn Starr @ 2003-03-30 21:28 UTC (permalink / raw)
To: Roland Dreier; +Cc: Andrew Morton, rml, linux-kernel
drivers/char/tty_io.c - Only
I bet it's this function, there's only a kfree, not destruction of any
timers.
Added this rebuilt kernel waiting :-)
Shawn.
----- Original Message -----
From: "Roland Dreier" <roland@topspin.com>
To: "Shawn Starr" <spstarr@sh0n.net>
Cc: "Andrew Morton" <akpm@digeo.com>; <rml@tech9.net>;
<linux-kernel@vger.kernel.org>
Sent: Sunday, March 30, 2003 4:02 PM
Subject: Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New
OOPS w/ timer
> Shawn> Function found was: delayed_work_timer_fn
> Shawn> (kernel/workqueue.c)
>
> It looks to me like something is calling schedule_delayed_work()
> (which calls queue_delayed_work(), which starts a timer) and then
> freeing the work_struct before it's executed.
>
> Here's a list of places that use schedule_delayed_work() where the
> work_struct might be kmalloc()ed. Are you using any of these drivers?
> (Obviously you're using tty_io, so that bears some looking at)
>
> drivers/char/cyclades.c
> drivers/char/mxser.c
> drivers/char/tty_io.c
> drivers/isdn/i4l/isdn_tty.c
> drivers/message/fusion/mptlan.c
> drivers/net/hamradio/baycom_epp.c
> drivers/net/plip.c
> drivers/scsi/imm.c
> drivers/scsi/ppa.c
>
> If tty_io.c is the problem, then maybe something like the patch below
> will find the culprit.
>
> - Roland
>
> ===== drivers/char/tty_io.c 1.68 vs edited =====
> --- 1.68/drivers/char/tty_io.c Thu Mar 27 21:15:44 2003
> +++ edited/drivers/char/tty_io.c Sun Mar 30 12:51:00 2003
> @@ -169,6 +169,10 @@
>
> static inline void free_tty_struct(struct tty_struct *tty)
> {
> + if (timer_pending(&tty->flip.work.timer)) {
> + printk(KERN_WARNING "freeing tty with pending flip work timer from
[<%p>]\n",
> + __builtin_return_address(0));
> + }
> kfree(tty);
> }
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 21:28 ` Shawn Starr
@ 2003-03-30 23:17 ` Andrew Morton
2003-03-31 0:00 ` Roland Dreier
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Andrew Morton @ 2003-03-30 23:17 UTC (permalink / raw)
To: Shawn Starr; +Cc: roland, rml, linux-kernel
"Shawn Starr" <spstarr@sh0n.net> wrote:
>
> drivers/char/tty_io.c - Only
>
> I bet it's this function, there's only a kfree, not destruction of any
> timers.
>
This is fairly foul.
--- 25/drivers/char/tty_io.c~a 2003-03-30 15:12:37.000000000 -0800
+++ 25-akpm/drivers/char/tty_io.c 2003-03-30 15:16:59.000000000 -0800
@@ -1288,6 +1288,8 @@ static void release_dev(struct file * fi
/*
* Make sure that the tty's task queue isn't activated.
*/
+ clear_bit(TTY_DONT_FLIP, &tty->flags);
+ del_timer_sync(&tty->flip.work.timer);
flush_scheduled_work();
/*
_
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 23:17 ` Andrew Morton
@ 2003-03-31 0:00 ` Roland Dreier
2003-03-31 14:08 ` Shawn Starr
2003-04-06 0:35 ` [OOPS][2.5.66bk9+] " Shawn Starr
2 siblings, 0 replies; 8+ messages in thread
From: Roland Dreier @ 2003-03-31 0:00 UTC (permalink / raw)
To: Andrew Morton; +Cc: Shawn Starr, rml, linux-kernel
> --- 25/drivers/char/tty_io.c~a 2003-03-30 15:12:37.000000000 -0800
> +++ 25-akpm/drivers/char/tty_io.c 2003-03-30 15:16:59.000000000 -0800
> @@ -1288,6 +1288,8 @@ static void release_dev(struct file * fi
> /*
> * Make sure that the tty's task queue isn't activated.
> */
> + clear_bit(TTY_DONT_FLIP, &tty->flags);
> + del_timer_sync(&tty->flip.work.timer);
> flush_scheduled_work();
I'm confused by this for two reasons:
First, from looking at workqueue.c (especially the comment in
queue_delayed_work() that says "Increase nr_queued so that the flush
function knows that there's something pending."), it seems like
flush_scheduled_work() should wait until even delayed work is done.
Given that, I don't think the del_timer_sync() should be there --
wouldn't flush_scheduled_work() block forever, since nr_queued can
never reach 0 now?
(I guess I'm assuming the real race is that tty_io.c calls
schedule_delayed_work() between flush_scheduled_work() and
release_mem() in release_dev())
Second, I don't see how it's _ever_ safe to call
flush_scheduled_work(). The comment in workqueue.c before
flush_workqueue() says "NOTE: if work is being added to the queue
constantly by some other context then this function might block
indefinitely." But flush_scheduled_work() is flushing the keventd_wq,
which other code will definitely add work to. If we're unlucky,
flush_scheduled_work() could block forever. Am I just being paranoid?
- Roland
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 23:17 ` Andrew Morton
2003-03-31 0:00 ` Roland Dreier
@ 2003-03-31 14:08 ` Shawn Starr
2003-04-06 0:35 ` [OOPS][2.5.66bk9+] " Shawn Starr
2 siblings, 0 replies; 8+ messages in thread
From: Shawn Starr @ 2003-03-31 14:08 UTC (permalink / raw)
To: Andrew Morton; +Cc: roland, rml, linux-kernel
I have applied this to my current tree, testing it out.
On Sun, 30 Mar 2003, Andrew Morton wrote:
> "Shawn Starr" <spstarr@sh0n.net> wrote:
> >
> > drivers/char/tty_io.c - Only
> >
> > I bet it's this function, there's only a kfree, not destruction of any
> > timers.
> >
>
> This is fairly foul.
>
> --- 25/drivers/char/tty_io.c~a 2003-03-30 15:12:37.000000000 -0800
> +++ 25-akpm/drivers/char/tty_io.c 2003-03-30 15:16:59.000000000 -0800
> @@ -1288,6 +1288,8 @@ static void release_dev(struct file * fi
> /*
> * Make sure that the tty's task queue isn't activated.
> */
> + clear_bit(TTY_DONT_FLIP, &tty->flags);
> + del_timer_sync(&tty->flip.work.timer);
> flush_scheduled_work();
>
> /*
>
> _
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [OOPS][2.5.66bk9+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer
2003-03-30 23:17 ` Andrew Morton
2003-03-31 0:00 ` Roland Dreier
2003-03-31 14:08 ` Shawn Starr
@ 2003-04-06 0:35 ` Shawn Starr
2 siblings, 0 replies; 8+ messages in thread
From: Shawn Starr @ 2003-04-06 0:35 UTC (permalink / raw)
To: 'Andrew Morton'; +Cc: roland, rml, linux-kernel
Andrew, since the new timer changes made, Things are looking solid so far.
We might want to begin adding things into BK-current no?
Sshd hasn't hung, ttys haven't hung either.
Shawn.
-----Original Message-----
From: Andrew Morton [mailto:akpm@digeo.com]
Sent: Sunday, March 30, 2003 6:18 PM
To: Shawn Starr
Cc: roland@topspin.com; rml@tech9.net; linux-kernel@vger.kernel.org
Subject: Re: [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New
OOPS w/ timer
"Shawn Starr" <spstarr@sh0n.net> wrote:
>
> drivers/char/tty_io.c - Only
>
> I bet it's this function, there's only a kfree, not destruction of any
> timers.
>
This is fairly foul.
--- 25/drivers/char/tty_io.c~a 2003-03-30 15:12:37.000000000 -0800
+++ 25-akpm/drivers/char/tty_io.c 2003-03-30 15:16:59.000000000 -0800
@@ -1288,6 +1288,8 @@ static void release_dev(struct file * fi
/*
* Make sure that the tty's task queue isn't activated.
*/
+ clear_bit(TTY_DONT_FLIP, &tty->flags);
+ del_timer_sync(&tty->flip.work.timer);
flush_scheduled_work();
/*
_
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2003-04-06 0:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-30 16:10 [OOPS][2.5.66bk3+] run_timer_softirq - IRQ Mishandlings - New OOPS w/ timer Shawn Starr
2003-03-30 17:12 ` Petr Vandrovec
2003-03-30 21:02 ` Roland Dreier
2003-03-30 21:28 ` Shawn Starr
2003-03-30 23:17 ` Andrew Morton
2003-03-31 0:00 ` Roland Dreier
2003-03-31 14:08 ` Shawn Starr
2003-04-06 0:35 ` [OOPS][2.5.66bk9+] " Shawn Starr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox