* kernel panic removing devices from a teql queuing discipline
@ 2007-10-29 18:00 Chuck Ebbert
2007-10-30 8:33 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Chuck Ebbert @ 2007-10-29 18:00 UTC (permalink / raw)
To: Netdev
https://bugzilla.redhat.com/show_bug.cgi?id=219488
Still happening in 2.6.22.9:
BUG: unable to handle kernel paging request at virtual address 66696674
printing eip:
d098d4de
*pde = 00000000
Oops: 0000 [#1]
SMP
last sysfs file: /class/net/lo/ifindex
Modules linked in: sch_teql netconsole autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 dm_multipath video sbs i2c_ec button battery asus_acpi ac parport_pc lp parport floppy i2c_piix4 pcspkr i2c_core pcnet32 mii serio_raw ide_cd cdrom dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU: 0
EIP: 0060:[<d098d4de>] Not tainted VLI
EFLAGS: 00010202 (2.6.18-1.2849.fc6 #1)
EIP is at teql_master_xmit+0xdc/0x3aa [sch_teql]
eax: c06a82c0 ebx: cde25c80 ecx: 00000000 edx: c06ad680
esi: c12f8e00 edi: cfc73800 ebp: 66696670 esp: ca6c8bb8
ds: 007b es: 007b ss: 0068
Process ping (pid: 2275, ti=ca6c8000 task=cc7bd400 task.ti=ca6c8000)
Stack: cde25174 00000000 000004cc ca6f0800 c12f8e00 000004cc cc7f5280 ca6f0c00
cc7f5280 cc7f5280 00000000 00000000 00000000 c06a82c0 00000000 ca6f0800
c12f8e00 c12f8e00 c0823e08 c05b9606 ca6c8c20 00000000 c12f8e00 ca6f0800
Call Trace:
[<c05b9606>] dev_hard_start_xmit+0x1b9/0x218
[<c05c72e1>] __qdisc_run+0xde/0x19b
[<c05baeea>] dev_queue_xmit+0x147/0x265
[<c05d8a0c>] ip_output+0x1df/0x20b
[<c05d63bd>] ip_push_pending_frames+0x301/0x3c3
[<c05ef0a6>] raw_sendmsg+0x62e/0x6f0
[<c05f5913>] inet_sendmsg+0x3b/0x45
[<c05af6a6>] sock_sendmsg+0xd0/0xeb
[<c05afec9>] sys_sendmsg+0x192/0x1f7
[<c05b1427>] sys_socketcall+0x240/0x261
[<c0404013>] syscall_call+0x7/0xb
The panic is in __teql_resolve (which has been inlined into teql_master_xmit) in
net/sched/sch_teql.c at this line:
if (n && n->tbl == mn->tbl &&
Specifically the dereference of n->tbl is faulting as n is not valid.
And the address looks like part of an ASCCI string... "figt"
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel panic removing devices from a teql queuing discipline
2007-10-29 18:00 kernel panic removing devices from a teql queuing discipline Chuck Ebbert
@ 2007-10-30 8:33 ` David Miller
2007-11-05 20:08 ` Evgeniy Polyakov
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2007-10-30 8:33 UTC (permalink / raw)
To: cebbert; +Cc: netdev
From: Chuck Ebbert <cebbert@redhat.com>
Date: Mon, 29 Oct 2007 14:00:01 -0400
> The panic is in __teql_resolve (which has been inlined into teql_master_xmit) in
> net/sched/sch_teql.c at this line:
>
> if (n && n->tbl == mn->tbl &&
>
> Specifically the dereference of n->tbl is faulting as n is not valid.
>
> And the address looks like part of an ASCCI string... "figt"
I studied sch_teql.c a bit and I suspect that the slave list
management in teql_destroy() and teql_qdisc_init() might be
suspect.
If someone can take a closer look at this, I'd appreciate it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel panic removing devices from a teql queuing discipline
2007-10-30 8:33 ` David Miller
@ 2007-11-05 20:08 ` Evgeniy Polyakov
2007-11-06 10:48 ` Evgeniy Polyakov
0 siblings, 1 reply; 5+ messages in thread
From: Evgeniy Polyakov @ 2007-11-05 20:08 UTC (permalink / raw)
To: David Miller; +Cc: cebbert, netdev
On Tue, Oct 30, 2007 at 01:33:41AM -0700, David Miller (davem@davemloft.net) wrote:
> > The panic is in __teql_resolve (which has been inlined into teql_master_xmit) in
> > net/sched/sch_teql.c at this line:
> >
> > if (n && n->tbl == mn->tbl &&
> >
> > Specifically the dereference of n->tbl is faulting as n is not valid.
n is never valid (null), mn is garbage.
> > And the address looks like part of an ASCCI string... "figt"
>
> I studied sch_teql.c a bit and I suspect that the slave list
> management in teql_destroy() and teql_qdisc_init() might be
> suspect.
tecl_reset() is called from deactivate and qdisc is set to noop already,
but subsequent teql_xmit does not know about it and dereference private
data as teql qdisc and thus oopses. I will fix it tomorrow if you will
not catch it first :)
--
Evgeniy Polyakov
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel panic removing devices from a teql queuing discipline
2007-11-05 20:08 ` Evgeniy Polyakov
@ 2007-11-06 10:48 ` Evgeniy Polyakov
2007-11-06 11:08 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Evgeniy Polyakov @ 2007-11-06 10:48 UTC (permalink / raw)
To: David Miller; +Cc: cebbert, netdev
On Mon, Nov 05, 2007 at 11:08:00PM +0300, Evgeniy Polyakov (johnpol@2ka.mipt.ru) wrote:
> On Tue, Oct 30, 2007 at 01:33:41AM -0700, David Miller (davem@davemloft.net) wrote:
> > > The panic is in __teql_resolve (which has been inlined into teql_master_xmit) in
> > > net/sched/sch_teql.c at this line:
> > >
> > > if (n && n->tbl == mn->tbl &&
> > >
> > > Specifically the dereference of n->tbl is faulting as n is not valid.
>
> n is never valid (null), mn is garbage.
My fault, of course you are right, n is invalid because it is
dereferenced from qdisc, which was changed. That was too late in Moscow
for conclusions...
> > > And the address looks like part of an ASCCI string... "figt"
> >
> > I studied sch_teql.c a bit and I suspect that the slave list
> > management in teql_destroy() and teql_qdisc_init() might be
> > suspect.
>
> tecl_reset() is called from deactivate and qdisc is set to noop already,
> but subsequent teql_xmit does not know about it and dereference private
> data as teql qdisc and thus oopses. I will fix it tomorrow if you will
> not catch it first :)
It looks like I am.
Tested, works, fixed.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
diff --git a/net/sched/sch_teql.c b/net/sched/sch_teql.c
index f05ad9a..e0a44b9 100644
--- a/net/sched/sch_teql.c
+++ b/net/sched/sch_teql.c
@@ -263,6 +276,9 @@ __teql_resolve(struct sk_buff *skb, struct sk_buff *skb_res, struct net_device *
static __inline__ int
teql_resolve(struct sk_buff *skb, struct sk_buff *skb_res, struct net_device *dev)
{
+ if (dev->qdisc == &noop_qdisc)
+ return -ENODEV;
+
if (dev->hard_header == NULL ||
skb->dst == NULL ||
skb->dst->neighbour == NULL)
--
Evgeniy Polyakov
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: kernel panic removing devices from a teql queuing discipline
2007-11-06 10:48 ` Evgeniy Polyakov
@ 2007-11-06 11:08 ` David Miller
0 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2007-11-06 11:08 UTC (permalink / raw)
To: johnpol; +Cc: cebbert, netdev
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Date: Tue, 6 Nov 2007 13:48:55 +0300
> Tested, works, fixed.
>
> Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Applied, thanks a lot Evgeniy!
I'll queue this up for -stable too.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-11-06 11:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-29 18:00 kernel panic removing devices from a teql queuing discipline Chuck Ebbert
2007-10-30 8:33 ` David Miller
2007-11-05 20:08 ` Evgeniy Polyakov
2007-11-06 10:48 ` Evgeniy Polyakov
2007-11-06 11:08 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).