* Re: PROBLEM: kernel lockup while changing TC rules [not found] <20080501145239.GA20284@atlantis.mitranet.cz> @ 2008-05-01 21:33 ` David Miller 2008-05-03 0:26 ` David Miller 1 sibling, 0 replies; 10+ messages in thread From: David Miller @ 2008-05-01 21:33 UTC (permalink / raw) To: yanek; +Cc: linux-net, netdev From: Jan 'yanek' Bortl <yanek@ya.bofh.cz> Date: Thu, 1 May 2008 16:52:39 +0200 CC:'ing netdev@vger.kernel.org, where such reports belong. > I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: > > Network configuration: > > +--------+ vlan20 +---------+ vlan80 > | laptop |----------------| router |----- > +--------+ +---------+ > laptop has address 192.168.243.10 > > Reproduce HOWTO: > > 1. run http://ya.bofh.cz/archive/kernel-2.6-htbcrash/init.sh > 2. start packet generator on laptop (any traffic targetted to router's vlan80) > 3. run http://ya.bofh.cz/archive/kernel-2.6-htbcrash/crash.sh > > Then machine lockup and afterwhile print these messages: > > [ 395.137697] BUG: soft lockup - CPU#0 stuck for 61s! [ksoftirqd/0:4] > ... > [ 395.139871] Call Trace: > [ 395.139940] <IRQ> [<ffffffff8035ea7c>] ? rb_insert_color+0xbc/0xf0 > [ 395.140012] [<ffffffffa01f9cf6>] ? :sch_htb:htb_add_to_wait_tree+0xa6/0xc0 > [ 395.140061] [<ffffffffa01fb46f>] ? :sch_htb:htb_dequeue+0x47f/0x7f0 > [ 395.140111] [<ffffffffa01f9ec2>] ? :sch_htb:htb_activate_prios+0x122/0x140 > [ 395.140160] [<ffffffff804682f6>] ? __qdisc_run+0x216/0x240 > [ 395.140207] [<ffffffff804584e3>] ? dev_queue_xmit+0x2c3/0x390 > [ 395.140253] [<ffffffff8047db07>] ? ip_finish_output+0x117/0x2a0 > [ 395.140300] [<ffffffff8047dfe0>] ? ip_output+0x70/0xb0 > [ 395.140344] [<ffffffff8047ac68>] ? ip_forward_finish+0x38/0x50 > > and > > [ 527.210416] BUG: soft lockup - CPU#1 stuck for 61s! [tc:2848] > ... > [ 527.212644] Call Trace: > [ 527.212715] [<ffffffff803617ff>] ? __delay+0xf/0x30 > [ 527.212759] [<ffffffff80365a4c>] ? _raw_spin_lock+0x10c/0x180 > [ 527.212805] [<ffffffff804d90a6>] ? _spin_lock_bh+0x56/0x70 > [ 527.212849] [<ffffffff80467a3f>] ? qdisc_lock_tree+0x1f/0x30 > [ 527.212895] [<ffffffffa0228bb4>] ? :sch_sfq:sfq_init+0xf4/0x240 > [ 527.212942] [<ffffffff804693e4>] ? qdisc_create+0x154/0x250 > [ 527.212987] [<ffffffff804710d3>] ? nla_parse+0x33/0xf0 > [ 527.213031] [<ffffffff80469fd0>] ? tc_modify_qdisc+0x90/0x420 > [ 527.213079] [<ffffffff804603d9>] ? rtnetlink_rcv_msg+0x1e9/0x230 > [ 527.213125] [<ffffffff804601f0>] ? rtnetlink_rcv_msg+0x0/0x230 > > (full output here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/netconsole.txt) > > kernel's config here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/config.txt > dmesg after boot here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/dmesg.txt > > > -- > Jan 'yanek' Bortl <yanek [at] ya.bofh. cz> > http://ya.bofh.cz/ | jab: yanek [at] mitranet. cz > ----------------------------------------------------------------- > "Maybe one day you will learn that your way is not the only way." > Opher [StarGate: The Nox] > -- > To unsubscribe from this list: send the line "unsubscribe linux-net" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules [not found] <20080501145239.GA20284@atlantis.mitranet.cz> 2008-05-01 21:33 ` PROBLEM: kernel lockup while changing TC rules David Miller @ 2008-05-03 0:26 ` David Miller 2008-05-03 6:16 ` Stephen Hemminger 1 sibling, 1 reply; 10+ messages in thread From: David Miller @ 2008-05-03 0:26 UTC (permalink / raw) To: yanek; +Cc: netdev From: Jan 'yanek' Bortl <yanek@ya.bofh.cz> Date: Thu, 1 May 2008 17:49:14 +0200 > I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: Thanks for this report, I'll try to figure it out. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 0:26 ` David Miller @ 2008-05-03 6:16 ` Stephen Hemminger 0 siblings, 0 replies; 10+ messages in thread From: Stephen Hemminger @ 2008-05-03 6:16 UTC (permalink / raw) To: David Miller; +Cc: yanek, netdev On Fri, 02 May 2008 17:26:00 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Jan 'yanek' Bortl <yanek@ya.bofh.cz> > Date: Thu, 1 May 2008 17:49:14 +0200 > > > I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: > > Thanks for this report, I'll try to figure it out. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html This problem isn't new (I think it is even buried in kernel bugzilla). ^ permalink raw reply [flat|nested] 10+ messages in thread
* PROBLEM: kernel lockup while changing TC rules
@ 2008-05-01 15:49 Jan 'yanek' Bortl
2008-05-03 7:16 ` Jarek Poplawski
0 siblings, 1 reply; 10+ messages in thread
From: Jan 'yanek' Bortl @ 2008-05-01 15:49 UTC (permalink / raw)
To: netdev
Hi,
I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25:
Network configuration:
+--------+ vlan20 +---------+ vlan80
| laptop |----------------| router |-----
+--------+ +---------+
laptop has address 192.168.243.10
Reproduce HOWTO:
1. run http://ya.bofh.cz/archive/kernel-2.6-htbcrash/init.sh
2. start packet generator on laptop (any traffic targetted to router's vlan80)
3. run http://ya.bofh.cz/archive/kernel-2.6-htbcrash/crash.sh
Then machine lockup and afterwhile print these messages:
[ 395.137697] BUG: soft lockup - CPU#0 stuck for 61s! [ksoftirqd/0:4]
...
[ 395.139871] Call Trace:
[ 395.139940] <IRQ> [<ffffffff8035ea7c>] ? rb_insert_color+0xbc/0xf0
[ 395.140012] [<ffffffffa01f9cf6>] ? :sch_htb:htb_add_to_wait_tree+0xa6/0xc0
[ 395.140061] [<ffffffffa01fb46f>] ? :sch_htb:htb_dequeue+0x47f/0x7f0
[ 395.140111] [<ffffffffa01f9ec2>] ? :sch_htb:htb_activate_prios+0x122/0x140
[ 395.140160] [<ffffffff804682f6>] ? __qdisc_run+0x216/0x240
[ 395.140207] [<ffffffff804584e3>] ? dev_queue_xmit+0x2c3/0x390
[ 395.140253] [<ffffffff8047db07>] ? ip_finish_output+0x117/0x2a0
[ 395.140300] [<ffffffff8047dfe0>] ? ip_output+0x70/0xb0
[ 395.140344] [<ffffffff8047ac68>] ? ip_forward_finish+0x38/0x50
and
[ 527.210416] BUG: soft lockup - CPU#1 stuck for 61s! [tc:2848]
...
[ 527.212644] Call Trace:
[ 527.212715] [<ffffffff803617ff>] ? __delay+0xf/0x30
[ 527.212759] [<ffffffff80365a4c>] ? _raw_spin_lock+0x10c/0x180
[ 527.212805] [<ffffffff804d90a6>] ? _spin_lock_bh+0x56/0x70
[ 527.212849] [<ffffffff80467a3f>] ? qdisc_lock_tree+0x1f/0x30
[ 527.212895] [<ffffffffa0228bb4>] ? :sch_sfq:sfq_init+0xf4/0x240
[ 527.212942] [<ffffffff804693e4>] ? qdisc_create+0x154/0x250
[ 527.212987] [<ffffffff804710d3>] ? nla_parse+0x33/0xf0
[ 527.213031] [<ffffffff80469fd0>] ? tc_modify_qdisc+0x90/0x420
[ 527.213079] [<ffffffff804603d9>] ? rtnetlink_rcv_msg+0x1e9/0x230
[ 527.213125] [<ffffffff804601f0>] ? rtnetlink_rcv_msg+0x0/0x230
(full output here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/netconsole.txt)
kernel's config here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/config.txt
dmesg after boot here: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/dmesg.txt
--
Jan 'yanek' Bortl <yanek [at] ya.bofh. cz>
http://ya.bofh.cz/ | jab: yanek [at] mitranet. cz
-----------------------------------------------------------------
"Maybe one day you will learn that your way is not the only way."
Opher [StarGate: The Nox]
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-01 15:49 Jan 'yanek' Bortl @ 2008-05-03 7:16 ` Jarek Poplawski 2008-05-03 9:43 ` Jan 'yanek' Bortl 0 siblings, 1 reply; 10+ messages in thread From: Jarek Poplawski @ 2008-05-03 7:16 UTC (permalink / raw) To: Jan 'yanek' Bortl; +Cc: netdev Jan 'yanek' Bortl wrote, On 05/01/2008 05:49 PM: > Hi, Hi, > > I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: Does this mean the bug triggers in 2.6.25 and later, but not in 2.6.24? If so, is it possible to reproduce this e.g. with 2.6.25-rc6? > > Network configuration: > > +--------+ vlan20 +---------+ vlan80 > | laptop |----------------| router |----- > +--------+ +---------+ > laptop has address 192.168.243.10 > > Reproduce HOWTO: Very nice description and logs, but alas I'm not able to test it, so a few questions: - there are quite a lot of networking modules loaded like bonding or ifb: are there some other scripts (especially with virtual devices)? - could you send vlan and routing rules on this router? - does it always break with the same traces? Thanks, Jarek P. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 7:16 ` Jarek Poplawski @ 2008-05-03 9:43 ` Jan 'yanek' Bortl 2008-05-03 12:11 ` Jarek Poplawski 2008-05-03 16:39 ` Jarek Poplawski 0 siblings, 2 replies; 10+ messages in thread From: Jan 'yanek' Bortl @ 2008-05-03 9:43 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski wrote: > Jan 'yanek' Bortl wrote, On 05/01/2008 05:49 PM: > >> Hi, > > Hi, >> I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: > > Does this mean the bug triggers in 2.6.25 and later, but not in 2.6.24? > If so, is it possible to reproduce this e.g. with 2.6.25-rc6? I firstly discovered this on 2.6.22-6~bpo40+1 (debian's backports), but I'm not sure if that was same thing (it is long ago). Now i tested on test machine with these kernels: 2.6.24, 2.6.24.6 (http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/netconsole-2.6.24.6-slon.txt), 2.6.25 I can test anything you want. > Very nice description and logs, but alas I'm not able to test it, so > a few questions: > > - there are quite a lot of networking modules loaded like bonding or > ifb: are there some other scripts (especially with virtual devices)? I kicked them out now (ifb, vlan, bonding). Problem persist. +--------+ eth1 +---------+ eth2 | laptop |----------------| router |----- +--------+ +---------+ laptop has address 192.168.243.10 > - could you send vlan and routing rules on this router? http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/ipa-2.6.25-slon-00000-ge4c576b.txt http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/ipro-2.6.25-slon-00000-ge4c576b.txt > - does it always break with the same traces? Yes. Another run (without that modules): http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/netconsole-2.6.25-slon-00000-ge4c576b.txt -- Jan 'yanek' Bortl <yanek [at] ya.bofh. cz> http://ya.bofh.cz/ | jab: yanek [at] mitranet. cz ----------------------------------------------------------------- "Maybe one day you will learn that your way is not the only way." Opher [StarGate: The Nox] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 9:43 ` Jan 'yanek' Bortl @ 2008-05-03 12:11 ` Jarek Poplawski 2008-05-03 12:42 ` Jan 'yanek' Bortl 2008-05-03 16:39 ` Jarek Poplawski 1 sibling, 1 reply; 10+ messages in thread From: Jarek Poplawski @ 2008-05-03 12:11 UTC (permalink / raw) To: Jan 'yanek' Bortl; +Cc: netdev On Sat, May 03, 2008 at 11:43:46AM +0200, Jan 'yanek' Bortl wrote: > Jarek Poplawski wrote: >> Jan 'yanek' Bortl wrote, On 05/01/2008 05:49 PM: >> >>> Hi, >> >> Hi, >>> I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: >> >> Does this mean the bug triggers in 2.6.25 and later, but not in 2.6.24? >> If so, is it possible to reproduce this e.g. with 2.6.25-rc6? > > I firstly discovered this on 2.6.22-6~bpo40+1 (debian's backports), but > I'm not sure if that was same thing (it is long ago). > > Now i tested on test machine with these kernels: > 2.6.24, 2.6.24.6 > (http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/netconsole-2.6.24.6-slon.txt), > 2.6.25 > > I can test anything you want. Great! I really appreciate! You're very helpful in catching this rare bug. Alas, there is still nothing obvious at least to me, so I need more time for any idea... BTW, one little doubt: are you really sure vanilla 2.6.24 (without .6 etc.) gives the same? (There were some changes backported from 2.6.25 which I'd like to exclude.) Regards, Jarek P. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 12:11 ` Jarek Poplawski @ 2008-05-03 12:42 ` Jan 'yanek' Bortl 0 siblings, 0 replies; 10+ messages in thread From: Jan 'yanek' Bortl @ 2008-05-03 12:42 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski wrote: > On Sat, May 03, 2008 at 11:43:46AM +0200, Jan 'yanek' Bortl wrote: >> Jarek Poplawski wrote: >>> Jan 'yanek' Bortl wrote, On 05/01/2008 05:49 PM: >>> >>>> Hi, >>> Hi, >>>> I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: >>> Does this mean the bug triggers in 2.6.25 and later, but not in 2.6.24? >>> If so, is it possible to reproduce this e.g. with 2.6.25-rc6? >> I firstly discovered this on 2.6.22-6~bpo40+1 (debian's backports), but >> I'm not sure if that was same thing (it is long ago). >> >> Now i tested on test machine with these kernels: >> 2.6.24, 2.6.24.6 >> (http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/netconsole-2.6.24.6-slon.txt), >> 2.6.25 >> >> I can test anything you want. > > Great! I really appreciate! You're very helpful in catching this rare > bug. Alas, there is still nothing obvious at least to me, so I need > more time for any idea... > > BTW, one little doubt: are you really sure vanilla 2.6.24 (without .6 > etc.) gives the same? (There were some changes backported from 2.6.25 > which I'd like to exclude.) Yes. http://ya.bofh.cz/archive/kernel-2.6-htbcrash/3/netconsole.txt (config: http://ya.bofh.cz/archive/kernel-2.6-htbcrash/3/config-2.6.24-slon2) -- Jan 'yanek' Bortl <yanek [at] ya.bofh. cz> http://ya.bofh.cz/ | jab: yanek [at] mitranet. cz ----------------------------------------------------------------- "Maybe one day you will learn that your way is not the only way." Opher [StarGate: The Nox] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 9:43 ` Jan 'yanek' Bortl 2008-05-03 12:11 ` Jarek Poplawski @ 2008-05-03 16:39 ` Jarek Poplawski 2008-05-03 17:26 ` Jan 'yanek' Bortl 1 sibling, 1 reply; 10+ messages in thread From: Jarek Poplawski @ 2008-05-03 16:39 UTC (permalink / raw) To: Jan 'yanek' Bortl; +Cc: netdev On Sat, May 03, 2008 at 11:43:46AM +0200, Jan 'yanek' Bortl wrote: ... >>> I have found this problem with today's git (2.6.25-00000-ge4c576b) and 2.6.25: >> >> Does this mean the bug triggers in 2.6.25 and later, but not in 2.6.24? >> If so, is it possible to reproduce this e.g. with 2.6.25-rc6? > > I firstly discovered this on 2.6.22-6~bpo40+1 (debian's backports), but > I'm not sure if that was same thing (it is long ago). > > Now i tested on test machine with these kernels: > 2.6.24, 2.6.24.6 > (http://ya.bofh.cz/archive/kernel-2.6-htbcrash/2/netconsole-2.6.24.6-slon.txt), > 2.6.25 > > I can test anything you want. Here is a suspect #1. (BTW, this place reminds me something...) Thanks, Jarek P. --- net/sched/sch_htb.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 66148cc..5bc1ed4 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1197,12 +1197,16 @@ static inline int htb_parent_last_child(struct htb_class *cl) return 1; } -static void htb_parent_to_leaf(struct htb_class *cl, struct Qdisc *new_q) +static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl, + struct Qdisc *new_q) { struct htb_class *parent = cl->parent; BUG_TRAP(!cl->level && cl->un.leaf.q && !cl->prio_activity); + if (parent->cmode != HTB_CAN_SEND) + htb_safe_rb_erase(&parent->pq_node, q->wait_pq + parent->level); + parent->level = 0; memset(&parent->un.inner, 0, sizeof(parent->un.inner)); INIT_LIST_HEAD(&parent->un.leaf.drop_list); @@ -1300,7 +1304,7 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) htb_deactivate(q, cl); if (last_child) - htb_parent_to_leaf(cl, new_q); + htb_parent_to_leaf(q, cl, new_q); if (--cl->refcnt == 0) htb_destroy_class(sch, cl); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: PROBLEM: kernel lockup while changing TC rules 2008-05-03 16:39 ` Jarek Poplawski @ 2008-05-03 17:26 ` Jan 'yanek' Bortl 0 siblings, 0 replies; 10+ messages in thread From: Jan 'yanek' Bortl @ 2008-05-03 17:26 UTC (permalink / raw) To: Jarek Poplawski; +Cc: netdev Jarek Poplawski wrote: > ... > > Here is a suspect #1. (BTW, this place reminds me something...) Great! Seems to solve my problem. I'll do some tests. Thank you! > > Thanks, > Jarek P. > > --- > > net/sched/sch_htb.c | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index 66148cc..5bc1ed4 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -1197,12 +1197,16 @@ static inline int htb_parent_last_child(struct htb_class *cl) > return 1; > } > > -static void htb_parent_to_leaf(struct htb_class *cl, struct Qdisc *new_q) > +static void htb_parent_to_leaf(struct htb_sched *q, struct htb_class *cl, > + struct Qdisc *new_q) > { > struct htb_class *parent = cl->parent; > > BUG_TRAP(!cl->level && cl->un.leaf.q && !cl->prio_activity); > > + if (parent->cmode != HTB_CAN_SEND) > + htb_safe_rb_erase(&parent->pq_node, q->wait_pq + parent->level); > + > parent->level = 0; > memset(&parent->un.inner, 0, sizeof(parent->un.inner)); > INIT_LIST_HEAD(&parent->un.leaf.drop_list); > @@ -1300,7 +1304,7 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) > htb_deactivate(q, cl); > > if (last_child) > - htb_parent_to_leaf(cl, new_q); > + htb_parent_to_leaf(q, cl, new_q); > > if (--cl->refcnt == 0) > htb_destroy_class(sch, cl); -- Jan 'yanek' Bortl <yanek [at] ya.bofh. cz> http://ya.bofh.cz/ | jab: yanek [at] mitranet. cz ----------------------------------------------------------------- "Maybe one day you will learn that your way is not the only way." Opher [StarGate: The Nox] ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-05-03 17:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080501145239.GA20284@atlantis.mitranet.cz>
2008-05-01 21:33 ` PROBLEM: kernel lockup while changing TC rules David Miller
2008-05-03 0:26 ` David Miller
2008-05-03 6:16 ` Stephen Hemminger
2008-05-01 15:49 Jan 'yanek' Bortl
2008-05-03 7:16 ` Jarek Poplawski
2008-05-03 9:43 ` Jan 'yanek' Bortl
2008-05-03 12:11 ` Jarek Poplawski
2008-05-03 12:42 ` Jan 'yanek' Bortl
2008-05-03 16:39 ` Jarek Poplawski
2008-05-03 17:26 ` Jan 'yanek' Bortl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).