* Networkl problems with lastest kernel....
@ 2008-07-21 16:18 Sean MacLennan
2008-07-21 16:31 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Sean MacLennan @ 2008-07-21 16:18 UTC (permalink / raw)
To: linuxppc-dev
I just did a git pull of Linus' kernel. It seems to be mainly network
changes... and I get the following oops. Anybody else seeing this?
I really don't have time to look at the problem right now, maybe
tonight.
Cheers,
Sean
------------[ cut here ]------------
Kernel BUG at c01ba650 [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#1]
Warp
Modules linked in:
NIP: c01ba650 LR: c015e240 CTR: c0011f84
REGS: cf821d60 TRAP: 0700 Not tainted (2.6.26-pika)
MSR: 00029000 <EE,ME> CR: 44000082 XER: 0000005f
TASK = cf81f900[1] 'swapper' THREAD: cf820000
GPR00: 00000000 cf821e10 cf81f900 c02fd5a8 80000000 c0010684 00000000
ffffffff GPR08: c02fd5a8 00000001 00000000 00000001 24000028 00000000
00000004 c02a0000 GPR16: 00400684 00800000 c02e0000 c0270000 c02e0000
c02e0000 c03206c0 00000001 GPR24: c02e0000 cf984000 cf984438 cf984380
00029000 00000000 cf9843d0 cf984000 NIP [c01ba650]
__netif_schedule+0x28/0x84 LR [c015e240] emac_open+0x3d8/0x470
Call Trace:
[cf821e10] [cf984000] 0xcf984000 (unreliable)
[cf821e30] [c015e240] emac_open+0x3d8/0x470
[cf821e60] [c01bc2b4] dev_open+0xa8/0x118
[cf821e80] [c01bc1b4] dev_change_flags+0x168/0x1c0
[cf821ea0] [c02d3e48] ip_auto_config+0x19c/0xecc
[cf821f60] [c02ba83c] kernel_init+0x84/0x274
[cf821ff0] [c000c518] kernel_thread+0x48/0x64
Instruction dump:
4e800020 4bfffe48 7c0802a6 3d60c030 9421ffe0 396bd5a8 90010024 bfa10014
7c681b78 7c6b5a78 200b0000 7d605914 <0f0b0000> 39200002 38030024
7d600028 ---[ end trace be4338b61948e802 ]---
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Networkl problems with lastest kernel....
2008-07-21 16:18 Networkl problems with lastest kernel Sean MacLennan
@ 2008-07-21 16:31 ` David Miller
2008-07-21 17:05 ` Sean MacLennan
0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2008-07-21 16:31 UTC (permalink / raw)
To: smaclennan; +Cc: linuxppc-dev
From: Sean MacLennan <smaclennan@pikatech.com>
Date: Mon, 21 Jul 2008 12:18:29 -0400
> I just did a git pull of Linus' kernel. It seems to be mainly network
> changes... and I get the following oops. Anybody else seeing this?
>
> I really don't have time to look at the problem right now, maybe
> tonight.
If I had a penny for every driver with broken TX queue handling...
Please try this patch, thanks:
diff --git a/drivers/net/ibm_newemac/core.c b/drivers/net/ibm_newemac/core.c
index 2e720f2..4e01d29 100644
--- a/drivers/net/ibm_newemac/core.c
+++ b/drivers/net/ibm_newemac/core.c
@@ -1157,6 +1157,7 @@ static int emac_open(struct net_device *ndev)
mal_enable_rx_channel(dev->mal, dev->mal_rx_chan);
emac_tx_enable(dev);
emac_rx_enable(dev);
+ netif_start_queue(dev);
emac_netif_start(dev);
mutex_unlock(&dev->link_lock);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Networkl problems with lastest kernel....
2008-07-21 16:31 ` David Miller
@ 2008-07-21 17:05 ` Sean MacLennan
2008-07-21 17:16 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Sean MacLennan @ 2008-07-21 17:05 UTC (permalink / raw)
To: David Miller; +Cc: linuxppc-dev
On Mon, 21 Jul 2008 09:31:10 -0700 (PDT)
"David Miller" <davem@davemloft.net> wrote:
> If I had a penny for every driver with broken TX queue handling...
>
> Please try this patch, thanks:
>
> diff --git a/drivers/net/ibm_newemac/core.c
> b/drivers/net/ibm_newemac/core.c index 2e720f2..4e01d29 100644
> --- a/drivers/net/ibm_newemac/core.c
> +++ b/drivers/net/ibm_newemac/core.c
> @@ -1157,6 +1157,7 @@ static int emac_open(struct net_device *ndev)
> mal_enable_rx_channel(dev->mal, dev->mal_rx_chan);
> emac_tx_enable(dev);
> emac_rx_enable(dev);
> + netif_start_queue(dev);
> emac_netif_start(dev);
I had to change the dev to an ndev. dev is an struct emac_instance and
ndev is the struct net_device.
It still crashes, but in a different way. I think the problem is deeper
than I thought. The kernel has been OOPSing on a reboot in the
nfs_remount or there abouts for a few days. I thought the problem was
in a debug driver I was using... so I ignored it for now.
But it does it without the debug driver.... so I think I have a
corruption somewhere in the kernel.
But I have attached the new OOPS anyway.
Cheers,
Sean
------------[ cut here ]------------
Kernel BUG at c01ba66c [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#1]
Warp
Modules linked in:
NIP: c01ba66c LR: c015da58 CTR: 00000000
REGS: cf839e90 TRAP: 0700 Not tainted (2.6.26-pika)
MSR: 00029000 <EE,ME> CR: 44000042 XER: 0000005f
TASK = cf81e880[5] 'events/0' THREAD: cf838000
GPR00: 00000000 cf839f40 cf81e880 c02fd5a8 cf97856a 00000002 00000002
ffffffff GPR08: c02fd5a8 00000001 00000000 00000001 24000048 00000000
0ffac000 007fff9c GPR16: 00400684 00800000 007fff00 0ffa93c4 00000002
c02e95f8 c02f0000 c02e95f8 GPR24: c02f0000 00000000 c0030000 cf984404
cf9843d0 cf984000 cf984380 cf984000 NIP [c01ba66c]
__netif_schedule+0x28/0x84 LR [c015da58] emac_link_timer+0x704/0x754
Call Trace:
[cf839f40] [c015c9f4] __emac_set_multicast_list+0x5c/0xb0 (unreliable)
[cf839f60] [c015da58] emac_link_timer+0x704/0x754
[cf839f80] [c002db54] run_workqueue+0x9c/0x138
[cf839fa0] [c002df54] worker_thread+0x50/0xb4
[cf839fd0] [c0031424] kthread+0x84/0x8c
[cf839ff0] [c000c518] kernel_thread+0x48/0x64
Instruction dump:
4e800020 4bfffe48 7c0802a6 3d60c030 9421ffe0 396bd5a8 90010024 bfa10014
7c681b78 7c6b5a78 200b0000 7d605914 <0f0b0000> 39200002 38030024
7d600028 ---[ end trace 3e8d5079b3c922db ]---
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Networkl problems with lastest kernel....
2008-07-21 17:05 ` Sean MacLennan
@ 2008-07-21 17:16 ` David Miller
2008-07-21 17:37 ` Sean MacLennan
2008-07-21 23:59 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 6+ messages in thread
From: David Miller @ 2008-07-21 17:16 UTC (permalink / raw)
To: smaclennan; +Cc: linuxppc-dev
From: Sean MacLennan <smaclennan@pikatech.com>
Date: Mon, 21 Jul 2008 13:05:36 -0400
> But I have attached the new OOPS anyway.
The same problem is still there, this driver will
unfortunately require quite a bit more surgery.
You can instead add the following patch, it will
warn instead of BUG on you, and try to continue.
>From 867d79fb9a4d5929ad8335c896fcfe11c3b2ef14 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 21 Jul 2008 09:54:18 -0700
Subject: [PATCH] net: In __netif_schedule() use WARN_ON instead of BUG_ON
Signed-off-by: David S. Miller <davem@davemloft.net>
---
net/core/dev.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 7e2d527..cbc34c0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1327,7 +1327,8 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
void __netif_schedule(struct Qdisc *q)
{
- BUG_ON(q == &noop_qdisc);
+ if (WARN_ON_ONCE(q == &noop_qdisc))
+ return;
if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state)) {
struct softnet_data *sd;
--
1.5.6.4.433.g09651
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: Networkl problems with lastest kernel....
2008-07-21 17:16 ` David Miller
@ 2008-07-21 17:37 ` Sean MacLennan
2008-07-21 23:59 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 6+ messages in thread
From: Sean MacLennan @ 2008-07-21 17:37 UTC (permalink / raw)
To: David Miller; +Cc: linuxppc-dev
On Mon, 21 Jul 2008 10:16:50 -0700 (PDT)
"David Miller" <davem@davemloft.net> wrote:
> The same problem is still there, this driver will
> unfortunately require quite a bit more surgery.
>
> You can instead add the following patch, it will
> warn instead of BUG on you, and try to continue.
Ok, that lets me boot. Thanks.
Cheers,
Sean
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Networkl problems with lastest kernel....
2008-07-21 17:16 ` David Miller
2008-07-21 17:37 ` Sean MacLennan
@ 2008-07-21 23:59 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2008-07-21 23:59 UTC (permalink / raw)
To: David Miller; +Cc: linuxppc-dev, smaclennan
On Mon, 2008-07-21 at 10:16 -0700, David Miller wrote:
> From: Sean MacLennan <smaclennan@pikatech.com>
> Date: Mon, 21 Jul 2008 13:05:36 -0400
>
> > But I have attached the new OOPS anyway.
>
> The same problem is still there, this driver will
> unfortunately require quite a bit more surgery.
>
> You can instead add the following patch, it will
> warn instead of BUG on you, and try to continue.
Argh, EMAC ! I suppose I need to go have a look & fix it :-)
EMAC does some strange things such as sharing one NAPI instance for
multiple devices. Dunno if that's related to the problem. I need to dig
a bit.
Cheers,
Ben.
> >From 867d79fb9a4d5929ad8335c896fcfe11c3b2ef14 Mon Sep 17 00:00:00 2001
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 21 Jul 2008 09:54:18 -0700
> Subject: [PATCH] net: In __netif_schedule() use WARN_ON instead of BUG_ON
>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> net/core/dev.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 7e2d527..cbc34c0 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1327,7 +1327,8 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
>
> void __netif_schedule(struct Qdisc *q)
> {
> - BUG_ON(q == &noop_qdisc);
> + if (WARN_ON_ONCE(q == &noop_qdisc))
> + return;
>
> if (!test_and_set_bit(__QDISC_STATE_SCHED, &q->state)) {
> struct softnet_data *sd;
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-07-21 23:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-21 16:18 Networkl problems with lastest kernel Sean MacLennan
2008-07-21 16:31 ` David Miller
2008-07-21 17:05 ` Sean MacLennan
2008-07-21 17:16 ` David Miller
2008-07-21 17:37 ` Sean MacLennan
2008-07-21 23:59 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).