From: Stephen Hemminger <shemminger@osdl.org>
To: linux-kernel@vger.kernel.org
Subject: [RFT] mv643xxx_eth_start_xmit oops
Date: Fri, 10 Nov 2006 11:54:44 -0800 [thread overview]
Message-ID: <20061110115444.07f58e40@freekitty> (raw)
In-Reply-To: 20061110191745.GA13783@codepoet.org
On Fri, 10 Nov 2006 12:17:45 -0700
Erik Andersen <andersen@codepoet.org> wrote:
> I have a Pegasos2 powerpc system acting as my home server. With
> 2.6.16.x it was 100% stable and I had months of uptime, rebooting
> only to periodically apply security updates to the kernel.
>
> With 2.6.17 and 2.6.18, after an uptime of no more than 2 days,
> and usually much less, I get a kernel panic, with nothing in the
> log. I finally caught it in the act, and took a picture.
> http://codepoet.org/oops.jpg
>
> A quick transcription of the Oops in the screenshot follows:
> --------------------------------------------
>
The code int mv643xx_eth_start_xmit is not safe on SMP it was checking for space outside of lock.
Does the following (untested) fix it?
0. Fix race where space check is outside of lock.
1. Eliminate bogus BUG_ON()'s
2. Use proper transmit routine return values
3. Cleanup potential kernel log overrun if hit with unaligned frags
4. Compare with actual space needed rather than worst case
5. Use correct return code for case of linearize() failure.
---
drivers/net/mv643xx_eth.c | 30 ++++++++++++++----------------
1 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
index 9997081..4052bfe 100644
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -1191,25 +1191,23 @@ static int mv643xx_eth_start_xmit(struct
struct net_device_stats *stats = &mp->stats;
unsigned long flags;
- BUG_ON(netif_queue_stopped(dev));
- BUG_ON(skb == NULL);
-
- if (mp->tx_ring_size - mp->tx_desc_count < MAX_DESCS_PER_SKB) {
- printk(KERN_ERR "%s: transmit with queue full\n", dev->name);
- netif_stop_queue(dev);
- return 1;
- }
-
- if (has_tiny_unaligned_frags(skb)) {
- if (__skb_linearize(skb)) {
- stats->tx_dropped++;
+ if (has_tiny_unaligned_frags(skb) && __skb_linearize(skb)) {
+ stats->tx_dropped++;
+ if (net_ratelimit())
printk(KERN_DEBUG "%s: failed to linearize tiny "
- "unaligned fragment\n", dev->name);
- return 1;
- }
+ "unaligned fragment\n", dev->name);
+ return NETDEV_TX_OK;
}
spin_lock_irqsave(&mp->lock, flags);
+ if (mp->tx_ring_size - mp->tx_desc_count < skb_shinfo(skb)->nr_frags + 1) {
+ if (!netif_queue_stopped(dev)) {
+ printk(KERN_ERR "%s: transmit with queue full\n", dev->name);
+ netif_stop_queue(dev);
+ }
+ spin_unlock_irqrestore(&mp->lock, flags);
+ return NETDEV_TX_BUSY;;
+ }
eth_tx_submit_descs_for_skb(mp, skb);
stats->tx_bytes = skb->len;
@@ -1221,7 +1219,7 @@ static int mv643xx_eth_start_xmit(struct
spin_unlock_irqrestore(&mp->lock, flags);
- return 0; /* success */
+ return NETDEV_TX_OK;
}
/*
--
1.4.1
--
Stephen Hemminger <shemminger@osdl.org>
WARNING: multiple messages have this Message-ID (diff)
From: Stephen Hemminger <shemminger@osdl.org>
To: Erik Andersen <andersen@codepoet.org>,
Dale Farnsworth <dale@farnsworth.org>,
Manish Lachwani <mlachwani@mvista.com>
Cc: netdev@vger.kernel.org
Subject: [RFT] mv643xxx_eth_start_xmit oops
Date: Fri, 10 Nov 2006 11:54:44 -0800 [thread overview]
Message-ID: <20061110115444.07f58e40@freekitty> (raw)
In-Reply-To: <20061110191745.GA13783@codepoet.org>
On Fri, 10 Nov 2006 12:17:45 -0700
Erik Andersen <andersen@codepoet.org> wrote:
> I have a Pegasos2 powerpc system acting as my home server. With
> 2.6.16.x it was 100% stable and I had months of uptime, rebooting
> only to periodically apply security updates to the kernel.
>
> With 2.6.17 and 2.6.18, after an uptime of no more than 2 days,
> and usually much less, I get a kernel panic, with nothing in the
> log. I finally caught it in the act, and took a picture.
> http://codepoet.org/oops.jpg
>
> A quick transcription of the Oops in the screenshot follows:
> --------------------------------------------
>
The code int mv643xx_eth_start_xmit is not safe on SMP it was checking for space outside of lock.
Does the following (untested) fix it?
0. Fix race where space check is outside of lock.
1. Eliminate bogus BUG_ON()'s
2. Use proper transmit routine return values
3. Cleanup potential kernel log overrun if hit with unaligned frags
4. Compare with actual space needed rather than worst case
5. Use correct return code for case of linearize() failure.
---
drivers/net/mv643xx_eth.c | 30 ++++++++++++++----------------
1 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
index 9997081..4052bfe 100644
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -1191,25 +1191,23 @@ static int mv643xx_eth_start_xmit(struct
struct net_device_stats *stats = &mp->stats;
unsigned long flags;
- BUG_ON(netif_queue_stopped(dev));
- BUG_ON(skb == NULL);
-
- if (mp->tx_ring_size - mp->tx_desc_count < MAX_DESCS_PER_SKB) {
- printk(KERN_ERR "%s: transmit with queue full\n", dev->name);
- netif_stop_queue(dev);
- return 1;
- }
-
- if (has_tiny_unaligned_frags(skb)) {
- if (__skb_linearize(skb)) {
- stats->tx_dropped++;
+ if (has_tiny_unaligned_frags(skb) && __skb_linearize(skb)) {
+ stats->tx_dropped++;
+ if (net_ratelimit())
printk(KERN_DEBUG "%s: failed to linearize tiny "
- "unaligned fragment\n", dev->name);
- return 1;
- }
+ "unaligned fragment\n", dev->name);
+ return NETDEV_TX_OK;
}
spin_lock_irqsave(&mp->lock, flags);
+ if (mp->tx_ring_size - mp->tx_desc_count < skb_shinfo(skb)->nr_frags + 1) {
+ if (!netif_queue_stopped(dev)) {
+ printk(KERN_ERR "%s: transmit with queue full\n", dev->name);
+ netif_stop_queue(dev);
+ }
+ spin_unlock_irqrestore(&mp->lock, flags);
+ return NETDEV_TX_BUSY;;
+ }
eth_tx_submit_descs_for_skb(mp, skb);
stats->tx_bytes = skb->len;
@@ -1221,7 +1219,7 @@ static int mv643xx_eth_start_xmit(struct
spin_unlock_irqrestore(&mp->lock, flags);
- return 0; /* success */
+ return NETDEV_TX_OK;
}
/*
--
1.4.1
--
Stephen Hemminger <shemminger@osdl.org>
next prev parent reply other threads:[~2006-11-10 19:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-10 19:17 Repeatable Oops, mv643xx_eth in 2.6.18.x Erik Andersen
2006-11-10 19:54 ` Stephen Hemminger [this message]
2006-11-10 19:54 ` [RFT] mv643xxx_eth_start_xmit oops Stephen Hemminger
2006-11-10 20:30 ` Francois Romieu
2006-11-10 20:53 ` Dale Farnsworth
2006-11-10 21:03 ` Francois Romieu
2006-11-10 21:07 ` Stephen Hemminger
2006-11-10 21:30 ` Francois Romieu
2006-11-10 21:35 ` Stephen Hemminger
2006-11-10 22:03 ` [NET] Update documentation of skb_linearize Francois Romieu
2006-11-10 22:53 ` David Miller
2006-11-10 23:34 ` Francois Romieu
2006-11-10 20:47 ` [RFT] mv643xxx_eth_start_xmit oops Erik Andersen
2006-11-11 23:44 ` Erik Andersen
2006-11-10 20:29 ` Repeatable Oops, mv643xx_eth in 2.6.18.x Olaf Hering
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061110115444.07f58e40@freekitty \
--to=shemminger@osdl.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.