netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
@ 2010-02-19 15:13 Atsushi Nemoto
  2010-02-19 15:31 ` Atsushi Nemoto
  2010-02-23  9:25 ` David Miller
  0 siblings, 2 replies; 4+ messages in thread
From: Atsushi Nemoto @ 2010-02-19 15:13 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Ralf Roesch

The netif_wake_queue() is called correctly (i.e. only on !txfull
condition) from txdone routine.  So Unconditional call to the
netif_wake_queue() here is wrong.  This might cause calling of
start_xmit routine on txfull state and trigger BUG_ON.

This bug does not happen when NAPI disabled.  After txdone there
must be at least one free tx slot.  But with NAPI, this is not
true anymore and the BUG_ON can hits on heavy load.

In this driver NAPI was enabled on 2.6.33-rc1 so this is
regression from 2.6.32 kernel.

Reported-by: Ralf Roesch <ralf.roesch@rw-gmbh.de>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
---
 drivers/net/tc35815.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/tc35815.c b/drivers/net/tc35815.c
index 75a669d..d71c197 100644
--- a/drivers/net/tc35815.c
+++ b/drivers/net/tc35815.c
@@ -1437,7 +1437,6 @@ static int tc35815_do_interrupt(struct net_device *dev, u32 status, int limit)
 		/* Transmit complete. */
 		lp->lstats.tx_ints++;
 		tc35815_txdone(dev);
-		netif_wake_queue(dev);
 		if (ret < 0)
 			ret = 0;
 	}
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
  2010-02-19 15:13 [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON Atsushi Nemoto
@ 2010-02-19 15:31 ` Atsushi Nemoto
  2010-02-23  9:26   ` David Miller
  2010-02-23  9:25 ` David Miller
  1 sibling, 1 reply; 4+ messages in thread
From: Atsushi Nemoto @ 2010-02-19 15:31 UTC (permalink / raw)
  To: netdev; +Cc: davem, ralf.roesch

On Sat, 20 Feb 2010 00:13:58 +0900, Atsushi Nemoto <anemo@mba.ocn.ne.jp> wrote:
> @@ -1437,7 +1437,6 @@ static int tc35815_do_interrupt(struct net_device *dev, u32 status, int limit)
>  		/* Transmit complete. */
>  		lp->lstats.tx_ints++;
>  		tc35815_txdone(dev);
> -		netif_wake_queue(dev);
>  		if (ret < 0)
>  			ret = 0;
>  	}

BTY, it looks isa-skelton.c have same problem.

void net_tx(struct net_device *dev)
{
...
	if (netif_queue_stopped(dev) && ! tx_full(dev))
		netif_wake_queue(dev);
...
}

static irqreturn_t net_interrupt(int irq, void *dev_id)
{
...
	if (status & TX_INTR) {
		/* Transmit complete. */
		net_tx(dev);
		np->stats.tx_packets++;
		netif_wake_queue(dev);
	}
...
}


I suppose here is the far origin of the bug :)

---
Atsushi Nemoto

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
  2010-02-19 15:13 [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON Atsushi Nemoto
  2010-02-19 15:31 ` Atsushi Nemoto
@ 2010-02-23  9:25 ` David Miller
  1 sibling, 0 replies; 4+ messages in thread
From: David Miller @ 2010-02-23  9:25 UTC (permalink / raw)
  To: anemo; +Cc: netdev, ralf.roesch

From: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Date: Sat, 20 Feb 2010 00:13:58 +0900

> The netif_wake_queue() is called correctly (i.e. only on !txfull
> condition) from txdone routine.  So Unconditional call to the
> netif_wake_queue() here is wrong.  This might cause calling of
> start_xmit routine on txfull state and trigger BUG_ON.
> 
> This bug does not happen when NAPI disabled.  After txdone there
> must be at least one free tx slot.  But with NAPI, this is not
> true anymore and the BUG_ON can hits on heavy load.
> 
> In this driver NAPI was enabled on 2.6.33-rc1 so this is
> regression from 2.6.32 kernel.
> 
> Reported-by: Ralf Roesch <ralf.roesch@rw-gmbh.de>
> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>

Patch applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
  2010-02-19 15:31 ` Atsushi Nemoto
@ 2010-02-23  9:26   ` David Miller
  0 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2010-02-23  9:26 UTC (permalink / raw)
  To: anemo; +Cc: netdev, ralf.roesch

From: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Date: Sat, 20 Feb 2010 00:31:22 +0900 (JST)

> On Sat, 20 Feb 2010 00:13:58 +0900, Atsushi Nemoto <anemo@mba.ocn.ne.jp> wrote:
>> @@ -1437,7 +1437,6 @@ static int tc35815_do_interrupt(struct net_device *dev, u32 status, int limit)
>>  		/* Transmit complete. */
>>  		lp->lstats.tx_ints++;
>>  		tc35815_txdone(dev);
>> -		netif_wake_queue(dev);
>>  		if (ret < 0)
>>  			ret = 0;
>>  	}
> 
> BTY, it looks isa-skelton.c have same problem.
 ...
> I suppose here is the far origin of the bug :)

Please send a patch for this.

Thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-02-23  9:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-19 15:13 [PATCH] tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON Atsushi Nemoto
2010-02-19 15:31 ` Atsushi Nemoto
2010-02-23  9:26   ` David Miller
2010-02-23  9:25 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).