* [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang
@ 2008-01-15 5:25 Frans Pop
2008-01-15 5:53 ` David Miller
0 siblings, 1 reply; 27+ messages in thread
From: Frans Pop @ 2008-01-15 5:25 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel
After compiling v2.6.24-rc7-163-g1a1b285 (x86_64) yesterday I suddenly see this error
repeatedly:
kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
kernel: Tx Queue <0>
kernel: TDH <a>
kernel: TDT <a>
kernel: next_to_use <a>
kernel: next_to_clean <ff>
kernel: buffer_info[next_to_clean]
kernel: time_stamp <10002738a>
kernel: next_to_watch <ff>
kernel: jiffies <1000275b4>
kernel: next_to_watch.status <1>
My previous kernel was v2.6.24-rc7 and with that this error did not occur. I
have also never seen it with earlier kernels.
The values for "TX Queue" and "next_to_watch.status" are constant, the
others vary.
My NIC is:
01:00.0 Ethernet controller [0200]: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03)
01:00.0 0200: 8086:108c (rev 03)
Subsystem: 8086:3096
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 1273
Region 0: Memory at 90200000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at 90100000 (32-bit, non-prefetchable) [size=1M]
Region 2: I/O ports at 1000 [size=32]
Capabilities: [c8] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
Address: 00000000fee0300c Data: 41a9
Capabilities: [e0] Express Endpoint IRQ 0
Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag-
Device: Latency L0s <512ns, L1 <64us
Device: AtnBtn- AtnInd- PwrInd-
Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
Link: Supported Speed 2.5Gb/s, Width x1, ASPM unknown, Port 0
Link: Latency L0s <128ns, L1 <64us
Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch-
Link: Speed 2.5Gb/s, Width x1
The system is an Intel D945GCZ main board with
Intel(R) Pentium(R) D CPU 3.20GHz (dual core) processor.
Cheers,
FJP
^ permalink raw reply [flat|nested] 27+ messages in thread* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 5:25 [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang Frans Pop @ 2008-01-15 5:53 ` David Miller 2008-01-15 6:17 ` Frans Pop ` (2 more replies) 0 siblings, 3 replies; 27+ messages in thread From: David Miller @ 2008-01-15 5:53 UTC (permalink / raw) To: elendil; +Cc: netdev, linux-kernel From: Frans Pop <elendil@planet.nl> Date: Tue, 15 Jan 2008 06:25:10 +0100 > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Does this make the problem go away? (Note this isn't the final correct patch we should apply. There is no reason why this revert back to the older ->poll() logic here should have any effect on the TX hang triggering...) diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 13d57b0..cada32c 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget) { struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); struct net_device *poll_dev = adapter->netdev; - int work_done = 0; + int tx_work = 0, work_done = 0; /* Must NOT use netdev_priv macro here. */ adapter = poll_dev->priv; @@ -3929,8 +3929,8 @@ e1000_clean(struct napi_struct *napi, int budget) * simultaneously. A failure obtaining the lock means * tx_ring[0] is currently being cleaned anyway. */ if (spin_trylock(&adapter->tx_queue_lock)) { - e1000_clean_tx_irq(adapter, - &adapter->tx_ring[0]); + tx_work = e1000_clean_tx_irq(adapter, + &adapter->tx_ring[0]); spin_unlock(&adapter->tx_queue_lock); } @@ -3938,7 +3938,7 @@ e1000_clean(struct napi_struct *napi, int budget) &work_done, budget); /* If budget not fully consumed, exit the polling mode */ - if (work_done < budget) { + if (!tx_work && (work_done < budget)) { if (likely(adapter->itr_setting & 3)) e1000_set_itr(adapter); netif_rx_complete(poll_dev, napi); ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 5:53 ` David Miller @ 2008-01-15 6:17 ` Frans Pop 2008-01-15 14:04 ` Frans Pop 2008-01-16 9:02 ` Badalian Vyacheslav 2 siblings, 0 replies; 27+ messages in thread From: Frans Pop @ 2008-01-15 6:17 UTC (permalink / raw) To: David Miller; +Cc: netdev, linux-kernel Wow. That's fast! :-) On Tuesday 15 January 2008, David Miller wrote: > From: Frans Pop <elendil@planet.nl> > > > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > > Does this make the problem go away? I'm compiling a kernel with the patch now. Will let you know the result. May take a while as I don't know how to trigger the bug, so I'll just have to let it run for some time. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 5:53 ` David Miller 2008-01-15 6:17 ` Frans Pop @ 2008-01-15 14:04 ` Frans Pop 2008-01-15 16:04 ` slavon 2008-01-16 9:02 ` Badalian Vyacheslav 2 siblings, 1 reply; 27+ messages in thread From: Frans Pop @ 2008-01-15 14:04 UTC (permalink / raw) To: David Miller; +Cc: netdev, linux-kernel On Tuesday 15 January 2008, David Miller wrote: > From: Frans Pop <elendil@planet.nl> > > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang > > Does this make the problem go away? Yes, it very much looks like that solves it. I ran with the patch for 6 hours or so without any errors. I then switched back to an unpatched kernel and they reappeared immediately. > (Note this isn't the final correct patch we should apply. There > is no reason why this revert back to the older ->poll() logic > here should have any effect on the TX hang triggering...) s/no reason/no obvious reason/ ? ;-) Cheers, FJP ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 14:04 ` Frans Pop @ 2008-01-15 16:04 ` slavon 2008-01-15 21:53 ` Brandeburg, Jesse 0 siblings, 1 reply; 27+ messages in thread From: slavon @ 2008-01-15 16:04 UTC (permalink / raw) To: Frans Pop; +Cc: David Miller, netdev, linux-kernel Quoting Frans Pop <elendil@planet.nl>: > On Tuesday 15 January 2008, David Miller wrote: >> From: Frans Pop <elendil@planet.nl> >> > kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang >> >> Does this make the problem go away? > > Yes, it very much looks like that solves it. > I ran with the patch for 6 hours or so without any errors. I then switched > back to an unpatched kernel and they reappeared immediately. > >> (Note this isn't the final correct patch we should apply. There >> is no reason why this revert back to the older ->poll() logic >> here should have any effect on the TX hang triggering...) > > s/no reason/no obvious reason/ ? ;-) > > Cheers, > FJP > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Hello. I also try your patch (apply to 2.6.24-rc7-git2) I catch this message in dmesg [ 1771.796954] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang [ 1771.796957] Tx Queue <0> [ 1771.796958] TDH <54> [ 1771.796959] TDT <54> [ 1771.796960] next_to_use <54> [ 1771.796961] next_to_clean <a9> [ 1771.796962] buffer_info[next_to_clean] [ 1771.796963] time_stamp <14d72e> [ 1771.796964] next_to_watch <a9> [ 1771.796965] jiffies <14ddd3> [ 1771.796966] next_to_watch.status <1> Thanks. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. ^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 16:04 ` slavon @ 2008-01-15 21:53 ` Brandeburg, Jesse 2008-01-16 5:02 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Brandeburg, Jesse @ 2008-01-15 21:53 UTC (permalink / raw) To: slavon, Frans Pop; +Cc: David Miller, netdev, linux-kernel slavon@bigtelecom.ru wrote: > Quoting Frans Pop <elendil@planet.nl>: >>> (Note this isn't the final correct patch we should apply. There is >>> no reason why this revert back to the older ->poll() logic here >>> should have any effect on the TX hang triggering...) >> >> s/no reason/no obvious reason/ ? ;-) The tx code has an "early exit" that tries to limit the amount of tx packets handled in a single poll loop and requires napi or interrupt rescheduling based on the return value from e1000_clean_tx_irq. see this code in e1000_clean_tx_irq 4005 #ifdef CONFIG_E1000_NAPI 4006 #define E1000_TX_WEIGHT 64 4007 > > /* weight of a sort for tx, to avoid endless transmit cleanup */ 4008 > > if (count++ == E1000_TX_WEIGHT) break; 4009 #endif I think that is probably related. For a test you could apply the original patch, and remove this "break" just by commenting out line 4008. This would guarantee all tx work is cleaned at every e1000_clean Jesse ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 21:53 ` Brandeburg, Jesse @ 2008-01-16 5:02 ` David Miller 2008-01-16 8:56 ` Frans Pop 2008-01-17 7:09 ` Brandeburg, Jesse 0 siblings, 2 replies; 27+ messages in thread From: David Miller @ 2008-01-16 5:02 UTC (permalink / raw) To: jesse.brandeburg; +Cc: slavon, elendil, netdev, linux-kernel From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> Date: Tue, 15 Jan 2008 13:53:43 -0800 > The tx code has an "early exit" that tries to limit the amount of tx > packets handled in a single poll loop and requires napi or interrupt > rescheduling based on the return value from e1000_clean_tx_irq. That explains everything, thanks Jesse. Ok, here is the patch I'll propose to fix this. The goal is to make it as simple as possible without regressing the thing we were trying to fix. Something more sophisticated can be done later. Three of the 5 Intel drivers had the TX breakout logic. e1000, e1000e, and ixgbe. e100 and ixgb did not, so they don't have any problems we need to fix here. What the fix does is behave as if the budget was fully consumed if *_clean_tx_irq() returns true. The only valid way to return from ->poll() without copleting the NAPI poll is by returning work_done == budget. That signals to the caller that the NAPI instance has not been descheduled and therefore the caller fully owns the NAPI context. This does mean that for these drivers any time TX work is done, we'll loop at least one extra time in the ->poll() loop of net_rx_work() but that is historically what these drivers have caused to happen for years. For 2.6.25 or similar I would suggest investigating courses of action to bring closure and consistency to this: 1) Determine whether the loop breakout is actually necessary. Jesse explained to me that they had seen a case where a thread on one cpu feeding the TX ring could keep a thread on another cpu constantly running the *_clean_tx_irq() code in a loop. I find this hard to believe since even the slowest CPU should be able to free up TX entries faster than they can be transmitted on gigabit links :-) 2) If the investigation in #1 deems the breakout logic is necessary, then consistently amongst all the 5 drivers a policy should be implemented which is integrated with the NAPI budgetting logic. For example, the simplest thing to do is to pass the budget and the "work_done" thing down into *_clean_tx_irq() and break out if it is exceeded. As a further refinement we can say that TX work is about 1/4 the expense of RX work and adjust the budget checking logic to match that. [NET]: Fix TX timeout regression in Intel drivers. This fixes a regression added by changeset 53e52c729cc169db82a6105fac7a166e10c2ec36 ("[NET]: Make ->poll() breakout consistent in Intel ethernet drivers.") As pointed out by Jesse Brandeburg, for three of the drivers edited above there is breakout logic in the *_clean_tx_irq() code to prevent running TX reclaim forever. If this occurs, we have to elide NAPI poll completion or else those TX events will never be serviced. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 13d57b0..0c9a6f7 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget) { struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); struct net_device *poll_dev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* Must NOT use netdev_priv macro here. */ adapter = poll_dev->priv; @@ -3929,14 +3929,17 @@ e1000_clean(struct napi_struct *napi, int budget) * simultaneously. A failure obtaining the lock means * tx_ring[0] is currently being cleaned anyway. */ if (spin_trylock(&adapter->tx_queue_lock)) { - e1000_clean_tx_irq(adapter, - &adapter->tx_ring[0]); + tx_cleaned = e1000_clean_tx_irq(adapter, + &adapter->tx_ring[0]); spin_unlock(&adapter->tx_queue_lock); } adapter->clean_rx(adapter, &adapter->rx_ring[0], &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { if (likely(adapter->itr_setting & 3)) diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 4a6fc74..2ab3bfb 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -1384,7 +1384,7 @@ static int e1000_clean(struct napi_struct *napi, int budget) { struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); struct net_device *poll_dev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* Must NOT use netdev_priv macro here. */ adapter = poll_dev->priv; @@ -1394,12 +1394,15 @@ static int e1000_clean(struct napi_struct *napi, int budget) * simultaneously. A failure obtaining the lock means * tx_ring is currently being cleaned anyway. */ if (spin_trylock(&adapter->tx_queue_lock)) { - e1000_clean_tx_irq(adapter); + tx_cleaned = e1000_clean_tx_irq(adapter); spin_unlock(&adapter->tx_queue_lock); } adapter->clean_rx(adapter, &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { if (adapter->itr_setting & 3) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index a564916..de3f45e 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -1468,13 +1468,16 @@ static int ixgbe_clean(struct napi_struct *napi, int budget) struct ixgbe_adapter *adapter = container_of(napi, struct ixgbe_adapter, napi); struct net_device *netdev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* In non-MSIX case, there is no multi-Tx/Rx queue */ - ixgbe_clean_tx_irq(adapter, adapter->tx_ring); + tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring); ixgbe_clean_rx_irq(adapter, &adapter->rx_ring[0], &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { netif_rx_complete(netdev, napi); ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 5:02 ` David Miller @ 2008-01-16 8:56 ` Frans Pop 2008-01-16 10:29 ` David Miller 2008-01-17 7:09 ` Brandeburg, Jesse 1 sibling, 1 reply; 27+ messages in thread From: Frans Pop @ 2008-01-16 8:56 UTC (permalink / raw) To: David Miller; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel On Wednesday 16 January 2008, David Miller wrote: > Ok, here is the patch I'll propose to fix this. The goal is to make > it as simple as possible without regressing the thing we were trying > to fix. Looks good to me. Tested with -rc8. Cheers, FJP ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 8:56 ` Frans Pop @ 2008-01-16 10:29 ` David Miller 2008-01-16 17:07 ` Robert Olsson 0 siblings, 1 reply; 27+ messages in thread From: David Miller @ 2008-01-16 10:29 UTC (permalink / raw) To: elendil; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel From: Frans Pop <elendil@planet.nl> Date: Wed, 16 Jan 2008 09:56:08 +0100 > On Wednesday 16 January 2008, David Miller wrote: > > Ok, here is the patch I'll propose to fix this. The goal is to make > > it as simple as possible without regressing the thing we were trying > > to fix. > > Looks good to me. Tested with -rc8. Thanks for testing. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 10:29 ` David Miller @ 2008-01-16 17:07 ` Robert Olsson 2008-01-18 12:11 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Robert Olsson @ 2008-01-16 17:07 UTC (permalink / raw) To: David Miller; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel David Miller writes: > > On Wednesday 16 January 2008, David Miller wrote: > > > Ok, here is the patch I'll propose to fix this. The goal is to make > > > it as simple as possible without regressing the thing we were trying > > > to fix. > > > > Looks good to me. Tested with -rc8. > > Thanks for testing. Yes that code looks nice. I'm using the patch but I've noticed another phenomena with the current e1000 driver. There is a race when taking a device down at high traffic loads. I've tracked and instrumented and it seems like occasionly irq_sem can get bump up so interrupts can't be enabled again. eth0 e1000_irq_enable sem = 1 <- High netload eth0 e1000_irq_enable sem = 1 eth0 e1000_irq_enable sem = 1 eth0 e1000_irq_enable sem = 1 eth0 e1000_irq_enable sem = 1 eth0 e1000_irq_enable sem = 1 eth0 e1000_irq_enable sem = 1 <- ifconfig eth0 down eth0 e1000_irq_disable sem = 2 **e1000_open <- ifconfig eth0 up eth0 e1000_irq_disable sem = 3 Dead. irq's can't be enabled e1000_irq_enable miss eth0 e1000_irq_enable sem = 2 e1000_irq_enable miss eth0 e1000_irq_enable sem = 1 ADDRCONF(NETDEV_UP): eth0: link is not ready Cheers --ro static void e1000_irq_disable(struct e1000_adapter *adapter) { atomic_inc(&adapter->irq_sem); E1000_WRITE_REG(&adapter->hw, IMC, ~0); E1000_WRITE_FLUSH(&adapter->hw); synchronize_irq(adapter->pdev->irq); if(adapter->netdev->ifindex == 3) printk("%s e1000_irq_disable sem = %d\n", adapter->netdev->name, atomic_read(&adapter->irq_sem)); } static void e1000_irq_enable(struct e1000_adapter *adapter) { if (likely(atomic_dec_and_test(&adapter->irq_sem))) { E1000_WRITE_REG(&adapter->hw, IMS, IMS_ENABLE_MASK); E1000_WRITE_FLUSH(&adapter->hw); } else printk("e1000_irq_enable miss\n"); if(adapter->netdev->ifindex == 3) printk("%s e1000_irq_enable sem = %d\n", adapter->netdev->name, atomic_read(&adapter->irq_sem)); } ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 17:07 ` Robert Olsson @ 2008-01-18 12:11 ` David Miller 2008-01-18 13:00 ` Robert Olsson 2008-01-21 13:27 ` Robert Olsson 0 siblings, 2 replies; 27+ messages in thread From: David Miller @ 2008-01-18 12:11 UTC (permalink / raw) To: Robert.Olsson; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel From: Robert Olsson <Robert.Olsson@data.slu.se> Date: Wed, 16 Jan 2008 18:07:38 +0100 > > eth0 e1000_irq_enable sem = 1 <- High netload > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 > eth0 e1000_irq_enable sem = 1 <- ifconfig eth0 down > eth0 e1000_irq_disable sem = 2 > > **e1000_open <- ifconfig eth0 up > eth0 e1000_irq_disable sem = 3 Dead. irq's can't be enabled > e1000_irq_enable miss > eth0 e1000_irq_enable sem = 2 > e1000_irq_enable miss > eth0 e1000_irq_enable sem = 1 > ADDRCONF(NETDEV_UP): eth0: link is not ready Yes, this semaphore thing is highly problematic. In the most crucial areas where network driver consistency matters the most for ease of understanding and debugging, the Intel drivers choose to be different :-( The way the napi_disable() logic breaks out from high packet load in net_rx_action() is it simply returns even leaving interrupts disabled when a pending napi_disable() is pending. This is what trips up the semaphore logic. Robert, give this patch a try. In the long term this semaphore should be completely eliminated, there is no justification for it. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 0c9a6f7..76c0fa6 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter) #ifdef CONFIG_E1000_NAPI napi_disable(&adapter->napi); + atomic_set(&adapter->irq_sem, 0); #endif e1000_irq_disable(adapter); diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 2ab3bfb..9cc5a6b 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter) msleep(10); napi_disable(&adapter->napi); + atomic_set(&adapter->irq_sem, 0); e1000_irq_disable(adapter); del_timer_sync(&adapter->watchdog_timer); diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c index d2fb88d..4f63839 100644 --- a/drivers/net/ixgb/ixgb_main.c +++ b/drivers/net/ixgb/ixgb_main.c @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) { struct net_device *netdev = adapter->netdev; +#ifdef CONFIG_IXGB_NAPI + napi_disable(&adapter->napi); + atomic_set(&adapter->irq_sem, 0); +#endif + ixgb_irq_disable(adapter); free_irq(adapter->pdev->irq, netdev); @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) if(kill_watchdog) del_timer_sync(&adapter->watchdog_timer); -#ifdef CONFIG_IXGB_NAPI - napi_disable(&adapter->napi); -#endif + adapter->link_speed = 0; adapter->link_duplex = 0; netif_carrier_off(netdev); diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index de3f45e..a4265bc 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter) IXGBE_WRITE_FLUSH(&adapter->hw); msleep(10); + napi_disable(&adapter->napi); + atomic_set(&adapter->irq_sem, 0); + ixgbe_irq_disable(adapter); - napi_disable(&adapter->napi); del_timer_sync(&adapter->watchdog_timer); netif_carrier_off(netdev); ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-18 12:11 ` David Miller @ 2008-01-18 13:00 ` Robert Olsson 2008-01-18 13:37 ` David Miller 2008-01-21 13:27 ` Robert Olsson 1 sibling, 1 reply; 27+ messages in thread From: Robert Olsson @ 2008-01-18 13:00 UTC (permalink / raw) To: David Miller Cc: Robert.Olsson, elendil, jesse.brandeburg, slavon, netdev, linux-kernel David Miller writes: > > eth0 e1000_irq_enable sem = 1 <- ifconfig eth0 down > > eth0 e1000_irq_disable sem = 2 > > > > **e1000_open <- ifconfig eth0 up > > eth0 e1000_irq_disable sem = 3 Dead. irq's can't be enabled > > e1000_irq_enable miss > > eth0 e1000_irq_enable sem = 2 > > e1000_irq_enable miss > > eth0 e1000_irq_enable sem = 1 > > ADDRCONF(NETDEV_UP): eth0: link is not ready > > Yes, this semaphore thing is highly problematic. In the most crucial > areas where network driver consistency matters the most for ease of > understanding and debugging, the Intel drivers choose to be different I don't understand the idea with semaphore for enabling/disabling irq's either the overall logic must safer/better without it. > The way the napi_disable() logic breaks out from high packet load in > net_rx_action() is it simply returns even leaving interrupts disabled > when a pending napi_disable() is pending. > > This is what trips up the semaphore logic. > > Robert, give this patch a try. > > In the long term this semaphore should be completely eliminated, > there is no justification for it. It's on the testing list... Cheers --ro > > Signed-off-by: David S. Miller <davem@davemloft.net> > > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index 0c9a6f7..76c0fa6 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter) > > #ifdef CONFIG_E1000_NAPI > napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > #endif > e1000_irq_disable(adapter); > > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c > index 2ab3bfb..9cc5a6b 100644 > --- a/drivers/net/e1000e/netdev.c > +++ b/drivers/net/e1000e/netdev.c > @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter) > msleep(10); > > napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > e1000_irq_disable(adapter); > > del_timer_sync(&adapter->watchdog_timer); > diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c > index d2fb88d..4f63839 100644 > --- a/drivers/net/ixgb/ixgb_main.c > +++ b/drivers/net/ixgb/ixgb_main.c > @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) > { > struct net_device *netdev = adapter->netdev; > > +#ifdef CONFIG_IXGB_NAPI > + napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > +#endif > + > ixgb_irq_disable(adapter); > free_irq(adapter->pdev->irq, netdev); > > @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) > > if(kill_watchdog) > del_timer_sync(&adapter->watchdog_timer); > -#ifdef CONFIG_IXGB_NAPI > - napi_disable(&adapter->napi); > -#endif > + > adapter->link_speed = 0; > adapter->link_duplex = 0; > netif_carrier_off(netdev); > diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c > index de3f45e..a4265bc 100644 > --- a/drivers/net/ixgbe/ixgbe_main.c > +++ b/drivers/net/ixgbe/ixgbe_main.c > @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter) > IXGBE_WRITE_FLUSH(&adapter->hw); > msleep(10); > > + napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > + > ixgbe_irq_disable(adapter); > > - napi_disable(&adapter->napi); > del_timer_sync(&adapter->watchdog_timer); > > netif_carrier_off(netdev); > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-18 13:00 ` Robert Olsson @ 2008-01-18 13:37 ` David Miller 2008-01-20 9:20 ` Brandeburg, Jesse 0 siblings, 1 reply; 27+ messages in thread From: David Miller @ 2008-01-18 13:37 UTC (permalink / raw) To: Robert.Olsson; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel From: Robert Olsson <Robert.Olsson@data.slu.se> Date: Fri, 18 Jan 2008 14:00:57 +0100 > I don't understand the idea with semaphore for enabling/disabling > irq's either the overall logic must safer/better without it. They must have had code paths where they didn't know if IRQs were enabled or not already, so they tried to create something which approximates the: local_irq_save(flags); local_irq_restore(flags); constructs we have for CPU interrupts, so they could go: e1000_irq_disable(); /* ... */ e1000_irq_enable(); and this would work even if the caller was running with e1000 interrupts disabled already. Or, something like that... it is indeed confusing. Anyways, yes it's totally bogus and should be removed. ^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-18 13:37 ` David Miller @ 2008-01-20 9:20 ` Brandeburg, Jesse 2008-01-20 9:28 ` Andrey Rahmatullin 0 siblings, 1 reply; 27+ messages in thread From: Brandeburg, Jesse @ 2008-01-20 9:20 UTC (permalink / raw) To: David Miller, Robert.Olsson; +Cc: elendil, slavon, netdev, linux-kernel David Miller wrote: > From: Robert Olsson <Robert.Olsson@data.slu.se> > Date: Fri, 18 Jan 2008 14:00:57 +0100 > >> I don't understand the idea with semaphore for enabling/disabling >> irq's either the overall logic must safer/better without it. > > They must have had code paths where they didn't know if IRQs were > enabled or not already, so they tried to create something which > approximates the: > > local_irq_save(flags); > local_irq_restore(flags); > > constructs we have for CPU interrupts, so they could go: > > e1000_irq_disable(); > /* ... */ > e1000_irq_enable(); > > and this would work even if the caller was running > with e1000 interrupts disabled already. > > Or, something like that... it is indeed confusing. > > Anyways, yes it's totally bogus and should be removed. I agree, bogus, and in fact I've already removed it from our development version of ixgbe. Right now I wanted to report I can't remove e1000 at all on 2.6.24-rc8+git I continually get the kernel: unregister_netdevice: waiting for eth2 to become free. Usage count = 1 Where 2.6.24-rc5 e1000 is okay still. Seems like maybe we are still missing a netif_rx_complete or a napi_disable somewhere. I don't think this problem has anything to do with the irq_sem right now. Something is still badly broken. I am just using the interface regularly (no heavy load) and I can't unload the module. Jesse ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-20 9:20 ` Brandeburg, Jesse @ 2008-01-20 9:28 ` Andrey Rahmatullin 0 siblings, 0 replies; 27+ messages in thread From: Andrey Rahmatullin @ 2008-01-20 9:28 UTC (permalink / raw) To: Brandeburg, Jesse Cc: David Miller, Robert.Olsson, elendil, slavon, netdev, linux-kernel [-- Attachment #1: Type: text/plain, Size: 263 bytes --] On Sun, Jan 20, 2008 at 01:20:11AM -0800, Brandeburg, Jesse wrote: > I continually get the > kernel: unregister_netdevice: waiting for eth2 to become free. Usage > count = 1 http://bugzilla.kernel.org/show_bug.cgi?id=9778 -- WBR, wRAR (ALT Linux Team) [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-18 12:11 ` David Miller 2008-01-18 13:00 ` Robert Olsson @ 2008-01-21 13:27 ` Robert Olsson 2008-01-21 13:29 ` David Miller 1 sibling, 1 reply; 27+ messages in thread From: Robert Olsson @ 2008-01-21 13:27 UTC (permalink / raw) To: David Miller Cc: Robert.Olsson, elendil, jesse.brandeburg, slavon, netdev, linux-kernel David Miller writes: > Yes, this semaphore thing is highly problematic. In the most crucial > areas where network driver consistency matters the most for ease of > understanding and debugging, the Intel drivers choose to be different > :-( > > The way the napi_disable() logic breaks out from high packet load in > net_rx_action() is it simply returns even leaving interrupts disabled > when a pending napi_disable() is pending. > > This is what trips up the semaphore logic. > > Robert, give this patch a try. Yes it works. e1000 tested for ~3 hours with high very high load and interface up/down every 5:th sec. Without the patch the irq's gets disabled within a couple of seconds A resolute way of handling the semaphores. :) Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Cheers --ro > In the long term this semaphore should be completely eliminated, > there is no justification for it. > > Signed-off-by: David S. Miller <davem@davemloft.net> > > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index 0c9a6f7..76c0fa6 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -632,6 +632,7 @@ e1000_down(struct e1000_adapter *adapter) > > #ifdef CONFIG_E1000_NAPI > napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > #endif > e1000_irq_disable(adapter); > > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c > index 2ab3bfb..9cc5a6b 100644 > --- a/drivers/net/e1000e/netdev.c > +++ b/drivers/net/e1000e/netdev.c > @@ -2183,6 +2183,7 @@ void e1000e_down(struct e1000_adapter *adapter) > msleep(10); > > napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > e1000_irq_disable(adapter); > > del_timer_sync(&adapter->watchdog_timer); > diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c > index d2fb88d..4f63839 100644 > --- a/drivers/net/ixgb/ixgb_main.c > +++ b/drivers/net/ixgb/ixgb_main.c > @@ -296,6 +296,11 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) > { > struct net_device *netdev = adapter->netdev; > > +#ifdef CONFIG_IXGB_NAPI > + napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > +#endif > + > ixgb_irq_disable(adapter); > free_irq(adapter->pdev->irq, netdev); > > @@ -304,9 +309,7 @@ ixgb_down(struct ixgb_adapter *adapter, boolean_t kill_watchdog) > > if(kill_watchdog) > del_timer_sync(&adapter->watchdog_timer); > -#ifdef CONFIG_IXGB_NAPI > - napi_disable(&adapter->napi); > -#endif > + > adapter->link_speed = 0; > adapter->link_duplex = 0; > netif_carrier_off(netdev); > diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c > index de3f45e..a4265bc 100644 > --- a/drivers/net/ixgbe/ixgbe_main.c > +++ b/drivers/net/ixgbe/ixgbe_main.c > @@ -1409,9 +1409,11 @@ void ixgbe_down(struct ixgbe_adapter *adapter) > IXGBE_WRITE_FLUSH(&adapter->hw); > msleep(10); > > + napi_disable(&adapter->napi); > + atomic_set(&adapter->irq_sem, 0); > + > ixgbe_irq_disable(adapter); > > - napi_disable(&adapter->napi); > del_timer_sync(&adapter->watchdog_timer); > > netif_carrier_off(netdev); ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-21 13:27 ` Robert Olsson @ 2008-01-21 13:29 ` David Miller 0 siblings, 0 replies; 27+ messages in thread From: David Miller @ 2008-01-21 13:29 UTC (permalink / raw) To: Robert.Olsson; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel From: Robert Olsson <Robert.Olsson@data.slu.se> Date: Mon, 21 Jan 2008 14:27:13 +0100 > Yes it works. e1000 tested for ~3 hours with high very high load and > interface up/down every 5:th sec. Without the patch the irq's gets > disabled within a couple of seconds > > A resolute way of handling the semaphores. :) > > Signed-off-by: Robert Olsson <robert.olsson@its.uu.se> Thanks for testing Robert. I sent off that fix to Linus an hour or so ago, hopefully he will pick it up some time today. ^ permalink raw reply [flat|nested] 27+ messages in thread
* RE: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 5:02 ` David Miller 2008-01-16 8:56 ` Frans Pop @ 2008-01-17 7:09 ` Brandeburg, Jesse 2008-01-17 7:20 ` David Miller 1 sibling, 1 reply; 27+ messages in thread From: Brandeburg, Jesse @ 2008-01-17 7:09 UTC (permalink / raw) To: David Miller; +Cc: slavon, elendil, netdev, linux-kernel David Miller wrote: > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> > Date: Tue, 15 Jan 2008 13:53:43 -0800 > >> The tx code has an "early exit" that tries to limit the amount of tx >> packets handled in a single poll loop and requires napi or interrupt >> rescheduling based on the return value from e1000_clean_tx_irq. > > That explains everything, thanks Jesse. > > Ok, here is the patch I'll propose to fix this. The goal is to make > it as simple as possible without regressing the thing we were trying > to fix. We spent Wednesday trying to reproduce (without the patch) these issues without much luck, and have applied the patch cleanly and will continue testing it. Given the simplicity of the changes, and the community testing, I'll give my ack and we will continue testing. I think we should fix Robert's (unrelated, but in this thread) reported issue before 2.6.24 final if we can, and I'll look at that tonight and tomorrow. Thanks for your work on this Dave, Jesse Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-17 7:09 ` Brandeburg, Jesse @ 2008-01-17 7:20 ` David Miller 2008-01-17 7:51 ` Frans Pop 0 siblings, 1 reply; 27+ messages in thread From: David Miller @ 2008-01-17 7:20 UTC (permalink / raw) To: jesse.brandeburg; +Cc: slavon, elendil, netdev, linux-kernel From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> Date: Wed, 16 Jan 2008 23:09:47 -0800 > We spent Wednesday trying to reproduce (without the patch) these issues > without much luck, and have applied the patch cleanly and will continue > testing it. Given the simplicity of the changes, and the community > testing, I'll give my ack and we will continue testing. You need a slow CPU, and you need to make sure you do actually trigger the TX limiting code there. I bet your cpus are fast enough that it simply never triggers. :-) > Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Thanks for reviewing Jesse. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-17 7:20 ` David Miller @ 2008-01-17 7:51 ` Frans Pop 2008-01-17 8:00 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Frans Pop @ 2008-01-17 7:51 UTC (permalink / raw) To: David Miller; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel On Thursday 17 January 2008, David Miller wrote: > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> > > > We spent Wednesday trying to reproduce (without the patch) these issues > > without much luck, and have applied the patch cleanly and will continue > > testing it. Given the simplicity of the changes, and the community > > testing, I'll give my ack and we will continue testing. > > You need a slow CPU, and you need to make sure you do actually > trigger the TX limiting code there. Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days? ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-17 7:51 ` Frans Pop @ 2008-01-17 8:00 ` David Miller 2008-01-17 9:40 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 27+ messages in thread From: David Miller @ 2008-01-17 8:00 UTC (permalink / raw) To: elendil; +Cc: jesse.brandeburg, slavon, netdev, linux-kernel From: Frans Pop <elendil@planet.nl> Date: Thu, 17 Jan 2008 08:51:55 +0100 > On Thursday 17 January 2008, David Miller wrote: > > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> > > > > > We spent Wednesday trying to reproduce (without the patch) these issues > > > without much luck, and have applied the patch cleanly and will continue > > > testing it. Given the simplicity of the changes, and the community > > > testing, I'll give my ack and we will continue testing. > > > > You need a slow CPU, and you need to make sure you do actually > > trigger the TX limiting code there. > > Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days? No of course :-) I guess it therefore depends upon the load as well. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-17 8:00 ` David Miller @ 2008-01-17 9:40 ` Arnaldo Carvalho de Melo 2008-01-17 9:45 ` David Miller 0 siblings, 1 reply; 27+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-01-17 9:40 UTC (permalink / raw) To: David Miller; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel Em Thu, Jan 17, 2008 at 12:00:02AM -0800, David Miller escreveu: > From: Frans Pop <elendil@planet.nl> > Date: Thu, 17 Jan 2008 08:51:55 +0100 > > > On Thursday 17 January 2008, David Miller wrote: > > > From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com> > > > > > > > We spent Wednesday trying to reproduce (without the patch) these issues > > > > without much luck, and have applied the patch cleanly and will continue > > > > testing it. Given the simplicity of the changes, and the community > > > > testing, I'll give my ack and we will continue testing. > > > > > > You need a slow CPU, and you need to make sure you do actually > > > trigger the TX limiting code there. > > > > Hmmm. Is a dual core Pentium D 3.20GHz considered slow these days? > > No of course :-) I guess it therefore depends upon the load > as well. I saw it just once, yesterday: [root@doppio ~]# uname -r 2.6.24-rc5 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH <58> TDT <8f> next_to_use <8f> next_to_clean <55> buffer_info[next_to_clean] time_stamp <105e973a9> next_to_watch <56> jiffies <105e97992> next_to_watch.status <1> [root@doppio ~]# on a lenovo T60W, core2duo machine (2GHz), when using it to stress test another machine, I was using netperf TCP_STREAM ranging from 1 to 8 streams + a ping -f using various packet sizes. I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again to reproduce. I also applied David's patch while trying some RT experiments on another, 8 way machine used as a server, but on this machine I didn't experience the Tx Unit Hang message with or without the patch. - Arnaldo ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-17 9:40 ` Arnaldo Carvalho de Melo @ 2008-01-17 9:45 ` David Miller 0 siblings, 0 replies; 27+ messages in thread From: David Miller @ 2008-01-17 9:45 UTC (permalink / raw) To: acme; +Cc: elendil, jesse.brandeburg, slavon, netdev, linux-kernel From: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Thu, 17 Jan 2008 07:40:07 -0200 > I'll update this machine today to 2.6.24-rc8-git + net-2.6 and try again > to reproduce. Thanks for the datapoints and testing. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-15 5:53 ` David Miller 2008-01-15 6:17 ` Frans Pop 2008-01-15 14:04 ` Frans Pop @ 2008-01-16 9:02 ` Badalian Vyacheslav 2008-01-16 12:25 ` David Miller 2008-01-16 12:28 ` David Miller 2 siblings, 2 replies; 27+ messages in thread From: Badalian Vyacheslav @ 2008-01-16 9:02 UTC (permalink / raw) To: David Miller; +Cc: elendil, netdev, linux-kernel applied to 2.6.24-rc7-git2 Have messages Also have regression after apply patch. System may do above 800mbs traffic before patch. After its "exit polling mode?" (4 CPU, 1 cpu get 100% si (process ksoftirqd/0), 3 CPU is IDLE) After patch system was go to "exit polling mode" at above 600mbs. Thanks. > From: Frans Pop <elendil@planet.nl> > Date: Tue, 15 Jan 2008 06:25:10 +0100 > > >> kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang >> > > Does this make the problem go away? > > (Note this isn't the final correct patch we should apply. There > is no reason why this revert back to the older ->poll() logic > here should have any effect on the TX hang triggering...) > > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index 13d57b0..cada32c 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget) > { > struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); > struct net_device *poll_dev = adapter->netdev; > - int work_done = 0; > + int tx_work = 0, work_done = 0; > > /* Must NOT use netdev_priv macro here. */ > adapter = poll_dev->priv; > @@ -3929,8 +3929,8 @@ e1000_clean(struct napi_struct *napi, int budget) > * simultaneously. A failure obtaining the lock means > * tx_ring[0] is currently being cleaned anyway. */ > if (spin_trylock(&adapter->tx_queue_lock)) { > - e1000_clean_tx_irq(adapter, > - &adapter->tx_ring[0]); > + tx_work = e1000_clean_tx_irq(adapter, > + &adapter->tx_ring[0]); > spin_unlock(&adapter->tx_queue_lock); > } > > @@ -3938,7 +3938,7 @@ e1000_clean(struct napi_struct *napi, int budget) > &work_done, budget); > > /* If budget not fully consumed, exit the polling mode */ > - if (work_done < budget) { > + if (!tx_work && (work_done < budget)) { > if (likely(adapter->itr_setting & 3)) > e1000_set_itr(adapter); > netif_rx_complete(poll_dev, napi); > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 9:02 ` Badalian Vyacheslav @ 2008-01-16 12:25 ` David Miller 2008-01-16 12:28 ` David Miller 1 sibling, 0 replies; 27+ messages in thread From: David Miller @ 2008-01-16 12:25 UTC (permalink / raw) To: slavon; +Cc: elendil, netdev, linux-kernel From: Badalian Vyacheslav <slavon@bigtelecom.ru> Date: Wed, 16 Jan 2008 12:02:28 +0300 > applied to 2.6.24-rc7-git2 > Have messages > Also have regression after apply patch. > System may do above 800mbs traffic before patch. After its "exit polling > mode?" (4 CPU, 1 cpu get 100% si (process ksoftirqd/0), 3 CPU is IDLE) > After patch system was go to "exit polling mode" at above 600mbs. What do you mean by 'system was go to "exit polling mode"'? Please be more clear about your situation, in particular provide every detail about what happens so that we can properly debug this. THanks. ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 9:02 ` Badalian Vyacheslav 2008-01-16 12:25 ` David Miller @ 2008-01-16 12:28 ` David Miller 2008-01-21 6:54 ` Badalian Vyacheslav 1 sibling, 1 reply; 27+ messages in thread From: David Miller @ 2008-01-16 12:28 UTC (permalink / raw) To: slavon; +Cc: elendil, netdev, linux-kernel From: Badalian Vyacheslav <slavon@bigtelecom.ru> Date: Wed, 16 Jan 2008 12:02:28 +0300 > Also have regression after apply patch. BTW, if you are using the e1000e driver then this initial patch will not work. My more recent patch posting for this problem, will. I include it again below for you: [NET]: Fix TX timeout regression in Intel drivers. This fixes a regression added by changeset 53e52c729cc169db82a6105fac7a166e10c2ec36 ("[NET]: Make ->poll() breakout consistent in Intel ethernet drivers.") As pointed out by Jesse Brandeburg, for three of the drivers edited above there is breakout logic in the *_clean_tx_irq() code to prevent running TX reclaim forever. If this occurs, we have to elide NAPI poll completion or else those TX events will never be serviced. Signed-off-by: David S. Miller <davem@davemloft.net> diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 13d57b0..0c9a6f7 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget) { struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); struct net_device *poll_dev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* Must NOT use netdev_priv macro here. */ adapter = poll_dev->priv; @@ -3929,14 +3929,17 @@ e1000_clean(struct napi_struct *napi, int budget) * simultaneously. A failure obtaining the lock means * tx_ring[0] is currently being cleaned anyway. */ if (spin_trylock(&adapter->tx_queue_lock)) { - e1000_clean_tx_irq(adapter, - &adapter->tx_ring[0]); + tx_cleaned = e1000_clean_tx_irq(adapter, + &adapter->tx_ring[0]); spin_unlock(&adapter->tx_queue_lock); } adapter->clean_rx(adapter, &adapter->rx_ring[0], &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { if (likely(adapter->itr_setting & 3)) diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 4a6fc74..2ab3bfb 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -1384,7 +1384,7 @@ static int e1000_clean(struct napi_struct *napi, int budget) { struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); struct net_device *poll_dev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* Must NOT use netdev_priv macro here. */ adapter = poll_dev->priv; @@ -1394,12 +1394,15 @@ static int e1000_clean(struct napi_struct *napi, int budget) * simultaneously. A failure obtaining the lock means * tx_ring is currently being cleaned anyway. */ if (spin_trylock(&adapter->tx_queue_lock)) { - e1000_clean_tx_irq(adapter); + tx_cleaned = e1000_clean_tx_irq(adapter); spin_unlock(&adapter->tx_queue_lock); } adapter->clean_rx(adapter, &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { if (adapter->itr_setting & 3) diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c index a564916..de3f45e 100644 --- a/drivers/net/ixgbe/ixgbe_main.c +++ b/drivers/net/ixgbe/ixgbe_main.c @@ -1468,13 +1468,16 @@ static int ixgbe_clean(struct napi_struct *napi, int budget) struct ixgbe_adapter *adapter = container_of(napi, struct ixgbe_adapter, napi); struct net_device *netdev = adapter->netdev; - int work_done = 0; + int tx_cleaned = 0, work_done = 0; /* In non-MSIX case, there is no multi-Tx/Rx queue */ - ixgbe_clean_tx_irq(adapter, adapter->tx_ring); + tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring); ixgbe_clean_rx_irq(adapter, &adapter->rx_ring[0], &work_done, budget); + if (tx_cleaned) + work_done = budget; + /* If budget not fully consumed, exit the polling mode */ if (work_done < budget) { netif_rx_complete(netdev, napi); ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang 2008-01-16 12:28 ` David Miller @ 2008-01-21 6:54 ` Badalian Vyacheslav 0 siblings, 0 replies; 27+ messages in thread From: Badalian Vyacheslav @ 2008-01-21 6:54 UTC (permalink / raw) To: David Miller; +Cc: elendil, netdev, linux-kernel Hello. Its work, thanks for resend it! Sorry, i understand that patch 53e52c729cc169db82a6105fac7a166e10c2ec36 ("[NET]: Make ->poll() breakout consistent in Intel ethernet drivers.") have regression and rollback it, i not see your patch. Sorry again. Thanks! > From: Badalian Vyacheslav <slavon@bigtelecom.ru> > Date: Wed, 16 Jan 2008 12:02:28 +0300 > > >> Also have regression after apply patch. >> > > BTW, if you are using the e1000e driver then this initial > patch will not work. > > My more recent patch posting for this problem, will. > > I include it again below for you: > > [NET]: Fix TX timeout regression in Intel drivers. > > This fixes a regression added by changeset > 53e52c729cc169db82a6105fac7a166e10c2ec36 ("[NET]: Make ->poll() > breakout consistent in Intel ethernet drivers.") > > As pointed out by Jesse Brandeburg, for three of the drivers edited > above there is breakout logic in the *_clean_tx_irq() code to prevent > running TX reclaim forever. If this occurs, we have to elide NAPI > poll completion or else those TX events will never be serviced. > > Signed-off-by: David S. Miller <davem@davemloft.net> > > diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c > index 13d57b0..0c9a6f7 100644 > --- a/drivers/net/e1000/e1000_main.c > +++ b/drivers/net/e1000/e1000_main.c > @@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget) > { > struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); > struct net_device *poll_dev = adapter->netdev; > - int work_done = 0; > + int tx_cleaned = 0, work_done = 0; > > /* Must NOT use netdev_priv macro here. */ > adapter = poll_dev->priv; > @@ -3929,14 +3929,17 @@ e1000_clean(struct napi_struct *napi, int budget) > * simultaneously. A failure obtaining the lock means > * tx_ring[0] is currently being cleaned anyway. */ > if (spin_trylock(&adapter->tx_queue_lock)) { > - e1000_clean_tx_irq(adapter, > - &adapter->tx_ring[0]); > + tx_cleaned = e1000_clean_tx_irq(adapter, > + &adapter->tx_ring[0]); > spin_unlock(&adapter->tx_queue_lock); > } > > adapter->clean_rx(adapter, &adapter->rx_ring[0], > &work_done, budget); > > + if (tx_cleaned) > + work_done = budget; > + > /* If budget not fully consumed, exit the polling mode */ > if (work_done < budget) { > if (likely(adapter->itr_setting & 3)) > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c > index 4a6fc74..2ab3bfb 100644 > --- a/drivers/net/e1000e/netdev.c > +++ b/drivers/net/e1000e/netdev.c > @@ -1384,7 +1384,7 @@ static int e1000_clean(struct napi_struct *napi, int budget) > { > struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi); > struct net_device *poll_dev = adapter->netdev; > - int work_done = 0; > + int tx_cleaned = 0, work_done = 0; > > /* Must NOT use netdev_priv macro here. */ > adapter = poll_dev->priv; > @@ -1394,12 +1394,15 @@ static int e1000_clean(struct napi_struct *napi, int budget) > * simultaneously. A failure obtaining the lock means > * tx_ring is currently being cleaned anyway. */ > if (spin_trylock(&adapter->tx_queue_lock)) { > - e1000_clean_tx_irq(adapter); > + tx_cleaned = e1000_clean_tx_irq(adapter); > spin_unlock(&adapter->tx_queue_lock); > } > > adapter->clean_rx(adapter, &work_done, budget); > > + if (tx_cleaned) > + work_done = budget; > + > /* If budget not fully consumed, exit the polling mode */ > if (work_done < budget) { > if (adapter->itr_setting & 3) > diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c > index a564916..de3f45e 100644 > --- a/drivers/net/ixgbe/ixgbe_main.c > +++ b/drivers/net/ixgbe/ixgbe_main.c > @@ -1468,13 +1468,16 @@ static int ixgbe_clean(struct napi_struct *napi, int budget) > struct ixgbe_adapter *adapter = container_of(napi, > struct ixgbe_adapter, napi); > struct net_device *netdev = adapter->netdev; > - int work_done = 0; > + int tx_cleaned = 0, work_done = 0; > > /* In non-MSIX case, there is no multi-Tx/Rx queue */ > - ixgbe_clean_tx_irq(adapter, adapter->tx_ring); > + tx_cleaned = ixgbe_clean_tx_irq(adapter, adapter->tx_ring); > ixgbe_clean_rx_irq(adapter, &adapter->rx_ring[0], &work_done, > budget); > > + if (tx_cleaned) > + work_done = budget; > + > /* If budget not fully consumed, exit the polling mode */ > if (work_done < budget) { > netif_rx_complete(netdev, napi); > > ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2008-01-21 13:29 UTC | newest] Thread overview: 27+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-01-15 5:25 [REGRESSION] 2.6.24-rc7: e1000: Detected Tx Unit Hang Frans Pop 2008-01-15 5:53 ` David Miller 2008-01-15 6:17 ` Frans Pop 2008-01-15 14:04 ` Frans Pop 2008-01-15 16:04 ` slavon 2008-01-15 21:53 ` Brandeburg, Jesse 2008-01-16 5:02 ` David Miller 2008-01-16 8:56 ` Frans Pop 2008-01-16 10:29 ` David Miller 2008-01-16 17:07 ` Robert Olsson 2008-01-18 12:11 ` David Miller 2008-01-18 13:00 ` Robert Olsson 2008-01-18 13:37 ` David Miller 2008-01-20 9:20 ` Brandeburg, Jesse 2008-01-20 9:28 ` Andrey Rahmatullin 2008-01-21 13:27 ` Robert Olsson 2008-01-21 13:29 ` David Miller 2008-01-17 7:09 ` Brandeburg, Jesse 2008-01-17 7:20 ` David Miller 2008-01-17 7:51 ` Frans Pop 2008-01-17 8:00 ` David Miller 2008-01-17 9:40 ` Arnaldo Carvalho de Melo 2008-01-17 9:45 ` David Miller 2008-01-16 9:02 ` Badalian Vyacheslav 2008-01-16 12:25 ` David Miller 2008-01-16 12:28 ` David Miller 2008-01-21 6:54 ` Badalian Vyacheslav
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).