public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* e1000 Detected Tx Unit Hang
@ 2006-09-02  4:38 Paul Aviles
  2006-09-02  5:45 ` Auke Kok
  0 siblings, 1 reply; 2+ messages in thread
From: Paul Aviles @ 2006-09-02  4:38 UTC (permalink / raw)
  To: linux-kernel

I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" using 
stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3.

The server is a Tyan GS10 and is connected to a Netgear GS724T Gig switch. I 
can easily reproduce the problem by trying to do a large ftp transfer to the 
server. It does not happen if the server is connected to a dummy 100 Mb 
switch, only when is connected to the Gig switch.
I have also tried the options line below disabling tso, tx and rx in the 
modprobe.conf without any luck.

options e1000 XsumRX=0 Speed=1000 Duplex=2 InterruptThrottleRate=0 
FlowControl=3 RxDescriptors=4096 TxDescriptors=4096 RxIntDelay=0 
TxIntDelay=0

in /var/log/kernel I get the following...

Sep  1 23:53:01 www kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
Sep  1 23:53:01 www kernel:   Tx Queue             <0>
Sep  1 23:53:01 www kernel:   TDH                  <4c4>
Sep  1 23:53:01 www kernel:   TDT                  <4c9>
Sep  1 23:53:01 www kernel:   next_to_use          <4c9>
Sep  1 23:53:01 www kernel:   next_to_clean        <4c4>
Sep  1 23:53:01 www kernel: buffer_info[next_to_clean]
Sep  1 23:53:01 www kernel:   time_stamp           <ffff9c60>
Sep  1 23:53:01 www kernel:   next_to_watch        <4c4>
Sep  1 23:53:01 www kernel:   jiffies              <ffff9d96>
Sep  1 23:53:01 www kernel:   next_to_watch.status <0>
.
repeats the same as above a few times....
.
Sep  1 23:53:10 www kernel: NETDEV WATCHDOG: eth0: transmit timed out
Sep  1 23:53:13 www kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up 
1000 Mbps Full Duplex

then the server locks up, no response from the keyboard at all and must be 
forced down with a power kill.

Here is my driver info,

driver: e1000
version: 7.0.33-k2-NAPI
firmware-version: N/A
bus-info: 0000:02:01.0

What else could I check?

Regards,

Paul Aviles 



-- 
VGER BF report: U 0.5

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: e1000 Detected Tx Unit Hang
  2006-09-02  4:38 e1000 Detected Tx Unit Hang Paul Aviles
@ 2006-09-02  5:45 ` Auke Kok
  0 siblings, 0 replies; 2+ messages in thread
From: Auke Kok @ 2006-09-02  5:45 UTC (permalink / raw)
  To: Paul Aviles; +Cc: linux-kernel, NetDev

Paul Aviles wrote:
> I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" 
> using stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3.
> 
> The server is a Tyan GS10 and is connected to a Netgear GS724T Gig 
> switch. I can easily reproduce the problem by trying to do a large ftp 
> transfer to the server. It does not happen if the server is connected to 
> a dummy 100 Mb switch, only when is connected to the Gig switch.
> I have also tried the options line below disabling tso, tx and rx in the 
> modprobe.conf without any luck.
> 
> options e1000 XsumRX=0 Speed=1000 Duplex=2 InterruptThrottleRate=0 
> FlowControl=3 RxDescriptors=4096 TxDescriptors=4096 RxIntDelay=0 
> TxIntDelay=0
> 
> in /var/log/kernel I get the following...
> 
> Sep  1 23:53:01 www kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx 
> Unit Hang
> Sep  1 23:53:01 www kernel:   Tx Queue             <0>
> Sep  1 23:53:01 www kernel:   TDH                  <4c4>
> Sep  1 23:53:01 www kernel:   TDT                  <4c9>
> Sep  1 23:53:01 www kernel:   next_to_use          <4c9>
> Sep  1 23:53:01 www kernel:   next_to_clean        <4c4>
> Sep  1 23:53:01 www kernel: buffer_info[next_to_clean]
> Sep  1 23:53:01 www kernel:   time_stamp           <ffff9c60>
> Sep  1 23:53:01 www kernel:   next_to_watch        <4c4>
> Sep  1 23:53:01 www kernel:   jiffies              <ffff9d96>
> Sep  1 23:53:01 www kernel:   next_to_watch.status <0>
> .
> repeats the same as above a few times....
> .
> Sep  1 23:53:10 www kernel: NETDEV WATCHDOG: eth0: transmit timed out
> Sep  1 23:53:13 www kernel: e1000: eth0: e1000_watchdog_task: NIC Link 
> is Up 1000 Mbps Full Duplex
> 
> then the server locks up, no response from the keyboard at all and must 
> be forced down with a power kill.
> 
> Here is my driver info,
> 
> driver: e1000
> version: 7.0.33-k2-NAPI
> firmware-version: N/A
> bus-info: 0000:02:01.0
> 
> What else could I check?

[adding netdev to cc, this is a NET issue]

This is a known issue and there are several discussions and bugs filed on this. 
  Please read this one where most is documented, and also the netdev

http://sourceforge.net/tracker/index.php?func=detail&aid=1463045&group_id=42302&atid=447449

more links and information available on http://e1000.sf.net/

Your debugging information might be needed and helpful, so please take the 
trouble of digging in the previous bugreports and reporting anything that might 
be relevant there.

The full lockup is certainly not good, but should not necessarily be related to 
the tx hang (or the cause of that). It is likely that interrupt sharing might 
be a problem here; what kind of e1000 nic is this? lspci -vv?

Cheers,

Auke

-- 
VGER BF report: H 0.00334085

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-09-02  5:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-02  4:38 e1000 Detected Tx Unit Hang Paul Aviles
2006-09-02  5:45 ` Auke Kok

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox