From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Paul Aviles" Subject: e1000 Detected Tx Unit Hang Date: Sat, 2 Sep 2006 10:39:44 -0400 Message-ID: <002c01c6ce9d$a1cf9100$3224050a@avilespaxp> Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit Return-path: Received: from dsl-7-36.cofs.net ([68.142.7.36]:42299 "EHLO www.palei.com") by vger.kernel.org with ESMTP id S1750837AbWIBOjs (ORCPT ); Sat, 2 Sep 2006 10:39:48 -0400 Received: from avilespaxp ([10.5.36.50]) by www.palei.com (8.13.1/8.13.1) with SMTP id k82EdkQm005129 for ; Sat, 2 Sep 2006 10:39:46 -0400 To: Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org I am getting "e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang" using stock 2.6.17.11, 2.6.17.5 or 2.6.17.4 kernels on centos 4.3. The server is a Tyan GS12 ( 82541GI/PI and 82547GI) and is connected to a Netgear GS724T Gig switch. I can easily reproduce the problem by trying to do a large ftp transfer to the server. It does not happen if the server is connected to a dummy 100 Mb switch, only when is connected to the Gig switch. I have also tried the options line below disabling tso, tx and rx in the modprobe.conf without any luck. options e1000 XsumRX=0 Speed=1000 Duplex=2 InterruptThrottleRate=0 FlowControl=3 RxDescriptors=4096 TxDescriptors=4096 RxIntDelay=0 TxIntDelay=0 in /var/log/kernel I get the following... Sep 1 23:53:01 www kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Sep 1 23:53:01 www kernel: Tx Queue <0> Sep 1 23:53:01 www kernel: TDH <4c4> Sep 1 23:53:01 www kernel: TDT <4c9> Sep 1 23:53:01 www kernel: next_to_use <4c9> Sep 1 23:53:01 www kernel: next_to_clean <4c4> Sep 1 23:53:01 www kernel: buffer_info[next_to_clean] Sep 1 23:53:01 www kernel: time_stamp Sep 1 23:53:01 www kernel: next_to_watch <4c4> Sep 1 23:53:01 www kernel: jiffies Sep 1 23:53:01 www kernel: next_to_watch.status <0> . repeats the same as above a few times.... . Sep 1 23:53:10 www kernel: NETDEV WATCHDOG: eth0: transmit timed out Sep 1 23:53:13 www kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex then the server locks up, no response from the keyboard at all and must be forced down with a power kill. The suggested tips on how to deal with this issue are not working so if I can help troubleshoot this let me know. Here is my system info, driver: e1000 version: 7.0.33-k2-NAPI firmware-version: N/A bus-info: 0000:02:01.0 lspci -vv output below.. 00:00.0 Host bridge: Intel Corporation 82875P/E7210 Memory Controller Hub (rev 02) Subsystem: Intel Corporation 82875P/E7210 Memory Controller Hub Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:01.0 PCI bridge: Intel Corporation 82875P Processor to AGP Controller (rev 02) (prog-if 00 [Normal decode]) Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- 00:03.0 PCI bridge: Intel Corporation 82875P/E7210 Processor to PCI to CSA Bridge (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- 00:1d.0 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Intel Corporation: Unknown device 24c0 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- 00:1f.0 ISA bridge: Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Region 1: I/O ports at Region 2: I/O ports at Region 3: I/O ports at Region 4: I/O ports at 14a0 [size=16] Region 5: Memory at 88100000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corporation 82801EB/ER (ICH5/ICH5R) SMBus Controller (rev 02) Subsystem: Intel Corporation: Unknown device 24c0 Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR-