From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?ISO-8859-2?Q?Pawe=B3_Staszewski?= Subject: Re: eth1: Detected Hardware Unit Hang Date: Mon, 29 Mar 2010 19:29:56 +0200 Message-ID: <4BB0E394.2060908@itcare.pl> References: <4BB0C853.2080607@itcare.pl> <8DD2590731AB5D4C9DBF71A877482A9061BB3254@orsmsx509.amr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------000902050909050802080500" Cc: Linux Network Development list , "e1000-devel@lists.sourceforge.net" To: "Allan, Bruce W" Return-path: Received: from smtp.iq.pl ([86.111.241.19]:54315 "EHLO smtp.iq.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753049Ab0C2RaG (ORCPT ); Mon, 29 Mar 2010 13:30:06 -0400 In-Reply-To: <8DD2590731AB5D4C9DBF71A877482A9061BB3254@orsmsx509.amr.corp.intel.com> Sender: netdev-owner@vger.kernel.org List-ID: This is a multi-part message in MIME format. --------------000902050909050802080500 Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit lspci -vvv + ethtool -S in attached files. Network traffic when i get this info: eth1: RX: 157.22 Mb/s TX: 379.27 Mb/s ethtool -i eth1 driver: e1000e version: 1.0.2-k2 firmware-version: 0.5-7 bus-info: 0000:05:00.0 This is: Intel Corporation 82573L Gigabit Ethernet Controller But in this server i have another gigabit interface: Intel Corporation 82573E Gigabit Ethernet Controller this interface has two times more traffic than eth0 (82573L) ethtool -i eth0 driver: e1000e version: 1.0.2-k2 firmware-version: 0.15-5 bus-info: 0000:04:00.0 And also this server was working 4months without problems on 2.6.29.1 kernel Drivers that I use for e1000e are from kernel (standard kernel build-in e1000e driver). I don't tried other drivers. This is production server so I can't make too much tests. W dniu 2010-03-29 18:41, Allan, Bruce W pisze: > [adding e1000-devel] > > Please provide more information: > * what NIC/LOM is this on (preferably send full output from lspci -vvv) > * what type of networking workload is running at the time the hang occurred > * a dump of the NIC/LOM statistics might also help (ethtool -S eth1) > > Have you tried the latest standalone e1000e driver on e1000.sf.net? Does it reproduce the issue? > > If we cannot reproduce the hang in-house, would you be able/willing to run a debug driver to gather more information? > > Thanks, > Bruce. > > -----Original Message----- > From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Pawel Staszewski > Sent: Monday, March 29, 2010 8:34 AM > To: Linux Network Development list > Subject: eth1: Detected Hardware Unit Hang > > After update to kernel from 2.6.29.1 to 2.6.33.1 i have this info in dmesg: > > 0000:05:00.0: eth1: Detected Hardware Unit Hang: > TDH<1e> > TDT > next_to_use > next_to_clean<1d> > buffer_info[next_to_clean]: > time_stamp<33bae15> > next_to_watch<20> > jiffies<33bafaf> > next_to_watch.status<0> > MAC Status<80080783> > PHY Status<796d> > PHY 1000BASE-T Status<3800> > PHY Extended Status<3000> > PCI Status<10> > 0000:05:00.0: eth1: Detected Hardware Unit Hang: > TDH<1e> > TDT > next_to_use > next_to_clean<1d> > buffer_info[next_to_clean]: > time_stamp<33bae15> > next_to_watch<20> > jiffies<33bb1a3> > next_to_watch.status<0> > MAC Status<80080783> > PHY Status<796d> > PHY 1000BASE-T Status<3800> > PHY Extended Status<3000> > PCI Status<10> > 0000:05:00.0: eth1: Detected Hardware Unit Hang: > TDH<1e> > TDT > next_to_use > next_to_clean<1d> > buffer_info[next_to_clean]: > time_stamp<33bae15> > next_to_watch<20> > jiffies<33bb397> > next_to_watch.status<0> > MAC Status<80080783> > PHY Status<796d> > PHY 1000BASE-T Status<3800> > PHY Extended Status<3000> > PCI Status<10> > ------------[ cut here ]------------ > WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x118/0x19c() > Hardware name: X7DCT > NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out > Modules linked in: > Pid: 0, comm: swapper Not tainted 2.6.33.1 #2 > Call Trace: > [] ? warn_slowpath_common+0x52/0x71 > [] ? warn_slowpath_common+0x5e/0x71 > [] ? warn_slowpath_fmt+0x26/0x2a > [] ? dev_watchdog+0x118/0x19c > [] ? __wake_up+0x29/0x39 > [] ? insert_work+0x40/0x44 > [] ? dev_watchdog+0x0/0x19c > [] ? run_timer_softirq+0x11a/0x173 > [] ? __do_softirq+0x74/0xdf > [] ? do_softirq+0x23/0x27 > [] ? irq_exit+0x26/0x58 > [] ? smp_apic_timer_interrupt+0x6c/0x76 > [] ? apic_timer_interrupt+0x2a/0x30 > [] ? mwait_idle+0x49/0x4e > [] ? cpu_idle+0x41/0x5a > ---[ end trace bcca9926a046332c ]--- > > > With kernel 2.6.29.1 all was ok. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > --------------000902050909050802080500 Content-Type: text/plain; name="lspci.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="lspci.txt" 00:00.0 Host bridge: Intel Corporation 5100 Chipset Memory Controller Hub (rev 90) Subsystem: Super Micro Computer Inc Device de80 Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: feeff00c Data: 4181 Capabilities: [6c] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #2, Speed 2.5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited ClockPM- Surprise+ LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surpise- Slot # 2e, PowerLimit 25.000000; Interlock- NoCompl- SltCtl: Enable: AttnBtn+ PwrFlt+ MRL+ PresDet+ CmdCplt- HPIrq+ LinkChg- Control: AttnInd Off, PwrInd On, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL+ CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Kernel driver in use: pcieport 00:04.0 PCI bridge: Intel Corporation 5100 Chipset PCI Express x16 Port 4-7 (rev 90) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: feeff00c Data: 4189 Capabilities: [6c] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited ExtTag- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #4, Speed 2.5GT/s, Width x16, ASPM L0s, Latency L0 unlimited, L1 unlimited ClockPM- Surprise+ LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- SltCap: AttnBtn+ PwrCtrl+ MRL+ AttnInd+ PwrInd+ HotPlug+ Surpise- Slot # 30, PowerLimit 25.000000; Interlock- NoCompl- SltCtl: Enable: AttnBtn+ PwrFlt+ MRL+ PresDet+ CmdCplt- HPIrq+ LinkChg- Control: AttnInd Off, PwrInd Off, Power+ Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL+ CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Kernel driver in use: pcieport 00:08.0 System peripheral: Intel Corporation 5100 Chipset DMA Engine (rev 90) Subsystem: Super Micro Computer Inc Device de80 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset+ FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us ClockPM- Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled+ Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surpise+ Slot # 0, PowerLimit 0.000000; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock- Changed: MRL- PresDet- LinkState- RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: feeff00c Data: 4191 Capabilities: [90] Subsystem: Super Micro Computer Inc Device de80 Capabilities: [a0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: pcieport 00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 5 (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #5, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us ClockPM- Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surpise+ Slot # 5, PowerLimit 10.000000; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet+ LinkState+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: feeff00c Data: 4199 Capabilities: [90] Subsystem: Super Micro Computer Inc Device de80 Capabilities: [a0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: pcieport 00:1c.5 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express Port 6 (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [40] Express (v1) Root Port (Slot+), MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #6, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <256ns, L1 <4us ClockPM- Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surpise+ Slot # 6, PowerLimit 10.000000; Interlock- NoCompl- SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg- Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock- SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock- Changed: MRL- PresDet+ LinkState+ RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible- RootCap: CRSVisible- RootSta: PME ReqID 0000, PMEStatus- PMEPending- Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: feeff00c Data: 41a1 Capabilities: [90] Subsystem: Super Micro Computer Inc Device de80 Capabilities: [a0] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: pcieport 00:1d.0 USB Controller: Intel Corporation 82801I (ICH9 Family) USB UHCI Controller #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Super Micro Computer Inc Device de80 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: [50] Subsystem: Super Micro Computer Inc Device de80 00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller (rev 02) Subsystem: Super Micro Computer Inc Device de80 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- 00:1f.2 IDE interface: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 4 port SATA IDE Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Super Micro Computer Inc Device de80 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR-