From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Heiko Wundram" Subject: AW: [pv_ops] e1000e: "Detected Tx Unit Hang" Date: Fri, 21 May 2010 01:21:29 +0200 Message-ID: <00a301caf873$2d76d350$886479f0$@org> References: <4BF5AD97.4000907@access.denied> <4BF5B547.60700@goop.org> <4BF5BE8A.9060509@access.denied> <4BF5BF3E.2010708@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4BF5BF3E.2010708@goop.org> Content-Language: de List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: 'Jeremy Fitzhardinge' , xen-devel@lists.xensource.com Cc: 'Stefan Kuhne' List-Id: xen-devel@lists.xenproject.org I'm pretty sure the problem you're seeing is related to a broken = firmware of the specific chipset used for this Intel network card, not to Xen/pv_ops kernel. I've had the same problems under high load with "semi-old" Supermicro-Boxens I'm administering. There's an Intel utility to patch the respective Firmware issue (i.e., = the network controller EEPROM), but it's not available online anymore (at = least last time I looked for it, I couldn't find it on the Intel site, where = it was prominently featured when I first looked for it). I'll try to get access to it from the last machine that I applied this = patch to, but I'll only be able to do this some time during the (European) day tomorrow. --- Heiko. -----Urspr=FCngliche Nachricht----- Von: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] Im Auftrag von Jeremy Fitzhardinge Gesendet: Freitag, 21. Mai 2010 01:01 An: xen-devel@lists.xensource.com Cc: Stefan Kuhne Betreff: Re: [Xen-devel] [pv_ops] e1000e: "Detected Tx Unit Hang" On 05/20/2010 03:58 PM, Stefan Kuhne wrote: > Am 21.05.2010 00:18, schrieb Jeremy Fitzhardinge: > > Hello Jeremy, > > =20 >> e1000e works fine for me. However, I did have problems with my Ibex >> Peak-based system and the integrated ethernet devices; they would = drop >> off the PCIe bus (lspci -vx would show all 0xff for the config = space), >> which turned out to be some problem with ALPM (PCIe active link power >> management). Could this be what you're seeing? >> >> =20 > my "lspci -vx" output: > > 02:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet > Controller (Copper) > Subsystem: FIRST INTERNATIONAL Computer Inc Unknown device = 4720 > Flags: bus master, fast devsel, latency 0, IRQ 409 > Memory at d0000000 (32-bit, non-prefetchable) [size=3D128K] > I/O ports at 2000 [size=3D32] > Capabilities: [c8] Power Management version 2 > Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ > Queue=3D0/0 Enable+ > Capabilities: [e0] Express Endpoint IRQ 0 > Capabilities: [100] Advanced Error Reporting > Capabilities: [140] Device Serial Number = c6-a9-09-ff-ff-0b-14-00 > 00: 86 80 8c 10 07 05 10 00 00 00 00 02 10 00 00 00 > 10: 00 00 00 d0 00 00 00 00 01 20 00 00 00 00 00 00 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 09 15 20 47 > 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00 > > and the complete dmesg output: > [ 9620.997466] 0000:02:00.0: peth0: Detected Tx Unit Hang: > [ 9620.997469] TDH > [ 9620.997471] TDT <1f> > [ 9620.997473] next_to_use <1f> > [ 9620.997475] next_to_clean > [ 9620.997477] buffer_info[next_to_clean]: > [ 9620.997479] time_stamp <8e2ec3> > [ 9620.997481] next_to_watch > [ 9620.997483] jiffies <8e3a25> > [ 9620.997485] next_to_watch.status <0> > [ 9622.997490] 0000:02:00.0: peth0: Detected Tx Unit Hang: > [ 9622.997496] TDH > [ 9622.997500] TDT <1f> > [ 9622.997503] next_to_use <1f> > [ 9622.997507] next_to_clean > [ 9622.997511] buffer_info[next_to_clean]: > [ 9622.997515] time_stamp <8e2ec3> > [ 9622.997519] next_to_watch > [ 9622.997522] jiffies <8e41f5> > [ 9622.997526] next_to_watch.status <0> > [ 9624.997536] 0000:02:00.0: peth0: Detected Tx Unit Hang: > [ 9624.997541] TDH > [ 9624.997545] TDT <1f> > [ 9624.997549] next_to_use <1f> > [ 9624.997553] next_to_clean > [ 9624.997557] buffer_info[next_to_clean]: > [ 9624.997561] time_stamp <8e2ec3> > [ 9624.997565] next_to_watch > [ 9624.997568] jiffies <8e49c5> > [ 9624.997572] next_to_watch.status <0> > [ 9626.065848] eth0: port 1(peth0) entering disabled state > [ 9629.910292] e1000e: peth0 NIC Link is Up 1000 Mbps Full Duplex, = Flow > Control: None > [ 9629.910854] eth0: port 1(peth0) entering forwarding state > =20 OK, definitely different problem. Does it happen immediately, or after a while? Under load? Can you provide the full boot output, and cat /proc/interrupts? Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel