All of lore.kernel.org
 help / color / mirror / Atom feed
* New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out
@ 2010-10-11 20:36 Christopher S. Aker
  0 siblings, 0 replies; only message in thread
From: Christopher S. Aker @ 2010-10-11 20:36 UTC (permalink / raw)
  To: xen devel

At Linode we recently altered our server build spec from Intel L5520 to 
L5630.  Nothing else has changed as far as we can tell, however with 
this new build we're experiencing a new problem -- permanent loss of 
networking after some time (measured in days) on these new machines, with:

NETDEV WATCHDOG: eth0: transmit timed out <-- kiss of network death

The link is remains active on switch, yet the NIC stops receiving any 
interrupts.  No amount of prodding wakes it back up...

Some data points:

2.6.18.8 @ 931 contains an older igb driver
2.6.18.8 @ 1038 contains newest igb driver (as of last week)

2.6.18.8 @ 931 works perfectly on all our equipment prior to L5630
2.6.18.8 @ 1038 times out on everything

Motherboard BIOS version is the same.
Upgrading BIOS on affected boxes has no effect.

A year or two back (after 931), I had to build a newer 2.6.18.8 for 
whatever reason and decided to include the newest igb drivers at that 
time.  I eventually had to roll this back because the NICs started 
timing out.  However, even our "good" build is timing out on the new 
spec machine.

These machines don't appear to present the problem when on bare metal.

dmesg:

http://theshore.net/~caker/xen/BUGS/nic-timeout/

What we're trying:

1) On an affected machine, we're swapping out the L5630 back to the L5520.
2) Moving from Xen 3.4.1 to Xen 3.4.4-rc1-pre
3) Xen 3.4.4-rc1-pre along with 2.6.32.23-g41a85de5 dom0

This certainly appears as some strange incompatibility with Xen, dom0, 
and/or the NIC driver.  No more interrupts being delivered is suspicious.

I'd be grateful for any insight!

Thanks,
-Chris

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2010-10-11 20:36 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-10-11 20:36 New CPUs, now get: NETDEV WATCHDOG: eth0: transmit timed out Christopher S. Aker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.