From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Kok, Auke" Subject: Re: [BUG REPORT, 2.6.22] e1000: detected tx unit hang Date: Mon, 21 Apr 2008 15:06:31 -0700 Message-ID: <480D0FE7.4010609@intel.com> References: <1071646044.20080421225227@3d-io.com> <480D0ADB.9000308@intel.com> <304359202.20080421235657@3d-io.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: speedy Return-path: Received: from mga11.intel.com ([192.55.52.93]:30709 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753933AbYDUWHz (ORCPT ); Mon, 21 Apr 2008 18:07:55 -0400 In-Reply-To: <304359202.20080421235657@3d-io.com> Sender: netdev-owner@vger.kernel.org List-ID: speedy wrote: > Hello Auke, > > Monday, April 21, 2008, 11:44:59 PM, you wrote: > > KA> [dropped lkml from the Cc] > > KA> speedy wrote: >>> Hello Linux crew, >>> >>> I've just switched the >>> >>> Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 01) >>> >>> netword card to an NForce 2 based motherboard and after a day >>> of work it got stuck with "detected tx unit hang" messages >>> showing in the console. >>> >>> The card worked flawlessly under load in a different computer >>> for two years now, under the same/similar Ubuntu operating system. >>> >>> /var/log/messages: http://87.230.23.147/messages.txt >>> /proc/interrupts: http://87.230.23.147/proc_interrupts.txt >>> lspci -vv: http://87.230.23.147/lspcivv.txt >>> >>> >>> If more info is needed, let me know. > > > KA> basically it's inserted into a new motherboard? > > Yup. > > I've changed the PCI slot in which the card is inserted (just out of > hunch) and rebooted the server. I'll let you know if the problem > happens again. > > KA> what was the old motherboard? > > QDI Legend KinetiZ 7B > > http://www.qdigrp.com/qdisite/eng/products/K7B.htm > > (had uptimes of 200+ days :) > > KA> can you check the BIOS and disable things like "PCI Write combining" or > KA> "Writeback" or any option looking similar to that? > > I'm curious to see how often does the problem happen. I'll try such > measures if it reproduces itself. > > KA> It appears you hit an issue that is exposed by these adapters on some AMD/NVIDIA > KA> chipset-based motherboards. This issue is known and we are investigating this and > KA> have been for a long time. The root cause is still yet unknown however. > > Does it also happen with newer AMD/NVIDIA motherboards? :( yes, that's what the reports are. it appears to be related to a bridge chip which is common on both older and newer motherboards. > KA> For some people disabling TSO helps to relieve the situation. You could give that > KA> a try. > > TSO? What is that and how to disable it? :) TCP Segmentation offload - the hardware will split up the payload into MTU-size fragments itself instead of doing it in the kernel. ethtool -K ethX tso off Auke