netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG REPORT, 2.6.22] e1000: detected tx unit hang
@ 2008-04-21 20:52 speedy
  2008-04-21 21:44 ` Kok, Auke
  0 siblings, 1 reply; 4+ messages in thread
From: speedy @ 2008-04-21 20:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev

Hello Linux crew,

        I've just switched the

        Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 01)

        netword card to an NForce 2 based motherboard and after a day
        of work it got stuck with "detected tx unit hang" messages
        showing in the console.

        The card worked flawlessly under load in a different computer
        for two years now, under the same/similar Ubuntu operating system.

        Unfortunaltly, the EEPROM fix from

        http://e1000.sourceforge.net/doku.php?id=known_issues&DokuWiki=9502f399bc8cae1528c5e85d2bc423f6

        from is not working/applicable:

        root@backupserver:~# ./fixeep-82573-dspd.sh eth1
        No appropriate hardware found for this fixup
        root@backupserver:~#

        /var/log/messages:  http://87.230.23.147/messages.txt
        /proc/interrupts: http://87.230.23.147/proc_interrupts.txt
        lspci -vv: http://87.230.23.147/lspcivv.txt


        If more info is needed, let me know.

        Thanks!

        ps. I don't follow the lists so please keep me in CC:

-- 
Best regards,
 speedy                          mailto:speedy@3d-io.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG REPORT, 2.6.22] e1000: detected tx unit hang
  2008-04-21 20:52 [BUG REPORT, 2.6.22] e1000: detected tx unit hang speedy
@ 2008-04-21 21:44 ` Kok, Auke
  2008-04-21 21:56   ` Re[2]: " speedy
  0 siblings, 1 reply; 4+ messages in thread
From: Kok, Auke @ 2008-04-21 21:44 UTC (permalink / raw)
  To: speedy; +Cc: netdev

[dropped lkml from the Cc]

speedy wrote:
> Hello Linux crew,
> 
>         I've just switched the
> 
>         Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 01)
> 
>         netword card to an NForce 2 based motherboard and after a day
>         of work it got stuck with "detected tx unit hang" messages
>         showing in the console.
> 
>         The card worked flawlessly under load in a different computer
>         for two years now, under the same/similar Ubuntu operating system.
> 
>         Unfortunaltly, the EEPROM fix from
> 
>         http://e1000.sourceforge.net/doku.php?id=known_issues&DokuWiki=9502f399bc8cae1528c5e85d2bc423f6
> 
>         from is not working/applicable:
> 
>         root@backupserver:~# ./fixeep-82573-dspd.sh eth1
>         No appropriate hardware found for this fixup

correct, that fix is only for very specific adapters which are based on a totally
different chipset than the one you have.

>         root@backupserver:~#
> 
>         /var/log/messages:  http://87.230.23.147/messages.txt
>         /proc/interrupts: http://87.230.23.147/proc_interrupts.txt
>         lspci -vv: http://87.230.23.147/lspcivv.txt
> 
> 
>         If more info is needed, let me know.


basically it's inserted into a new motherboard?

what was the old motherboard?

can you check the BIOS and disable things like "PCI Write combining" or
"Writeback" or any option looking similar to that?

It appears you hit an issue that is exposed by these adapters on some AMD/NVIDIA
chipset-based motherboards. This issue is known and we are investigating this and
have been for a long time. The root cause is still yet unknown however.

For some people disabling TSO helps to relieve the situation. You could give that
a try.

Cheers,

Auke


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re[2]: [BUG REPORT, 2.6.22] e1000: detected tx unit hang
  2008-04-21 21:44 ` Kok, Auke
@ 2008-04-21 21:56   ` speedy
  2008-04-21 22:06     ` Kok, Auke
  0 siblings, 1 reply; 4+ messages in thread
From: speedy @ 2008-04-21 21:56 UTC (permalink / raw)
  To: Kok, Auke; +Cc: netdev

Hello Auke,

Monday, April 21, 2008, 11:44:59 PM, you wrote:

KA> [dropped lkml from the Cc]

KA> speedy wrote:
>> Hello Linux crew,
>> 
>>         I've just switched the
>> 
>>         Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 01)
>> 
>>         netword card to an NForce 2 based motherboard and after a day
>>         of work it got stuck with "detected tx unit hang" messages
>>         showing in the console.
>> 
>>         The card worked flawlessly under load in a different computer
>>         for two years now, under the same/similar Ubuntu operating system.
>> 
>>         /var/log/messages:  http://87.230.23.147/messages.txt
>>         /proc/interrupts: http://87.230.23.147/proc_interrupts.txt
>>         lspci -vv: http://87.230.23.147/lspcivv.txt
>> 
>> 
>>         If more info is needed, let me know.


KA> basically it's inserted into a new motherboard?

Yup.

I've changed the PCI slot in which the card is inserted (just out of
hunch) and rebooted the server. I'll let you know if the problem
happens again.

KA> what was the old motherboard?

QDI Legend KinetiZ 7B

http://www.qdigrp.com/qdisite/eng/products/K7B.htm

(had uptimes of 200+ days :)

KA> can you check the BIOS and disable things like "PCI Write combining" or
KA> "Writeback" or any option looking similar to that?

I'm curious to see how often does the problem happen. I'll try such
measures if it reproduces itself.

KA> It appears you hit an issue that is exposed by these adapters on some AMD/NVIDIA
KA> chipset-based motherboards. This issue is known and we are investigating this and
KA> have been for a long time. The root cause is still yet unknown however.

Does it also happen with newer AMD/NVIDIA motherboards? :(

KA> For some people disabling TSO helps to relieve the situation. You could give that
KA> a try.

TSO? What is that and how to disable it? :)

KA> Cheers,

KA> Auke

Thanks!


-- 
Best regards,
 speedy                            mailto:speedy@3d-io.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [BUG REPORT, 2.6.22] e1000: detected tx unit hang
  2008-04-21 21:56   ` Re[2]: " speedy
@ 2008-04-21 22:06     ` Kok, Auke
  0 siblings, 0 replies; 4+ messages in thread
From: Kok, Auke @ 2008-04-21 22:06 UTC (permalink / raw)
  To: speedy; +Cc: netdev

speedy wrote:
> Hello Auke,
> 
> Monday, April 21, 2008, 11:44:59 PM, you wrote:
> 
> KA> [dropped lkml from the Cc]
> 
> KA> speedy wrote:
>>> Hello Linux crew,
>>>
>>>         I've just switched the
>>>
>>>         Ethernet controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev 01)
>>>
>>>         netword card to an NForce 2 based motherboard and after a day
>>>         of work it got stuck with "detected tx unit hang" messages
>>>         showing in the console.
>>>
>>>         The card worked flawlessly under load in a different computer
>>>         for two years now, under the same/similar Ubuntu operating system.
>>>
>>>         /var/log/messages:  http://87.230.23.147/messages.txt
>>>         /proc/interrupts: http://87.230.23.147/proc_interrupts.txt
>>>         lspci -vv: http://87.230.23.147/lspcivv.txt
>>>
>>>
>>>         If more info is needed, let me know.
> 
> 
> KA> basically it's inserted into a new motherboard?
> 
> Yup.
> 
> I've changed the PCI slot in which the card is inserted (just out of
> hunch) and rebooted the server. I'll let you know if the problem
> happens again.
> 
> KA> what was the old motherboard?
> 
> QDI Legend KinetiZ 7B
> 
> http://www.qdigrp.com/qdisite/eng/products/K7B.htm
> 
> (had uptimes of 200+ days :)
> 
> KA> can you check the BIOS and disable things like "PCI Write combining" or
> KA> "Writeback" or any option looking similar to that?
> 
> I'm curious to see how often does the problem happen. I'll try such
> measures if it reproduces itself.
> 
> KA> It appears you hit an issue that is exposed by these adapters on some AMD/NVIDIA
> KA> chipset-based motherboards. This issue is known and we are investigating this and
> KA> have been for a long time. The root cause is still yet unknown however.
> 
> Does it also happen with newer AMD/NVIDIA motherboards? :(

yes, that's what the reports are.

it appears to be related to a bridge chip which is common on both older and newer
motherboards.

> KA> For some people disabling TSO helps to relieve the situation. You could give that
> KA> a try.
> 
> TSO? What is that and how to disable it? :)

TCP Segmentation offload - the hardware will split up the payload into MTU-size
fragments itself instead of doing it in the kernel.

ethtool -K ethX tso off


Auke

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-04-21 22:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-21 20:52 [BUG REPORT, 2.6.22] e1000: detected tx unit hang speedy
2008-04-21 21:44 ` Kok, Auke
2008-04-21 21:56   ` Re[2]: " speedy
2008-04-21 22:06     ` Kok, Auke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).