NETDEV WATCHDOG: eth0: transmit timed out

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* NETDEV WATCHDOG: eth0: transmit timed out
@ 2000-12-28 10:43 David Ford
  0 siblings, 0 replies; 7+ messages in thread
From: David Ford @ 2000-12-28 10:43 UTC (permalink / raw)
  To: LKML

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

Same old story, bugger still does it.  Have to set the link down/up to
get it running again.  I had to reset two systems tonight, one up for
~60 days, one up for two days.  Both have this card.  Unrelated traffic.

This is kernel 2.4.0-test13-pre4

00:12.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev
20)
        Subsystem: Unknown device 1385:f004
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 set
        Interrupt: pin A routed to IRQ 11
        Region 0: I/O ports at f800
        Region 1: Memory at fdfffc00 (32-bit, non-prefetchable)

-d


[-- Attachment #2: Card for David Ford --]
[-- Type: text/x-vcard, Size: 274 bytes --]

begin:vcard 
n:Ford;David
x-mozilla-html:TRUE
url:www.blue-labs.org
adr:;;;;;;
version:2.1
email;internet:david@blue-labs.org
title:Blue Labs Developer
note;quoted-printable:GPG key: http://www.blue-labs.org/david@nifty.key=0D=0A
x-mozilla-cpt:;9952
fn:David Ford
end:vcard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NETDEV WATCHDOG: eth0: transmit timed out
@ 2000-12-28 11:26 Manfred
  2000-12-28 11:36 ` David Ford
  2000-12-29  0:30 ` idalton
  0 siblings, 2 replies; 7+ messages in thread
From: Manfred @ 2000-12-28 11:26 UTC (permalink / raw)
  To: david, linux-kernel

David wrote:
>
> Same old story, bugger still does it. Have to set the link down/up to 
> get it running again. 
>
> 00:12.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 
> 20) 
>

I missed your earlier mails, could you resend the details? 
I'm interested in the output from

	tulip-diag -m -a -f

before and after a link failure.

I'm aware that the tulip drivers doesn't handle cable disconnects and
reconnects with MII pnic cards. I have a patch for that problem, but it
affects _all_ MII tulip cards, and thus it won't be included soon. If
tulip-diag says "10mbps-serial", then you have run into that bug.

--
	Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NETDEV WATCHDOG: eth0: transmit timed out
  2000-12-28 11:26 Manfred
@ 2000-12-28 11:36 ` David Ford
  2000-12-29  0:30 ` idalton
  1 sibling, 0 replies; 7+ messages in thread
From: David Ford @ 2000-12-28 11:36 UTC (permalink / raw)
  To: Manfred; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]

Manfred wrote:

> David wrote:
> >
> > Same old story, bugger still does it. Have to set the link down/up to
> > get it running again.
> >
> > 00:12.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev
> > 20)
> >
>
> I missed your earlier mails, could you resend the details?
> I'm interested in the output from
>
>         tulip-diag -m -a -f
>
> before and after a link failure.
>
> I'm aware that the tulip drivers doesn't handle cable disconnects and
> reconnects with MII pnic cards. I have a patch for that problem, but it
> affects _all_ MII tulip cards, and thus it won't be included soon. If
> tulip-diag says "10mbps-serial", then you have run into that bug.
>
> --
>         Manfred

Here's the before, when the after happens..

# ./tulip-diag -m -a -f
tulip-diag.c:v2.04 9/26/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Lite-On 82c168 PNIC adapter at 0xf800.
Lite-On 82c168 PNIC chip registers at 0xf800:
  00008000 01ff0000 00450008 0118f000 0118f200 02660010 814c2202 0001ebef
  00000000 00000000 0118f2d0 01e3a88c 00000020 00000000 00000000 10000001
  00000000 00000000 f0041385 000000bf 609641e1 0118f110 00c99010 0001e978
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 Port selection is MII, full-duplex.
 Transmit started, Receive started, full-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Idle'.
  The transmit threshold is 72.
 MII PHY found at address 1, status 0x782d.
 MII PHY #1 transceiver registers:
   3000 782d 0040 6212 01e1 41e1 0003 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   5000 032b 0002 0046 0000 01cd 0100 0000
   003f f53e 0f00 ff00 002f 4000 80a0 000b.

This particular one is on a crossover @ 100 FD with a pcmcia tulip
card...which works fine.

The other machine I had reset tonight was on a crossover w/ cisco 3640
iirc.

-d


[-- Attachment #2: Card for David Ford --]
[-- Type: text/x-vcard, Size: 274 bytes --]

begin:vcard 
n:Ford;David
x-mozilla-html:TRUE
url:www.blue-labs.org
adr:;;;;;;
version:2.1
email;internet:david@blue-labs.org
title:Blue Labs Developer
note;quoted-printable:GPG key: http://www.blue-labs.org/david@nifty.key=0D=0A
x-mozilla-cpt:;9952
fn:David Ford
end:vcard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NETDEV WATCHDOG: eth0: transmit timed out
  2000-12-28 11:26 Manfred
  2000-12-28 11:36 ` David Ford
@ 2000-12-29  0:30 ` idalton
  1 sibling, 0 replies; 7+ messages in thread
From: idalton @ 2000-12-29  0:30 UTC (permalink / raw)
  To: Manfred; +Cc: david, linux-kernel

On Thu, Dec 28, 2000 at 12:26:06PM +0100, Manfred wrote:
> David wrote:
> >
> > Same old story, bugger still does it. Have to set the link down/up to 
> > get it running again. 
> >
> > 00:12.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 
> > 20) 
> >
> 
> I missed your earlier mails, could you resend the details? 
> I'm interested in the output from
> 
> 	tulip-diag -m -a -f
> 
> before and after a link failure.
> 
> 
> I'm aware that the tulip drivers doesn't handle cable disconnects and
> reconnects with MII pnic cards. I have a patch for that problem, but it
> affects _all_ MII tulip cards, and thus it won't be included soon. If
> tulip-diag says "10mbps-serial", then you have run into that bug.

I have the same transmit timeout problem, but with a D-Link via rhine
board. I'm running -test10, and it seems to happen under high
(interrupt?) load with both heavy disk and network
activity. Interestingly, it appears to happen more often when the other
end of the network activity is a 10BaseT link. I'm using a Netgear
dual-speed hub.

Do you think these might be related?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NETDEV WATCHDOG: eth0: transmit timed out
@ 2001-01-13  2:24 Darryl Miles
  0 siblings, 0 replies; 7+ messages in thread
From: Darryl Miles @ 2001-01-13  2:24 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org

I am getting complete lockups of the NIC, up/down the interface doesn't
restore it.  rmmod/insmod of ne2k-pci and 8390 doesn't restore it.  A
reboot does.

The m/c with this card in isn't normally highly loaded on the network,
but under heavy load it will lockup completely (fairly reliably I
suspect).  I have also had this problem with 2.4.0-test11, I had traced
it to ei_tx_intr() in so much as it was calling the
"ei_local->stat.collisions += 16;" line.  This is 8390.c:635 in 2.4.0.

The log below shows the time I had reloaded the modules trying to bring
it back to life.

Jan 13 01:46:24 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:46:24 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=951.
Jan 13 01:46:26 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:46:26 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=100.
Jan 13 01:47:14 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:47:14 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=106.
Jan 13 01:47:15 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:47:15 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=26.
Jan 13 01:47:17 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:47:17 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=105.
Jan 13 01:47:24 thehostname kernel: RPC: sendmsg returned error 101
Jan 13 01:47:24 thehostname kernel: nfs: RPC call returned error 101
Jan 13 01:47:24 thehostname kernel: nfs_statfs: statfs error = 101
Jan 13 01:47:37 thehostname kernel: ne2k-pci.c:v1.02 10/19/2000 D.
Becker/P. Gortmaker
Jan 13 01:47:37 thehostname kernel:  
http://www.scyld.com/network/ne2k-pci.html
Jan 13 01:47:37 thehostname kernel: eth0: RealTek RTL-8029 found at
0xe800, IRQ 19, 48:54:E8:21:15:56.
Jan 13 01:47:47 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:47:47 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=111.
Jan 13 01:47:58 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:47:58 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=1031.
Jan 13 01:48:00 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:48:00 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x1, ISR=0x3, t=107.
Jan 13 01:48:04 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:48:04 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=106.
Jan 13 01:48:08 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:48:08 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=306.
Jan 13 01:48:10 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:48:10 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=105.
Jan 13 01:48:24 thehostname kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Jan 13 01:48:24 thehostname kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x2, t=72.

$ uname -r
2.4.0

lsmod bits:

ne2k-pci                4448   1  (autoclean)
8390                    6544   0  (autoclean) [ne2k-pci]

/proc/pci:
  Bus  0, device  11, function  0:
    Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8029(AS)
(rev 0).
      IRQ 19.
      I/O at 0xe800 [0xe81f].

-- 
Darryl Miles
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* NETDEV WATCHDOG: eth0: transmit timed out
@ 2001-03-01 16:59 Caleb Epstein
  2001-03-01 23:01 ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Caleb Epstein @ 2001-03-01 16:59 UTC (permalink / raw)
  To: linux-kernel


	I am seeing the following error after my machine has been up
	for a while.  My eth0 is connected to a switched, local
	subnet.  There is not a lot of traffic on the interface, maybe
	a few 100 Mbytes or so.  Taking the interface down and then up
	again fixes the problem (until it happens again :)

	Here is the relevant section from my kernel log

Mar  1 10:48:44 tela kernel: NETDEV WATCHDOG: eth0: transmit timed out
Mar  1 10:48:44 tela kernel: eth0: transmit timed out, tx_status 00 status e000.
Mar  1 10:48:44 tela kernel:   diagnostics: net 0ec0 media 4810 dma 00000021.
Mar  1 10:48:44 tela kernel:   Flags; bus-master 1, full 1; dirty 87959(7) current 87975(7).
Mar  1 10:48:44 tela kernel:   Transmit list 01252270 vs. c1252270.
Mar  1 10:48:44 tela kernel:   0: @c1252200  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   1: @c1252210  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   2: @c1252220  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   3: @c1252230  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   4: @c1252240  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   5: @c1252250  length 8000002a status 8000002a
Mar  1 10:48:44 tela kernel:   6: @c1252260  length 8000002a status 8000002a
Mar  1 10:48:44 tela kernel:   7: @c1252270  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   8: @c1252280  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   9: @c1252290  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   10: @c12522a0  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   11: @c12522b0  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   12: @c12522c0  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   13: @c12522d0  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel:   14: @c12522e0  length 800000f7 status 000000f7
Mar  1 10:48:44 tela kernel:   15: @c12522f0  length 8000010c status 0000010c
Mar  1 10:48:44 tela kernel: eth0: Resetting the Tx ring pointer.

	Then a similar dump repeats until the interface is recycled.
	It appears that the interface was not functioning for some
	hours before the message was generated, and it was my attempt
	to ping a host on the local subnet that caused the NETDEV
	WATCHDOG error to be generated (e.g. the card locked up, but
	the kernel didn't notice until I tried to send something on
	the wire).

	The card is:

	eth0: 3Com PCI 3c900 Boomerang 10Mbps Combo at 0x1400,
	00:60:08:bd:ab:0e, IRQ 9

	I am running kernel 2.4.2, and have seen this error in 2.4.1
	as well; not sure about 2.4.0.  I do not ever recall
	encountering this error with the 2.2.x kernels, though my
	network topology has changed, but not my hardware.  I know of
	at least one other person who gets this same error with a
	eth0: 3Com PCI 3c905B Cyclone 100baseTx card.  The system is a
	P2-300, 128 Mb RAM, running various versions of Linux very
	happily for 3 years.

	FWIW, IRQ 9 is shared with the bttv module, though the network
	lockup doesn't seem to be related to my use of that module.  I
	was using xawtv last night while the interface was stil active
	and functioning.  The lockup happened this morning.

	Sorry for the long-winded post.  Is this a known bug?
	Anything I can do to help track it down and squash it if so?

-- 
cae at bklyn dot org | Caleb Epstein | bklyn . org | Brooklyn Dust Bunny Mfg.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: NETDEV WATCHDOG: eth0: transmit timed out
  2001-03-01 16:59 Caleb Epstein
@ 2001-03-01 23:01 ` Andrew Morton
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Morton @ 2001-03-01 23:01 UTC (permalink / raw)
  To: Caleb Epstein; +Cc: linux-kernel

Caleb Epstein wrote:
> 
>         I am seeing the following error after my machine has been up
>         for a while.  My eth0 is connected to a switched, local
>         subnet.  There is not a lot of traffic on the interface, maybe
>         a few 100 Mbytes or so.  Taking the interface down and then up
>         again fixes the problem (until it happens again :)
> 
>         Here is the relevant section from my kernel log
> 
> Mar  1 10:48:44 tela kernel: NETDEV WATCHDOG: eth0: transmit timed out

My guess would be that the driver has decided there's no
link beat on the 10baseT interface and has flopped over
to using 10base2.  A fix for this exists in 2.4.2-ac5+,
in the zerocopy patch and in

	http://www.uow.edu.au/~andrewm/linux/3c59x.c-2.4.2-pre4.gz

but not in 2.4.2.

You'll need to use

	options 3c59x options=0

in /etc/modules.conf to pin the driver down to using a 
particular physical interface - disable autoselection.

So could you please upgrade the driver?  If problems
remain, please send me a report, as described in the
final section of Documentation/networking/vortex.txt.

Thanks.

-

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-03-01 23:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2000-12-28 10:43 NETDEV WATCHDOG: eth0: transmit timed out David Ford
  -- strict thread matches above, loose matches on Subject: below --
2000-12-28 11:26 Manfred
2000-12-28 11:36 ` David Ford
2000-12-29  0:30 ` idalton
2001-01-13  2:24 Darryl Miles
2001-03-01 16:59 Caleb Epstein
2001-03-01 23:01 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox