mass "tulip_stop_rxtx() failed", network stops

All of lore.kernel.org
 help / color / mirror / Atom feed

* mass "tulip_stop_rxtx() failed", network stops
@ 2005-08-23  9:11 Tomasz Chmielewski
  2005-08-23  9:37 ` jerome lacoste
  0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23  9:11 UTC (permalink / raw)
  To: linux-kernel

We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1 
kernel, equipped with a onboard card that uses a tulip module:

02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast 
Ethernet 10/100 (rev 11)

No problem with those.

We are running four more machines like that, the only difference is the 
kernel they are running (2.6.11.4).

On some of them, there are serious problems with a network, and they 
usually happen when the traffic is bigger than usual (i.e., some big 
software deployment to several workstations, remote backup, etc.).

The syslog is then full of entries like that:

Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit 
timed out
Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed

and it's filling logs for hours; network doesn't work anymore, and 
someone has to restart the network or the machine itself.

It doesn't always happen with a big traffic - sometimes you can fill the 
100 Mbit link and do lots of reads from the disk, but nothing bad 
happens for hours.

I saw some posts on this issue ("2.6.10-rc3: tulip-driver: 
tulip_stop_rxtx() failed"), but it seemed to me that it wasn't similar 
to my problems; I looked into >2.6.10 kernel changelog, but there were 
no descriptions of that problem, either.

Any help appreciated, because rebooting machines which are 500 km away 
and are not responding is no fun :)

-- 
Tomek
http://wpkg.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mass "tulip_stop_rxtx() failed", network stops
  2005-08-23  9:11 mass "tulip_stop_rxtx() failed", network stops Tomasz Chmielewski
@ 2005-08-23  9:37 ` jerome lacoste
  2005-08-23  9:46   ` Tomasz Chmielewski
  2005-08-23 10:11   ` Tomasz Chmielewski
  0 siblings, 2 replies; 4+ messages in thread
From: jerome lacoste @ 2005-08-23  9:37 UTC (permalink / raw)
  To: Tomasz Chmielewski; +Cc: linux-kernel

On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:
> We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
> kernel, equipped with a onboard card that uses a tulip module:
> 
> 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
> Ethernet 10/100 (rev 11)
> 
> No problem with those.
> 
> 
> We are running four more machines like that, the only difference is the
> kernel they are running (2.6.11.4).
> 
> On some of them, there are serious problems with a network, and they
> usually happen when the traffic is bigger than usual (i.e., some big
> software deployment to several workstations, remote backup, etc.).
> 
> The syslog is then full of entries like that:
> 
> Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
> timed out
> Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed

I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.

See http://kerneltrap.org/mailarchive/1/message/110291/flat

Cheers,

Jerome

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mass "tulip_stop_rxtx() failed", network stops
  2005-08-23  9:37 ` jerome lacoste
@ 2005-08-23  9:46   ` Tomasz Chmielewski
  2005-08-23 10:11   ` Tomasz Chmielewski
  1 sibling, 0 replies; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23  9:46 UTC (permalink / raw)
  Cc: linux-kernel

jerome lacoste schrieb:
> On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:
> 
>>We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
>>kernel, equipped with a onboard card that uses a tulip module:
>>
>>02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
>>Ethernet 10/100 (rev 11)
>>
>>No problem with those.
>>
>>
>>We are running four more machines like that, the only difference is the
>>kernel they are running (2.6.11.4).
>>
>>On some of them, there are serious problems with a network, and they
>>usually happen when the traffic is bigger than usual (i.e., some big
>>software deployment to several workstations, remote backup, etc.).
>>
>>The syslog is then full of entries like that:
>>
>>Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
>>timed out
>>Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
> 
> 
> I am seeing thousands of tulip_stop_rxtx() failed messages as well
> with 2.6.11. No regular network failure though.
> 
> See http://kerneltrap.org/mailarchive/1/message/110291/flat

Lucky you.
Really no network problems, no increased ping responses?
For me lots of pings are lost, and when this "tulip_stop_rxtx() failed" 
happens, the time for a ping to "go back" can be as big as 14 seconds in 
a 100 Mbit LAN.



-- 
Tomek
http://wpkg.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mass "tulip_stop_rxtx() failed", network stops
  2005-08-23  9:37 ` jerome lacoste
  2005-08-23  9:46   ` Tomasz Chmielewski
@ 2005-08-23 10:11   ` Tomasz Chmielewski
  1 sibling, 0 replies; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23 10:11 UTC (permalink / raw)
  To: jerome lacoste, linux-kernel

jerome lacoste schrieb:
> On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:

(...)

>>We are running four more machines like that, the only difference is the
>>kernel they are running (2.6.11.4).
>>
>>On some of them, there are serious problems with a network, and they
>>usually happen when the traffic is bigger than usual (i.e., some big
>>software deployment to several workstations, remote backup, etc.).
>>
>>The syslog is then full of entries like that:
>>
>>Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
>>timed out
>>Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
> 
> 
> I am seeing thousands of tulip_stop_rxtx() failed messages as well
> with 2.6.11. No regular network failure though.
> 
> See http://kerneltrap.org/mailarchive/1/message/110291/flat

This may have something to do with this patch, introduced with 2.6.10 
(see the ChangeLog-2.6.10).
It would explain why I had no problems on ~20 machines with 2.6.8.1 
kernel, and I have this issue on the machines with 2.6.11.5 kernel.



[PATCH] tulip: make tulip_stop_rxtx() wait for DMA to fully stop
	
From: "John W. Linville" <linville@.........com>
	
tulip_stop_rxtx() doesn't wait for DMA to fully stop like the function
call name implies.
	
This was submitted through my employer -- I am not the original author 
of this	patch.  However, I passed it by Jeff Garizk and he expressed 
interest in having it upstream.


-- 
Tomek
http://wpkg.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-08-23 10:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-23  9:11 mass "tulip_stop_rxtx() failed", network stops Tomasz Chmielewski
2005-08-23  9:37 ` jerome lacoste
2005-08-23  9:46   ` Tomasz Chmielewski
2005-08-23 10:11   ` Tomasz Chmielewski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.