* mass "tulip_stop_rxtx() failed", network stops
@ 2005-08-23 9:11 Tomasz Chmielewski
2005-08-23 9:37 ` jerome lacoste
0 siblings, 1 reply; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23 9:11 UTC (permalink / raw)
To: linux-kernel
We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
kernel, equipped with a onboard card that uses a tulip module:
02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
Ethernet 10/100 (rev 11)
No problem with those.
We are running four more machines like that, the only difference is the
kernel they are running (2.6.11.4).
On some of them, there are serious problems with a network, and they
usually happen when the traffic is bigger than usual (i.e., some big
software deployment to several workstations, remote backup, etc.).
The syslog is then full of entries like that:
Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
timed out
Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
and it's filling logs for hours; network doesn't work anymore, and
someone has to restart the network or the machine itself.
It doesn't always happen with a big traffic - sometimes you can fill the
100 Mbit link and do lots of reads from the disk, but nothing bad
happens for hours.
I saw some posts on this issue ("2.6.10-rc3: tulip-driver:
tulip_stop_rxtx() failed"), but it seemed to me that it wasn't similar
to my problems; I looked into >2.6.10 kernel changelog, but there were
no descriptions of that problem, either.
Any help appreciated, because rebooting machines which are 500 km away
and are not responding is no fun :)
--
Tomek
http://wpkg.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mass "tulip_stop_rxtx() failed", network stops
2005-08-23 9:11 mass "tulip_stop_rxtx() failed", network stops Tomasz Chmielewski
@ 2005-08-23 9:37 ` jerome lacoste
2005-08-23 9:46 ` Tomasz Chmielewski
2005-08-23 10:11 ` Tomasz Chmielewski
0 siblings, 2 replies; 4+ messages in thread
From: jerome lacoste @ 2005-08-23 9:37 UTC (permalink / raw)
To: Tomasz Chmielewski; +Cc: linux-kernel
On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:
> We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
> kernel, equipped with a onboard card that uses a tulip module:
>
> 02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
> Ethernet 10/100 (rev 11)
>
> No problem with those.
>
>
> We are running four more machines like that, the only difference is the
> kernel they are running (2.6.11.4).
>
> On some of them, there are serious problems with a network, and they
> usually happen when the traffic is bigger than usual (i.e., some big
> software deployment to several workstations, remote backup, etc.).
>
> The syslog is then full of entries like that:
>
> Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
> timed out
> Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
I am seeing thousands of tulip_stop_rxtx() failed messages as well
with 2.6.11. No regular network failure though.
See http://kerneltrap.org/mailarchive/1/message/110291/flat
Cheers,
Jerome
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mass "tulip_stop_rxtx() failed", network stops
2005-08-23 9:37 ` jerome lacoste
@ 2005-08-23 9:46 ` Tomasz Chmielewski
2005-08-23 10:11 ` Tomasz Chmielewski
1 sibling, 0 replies; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23 9:46 UTC (permalink / raw)
Cc: linux-kernel
jerome lacoste schrieb:
> On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:
>
>>We are running almost 20 Fujitsu-Siemens Scenic machines, 2.6.8.1
>>kernel, equipped with a onboard card that uses a tulip module:
>>
>>02:01.0 Ethernet controller: Linksys NC100 Network Everywhere Fast
>>Ethernet 10/100 (rev 11)
>>
>>No problem with those.
>>
>>
>>We are running four more machines like that, the only difference is the
>>kernel they are running (2.6.11.4).
>>
>>On some of them, there are serious problems with a network, and they
>>usually happen when the traffic is bigger than usual (i.e., some big
>>software deployment to several workstations, remote backup, etc.).
>>
>>The syslog is then full of entries like that:
>>
>>Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
>>timed out
>>Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
>
>
> I am seeing thousands of tulip_stop_rxtx() failed messages as well
> with 2.6.11. No regular network failure though.
>
> See http://kerneltrap.org/mailarchive/1/message/110291/flat
Lucky you.
Really no network problems, no increased ping responses?
For me lots of pings are lost, and when this "tulip_stop_rxtx() failed"
happens, the time for a ping to "go back" can be as big as 14 seconds in
a 100 Mbit LAN.
--
Tomek
http://wpkg.org
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: mass "tulip_stop_rxtx() failed", network stops
2005-08-23 9:37 ` jerome lacoste
2005-08-23 9:46 ` Tomasz Chmielewski
@ 2005-08-23 10:11 ` Tomasz Chmielewski
1 sibling, 0 replies; 4+ messages in thread
From: Tomasz Chmielewski @ 2005-08-23 10:11 UTC (permalink / raw)
To: jerome lacoste, linux-kernel
jerome lacoste schrieb:
> On 8/23/05, Tomasz Chmielewski <mangoo@mch.one.pl> wrote:
(...)
>>We are running four more machines like that, the only difference is the
>>kernel they are running (2.6.11.4).
>>
>>On some of them, there are serious problems with a network, and they
>>usually happen when the traffic is bigger than usual (i.e., some big
>>software deployment to several workstations, remote backup, etc.).
>>
>>The syslog is then full of entries like that:
>>
>>Aug 21 04:04:30 SERVER-B-HS kernel: NETDEV WATCHDOG: eth0: transmit
>>timed out
>>Aug 21 04:04:30 SERVER-B-HS kernel: 0000:00:06.0: tulip_stop_rxtx() failed
>
>
> I am seeing thousands of tulip_stop_rxtx() failed messages as well
> with 2.6.11. No regular network failure though.
>
> See http://kerneltrap.org/mailarchive/1/message/110291/flat
This may have something to do with this patch, introduced with 2.6.10
(see the ChangeLog-2.6.10).
It would explain why I had no problems on ~20 machines with 2.6.8.1
kernel, and I have this issue on the machines with 2.6.11.5 kernel.
[PATCH] tulip: make tulip_stop_rxtx() wait for DMA to fully stop
From: "John W. Linville" <linville@.........com>
tulip_stop_rxtx() doesn't wait for DMA to fully stop like the function
call name implies.
This was submitted through my employer -- I am not the original author
of this patch. However, I passed it by Jeff Garizk and he expressed
interest in having it upstream.
--
Tomek
http://wpkg.org
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-08-23 10:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-23 9:11 mass "tulip_stop_rxtx() failed", network stops Tomasz Chmielewski
2005-08-23 9:37 ` jerome lacoste
2005-08-23 9:46 ` Tomasz Chmielewski
2005-08-23 10:11 ` Tomasz Chmielewski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.