netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Greaves <david@dgreaves.com>
To: Jens Laas <jens.laas@data.slu.se>
Cc: Stephen Hemminger <shemminger@osdl.org>,
	netdev@oss.sgi.com, ganesh.venkatesan@intel.com
Subject: Re: 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+ delay scheduler
Date: Fri, 18 Jun 2004 10:08:36 +0100	[thread overview]
Message-ID: <40D2B114.5020201@dgreaves.com> (raw)
In-Reply-To: <Pine.LNX.4.60.0406180953240.1089@jlaas2.data.slu.se>

Stephen, I applied your delay scheduler patch and some results appear below.

Jens Laas wrote:

> (04.06.16 kl.11:59) David Greaves skrev följande till Stephen Hemminger:
>
> We have seen the same symptoms. (2.6.x + e1000)
>
> Our system is an SMP system. That might be whats triggering the problem.
> Is your system UP or SMP ?

UP

> (Next reboot we will test running on only one CPU).
>
> We have tried with and without NAPI, both exhibit the same problem.

Me too

> We have tried different versions of e1000 without luck.

Me too, 3 cards.
(did I mention I have 2 machines with very similar specs (AMD/VIAKT600) 
and the other one works - actually, to be accurate, hasn't yet failed 
but hasn't yet run at full speed - and it has a higher CPU speed)

> We have tried with 100Mb and gigabit switches.

I'm now running two e1000's back to back over a piece of cat5...

>
> Make sure that flowcontrol is disabled on your switch (if it has it 
> implemented).

...so it's not that smart anymore ;)

>>
>> module parameters.
>
>
> I believe following is recommended by driver developers:
> TxDescriptors=256 RxDescriptors=256 FlowControl=0 XsumRX=0

Yes, I'm running with module defaults unless otherwise stated but I've 
tried that combo (to no effect)

I'm speaking with Ganesh Venkatesan at intel about it. Ganesh you went 
off list - do you want to include Jens or maybe go back on-list?

A simple failure case for me is : 'ping -s 1500 '
This doesn't cause the timout but doesn't succeed either.

ping -f with standard packet size succeeds (slow rate though) and 
doesn't timeout.

Using 8139 100Mbs card:
272384 packets transmitted, 272383 packets received, 0% packet loss
round-trip min/avg/max = 0.1/0.1/4.0 ms
real    0m32.179s

Using Pro/1000:
60992 packets transmitted, 60991 packets received, 0% packet loss
round-trip min/avg/max = 0.0/0.5/8.4 ms
real    0m38.257s

any ping with -s >1500 results in 100% packet loss.

============
 From hereon down it's 2.6.7 with Stephen's recent delay scheduler patch

This changed the behaviour.

Now ping -s 1500 works
but after that it gets lossy
root@ash:~ # ping -s3000 10.0.1.1
PING 10.0.1.1 (10.0.1.1): 3000 data bytes
3008 bytes from 10.0.1.1: icmp_seq=1 ttl=64 time=0.5 ms
3008 bytes from 10.0.1.1: icmp_seq=11 ttl=64 time=0.5 ms
3008 bytes from 10.0.1.1: icmp_seq=12 ttl=64 time=0.4 ms
3008 bytes from 10.0.1.1: icmp_seq=13 ttl=64 time=0.9 ms
3008 bytes from 10.0.1.1: icmp_seq=15 ttl=64 time=0.4 ms
3008 bytes from 10.0.1.1: icmp_seq=16 ttl=64 time=0.3 ms

and now I'm seeing ping generate:
Jun 18 09:41:57 ash kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jun 18 09:41:59 ash kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 
1000 Mbps Full Duplex

ping -f now works for packet sizes up to -s 2952 (2 packets at mtu 1500)

ping -f -s 2953 results in:
PING 10.0.1.1 (10.0.1.1): 2953 data bytes
..............................ping: sendto: No buffer space available
ping: wrote 10.0.1.1 2961 chars, ret=-1
.ping: sendto: No buffer space available

nb. with the patch, between the same machines via an alternate pair of nics:
root@ash:~ # ping -f -s29550 haze
PING haze.dgreaves.com (10.0.0.88): 29550 data bytes
.
--- haze.dgreaves.com ping statistics ---
10592 packets transmitted, 10591 packets received, 0% packet loss
round-trip min/avg/max = 5.4/5.5/83.5 ms

Increasing Transmit Descriptors to 4096 avoids the No buffer space 
available with packet sizes up to -s65468 (still 100% failure though)

I'm not sure that adds much now so I'll leave it until I get some more 
suggestions.

HTH

David

  reply	other threads:[~2004-06-18  9:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-14 16:47 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out David Greaves
     [not found] ` <20040615155111.26d6b809@dell_ss3.pdx.osdl.net>
2004-06-16 10:59   ` David Greaves
2004-06-18  8:04     ` Jens Laas
2004-06-18  9:08       ` David Greaves [this message]
2004-06-18 10:27         ` 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+ delay scheduler Jens Laas
2004-06-18 12:51           ` David Greaves
2004-06-21 16:42         ` Thayne Harbaugh
2004-06-21 17:29           ` David Greaves
2004-06-21 17:43             ` ganesh.venkatesan
2004-06-21 18:34               ` David Greaves
2004-06-18 18:11       ` 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out Stephen Hemminger
2004-06-18 18:44         ` David Greaves
     [not found]           ` <20040618141629.0edd9766@dell_ss3.pdx.osdl.net>
2004-06-18 21:28             ` David Greaves
  -- strict thread matches above, loose matches on Subject: below --
2004-06-18 14:40 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+ delay scheduler Venkatesan, Ganesh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40D2B114.5020201@dgreaves.com \
    --to=david@dgreaves.com \
    --cc=ganesh.venkatesan@intel.com \
    --cc=jens.laas@data.slu.se \
    --cc=netdev@oss.sgi.com \
    --cc=shemminger@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).