netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: e1000 driver (NETDEV WATCHDOG + page allocation failure)
       [not found] <468F3FDA28AA87429AD807992E22D07E02C6625A@orsmsx408>
@ 2004-11-15 16:54 ` David Greaves
  0 siblings, 0 replies; only message in thread
From: David Greaves @ 2004-11-15 16:54 UTC (permalink / raw)
  To: Venkatesan, Ganesh; +Cc: netdev

Hi Ganesh

Apologies for not responding sooner - you know how it is.

I've recently had a chance to update the BIOS on my motherboard as you 
suggested and it has made a considerable difference.
However I am still seeing some issues and wouldn't consider the system 
useable yet :(

Given that version 2.6.9 came out I thought it would be worth upgrading 
to grab the new patches I saw go through; so now I'm running 2.6.9

Light usage with a standard 1500 MTU now works most of the time (ie ping 
-f works fine, normal ssh usage and nfs etc)
I can't test jumbo packets as mtu 9000 causes immediate page allocation 
failures:
ifconfig: page allocation failure. order:3, mode:0x20
on my other (otherwise stable) box.

Even on mtu=1500 I do however have problems with sustained throughput. 
The reason I got the cards was to make video-editing over the network 
quicker so this is a real problem.

At the moment I'm using rsync to transfer a few hundred Gb of data.
If I use ssh as the shell tunnel then the high cpu bottlenecks the data 
to 10Mb/s
I use --rsh=rsh to ensure that there's minimal cpu usage the throughput 
goes up to ~21Mb/s (the remote server is capable of ~40Mb/s raw 
filesystem I/O but it still has a slow cpu)

At this throughput my workstation's e1000 appears to begin to fail.

Some tests:
# ping -f cuf
PING cuf (10.0.1.3): 56 data bytes
..
--- cuf ping statistics ---
1483792 packets transmitted, 1483791 packets received, 0% packet loss
round-trip min/avg/max = 0.0/0.6/3074.8 ms
so that works fine (whereas pre-BIOS update I used to have problems)

a more realistic activity:
rsync --rsh="rsh" --progress -a /scratch/* cu:/huge/myth/
1524072448 81% 22.84MB/s 0:00:14
but stalls (every 10-15 seconds) down to
95715328 5% 2.77MB/s 0:01:04

When the stall happens, the e1000_tx_timeout_task() log (extract below) 
is produced.
Normally (with R/TxDescriptors=256) the start of the log is 'lost' by 
syslogd so the version below is with T/RXDescriptors=80.

I've played with a few variables and found this set gave me better 
behaviour with a stall every minute or so rather than every few seconds)
# modprobe e1000 InterruptThrottleRate=600 FlowControl=3 
TxDescriptors=80 RxDescriptors=80


David

Venkatesan, Ganesh wrote:

>David:
>
>Could you check the BIOS version on your system? We were able to
>reproduce some of your performance issues on a machine with BIOS version
>1.03. Upgrading to version 1.10 resolved all issues. The machine we used
>is:
>Athlon 1800 with an Aopen AK77-KT600N motherboard.
>
>Please let us know what you find.
>
>Thanks,
>Ganesh.
>  
>
Nov 14 18:05:05 ash kernel: NETDEV WATCHDOG: eth0: transmit timed out 
after 5000 jiffies
Nov 14 18:05:05 ash kernel: eth0: transmit timeout from queuing
Nov 14 18:05:05 ash kernel: eth0: state=0x7 transmit ring size=4096 
count=80 to_use=6 to_clean=10
Nov 14 18:05:05 ash kernel: 0: skb=00000000 dma=0 length=1514 
time=+20656 watch=1
Nov 14 18:05:05 ash kernel: 1: skb=dffbf420 dma=747090014 length=1514 
time=+9308 watch=2
Nov 14 18:05:05 ash kernel: 2: skb=00000000 dma=0 length=1514 
time=+20656 watch=3
Nov 14 18:05:05 ash kernel: 3: skb=eef81420 dma=370819166 length=1514 
time=+9308 watch=4
Nov 14 18:05:05 ash kernel: 4: skb=00000000 dma=0 length=1514 
time=+20656 watch=5
Nov 14 18:05:05 ash kernel: 5: skb=dffbf6a0 dma=410486878 length=1514 
time=+9308 watch=6
Nov 14 18:05:05 ash kernel: 6: skb=00000000 dma=0 length=1514 
time=+20656 watch=7
Nov 14 18:05:05 ash kernel: 7: skb=00000000 dma=0 length=1514 time=+9313 
watch=8
Nov 14 18:05:05 ash kernel: 8: skb=00000000 dma=0 length=1514 
time=+20656 watch=9
Nov 14 18:05:05 ash kernel: 9: skb=00000000 dma=0 length=1514 time=+9313 
watch=10
Nov 14 18:05:05 ash kernel: 10: skb=00000000 dma=0 length=1514 
time=+20656 watch=11
Nov 14 18:05:05 ash kernel: 11: skb=dffbf9c0 dma=1015185502 length=1514 
time=+9313 watch=12
Nov 14 18:05:05 ash kernel: 12: skb=00000000 dma=0 length=1514 
time=+20656 watch=13
Nov 14 18:05:05 ash kernel: 13: skb=b94a42e0 dma=678299742 length=1514 
time=+9313watch=14
Nov 14 18:05:05 ash kernel: 14: skb=00000000 dma=0 length=1514 
time=+20656 watch=15
Nov 14 18:05:05 ash kernel: 15: skb=dffbf740 dma=244232286 length=1514 
time=+9313watch=16
Nov 14 18:05:05 ash kernel: 16: skb=00000000 dma=0 length=1514 
time=+20656 watch=17
Nov 14 18:05:05 ash kernel: 17: skb=dffbf880 dma=244234334 length=1514 
time=+9313watch=18
Nov 14 18:05:05 ash kernel: 18: skb=00000000 dma=0 length=982 
time=+20654 watch=19
Nov 14 18:05:05 ash kernel: 19: skb=e054f560 dma=543977566 length=1514 
time=+9313watch=19
Nov 14 18:05:05 ash kernel: 20: skb=00000000 dma=0 length=1514 
time=+20669 watch=21
Nov 14 18:05:05 ash kernel: 21: skb=efd22ce0 dma=543979614 length=1514 
time=+9313watch=22
Nov 14 18:05:05 ash kernel: 22: skb=00000000 dma=0 length=1514 
time=+20669 watch=23
Nov 14 18:05:05 ash kernel: 23: skb=eef81740 dma=621445214 length=1514 
time=+9313watch=24
Nov 14 18:05:05 ash kernel: 24: skb=00000000 dma=0 length=1514 
time=+20669 watch=25
Nov 14 18:05:05 ash kernel: 25: skb=dffbfd80 dma=621447262 length=1514 
time=+9313watch=26
Nov 14 18:05:05 ash kernel: 26: skb=00000000 dma=0 length=1514 
time=+20669 watch=27
Nov 14 18:05:05 ash kernel: 27: skb=b94a4920 dma=96651358 length=1514 
time=+9313 watch=28
Nov 14 18:05:05 ash kernel: 28: skb=00000000 dma=0 length=1514 
time=+20669 watch=29
Nov 14 18:05:05 ash kernel: 29: skb=ef3d27e0 dma=212396126 length=1514 
time=+9312watch=30
Nov 14 18:05:05 ash kernel: 30: skb=00000000 dma=0 length=1514 
time=+20669 watch=31
Nov 14 18:05:05 ash kernel: 31: skb=e054f420 dma=212394078 length=1514 
time=+9312watch=32
Nov 14 18:05:05 ash kernel: 32: skb=00000000 dma=0 length=1514 
time=+20669 watch=33
Nov 14 18:05:05 ash kernel: 33: skb=ef3d26a0 dma=471775326 length=1514 
time=+9312watch=34
Nov 14 18:05:05 ash kernel: 34: skb=00000000 dma=0 length=1514 
time=+20669 watch=35
Nov 14 18:05:05 ash kernel: 35: skb=b88a0880 dma=471773278 length=1514 
time=+9312watch=36
Nov 14 18:05:05 ash kernel: 36: skb=00000000 dma=0 length=1514 
time=+20669 watch=37
Nov 14 18:05:05 ash kernel: 37: skb=c9e26380 dma=301906014 length=1514 
time=+9312watch=38
Nov 14 18:05:05 ash kernel: 38: skb=00000000 dma=0 length=1514 
time=+20668 watch=39
Nov 14 18:05:05 ash kernel: 39: skb=efd227e0 dma=301903966 length=1514 
time=+9312watch=40
Nov 14 18:05:05 ash kernel: 40: skb=00000000 dma=0 length=1514 
time=+20668 watch=41
Nov 14 18:05:05 ash kernel: 41: skb=c95a2240 dma=292812894 length=1514 
time=+9312watch=42
Nov 14 18:05:05 ash kernel: 42: skb=00000000 dma=0 length=1514 
time=+20668 watch=43
Nov 14 18:05:05 ash kernel: 43: skb=eef81100 dma=292810846 length=1514 
time=+9312watch=44
Nov 14 18:05:05 ash kernel: 44: skb=00000000 dma=0 length=1514 
time=+20668 watch=45
Nov 14 18:05:05 ash kernel: 45: skb=c9e26f60 dma=412209246 length=1514 
time=+9312watch=46
Nov 14 18:05:05 ash kernel: 46: skb=00000000 dma=0 length=1514 
time=+20668 watch=47
Nov 14 18:05:05 ash kernel: 47: skb=b88a07e0 dma=410490974 length=1514 
time=+9312watch=48
Nov 14 18:05:05 ash kernel: 48: skb=00000000 dma=0 length=1514 
time=+20668 watch=49
Nov 14 18:05:05 ash kernel: 49: skb=efd22420 dma=471769182 length=1514 
time=+9312watch=50
Nov 14 18:05:05 ash kernel: 50: skb=00000000 dma=0 length=1394 
time=+20668 watch=51
Nov 14 18:05:05 ash kernel: 51: skb=e054fec0 dma=471771230 length=994 
time=+9312 watch=52
Nov 14 18:05:05 ash kernel: 52: skb=00000000 dma=0 length=78 time=+20656 
watch=53
Nov 14 18:05:05 ash kernel: 53: skb=c9e26560 dma=139356254 length=1514 
time=+9312watch=54
Nov 14 18:05:05 ash kernel: 54: skb=00000000 dma=0 length=1514 
time=+20656 watch=55
Nov 14 18:05:05 ash kernel: 55: skb=eef817e0 dma=410484830 length=1514 
time=+9312watch=56
Nov 14 18:05:05 ash kernel: 56: skb=00000000 dma=0 length=1514 
time=+20656 watch=57
Nov 14 18:05:05 ash kernel: 57: skb=e054f240 dma=410488926 length=1514 
time=+9312watch=58
Nov 14 18:05:05 ash kernel: 58: skb=00000000 dma=0 length=1514 
time=+20656 watch=59
Nov 14 18:05:05 ash kernel: 59: skb=ef3d2b00 dma=543973470 length=1514 
time=+9310watch=60
Nov 14 18:05:05 ash kernel: 60: skb=00000000 dma=0 length=1514 
time=+20656 watch=61
Nov 14 18:05:05 ash kernel: 61: skb=dffbfec0 dma=747087966 length=1514 
time=+9310watch=62
Nov 14 18:05:05 ash kernel: 62: skb=00000000 dma=0 length=1514 
time=+20656 watch=63
Nov 14 18:05:05 ash kernel: 63: skb=ef3d2560 dma=747085918 length=1514 
time=+9310watch=64
Nov 14 18:05:05 ash kernel: 64: skb=00000000 dma=0 length=1514 
time=+20656 watch=65
Nov 14 18:05:05 ash kernel: 65: skb=b94a4100 dma=1013049438 length=1514 
time=+9310 watch=66
Nov 14 18:05:05 ash kernel: 66: skb=00000000 dma=0 length=1514 
time=+20656 watch=67
Nov 14 18:05:05 ash kernel: 67: skb=c95a21a0 dma=323840094 length=1514 
time=+9310watch=68
Nov 14 18:05:05 ash kernel: 68: skb=00000000 dma=0 length=1514 
time=+20656 watch=69
Nov 14 18:05:05 ash kernel: 69: skb=e054f2e0 dma=572936286 length=1514 
time=+9310watch=70
Nov 14 18:05:05 ash kernel: 70: skb=00000000 dma=0 length=1514 
time=+20656 watch=71
Nov 14 18:05:05 ash kernel: 71: skb=b94a4060 dma=895561822 length=1514 
time=+9310watch=72
Nov 14 18:05:05 ash kernel: 72: skb=00000000 dma=0 length=1514 
time=+20656 watch=73
Nov 14 18:05:05 ash kernel: 73: skb=b94a4880 dma=96649310 length=1514 
time=+9310 watch=74
Nov 14 18:05:05 ash kernel: 74: skb=00000000 dma=0 length=1514 
time=+20656 watch=75
Nov 14 18:05:05 ash kernel: 75: skb=b88a0380 dma=512022622 length=1514 
time=+9310watch=76
Nov 14 18:05:05 ash kernel: 76: skb=00000000 dma=0 length=1514 
time=+20656 watch=77
Nov 14 18:05:05 ash kernel: 77: skb=b94a4380 dma=1013047390 length=1514 
time=+9310 watch=78
Nov 14 18:05:05 ash kernel: 78: skb=00000000 dma=0 length=1514 
time=+20656 watch=79
Nov 14 18:05:05 ash kernel: 79: skb=b88a0100 dma=1053198430 length=1514 
time=+9310 watch=0
Nov 14 18:05:08 ash kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 
1000 Mbps Full Duplex

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2004-11-15 16:54 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <468F3FDA28AA87429AD807992E22D07E02C6625A@orsmsx408>
2004-11-15 16:54 ` e1000 driver (NETDEV WATCHDOG + page allocation failure) David Greaves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).