* Re: e1000 driver (NETDEV WATCHDOG + page allocation failure)
[not found] <468F3FDA28AA87429AD807992E22D07E02C6625A@orsmsx408>
@ 2004-11-15 16:54 ` David Greaves
0 siblings, 0 replies; only message in thread
From: David Greaves @ 2004-11-15 16:54 UTC (permalink / raw)
To: Venkatesan, Ganesh; +Cc: netdev
Hi Ganesh
Apologies for not responding sooner - you know how it is.
I've recently had a chance to update the BIOS on my motherboard as you
suggested and it has made a considerable difference.
However I am still seeing some issues and wouldn't consider the system
useable yet :(
Given that version 2.6.9 came out I thought it would be worth upgrading
to grab the new patches I saw go through; so now I'm running 2.6.9
Light usage with a standard 1500 MTU now works most of the time (ie ping
-f works fine, normal ssh usage and nfs etc)
I can't test jumbo packets as mtu 9000 causes immediate page allocation
failures:
ifconfig: page allocation failure. order:3, mode:0x20
on my other (otherwise stable) box.
Even on mtu=1500 I do however have problems with sustained throughput.
The reason I got the cards was to make video-editing over the network
quicker so this is a real problem.
At the moment I'm using rsync to transfer a few hundred Gb of data.
If I use ssh as the shell tunnel then the high cpu bottlenecks the data
to 10Mb/s
I use --rsh=rsh to ensure that there's minimal cpu usage the throughput
goes up to ~21Mb/s (the remote server is capable of ~40Mb/s raw
filesystem I/O but it still has a slow cpu)
At this throughput my workstation's e1000 appears to begin to fail.
Some tests:
# ping -f cuf
PING cuf (10.0.1.3): 56 data bytes
..
--- cuf ping statistics ---
1483792 packets transmitted, 1483791 packets received, 0% packet loss
round-trip min/avg/max = 0.0/0.6/3074.8 ms
so that works fine (whereas pre-BIOS update I used to have problems)
a more realistic activity:
rsync --rsh="rsh" --progress -a /scratch/* cu:/huge/myth/
1524072448 81% 22.84MB/s 0:00:14
but stalls (every 10-15 seconds) down to
95715328 5% 2.77MB/s 0:01:04
When the stall happens, the e1000_tx_timeout_task() log (extract below)
is produced.
Normally (with R/TxDescriptors=256) the start of the log is 'lost' by
syslogd so the version below is with T/RXDescriptors=80.
I've played with a few variables and found this set gave me better
behaviour with a stall every minute or so rather than every few seconds)
# modprobe e1000 InterruptThrottleRate=600 FlowControl=3
TxDescriptors=80 RxDescriptors=80
David
Venkatesan, Ganesh wrote:
>David:
>
>Could you check the BIOS version on your system? We were able to
>reproduce some of your performance issues on a machine with BIOS version
>1.03. Upgrading to version 1.10 resolved all issues. The machine we used
>is:
>Athlon 1800 with an Aopen AK77-KT600N motherboard.
>
>Please let us know what you find.
>
>Thanks,
>Ganesh.
>
>
Nov 14 18:05:05 ash kernel: NETDEV WATCHDOG: eth0: transmit timed out
after 5000 jiffies
Nov 14 18:05:05 ash kernel: eth0: transmit timeout from queuing
Nov 14 18:05:05 ash kernel: eth0: state=0x7 transmit ring size=4096
count=80 to_use=6 to_clean=10
Nov 14 18:05:05 ash kernel: 0: skb=00000000 dma=0 length=1514
time=+20656 watch=1
Nov 14 18:05:05 ash kernel: 1: skb=dffbf420 dma=747090014 length=1514
time=+9308 watch=2
Nov 14 18:05:05 ash kernel: 2: skb=00000000 dma=0 length=1514
time=+20656 watch=3
Nov 14 18:05:05 ash kernel: 3: skb=eef81420 dma=370819166 length=1514
time=+9308 watch=4
Nov 14 18:05:05 ash kernel: 4: skb=00000000 dma=0 length=1514
time=+20656 watch=5
Nov 14 18:05:05 ash kernel: 5: skb=dffbf6a0 dma=410486878 length=1514
time=+9308 watch=6
Nov 14 18:05:05 ash kernel: 6: skb=00000000 dma=0 length=1514
time=+20656 watch=7
Nov 14 18:05:05 ash kernel: 7: skb=00000000 dma=0 length=1514 time=+9313
watch=8
Nov 14 18:05:05 ash kernel: 8: skb=00000000 dma=0 length=1514
time=+20656 watch=9
Nov 14 18:05:05 ash kernel: 9: skb=00000000 dma=0 length=1514 time=+9313
watch=10
Nov 14 18:05:05 ash kernel: 10: skb=00000000 dma=0 length=1514
time=+20656 watch=11
Nov 14 18:05:05 ash kernel: 11: skb=dffbf9c0 dma=1015185502 length=1514
time=+9313 watch=12
Nov 14 18:05:05 ash kernel: 12: skb=00000000 dma=0 length=1514
time=+20656 watch=13
Nov 14 18:05:05 ash kernel: 13: skb=b94a42e0 dma=678299742 length=1514
time=+9313watch=14
Nov 14 18:05:05 ash kernel: 14: skb=00000000 dma=0 length=1514
time=+20656 watch=15
Nov 14 18:05:05 ash kernel: 15: skb=dffbf740 dma=244232286 length=1514
time=+9313watch=16
Nov 14 18:05:05 ash kernel: 16: skb=00000000 dma=0 length=1514
time=+20656 watch=17
Nov 14 18:05:05 ash kernel: 17: skb=dffbf880 dma=244234334 length=1514
time=+9313watch=18
Nov 14 18:05:05 ash kernel: 18: skb=00000000 dma=0 length=982
time=+20654 watch=19
Nov 14 18:05:05 ash kernel: 19: skb=e054f560 dma=543977566 length=1514
time=+9313watch=19
Nov 14 18:05:05 ash kernel: 20: skb=00000000 dma=0 length=1514
time=+20669 watch=21
Nov 14 18:05:05 ash kernel: 21: skb=efd22ce0 dma=543979614 length=1514
time=+9313watch=22
Nov 14 18:05:05 ash kernel: 22: skb=00000000 dma=0 length=1514
time=+20669 watch=23
Nov 14 18:05:05 ash kernel: 23: skb=eef81740 dma=621445214 length=1514
time=+9313watch=24
Nov 14 18:05:05 ash kernel: 24: skb=00000000 dma=0 length=1514
time=+20669 watch=25
Nov 14 18:05:05 ash kernel: 25: skb=dffbfd80 dma=621447262 length=1514
time=+9313watch=26
Nov 14 18:05:05 ash kernel: 26: skb=00000000 dma=0 length=1514
time=+20669 watch=27
Nov 14 18:05:05 ash kernel: 27: skb=b94a4920 dma=96651358 length=1514
time=+9313 watch=28
Nov 14 18:05:05 ash kernel: 28: skb=00000000 dma=0 length=1514
time=+20669 watch=29
Nov 14 18:05:05 ash kernel: 29: skb=ef3d27e0 dma=212396126 length=1514
time=+9312watch=30
Nov 14 18:05:05 ash kernel: 30: skb=00000000 dma=0 length=1514
time=+20669 watch=31
Nov 14 18:05:05 ash kernel: 31: skb=e054f420 dma=212394078 length=1514
time=+9312watch=32
Nov 14 18:05:05 ash kernel: 32: skb=00000000 dma=0 length=1514
time=+20669 watch=33
Nov 14 18:05:05 ash kernel: 33: skb=ef3d26a0 dma=471775326 length=1514
time=+9312watch=34
Nov 14 18:05:05 ash kernel: 34: skb=00000000 dma=0 length=1514
time=+20669 watch=35
Nov 14 18:05:05 ash kernel: 35: skb=b88a0880 dma=471773278 length=1514
time=+9312watch=36
Nov 14 18:05:05 ash kernel: 36: skb=00000000 dma=0 length=1514
time=+20669 watch=37
Nov 14 18:05:05 ash kernel: 37: skb=c9e26380 dma=301906014 length=1514
time=+9312watch=38
Nov 14 18:05:05 ash kernel: 38: skb=00000000 dma=0 length=1514
time=+20668 watch=39
Nov 14 18:05:05 ash kernel: 39: skb=efd227e0 dma=301903966 length=1514
time=+9312watch=40
Nov 14 18:05:05 ash kernel: 40: skb=00000000 dma=0 length=1514
time=+20668 watch=41
Nov 14 18:05:05 ash kernel: 41: skb=c95a2240 dma=292812894 length=1514
time=+9312watch=42
Nov 14 18:05:05 ash kernel: 42: skb=00000000 dma=0 length=1514
time=+20668 watch=43
Nov 14 18:05:05 ash kernel: 43: skb=eef81100 dma=292810846 length=1514
time=+9312watch=44
Nov 14 18:05:05 ash kernel: 44: skb=00000000 dma=0 length=1514
time=+20668 watch=45
Nov 14 18:05:05 ash kernel: 45: skb=c9e26f60 dma=412209246 length=1514
time=+9312watch=46
Nov 14 18:05:05 ash kernel: 46: skb=00000000 dma=0 length=1514
time=+20668 watch=47
Nov 14 18:05:05 ash kernel: 47: skb=b88a07e0 dma=410490974 length=1514
time=+9312watch=48
Nov 14 18:05:05 ash kernel: 48: skb=00000000 dma=0 length=1514
time=+20668 watch=49
Nov 14 18:05:05 ash kernel: 49: skb=efd22420 dma=471769182 length=1514
time=+9312watch=50
Nov 14 18:05:05 ash kernel: 50: skb=00000000 dma=0 length=1394
time=+20668 watch=51
Nov 14 18:05:05 ash kernel: 51: skb=e054fec0 dma=471771230 length=994
time=+9312 watch=52
Nov 14 18:05:05 ash kernel: 52: skb=00000000 dma=0 length=78 time=+20656
watch=53
Nov 14 18:05:05 ash kernel: 53: skb=c9e26560 dma=139356254 length=1514
time=+9312watch=54
Nov 14 18:05:05 ash kernel: 54: skb=00000000 dma=0 length=1514
time=+20656 watch=55
Nov 14 18:05:05 ash kernel: 55: skb=eef817e0 dma=410484830 length=1514
time=+9312watch=56
Nov 14 18:05:05 ash kernel: 56: skb=00000000 dma=0 length=1514
time=+20656 watch=57
Nov 14 18:05:05 ash kernel: 57: skb=e054f240 dma=410488926 length=1514
time=+9312watch=58
Nov 14 18:05:05 ash kernel: 58: skb=00000000 dma=0 length=1514
time=+20656 watch=59
Nov 14 18:05:05 ash kernel: 59: skb=ef3d2b00 dma=543973470 length=1514
time=+9310watch=60
Nov 14 18:05:05 ash kernel: 60: skb=00000000 dma=0 length=1514
time=+20656 watch=61
Nov 14 18:05:05 ash kernel: 61: skb=dffbfec0 dma=747087966 length=1514
time=+9310watch=62
Nov 14 18:05:05 ash kernel: 62: skb=00000000 dma=0 length=1514
time=+20656 watch=63
Nov 14 18:05:05 ash kernel: 63: skb=ef3d2560 dma=747085918 length=1514
time=+9310watch=64
Nov 14 18:05:05 ash kernel: 64: skb=00000000 dma=0 length=1514
time=+20656 watch=65
Nov 14 18:05:05 ash kernel: 65: skb=b94a4100 dma=1013049438 length=1514
time=+9310 watch=66
Nov 14 18:05:05 ash kernel: 66: skb=00000000 dma=0 length=1514
time=+20656 watch=67
Nov 14 18:05:05 ash kernel: 67: skb=c95a21a0 dma=323840094 length=1514
time=+9310watch=68
Nov 14 18:05:05 ash kernel: 68: skb=00000000 dma=0 length=1514
time=+20656 watch=69
Nov 14 18:05:05 ash kernel: 69: skb=e054f2e0 dma=572936286 length=1514
time=+9310watch=70
Nov 14 18:05:05 ash kernel: 70: skb=00000000 dma=0 length=1514
time=+20656 watch=71
Nov 14 18:05:05 ash kernel: 71: skb=b94a4060 dma=895561822 length=1514
time=+9310watch=72
Nov 14 18:05:05 ash kernel: 72: skb=00000000 dma=0 length=1514
time=+20656 watch=73
Nov 14 18:05:05 ash kernel: 73: skb=b94a4880 dma=96649310 length=1514
time=+9310 watch=74
Nov 14 18:05:05 ash kernel: 74: skb=00000000 dma=0 length=1514
time=+20656 watch=75
Nov 14 18:05:05 ash kernel: 75: skb=b88a0380 dma=512022622 length=1514
time=+9310watch=76
Nov 14 18:05:05 ash kernel: 76: skb=00000000 dma=0 length=1514
time=+20656 watch=77
Nov 14 18:05:05 ash kernel: 77: skb=b94a4380 dma=1013047390 length=1514
time=+9310 watch=78
Nov 14 18:05:05 ash kernel: 78: skb=00000000 dma=0 length=1514
time=+20656 watch=79
Nov 14 18:05:05 ash kernel: 79: skb=b88a0100 dma=1053198430 length=1514
time=+9310 watch=0
Nov 14 18:05:08 ash kernel: e1000: eth0: e1000_watchdog: NIC Link is Up
1000 Mbps Full Duplex
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2004-11-15 16:54 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <468F3FDA28AA87429AD807992E22D07E02C6625A@orsmsx408>
2004-11-15 16:54 ` e1000 driver (NETDEV WATCHDOG + page allocation failure) David Greaves
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.