Re: 2.6.6 e1000 ifconfig: page allocation failure

All of lore.kernel.org
 help / color / mirror / Atom feed

From: David Greaves <david@dgreaves.com>
To: "Venkatesan, Ganesh" <ganesh.venkatesan@intel.com>
Cc: Jens Laas <jens.laas@data.slu.se>,
	Stephen Hemminger <shemminger@osdl.org>,
	netdev@oss.sgi.com
Subject: Re: 2.6.6 e1000 ifconfig: page allocation failure
Date: Fri, 18 Jun 2004 17:59:37 +0100	[thread overview]
Message-ID: <40D31F79.3000903@dgreaves.com> (raw)
In-Reply-To: <468F3FDA28AA87429AD807992E22D07E01767AF6@orsmsx408>

On the 2.6.6 server machine:
  ifconfig eth0 mtu 9000
gives an oops in the usb?

Unable to handle kernel paging request at virtual address 92a8292a
 printing eip:
d1163305
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<d1163305>]    Not tainted
EFLAGS: 00010286   (2.6.6)
EIP is at usb_buffer_free+0x15/0x50 [usbcore]
eax: cea2ec00   ebx: c13665e8   ecx: 00000001   edx: 92a8290a
esi: c13665ec   edi: cf0439dc   ebp: cf58eef4   esp: c3535f44
ds: 007b   es: 007b   ss: 0068
Process usb (pid: 2744, threadinfo=c3534000 task=cf245370)
Stack: cba80d00 c13665e8 c13665ec cf0439dc d106e3a6 cea2ec00 00002000 
cf636000
       0f636000 c13665e8 d106e4a9 c13665e8 cf122980 cffe0280 c01470d3 
cf0439dc
       cf122980 cf122980 00000000 cf27f200 c3534000 c0145a19 cf122980 
cf27f200
Call Trace:
 [<d106e3a6>] usblp_cleanup+0x46/0xb0 [usblp]
 [<d106e4a9>] usblp_release+0x59/0x60 [usblp]
 [<c01470d3>] __fput+0xe3/0x100
 [<c0145a19>] filp_close+0x59/0x90
 [<c0145aa0>] sys_close+0x50/0x60
 [<c0103f0b>] syscall_call+0x7/0xb

Code: 8b 4a 20 85 c9 74 07 8b 41 18 85 c0 75 04 83 c4 10 c3 8b 44
 <6>usb 1-1: new full speed USB device using address 3
drivers/usb/class/usblp.c: usblp0: USB Bidirectional printer dev 3 if 0 
alt 0 proto 2 vid 0x04B8 pid 0x0005
ifconfig: page allocation failure. order:3, mode:0x20
Call Trace:
 [<c013136f>] __alloc_pages+0x2af/0x2f0
 [<c01313d5>] __get_free_pages+0x25/0x40
 [<c01342e7>] cache_grow+0x87/0x230
 [<c01345c9>] cache_alloc_refill+0x139/0x200
 [<c0134960>] __kmalloc+0x70/0x80
 [<c02c1869>] alloc_skb+0x49/0xe0
 [<d110f262>] e1000_alloc_rx_buffers+0x62/0x100 [e1000]
 [<d110c045>] e1000_up+0x45/0xb0 [e1000]
 [<d110e4fc>] e1000_change_mtu+0x7c/0xd0 [e1000]
 [<c02c6e49>] dev_set_mtu+0x79/0x90
 [<c02c7429>] dev_ioctl+0x1e9/0x270
 [<c030032e>] inet_ioctl+0x8e/0xa0
 [<c02be895>] sock_ioctl+0xb5/0x250
 [<c015655d>] sys_ioctl+0xad/0x210
 [<c01129d0>] do_page_fault+0x0/0x4ff
 [<c0103f0b>] syscall_call+0x7/0xb

MemTotal:       256440 kB
MemFree:          2576 kB
Buffers:         18276 kB
Cached:         202048 kB
SwapCached:          0 kB
Active:         112492 kB
Inactive:       115324 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       256440 kB
LowFree:          2576 kB
SwapTotal:      522100 kB
SwapFree:       522100 kB
Dirty:               8 kB
Writeback:           0 kB
Mapped:          14856 kB
Slab:            16920 kB
Committed_AS:    20272 kB
PageTables:        368 kB
VmallocTotal:   770040 kB
VmallocUsed:     10656 kB
VmallocChunk:   759264 kB



I have had similar on the stable box when it's been used for a while.
I did:
ifconfig eth1 mtu 9000
on the good machine and it gave me this:

Jun 18 16:33:08 haze kernel: printk: 1 messages suppressed.
Jun 18 16:33:08 haze kernel: ifconfig: page allocation failure. order:3, 
mode:0x20
Jun 18 16:33:08 haze kernel:  [__alloc_pages+728/848] 
__alloc_pages+0x2d8/0x350
Jun 18 16:33:08 haze kernel:  [__get_free_pages+37/64] 
__get_free_pages+0x25/0x40
Jun 18 16:33:08 haze kernel:  [kmem_getpages+32/176] kmem_getpages+0x20/0xb0
Jun 18 16:33:08 haze kernel:  [cache_grow+166/512] cache_grow+0xa6/0x200
Jun 18 16:33:08 haze kernel:  [cache_alloc_refill+342/544] 
cache_alloc_refill+0x156/0x220
Jun 18 16:33:08 haze kernel:  [__kmalloc+116/128] __kmalloc+0x74/0x80
Jun 18 16:33:08 haze kernel:  [alloc_skb+71/224] alloc_skb+0x47/0xe0
Jun 18 16:33:08 haze kernel:  [pg0+945227150/1069572096] 
e1000_alloc_rx_buffers+0x5e/0x100 [e1000]
Jun 18 16:33:08 haze kernel:  [pg0+945213509/1069572096] 
e1000_up+0x45/0xb0 [e1000]
Jun 18 16:33:08 haze kernel:  [pg0+945223248/1069572096] 
e1000_change_mtu+0x80/0x110 [e1000]
Jun 18 16:33:08 haze kernel:  [dev_set_mtu+121/144] dev_set_mtu+0x79/0x90
Jun 18 16:33:08 haze kernel:  [dev_ioctl+501/640] dev_ioctl+0x1f5/0x280
Jun 18 16:33:08 haze kernel:  [inet_ioctl+142/160] inet_ioctl+0x8e/0xa0
Jun 18 16:33:08 haze kernel:  [sock_ioctl+233/656] sock_ioctl+0xe9/0x290
Jun 18 16:33:08 haze kernel:  [sys_ioctl+239/608] sys_ioctl+0xef/0x260
Jun 18 16:33:08 haze kernel:  [do_page_fault+0/1242] do_page_fault+0x0/0x4da
Jun 18 16:33:08 haze kernel:  [syscall_call+7/11] syscall_call+0x7/0xb

it had
root@haze:~ # cat /proc/meminfo
MemTotal:      1036868 kB
MemFree:          7564 kB
Buffers:         30720 kB
Cached:         756496 kB
SwapCached:          0 kB
Active:         553348 kB
Inactive:       362700 kB
HighTotal:      131056 kB
HighFree:          252 kB
LowTotal:       905812 kB
LowFree:          7312 kB
SwapTotal:           0 kB
SwapFree:            0 kB
Dirty:               0 kB
Writeback:           0 kB
Mapped:         179532 kB
Slab:           105264 kB
Committed_AS:   298092 kB
PageTables:       1504 kB
VmallocTotal:   114680 kB
VmallocUsed:      2112 kB
VmallocChunk:   112376 kB

I could repeat this by mtu 1500, mtu 9000.
Somehow the distro hadn't mkswap'ed the swap so I added swap and the 
problem went away.
if I swapoff then every time I set the mtu to 9000 I get the page 
allocation failure.

I don't think this should happen but I'm not sure if I *must* have swap?
Also I did this whilst the interface was up (it let me).

David


Venkatesan, Ganesh wrote:

>Jens/David:
>
>Did not mean to get off the list. For some reason, my subscription to
>netdev is not working (even after re-subscribing). So, I grabbed your
>message off of the archive.
>
>I am trying to recreate your failure scenario in our lab. In the
>meantime, please send me any new information you have on this issue.
>
>Thanks,
>ganesh 
> 
>-------------------------------------------------
>Ganesh Venkatesan
>Network/Storage Division, Hillsboro, OR
>
>-----Original Message-----
>From: David Greaves [mailto:david@dgreaves.com] 
>Sent: Friday, June 18, 2004 5:52 AM
>To: Jens Laas
>Cc: Stephen Hemminger; netdev@oss.sgi.com; Venkatesan, Ganesh
>Subject: Re: 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+
>delay scheduler
>
>New info:
>I booted into XP and the card works there - so it doesn't look like a 
>simple hardware incompatibility.
>[I've got no real way to test the performance but cygwin's wget against 
>apache1.3 on the linux box returns about 25M/s initially and then 15M/s 
>sustained for 500Mb]
>
>Jens Laas wrote:
>
>  
>
>>>I'm speaking with Ganesh Venkatesan at intel about it. Ganesh you 
>>>went off list - do you want to include Jens or maybe go back on-list?
>>>      
>>>
>>If others run into this problem I'm sure they'll appreciate if its on 
>>list.
>>Since we have no idea what causes this (AFAIK) it may be a more 
>>general problem than the device driver.
>>    
>>
>
>I tend to agree - but I wasn't sure if this was the place and I'll do as
>
>I'm told ;)
>
>  
>
>>>A simple failure case for me is : 'ping -s 1500 '
>>>This doesn't cause the timout but doesn't succeed either.
>>>
>>>ping -f with standard packet size succeeds (slow rate though) and 
>>>doesn't timeout.
>>>      
>>>
>>
>>I dont see the ping problems at all. Unless you try to ping when the 
>>interface has "hanged" ?
>>    
>>
>
><sigh> thought that might be helpful.
>Ping with -s and -f seems to allow me to trigger errors and it seems a 
>lot more debug-able than scp or nfs :)
>No all tests are when it's reset and 'clean'
>
>  
>
>>>============
>>>From hereon down it's 2.6.7 with Stephen's recent delay scheduler
>>>      
>>>
>patch
>  
>
>>>This changed the behaviour.
>>>      
>>>
>>
>>This is strange unless you are actually using the delay scheduler ?
>>Default is sch_generic (that is pfifo) that does not exhibit the 
>>problems correct by the patch.
>>    
>>
>
>I'll go back and double check in case I cocked up...
>(I noticed the e1000 module rebuild but you're right that's incidental)
>
>I've rebuilt the kernel and modules with and w/o patch and rebooted a 
>few times and I can't reproduce that effect - sorry for the red herring.
>So after I reverted Stephens patch the results I reported are still 
>reproducable w/o the patch.
>
>  
>
>>>10592 packets transmitted, 10591 packets received, 0% packet loss
>>>round-trip min/avg/max = 5.4/5.5/83.5 ms
>>>
>>>Increasing Transmit Descriptors to 4096 avoids the No buffer space 
>>>available with packet sizes up to -s65468 (still 100% failure though)
>>>      
>>>
>>Increasing nr of buffers is not a way to fix the problem.
>>    
>>
>
>agreed - however in my ignorance of the deep behaviour I'm reporting 
>things that affect behaviour in ways I don't expect.
>I expected it to take longer to run out of buffers - that didn't happen
>:)
>
>(Anyway, on retesting I find that this was wrong - I suspect the 
>interface was down and I didn't notice)
>
>  
>
>>I had hoped to hear something about this from Scott..
>>    
>>
>
>I'm happy to hear from anyone - I don't have *that* long until my RMA 
>option expires and I don't fancy keeping them as ornaments!
>
>David
>
>
>
>
>  
>

     prev parent reply	other threads:[~2004-06-18 16:59 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-18 14:40 2.6.6 e1000 NETDEV WATCHDOG: eth0: transmit timed out+ delay scheduler Venkatesan, Ganesh
2004-06-18 16:59 ` David Greaves [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=40D31F79.3000903@dgreaves.com \
    --to=david@dgreaves.com \
    --cc=ganesh.venkatesan@intel.com \
    --cc=jens.laas@data.slu.se \
    --cc=netdev@oss.sgi.com \
    --cc=shemminger@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.