* E1000 - page allocation failure - saga continues :(
@ 2005-04-14 21:48 Lukas Hejtmanek
2005-04-18 12:10 ` Yann Dupont
0 siblings, 1 reply; 18+ messages in thread
From: Lukas Hejtmanek @ 2005-04-14 21:48 UTC (permalink / raw)
To: linux-kernel
Hello,
today I tried 2.6.11.7 kernel with hoping that allocation failures disappear.
Unfortunately they did not.
Default min_free_kb is 3200kB.
Here is stack trace:
swapper: page allocation failure. order:0, mode:0x20
[<c0139783>] __alloc_pages+0x2b3/0x420
[<c013c4f1>] kmem_getpages+0x31/0xa0
[<c013d22e>] cache_grow+0xae/0x160
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c013d45b>] cache_alloc_refill+0x17b/0x230
[<c013d7d8>] __kmalloc+0x88/0xa0
[<c0327ce7>] alloc_skb+0x47/0xf0
[<c02bff97>] e1000_alloc_rx_buffers+0x57/0x100
[<c02bfc3f>] e1000_clean_rx_irq+0x1bf/0x4c0
[<c02bf99e>] e1000_clean_tx_irq+0x14e/0x230
[<c02bf79d>] e1000_clean+0x4d/0x100
[<c032e6e1>] net_rx_action+0x81/0x110
[<c011d5aa>] __do_softirq+0xba/0xd0
[<c011d5ed>] do_softirq+0x2d/0x30
[<c011d6b9>] irq_exit+0x39/0x40
[<c01030dc>] apic_timer_interrupt+0x1c/0x24
[<c0100513>] default_idle+0x23/0x30
[<c01005bf>] cpu_idle+0x5f/0x70
[<c04dc988>] start_kernel+0x158/0x180
[<c04dc390>] unknown_bootoption+0x0/0x1e0
swapper: page allocation failure. order:0, mode:0x20
[<c0139783>] __alloc_pages+0x2b3/0x420
[<c013c4f1>] kmem_getpages+0x31/0xa0
[<c013d22e>] cache_grow+0xae/0x160
[<c02b80d8>] as_next_request+0x38/0x50
[<c013d45b>] cache_alloc_refill+0x17b/0x230
[<c02df727>] scsi_put_command+0x77/0xb0
[<c013d7d8>] __kmalloc+0x88/0xa0
[<c0327ce7>] alloc_skb+0x47/0xf0
[<c02bff97>] e1000_alloc_rx_buffers+0x57/0x100
[<c02bfc3f>] e1000_clean_rx_irq+0x1bf/0x4c0
[<c02bf99e>] e1000_clean_tx_irq+0x14e/0x230
[<c02bf79d>] e1000_clean+0x4d/0x100
[<c032e6e1>] net_rx_action+0x81/0x110
[<c011d5aa>] __do_softirq+0xba/0xd0
[<c011d5ed>] do_softirq+0x2d/0x30
[<c011d6b9>] irq_exit+0x39/0x40
[<c01030dc>] apic_timer_interrupt+0x1c/0x24
[<c0100513>] default_idle+0x23/0x30
[<c01005bf>] cpu_idle+0x5f/0x70
[<c04dc988>] start_kernel+0x158/0x180
[<c04dc390>] unknown_bootoption+0x0/0x1e0
swapper: page allocation failure. order:1, mode:0x20
[<c0139783>] __alloc_pages+0x2b3/0x420
[<c013c4f1>] kmem_getpages+0x31/0xa0
[<c013d077>] alloc_slabmgmt+0x57/0x70
[<c013d22e>] cache_grow+0xae/0x160
[<c013d45b>] cache_alloc_refill+0x17b/0x230
[<c013d7d8>] __kmalloc+0x88/0xa0
[<c0327ce7>] alloc_skb+0x47/0xf0
[<c0354541>] tcp_collapse+0xf1/0x390
[<c0354904>] tcp_prune_queue+0x94/0x1e0
[<c0353cd4>] tcp_data_queue+0x3e4/0xb60
[<c0352b73>] tcp_ack+0x4b3/0x590
[<c03552df>] tcp_rcv_established+0x22f/0x8d0
[<c010cbf9>] mark_offset_tsc+0x1d9/0x2d0
[<c035e46b>] tcp_v4_do_rcv+0x12b/0x130
[<c035eb61>] tcp_v4_rcv+0x6f1/0x950
[<c03848f5>] ip_nat_fn+0x75/0x1d0
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c03429ff>] ip_local_deliver_finish+0xcf/0x220
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c03392a1>] nf_hook_slow+0xf1/0x130
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c0342410>] ip_local_deliver+0x280/0x2b0
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342d49>] ip_rcv_finish+0x1f9/0x280
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c03392a1>] nf_hook_slow+0xf1/0x130
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c034284e>] ip_rcv+0x40e/0x4f0
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c032e4b8>] netif_receive_skb+0x148/0x1d0
[<c02bfbdc>] e1000_clean_rx_irq+0x15c/0x4c0
[<c02bf99e>] e1000_clean_tx_irq+0x14e/0x230
[<c02bf79d>] e1000_clean+0x4d/0x100
[<c032e6e1>] net_rx_action+0x81/0x110
[<c011d5aa>] __do_softirq+0xba/0xd0
[<c011d5ed>] do_softirq+0x2d/0x30
[<c011d6b9>] irq_exit+0x39/0x40
[<c0104b0e>] do_IRQ+0x1e/0x30
[<c010304e>] common_interrupt+0x1a/0x20
[<c0100513>] default_idle+0x23/0x30
[<c01005bf>] cpu_idle+0x5f/0x70
[<c04dc988>] start_kernel+0x158/0x180
[<c04dc390>] unknown_bootoption+0x0/0x1e0
swapper: page allocation failure. order:1, mode:0x20
[<c0139783>] __alloc_pages+0x2b3/0x420
[<c013c4f1>] kmem_getpages+0x31/0xa0
[<c013d077>] alloc_slabmgmt+0x57/0x70
[<c013d22e>] cache_grow+0xae/0x160
[<c013d45b>] cache_alloc_refill+0x17b/0x230
[<c013d7d8>] __kmalloc+0x88/0xa0
[<c0327ce7>] alloc_skb+0x47/0xf0
[<c0354541>] tcp_collapse+0xf1/0x390
[<c0354904>] tcp_prune_queue+0x94/0x1e0
[<c0353cd4>] tcp_data_queue+0x3e4/0xb60
[<c0352b73>] tcp_ack+0x4b3/0x590
[<c03552df>] tcp_rcv_established+0x22f/0x8d0
[<c035e46b>] tcp_v4_do_rcv+0x12b/0x130
[<c035eb61>] tcp_v4_rcv+0x6f1/0x950
[<c03848f5>] ip_nat_fn+0x75/0x1d0
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c03429ff>] ip_local_deliver_finish+0xcf/0x220
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c03392a1>] nf_hook_slow+0xf1/0x130
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c0342410>] ip_local_deliver+0x280/0x2b0
[<c0342930>] ip_local_deliver_finish+0x0/0x220
[<c0342d49>] ip_rcv_finish+0x1f9/0x280
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c03392a1>] nf_hook_slow+0xf1/0x130
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c034284e>] ip_rcv+0x40e/0x4f0
[<c0342b50>] ip_rcv_finish+0x0/0x280
[<c032e4b8>] netif_receive_skb+0x148/0x1d0
[<c02bfbdc>] e1000_clean_rx_irq+0x15c/0x4c0
[<c02bf99e>] e1000_clean_tx_irq+0x14e/0x230
[<c02bf79d>] e1000_clean+0x4d/0x100
[<c032e6e1>] net_rx_action+0x81/0x110
[<c011d5aa>] __do_softirq+0xba/0xd0
[<c011d5ed>] do_softirq+0x2d/0x30
[<c011d6b9>] irq_exit+0x39/0x40
[<c0104b0e>] do_IRQ+0x1e/0x30
[<c010304e>] common_interrupt+0x1a/0x20
[<c0100513>] default_idle+0x23/0x30
[<c01005bf>] cpu_idle+0x5f/0x70
[<c04dc988>] start_kernel+0x158/0x180
[<c04dc390>] unknown_bootoption+0x0/0x1e0
--
Lukáš Hejtmánek
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-14 21:48 E1000 - page allocation failure - saga continues :( Lukas Hejtmanek
@ 2005-04-18 12:10 ` Yann Dupont
2005-04-18 12:22 ` Lukas Hejtmanek
0 siblings, 1 reply; 18+ messages in thread
From: Yann Dupont @ 2005-04-18 12:10 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: linux-kernel
Lukas Hejtmanek a écrit :
>Hello,
>
>today I tried 2.6.11.7 kernel with hoping that allocation failures disappear.
>Unfortunately they did not.
>
>Default min_free_kb is 3200kB.
>
>Here is stack trace:
>
>swapper: page allocation failure. order:0, mode:0x20
> [<c0139783>] __alloc_pages+0x2b3/0x420
> [<c013c4f1>] kmem_getpages+0x31/0xa0
> [<c013d22e>] cache_grow+0xae/0x160
> [<c0342b50>] ip_rcv_finish+0x0/0x280
> [<c013d45b>] cache_alloc_refill+0x17b/0x230
> [<c013d7d8>] __kmalloc+0x88/0xa0
> [<c0327ce7>] alloc_skb+0x47/0xf0
> [<c02bff97>] e1000_alloc_rx_buffers+0x57/0x100
> [<c02bfc3f>] e1000_clean_rx_irq+0x1bf/0x4c0
>
>
I have those problems too. The (temporary ?) fix is to raise the
min_free_kb to an higher value.
echo 65535 > /proc/sys/vm/min_free_kbytes
Maybe such an high value is totally silly, but at least I don't have
those messages.
Sincerely yours,
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-18 12:10 ` Yann Dupont
@ 2005-04-18 12:22 ` Lukas Hejtmanek
2005-04-18 12:24 ` Yann Dupont
2005-04-19 7:23 ` Yann Dupont
0 siblings, 2 replies; 18+ messages in thread
From: Lukas Hejtmanek @ 2005-04-18 12:22 UTC (permalink / raw)
To: Yann Dupont; +Cc: linux-kernel
On Mon, Apr 18, 2005 at 02:10:31PM +0200, Yann Dupont wrote:
> I have those problems too. The (temporary ?) fix is to raise the
> min_free_kb to an higher value.
> echo 65535 > /proc/sys/vm/min_free_kbytes
>
> Maybe such an high value is totally silly, but at least I don't have
> those messages.
I know that kernel 2.6.6-bk4 works. So were there some memory manager changes
since 2.6.6? If so it looks like there are some bugs.
On the other hand, ethernet driver should not allocate much memory but rather
drop packets.
Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
--
Lukáš Hejtmánek
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-18 12:22 ` Lukas Hejtmanek
@ 2005-04-18 12:24 ` Yann Dupont
2005-04-18 12:34 ` Lukas Hejtmanek
2005-04-19 7:23 ` Yann Dupont
1 sibling, 1 reply; 18+ messages in thread
From: Yann Dupont @ 2005-04-18 12:24 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: linux-kernel
Lukas Hejtmanek a écrit :
>On Mon, Apr 18, 2005 at 02:10:31PM +0200, Yann Dupont wrote:
>
>
>>I have those problems too. The (temporary ?) fix is to raise the
>>min_free_kb to an higher value.
>>echo 65535 > /proc/sys/vm/min_free_kbytes
>>
>>Maybe such an high value is totally silly, but at least I don't have
>>those messages.
>>
>>
>
>I know that kernel 2.6.6-bk4 works. So were there some memory manager changes
>since 2.6.6? If so it looks like there are some bugs.
>On the other hand, ethernet driver should not allocate much memory but rather
>drop packets.
>
>Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
>
>
>
No tweaking at all. No jumbo frames.
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-18 12:24 ` Yann Dupont
@ 2005-04-18 12:34 ` Lukas Hejtmanek
2005-04-18 12:39 ` Yann Dupont
0 siblings, 1 reply; 18+ messages in thread
From: Lukas Hejtmanek @ 2005-04-18 12:34 UTC (permalink / raw)
To: Yann Dupont; +Cc: linux-kernel
On Mon, Apr 18, 2005 at 02:24:47PM +0200, Yann Dupont wrote:
> >I know that kernel 2.6.6-bk4 works. So were there some memory manager changes
> >since 2.6.6? If so it looks like there are some bugs.
> >On the other hand, ethernet driver should not allocate much memory but rather
> >drop packets.
> >
> >Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
> >
> No tweaking at all. No jumbo frames.
There were assumptions that it is XFS related. Are you using XFS on that box?
I'm able to deterministically produce this error:
on XFS partition store a file from network using multiple threads. If file size
is bigger then total memory, then it fails after major part of memory is used
for a file cache.
--
Lukáš Hejtmánek
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-18 12:34 ` Lukas Hejtmanek
@ 2005-04-18 12:39 ` Yann Dupont
0 siblings, 0 replies; 18+ messages in thread
From: Yann Dupont @ 2005-04-18 12:39 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: linux-kernel
Lukas Hejtmanek a écrit :
>On Mon, Apr 18, 2005 at 02:24:47PM +0200, Yann Dupont wrote:
>
>
>>>I know that kernel 2.6.6-bk4 works. So were there some memory manager changes
>>>since 2.6.6? If so it looks like there are some bugs.
>>>On the other hand, ethernet driver should not allocate much memory but rather
>>>drop packets.
>>>
>>>Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
>>>
>>>
>>>
>>No tweaking at all. No jumbo frames.
>>
>>
>
>There were assumptions that it is XFS related. Are you using XFS on that box?
>
>I'm able to deterministically produce this error:
>on XFS partition store a file from network using multiple threads. If file size
>is bigger then total memory, then it fails after major part of memory is used
>for a file cache.
>
>
>
Ah yes, this is the case.
XFS all over ...
The server is quite heavily stressed, we have a bunch of servers
rsyncing on a big SAN volume - formatted with XFS, that's right.
(and, if that matters, XFS in on top of a EVMS volume (on top of a LVM2
region)...)
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-18 12:22 ` Lukas Hejtmanek
2005-04-18 12:24 ` Yann Dupont
@ 2005-04-19 7:23 ` Yann Dupont
2005-04-19 8:03 ` Nick Piggin
2005-04-19 8:04 ` Lukas Hejtmanek
1 sibling, 2 replies; 18+ messages in thread
From: Yann Dupont @ 2005-04-19 7:23 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: linux-kernel
Lukas Hejtmanek a écrit :
>On Mon, Apr 18, 2005 at 02:10:31PM +0200, Yann Dupont wrote:
>
>
>>I have those problems too. The (temporary ?) fix is to raise the
>>min_free_kb to an higher value.
>>echo 65535 > /proc/sys/vm/min_free_kbytes
>>
>>Maybe such an high value is totally silly, but at least I don't have
>>those messages.
>>
>>
>
>I know that kernel 2.6.6-bk4 works. So were there some memory manager changes
>since 2.6.6? If so it looks like there are some bugs.
>On the other hand, ethernet driver should not allocate much memory but rather
>drop packets.
>
>Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
>
>
>
Do you have turned NAPI on ??? I tried without it off on e1000 and ...
surprise !
Don't have any messages since 12H now (usually I got those in less than 1H)
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 7:23 ` Yann Dupont
@ 2005-04-19 8:03 ` Nick Piggin
2005-04-19 8:15 ` Yann Dupont
2005-04-19 8:04 ` Lukas Hejtmanek
1 sibling, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2005-04-19 8:03 UTC (permalink / raw)
To: Yann Dupont; +Cc: Lukas Hejtmanek, lkml
On Tue, 2005-04-19 at 09:23 +0200, Yann Dupont wrote:
> Lukas Hejtmanek a écrit :
> >Btw, are you using some TCP tweaks? E.g. I have default TCP window size 1MB.
> >
> >
> >
> Do you have turned NAPI on ??? I tried without it off on e1000 and ...
> surprise !
> Don't have any messages since 12H now (usually I got those in less than 1H)
>
Possibly kswapd might be unable to get enough CPU to free memory.
--
SUSE Labs, Novell Inc.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 7:23 ` Yann Dupont
2005-04-19 8:03 ` Nick Piggin
@ 2005-04-19 8:04 ` Lukas Hejtmanek
2005-04-20 3:42 ` Nuno Silva
2005-04-20 7:12 ` Yann Dupont
1 sibling, 2 replies; 18+ messages in thread
From: Lukas Hejtmanek @ 2005-04-19 8:04 UTC (permalink / raw)
To: Yann Dupont; +Cc: linux-kernel
On Tue, Apr 19, 2005 at 09:23:46AM +0200, Yann Dupont wrote:
> Do you have turned NAPI on ??? I tried without it off on e1000 and ...
> surprise !
> Don't have any messages since 12H now (usually I got those in less than 1H)
I have NAPI on. I tried to turn it off but my test failed, I can see allocation
failure again.
--
Lukáš Hejtmánek
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 8:03 ` Nick Piggin
@ 2005-04-19 8:15 ` Yann Dupont
2005-04-19 8:24 ` Nick Piggin
2005-04-19 9:34 ` Lukas Hejtmanek
0 siblings, 2 replies; 18+ messages in thread
From: Yann Dupont @ 2005-04-19 8:15 UTC (permalink / raw)
To: Nick Piggin; +Cc: Lukas Hejtmanek, lkml
Nick Piggin a écrit :
>
>>Do you have turned NAPI on ??? I tried without it off on e1000 and ...
>>surprise !
>>Don't have any messages since 12H now (usually I got those in less than 1H)
>>
>>
>>
>
>Possibly kswapd might be unable to get enough CPU to free memory.
>
>
>
Ok, so what you're saying is that turning NAPI off is just slowing down
things enough to not be hit by
this problem , right ?
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 8:15 ` Yann Dupont
@ 2005-04-19 8:24 ` Nick Piggin
2005-04-19 9:34 ` Lukas Hejtmanek
1 sibling, 0 replies; 18+ messages in thread
From: Nick Piggin @ 2005-04-19 8:24 UTC (permalink / raw)
To: Yann Dupont; +Cc: Lukas Hejtmanek, lkml
On Tue, 2005-04-19 at 10:15 +0200, Yann Dupont wrote:
> Nick Piggin a écrit :
>
> >
> >>Do you have turned NAPI on ??? I tried without it off on e1000 and ...
> >>surprise !
> >>Don't have any messages since 12H now (usually I got those in less than 1H)
> >>
> >>
> >>
> >
> >Possibly kswapd might be unable to get enough CPU to free memory.
> >
> >
> >
> Ok, so what you're saying is that turning NAPI off is just slowing down
> things enough to not be hit by
> this problem , right ?
>
Perhaps, yes. Or that NAPI is using more CPU than non-NAPI
(which I understand can happen in some corner cases).
If you have a multiprocessor (or even hyperthreading), I
think you could test this by binding kswapd on cpu CPU, and
put nic interrupts on the other - then test with and without
NAPI.
That is, presuming you can reproduce the problem on your
multiprocessor system in the first place.
--
SUSE Labs, Novell Inc.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 8:15 ` Yann Dupont
2005-04-19 8:24 ` Nick Piggin
@ 2005-04-19 9:34 ` Lukas Hejtmanek
1 sibling, 0 replies; 18+ messages in thread
From: Lukas Hejtmanek @ 2005-04-19 9:34 UTC (permalink / raw)
To: Yann Dupont; +Cc: Nick Piggin, lkml
On Tue, Apr 19, 2005 at 10:15:27AM +0200, Yann Dupont wrote:
> >Possibly kswapd might be unable to get enough CPU to free memory.
I do not see why NIC rather does not drop packets instead of running out of
memory.
I know that renicing kswapd helps. But still do not see why 2.6.6 kernel works.
--
Lukáš Hejtmánek
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 8:04 ` Lukas Hejtmanek
@ 2005-04-20 3:42 ` Nuno Silva
2005-04-20 7:12 ` Yann Dupont
1 sibling, 0 replies; 18+ messages in thread
From: Nuno Silva @ 2005-04-20 3:42 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: Yann Dupont, linux-kernel
Lukas Hejtmanek wrote:
> On Tue, Apr 19, 2005 at 09:23:46AM +0200, Yann Dupont wrote:
>
>>Do you have turned NAPI on ??? I tried without it off on e1000 and ...
>>surprise !
>>Don't have any messages since 12H now (usually I got those in less than 1H)
>
>
> I have NAPI on. I tried to turn it off but my test failed, I can see allocation
> failure again.
>
Not sure if this was already sugested, but here it is anyway:
echo "vm.min_free_kbytes=16384" >> /etc/sysctl.conf
Regards,
Nuno Silva
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-04-19 8:04 ` Lukas Hejtmanek
2005-04-20 3:42 ` Nuno Silva
@ 2005-04-20 7:12 ` Yann Dupont
1 sibling, 0 replies; 18+ messages in thread
From: Yann Dupont @ 2005-04-20 7:12 UTC (permalink / raw)
To: Lukas Hejtmanek; +Cc: linux-kernel, Nick Piggin
Lukas Hejtmanek a écrit :
>On Tue, Apr 19, 2005 at 09:23:46AM +0200, Yann Dupont wrote:
>
>
>>Do you have turned NAPI on ??? I tried without it off on e1000 and ...
>>surprise !
>>Don't have any messages since 12H now (usually I got those in less than 1H)
>>
>>
>
>I have NAPI on. I tried to turn it off but my test failed, I can see allocation
>failure again.
>
>
>
Well. forgives me :)
I have re turned NAPI On and my box is still happy 19H later...
So it's obviously not napi.
The problem is beetween the 2 incarnations of kernel (2.6.11.7 with
kswapd meesages on thoses who works well), I've changed some more options
Not exactly the best way to track bugs :(
Anyway i'll try to catch THE option that make the kernel not so happy
under heavy stress. Stay tuned,
--
Yann Dupont, Cri de l'université de Nantes
Tel: 02.51.12.53.91 - Fax: 02.51.12.58.60 - Yann.Dupont@univ-nantes.fr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
@ 2005-05-10 8:06 linuxkernel2.20.sandos
2005-05-10 8:19 ` Nick Piggin
0 siblings, 1 reply; 18+ messages in thread
From: linuxkernel2.20.sandos @ 2005-05-10 8:06 UTC (permalink / raw)
To: linux-kernel
>Anyway i'll try to catch THE option that make the kernel not so happy
>under heavy stress. Stay tuned
How did this turn out? Any luck? Im seeing this same problem with my
e1000, now I did enable rx/tx flow control, I reniced kswapd and I
changed vm.min_free_kbytes to 65536, and the problem went away.
It would be nice with a "cleaner" solution though.
---
John Bäckstrand
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :(
2005-05-10 8:06 E1000 - page allocation failure - saga continues :( linuxkernel2.20.sandos
@ 2005-05-10 8:19 ` Nick Piggin
2005-05-10 9:32 ` E1000 - page allocation failure - saga continues :( message 1 of 20) linuxkernel2.20.sandos@spamgourmet.com
0 siblings, 1 reply; 18+ messages in thread
From: Nick Piggin @ 2005-05-10 8:19 UTC (permalink / raw)
To: linuxkernel2.20.sandos; +Cc: linux-kernel
linuxkernel2.20.sandos@spamgourmet.com wrote:
> >Anyway i'll try to catch THE option that make the kernel not so happy
> >under heavy stress. Stay tuned
>
> How did this turn out? Any luck? Im seeing this same problem with my
> e1000, now I did enable rx/tx flow control, I reniced kswapd and I
> changed vm.min_free_kbytes to 65536, and the problem went away.
>
> It would be nice with a "cleaner" solution though.
>
What kernel are you using?
Are you doing a lot of block IO as well?
--
SUSE Labs, Novell Inc.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :( message 1 of 20)
2005-05-10 8:19 ` Nick Piggin
@ 2005-05-10 9:32 ` linuxkernel2.20.sandos@spamgourmet.com
2005-05-10 9:44 ` Nick Piggin
0 siblings, 1 reply; 18+ messages in thread
From: linuxkernel2.20.sandos@spamgourmet.com @ 2005-05-10 9:32 UTC (permalink / raw)
To: linux-kernel; +Cc: +linuxkernel2+sandos+f66671bddc.linux-kernel#vger.kernel.org
Nick Piggin - nickpiggin@yahoo.com.au wrote:
> linuxkernel2.20.sandos@spamgourmet.com wrote:
>
>> >Anyway i'll try to catch THE option that make the kernel not so happy
>> >under heavy stress. Stay tuned
>>
>> How did this turn out? Any luck? Im seeing this same problem with my
>> e1000, now I did enable rx/tx flow control, I reniced kswapd and I
>> changed vm.min_free_kbytes to 65536, and the problem went away.
>>
>> It would be nice with a "cleaner" solution though.
>>
>
> What kernel are you using?
> Are you doing a lot of block IO as well?
I am using 2.6.11.8.
Yes, the server is a fileserver for both the internet (~10Mbit) and
internally (1Gbit e1000). Hardware is pretty old so is pretty heavily
loaded and with 256MB RAM.
---
John Bäckstrand
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: E1000 - page allocation failure - saga continues :( message 1 of 20)
2005-05-10 9:32 ` E1000 - page allocation failure - saga continues :( message 1 of 20) linuxkernel2.20.sandos@spamgourmet.com
@ 2005-05-10 9:44 ` Nick Piggin
0 siblings, 0 replies; 18+ messages in thread
From: Nick Piggin @ 2005-05-10 9:44 UTC (permalink / raw)
To: linuxkernel2.20.sandos@spamgourmet.com; +Cc: linux-kernel
linuxkernel2.20.sandos@spamgourmet.com wrote:
> Nick Piggin - nickpiggin@yahoo.com.au wrote:
>
>> linuxkernel2.20.sandos@spamgourmet.com wrote:
>>> It would be nice with a "cleaner" solution though.
>>>
>>
>> What kernel are you using?
>> Are you doing a lot of block IO as well?
>
>
> I am using 2.6.11.8.
>
> Yes, the server is a fileserver for both the internet (~10Mbit) and
> internally (1Gbit e1000). Hardware is pretty old so is pretty heavily
> loaded and with 256MB RAM.
>
OK, well there are some patches in 2.6.12 that should make
things slightly better, and then some more patches in -mm
(not sure if they'll make it for 2.6.12) that should make
things slightly better again.
Basically they work towards reducing the memory allocation
"priority" for block IO requests, in relation to networking
and other atomic allocation requirements.
If you can't test the latest -mm, or 2.6.12-rc4, then wait
for 2.6.12 and 2.6.13 and check back on the problem.
Thanks,
Nick
--
SUSE Labs, Novell Inc.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2005-05-10 9:44 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-10 8:06 E1000 - page allocation failure - saga continues :( linuxkernel2.20.sandos
2005-05-10 8:19 ` Nick Piggin
2005-05-10 9:32 ` E1000 - page allocation failure - saga continues :( message 1 of 20) linuxkernel2.20.sandos@spamgourmet.com
2005-05-10 9:44 ` Nick Piggin
-- strict thread matches above, loose matches on Subject: below --
2005-04-14 21:48 E1000 - page allocation failure - saga continues :( Lukas Hejtmanek
2005-04-18 12:10 ` Yann Dupont
2005-04-18 12:22 ` Lukas Hejtmanek
2005-04-18 12:24 ` Yann Dupont
2005-04-18 12:34 ` Lukas Hejtmanek
2005-04-18 12:39 ` Yann Dupont
2005-04-19 7:23 ` Yann Dupont
2005-04-19 8:03 ` Nick Piggin
2005-04-19 8:15 ` Yann Dupont
2005-04-19 8:24 ` Nick Piggin
2005-04-19 9:34 ` Lukas Hejtmanek
2005-04-19 8:04 ` Lukas Hejtmanek
2005-04-20 3:42 ` Nuno Silva
2005-04-20 7:12 ` Yann Dupont
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox