Page allocation failures in guest

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* Page allocation failures in guest
@ 2009-07-13  9:51 Pierre Ossman
  2009-07-13 14:59 ` Minchan Kim
  0 siblings, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-07-13  9:51 UTC (permalink / raw)
  To: avi, kvm; +Cc: LKML, linux-mm

[-- Attachment #1: Type: text/plain, Size: 6060 bytes --]

I upgraded my Fedora 10 host to 2.6.29 a few days ago and since then
one of the guests keeps getting page allocation failures after a few
hours. I've upgraded the kernel in the guest from 2.6.27 to 2.6.29
without any change. There are also a few other guests running on the
machine that aren't having any issues.

The only noticable thing that dies for me is the network. The machine
still logs properly and I can attach to the local console and reboot it.

This is what I see in dmesg/logs:

Jul 12 23:04:54 loki kernel: sshd: page allocation failure. order:0, mode:0x4020
Jul 12 23:04:54 loki kernel: Pid: 1682, comm: sshd Not tainted 2.6.29.5-84.fc10.x86_64 #1
Jul 12 23:04:54 loki kernel: Call Trace:
Jul 12 23:04:54 loki kernel: <IRQ>  [<ffffffff810a1896>] __alloc_pages_internal+0x42f/0x451
Jul 12 23:04:54 loki kernel: [<ffffffff810c52f8>] alloc_pages_current+0xb9/0xc2
Jul 12 23:04:54 loki kernel: [<ffffffff810c926c>] alloc_slab_page+0x19/0x69
Jul 12 23:04:54 loki kernel: [<ffffffff810c931f>] new_slab+0x63/0x1cb
Jul 12 23:04:54 loki kernel: [<ffffffff810c99fd>] __slab_alloc+0x23d/0x3ac
Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] ? __netdev_alloc_skb+0x31/0x4d
Jul 12 23:04:54 loki kernel: [<ffffffff810cac1b>] __kmalloc_node_track_caller+0xbb/0x11f
Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] ? __netdev_alloc_skb+0x31/0x4d
Jul 12 23:04:54 loki kernel: [<ffffffff812d3dfc>] __alloc_skb+0x6f/0x130
Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] __netdev_alloc_skb+0x31/0x4d
Jul 12 23:04:54 loki kernel: [<ffffffffa002e668>] try_fill_recv_maxbufs+0x5a/0x20d [virtio_net]
Jul 12 23:04:54 loki kernel: [<ffffffffa002e83d>] try_fill_recv+0x22/0x17e [virtio_net]
Jul 12 23:04:54 loki kernel: [<ffffffff812d9c74>] ? netif_receive_skb+0x40a/0x42f
Jul 12 23:04:54 loki kernel: [<ffffffffa002f4b9>] virtnet_poll+0x57f/0x5ee [virtio_net]
Jul 12 23:04:54 loki kernel: [<ffffffff81374b45>] ? _spin_lock_irq+0x21/0x26
Jul 12 23:04:54 loki kernel: [<ffffffff812d8372>] net_rx_action+0xb3/0x1af
Jul 12 23:04:54 loki kernel: [<ffffffff8104d9f0>] __do_softirq+0x94/0x150
Jul 12 23:04:54 loki kernel: [<ffffffff8101274c>] call_softirq+0x1c/0x30
Jul 12 23:04:54 loki kernel: <EOI>  [<ffffffff81013869>] do_softirq+0x4d/0xb4
Jul 12 23:04:54 loki kernel: [<ffffffff812cf149>] ? release_sock+0xb0/0xbb
Jul 12 23:04:54 loki kernel: [<ffffffff8104d86f>] _local_bh_enable_ip+0xc5/0xe5
Jul 12 23:04:54 loki kernel: [<ffffffff8104d898>] local_bh_enable_ip+0x9/0xb
Jul 12 23:04:54 loki kernel: [<ffffffff81374954>] _spin_unlock_bh+0x13/0x15
Jul 12 23:04:54 loki kernel: [<ffffffff812cf149>] release_sock+0xb0/0xbb
Jul 12 23:04:54 loki kernel: [<ffffffff812d2f38>] ? __kfree_skb+0x82/0x86
Jul 12 23:04:54 loki kernel: [<ffffffff8130f088>] tcp_recvmsg+0x974/0xa99
Jul 12 23:04:54 loki kernel: [<ffffffff812ce566>] sock_common_recvmsg+0x32/0x47
Jul 12 23:04:54 loki kernel: [<ffffffff812cc5a1>] __sock_recvmsg+0x6d/0x7a
Jul 12 23:04:54 loki kernel: [<ffffffff812cc69c>] sock_aio_read+0xee/0xfe
Jul 12 23:04:54 loki kernel: [<ffffffff810d1ecb>] do_sync_read+0xe7/0x12d
Jul 12 23:04:54 loki kernel: [<ffffffff811867ba>] ? rb_erase+0x278/0x2a0
Jul 12 23:04:54 loki kernel: [<ffffffff8105bdc8>] ? autoremove_wake_function+0x0/0x38
Jul 12 23:04:54 loki kernel: [<ffffffff81374845>] ? _spin_lock+0x9/0xc
Jul 12 23:04:54 loki kernel: [<ffffffff811502e8>] ? security_file_permission+0x11/0x13
Jul 12 23:04:54 loki kernel: [<ffffffff810d2884>] vfs_read+0xbb/0x102
Jul 12 23:04:54 loki kernel: [<ffffffff810d298f>] sys_read+0x47/0x6e
Jul 12 23:04:54 loki kernel: [<ffffffff8101133a>] system_call_fastpath+0x16/0x1b
Jul 12 23:04:54 loki kernel: Mem-Info:
Jul 12 23:04:54 loki kernel: Node 0 DMA per-cpu:
Jul 12 23:04:54 loki kernel: CPU    0: hi:    0, btch:   1 usd:   0
Jul 12 23:04:54 loki kernel: Node 0 DMA32 per-cpu:
Jul 12 23:04:54 loki kernel: CPU    0: hi:  186, btch:  31 usd: 119
Jul 12 23:04:54 loki kernel: Active_anon:14065 active_file:87384 inactive_anon:37480
Jul 12 23:04:54 loki kernel: inactive_file:95821 unevictable:4 dirty:8 writeback:0 unstable:0
Jul 12 23:04:54 loki kernel: free:1344 slab:7113 mapped:4283 pagetables:5656 bounce:0
Jul 12 23:04:54 loki kernel: Node 0 DMA free:3988kB min:24kB low:28kB high:36kB active_anon:0kB inactive_anon:0kB active_file:3532kB inactive_file:1032kB unevictable:0kB present:6840kB pages_scanned:0 all_un
reclaimable? no
Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 994 994 994
Jul 12 23:04:54 loki kernel: Node 0 DMA32 free:1388kB min:4020kB low:5024kB high:6028kB active_anon:56260kB inactive_anon:149920kB active_file:346004kB inactive_file:382252kB unevictable:16kB present:1018016
kB pages_scanned:96 all_unreclaimable? no
Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 0 0 0
Jul 12 23:04:54 loki kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3988kB
Jul 12 23:04:54 loki kernel: Node 0 DMA32: 4*4kB 77*8kB 3*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1384kB
Jul 12 23:04:54 loki kernel: 183936 total pagecache pages
Jul 12 23:04:54 loki kernel: 0 pages in swap cache
Jul 12 23:04:54 loki kernel: Swap cache stats: add 0, delete 0, find 0/0
Jul 12 23:04:54 loki kernel: Free swap  = 1015800kB
Jul 12 23:04:54 loki kernel: Total swap = 1015800kB
Jul 12 23:04:54 loki kernel: 262128 pages RAM
Jul 12 23:04:54 loki kernel: 8339 pages reserved
Jul 12 23:04:54 loki kernel: 34783 pages shared
Jul 12 23:04:54 loki kernel: 245277 pages non-shared

It doesn't look like it's out of memory to me, so I'm not sure what is
going on.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  rdesktop, core developer          http://www.rdesktop.org
  TigerVNC, core developer          http://www.tigervnc.org

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-07-13  9:51 Page allocation failures in guest Pierre Ossman
@ 2009-07-13 14:59 ` Minchan Kim
  2009-08-11  6:32   ` Pierre Ossman
  0 siblings, 1 reply; 17+ messages in thread
From: Minchan Kim @ 2009-07-13 14:59 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: avi, kvm, LKML, linux-mm, Wu Fengguang, KOSAKI Motohiro,
	Rik van Riel

On Mon, Jul 13, 2009 at 6:51 PM, Pierre Ossman<drzeus-list@drzeus.cx> wrote:
> I upgraded my Fedora 10 host to 2.6.29 a few days ago and since then
> one of the guests keeps getting page allocation failures after a few
> hours. I've upgraded the kernel in the guest from 2.6.27 to 2.6.29
> without any change. There are also a few other guests running on the
> machine that aren't having any issues.
>
> The only noticable thing that dies for me is the network. The machine
> still logs properly and I can attach to the local console and reboot it.
>
> This is what I see in dmesg/logs:
>
> Jul 12 23:04:54 loki kernel: sshd: page allocation failure. order:0, mode:0x4020

GFP_ATOMIC.
We don't have a many thing for reclaiming.

> Jul 12 23:04:54 loki kernel: Pid: 1682, comm: sshd Not tainted 2.6.29.5-84.fc10.x86_64 #1
> Jul 12 23:04:54 loki kernel: Call Trace:
> Jul 12 23:04:54 loki kernel: <IRQ>  [<ffffffff810a1896>] __alloc_pages_internal+0x42f/0x451
> Jul 12 23:04:54 loki kernel: [<ffffffff810c52f8>] alloc_pages_current+0xb9/0xc2
> Jul 12 23:04:54 loki kernel: [<ffffffff810c926c>] alloc_slab_page+0x19/0x69
> Jul 12 23:04:54 loki kernel: [<ffffffff810c931f>] new_slab+0x63/0x1cb
> Jul 12 23:04:54 loki kernel: [<ffffffff810c99fd>] __slab_alloc+0x23d/0x3ac
> Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] ? __netdev_alloc_skb+0x31/0x4d
> Jul 12 23:04:54 loki kernel: [<ffffffff810cac1b>] __kmalloc_node_track_caller+0xbb/0x11f
> Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] ? __netdev_alloc_skb+0x31/0x4d
> Jul 12 23:04:54 loki kernel: [<ffffffff812d3dfc>] __alloc_skb+0x6f/0x130
> Jul 12 23:04:54 loki kernel: [<ffffffff812d49f2>] __netdev_alloc_skb+0x31/0x4d
> Jul 12 23:04:54 loki kernel: [<ffffffffa002e668>] try_fill_recv_maxbufs+0x5a/0x20d [virtio_net]
> Jul 12 23:04:54 loki kernel: [<ffffffffa002e83d>] try_fill_recv+0x22/0x17e [virtio_net]
> Jul 12 23:04:54 loki kernel: [<ffffffff812d9c74>] ? netif_receive_skb+0x40a/0x42f
> Jul 12 23:04:54 loki kernel: [<ffffffffa002f4b9>] virtnet_poll+0x57f/0x5ee [virtio_net]
> Jul 12 23:04:54 loki kernel: [<ffffffff81374b45>] ? _spin_lock_irq+0x21/0x26
> Jul 12 23:04:54 loki kernel: [<ffffffff812d8372>] net_rx_action+0xb3/0x1af
> Jul 12 23:04:54 loki kernel: [<ffffffff8104d9f0>] __do_softirq+0x94/0x150
> Jul 12 23:04:54 loki kernel: [<ffffffff8101274c>] call_softirq+0x1c/0x30
> Jul 12 23:04:54 loki kernel: <EOI>  [<ffffffff81013869>] do_softirq+0x4d/0xb4
> Jul 12 23:04:54 loki kernel: [<ffffffff812cf149>] ? release_sock+0xb0/0xbb
> Jul 12 23:04:54 loki kernel: [<ffffffff8104d86f>] _local_bh_enable_ip+0xc5/0xe5
> Jul 12 23:04:54 loki kernel: [<ffffffff8104d898>] local_bh_enable_ip+0x9/0xb
> Jul 12 23:04:54 loki kernel: [<ffffffff81374954>] _spin_unlock_bh+0x13/0x15
> Jul 12 23:04:54 loki kernel: [<ffffffff812cf149>] release_sock+0xb0/0xbb
> Jul 12 23:04:54 loki kernel: [<ffffffff812d2f38>] ? __kfree_skb+0x82/0x86
> Jul 12 23:04:54 loki kernel: [<ffffffff8130f088>] tcp_recvmsg+0x974/0xa99
> Jul 12 23:04:54 loki kernel: [<ffffffff812ce566>] sock_common_recvmsg+0x32/0x47
> Jul 12 23:04:54 loki kernel: [<ffffffff812cc5a1>] __sock_recvmsg+0x6d/0x7a
> Jul 12 23:04:54 loki kernel: [<ffffffff812cc69c>] sock_aio_read+0xee/0xfe
> Jul 12 23:04:54 loki kernel: [<ffffffff810d1ecb>] do_sync_read+0xe7/0x12d
> Jul 12 23:04:54 loki kernel: [<ffffffff811867ba>] ? rb_erase+0x278/0x2a0
> Jul 12 23:04:54 loki kernel: [<ffffffff8105bdc8>] ? autoremove_wake_function+0x0/0x38
> Jul 12 23:04:54 loki kernel: [<ffffffff81374845>] ? _spin_lock+0x9/0xc
> Jul 12 23:04:54 loki kernel: [<ffffffff811502e8>] ? security_file_permission+0x11/0x13
> Jul 12 23:04:54 loki kernel: [<ffffffff810d2884>] vfs_read+0xbb/0x102
> Jul 12 23:04:54 loki kernel: [<ffffffff810d298f>] sys_read+0x47/0x6e
> Jul 12 23:04:54 loki kernel: [<ffffffff8101133a>] system_call_fastpath+0x16/0x1b
> Jul 12 23:04:54 loki kernel: Mem-Info:
> Jul 12 23:04:54 loki kernel: Node 0 DMA per-cpu:
> Jul 12 23:04:54 loki kernel: CPU    0: hi:    0, btch:   1 usd:   0
> Jul 12 23:04:54 loki kernel: Node 0 DMA32 per-cpu:
> Jul 12 23:04:54 loki kernel: CPU    0: hi:  186, btch:  31 usd: 119
> Jul 12 23:04:54 loki kernel: Active_anon:14065 active_file:87384 inactive_anon:37480
> Jul 12 23:04:54 loki kernel: inactive_file:95821 unevictable:4 dirty:8 writeback:0 unstable:0
> Jul 12 23:04:54 loki kernel: free:1344 slab:7113 mapped:4283 pagetables:5656 bounce:0
> Jul 12 23:04:54 loki kernel: Node 0 DMA free:3988kB min:24kB low:28kB high:36kB active_anon:0kB inactive_anon:0kB active_file:3532kB inactive_file:1032kB unevictable:0kB present:6840kB pages_scanned:0 all_un

I don't know why present is bigger than free + [in]active anon ?
Who know this ?

There are 258 pages in inactive file.
Unfortunately, it seems we don't have any discardable pages.
The reclaimer can't sync dirty pages to reclaim them, too.
That's because we are going on GFP_ATOMIC as I mentioned.

> reclaimable? no
> Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 994 994 994
> Jul 12 23:04:54 loki kernel: Node 0 DMA32 free:1388kB min:4020kB low:5024kB high:6028kB active_anon:56260kB inactive_anon:149920kB active_file:346004kB inactive_file:382252kB unevictable:16kB present:1018016


free : 1388KB min : 4020KB. In addtion, now GFP_HIGH. so calculation
is as follow for zone_watermark_ok.

1388 < (4020 / 2)

So failed it in zone_watermark_ok.
AFAIU, it's fairy OOM problem.

> kB pages_scanned:96 all_unreclaimable? no
> Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 0 0 0
> Jul 12 23:04:54 loki kernel: Node 0 DMA: 1*4kB 0*8kB 1*16kB 0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3988kB
> Jul 12 23:04:54 loki kernel: Node 0 DMA32: 4*4kB 77*8kB 3*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1384kB
> Jul 12 23:04:54 loki kernel: 183936 total pagecache pages
> Jul 12 23:04:54 loki kernel: 0 pages in swap cache
> Jul 12 23:04:54 loki kernel: Swap cache stats: add 0, delete 0, find 0/0
> Jul 12 23:04:54 loki kernel: Free swap  = 1015800kB
> Jul 12 23:04:54 loki kernel: Total swap = 1015800kB
> Jul 12 23:04:54 loki kernel: 262128 pages RAM
> Jul 12 23:04:54 loki kernel: 8339 pages reserved
> Jul 12 23:04:54 loki kernel: 34783 pages shared
> Jul 12 23:04:54 loki kernel: 245277 pages non-shared
>
> It doesn't look like it's out of memory to me, so I'm not sure what is
> going on.
>
> Rgds
> --
>     -- Pierre Ossman
>
>  Linux kernel, MMC maintainer        http://www.kernel.org
>  rdesktop, core developer          http://www.rdesktop.org
>  TigerVNC, core developer          http://www.tigervnc.org
>
>  WARNING: This correspondence is being monitored by the
>  Swedish government. Make sure your server uses encryption
>  for SMTP traffic and consider using PGP for end-to-end
>  encryption.
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-07-13 14:59 ` Minchan Kim
@ 2009-08-11  6:32   ` Pierre Ossman
  2009-08-11  6:52     ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-08-11  6:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: avi, kvm, LKML, linux-mm, Wu Fengguang, KOSAKI Motohiro,
	Rik van Riel

[-- Attachment #1: Type: text/plain, Size: 1988 bytes --]

On Mon, 13 Jul 2009 23:59:52 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> On Mon, Jul 13, 2009 at 6:51 PM, Pierre Ossman<drzeus-list@drzeus.cx> wrote:
> > Jul 12 23:04:54 loki kernel: Active_anon:14065 active_file:87384 inactive_anon:37480
> > Jul 12 23:04:54 loki kernel: inactive_file:95821 unevictable:4 dirty:8 writeback:0 unstable:0
> > Jul 12 23:04:54 loki kernel: free:1344 slab:7113 mapped:4283 pagetables:5656 bounce:0
> > Jul 12 23:04:54 loki kernel: Node 0 DMA free:3988kB min:24kB low:28kB high:36kB active_anon:0kB inactive_anon:0kB active_file:3532kB inactive_file:1032kB unevictable:0kB present:6840kB pages_scanned:0 all_un
> 
> I don't know why present is bigger than free + [in]active anon ?
> Who know this ?
> 
> There are 258 pages in inactive file.
> Unfortunately, it seems we don't have any discardable pages.
> The reclaimer can't sync dirty pages to reclaim them, too.
> That's because we are going on GFP_ATOMIC as I mentioned.
> 

Any ideas here? Is the virtio net driver very GFP_ATOMIC happy so it
drains all those pages? And why is this triggered by a kernel upgrade
in the host?

Avi?

> > reclaimable? no
> > Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 994 994 994
> > Jul 12 23:04:54 loki kernel: Node 0 DMA32 free:1388kB min:4020kB low:5024kB high:6028kB active_anon:56260kB inactive_anon:149920kB active_file:346004kB inactive_file:382252kB unevictable:16kB present:1018016
> 
> 
> free : 1388KB min : 4020KB. In addtion, now GFP_HIGH. so calculation
> is as follow for zone_watermark_ok.
> 
> 1388 < (4020 / 2)
> 
> So failed it in zone_watermark_ok.
> AFAIU, it's fairy OOM problem.
> 

I doesn't get out of it though, or at least the virtio net driver
wedges itself.

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-11  6:32   ` Pierre Ossman
@ 2009-08-11  6:52     ` Avi Kivity
  2009-08-12  3:19       ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-08-11  6:52 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Pierre Ossman, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

On 08/11/2009 09:32 AM, Pierre Ossman wrote:
> On Mon, 13 Jul 2009 23:59:52 +0900
> Minchan Kim<minchan.kim@gmail.com>  wrote:
>
>    
>> On Mon, Jul 13, 2009 at 6:51 PM, Pierre Ossman<drzeus-list@drzeus.cx>  wrote:
>>      
>>> Jul 12 23:04:54 loki kernel: Active_anon:14065 active_file:87384 inactive_anon:37480
>>> Jul 12 23:04:54 loki kernel: inactive_file:95821 unevictable:4 dirty:8 writeback:0 unstable:0
>>> Jul 12 23:04:54 loki kernel: free:1344 slab:7113 mapped:4283 pagetables:5656 bounce:0
>>> Jul 12 23:04:54 loki kernel: Node 0 DMA free:3988kB min:24kB low:28kB high:36kB active_anon:0kB inactive_anon:0kB active_file:3532kB inactive_file:1032kB unevictable:0kB present:6840kB pages_scanned:0 all_un
>>>        
>> I don't know why present is bigger than free + [in]active anon ?
>> Who know this ?
>>
>> There are 258 pages in inactive file.
>> Unfortunately, it seems we don't have any discardable pages.
>> The reclaimer can't sync dirty pages to reclaim them, too.
>> That's because we are going on GFP_ATOMIC as I mentioned.
>>
>>      
>
> Any ideas here? Is the virtio net driver very GFP_ATOMIC happy so it
> drains all those pages? And why is this triggered by a kernel upgrade
> in the host?
>
> Avi?
>
>    

Rusty?

>>> reclaimable? no
>>> Jul 12 23:04:54 loki kernel: lowmem_reserve[]: 0 994 994 994
>>> Jul 12 23:04:54 loki kernel: Node 0 DMA32 free:1388kB min:4020kB low:5024kB high:6028kB active_anon:56260kB inactive_anon:149920kB active_file:346004kB inactive_file:382252kB unevictable:16kB present:1018016
>>>        
>> free : 1388KB min : 4020KB. In addtion, now GFP_HIGH. so calculation
>> is as follow for zone_watermark_ok.
>>
>> 1388<  (4020 / 2)
>>
>> So failed it in zone_watermark_ok.
>> AFAIU, it's fairy OOM problem.
>>
>>      
>
> I doesn't get out of it though, or at least the virtio net driver
> wedges itself.
>
> Rgds
>    


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-11  6:52     ` Avi Kivity
@ 2009-08-12  3:19       ` Rusty Russell
  2009-08-12  5:31         ` Rusty Russell
  2009-08-12  6:19         ` Pierre Ossman
  0 siblings, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2009-08-12  3:19 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pierre Ossman, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

On Tue, 11 Aug 2009 04:22:53 pm Avi Kivity wrote:
> On 08/11/2009 09:32 AM, Pierre Ossman wrote:
> > On Mon, 13 Jul 2009 23:59:52 +0900
> > Minchan Kim<minchan.kim@gmail.com>  wrote:
> > Any ideas here? Is the virtio net driver very GFP_ATOMIC happy so it
> > drains all those pages? And why is this triggered by a kernel upgrade
> > in the host?
> >
> > Avi? 
> 
> Rusty?

It's kind of the nature of networking devices :(

I'd say your host now offers GSO features, so the guest allocates big
packets.

> > I doesn't get out of it though, or at least the virtio net driver
> > wedges itself.

There's a fixme to retry when this happens, but this is the first report
I've received.  I'll check it out.

Thanks,
Rusty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  3:19       ` Rusty Russell
@ 2009-08-12  5:31         ` Rusty Russell
  2009-08-12  5:41           ` Avi Kivity
  2009-08-13 20:25           ` Pierre Ossman
  2009-08-12  6:19         ` Pierre Ossman
  1 sibling, 2 replies; 17+ messages in thread
From: Rusty Russell @ 2009-08-12  5:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pierre Ossman, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

On Wed, 12 Aug 2009 12:49:51 pm Rusty Russell wrote:
> On Tue, 11 Aug 2009 04:22:53 pm Avi Kivity wrote:
> > On 08/11/2009 09:32 AM, Pierre Ossman wrote:
> > > I doesn't get out of it though, or at least the virtio net driver
> > > wedges itself.
> 
> There's a fixme to retry when this happens, but this is the first report
> I've received.  I'll check it out.

Subject: virtio: net refill on out-of-memory

If we run out of memory, use keventd to fill the buffer.  There's a
report of this happening: "Page allocation failures in guest",
Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -71,6 +71,9 @@ struct virtnet_info
 	struct sk_buff_head recv;
 	struct sk_buff_head send;
 
+	/* Work struct for refilling if we run low on memory. */
+	struct work_struct refill;
+	
 	/* Chain pages by the private ptr. */
 	struct page *pages;
 };
@@ -274,19 +277,22 @@ drop:
 	dev_kfree_skb(skb);
 }
 
-static void try_fill_recv_maxbufs(struct virtnet_info *vi)
+static bool try_fill_recv_maxbufs(struct virtnet_info *vi, gfp_t gfp)
 {
 	struct sk_buff *skb;
 	struct scatterlist sg[2+MAX_SKB_FRAGS];
 	int num, err, i;
+	bool oom = false;
 
 	sg_init_table(sg, 2+MAX_SKB_FRAGS);
 	for (;;) {
 		struct virtio_net_hdr *hdr;
 
 		skb = netdev_alloc_skb(vi->dev, MAX_PACKET_LEN + NET_IP_ALIGN);
-		if (unlikely(!skb))
+		if (unlikely(!skb)) {
+			oom = true;
 			break;
+		}
 
 		skb_reserve(skb, NET_IP_ALIGN);
 		skb_put(skb, MAX_PACKET_LEN);
@@ -297,7 +303,7 @@ static void try_fill_recv_maxbufs(struct
 		if (vi->big_packets) {
 			for (i = 0; i < MAX_SKB_FRAGS; i++) {
 				skb_frag_t *f = &skb_shinfo(skb)->frags[i];
-				f->page = get_a_page(vi, GFP_ATOMIC);
+				f->page = get_a_page(vi, gfp);
 				if (!f->page)
 					break;
 
@@ -326,31 +332,35 @@ static void try_fill_recv_maxbufs(struct
 	if (unlikely(vi->num > vi->max))
 		vi->max = vi->num;
 	vi->rvq->vq_ops->kick(vi->rvq);
+	return !oom;
 }
 
-static void try_fill_recv(struct virtnet_info *vi)
+/* Returns false if we couldn't fill entirely (OOM). */
+static bool try_fill_recv(struct virtnet_info *vi, gfp_t gfp)
 {
 	struct sk_buff *skb;
 	struct scatterlist sg[1];
 	int err;
+	bool oom = false;
 
-	if (!vi->mergeable_rx_bufs) {
-		try_fill_recv_maxbufs(vi);
-		return;
-	}
+	if (!vi->mergeable_rx_bufs)
+		return try_fill_recv_maxbufs(vi, gfp);
 
 	for (;;) {
 		skb_frag_t *f;
 
 		skb = netdev_alloc_skb(vi->dev, GOOD_COPY_LEN + NET_IP_ALIGN);
-		if (unlikely(!skb))
+		if (unlikely(!skb)) {
+			oom = true;
 			break;
+		}
 
 		skb_reserve(skb, NET_IP_ALIGN);
 
 		f = &skb_shinfo(skb)->frags[0];
-		f->page = get_a_page(vi, GFP_ATOMIC);
+		f->page = get_a_page(vi, gfp);
 		if (!f->page) {
+			oom = true;
 			kfree_skb(skb);
 			break;
 		}
@@ -374,6 +384,7 @@ static void try_fill_recv(struct virtnet
 	if (unlikely(vi->num > vi->max))
 		vi->max = vi->num;
 	vi->rvq->vq_ops->kick(vi->rvq);
+	return !oom;
 }
 
 static void skb_recv_done(struct virtqueue *rvq)
@@ -386,6 +397,26 @@ static void skb_recv_done(struct virtque
 	}
 }
 
+static void refill_work(struct work_struct *work)
+{
+	struct virtnet_info *vi;
+	bool still_empty;
+
+	vi = container_of(work, struct virtnet_info, refill);
+	napi_disable(&vi->napi);
+	try_fill_recv(vi, GFP_KERNEL);
+	still_empty = (vi->num == 0);
+	napi_enable(&vi->napi);
+
+	/* In theory, this can happen: if we don't get any buffers in
+	 * we will *never* try to fill again.  Sleeping in keventd if
+	 * bad, but that is worse. */
+	if (still_empty) {
+		msleep(100);
+		schedule_work(&vi->refill);
+	}
+}
+
 static int virtnet_poll(struct napi_struct *napi, int budget)
 {
 	struct virtnet_info *vi = container_of(napi, struct virtnet_info, napi);
@@ -401,10 +432,10 @@ again:
 		received++;
 	}
 
-	/* FIXME: If we oom and completely run out of inbufs, we need
-	 * to start a timer trying to fill more. */
-	if (vi->num < vi->max / 2)
-		try_fill_recv(vi);
+	if (vi->num < vi->max / 2) {
+		if (!try_fill_recv(vi, GFP_ATOMIC))
+			schedule_work(&vi->refill);
+	}
 
 	/* Out of packets? */
 	if (received < budget) {
@@ -894,6 +925,7 @@ static int virtnet_probe(struct virtio_d
 	vi->vdev = vdev;
 	vdev->priv = vi;
 	vi->pages = NULL;
+	INIT_WORK(&vi->refill, refill_work);
 
 	/* If they give us a callback when all buffers are done, we don't need
 	 * the timer. */
@@ -942,7 +974,7 @@ static int virtnet_probe(struct virtio_d
 	}
 
 	/* Last of all, set up some receive buffers. */
-	try_fill_recv(vi);
+	try_fill_recv(vi, GFP_KERNEL);
 
 	/* If we didn't even get one input buffer, we're useless. */
 	if (vi->num == 0) {
@@ -959,6 +991,7 @@ static int virtnet_probe(struct virtio_d
 
 unregister:
 	unregister_netdev(dev);
+	cancel_work_sync(&vi->refill);
 free_vqs:
 	vdev->config->del_vqs(vdev);
 free:
@@ -987,6 +1020,7 @@ static void virtnet_remove(struct virtio
 	BUG_ON(vi->num != 0);
 
 	unregister_netdev(vi->dev);
+	cancel_work_sync(&vi->refill);
 
 	vdev->config->del_vqs(vi->vdev);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  5:31         ` Rusty Russell
@ 2009-08-12  5:41           ` Avi Kivity
  2009-08-12  6:56             ` Rusty Russell
  2009-08-13 20:25           ` Pierre Ossman
  1 sibling, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-08-12  5:41 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Pierre Ossman, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

On 08/12/2009 08:31 AM, Rusty Russell wrote:
> +static void refill_work(struct work_struct *work)
> +{
> +	struct virtnet_info *vi;
> +	bool still_empty;
> +
> +	vi = container_of(work, struct virtnet_info, refill);
> +	napi_disable(&vi->napi);
> +	try_fill_recv(vi, GFP_KERNEL);
> +	still_empty = (vi->num == 0);
> +	napi_enable(&vi->napi);
> +
> +	/* In theory, this can happen: if we don't get any buffers in
> +	 * we will*never*  try to fill again.  Sleeping in keventd if
> +	 * bad, but that is worse. */
> +	if (still_empty) {
> +		msleep(100);
> +		schedule_work(&vi->refill);
> +	}
> +}
> +
>    

schedule_delayed_work()?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  5:41           ` Avi Kivity
@ 2009-08-12  6:56             ` Rusty Russell
  0 siblings, 0 replies; 17+ messages in thread
From: Rusty Russell @ 2009-08-12  6:56 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Pierre Ossman, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

On Wed, 12 Aug 2009 03:11:21 pm Avi Kivity wrote:
> > +	/* In theory, this can happen: if we don't get any buffers in
> > +	 * we will*never*  try to fill again.  Sleeping in keventd if
> > +	 * bad, but that is worse. */
> > +	if (still_empty) {
> > +		msleep(100);
> > +		schedule_work(&vi->refill);
> > +	}
> > +}
> > + 
> 
> schedule_delayed_work()?

Hmm, might as well, although this is v. unlikely to happen.

Thanks,
Rusty.

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -72,7 +72,7 @@ struct virtnet_info
 	struct sk_buff_head send;
 
 	/* Work struct for refilling if we run low on memory. */
-	struct work_struct refill;
+	struct delayed_work refill;
 	
 	/* Chain pages by the private ptr. */
 	struct page *pages;
@@ -402,19 +402,16 @@ static void refill_work(struct work_stru
 	struct virtnet_info *vi;
 	bool still_empty;
 
-	vi = container_of(work, struct virtnet_info, refill);
+	vi = container_of(work, struct virtnet_info, refill.work);
 	napi_disable(&vi->napi);
 	try_fill_recv(vi, GFP_KERNEL);
 	still_empty = (vi->num == 0);
 	napi_enable(&vi->napi);
 
 	/* In theory, this can happen: if we don't get any buffers in
-	 * we will *never* try to fill again.  Sleeping in keventd if
-	 * bad, but that is worse. */
-	if (still_empty) {
-		msleep(100);
-		schedule_work(&vi->refill);
-	}
+	 * we will *never* try to fill again. */
+	if (still_empty)
+		schedule_delayed_work(&vi->refill, HZ/2);
 }
 
 static int virtnet_poll(struct napi_struct *napi, int budget)
@@ -434,7 +431,7 @@ again:
 
 	if (vi->num < vi->max / 2) {
 		if (!try_fill_recv(vi, GFP_ATOMIC))
-			schedule_work(&vi->refill);
+			schedule_delayed_work(&vi->refill, 0);
 	}
 
 	/* Out of packets? */
@@ -925,7 +922,7 @@ static int virtnet_probe(struct virtio_d
 	vi->vdev = vdev;
 	vdev->priv = vi;
 	vi->pages = NULL;
-	INIT_WORK(&vi->refill, refill_work);
+	INIT_DELAYED_WORK(&vi->refill, refill_work);
 
 	/* If they give us a callback when all buffers are done, we don't need
 	 * the timer. */
@@ -991,7 +988,7 @@ static int virtnet_probe(struct virtio_d
 
 unregister:
 	unregister_netdev(dev);
-	cancel_work_sync(&vi->refill);
+	cancel_delayed_work_sync(&vi->refill);
 free_vqs:
 	vdev->config->del_vqs(vdev);
 free:
@@ -1020,7 +1017,7 @@ static void virtnet_remove(struct virtio
 	BUG_ON(vi->num != 0);
 
 	unregister_netdev(vi->dev);
-	cancel_work_sync(&vi->refill);
+	cancel_delayed_work_sync(&vi->refill);
 
 	vdev->config->del_vqs(vi->vdev);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  5:31         ` Rusty Russell
  2009-08-12  5:41           ` Avi Kivity
@ 2009-08-13 20:25           ` Pierre Ossman
  2009-08-26  2:17             ` Rusty Russell
  1 sibling, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-08-13 20:25 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Avi Kivity, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

[-- Attachment #1: Type: text/plain, Size: 1044 bytes --]

On Wed, 12 Aug 2009 15:01:52 +0930
Rusty Russell <rusty@rustcorp.com.au> wrote:

> On Wed, 12 Aug 2009 12:49:51 pm Rusty Russell wrote:
> > On Tue, 11 Aug 2009 04:22:53 pm Avi Kivity wrote:
> > > On 08/11/2009 09:32 AM, Pierre Ossman wrote:
> > > > I doesn't get out of it though, or at least the virtio net driver
> > > > wedges itself.
> > 
> > There's a fixme to retry when this happens, but this is the first report
> > I've received.  I'll check it out.
> 
> Subject: virtio: net refill on out-of-memory
> 
> If we run out of memory, use keventd to fill the buffer.  There's a
> report of this happening: "Page allocation failures in guest",
> Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu>
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> 

Patch applied. Now we wait. :)

-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-13 20:25           ` Pierre Ossman
@ 2009-08-26  2:17             ` Rusty Russell
  2009-08-26  4:55               ` Pierre Ossman
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2009-08-26  2:17 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Avi Kivity, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

On Fri, 14 Aug 2009 05:55:48 am Pierre Ossman wrote:
> On Wed, 12 Aug 2009 15:01:52 +0930
> Rusty Russell <rusty@rustcorp.com.au> wrote:
> > Subject: virtio: net refill on out-of-memory
... 
> Patch applied. Now we wait. :)

Any results?

Thanks,
Rusty.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-26  2:17             ` Rusty Russell
@ 2009-08-26  4:55               ` Pierre Ossman
  2009-08-26 12:18                 ` Rusty Russell
  0 siblings, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-08-26  4:55 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Avi Kivity, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

[-- Attachment #1: Type: text/plain, Size: 892 bytes --]

On Wed, 26 Aug 2009 11:47:17 +0930
Rusty Russell <rusty@rustcorp.com.au> wrote:

> On Fri, 14 Aug 2009 05:55:48 am Pierre Ossman wrote:
> > On Wed, 12 Aug 2009 15:01:52 +0930
> > Rusty Russell <rusty@rustcorp.com.au> wrote:
> > > Subject: virtio: net refill on out-of-memory
> ... 
> > Patch applied. Now we wait. :)
> 
> Any results?
> 

It's been up for 12 days, so I'd say it works. But there is nothing in
dmesg, which suggests I haven't triggered the condition yet.

I wonder if there might be something broken with Fedora's kernel. :/
(I am running the same upstream version, and their conf, for this test,
but not all of their patches)

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-26  4:55               ` Pierre Ossman
@ 2009-08-26 12:18                 ` Rusty Russell
  2009-08-26 19:22                   ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Rusty Russell @ 2009-08-26 12:18 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Avi Kivity, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel, netdev

On Wed, 26 Aug 2009 02:25:01 pm Pierre Ossman wrote:
> On Wed, 26 Aug 2009 11:47:17 +0930
> Rusty Russell <rusty@rustcorp.com.au> wrote:
> 
> > On Fri, 14 Aug 2009 05:55:48 am Pierre Ossman wrote:
> > > On Wed, 12 Aug 2009 15:01:52 +0930
> > > Rusty Russell <rusty@rustcorp.com.au> wrote:
> > > > Subject: virtio: net refill on out-of-memory
> > ... 
> > > Patch applied. Now we wait. :)
> > 
> > Any results?
> > 
> 
> It's been up for 12 days, so I'd say it works. But there is nothing in
> dmesg, which suggests I haven't triggered the condition yet.

No, that's totally expected.  I wouldn't expect a GFP_ATOMIC order 0 alloc
failure to be noted, and the patch doesn't add any printks.

Dave, can you push this to Linus ASAP?

Thanks,
Rusty.

Subject: virtio: net refill on out-of-memory

If we run out of memory, use keventd to fill the buffer.  There's a
report of this happening: "Page allocation failures in guest",
Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
---
 drivers/net/virtio_net.c |   61 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 15 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -71,6 +71,9 @@ struct virtnet_info
 	struct sk_buff_head recv;
 	struct sk_buff_head send;
 
+	/* Work struct for refilling if we run low on memory. */
+	struct delayed_work refill;
+
 	/* Chain pages by the private ptr. */
 	struct page *pages;
 };
@@ -274,19 +277,22 @@ drop:
 	dev_kfree_skb(skb);
 }
 
-static void try_fill_recv_maxbufs(struct virtnet_info *vi)
+static bool try_fill_recv_maxbufs(struct virtnet_info *vi, gfp_t gfp)
 {
 	struct sk_buff *skb;
 	struct scatterlist sg[2+MAX_SKB_FRAGS];
 	int num, err, i;
+	bool oom = false;
 
 	sg_init_table(sg, 2+MAX_SKB_FRAGS);
 	for (;;) {
 		struct virtio_net_hdr *hdr;
 
 		skb = netdev_alloc_skb(vi->dev, MAX_PACKET_LEN + NET_IP_ALIGN);
-		if (unlikely(!skb))
+		if (unlikely(!skb)) {
+			oom = true;
 			break;
+		}
 
 		skb_reserve(skb, NET_IP_ALIGN);
 		skb_put(skb, MAX_PACKET_LEN);
@@ -297,7 +303,7 @@ static void try_fill_recv_maxbufs(struct
 		if (vi->big_packets) {
 			for (i = 0; i < MAX_SKB_FRAGS; i++) {
 				skb_frag_t *f = &skb_shinfo(skb)->frags[i];
-				f->page = get_a_page(vi, GFP_ATOMIC);
+				f->page = get_a_page(vi, gfp);
 				if (!f->page)
 					break;
 
@@ -326,31 +332,35 @@ static void try_fill_recv_maxbufs(struct
 	if (unlikely(vi->num > vi->max))
 		vi->max = vi->num;
 	vi->rvq->vq_ops->kick(vi->rvq);
+	return !oom;
 }
 
-static void try_fill_recv(struct virtnet_info *vi)
+/* Returns false if we couldn't fill entirely (OOM). */
+static bool try_fill_recv(struct virtnet_info *vi, gfp_t gfp)
 {
 	struct sk_buff *skb;
 	struct scatterlist sg[1];
 	int err;
+	bool oom = false;
 
-	if (!vi->mergeable_rx_bufs) {
-		try_fill_recv_maxbufs(vi);
-		return;
-	}
+	if (!vi->mergeable_rx_bufs)
+		return try_fill_recv_maxbufs(vi, gfp);
 
 	for (;;) {
 		skb_frag_t *f;
 
 		skb = netdev_alloc_skb(vi->dev, GOOD_COPY_LEN + NET_IP_ALIGN);
-		if (unlikely(!skb))
+		if (unlikely(!skb)) {
+			oom = true;
 			break;
+		}
 
 		skb_reserve(skb, NET_IP_ALIGN);
 
 		f = &skb_shinfo(skb)->frags[0];
-		f->page = get_a_page(vi, GFP_ATOMIC);
+		f->page = get_a_page(vi, gfp);
 		if (!f->page) {
+			oom = true;
 			kfree_skb(skb);
 			break;
 		}
@@ -374,6 +384,7 @@ static void try_fill_recv(struct virtnet
 	if (unlikely(vi->num > vi->max))
 		vi->max = vi->num;
 	vi->rvq->vq_ops->kick(vi->rvq);
+	return !oom;
 }
 
 static void skb_recv_done(struct virtqueue *rvq)
@@ -386,6 +397,23 @@ static void skb_recv_done(struct virtque
 	}
 }
 
+static void refill_work(struct work_struct *work)
+{
+	struct virtnet_info *vi;
+	bool still_empty;
+
+	vi = container_of(work, struct virtnet_info, refill.work);
+	napi_disable(&vi->napi);
+	try_fill_recv(vi, GFP_KERNEL);
+	still_empty = (vi->num == 0);
+	napi_enable(&vi->napi);
+
+	/* In theory, this can happen: if we don't get any buffers in
+	 * we will *never* try to fill again. */
+	if (still_empty)
+		schedule_delayed_work(&vi->refill, HZ/2);
+}
+
 static int virtnet_poll(struct napi_struct *napi, int budget)
 {
 	struct virtnet_info *vi = container_of(napi, struct virtnet_info, napi);
@@ -401,10 +429,10 @@ again:
 		received++;
 	}
 
-	/* FIXME: If we oom and completely run out of inbufs, we need
-	 * to start a timer trying to fill more. */
-	if (vi->num < vi->max / 2)
-		try_fill_recv(vi);
+	if (vi->num < vi->max / 2) {
+		if (!try_fill_recv(vi, GFP_ATOMIC))
+			schedule_delayed_work(&vi->refill, 0);
+	}
 
 	/* Out of packets? */
 	if (received < budget) {
@@ -894,6 +922,7 @@ static int virtnet_probe(struct virtio_d
 	vi->vdev = vdev;
 	vdev->priv = vi;
 	vi->pages = NULL;
+	INIT_DELAYED_WORK(&vi->refill, refill_work);
 
 	/* If they give us a callback when all buffers are done, we don't need
 	 * the timer. */
@@ -942,7 +971,7 @@ static int virtnet_probe(struct virtio_d
 	}
 
 	/* Last of all, set up some receive buffers. */
-	try_fill_recv(vi);
+	try_fill_recv(vi, GFP_KERNEL);
 
 	/* If we didn't even get one input buffer, we're useless. */
 	if (vi->num == 0) {
@@ -959,6 +988,7 @@ static int virtnet_probe(struct virtio_d
 
 unregister:
 	unregister_netdev(dev);
+	cancel_delayed_work_sync(&vi->refill);
 free_vqs:
 	vdev->config->del_vqs(vdev);
 free:
@@ -987,6 +1017,7 @@ static void virtnet_remove(struct virtio
 	BUG_ON(vi->num != 0);
 
 	unregister_netdev(vi->dev);
+	cancel_delayed_work_sync(&vi->refill);
 
 	vdev->config->del_vqs(vi->vdev);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-26 12:18                 ` Rusty Russell
@ 2009-08-26 19:22                   ` David Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2009-08-26 19:22 UTC (permalink / raw)
  To: rusty
  Cc: drzeus-list, avi, minchan.kim, kvm, linux-kernel, linux-mm,
	fengguang.wu, kosaki.motohiro, riel, netdev

From: Rusty Russell <rusty@rustcorp.com.au>
Date: Wed, 26 Aug 2009 21:48:58 +0930

> Dave, can you push this to Linus ASAP?

Ok.

> Subject: virtio: net refill on out-of-memory
> 
> If we run out of memory, use keventd to fill the buffer.  There's a
> report of this happening: "Page allocation failures in guest",
> Message-ID: <20090713115158.0a4892b0@mjolnir.ossman.eu>
> 
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Applied, thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  3:19       ` Rusty Russell
  2009-08-12  5:31         ` Rusty Russell
@ 2009-08-12  6:19         ` Pierre Ossman
  2009-08-12  7:43           ` Avi Kivity
  1 sibling, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-08-12  6:19 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Avi Kivity, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

[-- Attachment #1: Type: text/plain, Size: 786 bytes --]

On Wed, 12 Aug 2009 12:49:51 +0930
Rusty Russell <rusty@rustcorp.com.au> wrote:

> 
> It's kind of the nature of networking devices :(
> 
> I'd say your host now offers GSO features, so the guest allocates big
> packets.
> 
> > > I doesn't get out of it though, or at least the virtio net driver
> > > wedges itself.
> 
> There's a fixme to retry when this happens, but this is the first report
> I've received.  I'll check it out.
> 

Will it still trigger the OOM killer with this patch, or will things
behave slightly more gracefully?

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  6:19         ` Pierre Ossman
@ 2009-08-12  7:43           ` Avi Kivity
  2009-08-12  8:22             ` Pierre Ossman
  0 siblings, 1 reply; 17+ messages in thread
From: Avi Kivity @ 2009-08-12  7:43 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rusty Russell, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

On 08/12/2009 09:19 AM, Pierre Ossman wrote:
> Will it still trigger the OOM killer with this patch, or will things
> behave slightly more gracefully?
>    

I don't think you mentioned the OOM killer in your original report?  Did 
it trigger?


-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  7:43           ` Avi Kivity
@ 2009-08-12  8:22             ` Pierre Ossman
  2009-08-12  8:35               ` Avi Kivity
  0 siblings, 1 reply; 17+ messages in thread
From: Pierre Ossman @ 2009-08-12  8:22 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Rusty Russell, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

[-- Attachment #1: Type: text/plain, Size: 725 bytes --]

On Wed, 12 Aug 2009 10:43:46 +0300
Avi Kivity <avi@redhat.com> wrote:

> On 08/12/2009 09:19 AM, Pierre Ossman wrote:
> > Will it still trigger the OOM killer with this patch, or will things
> > behave slightly more gracefully?
> >    
> 
> I don't think you mentioned the OOM killer in your original report?  Did 
> it trigger?
> 

I might have things backwards here, but I though the OOM killer started
doing its dirty business once you got that memory allocation failure
dump.

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Page allocation failures in guest
  2009-08-12  8:22             ` Pierre Ossman
@ 2009-08-12  8:35               ` Avi Kivity
  0 siblings, 0 replies; 17+ messages in thread
From: Avi Kivity @ 2009-08-12  8:35 UTC (permalink / raw)
  To: Pierre Ossman
  Cc: Rusty Russell, Minchan Kim, kvm, LKML, linux-mm, Wu Fengguang,
	KOSAKI Motohiro, Rik van Riel

On 08/12/2009 11:22 AM, Pierre Ossman wrote:
>>
>>> Will it still trigger the OOM killer with this patch, or will things
>>> behave slightly more gracefully?
>>>
>>>        
>> I don't think you mentioned the OOM killer in your original report?  Did
>> it trigger?
>>
>>      
>
> I might have things backwards here, but I though the OOM killer started
> doing its dirty business once you got that memory allocation failure
> dump.
>    

I don't think the oom killer should trigger on GFP_ATOMIC failures, but 
don't know for sure.  If you don't have a trace saying it picked a task 
to kill, it probably didn't.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2009-08-26 19:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-13  9:51 Page allocation failures in guest Pierre Ossman
2009-07-13 14:59 ` Minchan Kim
2009-08-11  6:32   ` Pierre Ossman
2009-08-11  6:52     ` Avi Kivity
2009-08-12  3:19       ` Rusty Russell
2009-08-12  5:31         ` Rusty Russell
2009-08-12  5:41           ` Avi Kivity
2009-08-12  6:56             ` Rusty Russell
2009-08-13 20:25           ` Pierre Ossman
2009-08-26  2:17             ` Rusty Russell
2009-08-26  4:55               ` Pierre Ossman
2009-08-26 12:18                 ` Rusty Russell
2009-08-26 19:22                   ` David Miller
2009-08-12  6:19         ` Pierre Ossman
2009-08-12  7:43           ` Avi Kivity
2009-08-12  8:22             ` Pierre Ossman
2009-08-12  8:35               ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).