From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: order 1 page allocation failures
Date: Thu, 29 Sep 2011 14:26:02 -0300
Message-ID: <20110929172602.GB7202@amt.cnet>
References: <201109271610.20408.thomas@fjellstrom.ca>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Thomas Fjellstrom <thomas@fjellstrom.ca>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:31399 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1756890Ab1I2Rmp (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 29 Sep 2011 13:42:45 -0400
Content-Disposition: inline
In-Reply-To: <201109271610.20408.thomas@fjellstrom.ca>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Tue, Sep 27, 2011 at 04:10:20PM -0600, Thomas Fjellstrom wrote:
> Hi,
> 
> I've been having some issues with KVM recently where one or more vms will cause page allocation failure messages, usually with the backtrace including networking functions, example follows:
> 
> [362409.429944] kvm: page allocation failure: order:1, mode:0x20
> [362409.429957] Pid: 3453, comm: kvm Not tainted 3.0.0-1-amd64 #1
> [362409.429965] Call Trace:
> [362409.429970]  <IRQ>  [<ffffffff810b9c90>] ? warn_alloc_failed+0x108/0x11b
> [362409.429998]  [<ffffffff810bcd78>] ? __alloc_pages_nodemask+0x6e6/0x75c
> [362409.430012]  [<ffffffff810ec0c0>] ? kmem_getpages+0x55/0xf0
> [362409.430022]  [<ffffffff810ec87a>] ? fallback_alloc+0x129/0x1c1
> [362409.430035]  [<ffffffff8100e28d>] ? paravirt_read_tsc+0x5/0x8
> [362409.430045]  [<ffffffff810ed10e>] ? kmem_cache_alloc+0x73/0xf0
> [362409.430057]  [<ffffffff812707a2>] ? sk_prot_alloc+0x2b/0x128
> [362409.430067]  [<ffffffff81270965>] ? sk_clone+0x14/0x2bd
> [362409.430077]  [<ffffffff812ade7d>] ? inet_csk_clone+0x10/0x91
> [362409.430088]  [<ffffffff812c1aae>] ? tcp_create_openreq_child+0x21/0x41a
> [362409.430099]  [<ffffffff812bf98a>] ? tcp_v4_syn_recv_sock+0x33/0x208
> [362409.430110]  [<ffffffff812c2441>] ? tcp_check_req+0x1ff/0x2dd
> [362409.430122]  [<ffffffff812adc06>] ? inet_csk_search_req+0x35/0xa7
> [362409.430132]  [<ffffffff812bf4f1>] ? tcp_v4_do_rcv+0x206/0x32c
> [362409.430144]  [<ffffffff812c15d4>] ? tcp_v4_rcv+0x419/0x66c
> [362409.430154]  [<ffffffff8100e74a>] ? native_sched_clock+0x28/0x30
> [362409.430173]  [<ffffffff812a5a0c>] ? ip_local_deliver_finish+0x14b/0x1bb
> [362409.430186]  [<ffffffff8127cc8f>] ? __netif_receive_skb+0x3d7/0x40b
> [362409.430197]  [<ffffffff8127d74b>] ? netif_receive_skb+0x52/0x58
> [362409.430220]  [<ffffffffa04b5af6>] ? br_nf_pre_routing_finish+0x1d4/0x1e1 [bridge]
> [362409.430241]  [<ffffffffa04b5111>] ? NF_HOOK_THRESH+0x3b/0x55 [bridge]
> [362409.430260]  [<ffffffffa04b60ed>] ? br_nf_pre_routing+0x3be/0x3cb [bridge]
> [362409.430272]  [<ffffffff8129fb69>] ? nf_iterate+0x41/0x77
> [362409.430288]  [<ffffffffa04b13a3>] ? NF_HOOK.clone.4+0x56/0x56 [bridge]
> [362409.430305]  [<ffffffffa04b13a3>] ? NF_HOOK.clone.4+0x56/0x56 [bridge]
> [362409.430315]  [<ffffffff8129fc12>] ? nf_hook_slow+0x73/0x111
> [362409.430330]  [<ffffffffa04b13a3>] ? NF_HOOK.clone.4+0x56/0x56 [bridge]
> [362409.430342]  [<ffffffff8103f0a4>] ? try_to_wake_up+0x199/0x199
> [362409.430358]  [<ffffffffa04b13a3>] ? NF_HOOK.clone.4+0x56/0x56 [bridge]
> [362409.430375]  [<ffffffffa04b1389>] ? NF_HOOK.clone.4+0x3c/0x56 [bridge]
> [362409.430392]  [<ffffffffa04b1745>] ? br_handle_frame+0x1af/0x1c6 [bridge]
> [362409.430408]  [<ffffffffa04b1596>] ? br_handle_frame_finish+0x1f3/0x1f3 [bridge]
> [362409.430420]  [<ffffffff8127cb7c>] ? __netif_receive_skb+0x2c4/0x40b
> [362409.430432]  [<ffffffff8127cd3b>] ? process_backlog+0x78/0x157
> [362409.430443]  [<ffffffff8127dd68>] ? net_rx_action+0xa4/0x1b2
> [362409.430454]  [<ffffffff81038189>] ? test_tsk_need_resched+0xe/0x17
> [362409.430465]  [<ffffffff8104bdd4>] ? __do_softirq+0xb9/0x178
> [362409.430476]  [<ffffffff8133cf1c>] ? call_softirq+0x1c/0x30
> 
> The server has 8G of ram, and usually never uses more than about 4G (sitting at 3.4G right now).

This is a guest problem, please report it to the netfilter/lkml lists
(if its not a known issue with particular kernel version already).