From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: Linux x86 guest panics in skb_copy_bits
Date: Wed, 6 May 2009 21:16:27 -0300
Message-ID: <20090507001627.GA21693@amt.cnet>
References: <ce02da610905031140w76c44964sac142c1965621e98@mail.gmail.com> <20090504224006.GB10616@amt.cnet> <ce02da610905051529w23a72a54m33218d998a36e5d0@mail.gmail.com> <ce02da610905061635y1bb94abemdb5d52bbe0ac905@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: kvm@vger.kernel.org
To: Justin Dossey <jbd@justindossey.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx2.redhat.com ([66.187.237.31]:54818 "EHLO mx2.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1750861AbZEGE1c (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 7 May 2009 00:27:32 -0400
Content-Disposition: inline
In-Reply-To: <ce02da610905061635y1bb94abemdb5d52bbe0ac905@mail.gmail.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Justin,

On Wed, May 06, 2009 at 04:35:24PM -0700, Justin Dossey wrote:
> On Tue, May 5, 2009 at 3:29 PM, Justin Dossey <jbd@justindossey.com> =
wrote:
> > On Mon, May 4, 2009 at 3:40 PM, Marcelo Tosatti <mtosatti@redhat.co=
m> wrote:
> >> Justin,
> >>
> >> On Sun, May 03, 2009 at 11:40:47AM -0700, Justin Dossey wrote:
> > [snip]
> >>
> >> Seems to be an issue with paravirt mmu. Do you happen to have
> >> CONFIG_DEBUG_PAGEALLOC turned on your guests?
> >
> > I don't, as my VMs are in production use. =A0To find the source of =
this
> > issue, I can turn it on though.
> >
> > While I'm at it, are there any other kernel features I should enabl=
e?
> >
>=20
> I went ahead and recompiled with CONFIG_DEBUG_PAGEALLOC enabled.
> Here's the panic (77 seconds after boot!)
>=20
> [   76.911884] BUG: unable to handle kernel paging request at f4d1700=
0
> [   76.915076] IP: [<c02856a9>] __slab_alloc+0x217/0x42f
> [   76.917161] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
> [   76.919309] last sysfs file: /sys/kernel/uevent_seqnum
> [   76.920015] Modules linked in:
> [   76.920015]
> [   76.920015] Pid: 4632, comm: ruby18 Not tainted (2.6.28-gentoo-r4 =
#2)
> [   76.920015] EIP: 0060:[<c02856a9>] EFLAGS: 00210086 CPU: 0
> [   76.920015] EIP is at __slab_alloc+0x217/0x42f
> [   76.920015] EAX: c0761564 EBX: c1abb740 ECX: c0761564 EDX: 0000000=
0
> [   76.920015] ESI: f4d17800 EDI: f4d17000 EBP: f72e9b8c ESP: f72e9b6=
c
> [   76.920015]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> [   76.920015] Process ruby18 (pid: 4632, ti=3Df72e8000 task=3Df721d8=
e0
> task.ti=3Df72e8000)
> [   76.920015] Stack:
> [   76.920015]  ffffffff 00000020 c0761564 00000000 f4d10000 00000000
> 00200282 c0761564
> [   76.920015]  f72e9bb4 c02867a5 c04b6782 c2a202a8 c04b6782 00000020
> 00000800 f70583c0
> [   76.920015]  00000600 f70583c0 f72e9bd4 c04b5948 00000000 00000020
> c0760f58 00000020
> [   76.920015] Call Trace:
> [   76.920015]  [<c02867a5>] ? __kmalloc_track_caller+0x89/0xda
> [   76.920015]  [<c04b6782>] ? __netdev_alloc_skb+0x17/0x34
> [   76.920015]  [<c04b6782>] ? __netdev_alloc_skb+0x17/0x34
> [   76.920015]  [<c04b5948>] ? __alloc_skb+0x4f/0xfb
> [   76.920015]  [<c04b6782>] ? __netdev_alloc_skb+0x17/0x34
> [   76.920015]  [<c0452583>] ? try_fill_recv+0x30/0x177
> [   76.920015]  [<c04b37a1>] ? sock_def_readable+0x5e/0x63
> [   76.920015]  [<c0453237>] ? virtnet_poll+0x25c/0x309
> [   76.920015]  [<c04bc32e>] ? net_rx_action+0xbd/0x1ea
> [   76.920015]  [<c022b2ed>] ? __do_softirq+0x83/0x12e
> [   76.920015]  [<c022b3e0>] ? do_softirq+0x48/0x57
> [   76.920015]  [<c022b6fa>] ? irq_exit+0x38/0x6d
> [   76.920015]  [<c0205868>] ? do_IRQ+0x96/0xae
> [   76.920015]  [<c020471b>] ? common_interrupt+0x23/0x28
> [   76.920015]  [<c0274379>] ? copy_page_range+0x25c/0x51e
> [   76.920015]  [<c0225731>] ? dup_mm+0x22a/0x30c
> [   76.920015]  [<c0226141>] ? copy_process+0x906/0x1026
> [   76.920015]  [<c02269a0>] ? do_fork+0xd6/0x21f
> [   76.920015]  [<c03ae3e0>] ? copy_to_user+0x2a/0x36
> [   76.920015]  [<c0202236>] ? sys_clone+0x25/0x2a
> [   76.920015]  [<c0203bd2>] ? syscall_call+0x7/0xb
> [   76.920015] Code: c1 e9 02 f3 ab f6 c2 02 74 02 66 ab f6 c2 01 74
> 01 aa 8b 7d f0 89 fe eb 19 8b 45 e8 89 f9 89 da e8 4d ee ff ff 8b 4d
> e8 03 79 0c <89> 37 89 f7 03 71 04 8b 55 e8 0f b7 43 0a 0f af 42 04 0=
3
> 45f0

OK, so you're original report contains a different bug. You can disable
CONFIG_KVM_GUEST in the meantime as a workaround, until the bug is
fixed.