public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* Kernel oops in host caused by mmaping RAM
@ 2011-04-12 19:41 Sasha Levin
  2011-04-12 21:09 ` Jan Kiszka
  2011-04-12 23:27 ` [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error Jan Kiszka
  0 siblings, 2 replies; 6+ messages in thread
From: Sasha Levin @ 2011-04-12 19:41 UTC (permalink / raw)
  To: kvm

Hello,

I've tried using mmap to map the RAM of a guest instead of
posix_memalign which is used both in the kvm tool and qemu.

Doing so caused a kernel Oops, which happens every time I run the code
and was confirmed both on 2.6.38 and the latest git build of 2.6.39.

[32109.368018] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000008
[32109.368018] IP: [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
[32109.368018] PGD d7202067 PUD 6a838067 PMD 0
[32109.368018] Oops: 0002 [#1] PREEMPT SMP
[32109.368018] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/block/sda/uevent
[32109.368018] CPU 0
[32109.368018] Modules linked in:
[32109.368018]
[32109.368018] Pid: 20829, comm: kvm Not tainted 2.6.38-gentoo-r1 #4
System manufacturer System Product Name/P5GC-MX/1333
[32109.368018] RIP: 0010:[<ffffffff810033b0>]  [<ffffffff810033b0>]
kvm_vm_ioctl+0xbc/0x33a
[32109.368018] RSP: 0018:ffff880037013e28  EFLAGS: 00010207
[32109.368018] RAX: 0000000000000000 RBX: ffff880037158000 RCX: 0000000000000000
[32109.368018] RDX: 0000000000000000 RSI: ffff880037013d78 RDI: 0000000000000206
[32109.368018] RBP: ffff880037013ea8 R08: ffff880000098e00 R09: 0000000000000004
[32109.368018] R10: 0000000000000000 R11: ffff880037013ca8 R12: 0000000000000000
[32109.368018] R13: 000000000000ae41 R14: 0000000000000000 R15: 0000000000000000
[32109.368018] FS:  00007f83f7cd9700(0000) GS:ffff8800d7c00000(0000)
knlGS:0000000000000000
[32109.368018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[32109.368018] CR2: 0000000000000008 CR3: 00000000d062e000 CR4: 00000000000026e0
[32109.368018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[32109.368018] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[32109.368018] Process kvm (pid: 20829, threadinfo ffff880037012000,
task ffff88008525d340)
[32109.368018] Stack:
[32109.368018]  ffff880037013e48 ffffffff8150d59c ffff88000e9b9308
ffff88000e9b9308
[32109.368018]  ffff880037013ec8 ffffffff81079e25 ffff8800d7c0e388
ffff88000e9b9308
[32109.368018]  0000000000000001 000000000000e380 ffff880037013e98
ffffffff8105b04e
[32109.368018] Call Trace:
[32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
[32109.368018]  [<ffffffff81079e25>] ? __hrtimer_start_range_ns+0x2b4/0x2c6
[32109.368018]  [<ffffffff8105b04e>] ? get_parent_ip+0x11/0x41
[32109.368018]  [<ffffffff810f1ba9>] do_vfs_ioctl+0x3f1/0x440
[32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
[32109.368018]  [<ffffffff8107605c>] ? sys_timer_settime+0x254/0x2a4
[32109.368018]  [<ffffffff810f1c49>] sys_ioctl+0x51/0x74
[32109.368018]  [<ffffffff81027a52>] system_call_fastpath+0x16/0x1b
[32109.368018] Code: 40 40 0f 85 70 02 00 00 e9 13 02 00 00 44 89 e6
45 89 e6 48 89 df e8 19 6e 00 00 49 89 c4 49 81 fc 00 f0 ff ff 0f 87
6f 02 00 00 <49> c7 44 24 08 00 00 00 00 49 c7 44 24 10 00 00 00 00 49
c7 44
[32109.368018] RIP  [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
[32109.368018]  RSP <ffff880037013e28>
[32109.368018] CR2: 0000000000000008
[32109.368018] [drm] force priority to high
[32109.385714] ---[ end trace 0fc207e73803c472 ]---



--
Sasha.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Kernel oops in host caused by mmaping RAM
  2011-04-12 19:41 Kernel oops in host caused by mmaping RAM Sasha Levin
@ 2011-04-12 21:09 ` Jan Kiszka
  2011-04-13 12:50   ` Pekka Enberg
  2011-04-12 23:27 ` [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error Jan Kiszka
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2011-04-12 21:09 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

On 2011-04-12 21:41, Sasha Levin wrote:
> Hello,
> 
> I've tried using mmap to map the RAM of a guest instead of
> posix_memalign which is used both in the kvm tool and qemu.
> 
> Doing so caused a kernel Oops, which happens every time I run the code
> and was confirmed both on 2.6.38 and the latest git build of 2.6.39.
> 

Can you share the test case that triggers it? That's easier than
guessing what you did precisely.

Thanks,
Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error
  2011-04-12 19:41 Kernel oops in host caused by mmaping RAM Sasha Levin
  2011-04-12 21:09 ` Jan Kiszka
@ 2011-04-12 23:27 ` Jan Kiszka
  2011-04-16 14:50   ` Marcelo Tosatti
  1 sibling, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2011-04-12 23:27 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Avi Kivity, Marcelo Tosatti

On 2011-04-12 21:41, Sasha Levin wrote:
> Hello,
> 
> I've tried using mmap to map the RAM of a guest instead of
> posix_memalign which is used both in the kvm tool and qemu.
> 
> Doing so caused a kernel Oops, which happens every time I run the code
> and was confirmed both on 2.6.38 and the latest git build of 2.6.39.
> 
> [32109.368018] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000008
> [32109.368018] IP: [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
> [32109.368018] PGD d7202067 PUD 6a838067 PMD 0
> [32109.368018] Oops: 0002 [#1] PREEMPT SMP
> [32109.368018] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/block/sda/uevent
> [32109.368018] CPU 0
> [32109.368018] Modules linked in:
> [32109.368018]
> [32109.368018] Pid: 20829, comm: kvm Not tainted 2.6.38-gentoo-r1 #4
> System manufacturer System Product Name/P5GC-MX/1333
> [32109.368018] RIP: 0010:[<ffffffff810033b0>]  [<ffffffff810033b0>]
> kvm_vm_ioctl+0xbc/0x33a
> [32109.368018] RSP: 0018:ffff880037013e28  EFLAGS: 00010207
> [32109.368018] RAX: 0000000000000000 RBX: ffff880037158000 RCX: 0000000000000000
> [32109.368018] RDX: 0000000000000000 RSI: ffff880037013d78 RDI: 0000000000000206
> [32109.368018] RBP: ffff880037013ea8 R08: ffff880000098e00 R09: 0000000000000004
> [32109.368018] R10: 0000000000000000 R11: ffff880037013ca8 R12: 0000000000000000
> [32109.368018] R13: 000000000000ae41 R14: 0000000000000000 R15: 0000000000000000
> [32109.368018] FS:  00007f83f7cd9700(0000) GS:ffff8800d7c00000(0000)
> knlGS:0000000000000000
> [32109.368018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [32109.368018] CR2: 0000000000000008 CR3: 00000000d062e000 CR4: 00000000000026e0
> [32109.368018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [32109.368018] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [32109.368018] Process kvm (pid: 20829, threadinfo ffff880037012000,
> task ffff88008525d340)
> [32109.368018] Stack:
> [32109.368018]  ffff880037013e48 ffffffff8150d59c ffff88000e9b9308
> ffff88000e9b9308
> [32109.368018]  ffff880037013ec8 ffffffff81079e25 ffff8800d7c0e388
> ffff88000e9b9308
> [32109.368018]  0000000000000001 000000000000e380 ffff880037013e98
> ffffffff8105b04e
> [32109.368018] Call Trace:
> [32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
> [32109.368018]  [<ffffffff81079e25>] ? __hrtimer_start_range_ns+0x2b4/0x2c6
> [32109.368018]  [<ffffffff8105b04e>] ? get_parent_ip+0x11/0x41
> [32109.368018]  [<ffffffff810f1ba9>] do_vfs_ioctl+0x3f1/0x440
> [32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
> [32109.368018]  [<ffffffff8107605c>] ? sys_timer_settime+0x254/0x2a4
> [32109.368018]  [<ffffffff810f1c49>] sys_ioctl+0x51/0x74
> [32109.368018]  [<ffffffff81027a52>] system_call_fastpath+0x16/0x1b
> [32109.368018] Code: 40 40 0f 85 70 02 00 00 e9 13 02 00 00 44 89 e6
> 45 89 e6 48 89 df e8 19 6e 00 00 49 89 c4 49 81 fc 00 f0 ff ff 0f 87
> 6f 02 00 00 <49> c7 44 24 08 00 00 00 00 49 c7 44 24 10 00 00 00 00 49
> c7 44
> [32109.368018] RIP  [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
> [32109.368018]  RSP <ffff880037013e28>
> [32109.368018] CR2: 0000000000000008
> [32109.368018] [drm] force priority to high
> [32109.385714] ---[ end trace 0fc207e73803c472 ]---
> 

Patch below fixes the oops for me.

It looks like the problem was that your guest memory setup caused a
conflict with the kernel's desire to map the APIC access page. So
alloc_apic_access_page failed, but that error was not properly reported
back, causing the NULL pointer dereferencing.

Thanks for reporting,
Jan

-----8<------

From: Jan Kiszka <jan.kiszka@siemens.com>

In case certain allocations fail, vmx_create_vcpu may return 0 as error
instead of a negative value encoded via ERR_PTR. This causes a NULL
pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 arch/x86/kvm/vmx.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index aabe333..af52069 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4251,8 +4251,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 		goto free_vcpu;
 
 	vmx->guest_msrs = kmalloc(PAGE_SIZE, GFP_KERNEL);
+	err = -ENOMEM;
 	if (!vmx->guest_msrs) {
-		err = -ENOMEM;
 		goto uninit_vcpu;
 	}
 
@@ -4271,7 +4271,8 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_vmcs;
 	if (vm_need_virtualize_apic_accesses(kvm))
-		if (alloc_apic_access_page(kvm) != 0)
+		err = alloc_apic_access_page(kvm);
+		if (err)
 			goto free_vmcs;
 
 	if (enable_ept) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Kernel oops in host caused by mmaping RAM
  2011-04-12 21:09 ` Jan Kiszka
@ 2011-04-13 12:50   ` Pekka Enberg
  2011-04-13 12:58     ` Sasha Levin
  0 siblings, 1 reply; 6+ messages in thread
From: Pekka Enberg @ 2011-04-13 12:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Sasha Levin, kvm

On Wed, Apr 13, 2011 at 12:09 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
> On 2011-04-12 21:41, Sasha Levin wrote:
>> Hello,
>>
>> I've tried using mmap to map the RAM of a guest instead of
>> posix_memalign which is used both in the kvm tool and qemu.
>>
>> Doing so caused a kernel Oops, which happens every time I run the code
>> and was confirmed both on 2.6.38 and the latest git build of 2.6.39.
>>
>
> Can you share the test case that triggers it? That's easier than
> guessing what you did precisely.

It's the native Linux kvm tool patched to use mmap() instead of
posix_memalign(). Sasha, maybe you should post your patch so other
people can try to reproduce the problem?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Kernel oops in host caused by mmaping RAM
  2011-04-13 12:50   ` Pekka Enberg
@ 2011-04-13 12:58     ` Sasha Levin
  0 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2011-04-13 12:58 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Jan Kiszka, kvm

On Wed, Apr 13, 2011 at 3:50 PM, Pekka Enberg <penberg@kernel.org> wrote:
> On Wed, Apr 13, 2011 at 12:09 AM, Jan Kiszka <jan.kiszka@web.de> wrote:
>> On 2011-04-12 21:41, Sasha Levin wrote:
>>> Hello,
>>>
>>> I've tried using mmap to map the RAM of a guest instead of
>>> posix_memalign which is used both in the kvm tool and qemu.
>>>
>>> Doing so caused a kernel Oops, which happens every time I run the code
>>> and was confirmed both on 2.6.38 and the latest git build of 2.6.39.
>>>
>>
>> Can you share the test case that triggers it? That's easier than
>> guessing what you did precisely.
>
> It's the native Linux kvm tool patched to use mmap() instead of
> posix_memalign(). Sasha, maybe you should post your patch so other
> people can try to reproduce the problem?
>

I provided Jan with a patch to the kvm tool yesterday, Jan has
reproduced the oops and sent a patch to kernel-side KVM to fix it.
Here's the patch for the Linux kvm tool which triggered the oops.

diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index 08ff63c..bac2a5e 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -158,7 +158,6 @@ struct kvm *kvm__init(const char *kvm_dev,
unsigned long ram_size)
 	struct kvm_userspace_memory_region mem;
 	struct kvm_pit_config pit_config = { .flags = 0, };
 	struct kvm *self;
-	long page_size;
 	int ret;

 	if (!kvm__cpu_supports_vm())
@@ -199,8 +198,8 @@ struct kvm *kvm__init(const char *kvm_dev,
unsigned long ram_size)

 	self->ram_size		= ram_size;

-	page_size	= sysconf(_SC_PAGESIZE);
-	if (posix_memalign(&self->ram_start, page_size, self->ram_size) != 0)
+	self->ram_start = mmap(NULL, self->ram_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_NORESERVE | MAP_ANONYMOUS, -1, 0);
+	if (self == MAP_FAILED)
 		die("out of memory");

 	mem = (struct kvm_userspace_memory_region) {


--
Sasha.

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error
  2011-04-12 23:27 ` [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error Jan Kiszka
@ 2011-04-16 14:50   ` Marcelo Tosatti
  0 siblings, 0 replies; 6+ messages in thread
From: Marcelo Tosatti @ 2011-04-16 14:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Sasha Levin, kvm, Avi Kivity

On Wed, Apr 13, 2011 at 01:27:55AM +0200, Jan Kiszka wrote:
> On 2011-04-12 21:41, Sasha Levin wrote:
> > Hello,
> > 
> > I've tried using mmap to map the RAM of a guest instead of
> > posix_memalign which is used both in the kvm tool and qemu.
> > 
> > Doing so caused a kernel Oops, which happens every time I run the code
> > and was confirmed both on 2.6.38 and the latest git build of 2.6.39.
> > 
> > [32109.368018] BUG: unable to handle kernel NULL pointer dereference
> > at 0000000000000008
> > [32109.368018] IP: [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
> > [32109.368018] PGD d7202067 PUD 6a838067 PMD 0
> > [32109.368018] Oops: 0002 [#1] PREEMPT SMP
> > [32109.368018] last sysfs file:
> > /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/block/sda/uevent
> > [32109.368018] CPU 0
> > [32109.368018] Modules linked in:
> > [32109.368018]
> > [32109.368018] Pid: 20829, comm: kvm Not tainted 2.6.38-gentoo-r1 #4
> > System manufacturer System Product Name/P5GC-MX/1333
> > [32109.368018] RIP: 0010:[<ffffffff810033b0>]  [<ffffffff810033b0>]
> > kvm_vm_ioctl+0xbc/0x33a
> > [32109.368018] RSP: 0018:ffff880037013e28  EFLAGS: 00010207
> > [32109.368018] RAX: 0000000000000000 RBX: ffff880037158000 RCX: 0000000000000000
> > [32109.368018] RDX: 0000000000000000 RSI: ffff880037013d78 RDI: 0000000000000206
> > [32109.368018] RBP: ffff880037013ea8 R08: ffff880000098e00 R09: 0000000000000004
> > [32109.368018] R10: 0000000000000000 R11: ffff880037013ca8 R12: 0000000000000000
> > [32109.368018] R13: 000000000000ae41 R14: 0000000000000000 R15: 0000000000000000
> > [32109.368018] FS:  00007f83f7cd9700(0000) GS:ffff8800d7c00000(0000)
> > knlGS:0000000000000000
> > [32109.368018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [32109.368018] CR2: 0000000000000008 CR3: 00000000d062e000 CR4: 00000000000026e0
> > [32109.368018] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [32109.368018] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [32109.368018] Process kvm (pid: 20829, threadinfo ffff880037012000,
> > task ffff88008525d340)
> > [32109.368018] Stack:
> > [32109.368018]  ffff880037013e48 ffffffff8150d59c ffff88000e9b9308
> > ffff88000e9b9308
> > [32109.368018]  ffff880037013ec8 ffffffff81079e25 ffff8800d7c0e388
> > ffff88000e9b9308
> > [32109.368018]  0000000000000001 000000000000e380 ffff880037013e98
> > ffffffff8105b04e
> > [32109.368018] Call Trace:
> > [32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
> > [32109.368018]  [<ffffffff81079e25>] ? __hrtimer_start_range_ns+0x2b4/0x2c6
> > [32109.368018]  [<ffffffff8105b04e>] ? get_parent_ip+0x11/0x41
> > [32109.368018]  [<ffffffff810f1ba9>] do_vfs_ioctl+0x3f1/0x440
> > [32109.368018]  [<ffffffff8150d59c>] ? _raw_spin_unlock_irqrestore+0x3c/0x49
> > [32109.368018]  [<ffffffff8107605c>] ? sys_timer_settime+0x254/0x2a4
> > [32109.368018]  [<ffffffff810f1c49>] sys_ioctl+0x51/0x74
> > [32109.368018]  [<ffffffff81027a52>] system_call_fastpath+0x16/0x1b
> > [32109.368018] Code: 40 40 0f 85 70 02 00 00 e9 13 02 00 00 44 89 e6
> > 45 89 e6 48 89 df e8 19 6e 00 00 49 89 c4 49 81 fc 00 f0 ff ff 0f 87
> > 6f 02 00 00 <49> c7 44 24 08 00 00 00 00 49 c7 44 24 10 00 00 00 00 49
> > c7 44
> > [32109.368018] RIP  [<ffffffff810033b0>] kvm_vm_ioctl+0xbc/0x33a
> > [32109.368018]  RSP <ffff880037013e28>
> > [32109.368018] CR2: 0000000000000008
> > [32109.368018] [drm] force priority to high
> > [32109.385714] ---[ end trace 0fc207e73803c472 ]---
> > 
> 
> Patch below fixes the oops for me.
> 
> It looks like the problem was that your guest memory setup caused a
> conflict with the kernel's desire to map the APIC access page. So
> alloc_apic_access_page failed, but that error was not properly reported
> back, causing the NULL pointer dereferencing.
> 
> Thanks for reporting,
> Jan
> 
> -----8<------
> 
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> In case certain allocations fail, vmx_create_vcpu may return 0 as error
> instead of a negative value encoded via ERR_PTR. This causes a NULL
> pointer dereferencing later on in kvm_vm_ioctl_vcpu_create.
> 
> Reported-by: Sasha Levin <levinsasha928@gmail.com>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Applied, thanks.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-04-16 14:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-12 19:41 Kernel oops in host caused by mmaping RAM Sasha Levin
2011-04-12 21:09 ` Jan Kiszka
2011-04-13 12:50   ` Pekka Enberg
2011-04-13 12:58     ` Sasha Levin
2011-04-12 23:27 ` [PATCH] KVM: VMX: Ensure that vmx_create_vcpu always returns proper error Jan Kiszka
2011-04-16 14:50   ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox