kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NULL pointer dereference in kernel code, ignored parameters in libkvm
@ 2009-05-23 22:20 Gabe Black
  2009-05-24 11:59 ` Avi Kivity
  0 siblings, 1 reply; 4+ messages in thread
From: Gabe Black @ 2009-05-23 22:20 UTC (permalink / raw)
  To: kvm; +Cc: nathan binkert, Steve Reinhardt

    Hi. I'm a developer on the M5 simulator (m5sim.org) working on a CPU
model which uses kvm as its execution engine. I ran into a kernel "BUG"
where a NULL pointer is being dereferenced in gfn_to_rmap.

    What's happening on the kernel side is that gfn_to_rmap is calling
gfn_to_memslot. That function looks for the gfn in the memory slots,
fails to find it, and returns a NULL pointer. gfn_to_rmap then tries to
dereference it, and the kernel kills itself. I believe the original
source of the call to gfn_to_memslot was mmu_alloc_roots (in 2.6.28.9,
it may have moved) which tries to get the page pointed to by CR3 using
kvm_mmu_get_page. That part may not be correct, so here's the log output
from the kernel.

May 15 18:54:46 fajita BUG: unable to handle kernel NULL pointer
dereference at 0000000000000000
May 15 18:54:46 fajita IP: [<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita PGD 136051067 PUD 1299fd067 PMD 0
May 15 18:54:46 fajita Oops: 0000 [#1] SMP
May 15 18:54:46 fajita last sysfs file: /sys/power/state
May 15 18:54:46 fajita CPU 0
May 15 18:54:46 fajita Modules linked in: snd_hda_intel nvidia(P)
snd_pcm snd_timer snd iwlagn snd_page_alloc
May 15 18:54:46 fajita Pid: 7325, comm: m5.opt Tainted: P          
2.6.28.9 #2
May 15 18:54:46 fajita RIP: 0010:[<ffffffff802127b3>] 
[<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita RSP: 0018:ffff880129963cf8  EFLAGS: 00010246
May 15 18:54:46 fajita RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000000
May 15 18:54:46 fajita RDX: 0000000000000000 RSI: 0000000000000070 RDI:
ffff8801268d8000
May 15 18:54:46 fajita RBP: 0000000000000070 R08: 000000000000000a R09:
0000000000000000
May 15 18:54:46 fajita R10: 000000000000008b R11: 0000000000000002 R12:
0000000000000070
May 15 18:54:46 fajita R13: 0000000000000000 R14: 000000000000ae80 R15:
0000000000000070
May 15 18:54:46 fajita FS:  0000000041e1d950(0063)
GS:ffffffff80ab2040(0000) knlGS:0000000000000000
May 15 18:54:46 fajita CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 15 18:54:46 fajita CR2: 0000000000000000 CR3: 0000000129909000 CR4:
00000000000026e0
May 15 18:54:46 fajita DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
May 15 18:54:46 fajita DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
May 15 18:54:46 fajita Process m5.opt (pid: 7325, threadinfo
ffff880129962000, task ffff88013a1eacd0)
May 15 18:54:46 fajita Stack:
May 15 18:54:46 fajita ffff88013aba6800 ffff8801299727b0
ffff8801268d8000 ffffffff80213abe
May 15 18:54:46 fajita 00000000000080d0 ffff8801299727b0
ffff88012f040590 00000000000e0044
May 15 18:54:46 fajita ffff880129972040 ffffffff80213eeb
ffff88013b282380 0000000000000246
May 15 18:54:46 fajita Call Trace:
May 15 18:54:46 fajita [<ffffffff80213abe>] ? rmap_write_protect+0x25/0x123
May 15 18:54:46 fajita [<ffffffff80213eeb>] ? kvm_mmu_get_page+0x2cb/0x320
May 15 18:54:46 fajita [<ffffffff80214f51>] ? kvm_mmu_load+0x80/0x1b1
May 15 18:54:46 fajita [<ffffffff806db286>] ? __down_read+0x12/0x93
May 15 18:54:46 fajita [<ffffffff8020fc9c>] ?
kvm_arch_vcpu_ioctl_run+0x1ce/0x621
May 15 18:54:46 fajita [<ffffffff8020b590>] ? kvm_vcpu_ioctl+0xf2/0x448
May 15 18:54:46 fajita [<ffffffff80287a8d>] ? handle_mm_fault+0x367/0x6dd
May 15 18:54:46 fajita [<ffffffff802ae03e>] ? vfs_ioctl+0x21/0x6b
May 15 18:54:46 fajita [<ffffffff802ae402>] ? do_vfs_ioctl+0x37a/0x3c1
May 15 18:54:46 fajita [<ffffffff806dd616>] ? do_page_fault+0x444/0x806
May 15 18:54:46 fajita [<ffffffff80407353>] ? __up_write+0x21/0x10e
May 15 18:54:46 fajita [<ffffffff802ae485>] ? sys_ioctl+0x3c/0x5c
May 15 18:54:46 fajita [<ffffffff802234db>] ?
system_call_fastpath+0x16/0x1b
May 15 18:54:46 fajita Code: 26 21 80 48 89 f3 e8 33 ff ff ff 48 89 df
5b e9 c0 fe ff ff 55 48 89 f5 53 89 d3 48 83 ec 08 e8 60 78 ff ff 85 db
48 89 c1 75 11 <48> 2b 28 48 8d 14 ed 00 00 00 00 48 03 50 18 eb 19 48
8b 00 48
May 15 18:54:46 fajita RIP  [<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita RSP <ffff880129963cf8>
May 15 18:54:46 fajita CR2: 0000000000000000
May 15 18:54:46 fajita ---[ end trace 61dc41d5d0f7fc5f ]---



I looked in your git repository and this bug seems to be present in your
most recent code.

The second problem was the fact that CR3 didn't point to any memory even
though it had a valid value (0x7000). This was because our code relied
on kvm_create to set up physical memory, and while it takes parameters
for it and passes them around, it never actually seems to do anything
with them. This also seems to be the case in your most recent code.

The series of events leading to the BUG were then the following:

1. Our code calls kvm_create to create the vm and create its physical
memory, only the first of which happens.
2. Our code tries to start a CPU in that VM from a point where paging is
turned on and CR3 has a value that points into the physical memory that
doesn't exist.
3. The kernel code tries to get at the reverse mapping for the guest
page frame number.
4. Code below that tries to find the "slot" for that address, fails to
do so, but continues anyway, causing the kernel to dereference a NULL
pointer.
5. Kablooey.


I am a full time employee of VMware, and while I work on M5 on my own
time, that places certain limits on what I can do to help fix these
bugs. While I probably can't implement anything, I should be able to
provide more information about what we're doing with M5 or about the
crash if that would help.

Gabe Black

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereference in kernel code, ignored parameters in libkvm
  2009-05-23 22:20 NULL pointer dereference in kernel code, ignored parameters in libkvm Gabe Black
@ 2009-05-24 11:59 ` Avi Kivity
  2009-05-24 20:26   ` Gabe Black
  0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2009-05-24 11:59 UTC (permalink / raw)
  To: Gabe Black; +Cc: kvm, nathan binkert, Steve Reinhardt

Gabe Black wrote:
>     Hi. I'm a developer on the M5 simulator (m5sim.org) working on a CPU
> model which uses kvm as its execution engine. 

Neat stuff.  You're using kvm to run non-x86 code on x86?

> I ran into a kernel "BUG"
> where a NULL pointer is being dereferenced in gfn_to_rmap.
>
>     What's happening on the kernel side is that gfn_to_rmap is calling
> gfn_to_memslot. That function looks for the gfn in the memory slots,
> fails to find it, and returns a NULL pointer. gfn_to_rmap then tries to
> dereference it, and the kernel kills itself. I believe the original
> source of the call to gfn_to_memslot was mmu_alloc_roots (in 2.6.28.9,
> it may have moved) which tries to get the page pointed to by CR3 using
> kvm_mmu_get_page. That part may not be correct, so here's the log output
> from the kernel.
>   

This was fixed by 89da4ff17 ("KVM: x86: check for cr3 validity in 
mmu_alloc_roots").  Did the code base you were testing contain that?7

> The second problem was the fact that CR3 didn't point to any memory even
> though it had a valid value (0x7000). This was because our code relied
> on kvm_create to set up physical memory, and while it takes parameters
> for it and passes them around, it never actually seems to do anything
> with them. This also seems to be the case in your most recent code.
>
>   

You should set up the memory independently using the memory slot APIs, 
then load CR3.  kvm_create() has bitrotted a bit.

> I am a full time employee of VMware, and while I work on M5 on my own
> time, that places certain limits on what I can do to help fix these
> bugs. While I probably can't implement anything, I should be able to
> provide more information about what we're doing with M5 or about the
> crash if that would help.
>   

I appreciate the reports.  Please test latest kvm.git and let us know if 
the problems persist.

It would also be interesting to hear how you use kvm.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereference in kernel code, ignored parameters in libkvm
  2009-05-24 11:59 ` Avi Kivity
@ 2009-05-24 20:26   ` Gabe Black
  2009-05-25 11:55     ` Avi Kivity
  0 siblings, 1 reply; 4+ messages in thread
From: Gabe Black @ 2009-05-24 20:26 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, nathan binkert, Steve Reinhardt

Avi Kivity wrote:
> Gabe Black wrote:
>>     Hi. I'm a developer on the M5 simulator (m5sim.org) working on a CPU
>> model which uses kvm as its execution engine. 
>
> Neat stuff.  You're using kvm to run non-x86 code on x86?

Right now we're trying to run x86 on x86 but using our device models
instead of QEMUs. It would be interesting to use it for other ISAs, but
as far as I know we hadn't considered that.

>
>> I ran into a kernel "BUG"
>> where a NULL pointer is being dereferenced in gfn_to_rmap.
>>
>>     What's happening on the kernel side is that gfn_to_rmap is calling
>> gfn_to_memslot. That function looks for the gfn in the memory slots,
>> fails to find it, and returns a NULL pointer. gfn_to_rmap then tries to
>> dereference it, and the kernel kills itself. I believe the original
>> source of the call to gfn_to_memslot was mmu_alloc_roots (in 2.6.28.9,
>> it may have moved) which tries to get the page pointed to by CR3 using
>> kvm_mmu_get_page. That part may not be correct, so here's the log output
>> from the kernel.
>>   
>
> This was fixed by 89da4ff17 ("KVM: x86: check for cr3 validity in
> mmu_alloc_roots").  Did the code base you were testing contain that?7

It was the 2.6.28.9 kernel, and looking at the patch and lxr it appears not.

>
>> The second problem was the fact that CR3 didn't point to any memory even
>> though it had a valid value (0x7000). This was because our code relied
>> on kvm_create to set up physical memory, and while it takes parameters
>> for it and passes them around, it never actually seems to do anything
>> with them. This also seems to be the case in your most recent code.
>>
>>   
>
> You should set up the memory independently using the memory slot APIs,
> then load CR3.  kvm_create() has bitrotted a bit.

Will do.

>
>> I am a full time employee of VMware, and while I work on M5 on my own
>> time, that places certain limits on what I can do to help fix these
>> bugs. While I probably can't implement anything, I should be able to
>> provide more information about what we're doing with M5 or about the
>> crash if that would help.
>>   
>
> I appreciate the reports.  Please test latest kvm.git and let us know
> if the problems persist.

Will do.

>
> It would also be interesting to hear how you use kvm.
>

Ultimately, we'd like to use KVM for at least two things. The first is
as a way to fast forward simulations to the portion of interest before
switching into something slower that can collect interesting statistics
and accurately simulate performance. Our immediate goal on our way to
that is to get a CPU based around KVM to boot Linux while hooked into
our device models. We're very early in the process, but one challenge
I'm anticipating is being able to pull the local APIC out of the virtual
CPU and into M5 so that we can coordinate IPIs, etc., ourselves.

The other thing we'd like to do is to use KVM as a golden model to
verify our correctness against. To do that, we'll probably need to make
significant progress on the above, and then also find a mechanism to
make each CPU advance in very incremental and deterministic ways. Our
thought on that so far as been to set the TF bit in the guest and use
the #DB to exit back to the host. We expect there will be some gotchas
with this like hold off on mov to %ss, but if there would be any show
stopper problems please let us know.

Gabe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: NULL pointer dereference in kernel code, ignored parameters in libkvm
  2009-05-24 20:26   ` Gabe Black
@ 2009-05-25 11:55     ` Avi Kivity
  0 siblings, 0 replies; 4+ messages in thread
From: Avi Kivity @ 2009-05-25 11:55 UTC (permalink / raw)
  To: Gabe Black; +Cc: kvm, nathan binkert, Steve Reinhardt

Gabe Black wrote:
>> It would also be interesting to hear how you use kvm.
>>
>>     
>
> Ultimately, we'd like to use KVM for at least two things. The first is
> as a way to fast forward simulations to the portion of interest before
> switching into something slower that can collect interesting statistics
> and accurately simulate performance. Our immediate goal on our way to
> that is to get a CPU based around KVM to boot Linux while hooked into
> our device models. We're very early in the process, but one challenge
> I'm anticipating is being able to pull the local APIC out of the virtual
> CPU and into M5 so that we can coordinate IPIs, etc., ourselves.
>   

Since we support live migration, it should be doable.  Tricky though.

> The other thing we'd like to do is to use KVM as a golden model to
> verify our correctness against. To do that, we'll probably need to make
> significant progress on the above, and then also find a mechanism to
> make each CPU advance in very incremental and deterministic ways. Our
> thought on that so far as been to set the TF bit in the guest and use
> the #DB to exit back to the host. We expect there will be some gotchas
> with this like hold off on mov to %ss, but if there would be any show
> stopper problems please let us know.
>   

It should work as long as the guest doesn't debug itself, I think.  May 
have problems with interrupts or NMIs.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-05-25 11:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-23 22:20 NULL pointer dereference in kernel code, ignored parameters in libkvm Gabe Black
2009-05-24 11:59 ` Avi Kivity
2009-05-24 20:26   ` Gabe Black
2009-05-25 11:55     ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).