* Shadow page table questions
@ 2010-03-10 4:57 Marek Olszewski
2010-03-10 9:47 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-10 4:57 UTC (permalink / raw)
To: kvm
Hello,
I was wondering if someone could point me to some documentation that
explains the basic non-nested-paging shadow page table
algorithm/strategy used by KVM. I understand that KVM caches shadow
page tables across context switches and that there is a reverse mapping
and page protection to help zap shadow page tables when the guest page
tables change. However, I'm not entirely sure how the actual caching is
done. At first I assumed that KVM would change the host CR3 on every
guest context switch such that it would point to a cached shadow page
table for the currently running guest user thread, however, as far as I
can tell, the host CR3 does not change so I'm a little lost. If indeed
it doesn't change the CR3, how does KVM solve the problem that arises
when two processes in the guest OS share the same guest logical addresses?
I'm also interested in figuring out what KVM does when running with
multiple virtual CPUs. Looking at the code, I can see that each VCPU
has its own root pointer to a shadow page table graph, but I have yet to
figure out if this graph has node's shared between VCPUs, or whether
they are all private.
Any help would be greatly appreciated. Thanks!
Marek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Shadow page table questions
2010-03-10 4:57 Shadow page table questions Marek Olszewski
@ 2010-03-10 9:47 ` Avi Kivity
2010-03-11 0:06 ` Marek Olszewski
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-10 9:47 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm
On 03/10/2010 06:57 AM, Marek Olszewski wrote:
> Hello,
>
> I was wondering if someone could point me to some documentation that
> explains the basic non-nested-paging shadow page table
> algorithm/strategy used by KVM. I understand that KVM caches shadow
> page tables across context switches and that there is a reverse
> mapping and page protection to help zap shadow page tables when the
> guest page tables change. However, I'm not entirely sure how the
> actual caching is done. At first I assumed that KVM would change the
> host CR3 on every guest context switch such that it would point to a
> cached shadow page table for the currently running guest user thread,
> however, as far as I can tell, the host CR3 does not change so I'm a
> little lost. If indeed it doesn't change the CR3, how does KVM solve
> the problem that arises when two processes in the guest OS share the
> same guest logical addresses?
The host cr3 does change, though not by using the 'mov cr3' instruction
(that would cause the host to immediately switch to the guest address
space, which would be bad).
See the calls to kvm_x86_ops->set_cr3().
>
> I'm also interested in figuring out what KVM does when running with
> multiple virtual CPUs. Looking at the code, I can see that each VCPU
> has its own root pointer to a shadow page table graph, but I have yet
> to figure out if this graph has node's shared between VCPUs, or
> whether they are all private.
Everything is shared. If the guest is running with identical cr3s, kvm
will load identical cr3s in guest mode.
An exception is when we use 32-bit pae mode. In that case, the guest
cr3s will be different (but guest PDPTRs will be identical). Instead of
dealing with the pae cr3, we deal with the four PDPTRs.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Shadow page table questions
2010-03-10 9:47 ` Avi Kivity
@ 2010-03-11 0:06 ` Marek Olszewski
2010-03-11 6:39 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-11 0:06 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
Thanks for the response. I've looked through the code some more and
think I have figured it out now. I finally see that the root_hpa
variable gets switched before entering the guest in mmu_alloc_roots, to
correspond with the new cr3. Thanks again.
Perhaps you can help me with one more question. I was hoping to try out
a certain change for a research project. I would like to "privatize"
kvm_mmu_page's and their spe's for each guest thread running in certain
designated guest processes. The goal is to give each thread its own
shadow page table graphs that map the same guest logical addresses to
guest physical addresses (with some changes to be introduced later).
Are there any assumptions that KVM makes that will break if I do
something like this? I understand that I will have to add some code
throughout the mmu to make sure that these structures are synchronized
when a guest thread makes a change, but I'm wondering if there is
anything else. Does the reverse mapping data structure you have assume
that there is only one shadow page per guest page?
Thanks!
Marek
Avi Kivity wrote:
> On 03/10/2010 06:57 AM, Marek Olszewski wrote:
>> Hello,
>>
>> I was wondering if someone could point me to some documentation that
>> explains the basic non-nested-paging shadow page table
>> algorithm/strategy used by KVM. I understand that KVM caches shadow
>> page tables across context switches and that there is a reverse
>> mapping and page protection to help zap shadow page tables when the
>> guest page tables change. However, I'm not entirely sure how the
>> actual caching is done. At first I assumed that KVM would change the
>> host CR3 on every guest context switch such that it would point to a
>> cached shadow page table for the currently running guest user thread,
>> however, as far as I can tell, the host CR3 does not change so I'm a
>> little lost. If indeed it doesn't change the CR3, how does KVM solve
>> the problem that arises when two processes in the guest OS share the
>> same guest logical addresses?
>
> The host cr3 does change, though not by using the 'mov cr3'
> instruction (that would cause the host to immediately switch to the
> guest address space, which would be bad).
>
> See the calls to kvm_x86_ops->set_cr3().
>
>>
>> I'm also interested in figuring out what KVM does when running with
>> multiple virtual CPUs. Looking at the code, I can see that each VCPU
>> has its own root pointer to a shadow page table graph, but I have yet
>> to figure out if this graph has node's shared between VCPUs, or
>> whether they are all private.
>
> Everything is shared. If the guest is running with identical cr3s,
> kvm will load identical cr3s in guest mode.
>
> An exception is when we use 32-bit pae mode. In that case, the guest
> cr3s will be different (but guest PDPTRs will be identical). Instead
> of dealing with the pae cr3, we deal with the four PDPTRs.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Shadow page table questions
2010-03-11 0:06 ` Marek Olszewski
@ 2010-03-11 6:39 ` Avi Kivity
2010-03-11 16:14 ` Marek Olszewski
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-11 6:39 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm
On 03/11/2010 02:06 AM, Marek Olszewski wrote:
> Thanks for the response. I've looked through the code some more and
> think I have figured it out now. I finally see that the root_hpa
> variable gets switched before entering the guest in mmu_alloc_roots,
> to correspond with the new cr3. Thanks again.
>
> Perhaps you can help me with one more question. I was hoping to try
> out a certain change for a research project. I would like to
> "privatize" kvm_mmu_page's and their spe's for each guest thread
> running in certain designated guest processes. The goal is to give
> each thread its own shadow page table graphs that map the same guest
> logical addresses to guest physical addresses (with some changes to be
> introduced later). Are there any assumptions that KVM makes that
> will break if I do something like this? I understand that I will have
> to add some code throughout the mmu to make sure that these structures
> are synchronized when a guest thread makes a change, but I'm wondering
> if there is anything else. Does the reverse mapping data structure
> you have assume that there is only one shadow page per guest page?
It doesn't, and there are often multiple shadow pages per guest page,
distinguished by their sp->role field.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Shadow page table questions
2010-03-11 6:39 ` Avi Kivity
@ 2010-03-11 16:14 ` Marek Olszewski
2010-03-13 8:51 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-11 16:14 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
> It doesn't, and there are often multiple shadow pages per guest page,
> distinguished by their sp->role field.
Oh, great! Does this mean that there is already a mechanism for
synchronizing all shadow pages shadowing the same guest when such a
guest page changes?
Marek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Shadow page table questions
2010-03-11 16:14 ` Marek Olszewski
@ 2010-03-13 8:51 ` Avi Kivity
2010-03-18 23:50 ` KVM Page Fault Question Marek Olszewski
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-13 8:51 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm
On 03/11/2010 06:14 PM, Marek Olszewski wrote:
>> It doesn't, and there are often multiple shadow pages per guest page,
>> distinguished by their sp->role field.
> Oh, great! Does this mean that there is already a mechanism for
> synchronizing all shadow pages shadowing the same guest when such a
> guest page changes?
Yes, kvm_mmu_pte_write().
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* KVM Page Fault Question
2010-03-13 8:51 ` Avi Kivity
@ 2010-03-18 23:50 ` Marek Olszewski
2010-03-19 8:39 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-18 23:50 UTC (permalink / raw)
To: kvm
When using VMX without EPT, is it ever possible for a guest to receive a
page fault without it first appearing (and being reinjected) in KVM?
I'm seeing some strange behavior where accesses to mprotected (but yet
to be accessed) memory causes a fault in the guest OS, that I cannot see
KVM intercepting.
Thanks!
Marek
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
2010-03-18 23:50 ` KVM Page Fault Question Marek Olszewski
@ 2010-03-19 8:39 ` Avi Kivity
2010-04-02 4:41 ` Marek Olszewski
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-19 8:39 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm
On 03/19/2010 01:50 AM, Marek Olszewski wrote:
> When using VMX without EPT, is it ever possible for a guest to receive
> a page fault without it first appearing (and being reinjected) in KVM?
Yes. On Intel hosts only, and controlled by bypass_guest_pf.
> I'm seeing some strange behavior where accesses to mprotected (but yet
> to be accessed) memory causes a fault in the guest OS, that I cannot
> see KVM intercepting.
>
Look for 'shadow_trap_nonpresent_pte' (which will trap into kvm) and
'shadow_notrap_nonpresent_pte' (which will not) in the code.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
2010-03-19 8:39 ` Avi Kivity
@ 2010-04-02 4:41 ` Marek Olszewski
2010-04-02 6:39 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-04-02 4:41 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
When a guest OS writes to a shadowed (and therefore page protected)
guest page table, does the resulting page fault get handled in
paging_tmpl.h:xxx_page_fault or does it call some rmap related code
directly? Also, what does the "direct" mmu page role mean?
Thanks!
Marek
Avi Kivity wrote:
> On 03/19/2010 01:50 AM, Marek Olszewski wrote:
>> When using VMX without EPT, is it ever possible for a guest to
>> receive a page fault without it first appearing (and being
>> reinjected) in KVM?
>
> Yes. On Intel hosts only, and controlled by bypass_guest_pf.
>
>> I'm seeing some strange behavior where accesses to mprotected (but
>> yet to be accessed) memory causes a fault in the guest OS, that I
>> cannot see KVM intercepting.
>>
>
> Look for 'shadow_trap_nonpresent_pte' (which will trap into kvm) and
> 'shadow_notrap_nonpresent_pte' (which will not) in the code.
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
2010-04-02 4:41 ` Marek Olszewski
@ 2010-04-02 6:39 ` Avi Kivity
[not found] ` <4BB614BC.9080608@csail.mit.edu>
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-04-02 6:39 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm
On 04/02/2010 07:41 AM, Marek Olszewski wrote:
> When a guest OS writes to a shadowed (and therefore page protected)
> guest page table, does the resulting page fault get handled in
> paging_tmpl.h:xxx_page_fault or does it call some rmap related code
> directly?
page faults are dispatched to the page_fault callback.
> Also, what does the "direct" mmu page role mean?
>
It means that the page maps the linear range (gfn << 12)..(((gfn + (1 <<
level*9))) << 12) instead of shadowing a guest page table at gfn.
Useful for real mode, large pages, and tdp.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
[not found] ` <4BB614BC.9080608@csail.mit.edu>
@ 2010-04-04 16:59 ` Avi Kivity
2010-04-22 5:26 ` Marek Olszewski
0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-04-04 16:59 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm-devel
(re-adding list)
On 04/02/2010 07:01 PM, Marek Olszewski wrote:
> Thanks for the fast response.
>
> I'm trying to find the code that on a write to a guest page table
> entry, will iterate over all shadow page table entries that map that
> guest entry to update them. Can you point me to that code? I can't
> seem to find it myself :(
See kvm_mmu_pte_write().
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
2010-04-04 16:59 ` Avi Kivity
@ 2010-04-22 5:26 ` Marek Olszewski
2010-04-22 6:52 ` Avi Kivity
0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-04-22 5:26 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm-devel
Under VMX without EPT, I do not seeing any VM Exits due to task
switches. Is there a way to enable these? I'm looking to intercept the
guest whenever it does a iret.
Thanks!
Marek
Avi Kivity wrote:
> (re-adding list)
>
>
> On 04/02/2010 07:01 PM, Marek Olszewski wrote:
>> Thanks for the fast response.
>>
>> I'm trying to find the code that on a write to a guest page table
>> entry, will iterate over all shadow page table entries that map that
>> guest entry to update them. Can you point me to that code? I can't
>> seem to find it myself :(
>
> See kvm_mmu_pte_write().
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
2010-04-22 5:26 ` Marek Olszewski
@ 2010-04-22 6:52 ` Avi Kivity
[not found] ` <4BD0DFBE.1090103@csail.mit.edu>
2010-05-20 2:24 ` Shadow MMU state preserved across kvm_mmu_zap_all? Marek Olszewski
0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-04-22 6:52 UTC (permalink / raw)
To: Marek Olszewski; +Cc: kvm-devel
On 04/22/2010 08:26 AM, Marek Olszewski wrote:
> Under VMX without EPT, I do not seeing any VM Exits due to task
> switches. Is there a way to enable these? I'm looking to intercept
> the guest whenever it does a iret.
See EXIT_REASON_TASK_SWITCH. However, that won't fire on any iret, only
irets that generate task switches. You can ask for exits on irets by
setting CPU_BASED_VIRTUAL_NMI_PENDING and GUEST_INTR_STATE_NMI, and
looking for EXIT_REASON_NMI_WINDOW.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: KVM Page Fault Question
[not found] ` <4BD0DFBE.1090103@csail.mit.edu>
@ 2010-04-26 5:42 ` Marek Olszewski
0 siblings, 0 replies; 15+ messages in thread
From: Marek Olszewski @ 2010-04-26 5:42 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
Avi,
>
> I guess I only really care about intercepting ring 0 -> ring 3
> transitions in the guest. Is there an easier way of intercepting these?
Never mind about this. I figured out a solution to my problem that
didn't need to intercept these transitions.
Unfortunately, now I have a new problem. I'm getting a segfault in
gfn_to_rmap caused by gfn_to_memslot returning NULL. Would someone mind
explaining this code to me? I don't really understand what it is doing.
Also, does the current code assume that any guest page in any level can
be shadowed more than once, or are only certain levels allowed to be
shadowed multiple times?
Thank you!
Marek
>
> Marek
>
>
> Avi Kivity wrote:
>> On 04/22/2010 08:26 AM, Marek Olszewski wrote:
>>> Under VMX without EPT, I do not seeing any VM Exits due to task
>>> switches. Is there a way to enable these? I'm looking to intercept
>>> the guest whenever it does a iret.
>>
>> See EXIT_REASON_TASK_SWITCH. However, that won't fire on any iret,
>> only irets that generate task switches. You can ask for exits on
>> irets by setting CPU_BASED_VIRTUAL_NMI_PENDING and
>> GUEST_INTR_STATE_NMI, and looking for EXIT_REASON_NMI_WINDOW.
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Shadow MMU state preserved across kvm_mmu_zap_all?
2010-04-22 6:52 ` Avi Kivity
[not found] ` <4BD0DFBE.1090103@csail.mit.edu>
@ 2010-05-20 2:24 ` Marek Olszewski
1 sibling, 0 replies; 15+ messages in thread
From: Marek Olszewski @ 2010-05-20 2:24 UTC (permalink / raw)
To: kvm-devel; +Cc: Avi Kivity
Hello,
I'm trying to track down a bug I'm observing in a branched version of
kvm I'm using for research. I'm hoping someone might be able to point
me int to the right direction as I haven't had any luck with it on my
own. Here are the details:
I have made some changes to kvm that enable guest user applications to
use duplicate shadow pages to do interesting things (essentially I
duplicate the shadow page table tree for a process multiple times, once
for each thread). During my tests, my guest application enables this
new feature, completes correctly, and then disables it. Unfortunately,
after the test application completes, random programs begin segfaulting
for unknown reasons. This is despite the fact that my changes to KVM no
longer get executed (verified with a kgdb). At first I thought that I
corrupted the shadow pages tables somehow, however, calling
kvm_mmu_zap_all does not solve the problem. Thus, I figured I corrupted
the guest OS somehow, however, the problem persists even if I reboot the
guest OS.
So my question is this: Are there any other data structures that survive
both a call to kvm_mmu_zap and a guest reboot?
Thanks!
Marek
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2010-05-20 2:24 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-10 4:57 Shadow page table questions Marek Olszewski
2010-03-10 9:47 ` Avi Kivity
2010-03-11 0:06 ` Marek Olszewski
2010-03-11 6:39 ` Avi Kivity
2010-03-11 16:14 ` Marek Olszewski
2010-03-13 8:51 ` Avi Kivity
2010-03-18 23:50 ` KVM Page Fault Question Marek Olszewski
2010-03-19 8:39 ` Avi Kivity
2010-04-02 4:41 ` Marek Olszewski
2010-04-02 6:39 ` Avi Kivity
[not found] ` <4BB614BC.9080608@csail.mit.edu>
2010-04-04 16:59 ` Avi Kivity
2010-04-22 5:26 ` Marek Olszewski
2010-04-22 6:52 ` Avi Kivity
[not found] ` <4BD0DFBE.1090103@csail.mit.edu>
2010-04-26 5:42 ` Marek Olszewski
2010-05-20 2:24 ` Shadow MMU state preserved across kvm_mmu_zap_all? Marek Olszewski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).