kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Shadow page table questions
@ 2010-03-10  4:57 Marek Olszewski
  2010-03-10  9:47 ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-10  4:57 UTC (permalink / raw)
  To: kvm

Hello,

I was wondering if someone could point me to some documentation that 
explains the basic non-nested-paging shadow page table 
algorithm/strategy used by KVM.  I understand that KVM caches shadow 
page tables across context switches and that there is a reverse mapping 
and page protection to help zap shadow page tables when the guest page 
tables change.  However, I'm not entirely sure how the actual caching is 
done.  At first I assumed that KVM would change the host CR3 on every 
guest context switch such that it would point to a cached shadow page 
table for the currently running guest user thread, however, as far as I 
can tell, the host CR3 does not change so I'm a little lost.  If indeed 
it doesn't change the CR3, how does KVM solve the problem that arises 
when two processes in the guest OS share the same guest logical addresses?

I'm also interested in figuring out what KVM does when running with 
multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
has its own root pointer to a shadow page table graph, but I have yet to 
figure out if this graph has node's shared between VCPUs, or whether 
they are all private.

Any help would be greatly appreciated.  Thanks!

Marek

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Shadow page table questions
  2010-03-10  4:57 Shadow page table questions Marek Olszewski
@ 2010-03-10  9:47 ` Avi Kivity
  2010-03-11  0:06   ` Marek Olszewski
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-10  9:47 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/10/2010 06:57 AM, Marek Olszewski wrote:
> Hello,
>
> I was wondering if someone could point me to some documentation that 
> explains the basic non-nested-paging shadow page table 
> algorithm/strategy used by KVM.  I understand that KVM caches shadow 
> page tables across context switches and that there is a reverse 
> mapping and page protection to help zap shadow page tables when the 
> guest page tables change.  However, I'm not entirely sure how the 
> actual caching is done.  At first I assumed that KVM would change the 
> host CR3 on every guest context switch such that it would point to a 
> cached shadow page table for the currently running guest user thread, 
> however, as far as I can tell, the host CR3 does not change so I'm a 
> little lost.  If indeed it doesn't change the CR3, how does KVM solve 
> the problem that arises when two processes in the guest OS share the 
> same guest logical addresses?

The host cr3 does change, though not by using the 'mov cr3' instruction 
(that would cause the host to immediately switch to the guest address 
space, which would be bad).

See the calls to kvm_x86_ops->set_cr3().

>
> I'm also interested in figuring out what KVM does when running with 
> multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
> has its own root pointer to a shadow page table graph, but I have yet 
> to figure out if this graph has node's shared between VCPUs, or 
> whether they are all private.

Everything is shared.  If the guest is running with identical cr3s, kvm 
will load identical cr3s in guest mode.

An exception is when we use 32-bit pae mode.  In that case, the guest 
cr3s will be different (but guest PDPTRs will be identical).  Instead of 
dealing with the pae cr3, we deal with the four PDPTRs.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Shadow page table questions
  2010-03-10  9:47 ` Avi Kivity
@ 2010-03-11  0:06   ` Marek Olszewski
  2010-03-11  6:39     ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-11  0:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Thanks for the response.  I've looked through the code some more and 
think I have figured it out now.   I finally see that the root_hpa 
variable gets switched before entering the guest in mmu_alloc_roots, to 
correspond with the new cr3.  Thanks again.

Perhaps you can help me with one more question.  I was hoping to try out 
a certain change for a research project.   I would like to "privatize" 
kvm_mmu_page's and their spe's for each guest thread running in certain 
designated guest processes.  The goal is to give each thread its own 
shadow page table graphs that map the same guest logical addresses to 
guest physical addresses (with some changes to be introduced later).   
Are there any assumptions that KVM makes that will break if I do 
something like this?  I understand that I will have to add some code 
throughout the mmu to make sure that these structures are synchronized 
when a guest thread makes a change, but I'm wondering if there is 
anything else.  Does the reverse mapping data structure you have assume 
that there is only one shadow page per guest page?

Thanks!

Marek


Avi Kivity wrote:
> On 03/10/2010 06:57 AM, Marek Olszewski wrote:
>> Hello,
>>
>> I was wondering if someone could point me to some documentation that 
>> explains the basic non-nested-paging shadow page table 
>> algorithm/strategy used by KVM.  I understand that KVM caches shadow 
>> page tables across context switches and that there is a reverse 
>> mapping and page protection to help zap shadow page tables when the 
>> guest page tables change.  However, I'm not entirely sure how the 
>> actual caching is done.  At first I assumed that KVM would change the 
>> host CR3 on every guest context switch such that it would point to a 
>> cached shadow page table for the currently running guest user thread, 
>> however, as far as I can tell, the host CR3 does not change so I'm a 
>> little lost.  If indeed it doesn't change the CR3, how does KVM solve 
>> the problem that arises when two processes in the guest OS share the 
>> same guest logical addresses?
>
> The host cr3 does change, though not by using the 'mov cr3' 
> instruction (that would cause the host to immediately switch to the 
> guest address space, which would be bad).
>
> See the calls to kvm_x86_ops->set_cr3().
>
>>
>> I'm also interested in figuring out what KVM does when running with 
>> multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
>> has its own root pointer to a shadow page table graph, but I have yet 
>> to figure out if this graph has node's shared between VCPUs, or 
>> whether they are all private.
>
> Everything is shared.  If the guest is running with identical cr3s, 
> kvm will load identical cr3s in guest mode.
>
> An exception is when we use 32-bit pae mode.  In that case, the guest 
> cr3s will be different (but guest PDPTRs will be identical).  Instead 
> of dealing with the pae cr3, we deal with the four PDPTRs.
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Shadow page table questions
  2010-03-11  0:06   ` Marek Olszewski
@ 2010-03-11  6:39     ` Avi Kivity
  2010-03-11 16:14       ` Marek Olszewski
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-11  6:39 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/11/2010 02:06 AM, Marek Olszewski wrote:
> Thanks for the response.  I've looked through the code some more and 
> think I have figured it out now.   I finally see that the root_hpa 
> variable gets switched before entering the guest in mmu_alloc_roots, 
> to correspond with the new cr3.  Thanks again.
>
> Perhaps you can help me with one more question.  I was hoping to try 
> out a certain change for a research project.   I would like to 
> "privatize" kvm_mmu_page's and their spe's for each guest thread 
> running in certain designated guest processes.  The goal is to give 
> each thread its own shadow page table graphs that map the same guest 
> logical addresses to guest physical addresses (with some changes to be 
> introduced later).   Are there any assumptions that KVM makes that 
> will break if I do something like this?  I understand that I will have 
> to add some code throughout the mmu to make sure that these structures 
> are synchronized when a guest thread makes a change, but I'm wondering 
> if there is anything else.  Does the reverse mapping data structure 
> you have assume that there is only one shadow page per guest page?

It doesn't, and there are often multiple shadow pages per guest page, 
distinguished by their sp->role field.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Shadow page table questions
  2010-03-11  6:39     ` Avi Kivity
@ 2010-03-11 16:14       ` Marek Olszewski
  2010-03-13  8:51         ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-11 16:14 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

> It doesn't, and there are often multiple shadow pages per guest page, 
> distinguished by their sp->role field. 
Oh, great!  Does this mean that there is already a mechanism for 
synchronizing all shadow pages shadowing the same guest when such a 
guest page changes?

Marek




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Shadow page table questions
  2010-03-11 16:14       ` Marek Olszewski
@ 2010-03-13  8:51         ` Avi Kivity
  2010-03-18 23:50           ` KVM Page Fault Question Marek Olszewski
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-13  8:51 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/11/2010 06:14 PM, Marek Olszewski wrote:
>> It doesn't, and there are often multiple shadow pages per guest page, 
>> distinguished by their sp->role field. 
> Oh, great!  Does this mean that there is already a mechanism for 
> synchronizing all shadow pages shadowing the same guest when such a 
> guest page changes?

Yes, kvm_mmu_pte_write().

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* KVM Page Fault Question
  2010-03-13  8:51         ` Avi Kivity
@ 2010-03-18 23:50           ` Marek Olszewski
  2010-03-19  8:39             ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-03-18 23:50 UTC (permalink / raw)
  To: kvm

When using VMX without EPT, is it ever possible for a guest to receive a 
page fault without it first appearing (and being reinjected) in KVM?  
I'm seeing some strange behavior where accesses to mprotected (but yet 
to be accessed) memory causes a fault in the guest OS, that I cannot see 
KVM intercepting.

Thanks!

Marek

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
  2010-03-18 23:50           ` KVM Page Fault Question Marek Olszewski
@ 2010-03-19  8:39             ` Avi Kivity
  2010-04-02  4:41               ` Marek Olszewski
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-03-19  8:39 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/19/2010 01:50 AM, Marek Olszewski wrote:
> When using VMX without EPT, is it ever possible for a guest to receive 
> a page fault without it first appearing (and being reinjected) in KVM? 

Yes.  On Intel hosts only, and controlled by bypass_guest_pf.

> I'm seeing some strange behavior where accesses to mprotected (but yet 
> to be accessed) memory causes a fault in the guest OS, that I cannot 
> see KVM intercepting.
>

Look for 'shadow_trap_nonpresent_pte' (which will trap into kvm) and 
'shadow_notrap_nonpresent_pte' (which will not) in the code.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
  2010-03-19  8:39             ` Avi Kivity
@ 2010-04-02  4:41               ` Marek Olszewski
  2010-04-02  6:39                 ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-04-02  4:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

When a guest OS writes to a shadowed (and therefore page protected) 
guest page table, does the resulting page fault get handled in 
paging_tmpl.h:xxx_page_fault or does it call some rmap related code 
directly?  Also, what does the "direct" mmu page role mean?

Thanks!

Marek


Avi Kivity wrote:
> On 03/19/2010 01:50 AM, Marek Olszewski wrote:
>> When using VMX without EPT, is it ever possible for a guest to 
>> receive a page fault without it first appearing (and being 
>> reinjected) in KVM? 
>
> Yes.  On Intel hosts only, and controlled by bypass_guest_pf.
>
>> I'm seeing some strange behavior where accesses to mprotected (but 
>> yet to be accessed) memory causes a fault in the guest OS, that I 
>> cannot see KVM intercepting.
>>
>
> Look for 'shadow_trap_nonpresent_pte' (which will trap into kvm) and 
> 'shadow_notrap_nonpresent_pte' (which will not) in the code.
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
  2010-04-02  4:41               ` Marek Olszewski
@ 2010-04-02  6:39                 ` Avi Kivity
       [not found]                   ` <4BB614BC.9080608@csail.mit.edu>
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-04-02  6:39 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 04/02/2010 07:41 AM, Marek Olszewski wrote:
> When a guest OS writes to a shadowed (and therefore page protected) 
> guest page table, does the resulting page fault get handled in 
> paging_tmpl.h:xxx_page_fault or does it call some rmap related code 
> directly? 

page faults are dispatched to the page_fault callback.

> Also, what does the "direct" mmu page role mean?
>

It means that the page maps the linear range (gfn << 12)..(((gfn + (1 << 
level*9))) << 12) instead of shadowing a guest page table at gfn.  
Useful for real mode, large pages, and tdp.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
       [not found]                   ` <4BB614BC.9080608@csail.mit.edu>
@ 2010-04-04 16:59                     ` Avi Kivity
  2010-04-22  5:26                       ` Marek Olszewski
  0 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2010-04-04 16:59 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm-devel

(re-adding list)


On 04/02/2010 07:01 PM, Marek Olszewski wrote:
> Thanks for the fast response.
>
> I'm trying to find the code that on a write to a guest page table 
> entry, will iterate over all shadow page table entries that map that 
> guest entry to update them.  Can you point me to that code?  I can't 
> seem to find it myself :(

See kvm_mmu_pte_write().

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
  2010-04-04 16:59                     ` Avi Kivity
@ 2010-04-22  5:26                       ` Marek Olszewski
  2010-04-22  6:52                         ` Avi Kivity
  0 siblings, 1 reply; 15+ messages in thread
From: Marek Olszewski @ 2010-04-22  5:26 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel

Under VMX without EPT, I do not seeing any VM Exits due to task 
switches.  Is there a way to enable these?  I'm looking to intercept the 
guest whenever it does a iret.

Thanks!

Marek


Avi Kivity wrote:
> (re-adding list)
>
>
> On 04/02/2010 07:01 PM, Marek Olszewski wrote:
>> Thanks for the fast response.
>>
>> I'm trying to find the code that on a write to a guest page table 
>> entry, will iterate over all shadow page table entries that map that 
>> guest entry to update them.  Can you point me to that code?  I can't 
>> seem to find it myself :(
>
> See kvm_mmu_pte_write().
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
  2010-04-22  5:26                       ` Marek Olszewski
@ 2010-04-22  6:52                         ` Avi Kivity
       [not found]                           ` <4BD0DFBE.1090103@csail.mit.edu>
  2010-05-20  2:24                           ` Shadow MMU state preserved across kvm_mmu_zap_all? Marek Olszewski
  0 siblings, 2 replies; 15+ messages in thread
From: Avi Kivity @ 2010-04-22  6:52 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm-devel

On 04/22/2010 08:26 AM, Marek Olszewski wrote:
> Under VMX without EPT, I do not seeing any VM Exits due to task 
> switches.  Is there a way to enable these?  I'm looking to intercept 
> the guest whenever it does a iret.

See EXIT_REASON_TASK_SWITCH.  However, that won't fire on any iret, only 
irets that generate task switches.  You can ask for exits on irets by 
setting CPU_BASED_VIRTUAL_NMI_PENDING and GUEST_INTR_STATE_NMI, and 
looking for EXIT_REASON_NMI_WINDOW.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: KVM Page Fault Question
       [not found]                           ` <4BD0DFBE.1090103@csail.mit.edu>
@ 2010-04-26  5:42                             ` Marek Olszewski
  0 siblings, 0 replies; 15+ messages in thread
From: Marek Olszewski @ 2010-04-26  5:42 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi,
>
> I guess I only really care about intercepting ring 0 -> ring 3 
> transitions in the guest.  Is there an easier way of intercepting these?
Never mind about this.  I figured out a solution to my problem that 
didn't need to intercept these transitions.

Unfortunately, now I have a new problem.  I'm getting a segfault in 
gfn_to_rmap caused by gfn_to_memslot returning NULL.  Would someone mind 
explaining this code to me?  I don't really understand what it is doing.

Also, does the current code assume that any guest page in any level can 
be shadowed more than once, or are only certain levels allowed to be 
shadowed multiple times?

Thank you!

Marek

>
> Marek
>
>
> Avi Kivity wrote:
>> On 04/22/2010 08:26 AM, Marek Olszewski wrote:
>>> Under VMX without EPT, I do not seeing any VM Exits due to task 
>>> switches.  Is there a way to enable these?  I'm looking to intercept 
>>> the guest whenever it does a iret.
>>
>> See EXIT_REASON_TASK_SWITCH.  However, that won't fire on any iret, 
>> only irets that generate task switches.  You can ask for exits on 
>> irets by setting CPU_BASED_VIRTUAL_NMI_PENDING and 
>> GUEST_INTR_STATE_NMI, and looking for EXIT_REASON_NMI_WINDOW.
>>
>
>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Shadow MMU state preserved across kvm_mmu_zap_all?
  2010-04-22  6:52                         ` Avi Kivity
       [not found]                           ` <4BD0DFBE.1090103@csail.mit.edu>
@ 2010-05-20  2:24                           ` Marek Olszewski
  1 sibling, 0 replies; 15+ messages in thread
From: Marek Olszewski @ 2010-05-20  2:24 UTC (permalink / raw)
  To: kvm-devel; +Cc: Avi Kivity

Hello,

I'm trying to track down a bug I'm observing in a branched version of 
kvm I'm using for research.  I'm hoping someone might be able to point 
me int to the right direction as I haven't had any luck with it on my 
own.  Here are the details:

I have made some changes to kvm that enable guest user applications to 
use duplicate shadow pages to do interesting things (essentially I 
duplicate the shadow page table tree for a process multiple times, once 
for each thread).  During my tests, my guest application enables this 
new feature, completes correctly, and then disables it.  Unfortunately, 
after the test application completes, random programs begin segfaulting 
for unknown reasons.  This is despite the fact that my changes to KVM no 
longer get executed (verified with a kgdb).  At first I thought that I 
corrupted the shadow pages tables somehow, however, calling 
kvm_mmu_zap_all does not solve the problem.  Thus, I figured I corrupted 
the guest OS somehow, however, the problem persists even if I reboot the 
guest OS.  

So my question is this: Are there any other data structures that survive 
both a call to kvm_mmu_zap and a guest reboot?

Thanks!

Marek


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-05-20  2:24 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-10  4:57 Shadow page table questions Marek Olszewski
2010-03-10  9:47 ` Avi Kivity
2010-03-11  0:06   ` Marek Olszewski
2010-03-11  6:39     ` Avi Kivity
2010-03-11 16:14       ` Marek Olszewski
2010-03-13  8:51         ` Avi Kivity
2010-03-18 23:50           ` KVM Page Fault Question Marek Olszewski
2010-03-19  8:39             ` Avi Kivity
2010-04-02  4:41               ` Marek Olszewski
2010-04-02  6:39                 ` Avi Kivity
     [not found]                   ` <4BB614BC.9080608@csail.mit.edu>
2010-04-04 16:59                     ` Avi Kivity
2010-04-22  5:26                       ` Marek Olszewski
2010-04-22  6:52                         ` Avi Kivity
     [not found]                           ` <4BD0DFBE.1090103@csail.mit.edu>
2010-04-26  5:42                             ` Marek Olszewski
2010-05-20  2:24                           ` Shadow MMU state preserved across kvm_mmu_zap_all? Marek Olszewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).