All of lore.kernel.org
 help / color / mirror / Atom feed
* Shadow page table questions
@ 2006-04-19 10:35 Arne Mejlholm
  0 siblings, 0 replies; 7+ messages in thread
From: Arne Mejlholm @ 2006-04-19 10:35 UTC (permalink / raw)
  To: xen-devel

Dear All,

I'm trying to understand some of the details of the shadow page table
implementation. In particular I'm interested in the inner workings of
translated mode.

Now as I understand it, the principle behind Shadow Page Tables (SPTs)
is to provide a level of indirection between virtual and machine
addresses. In Xen this is implemented by a P2M table for each domain (at
least in translated mode this address space is reused). Each entry in
this table is indexed by "physical" addresses and point to a given
machine frame number (mfn). As far as I can tell from the code, this
structure is walked much like a normal page table. In translated mode
the guest VM however cannot make use of the two level lookup (first
from virtual to physical and then from physical to machine addresses),
so a hardware specific version (the SPT) of the two mappings is
produced and given to the MMU instead of the ordinary page table.

My first question is have the status bits of each entry in the P2M
table changed semantics (compared to a normal l1 entry) or do they
signify the same flags? If the read/write flag is not set (read only),
will this be propagated to the SPT seen by the MMU?

Normally producing a SPT each time a the cr3 register is updated (as
required per context switch), is considered to be an overhead. One
solution to this is to make use of a cache, where references to old
SPTs are kept. This again involves the overhead in terms of memory
usage and tracking if the cached SPT is valid.

My second question is thus, does the implementation make use of a SPT
cache or are SPTs produced on demand? If a cache is used, can anyone
give me pointers to where I can find this structure?

The implementation makes use of snapshots, how do these fit into big
picture?

Finally some of the terminology used for the implementation seems a
bit un-intuitive to me. Throughout the source, there is the concept of
a hl2 table. As far as I can tell, a l2 table is the PGD in Linux
kernel terminology and a l1 table is the table pointed to by an entry
in the l2 table (PT in Linux). My first guess was that the hl2 table
was the table actually pointed to by the cr3 register, but I cannot
seem to confirm this. What does the h in hl2 stand for and what is it
used for?

Thank you,
Arne Mejlholm

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Shadow page table questions
@ 2010-03-10  4:57 Marek Olszewski
  2010-03-10  9:47 ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marek Olszewski @ 2010-03-10  4:57 UTC (permalink / raw)
  To: kvm

Hello,

I was wondering if someone could point me to some documentation that 
explains the basic non-nested-paging shadow page table 
algorithm/strategy used by KVM.  I understand that KVM caches shadow 
page tables across context switches and that there is a reverse mapping 
and page protection to help zap shadow page tables when the guest page 
tables change.  However, I'm not entirely sure how the actual caching is 
done.  At first I assumed that KVM would change the host CR3 on every 
guest context switch such that it would point to a cached shadow page 
table for the currently running guest user thread, however, as far as I 
can tell, the host CR3 does not change so I'm a little lost.  If indeed 
it doesn't change the CR3, how does KVM solve the problem that arises 
when two processes in the guest OS share the same guest logical addresses?

I'm also interested in figuring out what KVM does when running with 
multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
has its own root pointer to a shadow page table graph, but I have yet to 
figure out if this graph has node's shared between VCPUs, or whether 
they are all private.

Any help would be greatly appreciated.  Thanks!

Marek

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Shadow page table questions
  2010-03-10  4:57 Marek Olszewski
@ 2010-03-10  9:47 ` Avi Kivity
  2010-03-11  0:06   ` Marek Olszewski
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2010-03-10  9:47 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/10/2010 06:57 AM, Marek Olszewski wrote:
> Hello,
>
> I was wondering if someone could point me to some documentation that 
> explains the basic non-nested-paging shadow page table 
> algorithm/strategy used by KVM.  I understand that KVM caches shadow 
> page tables across context switches and that there is a reverse 
> mapping and page protection to help zap shadow page tables when the 
> guest page tables change.  However, I'm not entirely sure how the 
> actual caching is done.  At first I assumed that KVM would change the 
> host CR3 on every guest context switch such that it would point to a 
> cached shadow page table for the currently running guest user thread, 
> however, as far as I can tell, the host CR3 does not change so I'm a 
> little lost.  If indeed it doesn't change the CR3, how does KVM solve 
> the problem that arises when two processes in the guest OS share the 
> same guest logical addresses?

The host cr3 does change, though not by using the 'mov cr3' instruction 
(that would cause the host to immediately switch to the guest address 
space, which would be bad).

See the calls to kvm_x86_ops->set_cr3().

>
> I'm also interested in figuring out what KVM does when running with 
> multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
> has its own root pointer to a shadow page table graph, but I have yet 
> to figure out if this graph has node's shared between VCPUs, or 
> whether they are all private.

Everything is shared.  If the guest is running with identical cr3s, kvm 
will load identical cr3s in guest mode.

An exception is when we use 32-bit pae mode.  In that case, the guest 
cr3s will be different (but guest PDPTRs will be identical).  Instead of 
dealing with the pae cr3, we deal with the four PDPTRs.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Shadow page table questions
  2010-03-10  9:47 ` Avi Kivity
@ 2010-03-11  0:06   ` Marek Olszewski
  2010-03-11  6:39     ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marek Olszewski @ 2010-03-11  0:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Thanks for the response.  I've looked through the code some more and 
think I have figured it out now.   I finally see that the root_hpa 
variable gets switched before entering the guest in mmu_alloc_roots, to 
correspond with the new cr3.  Thanks again.

Perhaps you can help me with one more question.  I was hoping to try out 
a certain change for a research project.   I would like to "privatize" 
kvm_mmu_page's and their spe's for each guest thread running in certain 
designated guest processes.  The goal is to give each thread its own 
shadow page table graphs that map the same guest logical addresses to 
guest physical addresses (with some changes to be introduced later).   
Are there any assumptions that KVM makes that will break if I do 
something like this?  I understand that I will have to add some code 
throughout the mmu to make sure that these structures are synchronized 
when a guest thread makes a change, but I'm wondering if there is 
anything else.  Does the reverse mapping data structure you have assume 
that there is only one shadow page per guest page?

Thanks!

Marek


Avi Kivity wrote:
> On 03/10/2010 06:57 AM, Marek Olszewski wrote:
>> Hello,
>>
>> I was wondering if someone could point me to some documentation that 
>> explains the basic non-nested-paging shadow page table 
>> algorithm/strategy used by KVM.  I understand that KVM caches shadow 
>> page tables across context switches and that there is a reverse 
>> mapping and page protection to help zap shadow page tables when the 
>> guest page tables change.  However, I'm not entirely sure how the 
>> actual caching is done.  At first I assumed that KVM would change the 
>> host CR3 on every guest context switch such that it would point to a 
>> cached shadow page table for the currently running guest user thread, 
>> however, as far as I can tell, the host CR3 does not change so I'm a 
>> little lost.  If indeed it doesn't change the CR3, how does KVM solve 
>> the problem that arises when two processes in the guest OS share the 
>> same guest logical addresses?
>
> The host cr3 does change, though not by using the 'mov cr3' 
> instruction (that would cause the host to immediately switch to the 
> guest address space, which would be bad).
>
> See the calls to kvm_x86_ops->set_cr3().
>
>>
>> I'm also interested in figuring out what KVM does when running with 
>> multiple virtual CPUs.  Looking at the code, I can see that each VCPU 
>> has its own root pointer to a shadow page table graph, but I have yet 
>> to figure out if this graph has node's shared between VCPUs, or 
>> whether they are all private.
>
> Everything is shared.  If the guest is running with identical cr3s, 
> kvm will load identical cr3s in guest mode.
>
> An exception is when we use 32-bit pae mode.  In that case, the guest 
> cr3s will be different (but guest PDPTRs will be identical).  Instead 
> of dealing with the pae cr3, we deal with the four PDPTRs.
>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Shadow page table questions
  2010-03-11  0:06   ` Marek Olszewski
@ 2010-03-11  6:39     ` Avi Kivity
  2010-03-11 16:14       ` Marek Olszewski
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2010-03-11  6:39 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/11/2010 02:06 AM, Marek Olszewski wrote:
> Thanks for the response.  I've looked through the code some more and 
> think I have figured it out now.   I finally see that the root_hpa 
> variable gets switched before entering the guest in mmu_alloc_roots, 
> to correspond with the new cr3.  Thanks again.
>
> Perhaps you can help me with one more question.  I was hoping to try 
> out a certain change for a research project.   I would like to 
> "privatize" kvm_mmu_page's and their spe's for each guest thread 
> running in certain designated guest processes.  The goal is to give 
> each thread its own shadow page table graphs that map the same guest 
> logical addresses to guest physical addresses (with some changes to be 
> introduced later).   Are there any assumptions that KVM makes that 
> will break if I do something like this?  I understand that I will have 
> to add some code throughout the mmu to make sure that these structures 
> are synchronized when a guest thread makes a change, but I'm wondering 
> if there is anything else.  Does the reverse mapping data structure 
> you have assume that there is only one shadow page per guest page?

It doesn't, and there are often multiple shadow pages per guest page, 
distinguished by their sp->role field.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Shadow page table questions
  2010-03-11  6:39     ` Avi Kivity
@ 2010-03-11 16:14       ` Marek Olszewski
  2010-03-13  8:51         ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marek Olszewski @ 2010-03-11 16:14 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

> It doesn't, and there are often multiple shadow pages per guest page, 
> distinguished by their sp->role field. 
Oh, great!  Does this mean that there is already a mechanism for 
synchronizing all shadow pages shadowing the same guest when such a 
guest page changes?

Marek




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Shadow page table questions
  2010-03-11 16:14       ` Marek Olszewski
@ 2010-03-13  8:51         ` Avi Kivity
  0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2010-03-13  8:51 UTC (permalink / raw)
  To: Marek Olszewski; +Cc: kvm

On 03/11/2010 06:14 PM, Marek Olszewski wrote:
>> It doesn't, and there are often multiple shadow pages per guest page, 
>> distinguished by their sp->role field. 
> Oh, great!  Does this mean that there is already a mechanism for 
> synchronizing all shadow pages shadowing the same guest when such a 
> guest page changes?

Yes, kvm_mmu_pte_write().

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-03-13  8:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-19 10:35 Shadow page table questions Arne Mejlholm
  -- strict thread matches above, loose matches on Subject: below --
2010-03-10  4:57 Marek Olszewski
2010-03-10  9:47 ` Avi Kivity
2010-03-11  0:06   ` Marek Olszewski
2010-03-11  6:39     ` Avi Kivity
2010-03-11 16:14       ` Marek Olszewski
2010-03-13  8:51         ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.