public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KVM list <kvm@vger.kernel.org>
Subject: Re: [patch 07/13] KVM: MMU: mode specific sync_page
Date: Sun, 07 Sep 2008 12:52:21 +0300	[thread overview]
Message-ID: <48C3A455.5080100@qumranet.com> (raw)
In-Reply-To: <20080906192431.043506161@localhost.localdomain>

Marcelo Tosatti wrote:
> Examine guest pagetable and bring the shadow back in sync. At the moment
> sync_page is simplistic and only cares about shadow present entries
> whose gfn remains unchanged.
>
> It might be worthwhile to prepopulate the shadow in advance.
>
> FIXME: the RW->RO transition needs a local TLB flush.
>
>   

Yes!

>  static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>  					     gfn_t gfn,
>  					     gva_t gaddr,
> @@ -1536,6 +1581,7 @@ static int nonpaging_init_context(struct
>  	context->gva_to_gpa = nonpaging_gva_to_gpa;
>  	context->free = nonpaging_free;
>  	context->prefetch_page = nonpaging_prefetch_page;
> +	context->sync_page = nonpaging_sync_page;
>  	context->root_level = 0;
>  	context->shadow_root_level = PT32E_ROOT_LEVEL;
>  	context->root_hpa = INVALID_PAGE;
> @@ -1583,6 +1629,7 @@ static int paging64_init_context_common(
>  	context->page_fault = paging64_page_fault;
>  	context->gva_to_gpa = paging64_gva_to_gpa;
>  	context->prefetch_page = paging64_prefetch_page;
> +	context->sync_page = paging64_sync_page;
>  	context->free = paging_free;
>  	context->root_level = level;
>  	context->shadow_root_level = level;
> @@ -1604,6 +1651,7 @@ static int paging32_init_context(struct 
>  	context->gva_to_gpa = paging32_gva_to_gpa;
>  	context->free = paging_free;
>  	context->prefetch_page = paging32_prefetch_page;
> +	context->sync_page = paging32_sync_page;
>  	context->root_level = PT32_ROOT_LEVEL;
>  	context->shadow_root_level = PT32E_ROOT_LEVEL;
>  	context->root_hpa = INVALID_PAGE;
> @@ -1623,6 +1671,7 @@ static int init_kvm_tdp_mmu(struct kvm_v
>  	context->page_fault = tdp_page_fault;
>  	context->free = nonpaging_free;
>  	context->prefetch_page = nonpaging_prefetch_page;
> +	context->sync_page = nonpaging_sync_page;
>  	context->shadow_root_level = kvm_x86_ops->get_tdp_level();
>  	context->root_hpa = INVALID_PAGE;
>   

Not sure this is right.

What if vcpu0 is in mode X, while vcpu1 is in mode Y.  vcpu0 writes to 
some pagetable, causing both mode X and mode Y shadows to become 
unsynced, so on the next resync (either by vcpu0 or vcpu1) we need to 
sync both modes.

Same problem with kvm_mmu_pte_write(), which right now hacks around it.

Maybe we need a ->ops member.

>  
>  
> +static int FNAME(sync_page)(struct kvm_vcpu *vcpu,
> +			    struct kvm_mmu_page *sp)
> +{
> +	int i, nr_present = 0;
> +	struct page *pt_page;
> +	pt_element_t *pt;
> +	void *gpte_kaddr;
> +
> +	pt_page = gfn_to_page_atomic(vcpu->kvm, sp->gfn);
> +	if (is_error_page(pt_page)) {
> +		kvm_release_page_clean(pt_page);
> +		return -EFAULT;
> +	}
> +
> +	gpte_kaddr = pt = kmap_atomic(pt_page, KM_USER0);
> +
> +	if (PTTYPE == 32)
> +		pt += sp->role.quadrant << PT64_LEVEL_BITS;
>   

Only works for level 1 pages (which is okay).

> +
> +	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> +		if (is_shadow_present_pte(sp->spt[i])) {
>   

Helper function needed for contents of inner loop.

> +			struct page *page;
> +			u64 spte;
> +			unsigned pte_access;
> +
> +			if (!is_present_pte(*pt)) {
> +				rmap_remove(vcpu->kvm, &sp->spt[i]);
> +				sp->spt[i] = shadow_notrap_nonpresent_pte;
> +				pt++;
> +				continue;
> +			}
>   

Are we missing a tlb flush?  Or will the caller take care of it?

> +
> +			pte_access = sp->role.access & FNAME(gpte_access)(vcpu, *pt);
> +			/* user */
> +			if (pte_access & ACC_USER_MASK)
> +				spte |= shadow_user_mask;
>   

There are some special cases involving cr0.wp=0 and the user mask.  so 
spte.u is not correlated exactly with gpte.u.

> +			/* guest->shadow accessed sync */
> +			if (!(*pt & PT_ACCESSED_MASK))
> +				spte &= ~PT_ACCESSED_MASK;
>   

spte shouldn't be accessible at all if gpte is not accessed, so we can 
set gpte.a on the next access (similar to spte not being writeable if 
gpte is not dirty).

> +			/* shadow->guest accessed sync */
> +			if (spte & PT_ACCESSED_MASK)
> +				set_bit(PT_ACCESSED_SHIFT, (unsigned long *)pt);
>   

host accessed and guest accessed are very different.  We shouldn't set 
host accessed unless we're sure the guest will access the page very soon.


> +			set_shadow_pte(&sp->spt[i], spte);
>   

What if permissions are reduced?

You can use PT_* instead of shadow_* as this will never be called when 
ept is active.

I'm worried about the duplication with kvm_mmu_set_pte().  Perhaps that 
can be refactored instead to be the inner loop.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2008-09-07  9:52 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-06 18:48 [patch 00/13] RFC: out of sync shadow Marcelo Tosatti
2008-09-06 18:48 ` [patch 01/13] x86/mm: get_user_pages_fast_atomic Marcelo Tosatti
2008-09-07  8:42   ` Avi Kivity
2008-09-08  6:10     ` Marcelo Tosatti
2008-09-08 14:20       ` Avi Kivity
2008-09-06 18:48 ` [patch 02/13] KVM: MMU: switch to get_user_pages_fast Marcelo Tosatti
2008-09-07  8:45   ` Avi Kivity
2008-09-07 20:44     ` Marcelo Tosatti
2008-09-08 14:53       ` Avi Kivity
2008-09-09 12:21     ` Andrea Arcangeli
2008-09-09 13:57       ` Avi Kivity
2008-09-06 18:48 ` [patch 03/13] KVM: MMU: gfn_to_page_atomic Marcelo Tosatti
2008-09-06 18:48 ` [patch 04/13] KVM: MMU: switch prefetch_page to gfn_to_page_atomic Marcelo Tosatti
2008-09-06 18:48 ` [patch 05/13] KVM: MMU: do not write-protect large mappings Marcelo Tosatti
2008-09-07  9:04   ` Avi Kivity
2008-09-07 20:54     ` Marcelo Tosatti
2008-09-06 18:48 ` [patch 06/13] KVM: MMU: global page keeping Marcelo Tosatti
2008-09-07  9:16   ` Avi Kivity
2008-09-06 18:48 ` [patch 07/13] KVM: MMU: mode specific sync_page Marcelo Tosatti
2008-09-07  9:52   ` Avi Kivity [this message]
2008-09-08  6:03     ` Marcelo Tosatti
2008-09-08  9:50       ` Avi Kivity
2008-09-06 18:48 ` [patch 08/13] KVM: MMU: record guest root level on struct guest_walker Marcelo Tosatti
2008-09-06 18:48 ` [patch 09/13] KVM: MMU: out of sync shadow core Marcelo Tosatti
2008-09-07 11:01   ` Avi Kivity
2008-09-08  7:19     ` Marcelo Tosatti
2008-09-08 14:51       ` Avi Kivity
2008-09-11  8:19         ` Marcelo Tosatti
2008-09-11 13:15     ` Marcelo Tosatti
2008-09-06 18:48 ` [patch 10/13] KVM: MMU: sync roots on mmu reload Marcelo Tosatti
2008-09-06 18:48 ` [patch 11/13] KVM: MMU: sync global pages on cr0/cr4 writes Marcelo Tosatti
2008-09-06 18:48 ` [patch 12/13] KVM: x86: trap invlpg Marcelo Tosatti
2008-09-07 11:14   ` Avi Kivity
2008-09-06 18:48 ` [patch 13/13] KVM: MMU: ignore multiroot when unsyncing global pages Marcelo Tosatti
2008-09-07 11:22 ` [patch 00/13] RFC: out of sync shadow Avi Kivity
2008-09-08  7:23   ` Marcelo Tosatti
2008-09-08 14:56     ` Avi Kivity
2008-09-12  4:05 ` David S. Ahern
2008-09-12 11:51   ` Marcelo Tosatti
2008-09-12 15:12     ` David S. Ahern
2008-09-12 18:09       ` Marcelo Tosatti
2008-09-12 18:19         ` David S. Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48C3A455.5080100@qumranet.com \
    --to=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox