All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi@qumranet.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org
Subject: Re: [patch 09/13] KVM: MMU: out of sync shadow core
Date: Sun, 07 Sep 2008 14:01:42 +0300	[thread overview]
Message-ID: <48C3B496.20905@qumranet.com> (raw)
In-Reply-To: <20080906192431.211131067@localhost.localdomain>

Marcelo Tosatti wrote:
> Allow global and single-root, single-role-per-gfn leaf shadowed
> pagetables to be unsynced.
>
> Global unsync pages are saved into a per-vm array, synced on cr4/cr0 writes.
>
>   

Why not a list?

> Non-global unsync pages are linked off their root shadow page, synced 
> on cr3/cr4/cr0 writes.
>
> Some of this logic is simplistic and could be smarter (page_multimapped and
> the full root sync on higher level pagetable sharing).
>
> Also unsyncing of non-leaf nodes might be interesting (but more complicated).
>
>
>  
> +static struct kvm_mmu_page *kvm_mmu_lookup_page_root(struct kvm_vcpu *vcpu,
> +						     gfn_t gfn)
> +{
> +	unsigned index;
> +	struct hlist_head *bucket;
> +	struct kvm_mmu_page *sp;
> +	struct hlist_node *node;
> +	struct kvm *kvm = vcpu->kvm;
> +	int level = vcpu->arch.mmu.root_level;
> +	if (!is_long_mode(vcpu) && is_pae(vcpu))
> +		level--;
> +
> +	pgprintk("%s: looking for gfn %lx\n", __func__, gfn);
> +	index = kvm_page_table_hashfn(gfn);
> +	bucket = &kvm->arch.mmu_page_hash[index];
> +	hlist_for_each_entry(sp, node, bucket, hash_link)
> +		if (sp->gfn == gfn && !sp->role.metaphysical
> +		    && !sp->role.invalid && sp->role.level == level) {
> +			pgprintk("%s: found role %x\n",
> +				 __func__, sp->role.word);
> +			return sp;
> +		}
> +	return NULL;
> +}
>   

I'm worried about the complexity this (and the rest) introduces.

A possible alternative is:

- for non-leaf pages, including roots, add a 'unsync_children' flag.
- when marking a page unsync, set the flag recursively on all parents
- when switching cr3, recursively descend to locate unsynced leaves, 
clearing flags along the way
- to speed this up, put a bitmap with 1 bit per pte in the pages (512 
bits = 64 bytes)
- the bitmap can be externally allocated to save space, or not

This means we no longer have to worry about multiple roots, when a page 
acquires another root while it is unsynced, etc.


> @@ -963,8 +1112,24 @@ static struct kvm_mmu_page *kvm_mmu_get_
>  		 gfn, role.word);
>  	index = kvm_page_table_hashfn(gfn);
>  	bucket = &vcpu->kvm->arch.mmu_page_hash[index];
> -	hlist_for_each_entry(sp, node, bucket, hash_link)
> -		if (sp->gfn == gfn && sp->role.word == role.word) {
> +	hlist_for_each_entry_safe(sp, node, tmp, bucket, hash_link)
> +		if (sp->gfn == gfn) {
> +			/*
> + 			 * If a pagetable becomes referenced by more than one
> + 			 * root, or has multiple roles, unsync it and disable
> + 			 * oos. For higher level pgtables the entire tree
> + 			 * has to be synced.
> + 			 */
> +			if (sp->root_gfn != root_gfn) {
> +				kvm_set_pg_inuse(sp);
>   

What does inuse mean exactly?

> +				if (set_shared_mmu_page(vcpu, sp))
> +					tmp = bucket->first;
> +				kvm_clear_pg_inuse(sp);
>   

Cleared here?

> +				unsyncable = 0;
> +			}
> +			if (sp->role.word != role.word)
> +				continue;
> +
>  			mmu_page_add_parent_pte(vcpu, sp, parent_pte);
>  			pgprintk("%s: found\n", __func__);
>  			return sp;
> --- kvm.orig/include/asm-x86/kvm_host.h
> +++ kvm/include/asm-x86/kvm_host.h
> @@ -179,6 +179,10 @@ union kvm_mmu_page_role {
>  struct kvm_mmu_page {
>  	struct list_head link;
>  	struct hlist_node hash_link;
> +	/* FIXME: one list_head is enough */
> +	struct list_head unsync_pages;
> +	struct list_head oos_link;
>   

That's okay, we may allow OOS roots one day.

> +	gfn_t root_gfn; /* root this pagetable belongs to, -1 if multimapped */
>  
>  	/*
>  	 * The following two entries are used to key the shadow page in the
> @@ -362,6 +366,8 @@ struct kvm_arch{
>  	unsigned int n_requested_mmu_pages;
>  	unsigned int n_alloc_mmu_pages;
>  	struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
> +	struct kvm_mmu_page *oos_global_pages[7];
> +	unsigned oos_global_idx;
>   

What does the index mean?  An lru pointer?

I became a little unsynced myself reading the patch.  It's very complex.

We need to change something to keep it maintainable.  Either a better 
data structure, or disallowing a parent to be zapped while any of its 
children are alive.

What happens when a sp->global changes its value while a page is oos?

Can we even detect a nonglobal->global change?  Maybe instead of a flag, 
add a counter for active ptes and global ptes, and a page is global if 
the counters match.  However most likely it doesn't matter at all.

-- 
error compiling committee.c: too many arguments to function


  reply	other threads:[~2008-09-07 11:01 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-06 18:48 [patch 00/13] RFC: out of sync shadow Marcelo Tosatti
2008-09-06 18:48 ` [patch 01/13] x86/mm: get_user_pages_fast_atomic Marcelo Tosatti
2008-09-07  8:42   ` Avi Kivity
2008-09-08  6:10     ` Marcelo Tosatti
2008-09-08 14:20       ` Avi Kivity
2008-09-06 18:48 ` [patch 02/13] KVM: MMU: switch to get_user_pages_fast Marcelo Tosatti
2008-09-07  8:45   ` Avi Kivity
2008-09-07 20:44     ` Marcelo Tosatti
2008-09-08 14:53       ` Avi Kivity
2008-09-09 12:21     ` Andrea Arcangeli
2008-09-09 13:57       ` Avi Kivity
2008-09-06 18:48 ` [patch 03/13] KVM: MMU: gfn_to_page_atomic Marcelo Tosatti
2008-09-06 18:48 ` [patch 04/13] KVM: MMU: switch prefetch_page to gfn_to_page_atomic Marcelo Tosatti
2008-09-06 18:48 ` [patch 05/13] KVM: MMU: do not write-protect large mappings Marcelo Tosatti
2008-09-07  9:04   ` Avi Kivity
2008-09-07 20:54     ` Marcelo Tosatti
2008-09-06 18:48 ` [patch 06/13] KVM: MMU: global page keeping Marcelo Tosatti
2008-09-07  9:16   ` Avi Kivity
2008-09-06 18:48 ` [patch 07/13] KVM: MMU: mode specific sync_page Marcelo Tosatti
2008-09-07  9:52   ` Avi Kivity
2008-09-08  6:03     ` Marcelo Tosatti
2008-09-08  9:50       ` Avi Kivity
2008-09-06 18:48 ` [patch 08/13] KVM: MMU: record guest root level on struct guest_walker Marcelo Tosatti
2008-09-06 18:48 ` [patch 09/13] KVM: MMU: out of sync shadow core Marcelo Tosatti
2008-09-07 11:01   ` Avi Kivity [this message]
2008-09-08  7:19     ` Marcelo Tosatti
2008-09-08 14:51       ` Avi Kivity
2008-09-11  8:19         ` Marcelo Tosatti
2008-09-11 13:15     ` Marcelo Tosatti
2008-09-06 18:48 ` [patch 10/13] KVM: MMU: sync roots on mmu reload Marcelo Tosatti
2008-09-06 18:48 ` [patch 11/13] KVM: MMU: sync global pages on cr0/cr4 writes Marcelo Tosatti
2008-09-06 18:48 ` [patch 12/13] KVM: x86: trap invlpg Marcelo Tosatti
2008-09-07 11:14   ` Avi Kivity
2008-09-06 18:48 ` [patch 13/13] KVM: MMU: ignore multiroot when unsyncing global pages Marcelo Tosatti
2008-09-07 11:22 ` [patch 00/13] RFC: out of sync shadow Avi Kivity
2008-09-08  7:23   ` Marcelo Tosatti
2008-09-08 14:56     ` Avi Kivity
2008-09-12  4:05 ` David S. Ahern
2008-09-12 11:51   ` Marcelo Tosatti
2008-09-12 15:12     ` David S. Ahern
2008-09-12 18:09       ` Marcelo Tosatti
2008-09-12 18:19         ` David S. Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48C3B496.20905@qumranet.com \
    --to=avi@qumranet.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.