Re: [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Avi Kivity <avi.kivity@gmail.com>
To: mtosatti@redhat.com, kvm@vger.kernel.org, ak@linux.intel.com
Cc: pbonzini@redhat.com, xiaoguangrong@linux.vnet.ibm.com, gleb@kernel.org
Subject: Re: [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only)
Date: Thu, 19 Jun 2014 11:01:06 +0300	[thread overview]
Message-ID: <53A298C2.4040005@gmail.com> (raw)
In-Reply-To: <20140618231521.569025131@amt.cnet>


On 06/19/2014 02:12 AM, mtosatti@redhat.com wrote:
> Allow vcpus to pin spte translations by:
>
> 1) Creating a per-vcpu list of pinned ranges.
> 2) On mmu reload request:
> 	- Fault ranges.
> 	- Mark sptes with a pinned bit.
> 	- Mark shadow pages as pinned.
>
> 3) Then modify the following actions:
> 	- Page age => skip spte flush.
> 	- MMU notifiers => force mmu reload request (which kicks cpu out of
> 				guest mode).
> 	- GET_DIRTY_LOG => force mmu reload request.
> 	- SLAB shrinker => skip shadow page deletion.
>
> TDP-only.
>
>   
> +int kvm_mmu_register_pinned_range(struct kvm_vcpu *vcpu,
> +				  gfn_t base_gfn, unsigned long npages)
> +{
> +	struct kvm_pinned_page_range *p;
> +
> +	mutex_lock(&vcpu->arch.pinned_mmu_mutex);
> +	list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
> +		if (p->base_gfn == base_gfn && p->npages == npages) {
> +			mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
> +			return -EEXIST;
> +		}
> +	}
> +	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
> +
> +	if (vcpu->arch.nr_pinned_ranges >=
> +	    KVM_MAX_PER_VCPU_PINNED_RANGE)
> +		return -ENOSPC;
> +
> +	p = kzalloc(sizeof(struct kvm_pinned_page_range), GFP_KERNEL);
> +	if (!p)
> +		return -ENOMEM;
> +
> +	vcpu->arch.nr_pinned_ranges++;
> +
> +	trace_kvm_mmu_register_pinned_range(vcpu->vcpu_id, base_gfn, npages);
> +
> +	INIT_LIST_HEAD(&p->link);
> +	p->base_gfn = base_gfn;
> +	p->npages = npages;
> +	mutex_lock(&vcpu->arch.pinned_mmu_mutex);
> +	list_add(&p->link, &vcpu->arch.pinned_mmu_pages);
> +	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
> +	kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
> +
> +	return 0;
> +}
> +

What happens if ranges overlap (within a vcpu, cross-vcpu)? Or if a 
range overflows and wraps around 0?  Or if it does not refer to RAM?

Looks like you're limiting the number of ranges, but not the number of 
pages, so a guest can lock all of its memory.

> +
> +/*
> + * Pin KVM MMU page translations. This guarantees, for valid
> + * addresses registered by kvm_mmu_register_pinned_range (valid address
> + * meaning address which posses sufficient information for fault to
> + * be resolved), valid translations exist while in guest mode and
> + * therefore no VM-exits due to faults will occur.
> + *
> + * Failure to instantiate pages will abort guest entry.
> + *
> + * Page frames should be pinned with get_page in advance.
> + *
> + * Pinning is not guaranteed while executing as L2 guest.

Does this undermine security?

> + *
> + */
> +
> +static void kvm_mmu_pin_pages(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm_pinned_page_range *p;
> +
> +	if (is_guest_mode(vcpu))
> +		return;
> +
> +	if (!vcpu->arch.mmu.direct_map)
> +		return;
> +
> +	ASSERT(VALID_PAGE(vcpu->arch.mmu.root_hpa));
> +
> +	mutex_lock(&vcpu->arch.pinned_mmu_mutex);

Is the mutex actually needed? It seems it's only taken in vcpu context, 
so the vcpu mutex should be sufficient.

> +	list_for_each_entry(p, &vcpu->arch.pinned_mmu_pages, link) {
> +		gfn_t gfn_offset;
> +
> +		for (gfn_offset = 0; gfn_offset < p->npages; gfn_offset++) {
> +			gfn_t gfn = p->base_gfn + gfn_offset;
> +			int r;
> +			bool pinned = false;
> +
> +			r = vcpu->arch.mmu.page_fault(vcpu, gfn << PAGE_SHIFT,
> +						     PFERR_WRITE_MASK, false,
> +						     true, &pinned);
> +			/* MMU notifier sequence window: retry */
> +			if (!r && !pinned)
> +				kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
> +			if (r) {
> +				kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> +				break;
> +			}
> +
> +		}
> +	}
> +	mutex_unlock(&vcpu->arch.pinned_mmu_mutex);
> +}
> +
>   int kvm_mmu_load(struct kvm_vcpu *vcpu)
>   {
>   	int r;
> @@ -3916,6 +4101,7 @@
>   		goto out;
>   	/* set_cr3() should ensure TLB has been flushed */
>   	vcpu->arch.mmu.set_cr3(vcpu, vcpu->arch.mmu.root_hpa);
> +	kvm_mmu_pin_pages(vcpu);
>   out:
>   	return r;
>   }
>

I don't see where  you unpin pages, so even if you limit the number of 
pinned pages, a guest can pin all of memory by iterating over all of 
memory and pinning it a chunk at a time.

You might try something similar to guest MTRR handling.

next prev parent reply	other threads:[~2014-06-19  8:01 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 23:12 [patch 0/5] KVM: support for pinning sptes mtosatti
2014-06-18 23:12 ` [patch 1/5] KVM: x86: add pinned parameter to page_fault methods mtosatti
2014-06-18 23:12 ` [patch 2/5] KVM: MMU: allow pinning spte translations (TDP-only) mtosatti
2014-06-19  7:21   ` Gleb Natapov
2014-06-19 19:22     ` Marcelo Tosatti
2014-06-20 10:09       ` Gleb Natapov
2014-06-30 20:46         ` Marcelo Tosatti
2014-06-30 22:00           ` Andi Kleen
2014-06-19  8:01   ` Avi Kivity [this message]
2014-06-19 14:06     ` Andi Kleen
2014-06-19 18:26     ` Marcelo Tosatti
2014-06-22 13:35       ` Avi Kivity
2014-07-09 13:25         ` Marcelo Tosatti
2014-07-02  0:58   ` Nadav Amit
2014-06-18 23:12 ` [patch 3/5] KVM: MMU: notifiers support for pinned sptes mtosatti
2014-06-19  6:48   ` Gleb Natapov
2014-06-19 18:28     ` Marcelo Tosatti
2014-06-20 10:11       ` Gleb Natapov
2014-06-18 23:12 ` [patch 4/5] KVM: MMU: reload request from GET_DIRTY_LOG path mtosatti
2014-06-19  8:17   ` Gleb Natapov
2014-06-19 18:40     ` Marcelo Tosatti
2014-06-20 10:46       ` Gleb Natapov
2014-06-30 20:59         ` Marcelo Tosatti
2014-07-01  6:27           ` Gleb Natapov
2014-07-01 17:50             ` Marcelo Tosatti
2014-06-18 23:12 ` [patch 5/5] KVM: MMU: pinned sps are not candidates for deletion mtosatti
2014-06-19  1:44 ` [patch 0/5] KVM: support for pinning sptes Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53A298C2.4040005@gmail.com \
    --to=avi.kivity@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=gleb@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.