linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH 11/13] KVM: MMU: fast path of handling guest page fault
Date: Sun, 01 Apr 2012 19:23:53 +0300	[thread overview]
Message-ID: <4F788119.6090401@redhat.com> (raw)
In-Reply-To: <4F742AE8.9020201@linux.vnet.ibm.com>

On 03/29/2012 11:27 AM, Xiao Guangrong wrote:
> If the the present bit of page fault error code is set, it indicates
> the shadow page is populated on all levels, it means what we do is only
> modify the access bit which can be done out of mmu-lock
>
> The tricks in this patch is avoiding the race between fast page fault
> path and write-protect path, write-protect path is a read-check-modify
> path:
> read spte, check W bit, then clear W bit. What we do is populating a
> identification in spte, if write-protect meets it, it modify the spte
> even if the spte is readonly. See the comment in the code to get more
> information
>
> +
> +static bool page_fault_can_be_fast(struct kvm_vcpu *vcpu, gfn_t gfn,
> +				   u32 error_code)
> +{
> +	unsigned long *rmap;
> +	bool write = error_code & PFERR_WRITE_MASK;
> +
> +	/*
> +	 * #PF can be fast only if the shadow page table is present, that
> +	 * means we just need change the access bits (e.g: R/W, U/S...)
> +	 * which can be done out of mmu-lock.
> +	 */
> +	if (!(error_code & PFERR_PRESENT_MASK))
> +		return false;
> +
> +	if (unlikely(vcpu->vcpu_id > max_vcpu_spte()))
> +		return false;
> +
> +	rmap = gfn_to_rmap(vcpu->kvm, gfn, PT_PAGE_TABLE_LEVEL);

What prevents sp->gfns from being freed while this is going on?  Did I
miss the patch that makes mmu pages freed by rcu?  Also need barriers in
kvm_mmu_get_page() to make sure sp->gfns is made visible before the page
is hashed in.

+ smp_rmb()

> +
> +	/* Quickly check the page can be writable. */
> +	if (write && (ACCESS_ONCE(*rmap) & PTE_LIST_WRITE_PROTECT))
> +		return false;
> +
> +	return true;
> +}
> +
> +typedef bool (*fast_pf_fetch_spte)(struct kvm_vcpu *vcpu, u64 *sptep,
> +				   u64 *new_spte, gfn_t gfn, u32 expect_access,
> +				   u64 spte);
> +
> +static bool
> +fast_pf_fetch_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 *new_spte,
> +			  gfn_t gfn, u32 expect_access, u64 spte)
> +{
> +	struct kvm_mmu_page *sp = page_header(__pa(sptep));
> +

Why is this stable?  Another cpu could have removed it.

>
> +	WARN_ON(!sp->role.direct);
> +
> +	if (kvm_mmu_page_get_gfn(sp, sptep - sp->spt) != gfn)
> +		return false;

Simpler to check if the page is sync; if it is, gfn has not changed.

(but the check for a sync page is itself unstable).

>
> +
> +	set_spte(vcpu, new_spte, sp->role.access,
> +		 expect_access & ACC_USER_MASK, expect_access & ACC_WRITE_MASK,
> +		 sp->role.level, gfn, spte_to_pfn(spte), false, false,
> +		 spte & SPTE_HOST_WRITEABLE, true);
> +

What if the conditions have changed meanwhile?  Should we set the spte
atomically via set_spte?  For example an mmu notifier clearing the spte
in parallel.

What if another code path has read *spte, and did not re-read it because
it holds the mmu lock and thinks only the guest can access it?

>
> +	return true;
> +}
> +
> +static bool
> +fast_page_fault_fix_spte(struct kvm_vcpu *vcpu, u64 *sptep, u64 spte,
> +			 gfn_t gfn, u32 expect_access,
> +			 fast_pf_fetch_spte fn)
> +{
> +	u64 new_spte = 0ull;
> +	int vcpu_id = vcpu->vcpu_id;
> +
> +	spte = mark_vcpu_id_spte(sptep, spte, vcpu_id);

What's the point of reading spte?  It can change any minute, so the
value is stale and we can't use it.  We have to read and lock it
atomically, no?

Your patches read the spte, then set the identity.  In between some
other path can change the spte.

-- 
error compiling committee.c: too many arguments to function


  parent reply	other threads:[~2012-04-01 16:24 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-29  9:20 [PATCH 00/13] KVM: MMU: fast page fault Xiao Guangrong
2012-03-29  9:20 ` [PATCH 01/13] KVM: MMU: properly assert spte on rmap_next path Xiao Guangrong
2012-03-29  9:21 ` [PATCH 02/13] KVM: MMU: abstract spte write-protect Xiao Guangrong
2012-03-29 11:11   ` Avi Kivity
2012-03-29 11:51     ` Xiao Guangrong
2012-03-29  9:22 ` [PATCH 03/13] KVM: MMU: split FNAME(prefetch_invalid_gpte) Xiao Guangrong
2012-03-29 13:00   ` Avi Kivity
2012-03-30  3:51     ` Xiao Guangrong
2012-03-29  9:22 ` [PATCH 04/13] KVM: MMU: introduce FNAME(get_sp_gpa) Xiao Guangrong
2012-03-29 13:07   ` Avi Kivity
2012-03-30  5:01     ` Xiao Guangrong
2012-04-01 12:42       ` Avi Kivity
2012-03-29  9:23 ` [PATCH 05/13] KVM: MMU: reset shadow_mmio_mask Xiao Guangrong
2012-03-29 13:10   ` Avi Kivity
2012-03-29 15:28     ` Avi Kivity
2012-03-29 16:24       ` Avi Kivity
2012-03-29  9:23 ` [PATCH 06/13] KVM: VMX: export PFEC.P bit on ept Xiao Guangrong
2012-03-29  9:24 ` [PATCH 07/13] KVM: MMU: store more bits in rmap Xiao Guangrong
2012-03-29  9:25 ` [PATCH 08/13] KVM: MMU: fask check whether page is writable Xiao Guangrong
2012-03-29 15:49   ` Avi Kivity
2012-03-30  5:10     ` Xiao Guangrong
2012-04-01 15:52   ` Avi Kivity
2012-04-05 17:54     ` Xiao Guangrong
2012-04-12 23:08       ` Marcelo Tosatti
2012-04-13 10:26         ` Xiao Guangrong
2012-03-29  9:25 ` [PATCH 09/13] KVM: MMU: get expected spte out of mmu-lock Xiao Guangrong
2012-04-01 15:53   ` Avi Kivity
2012-04-05 18:25     ` Xiao Guangrong
2012-04-09 12:28       ` Avi Kivity
2012-04-09 13:16         ` Takuya Yoshikawa
2012-04-09 13:21           ` Avi Kivity
2012-03-29  9:26 ` [PATCH 10/13] KVM: MMU: store vcpu id in spte to notify page write-protect path Xiao Guangrong
2012-03-29  9:27 ` [PATCH 11/13] KVM: MMU: fast path of handling guest page fault Xiao Guangrong
2012-03-31 12:24   ` Xiao Guangrong
2012-04-01 16:23   ` Avi Kivity [this message]
2012-04-03 13:04     ` Avi Kivity
2012-04-05 19:39     ` Xiao Guangrong
2012-03-29  9:27 ` [PATCH 12/13] KVM: MMU: trace fast " Xiao Guangrong
2012-03-29  9:28 ` [PATCH 13/13] KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint Xiao Guangrong
2012-03-29 10:18 ` [PATCH 00/13] KVM: MMU: fast page fault Avi Kivity
2012-03-29 11:40   ` Xiao Guangrong
2012-03-29 12:57     ` Avi Kivity
2012-03-30  9:18       ` Xiao Guangrong
2012-03-31 13:12         ` Xiao Guangrong
2012-04-01 12:58         ` Avi Kivity
2012-04-05 21:57           ` Xiao Guangrong
2012-04-06  5:24             ` Xiao Guangrong
2012-04-09 13:20               ` Avi Kivity
2012-04-09 13:59                 ` Xiao Guangrong
2012-04-09 13:12 ` Avi Kivity
2012-04-09 13:55   ` Xiao Guangrong
2012-04-09 14:01     ` Xiao Guangrong
2012-04-09 14:25     ` Avi Kivity
2012-04-09 17:58   ` Marcelo Tosatti
2012-04-09 18:13     ` Xiao Guangrong
2012-04-09 19:31       ` Marcelo Tosatti
2012-04-09 18:26     ` Xiao Guangrong
2012-04-09 19:46       ` Marcelo Tosatti
2012-04-10  3:06         ` Xiao Guangrong
2012-04-10 10:04         ` Avi Kivity
2012-04-11  1:47           ` Marcelo Tosatti
2012-04-11  9:15             ` Avi Kivity
2012-04-10 10:39         ` Avi Kivity
2012-04-10 11:40           ` Takuya Yoshikawa
2012-04-10 11:58             ` Xiao Guangrong
2012-04-11 12:15               ` Takuya Yoshikawa
2012-04-11 12:38                 ` Xiao Guangrong
2012-04-11 14:14                   ` Takuya Yoshikawa
2012-04-11 14:21                     ` Avi Kivity
2012-04-11 22:26                       ` Takuya Yoshikawa
2012-04-13 14:25                     ` Takuya Yoshikawa
2012-04-15  9:32                       ` Avi Kivity
2012-04-16 15:49                         ` Takuya Yoshikawa
2012-04-16 16:02                           ` Avi Kivity
2012-04-17  6:26                           ` Xiao Guangrong
2012-04-17  7:51                             ` Avi Kivity
2012-04-17 12:37                               ` Takuya Yoshikawa
2012-04-17 12:41                                 ` Avi Kivity
2012-04-17 14:54                                   ` Takuya Yoshikawa
2012-04-17 14:56                                     ` Avi Kivity
2012-04-18 13:42                                       ` Takuya Yoshikawa
2012-04-17  6:16                         ` Xiao Guangrong
2012-04-10 10:10       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F788119.6090401@redhat.com \
    --to=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).