From: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH v4 06/10] KVM: MMU: fast path of handling guest page fault
Date: Fri, 27 Apr 2012 13:53:09 +0800 [thread overview]
Message-ID: <4F9A3445.2060305@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120426234535.GA5057@amt.cnet>
On 04/27/2012 07:45 AM, Marcelo Tosatti wrote:
>> +static bool
>> +fast_pf_fix_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
>> + u64 *sptep, u64 spte)
>> +{
>> + gfn_t gfn;
>> +
>> + spin_lock(&vcpu->kvm->mmu_lock);
>> +
>> + /* The spte has been changed. */
>> + if (*sptep != spte)
>> + goto exit;
>> +
>> + gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
>> +
>> + *sptep = spte | PT_WRITABLE_MASK;
>> + mark_page_dirty(vcpu->kvm, gfn);
>> +
>> +exit:
>> + spin_unlock(&vcpu->kvm->mmu_lock);
>
> There was a misunderstanding.
Sorry, Marcelo! It is still not clear for me now. :(
> The suggestion is to change codepaths that
> today assume that a side effect of holding mmu_lock is that sptes are
> not updated or read before attempting to introduce any lockless spte
> updates.
>
> example)
>
> u64 mask, old_spte = *sptep;
>
> if (!spte_has_volatile_bits(old_spte) || (new_spte & mask) == mask)
> __update_clear_spte_fast(sptep, new_spte);
> else
> old_spte = __update_clear_spte_slow(sptep, new_spte);
>
> The local old_spte copy might be stale by the
> time "spte_has_volatile_bits(old_spte)" reads the writable bit.
>
> example)
>
>
> VCPU0 VCPU1
> set_spte
> mmu_spte_update decides fast write
> mov newspte to sptep
> (non-locked write instruction)
> newspte in store buffer
>
> lockless spte read path reads stale value
>
> spin_unlock(mmu_lock)
>
> Depending on the what stale value CPU1 read and what decision it took
> this can be a problem, say the new bits (and we do not want to verify
> every single case). The spte write from inside mmu_lock should always be
> atomic now?
>
> So these details must be exposed to the reader, they are hidden now
> (note mmu_spte_update is already a maze, its getting too deep).
>
Actually, in this patch, all the spte update is under mmu-lock, and we
lockless-ly read spte , but the spte will be verified again after holding
mmu-lock.
+ spin_lock(&vcpu->kvm->mmu_lock);
+
+ /* The spte has been changed. */
+ if (*sptep != spte)
+ goto exit;
+
+ gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
+
+ *sptep = spte | PT_WRITABLE_MASK;
+ mark_page_dirty(vcpu->kvm, gfn);
+
+exit:
+ spin_unlock(&vcpu->kvm->mmu_lock);
Is not the same as both read/update spte under mmu-lock?
Hmm, this is what you want?
+static bool
+fast_pf_fix_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
+ u64 *sptep, u64 spte)
+{
+ gfn_t gfn;
+
+ gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
+ mmu_spte_update(sptep, spte | PT_WRITABLE_MASK);
+ mark_page_dirty(vcpu->kvm, gfn);
+
+ return true;
+}
+
+/*
+ * Return value:
+ * - true: let the vcpu to access on the same address again.
+ * - false: let the real page fault path to fix it.
+ */
+static bool fast_page_fault(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
+ int level, u32 error_code)
+{
+ struct kvm_shadow_walk_iterator iterator;
+ struct kvm_mmu_page *sp;
+ bool ret = false;
+ u64 spte = 0ull;
+
+ if (!page_fault_can_be_fast(vcpu, gfn, error_code))
+ return false;
+
+ spin_lock(&vcpu->kvm->mmu_lock);
+
+ for_each_shadow_entry(vcpu, gva, iterator) {
+ spte = *iterator.sptep;
+
+ if (!is_shadow_present_pte(spte) || iterator.level < level)
+ break;
+ }
+
+ /*
+ * If the mapping has been changed, let the vcpu fault on the
+ * same address again.
+ */
+ if (!is_rmap_spte(spte)) {
+ ret = true;
+ goto exit;
+ }
+
+ if (!is_last_spte(spte, level))
+ goto exit;
+
+ /*
+ * Check if it is a spurious fault caused by TLB lazily flushed.
+ *
+ * Need not check the access of upper level table entries since
+ * they are always ACC_ALL.
+ */
+ if (is_writable_pte(spte)) {
+ ret = true;
+ goto exit;
+ }
+
+ /*
+ * Currently, to simplify the code, only the spte write-protected
+ * by dirty-log can be fast fixed.
+ */
+ if (!spte_wp_by_dirty_log(spte))
+ goto exit;
+
+ sp = page_header(__pa(iterator.sptep));
+
+ ret = fast_pf_fix_spte(vcpu, sp, iterator.sptep, spte);
+
+exit:
+ spin_unlock(&vcpu->kvm->mmu_lock);
+
+ return ret;
+}
next prev parent reply other threads:[~2012-04-27 5:53 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-25 4:00 [PATCH v4 00/10] KVM: MMU: fast page fault Xiao Guangrong
2012-04-25 4:01 ` [PATCH v4 01/10] KVM: MMU: return bool in __rmap_write_protect Xiao Guangrong
2012-04-25 4:01 ` [PATCH v4 02/10] KVM: MMU: abstract spte write-protect Xiao Guangrong
2012-04-25 4:02 ` [PATCH v4 03/10] KVM: VMX: export PFEC.P bit on ept Xiao Guangrong
2012-04-25 4:02 ` [PATCH v4 04/10] KVM: MMU: introduce SPTE_MMU_WRITEABLE bit Xiao Guangrong
2012-04-25 4:03 ` [PATCH v4 05/10] KVM: MMU: introduce SPTE_WRITE_PROTECT bit Xiao Guangrong
2012-04-25 4:03 ` [PATCH v4 06/10] KVM: MMU: fast path of handling guest page fault Xiao Guangrong
2012-04-26 23:45 ` Marcelo Tosatti
2012-04-27 5:53 ` Xiao Guangrong [this message]
2012-04-27 14:52 ` Marcelo Tosatti
2012-04-28 6:10 ` Xiao Guangrong
2012-05-01 1:34 ` Marcelo Tosatti
2012-05-02 5:28 ` Xiao Guangrong
2012-05-02 21:07 ` Marcelo Tosatti
2012-05-03 11:26 ` Xiao Guangrong
2012-05-05 14:08 ` Marcelo Tosatti
2012-05-06 9:36 ` Avi Kivity
2012-05-07 6:52 ` Xiao Guangrong
2012-04-29 8:50 ` Takuya Yoshikawa
2012-05-01 2:31 ` Marcelo Tosatti
2012-05-02 5:39 ` Xiao Guangrong
2012-05-02 21:10 ` Marcelo Tosatti
2012-05-03 12:09 ` Xiao Guangrong
2012-05-03 12:13 ` Avi Kivity
2012-05-03 0:15 ` Takuya Yoshikawa
2012-05-03 12:23 ` Xiao Guangrong
2012-05-03 12:40 ` Takuya Yoshikawa
2012-04-25 4:04 ` [PATCH v4 07/10] KVM: MMU: lockless update spte on fast page fault path Xiao Guangrong
2012-04-25 4:04 ` [PATCH v4 08/10] KVM: MMU: trace fast page fault Xiao Guangrong
2012-04-25 4:05 ` [PATCH v4 09/10] KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint Xiao Guangrong
2012-04-25 4:06 ` [PATCH v4 10/10] KVM: MMU: document mmu-lock and fast page fault Xiao Guangrong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F9A3445.2060305@linux.vnet.ibm.com \
--to=xiaoguangrong@linux.vnet.ibm.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).