All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range()
Date: Thu, 10 Oct 2024 09:14:41 -0700	[thread overview]
Message-ID: <Zwf9cSjhlp5clpTm@google.com> (raw)
In-Reply-To: <Zwd75Nc8+8pIWUGm@yzhao56-desk.sh.intel.com>

On Thu, Oct 10, 2024, Yan Zhao wrote:
> On Wed, Oct 09, 2024 at 12:23:44PM -0700, Sean Christopherson wrote:
> > Add a lockdep assertion in kvm_unmap_gfn_range() to ensure that either
> > mmu_invalidate_in_progress is elevated, or that the range is being zapped
> > due to memslot removal (loosely detected by slots_lock being held).
> > Zapping SPTEs without mmu_invalidate_{in_progress,seq} protection is unsafe
> > as KVM's page fault path snapshots state before acquiring mmu_lock, and
> > thus can create SPTEs with stale information if vCPUs aren't forced to
> > retry faults (due to seeing an in-progress or past MMU invalidation).
> > 
> > Memslot removal is a special case, as the memslot is retrieved outside of
> > mmu_invalidate_seq, i.e. doesn't use the "standard" protections, and
> > instead relies on SRCU synchronization to ensure any in-flight page faults
> > are fully resolved before zapping SPTEs.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 09494d01c38e..c6716fd3666f 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -1556,6 +1556,16 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
> >  {
> >  	bool flush = false;
> >  
> > +	/*
> > +	 * To prevent races with vCPUs faulting in a gfn using stale data,
> > +	 * zapping a gfn range must be protected by mmu_invalidate_in_progress
> > +	 * (and mmu_invalidate_seq).  The only exception is memslot deletion,
> > +	 * in which case SRCU synchronization ensures SPTEs a zapped after all
> > +	 * vCPUs have unlocked SRCU and are guaranteed to see the invalid slot.
> > +	 */
> > +	lockdep_assert_once(kvm->mmu_invalidate_in_progress ||
> > +			    lockdep_is_held(&kvm->slots_lock));
> > +
> Is the detection of slots_lock too loose?

Yes, but I can't think of an easy way to tighten it.  My original thought was to
require range->slot to be invalid, but KVM (correctly) passes in the old, valid
memslot to kvm_arch_flush_shadow_memslot().

The goal with the assert is to detect as many bugs as possible, without adding
too much complexity, and also to document the rules for using kvm_unmap_gfn_range().

Actually, we can tighten the check, by verifying that the slot being unmapped is
valid, but that the slot that KVM sees is invalid.  I'm not sure I love it though,
as it's absurdly specific.

(untested)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c6716fd3666f..12b87b746b59 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1552,6 +1552,17 @@ static bool __kvm_rmap_zap_gfn_range(struct kvm *kvm,
                                 start, end - 1, can_yield, true, flush);
 }
 
+static kvm_memslot_is_being_invalidated(const struct kvm_memory_slot *old)
+{
+       const struct kvm_memory_slot *new;
+
+       if (old->flags & KVM_MEMSLOT_INVALID)
+               return false;
+
+       new = id_to_memslot(__kvm_memslots(kvm, old->as_id), old->id);
+       return new && new->flags & KVM_MEMSLOT_INVALID;
+}
+
 bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 {
        bool flush = false;
@@ -1564,7 +1575,8 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
         * vCPUs have unlocked SRCU and are guaranteed to see the invalid slot.
         */
        lockdep_assert_once(kvm->mmu_invalidate_in_progress ||
-                           lockdep_is_held(&kvm->slots_lock));
+                           (lockdep_is_held(&kvm->slots_lock) &&
+                            kvm_memslot_is_being_invalidated(range->slot));
 
        if (kvm_memslots_have_rmaps(kvm))
                flush = __kvm_rmap_zap_gfn_range(kvm, range->slot,


> If a caller just holds slots_lock without calling
> "synchronize_srcu_expedited(&kvm->srcu)" as that in kvm_swap_active_memslots()
> to ensure the old slot is retired, stale data may still be encountered. 
> 
> >  	if (kvm_memslots_have_rmaps(kvm))
> >  		flush = __kvm_rmap_zap_gfn_range(kvm, range->slot,
> >  						 range->start, range->end,
> > -- 
> > 2.47.0.rc1.288.g06298d1525-goog
> > 

  reply	other threads:[~2024-10-10 16:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09 19:23 [PATCH 0/3] KVM: x86/mmu: Don't zap "direct" non-leaf SPTEs on memslot removal Sean Christopherson
2024-10-09 19:23 ` [PATCH 1/3] KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot Sean Christopherson
2024-10-10  7:59   ` Yan Zhao
2024-10-09 19:23 ` [PATCH 2/3] KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() Sean Christopherson
2024-10-10  7:01   ` Yan Zhao
2024-10-10 16:14     ` Sean Christopherson [this message]
2024-10-11  5:37       ` Yan Zhao
2024-10-11 21:22         ` Sean Christopherson
2024-10-09 19:23 ` [PATCH 3/3] KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zwf9cSjhlp5clpTm@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.