Re: [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using host_pfn_mapping_level()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using host_pfn_mapping_level()
Date: Mon, 18 Jul 2022 20:48:55 +0000	[thread overview]
Message-ID: <YtXHN9rrj6+SRa1Z@google.com> (raw)
In-Reply-To: <YtWPSILmAp/0m5eC@google.com>

On Mon, Jul 18, 2022, Sean Christopherson wrote:
> On Sat, Jul 16, 2022, Mingwei Zhang wrote:
> > On Fri, Jul 15, 2022, Sean Christopherson wrote:
> > > Add a comment to document how host_pfn_mapping_level() can be used safely,
> > > as the line between safe and dangerous is quite thin.  E.g. if KVM were
> > > to ever support in-place promotion to create huge pages, consuming the
> > > level is safe if the caller holds mmu_lock and checks that there's an
> > > existing _leaf_ SPTE, but unsafe if the caller only checks that there's a
> > > non-leaf SPTE.
> > > 
> > > Opportunistically tweak the existing comments to explicitly document why
> > > KVM needs to use READ_ONCE().
> > > 
> > > No functional change intended.
> > > 
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 42 +++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 35 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index bebff1d5acd4..d5b644f3e003 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -2919,6 +2919,31 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
> > >  	__direct_pte_prefetch(vcpu, sp, sptep);
> > >  }
> > >  
> > > +/*
> > > + * Lookup the mapping level for @gfn in the current mm.
> > > + *
> > > + * WARNING!  Use of host_pfn_mapping_level() requires the caller and the end
> > > + * consumer to be tied into KVM's handlers for MMU notifier events!
> > Since calling this function won't cause kernel crash now, I guess we can
> > remove the warning sign here, but keep the remaining statement since it
> > is necessary.
> 
> Calling this function won't _directly_ crash the kernel, but improper usage can
> most definitely crash the host kernel, or even worse, silently corrupt host and
> or guest data.  E.g. if KVM were to race with an mmu_notifier event and incorrectly
> map a stale huge page into the guest.
> 
> So yes, the function itself is robust, but usage is still very subtle and delicate.

Understood. So we basically create another "gup_fast_only()" within KVM
and we worry that may confuse other developers so we add the warning
sign.
> 
> > > + *
> > > + * There are several ways to safely use this helper:
> > > + *
> > > + * - Check mmu_notifier_retry_hva() after grabbing the mapping level, before
> > > + *   consuming it.  In this case, mmu_lock doesn't need to be held during the
> > > + *   lookup, but it does need to be held while checking the MMU notifier.
> > 
> > but it does need to be held while checking the MMU notifier and
> > consuming the result.
> 
> I didn't want to include "consuming the result" because arguably the result is
> being consumed while running the guest, and obviously KVM doesn't hold mmu_lock
> while running the guest (though I fully acknowledge the above effectively uses
> "consume" in the sense of shoving the result into SPTEs).  
> 
> > > + *
> > > + * - Hold mmu_lock AND ensure there is no in-progress MMU notifier invalidation
> > > + *   event for the hva.  This can be done by explicit checking the MMU notifier
> 
> s/explicit/explicitly
> 
> > > + *   or by ensuring that KVM already has a valid mapping that covers the hva.
> > 
> > Yes, more specifically, "mmu notifier sequence counter".
> 
> Heh, depends on what the reader interprets as "sequence counter".  If the reader
> interprets that as the literal sequence counter, mmu_notifier_seq, then this phrasing
> is incorrect as mmu_notifier_seq isn't bumped until the invalidation completes,
> i.e. it guards against _past_ invalidations, not in-progress validations.
> 
> My preference is to intentionally not be precise in describing how to check for an
> in-progress invalidation, e.g. so that this comment doesn't need to be updated if
> the details change, and to also to try and force developers to do more than copy
> and paste if they want to use this helper.

Hmm, I was going to say that I strongly disagree about the intentional
unclearness. But then I find that MMU notifier implementation does
require more than just the counter but also the range, so yeah, talking
too much may fall into the weeds. But in general, I think mmu notifier
deserves better documentation in both concept and implementation in KVM.

next prev parent reply	other threads:[~2022-07-18 20:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-15 23:21 [PATCH 0/4] Huge page related cleanups Sean Christopherson
2022-07-15 23:21 ` [PATCH 1/4] KVM: x86/mmu: Don't require refcounted "struct page" to create huge SPTEs Sean Christopherson
2022-07-16  5:53   ` Mingwei Zhang
2022-07-15 23:21 ` [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using host_pfn_mapping_level() Sean Christopherson
2022-07-16 18:51   ` Mingwei Zhang
2022-07-18 16:50     ` Sean Christopherson
2022-07-18 20:48       ` Mingwei Zhang [this message]
2022-07-15 23:21 ` [PATCH 3/4] KVM: x86/mmu: Don't bottom out on leafs when zapping collapsible SPTEs Sean Christopherson
2022-07-15 23:21 ` [PATCH 4/4] KVM: selftests: Add an option to run vCPUs while disabling dirty logging Sean Christopherson
2022-07-19 18:02 ` [PATCH 0/4] Huge page related cleanups Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtXHN9rrj6+SRa1Z@google.com \
    --to=mizhang@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.