Re: [PATCH v2 1/3] KVM: x86: assign two bits to track SPTE kinds

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Ben Gardon <bgardon@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Junaid Shahid <junaids@google.com>
Subject: Re: [PATCH v2 1/3] KVM: x86: assign two bits to track SPTE kinds
Date: Wed, 11 Dec 2019 11:13:27 -0800	[thread overview]
Message-ID: <20191211191327.GI5044@linux.intel.com> (raw)
In-Reply-To: <CANgfPd8G194y1Bo-6HR-jP8wh4DvdAsaijue_pnhetjduyzn4A@mail.gmail.com>

On Wed, Dec 11, 2019 at 10:39:22AM -0800, Ben Gardon wrote:
> Has anyone tested this patch on a long-running machine? It looks like
> the SPTE_MMIO_MASK overlaps with the bits used to track MMIO
> generation number, which makes me think that on a long running VM, a
> high enough generation number would overwrite the SPTE_SPECIAL_MASK
> region and cause the MMIO SPTE to be misinterpreted. It seems like
> setting bits 52 and 53 would also cause an incorrect generation number
> to be read from the PTE.

Hmm, the MMIO SPTE won't be misintrepreted as MMIO SPTEs will always have
bits 53:52=11b, i.e. bits 10:9 of the generation number effectively get
ignored.  It does mean that check_mmio_spte() could theoretically consume
stale MMIO entries due those bits being ignored, but only if a SPTE lived
across 512 memslot updates, which is impossible given that KVM currently
zaps all SPTEs on a memslot update (fast zapping can let a stale SPTE live
for while the memslot update is in progress, but not longer than that).

There is/was the pass-through issue that cropped up when I tried to stop
zapping all SPTEs on a memslot update, but I doubt this would explain that
problem.

Assuming we haven't missed something, the easiest fix would be to reduce
the MMIO generation by one bit and use bits 62:54 for the MMIO generation.

> On Fri, Sep 27, 2019 at 4:16 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > Currently, we are overloading SPTE_SPECIAL_MASK to mean both
> > "A/D bits unavailable" and MMIO, where the difference between the
> > two is determined by mio_mask and mmio_value.
> >
> > However, the next patch will need two bits to distinguish
> > availability of A/D bits from write protection.  So, while at
> > it give MMIO its own bit pattern, and move the two bits from
> > bit 62 to bits 52..53 since Intel is allocating EPT page table
> > bits from the top.
> >
> > Reviewed-by: Junaid Shahid <junaids@google.com>
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h |  7 -------
> >  arch/x86/kvm/mmu.c              | 28 ++++++++++++++++++----------
> >  2 files changed, 18 insertions(+), 17 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index 23edf56cf577..50eb430b0ad8 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -219,13 +219,6 @@ enum {
> >                                  PFERR_WRITE_MASK |             \
> >                                  PFERR_PRESENT_MASK)
> >
> > -/*
> > - * The mask used to denote special SPTEs, which can be either MMIO SPTEs or
> > - * Access Tracking SPTEs. We use bit 62 instead of bit 63 to avoid conflicting
> > - * with the SVE bit in EPT PTEs.
> > - */
> > -#define SPTE_SPECIAL_MASK (1ULL << 62)
> > -
> >  /* apic attention bits */
> >  #define KVM_APIC_CHECK_VAPIC   0
> >  /*
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index 5269aa057dfa..bac8d228d82b 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -83,7 +83,16 @@ enum {
> >  #define PTE_PREFETCH_NUM               8
> >
> >  #define PT_FIRST_AVAIL_BITS_SHIFT 10
> > -#define PT64_SECOND_AVAIL_BITS_SHIFT 52
> > +#define PT64_SECOND_AVAIL_BITS_SHIFT 54
> > +
> > +/*
> > + * The mask used to denote special SPTEs, which can be either MMIO SPTEs or
> > + * Access Tracking SPTEs.
> > + */
> > +#define SPTE_SPECIAL_MASK (3ULL << 52)
> > +#define SPTE_AD_ENABLED_MASK (0ULL << 52)
> > +#define SPTE_AD_DISABLED_MASK (1ULL << 52)
> > +#define SPTE_MMIO_MASK (3ULL << 52)

next prev parent reply	other threads:[~2019-12-11 19:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-27 11:15 [PATCH v2 0/3] KVM: MMU: fix nested guest live migration with PML Paolo Bonzini
2019-09-27 11:15 ` [PATCH v2 1/3] KVM: x86: assign two bits to track SPTE kinds Paolo Bonzini
2019-12-11 18:39   ` Ben Gardon
2019-12-11 19:13     ` Sean Christopherson [this message]
2019-12-11 23:28       ` Paolo Bonzini
2019-12-12  0:29         ` Sean Christopherson
2019-12-12  0:33           ` Paolo Bonzini
2019-09-27 11:15 ` [PATCH v2 2/3] KVM: x86: fix nested guest live migration with PML Paolo Bonzini
2019-09-27 20:24   ` Junaid Shahid
2019-09-27 11:15 ` [PATCH v2 3/3] selftests: kvm: add test for dirty logging inside nested guests Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191211191327.GI5044@linux.intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=bgardon@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.