From: Sean Christopherson <seanjc@google.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Jim Mattson <jmattson@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM: nVMX: Always use TLB_FLUSH_GUEST for nested VM-Enter/VM-Exit
Date: Thu, 16 Jan 2025 14:35:18 -0800 [thread overview]
Message-ID: <Z4mJpu3MvBeL4d1Q@google.com> (raw)
In-Reply-To: <CAJD7tkZQQUqh1GG5RpfYFT4-jK-CV7H+z9p2rTudLsrBe3WgbA@mail.gmail.com>
On Thu, Jan 16, 2025, Yosry Ahmed wrote:
> On Thu, Jan 16, 2025 at 9:11 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Thu, Jan 16, 2025, Yosry Ahmed wrote:
> > > On Wed, Jan 15, 2025 at 9:27 PM Jim Mattson <jmattson@google.com> wrote:
> > > > On Wed, Jan 15, 2025 at 7:50 PM Yosry Ahmed <yosryahmed@google.com> wrote:
> > > > > Use KVM_REQ_TLB_FLUSH_GUEST in this case in
> > > > > nested_vmx_transition_tlb_flush() for consistency. This arguably makes
> > > > > more sense conceptually too -- L1 and L2 cannot share the TLB tag for
> > > > > guest-physical translations, so only flushing linear and combined
> > > > > translations (i.e. guest-generated translations) is needed.
> >
> > No, using KVM_REQ_TLB_FLUSH_CURRENT is correct. From *L1's* perspective, VPID
> > is enabled, and so VM-Entry/VM-Exit are NOT architecturally guaranteed to flush
> > TLBs, and thus KVM is not required to FLUSH_GUEST.
> >
> > E.g. if KVM is using shadow paging (no EPT whatsoever), and L1 has modified the
> > PTEs used to map L2 but has not yet flushed TLBs for L2's VPID, then KVM is allowed
> > to retain its old, "stale" SPTEs that map L2 because architecturally they aren't
> > guaranteed to be visible to L2.
> >
> > But because L1 and L2 share TLB entries *in hardware*, KVM needs to ensure the
> > hardware TLBs are flushed. Without EPT, KVM will use different CR3s for L1 and
> > L2, but Intel's ASID tag doesn't include the CR3 address, only the PCID, which
> > KVM always pulls from guest CR3, i.e. could be the same for L1 and L2.
> >
> > Specifically, the synchronization of shadow roots in kvm_vcpu_flush_tlb_guest()
> > is not required in this scenario.
>
> Aha, I was examining vmx_flush_tlb_guest() not
> kvm_vcpu_flush_tlb_guest(), so I missed the synchronization. Yeah I
> think it's possible that we end up unnecessarily synchronizing the
> shadow page tables (or dropping them) in this case.
>
> Do you think it's worth expanding the comment in
> nested_vmx_transition_tlb_flush()?
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 2ed454186e59c..43d34e413d016 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1239,6 +1239,11 @@ static void
> nested_vmx_transition_tlb_flush(struct kvm_vcpu *vcpu,
> * does not have a unique TLB tag (ASID), i.e. EPT is disabled and
> * KVM was unable to allocate a VPID for L2, flush the current context
> * as the effective ASID is common to both L1 and L2.
> + *
> + * Note that even though TLB_FLUSH_GUEST would be correct because we
> + * only need to flush linear mappings, it would unnecessarily
> + * synchronize the MMU even though a TLB flush is not architecturally
> + * required from L1's perspective.
I'm open to calling out that there's no flush from L1's perspective, but this
is inaccurate. Using TLB_FLUSH_GUEST is simply not correct. Will it cause
functional problems? No. But neither would blasting kvm_flush_remote_tlbs(),
and I think most people would consider flushing all TLBs on all vCPUs to be a
bug.
How about:
* Note, only the hardware TLB entries need to be flushed, as VPID is
* fully enabled from L1's perspective, i.e. there's no architectural
* TLB flush from L1's perspective.
> */
> if (!nested_has_guest_tlb_tag(vcpu))
> kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, vcpu);
next prev parent reply other threads:[~2025-01-16 22:35 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-16 3:50 [PATCH] KVM: nVMX: Always use TLB_FLUSH_GUEST for nested VM-Enter/VM-Exit Yosry Ahmed
2025-01-16 5:27 ` Jim Mattson
2025-01-16 15:25 ` Yosry Ahmed
2025-01-16 17:11 ` Sean Christopherson
2025-01-16 18:24 ` Yosry Ahmed
2025-01-16 22:35 ` Sean Christopherson [this message]
2025-01-16 22:43 ` Yosry Ahmed
2025-01-17 0:34 ` Sean Christopherson
2025-01-17 0:53 ` Yosry Ahmed
2025-01-17 18:01 ` Sean Christopherson
2025-01-17 18:20 ` Yosry Ahmed
2025-01-18 0:03 ` Sean Christopherson
2025-01-22 19:15 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z4mJpu3MvBeL4d1Q@google.com \
--to=seanjc@google.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.