public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>,
	Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 01/13] KVM: nSVM: Track the ASID per-VMCB
Date: Mon, 3 Mar 2025 10:53:34 -0800	[thread overview]
Message-ID: <Z8X6rtIwlTtu5rHx@google.com> (raw)
In-Reply-To: <CALMp9eSGRLMj-a_ZrzzeLx_jgAea13-to=ZPHu3F+trQq28YjA@mail.gmail.com>

On Mon, Mar 03, 2025, Jim Mattson wrote:
> On Fri, Feb 28, 2025 at 4:03 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > +Jim, for his input on VPIDs.
> >
> > On Wed, Feb 05, 2025, Yosry Ahmed wrote:
> > > The ASID is currently tracked per-vCPU, because the same ASID is used by
> > > L1 and L2. That ASID is flushed on every transition between L1 and L2.
> > >
> > > Track the ASID separately for each VMCB (similar to the
> > > asid_generation), giving L2 a separate ASID. This is in preparation for
> > > doing fine-grained TLB flushes on nested transitions instead of
> > > unconditional full flushes.
> >
> > After having some time to think about this, rather than track ASIDs per VMCB, I
> > think we should converge on a single approach for nVMX (VPID) and nSVM (ASID).
> >
> > Per **VM**, one VPID/ASID for L1, and one VPID/ASID for L2.
> 
> When using EPT on VMX, there is probably no advantage to using one
> VPID per VM. The physical ASID is determined by <EPTRTA, VPID, PCID>.
> Two different VMs are not going to share an EPTRTA, so they already
> have different ASIDs, even if they have the same VPID.

For posterity, which the SDM says this:

  Linear mappings may be created. They are derived from the paging structures
  referenced (directly or indirectly) by the current value of CR3 and are associated
  with the current VPID and the current PCID.

it explicitly disallows creating or using linear mappings when EPT is enabled:

  No linear mappings are created while EPT is in use.

  no linear mappings are used while EPT is in use.

I think it's still worth assigning a unique VPID though, e.g. it would provide
some amount of defense in depth.  I.e. two different VMs *shouldn't* share an
EPTRTA :-)

> > For SVM, the dynamic ASID crud is a holdover from KVM's support for CPUs that
> > don't support FLUSHBYASID, i.e. needed to purge the entire TLB in order to flush
> > guest mappings.  FLUSHBYASID was added in 2010, and AFAIK has been supported by
> > all AMD CPUs since.
> 
> > KVM already mostly keeps the same ASID, except for when a vCPU is migrated, in
> > which case KVM assigns a new ASID.  I suspect that following VMX's lead and
> > simply doing a TLB flush in this situation would be an improvement for modern
> > CPUs, as it would flush the entries that need to be flushed, and not pollute the
> > TLBs with stale, unused entries.
> >
> > Using a static per-VM ASID would also allow using broadcast invalidations[*],
> > would simplify the SVM code base, and I think/hope would allow us to move much
> > of the TLB flushing logic, e.g. for task migration, to common code.
> >
> > For VPIDs, maybe it's because it's Friday afternoon, but for the life of me I
> > can't think of any reason why KVM needs to assign VPIDs per vCPU.  Especially
> > since KVM is ridiculously conservative and flushes _all_ EPT/VPID contexts when
> > running a different vCPU on a pCPU (which I suspect we can trim down?).
> >
> > Am I forgetting something?
> 
> TDX? IIRC, TDX requires a unique VPID for each vCPU in a VM.

Ha!  Nope, the TDX module actually does what I'm suggesting, and uses a per-VM
VPID.  So if I'm forgetting some TLB edge case, TDX is already hosed.

FWIW, the hypervisor, i.e. KVM, has no control over the VPID used by the TDX
module.  Intel incorporated SEAM mode into the ASID tag to prevent TLB collisions
between the hypervisor and the TDX module, and that also conveniently provides
separation between VPIDs for non-TDX VMs and TDX VMs (and now I'm curious if TDX
enabling does the "right" thing and skips VPID allocation).

FWIW, TDX's scheme would match what I'm proposing almost exactly.  TDX "composes"
the VPID using the HKID (guaranteed unique per VM) and then a "VM identifier",
which at a glance differentiates L1 from L2.

> > [*] https://lore.kernel.org/all/Z8HdBg3wj8M7a4ts@google.com

  reply	other threads:[~2025-03-03 18:53 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 18:23 [RFC PATCH 00/13] Optimize nSVM TLB flushes Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 01/13] KVM: nSVM: Track the ASID per-VMCB Yosry Ahmed
2025-03-01  0:03   ` Sean Christopherson
2025-03-03 17:51     ` Jim Mattson
2025-03-03 18:53       ` Sean Christopherson [this message]
2025-03-03 19:18     ` Yosry Ahmed
2025-03-01  1:23   ` Maxim Levitsky
2025-03-03 19:31     ` Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 02/13] KVM: nSVM: Rework svm_flush_tlb_asid() to operate on a given VMCB Yosry Ahmed
2025-03-01  1:29   ` Maxim Levitsky
2025-03-03 21:58     ` Yosry Ahmed
2025-03-05  2:52       ` Maxim Levitsky
2025-02-05 18:23 ` [RFC PATCH 03/13] KVM: nSVM: Split nested_svm_transition_tlb_flush() into entry/exit fns Yosry Ahmed
2025-03-01  1:34   ` Maxim Levitsky
2025-02-05 18:23 ` [RFC PATCH 04/13] KVM: SVM: Introduce helpers for updating TLB_CONTROL Yosry Ahmed
2025-03-01  1:37   ` Maxim Levitsky
2025-02-05 18:23 ` [RFC PATCH 05/13] KVM: x86/mmu: rename __kvm_mmu_invalidate_addr() Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 06/13] KVM: x86/mmu: Allow skipping the gva flush in kvm_mmu_invalidate_addr() Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 07/13] KVM: nSVM: Handle INVLPGA interception correctly Yosry Ahmed
2025-03-01  1:55   ` Maxim Levitsky
2025-03-03 22:05     ` Yosry Ahmed
2025-03-05  2:54       ` Maxim Levitsky
2025-03-05  6:20         ` Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 08/13] KVM: nSVM: Flush both L1 and L2 ASIDs on KVM_REQ_TLB_FLUSH Yosry Ahmed
2025-03-01  1:58   ` Maxim Levitsky
2025-03-03 22:06     ` Yosry Ahmed
2025-02-05 18:23 ` [RFC PATCH 09/13] KVM: nSVM: Handle nested TLB flush requests through TLB_CONTROL Yosry Ahmed
2025-02-05 21:45   ` Yosry Ahmed
2025-03-01  2:06   ` Maxim Levitsky
     [not found]     ` <Z8Yovz0I3QLuq6VQ@google.com>
2025-03-05  2:57       ` Maxim Levitsky
2025-02-05 18:23 ` [RFC PATCH 10/13] KVM: nSVM: Flush the TLB if L1 changes L2's ASID Yosry Ahmed
2025-03-01  2:13   ` Maxim Levitsky
2025-02-05 18:24 ` [RFC PATCH 11/13] KVM: nSVM: Do not reset TLB_CONTROL in VMCB02 on nested entry Yosry Ahmed
2025-03-01  2:17   ` Maxim Levitsky
2025-03-03 22:14     ` Yosry Ahmed
2025-03-05  3:01       ` Maxim Levitsky
2025-03-05  6:34         ` Yosry Ahmed
2025-02-05 18:24 ` [RFC PATCH 12/13] KVM: nSVM: Service local TLB flushes before nested transitions Yosry Ahmed
2025-03-01  2:20   ` Maxim Levitsky
2025-03-03 22:18     ` Yosry Ahmed
2025-03-05  3:03       ` Maxim Levitsky
2025-03-05  6:37         ` Yosry Ahmed
2025-02-05 18:24 ` [RFC PATCH 13/13] KVM: nSVM: Stop bombing the TLB on " Yosry Ahmed
2025-03-01  2:21   ` Maxim Levitsky
2025-03-03 22:21     ` Yosry Ahmed
2025-03-05  3:14       ` Maxim Levitsky
2025-03-05  6:45         ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8X6rtIwlTtu5rHx@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=yosry.ahmed@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox