* [PATCH] KVM: x86: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS
@ 2025-04-01 4:49 Jon Kohler
2025-04-02 13:36 ` Sean Christopherson
0 siblings, 1 reply; 3+ messages in thread
From: Jon Kohler @ 2025-04-01 4:49 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm,
linux-kernel
Cc: Jon Kohler, Emanuele Giuseppe Esposito, Pawan Gupta
Expose FB_CLEAR in arch_capabilities for certain MDS-invulnerable cases
to support live migration from older hardware (e.g., Cascade Lake, Ice
Lake) to newer hardware (e.g., Sapphire Rapids or higher). This ensures
compatibility when user space has previously configured vCPUs to see
FB_CLEAR (ARCH_CAPABILITIES Bit 17).
Newer hardware sets the following bits but does not set FB_CLEAR, which
can prevent user space from configuring a matching setup:
ARCH_CAP_MDS_NO
ARCH_CAP_TAA_NO
ARCH_CAP_PSDP_NO
ARCH_CAP_FBSDP_NO
ARCH_CAP_SBDR_SSDP_NO
This change has minimal impact, as these bit combinations already mark
the host as MMIO immune (via arch_cap_mmio_immune()) and set
disable_fb_clear in vmx_update_fb_clear_dis(), resulting in no
additional overhead.
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
arch/x86/kvm/x86.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c841817a914a..2a4337aa78cd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1641,6 +1641,20 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated())
data |= ARCH_CAP_GDS_NO;
+ /*
+ * User space might set FB_CLEAR when starting a vCPU on a system
+ * that does not enumerate FB_CLEAR but is also invulnerable to
+ * other various MDS related bugs. To allow live migration from
+ * hosts that do implement FB_CLEAR, leave it enabled.
+ */
+ if ((data & ARCH_CAP_MDS_NO) &&
+ (data & ARCH_CAP_TAA_NO) &&
+ (data & ARCH_CAP_PSDP_NO) &&
+ (data & ARCH_CAP_FBSDP_NO) &&
+ (data & ARCH_CAP_SBDR_SSDP_NO)) {
+ data |= ARCH_CAP_FB_CLEAR;
+ }
+
return data;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] KVM: x86: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS 2025-04-01 4:49 [PATCH] KVM: x86: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS Jon Kohler @ 2025-04-02 13:36 ` Sean Christopherson 2025-04-02 13:46 ` Jon Kohler 0 siblings, 1 reply; 3+ messages in thread From: Sean Christopherson @ 2025-04-02 13:36 UTC (permalink / raw) To: Jon Kohler Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, kvm, linux-kernel, Emanuele Giuseppe Esposito, Pawan Gupta On Mon, Mar 31, 2025, Jon Kohler wrote: > Expose FB_CLEAR in arch_capabilities for certain MDS-invulnerable cases > to support live migration from older hardware (e.g., Cascade Lake, Ice > Lake) to newer hardware (e.g., Sapphire Rapids or higher). This ensures > compatibility when user space has previously configured vCPUs to see > FB_CLEAR (ARCH_CAPABILITIES Bit 17). > > Newer hardware sets the following bits but does not set FB_CLEAR, which > can prevent user space from configuring a matching setup: I looked at this again right after PUCK, and KVM does NOT actually prevent userspace from matching the original, pre-SPR configuration. KVM effectively treats ARCH_CAPABILITIES like a CPUID leaf, and lets userspace shove in any value. I.e. userspace can still migrate+stuff FB_CLEAR irrespective of hardware support, and thus there is no need for KVM to lie to userspace. So in effect, this is a userspace problem where it's being too aggressive in its sanity checks. FWIW, even if KVM did reject unsupported ARCH_CAPABILITIES bits, I would still say this is userspace's problem to solve. E.g. by using MSR filtering to intercept and emulate RDMSR(ARCH_CAPABILITIES) in userspace. > ARCH_CAP_MDS_NO > ARCH_CAP_TAA_NO > ARCH_CAP_PSDP_NO > ARCH_CAP_FBSDP_NO > ARCH_CAP_SBDR_SSDP_NO > > This change has minimal impact, as these bit combinations already mark > the host as MMIO immune (via arch_cap_mmio_immune()) and set > disable_fb_clear in vmx_update_fb_clear_dis(), resulting in no > additional overhead. > > Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> > Signed-off-by: Jon Kohler <jon@nutanix.com> > > --- > arch/x86/kvm/x86.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index c841817a914a..2a4337aa78cd 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1641,6 +1641,20 @@ static u64 kvm_get_arch_capabilities(void) > if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated()) > data |= ARCH_CAP_GDS_NO; > > + /* > + * User space might set FB_CLEAR when starting a vCPU on a system > + * that does not enumerate FB_CLEAR but is also invulnerable to > + * other various MDS related bugs. To allow live migration from > + * hosts that do implement FB_CLEAR, leave it enabled. > + */ > + if ((data & ARCH_CAP_MDS_NO) && > + (data & ARCH_CAP_TAA_NO) && > + (data & ARCH_CAP_PSDP_NO) && > + (data & ARCH_CAP_FBSDP_NO) && > + (data & ARCH_CAP_SBDR_SSDP_NO)) { > + data |= ARCH_CAP_FB_CLEAR; > + } > + > return data; > } > > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] KVM: x86: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS 2025-04-02 13:36 ` Sean Christopherson @ 2025-04-02 13:46 ` Jon Kohler 0 siblings, 0 replies; 3+ messages in thread From: Jon Kohler @ 2025-04-02 13:46 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86@kernel.org, H. Peter Anvin, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Emanuele Giuseppe Esposito, Pawan Gupta > On Apr 2, 2025, at 9:36 AM, Sean Christopherson <seanjc@google.com> wrote: > > !-------------------------------------------------------------------| > CAUTION: External Email > > |-------------------------------------------------------------------! > > On Mon, Mar 31, 2025, Jon Kohler wrote: >> Expose FB_CLEAR in arch_capabilities for certain MDS-invulnerable cases >> to support live migration from older hardware (e.g., Cascade Lake, Ice >> Lake) to newer hardware (e.g., Sapphire Rapids or higher). This ensures >> compatibility when user space has previously configured vCPUs to see >> FB_CLEAR (ARCH_CAPABILITIES Bit 17). >> >> Newer hardware sets the following bits but does not set FB_CLEAR, which >> can prevent user space from configuring a matching setup: > > I looked at this again right after PUCK, and KVM does NOT actually prevent > userspace from matching the original, pre-SPR configuration. KVM effectively > treats ARCH_CAPABILITIES like a CPUID leaf, and lets userspace shove in any > value. I.e. userspace can still migrate+stuff FB_CLEAR irrespective of hardware > support, and thus there is no need for KVM to lie to userspace. > > So in effect, this is a userspace problem where it's being too aggressive in its > sanity checks. > > FWIW, even if KVM did reject unsupported ARCH_CAPABILITIES bits, I would still > say this is userspace's problem to solve. E.g. by using MSR filtering to > intercept and emulate RDMSR(ARCH_CAPABILITIES) in userspace. Thanks, Sean, I appreciate it. I’ll see what sort of trouble I can get in on the user space side of the house with qemu to see if there is a clean way to special case this. Cheers, Jon > >> ARCH_CAP_MDS_NO >> ARCH_CAP_TAA_NO >> ARCH_CAP_PSDP_NO >> ARCH_CAP_FBSDP_NO >> ARCH_CAP_SBDR_SSDP_NO >> >> This change has minimal impact, as these bit combinations already mark >> the host as MMIO immune (via arch_cap_mmio_immune()) and set >> disable_fb_clear in vmx_update_fb_clear_dis(), resulting in no >> additional overhead. >> >> Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com> >> Cc: Paolo Bonzini <pbonzini@redhat.com> >> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> >> Signed-off-by: Jon Kohler <jon@nutanix.com> >> >> --- >> arch/x86/kvm/x86.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index c841817a914a..2a4337aa78cd 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -1641,6 +1641,20 @@ static u64 kvm_get_arch_capabilities(void) >> if (!boot_cpu_has_bug(X86_BUG_GDS) || gds_ucode_mitigated()) >> data |= ARCH_CAP_GDS_NO; >> >> + /* >> + * User space might set FB_CLEAR when starting a vCPU on a system >> + * that does not enumerate FB_CLEAR but is also invulnerable to >> + * other various MDS related bugs. To allow live migration from >> + * hosts that do implement FB_CLEAR, leave it enabled. >> + */ >> + if ((data & ARCH_CAP_MDS_NO) && >> + (data & ARCH_CAP_TAA_NO) && >> + (data & ARCH_CAP_PSDP_NO) && >> + (data & ARCH_CAP_FBSDP_NO) && >> + (data & ARCH_CAP_SBDR_SSDP_NO)) { >> + data |= ARCH_CAP_FB_CLEAR; >> + } >> + >> return data; >> } >> >> -- >> 2.43.0 >> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-04-02 13:47 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-01 4:49 [PATCH] KVM: x86: Expose ARCH_CAP_FB_CLEAR when invulnerable to MDS Jon Kohler 2025-04-02 13:36 ` Sean Christopherson 2025-04-02 13:46 ` Jon Kohler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox