* [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups
@ 2025-10-31 0:30 Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
` (8 more replies)
0 siblings, 9 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
This is a combination of Brendan's work to unify the L1TF L1D flushing
mitigation, and Pawan's work to bring some sanity to the mitigations that
clear CPU buffers, with a bunch of glue code and some polishing from me.
The "v4" is relative to the L1TF series. I smushed the two series together
as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
handling the series separately would have been a lot of work+churn for no
real benefit.
TL;DR:
- Unify L1TF flushing under per-CPU variable
- Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
- Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
VERW, tweak the low lever macros in nospec-branch.h to define the instruction
sequence, and then wrap it with __stringify() as needed.
The non-VMX code is lightly tested (but there's far less chance for breakage
there). For the VMX code, I verified it does what I want (which may or may
not be correct :-D) by hacking the code to force/clear various mitigations, and
using ud2 to confirm the right path got selected.
v4:
- Drop the patch to fallback to handling the MMIO mitigation if
vmx_l1d_flush() doesn't flush, and instead use Pawan's approach of
decoupling the two entirely.
- Replace the static branch with X86_FEATURE_CLEAR_CPU_BUF_MMIO so that
it can be referenced in ALTERNATIVE macros.
- Decouple X86_FEATURE_CLEAR_CPU_BUF_VM from X86_FEATURE_CLEAR_CPU_BUF_MMIO
(though they still interact and can both be set)
v3:
- https://lore.kernel.org/all/20251016200417.97003-1-seanjc@google.com
- [Pawan's series] https://lore.kernel.org/all/20251029-verw-vm-v1-0-babf9b961519@linux.intel.com
- Put the "raw" variant in KVM, dress it up with KVM's "request" terminology,
and add a comment explaining why _KVM_ knows its usage doesn't need to
disable virtualization.
- Add the prep patches.
v2:
- https://lore.kernel.org/all/20251015-b4-l1tf-percpu-v2-1-6d7a8d3d40e9@google.com
- Moved the bit back to irq_stat
- Fixed DEBUG_PREEMPT issues by adding a _raw variant
v1: https://lore.kernel.org/r/20251013-b4-l1tf-percpu-v1-1-d65c5366ea1a@google.com
Brendan Jackman (1):
KVM: x86: Unify L1TF flushing under per-CPU variable
Pawan Gupta (1):
x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
Sean Christopherson (6):
x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data
mitigation
KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via
ALTERNATIVES_2
x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as
SVM_CLEAR_CPU_BUFFERS
KVM: VMX: Bundle all L1 data cache flush mitigation code together
KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/hardirq.h | 4 +-
arch/x86/include/asm/kvm_host.h | 3 -
arch/x86/include/asm/nospec-branch.h | 24 +--
arch/x86/kernel/cpu/bugs.c | 18 +-
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/mmu/spte.c | 2 +-
arch/x86/kvm/svm/vmenter.S | 6 +-
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/vmenter.S | 14 +-
arch/x86/kvm/vmx/vmx.c | 235 ++++++++++++++-------------
arch/x86/kvm/x86.c | 6 +-
arch/x86/kvm/x86.h | 14 ++
13 files changed, 178 insertions(+), 153 deletions(-)
base-commit: 4cc167c50eb19d44ac7e204938724e685e3d8057
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply [flat|nested] 57+ messages in thread
* [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 11:30 ` Brendan Jackman
` (2 more replies)
2025-10-31 0:30 ` [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition Sean Christopherson
` (7 subsequent siblings)
8 siblings, 3 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
TSA mitigation:
d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
(kernel->user also).
Make mitigations on Intel consistent with TSA. This would help handling the
guest-only mitigations better in future.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
[sean: make CLEAR_CPU_BUF_VM mutually exclusive with the MMIO mitigation]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kernel/cpu/bugs.c | 9 +++++++--
arch/x86/kvm/vmx/vmenter.S | 2 +-
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 6a526ae1fe99..723666a1357e 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -194,7 +194,7 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
/*
* Controls CPU Fill buffer clear before VMenter. This is a subset of
- * X86_FEATURE_CLEAR_CPU_BUF, and should only be enabled when KVM-only
+ * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
* mitigation is required.
*/
DEFINE_STATIC_KEY_FALSE(cpu_buf_vm_clear);
@@ -536,6 +536,7 @@ static void __init mds_apply_mitigation(void)
if (mds_mitigation == MDS_MITIGATION_FULL ||
mds_mitigation == MDS_MITIGATION_VMWERV) {
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
+ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) &&
(mds_nosmt || smt_mitigations == SMT_MITIGATIONS_ON))
cpu_smt_disable(false);
@@ -647,6 +648,7 @@ static void __init taa_apply_mitigation(void)
* present on host, enable the mitigation for UCODE_NEEDED as well.
*/
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
+ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
if (taa_nosmt || smt_mitigations == SMT_MITIGATIONS_ON)
cpu_smt_disable(false);
@@ -748,6 +750,7 @@ static void __init mmio_apply_mitigation(void)
*/
if (verw_clear_cpu_buf_mitigation_selected) {
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
+ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
static_branch_disable(&cpu_buf_vm_clear);
} else {
static_branch_enable(&cpu_buf_vm_clear);
@@ -839,8 +842,10 @@ static void __init rfds_update_mitigation(void)
static void __init rfds_apply_mitigation(void)
{
- if (rfds_mitigation == RFDS_MITIGATION_VERW)
+ if (rfds_mitigation == RFDS_MITIGATION_VERW) {
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
+ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
+ }
}
static __init int rfds_parse_cmdline(char *str)
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index bc255d709d8a..1f99a98a16a2 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -161,7 +161,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
mov VCPU_RAX(%_ASM_AX), %_ASM_AX
/* Clobbers EFLAGS.ZF */
- CLEAR_CPU_BUFFERS
+ VM_CLEAR_CPU_BUFFERS
/* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */
jnc .Lvmlaunch
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 11:37 ` Brendan Jackman
2025-11-01 4:13 ` Pawan Gupta
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
` (6 subsequent siblings)
8 siblings, 2 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
and "clear if guest can access host MMIO" in a single statement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/nospec-branch.h | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 08ed5a2e46a5..923ae21cbef1 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -308,24 +308,23 @@
* CFLAGS.ZF.
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
-.macro __CLEAR_CPU_BUFFERS feature
#ifdef CONFIG_X86_64
- ALTERNATIVE "", "verw x86_verw_sel(%rip)", \feature
+#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
#else
- /*
- * In 32bit mode, the memory operand must be a %cs reference. The data
- * segments may not be usable (vm86 mode), and the stack segment may not
- * be flat (ESPFIX32).
- */
- ALTERNATIVE "", "verw %cs:x86_verw_sel", \feature
+/*
+ * In 32bit mode, the memory operand must be a %cs reference. The data segments
+ * may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
+ */
+#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
#endif
-.endm
+
+#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
#define CLEAR_CPU_BUFFERS \
- __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
+ ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
#define VM_CLEAR_CPU_BUFFERS \
- __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF_VM
+ ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 11:44 ` Brendan Jackman
` (2 more replies)
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
` (5 subsequent siblings)
8 siblings, 3 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Convert the MMIO Stale Data mitigation flag from a static branch into an
X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/nospec-branch.h | 2 --
arch/x86/kernel/cpu/bugs.c | 11 +----------
arch/x86/kvm/mmu/spte.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 4 ++--
5 files changed, 5 insertions(+), 15 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 7129eb44adad..d1d7b5ec6425 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -501,6 +501,7 @@
#define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
#define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
#define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
+#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
/*
* BUG word(s)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 923ae21cbef1..b29df45b1edb 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -579,8 +579,6 @@ DECLARE_STATIC_KEY_FALSE(cpu_buf_idle_clear);
DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
-DECLARE_STATIC_KEY_FALSE(cpu_buf_vm_clear);
-
extern u16 x86_verw_sel;
#include <asm/segment.h>
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 723666a1357e..9acf6343b0ac 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -192,14 +192,6 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
*/
DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
-/*
- * Controls CPU Fill buffer clear before VMenter. This is a subset of
- * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
- * mitigation is required.
- */
-DEFINE_STATIC_KEY_FALSE(cpu_buf_vm_clear);
-EXPORT_SYMBOL_GPL(cpu_buf_vm_clear);
-
#undef pr_fmt
#define pr_fmt(fmt) "mitigations: " fmt
@@ -751,9 +743,8 @@ static void __init mmio_apply_mitigation(void)
if (verw_clear_cpu_buf_mitigation_selected) {
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
- static_branch_disable(&cpu_buf_vm_clear);
} else {
- static_branch_enable(&cpu_buf_vm_clear);
+ setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_MMIO);
}
/*
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 37647afde7d3..c43dd153d868 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -292,7 +292,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
mark_page_dirty_in_slot(vcpu->kvm, slot, gfn);
}
- if (static_branch_unlikely(&cpu_buf_vm_clear) &&
+ if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
!kvm_vcpu_can_access_host_mmio(vcpu) &&
kvm_is_mmio_pfn(pfn, &is_host_mmio))
kvm_track_host_mmio_mapping(vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1021d3b65ea0..68cde725d1c7 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -903,7 +903,7 @@ unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx)
if (!msr_write_intercepted(vmx, MSR_IA32_SPEC_CTRL))
flags |= VMX_RUN_SAVE_SPEC_CTRL;
- if (static_branch_unlikely(&cpu_buf_vm_clear) &&
+ if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
@@ -7351,7 +7351,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
*/
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
- else if (static_branch_unlikely(&cpu_buf_vm_clear) &&
+ else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
(flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO))
x86_clear_cpu_buffers();
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (2 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 12:32 ` Brendan Jackman
` (3 more replies)
2025-10-31 0:30 ` [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS Sean Christopherson
` (4 subsequent siblings)
8 siblings, 4 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
immediately prior to VM-Enter, i.e. in the same location that KVM emits a
VERW for unconditional (at runtime) clearing. Co-locating the code and
using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
various vulnerabilities.
Deliberately order the alternatives as:
0. Do nothing
1. Clear if vCPU can access MMIO
2. Clear always
since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
honor the strictest mitigation (always clear CPU buffers) if multiple
mitigations are selected. E.g. even if the kernel chooses to mitigate
MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
mitigation but not any other "clear CPU buffers" mitigation is enabled.
For that specific scenario, KVM would skip clearing CPU buffers for the
MMIO mitigation even though the kernel requested a clear on every VM-Enter.
Note #2, the flaw goes back to the introduction of the MDS mitigation. The
MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
Move VERW closer to VMentry for MDS mitigation"), but previous kernels
that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
unlikely the flaw is meaningfully exploitable even older kernels).
Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmenter.S | 14 +++++++++++++-
arch/x86/kvm/vmx/vmx.c | 13 -------------
2 files changed, 13 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 1f99a98a16a2..61a809790a58 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -71,6 +71,7 @@
* @regs: unsigned long * (to guest registers)
* @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
* VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
+ * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
*
* Returns:
* 0 on VM-Exit, 1 on VM-Fail
@@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
/* Load @regs to RAX. */
mov (%_ASM_SP), %_ASM_AX
+ /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
+ ALTERNATIVE_2 "", \
+ __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
+ X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
+ "", X86_FEATURE_CLEAR_CPU_BUF_VM
+
/* Check if vmlaunch or vmresume is needed */
bt $VMX_RUN_VMRESUME_SHIFT, %ebx
@@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
mov VCPU_RAX(%_ASM_AX), %_ASM_AX
/* Clobbers EFLAGS.ZF */
- VM_CLEAR_CPU_BUFFERS
+ ALTERNATIVE_2 "", \
+ __stringify(jz .Lskip_clear_cpu_buffers; \
+ CLEAR_CPU_BUFFERS_SEQ; \
+ .Lskip_clear_cpu_buffers:), \
+ X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
+ __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
/* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */
jnc .Lvmlaunch
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 68cde725d1c7..5af2338c7cb8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7339,21 +7339,8 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
guest_state_enter_irqoff();
- /*
- * L1D Flush includes CPU buffer clear to mitigate MDS, but VERW
- * mitigation for MDS is done late in VMentry and is still
- * executed in spite of L1D Flush. This is because an extra VERW
- * should not matter much after the big hammer L1D Flush.
- *
- * cpu_buf_vm_clear is used when system is not vulnerable to MDS/TAA,
- * and is affected by MMIO Stale Data. In such cases mitigation in only
- * needed against an MMIO capable guest.
- */
if (static_branch_unlikely(&vmx_l1d_should_flush))
vmx_l1d_flush(vcpu);
- else if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
- (flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO))
- x86_clear_cpu_buffers();
vmx_disable_fb_clear(vmx);
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (3 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 12:34 ` Brendan Jackman
2025-11-13 15:03 ` Borislav Petkov
2025-10-31 0:30 ` [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together Sean Christopherson
` (3 subsequent siblings)
8 siblings, 2 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Now that VMX encodes its own sequency for clearing CPU buffers, move
VM_CLEAR_CPU_BUFFERS into SVM to minimize the chances of KVM botching a
mitigation in the future, e.g. using VM_CLEAR_CPU_BUFFERS instead of
checking multiple mitigation flags.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/nospec-branch.h | 3 ---
arch/x86/kvm/svm/vmenter.S | 6 ++++--
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index b29df45b1edb..88fe40d6949a 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -323,9 +323,6 @@
#define CLEAR_CPU_BUFFERS \
ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
-#define VM_CLEAR_CPU_BUFFERS \
- ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
-
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP
diff --git a/arch/x86/kvm/svm/vmenter.S b/arch/x86/kvm/svm/vmenter.S
index 235c4af6b692..da5f481cb17e 100644
--- a/arch/x86/kvm/svm/vmenter.S
+++ b/arch/x86/kvm/svm/vmenter.S
@@ -92,6 +92,8 @@
jmp 901b
.endm
+#define SVM_CLEAR_CPU_BUFFERS \
+ ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
/**
* __svm_vcpu_run - Run a vCPU via a transition to SVM guest mode
@@ -170,7 +172,7 @@ SYM_FUNC_START(__svm_vcpu_run)
mov VCPU_RDI(%_ASM_DI), %_ASM_DI
/* Clobbers EFLAGS.ZF */
- VM_CLEAR_CPU_BUFFERS
+ SVM_CLEAR_CPU_BUFFERS
/* Enter guest mode */
3: vmrun %_ASM_AX
@@ -339,7 +341,7 @@ SYM_FUNC_START(__svm_sev_es_vcpu_run)
mov KVM_VMCB_pa(%rax), %rax
/* Clobbers EFLAGS.ZF */
- VM_CLEAR_CPU_BUFFERS
+ SVM_CLEAR_CPU_BUFFERS
/* Enter guest mode */
1: vmrun %rax
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (4 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-11-03 18:26 ` Pawan Gupta
2025-10-31 0:30 ` [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n Sean Christopherson
` (2 subsequent siblings)
8 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Move vmx_l1d_flush(), vmx_cleanup_l1d_flush(), and the vmentry_l1d_flush
param code up in vmx.c so that all of the L1 data cache flushing code is
bundled together. This will allow conditioning the mitigation code on
CONFIG_CPU_MITIGATIONS=y with minimal #ifdefs.
No functional change intended.
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/vmx.c | 174 ++++++++++++++++++++---------------------
1 file changed, 87 insertions(+), 87 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5af2338c7cb8..55962146fc34 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -302,6 +302,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
return 0;
}
+static void vmx_cleanup_l1d_flush(void)
+{
+ if (vmx_l1d_flush_pages) {
+ free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER);
+ vmx_l1d_flush_pages = NULL;
+ }
+ /* Restore state so sysfs ignores VMX */
+ l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
+}
+
static int vmentry_l1d_flush_parse(const char *s)
{
unsigned int i;
@@ -352,6 +362,83 @@ static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp)
return sysfs_emit(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].option);
}
+/*
+ * Software based L1D cache flush which is used when microcode providing
+ * the cache control MSR is not loaded.
+ *
+ * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to
+ * flush it is required to read in 64 KiB because the replacement algorithm
+ * is not exactly LRU. This could be sized at runtime via topology
+ * information but as all relevant affected CPUs have 32KiB L1D cache size
+ * there is no point in doing so.
+ */
+static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
+{
+ int size = PAGE_SIZE << L1D_CACHE_ORDER;
+
+ /*
+ * This code is only executed when the flush mode is 'cond' or
+ * 'always'
+ */
+ if (static_branch_likely(&vmx_l1d_flush_cond)) {
+ bool flush_l1d;
+
+ /*
+ * Clear the per-vcpu flush bit, it gets set again if the vCPU
+ * is reloaded, i.e. if the vCPU is scheduled out or if KVM
+ * exits to userspace, or if KVM reaches one of the unsafe
+ * VMEXIT handlers, e.g. if KVM calls into the emulator.
+ */
+ flush_l1d = vcpu->arch.l1tf_flush_l1d;
+ vcpu->arch.l1tf_flush_l1d = false;
+
+ /*
+ * Clear the per-cpu flush bit, it gets set again from
+ * the interrupt handlers.
+ */
+ flush_l1d |= kvm_get_cpu_l1tf_flush_l1d();
+ kvm_clear_cpu_l1tf_flush_l1d();
+
+ if (!flush_l1d)
+ return;
+ }
+
+ vcpu->stat.l1d_flush++;
+
+ if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
+ native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
+ return;
+ }
+
+ asm volatile(
+ /* First ensure the pages are in the TLB */
+ "xorl %%eax, %%eax\n"
+ ".Lpopulate_tlb:\n\t"
+ "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
+ "addl $4096, %%eax\n\t"
+ "cmpl %%eax, %[size]\n\t"
+ "jne .Lpopulate_tlb\n\t"
+ "xorl %%eax, %%eax\n\t"
+ "cpuid\n\t"
+ /* Now fill the cache */
+ "xorl %%eax, %%eax\n"
+ ".Lfill_cache:\n"
+ "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
+ "addl $64, %%eax\n\t"
+ "cmpl %%eax, %[size]\n\t"
+ "jne .Lfill_cache\n\t"
+ "lfence\n"
+ :: [flush_pages] "r" (vmx_l1d_flush_pages),
+ [size] "r" (size)
+ : "eax", "ebx", "ecx", "edx");
+}
+
+static const struct kernel_param_ops vmentry_l1d_flush_ops = {
+ .set = vmentry_l1d_flush_set,
+ .get = vmentry_l1d_flush_get,
+};
+module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
+
static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx)
{
u64 msr;
@@ -404,12 +491,6 @@ static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx)
vmx->disable_fb_clear = false;
}
-static const struct kernel_param_ops vmentry_l1d_flush_ops = {
- .set = vmentry_l1d_flush_set,
- .get = vmentry_l1d_flush_get,
-};
-module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
-
static u32 vmx_segment_access_rights(struct kvm_segment *var);
void vmx_vmexit(void);
@@ -6672,77 +6753,6 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath)
return ret;
}
-/*
- * Software based L1D cache flush which is used when microcode providing
- * the cache control MSR is not loaded.
- *
- * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to
- * flush it is required to read in 64 KiB because the replacement algorithm
- * is not exactly LRU. This could be sized at runtime via topology
- * information but as all relevant affected CPUs have 32KiB L1D cache size
- * there is no point in doing so.
- */
-static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
-{
- int size = PAGE_SIZE << L1D_CACHE_ORDER;
-
- /*
- * This code is only executed when the flush mode is 'cond' or
- * 'always'
- */
- if (static_branch_likely(&vmx_l1d_flush_cond)) {
- bool flush_l1d;
-
- /*
- * Clear the per-vcpu flush bit, it gets set again if the vCPU
- * is reloaded, i.e. if the vCPU is scheduled out or if KVM
- * exits to userspace, or if KVM reaches one of the unsafe
- * VMEXIT handlers, e.g. if KVM calls into the emulator.
- */
- flush_l1d = vcpu->arch.l1tf_flush_l1d;
- vcpu->arch.l1tf_flush_l1d = false;
-
- /*
- * Clear the per-cpu flush bit, it gets set again from
- * the interrupt handlers.
- */
- flush_l1d |= kvm_get_cpu_l1tf_flush_l1d();
- kvm_clear_cpu_l1tf_flush_l1d();
-
- if (!flush_l1d)
- return;
- }
-
- vcpu->stat.l1d_flush++;
-
- if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
- native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
- return;
- }
-
- asm volatile(
- /* First ensure the pages are in the TLB */
- "xorl %%eax, %%eax\n"
- ".Lpopulate_tlb:\n\t"
- "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
- "addl $4096, %%eax\n\t"
- "cmpl %%eax, %[size]\n\t"
- "jne .Lpopulate_tlb\n\t"
- "xorl %%eax, %%eax\n\t"
- "cpuid\n\t"
- /* Now fill the cache */
- "xorl %%eax, %%eax\n"
- ".Lfill_cache:\n"
- "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
- "addl $64, %%eax\n\t"
- "cmpl %%eax, %[size]\n\t"
- "jne .Lfill_cache\n\t"
- "lfence\n"
- :: [flush_pages] "r" (vmx_l1d_flush_pages),
- [size] "r" (size)
- : "eax", "ebx", "ecx", "edx");
-}
-
void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
{
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
@@ -8677,16 +8687,6 @@ __init int vmx_hardware_setup(void)
return r;
}
-static void vmx_cleanup_l1d_flush(void)
-{
- if (vmx_l1d_flush_pages) {
- free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER);
- vmx_l1d_flush_pages = NULL;
- }
- /* Restore state so sysfs ignores VMX */
- l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
-}
-
void vmx_exit(void)
{
allow_smaller_maxphyaddr = false;
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (5 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 12:37 ` Brendan Jackman
2025-10-31 0:30 ` [PATCH v4 8/8] KVM: x86: Unify L1TF flushing under per-CPU variable Sean Christopherson
2025-10-31 11:22 ` [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Brendan Jackman
8 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Disable support for flushing the L1 data cache to mitigate L1TF if CPU
mitigations are disabled for the entire kernel. KVM's mitigation of L1TF
is in no way special enough to justify ignoring CONFIG_CPU_MITIGATIONS=n.
Deliberately use CPU_MITIGATIONS instead of the more precise
MITIGATION_L1TF, as MITIGATION_L1TF only controls the default behavior,
i.e. CONFIG_MITIGATION_L1TF=n doesn't completely disable L1TF mitigations
in the kernel.
Keep the vmentry_l1d_flush module param to avoid breaking existing setups,
and leverage the .set path to alert the user to the fact that
vmentry_l1d_flush will be ignored. Don't bother validating the incoming
value; if an admin misconfigures vmentry_l1d_flush, the fact that the bad
configuration won't be detected when running with CONFIG_CPU_MITIGATIONS=n
is likely the least of their worries.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/hardirq.h | 4 +--
arch/x86/kvm/vmx/vmx.c | 56 ++++++++++++++++++++++++++--------
2 files changed, 46 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index f00c09ffe6a9..6b6d472baa0b 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -5,7 +5,7 @@
#include <linux/threads.h>
typedef struct {
-#if IS_ENABLED(CONFIG_KVM_INTEL)
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
u8 kvm_cpu_l1tf_flush_l1d;
#endif
unsigned int __nmi_count; /* arch dependent */
@@ -68,7 +68,7 @@ extern u64 arch_irq_stat(void);
DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
#define local_softirq_pending_ref __softirq_pending
-#if IS_ENABLED(CONFIG_KVM_INTEL)
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
/*
* This function is called from noinstr interrupt contexts
* and must be inlined to not get instrumentation.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 55962146fc34..1b5540105e4b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -203,6 +203,7 @@ module_param(pt_mode, int, S_IRUGO);
struct x86_pmu_lbr __ro_after_init vmx_lbr_caps;
+#ifdef CONFIG_CPU_MITIGATIONS
static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
static DEFINE_MUTEX(vmx_l1d_flush_mutex);
@@ -225,7 +226,7 @@ static const struct {
#define L1D_CACHE_ORDER 4
static void *vmx_l1d_flush_pages;
-static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
+static int __vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
{
struct page *page;
unsigned int i;
@@ -302,6 +303,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
return 0;
}
+static int vmx_setup_l1d_flush(void)
+{
+ /*
+ * Hand the parameter mitigation value in which was stored in the pre
+ * module init parser. If no parameter was given, it will contain
+ * 'auto' which will be turned into the default 'cond' mitigation mode.
+ */
+ return __vmx_setup_l1d_flush(vmentry_l1d_flush_param);
+}
+
static void vmx_cleanup_l1d_flush(void)
{
if (vmx_l1d_flush_pages) {
@@ -349,7 +360,7 @@ static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp)
}
mutex_lock(&vmx_l1d_flush_mutex);
- ret = vmx_setup_l1d_flush(l1tf);
+ ret = __vmx_setup_l1d_flush(l1tf);
mutex_unlock(&vmx_l1d_flush_mutex);
return ret;
}
@@ -376,6 +387,9 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
{
int size = PAGE_SIZE << L1D_CACHE_ORDER;
+ if (!static_branch_unlikely(&vmx_l1d_should_flush))
+ return;
+
/*
* This code is only executed when the flush mode is 'cond' or
* 'always'
@@ -433,6 +447,31 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
: "eax", "ebx", "ecx", "edx");
}
+#else /* CONFIG_CPU_MITIGATIONS*/
+static int vmx_setup_l1d_flush(void)
+{
+ l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NEVER;
+ return 0;
+}
+static void vmx_cleanup_l1d_flush(void)
+{
+ l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
+}
+static __always_inline void vmx_l1d_flush(struct kvm_vcpu *vcpu)
+{
+
+}
+static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp)
+{
+ pr_warn_once("Kernel compiled without mitigations, ignoring vmentry_l1d_flush\n");
+ return 0;
+}
+static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp)
+{
+ return sysfs_emit(s, "never\n");
+}
+#endif
+
static const struct kernel_param_ops vmentry_l1d_flush_ops = {
.set = vmentry_l1d_flush_set,
.get = vmentry_l1d_flush_get,
@@ -7349,8 +7388,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
guest_state_enter_irqoff();
- if (static_branch_unlikely(&vmx_l1d_should_flush))
- vmx_l1d_flush(vcpu);
+ vmx_l1d_flush(vcpu);
vmx_disable_fb_clear(vmx);
@@ -8722,14 +8760,8 @@ int __init vmx_init(void)
if (r)
return r;
- /*
- * Must be called after common x86 init so enable_ept is properly set
- * up. Hand the parameter mitigation value in which was stored in
- * the pre module init parser. If no parameter was given, it will
- * contain 'auto' which will be turned into the default 'cond'
- * mitigation mode.
- */
- r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
+ /* Must be called after common x86 init so enable_ept is setup. */
+ r = vmx_setup_l1d_flush();
if (r)
goto err_l1d_flush;
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* [PATCH v4 8/8] KVM: x86: Unify L1TF flushing under per-CPU variable
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (6 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n Sean Christopherson
@ 2025-10-31 0:30 ` Sean Christopherson
2025-10-31 11:22 ` [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Brendan Jackman
8 siblings, 0 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 0:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
From: Brendan Jackman <jackmanb@google.com>
Currently the tracking of the need to flush L1D for L1TF is tracked by
two bits: one per-CPU and one per-vCPU.
The per-vCPU bit is always set when the vCPU shows up on a core, so
there is no interesting state that's truly per-vCPU. Indeed, this is a
requirement, since L1D is a part of the physical CPU.
So simplify this by combining the two bits.
The vCPU bit was being written from preemption-enabled regions. To play
nice with those cases, wrap all calls from KVM and use a raw write so that
request a flush with preemption enabled doesn't trigger what would
effectively be DEBUG_PREEMPT false positives. Preemption doesn't need to
be disabled, as kvm_arch_vcpu_load() will mark the new CPU as needing a
flush if the vCPU task is migrated, or if userspace runs the vCPU on a
different task.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
[sean: put raw write in KVM instead of in a hardirq.h variant]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/include/asm/kvm_host.h | 3 ---
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 20 +++++---------------
arch/x86/kvm/x86.c | 6 +++---
arch/x86/kvm/x86.h | 14 ++++++++++++++
6 files changed, 24 insertions(+), 23 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 48598d017d6f..fcdc65ab13d8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1055,9 +1055,6 @@ struct kvm_vcpu_arch {
/* be preempted when it's in kernel-mode(cpl=0) */
bool preempted_in_kernel;
- /* Flush the L1 Data cache for L1TF mitigation on VMENTER */
- bool l1tf_flush_l1d;
-
/* Host CPU on which VM-entry was most recently attempted */
int last_vmentry_cpu;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 18d69d48bc55..4e016582adc7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4859,7 +4859,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
*/
BUILD_BUG_ON(lower_32_bits(PFERR_SYNTHETIC_MASK));
- vcpu->arch.l1tf_flush_l1d = true;
+ kvm_request_l1tf_flush_l1d();
if (!flags) {
trace_kvm_page_fault(vcpu, fault_address, error_code);
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b0cd745518b4..6f2f969d19f9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3828,7 +3828,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
goto vmentry_failed;
/* Hide L1D cache contents from the nested guest. */
- vcpu->arch.l1tf_flush_l1d = true;
+ kvm_request_l1tf_flush_l1d();
/*
* Must happen outside of nested_vmx_enter_non_root_mode() as it will
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1b5540105e4b..f87af1836ea1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -395,26 +395,16 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
* 'always'
*/
if (static_branch_likely(&vmx_l1d_flush_cond)) {
- bool flush_l1d;
-
/*
- * Clear the per-vcpu flush bit, it gets set again if the vCPU
+ * Clear the per-cpu flush bit, it gets set again if the vCPU
* is reloaded, i.e. if the vCPU is scheduled out or if KVM
* exits to userspace, or if KVM reaches one of the unsafe
- * VMEXIT handlers, e.g. if KVM calls into the emulator.
+ * VMEXIT handlers, e.g. if KVM calls into the emulator,
+ * or from the interrupt handlers.
*/
- flush_l1d = vcpu->arch.l1tf_flush_l1d;
- vcpu->arch.l1tf_flush_l1d = false;
-
- /*
- * Clear the per-cpu flush bit, it gets set again from
- * the interrupt handlers.
- */
- flush_l1d |= kvm_get_cpu_l1tf_flush_l1d();
+ if (!kvm_get_cpu_l1tf_flush_l1d())
+ return;
kvm_clear_cpu_l1tf_flush_l1d();
-
- if (!flush_l1d)
- return;
}
vcpu->stat.l1d_flush++;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4b5d2d09634..851f078cd5ca 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5189,7 +5189,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- vcpu->arch.l1tf_flush_l1d = true;
+ kvm_request_l1tf_flush_l1d();
if (vcpu->scheduled_out && pmu->version && pmu->event_count) {
pmu->need_cleanup = true;
@@ -7999,7 +7999,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
unsigned int bytes, struct x86_exception *exception)
{
/* kvm_write_guest_virt_system can pull in tons of pages. */
- vcpu->arch.l1tf_flush_l1d = true;
+ kvm_request_l1tf_flush_l1d();
return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
PFERR_WRITE_MASK, exception);
@@ -9395,7 +9395,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
return handle_emulation_failure(vcpu, emulation_type);
}
- vcpu->arch.l1tf_flush_l1d = true;
+ kvm_request_l1tf_flush_l1d();
if (!(emulation_type & EMULTYPE_NO_DECODE)) {
kvm_clear_exception_queue(vcpu);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index f3dc77f006f9..cd67ccbb747f 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -420,6 +420,20 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
return !(kvm->arch.disabled_quirks & quirk);
}
+static __always_inline void kvm_request_l1tf_flush_l1d(void)
+{
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
+ /*
+ * Use a raw write to set the per-CPU flag, as KVM will ensure a flush
+ * even if preemption is currently enabled.. If the current vCPU task
+ * is migrated to a different CPU (or userspace runs the vCPU on a
+ * different task) before the next VM-Entry, then kvm_arch_vcpu_load()
+ * will request a flush on the new CPU.
+ */
+ raw_cpu_write(irq_stat.kvm_cpu_l1tf_flush_l1d, 1);
+#endif
+}
+
void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);
u64 get_kvmclock_ns(struct kvm *kvm);
--
2.51.1.930.gacf6e81ea2-goog
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
` (7 preceding siblings ...)
2025-10-31 0:30 ` [PATCH v4 8/8] KVM: x86: Unify L1TF flushing under per-CPU variable Sean Christopherson
@ 2025-10-31 11:22 ` Brendan Jackman
2025-10-31 17:36 ` Sean Christopherson
8 siblings, 1 reply; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 11:22 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> This is a combination of Brendan's work to unify the L1TF L1D flushing
> mitigation, and Pawan's work to bring some sanity to the mitigations that
> clear CPU buffers, with a bunch of glue code and some polishing from me.
>
> The "v4" is relative to the L1TF series. I smushed the two series together
> as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
> for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
> handling the series separately would have been a lot of work+churn for no
> real benefit.
>
> TL;DR:
>
> - Unify L1TF flushing under per-CPU variable
> - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
> - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
>
> To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
> VERW, tweak the low lever macros in nospec-branch.h to define the instruction
> sequence, and then wrap it with __stringify() as needed.
>
> The non-VMX code is lightly tested (but there's far less chance for breakage
> there). For the VMX code, I verified it does what I want (which may or may
> not be correct :-D) by hacking the code to force/clear various mitigations, and
> using ud2 to confirm the right path got selected.
FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
by the mitigation. It's a bit of a long-winded way to achieve that and I
guess L1TF is anyway the easy case here, but I couldn't resist promoting
it.
(I just received a Skylake machine from ebay, once that's set up I'll be
able to double check on there that things still work).
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
@ 2025-10-31 11:30 ` Brendan Jackman
2025-11-01 1:46 ` Pawan Gupta
2025-11-03 18:18 ` Pawan Gupta
2025-11-07 18:59 ` Borislav Petkov
2 siblings, 1 reply; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 11:30 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
Rewording my comments from:
https://lore.kernel.org/all/20251029-verw-vm-v1-1-babf9b961519@linux.intel.com/
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>
> TSA mitigation:
>
> d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
>
> introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
> CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
> (kernel->user also).
>
> Make mitigations on Intel consistent with TSA. This would help handling the
> guest-only mitigations better in future.
>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> [sean: make CLEAR_CPU_BUF_VM mutually exclusive with the MMIO mitigation]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
I think this is a clear improvement. Now that X86_FEATURE_CLEAR_CPU_BUF
has a clear scope, can we also update the comment on its definition in
cpufeatures.h? I.e. say that it's specifically about exit to user.
This also seems like a good moment to update the comment on
verw_clear_cpu_buf_mitigation_selected to mention the _VM flag too.
Also, where we set vmx->disable_fb_clear in vmx_update_fb_clear_dis(),
it still refers to X86_FEATURE_CLEAR_CPU_BUF, is that wrong?
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-10-31 0:30 ` [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition Sean Christopherson
@ 2025-10-31 11:37 ` Brendan Jackman
2025-10-31 17:43 ` Sean Christopherson
2025-11-01 4:13 ` Pawan Gupta
1 sibling, 1 reply; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 11:37 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
> buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
> and "clear if guest can access host MMIO" in a single statement.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/nospec-branch.h | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 08ed5a2e46a5..923ae21cbef1 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -308,24 +308,23 @@
> * CFLAGS.ZF.
> * Note: Only the memory operand variant of VERW clears the CPU buffers.
> */
> -.macro __CLEAR_CPU_BUFFERS feature
> #ifdef CONFIG_X86_64
> - ALTERNATIVE "", "verw x86_verw_sel(%rip)", \feature
> +#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
> #else
> - /*
> - * In 32bit mode, the memory operand must be a %cs reference. The data
> - * segments may not be usable (vm86 mode), and the stack segment may not
> - * be flat (ESPFIX32).
> - */
> - ALTERNATIVE "", "verw %cs:x86_verw_sel", \feature
> +/*
> + * In 32bit mode, the memory operand must be a %cs reference. The data segments
> + * may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
> + */
> +#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
> #endif
> -.endm
> +
> +#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
Maybe CLEAR_CPU_BUFFERS_SEQ should just be defined as a string in the
first place? But meh, that's a very bikeshed comment. I can see the
aeshetic appeal of the separate __stringify().
Reviewed-by: Brendan Jackman <jackmanb@google.com>
>
> #define CLEAR_CPU_BUFFERS \
> - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
>
> #define VM_CLEAR_CPU_BUFFERS \
> - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF_VM
> + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> #ifdef CONFIG_X86_64
> .macro CLEAR_BRANCH_HISTORY
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
@ 2025-10-31 11:44 ` Brendan Jackman
2025-10-31 21:47 ` Sean Christopherson
2025-10-31 22:28 ` Pawan Gupta
2025-11-12 14:46 ` Borislav Petkov
2 siblings, 1 reply; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 11:44 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> Convert the MMIO Stale Data mitigation flag from a static branch into an
> X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/nospec-branch.h | 2 --
> arch/x86/kernel/cpu/bugs.c | 11 +----------
> arch/x86/kvm/mmu/spte.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 4 ++--
> 5 files changed, 5 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 7129eb44adad..d1d7b5ec6425 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -501,6 +501,7 @@
> #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
>
> /*
> * BUG word(s)
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 923ae21cbef1..b29df45b1edb 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -579,8 +579,6 @@ DECLARE_STATIC_KEY_FALSE(cpu_buf_idle_clear);
>
> DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
>
> -DECLARE_STATIC_KEY_FALSE(cpu_buf_vm_clear);
> -
> extern u16 x86_verw_sel;
>
> #include <asm/segment.h>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 723666a1357e..9acf6343b0ac 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -192,14 +192,6 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
> */
> DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
>
> -/*
> - * Controls CPU Fill buffer clear before VMenter. This is a subset of
> - * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
> - * mitigation is required.
> - */
This comment wasn't super clear IMO but now that we're losing it, maybe
we can replace it with a WARN_ON() at the end of
cpu_apply_mitigations() or something (maybe it belongs in VMX code)? To
make it more obvious that X86_FEATURE_CLEAR_CPU_BUF_VM and
X86_FEATURE_CLEAR_CPU_BUF_MMIO are mutually exclusive.
Other than the continued bikeshedding,
Reviewed-by: Brendan Jackman <jackmanb@google.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
@ 2025-10-31 12:32 ` Brendan Jackman
2025-10-31 21:44 ` Sean Christopherson
2025-10-31 23:55 ` Pawan Gupta
` (2 subsequent siblings)
3 siblings, 1 reply; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 12:32 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
> immediately prior to VM-Enter, i.e. in the same location that KVM emits a
> VERW for unconditional (at runtime) clearing. Co-locating the code and
> using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
> various vulnerabilities.
>
> Deliberately order the alternatives as:
>
> 0. Do nothing
> 1. Clear if vCPU can access MMIO
> 2. Clear always
>
> since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
> honor the strictest mitigation (always clear CPU buffers) if multiple
> mitigations are selected. E.g. even if the kernel chooses to mitigate
> MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
> may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
>
> Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
> a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
> L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
> mitigation but not any other "clear CPU buffers" mitigation is enabled.
> For that specific scenario, KVM would skip clearing CPU buffers for the
> MMIO mitigation even though the kernel requested a clear on every VM-Enter.
>
> Note #2, the flaw goes back to the introduction of the MDS mitigation. The
> MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
> Move VERW closer to VMentry for MDS mitigation"), but previous kernels
> that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
> unlikely the flaw is meaningfully exploitable even older kernels).
>
> Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
> Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/vmx/vmenter.S | 14 +++++++++++++-
> arch/x86/kvm/vmx/vmx.c | 13 -------------
> 2 files changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index 1f99a98a16a2..61a809790a58 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -71,6 +71,7 @@
> * @regs: unsigned long * (to guest registers)
> * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> *
> * Returns:
> * 0 on VM-Exit, 1 on VM-Fail
> @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> /* Load @regs to RAX. */
> mov (%_ASM_SP), %_ASM_AX
>
> + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> + ALTERNATIVE_2 "", \
> + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> + "", X86_FEATURE_CLEAR_CPU_BUF_VM
Ah, so this ALTERNATIVE_2 (instead of just an ALTERNATIVE that checks
CLEAR_CPU_BUF_MMIO) is really about avoiding the flags needing to be
mutually exclusive? I.e. this is
if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
!cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_VM))
test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx
... right? This is a good idea but I think it warrants a comment to
capture the intent, without having the commit message in short-term
memory I'd have struggled with this code, I think.
> /* Check if vmlaunch or vmresume is needed */
> bt $VMX_RUN_VMRESUME_SHIFT, %ebx
>
> @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> mov VCPU_RAX(%_ASM_AX), %_ASM_AX
>
> /* Clobbers EFLAGS.ZF */
> - VM_CLEAR_CPU_BUFFERS
> + ALTERNATIVE_2 "", \
> + __stringify(jz .Lskip_clear_cpu_buffers; \
Maybe I'm just an asm noob (I was very impressed by this trick of
using CF and ZF together like this!) but I think it's helpful to
have the comment like the jnc has below, and Pawan had in his version,
to really make the test->jz dependency obvious, since the two
instructions are quite far apart.
> + CLEAR_CPU_BUFFERS_SEQ; \
> + .Lskip_clear_cpu_buffers:), \
> + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
Sorry I'm really nitpicking but I think it's justified for asm
readability...
It's a bit unfortunate that one branch says
CLEAR_CPU_BUFFERS_SEQ and the other says __CLEAR_CPU_BUFFERS. With the
current code I think it would be more readable to jut have
__stringify(CLEAR_CPU_BUFFERS_SEQ) in the CLEAR_CPU_BUF_VM case, then
you don't have to mentally expand the macro to see how the two branches
actually differ.
Anyway the actual idea here LGTM, thanks.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS
2025-10-31 0:30 ` [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS Sean Christopherson
@ 2025-10-31 12:34 ` Brendan Jackman
2025-11-13 15:03 ` Borislav Petkov
1 sibling, 0 replies; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 12:34 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> Now that VMX encodes its own sequency for clearing CPU buffers, move
> VM_CLEAR_CPU_BUFFERS into SVM to minimize the chances of KVM botching a
> mitigation in the future, e.g. using VM_CLEAR_CPU_BUFFERS instead of
> checking multiple mitigation flags.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n
2025-10-31 0:30 ` [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n Sean Christopherson
@ 2025-10-31 12:37 ` Brendan Jackman
0 siblings, 0 replies; 57+ messages in thread
From: Brendan Jackman @ 2025-10-31 12:37 UTC (permalink / raw)
To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf
Cc: kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> Disable support for flushing the L1 data cache to mitigate L1TF if CPU
> mitigations are disabled for the entire kernel. KVM's mitigation of L1TF
> is in no way special enough to justify ignoring CONFIG_CPU_MITIGATIONS=n.
>
> Deliberately use CPU_MITIGATIONS instead of the more precise
> MITIGATION_L1TF, as MITIGATION_L1TF only controls the default behavior,
> i.e. CONFIG_MITIGATION_L1TF=n doesn't completely disable L1TF mitigations
> in the kernel.
>
> Keep the vmentry_l1d_flush module param to avoid breaking existing setups,
> and leverage the .set path to alert the user to the fact that
> vmentry_l1d_flush will be ignored. Don't bother validating the incoming
> value; if an admin misconfigures vmentry_l1d_flush, the fact that the bad
> configuration won't be detected when running with CONFIG_CPU_MITIGATIONS=n
> is likely the least of their worries.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
(Git says no changed lines)
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups
2025-10-31 11:22 ` [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Brendan Jackman
@ 2025-10-31 17:36 ` Sean Christopherson
2025-11-04 10:58 ` Brendan Jackman
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 17:36 UTC (permalink / raw)
To: Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > This is a combination of Brendan's work to unify the L1TF L1D flushing
> > mitigation, and Pawan's work to bring some sanity to the mitigations that
> > clear CPU buffers, with a bunch of glue code and some polishing from me.
> >
> > The "v4" is relative to the L1TF series. I smushed the two series together
> > as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
> > for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
> > handling the series separately would have been a lot of work+churn for no
> > real benefit.
> >
> > TL;DR:
> >
> > - Unify L1TF flushing under per-CPU variable
> > - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
> > - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
> >
> > To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
> > VERW, tweak the low lever macros in nospec-branch.h to define the instruction
> > sequence, and then wrap it with __stringify() as needed.
> >
> > The non-VMX code is lightly tested (but there's far less chance for breakage
> > there). For the VMX code, I verified it does what I want (which may or may
> > not be correct :-D) by hacking the code to force/clear various mitigations, and
> > using ud2 to confirm the right path got selected.
>
> FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
> by the mitigation. It's a bit of a long-winded way to achieve that and I
> guess L1TF is anyway the easy case here, but I couldn't resist promoting
> it.
Yeah, it's on my radar, but it'll be a while before I have the bandwidth to dig
through something that involved (though I _am_ excited to have a way to actually
test mitigations).
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-10-31 11:37 ` Brendan Jackman
@ 2025-10-31 17:43 ` Sean Christopherson
0 siblings, 0 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 17:43 UTC (permalink / raw)
To: Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
> > buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
> > and "clear if guest can access host MMIO" in a single statement.
> >
> > No functional change intended.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > arch/x86/include/asm/nospec-branch.h | 21 ++++++++++-----------
> > 1 file changed, 10 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index 08ed5a2e46a5..923ae21cbef1 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -308,24 +308,23 @@
> > * CFLAGS.ZF.
> > * Note: Only the memory operand variant of VERW clears the CPU buffers.
> > */
> > -.macro __CLEAR_CPU_BUFFERS feature
> > #ifdef CONFIG_X86_64
> > - ALTERNATIVE "", "verw x86_verw_sel(%rip)", \feature
> > +#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
> > #else
> > - /*
> > - * In 32bit mode, the memory operand must be a %cs reference. The data
> > - * segments may not be usable (vm86 mode), and the stack segment may not
> > - * be flat (ESPFIX32).
> > - */
> > - ALTERNATIVE "", "verw %cs:x86_verw_sel", \feature
> > +/*
> > + * In 32bit mode, the memory operand must be a %cs reference. The data segments
> > + * may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
> > + */
> > +#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
> > #endif
> > -.endm
> > +
> > +#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
>
> Maybe CLEAR_CPU_BUFFERS_SEQ should just be defined as a string in the
> first place?
Heh, I tried that, and AFAICT it simply can't work with the way ALTERNATIVE and
friends are implemented, as each paramater needs to be a single unbroken string.
E.g. this
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 61a809790a58..ffa6bc2345e3 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -63,6 +63,8 @@
RET
.endm
+#define CLEAR_CPU_BUFFERS_SEQ_STRING "verw x86_verw_sel(%rip)"
+
.section .noinstr.text, "ax"
/**
@@ -169,9 +171,9 @@ SYM_FUNC_START(__vmx_vcpu_run)
/* Clobbers EFLAGS.ZF */
ALTERNATIVE_2 "", \
- __stringify(jz .Lskip_clear_cpu_buffers; \
- CLEAR_CPU_BUFFERS_SEQ; \
- .Lskip_clear_cpu_buffers:), \
+ "jz .Lskip_clear_cpu_buffers; " \
+ CLEAR_CPU_BUFFERS_SEQ_STRING; \
+ ".Lskip_clear_cpu_buffers:", \
X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
__CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
yields wonderfully helpful error messages like so:
arch/x86/kvm/vmx/vmenter.S: Assembler messages:
arch/x86/kvm/vmx/vmenter.S:173: Error: too many positional arguments
If there's a magic incanation to get things to work, it's unknown to me.
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 12:32 ` Brendan Jackman
@ 2025-10-31 21:44 ` Sean Christopherson
2025-11-03 10:51 ` Brendan Jackman
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 21:44 UTC (permalink / raw)
To: Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
> > immediately prior to VM-Enter, i.e. in the same location that KVM emits a
> > VERW for unconditional (at runtime) clearing. Co-locating the code and
> > using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
> > various vulnerabilities.
> >
> > Deliberately order the alternatives as:
> >
> > 0. Do nothing
> > 1. Clear if vCPU can access MMIO
> > 2. Clear always
> >
> > since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
> > honor the strictest mitigation (always clear CPU buffers) if multiple
> > mitigations are selected. E.g. even if the kernel chooses to mitigate
> > MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
> > may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
> >
> > Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
> > a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
> > L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
> > mitigation but not any other "clear CPU buffers" mitigation is enabled.
> > For that specific scenario, KVM would skip clearing CPU buffers for the
> > MMIO mitigation even though the kernel requested a clear on every VM-Enter.
> >
> > Note #2, the flaw goes back to the introduction of the MDS mitigation. The
> > MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
> > Move VERW closer to VMentry for MDS mitigation"), but previous kernels
> > that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
> > unlikely the flaw is meaningfully exploitable even older kernels).
> >
> > Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
> > Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > arch/x86/kvm/vmx/vmenter.S | 14 +++++++++++++-
> > arch/x86/kvm/vmx/vmx.c | 13 -------------
> > 2 files changed, 13 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > index 1f99a98a16a2..61a809790a58 100644
> > --- a/arch/x86/kvm/vmx/vmenter.S
> > +++ b/arch/x86/kvm/vmx/vmenter.S
> > @@ -71,6 +71,7 @@
> > * @regs: unsigned long * (to guest registers)
> > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> > *
> > * Returns:
> > * 0 on VM-Exit, 1 on VM-Fail
> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > /* Load @regs to RAX. */
> > mov (%_ASM_SP), %_ASM_AX
> >
> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> > + ALTERNATIVE_2 "", \
> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Ah, so this ALTERNATIVE_2 (instead of just an ALTERNATIVE that checks
> CLEAR_CPU_BUF_MMIO) is really about avoiding the flags needing to be
> mutually exclusive?
Yeah, more or less. More specifically, I want to keep the X vs. Y logic in one
place (well, two if you count both ALTERNATIVE_2 flows), so that in generaly,
from KVM's perspective, the mitigations are handled as independent things. E.g.
even if CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO are mutually exclusive within
the kernel (and it's not clear to me that that's 100% guaranteed), I want to
limit how much of KVM assumes they are exclusive. Partly to avoid "oops, we
forgot to mitigate that thing you care about", partly so that reading code like
the setting of VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO doesn't require understanding
the relationship between CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO.
> if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> !cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_VM))
> test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx
>
> ... right? This is a good idea but I think it warrants a comment to
> capture the intent, without having the commit message in short-term
> memory I'd have struggled with this code, I think.
>
> > /* Check if vmlaunch or vmresume is needed */
> > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
> >
> > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> >
> > /* Clobbers EFLAGS.ZF */
> > - VM_CLEAR_CPU_BUFFERS
> > + ALTERNATIVE_2 "", \
> > + __stringify(jz .Lskip_clear_cpu_buffers; \
>
> Maybe I'm just an asm noob
Nah, all of this is definitely playing on hard mode. I'm just thankful we don't
have to deal with the horrors of KVM doing all of this in inline asm. :-D
> I was very impressed by this trick of using CF and ZF together like this!)
> but I think it's helpful to have the comment like the jnc has below, and
> Pawan had in his version, to really make the test->jz dependency obvious,
> since the two instructions are quite far apart.
>
>
> > + CLEAR_CPU_BUFFERS_SEQ; \
> > + .Lskip_clear_cpu_buffers:), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Sorry I'm really nitpicking but I think it's justified for asm
> readability...
>
> It's a bit unfortunate that one branch says
> CLEAR_CPU_BUFFERS_SEQ and the other says __CLEAR_CPU_BUFFERS. With the
> current code I think it would be more readable to jut have
> __stringify(CLEAR_CPU_BUFFERS_SEQ) in the CLEAR_CPU_BUF_VM case, then
> you don't have to mentally expand the macro to see how the two branches
> actually differ.
No preference here (assuming I understand what you're asking).
Is this better?
/*
* Note, this sequence consumes *and* clobbers EFLAGS.ZF. The MMIO
* mitigations uses ZF to track whether or not the vCPU has access to
* host MMIO (see above), and VERW (the instruction microcode hijacks
* to clear CPU buffers) writes ZF.
*/
ALTERNATIVE_2 "", \
__stringify(jz .Lskip_clear_cpu_buffers; \
CLEAR_CPU_BUFFERS_SEQ; \
.Lskip_clear_cpu_buffers:), \
X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
__stringify(CLEAR_CPU_BUFFERS_SEQ), X86_FEATURE_CLEAR_CPU_BUF_VM
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 11:44 ` Brendan Jackman
@ 2025-10-31 21:47 ` Sean Christopherson
2025-11-03 10:49 ` Brendan Jackman
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 21:47 UTC (permalink / raw)
To: Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> > index 723666a1357e..9acf6343b0ac 100644
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
> > @@ -192,14 +192,6 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
> > */
> > DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
> >
> > -/*
> > - * Controls CPU Fill buffer clear before VMenter. This is a subset of
> > - * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
> > - * mitigation is required.
> > - */
>
> This comment wasn't super clear IMO but now that we're losing it, maybe
> we can replace it with a WARN_ON() at the end of
> cpu_apply_mitigations() or something (maybe it belongs in VMX code)? To
> make it more obvious that X86_FEATURE_CLEAR_CPU_BUF_VM and
> X86_FEATURE_CLEAR_CPU_BUF_MMIO are mutually exclusive.
No objection from me if we want strong guarantees that CLEAR_CPU_BUF_VM and
CLEAR_CPU_BUF_MMIO are mutually exclusive. Though I do think the KVM side of
things (and the kernel in general) should be paranoid and not lean _too_ hard
on such assumptions.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
2025-10-31 11:44 ` Brendan Jackman
@ 2025-10-31 22:28 ` Pawan Gupta
2025-10-31 22:37 ` Sean Christopherson
2025-11-12 14:46 ` Borislav Petkov
2 siblings, 1 reply; 57+ messages in thread
From: Pawan Gupta @ 2025-10-31 22:28 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:35PM -0700, Sean Christopherson wrote:
> Convert the MMIO Stale Data mitigation flag from a static branch into an
> X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/nospec-branch.h | 2 --
> arch/x86/kernel/cpu/bugs.c | 11 +----------
> arch/x86/kvm/mmu/spte.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 4 ++--
> 5 files changed, 5 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 7129eb44adad..d1d7b5ec6425 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -501,6 +501,7 @@
> #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
Some bikeshedding from my side too:
s/iff/if/
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 22:28 ` Pawan Gupta
@ 2025-10-31 22:37 ` Sean Christopherson
2025-10-31 22:50 ` Pawan Gupta
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-10-31 22:37 UTC (permalink / raw)
To: Pawan Gupta
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Fri, Oct 31, 2025, Pawan Gupta wrote:
> On Thu, Oct 30, 2025 at 05:30:35PM -0700, Sean Christopherson wrote:
> > Convert the MMIO Stale Data mitigation flag from a static branch into an
> > X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
> >
> > No functional change intended.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > arch/x86/include/asm/cpufeatures.h | 1 +
> > arch/x86/include/asm/nospec-branch.h | 2 --
> > arch/x86/kernel/cpu/bugs.c | 11 +----------
> > arch/x86/kvm/mmu/spte.c | 2 +-
> > arch/x86/kvm/vmx/vmx.c | 4 ++--
> > 5 files changed, 5 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > index 7129eb44adad..d1d7b5ec6425 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -501,6 +501,7 @@
> > #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> > #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> > #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> > +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
>
> Some bikeshedding from my side too:
> s/iff/if/
Heh, that's actually intentional. "iff" is shorthand for "if and only if". But
this isn't the first time my use of "iff" has confused people, so I've no objection
to switching to "if".
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 22:37 ` Sean Christopherson
@ 2025-10-31 22:50 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-10-31 22:50 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Fri, Oct 31, 2025 at 03:37:34PM -0700, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Pawan Gupta wrote:
> > On Thu, Oct 30, 2025 at 05:30:35PM -0700, Sean Christopherson wrote:
> > > Convert the MMIO Stale Data mitigation flag from a static branch into an
> > > X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
> > >
> > > No functional change intended.
> > >
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > > arch/x86/include/asm/cpufeatures.h | 1 +
> > > arch/x86/include/asm/nospec-branch.h | 2 --
> > > arch/x86/kernel/cpu/bugs.c | 11 +----------
> > > arch/x86/kvm/mmu/spte.c | 2 +-
> > > arch/x86/kvm/vmx/vmx.c | 4 ++--
> > > 5 files changed, 5 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > > index 7129eb44adad..d1d7b5ec6425 100644
> > > --- a/arch/x86/include/asm/cpufeatures.h
> > > +++ b/arch/x86/include/asm/cpufeatures.h
> > > @@ -501,6 +501,7 @@
> > > #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> > > #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> > > #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> > > +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
> >
> > Some bikeshedding from my side too:
> > s/iff/if/
>
> Heh, that's actually intentional. "iff" is shorthand for "if and only if". But
> this isn't the first time my use of "iff" has confused people, so I've no objection
> to switching to "if".
I did a quick search, there are about ~500 instances of "iff" in the
kernel. So, it's a common abbreviation that I learnt today. It is fine to
keep it as is.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
2025-10-31 12:32 ` Brendan Jackman
@ 2025-10-31 23:55 ` Pawan Gupta
2025-11-01 3:41 ` Pawan Gupta
2025-11-03 9:17 ` Peter Zijlstra
2025-11-03 17:46 ` Pawan Gupta
2025-11-12 16:41 ` Borislav Petkov
3 siblings, 2 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-10-31 23:55 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
...
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index 1f99a98a16a2..61a809790a58 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -71,6 +71,7 @@
> * @regs: unsigned long * (to guest registers)
> * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> *
> * Returns:
> * 0 on VM-Exit, 1 on VM-Fail
> @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> /* Load @regs to RAX. */
> mov (%_ASM_SP), %_ASM_AX
>
> + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> + ALTERNATIVE_2 "", \
> + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> + "", X86_FEATURE_CLEAR_CPU_BUF_VM
> +
> /* Check if vmlaunch or vmresume is needed */
> bt $VMX_RUN_VMRESUME_SHIFT, %ebx
>
> @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> mov VCPU_RAX(%_ASM_AX), %_ASM_AX
>
> /* Clobbers EFLAGS.ZF */
> - VM_CLEAR_CPU_BUFFERS
> + ALTERNATIVE_2 "", \
> + __stringify(jz .Lskip_clear_cpu_buffers; \
> + CLEAR_CPU_BUFFERS_SEQ; \
> + .Lskip_clear_cpu_buffers:), \
> + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
Another way to write this could be:
ALTERNATIVE_2 "jmp .Lskip_clear_cpu_buffers", \
"jz .Lskip_clear_cpu_buffers", X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
"", X86_FEATURE_CLEAR_CPU_BUF_VM
CLEAR_CPU_BUFFERS_SEQ
.Lskip_clear_cpu_buffers:
With this jmp;verw; would show up in the disassembly on unaffected CPUs, I
don't know how big a problem is that. OTOH, I find this easier to understand.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-10-31 11:30 ` Brendan Jackman
@ 2025-11-01 1:46 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-01 1:46 UTC (permalink / raw)
To: Brendan Jackman
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, kvm,
linux-kernel
On Fri, Oct 31, 2025 at 11:30:54AM +0000, Brendan Jackman wrote:
> Rewording my comments from:
> https://lore.kernel.org/all/20251029-verw-vm-v1-1-babf9b961519@linux.intel.com/
>
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> >
> > TSA mitigation:
> >
> > d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
> >
> > introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
> > CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
> > (kernel->user also).
> >
> > Make mitigations on Intel consistent with TSA. This would help handling the
> > guest-only mitigations better in future.
> >
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > [sean: make CLEAR_CPU_BUF_VM mutually exclusive with the MMIO mitigation]
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
>
> I think this is a clear improvement. Now that X86_FEATURE_CLEAR_CPU_BUF
> has a clear scope, can we also update the comment on its definition in
> cpufeatures.h? I.e. say that it's specifically about exit to user.
Does this suffice?
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 08ed5a2e46a5..e842f27a1108 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -321,6 +321,7 @@
#endif
.endm
+/* Primarily used in exit-to-userspace path */
#define CLEAR_CPU_BUFFERS \
__CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> This also seems like a good moment to update the comment on
> verw_clear_cpu_buf_mitigation_selected to mention the _VM flag too.
As we have 3 different flags, referring them with X86_FEATURE_CLEAR_CPU_*
should be okay?
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 723666a1357e..51dec95a9af5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -490,7 +490,7 @@ static enum rfds_mitigations rfds_mitigation __ro_after_init =
/*
* Set if any of MDS/TAA/MMIO/RFDS are going to enable VERW clearing
- * through X86_FEATURE_CLEAR_CPU_BUF on kernel and guest entry.
+ * through X86_FEATURE_CLEAR_CPU_* on kernel and guest entry.
*/
static bool verw_clear_cpu_buf_mitigation_selected __ro_after_init;
> Also, where we set vmx->disable_fb_clear in vmx_update_fb_clear_dis(),
> it still refers to X86_FEATURE_CLEAR_CPU_BUF, is that wrong?
It looks correct to me. The only reason X86_FEATURE_CLEAR_CPU_BUF is used
in vmx_update_fb_clear_dis() is to check if host has enabled its
exit-to-userspace mitigation for some reason, and allow guest to also use
VERW.
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 23:55 ` Pawan Gupta
@ 2025-11-01 3:41 ` Pawan Gupta
2025-11-03 9:17 ` Peter Zijlstra
1 sibling, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-01 3:41 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Fri, Oct 31, 2025 at 04:55:37PM -0700, Pawan Gupta wrote:
> On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> ...
> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > index 1f99a98a16a2..61a809790a58 100644
> > --- a/arch/x86/kvm/vmx/vmenter.S
> > +++ b/arch/x86/kvm/vmx/vmenter.S
> > @@ -71,6 +71,7 @@
> > * @regs: unsigned long * (to guest registers)
> > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> > *
> > * Returns:
> > * 0 on VM-Exit, 1 on VM-Fail
> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > /* Load @regs to RAX. */
> > mov (%_ASM_SP), %_ASM_AX
> >
> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> > + ALTERNATIVE_2 "", \
> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
> > +
> > /* Check if vmlaunch or vmresume is needed */
> > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
> >
> > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> >
> > /* Clobbers EFLAGS.ZF */
> > - VM_CLEAR_CPU_BUFFERS
> > + ALTERNATIVE_2 "", \
> > + __stringify(jz .Lskip_clear_cpu_buffers; \
> > + CLEAR_CPU_BUFFERS_SEQ; \
> > + .Lskip_clear_cpu_buffers:), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Another way to write this could be:
>
> ALTERNATIVE_2 "jmp .Lskip_clear_cpu_buffers", \
> "jz .Lskip_clear_cpu_buffers", X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> "", X86_FEATURE_CLEAR_CPU_BUF_VM
>
> CLEAR_CPU_BUFFERS_SEQ
> .Lskip_clear_cpu_buffers:
>
> With this jmp;verw; would show up in the disassembly on unaffected CPUs, I
> don't know how big a problem is that. OTOH, I find this easier to understand.
As far as execution is concerned, it basically boils down to 9 NOPs:
54: 48 8b 00 mov (%rax),%rax
---
57: 90 nop
58: 90 nop
59: 90 nop
5a: 90 nop
5b: 90 nop
5c: 90 nop
5d: 90 nop
5e: 90 nop
5f: 90 nop
---
60: 73 08 jae
versus 1 near jump:
54: 48 8b 00 mov (%rax),%rax
---
57: eb 0b jmp ffffffff81fa1064
59: 90 nop
5a: 90 nop
5b: 90 nop
5c: 90 nop
5d: 0f 00 2d dc ef 05 ff verw -0xfa1024(%rip)
---
64: 73 08 jae
I can't tell which one is better.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-10-31 0:30 ` [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition Sean Christopherson
2025-10-31 11:37 ` Brendan Jackman
@ 2025-11-01 4:13 ` Pawan Gupta
2025-11-03 17:00 ` Sean Christopherson
1 sibling, 1 reply; 57+ messages in thread
From: Pawan Gupta @ 2025-11-01 4:13 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:34PM -0700, Sean Christopherson wrote:
> Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
> buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
> and "clear if guest can access host MMIO" in a single statement.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/nospec-branch.h | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 08ed5a2e46a5..923ae21cbef1 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -308,24 +308,23 @@
> * CFLAGS.ZF.
> * Note: Only the memory operand variant of VERW clears the CPU buffers.
> */
> -.macro __CLEAR_CPU_BUFFERS feature
> #ifdef CONFIG_X86_64
> - ALTERNATIVE "", "verw x86_verw_sel(%rip)", \feature
> +#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
> #else
> - /*
> - * In 32bit mode, the memory operand must be a %cs reference. The data
> - * segments may not be usable (vm86 mode), and the stack segment may not
> - * be flat (ESPFIX32).
> - */
> - ALTERNATIVE "", "verw %cs:x86_verw_sel", \feature
> +/*
> + * In 32bit mode, the memory operand must be a %cs reference. The data segments
> + * may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
> + */
> +#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
> #endif
> -.endm
> +
> +#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
>
> #define CLEAR_CPU_BUFFERS \
> - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
>
> #define VM_CLEAR_CPU_BUFFERS \
> - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF_VM
> + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
Sorry nitpicking, we have too many "CLEAR_CPU_BUF" in these macros, can we
avoid adding CLEAR_CPU_BUFFERS_SEQ?
Or better yet, can we name the actual instruction define to VERW_SEQ, so as
to differentiate it from the ALTERNATIVE defines?
---
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4cf347732ec1..16b957382224 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -309,23 +309,21 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
#ifdef CONFIG_X86_64
-#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
+#define VERW_SEQ verw x86_verw_sel(%rip)
#else
/*
* In 32bit mode, the memory operand must be a %cs reference. The data segments
* may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
*/
-#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
+#define VERW_SEQ verw %cs:x86_verw_sel
#endif
-#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
-
/* Primarily used in exit-to-userspace path */
#define CLEAR_CPU_BUFFERS \
- ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
+ ALTERNATIVE "", __stringify(VERW_SEQ), X86_FEATURE_CLEAR_CPU_BUF
#define VM_CLEAR_CPU_BUFFERS \
- ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
+ ALTERNATIVE "", __stringify(VERW_SEQ), X86_FEATURE_CLEAR_CPU_BUF_VM
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 23:55 ` Pawan Gupta
2025-11-01 3:41 ` Pawan Gupta
@ 2025-11-03 9:17 ` Peter Zijlstra
2025-11-03 17:37 ` Pawan Gupta
1 sibling, 1 reply; 57+ messages in thread
From: Peter Zijlstra @ 2025-11-03 9:17 UTC (permalink / raw)
To: Pawan Gupta
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Fri, Oct 31, 2025 at 04:55:24PM -0700, Pawan Gupta wrote:
> On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> ...
> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > index 1f99a98a16a2..61a809790a58 100644
> > --- a/arch/x86/kvm/vmx/vmenter.S
> > +++ b/arch/x86/kvm/vmx/vmenter.S
> > @@ -71,6 +71,7 @@
> > * @regs: unsigned long * (to guest registers)
> > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> > *
> > * Returns:
> > * 0 on VM-Exit, 1 on VM-Fail
> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > /* Load @regs to RAX. */
> > mov (%_ASM_SP), %_ASM_AX
> >
> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> > + ALTERNATIVE_2 "", \
> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
> > +
> > /* Check if vmlaunch or vmresume is needed */
> > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
> >
> > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> >
> > /* Clobbers EFLAGS.ZF */
> > - VM_CLEAR_CPU_BUFFERS
> > + ALTERNATIVE_2 "", \
> > + __stringify(jz .Lskip_clear_cpu_buffers; \
> > + CLEAR_CPU_BUFFERS_SEQ; \
> > + .Lskip_clear_cpu_buffers:), \
> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Another way to write this could be:
>
> ALTERNATIVE_2 "jmp .Lskip_clear_cpu_buffers", \
> "jz .Lskip_clear_cpu_buffers", X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> "", X86_FEATURE_CLEAR_CPU_BUF_VM
>
> CLEAR_CPU_BUFFERS_SEQ
> .Lskip_clear_cpu_buffers:
>
> With this jmp;verw; would show up in the disassembly on unaffected CPUs, I
> don't know how big a problem is that. OTOH, I find this easier to understand.
Generating larger code just to keep disassembly 'simple' seems wrong.
Also, see this:
https://lkml.kernel.org/r/194ad779-f41f-46a5-9973-e886f483b60a@oracle.com
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 21:47 ` Sean Christopherson
@ 2025-11-03 10:49 ` Brendan Jackman
0 siblings, 0 replies; 57+ messages in thread
From: Brendan Jackman @ 2025-11-03 10:49 UTC (permalink / raw)
To: Sean Christopherson, Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri Oct 31, 2025 at 9:47 PM UTC, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Brendan Jackman wrote:
>> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
>> > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
>> > index 723666a1357e..9acf6343b0ac 100644
>> > --- a/arch/x86/kernel/cpu/bugs.c
>> > +++ b/arch/x86/kernel/cpu/bugs.c
>> > @@ -192,14 +192,6 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
>> > */
>> > DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
>> >
>> > -/*
>> > - * Controls CPU Fill buffer clear before VMenter. This is a subset of
>> > - * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
>> > - * mitigation is required.
>> > - */
>>
>> This comment wasn't super clear IMO but now that we're losing it, maybe
>> we can replace it with a WARN_ON() at the end of
>> cpu_apply_mitigations() or something (maybe it belongs in VMX code)? To
>> make it more obvious that X86_FEATURE_CLEAR_CPU_BUF_VM and
>> X86_FEATURE_CLEAR_CPU_BUF_MMIO are mutually exclusive.
>
> No objection from me if we want strong guarantees that CLEAR_CPU_BUF_VM and
> CLEAR_CPU_BUF_MMIO are mutually exclusive. Though I do think the KVM side of
> things (and the kernel in general) should be paranoid and not lean _too_ hard
> on such assumptions.
Ah, after finishing the review I realised these are _not_ actually
mutually exclusive in terms of the implementation. So asserting here
that they are mutually exclusive would just be confusing, rather than
helfpul, IMO.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 21:44 ` Sean Christopherson
@ 2025-11-03 10:51 ` Brendan Jackman
0 siblings, 0 replies; 57+ messages in thread
From: Brendan Jackman @ 2025-11-03 10:51 UTC (permalink / raw)
To: Sean Christopherson, Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri Oct 31, 2025 at 9:44 PM UTC, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Brendan Jackman wrote:
>> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
>> > Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
>> > immediately prior to VM-Enter, i.e. in the same location that KVM emits a
>> > VERW for unconditional (at runtime) clearing. Co-locating the code and
>> > using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
>> > various vulnerabilities.
>> >
>> > Deliberately order the alternatives as:
>> >
>> > 0. Do nothing
>> > 1. Clear if vCPU can access MMIO
>> > 2. Clear always
>> >
>> > since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
>> > honor the strictest mitigation (always clear CPU buffers) if multiple
>> > mitigations are selected. E.g. even if the kernel chooses to mitigate
>> > MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
>> > may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
>> >
>> > Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
>> > a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
>> > L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
>> > mitigation but not any other "clear CPU buffers" mitigation is enabled.
>> > For that specific scenario, KVM would skip clearing CPU buffers for the
>> > MMIO mitigation even though the kernel requested a clear on every VM-Enter.
>> >
>> > Note #2, the flaw goes back to the introduction of the MDS mitigation. The
>> > MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
>> > Move VERW closer to VMentry for MDS mitigation"), but previous kernels
>> > that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
>> > unlikely the flaw is meaningfully exploitable even older kernels).
>> >
>> > Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
>> > Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>> > Signed-off-by: Sean Christopherson <seanjc@google.com>
>> > ---
>> > arch/x86/kvm/vmx/vmenter.S | 14 +++++++++++++-
>> > arch/x86/kvm/vmx/vmx.c | 13 -------------
>> > 2 files changed, 13 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
>> > index 1f99a98a16a2..61a809790a58 100644
>> > --- a/arch/x86/kvm/vmx/vmenter.S
>> > +++ b/arch/x86/kvm/vmx/vmenter.S
>> > @@ -71,6 +71,7 @@
>> > * @regs: unsigned long * (to guest registers)
>> > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
>> > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
>> > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
>> > *
>> > * Returns:
>> > * 0 on VM-Exit, 1 on VM-Fail
>> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
>> > /* Load @regs to RAX. */
>> > mov (%_ASM_SP), %_ASM_AX
>> >
>> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
>> > + ALTERNATIVE_2 "", \
>> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
>> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
>> > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
>>
>> Ah, so this ALTERNATIVE_2 (instead of just an ALTERNATIVE that checks
>> CLEAR_CPU_BUF_MMIO) is really about avoiding the flags needing to be
>> mutually exclusive?
>
> Yeah, more or less. More specifically, I want to keep the X vs. Y logic in one
> place (well, two if you count both ALTERNATIVE_2 flows), so that in generaly,
> from KVM's perspective, the mitigations are handled as independent things. E.g.
> even if CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO are mutually exclusive within
> the kernel (and it's not clear to me that that's 100% guaranteed), I want to
> limit how much of KVM assumes they are exclusive. Partly to avoid "oops, we
> forgot to mitigate that thing you care about", partly so that reading code like
> the setting of VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO doesn't require understanding
> the relationship between CLEAR_CPU_BUF_VM and CLEAR_CPU_BUF_MMIO.
Yeah, this makes sense, if we can avoid creating any unnecessary
and awkward-to-enforce invariants that seems like a win.
>> if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
>> !cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_VM))
>> test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx
>>
>> ... right? This is a good idea but I think it warrants a comment to
>> capture the intent, without having the commit message in short-term
>> memory I'd have struggled with this code, I think.
>>
>> > /* Check if vmlaunch or vmresume is needed */
>> > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
>> >
>> > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
>> > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
>> >
>> > /* Clobbers EFLAGS.ZF */
>> > - VM_CLEAR_CPU_BUFFERS
>> > + ALTERNATIVE_2 "", \
>> > + __stringify(jz .Lskip_clear_cpu_buffers; \
>>
>> Maybe I'm just an asm noob
>
> Nah, all of this is definitely playing on hard mode. I'm just thankful we don't
> have to deal with the horrors of KVM doing all of this in inline asm. :-D
>
>> I was very impressed by this trick of using CF and ZF together like this!)
>> but I think it's helpful to have the comment like the jnc has below, and
>> Pawan had in his version, to really make the test->jz dependency obvious,
>> since the two instructions are quite far apart.
>>
>>
>> > + CLEAR_CPU_BUFFERS_SEQ; \
>> > + .Lskip_clear_cpu_buffers:), \
>> > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
>> > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>>
>> Sorry I'm really nitpicking but I think it's justified for asm
>> readability...
>>
>> It's a bit unfortunate that one branch says
>> CLEAR_CPU_BUFFERS_SEQ and the other says __CLEAR_CPU_BUFFERS. With the
>> current code I think it would be more readable to jut have
>> __stringify(CLEAR_CPU_BUFFERS_SEQ) in the CLEAR_CPU_BUF_VM case, then
>> you don't have to mentally expand the macro to see how the two branches
>> actually differ.
>
> No preference here (assuming I understand what you're asking).
>
> Is this better?
>
> /*
> * Note, this sequence consumes *and* clobbers EFLAGS.ZF. The MMIO
> * mitigations uses ZF to track whether or not the vCPU has access to
> * host MMIO (see above), and VERW (the instruction microcode hijacks
> * to clear CPU buffers) writes ZF.
> */
> ALTERNATIVE_2 "", \
> __stringify(jz .Lskip_clear_cpu_buffers; \
> CLEAR_CPU_BUFFERS_SEQ; \
> .Lskip_clear_cpu_buffers:), \
> X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> __stringify(CLEAR_CPU_BUFFERS_SEQ), X86_FEATURE_CLEAR_CPU_BUF_VM
Yep that looks good to me.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-11-01 4:13 ` Pawan Gupta
@ 2025-11-03 17:00 ` Sean Christopherson
2025-11-03 17:40 ` Pawan Gupta
2025-11-12 12:15 ` Borislav Petkov
0 siblings, 2 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-11-03 17:00 UTC (permalink / raw)
To: Pawan Gupta
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Fri, Oct 31, 2025, Pawan Gupta wrote:
> On Thu, Oct 30, 2025 at 05:30:34PM -0700, Sean Christopherson wrote:
> > Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
> > buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
> > and "clear if guest can access host MMIO" in a single statement.
> >
> > No functional change intended.
> >
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > arch/x86/include/asm/nospec-branch.h | 21 ++++++++++-----------
> > 1 file changed, 10 insertions(+), 11 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index 08ed5a2e46a5..923ae21cbef1 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -308,24 +308,23 @@
> > * CFLAGS.ZF.
> > * Note: Only the memory operand variant of VERW clears the CPU buffers.
> > */
> > -.macro __CLEAR_CPU_BUFFERS feature
> > #ifdef CONFIG_X86_64
> > - ALTERNATIVE "", "verw x86_verw_sel(%rip)", \feature
> > +#define CLEAR_CPU_BUFFERS_SEQ verw x86_verw_sel(%rip)
> > #else
> > - /*
> > - * In 32bit mode, the memory operand must be a %cs reference. The data
> > - * segments may not be usable (vm86 mode), and the stack segment may not
> > - * be flat (ESPFIX32).
> > - */
> > - ALTERNATIVE "", "verw %cs:x86_verw_sel", \feature
> > +/*
> > + * In 32bit mode, the memory operand must be a %cs reference. The data segments
> > + * may not be usable (vm86 mode), and the stack segment may not be flat (ESPFIX32).
> > + */
> > +#define CLEAR_CPU_BUFFERS_SEQ verw %cs:x86_verw_sel
> > #endif
> > -.endm
> > +
> > +#define __CLEAR_CPU_BUFFERS __stringify(CLEAR_CPU_BUFFERS_SEQ)
> >
> > #define CLEAR_CPU_BUFFERS \
> > - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> > + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
> >
> > #define VM_CLEAR_CPU_BUFFERS \
> > - __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF_VM
> > + ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
>
> Sorry nitpicking, we have too many "CLEAR_CPU_BUF" in these macros, can we
> avoid adding CLEAR_CPU_BUFFERS_SEQ?
AFAICT, there's no sane way to avoid defining a macro for the raw instruction. :-/
> Or better yet, can we name the actual instruction define to VERW_SEQ,
Works for me.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-03 9:17 ` Peter Zijlstra
@ 2025-11-03 17:37 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-03 17:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Borislav Petkov, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Mon, Nov 03, 2025 at 10:17:49AM +0100, Peter Zijlstra wrote:
> On Fri, Oct 31, 2025 at 04:55:24PM -0700, Pawan Gupta wrote:
> > On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> > ...
> > > diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> > > index 1f99a98a16a2..61a809790a58 100644
> > > --- a/arch/x86/kvm/vmx/vmenter.S
> > > +++ b/arch/x86/kvm/vmx/vmenter.S
> > > @@ -71,6 +71,7 @@
> > > * @regs: unsigned long * (to guest registers)
> > > * @flags: VMX_RUN_VMRESUME: use VMRESUME instead of VMLAUNCH
> > > * VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
> > > + * VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO: vCPU can access host MMIO
> > > *
> > > * Returns:
> > > * 0 on VM-Exit, 1 on VM-Fail
> > > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > > /* Load @regs to RAX. */
> > > mov (%_ASM_SP), %_ASM_AX
> > >
> > > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> > > + ALTERNATIVE_2 "", \
> > > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> > > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > > + "", X86_FEATURE_CLEAR_CPU_BUF_VM
> > > +
> > > /* Check if vmlaunch or vmresume is needed */
> > > bt $VMX_RUN_VMRESUME_SHIFT, %ebx
> > >
> > > @@ -161,7 +168,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > > mov VCPU_RAX(%_ASM_AX), %_ASM_AX
> > >
> > > /* Clobbers EFLAGS.ZF */
> > > - VM_CLEAR_CPU_BUFFERS
> > > + ALTERNATIVE_2 "", \
> > > + __stringify(jz .Lskip_clear_cpu_buffers; \
> > > + CLEAR_CPU_BUFFERS_SEQ; \
> > > + .Lskip_clear_cpu_buffers:), \
> > > + X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > > + __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF_VM
> >
> > Another way to write this could be:
> >
> > ALTERNATIVE_2 "jmp .Lskip_clear_cpu_buffers", \
> > "jz .Lskip_clear_cpu_buffers", X86_FEATURE_CLEAR_CPU_BUF_MMIO, \
> > "", X86_FEATURE_CLEAR_CPU_BUF_VM
> >
> > CLEAR_CPU_BUFFERS_SEQ
> > .Lskip_clear_cpu_buffers:
> >
> > With this jmp;verw; would show up in the disassembly on unaffected CPUs, I
> > don't know how big a problem is that. OTOH, I find this easier to understand.
>
> Generating larger code just to keep disassembly 'simple' seems wrong.
> Also, see this:
>
> https://lkml.kernel.org/r/194ad779-f41f-46a5-9973-e886f483b60a@oracle.com
Ok thanks, that settles it.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-11-03 17:00 ` Sean Christopherson
@ 2025-11-03 17:40 ` Pawan Gupta
2025-11-12 12:15 ` Borislav Petkov
1 sibling, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-03 17:40 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Mon, Nov 03, 2025 at 09:00:48AM -0800, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Pawan Gupta wrote:
> > On Thu, Oct 30, 2025 at 05:30:34PM -0700, Sean Christopherson wrote:
> > > Decouple the use of ALTERNATIVE from the encoding of VERW to clear CPU
> > > buffers so that KVM can use ALTERNATIVE_2 to handle "always clear buffers"
> > > and "clear if guest can access host MMIO" in a single statement.
> > >
> > > No functional change intended.
> > >
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
2025-10-31 12:32 ` Brendan Jackman
2025-10-31 23:55 ` Pawan Gupta
@ 2025-11-03 17:46 ` Pawan Gupta
2025-11-12 16:41 ` Borislav Petkov
3 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-03 17:46 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> Rework the handling of the MMIO Stale Data mitigation to clear CPU buffers
> immediately prior to VM-Enter, i.e. in the same location that KVM emits a
> VERW for unconditional (at runtime) clearing. Co-locating the code and
> using a single ALTERNATIVES_2 makes it more obvious how VMX mitigates the
> various vulnerabilities.
>
> Deliberately order the alternatives as:
>
> 0. Do nothing
> 1. Clear if vCPU can access MMIO
> 2. Clear always
>
> since the last alternative wins in ALTERNATIVES_2(), i.e. so that KVM will
> honor the strictest mitigation (always clear CPU buffers) if multiple
> mitigations are selected. E.g. even if the kernel chooses to mitigate
> MMIO Stale Data via X86_FEATURE_CLEAR_CPU_BUF_MMIO, some other mitigation
> may enable X86_FEATURE_CLEAR_CPU_BUF_VM, and that other thing needs to win.
>
> Note, decoupling the MMIO mitigation from the L1TF mitigation also fixes
> a mostly-benign flaw where KVM wouldn't do any clearing/flushing if the
> L1TF mitigation is configured to conditionally flush the L1D, and the MMIO
> mitigation but not any other "clear CPU buffers" mitigation is enabled.
> For that specific scenario, KVM would skip clearing CPU buffers for the
> MMIO mitigation even though the kernel requested a clear on every VM-Enter.
>
> Note #2, the flaw goes back to the introduction of the MDS mitigation. The
> MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
> Move VERW closer to VMentry for MDS mitigation"), but previous kernels
> that flush CPU buffers in vmx_vcpu_enter_exit() are affected (though it's
> unlikely the flaw is meaningfully exploitable even older kernels).
>
> Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not active")
> Suggested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
2025-10-31 11:30 ` Brendan Jackman
@ 2025-11-03 18:18 ` Pawan Gupta
2025-11-07 19:05 ` Borislav Petkov
2025-11-07 18:59 ` Borislav Petkov
2 siblings, 1 reply; 57+ messages in thread
From: Pawan Gupta @ 2025-11-03 18:18 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:33PM -0700, Sean Christopherson wrote:
> From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>
> TSA mitigation:
>
> d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
>
> introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
> CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
> (kernel->user also).
>
> Make mitigations on Intel consistent with TSA. This would help handling the
> guest-only mitigations better in future.
>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> [sean: make CLEAR_CPU_BUF_VM mutually exclusive with the MMIO mitigation]
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kernel/cpu/bugs.c | 9 +++++++--
> arch/x86/kvm/vmx/vmenter.S | 2 +-
> 2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 6a526ae1fe99..723666a1357e 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -194,7 +194,7 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
>
> /*
> * Controls CPU Fill buffer clear before VMenter. This is a subset of
> - * X86_FEATURE_CLEAR_CPU_BUF, and should only be enabled when KVM-only
> + * X86_FEATURE_CLEAR_CPU_BUF_VM, and should only be enabled when KVM-only
> * mitigation is required.
> */
> DEFINE_STATIC_KEY_FALSE(cpu_buf_vm_clear);
> @@ -536,6 +536,7 @@ static void __init mds_apply_mitigation(void)
> if (mds_mitigation == MDS_MITIGATION_FULL ||
> mds_mitigation == MDS_MITIGATION_VMWERV) {
> setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
> + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
> if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) &&
> (mds_nosmt || smt_mitigations == SMT_MITIGATIONS_ON))
> cpu_smt_disable(false);
> @@ -647,6 +648,7 @@ static void __init taa_apply_mitigation(void)
> * present on host, enable the mitigation for UCODE_NEEDED as well.
> */
> setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
> + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
>
> if (taa_nosmt || smt_mitigations == SMT_MITIGATIONS_ON)
> cpu_smt_disable(false);
> @@ -748,6 +750,7 @@ static void __init mmio_apply_mitigation(void)
> */
> if (verw_clear_cpu_buf_mitigation_selected) {
> setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
> + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
> static_branch_disable(&cpu_buf_vm_clear);
> } else {
> static_branch_enable(&cpu_buf_vm_clear);
> @@ -839,8 +842,10 @@ static void __init rfds_update_mitigation(void)
>
> static void __init rfds_apply_mitigation(void)
> {
> - if (rfds_mitigation == RFDS_MITIGATION_VERW)
> + if (rfds_mitigation == RFDS_MITIGATION_VERW) {
> setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF);
> + setup_force_cpu_cap(X86_FEATURE_CLEAR_CPU_BUF_VM);
> + }
> }
>
> static __init int rfds_parse_cmdline(char *str)
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index bc255d709d8a..1f99a98a16a2 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -161,7 +161,7 @@ SYM_FUNC_START(__vmx_vcpu_run)
> mov VCPU_RAX(%_ASM_AX), %_ASM_AX
>
> /* Clobbers EFLAGS.ZF */
> - CLEAR_CPU_BUFFERS
> + VM_CLEAR_CPU_BUFFERS
>
> /* Check EFLAGS.CF from the VMX_RUN_VMRESUME bit test above. */
> jnc .Lvmlaunch
> --
Sean, based on Brendan's feedback, below are the fixes to the comments on
top of this patch:
---
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 08ed5a2e46a5..2be9be782013 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -321,9 +321,11 @@
#endif
.endm
+/* Primarily used in exit-to-userspace path */
#define CLEAR_CPU_BUFFERS \
__CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
+/* For use in KVM */
#define VM_CLEAR_CPU_BUFFERS \
__CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF_VM
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 723666a1357e..49d5797a2a42 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -490,7 +490,7 @@ static enum rfds_mitigations rfds_mitigation __ro_after_init =
/*
* Set if any of MDS/TAA/MMIO/RFDS are going to enable VERW clearing
- * through X86_FEATURE_CLEAR_CPU_BUF on kernel and guest entry.
+ * at userspace *and* guest entry.
*/
static bool verw_clear_cpu_buf_mitigation_selected __ro_after_init;
^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together
2025-10-31 0:30 ` [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together Sean Christopherson
@ 2025-11-03 18:26 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-03 18:26 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:38PM -0700, Sean Christopherson wrote:
> Move vmx_l1d_flush(), vmx_cleanup_l1d_flush(), and the vmentry_l1d_flush
> param code up in vmx.c so that all of the L1 data cache flushing code is
> bundled together. This will allow conditioning the mitigation code on
> CONFIG_CPU_MITIGATIONS=y with minimal #ifdefs.
>
> No functional change intended.
>
> Reviewed-by: Brendan Jackman <jackmanb@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups
2025-10-31 17:36 ` Sean Christopherson
@ 2025-11-04 10:58 ` Brendan Jackman
0 siblings, 0 replies; 57+ messages in thread
From: Brendan Jackman @ 2025-11-04 10:58 UTC (permalink / raw)
To: Sean Christopherson, Brendan Jackman
Cc: Paolo Bonzini, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Pawan Gupta
On Fri Oct 31, 2025 at 5:36 PM UTC, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Brendan Jackman wrote:
>> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
>> > This is a combination of Brendan's work to unify the L1TF L1D flushing
>> > mitigation, and Pawan's work to bring some sanity to the mitigations that
>> > clear CPU buffers, with a bunch of glue code and some polishing from me.
>> >
>> > The "v4" is relative to the L1TF series. I smushed the two series together
>> > as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
>> > for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
>> > handling the series separately would have been a lot of work+churn for no
>> > real benefit.
>> >
>> > TL;DR:
>> >
>> > - Unify L1TF flushing under per-CPU variable
>> > - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
>> > - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
>> >
>> > To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
>> > VERW, tweak the low lever macros in nospec-branch.h to define the instruction
>> > sequence, and then wrap it with __stringify() as needed.
>> >
>> > The non-VMX code is lightly tested (but there's far less chance for breakage
>> > there). For the VMX code, I verified it does what I want (which may or may
>> > not be correct :-D) by hacking the code to force/clear various mitigations, and
>> > using ud2 to confirm the right path got selected.
>>
>> FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
>> by the mitigation. It's a bit of a long-winded way to achieve that and I
>> guess L1TF is anyway the easy case here, but I couldn't resist promoting
>> it.
Oops, for posterity, the missing [0] was:
[0]: https://lore.kernel.org/all/20251013-l1tf-test-v1-0-583fb664836d@google.com/
> Yeah, it's on my radar, but it'll be a while before I have the bandwidth to dig
> through something that involved (though I _am_ excited to have a way to actually
> test mitigations).
Also, I just realised I never mentioned anywhere: this is just the first
part, we also have an extension to the L1TF exploit to make it attack
via SMT. And we also have tests that exploit SRSO. Those will come later
though, I think there's no point in burning everyone out trying to get
everything in at once.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
2025-10-31 11:30 ` Brendan Jackman
2025-11-03 18:18 ` Pawan Gupta
@ 2025-11-07 18:59 ` Borislav Petkov
2025-11-12 18:02 ` Pawan Gupta
2 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-07 18:59 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:33PM -0700, Sean Christopherson wrote:
> From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
>
> TSA mitigation:
>
> d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
>
> introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
> CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
> (kernel->user also).
>
> Make mitigations on Intel consistent with TSA. This would help handling the
"consistent" as in "use the VM-specific buffer clearing variant in VMX too"?
In any case:
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-03 18:18 ` Pawan Gupta
@ 2025-11-07 19:05 ` Borislav Petkov
2025-11-11 22:03 ` Sean Christopherson
2025-11-12 18:17 ` Pawan Gupta
0 siblings, 2 replies; 57+ messages in thread
From: Borislav Petkov @ 2025-11-07 19:05 UTC (permalink / raw)
To: Pawan Gupta
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Peter Zijlstra, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Mon, Nov 03, 2025 at 10:18:40AM -0800, Pawan Gupta wrote:
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 08ed5a2e46a5..2be9be782013 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -321,9 +321,11 @@
> #endif
> .endm
>
> +/* Primarily used in exit-to-userspace path */
What does "primarily" mean here?
$ git grep -w CLEAR_CPU_BUFFERS
says *only* the kernel->user vector.
> #define CLEAR_CPU_BUFFERS \
> __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
>
> +/* For use in KVM */
That's why the "VM_" prefix is there.
The comments in arch/x86/include/asm/cpufeatures.h actually already explain
that, you could make them more explicit but let's not sprinkle comments
willy-nilly.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-07 19:05 ` Borislav Petkov
@ 2025-11-11 22:03 ` Sean Christopherson
2025-11-12 10:23 ` Borislav Petkov
2025-11-12 18:17 ` Pawan Gupta
1 sibling, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-11-11 22:03 UTC (permalink / raw)
To: Borislav Petkov
Cc: Pawan Gupta, Paolo Bonzini, Thomas Gleixner, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Fri, Nov 07, 2025, Borislav Petkov wrote:
> On Mon, Nov 03, 2025 at 10:18:40AM -0800, Pawan Gupta wrote:
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index 08ed5a2e46a5..2be9be782013 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -321,9 +321,11 @@
> > #endif
> > .endm
> >
> > +/* Primarily used in exit-to-userspace path */
>
> What does "primarily" mean here?
>
> $ git grep -w CLEAR_CPU_BUFFERS
>
> says *only* the kernel->user vector.
How about:
/* If necessary, emit VERW on exit-to-userspace to clear CPU buffers. */
#define CLEAR_CPU_BUFFERS \
ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
>
> > #define CLEAR_CPU_BUFFERS \
> > __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> >
> > +/* For use in KVM */
>
> That's why the "VM_" prefix is there.
Ya, and this goes away (moved into SVM) by the end of the series.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-11 22:03 ` Sean Christopherson
@ 2025-11-12 10:23 ` Borislav Petkov
2025-11-12 18:19 ` Pawan Gupta
0 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-12 10:23 UTC (permalink / raw)
To: Sean Christopherson
Cc: Pawan Gupta, Paolo Bonzini, Thomas Gleixner, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Tue, Nov 11, 2025 at 02:03:40PM -0800, Sean Christopherson wrote:
> How about:
>
> /* If necessary, emit VERW on exit-to-userspace to clear CPU buffers. */
> #define CLEAR_CPU_BUFFERS \
> ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
By the "If necessary" you mean whether X86_FEATURE_CLEAR_CPU_BUF is set or
not, I presume...
I was just wondering whether this macro is going to be used somewhere else
*except* on the kernel->user vector.
> Ya, and this goes away (moved into SVM) by the end of the series.
Aha, lemme look at the rest too then.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
2025-11-03 17:00 ` Sean Christopherson
2025-11-03 17:40 ` Pawan Gupta
@ 2025-11-12 12:15 ` Borislav Petkov
1 sibling, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2025-11-12 12:15 UTC (permalink / raw)
To: Sean Christopherson
Cc: Pawan Gupta, Paolo Bonzini, Thomas Gleixner, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Mon, Nov 03, 2025 at 09:00:48AM -0800, Sean Christopherson wrote:
> > Or better yet, can we name the actual instruction define to VERW_SEQ,
>
> Works for me.
Just call it VERW. If a separate x86-insn-like macro wants to appear, we can
disambiguate then.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
2025-10-31 11:44 ` Brendan Jackman
2025-10-31 22:28 ` Pawan Gupta
@ 2025-11-12 14:46 ` Borislav Petkov
2025-11-12 18:24 ` Pawan Gupta
2 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-12 14:46 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:35PM -0700, Sean Christopherson wrote:
> Subject: Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
I'm guessing that "xxx" would turn into the proper name after we're done
bikeshedding.
> Convert the MMIO Stale Data mitigation flag from a static branch into an
> X86_FEATURE_xxx so that it can be used via ALTERNATIVE_2 in KVM.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/nospec-branch.h | 2 --
> arch/x86/kernel/cpu/bugs.c | 11 +----------
> arch/x86/kvm/mmu/spte.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 4 ++--
> 5 files changed, 5 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 7129eb44adad..d1d7b5ec6425 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -501,6 +501,7 @@
> #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
^^^^^^^
Yes, you can break the line and format it properly. :-)
Also, this should be called then
X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO
as it is a VM-thing too.
Also, in my tree pile I have for bit 17
#define X86_FEATURE_SGX_EUPDATESVN (21*32+17) /* Support for ENCLS[EUPDATESVN] instruction */
I see you have X86_FEATURE_X2AVIC_EXT there so we need to pay attention during
the merge window.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
` (2 preceding siblings ...)
2025-11-03 17:46 ` Pawan Gupta
@ 2025-11-12 16:41 ` Borislav Petkov
2025-11-12 17:15 ` Sean Christopherson
3 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-12 16:41 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> /* Load @regs to RAX. */
> mov (%_ASM_SP), %_ASM_AX
>
> + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
Oh wow. Alternatives interdependence. What can go wrong. :)
> + ALTERNATIVE_2 "", \
> + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
So this VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO bit gets set here:
if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
So how static and/or dynamic is this?
IOW, can you stick this into a simple variable which is unconditionally
updated and you can use it in X86_FEATURE_CLEAR_CPU_BUF_MMIO case and
otherwise it simply remains unused?
Because then you get rid of that yuckiness.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-12 16:41 ` Borislav Petkov
@ 2025-11-12 17:15 ` Sean Christopherson
2025-11-12 18:38 ` Borislav Petkov
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-11-12 17:15 UTC (permalink / raw)
To: Borislav Petkov
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Wed, Nov 12, 2025, Borislav Petkov wrote:
> On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > /* Load @regs to RAX. */
> > mov (%_ASM_SP), %_ASM_AX
> >
> > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
>
> Oh wow. Alternatives interdependence. What can go wrong. :)
Nothing, it's perfect. :-D
> > + ALTERNATIVE_2 "", \
> > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
>
> So this VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO bit gets set here:
>
> if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
> flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
>
> So how static and/or dynamic is this?
kvm_vcpu_can_access_host_mmio() is very dynamic. It can be different between
vCPUs in a VM, and can even change on back-to-back runs of the same vCPU.
>
> IOW, can you stick this into a simple variable which is unconditionally
> updated and you can use it in X86_FEATURE_CLEAR_CPU_BUF_MMIO case and
> otherwise it simply remains unused?
Can you elaborate? I don't think I follow what you're suggesting.
>
> Because then you get rid of that yuckiness.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-07 18:59 ` Borislav Petkov
@ 2025-11-12 18:02 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-12 18:02 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Peter Zijlstra, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Fri, Nov 07, 2025 at 07:59:41PM +0100, Borislav Petkov wrote:
> On Thu, Oct 30, 2025 at 05:30:33PM -0700, Sean Christopherson wrote:
> > From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> >
> > TSA mitigation:
> >
> > d8010d4ba43e ("x86/bugs: Add a Transient Scheduler Attacks mitigation")
> >
> > introduced VM_CLEAR_CPU_BUFFERS for guests on AMD CPUs. Currently on Intel
> > CLEAR_CPU_BUFFERS is being used for guests which has a much broader scope
> > (kernel->user also).
> >
> > Make mitigations on Intel consistent with TSA. This would help handling the
>
> "consistent" as in "use the VM-specific buffer clearing variant in VMX too"?
That's correct.
> In any case:
>
> Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Thanks.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-07 19:05 ` Borislav Petkov
2025-11-11 22:03 ` Sean Christopherson
@ 2025-11-12 18:17 ` Pawan Gupta
1 sibling, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-12 18:17 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Peter Zijlstra, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Fri, Nov 07, 2025 at 08:05:34PM +0100, Borislav Petkov wrote:
> On Mon, Nov 03, 2025 at 10:18:40AM -0800, Pawan Gupta wrote:
> > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > index 08ed5a2e46a5..2be9be782013 100644
> > --- a/arch/x86/include/asm/nospec-branch.h
> > +++ b/arch/x86/include/asm/nospec-branch.h
> > @@ -321,9 +321,11 @@
> > #endif
> > .endm
> >
> > +/* Primarily used in exit-to-userspace path */
>
> What does "primarily" mean here?
> $ git grep -w CLEAR_CPU_BUFFERS
>
> says *only* the kernel->user vector.
By the end of this series, yes. At this patch this is used in VMX also:
arch/x86/kvm/vmx/vmenter.S:164: CLEAR_CPU_BUFFERS
"Primarily" can be dropped by the patch that replaces it in SVM/VMX.
> > #define CLEAR_CPU_BUFFERS \
> > __CLEAR_CPU_BUFFERS X86_FEATURE_CLEAR_CPU_BUF
> >
> > +/* For use in KVM */
>
> That's why the "VM_" prefix is there.
>
> The comments in arch/x86/include/asm/cpufeatures.h actually already explain
> that, you could make them more explicit but let's not sprinkle comments
> willy-nilly.
As Sean pointed out, this goes away in later patches.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well
2025-11-12 10:23 ` Borislav Petkov
@ 2025-11-12 18:19 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-12 18:19 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Peter Zijlstra, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Wed, Nov 12, 2025 at 11:23:36AM +0100, Borislav Petkov wrote:
> On Tue, Nov 11, 2025 at 02:03:40PM -0800, Sean Christopherson wrote:
> > How about:
> >
> > /* If necessary, emit VERW on exit-to-userspace to clear CPU buffers. */
> > #define CLEAR_CPU_BUFFERS \
> > ALTERNATIVE "", __CLEAR_CPU_BUFFERS, X86_FEATURE_CLEAR_CPU_BUF
>
> By the "If necessary" you mean whether X86_FEATURE_CLEAR_CPU_BUF is set or
> not, I presume...
>
> I was just wondering whether this macro is going to be used somewhere else
> *except* on the kernel->user vector.
I believe the intent is to use it only at kernel->user transition.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation
2025-11-12 14:46 ` Borislav Petkov
@ 2025-11-12 18:24 ` Pawan Gupta
0 siblings, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-12 18:24 UTC (permalink / raw)
To: Borislav Petkov
Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner,
Peter Zijlstra, Josh Poimboeuf, kvm, linux-kernel,
Brendan Jackman
On Wed, Nov 12, 2025 at 03:46:55PM +0100, Borislav Petkov wrote:
> > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > index 7129eb44adad..d1d7b5ec6425 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -501,6 +501,7 @@
> > #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
> > #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
> > #define X86_FEATURE_X2AVIC_EXT (21*32+17) /* AMD SVM x2AVIC support for 4k vCPUs */
> > +#define X86_FEATURE_CLEAR_CPU_BUF_MMIO (21*32+18) /* Clear CPU buffers using VERW before VMRUN, iff the vCPU can access host MMIO*/
> ^^^^^^^
>
> Yes, you can break the line and format it properly. :-)
>
> Also, this should be called then
>
> X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO
>
> as it is a VM-thing too.
+1. This is a VM-only flag.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-12 17:15 ` Sean Christopherson
@ 2025-11-12 18:38 ` Borislav Petkov
2025-11-12 20:30 ` Sean Christopherson
0 siblings, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-12 18:38 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Wed, Nov 12, 2025 at 09:15:00AM -0800, Sean Christopherson wrote:
> On Wed, Nov 12, 2025, Borislav Petkov wrote:
> > On Thu, Oct 30, 2025 at 05:30:36PM -0700, Sean Christopherson wrote:
> > > @@ -137,6 +138,12 @@ SYM_FUNC_START(__vmx_vcpu_run)
> > > /* Load @regs to RAX. */
> > > mov (%_ASM_SP), %_ASM_AX
> > >
> > > + /* Stash "clear for MMIO" in EFLAGS.ZF (used below). */
> >
> > Oh wow. Alternatives interdependence. What can go wrong. :)
>
> Nothing, it's perfect. :-D
Yeah. :-P
>
> > > + ALTERNATIVE_2 "", \
> > > + __stringify(test $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, %ebx), \
> >
> > So this VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO bit gets set here:
> >
> > if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> > kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
> > flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
> >
> > So how static and/or dynamic is this?
>
> kvm_vcpu_can_access_host_mmio() is very dynamic. It can be different between
> vCPUs in a VM, and can even change on back-to-back runs of the same vCPU.
Hmm, strange. Because looking at those things there:
root->has_mapped_host_mmio and vcpu->kvm->arch.has_mapped_host_mmio
they both read like something that a guest would set up once and that's it.
But what do I know...
> > IOW, can you stick this into a simple variable which is unconditionally
> > updated and you can use it in X86_FEATURE_CLEAR_CPU_BUF_MMIO case and
> > otherwise it simply remains unused?
>
> Can you elaborate? I don't think I follow what you're suggesting.
So I was thinking if you could set a per-guest variable in
C - vmx_per_guest_clear_per_mmio or so and then test it in asm:
testb $1,vmx_per_guest_clear_per_mmio(%rip)
jz .Lskip_clear_cpu_buffers;
CLEAR_CPU_BUFFERS_SEQ;
.Lskip_clear_cpu_buffers:
gcc -O3 suggests also
cmpb $0x0,vmx_per_guest_clear_per_mmio(%rip)
which is the same insn size...
The idea is to get rid of this first asm stashing things and it'll be a bit
more robust, I'd say.
And you don't rely on registers...
and when I say that, I now realize this is 32-bit too and you don't want to
touch regs - that's why you're stashing it - and there's no rip-relative on
32-bit...
I dunno - it might get hairy but I would still opt for a different solution
instead of this fragile stashing in ZF. You could do a function which pushes
and pops a scratch register where you put the value, i.e., you could do
push %reg
mov var, %reg
test or cmp ...
...
jz skip...
skip:
pop %reg
It is still all together in one place instead of spreading it around like
that.
Oh well.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-12 18:38 ` Borislav Petkov
@ 2025-11-12 20:30 ` Sean Christopherson
2025-11-12 23:01 ` Pawan Gupta
2025-11-13 14:20 ` Borislav Petkov
0 siblings, 2 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-11-12 20:30 UTC (permalink / raw)
To: Borislav Petkov
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Wed, Nov 12, 2025, Borislav Petkov wrote:
> On Wed, Nov 12, 2025 at 09:15:00AM -0800, Sean Christopherson wrote:
> > On Wed, Nov 12, 2025, Borislav Petkov wrote:
> > > So this VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO bit gets set here:
> > >
> > > if (cpu_feature_enabled(X86_FEATURE_CLEAR_CPU_BUF_MMIO) &&
> > > kvm_vcpu_can_access_host_mmio(&vmx->vcpu))
> > > flags |= VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO;
> > >
> > > So how static and/or dynamic is this?
> >
> > kvm_vcpu_can_access_host_mmio() is very dynamic. It can be different between
> > vCPUs in a VM, and can even change on back-to-back runs of the same vCPU.
>
> Hmm, strange. Because looking at those things there:
>
> root->has_mapped_host_mmio and vcpu->kvm->arch.has_mapped_host_mmio
>
> they both read like something that a guest would set up once and that's it.
> But what do I know...
They're set based on what memory is mapped into the KVM-controlled page tables,
e.g. into the EPT/NPT tables, that will be used by the vCPU for that VM-Enter.
root->has_mapped_host_mmio is per page table. vcpu->kvm->arch.has_mapped_host_mmio
exists because of nastiness related to shadow paging; for all intents and purposes,
I would just mentally ignore that one.
> > > IOW, can you stick this into a simple variable which is unconditionally
> > > updated and you can use it in X86_FEATURE_CLEAR_CPU_BUF_MMIO case and
> > > otherwise it simply remains unused?
> >
> > Can you elaborate? I don't think I follow what you're suggesting.
>
> So I was thinking if you could set a per-guest variable in
> C - vmx_per_guest_clear_per_mmio or so and then test it in asm:
>
> testb $1,vmx_per_guest_clear_per_mmio(%rip)
> jz .Lskip_clear_cpu_buffers;
> CLEAR_CPU_BUFFERS_SEQ;
>
> .Lskip_clear_cpu_buffers:
>
> gcc -O3 suggests also
>
> cmpb $0x0,vmx_per_guest_clear_per_mmio(%rip)
>
> which is the same insn size...
>
> The idea is to get rid of this first asm stashing things and it'll be a bit
> more robust, I'd say.
VMX "needs" to abuse RFLAGS no matter what, because RFLAGS is the only register
that's available at the time of VMLAUNCH/VMRESUME. On Intel, only RSP and
RFLAGS are context switched via the VMCS, all other GPRs need to be context
switch by software. Which is why I didn't balk at Pawan's idea to use RFLAGS.ZF
to track whether or not a VERW for MMIO is needed.
Hmm, actually, @flags is already on the stack because it's needed at VM-Exit.
Using EBX was a holdover from the conversion from inline asm to "proper" asm,
e.g. from commit 77df549559db ("KVM: VMX: Pass @launched to the vCPU-run asm via
standard ABI regs").
Oooh, and if we stop using bt+RFLAGS.CF, then we drop the annoying SHIFT definitions
in arch/x86/kvm/vmx/run_flags.h.
Very lightly tested at this point, but I think this can all be simplified to
/*
* Note, ALTERNATIVE_2 works in reverse order. If CLEAR_CPU_BUF_VM is
* enabled, do VERW unconditionally. If CPU_BUF_VM_MMIO is enabled,
* check @flags to see if the vCPU has access to host MMIO, and do VERW
* if so. Else, do nothing (no mitigations needed/enabled).
*/
ALTERNATIVE_2 "", \
__stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
jz .Lskip_clear_cpu_buffers; \
VERW; \
.Lskip_clear_cpu_buffers:), \
X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO, \
__stringify(VERW), X86_FEATURE_CLEAR_CPU_BUF_VM
/* Check if vmlaunch or vmresume is needed */
testl $VMX_RUN_VMRESUME, WORD_SIZE(%_ASM_SP)
jz .Lvmlaunch
> And you don't rely on registers...
>
> and when I say that, I now realize this is 32-bit too and you don't want to
> touch regs - that's why you're stashing it - and there's no rip-relative on
> 32-bit...
>
> I dunno - it might get hairy but I would still opt for a different solution
> instead of this fragile stashing in ZF. You could do a function which pushes
> and pops a scratch register where you put the value, i.e., you could do
>
> push %reg
> mov var, %reg
> test or cmp ...
> ...
> jz skip...
> skip:
> pop %reg
>
> It is still all together in one place instead of spreading it around like
> that.
FWIW, all GPRs except RSP are off limits. But as above, getting at @flags via
RSP is trivial.
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-12 20:30 ` Sean Christopherson
@ 2025-11-12 23:01 ` Pawan Gupta
2025-11-13 14:20 ` Borislav Petkov
1 sibling, 0 replies; 57+ messages in thread
From: Pawan Gupta @ 2025-11-12 23:01 UTC (permalink / raw)
To: Sean Christopherson
Cc: Borislav Petkov, Paolo Bonzini, Thomas Gleixner, Peter Zijlstra,
Josh Poimboeuf, kvm, linux-kernel, Brendan Jackman
On Wed, Nov 12, 2025 at 12:30:36PM -0800, Sean Christopherson wrote:
> VMX "needs" to abuse RFLAGS no matter what, because RFLAGS is the only register
> that's available at the time of VMLAUNCH/VMRESUME. On Intel, only RSP and
> RFLAGS are context switched via the VMCS, all other GPRs need to be context
> switch by software. Which is why I didn't balk at Pawan's idea to use RFLAGS.ZF
> to track whether or not a VERW for MMIO is needed.
>
> Hmm, actually, @flags is already on the stack because it's needed at VM-Exit.
> Using EBX was a holdover from the conversion from inline asm to "proper" asm,
> e.g. from commit 77df549559db ("KVM: VMX: Pass @launched to the vCPU-run asm via
> standard ABI regs").
>
> Oooh, and if we stop using bt+RFLAGS.CF, then we drop the annoying SHIFT definitions
> in arch/x86/kvm/vmx/run_flags.h.
>
> Very lightly tested at this point, but I think this can all be simplified to
>
> /*
> * Note, ALTERNATIVE_2 works in reverse order. If CLEAR_CPU_BUF_VM is
> * enabled, do VERW unconditionally. If CPU_BUF_VM_MMIO is enabled,
> * check @flags to see if the vCPU has access to host MMIO, and do VERW
> * if so. Else, do nothing (no mitigations needed/enabled).
> */
> ALTERNATIVE_2 "", \
> __stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
WORD_SIZE(%_ASM_SP) is still a bit fragile, but this is definitely an
improvement.
> jz .Lskip_clear_cpu_buffers; \
> VERW; \
> .Lskip_clear_cpu_buffers:), \
> X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO, \
> __stringify(VERW), X86_FEATURE_CLEAR_CPU_BUF_VM
>
> /* Check if vmlaunch or vmresume is needed */
> testl $VMX_RUN_VMRESUME, WORD_SIZE(%_ASM_SP)
> jz .Lvmlaunch
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-12 20:30 ` Sean Christopherson
2025-11-12 23:01 ` Pawan Gupta
@ 2025-11-13 14:20 ` Borislav Petkov
2025-11-13 22:01 ` Sean Christopherson
1 sibling, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-13 14:20 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Wed, Nov 12, 2025 at 12:30:36PM -0800, Sean Christopherson wrote:
> They're set based on what memory is mapped into the KVM-controlled page tables,
> e.g. into the EPT/NPT tables, that will be used by the vCPU for that VM-Enter.
> root->has_mapped_host_mmio is per page table. vcpu->kvm->arch.has_mapped_host_mmio
> exists because of nastiness related to shadow paging; for all intents and purposes,
> I would just mentally ignore that one.
And you say they're very dynamic because the page table will ofc very likely
change before each VM-Enter. Or rather, as long as the fact that the guest has
mapped host MMIO ranges changes. Oh well, I guess that's dynamic enough...
> Very lightly tested at this point, but I think this can all be simplified to
>
> /*
> * Note, ALTERNATIVE_2 works in reverse order. If CLEAR_CPU_BUF_VM is
> * enabled, do VERW unconditionally. If CPU_BUF_VM_MMIO is enabled,
> * check @flags to see if the vCPU has access to host MMIO, and do VERW
> * if so. Else, do nothing (no mitigations needed/enabled).
> */
> ALTERNATIVE_2 "", \
> __stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
> jz .Lskip_clear_cpu_buffers; \
> VERW; \
> .Lskip_clear_cpu_buffers:), \
And juse because that label is local to this statement only, you can simply
call it "1" and reduce clutter even more.
> X86_FEATURE_CLEAR_CPU_BUF_VM_MMIO, \
> __stringify(VERW), X86_FEATURE_CLEAR_CPU_BUF_VM
>
> /* Check if vmlaunch or vmresume is needed */
> testl $VMX_RUN_VMRESUME, WORD_SIZE(%_ASM_SP)
> jz .Lvmlaunch
Yap, that's as nice as it gets. Looks much more straight-forward and
contained.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS
2025-10-31 0:30 ` [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS Sean Christopherson
2025-10-31 12:34 ` Brendan Jackman
@ 2025-11-13 15:03 ` Borislav Petkov
2025-11-13 15:37 ` Sean Christopherson
1 sibling, 1 reply; 57+ messages in thread
From: Borislav Petkov @ 2025-11-13 15:03 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Oct 30, 2025 at 05:30:37PM -0700, Sean Christopherson wrote:
> Now that VMX encodes its own sequency for clearing CPU buffers, move
Now that VMX encodes its own sequency for clearing CPU buffers, move
Unknown word [sequency] in commit message.
Suggestions: ['sequence',
Please introduce a spellchecker into your patch creation workflow. :)
> VM_CLEAR_CPU_BUFFERS into SVM to minimize the chances of KVM botching a
> mitigation in the future, e.g. using VM_CLEAR_CPU_BUFFERS instead of
> checking multiple mitigation flags.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/include/asm/nospec-branch.h | 3 ---
> arch/x86/kvm/svm/vmenter.S | 6 ++++--
...
> +#define SVM_CLEAR_CPU_BUFFERS \
I need to remember to grep for "CLEAR_CPU_BUF" in the future in order to catch
them all...
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS
2025-11-13 15:03 ` Borislav Petkov
@ 2025-11-13 15:37 ` Sean Christopherson
2025-11-13 16:19 ` Borislav Petkov
0 siblings, 1 reply; 57+ messages in thread
From: Sean Christopherson @ 2025-11-13 15:37 UTC (permalink / raw)
To: Borislav Petkov
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Nov 13, 2025, Borislav Petkov wrote:
> On Thu, Oct 30, 2025 at 05:30:37PM -0700, Sean Christopherson wrote:
> > Now that VMX encodes its own sequency for clearing CPU buffers, move
>
> Now that VMX encodes its own sequency for clearing CPU buffers, move
> Unknown word [sequency] in commit message.
> Suggestions: ['sequence',
>
> Please introduce a spellchecker into your patch creation workflow. :)
I use codespell, but it's obviously imperfect. Do you use something fancier?
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS
2025-11-13 15:37 ` Sean Christopherson
@ 2025-11-13 16:19 ` Borislav Petkov
0 siblings, 0 replies; 57+ messages in thread
From: Borislav Petkov @ 2025-11-13 16:19 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Nov 13, 2025 at 07:37:52AM -0800, Sean Christopherson wrote:
> On Thu, Nov 13, 2025, Borislav Petkov wrote:
> > On Thu, Oct 30, 2025 at 05:30:37PM -0700, Sean Christopherson wrote:
> > > Now that VMX encodes its own sequency for clearing CPU buffers, move
> >
> > Now that VMX encodes its own sequency for clearing CPU buffers, move
> > Unknown word [sequency] in commit message.
> > Suggestions: ['sequence',
> >
> > Please introduce a spellchecker into your patch creation workflow. :)
>
> I use codespell, but it's obviously imperfect. Do you use something fancier?
Fancy? no.
Homegrown and thus moldable as time provides? Yeah:
https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/tree/.tip/bin/vp.py?h=vp&id=880f7f0393ae7d10643aeab32234086ee253687a#n815
That's my patch checker.
I also have enabled spellchecking in vim when I write the commit message.
But meh, typos will slip from time to time regardless...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2
2025-11-13 14:20 ` Borislav Petkov
@ 2025-11-13 22:01 ` Sean Christopherson
0 siblings, 0 replies; 57+ messages in thread
From: Sean Christopherson @ 2025-11-13 22:01 UTC (permalink / raw)
To: Borislav Petkov
Cc: Paolo Bonzini, Thomas Gleixner, Peter Zijlstra, Josh Poimboeuf,
kvm, linux-kernel, Pawan Gupta, Brendan Jackman
On Thu, Nov 13, 2025, Borislav Petkov wrote:
> On Wed, Nov 12, 2025 at 12:30:36PM -0800, Sean Christopherson wrote:
> > They're set based on what memory is mapped into the KVM-controlled page tables,
> > e.g. into the EPT/NPT tables, that will be used by the vCPU for that VM-Enter.
> > root->has_mapped_host_mmio is per page table. vcpu->kvm->arch.has_mapped_host_mmio
> > exists because of nastiness related to shadow paging; for all intents and purposes,
> > I would just mentally ignore that one.
>
> And you say they're very dynamic because the page table will ofc very likely
> change before each VM-Enter. Or rather, as long as the fact that the guest has
> mapped host MMIO ranges changes. Oh well, I guess that's dynamic enough...
In practice, the flag will be quite static for a given vCPU. The issue is that
it _could_ be extremely volatile depending on VMM and/or guest behavior, and so
I don't want to try and optimize for any particular behavior/pattern, because
KVM effectively doesn't have any control over whether or not the vCPU can access
MMIO.
> > Very lightly tested at this point, but I think this can all be simplified to
> >
> > /*
> > * Note, ALTERNATIVE_2 works in reverse order. If CLEAR_CPU_BUF_VM is
> > * enabled, do VERW unconditionally. If CPU_BUF_VM_MMIO is enabled,
> > * check @flags to see if the vCPU has access to host MMIO, and do VERW
> > * if so. Else, do nothing (no mitigations needed/enabled).
> > */
> > ALTERNATIVE_2 "", \
> > __stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
> > jz .Lskip_clear_cpu_buffers; \
> > VERW; \
> > .Lskip_clear_cpu_buffers:), \
>
> And juse because that label is local to this statement only, you can simply
> call it "1" and reduce clutter even more.
Eh, sort of. In the past, this code used "simple" numeric labels, and it became
nearly impossible to maintain. This is quite contained code and so isn't likely
to cause maintenance problems, but unless someone feels *really* strongly about
numeric labels, I'll keep a named label to match the rest of the code.
Though with it just being VERW, I can shorten it a wee bit and make it more
precise at the same time:
__stringify(testl $VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO, WORD_SIZE(%_ASM_SP); \
jz .Lskip_mmio_verw; \
VERW; \
.Lskip_mmio_verw:), \
^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2025-11-13 22:01 UTC | newest]
Thread overview: 57+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-31 0:30 [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 1/8] x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well Sean Christopherson
2025-10-31 11:30 ` Brendan Jackman
2025-11-01 1:46 ` Pawan Gupta
2025-11-03 18:18 ` Pawan Gupta
2025-11-07 19:05 ` Borislav Petkov
2025-11-11 22:03 ` Sean Christopherson
2025-11-12 10:23 ` Borislav Petkov
2025-11-12 18:19 ` Pawan Gupta
2025-11-12 18:17 ` Pawan Gupta
2025-11-07 18:59 ` Borislav Petkov
2025-11-12 18:02 ` Pawan Gupta
2025-10-31 0:30 ` [PATCH v4 2/8] x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition Sean Christopherson
2025-10-31 11:37 ` Brendan Jackman
2025-10-31 17:43 ` Sean Christopherson
2025-11-01 4:13 ` Pawan Gupta
2025-11-03 17:00 ` Sean Christopherson
2025-11-03 17:40 ` Pawan Gupta
2025-11-12 12:15 ` Borislav Petkov
2025-10-31 0:30 ` [PATCH v4 3/8] x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data mitigation Sean Christopherson
2025-10-31 11:44 ` Brendan Jackman
2025-10-31 21:47 ` Sean Christopherson
2025-11-03 10:49 ` Brendan Jackman
2025-10-31 22:28 ` Pawan Gupta
2025-10-31 22:37 ` Sean Christopherson
2025-10-31 22:50 ` Pawan Gupta
2025-11-12 14:46 ` Borislav Petkov
2025-11-12 18:24 ` Pawan Gupta
2025-10-31 0:30 ` [PATCH v4 4/8] KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via ALTERNATIVES_2 Sean Christopherson
2025-10-31 12:32 ` Brendan Jackman
2025-10-31 21:44 ` Sean Christopherson
2025-11-03 10:51 ` Brendan Jackman
2025-10-31 23:55 ` Pawan Gupta
2025-11-01 3:41 ` Pawan Gupta
2025-11-03 9:17 ` Peter Zijlstra
2025-11-03 17:37 ` Pawan Gupta
2025-11-03 17:46 ` Pawan Gupta
2025-11-12 16:41 ` Borislav Petkov
2025-11-12 17:15 ` Sean Christopherson
2025-11-12 18:38 ` Borislav Petkov
2025-11-12 20:30 ` Sean Christopherson
2025-11-12 23:01 ` Pawan Gupta
2025-11-13 14:20 ` Borislav Petkov
2025-11-13 22:01 ` Sean Christopherson
2025-10-31 0:30 ` [PATCH v4 5/8] x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as SVM_CLEAR_CPU_BUFFERS Sean Christopherson
2025-10-31 12:34 ` Brendan Jackman
2025-11-13 15:03 ` Borislav Petkov
2025-11-13 15:37 ` Sean Christopherson
2025-11-13 16:19 ` Borislav Petkov
2025-10-31 0:30 ` [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together Sean Christopherson
2025-11-03 18:26 ` Pawan Gupta
2025-10-31 0:30 ` [PATCH v4 7/8] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n Sean Christopherson
2025-10-31 12:37 ` Brendan Jackman
2025-10-31 0:30 ` [PATCH v4 8/8] KVM: x86: Unify L1TF flushing under per-CPU variable Sean Christopherson
2025-10-31 11:22 ` [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups Brendan Jackman
2025-10-31 17:36 ` Sean Christopherson
2025-11-04 10:58 ` Brendan Jackman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).