* [PATCH v10 01/12] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
@ 2026-04-14 7:05 ` Pawan Gupta
2026-04-14 18:05 ` Pawan Gupta
2026-04-14 7:05 ` [PATCH v10 02/12] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
` (10 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:05 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
Currently, the BHB clearing sequence is followed by an LFENCE to prevent
transient execution of subsequent indirect branches prematurely. However,
the LFENCE barrier could be unnecessary in certain cases. For example, when
the kernel is using the BHI_DIS_S mitigation, and BHB clearing is only
needed for userspace. In such cases, the LFENCE is redundant because ring
transitions would provide the necessary serialization.
Below is a quick recap of BHI mitigation options:
On Alder Lake and newer
BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
performance overhead.
Long loop: Alternatively, a longer version of the BHB clearing sequence
can be used to mitigate BHI. It can also be used to mitigate the BHI
variant of VMSCAPE. This is not yet implemented in Linux.
On older CPUs
Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
effective on older CPUs as well, but should be avoided because of
unnecessary overhead.
On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
branch history may still influence indirect branches in userspace. This
also means the big hammer IBPB could be replaced with a cheaper option that
clears the BHB at exit-to-userspace after a VMexit.
In preparation for adding the support for the BHB sequence (without LFENCE)
on newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
executed. Allow callers to decide whether they need the LFENCE or not. This
adds a few extra bytes to the call sites, but it obviates the need for
multiple variants of clear_bhb_loop().
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 5 ++++-
arch/x86/include/asm/nospec-branch.h | 4 ++--
arch/x86/net/bpf_jit_comp.c | 2 ++
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 42447b1e1dff..3a180a36ca0e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1528,6 +1528,9 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* refactored in the future if needed. The .skips are for safety, to ensure
* that all RETs are in the second half of a cacheline to mitigate Indirect
* Target Selection, rather than taking the slowpath via its_return_thunk.
+ *
+ * Note, callers should use a speculation barrier like LFENCE immediately after
+ * a call to this function to ensure BHB is cleared before indirect branches.
*/
SYM_FUNC_START(clear_bhb_loop)
ANNOTATE_NOENDBR
@@ -1562,7 +1565,7 @@ SYM_FUNC_START(clear_bhb_loop)
sub $1, %ecx
jnz 1b
.Lret2: RET
-5: lfence
+5:
pop %rbp
RET
SYM_FUNC_END(clear_bhb_loop)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4f4b5e8a1574..70b377fcbc1c 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -331,11 +331,11 @@
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
- ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP
+ ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_LOOP
.endm
.macro CLEAR_BRANCH_HISTORY_VMEXIT
- ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_VMEXIT
+ ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
.endm
#else
#define CLEAR_BRANCH_HISTORY
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e9b78040d703..63d6c9fa5e80 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1624,6 +1624,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
if (emit_call(&prog, func, ip))
return -EINVAL;
+ /* Don't speculate past this until BHB is cleared */
+ EMIT_LFENCE();
EMIT1(0x59); /* pop rcx */
EMIT1(0x58); /* pop rax */
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v10 01/12] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
2026-04-14 7:05 ` [PATCH v10 01/12] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
@ 2026-04-14 18:05 ` Pawan Gupta
0 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 18:05 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Apr 14, 2026 at 12:05:28AM -0700, Pawan Gupta wrote:
> Currently, the BHB clearing sequence is followed by an LFENCE to prevent
> transient execution of subsequent indirect branches prematurely. However,
> the LFENCE barrier could be unnecessary in certain cases. For example, when
> the kernel is using the BHI_DIS_S mitigation, and BHB clearing is only
> needed for userspace. In such cases, the LFENCE is redundant because ring
> transitions would provide the necessary serialization.
>
> Below is a quick recap of BHI mitigation options:
>
> On Alder Lake and newer
>
> BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
> performance overhead.
>
> Long loop: Alternatively, a longer version of the BHB clearing sequence
> can be used to mitigate BHI. It can also be used to mitigate the BHI
> variant of VMSCAPE. This is not yet implemented in Linux.
>
> On older CPUs
>
> Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
> effective on older CPUs as well, but should be avoided because of
> unnecessary overhead.
>
> On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
> guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
> branch history may still influence indirect branches in userspace. This
> also means the big hammer IBPB could be replaced with a cheaper option that
> clears the BHB at exit-to-userspace after a VMexit.
>
> In preparation for adding the support for the BHB sequence (without LFENCE)
> on newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
> executed. Allow callers to decide whether they need the LFENCE or not. This
> adds a few extra bytes to the call sites, but it obviates the need for
> multiple variants of clear_bhb_loop().
>
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Tested-by: Jon Kohler <jon@nutanix.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
Sorry this is missing Boris's Ack, I will fix.
> Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v10 02/12] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
2026-04-14 7:05 ` [PATCH v10 01/12] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
@ 2026-04-14 7:05 ` Pawan Gupta
2026-04-14 7:06 ` [PATCH v10 03/12] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence() Pawan Gupta
` (9 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:05 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
As a mitigation for BHI, clear_bhb_loop() executes branches that overwrite
the Branch History Buffer (BHB). On Alder Lake and newer parts this
sequence is not sufficient because it doesn't clear enough entries. This
was not an issue because these CPUs use the BHI_DIS_S hardware mitigation
in the kernel.
Now with VMSCAPE (BHI variant) it is also required to isolate branch
history between guests and userspace. Since BHI_DIS_S only protects the
kernel, the newer CPUs also use IBPB.
A cheaper alternative to the current IBPB mitigation is clear_bhb_loop().
But it currently does not clear enough BHB entries to be effective on newer
CPUs with larger BHB. At boot, dynamically set the loop count of
clear_bhb_loop() such that it is effective on newer CPUs too.
Introduce global loop counts, initializing them with appropriate value
based on the hardware feature X86_FEATURE_BHI_CTRL.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 8 +++++---
arch/x86/include/asm/nospec-branch.h | 2 ++
arch/x86/kernel/cpu/bugs.c | 13 +++++++++++++
3 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3a180a36ca0e..bbd4b1c7ec04 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1536,7 +1536,9 @@ SYM_FUNC_START(clear_bhb_loop)
ANNOTATE_NOENDBR
push %rbp
mov %rsp, %rbp
- movl $5, %ecx
+
+ movzbl bhb_seq_outer_loop(%rip), %ecx
+
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
jmp 5f
@@ -1556,8 +1558,8 @@ SYM_FUNC_START(clear_bhb_loop)
* This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
* but some Clang versions (e.g. 18) don't like this.
*/
- .skip 32 - 18, 0xcc
-2: movl $5, %eax
+ .skip 32 - 20, 0xcc
+2: movzbl bhb_seq_inner_loop(%rip), %eax
3: jmp 4f
nop
4: sub $1, %eax
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 70b377fcbc1c..87b83ae7c97f 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -548,6 +548,8 @@ DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
extern void update_spec_ctrl_cond(u64 val);
extern u64 spec_ctrl_current(void);
+extern u8 bhb_seq_inner_loop, bhb_seq_outer_loop;
+
/*
* With retpoline, we must use IBRS to restrict branch prediction
* before calling into firmware.
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 83f51cab0b1e..2cb4a96247d8 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2047,6 +2047,10 @@ enum bhi_mitigations {
static enum bhi_mitigations bhi_mitigation __ro_after_init =
IS_ENABLED(CONFIG_MITIGATION_SPECTRE_BHI) ? BHI_MITIGATION_AUTO : BHI_MITIGATION_OFF;
+/* Default to short BHB sequence values */
+u8 bhb_seq_outer_loop __ro_after_init = 5;
+u8 bhb_seq_inner_loop __ro_after_init = 5;
+
static int __init spectre_bhi_parse_cmdline(char *str)
{
if (!str)
@@ -3242,6 +3246,15 @@ void __init cpu_select_mitigations(void)
x86_spec_ctrl_base &= ~SPEC_CTRL_MITIGATIONS_MASK;
}
+ /*
+ * Switch to long BHB clear sequence on newer CPUs (with BHI_CTRL
+ * support), see Intel's BHI guidance.
+ */
+ if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL)) {
+ bhb_seq_outer_loop = 12;
+ bhb_seq_inner_loop = 7;
+ }
+
x86_arch_cap_msr = x86_read_arch_cap_msr();
cpu_print_attack_vectors();
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 03/12] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
2026-04-14 7:05 ` [PATCH v10 01/12] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
2026-04-14 7:05 ` [PATCH v10 02/12] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
@ 2026-04-14 7:06 ` Pawan Gupta
2026-04-14 7:06 ` [PATCH v10 04/12] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user Pawan Gupta
` (8 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:06 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
To reflect the recent change that moved LFENCE to the caller side.
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 8 ++++----
arch/x86/include/asm/nospec-branch.h | 6 +++---
arch/x86/net/bpf_jit_comp.c | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index bbd4b1c7ec04..1f56d086d312 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1532,7 +1532,7 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* Note, callers should use a speculation barrier like LFENCE immediately after
* a call to this function to ensure BHB is cleared before indirect branches.
*/
-SYM_FUNC_START(clear_bhb_loop)
+SYM_FUNC_START(clear_bhb_loop_nofence)
ANNOTATE_NOENDBR
push %rbp
mov %rsp, %rbp
@@ -1570,6 +1570,6 @@ SYM_FUNC_START(clear_bhb_loop)
5:
pop %rbp
RET
-SYM_FUNC_END(clear_bhb_loop)
-EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop)
-STACK_FRAME_NON_STANDARD(clear_bhb_loop)
+SYM_FUNC_END(clear_bhb_loop_nofence)
+EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop_nofence)
+STACK_FRAME_NON_STANDARD(clear_bhb_loop_nofence)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 87b83ae7c97f..157eb69c7f0f 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -331,11 +331,11 @@
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
- ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_LOOP
+ ALTERNATIVE "", "call clear_bhb_loop_nofence; lfence", X86_FEATURE_CLEAR_BHB_LOOP
.endm
.macro CLEAR_BRANCH_HISTORY_VMEXIT
- ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
+ ALTERNATIVE "", "call clear_bhb_loop_nofence; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
.endm
#else
#define CLEAR_BRANCH_HISTORY
@@ -389,7 +389,7 @@ extern void entry_untrain_ret(void);
extern void write_ibpb(void);
#ifdef CONFIG_X86_64
-extern void clear_bhb_loop(void);
+extern void clear_bhb_loop_nofence(void);
#endif
extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 63d6c9fa5e80..f40e88f87273 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1619,7 +1619,7 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
EMIT1(0x51); /* push rcx */
ip += 2;
- func = (u8 *)clear_bhb_loop;
+ func = (u8 *)clear_bhb_loop_nofence;
ip += x86_call_depth_emit_accounting(&prog, func, ip);
if (emit_call(&prog, func, ip))
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 04/12] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (2 preceding siblings ...)
2026-04-14 7:06 ` [PATCH v10 03/12] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence() Pawan Gupta
@ 2026-04-14 7:06 ` Pawan Gupta
2026-04-14 7:06 ` [PATCH v10 05/12] x86/vmscape: Move mitigation selection to a switch() Pawan Gupta
` (7 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:06 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
With the upcoming changes x86_ibpb_exit_to_user will also be used when BHB
clearing sequence is used. Rename it cover both the cases.
No functional change.
Suggested-by: Sean Christopherson <seanjc@google.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/include/asm/entry-common.h | 6 +++---
arch/x86/include/asm/nospec-branch.h | 2 +-
arch/x86/kernel/cpu/bugs.c | 4 ++--
arch/x86/kvm/x86.c | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index ce3eb6d5fdf9..c45858db16c9 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -94,11 +94,11 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
*/
choose_random_kstack_offset(rdtsc());
- /* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */
+ /* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
- this_cpu_read(x86_ibpb_exit_to_user)) {
+ this_cpu_read(x86_predictor_flush_exit_to_user)) {
indirect_branch_prediction_barrier();
- this_cpu_write(x86_ibpb_exit_to_user, false);
+ this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 157eb69c7f0f..0381db59c39d 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -533,7 +533,7 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
: "memory");
}
-DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user);
+DECLARE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
static inline void indirect_branch_prediction_barrier(void)
{
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 2cb4a96247d8..002bf4adccc3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -65,8 +65,8 @@ EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
* be needed to before running userspace. That IBPB will flush the branch
* predictor content.
*/
-DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user);
-EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user);
+DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
+EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fd1c4a36b593..45d7cfedc507 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11464,7 +11464,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* may migrate to.
*/
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
- this_cpu_write(x86_ibpb_exit_to_user, true);
+ this_cpu_write(x86_predictor_flush_exit_to_user, true);
/*
* Consume any pending interrupts, including the possible source of
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 05/12] x86/vmscape: Move mitigation selection to a switch()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (3 preceding siblings ...)
2026-04-14 7:06 ` [PATCH v10 04/12] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user Pawan Gupta
@ 2026-04-14 7:06 ` Pawan Gupta
2026-04-14 7:06 ` [PATCH v10 06/12] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier() Pawan Gupta
` (6 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:06 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
This ensures that all mitigation modes are explicitly handled, while
keeping the mitigation selection for each mode together. This also prepares
for adding BHB-clearing mitigation mode for VMSCAPE.
Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 002bf4adccc3..636280c612f0 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3088,17 +3088,33 @@ early_param("vmscape", vmscape_parse_cmdline);
static void __init vmscape_select_mitigation(void)
{
- if (!boot_cpu_has_bug(X86_BUG_VMSCAPE) ||
- !boot_cpu_has(X86_FEATURE_IBPB)) {
+ if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) {
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
return;
}
- if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) {
- if (should_mitigate_vuln(X86_BUG_VMSCAPE))
+ if ((vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) &&
+ !should_mitigate_vuln(X86_BUG_VMSCAPE))
+ vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+
+ switch (vmscape_mitigation) {
+ case VMSCAPE_MITIGATION_NONE:
+ break;
+
+ case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+ if (!boot_cpu_has(X86_FEATURE_IBPB))
+ vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+ break;
+
+ case VMSCAPE_MITIGATION_AUTO:
+ if (boot_cpu_has(X86_FEATURE_IBPB))
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
else
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+ break;
+
+ default:
+ break;
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 06/12] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (4 preceding siblings ...)
2026-04-14 7:06 ` [PATCH v10 05/12] x86/vmscape: Move mitigation selection to a switch() Pawan Gupta
@ 2026-04-14 7:06 ` Pawan Gupta
2026-04-14 7:07 ` [PATCH v10 07/12] static_call: Add EXPORT_STATIC_CALL_FOR_MODULES() Pawan Gupta
` (5 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:06 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
indirect_branch_prediction_barrier() is a wrapper to write_ibpb(), which
also checks if the CPU supports IBPB. For VMSCAPE, call to
indirect_branch_prediction_barrier() is only possible when CPU supports
IBPB.
Simply call write_ibpb() directly to avoid unnecessary alternative
patching.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/include/asm/entry-common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index c45858db16c9..78b143673ca7 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -97,7 +97,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
/* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
this_cpu_read(x86_predictor_flush_exit_to_user)) {
- indirect_branch_prediction_barrier();
+ write_ibpb();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 07/12] static_call: Add EXPORT_STATIC_CALL_FOR_MODULES()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (5 preceding siblings ...)
2026-04-14 7:06 ` [PATCH v10 06/12] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier() Pawan Gupta
@ 2026-04-14 7:07 ` Pawan Gupta
2026-04-14 7:07 ` [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM() Pawan Gupta
` (4 subsequent siblings)
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:07 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
There is EXPORT_STATIC_CALL_TRAMP() that hides the static key from all
modules. But there is no equivalent of EXPORT_SYMBOL_FOR_MODULES() to
restrict symbol visibility to only certain modules.
Add EXPORT_STATIC_CALL_FOR_MODULES(name, mods) that wraps both the key and
the trampoline with EXPORT_SYMBOL_FOR_MODULES(), allowing only a limited
set of modules to see and update the static key.
The immediate user is KVM, in the following commit.
checkpatch reported below warnings with this change that I believe don't
apply in this case:
include/linux/static_call.h:219: WARNING: Non-declarative macros with multiple statements should be enclosed in a do - while loop
include/linux/static_call.h:220: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
include/linux/static_call.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 78a77a4ae0ea..b610afd1ed55 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -216,6 +216,9 @@ extern long __static_call_return0(void);
#define EXPORT_STATIC_CALL_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods) \
+ EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods); \
+ EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_TRAMP(name), mods)
/* Leave the key unexported, so modules can't change static call targets: */
#define EXPORT_STATIC_CALL_TRAMP(name) \
@@ -276,6 +279,9 @@ extern long __static_call_return0(void);
#define EXPORT_STATIC_CALL_GPL(name) \
EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name)); \
EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods) \
+ EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods); \
+ EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_TRAMP(name), mods)
/* Leave the key unexported, so modules can't change static call targets: */
#define EXPORT_STATIC_CALL_TRAMP(name) \
@@ -346,6 +352,8 @@ static inline int static_call_text_reserved(void *start, void *end)
#define EXPORT_STATIC_CALL(name) EXPORT_SYMBOL(STATIC_CALL_KEY(name))
#define EXPORT_STATIC_CALL_GPL(name) EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods) \
+ EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods)
#endif /* CONFIG_HAVE_STATIC_CALL */
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM()
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (6 preceding siblings ...)
2026-04-14 7:07 ` [PATCH v10 07/12] static_call: Add EXPORT_STATIC_CALL_FOR_MODULES() Pawan Gupta
@ 2026-04-14 7:07 ` Pawan Gupta
2026-04-16 22:44 ` Sean Christopherson
2026-04-14 7:07 ` [PATCH v10 09/12] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
` (3 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:07 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
EXPORT_SYMBOL_FOR_KVM() exists to export symbols to KVM modules. Static
calls need the same treatment when the core kernel defines a static_call
that KVM needs access to (e.g. from a VM-exit path).
Define EXPORT_STATIC_CALL_FOR_KVM() as the static_call analogue of
EXPORT_SYMBOL_FOR_KVM(). The same three-way logic applies:
- KVM_SUB_MODULES defined: export to "kvm," plus all sub-modules
- KVM=m, no sub-modules: export to "kvm" only
- KVM built-in: no export needed (noop)
As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override the
definition (e.g. to suppress the export when kvm.ko itself will not be
built despite CONFIG_KVM=m). Add the x86 no-op override in
arch/x86/include/asm/kvm_types.h for that case.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/include/asm/kvm_types.h | 1 +
include/linux/kvm_types.h | 13 ++++++++++++-
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_types.h
index d7c704ed1be9..bceeaed2940e 100644
--- a/arch/x86/include/asm/kvm_types.h
+++ b/arch/x86/include/asm/kvm_types.h
@@ -15,6 +15,7 @@
* at least one vendor module is enabled.
*/
#define EXPORT_SYMBOL_FOR_KVM(symbol)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
#endif
#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index a568d8e6f4e8..c81f4fdba625 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -13,6 +13,8 @@
EXPORT_SYMBOL_FOR_MODULES(symbol, __stringify(KVM_SUB_MODULES))
#define EXPORT_SYMBOL_FOR_KVM(symbol) \
EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) \
+ EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
#else
#define EXPORT_SYMBOL_FOR_KVM_INTERNAL(symbol)
/*
@@ -27,7 +29,16 @@
#define EXPORT_SYMBOL_FOR_KVM(symbol)
#endif /* IS_MODULE(CONFIG_KVM) */
#endif /* EXPORT_SYMBOL_FOR_KVM */
-#endif
+
+#ifndef EXPORT_STATIC_CALL_FOR_KVM
+#if IS_MODULE(CONFIG_KVM)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm")
+#else
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
+#endif /* IS_MODULE(CONFIG_KVM) */
+#endif /* EXPORT_STATIC_CALL_FOR_KVM */
+
+#endif /* KVM_SUB_MODULES */
#ifndef __ASSEMBLER__
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM()
2026-04-14 7:07 ` [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM() Pawan Gupta
@ 2026-04-16 22:44 ` Sean Christopherson
2026-04-16 23:12 ` Pawan Gupta
0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2026-04-16 22:44 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Apr 14, 2026, Pawan Gupta wrote:
> EXPORT_SYMBOL_FOR_KVM() exists to export symbols to KVM modules. Static
> calls need the same treatment when the core kernel defines a static_call
> that KVM needs access to (e.g. from a VM-exit path).
>
> Define EXPORT_STATIC_CALL_FOR_KVM() as the static_call analogue of
> EXPORT_SYMBOL_FOR_KVM(). The same three-way logic applies:
>
> - KVM_SUB_MODULES defined: export to "kvm," plus all sub-modules
> - KVM=m, no sub-modules: export to "kvm" only
> - KVM built-in: no export needed (noop)
>
> As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override the
> definition (e.g. to suppress the export when kvm.ko itself will not be
> built despite CONFIG_KVM=m). Add the x86 no-op override in
> arch/x86/include/asm/kvm_types.h for that case.
>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
> arch/x86/include/asm/kvm_types.h | 1 +
> include/linux/kvm_types.h | 13 ++++++++++++-
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_types.h
> index d7c704ed1be9..bceeaed2940e 100644
> --- a/arch/x86/include/asm/kvm_types.h
> +++ b/arch/x86/include/asm/kvm_types.h
> @@ -15,6 +15,7 @@
> * at least one vendor module is enabled.
> */
> #define EXPORT_SYMBOL_FOR_KVM(symbol)
> +#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
> #endif
>
> #define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index a568d8e6f4e8..c81f4fdba625 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -13,6 +13,8 @@
> EXPORT_SYMBOL_FOR_MODULES(symbol, __stringify(KVM_SUB_MODULES))
> #define EXPORT_SYMBOL_FOR_KVM(symbol) \
> EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
> +#define EXPORT_STATIC_CALL_FOR_KVM(symbol) \
> + EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
> #else
> #define EXPORT_SYMBOL_FOR_KVM_INTERNAL(symbol)
> /*
> @@ -27,7 +29,16 @@
> #define EXPORT_SYMBOL_FOR_KVM(symbol)
> #endif /* IS_MODULE(CONFIG_KVM) */
> #endif /* EXPORT_SYMBOL_FOR_KVM */
> -#endif
> +
> +#ifndef EXPORT_STATIC_CALL_FOR_KVM
> +#if IS_MODULE(CONFIG_KVM)
> +#define EXPORT_STATIC_CALL_FOR_KVM(symbol) EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm")
> +#else
> +#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
> +#endif /* IS_MODULE(CONFIG_KVM) */
> +#endif /* EXPORT_STATIC_CALL_FOR_KVM */
> +
> +#endif /* KVM_SUB_MODULES */
I think I'd prefer to require EXPORT_SYMBOL_FOR_KVM and EXPORT_STATIC_CALL_FOR_KVM
to come as a pair from arch code. I can't think of a scenario where arch code
should override one but not the other. The end result is slightly less ugly :-)
---
arch/x86/include/asm/kvm_types.h | 1 +
include/linux/kvm_types.h | 10 +++++++++-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_types.h
index d7c704ed1be9..bceeaed2940e 100644
--- a/arch/x86/include/asm/kvm_types.h
+++ b/arch/x86/include/asm/kvm_types.h
@@ -15,6 +15,7 @@
* at least one vendor module is enabled.
*/
#define EXPORT_SYMBOL_FOR_KVM(symbol)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
#endif
#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index a568d8e6f4e8..be602d3f287e 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -13,6 +13,8 @@
EXPORT_SYMBOL_FOR_MODULES(symbol, __stringify(KVM_SUB_MODULES))
#define EXPORT_SYMBOL_FOR_KVM(symbol) \
EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) \
+ EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
#else
#define EXPORT_SYMBOL_FOR_KVM_INTERNAL(symbol)
/*
@@ -23,11 +25,17 @@
#ifndef EXPORT_SYMBOL_FOR_KVM
#if IS_MODULE(CONFIG_KVM)
#define EXPORT_SYMBOL_FOR_KVM(symbol) EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm")
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm")
#else
#define EXPORT_SYMBOL_FOR_KVM(symbol)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
#endif /* IS_MODULE(CONFIG_KVM) */
+#else
+#ifndef EXPORT_STATIC_CALL_FOR_KVM
+#error Must #define EXPORT_STATIC_CALL_FOR_KVM if #defining EXPORT_SYMBOL_FOR_KVM
+#endif
#endif /* EXPORT_SYMBOL_FOR_KVM */
-#endif
+#endif /* KVM_SUB_MODULES */
#ifndef __ASSEMBLER__
base-commit: 56b7ace84970ff647b095849e80bc36c094760aa
--
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM()
2026-04-16 22:44 ` Sean Christopherson
@ 2026-04-16 23:12 ` Pawan Gupta
0 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-16 23:12 UTC (permalink / raw)
To: Sean Christopherson
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Thu, Apr 16, 2026 at 03:44:49PM -0700, Sean Christopherson wrote:
> I think I'd prefer to require EXPORT_SYMBOL_FOR_KVM and EXPORT_STATIC_CALL_FOR_KVM
> to come as a pair from arch code. I can't think of a scenario where arch code
> should override one but not the other.
Right, will change as you suggested.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v10 09/12] x86/vmscape: Use static_call() for predictor flush
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (7 preceding siblings ...)
2026-04-14 7:07 ` [PATCH v10 08/12] kvm: Define EXPORT_STATIC_CALL_FOR_KVM() Pawan Gupta
@ 2026-04-14 7:07 ` Pawan Gupta
2026-04-16 22:45 ` Sean Christopherson
2026-04-14 7:07 ` [PATCH v10 10/12] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
` (2 subsequent siblings)
11 siblings, 1 reply; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:07 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
Adding more mitigation options at exit-to-userspace for VMSCAPE would
usually require a series of checks to decide which mitigation to use. In
this case, the mitigation is done by calling a function, which is decided
at boot. So, adding more feature flags and multiple checks can be avoided
by using static_call() to the mitigating function.
Replace the flag-based mitigation selector with a static_call(). This also
frees the existing X86_FEATURE_IBPB_EXIT_TO_USER.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 7 +++----
arch/x86/include/asm/nospec-branch.h | 3 +++
arch/x86/kernel/cpu/bugs.c | 9 ++++++++-
arch/x86/kvm/x86.c | 2 +-
6 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e2df1b147184..5b8def9ddb98 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2720,6 +2720,7 @@ config MITIGATION_TSA
config MITIGATION_VMSCAPE
bool "Mitigate VMSCAPE"
depends on KVM
+ depends on HAVE_STATIC_CALL
default y
help
Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index dbe104df339b..b4d529dd6d30 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -503,7 +503,7 @@
#define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
#define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
#define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
-#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
+/* Free */
#define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
#define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
#define X86_FEATURE_SGX_EUPDATESVN (21*32+17) /* Support for ENCLS[EUPDATESVN] instruction */
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index 78b143673ca7..783e7cb50cae 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -4,6 +4,7 @@
#include <linux/randomize_kstack.h>
#include <linux/user-return-notifier.h>
+#include <linux/static_call_types.h>
#include <asm/nospec-branch.h>
#include <asm/io_bitmap.h>
@@ -94,10 +95,8 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
*/
choose_random_kstack_offset(rdtsc());
- /* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
- if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
- this_cpu_read(x86_predictor_flush_exit_to_user)) {
- write_ibpb();
+ if (unlikely(this_cpu_read(x86_predictor_flush_exit_to_user))) {
+ static_call_cond(vmscape_predictor_flush)();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 0381db59c39d..066fd8095200 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -542,6 +542,9 @@ static inline void indirect_branch_prediction_barrier(void)
:: "rax", "rcx", "rdx", "memory");
}
+#include <linux/static_call_types.h>
+DECLARE_STATIC_CALL(vmscape_predictor_flush, write_ibpb);
+
/* The Intel SPEC CTRL MSR base value cache */
extern u64 x86_spec_ctrl_base;
DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 636280c612f0..bfc0e41697f6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -144,6 +144,13 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
*/
DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+/*
+ * Controls how vmscape is mitigated e.g. via IBPB or BHB-clear
+ * sequence. This defaults to no mitigation.
+ */
+DEFINE_STATIC_CALL_NULL(vmscape_predictor_flush, write_ibpb);
+EXPORT_STATIC_CALL_FOR_KVM(vmscape_predictor_flush);
+
#undef pr_fmt
#define pr_fmt(fmt) "mitigations: " fmt
@@ -3133,7 +3140,7 @@ static void __init vmscape_update_mitigation(void)
static void __init vmscape_apply_mitigation(void)
{
if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
- setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER);
+ static_call_update(vmscape_predictor_flush, write_ibpb);
}
#undef pr_fmt
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45d7cfedc507..5582056b2fa1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11463,7 +11463,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* set for the CPU that actually ran the guest, and not the CPU that it
* may migrate to.
*/
- if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
+ if (static_call_query(vmscape_predictor_flush))
this_cpu_write(x86_predictor_flush_exit_to_user, true);
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v10 09/12] x86/vmscape: Use static_call() for predictor flush
2026-04-14 7:07 ` [PATCH v10 09/12] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
@ 2026-04-16 22:45 ` Sean Christopherson
0 siblings, 0 replies; 17+ messages in thread
From: Sean Christopherson @ 2026-04-16 22:45 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Borislav Petkov, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Apr 14, 2026, Pawan Gupta wrote:
> Adding more mitigation options at exit-to-userspace for VMSCAPE would
> usually require a series of checks to decide which mitigation to use. In
> this case, the mitigation is done by calling a function, which is decided
> at boot. So, adding more feature flags and multiple checks can be avoided
> by using static_call() to the mitigating function.
>
> Replace the flag-based mitigation selector with a static_call(). This also
> frees the existing X86_FEATURE_IBPB_EXIT_TO_USER.
>
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Tested-by: Jon Kohler <jon@nutanix.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
For the KVM change,
Acked-by: Sean Christopherson <seanjc@google.com>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 45d7cfedc507..5582056b2fa1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11463,7 +11463,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> * set for the CPU that actually ran the guest, and not the CPU that it
> * may migrate to.
> */
> - if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
> + if (static_call_query(vmscape_predictor_flush))
> this_cpu_write(x86_predictor_flush_exit_to_user, true);
>
> /*
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v10 10/12] x86/vmscape: Deploy BHB clearing mitigation
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (8 preceding siblings ...)
2026-04-14 7:07 ` [PATCH v10 09/12] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
@ 2026-04-14 7:07 ` Pawan Gupta
2026-04-14 7:08 ` [PATCH v10 11/12] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force Pawan Gupta
2026-04-14 7:08 ` [PATCH v10 12/12] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:07 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
IBPB mitigation for VMSCAPE is an overkill on CPUs that are only affected
by the BHI variant of VMSCAPE. On such CPUs, eIBRS already provides
indirect branch isolation between guest and host userspace. However, branch
history from guest may also influence the indirect branches in host
userspace.
To mitigate the BHI aspect, use the BHB clearing sequence. Since now, IBPB
is not the only mitigation for VMSCAPE, update the documentation to reflect
that =auto could select either IBPB or BHB clear mitigation based on the
CPU.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
Documentation/admin-guide/hw-vuln/vmscape.rst | 11 ++++++++-
Documentation/admin-guide/kernel-parameters.txt | 4 +++-
arch/x86/include/asm/entry-common.h | 4 ++++
arch/x86/include/asm/nospec-branch.h | 2 ++
arch/x86/kernel/cpu/bugs.c | 30 +++++++++++++++++++------
5 files changed, 42 insertions(+), 9 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index d9b9a2b6c114..7c40cf70ad7a 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -86,6 +86,10 @@ The possible values in this file are:
run a potentially malicious guest and issues an IBPB before the first
exit to userspace after VM-exit.
+ * 'Mitigation: Clear BHB before exit to userspace':
+
+ As above, conditional BHB clearing mitigation is enabled.
+
* 'Mitigation: IBPB on VMEXIT':
IBPB is issued on every VM-exit. This occurs when other mitigations like
@@ -102,9 +106,14 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
* ``vmscape=ibpb``:
- Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y).
+ Enable conditional IBPB mitigation.
* ``vmscape=force``:
Force vulnerability detection and mitigation even on processors that are
not known to be affected.
+
+ * ``vmscape=auto``:
+
+ Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
+ (default when CONFIG_MITIGATION_VMSCAPE=y)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 03a550630644..3853c7109419 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8378,9 +8378,11 @@ Kernel parameters
off - disable the mitigation
ibpb - use Indirect Branch Prediction Barrier
- (IBPB) mitigation (default)
+ (IBPB) mitigation
force - force vulnerability detection even on
unaffected processors
+ auto - (default) use IBPB or BHB clear
+ mitigation based on CPU
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index 783e7cb50cae..13db31472f3a 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -96,6 +96,10 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
choose_random_kstack_offset(rdtsc());
if (unlikely(this_cpu_read(x86_predictor_flush_exit_to_user))) {
+ /*
+ * Since the mitigation is for userspace, an explicit
+ * speculation barrier is not required after flush.
+ */
static_call_cond(vmscape_predictor_flush)();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 066fd8095200..38478383139b 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -390,6 +390,8 @@ extern void write_ibpb(void);
#ifdef CONFIG_X86_64
extern void clear_bhb_loop_nofence(void);
+#else
+static inline void clear_bhb_loop_nofence(void) {}
#endif
extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bfc0e41697f6..1082ed1fb2e6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -61,9 +61,8 @@ DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
/*
- * Set when the CPU has run a potentially malicious guest. An IBPB will
- * be needed to before running userspace. That IBPB will flush the branch
- * predictor content.
+ * Set when the CPU has run a potentially malicious guest. Indicates that a
+ * branch predictor flush is needed before running userspace.
*/
DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
@@ -3061,13 +3060,15 @@ enum vmscape_mitigations {
VMSCAPE_MITIGATION_AUTO,
VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
+ VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
};
static const char * const vmscape_strings[] = {
- [VMSCAPE_MITIGATION_NONE] = "Vulnerable",
+ [VMSCAPE_MITIGATION_NONE] = "Vulnerable",
/* [VMSCAPE_MITIGATION_AUTO] */
- [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
- [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
+ [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
+ [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
+ [VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER] = "Mitigation: Clear BHB before exit to userspace",
};
static enum vmscape_mitigations vmscape_mitigation __ro_after_init =
@@ -3085,6 +3086,8 @@ static int __init vmscape_parse_cmdline(char *str)
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+ } else if (!strcmp(str, "auto")) {
+ vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
pr_err("Ignoring unknown vmscape=%s option.\n", str);
}
@@ -3114,7 +3117,17 @@ static void __init vmscape_select_mitigation(void)
break;
case VMSCAPE_MITIGATION_AUTO:
- if (boot_cpu_has(X86_FEATURE_IBPB))
+ /*
+ * CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
+ * BHB clear sequence. These CPUs are only vulnerable to the BHI
+ * variant of the VMSCAPE attack, and thus they do not require a
+ * full predictor flush.
+ *
+ * Note, in 32-bit mode BHB clear sequence is not supported.
+ */
+ if (boot_cpu_has(X86_FEATURE_BHI_CTRL) && IS_ENABLED(CONFIG_X86_64))
+ vmscape_mitigation = VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER;
+ else if (boot_cpu_has(X86_FEATURE_IBPB))
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
else
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
@@ -3141,6 +3154,8 @@ static void __init vmscape_apply_mitigation(void)
{
if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
static_call_update(vmscape_predictor_flush, write_ibpb);
+ else if (vmscape_mitigation == VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER)
+ static_call_update(vmscape_predictor_flush, clear_bhb_loop_nofence);
}
#undef pr_fmt
@@ -3232,6 +3247,7 @@ void cpu_bugs_smt_update(void)
break;
case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+ case VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER:
/*
* Hypervisors can be attacked across-threads, warn for SMT when
* STIBP is not already enabled system-wide.
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 11/12] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (9 preceding siblings ...)
2026-04-14 7:07 ` [PATCH v10 10/12] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
@ 2026-04-14 7:08 ` Pawan Gupta
2026-04-14 7:08 ` [PATCH v10 12/12] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:08 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
vmscape=force option currently defaults to AUTO mitigation. This lets
attack-vector controls to override the vmscape mitigation. Preventing the
user from being able to force VMSCAPE mitigation.
When vmscape mitigation is forced, allow it be deployed irrespective of
attack vectors. Introduce VMSCAPE_MITIGATION_ON that wins over
attack-vector controls.
Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/kernel/cpu/bugs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 1082ed1fb2e6..fbdb137720c4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3058,6 +3058,7 @@ static void __init srso_apply_mitigation(void)
enum vmscape_mitigations {
VMSCAPE_MITIGATION_NONE,
VMSCAPE_MITIGATION_AUTO,
+ VMSCAPE_MITIGATION_ON,
VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
@@ -3066,6 +3067,7 @@ enum vmscape_mitigations {
static const char * const vmscape_strings[] = {
[VMSCAPE_MITIGATION_NONE] = "Vulnerable",
/* [VMSCAPE_MITIGATION_AUTO] */
+ /* [VMSCAPE_MITIGATION_ON] */
[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
[VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER] = "Mitigation: Clear BHB before exit to userspace",
@@ -3085,7 +3087,7 @@ static int __init vmscape_parse_cmdline(char *str)
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
- vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+ vmscape_mitigation = VMSCAPE_MITIGATION_ON;
} else if (!strcmp(str, "auto")) {
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
@@ -3117,6 +3119,7 @@ static void __init vmscape_select_mitigation(void)
break;
case VMSCAPE_MITIGATION_AUTO:
+ case VMSCAPE_MITIGATION_ON:
/*
* CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
* BHB clear sequence. These CPUs are only vulnerable to the BHI
@@ -3244,6 +3247,7 @@ void cpu_bugs_smt_update(void)
switch (vmscape_mitigation) {
case VMSCAPE_MITIGATION_NONE:
case VMSCAPE_MITIGATION_AUTO:
+ case VMSCAPE_MITIGATION_ON:
break;
case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v10 12/12] x86/vmscape: Add cmdline vmscape=on to override attack vector controls
2026-04-14 7:05 [PATCH v10 00/12] VMSCAPE optimization for BHI variant Pawan Gupta
` (10 preceding siblings ...)
2026-04-14 7:08 ` [PATCH v10 11/12] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force Pawan Gupta
@ 2026-04-14 7:08 ` Pawan Gupta
11 siblings, 0 replies; 17+ messages in thread
From: Pawan Gupta @ 2026-04-14 7:08 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
In general, individual mitigation knobs override the attack vector
controls. For VMSCAPE, =ibpb exists but nothing to select BHB clearing
mitigation. The =force option would select BHB clearing when supported, but
with a side-effect of also forcing the bug, hence deploying the mitigation
on unaffected parts too.
Add a new cmdline option vmscape=on to enable the mitigation based on the
VMSCAPE variant the CPU is affected by.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
Documentation/admin-guide/hw-vuln/vmscape.rst | 4 ++++
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/bugs.c | 2 ++
3 files changed, 8 insertions(+)
diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index 7c40cf70ad7a..2558a5c3d956 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -117,3 +117,7 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
(default when CONFIG_MITIGATION_VMSCAPE=y)
+
+ * ``vmscape=on``:
+
+ Same as ``auto``, except that it overrides attack vector controls.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3853c7109419..98204d464477 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8383,6 +8383,8 @@ Kernel parameters
unaffected processors
auto - (default) use IBPB or BHB clear
mitigation based on CPU
+ on - same as "auto", but override attack
+ vector control
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index fbdb137720c4..4e0b77fb21dd 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3088,6 +3088,8 @@ static int __init vmscape_parse_cmdline(char *str)
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
vmscape_mitigation = VMSCAPE_MITIGATION_ON;
+ } else if (!strcmp(str, "on")) {
+ vmscape_mitigation = VMSCAPE_MITIGATION_ON;
} else if (!strcmp(str, "auto")) {
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 17+ messages in thread