* [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
@ 2026-03-24 18:16 ` Pawan Gupta
2026-03-24 20:22 ` Borislav Petkov
2026-03-24 18:16 ` [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
` (8 subsequent siblings)
9 siblings, 1 reply; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:16 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
Currently, BHB clearing sequence is followed by an LFENCE to prevent
transient execution of subsequent indirect branches prematurely. However,
LFENCE barrier could be unnecessary in certain cases. For example, when
kernel is using BHI_DIS_S mitigation, and BHB clearing is only needed for
userspace. In such cases, LFENCE is redundant because ring transitions
would provide the necessary serialization.
Below is a quick recap of BHI mitigation options:
On Alder Lake and newer
- BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
performance overhead.
- Long loop: Alternatively, longer version of BHB clearing sequence
can be used to mitigate BHI. It can also be used to mitigate
BHI variant of VMSCAPE. This is not yet implemented in
Linux.
On older CPUs
- Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
effective on older CPUs as well, but should be avoided
because of unnecessary overhead.
On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
branch history may still influence indirect branches in userspace. This
also means the big hammer IBPB could be replaced with a cheaper option that
clears the BHB at exit-to-userspace after a VMexit.
In preparation for adding the support for BHB sequence (without LFENCE) on
newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
executed. Allow callers to decide whether they need the LFENCE or
not. This adds a few extra bytes to the call sites, but it obviates
the need for multiple variants of clear_bhb_loop().
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 5 ++++-
arch/x86/include/asm/nospec-branch.h | 4 ++--
arch/x86/net/bpf_jit_comp.c | 2 ++
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 42447b1e1dff..3a180a36ca0e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1528,6 +1528,9 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* refactored in the future if needed. The .skips are for safety, to ensure
* that all RETs are in the second half of a cacheline to mitigate Indirect
* Target Selection, rather than taking the slowpath via its_return_thunk.
+ *
+ * Note, callers should use a speculation barrier like LFENCE immediately after
+ * a call to this function to ensure BHB is cleared before indirect branches.
*/
SYM_FUNC_START(clear_bhb_loop)
ANNOTATE_NOENDBR
@@ -1562,7 +1565,7 @@ SYM_FUNC_START(clear_bhb_loop)
sub $1, %ecx
jnz 1b
.Lret2: RET
-5: lfence
+5:
pop %rbp
RET
SYM_FUNC_END(clear_bhb_loop)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 4f4b5e8a1574..70b377fcbc1c 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -331,11 +331,11 @@
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
- ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_LOOP
+ ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_LOOP
.endm
.macro CLEAR_BRANCH_HISTORY_VMEXIT
- ALTERNATIVE "", "call clear_bhb_loop", X86_FEATURE_CLEAR_BHB_VMEXIT
+ ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
.endm
#else
#define CLEAR_BRANCH_HISTORY
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e9b78040d703..63d6c9fa5e80 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1624,6 +1624,8 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
if (emit_call(&prog, func, ip))
return -EINVAL;
+ /* Don't speculate past this until BHB is cleared */
+ EMIT_LFENCE();
EMIT1(0x59); /* pop rcx */
EMIT1(0x58); /* pop rax */
}
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
2026-03-24 18:16 ` [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
@ 2026-03-24 20:22 ` Borislav Petkov
2026-03-24 21:30 ` Pawan Gupta
0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2026-03-24 20:22 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Mar 24, 2026 at 11:16:36AM -0700, Pawan Gupta wrote:
> Currently, BHB clearing sequence is followed by an LFENCE to prevent
> transient execution of subsequent indirect branches prematurely. However,
> LFENCE barrier could be unnecessary in certain cases. For example, when
> kernel is using BHI_DIS_S mitigation, and BHB clearing is only needed for
> userspace. In such cases, LFENCE is redundant because ring transitions
> would provide the necessary serialization.
>
> Below is a quick recap of BHI mitigation options:
>
> On Alder Lake and newer
>
> - BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
> performance overhead.
> - Long loop: Alternatively, longer version of BHB clearing sequence
> can be used to mitigate BHI. It can also be used to mitigate
> BHI variant of VMSCAPE. This is not yet implemented in
> Linux.
>
> On older CPUs
>
> - Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
> effective on older CPUs as well, but should be avoided
> because of unnecessary overhead.
>
> On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
> guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
> branch history may still influence indirect branches in userspace. This
> also means the big hammer IBPB could be replaced with a cheaper option that
> clears the BHB at exit-to-userspace after a VMexit.
>
> In preparation for adding the support for BHB sequence (without LFENCE) on
> newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
> executed. Allow callers to decide whether they need the LFENCE or
> not. This adds a few extra bytes to the call sites, but it obviates
> the need for multiple variants of clear_bhb_loop().
Claude, please add proper articles where they're missing in the above text:
"Currently, the BHB clearing sequence is followed by an LFENCE to prevent
transient execution of subsequent indirect branches prematurely. However, the
LFENCE barrier could be unnecessary in certain cases. For example, when the
kernel is using the BHI_DIS_S mitigation, and BHB clearing is only needed for
userspace. In such cases, the LFENCE is redundant because ring transitions
would provide the necessary serialization.
Below is a quick recap of BHI mitigation options:
On Alder Lake and newer
BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
performance overhead.
Long loop: Alternatively, a longer version of the BHB clearing sequence
can be used to mitigate BHI. It can also be used to mitigate the BHI
variant of VMSCAPE. This is not yet implemented in Linux.
On older CPUs
Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
effective on older CPUs as well, but should be avoided because of
unnecessary overhead.
On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
branch history may still influence indirect branches in userspace. This also
means the big hammer IBPB could be replaced with a cheaper option that clears
the BHB at exit-to-userspace after a VMexit.
In preparation for adding the support for the BHB sequence (without LFENCE) on
newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
executed. Allow callers to decide whether they need the LFENCE or not. This
adds a few extra bytes to the call sites, but it obviates the need for
multiple variants of clear_bhb_loop()."
Reads proper to me. Use it for your next revision pls.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
2026-03-24 20:22 ` Borislav Petkov
@ 2026-03-24 21:30 ` Pawan Gupta
0 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 21:30 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Mar 24, 2026 at 09:22:51PM +0100, Borislav Petkov wrote:
> On Tue, Mar 24, 2026 at 11:16:36AM -0700, Pawan Gupta wrote:
> > Currently, BHB clearing sequence is followed by an LFENCE to prevent
> > transient execution of subsequent indirect branches prematurely. However,
> > LFENCE barrier could be unnecessary in certain cases. For example, when
> > kernel is using BHI_DIS_S mitigation, and BHB clearing is only needed for
> > userspace. In such cases, LFENCE is redundant because ring transitions
> > would provide the necessary serialization.
> >
> > Below is a quick recap of BHI mitigation options:
> >
> > On Alder Lake and newer
> >
> > - BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
> > performance overhead.
> > - Long loop: Alternatively, longer version of BHB clearing sequence
> > can be used to mitigate BHI. It can also be used to mitigate
> > BHI variant of VMSCAPE. This is not yet implemented in
> > Linux.
> >
> > On older CPUs
> >
> > - Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
> > effective on older CPUs as well, but should be avoided
> > because of unnecessary overhead.
> >
> > On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
> > guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
> > branch history may still influence indirect branches in userspace. This
> > also means the big hammer IBPB could be replaced with a cheaper option that
> > clears the BHB at exit-to-userspace after a VMexit.
> >
> > In preparation for adding the support for BHB sequence (without LFENCE) on
> > newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
> > executed. Allow callers to decide whether they need the LFENCE or
> > not. This adds a few extra bytes to the call sites, but it obviates
> > the need for multiple variants of clear_bhb_loop().
>
> Claude, please add proper articles where they're missing in the above text:
>
> "Currently, the BHB clearing sequence is followed by an LFENCE to prevent
> transient execution of subsequent indirect branches prematurely. However, the
> LFENCE barrier could be unnecessary in certain cases. For example, when the
> kernel is using the BHI_DIS_S mitigation, and BHB clearing is only needed for
> userspace. In such cases, the LFENCE is redundant because ring transitions
> would provide the necessary serialization.
>
> Below is a quick recap of BHI mitigation options:
>
> On Alder Lake and newer
>
> BHI_DIS_S: Hardware control to mitigate BHI in ring0. This has low
> performance overhead.
>
> Long loop: Alternatively, a longer version of the BHB clearing sequence
> can be used to mitigate BHI. It can also be used to mitigate the BHI
> variant of VMSCAPE. This is not yet implemented in Linux.
>
> On older CPUs
>
> Short loop: Clears BHB at kernel entry and VMexit. The "Long loop" is
> effective on older CPUs as well, but should be avoided because of
> unnecessary overhead.
>
> On Alder Lake and newer CPUs, eIBRS isolates the indirect targets between
> guest and host. But when affected by the BHI variant of VMSCAPE, a guest's
> branch history may still influence indirect branches in userspace. This also
> means the big hammer IBPB could be replaced with a cheaper option that clears
> the BHB at exit-to-userspace after a VMexit.
>
> In preparation for adding the support for the BHB sequence (without LFENCE) on
> newer CPUs, move the LFENCE to the caller side after clear_bhb_loop() is
> executed. Allow callers to decide whether they need the LFENCE or not. This
> adds a few extra bytes to the call sites, but it obviates the need for
> multiple variants of clear_bhb_loop()."
>
> Reads proper to me. Use it for your next revision pls.
Sure, will use this.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
2026-03-24 18:16 ` [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
@ 2026-03-24 18:16 ` Pawan Gupta
2026-03-24 20:59 ` Borislav Petkov
2026-03-25 17:50 ` Jim Mattson
2026-03-24 18:17 ` [PATCH v8 03/10] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence() Pawan Gupta
` (7 subsequent siblings)
9 siblings, 2 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:16 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
the Branch History Buffer (BHB). On Alder Lake and newer parts this
sequence is not sufficient because it doesn't clear enough entries. This
was not an issue because these CPUs have a hardware control (BHI_DIS_S)
that mitigates BHI in kernel.
BHI variant of VMSCAPE requires isolating branch history between guests and
userspace. Note that there is no equivalent hardware control for userspace.
To effectively isolate branch history on newer CPUs, clear_bhb_loop()
should execute sufficient number of branches to clear a larger BHB.
Dynamically set the loop count of clear_bhb_loop() such that it is
effective on newer CPUs too. Use the hardware control enumeration
X86_FEATURE_BHI_CTRL to select the appropriate loop count.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
arch/x86/net/bpf_jit_comp.c | 7 -------
2 files changed, 16 insertions(+), 12 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3a180a36ca0e..8128e00ca73f 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead)
SYM_FUNC_START(clear_bhb_loop)
ANNOTATE_NOENDBR
push %rbp
+ /* BPF caller may require %rax to be preserved */
+ push %rax
mov %rsp, %rbp
- movl $5, %ecx
+
+ /*
+ * Between the long and short version of BHB clear sequence, just the
+ * loop count differs based on BHI_CTRL, see Intel's BHI guidance.
+ */
+ ALTERNATIVE "movb $5, %al", \
+ "movb $12, %al", X86_FEATURE_BHI_CTRL
+
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
jmp 5f
@@ -1556,16 +1565,18 @@ SYM_FUNC_START(clear_bhb_loop)
* This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
* but some Clang versions (e.g. 18) don't like this.
*/
- .skip 32 - 18, 0xcc
-2: movl $5, %eax
+ .skip 32 - 14, 0xcc
+2: ALTERNATIVE "movb $5, %ah", \
+ "movb $7, %ah", X86_FEATURE_BHI_CTRL
3: jmp 4f
nop
-4: sub $1, %eax
+4: sub $1, %ah
jnz 3b
- sub $1, %ecx
+ sub $1, %al
jnz 1b
.Lret2: RET
5:
+ pop %rax
pop %rbp
RET
SYM_FUNC_END(clear_bhb_loop)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 63d6c9fa5e80..e2cceabb23e8 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1614,11 +1614,6 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
u8 *func;
if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) {
- /* The clearing sequence clobbers eax and ecx. */
- EMIT1(0x50); /* push rax */
- EMIT1(0x51); /* push rcx */
- ip += 2;
-
func = (u8 *)clear_bhb_loop;
ip += x86_call_depth_emit_accounting(&prog, func, ip);
@@ -1626,8 +1621,6 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
return -EINVAL;
/* Don't speculate past this until BHB is cleared */
EMIT_LFENCE();
- EMIT1(0x59); /* pop rcx */
- EMIT1(0x58); /* pop rax */
}
/* Insert IBHF instruction */
if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) &&
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-24 18:16 ` [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
@ 2026-03-24 20:59 ` Borislav Petkov
2026-03-24 22:13 ` Pawan Gupta
2026-03-25 17:50 ` Jim Mattson
1 sibling, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2026-03-24 20:59 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Mar 24, 2026 at 11:16:51AM -0700, Pawan Gupta wrote:
> As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> the Branch History Buffer (BHB). On Alder Lake and newer parts this
> sequence is not sufficient because it doesn't clear enough entries. This
> was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> that mitigates BHI in kernel.
>
> BHI variant of VMSCAPE requires isolating branch history between guests and
> userspace. Note that there is no equivalent hardware control for userspace.
> To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> should execute sufficient number of branches to clear a larger BHB.
>
> Dynamically set the loop count of clear_bhb_loop() such that it is
> effective on newer CPUs too. Use the hardware control enumeration
> X86_FEATURE_BHI_CTRL to select the appropriate loop count.
>
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
> arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> arch/x86/net/bpf_jit_comp.c | 7 -------
> 2 files changed, 16 insertions(+), 12 deletions(-)
Ok, pls tell me why this below doesn't work?
The additional indirection makes even the BHB loop code simpler.
(I didn't pay too much attention to the labels, 2: is probably weird there).
---
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3a180a36ca0e..95c7ed9afbbe 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1532,11 +1532,13 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* Note, callers should use a speculation barrier like LFENCE immediately after
* a call to this function to ensure BHB is cleared before indirect branches.
*/
-SYM_FUNC_START(clear_bhb_loop)
+SYM_FUNC_START(__clear_bhb_loop)
ANNOTATE_NOENDBR
push %rbp
+ /* BPF caller may require %rax to be preserved */
+ push %rax
mov %rsp, %rbp
- movl $5, %ecx
+
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
jmp 5f
@@ -1557,17 +1559,17 @@ SYM_FUNC_START(clear_bhb_loop)
* but some Clang versions (e.g. 18) don't like this.
*/
.skip 32 - 18, 0xcc
-2: movl $5, %eax
+2:
3: jmp 4f
nop
-4: sub $1, %eax
+4: sub $1, %rsi
jnz 3b
- sub $1, %ecx
+ sub $1, %rdi
jnz 1b
.Lret2: RET
5:
+ pop %rax
pop %rbp
RET
-SYM_FUNC_END(clear_bhb_loop)
-EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop)
-STACK_FRAME_NON_STANDARD(clear_bhb_loop)
+SYM_FUNC_END(__clear_bhb_loop)
+STACK_FRAME_NON_STANDARD(__clear_bhb_loop)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 70b377fcbc1c..a9f406941e11 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -390,6 +390,7 @@ extern void write_ibpb(void);
#ifdef CONFIG_X86_64
extern void clear_bhb_loop(void);
+extern void __clear_bhb_loop(unsigned int a, unsigned int b);
#endif
extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 83f51cab0b1e..c41b0548cf2a 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3735,3 +3735,11 @@ void __warn_thunk(void)
{
WARN_ONCE(1, "Unpatched return thunk in use. This should not happen!\n");
}
+
+void clear_bhb_loop(void)
+{
+ if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL))
+ __clear_bhb_loop(12, 7);
+ else
+ __clear_bhb_loop(5, 5);
+}
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 63d6c9fa5e80..e2cceabb23e8 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1614,11 +1614,6 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
u8 *func;
if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) {
- /* The clearing sequence clobbers eax and ecx. */
- EMIT1(0x50); /* push rax */
- EMIT1(0x51); /* push rcx */
- ip += 2;
-
func = (u8 *)clear_bhb_loop;
ip += x86_call_depth_emit_accounting(&prog, func, ip);
@@ -1626,8 +1621,6 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
return -EINVAL;
/* Don't speculate past this until BHB is cleared */
EMIT_LFENCE();
- EMIT1(0x59); /* pop rcx */
- EMIT1(0x58); /* pop rax */
}
/* Insert IBHF instruction */
if ((cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP) &&
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-24 20:59 ` Borislav Petkov
@ 2026-03-24 22:13 ` Pawan Gupta
2026-03-25 20:37 ` Borislav Petkov
0 siblings, 1 reply; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 22:13 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Mar 24, 2026 at 09:59:30PM +0100, Borislav Petkov wrote:
> On Tue, Mar 24, 2026 at 11:16:51AM -0700, Pawan Gupta wrote:
> > As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> > the Branch History Buffer (BHB). On Alder Lake and newer parts this
> > sequence is not sufficient because it doesn't clear enough entries. This
> > was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> > that mitigates BHI in kernel.
> >
> > BHI variant of VMSCAPE requires isolating branch history between guests and
> > userspace. Note that there is no equivalent hardware control for userspace.
> > To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> > should execute sufficient number of branches to clear a larger BHB.
> >
> > Dynamically set the loop count of clear_bhb_loop() such that it is
> > effective on newer CPUs too. Use the hardware control enumeration
> > X86_FEATURE_BHI_CTRL to select the appropriate loop count.
> >
> > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > ---
> > arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> > arch/x86/net/bpf_jit_comp.c | 7 -------
> > 2 files changed, 16 insertions(+), 12 deletions(-)
>
> Ok, pls tell me why this below doesn't work?
>
> The additional indirection makes even the BHB loop code simpler.
>
> (I didn't pay too much attention to the labels, 2: is probably weird there).
>
> ---
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 3a180a36ca0e..95c7ed9afbbe 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1532,11 +1532,13 @@ SYM_CODE_END(rewind_stack_and_make_dead)
> * Note, callers should use a speculation barrier like LFENCE immediately after
> * a call to this function to ensure BHB is cleared before indirect branches.
> */
> -SYM_FUNC_START(clear_bhb_loop)
> +SYM_FUNC_START(__clear_bhb_loop)
> ANNOTATE_NOENDBR
> push %rbp
> + /* BPF caller may require %rax to be preserved */
> + push %rax
> mov %rsp, %rbp
> - movl $5, %ecx
> +
> ANNOTATE_INTRA_FUNCTION_CALL
> call 1f
> jmp 5f
> @@ -1557,17 +1559,17 @@ SYM_FUNC_START(clear_bhb_loop)
> * but some Clang versions (e.g. 18) don't like this.
> */
> .skip 32 - 18, 0xcc
> -2: movl $5, %eax
> +2:
> 3: jmp 4f
> nop
> -4: sub $1, %eax
> +4: sub $1, %rsi
%rsi needs to be loaded again with $inner_loop_count once per every
outer loop iteration. We probably need another register to hold that.
> jnz 3b
> - sub $1, %ecx
> + sub $1, %rdi
> jnz 1b
> .Lret2: RET
> 5:
> + pop %rax
> pop %rbp
> RET
> -SYM_FUNC_END(clear_bhb_loop)
> -EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop)
> -STACK_FRAME_NON_STANDARD(clear_bhb_loop)
> +SYM_FUNC_END(__clear_bhb_loop)
> +STACK_FRAME_NON_STANDARD(__clear_bhb_loop)
> diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> index 70b377fcbc1c..a9f406941e11 100644
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -390,6 +390,7 @@ extern void write_ibpb(void);
>
> #ifdef CONFIG_X86_64
> extern void clear_bhb_loop(void);
> +extern void __clear_bhb_loop(unsigned int a, unsigned int b);
> #endif
>
> extern void (*x86_return_thunk)(void);
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 83f51cab0b1e..c41b0548cf2a 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -3735,3 +3735,11 @@ void __warn_thunk(void)
> {
> WARN_ONCE(1, "Unpatched return thunk in use. This should not happen!\n");
> }
> +
> +void clear_bhb_loop(void)
> +{
> + if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL))
> + __clear_bhb_loop(12, 7);
> + else
> + __clear_bhb_loop(5, 5);
> +}
This is cleaner. A few things to consider are, CLEAR_BRANCH_HISTORY that
calls clear_bhb_loop() would be calling into C code very early during the
kernel entry. The code generated here may vary based on the compiler. Any
indirect branch here would be security risk. This needs to be noinstr so
that it can't be hijacked by probes and ftraces.
At kernel entry, calling into C before mitigations are applied is risky.
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 63d6c9fa5e80..e2cceabb23e8 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1614,11 +1614,6 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
> u8 *func;
>
> if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) {
> - /* The clearing sequence clobbers eax and ecx. */
> - EMIT1(0x50); /* push rax */
> - EMIT1(0x51); /* push rcx */
> - ip += 2;
> -
> func = (u8 *)clear_bhb_loop;
Although call to clear_bhb_loop() will be inserted at the end of the BPF
program before it returns, I am not sure if it is safe to assume that
trashing registers in the path clear_bhb_loop() -> __clear_bhb_loop() is
okay? Especially, when we don't know what code compiler generated for
clear_bhb_loop(). BPF experts would know better?
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-24 22:13 ` Pawan Gupta
@ 2026-03-25 20:37 ` Borislav Petkov
2026-03-25 22:40 ` David Laight
2026-03-26 8:39 ` Pawan Gupta
0 siblings, 2 replies; 31+ messages in thread
From: Borislav Petkov @ 2026-03-25 20:37 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Tue, Mar 24, 2026 at 03:13:08PM -0700, Pawan Gupta wrote:
> This is cleaner. A few things to consider are, CLEAR_BRANCH_HISTORY that
> calls clear_bhb_loop() would be calling into C code very early during the
> kernel entry. The code generated here may vary based on the compiler. Any
> indirect branch here would be security risk. This needs to be noinstr so
> that it can't be hijacked by probes and ftraces.
>
> At kernel entry, calling into C before mitigations are applied is risky.
You can write the above function in asm if you prefer - should still be
easier.
> Although call to clear_bhb_loop() will be inserted at the end of the BPF
> program before it returns, I am not sure if it is safe to assume that
> trashing registers in the path clear_bhb_loop() -> __clear_bhb_loop() is
> okay? Especially, when we don't know what code compiler generated for
> clear_bhb_loop(). BPF experts would know better?
The compiler would preserve the regs. If you write it in asm and you adhere to
the C ABI, you could preserve them too. Shouldn't be too many.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-25 20:37 ` Borislav Petkov
@ 2026-03-25 22:40 ` David Laight
2026-03-26 8:39 ` Pawan Gupta
1 sibling, 0 replies; 31+ messages in thread
From: David Laight @ 2026-03-25 22:40 UTC (permalink / raw)
To: Borislav Petkov
Cc: Pawan Gupta, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
Andy Lutomirski, Thomas Gleixner, Ingo Molnar, David Ahern,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, Stanislav Fomichev, Hao Luo, Paolo Bonzini,
Jonathan Corbet, linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf,
netdev, linux-doc
On Wed, 25 Mar 2026 21:37:59 +0100
Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Mar 24, 2026 at 03:13:08PM -0700, Pawan Gupta wrote:
...
> > Although call to clear_bhb_loop() will be inserted at the end of the BPF
> > program before it returns, I am not sure if it is safe to assume that
> > trashing registers in the path clear_bhb_loop() -> __clear_bhb_loop() is
> > okay? Especially, when we don't know what code compiler generated for
> > clear_bhb_loop(). BPF experts would know better?
>
> The compiler would preserve the regs. If you write it in asm and you adhere to
> the C ABI, you could preserve them too. Shouldn't be too many.
The BPF code that calls it doesn't use the C ABI - it just puts
a call instruction in the code it generates.
Hence all registers must be preserved.
David
>
> Thx.
>
>
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-25 20:37 ` Borislav Petkov
2026-03-25 22:40 ` David Laight
@ 2026-03-26 8:39 ` Pawan Gupta
2026-03-26 9:15 ` David Laight
2026-03-26 10:01 ` Borislav Petkov
1 sibling, 2 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-26 8:39 UTC (permalink / raw)
To: Borislav Petkov
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Wed, Mar 25, 2026 at 09:37:59PM +0100, Borislav Petkov wrote:
> On Tue, Mar 24, 2026 at 03:13:08PM -0700, Pawan Gupta wrote:
> > This is cleaner. A few things to consider are, CLEAR_BRANCH_HISTORY that
> > calls clear_bhb_loop() would be calling into C code very early during the
> > kernel entry. The code generated here may vary based on the compiler. Any
> > indirect branch here would be security risk. This needs to be noinstr so
> > that it can't be hijacked by probes and ftraces.
> >
> > At kernel entry, calling into C before mitigations are applied is risky.
>
> You can write the above function in asm if you prefer - should still be
> easier.
I believe the equivalent for cpu_feature_enabled() in asm is the
ALTERNATIVE. Please let me know if I am missing something.
Regarding your intent to move the loop count selection out of the BHB
sequence, below is what I could come up. It is not as pretty as the C
version, but it is trying to achieve something similar:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ecae3cef9d8c..54c65b0a3f65 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1494,6 +1494,20 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
SYM_CODE_END(rewind_stack_and_make_dead)
.popsection
+/*
+ * Between the long and short version of BHB clear sequence, just the
+ * loop count differs based on BHI_CTRL, see Intel's BHI guidance.
+ */
+#define BHB_SHORT_LOOP_OUTER 5
+#define BHB_SHORT_LOOP_INNER 5
+
+#define BHB_LONG_LOOP_OUTER 12
+#define BHB_LONG_LOOP_INNER 7
+
+#define BHB_MOVB(type, reg) \
+ ALTERNATIVE __stringify(movb $BHB_SHORT_LOOP_##type, reg), \
+ __stringify(movb $BHB_LONG_LOOP_##type, reg), X86_FEATURE_BHI_CTRL
+
/*
* This sequence executes branches in order to remove user branch information
* from the branch history tracker in the Branch Predictor, therefore removing
@@ -1540,12 +1554,7 @@ SYM_FUNC_START(clear_bhb_loop_nofence)
/* BPF caller may require all registers to be preserved */
push %rax
- /*
- * Between the long and short version of BHB clear sequence, just the
- * loop count differs based on BHI_CTRL, see Intel's BHI guidance.
- */
- ALTERNATIVE "movb $5, %al", \
- "movb $12, %al", X86_FEATURE_BHI_CTRL
+ BHB_MOVB(OUTER, %al)
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
@@ -1567,8 +1576,7 @@ SYM_FUNC_START(clear_bhb_loop_nofence)
* but some Clang versions (e.g. 18) don't like this.
*/
.skip 32 - 14, 0xcc
-2: ALTERNATIVE "movb $5, %ah", \
- "movb $7, %ah", X86_FEATURE_BHI_CTRL
+2: BHB_MOVB(INNER, %ah)
3: jmp 4f
nop
4: sub $1, %ah
Below is how the disassembly looks like:
clear_bhb_loop_nofence:
...
call 1f
jmp 5f
// BHB_MOVB(OUTER, %al)
mov $0x5,%al
^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-26 8:39 ` Pawan Gupta
@ 2026-03-26 9:15 ` David Laight
2026-03-26 10:01 ` Borislav Petkov
1 sibling, 0 replies; 31+ messages in thread
From: David Laight @ 2026-03-26 9:15 UTC (permalink / raw)
To: Pawan Gupta
Cc: Borislav Petkov, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
Andy Lutomirski, Thomas Gleixner, Ingo Molnar, David Ahern,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, Stanislav Fomichev, Hao Luo, Paolo Bonzini,
Jonathan Corbet, linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf,
netdev, linux-doc
On Thu, 26 Mar 2026 01:39:34 -0700
Pawan Gupta <pawan.kumar.gupta@linux.intel.com> wrote:
> On Wed, Mar 25, 2026 at 09:37:59PM +0100, Borislav Petkov wrote:
> > On Tue, Mar 24, 2026 at 03:13:08PM -0700, Pawan Gupta wrote:
> > > This is cleaner. A few things to consider are, CLEAR_BRANCH_HISTORY that
> > > calls clear_bhb_loop() would be calling into C code very early during the
> > > kernel entry. The code generated here may vary based on the compiler. Any
> > > indirect branch here would be security risk. This needs to be noinstr so
> > > that it can't be hijacked by probes and ftraces.
> > >
> > > At kernel entry, calling into C before mitigations are applied is risky.
> >
> > You can write the above function in asm if you prefer - should still be
> > easier.
>
> I believe the equivalent for cpu_feature_enabled() in asm is the
> ALTERNATIVE. Please let me know if I am missing something.
>
> Regarding your intent to move the loop count selection out of the BHB
> sequence, below is what I could come up. It is not as pretty as the C
> version, but it is trying to achieve something similar:
I think that fails on being harder to read and longer.
So no real benefit.
I believe this code has to be asm because it is required to excute
specific instructions in a specific order - you can't trust the C
compiler to do that for you.
David
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index ecae3cef9d8c..54c65b0a3f65 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1494,6 +1494,20 @@ SYM_CODE_START_NOALIGN(rewind_stack_and_make_dead)
> SYM_CODE_END(rewind_stack_and_make_dead)
> .popsection
>
> +/*
> + * Between the long and short version of BHB clear sequence, just the
> + * loop count differs based on BHI_CTRL, see Intel's BHI guidance.
> + */
> +#define BHB_SHORT_LOOP_OUTER 5
> +#define BHB_SHORT_LOOP_INNER 5
> +
> +#define BHB_LONG_LOOP_OUTER 12
> +#define BHB_LONG_LOOP_INNER 7
> +
> +#define BHB_MOVB(type, reg) \
> + ALTERNATIVE __stringify(movb $BHB_SHORT_LOOP_##type, reg), \
> + __stringify(movb $BHB_LONG_LOOP_##type, reg), X86_FEATURE_BHI_CTRL
> +
> /*
> * This sequence executes branches in order to remove user branch information
> * from the branch history tracker in the Branch Predictor, therefore removing
> @@ -1540,12 +1554,7 @@ SYM_FUNC_START(clear_bhb_loop_nofence)
> /* BPF caller may require all registers to be preserved */
> push %rax
>
> - /*
> - * Between the long and short version of BHB clear sequence, just the
> - * loop count differs based on BHI_CTRL, see Intel's BHI guidance.
> - */
> - ALTERNATIVE "movb $5, %al", \
> - "movb $12, %al", X86_FEATURE_BHI_CTRL
> + BHB_MOVB(OUTER, %al)
>
> ANNOTATE_INTRA_FUNCTION_CALL
> call 1f
> @@ -1567,8 +1576,7 @@ SYM_FUNC_START(clear_bhb_loop_nofence)
> * but some Clang versions (e.g. 18) don't like this.
> */
> .skip 32 - 14, 0xcc
> -2: ALTERNATIVE "movb $5, %ah", \
> - "movb $7, %ah", X86_FEATURE_BHI_CTRL
> +2: BHB_MOVB(INNER, %ah)
> 3: jmp 4f
> nop
> 4: sub $1, %ah
>
>
> Below is how the disassembly looks like:
>
> clear_bhb_loop_nofence:
> ...
> call 1f
> jmp 5f
> // BHB_MOVB(OUTER, %al)
> mov $0x5,%al
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-26 8:39 ` Pawan Gupta
2026-03-26 9:15 ` David Laight
@ 2026-03-26 10:01 ` Borislav Petkov
2026-03-26 10:45 ` David Laight
1 sibling, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2026-03-26 10:01 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Dave Hansen, Peter Zijlstra,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, KP Singh,
Jiri Olsa, David S. Miller, David Laight, Andy Lutomirski,
Thomas Gleixner, Ingo Molnar, David Ahern, Martin KaFai Lau,
Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
Stanislav Fomichev, Hao Luo, Paolo Bonzini, Jonathan Corbet,
linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
On Thu, Mar 26, 2026 at 01:39:34AM -0700, Pawan Gupta wrote:
> I believe the equivalent for cpu_feature_enabled() in asm is the
> ALTERNATIVE. Please let me know if I am missing something.
Yes, you are.
The point is that you don't want to stick those alternative calls inside some
magic bhb_loop function but hand them in from the outside, as function
arguments.
Basically what I did.
Then you were worried about this being C code and it had to be noinstr... So
that outer function can be rewritten in asm, I think, and still keep it well
separate.
I'll try to rewrite it once I get a free minute, and see how it looks.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-26 10:01 ` Borislav Petkov
@ 2026-03-26 10:45 ` David Laight
2026-03-26 20:29 ` Pawan Gupta
0 siblings, 1 reply; 31+ messages in thread
From: David Laight @ 2026-03-26 10:45 UTC (permalink / raw)
To: Borislav Petkov
Cc: Pawan Gupta, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
Andy Lutomirski, Thomas Gleixner, Ingo Molnar, David Ahern,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, Stanislav Fomichev, Hao Luo, Paolo Bonzini,
Jonathan Corbet, linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf,
netdev, linux-doc
On Thu, 26 Mar 2026 11:01:20 +0100
Borislav Petkov <bp@alien8.de> wrote:
> On Thu, Mar 26, 2026 at 01:39:34AM -0700, Pawan Gupta wrote:
> > I believe the equivalent for cpu_feature_enabled() in asm is the
> > ALTERNATIVE. Please let me know if I am missing something.
>
> Yes, you are.
>
> The point is that you don't want to stick those alternative calls inside some
> magic bhb_loop function but hand them in from the outside, as function
> arguments.
>
> Basically what I did.
>
> Then you were worried about this being C code and it had to be noinstr... So
> that outer function can be rewritten in asm, I think, and still keep it well
> separate.
>
> I'll try to rewrite it once I get a free minute, and see how it looks.
>
I think someone tried getting C code to write the values to global data
and getting the asm to read them.
That got discounted because it spilt things between two largely unrelated files.
I think the BPF code would need significant refactoring to call a C function.
David
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-26 10:45 ` David Laight
@ 2026-03-26 20:29 ` Pawan Gupta
0 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-26 20:29 UTC (permalink / raw)
To: David Laight
Cc: Borislav Petkov, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
Andy Lutomirski, Thomas Gleixner, Ingo Molnar, David Ahern,
Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
John Fastabend, Stanislav Fomichev, Hao Luo, Paolo Bonzini,
Jonathan Corbet, linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf,
netdev, linux-doc
On Thu, Mar 26, 2026 at 10:45:57AM +0000, David Laight wrote:
> On Thu, 26 Mar 2026 11:01:20 +0100
> Borislav Petkov <bp@alien8.de> wrote:
>
> > On Thu, Mar 26, 2026 at 01:39:34AM -0700, Pawan Gupta wrote:
> > > I believe the equivalent for cpu_feature_enabled() in asm is the
> > > ALTERNATIVE. Please let me know if I am missing something.
> >
> > Yes, you are.
> >
> > The point is that you don't want to stick those alternative calls inside some
> > magic bhb_loop function but hand them in from the outside, as function
> > arguments.
> >
> > Basically what I did.
> >
> > Then you were worried about this being C code and it had to be noinstr... So
> > that outer function can be rewritten in asm, I think, and still keep it well
> > separate.
> >
> > I'll try to rewrite it once I get a free minute, and see how it looks.
> >
>
> I think someone tried getting C code to write the values to global data
> and getting the asm to read them.
> That got discounted because it spilt things between two largely unrelated files.
The implementation with global variables wasn't that bad, let me revive it.
This part which ties sequence to BHI mitigation, which is not ideal,
(because VMSCAPE also uses it) it does seems a cleaner option.
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2095,6 +2095,11 @@ static void __init bhi_select_mitigation(void)
static void __init bhi_update_mitigation(void)
{
+ if (!cpu_feature_enabled(X86_FEATURE_BHI_CTRL)) {
+ bhi_seq_outer_loop = 5;
+ bhi_seq_inner_loop = 5;
+ }
+
I believe this can be moved to somewhere common to all mitigations.
> I think the BPF code would need significant refactoring to call a C function.
Ya, true. Will use globals and keep clear_bhb_loop() in asm.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-24 18:16 ` [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
2026-03-24 20:59 ` Borislav Petkov
@ 2026-03-25 17:50 ` Jim Mattson
2026-03-25 18:44 ` Pawan Gupta
2026-03-25 19:41 ` David Laight
1 sibling, 2 replies; 31+ messages in thread
From: Jim Mattson @ 2026-03-25 17:50 UTC (permalink / raw)
To: Pawan Gupta
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet, linux-kernel, kvm, Asit Mallick,
Tao Zhang, bpf, netdev, linux-doc
On Tue, Mar 24, 2026 at 11:19 AM Pawan Gupta
<pawan.kumar.gupta@linux.intel.com> wrote:
>
> As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> the Branch History Buffer (BHB). On Alder Lake and newer parts this
> sequence is not sufficient because it doesn't clear enough entries. This
> was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> that mitigates BHI in kernel.
>
> BHI variant of VMSCAPE requires isolating branch history between guests and
> userspace. Note that there is no equivalent hardware control for userspace.
> To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> should execute sufficient number of branches to clear a larger BHB.
>
> Dynamically set the loop count of clear_bhb_loop() such that it is
> effective on newer CPUs too. Use the hardware control enumeration
> X86_FEATURE_BHI_CTRL to select the appropriate loop count.
>
> Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> ---
> arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> arch/x86/net/bpf_jit_comp.c | 7 -------
> 2 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 3a180a36ca0e..8128e00ca73f 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead)
> SYM_FUNC_START(clear_bhb_loop)
> ANNOTATE_NOENDBR
> push %rbp
> + /* BPF caller may require %rax to be preserved */
> + push %rax
Shouldn't the "push %rax" come after "mov %rsp, %rbp"?
> mov %rsp, %rbp
> - movl $5, %ecx
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-25 17:50 ` Jim Mattson
@ 2026-03-25 18:44 ` Pawan Gupta
2026-03-25 19:41 ` David Laight
1 sibling, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-25 18:44 UTC (permalink / raw)
To: Jim Mattson
Cc: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet, linux-kernel, kvm, Asit Mallick,
Tao Zhang, bpf, netdev, linux-doc
On Wed, Mar 25, 2026 at 10:50:58AM -0700, Jim Mattson wrote:
> On Tue, Mar 24, 2026 at 11:19 AM Pawan Gupta
> <pawan.kumar.gupta@linux.intel.com> wrote:
> >
> > As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> > the Branch History Buffer (BHB). On Alder Lake and newer parts this
> > sequence is not sufficient because it doesn't clear enough entries. This
> > was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> > that mitigates BHI in kernel.
> >
> > BHI variant of VMSCAPE requires isolating branch history between guests and
> > userspace. Note that there is no equivalent hardware control for userspace.
> > To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> > should execute sufficient number of branches to clear a larger BHB.
> >
> > Dynamically set the loop count of clear_bhb_loop() such that it is
> > effective on newer CPUs too. Use the hardware control enumeration
> > X86_FEATURE_BHI_CTRL to select the appropriate loop count.
> >
> > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > ---
> > arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> > arch/x86/net/bpf_jit_comp.c | 7 -------
> > 2 files changed, 16 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 3a180a36ca0e..8128e00ca73f 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead)
> > SYM_FUNC_START(clear_bhb_loop)
> > ANNOTATE_NOENDBR
> > push %rbp
> > + /* BPF caller may require %rax to be preserved */
> > + push %rax
>
> Shouldn't the "push %rax" come after "mov %rsp, %rbp"?
Right, thanks for catching that.
> > mov %rsp, %rbp
> > - movl $5, %ecx
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-25 17:50 ` Jim Mattson
2026-03-25 18:44 ` Pawan Gupta
@ 2026-03-25 19:41 ` David Laight
2026-03-25 22:29 ` Pawan Gupta
1 sibling, 1 reply; 31+ messages in thread
From: David Laight @ 2026-03-25 19:41 UTC (permalink / raw)
To: Jim Mattson
Cc: Pawan Gupta, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson,
Borislav Petkov, Dave Hansen, Peter Zijlstra, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, KP Singh, Jiri Olsa,
David S. Miller, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet, linux-kernel, kvm, Asit Mallick,
Tao Zhang, bpf, netdev, linux-doc
On Wed, 25 Mar 2026 10:50:58 -0700
Jim Mattson <jmattson@google.com> wrote:
> On Tue, Mar 24, 2026 at 11:19 AM Pawan Gupta
> <pawan.kumar.gupta@linux.intel.com> wrote:
> >
> > As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> > the Branch History Buffer (BHB). On Alder Lake and newer parts this
> > sequence is not sufficient because it doesn't clear enough entries. This
> > was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> > that mitigates BHI in kernel.
> >
> > BHI variant of VMSCAPE requires isolating branch history between guests and
> > userspace. Note that there is no equivalent hardware control for userspace.
> > To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> > should execute sufficient number of branches to clear a larger BHB.
> >
> > Dynamically set the loop count of clear_bhb_loop() such that it is
> > effective on newer CPUs too. Use the hardware control enumeration
> > X86_FEATURE_BHI_CTRL to select the appropriate loop count.
> >
> > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > ---
> > arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> > arch/x86/net/bpf_jit_comp.c | 7 -------
> > 2 files changed, 16 insertions(+), 12 deletions(-)
> >
> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > index 3a180a36ca0e..8128e00ca73f 100644
> > --- a/arch/x86/entry/entry_64.S
> > +++ b/arch/x86/entry/entry_64.S
> > @@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead)
> > SYM_FUNC_START(clear_bhb_loop)
> > ANNOTATE_NOENDBR
> > push %rbp
> > + /* BPF caller may require %rax to be preserved */
Since you need a new version change that to 'all registers preserved'.
> > + push %rax
>
> Shouldn't the "push %rax" come after "mov %rsp, %rbp"?
Or delete the stack frame :-)
It is only there for the stack trace-back code.
David
>
> > mov %rsp, %rbp
> > - movl $5, %ecx
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
2026-03-25 19:41 ` David Laight
@ 2026-03-25 22:29 ` Pawan Gupta
0 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-25 22:29 UTC (permalink / raw)
To: David Laight
Cc: Jim Mattson, x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin,
Josh Poimboeuf, David Kaplan, Sean Christopherson,
Borislav Petkov, Dave Hansen, Peter Zijlstra, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, KP Singh, Jiri Olsa,
David S. Miller, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet, linux-kernel, kvm, Asit Mallick,
Tao Zhang, bpf, netdev, linux-doc
On Wed, Mar 25, 2026 at 07:41:46PM +0000, David Laight wrote:
> On Wed, 25 Mar 2026 10:50:58 -0700
> Jim Mattson <jmattson@google.com> wrote:
>
> > On Tue, Mar 24, 2026 at 11:19 AM Pawan Gupta
> > <pawan.kumar.gupta@linux.intel.com> wrote:
> > >
> > > As a mitigation for BHI, clear_bhb_loop() executes branches that overwrites
> > > the Branch History Buffer (BHB). On Alder Lake and newer parts this
> > > sequence is not sufficient because it doesn't clear enough entries. This
> > > was not an issue because these CPUs have a hardware control (BHI_DIS_S)
> > > that mitigates BHI in kernel.
> > >
> > > BHI variant of VMSCAPE requires isolating branch history between guests and
> > > userspace. Note that there is no equivalent hardware control for userspace.
> > > To effectively isolate branch history on newer CPUs, clear_bhb_loop()
> > > should execute sufficient number of branches to clear a larger BHB.
> > >
> > > Dynamically set the loop count of clear_bhb_loop() such that it is
> > > effective on newer CPUs too. Use the hardware control enumeration
> > > X86_FEATURE_BHI_CTRL to select the appropriate loop count.
> > >
> > > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> > > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > > ---
> > > arch/x86/entry/entry_64.S | 21 ++++++++++++++++-----
> > > arch/x86/net/bpf_jit_comp.c | 7 -------
> > > 2 files changed, 16 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > > index 3a180a36ca0e..8128e00ca73f 100644
> > > --- a/arch/x86/entry/entry_64.S
> > > +++ b/arch/x86/entry/entry_64.S
> > > @@ -1535,8 +1535,17 @@ SYM_CODE_END(rewind_stack_and_make_dead)
> > > SYM_FUNC_START(clear_bhb_loop)
> > > ANNOTATE_NOENDBR
> > > push %rbp
> > > + /* BPF caller may require %rax to be preserved */
>
> Since you need a new version change that to 'all registers preserved'.
Ya, thats more accurate.
> > > + push %rax
> >
> > Shouldn't the "push %rax" come after "mov %rsp, %rbp"?
>
> Or delete the stack frame :-)
> It is only there for the stack trace-back code.
Hmm, lets keep the stack frame, it might help debug.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v8 03/10] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence()
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
2026-03-24 18:16 ` [PATCH v8 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
2026-03-24 18:16 ` [PATCH v8 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
@ 2026-03-24 18:17 ` Pawan Gupta
2026-03-24 18:17 ` [PATCH v8 04/10] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user Pawan Gupta
` (6 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:17 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
To reflect the recent change that moved LFENCE to the caller side.
Suggested-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 8 ++++----
arch/x86/include/asm/nospec-branch.h | 6 +++---
arch/x86/net/bpf_jit_comp.c | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 8128e00ca73f..e9b81b95fcc8 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1532,7 +1532,7 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* Note, callers should use a speculation barrier like LFENCE immediately after
* a call to this function to ensure BHB is cleared before indirect branches.
*/
-SYM_FUNC_START(clear_bhb_loop)
+SYM_FUNC_START(clear_bhb_loop_nofence)
ANNOTATE_NOENDBR
push %rbp
/* BPF caller may require %rax to be preserved */
@@ -1579,6 +1579,6 @@ SYM_FUNC_START(clear_bhb_loop)
pop %rax
pop %rbp
RET
-SYM_FUNC_END(clear_bhb_loop)
-EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop)
-STACK_FRAME_NON_STANDARD(clear_bhb_loop)
+SYM_FUNC_END(clear_bhb_loop_nofence)
+EXPORT_SYMBOL_FOR_KVM(clear_bhb_loop_nofence)
+STACK_FRAME_NON_STANDARD(clear_bhb_loop_nofence)
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 70b377fcbc1c..0f5e6ed6c9c2 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -331,11 +331,11 @@
#ifdef CONFIG_X86_64
.macro CLEAR_BRANCH_HISTORY
- ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_LOOP
+ ALTERNATIVE "", "call clear_bhb_loop_nofence; lfence", X86_FEATURE_CLEAR_BHB_LOOP
.endm
.macro CLEAR_BRANCH_HISTORY_VMEXIT
- ALTERNATIVE "", "call clear_bhb_loop; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
+ ALTERNATIVE "", "call clear_bhb_loop_nofence; lfence", X86_FEATURE_CLEAR_BHB_VMEXIT
.endm
#else
#define CLEAR_BRANCH_HISTORY
@@ -389,7 +389,7 @@ extern void entry_untrain_ret(void);
extern void write_ibpb(void);
#ifdef CONFIG_X86_64
-extern void clear_bhb_loop(void);
+extern void clear_bhb_loop_nofence(void);
#endif
extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index e2cceabb23e8..b57e9ab51c5d 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1614,7 +1614,7 @@ static int emit_spectre_bhb_barrier(u8 **pprog, u8 *ip,
u8 *func;
if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_LOOP)) {
- func = (u8 *)clear_bhb_loop;
+ func = (u8 *)clear_bhb_loop_nofence;
ip += x86_call_depth_emit_accounting(&prog, func, ip);
if (emit_call(&prog, func, ip))
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v8 04/10] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (2 preceding siblings ...)
2026-03-24 18:17 ` [PATCH v8 03/10] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence() Pawan Gupta
@ 2026-03-24 18:17 ` Pawan Gupta
2026-03-24 18:17 ` [PATCH v8 05/10] x86/vmscape: Move mitigation selection to a switch() Pawan Gupta
` (5 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:17 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
With the upcoming changes x86_ibpb_exit_to_user will also be used when BHB
clearing sequence is used. Rename it cover both the cases.
No functional change.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/include/asm/entry-common.h | 6 +++---
arch/x86/include/asm/nospec-branch.h | 2 +-
arch/x86/kernel/cpu/bugs.c | 4 ++--
arch/x86/kvm/x86.c | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index ce3eb6d5fdf9..c45858db16c9 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -94,11 +94,11 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
*/
choose_random_kstack_offset(rdtsc());
- /* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */
+ /* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
- this_cpu_read(x86_ibpb_exit_to_user)) {
+ this_cpu_read(x86_predictor_flush_exit_to_user)) {
indirect_branch_prediction_barrier();
- this_cpu_write(x86_ibpb_exit_to_user, false);
+ this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 0f5e6ed6c9c2..0a55b1c64741 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -533,7 +533,7 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
: "memory");
}
-DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user);
+DECLARE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
static inline void indirect_branch_prediction_barrier(void)
{
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 83f51cab0b1e..47c020b80371 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -65,8 +65,8 @@ EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
* be needed to before running userspace. That IBPB will flush the branch
* predictor content.
*/
-DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user);
-EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user);
+DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
+EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fd1c4a36b593..45d7cfedc507 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11464,7 +11464,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* may migrate to.
*/
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
- this_cpu_write(x86_ibpb_exit_to_user, true);
+ this_cpu_write(x86_predictor_flush_exit_to_user, true);
/*
* Consume any pending interrupts, including the possible source of
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v8 05/10] x86/vmscape: Move mitigation selection to a switch()
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (3 preceding siblings ...)
2026-03-24 18:17 ` [PATCH v8 04/10] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user Pawan Gupta
@ 2026-03-24 18:17 ` Pawan Gupta
2026-03-24 18:17 ` [PATCH v8 06/10] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier() Pawan Gupta
` (4 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:17 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
This ensures that all mitigation modes are explicitly handled, while
keeping the mitigation selection for each mode together. This also prepares
for adding BHB-clearing mitigation mode for VMSCAPE.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47c020b80371..68e2df3e3bf5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3084,17 +3084,33 @@ early_param("vmscape", vmscape_parse_cmdline);
static void __init vmscape_select_mitigation(void)
{
- if (!boot_cpu_has_bug(X86_BUG_VMSCAPE) ||
- !boot_cpu_has(X86_FEATURE_IBPB)) {
+ if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) {
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
return;
}
- if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) {
- if (should_mitigate_vuln(X86_BUG_VMSCAPE))
+ if ((vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) &&
+ !should_mitigate_vuln(X86_BUG_VMSCAPE))
+ vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+
+ switch (vmscape_mitigation) {
+ case VMSCAPE_MITIGATION_NONE:
+ break;
+
+ case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+ if (!boot_cpu_has(X86_FEATURE_IBPB))
+ vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+ break;
+
+ case VMSCAPE_MITIGATION_AUTO:
+ if (boot_cpu_has(X86_FEATURE_IBPB))
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
else
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+ break;
+
+ default:
+ break;
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v8 06/10] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (4 preceding siblings ...)
2026-03-24 18:17 ` [PATCH v8 05/10] x86/vmscape: Move mitigation selection to a switch() Pawan Gupta
@ 2026-03-24 18:17 ` Pawan Gupta
2026-03-24 18:18 ` [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
` (3 subsequent siblings)
9 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:17 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
indirect_branch_prediction_barrier() is a wrapper to write_ibpb(), which
also checks if the CPU supports IBPB. For VMSCAPE, call to
indirect_branch_prediction_barrier() is only possible when CPU supports
IBPB.
Simply call write_ibpb() directly to avoid unnecessary alternative
patching.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/include/asm/entry-common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index c45858db16c9..78b143673ca7 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -97,7 +97,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
/* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
this_cpu_read(x86_predictor_flush_exit_to_user)) {
- indirect_branch_prediction_barrier();
+ write_ibpb();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (5 preceding siblings ...)
2026-03-24 18:17 ` [PATCH v8 06/10] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier() Pawan Gupta
@ 2026-03-24 18:18 ` Pawan Gupta
2026-03-24 19:09 ` bot+bpf-ci
2026-03-24 18:18 ` [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
` (2 subsequent siblings)
9 siblings, 1 reply; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:18 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
Adding more mitigation options at exit-to-userspace for VMSCAPE would
usually require a series of checks to decide which mitigation to use. In
this case, the mitigation is done by calling a function, which is decided
at boot. So, adding more feature flags and multiple checks can be avoided
by using static_call() to the mitigating function.
Replace the flag-based mitigation selector with a static_call(). This also
frees the existing X86_FEATURE_IBPB_EXIT_TO_USER.
Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/include/asm/entry-common.h | 7 +++----
arch/x86/include/asm/nospec-branch.h | 3 +++
arch/x86/include/asm/processor.h | 1 +
arch/x86/kernel/cpu/bugs.c | 14 +++++++++++++-
arch/x86/kvm/x86.c | 2 +-
7 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e2df1b147184..5b8def9ddb98 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2720,6 +2720,7 @@ config MITIGATION_TSA
config MITIGATION_VMSCAPE
bool "Mitigate VMSCAPE"
depends on KVM
+ depends on HAVE_STATIC_CALL
default y
help
Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index dbe104df339b..b4d529dd6d30 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -503,7 +503,7 @@
#define X86_FEATURE_TSA_SQ_NO (21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
#define X86_FEATURE_TSA_L1_NO (21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
#define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
-#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
+/* Free */
#define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
#define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions */
#define X86_FEATURE_SGX_EUPDATESVN (21*32+17) /* Support for ENCLS[EUPDATESVN] instruction */
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index 78b143673ca7..783e7cb50cae 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -4,6 +4,7 @@
#include <linux/randomize_kstack.h>
#include <linux/user-return-notifier.h>
+#include <linux/static_call_types.h>
#include <asm/nospec-branch.h>
#include <asm/io_bitmap.h>
@@ -94,10 +95,8 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
*/
choose_random_kstack_offset(rdtsc());
- /* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
- if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
- this_cpu_read(x86_predictor_flush_exit_to_user)) {
- write_ibpb();
+ if (unlikely(this_cpu_read(x86_predictor_flush_exit_to_user))) {
+ static_call_cond(vmscape_predictor_flush)();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
}
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 0a55b1c64741..e45e49f1e0c9 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -542,6 +542,9 @@ static inline void indirect_branch_prediction_barrier(void)
:: "rax", "rcx", "rdx", "memory");
}
+#include <linux/static_call_types.h>
+DECLARE_STATIC_CALL(vmscape_predictor_flush, write_ibpb);
+
/* The Intel SPEC CTRL MSR base value cache */
extern u64 x86_spec_ctrl_base;
DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a24c7805acdb..20ab4dd588c6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -753,6 +753,7 @@ enum mds_mitigations {
};
extern bool gds_ucode_mitigated(void);
+extern bool vmscape_mitigation_enabled(void);
/*
* Make previous memory operations globally visible before
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 68e2df3e3bf5..a7dee7ec6ea3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -144,6 +144,12 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
*/
DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+/*
+ * Controls how vmscape is mitigated e.g. via IBPB or BHB-clear
+ * sequence. This defaults to no mitigation.
+ */
+DEFINE_STATIC_CALL_NULL(vmscape_predictor_flush, write_ibpb);
+
#undef pr_fmt
#define pr_fmt(fmt) "mitigations: " fmt
@@ -3129,8 +3135,14 @@ static void __init vmscape_update_mitigation(void)
static void __init vmscape_apply_mitigation(void)
{
if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
- setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER);
+ static_call_update(vmscape_predictor_flush, write_ibpb);
+}
+
+bool vmscape_mitigation_enabled(void)
+{
+ return !!static_call_query(vmscape_predictor_flush);
}
+EXPORT_SYMBOL_FOR_KVM(vmscape_mitigation_enabled);
#undef pr_fmt
#define pr_fmt(fmt) fmt
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 45d7cfedc507..e204482e64f3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11463,7 +11463,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* set for the CPU that actually ran the guest, and not the CPU that it
* may migrate to.
*/
- if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
+ if (vmscape_mitigation_enabled())
this_cpu_write(x86_predictor_flush_exit_to_user, true);
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush
2026-03-24 18:18 ` [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
@ 2026-03-24 19:09 ` bot+bpf-ci
2026-03-24 19:51 ` Pawan Gupta
0 siblings, 1 reply; 31+ messages in thread
From: bot+bpf-ci @ 2026-03-24 19:09 UTC (permalink / raw)
To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, sdf, haoluo, pbonzini, corbet
Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index dbe104df339b8..b4d529dd6d305 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -503,7 +503,7 @@
> #define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
> -#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
> +/* Free */
> #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
Should the corresponding entry in
tools/arch/x86/include/asm/cpufeatures.h also be updated to match?
It still has:
#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14)
tools/perf/check-headers.sh checks this file for sync with the main
header.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/23506000396
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush
2026-03-24 19:09 ` bot+bpf-ci
@ 2026-03-24 19:51 ` Pawan Gupta
0 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 19:51 UTC (permalink / raw)
To: bot+bpf-ci
Cc: x86, jon, nik.borisov, hpa, jpoimboe, david.kaplan, seanjc, bp,
dave.hansen, peterz, ast, daniel, andrii, kpsingh, jolsa, davem,
david.laight.linux, luto, tglx, mingo, dsahern, martin.lau,
eddyz87, song, yonghong.song, john.fastabend, sdf, haoluo,
pbonzini, corbet, linux-kernel, kvm, asit.k.mallick, tao1.zhang,
bpf, netdev, linux-doc, martin.lau, clm, ihor.solodrai
On Tue, Mar 24, 2026 at 07:09:31PM +0000, bot+bpf-ci@kernel.org wrote:
> > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> > index dbe104df339b8..b4d529dd6d305 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -503,7 +503,7 @@
> > #define X86_FEATURE_CLEAR_CPU_BUF_VM (21*32+13) /* Clear CPU buffers using VERW before VMRUN */
> > -#define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
> > +/* Free */
> > #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Counters */
>
> Should the corresponding entry in
> tools/arch/x86/include/asm/cpufeatures.h also be updated to match?
No, because:
"So its important not to touch the copies in tools/ when doing changes in
the original kernel headers, that will be done later, when
check-headers.sh inform about the change to the perf tools hackers."
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/include/uapi/README
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (6 preceding siblings ...)
2026-03-24 18:18 ` [PATCH v8 07/10] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
@ 2026-03-24 18:18 ` Pawan Gupta
2026-03-24 19:09 ` bot+bpf-ci
2026-03-24 18:18 ` [PATCH v8 09/10] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force Pawan Gupta
2026-03-24 18:19 ` [PATCH v8 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
9 siblings, 1 reply; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:18 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
IBPB mitigation for VMSCAPE is an overkill on CPUs that are only affected
by the BHI variant of VMSCAPE. On such CPUs, eIBRS already provides
indirect branch isolation between guest and host userspace. However, branch
history from guest may also influence the indirect branches in host
userspace.
To mitigate the BHI aspect, use the BHB clearing sequence. Since now, IBPB
is not the only mitigation for VMSCAPE, update the documentation to reflect
that =auto could select either IBPB or BHB clear mitigation based on the
CPU.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
Documentation/admin-guide/hw-vuln/vmscape.rst | 11 ++++++++-
Documentation/admin-guide/kernel-parameters.txt | 4 +++-
arch/x86/include/asm/nospec-branch.h | 2 ++
arch/x86/kernel/cpu/bugs.c | 30 +++++++++++++++++++------
4 files changed, 38 insertions(+), 9 deletions(-)
diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index d9b9a2b6c114..7c40cf70ad7a 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -86,6 +86,10 @@ The possible values in this file are:
run a potentially malicious guest and issues an IBPB before the first
exit to userspace after VM-exit.
+ * 'Mitigation: Clear BHB before exit to userspace':
+
+ As above, conditional BHB clearing mitigation is enabled.
+
* 'Mitigation: IBPB on VMEXIT':
IBPB is issued on every VM-exit. This occurs when other mitigations like
@@ -102,9 +106,14 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
* ``vmscape=ibpb``:
- Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y).
+ Enable conditional IBPB mitigation.
* ``vmscape=force``:
Force vulnerability detection and mitigation even on processors that are
not known to be affected.
+
+ * ``vmscape=auto``:
+
+ Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
+ (default when CONFIG_MITIGATION_VMSCAPE=y)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 03a550630644..3853c7109419 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8378,9 +8378,11 @@ Kernel parameters
off - disable the mitigation
ibpb - use Indirect Branch Prediction Barrier
- (IBPB) mitigation (default)
+ (IBPB) mitigation
force - force vulnerability detection even on
unaffected processors
+ auto - (default) use IBPB or BHB clear
+ mitigation based on CPU
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index e45e49f1e0c9..7be812a73326 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -390,6 +390,8 @@ extern void write_ibpb(void);
#ifdef CONFIG_X86_64
extern void clear_bhb_loop_nofence(void);
+#else
+static inline void clear_bhb_loop_nofence(void) {}
#endif
extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index a7dee7ec6ea3..8cacd9474fdf 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -61,9 +61,8 @@ DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
/*
- * Set when the CPU has run a potentially malicious guest. An IBPB will
- * be needed to before running userspace. That IBPB will flush the branch
- * predictor content.
+ * Set when the CPU has run a potentially malicious guest. Indicates that a
+ * branch predictor flush is needed before running userspace.
*/
DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
@@ -3056,13 +3055,15 @@ enum vmscape_mitigations {
VMSCAPE_MITIGATION_AUTO,
VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
+ VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
};
static const char * const vmscape_strings[] = {
- [VMSCAPE_MITIGATION_NONE] = "Vulnerable",
+ [VMSCAPE_MITIGATION_NONE] = "Vulnerable",
/* [VMSCAPE_MITIGATION_AUTO] */
- [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
- [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
+ [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
+ [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
+ [VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER] = "Mitigation: Clear BHB before exit to userspace",
};
static enum vmscape_mitigations vmscape_mitigation __ro_after_init =
@@ -3080,6 +3081,8 @@ static int __init vmscape_parse_cmdline(char *str)
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+ } else if (!strcmp(str, "auto")) {
+ vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
pr_err("Ignoring unknown vmscape=%s option.\n", str);
}
@@ -3109,7 +3112,17 @@ static void __init vmscape_select_mitigation(void)
break;
case VMSCAPE_MITIGATION_AUTO:
- if (boot_cpu_has(X86_FEATURE_IBPB))
+ /*
+ * CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
+ * BHB clear sequence. These CPUs are only vulnerable to the BHI
+ * variant of the VMSCAPE attack, and thus they do not require a
+ * full predictor flush.
+ *
+ * Note, in 32-bit mode BHB clear sequence is not supported.
+ */
+ if (boot_cpu_has(X86_FEATURE_BHI_CTRL) && IS_ENABLED(CONFIG_X86_64))
+ vmscape_mitigation = VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER;
+ else if (boot_cpu_has(X86_FEATURE_IBPB))
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
else
vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
@@ -3136,6 +3149,8 @@ static void __init vmscape_apply_mitigation(void)
{
if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
static_call_update(vmscape_predictor_flush, write_ibpb);
+ else if (vmscape_mitigation == VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER)
+ static_call_update(vmscape_predictor_flush, clear_bhb_loop_nofence);
}
bool vmscape_mitigation_enabled(void)
@@ -3233,6 +3248,7 @@ void cpu_bugs_smt_update(void)
break;
case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+ case VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER:
/*
* Hypervisors can be attacked across-threads, warn for SMT when
* STIBP is not already enabled system-wide.
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation
2026-03-24 18:18 ` [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
@ 2026-03-24 19:09 ` bot+bpf-ci
2026-03-24 19:46 ` Pawan Gupta
0 siblings, 1 reply; 31+ messages in thread
From: bot+bpf-ci @ 2026-03-24 19:09 UTC (permalink / raw)
To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, sdf, haoluo, pbonzini, corbet
Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 1836 bytes --]
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index a7dee7ec6ea39..8cacd9474fdf0 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
[ ... ]
> @@ -3136,6 +3149,8 @@ static void __init vmscape_apply_mitigation(void)
> {
> if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
> static_call_update(vmscape_predictor_flush, write_ibpb);
> + else if (vmscape_mitigation == VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER)
> + static_call_update(vmscape_predictor_flush, clear_bhb_loop_nofence);
> }
When vmscape_predictor_flush is set to clear_bhb_loop_nofence, the
call site in arch_exit_to_user_mode_prepare() becomes:
static_call_cond(vmscape_predictor_flush)();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
The comment above clear_bhb_loop_nofence() in entry_64.S says:
Note, callers should use a speculation barrier like LFENCE
immediately after a call to this function to ensure BHB is
cleared before indirect branches.
Every other callsite follows this requirement, e.g. in
CLEAR_BRANCH_HISTORY and CLEAR_BRANCH_HISTORY_VMEXIT:
call clear_bhb_loop_nofence; lfence
The previous static_call target, write_ibpb(), uses wrmsr which is
serializing and inherently acts as a speculation barrier. But
clear_bhb_loop_nofence() is just a loop that fills the BHB -- it is
not serializing.
Without an lfence after the static_call, is there a speculative
window where indirect branches could execute using stale
(guest-controlled) branch history before the BHB clear takes
architectural effect?
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/23506000396
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation
2026-03-24 19:09 ` bot+bpf-ci
@ 2026-03-24 19:46 ` Pawan Gupta
0 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 19:46 UTC (permalink / raw)
To: bot+bpf-ci
Cc: x86, jon, nik.borisov, hpa, jpoimboe, david.kaplan, seanjc, bp,
dave.hansen, peterz, ast, daniel, andrii, kpsingh, jolsa, davem,
david.laight.linux, luto, tglx, mingo, dsahern, martin.lau,
eddyz87, song, yonghong.song, john.fastabend, sdf, haoluo,
pbonzini, corbet, linux-kernel, kvm, asit.k.mallick, tao1.zhang,
bpf, netdev, linux-doc, martin.lau, clm, ihor.solodrai
On Tue, Mar 24, 2026 at 07:09:33PM +0000, bot+bpf-ci@kernel.org wrote:
> > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> > index a7dee7ec6ea39..8cacd9474fdf0 100644
> > --- a/arch/x86/kernel/cpu/bugs.c
> > +++ b/arch/x86/kernel/cpu/bugs.c
>
> [ ... ]
>
> > @@ -3136,6 +3149,8 @@ static void __init vmscape_apply_mitigation(void)
> > {
> > if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
> > static_call_update(vmscape_predictor_flush, write_ibpb);
> > + else if (vmscape_mitigation == VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER)
> > + static_call_update(vmscape_predictor_flush, clear_bhb_loop_nofence);
> > }
>
> When vmscape_predictor_flush is set to clear_bhb_loop_nofence, the
> call site in arch_exit_to_user_mode_prepare() becomes:
>
> static_call_cond(vmscape_predictor_flush)();
> this_cpu_write(x86_predictor_flush_exit_to_user, false);
>
> The comment above clear_bhb_loop_nofence() in entry_64.S says:
>
> Note, callers should use a speculation barrier like LFENCE
> immediately after a call to this function to ensure BHB is
> cleared before indirect branches.
>
> Every other callsite follows this requirement, e.g. in
> CLEAR_BRANCH_HISTORY and CLEAR_BRANCH_HISTORY_VMEXIT:
>
> call clear_bhb_loop_nofence; lfence
>
> The previous static_call target, write_ibpb(), uses wrmsr which is
> serializing and inherently acts as a speculation barrier. But
> clear_bhb_loop_nofence() is just a loop that fills the BHB -- it is
> not serializing.
>
> Without an lfence after the static_call, is there a speculative
> window where indirect branches could execute using stale
> (guest-controlled) branch history before the BHB clear takes
> architectural effect?
VMSCAPE mitigation is for userspace, LFENCE is not required at exit-to-user
because ring transitions are serializing. Will add a comment.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v8 09/10] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (7 preceding siblings ...)
2026-03-24 18:18 ` [PATCH v8 08/10] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
@ 2026-03-24 18:18 ` Pawan Gupta
2026-03-24 18:19 ` [PATCH v8 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
9 siblings, 0 replies; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:18 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
vmscape=force option currently defaults to AUTO mitigation. This lets
attack-vector controls to override the vmscape mitigation. Preventing the
user from being able to force VMSCAPE mitigation.
When vmscape mitigation is forced, allow it be deployed irrespective of
attack vectors. Introduce VMSCAPE_MITIGATION_ON that wins over
attack-vector controls.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/kernel/cpu/bugs.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 8cacd9474fdf..ba714f600249 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3053,6 +3053,7 @@ static void __init srso_apply_mitigation(void)
enum vmscape_mitigations {
VMSCAPE_MITIGATION_NONE,
VMSCAPE_MITIGATION_AUTO,
+ VMSCAPE_MITIGATION_ON,
VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
@@ -3061,6 +3062,7 @@ enum vmscape_mitigations {
static const char * const vmscape_strings[] = {
[VMSCAPE_MITIGATION_NONE] = "Vulnerable",
/* [VMSCAPE_MITIGATION_AUTO] */
+ /* [VMSCAPE_MITIGATION_ON] */
[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] = "Mitigation: IBPB before exit to userspace",
[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT",
[VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER] = "Mitigation: Clear BHB before exit to userspace",
@@ -3080,7 +3082,7 @@ static int __init vmscape_parse_cmdline(char *str)
vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
- vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+ vmscape_mitigation = VMSCAPE_MITIGATION_ON;
} else if (!strcmp(str, "auto")) {
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
@@ -3112,6 +3114,7 @@ static void __init vmscape_select_mitigation(void)
break;
case VMSCAPE_MITIGATION_AUTO:
+ case VMSCAPE_MITIGATION_ON:
/*
* CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
* BHB clear sequence. These CPUs are only vulnerable to the BHI
@@ -3245,6 +3248,7 @@ void cpu_bugs_smt_update(void)
switch (vmscape_mitigation) {
case VMSCAPE_MITIGATION_NONE:
case VMSCAPE_MITIGATION_AUTO:
+ case VMSCAPE_MITIGATION_ON:
break;
case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* [PATCH v8 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls
2026-03-24 18:16 [PATCH v8 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
` (8 preceding siblings ...)
2026-03-24 18:18 ` [PATCH v8 09/10] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force Pawan Gupta
@ 2026-03-24 18:19 ` Pawan Gupta
2026-03-24 19:09 ` bot+bpf-ci
9 siblings, 1 reply; 31+ messages in thread
From: Pawan Gupta @ 2026-03-24 18:19 UTC (permalink / raw)
To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
Paolo Bonzini, Jonathan Corbet
Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
linux-doc
In general, individual mitigation knobs override the attack vector
controls. For VMSCAPE, =ibpb exists but nothing to select BHB clearing
mitigation. The =force option would select BHB clearing when supported, but
with a side-effect of also forcing the bug, hence deploying the mitigation
on unaffected parts too.
Add a new cmdline option vmscape=on to enable the mitigation based on the
VMSCAPE variant the CPU is affected by.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
Documentation/admin-guide/hw-vuln/vmscape.rst | 4 ++++
Documentation/admin-guide/kernel-parameters.txt | 2 ++
arch/x86/kernel/cpu/bugs.c | 2 ++
3 files changed, 8 insertions(+)
diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index 7c40cf70ad7a..a15d1bc91cce 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -117,3 +117,7 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
(default when CONFIG_MITIGATION_VMSCAPE=y)
+
+ * ``vmscape=on``:
+
+ Same as `auto`, except that it overrides attack vector controls.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3853c7109419..98204d464477 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8383,6 +8383,8 @@ Kernel parameters
unaffected processors
auto - (default) use IBPB or BHB clear
mitigation based on CPU
+ on - same as "auto", but override attack
+ vector control
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index ba714f600249..84bf89ca278b 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3083,6 +3083,8 @@ static int __init vmscape_parse_cmdline(char *str)
} else if (!strcmp(str, "force")) {
setup_force_cpu_bug(X86_BUG_VMSCAPE);
vmscape_mitigation = VMSCAPE_MITIGATION_ON;
+ } else if (!strcmp(str, "on")) {
+ vmscape_mitigation = VMSCAPE_MITIGATION_ON;
} else if (!strcmp(str, "auto")) {
vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
} else {
--
2.34.1
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH v8 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls
2026-03-24 18:19 ` [PATCH v8 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
@ 2026-03-24 19:09 ` bot+bpf-ci
0 siblings, 0 replies; 31+ messages in thread
From: bot+bpf-ci @ 2026-03-24 19:09 UTC (permalink / raw)
To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
john.fastabend, sdf, haoluo, pbonzini, corbet
Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 892 bytes --]
> diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
> --- a/Documentation/admin-guide/hw-vuln/vmscape.rst
> +++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
> @@ -117,3 +117,7 @@
>
> Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
> (default when CONFIG_MITIGATION_VMSCAPE=y)
> +
> + * ``vmscape=on``:
> +
> + Same as `auto`, except that it overrides attack vector controls.
Should `auto` use double backticks here (``auto``) to match the rest of
the file? All other inline code references in this document use RST
literal markup with double backticks.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/23506000396
^ permalink raw reply [flat|nested] 31+ messages in thread