From: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
To: Jim Mattson <jmattson@google.com>
Cc: x86@kernel.org, Jon Kohler <jon@nutanix.com>,
Nikolay Borisov <nik.borisov@suse.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Josh Poimboeuf <jpoimboe@kernel.org>,
David Kaplan <david.kaplan@amd.com>,
Sean Christopherson <seanjc@google.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
KP Singh <kpsingh@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
David Laight <david.laight.linux@gmail.com>,
Andy Lutomirski <luto@kernel.org>,
Thomas Gleixner <tglx@kernel.org>, Ingo Molnar <mingo@redhat.com>,
David Ahern <dsahern@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Asit Mallick <asit.k.mallick@intel.com>,
Tao Zhang <tao1.zhang@intel.com>,
bpf@vger.kernel.org, netdev@vger.kernel.org,
linux-doc@vger.kernel.org, chao.gao@intel.com
Subject: Re: [PATCH v9 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
Date: Fri, 3 Apr 2026 16:16:08 -0700 [thread overview]
Message-ID: <20260403231608.zopnhnypdclzqlx7@desk> (raw)
In-Reply-To: <CALMp9eSXfJvR=PHtttbqm3q3nH436T1eH4YdpVqxQeP-cxEPsA@mail.gmail.com>
On Fri, Apr 03, 2026 at 02:59:33PM -0700, Jim Mattson wrote:
> On Fri, Apr 3, 2026 at 2:34 PM Pawan Gupta
> <pawan.kumar.gupta@linux.intel.com> wrote:
> >
> > On Fri, Apr 03, 2026 at 01:19:17PM -0700, Jim Mattson wrote:
> > > On Fri, Apr 3, 2026 at 11:52 AM Pawan Gupta
> > > <pawan.kumar.gupta@linux.intel.com> wrote:
> > > >
> > > > On Fri, Apr 03, 2026 at 11:10:08AM -0700, Jim Mattson wrote:
> > > > > On Thu, Apr 2, 2026 at 5:32 PM Pawan Gupta
> > > > > <pawan.kumar.gupta@linux.intel.com> wrote:
> > > > > >
> > > > > > As a mitigation for BHI, clear_bhb_loop() executes branches that overwrite
> > > > > > the Branch History Buffer (BHB). On Alder Lake and newer parts this
> > > > > > sequence is not sufficient because it doesn't clear enough entries. This
> > > > > > was not an issue because these CPUs use the BHI_DIS_S hardware mitigation
> > > > > > in the kernel.
> > > > > >
> > > > > > Now with VMSCAPE (BHI variant) it is also required to isolate branch
> > > > > > history between guests and userspace. Since BHI_DIS_S only protects the
> > > > > > kernel, the newer CPUs also use IBPB.
> > > > > >
> > > > > > A cheaper alternative to the current IBPB mitigation is clear_bhb_loop().
> > > > > > But it currently does not clear enough BHB entries to be effective on newer
> > > > > > CPUs with larger BHB. At boot, dynamically set the loop count of
> > > > > > clear_bhb_loop() such that it is effective on newer CPUs too. Use the
> > > > > > X86_FEATURE_BHI_CTRL feature flag to select the appropriate loop count.
> > > > > >
> > > > > > Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
> > > > > > Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > > > > > ---
> > > > > > arch/x86/entry/entry_64.S | 8 +++++---
> > > > > > arch/x86/include/asm/nospec-branch.h | 2 ++
> > > > > > arch/x86/kernel/cpu/bugs.c | 13 +++++++++++++
> > > > > > 3 files changed, 20 insertions(+), 3 deletions(-)
> > > > > >
> > > > > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> > > > > > index 3a180a36ca0e..bbd4b1c7ec04 100644
> > > > > > --- a/arch/x86/entry/entry_64.S
> > > > > > +++ b/arch/x86/entry/entry_64.S
> > > > > > @@ -1536,7 +1536,9 @@ SYM_FUNC_START(clear_bhb_loop)
> > > > > > ANNOTATE_NOENDBR
> > > > > > push %rbp
> > > > > > mov %rsp, %rbp
> > > > > > - movl $5, %ecx
> > > > > > +
> > > > > > + movzbl bhb_seq_outer_loop(%rip), %ecx
> > > > > > +
> > > > > > ANNOTATE_INTRA_FUNCTION_CALL
> > > > > > call 1f
> > > > > > jmp 5f
> > > > > > @@ -1556,8 +1558,8 @@ SYM_FUNC_START(clear_bhb_loop)
> > > > > > * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
> > > > > > * but some Clang versions (e.g. 18) don't like this.
> > > > > > */
> > > > > > - .skip 32 - 18, 0xcc
> > > > > > -2: movl $5, %eax
> > > > > > + .skip 32 - 20, 0xcc
> > > > > > +2: movzbl bhb_seq_inner_loop(%rip), %eax
> > > > > > 3: jmp 4f
> > > > > > nop
> > > > > > 4: sub $1, %eax
> > > > > > diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
> > > > > > index 70b377fcbc1c..87b83ae7c97f 100644
> > > > > > --- a/arch/x86/include/asm/nospec-branch.h
> > > > > > +++ b/arch/x86/include/asm/nospec-branch.h
> > > > > > @@ -548,6 +548,8 @@ DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
> > > > > > extern void update_spec_ctrl_cond(u64 val);
> > > > > > extern u64 spec_ctrl_current(void);
> > > > > >
> > > > > > +extern u8 bhb_seq_inner_loop, bhb_seq_outer_loop;
> > > > > > +
> > > > > > /*
> > > > > > * With retpoline, we must use IBRS to restrict branch prediction
> > > > > > * before calling into firmware.
> > > > > > diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> > > > > > index 83f51cab0b1e..2cb4a96247d8 100644
> > > > > > --- a/arch/x86/kernel/cpu/bugs.c
> > > > > > +++ b/arch/x86/kernel/cpu/bugs.c
> > > > > > @@ -2047,6 +2047,10 @@ enum bhi_mitigations {
> > > > > > static enum bhi_mitigations bhi_mitigation __ro_after_init =
> > > > > > IS_ENABLED(CONFIG_MITIGATION_SPECTRE_BHI) ? BHI_MITIGATION_AUTO : BHI_MITIGATION_OFF;
> > > > > >
> > > > > > +/* Default to short BHB sequence values */
> > > > > > +u8 bhb_seq_outer_loop __ro_after_init = 5;
> > > > > > +u8 bhb_seq_inner_loop __ro_after_init = 5;
> > > > > > +
> > > > > > static int __init spectre_bhi_parse_cmdline(char *str)
> > > > > > {
> > > > > > if (!str)
> > > > > > @@ -3242,6 +3246,15 @@ void __init cpu_select_mitigations(void)
> > > > > > x86_spec_ctrl_base &= ~SPEC_CTRL_MITIGATIONS_MASK;
> > > > > > }
> > > > > >
> > > > > > + /*
> > > > > > + * Switch to long BHB clear sequence on newer CPUs (with BHI_CTRL
> > > > > > + * support), see Intel's BHI guidance.
> > > > > > + */
> > > > > > + if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL)) {
> > > > > > + bhb_seq_outer_loop = 12;
> > > > > > + bhb_seq_inner_loop = 7;
> > > > > > + }
> > > > > > +
> > > > >
> > > > > How does this work for VMs in a heterogeneous migration pool that
> > > > > spans the Alder Lake boundary? They can't advertise BHI_CTRL, because
> > > > > it isn't available on all hosts in the migration pool, but they need
> > > > > the long sequence when running on Alder Lake or newer.
> > > >
> > > > As we discussed elsewhere, support for migration pool is much more
> > > > involved. It should be dealt in a separate QEMU/KVM focused series.
> > > >
> > > > A quickfix could be adding support for spectre_bhi=long that guests in a
> > > > migration pool can use?
> > >
> > > The simplest solution is to add "|
> > > cpu_feature_enabled(X86_FEATURE_HYPERVISOR)" to the condition above.
> > > If that is unacceptable for the performance of pre-Alder Lake
> >
> > Yes, that would be unnecessary overhead.
> >
> > > migration pools, you could define a CPUID or MSR bit that says
> > > explicitly, "long BHB flush sequence needed," rather than trying to
> > > intuit that property from the presence of BHI_CTRL. Like
> > > IA32_ARCH_CAPABILITIES.SKIP_L1DFL_VMENTRY, the bit would only be set
> > > by a hypervisor.
> >
> > I will think about this more.
> >
> > > I am still skeptical of the need for MSR_VIRTUAL_ENUMERATION and
> > > friends, unless there is a major guest OS out there that relies on
> > > them.
> >
> > If we forget about MSR_VIRTUAL_ENUMERATION for a moment, userspace VMM is
> > in the best position to decide whether a guest needs
> > virtual.SPEC_CTRL[BHI_DIS_S]. Via a KVM interface userspace VMM can get
> > BHI_DIS_S for the guests that are in migration pool?
>
> That is not possible today, since KVM does not implement Intel's
> IA32_SPEC_CTRL virtualization, and cedes the hardware IA32_SPEC_CTRL
> to the guest after the first non-zero write to the guest's MSR.
Yes, KVM doesn't support it yet. But, adding that support to give more
control to userspace VMM helps this case, and probably many other in
the future.
I will check with Chao if he can prepare the next version of virtual
SPEC_CTRL series (leaving out virtual mitigation MSRs).
next prev parent reply other threads:[~2026-04-03 23:16 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-03 0:30 [PATCH v9 00/10] VMSCAPE optimization for BHI variant Pawan Gupta
2026-04-03 0:30 ` [PATCH v9 01/10] x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop() Pawan Gupta
2026-04-03 15:16 ` Borislav Petkov
2026-04-03 16:45 ` Pawan Gupta
2026-04-03 17:11 ` Borislav Petkov
2026-04-03 0:31 ` [PATCH v9 02/10] x86/bhi: Make clear_bhb_loop() effective on newer CPUs Pawan Gupta
2026-04-03 18:10 ` Jim Mattson
2026-04-03 18:52 ` Pawan Gupta
2026-04-03 20:19 ` Jim Mattson
2026-04-03 21:34 ` Pawan Gupta
2026-04-03 21:59 ` Jim Mattson
2026-04-03 23:16 ` Pawan Gupta [this message]
2026-04-03 23:22 ` Jim Mattson
2026-04-03 23:33 ` Pawan Gupta
2026-04-03 23:39 ` Jim Mattson
2026-04-04 0:21 ` Pawan Gupta
2026-04-04 2:21 ` Jim Mattson
2026-04-04 3:49 ` Pawan Gupta
2026-04-06 14:23 ` Jim Mattson
2026-04-03 0:31 ` [PATCH v9 03/10] x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence() Pawan Gupta
2026-04-03 0:31 ` [PATCH v9 04/10] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user Pawan Gupta
2026-04-03 0:31 ` [PATCH v9 05/10] x86/vmscape: Move mitigation selection to a switch() Pawan Gupta
2026-04-03 0:32 ` [PATCH v9 06/10] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier() Pawan Gupta
2026-04-03 0:32 ` [PATCH v9 07/10] x86/vmscape: Use static_call() for predictor flush Pawan Gupta
2026-04-03 14:52 ` Sean Christopherson
2026-04-03 16:44 ` Pawan Gupta
2026-04-03 17:26 ` Pawan Gupta
2026-04-03 0:32 ` [PATCH v9 08/10] x86/vmscape: Deploy BHB clearing mitigation Pawan Gupta
2026-04-03 0:32 ` [PATCH v9 09/10] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force Pawan Gupta
2026-04-03 0:33 ` [PATCH v9 10/10] x86/vmscape: Add cmdline vmscape=on to override attack vector controls Pawan Gupta
2026-04-04 15:20 ` [PATCH v9 00/10] VMSCAPE optimization for BHI variant David Laight
2026-04-05 7:23 ` Pawan Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260403231608.zopnhnypdclzqlx7@desk \
--to=pawan.kumar.gupta@linux.intel.com \
--cc=andrii@kernel.org \
--cc=asit.k.mallick@intel.com \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=chao.gao@intel.com \
--cc=corbet@lwn.net \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=david.kaplan@amd.com \
--cc=david.laight.linux@gmail.com \
--cc=dsahern@kernel.org \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=jon@nutanix.com \
--cc=jpoimboe@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=martin.lau@linux.dev \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=nik.borisov@suse.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=sdf@fomichev.me \
--cc=seanjc@google.com \
--cc=song@kernel.org \
--cc=tao1.zhang@intel.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox