Linux Documentation

Linux Documentation
 help / color / mirror / Atom feed

* [PATCH] kernel-doc: xforms: support __SYSFS_FUNCTION_ALTERNATIVE()
From: Randy Dunlap @ 2026-06-23 19:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Randy Dunlap, Thomas Weißschuh, Rafael J. Wysocki,
	Danilo Krummrich, driver-core, Greg Kroah-Hartman,
	Jonathan Corbet, Shuah Khan, linux-doc, Mauro Carvalho Chehab,
	stable

Add support for __SYSFS_FUNCTION_ALTERNATIVE() to create a union of its
members (as though CONFIG_CFI is unset).

Fixes these docs build warnings:

WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show)(struct device *dev, struct device_attribute *attr, char *buf)
WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*show' not described in 'device_attribute'
WARNING: include/linux/device.h:117 Invalid param: __SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store)(struct device *dev, struct device_attribute *attr, const char *buf, size_t count)
WARNING: include/linux/device.h:117 struct member '__SYSFS_FUNCTION_ALTERNATIVE( ssize_t (*store' not described in 'device_attribute'

Fixes: 434506b86a6c ("driver core: Allow the constification of device attributes")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
---
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: driver-core@lists.linux.dev
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: linux-doc@vger.kernel.org
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: stable@vger.kernel.org

 tools/lib/python/kdoc/xforms_lists.py |    1 +
 1 file changed, 1 insertion(+)

--- linux-next-20260619.orig/tools/lib/python/kdoc/xforms_lists.py
+++ linux-next-20260619/tools/lib/python/kdoc/xforms_lists.py
@@ -49,6 +49,7 @@ class CTransforms:
         (CMatch("DEFINE_DMA_UNMAP_ADDR"), r"dma_addr_t \1"),
         (CMatch("DEFINE_DMA_UNMAP_LEN"), r"__u32 \1"),
         (CMatch("VIRTIO_DECLARE_FEATURES"), r"union { u64 \1; u64 \1_array[VIRTIO_FEATURES_U64S]; }"),
+        (CMatch("__SYSFS_FUNCTION_ALTERNATIVE"), r"union { \1+ }"),
         (CMatch("__attribute__"), ""),
 
         #

^ permalink raw reply

* Re: [PATCH 1/2] dt-bindings: hwmon: chipcap2: Add label property
From: Guenter Roeck @ 2026-06-23 18:58 UTC (permalink / raw)
  To: Flaviu Nistor, Javier Carrasco
  Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet,
	Shuah Khan, linux-hwmon, linux-kernel, devicetree, linux-doc
In-Reply-To: <20260623181625.5697-1-flaviu.nistor@gmail.com>

On 6/23/26 11:16, Flaviu Nistor wrote:
> On Mon Jun 22, 2026 at 7:29 PM CEST, Javier Carrasco wrote:
>> On Mon Jun 22, 2026 at 2:21 PM CEST, Flaviu Nistor wrote:
>>> Add support for an optional label property similar to other hwmon devices
>>> This allows, in case of boards with multiple CHIPCAP2 sensors, to assign
>>> distinct names to each instance.
>>>
>>> Signed-off-by: Flaviu Nistor <flaviu.nistor@gmail.com>
>>> ---
>>>   .../devicetree/bindings/hwmon/amphenol,chipcap2.yaml         | 5 +++++
>>>   1 file changed, 5 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.ya=
>> ml b/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>>> index 17351fdbefce..f00b5a4b14dd 100644
>>> --- a/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>>> +++ b/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>>> @@ -33,6 +33,10 @@ properties:
>>>     reg:
>>>       maxItems: 1
>>>
>>> +  label:
>>> +    description:
>>> +      A descriptive name for this channel, like "ambient" or "psu".
>>> +
>>>     interrupts:
>>>       items:
>>>         - description: measurement ready indicator
>>> @@ -72,6 +76,7 @@ examples:
>>>                            <5 IRQ_TYPE_EDGE_RISING>,
>>>                            <6 IRQ_TYPE_EDGE_RISING>;
>>>               interrupt-names =3D "ready", "low", "high";
>>> +            label =3D "somelabel";
>>>               vdd-supply =3D <&reg_vdd>;
>>>           };
>>       };
>>
>> Hello Falviu, thank you for your patch.
>>
> 
> Hello Javier, thanks for your reply.
> 
>> Should we not add a reference to hwmon-common.yaml (with
>> unevelautedProperties instead of additionalProperties), as label is
>> defined there? I believe that Krzysztof Kozlowski did something similar
>> for the shunt-resistor-micro-ohms property. Could we follow suit here?
>>
> 
> This is a good question and I am happy you asked. I also thought a lot
> about this and the reason I decided to go for this approach is that by using
> $ref: hwmon-common.yaml#, I would have to change additionalProperties: false
> to unevaluatedProperties: false, which will evaluate in case it is used, also
> shunt-resistor-micro-ohms property which does not apply to this sensor. At
> least this is my understanding, but of course I can be wrong (I see lm75 binding
> also uses $ref: hwmon-common.yaml# but shunt-resistor-micro-ohms does not apply).
> 

Where does the idea come from that shunt-resistor-micro-ohms would be mandatory ?
That would make hwmon-common.yaml unusable for most chips.

Guenter

>> I am also not a big fan of a name like "somelabel", and a more
>> meaningful name from a "real" example would look better. I know that
>> some examples have already used "somelabel" as an example, but others
>> have used more meaningful names too.
>>
> 
> I will have to send a v2 since for the label property description I used
> "channel" instead of "sensor" (detected by Sashiko AI review), so I can
> use in the example section a more meaningful name like "Room" if no other
> suggestion.
> 
>> Best regards,
>> Javier Carrasco
> 
> Best regards,
> Flaviu Nistor
> 


^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Document the deprecation of AF_ALG
From: Linus Torvalds @ 2026-06-23 18:56 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Bastien Nocera, linux-crypto, Herbert Xu, Marcel Holtmann,
	Luiz Augusto von Dentz, linux-doc, linux-api, linux-kernel,
	netdev, linux-bluetooth, ell
In-Reply-To: <20260623164932.GA1793@sol>

On Tue, 23 Jun 2026 at 09:51, Eric Biggers <ebiggers@kernel.org> wrote:
>
> We're aware of that and are taking it into account in the allowlist:

Note that if we can  just unconditionally make it depend on
CAP_NET_ADMIN, that would be good - independently of any allowlist.

Because if iwd and abluetoothd are the main two users, and both of
those already require CAP_NET_ADMIN anyway...

                Linus

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Luiz Augusto von Dentz @ 2026-06-23 18:51 UTC (permalink / raw)
  To: Eric Biggers
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <20260623180502.GC1850517@google.com>

Hi Eric,

On Tue, Jun 23, 2026 at 2:05 PM Eric Biggers <ebiggers@kernel.org> wrote:
>
> On Tue, Jun 23, 2026 at 09:52:08AM -0700, Eric Biggers wrote:
> > On Tue, Jun 23, 2026 at 11:04:14AM -0400, Luiz Augusto von Dentz wrote:
> > > > +===  ==================================================================
> > > > +0    AF_ALG is unrestricted.
> > > > +
> > > > +1    AF_ALG is supported with a limited list of algorithms. The list
> > > > +     is designed for compatibility with known users such as iwd and
> > > > +     bluez that haven't yet been fixed to use userspace crypto code.
> > >
> > > Is the expectation that we go shopping for userspace crypto here?
> >
> > Yes, same as what 99% of userspace already does.  Probably you'll just
> > want to link to OpenSSL, but it could be something else if you want.
> >
> > - Eric
>
> By the way you do know that bluez already has a local implementation of
> ECDH, right?  See src/shared/ecc.c.

It's never been audited; it's only used for hardware emulation, I
didn't even remember we had that thingy. What we really use is
src/shared/crypto.c, and I'm not looking forward to having it changed.
With something like Zephyr, changing crypto libraries every so often
just because one didn't fit on a platform wasn't a great experience,
and that is a much bigger project. In the end it seems they are using
a forked mbedtls:

https://github.com/zephyrproject-rtos/mbedtls

I'm quite sure whatever choice we make will be the wrong choice for
someone. Then someone will have the brilliant idea to add some sort of
backend support to let everyone plug in their preferred crypto
library, possibly adding even more code to audit.

> - Eric

-- 
Luiz Augusto von Dentz

^ permalink raw reply

* Re: [PATCH v2] usbcore: Add quirk for 255-bytes initial config read
From: Randy Dunlap @ 2026-06-23 18:35 UTC (permalink / raw)
  To: Nikhil Solanke, linux-usb
  Cc: gregkh, linux-kernel, stern, michal.pecio, stable, corbet, skhan,
	linux-doc
In-Reply-To: <20260623161035.5792-1-nikhilsolanke5@gmail.com>



On 6/23/26 9:10 AM, Nikhil Solanke wrote:
> Certain third-party USB game controllers exposing (or spoofing) an Xbox
> 360-compatible interface (VID:PID 045e:028e) fail to enumerate under Linux.
> The device disconnects from the bus without responding to the initial
> GET_DESCRIPTOR(CONFIGURATION) request, and the kernel logs 'unable to read
> config index 0 descriptor/start: -71'.
> 
> The device then falls back to a secondary Android HID mode (with a
> different VID:PID), losing XInput functionality including rumble support.
> The failure reproduces across multiple machines, host controller types, and
> kernel versions including current mainline and LTS. The device enumerates
> correctly and remains in XInput mode under Windows. Notably, the device
> enumerates correctly in Android mode when the same 9-byte request
> is issued for that mode's configuration descriptor, confirming the firmware
> bug is specific to the XInput mode.
> 
> usbmon traces from Linux and Wireshark/USBPcap traces from Windows are
> identical up to the point of failure, with no visible protocol-level
> difference explaining the divergence. The root cause was identified when
> Michal Pecio discovered via a QEMU bus-level capture that Windows does not
> use wLength=9 for the initial config descriptor request; it uses
> wLength=255. Alan Stern subsequently confirmed this with a bus
> analyzer on a different USB 2.0 device, and Michal verified the behavior
> goes back to Windows 95 OSR2.1.
> 
> So, add a new quirk flag USB_QUIRK_CONFIG_SIZE which causes
> usb_get_configuration() to issue a 255 byte sized configuration request
> instead of USB_DT_CONFIG_SIZE (9) for the initial
> GET_DESCRIPTOR(CONFIGURATION) request, mimicking long-standing Windows
> behavior.
> 
> Suggested-by: Alan Stern <stern@rowland.harvard.edu>
> Suggested-by: Michal Pecio <michal.pecio@gmail.com>
> Closes: https://lore.kernel.org/linux-usb/CAFgddh+JWdT4LLwMc5qjM8q_pBu-fRo2qADR5ovAKoGHWMQrRw@mail.gmail.com/
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: stable@vger.kernel.org
> 
> Signed-off-by: Nikhil Solanke <nikhilsolanke5@gmail.com>
> ---
> Changes in v2:
> - Add Documentation
> - Naming changes
> - Refactored to have a better flow with existing code.
> 
>  .../admin-guide/kernel-parameters.txt         |  9 +++
>  drivers/usb/core/config.c                     | 61 ++++++++++++++-----
>  drivers/usb/core/hub.c                        |  6 +-
>  drivers/usb/core/quirks.c                     |  4 ++
>  include/linux/usb/quirks.h                    |  3 +
>  5 files changed, 67 insertions(+), 16 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 97007f4f69d4..af4bf0ef2c7b 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -8158,6 +8158,15 @@ Kernel parameters
>  				q = USB_QUIRK_FORCE_ONE_CONFIG (Device
>  					claims zero configurations,
>  					forcing to 1);
> +                r = USB_QUIRK_WINDOWS_CONFIG_REQ_SIZE (Device
> +                    fails during initialization when asked for
> +                    9-bytes configuration desciptor request. Ask

		                             descriptor

> +                    for 255-bytes request instead to mirror
> +                    Windows' behavior. This quirk is originally
> +                    meant to fix some quirky gamepads that refuse
> +                    to connect in their XInput mode. But it can also
> +                    potentially fix issues with other USB devices
> +                    that work on Windows but not on Linux)

add ending '.'

For all lines added here, use tabs instead of spaces for indentation.


>  			Example: quirks=0781:5580:bk,0a5c:5834:gij
>  
>  	usbhid.mousepoll=


-- 
~Randy


^ permalink raw reply

* Re: [PATCH v7 03/10] tracing/probes: Support dumping fetcharg program for debugging dynamic events
From: Julian Braha @ 2026-06-23 18:31 UTC (permalink / raw)
  To: Masami Hiramatsu (Google), Steven Rostedt, Mathieu Desnoyers
  Cc: Jonathan Corbet, Shuah Khan, linux-kernel, linux-trace-kernel,
	linux-doc, linux-kselftest
In-Reply-To: <178217907822.643090.14693478306190628970.stgit@devnote2>

Hi Masami,

On 6/23/26 02:44, Masami Hiramatsu (Google) wrote:

> +config PROBE_EVENTS_DUMP_FETCHARG
> +	depends on PROBE_EVENTS
> +	bool "Dump of dynamic probe event fetch-arguments"
> +	default n

Sorry, kconfig nitpick: could you match the style used by the rest of
the config options in this file? E.g. the type and prompt come first in
the list of attributes?

- Julian Braha

^ permalink raw reply

* Re: [PATCH v4 1/5] mm/zswap: Extend shrink_memcg() writeback capability
From: Yosry Ahmed @ 2026-06-23 18:17 UTC (permalink / raw)
  To: Hao Jia
  Cc: akpm, tj, hannes, shakeel.butt, mhocko, mkoutny, nphamcs,
	chengming.zhou, muchun.song, roman.gushchin, linux-mm,
	linux-kernel, linux-doc, Hao Jia
In-Reply-To: <d0f05c35-457a-4b2c-6faa-7a83d4bdec01@gmail.com>

> My initial thought was that if cold memory is evenly distributed across
> nodes and we are doing a large writeback, it would be better to balance
> the zswap entry writeback across all nodes rather than just draining
> node 0 first. However, since we currently lack a proper metric to
> represent hot/cold memory (such as age-based tracking), doing this
> probably doesn't make much sense right now.

Yeah let's start simple and go from there.

>
> So, perhaps we want something like this? Please correct me if I'm wrong.
>
> static long shrink_memcg(struct mem_cgroup *memcg,
>         unsigned long nr_to_scan)
> {
>    struct zswap_shrink_walk_arg walk_arg = {
>      .bytes_written = 0,
>      .encountered_page_in_swapcache = false,
>    };
>    unsigned long nr_remaining = nr_to_scan;
>    bool memcg_list_is_empty = true;
>    int nid;
>
>    if (!mem_cgroup_zswap_writeback_enabled(memcg))
>      return -ENOENT;
>
>    if (memcg && !mem_cgroup_online(memcg))
>      return -ENOENT;
>
>    for_each_node_state(nid, N_NORMAL_MEMORY) {
>      unsigned long nr_to_walk;
>
>      /*
>       * Cap the per-node scan by the current LRU length. A referenced
>       * entry is only rotated to the tail (second chance) and may be
>       * revisited within a single walk; without this cap those rotated
>       * entries could drain the shared scan budget on one node.
>       */

The comment here is a bit misleading. It's not just about draining one
node. One call to shrink_memcg() should only scan entries once. The
caller can then choose to scan the memcg again, or scan a different
one. In this case, the caller should iterate all memcgs first before
retrying memcgs again and reclaiming rotated entries.

>      nr_to_walk = min(nr_remaining,
>           list_lru_count_one(&zswap_list_lru, nid, memcg));
>      if (!nr_to_walk)
>        continue;
>      memcg_list_is_empty = false;
>
>      nr_remaining -= nr_to_walk;
>      list_lru_walk_one(&zswap_list_lru, nid, memcg,
>            &shrink_memcg_cb, &walk_arg, &nr_to_walk);
>      /* Return the unused share of the budget to the pool. */
>      nr_remaining += nr_to_walk;
>
>      /* Bail out once the whole scan budget has been spent. */

The comment is unnecessary.

>      if (!nr_remaining)
>        break;
>
>      cond_resched();

Did you observe a problem here or did you just add this due to an
abundance of caution?

>    }
>
>    if (memcg_list_is_empty)

Do we need memcg_list_is_empty? Can we just check if nr_remaining
matches nr_to_scan?

>      return -ENOENT;
>
>    return walk_arg.bytes_written;
> }
>
> Thanks,
> Hao

^ permalink raw reply

* Re:[PATCH 1/2] dt-bindings: hwmon: chipcap2: Add label property
From: Flaviu Nistor @ 2026-06-23 18:16 UTC (permalink / raw)
  To: Javier Carrasco
  Cc: Guenter Roeck, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Jonathan Corbet, Shuah Khan, Flaviu Nistor, linux-hwmon,
	linux-kernel, devicetree, linux-doc
In-Reply-To: <DJFPYCV2FXW7.1BFG9DURPZRCC@gmail.com>

On Mon Jun 22, 2026 at 7:29 PM CEST, Javier Carrasco wrote:
>On Mon Jun 22, 2026 at 2:21 PM CEST, Flaviu Nistor wrote:
>> Add support for an optional label property similar to other hwmon devices
>> This allows, in case of boards with multiple CHIPCAP2 sensors, to assign
>> distinct names to each instance.
>>
>> Signed-off-by: Flaviu Nistor <flaviu.nistor@gmail.com>
>> ---
>>  .../devicetree/bindings/hwmon/amphenol,chipcap2.yaml         | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.ya=
>ml b/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>> index 17351fdbefce..f00b5a4b14dd 100644
>> --- a/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>> +++ b/Documentation/devicetree/bindings/hwmon/amphenol,chipcap2.yaml
>> @@ -33,6 +33,10 @@ properties:
>>    reg:
>>      maxItems: 1
>>
>> +  label:
>> +    description:
>> +      A descriptive name for this channel, like "ambient" or "psu".
>> +
>>    interrupts:
>>      items:
>>        - description: measurement ready indicator
>> @@ -72,6 +76,7 @@ examples:
>>                           <5 IRQ_TYPE_EDGE_RISING>,
>>                           <6 IRQ_TYPE_EDGE_RISING>;
>>              interrupt-names =3D "ready", "low", "high";
>> +            label =3D "somelabel";
>>              vdd-supply =3D <&reg_vdd>;
>>          };
>      };
>
>Hello Falviu, thank you for your patch.
>

Hello Javier, thanks for your reply.

>Should we not add a reference to hwmon-common.yaml (with
>unevelautedProperties instead of additionalProperties), as label is
>defined there? I believe that Krzysztof Kozlowski did something similar
>for the shunt-resistor-micro-ohms property. Could we follow suit here?
>

This is a good question and I am happy you asked. I also thought a lot
about this and the reason I decided to go for this approach is that by using
$ref: hwmon-common.yaml#, I would have to change additionalProperties: false
to unevaluatedProperties: false, which will evaluate in case it is used, also
shunt-resistor-micro-ohms property which does not apply to this sensor. At
least this is my understanding, but of course I can be wrong (I see lm75 binding
also uses $ref: hwmon-common.yaml# but shunt-resistor-micro-ohms does not apply). 

>I am also not a big fan of a name like "somelabel", and a more
>meaningful name from a "real" example would look better. I know that
>some examples have already used "somelabel" as an example, but others
>have used more meaningful names too.
>

I will have to send a v2 since for the label property description I used
"channel" instead of "sensor" (detected by Sashiko AI review), so I can
use in the example section a more meaningful name like "Room" if no other
suggestion.

>Best regards,
>Javier Carrasco

Best regards,
Flaviu Nistor

^ permalink raw reply

* Re: [PATCH v12 11/12] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
From: bot+bpf-ci @ 2026-06-23 18:13 UTC (permalink / raw)
  To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
	david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
	andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
	mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, sdf, haoluo, pbonzini, corbet, jbaron, aliceryhl,
	rostedt, ardb, skhan
  Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
	linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260622-vmscape-bhb-v12-11-76cbda0ae3e5@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 1456 bytes --]

> x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
>
> vmscape=force option currently defaults to AUTO mitigation. This lets
> attack-vector controls to override the vmscape mitigation. Preventing the
> user from being able to force VMSCAPE mitigation.
>
> When vmscape mitigation is forced, allow it be deployed irrespective of
> attack vectors. Introduce VMSCAPE_MITIGATION_ON that wins over
> attack-vector controls.
>
> Tested-by: Jon Kohler <jon@nutanix.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

This appears to fix a behavioral bug where vmscape=force mapped to
VMSCAPE_MITIGATION_AUTO, which is gated by should_mitigate_vuln(), so
attack-vector controls could override a user-forced mitigation.

Should this carry a Fixes: tag?

The force -> VMSCAPE_MITIGATION_AUTO mapping that allowed the override
predates this series. The earlier refactor 68d94c61c09a ("x86/vmscape:
Move mitigation selection to a switch()") only moved the logic into a
switch() and did not introduce the conflict, so it does not look like the
right Fixes: target. A precise commit to point at could not be confirmed
from this tree.

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/28046221590

^ permalink raw reply

* Re: [PATCH v12 02/12] x86/bhi: Make clear_bhb_loop() effective on newer CPUs
From: bot+bpf-ci @ 2026-06-23 18:13 UTC (permalink / raw)
  To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
	david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
	andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
	mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, sdf, haoluo, pbonzini, corbet, jbaron, aliceryhl,
	rostedt, ardb, skhan
  Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
	linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260622-vmscape-bhb-v12-2-76cbda0ae3e5@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 3513 bytes --]

> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 83f51cab0b1e..2cb4a96247d8 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c

[ ... ]

> @@ -3242,6 +3246,15 @@ void __init cpu_select_mitigations(void)
>  		x86_spec_ctrl_base &= ~SPEC_CTRL_MITIGATIONS_MASK;
>  	}
>
> +	/*
> +	 * Switch to long BHB clear sequence on newer CPUs (with BHI_CTRL
> +	 * support), see Intel's BHI guidance.
> +	 */
> +	if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL)) {
> +		bhb_seq_outer_loop = 12;
> +		bhb_seq_inner_loop = 7;
> +	}

Should this also consider the case of a guest running under a hypervisor
that does not expose BHI_CTRL?

When BHI_CTRL is absent the loop counts keep their short defaults (5/5)
and the guest declares itself BHI-mitigated.  In a heterogeneous
migration pool a guest may present as Ice Lake (no BHI_CTRL) while
actually running on Alder Lake or newer hardware, where the short
sequence is not sufficient.

A default-safe approach would be to also select the long sequence when
running under a hypervisor without BHI_CTRL, for example:

	if (cpu_feature_enabled(X86_FEATURE_BHI_CTRL) ||
	    cpu_feature_enabled(X86_FEATURE_HYPERVISOR)) {
		bhb_seq_outer_loop = 12;
		bhb_seq_inner_loop = 7;
	}

There is precedent for assuming the worst under a hypervisor, such as
vulnerable_to_its(), cpu_set_bug_bits() and check_null_seg_clears_base().

This came up during review and the resolution was to let the series move
forward and revisit the hypervisor case afterwards.  Is that still the
intent, or should the hypervisor fallback land with this patch?


For the missing Fixes: tag, quoting the full commit message:

commit a2a0812335ee19ae1df81f88e736da5210748c2b
Author: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

>     x86/bhi: Make clear_bhb_loop() effective on newer CPUs
>
>     As a mitigation for BHI, clear_bhb_loop() executes branches that overwrite
>     the Branch History Buffer (BHB). On Alder Lake and newer parts this
>     sequence is not sufficient because it doesn't clear enough entries. This
>     was not an issue because these CPUs use the BHI_DIS_S hardware mitigation
>     in the kernel.
>     Now with VMSCAPE (BHI variant) it is also required to isolate branch
>     history between guests and userspace. Since BHI_DIS_S only protects the
>     kernel, the newer CPUs also use IBPB.
>     A cheaper alternative to the current IBPB mitigation is clear_bhb_loop().
>     But it currently does not clear enough BHB entries to be effective on newer
>     CPUs with larger BHB. At boot, dynamically set the loop count of
>     clear_bhb_loop() such that it is effective on newer CPUs too.
>     Introduce global loop counts, initializing them with appropriate value
>     based on the hardware feature X86_FEATURE_BHI_CTRL.
>     Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
>     Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
>     Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

The short BHB clear sequence with the hardcoded loop count of 5 was
added by clear_bhb_loop() in entry_64.S, which is the code this patch
makes configurable.  Should this carry:

  Fixes: 7390db8aea0d ("x86/bhi: Add support for clearing branch history at syscall entry")


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/28046221590

^ permalink raw reply

* Re: [PATCH v12 08/12] KVM: Define EXPORT_STATIC_CALL_FOR_KVM()
From: bot+bpf-ci @ 2026-06-23 18:13 UTC (permalink / raw)
  To: pawan.kumar.gupta, x86, jon, nik.borisov, hpa, jpoimboe,
	david.kaplan, seanjc, bp, dave.hansen, peterz, ast, daniel,
	andrii, kpsingh, jolsa, davem, david.laight.linux, luto, tglx,
	mingo, dsahern, martin.lau, eddyz87, song, yonghong.song,
	john.fastabend, sdf, haoluo, pbonzini, corbet, jbaron, aliceryhl,
	rostedt, ardb, skhan
  Cc: linux-kernel, kvm, asit.k.mallick, tao1.zhang, bpf, netdev,
	linux-doc, ast, andrii, daniel, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260622-vmscape-bhb-v12-8-76cbda0ae3e5@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]

> EXPORT_SYMBOL_FOR_KVM() exists to export symbols to KVM modules. Static
> calls need the same treatment when the core kernel defines a static_call
> that KVM needs access to (e.g. from a VM-exit path).
>
> Define EXPORT_STATIC_CALL_FOR_KVM() as the static_call analogue of
> EXPORT_SYMBOL_FOR_KVM(). The same three-way logic applies:
>
>   - KVM_SUB_MODULES defined: export to "kvm," plus all sub-modules
>   - KVM=m, no sub-modules: export to "kvm" only
>   - KVM built-in: no export needed (noop)
>
>   As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override both
>   macros (e.g. to suppress the export when kvm.ko itself will not be
>   built despite CONFIG_KVM=m). Add the x86 no-op overrides in
>   arch/x86/include/asm/kvm_types.h for that case. To keep the pair in
>   sync, EXPORT_STATIC_CALL_FOR_KVM() is defined inside the
>   EXPORT_SYMBOL_FOR_KVM #ifndef block; an arch that defines
>   EXPORT_SYMBOL_FOR_KVM must also define EXPORT_STATIC_CALL_FOR_KVM or the
>   build will fail with a compile-time error.
>
> As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override
> EXPORT_STATIC_CALL_FOR_KVM definition (e.g. to suppress the export when
> kvm.ko itself will not be built despite CONFIG_KVM=m). Add the x86 no-op
> override in arch/x86/include/asm/kvm_types.h for that case.
>
> Architectures must also define EXPORT_STATIC_CALL_FOR_KVM when they define
> EXPORT_SYMBOL_FOR_KVM.

This isn't a bug, but the arch-override rationale appears twice in nearly
identical form.  The third paragraph (starting "As with
EXPORT_SYMBOL_FOR_KVM(), allow architectures to override both macros") and
the fourth paragraph (starting "As with EXPORT_SYMBOL_FOR_KVM(), allow
architectures to override EXPORT_STATIC_CALL_FOR_KVM definition") both
explain suppressing the export when kvm.ko will not be built and both
mention adding the x86 no-op override.

The requirement that an arch defining EXPORT_SYMBOL_FOR_KVM must also
define EXPORT_STATIC_CALL_FOR_KVM also shows up twice: once inside the
third paragraph ("an arch that defines EXPORT_SYMBOL_FOR_KVM must also
define EXPORT_STATIC_CALL_FOR_KVM or the build will fail") and again as the
standalone final sentence ("Architectures must also define
EXPORT_STATIC_CALL_FOR_KVM when they define EXPORT_SYMBOL_FOR_KVM").

Could the duplicate paragraphs be folded into a single statement?


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/28046221590

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Eric Biggers @ 2026-06-23 18:05 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Demi Marie Obenour, Andy Lutomirski
In-Reply-To: <20260623165208.GB1793@sol>

On Tue, Jun 23, 2026 at 09:52:08AM -0700, Eric Biggers wrote:
> On Tue, Jun 23, 2026 at 11:04:14AM -0400, Luiz Augusto von Dentz wrote:
> > > +===  ==================================================================
> > > +0    AF_ALG is unrestricted.
> > > +
> > > +1    AF_ALG is supported with a limited list of algorithms. The list
> > > +     is designed for compatibility with known users such as iwd and
> > > +     bluez that haven't yet been fixed to use userspace crypto code.
> > 
> > Is the expectation that we go shopping for userspace crypto here?
> 
> Yes, same as what 99% of userspace already does.  Probably you'll just
> want to link to OpenSSL, but it could be something else if you want.
> 
> - Eric

By the way you do know that bluez already has a local implementation of
ECDH, right?  See src/shared/ecc.c.

- Eric

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Eric Biggers @ 2026-06-23 17:58 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: linux-crypto, Herbert Xu, linux-kernel, linux-doc,
	linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Andy Lutomirski
In-Reply-To: <34e20a64-17d3-4994-89f1-6084cd4fae36@gmail.com>

On Tue, Jun 23, 2026 at 01:24:01PM -0400, Demi Marie Obenour wrote:
> I think there is room for something in-between the allowlist provided
> here and "no restrictions".  For instance, I think it makes sense
> to have a mode that allows modern¸ widely-used algorithms (AES-GCM,
> ChaCha20-Poly1305, SHA-3, HMAC, etc) to all users.
> 
> This makes it less likely someone turns off all restrictions.

Would it be worth the complexity when AF_ALG as a whole is deprecated?

> XFRM allows providing an arbitrary algorithm name, and it appears to
> be accessible in unprivileged user namespaces.  That also needs an
> allowlist.

Doesn't XFRM have a fixed list already?  The algorithm comes in as a
string, but it has to be in the list.

> > +	{ "hmac(sha512)", true }, /* iwd, sha512hmac */
> 
> Should this entry have privileged = false?  sha512hmac doesn't
> need privileges.

It's for the FIPS 140-3 integrity check use case, which runs as root.

> > diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
> > index 4dfe7899f8fa..bd522915d56d 100644
> > --- a/crypto/algif_rng.c
> > +++ b/crypto/algif_rng.c
> > @@ -48,10 +48,14 @@
> >  
> >  MODULE_LICENSE("GPL");
> >  MODULE_AUTHOR("Stephan Mueller <smueller@chronox.de>");
> >  MODULE_DESCRIPTION("User-space interface for random number generators");
> >  
> > +static const struct af_alg_allowlist_entry rng_allowlist[] = {
> > +	{},
> > +};
> 
> Can this whole file be deleted?  You wrote that it isn't actually used.

That's not quite what I said.  There's still one known use case: CAVP
testing for FIPS 140-3 certifications.  But that only accesses the drbg
and jitterentropy RNGs and can run as root.

We could consider deleting it anyway and require that the FIPS people
use custom kernel modules.  But that is out of the scope of this patch.

> > +static const struct af_alg_allowlist_entry skcipher_allowlist[] = {
> > +	{ "adiantum(xchacha12,aes)", false }, /* cryptsetup */
> > +	{ "adiantum(xchacha20,aes)", false }, /* cryptsetup */
> > +	{ "cbc(aes)", true }, /* iwd */
> > +	{ "cbc(des)", true }, /* iwd */
> > +	{ "cbc(des3_ede)", true }, /* iwd */
> > +	{ "ctr(aes)", true }, /* iwd */
> > +	{ "ecb(aes)", true }, /* iwd, bluez */
> > +	{ "ecb(des)", true }, /* iwd */
> > +	{ "hctr2(aes)", false }, /* cryptsetup */
> > +	{ "xts(aes)", false }, /* cryptsetup benchmark */
> > +	{},
> > +};
> 
> Do the cryptsetup ones really need to be accessible to unprivileged users?

If we don't allow them, people running 'cryptsetup benchmark' will
notice the change.  Formatting could also be affected.

Requiring sudo in these cases might be feasible, but we need to strike
the right balance.  Maybe Milan has an opinion.

- Eric

^ permalink raw reply

* Re: [PATCH] crypto: af_alg - Add af_alg_restrict sysctl, defaulting to 1
From: Eric Biggers @ 2026-06-23 17:45 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: Luiz Augusto von Dentz, linux-crypto, Herbert Xu, linux-kernel,
	linux-doc, linux-bluetooth, iwd, linux-hardening, Milan Broz,
	Andy Lutomirski
In-Reply-To: <01b73988-b35c-4ce8-8463-4589fe18b0b0@gmail.com>

On Tue, Jun 23, 2026 at 01:29:20PM -0400, Demi Marie Obenour wrote:
> On 6/23/26 12:52, Eric Biggers wrote:
> > On Tue, Jun 23, 2026 at 11:04:14AM -0400, Luiz Augusto von Dentz wrote:
> >>> +===  ==================================================================
> >>> +0    AF_ALG is unrestricted.
> >>> +
> >>> +1    AF_ALG is supported with a limited list of algorithms. The list
> >>> +     is designed for compatibility with known users such as iwd and
> >>> +     bluez that haven't yet been fixed to use userspace crypto code.
> >>
> >> Is the expectation that we go shopping for userspace crypto here?
> > 
> > Yes, same as what 99% of userspace already does.  Probably you'll just
> > want to link to OpenSSL, but it could be something else if you want.
> 
> Hard disagree on OpenSSL.  It's not a good library.
> 
> See <https://cryptography.io/en/latest/statements/state-of-openssl/>.
> 
> Distributions should ship AWS-LC and either rebuild reverse
> dependencies when needed, or work with upstream to catch ABI breaks.

I don't like OpenSSL either, but it's the de facto standard on most
distros.  While perhaps distros should make that switch, there's no need
to wait for that to move away from AF_ALG.

- Eric

^ permalink raw reply

* Re: [PATCH v3 1/8] hwmon/misc: amd-sbi: Move core sbtsi support from hwmon to misc
From: Julian Braha @ 2026-06-23 17:45 UTC (permalink / raw)
  To: Akshay Gupta, linux-doc, linux-kernel, linux-hwmon
  Cc: corbet, skhan, linux, arnd, gregkh, NaveenKrishna.Chatradhi,
	Anand.Umarji, Prathima.Lk
In-Reply-To: <20260622135821.2190260-2-Akshay.Gupta@amd.com>

Hi Akshay,

On 6/22/26 14:58, Akshay Gupta wrote:

> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> index e4c4f2b09732..8f204cf49b6e 100644
> --- a/drivers/hwmon/Kconfig
> +++ b/drivers/hwmon/Kconfig
> @@ -1963,7 +1963,7 @@ config SENSORS_SL28CPLD
>  
>  config SENSORS_SBTSI
>  	tristate "Emulated SB-TSI temperature sensor"
> -	depends on I2C
> +	select AMD_SBTSI
>  	help
>  	  If you say yes here you get support for emulated temperature
>  	  sensors on AMD SoCs with SB-TSI interface connected to a BMC device.

> diff --git a/drivers/misc/amd-sbi/Kconfig b/drivers/misc/amd-sbi/Kconfig
> index 30e7fad7356c..512251690e0e 100644
> --- a/drivers/misc/amd-sbi/Kconfig
> +++ b/drivers/misc/amd-sbi/Kconfig
> @@ -20,3 +20,16 @@ config AMD_SBRMI_HWMON
>  	  This provides support for RMI device hardware monitoring. If enabled,
>  	  a hardware monitoring device will be created for each socket in
>  	  the system.
> +
> +config AMD_SBTSI
> +	tristate "AMD side band TSI support"
> +	depends on I2C
> +	depends on ARM || ARM64 || COMPILE_TEST
> +	select AUXILIARY_BUS
> +	help
> +	  Enables support for the AMD SB-TSI (Side Band Temperature Sensor
> +	  Interface) driver, which provides access to emulated CPU temperature
> +	  sensors on AMD SoCs via an I2C connected BMC device.
> +
> +	  This driver can also be built as a module. If so, the module will
> +	  be called sbtsi.

Your kconfig changes introduce an unmet dependency bug. When I enable
SENSORS_SBTSI without enabling COMPILE_TEST on x86, I get this:

WARNING: unmet direct dependencies detected for AMD_SBTSI
  Depends on [n]: I2C [=y] && (ARM || ARM64 || COMPILE_TEST [=n])
  Selected by [y]:
  - SENSORS_SBTSI [=y] && HWMON [=y]

- Julian Braha


^ permalink raw reply

* Re: [PATCH 2/3] dt-bindings: hwmon: pmbus: Support for onsemi's FD5121
From: Conor Dooley @ 2026-06-23 17:36 UTC (permalink / raw)
  To: Selvamani.Rajagopal
  Cc: Guenter Roeck, Jonathan Corbet, Shuah Khan, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, linux-hwmon, linux-doc,
	linux-kernel, devicetree
In-Reply-To: <20260622-support-fd5121-from-onsemi-v1-2-b31767689c65@onsemi.com>

[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]

On Mon, Jun 22, 2026 at 10:55:38PM -0700, Selvamani Rajagopal via B4 Relay wrote:
> From: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
> 
> Add devicetree schema for onsemi FD5121, FD5123, and
> FD5125 dual rail, multi-phase digital controllers.
> 
> Signed-off-by: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
> ---
>  .../bindings/hwmon/pmbus/onnn,fd5121.yaml          | 41 ++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml b/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
> new file mode 100644
> index 000000000000..b0453b0634f0
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/hwmon/pmbus/onnn,fd5121.yaml
> @@ -0,0 +1,41 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/hwmon/pmbus/onnn,fd5121.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: onsemi's multi-phase digital controllers

Can someone explain to me what a "digital controller" actually is?
Seems very generi and that a word may have been left out, were it not
for the fact that this wording is used several times in the patch.

> +
> +maintainers:
> +  - Selvamani Rajagopal <selvamani.rajagopal@onsemi.com>
> +
> +description:
> +  onsemi multi-phase digital controllers with PMBus.
> +
> +properties:
> +  compatible:
> +    enum:
> +      - onnn,fd5121
> +      - onnn,fd5123
> +      - onnn,fd5125

Your /OF/ match data in your driver suggests that you intended to permit
fallback compatibles here?

|+#ifdef CONFIG_OF
|+static const struct of_device_id fd5121_of_match[] = {
|+       { .compatible = "onnn,fd5121" },
|+       { }
|+};
|+MODULE_DEVICE_TABLE(of, fd5121_of_match);
|+#endif

pw-bot: changes-requested

Cheers,
Conor.

> +
> +  reg:
> +    maxItems: 1
> +
> +required:
> +  - compatible
> +  - reg
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    i2c {
> +      #address-cells = <1>;
> +      #size-cells = <0>;
> +
> +      fd5121@50 {
> +        compatible = "onnn,fd5121";
> +        reg = <0x50>;
> +      };
> +    };
> 
> -- 
> 2.43.0
> 
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* [PATCH v12 12/12] x86/vmscape: Add cmdline vmscape=on to override attack vector controls
From: Pawan Gupta @ 2026-06-23 17:35 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

In general, individual mitigation knobs override the attack vector
controls. For VMSCAPE, =ibpb exists but nothing to select BHB clearing
mitigation. The =force option would select BHB clearing when supported, but
with a side-effect of also forcing the bug, hence deploying the mitigation
on unaffected parts too.

Add a new cmdline option vmscape=on to enable the mitigation based on the
VMSCAPE variant the CPU is affected by.

Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 Documentation/admin-guide/hw-vuln/vmscape.rst   | 4 ++++
 Documentation/admin-guide/kernel-parameters.txt | 2 ++
 arch/x86/kernel/cpu/bugs.c                      | 2 ++
 3 files changed, 8 insertions(+)

diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index 7c40cf70ad7a..2558a5c3d956 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -117,3 +117,7 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
 
    Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
    (default when CONFIG_MITIGATION_VMSCAPE=y)
+
+ * ``vmscape=on``:
+
+   Same as ``auto``, except that it overrides attack vector controls.
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 38594df8859f..1320baf9264c 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8342,6 +8342,8 @@ Kernel parameters
 					  unaffected processors
 			auto		- (default) use IBPB or BHB clear
 					  mitigation based on CPU
+			on		- same as "auto", but override attack
+					  vector control
 
 	vsyscall=	[X86-64,EARLY]
 			Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index fbdb137720c4..4e0b77fb21dd 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3088,6 +3088,8 @@ static int __init vmscape_parse_cmdline(char *str)
 	} else if (!strcmp(str, "force")) {
 		setup_force_cpu_bug(X86_BUG_VMSCAPE);
 		vmscape_mitigation = VMSCAPE_MITIGATION_ON;
+	} else if (!strcmp(str, "on")) {
+		vmscape_mitigation = VMSCAPE_MITIGATION_ON;
 	} else if (!strcmp(str, "auto")) {
 		vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
 	} else {

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 11/12] x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
From: Pawan Gupta @ 2026-06-23 17:35 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

vmscape=force option currently defaults to AUTO mitigation. This lets
attack-vector controls to override the vmscape mitigation. Preventing the
user from being able to force VMSCAPE mitigation.

When vmscape mitigation is forced, allow it be deployed irrespective of
attack vectors. Introduce VMSCAPE_MITIGATION_ON that wins over
attack-vector controls.

Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/kernel/cpu/bugs.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 1082ed1fb2e6..fbdb137720c4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3058,6 +3058,7 @@ static void __init srso_apply_mitigation(void)
 enum vmscape_mitigations {
 	VMSCAPE_MITIGATION_NONE,
 	VMSCAPE_MITIGATION_AUTO,
+	VMSCAPE_MITIGATION_ON,
 	VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
 	VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
 	VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
@@ -3066,6 +3067,7 @@ enum vmscape_mitigations {
 static const char * const vmscape_strings[] = {
 	[VMSCAPE_MITIGATION_NONE]			= "Vulnerable",
 	/* [VMSCAPE_MITIGATION_AUTO] */
+	/* [VMSCAPE_MITIGATION_ON] */
 	[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER]		= "Mitigation: IBPB before exit to userspace",
 	[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT]		= "Mitigation: IBPB on VMEXIT",
 	[VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER]	= "Mitigation: Clear BHB before exit to userspace",
@@ -3085,7 +3087,7 @@ static int __init vmscape_parse_cmdline(char *str)
 		vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
 	} else if (!strcmp(str, "force")) {
 		setup_force_cpu_bug(X86_BUG_VMSCAPE);
-		vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+		vmscape_mitigation = VMSCAPE_MITIGATION_ON;
 	} else if (!strcmp(str, "auto")) {
 		vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
 	} else {
@@ -3117,6 +3119,7 @@ static void __init vmscape_select_mitigation(void)
 		break;
 
 	case VMSCAPE_MITIGATION_AUTO:
+	case VMSCAPE_MITIGATION_ON:
 		/*
 		 * CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
 		 * BHB clear sequence. These CPUs are only vulnerable to the BHI
@@ -3244,6 +3247,7 @@ void cpu_bugs_smt_update(void)
 	switch (vmscape_mitigation) {
 	case VMSCAPE_MITIGATION_NONE:
 	case VMSCAPE_MITIGATION_AUTO:
+	case VMSCAPE_MITIGATION_ON:
 		break;
 	case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
 	case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 10/12] x86/vmscape: Deploy BHB clearing mitigation
From: Pawan Gupta @ 2026-06-23 17:35 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

IBPB mitigation for VMSCAPE is an overkill on CPUs that are only affected
by the BHI variant of VMSCAPE. On such CPUs, eIBRS already provides
indirect branch isolation between guest and host userspace. However, branch
history from guest may also influence the indirect branches in host
userspace.

To mitigate the BHI aspect, use the BHB clearing sequence. Since now, IBPB
is not the only mitigation for VMSCAPE, update the documentation to reflect
that =auto could select either IBPB or BHB clear mitigation based on the
CPU.

Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 Documentation/admin-guide/hw-vuln/vmscape.rst   | 11 ++++++++-
 Documentation/admin-guide/kernel-parameters.txt |  4 +++-
 arch/x86/include/asm/entry-common.h             |  4 ++++
 arch/x86/include/asm/nospec-branch.h            |  2 ++
 arch/x86/kernel/cpu/bugs.c                      | 30 +++++++++++++++++++------
 5 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/admin-guide/hw-vuln/vmscape.rst
index d9b9a2b6c114..7c40cf70ad7a 100644
--- a/Documentation/admin-guide/hw-vuln/vmscape.rst
+++ b/Documentation/admin-guide/hw-vuln/vmscape.rst
@@ -86,6 +86,10 @@ The possible values in this file are:
    run a potentially malicious guest and issues an IBPB before the first
    exit to userspace after VM-exit.
 
+ * 'Mitigation: Clear BHB before exit to userspace':
+
+   As above, conditional BHB clearing mitigation is enabled.
+
  * 'Mitigation: IBPB on VMEXIT':
 
    IBPB is issued on every VM-exit. This occurs when other mitigations like
@@ -102,9 +106,14 @@ The mitigation can be controlled via the ``vmscape=`` command line parameter:
 
  * ``vmscape=ibpb``:
 
-   Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y).
+   Enable conditional IBPB mitigation.
 
  * ``vmscape=force``:
 
    Force vulnerability detection and mitigation even on processors that are
    not known to be affected.
+
+ * ``vmscape=auto``:
+
+   Choose the mitigation based on the VMSCAPE variant the CPU is affected by.
+   (default when CONFIG_MITIGATION_VMSCAPE=y)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 97007f4f69d4..38594df8859f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -8337,9 +8337,11 @@ Kernel parameters
 
 			off		- disable the mitigation
 			ibpb		- use Indirect Branch Prediction Barrier
-					  (IBPB) mitigation (default)
+					  (IBPB) mitigation
 			force		- force vulnerability detection even on
 					  unaffected processors
+			auto		- (default) use IBPB or BHB clear
+					  mitigation based on CPU
 
 	vsyscall=	[X86-64,EARLY]
 			Controls the behavior of vsyscalls (i.e. calls to
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index d5b390c76f00..befa14a19817 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -84,6 +84,10 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 #endif
 
 	if (unlikely(this_cpu_read(x86_predictor_flush_exit_to_user))) {
+		/*
+		 * Since the mitigation is for userspace, an explicit
+		 * speculation barrier is not required after flush.
+		 */
 		static_call_cond(vmscape_predictor_flush)();
 		this_cpu_write(x86_predictor_flush_exit_to_user, false);
 	}
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 066fd8095200..38478383139b 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -390,6 +390,8 @@ extern void write_ibpb(void);
 
 #ifdef CONFIG_X86_64
 extern void clear_bhb_loop_nofence(void);
+#else
+static inline void clear_bhb_loop_nofence(void) {}
 #endif
 
 extern void (*x86_return_thunk)(void);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bfc0e41697f6..1082ed1fb2e6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -61,9 +61,8 @@ DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
 EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
 
 /*
- * Set when the CPU has run a potentially malicious guest. An IBPB will
- * be needed to before running userspace. That IBPB will flush the branch
- * predictor content.
+ * Set when the CPU has run a potentially malicious guest. Indicates that a
+ * branch predictor flush is needed before running userspace.
  */
 DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
 EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
@@ -3061,13 +3060,15 @@ enum vmscape_mitigations {
 	VMSCAPE_MITIGATION_AUTO,
 	VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER,
 	VMSCAPE_MITIGATION_IBPB_ON_VMEXIT,
+	VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER,
 };
 
 static const char * const vmscape_strings[] = {
-	[VMSCAPE_MITIGATION_NONE]		= "Vulnerable",
+	[VMSCAPE_MITIGATION_NONE]			= "Vulnerable",
 	/* [VMSCAPE_MITIGATION_AUTO] */
-	[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER]	= "Mitigation: IBPB before exit to userspace",
-	[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT]	= "Mitigation: IBPB on VMEXIT",
+	[VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER]		= "Mitigation: IBPB before exit to userspace",
+	[VMSCAPE_MITIGATION_IBPB_ON_VMEXIT]		= "Mitigation: IBPB on VMEXIT",
+	[VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER]	= "Mitigation: Clear BHB before exit to userspace",
 };
 
 static enum vmscape_mitigations vmscape_mitigation __ro_after_init =
@@ -3085,6 +3086,8 @@ static int __init vmscape_parse_cmdline(char *str)
 	} else if (!strcmp(str, "force")) {
 		setup_force_cpu_bug(X86_BUG_VMSCAPE);
 		vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
+	} else if (!strcmp(str, "auto")) {
+		vmscape_mitigation = VMSCAPE_MITIGATION_AUTO;
 	} else {
 		pr_err("Ignoring unknown vmscape=%s option.\n", str);
 	}
@@ -3114,7 +3117,17 @@ static void __init vmscape_select_mitigation(void)
 		break;
 
 	case VMSCAPE_MITIGATION_AUTO:
-		if (boot_cpu_has(X86_FEATURE_IBPB))
+		/*
+		 * CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use
+		 * BHB clear sequence. These CPUs are only vulnerable to the BHI
+		 * variant of the VMSCAPE attack, and thus they do not require a
+		 * full predictor flush.
+		 *
+		 * Note, in 32-bit mode BHB clear sequence is not supported.
+		 */
+		if (boot_cpu_has(X86_FEATURE_BHI_CTRL) && IS_ENABLED(CONFIG_X86_64))
+			vmscape_mitigation = VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER;
+		else if (boot_cpu_has(X86_FEATURE_IBPB))
 			vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
 		else
 			vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
@@ -3141,6 +3154,8 @@ static void __init vmscape_apply_mitigation(void)
 {
 	if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
 		static_call_update(vmscape_predictor_flush, write_ibpb);
+	else if (vmscape_mitigation == VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER)
+		static_call_update(vmscape_predictor_flush, clear_bhb_loop_nofence);
 }
 
 #undef pr_fmt
@@ -3232,6 +3247,7 @@ void cpu_bugs_smt_update(void)
 		break;
 	case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT:
 	case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+	case VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER:
 		/*
 		 * Hypervisors can be attacked across-threads, warn for SMT when
 		 * STIBP is not already enabled system-wide.

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 09/12] x86/vmscape: Use static_call() for predictor flush
From: Pawan Gupta @ 2026-06-23 17:34 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

Adding more mitigation options at exit-to-userspace for VMSCAPE would
usually require a series of checks to decide which mitigation to use. In
this case, the mitigation is done by calling a function, which is decided
at boot. So, adding more feature flags and multiple checks can be avoided
by using static_call() to the mitigating function.

Replace the flag-based mitigation selector with a static_call(). This also
frees the existing X86_FEATURE_IBPB_EXIT_TO_USER.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/Kconfig                     | 1 +
 arch/x86/include/asm/cpufeatures.h   | 2 +-
 arch/x86/include/asm/entry-common.h  | 7 +++----
 arch/x86/include/asm/nospec-branch.h | 3 +++
 arch/x86/kernel/cpu/bugs.c           | 9 ++++++++-
 arch/x86/kvm/x86.c                   | 2 +-
 6 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f3f7cb01d69d..c5965bcf14c1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2717,6 +2717,7 @@ config MITIGATION_TSA
 config MITIGATION_VMSCAPE
 	bool "Mitigate VMSCAPE"
 	depends on KVM
+	depends on HAVE_STATIC_CALL
 	default y
 	help
 	  Enable mitigation for VMSCAPE attacks. VMSCAPE is a hardware security
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1d506e5d6f46..09f956b72637 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -504,7 +504,7 @@
 #define X86_FEATURE_TSA_SQ_NO		(21*32+11) /* AMD CPU not vulnerable to TSA-SQ */
 #define X86_FEATURE_TSA_L1_NO		(21*32+12) /* AMD CPU not vulnerable to TSA-L1 */
 #define X86_FEATURE_CLEAR_CPU_BUF_VM	(21*32+13) /* Clear CPU buffers using VERW before VMRUN */
-#define X86_FEATURE_IBPB_EXIT_TO_USER	(21*32+14) /* Use IBPB on exit-to-userspace, see VMSCAPE bug */
+/* Free */
 #define X86_FEATURE_ABMC		(21*32+15) /* Assignable Bandwidth Monitoring Counters */
 #define X86_FEATURE_MSR_IMM		(21*32+16) /* MSR immediate form instructions */
 #define X86_FEATURE_SGX_EUPDATESVN	(21*32+17) /* Support for ENCLS[EUPDATESVN] instruction */
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index 3be6d4c356ed..d5b390c76f00 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -4,6 +4,7 @@
 
 #include <linux/randomize_kstack.h>
 #include <linux/user-return-notifier.h>
+#include <linux/static_call_types.h>
 
 #include <asm/nospec-branch.h>
 #include <asm/io_bitmap.h>
@@ -82,10 +83,8 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED);
 #endif
 
-	/* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
-	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
-	    this_cpu_read(x86_predictor_flush_exit_to_user)) {
-		write_ibpb();
+	if (unlikely(this_cpu_read(x86_predictor_flush_exit_to_user))) {
+		static_call_cond(vmscape_predictor_flush)();
 		this_cpu_write(x86_predictor_flush_exit_to_user, false);
 	}
 }
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 0381db59c39d..066fd8095200 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -542,6 +542,9 @@ static inline void indirect_branch_prediction_barrier(void)
 			    :: "rax", "rcx", "rdx", "memory");
 }
 
+#include <linux/static_call_types.h>
+DECLARE_STATIC_CALL(vmscape_predictor_flush, write_ibpb);
+
 /* The Intel SPEC CTRL MSR base value cache */
 extern u64 x86_spec_ctrl_base;
 DECLARE_PER_CPU(u64, x86_spec_ctrl_current);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 636280c612f0..bfc0e41697f6 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -144,6 +144,13 @@ EXPORT_SYMBOL_GPL(cpu_buf_idle_clear);
  */
 DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
 
+/*
+ * Controls how vmscape is mitigated e.g. via IBPB or BHB-clear
+ * sequence. This defaults to no mitigation.
+ */
+DEFINE_STATIC_CALL_NULL(vmscape_predictor_flush, write_ibpb);
+EXPORT_STATIC_CALL_FOR_KVM(vmscape_predictor_flush);
+
 #undef pr_fmt
 #define pr_fmt(fmt)	"mitigations: " fmt
 
@@ -3133,7 +3140,7 @@ static void __init vmscape_update_mitigation(void)
 static void __init vmscape_apply_mitigation(void)
 {
 	if (vmscape_mitigation == VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER)
-		setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER);
+		static_call_update(vmscape_predictor_flush, write_ibpb);
 }
 
 #undef pr_fmt
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 721ff7667dc0..fcd61c47653f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11556,7 +11556,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	 * set for the CPU that actually ran the guest, and not the CPU that it
 	 * may migrate to.
 	 */
-	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
+	if (static_call_query(vmscape_predictor_flush))
 		this_cpu_write(x86_predictor_flush_exit_to_user, true);
 
 	/*

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 08/12] KVM: Define EXPORT_STATIC_CALL_FOR_KVM()
From: Pawan Gupta @ 2026-06-23 17:34 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

EXPORT_SYMBOL_FOR_KVM() exists to export symbols to KVM modules. Static
calls need the same treatment when the core kernel defines a static_call
that KVM needs access to (e.g. from a VM-exit path).

Define EXPORT_STATIC_CALL_FOR_KVM() as the static_call analogue of
EXPORT_SYMBOL_FOR_KVM(). The same three-way logic applies:

  - KVM_SUB_MODULES defined: export to "kvm," plus all sub-modules
  - KVM=m, no sub-modules: export to "kvm" only
  - KVM built-in: no export needed (noop)

  As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override both
  macros (e.g. to suppress the export when kvm.ko itself will not be
  built despite CONFIG_KVM=m). Add the x86 no-op overrides in
  arch/x86/include/asm/kvm_types.h for that case. To keep the pair in
  sync, EXPORT_STATIC_CALL_FOR_KVM() is defined inside the
  EXPORT_SYMBOL_FOR_KVM #ifndef block; an arch that defines
  EXPORT_SYMBOL_FOR_KVM must also define EXPORT_STATIC_CALL_FOR_KVM or the
  build will fail with a compile-time error.

As with EXPORT_SYMBOL_FOR_KVM(), allow architectures to override
EXPORT_STATIC_CALL_FOR_KVM definition (e.g. to suppress the export when
kvm.ko itself will not be built despite CONFIG_KVM=m). Add the x86 no-op
override in arch/x86/include/asm/kvm_types.h for that case.

Architectures must also define EXPORT_STATIC_CALL_FOR_KVM when they define
EXPORT_SYMBOL_FOR_KVM.

Suggested-by: Sean Christopherson <seanjc@google.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/include/asm/kvm_types.h |  1 +
 include/linux/kvm_types.h        | 10 +++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_types.h
index d7c704ed1be9..bceeaed2940e 100644
--- a/arch/x86/include/asm/kvm_types.h
+++ b/arch/x86/include/asm/kvm_types.h
@@ -15,6 +15,7 @@
  * at least one vendor module is enabled.
  */
 #define EXPORT_SYMBOL_FOR_KVM(symbol)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
 #endif
 
 #define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index a568d8e6f4e8..be602d3f287e 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -13,6 +13,8 @@
 	EXPORT_SYMBOL_FOR_MODULES(symbol, __stringify(KVM_SUB_MODULES))
 #define EXPORT_SYMBOL_FOR_KVM(symbol) \
 	EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) \
+	EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm," __stringify(KVM_SUB_MODULES))
 #else
 #define EXPORT_SYMBOL_FOR_KVM_INTERNAL(symbol)
 /*
@@ -23,11 +25,17 @@
 #ifndef EXPORT_SYMBOL_FOR_KVM
 #if IS_MODULE(CONFIG_KVM)
 #define EXPORT_SYMBOL_FOR_KVM(symbol) EXPORT_SYMBOL_FOR_MODULES(symbol, "kvm")
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol) EXPORT_STATIC_CALL_FOR_MODULES(symbol, "kvm")
 #else
 #define EXPORT_SYMBOL_FOR_KVM(symbol)
+#define EXPORT_STATIC_CALL_FOR_KVM(symbol)
 #endif /* IS_MODULE(CONFIG_KVM) */
-#endif /* EXPORT_SYMBOL_FOR_KVM */
+#else
+#ifndef EXPORT_STATIC_CALL_FOR_KVM
+#error Must #define EXPORT_STATIC_CALL_FOR_KVM if #defining EXPORT_SYMBOL_FOR_KVM
 #endif
+#endif /* EXPORT_SYMBOL_FOR_KVM */
+#endif /* KVM_SUB_MODULES */
 
 #ifndef __ASSEMBLER__
 

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 07/12] static_call: Define EXPORT_STATIC_CALL_FOR_MODULES()
From: Pawan Gupta @ 2026-06-23 17:34 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

There is EXPORT_STATIC_CALL_TRAMP() that hides the static key from all
modules. But there is no equivalent of EXPORT_SYMBOL_FOR_MODULES() to
restrict symbol visibility to only certain modules.

Add EXPORT_STATIC_CALL_FOR_MODULES(name, mods) that wraps both the key and
the trampoline with EXPORT_SYMBOL_FOR_MODULES(), allowing only a limited
set of modules to see and update the static key.

The immediate user is KVM, in the following commit.

checkpatch reported below warnings with this change that I believe don't
apply in this case:

  include/linux/static_call.h:219: WARNING: Non-declarative macros with multiple statements should be enclosed in a do - while loop
  include/linux/static_call.h:220: WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 include/linux/static_call.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 78a77a4ae0ea..b610afd1ed55 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -216,6 +216,9 @@ extern long __static_call_return0(void);
 #define EXPORT_STATIC_CALL_GPL(name)					\
 	EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name));			\
 	EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods)			\
+	EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods);		\
+	EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_TRAMP(name), mods)
 
 /* Leave the key unexported, so modules can't change static call targets: */
 #define EXPORT_STATIC_CALL_TRAMP(name)					\
@@ -276,6 +279,9 @@ extern long __static_call_return0(void);
 #define EXPORT_STATIC_CALL_GPL(name)					\
 	EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name));			\
 	EXPORT_SYMBOL_GPL(STATIC_CALL_TRAMP(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods)			\
+	EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods);		\
+	EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_TRAMP(name), mods)
 
 /* Leave the key unexported, so modules can't change static call targets: */
 #define EXPORT_STATIC_CALL_TRAMP(name)					\
@@ -346,6 +352,8 @@ static inline int static_call_text_reserved(void *start, void *end)
 
 #define EXPORT_STATIC_CALL(name)	EXPORT_SYMBOL(STATIC_CALL_KEY(name))
 #define EXPORT_STATIC_CALL_GPL(name)	EXPORT_SYMBOL_GPL(STATIC_CALL_KEY(name))
+#define EXPORT_STATIC_CALL_FOR_MODULES(name, mods)			\
+	EXPORT_SYMBOL_FOR_MODULES(STATIC_CALL_KEY(name), mods)
 
 #endif /* CONFIG_HAVE_STATIC_CALL */
 

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 06/12] x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
From: Pawan Gupta @ 2026-06-23 17:34 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

indirect_branch_prediction_barrier() is a wrapper to write_ibpb(), which
also checks if the CPU supports IBPB. For VMSCAPE, call to
indirect_branch_prediction_barrier() is only possible when CPU supports
IBPB.

Simply call write_ibpb() directly to avoid unnecessary alternative
patching.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/include/asm/entry-common.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index e2b985929083..3be6d4c356ed 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -85,7 +85,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	/* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
 	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
 	    this_cpu_read(x86_predictor_flush_exit_to_user)) {
-		indirect_branch_prediction_barrier();
+		write_ibpb();
 		this_cpu_write(x86_predictor_flush_exit_to_user, false);
 	}
 }

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 05/12] x86/vmscape: Move mitigation selection to a switch()
From: Pawan Gupta @ 2026-06-23 17:33 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

This ensures that all mitigation modes are explicitly handled, while
keeping the mitigation selection for each mode together. This also prepares
for adding BHB-clearing mitigation mode for VMSCAPE.

Tested-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/kernel/cpu/bugs.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 002bf4adccc3..636280c612f0 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -3088,17 +3088,33 @@ early_param("vmscape", vmscape_parse_cmdline);
 
 static void __init vmscape_select_mitigation(void)
 {
-	if (!boot_cpu_has_bug(X86_BUG_VMSCAPE) ||
-	    !boot_cpu_has(X86_FEATURE_IBPB)) {
+	if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) {
 		vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
 		return;
 	}
 
-	if (vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) {
-		if (should_mitigate_vuln(X86_BUG_VMSCAPE))
+	if ((vmscape_mitigation == VMSCAPE_MITIGATION_AUTO) &&
+	    !should_mitigate_vuln(X86_BUG_VMSCAPE))
+		vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+
+	switch (vmscape_mitigation) {
+	case VMSCAPE_MITIGATION_NONE:
+		break;
+
+	case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER:
+		if (!boot_cpu_has(X86_FEATURE_IBPB))
+			vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+		break;
+
+	case VMSCAPE_MITIGATION_AUTO:
+		if (boot_cpu_has(X86_FEATURE_IBPB))
 			vmscape_mitigation = VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER;
 		else
 			vmscape_mitigation = VMSCAPE_MITIGATION_NONE;
+		break;
+
+	default:
+		break;
 	}
 }
 

-- 
2.34.1



^ permalink raw reply related

* [PATCH v12 04/12] x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
From: Pawan Gupta @ 2026-06-23 17:33 UTC (permalink / raw)
  To: x86, Jon Kohler, Nikolay Borisov, H. Peter Anvin, Josh Poimboeuf,
	David Kaplan, Sean Christopherson, Borislav Petkov, Dave Hansen,
	Peter Zijlstra, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, KP Singh, Jiri Olsa, David S. Miller,
	David Laight, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	David Ahern, Martin KaFai Lau, Eduard Zingerman, Song Liu,
	Yonghong Song, John Fastabend, Stanislav Fomichev, Hao Luo,
	Paolo Bonzini, Jonathan Corbet, Jason Baron, Alice Ryhl,
	Steven Rostedt, Ard Biesheuvel, Shuah Khan
  Cc: linux-kernel, kvm, Asit Mallick, Tao Zhang, bpf, netdev,
	linux-doc
In-Reply-To: <20260622-vmscape-bhb-v12-0-76cbda0ae3e5@linux.intel.com>

With the upcoming changes x86_ibpb_exit_to_user will also be used when BHB
clearing sequence is used. Rename it cover both the cases.

No functional change.

Suggested-by: Sean Christopherson <seanjc@google.com>
Tested-by: Jon Kohler <jon@nutanix.com>
Acked-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
 arch/x86/include/asm/entry-common.h  | 6 +++---
 arch/x86/include/asm/nospec-branch.h | 2 +-
 arch/x86/kernel/cpu/bugs.c           | 4 ++--
 arch/x86/kvm/x86.c                   | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index eca24b5e07f4..e2b985929083 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -82,11 +82,11 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	current_thread_info()->status &= ~(TS_COMPAT | TS_I386_REGS_POKED);
 #endif
 
-	/* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */
+	/* Avoid unnecessary reads of 'x86_predictor_flush_exit_to_user' */
 	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) &&
-	    this_cpu_read(x86_ibpb_exit_to_user)) {
+	    this_cpu_read(x86_predictor_flush_exit_to_user)) {
 		indirect_branch_prediction_barrier();
-		this_cpu_write(x86_ibpb_exit_to_user, false);
+		this_cpu_write(x86_predictor_flush_exit_to_user, false);
 	}
 }
 #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 157eb69c7f0f..0381db59c39d 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -533,7 +533,7 @@ void alternative_msr_write(unsigned int msr, u64 val, unsigned int feature)
 		: "memory");
 }
 
-DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user);
+DECLARE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
 
 static inline void indirect_branch_prediction_barrier(void)
 {
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 2cb4a96247d8..002bf4adccc3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -65,8 +65,8 @@ EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current);
  * be needed to before running userspace. That IBPB will flush the branch
  * predictor content.
  */
-DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user);
-EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user);
+DEFINE_PER_CPU(bool, x86_predictor_flush_exit_to_user);
+EXPORT_PER_CPU_SYMBOL_GPL(x86_predictor_flush_exit_to_user);
 
 u64 x86_pred_cmd __ro_after_init = PRED_CMD_IBPB;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0550359ed798..721ff7667dc0 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11557,7 +11557,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	 * may migrate to.
 	 */
 	if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
-		this_cpu_write(x86_ibpb_exit_to_user, true);
+		this_cpu_write(x86_predictor_flush_exit_to_user, true);
 
 	/*
 	 * Consume any pending interrupts, including the possible source of

-- 
2.34.1



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox