Linux Documentation
 help / color / mirror / Atom feed
* Re: [PATCH] docs: update rust-analyzer command
From: Miguel Ojeda @ 2026-05-13 10:29 UTC (permalink / raw)
  To: Onur Özkan, Tamir Duberstein, Jesung Yang
  Cc: rust-for-linux, linux-doc, linux-kernel, ojeda, boqun, gary,
	bjorn3_gh, lossin, a.hindborg, aliceryhl, tmgross, dakr, corbet,
	skhan, alexs, si.yanteng, dzm91
In-Reply-To: <20260513092017.265269-1-work@onurozkan.dev>

On Wed, May 13, 2026 at 11:20 AM Onur Özkan <work@onurozkan.dev> wrote:
>
> diff --git a/Documentation/rust/quick-start.rst b/Documentation/rust/quick-start.rst
> index a6ec3fa94d33..df5b54b51deb 100644
> --- a/Documentation/rust/quick-start.rst
> +++ b/Documentation/rust/quick-start.rst
> @@ -314,7 +314,7 @@ definition, and other features.
>  ``rust-analyzer`` needs a configuration file, ``rust-project.json``, which
>  can be generated by the ``rust-analyzer`` Make target::
>
> -       make LLVM=1 rust-analyzer
> +       make LLVM=1 prepare rust-analyzer

Perhaps we should add a brief sentence after this code block
explaining why the `prepare` is there, e.g. adapted from the commit
message:

    For the best experience, it is recommended to make ``prepare``
    together with the ``rust-analyzer`` target so that all generated
    files (e.g. proc macros) are available.

Cc'ing Tamir and Jesung.

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH v12 02/11] lib: kstrtox: add kstrtoudec64() and kstrtodec64()
From: David Laight @ 2026-05-13 10:09 UTC (permalink / raw)
  To: Rodrigo Alencar
  Cc: Andy Shevchenko, Andy Shevchenko, Jonathan Cameron,
	Rodrigo Alencar via B4 Relay, rodrigo.alencar, linux-kernel,
	linux-iio, devicetree, linux-doc, David Lechner, Andy Shevchenko,
	Lars-Peter Clausen, Michael Hennerich, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet, Andrew Morton,
	Petr Mladek, Steven Rostedt, Rasmus Villemoes, Sergey Senozhatsky,
	Shuah Khan
In-Reply-To: <sqt72hd4xdu6rj3zecvcpo3bbfsxlh7u4bi76enbq64hpgjm3t@vksuk4cuo76x>

On Wed, 13 May 2026 08:14:28 +0100
Rodrigo Alencar <455.rodrigo.alencar@gmail.com> wrote:

> On 26/05/12 11:16PM, Andy Shevchenko wrote:
> > On Tue, May 12, 2026 at 08:39:21PM +0100, Rodrigo Alencar wrote:  
...
> > Oh, I only now realised that this is sliding window for a single 64-bit signed value!
> > I was under impression that you wanted implementation that covers 128-bit signed value
> > (with 64 + 64)...  
> 
> So that was the initial approach with strntoull() with integer and fractional parts
> combined in iio core. At that time I realized that we ended up combining them anyways
> with:
> 
> 	val64 = (u64)val * MICRO + val2
> 
> so why not have val64 already! And all this made me realise that once leading 0s are ok,
> scale can be even bigger, e.g.
> 
> scale = 20
> 	max = 0.09223372036854775807, min = -0.09223372036854775808
> scale = 21
> 	max = 0.009223372036854775807, min = -0.009223372036854775808
> 
> It might be a sliding window of 19 digits, but here we trade range for scale, precision
> is still fixed at 64-bit.

I wouldn't worry about that case unless it 'falls out in the wash'.

> I have a new idea to make thing simpler, actually
> it would go back to what David pointed out in the past.

:-)

-- David

> Let me put this together...
> 
> > > I am not representing -0.9999999999999999999 as is. The desired scale will have this
> > > truncated. It may be -0.9999 or -0.999999 or -0.9. And this is practical for a
> > > reasonable scale value... for pico and femto precision you still get a decent range.  
> > 
> > -- 
> > With Best Regards,
> > Andy Shevchenko
> > 
> >   
> 


^ permalink raw reply

* [PATCH v2 5/5] KVM: PPC: Document KVM_PPC_GET_COMPAT_CAPS ioctl
From: Amit Machhiwal @ 2026-05-13 10:07 UTC (permalink / raw)
  To: linuxppc-dev, Madhavan Srinivasan
  Cc: Amit Machhiwal, Vaibhav Jain, Paolo Bonzini, Jonathan Corbet,
	Shuah Khan, kvm, linux-kernel, linux-doc
In-Reply-To: <20260513100755.83215-1-amachhiw@linux.ibm.com>

Add documentation for the KVM_PPC_GET_COMPAT_CAPS ioctl to the KVM API
documentation.

The ioctl exposes host processor compatibility modes supported for
nested KVM guests on PowerPC systems.

Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
---
 Documentation/virt/kvm/api.rst | 35 ++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 52bbbb553ce1..1b533f674a09 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6555,6 +6555,41 @@ KVM_S390_KEYOP_SSKE
 
 .. _kvm_run:
 
+4.145 KVM_PPC_GET_COMPAT_CAPS
+-----------------------------
+:Capability: KVM_CAP_PPC_COMPAT_CAPS
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_compat_caps (out)
+:Returns:
+	0 on successful completion,
+	-EFAULT if ``struct kvm_ppc_compat_caps`` cannot be written
+
+IBM POWER system server-based processors provide a compatibility mode feature
+where an Nth generation processor can operate in modes consistent with earlier
+generations such as (N-1) and (N-2).
+
+This ioctl provides userspace with information about the CPU compatibility modes
+supported by the current host processor for booting the nested KVM guests on
+PowerNV (KVM nested APIv1) and PowerVM (KVM nested APIv2) platforms.
+
+::
+
+  struct kvm_ppc_compat_caps {
+	__u64	flags;			/* Reserved for future use */
+	__u64	compat_capabilities;	/* Capabilities supported by the host */
+  };
+
+The ``compat_capabilities`` bit field describes the processor compatibility
+modes supported by the host. For example, the following bits indicate support
+for specific processor modes.
+
+::
+
+H_GUEST_CAP_POWER9  (bit 1): KVM guests can run in Power9 processor mode
+H_GUEST_CAP_POWER10 (bit 2): KVM guests can run in Power10 processor mode
+H_GUEST_CAP_POWER11 (bit 3): KVM guests can run in Power11 processor mode
+
 5. The kvm_run structure
 ========================
 
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related

* [PATCH] Documentation: iio: make ADXL Y-axis calibbias description consistent
From: Stepan Ionichev @ 2026-05-13 10:07 UTC (permalink / raw)
  To: jic23
  Cc: dlechner, nuno.sa, andy, corbet, skhan, linux-iio, linux-doc,
	linux-kernel, sozdayvek

The Y-axis calibbias rows in adxl345.rst, adxl313.rst and adxl380.rst
use a different wording than the matching X-axis and Z-axis rows in
the same tables: the X/Z rows say "Calibration offset for the
X/Z-axis accelerometer channel." while the Y row says "Y-axis (or
y-axis) acceleration offset correction".

Make the Y-axis row match the other two so each driver's sysfs
table has consistent capitalisation and wording.

Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
---
 Documentation/iio/adxl313.rst | 2 +-
 Documentation/iio/adxl345.rst | 2 +-
 Documentation/iio/adxl380.rst | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/iio/adxl313.rst b/Documentation/iio/adxl313.rst
index 966e72c01..b18b54e47 100644
--- a/Documentation/iio/adxl313.rst
+++ b/Documentation/iio/adxl313.rst
@@ -38,7 +38,7 @@ specific device folder path ``/sys/bus/iio/devices/iio:deviceX``.
 +---------------------------------------------------+----------------------------------------------------------+
 | in_accel_x_raw                                    | Raw X-axis accelerometer channel value.                  |
 +---------------------------------------------------+----------------------------------------------------------+
-| in_accel_y_calibbias                              | y-axis acceleration offset correction                    |
+| in_accel_y_calibbias                              | Calibration offset for the Y-axis accelerometer channel. |
 +---------------------------------------------------+----------------------------------------------------------+
 | in_accel_y_raw                                    | Raw Y-axis accelerometer channel value.                  |
 +---------------------------------------------------+----------------------------------------------------------+
diff --git a/Documentation/iio/adxl345.rst b/Documentation/iio/adxl345.rst
index 978f746a8..0aa33a852 100644
--- a/Documentation/iio/adxl345.rst
+++ b/Documentation/iio/adxl345.rst
@@ -47,7 +47,7 @@ specific device folder path ``/sys/bus/iio/devices/iio:deviceX``.
 +-------------------------------------------+----------------------------------------------------------+
 | in_accel_x_raw                            | Raw X-axis accelerometer channel value.                  |
 +-------------------------------------------+----------------------------------------------------------+
-| in_accel_y_calibbias                      | Y-axis acceleration offset correction                    |
+| in_accel_y_calibbias                      | Calibration offset for the Y-axis accelerometer channel. |
 +-------------------------------------------+----------------------------------------------------------+
 | in_accel_y_raw                            | Raw Y-axis accelerometer channel value.                  |
 +-------------------------------------------+----------------------------------------------------------+
diff --git a/Documentation/iio/adxl380.rst b/Documentation/iio/adxl380.rst
index 61cafa2f9..654d4c0e8 100644
--- a/Documentation/iio/adxl380.rst
+++ b/Documentation/iio/adxl380.rst
@@ -51,7 +51,7 @@ specific device folder path ``/sys/bus/iio/devices/iio:deviceX``.
 +---------------------------------------------------+----------------------------------------------------------+
 | in_accel_x_raw                                    | Raw X-axis accelerometer channel value.                  |
 +---------------------------------------------------+----------------------------------------------------------+
-| in_accel_y_calibbias                              | y-axis acceleration offset correction                    |
+| in_accel_y_calibbias                              | Calibration offset for the Y-axis accelerometer channel. |
 +---------------------------------------------------+----------------------------------------------------------+
 | in_accel_y_raw                                    | Raw Y-axis accelerometer channel value.                  |
 +---------------------------------------------------+----------------------------------------------------------+
-- 
2.43.0


^ permalink raw reply related

* [PATCH v2 0/5] KVM: PPC: Handle CPU compatibility mode for nested guests
From: Amit Machhiwal @ 2026-05-13 10:07 UTC (permalink / raw)
  To: linuxppc-dev, Madhavan Srinivasan
  Cc: Amit Machhiwal, Vaibhav Jain, Paolo Bonzini, Nicholas Piggin,
	Michael Ellerman, Christophe Leroy (CS GROUP), Jonathan Corbet,
	Shuah Khan, kvm, linux-kernel, linux-doc

On POWER systems, newer processor generations can operate in compatibility
modes corresponding to earlier generations (e.g., a Power11 system running
in Power10 compatibility mode). In such cases, the effective CPU level
exposed to guests differs from the physical processor generation.

This creates a problem for nested virtualization. When booting a nested KVM
guest (L2) inside a host KVM guest (L1) running in a compatibility mode,
userspace (e.g., QEMU) may derive the CPU model from the raw hardware PVR
and attempt to configure the nested guest accordingly. However, the L1
partition is constrained by the compatibility level negotiated with the
hypervisor (L0), and requests exceeding that level are rejected, leading to
guest boot failures such as:

  KVM-NESTEDv2: couldn't set guest wide elements

This series addresses the issue in two steps:

1. Detect and reject invalid compatibility requests early in KVM to avoid
   late failures.

2. Provide a mechanism for userspace to query the effective CPU
   compatibility modes supported by the host, so it can select an
   appropriate CPU model for nested guests.

To achieve this, the series introduces a new KVM capability and ioctl
(KVM_CAP_PPC_COMPAT_CAPS / KVM_PPC_GET_COMPAT_CAPS) that expose the
compatibility modes supported by the host.

The implementation supports both:

  - PowerVM (nested API v2), where compatibility information is obtained
    via the H_GUEST_GET_CAPABILITIES hypercall.
  - PowerNV (nested API v1), where compatibility is derived from the device
    tree ("cpu-version") representing the effective processor compatibility
    level.

This allows userspace (e.g., QEMU) to select a CPU model consistent with
the host compatibility mode, avoiding mismatches and enabling successful
nested guest boot.

Changes in v2:
  - Squashed patches 2 and 3 from v1 (capability introduction and ioctl
    wiring) into a single patch for better logical grouping
  - Changed kvm_ppc_compat_caps.flags from __u32 to __u64 for consistency
    and future extensibility
  - Addressed other review comments
  - Improved commit messages with clearer explanations of the changes

Patch summary:
  [1/5] Validate arch_compat against host compatibility mode
  [2/5] Introduce KVM_CAP_PPC_COMPAT_CAPS and wire up ioctl
  [3/5] Implement capability retrieval for PowerVM (API v2)
  [4/5] Add PowerNV support (API v1)
  [5/5] Document the new ioctl

Tested on:
  - Power11 pSeries LPAR in Power10 compatibility mode (nested API v2)
  - Power10 PowerNV system (and QEMU TCG PowerNV 11) with nested
    virtualization (API v1) with various combinations of KVM L1/L2 guests
    in various supported compatibility modes.

With this series, nested guests boot successfully in configurations where
they previously failed due to compatibility mismatches.

Related QEMU series:
  A corresponding QEMU series adds support for querying and using these
  compatibility capabilities when configuring nested KVM guests:
  https://lore.kernel.org/all/20260502140021.69712-1-amachhiw@linux.ibm.com/

v1: https://lore.kernel.org/linuxppc-dev/20260430054906.94431-1-amachhiw@linux.ibm.com/

Amit Machhiwal (5):
  KVM: PPC: Book3S HV: Validate arch_compat against host compatibility
    mode
  KVM: PPC: Introduce KVM_CAP_PPC_COMPAT_CAPS and wire up ioctl
  KVM: PPC: Book3S HV: Implement compat CPU capability retrieval for KVM
    on PowerVM
  KVM: PPC: Book3S HV: Add support for compat CPU capabilities for KVM
    on PowerNV
  KVM: PPC: Document KVM_PPC_GET_COMPAT_CAPS ioctl

 Documentation/virt/kvm/api.rst      | 35 ++++++++++++++++
 arch/powerpc/include/asm/kvm_ppc.h  |  1 +
 arch/powerpc/include/uapi/asm/kvm.h |  6 +++
 arch/powerpc/kvm/book3s_hv.c        | 63 +++++++++++++++++++++++++++++
 arch/powerpc/kvm/powerpc.c          | 21 ++++++++++
 include/uapi/linux/kvm.h            |  4 ++
 6 files changed, 130 insertions(+)


base-commit: 1d5dcaa3bd65f2e8c9baa14a393d3a2dc5db7524
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply

* [PATCH] Documentation: iio: fix typo in triggered-buffers example
From: Stepan Ionichev @ 2026-05-13 10:06 UTC (permalink / raw)
  To: corbet
  Cc: jic23, dlechner, nuno.sa, andy, skhan, gregkh, hcazarim,
	linux-doc, linux-iio, linux-kernel, sozdayvek

The function call example in triggered-buffers.rst uses "polfunc"
(single 'l') while the function is defined as "pollfunc" (double
'l') on line 24 and referenced as "pollfunc" further down on
line 56. Fix the misspelling so the example is consistent.

Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
---
 Documentation/driver-api/iio/triggered-buffers.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/driver-api/iio/triggered-buffers.rst b/Documentation/driver-api/iio/triggered-buffers.rst
index 417555dbb..23b82357e 100644
--- a/Documentation/driver-api/iio/triggered-buffers.rst
+++ b/Documentation/driver-api/iio/triggered-buffers.rst
@@ -43,7 +43,7 @@ A typical triggered buffer setup looks like this::
     }
 
     /* setup triggered buffer, usually in probe function */
-    iio_triggered_buffer_setup(indio_dev, sensor_iio_polfunc,
+    iio_triggered_buffer_setup(indio_dev, sensor_iio_pollfunc,
                                sensor_trigger_handler,
                                sensor_buffer_setup_ops);
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH 3/3] Documentation: f2fs: document encrypted inline data
From: LiaoYuanhong-vivo @ 2026-05-13 10:04 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, Jonathan Corbet, Shuah Khan,
	open list:F2FS FILE SYSTEM, open list, open list:DOCUMENTATION
  Cc: Liao Yuanhong
In-Reply-To: <20260513100431.299904-1-liaoyuanhong@vivo.com>

From: Liao Yuanhong <liaoyuanhong@vivo.com>

Document the F2FS encrypted_inline_data feature, including the on-disk
feature requirement, the CONFIG_F2FS_FS_ENCRYPTED_INLINE_DATA dependency,
how inline payloads are encrypted and decrypted, and the truncate
behavior.

Also list encrypted_inline_data in the supported F2FS feature sysfs
documentation.

Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>

---
 Documentation/ABI/testing/sysfs-fs-f2fs |  5 +++--
 Documentation/filesystems/f2fs.rst      | 27 +++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index 27d5e88facbe..dad483fb2fc1 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -258,7 +258,8 @@ Description:	Expand /sys/fs/f2fs/<disk>/features to meet sysfs rule.
 		encryption, block_zoned (aka blkzoned), extra_attr,
 		project_quota (aka projquota), inode_checksum,
 		flexible_inline_xattr, quota_ino, inode_crtime, lost_found,
-		verity, sb_checksum, casefold, readonly, compression.
+		verity, sb_checksum, casefold, readonly, compression,
+		encrypted_inline_data.
 		Note that, pin_file is moved into /sys/fs/f2fs/features/.
 
 What:		/sys/fs/f2fs/features/
@@ -271,7 +272,7 @@ Description:	Shows all enabled kernel features.
 		inode_crtime, lost_found, verity, sb_checksum,
 		casefold, readonly, compression, test_dummy_encryption_v2,
 		atomic_write, pin_file, encrypted_casefold, linear_lookup,
-		fserror.
+		fserror, encrypted_inline_data.
 
 What:		/sys/fs/f2fs/<disk>/inject_rate
 Date:		May 2016
diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst
index 5bc37a1c4e51..1f3e02352dd1 100644
--- a/Documentation/filesystems/f2fs.rst
+++ b/Documentation/filesystems/f2fs.rst
@@ -420,6 +420,33 @@ lookup_mode=%s		 Control the directory lookup behavior for casefolded
 			     ================== ========================================
 ======================== ============================================================
 
+Encrypted inline data
+=====================
+
+F2FS normally disables inline data for encrypted regular files, since inline
+data is stored inside the inode block and does not pass through the regular
+block I/O path.  When a filesystem is formatted with the encrypted_inline_data
+feature, encrypted regular files may keep small file contents in the inode
+block.  The inline payload is encrypted with fscrypt contents-key semantics
+before it is written to the inode, and it is decrypted back to page-cache
+plaintext when it is read.
+
+This feature requires the encrypt feature on disk and kernel support for
+CONFIG_F2FS_FS_ENCRYPTED_INLINE_DATA.  It is intended to be used together with
+the inline_data mount option.  When the normal encrypted file contents path uses
+blk-crypto, fscrypt also prepares a software contents-key transform for the
+filesystem-managed inline payload.
+
+Encrypted inline data is stored in fscrypt contents-aligned units.  Therefore,
+the maximum plaintext size that can stay inline may be slightly smaller than the
+ordinary inline data capacity.  If an encrypted inline-data file is truncated
+from a non-zero offset, F2FS first converts the inline payload to normal data
+blocks and then applies the truncate operation.
+
+Recovery copies inline payloads as on-disk bytes.  Encryption and decryption are
+performed only when moving data between the inode inline area and page-cache
+plaintext.
+
 Debugfs Entries
 ===============
 
-- 
2.34.1

^ permalink raw reply related

* [PATCH 0/3] f2fs: support encrypted inline data
From: LiaoYuanhong-vivo @ 2026-05-13 10:04 UTC (permalink / raw)
  To: Jaegeuk Kim, Chao Yu, Jonathan Corbet, Shuah Khan, Eric Biggers,
	Theodore Y. Ts'o, open list:F2FS FILE SYSTEM, open list,
	open list:DOCUMENTATION,
	open list:FSCRYPT: FILE SYSTEM LEVEL ENCRYPTION SUPPORT
  Cc: Liao Yuanhong

From: Liao Yuanhong <liaoyuanhong@vivo.com>

F2FS currently avoids inline data for encrypted regular files.  This is
because inline data is stored in the inode block, outside the regular
bio-based data path where fscrypt and blk-crypto normally operate.
As a result, devices that enable blk-crypto for encrypted file contents
cannot use F2FS inline data for encrypted regular files, which wastes
space for small files.

This series adds support for keeping small encrypted regular-file
contents as inline data.  The f2fs side defines a new on-disk feature,
encrypted_inline_data, under which inline payloads of encrypted regular
files are interpreted as ciphertext.  The payload is encrypted before
being stored in the inode block and decrypted back into page-cache
plaintext on read.

The fscrypt side prepares a software contents-key transform even when
normal file contents use blk-crypto, so filesystems can encrypt
filesystem-managed data regions that do not go through bio submission.
The new fscrypt helper operates on fscrypt data units and leaves the
filesystem responsible for deciding which filesystem-managed byte ranges
need this treatment.

The software crypto operation is limited to the inline payload.  Since
these files are small enough to remain inline, the expected read/write
performance difference between hardware and software crypto is small,
while the space saving from keeping the data inline is significant.

The feature is guarded by CONFIG_F2FS_FS_ENCRYPTED_INLINE_DATA and by the
F2FS encrypted_inline_data on-disk feature bit.  Filesystems with this
feature set are rejected if the kernel lacks the config option.

Hardware-wrapped keys are not supported by this initial version. I would
like to discuss whether this feature should remain disabled for
hardware-wrapped keys, or whether there is an acceptable way to support the
combination in the future.

The f2fs-tools support for formatting filesystems with this feature will be
submitted separately.

Basic testing passed.  Encrypted small files can be kept as inline data,
and read/write verification succeeded.

Liao Yuanhong (3):
  fscrypt: prepare software keys for filesystem-managed data units
  f2fs: support encrypted inline data
  Documentation: f2fs: document encrypted inline data

 Documentation/ABI/testing/sysfs-fs-f2fs |   5 +-
 Documentation/filesystems/f2fs.rst      |  27 ++++++
 fs/crypto/crypto.c                      |  63 +++++++++++++
 fs/crypto/fscrypt_private.h             |   3 +-
 fs/crypto/keysetup.c                    |  59 +++++++++---
 fs/f2fs/Kconfig                         |  14 +++
 fs/f2fs/data.c                          |   8 +-
 fs/f2fs/f2fs.h                          |  37 +++++++-
 fs/f2fs/file.c                          |  24 ++++-
 fs/f2fs/inline.c                        | 119 +++++++++++++++++++++---
 fs/f2fs/super.c                         |  12 +++
 fs/f2fs/sysfs.c                         |   8 ++
 include/linux/fscrypt.h                 |  28 ++++++
 13 files changed, 370 insertions(+), 37 deletions(-)

-- 
2.34.1

^ permalink raw reply

* Re: [PATCH v12 02/11] lib: kstrtox: add kstrtoudec64() and kstrtodec64()
From: Rodrigo Alencar @ 2026-05-13  9:41 UTC (permalink / raw)
  To: rodrigo.alencar, linux-kernel, linux-iio, devicetree, linux-doc
  Cc: Jonathan Cameron, David Lechner, Andy Shevchenko,
	Lars-Peter Clausen, Michael Hennerich, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Jonathan Corbet, Andrew Morton,
	Petr Mladek, Steven Rostedt, Andy Shevchenko, Rasmus Villemoes,
	Sergey Senozhatsky, Shuah Khan
In-Reply-To: <20260510-adf41513-iio-driver-v12-2-34af2ed2779f@analog.com>

On 26/05/10 01:42PM, Rodrigo Alencar via B4 Relay wrote:
> From: Rodrigo Alencar <rodrigo.alencar@analog.com>
> 
> Add helpers that parses decimal numbers into 64-bit number, i.e., decimal
> point numbers with pre-defined scale are parsed into a 64-bit value (fixed
> precision). After the decimal point, digits beyond the specified scale
> are ignored.

Hi Andy,

I am starting over here, the other conversation is getting hard to follow.
This is my new proposal...

...

> +static int _kstrtoudec64(const char *s, unsigned int scale, u64 *res)
> +{
> +	u64 _res = 0, _frac = 0;
> +	unsigned int rv;
> +
> +	if (scale > 19) /* log10(2^64) = 19.26 */
> +		return -EINVAL;
> +
> +	if (*s != '.') {
> +		rv = _parse_integer(s, 10, &_res);
> +		if (rv & KSTRTOX_OVERFLOW)
> +			return -ERANGE;
> +		if (rv == 0)
> +			return -EINVAL;
> +		s += rv;
> +	}
> +
> +	if (*s == '.' && scale) {
> +		s++; /* skip decimal point */
> +		rv = _parse_integer_limit(s, 10, &_frac, scale);
> +		if (rv & KSTRTOX_OVERFLOW)
> +			return -ERANGE;
> +		if (rv == 0)
> +			return -EINVAL;
> +		s += rv;
> +		if (rv < scale)
> +			_frac *= int_pow(10, scale - rv);
> +		while (isdigit(*s)) /* truncate */
> +			s++;
> +	}
> +
> +	if (*s == '\n')
> +		s++;
> +	if (*s)
> +		return -EINVAL;
> +
> +	if (check_mul_overflow(_res, int_pow(10, scale), &_res) ||
> +	    check_add_overflow(_res, _frac, &_res))
> +		return -ERANGE;
> +
> +	*res = _res;
> +	return 0;
> +}

This function now becomes:

	static int _kstrtoudec64(const char *s, unsigned int scale, u64 *res)
	{
		u64 _res = 0;
		unsigned int rv_int, rv_frac;

		rv_int = _parse_integer(s, 10, &_res);
		if (rv_int & KSTRTOX_OVERFLOW)
			return -ERANGE;
		s += rv_int;

		if (*s == '.')
			s++; /* skip decimal point */

		rv_frac = _parse_integer_limit_init(s, 10, _res, &_res, scale);
		if (rv_frac & KSTRTOX_OVERFLOW)
			return -ERANGE;
		s += rv_frac;

		if (!rv_int && !rv_frac && !isdigit(*s))
			return -EINVAL; /* no digits at all */

		while (isdigit(*s)) /* truncate digits */
			s++;

		if (*s == '\n')
			s++;
		if (*s)
			return -EINVAL;

		if (_res && (scale > (19 + rv_frac) || /* log10(2^64) = 19.26 */
		    check_mul_overflow(_res, int_pow(10, scale - rv_frac), &_res)))
			return -ERANGE;

		*res = _res;
		return 0;
	}

The new thing here is _parse_integer_limit_init(), which is a local modified
helper that accepts an init value, so _parse_integer_limit() becomes:

	unsigned int _parse_integer_limit(const char *s, unsigned int base,
					  unsigned long long *p, size_t max_chars)
	{
		return _parse_integer_limit_init(s, base, 0, p, max_chars);
	}

with init = 0:

	static unsigned int _parse_integer_limit_init(const char *s, unsigned int base,
						      unsigned long long init,
						      unsigned long long *p,
						      size_t max_chars)
	{
		unsigned long long res;
		unsigned int rv;

		res = init;
		/* ...
		 * the rest is the same implementation as _parse_integer_limit()
		 * ...
		 */
		return rv;
	}

That allows to accumulate the final value into the same variable, which makes
things simpler and decreases the amount of overflow checks.

The scale can now be a bigger value, like 0.00000000000000000000000000000000423
can be parsed with scale = 35, resulting into 423.

The truncation loop is still there... I think this implementation is better,
and I am not sure what is the input limit that you would consider ok to allow
non-zero digits to be truncated once the scale can now be something bigger than 19.
As long as the output fits into a u64 variable, the parser still works.

I am also adding new test cases for that!

-- 
Kind regards,

Rodrigo Alencar

^ permalink raw reply

* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: David Woodhouse @ 2026-05-13  9:24 UTC (permalink / raw)
  To: Marc Zyngier, Paolo Bonzini
  Cc: Jonathan Corbet, Shuah Khan, kvm, linux-doc, linux-kernel,
	Sean Christopherson, Jim Mattson, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
	Raghavendra Rao Ananta, Eric Auger, Kees Cook, Arnd Bergmann,
	Nathan Chancellor, linux-arm-kernel, kvmarm, linux-kselftest
In-Reply-To: <86h5obya2r.wl-maz@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4564 bytes --]

On Wed, 2026-05-13 at 09:42 +0100, Marc Zyngier wrote:
> On Mon, 11 May 2026 17:56:15 +0100,
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> > On 5/11/26 18:38, David Woodhouse wrote:
> > > Not *everything* is in CPUID; one recent exception that comes to mind
> > > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the
> > > existing behaviour of older kernels — even when that behaviour doesn't
> > > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would
> > > *advertise* the feature, but not actually *implement* it. Nevertheless,
> > > that remains the default behaviour of future kernels unless userspace
> > > explicitly opts in to fully enable (or disable) the feature.
> > > 
> > > But this documentation update isn't even asking for that compatible-by-
> > > default behaviour, even though that is the right thing to do. It's only
> > > asking that it be *possible* to reinstate the old behaviour, for
> > > userspace that *knows* about the change and explicitly wants to go back
> > > to the old way to remain compatible.
> > 
> > Yep, these are the "quirks"---if it's too early for Arm to commit to
> > that, I guess it's fine.
> 
> Compatible by default means nothing, because userspace needs to
> discover the combined capabilities of the host and KVM. This is not a
> "CPU model" architecture.
> 
> If userspace is not a total joke, it will read all the ID registers,
> and configure what it wants to see, assuming it is a feature that can
> be configured (not everything can, because the architecture itself is
> not fully backward compatible).
> 
> Yes, this is buggy at times, because the combinatorial explosion of
> CPU capabilities and supported features makes it pretty hard to test
> (and really nobody actually does). But overall, it works, and QEMU is
> growing an infrastructure to manage it in a "user friendly" way.

Yes, that is precisely what I'm asking for. I'm prepared to deal with
the fact that KVM/Arm64 is not a stable and mature platform like x86
is, and that userspace has to find all the random changes from one
version to the next, and explicitly pin things down to be compatible.

All I'm asking for is that KVM makes it *possible* to pin things down
to the behaviour of previously released Linux/KVM kernels.

> But really, this isn't what David is asking. He's demanding "bug for
> bug" compatibility. For that, we have two possible cases:

No, I am not asking you to meet that bar. I merely observed that x86
does and that it would be nice. But we are a *long* way from that.

> - this is a behaviour that, while undesirable, is allowed by the
>   architecture: fine, we preserve the behaviour and add another way to
>   expose the one we really want. it is ugly, but we manage.
> 
> - this is a behaviour that is not allowed by the architecture: we fix
>   it for good. We do that on every release. Some minor, some much more
>   visible. And there is no way we will add this sort of "bring the
>   bugs back" type of behaviours. Specially when it is really obvious
>   that no SW can make any reasonable use of the defect. We allow
>   userspace to keep behaving as before, but the guest will not see a
>   non-compliant behaviour.
> 
> That being said, there is a way out of that: convince people in charge
> of the architecture that the non-compliant KVM behaviour is actually
> valuable, and deserves to be tolerated. This has happened before (VHE
> only and NV2 only, just to name two recent changes).
> 
> Other terrible hacks (such as GICv3's GICD_TYPER.num_LPIs which KVM
> doesn't support) were added at the request of cloud vendors that David
> might be familiar with, so it isn't like it is a brand new process.
> 
> And once it is in the architecture, it becomes a behaviour that is
> allowed to be exposed to a guest, for better or worse.

Marc, this is complete nonsense and you should know better.

Once a behaviour is present in a released version of Linux/KVM, we
can't just declare it "wrong" and unilaterally impose a change in
guest-visible behaviour on *running* guests as a side-effect of a
kernel upgrade.

The criterion for *KVM* to remain compatible is "once it has been in a
released version of the kernel". Not "once it is in the architecture".

> These are the rules we have followed since we started KVM/arm, and I
> intend to stick to them.

Then KVM/arm is falling far short of the standards we expect of KVM and
of Linux in general.

Please do better.



[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* [PATCH] docs: update rust-analyzer command
From: Onur Özkan @ 2026-05-13  9:19 UTC (permalink / raw)
  To: rust-for-linux, linux-doc, linux-kernel
  Cc: ojeda, boqun, gary, bjorn3_gh, lossin, a.hindborg, aliceryhl,
	tmgross, dakr, corbet, skhan, alexs, si.yanteng, dzm91,
	Onur Özkan

On a fresh checkout, generating rust-project.json alone is not enough
for rust-analyzer to work reliably. The issue only becomes apparent
later when the LSP fails on a proc macro or binding types/functions.

Recommend running prepare together with the rust-analyzer target so the
generated files expected by rust-analyzer are available from the start.

Link: https://rust-for-linux.zulipchat.com/#narrow/channel/597064-rust-analyzer
Signed-off-by: Onur Özkan <work@onurozkan.dev>
---
 Documentation/rust/quick-start.rst                    | 2 +-
 Documentation/translations/zh_CN/rust/quick-start.rst | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/rust/quick-start.rst b/Documentation/rust/quick-start.rst
index a6ec3fa94d33..df5b54b51deb 100644
--- a/Documentation/rust/quick-start.rst
+++ b/Documentation/rust/quick-start.rst
@@ -314,7 +314,7 @@ definition, and other features.
 ``rust-analyzer`` needs a configuration file, ``rust-project.json``, which
 can be generated by the ``rust-analyzer`` Make target::
 
-	make LLVM=1 rust-analyzer
+	make LLVM=1 prepare rust-analyzer
 
 
 Configuration
diff --git a/Documentation/translations/zh_CN/rust/quick-start.rst b/Documentation/translations/zh_CN/rust/quick-start.rst
index 5f0ece6411f5..3f7efd3a63ad 100644
--- a/Documentation/translations/zh_CN/rust/quick-start.rst
+++ b/Documentation/translations/zh_CN/rust/quick-start.rst
@@ -291,7 +291,7 @@ rust-analyzer
 ``rust-analyzer`` 需要一个配置文件, ``rust-project.json``, 它可以由 ``rust-analyzer``
 Make 目标生成::
 
-       make LLVM=1 rust-analyzer
+       make LLVM=1 prepare rust-analyzer
 
 
 配置
-- 
2.51.2


^ permalink raw reply related

* Re: [PATCH v7 10/20] KVM: arm64: Context swap Partitioned PMU guest registers
From: Oliver Upton @ 2026-05-13  9:18 UTC (permalink / raw)
  To: Colton Lewis
  Cc: kvm, Alexandru Elisei, Paolo Bonzini, Jonathan Corbet,
	Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
	Oliver Upton, Mingwei Zhang, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Mark Rutland, Shuah Khan, Ganapatrao Kulkarni,
	James Clark, linux-doc, linux-kernel, linux-arm-kernel, kvmarm,
	linux-perf-users, linux-kselftest
In-Reply-To: <20260504211813.1804997-11-coltonlewis@google.com>

On Mon, May 04, 2026 at 09:18:03PM +0000, Colton Lewis wrote:
> +
> +/**
> + * kvm_pmu_host_counter_mask() - Compute bitmask of host-reserved counters
> + * @pmu: Pointer to arm_pmu struct
> + *
> + * Compute the bitmask that selects the host-reserved counters in the
> + * {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers. These are the counters
> + * in HPMN..N
> + *
> + * Return: Bitmask
> + */
> +u64 kvm_pmu_host_counter_mask(struct arm_pmu *pmu)
> +{
> +	u8 nr_counters = *host_data_ptr(nr_event_counters);
> +
> +	if (kvm_pmu_is_partitioned(pmu))
> +		return GENMASK(nr_counters - 1, pmu->max_guest_counters);
> +
> +	return ARMV8_PMU_CNT_MASK_ALL;
> +}
> +
> +/**
> + * kvm_pmu_guest_counter_mask() - Compute bitmask of guest-reserved counters
> + * @pmu: Pointer to arm_pmu struct
> + *
> + * Compute the bitmask that selects the guest-reserved counters in the
> + * {PMCNTEN,PMINTEN,PMOVS}{SET,CLR} registers. These are the counters
> + * in 0..HPMN and the cycle and instruction counters.
> + *
> + * Return: Bitmask
> + */
> +u64 kvm_pmu_guest_counter_mask(struct arm_pmu *pmu)
> +{
> +	if (kvm_pmu_is_partitioned(pmu))
> +		return ARMV8_PMU_CNT_MASK_C | GENMASK(pmu->max_guest_counters - 1, 0);
> +
> +	return 0;
> +}
> +
> +/**
> + * kvm_pmu_load() - Load untrapped PMU registers
> + * @vcpu: Pointer to struct kvm_vcpu
> + *
> + * Load all untrapped PMU registers from the VCPU into the PCPU. Mask
> + * to only bits belonging to guest-reserved counters and leave
> + * host-reserved counters alone in bitmask registers.
> + */
> +void kvm_pmu_load(struct kvm_vcpu *vcpu)
> +{
> +	struct arm_pmu *pmu;
> +	unsigned long guest_counters;
> +	u64 mask;
> +	u8 i;
> +	u64 val;
> +
> +	/*
> +	 * If we aren't guest-owned then we know the guest isn't using
> +	 * the PMU anyway, so no need to bother with the swap.
> +	 */
> +	if (!kvm_vcpu_pmu_is_partitioned(vcpu))
> +		return;
> +
> +	preempt_disable();
> +
> +	pmu = vcpu->kvm->arch.arm_pmu;
> +	guest_counters = kvm_pmu_guest_counter_mask(pmu);
> +
> +	for_each_set_bit(i, &guest_counters, ARMPMU_MAX_HWEVENTS) {
> +		val = __vcpu_sys_reg(vcpu, PMEVCNTR0_EL0 + i);
> +
> +		if (i == ARMV8_PMU_CYCLE_IDX) {
> +			write_sysreg(val, pmccntr_el0);
> +		} else {
> +			write_sysreg(i, pmselr_el0);
> +			write_sysreg(val, pmxevcntr_el0);

This is wrong, you would need an intervening ISB. It'd be better to
avoid the ISB altogether and just use {read,write}_pmevcntrn().

Thanks,
Oliver

^ permalink raw reply

* [PATCH v11 5/5] platform/chrome: cros_ec_chardev: Consume cros_ec_device via revocable
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney
In-Reply-To: <20260513091043.6766-1-tzungbi@kernel.org>

The cros_ec_chardev driver provides a character device interface to the
ChromeOS EC.  A file handle to this device can remain open in userspace
even if the underlying EC device is removed.

This creates a classic use-after-free vulnerability.  Any file operation
(ioctl, release, etc.) on the open handle after the EC device has gone
would access a stale pointer, leading to a system crash.

To prevent this, leverage the revocable and convert cros_ec_chardev to a
resource consumer of cros_ec_device.

---
v11:
- No changes.

v10: https://lore.kernel.org/all/20260508105448.31799-10-tzungbi@kernel.org
- No changes.

v9: https://lore.kernel.org/all/20260427135841.96266-10-tzungbi@kernel.org
- New to the series.
- Change revocable API usages accordingly.

v4 - v8:
- Doesn't exist.

v3: https://lore.kernel.org/all/20250912081718.3827390-6-tzungbi@kernel.org
- Use specific labels for different cleanup in cros_ec_chardev_open().

v2: https://lore.kernel.org/all/20250820081645.847919-6-tzungbi@kernel.org
- Rename "ref_proxy" -> "revocable".
- Fix a sparse warning by removing the redundant __rcu annotation.

v1: https://lore.kernel.org/all/20250814091020.1302888-4-tzungbi@kernel.org

Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
 drivers/platform/chrome/cros_ec_chardev.c | 80 +++++++++++++++++------
 1 file changed, 61 insertions(+), 19 deletions(-)

diff --git a/drivers/platform/chrome/cros_ec_chardev.c b/drivers/platform/chrome/cros_ec_chardev.c
index 002be3352100..c597dc92d519 100644
--- a/drivers/platform/chrome/cros_ec_chardev.c
+++ b/drivers/platform/chrome/cros_ec_chardev.c
@@ -22,6 +22,7 @@
 #include <linux/platform_data/cros_ec_proto.h>
 #include <linux/platform_device.h>
 #include <linux/poll.h>
+#include <linux/revocable.h>
 #include <linux/slab.h>
 #include <linux/types.h>
 #include <linux/uaccess.h>
@@ -32,7 +33,7 @@
 #define CROS_MAX_EVENT_LEN	PAGE_SIZE
 
 struct chardev_priv {
-	struct cros_ec_device *ec_dev;
+	struct revocable *rev;
 	struct notifier_block notifier;
 	wait_queue_head_t wait_event;
 	unsigned long event_mask;
@@ -55,6 +56,7 @@ static int ec_get_version(struct chardev_priv *priv, char *str, int maxlen)
 	};
 	struct ec_response_get_version *resp;
 	struct cros_ec_command *msg;
+	struct cros_ec_device *ec_dev;
 	int ret;
 
 	msg = kzalloc(sizeof(*msg) + sizeof(*resp), GFP_KERNEL);
@@ -64,12 +66,19 @@ static int ec_get_version(struct chardev_priv *priv, char *str, int maxlen)
 	msg->command = EC_CMD_GET_VERSION + priv->cmd_offset;
 	msg->insize = sizeof(*resp);
 
-	ret = cros_ec_cmd_xfer_status(priv->ec_dev, msg);
-	if (ret < 0) {
-		snprintf(str, maxlen,
-			 "Unknown EC version, returned error: %d\n",
-			 msg->result);
-		goto exit;
+	revocable_try_access_with_scoped(priv->rev, ec_dev) {
+		if (!ec_dev) {
+			ret = -ENODEV;
+			goto exit;
+		}
+
+		ret = cros_ec_cmd_xfer_status(ec_dev, msg);
+		if (ret < 0) {
+			snprintf(str, maxlen,
+				 "Unknown EC version, returned error: %d\n",
+				 msg->result);
+			goto exit;
+		}
 	}
 
 	resp = (struct ec_response_get_version *)msg->data;
@@ -92,10 +101,15 @@ static int cros_ec_chardev_mkbp_event(struct notifier_block *nb,
 {
 	struct chardev_priv *priv = container_of(nb, struct chardev_priv,
 						 notifier);
-	struct cros_ec_device *ec_dev = priv->ec_dev;
+	struct cros_ec_device *ec_dev;
 	struct ec_event *event;
-	unsigned long event_bit = 1 << ec_dev->event_data.event_type;
-	int total_size = sizeof(*event) + ec_dev->event_size;
+	unsigned long event_bit;
+	int total_size;
+
+	revocable_try_access_or_return_err(priv->rev, ec_dev, NOTIFY_DONE);
+
+	event_bit = 1 << ec_dev->event_data.event_type;
+	total_size = sizeof(*event) + ec_dev->event_size;
 
 	if (!(event_bit & priv->event_mask) ||
 	    (priv->event_len + total_size) > CROS_MAX_EVENT_LEN)
@@ -166,7 +180,8 @@ static int cros_ec_chardev_open(struct inode *inode, struct file *filp)
 	if (!priv)
 		return -ENOMEM;
 
-	priv->ec_dev = ec_dev;
+	priv->rev = ec_dev->its_rev;
+	revocable_get(priv->rev);
 	priv->cmd_offset = ec->cmd_offset;
 	filp->private_data = priv;
 	INIT_LIST_HEAD(&priv->events);
@@ -178,6 +193,7 @@ static int cros_ec_chardev_open(struct inode *inode, struct file *filp)
 					       &priv->notifier);
 	if (ret) {
 		dev_err(ec_dev->dev, "failed to register event notifier\n");
+		revocable_put(priv->rev);
 		kfree(priv);
 	}
 
@@ -251,11 +267,13 @@ static ssize_t cros_ec_chardev_read(struct file *filp, char __user *buffer,
 static int cros_ec_chardev_release(struct inode *inode, struct file *filp)
 {
 	struct chardev_priv *priv = filp->private_data;
-	struct cros_ec_device *ec_dev = priv->ec_dev;
+	struct cros_ec_device *ec_dev;
 	struct ec_event *event, *e;
 
-	blocking_notifier_chain_unregister(&ec_dev->event_notifier,
-					   &priv->notifier);
+	revocable_try_access_or_skip_scoped(priv->rev, ec_dev)
+		blocking_notifier_chain_unregister(&ec_dev->event_notifier,
+						   &priv->notifier);
+	revocable_put(priv->rev);
 
 	list_for_each_entry_safe(event, e, &priv->events, node) {
 		list_del(&event->node);
@@ -273,6 +291,7 @@ static long cros_ec_chardev_ioctl_xcmd(struct chardev_priv *priv, void __user *a
 {
 	struct cros_ec_command *s_cmd;
 	struct cros_ec_command u_cmd;
+	struct cros_ec_device *ec_dev;
 	long ret;
 
 	if (copy_from_user(&u_cmd, arg, sizeof(u_cmd)))
@@ -299,10 +318,17 @@ static long cros_ec_chardev_ioctl_xcmd(struct chardev_priv *priv, void __user *a
 	}
 
 	s_cmd->command += priv->cmd_offset;
-	ret = cros_ec_cmd_xfer(priv->ec_dev, s_cmd);
-	/* Only copy data to userland if data was received. */
-	if (ret < 0)
-		goto exit;
+	revocable_try_access_with_scoped(priv->rev, ec_dev) {
+		if (!ec_dev) {
+			ret = -ENODEV;
+			goto exit;
+		}
+
+		ret = cros_ec_cmd_xfer(ec_dev, s_cmd);
+		/* Only copy data to userland if data was received. */
+		if (ret < 0)
+			goto exit;
+	}
 
 	if (copy_to_user(arg, s_cmd, sizeof(*s_cmd) + s_cmd->insize))
 		ret = -EFAULT;
@@ -313,10 +339,12 @@ static long cros_ec_chardev_ioctl_xcmd(struct chardev_priv *priv, void __user *a
 
 static long cros_ec_chardev_ioctl_readmem(struct chardev_priv *priv, void __user *arg)
 {
-	struct cros_ec_device *ec_dev = priv->ec_dev;
+	struct cros_ec_device *ec_dev;
 	struct cros_ec_readmem s_mem = { };
 	long num;
 
+	revocable_try_access_or_return(priv->rev, ec_dev);
+
 	/* Not every platform supports direct reads */
 	if (!ec_dev->cmd_readmem)
 		return -ENOTTY;
@@ -370,11 +398,25 @@ static const struct file_operations chardev_fops = {
 #endif
 };
 
+static void cros_ec_chardev_free(void *data)
+{
+	struct revocable *rev = data;
+
+	revocable_put(rev);
+}
+
 static int cros_ec_chardev_probe(struct platform_device *pdev)
 {
 	struct cros_ec_dev *ec = dev_get_drvdata(pdev->dev.parent);
 	struct cros_ec_platform *ec_platform = dev_get_platdata(ec->dev);
+	struct revocable *rev = ec->ec_dev->its_rev;
 	struct miscdevice *misc;
+	int ret;
+
+	revocable_get(rev);
+	ret = devm_add_action_or_reset(&pdev->dev, cros_ec_chardev_free, rev);
+	if (ret)
+		return ret;
 
 	/* Create a char device: we want to create it anew */
 	misc = devm_kzalloc(&pdev->dev, sizeof(*misc), GFP_KERNEL);
-- 
2.51.0


^ permalink raw reply related

* [PATCH v11 4/5] platform/chrome: Protect cros_ec_device lifecycle with revocable
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney
In-Reply-To: <20260513091043.6766-1-tzungbi@kernel.org>

The cros_ec_device can be unregistered when the underlying device is
removed.  Other kernel drivers that interact with the EC may hold a
pointer to the cros_ec_device, creating a risk of a use-after-free
error if the EC device is removed while still being referenced.

To prevent this, leverage the revocable and convert the underlying
device drivers to resource providers of cros_ec_device.

---
v11:
- No changes.

v10: https://lore.kernel.org/all/20260508105448.31799-9-tzungbi@kernel.org
- No changes.

v9: https://lore.kernel.org/all/20260427135841.96266-9-tzungbi@kernel.org
- New to the series.
- Change revocable API usages accordingly.
- Rename "revocable_provider" -> "its_rev".

v5 - v8:
- Doesn't exist.

v4: https://lore.kernel.org/all/20250923075302.591026-5-tzungbi@kernel.org
- No changes.

v3: https://lore.kernel.org/all/20250912081718.3827390-5-tzungbi@kernel.org
- Initialize the revocable provider in cros_ec_device_alloc() instead of
  spreading in protocol device drivers.

v2: https://lore.kernel.org/all/20250820081645.847919-5-tzungbi@kernel.org
- Rename "ref_proxy" -> "revocable".

v1: https://lore.kernel.org/all/20250814091020.1302888-3-tzungbi@kernel.org

Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
 drivers/platform/chrome/cros_ec.c           | 11 +++++++++++
 include/linux/platform_data/cros_ec_proto.h |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c
index 1da79e3d215b..2702a1bbfeb5 100644
--- a/drivers/platform/chrome/cros_ec.c
+++ b/drivers/platform/chrome/cros_ec.c
@@ -16,6 +16,7 @@
 #include <linux/platform_device.h>
 #include <linux/platform_data/cros_ec_commands.h>
 #include <linux/platform_data/cros_ec_proto.h>
+#include <linux/revocable.h>
 #include <linux/slab.h>
 #include <linux/suspend.h>
 
@@ -37,6 +38,7 @@ static void cros_ec_device_free(void *data)
 
 	mutex_destroy(&ec_dev->lock);
 	lockdep_unregister_key(&ec_dev->lockdep_key);
+	revocable_revoke(ec_dev->its_rev);
 }
 
 struct cros_ec_device *cros_ec_device_alloc(struct device *dev)
@@ -47,6 +49,15 @@ struct cros_ec_device *cros_ec_device_alloc(struct device *dev)
 	if (!ec_dev)
 		return NULL;
 
+	ec_dev->its_rev = revocable_alloc(ec_dev);
+	if (!ec_dev->its_rev)
+		return NULL;
+	/*
+	 * Drop the extra reference for the caller as the caller is the
+	 * resource provider.
+	 */
+	revocable_put(ec_dev->its_rev);
+
 	ec_dev->din_size = sizeof(struct ec_host_response) +
 			   sizeof(struct ec_response_get_protocol_info) +
 			   EC_MAX_RESPONSE_OVERHEAD;
diff --git a/include/linux/platform_data/cros_ec_proto.h b/include/linux/platform_data/cros_ec_proto.h
index de14923720a5..e8c3bd03403c 100644
--- a/include/linux/platform_data/cros_ec_proto.h
+++ b/include/linux/platform_data/cros_ec_proto.h
@@ -12,6 +12,7 @@
 #include <linux/lockdep_types.h>
 #include <linux/mutex.h>
 #include <linux/notifier.h>
+#include <linux/revocable.h>
 
 #include <linux/platform_data/cros_ec_commands.h>
 
@@ -165,6 +166,7 @@ struct cros_ec_command {
  * @pd: The platform_device used by the mfd driver to interface with the
  *      PD behind an EC.
  * @panic_notifier: EC panic notifier.
+ * @its_rev: The revocable_provider to this device.
  */
 struct cros_ec_device {
 	/* These are used by other drivers that want to talk to the EC */
@@ -211,6 +213,7 @@ struct cros_ec_device {
 	struct platform_device *pd;
 
 	struct blocking_notifier_head panic_notifier;
+	struct revocable *its_rev;
 };
 
 /**
-- 
2.51.0


^ permalink raw reply related

* [PATCH v11 3/5] gpio: Leverage revocable for accessing struct gpio_chip
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney,
	Bartosz Golaszewski
In-Reply-To: <20260513091043.6766-1-tzungbi@kernel.org>

The underlying chip can be removed asynchronously.  `gdev->srcu` is used
to ensure the synchronization before accessing `gdev->chip`.

Revocable encapsulates the details.  Leverage revocable for accessing
the struct gpio_chip and remove the `gdev->srcu`.

Tested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
v11:
- Add Tested-by tag.
- Squash
  - https://lore.kernel.org/all/20260508105448.31799-5-tzungbi@kernel.org
  - https://lore.kernel.org/all/20260508105448.31799-6-tzungbi@kernel.org
  - https://lore.kernel.org/all/20260508105448.31799-7-tzungbi@kernel.org
  - https://lore.kernel.org/all/20260508105448.31799-8-tzungbi@kernel.org

v10: https://lore.kernel.org/all/20260508105448.31799-4-tzungbi@kernel.org
- Change revocable API usages accordingly.

v9: https://lore.kernel.org/all/20260427135841.96266-4-tzungbi@kernel.org
- New to the series.
- Use static allocated resource provider.
- Rename "chip_rp" -> "chip_rev".

v4 - v8:
- Doesn't exist.

v3: https://lore.kernel.org/all/20260213092958.864411-8-tzungbi@kernel.org
- Change revocable API usages accordingly.

v2: https://lore.kernel.org/all/20260203061059.975605-8-tzungbi@kernel.org
- Change usages accordingly after applying
  https://lore.kernel.org/all/20260129143733.45618-2-tzungbi@kernel.org.
  - Add __rcu for `chip_rp`.
  - Pass pointer of pointer to revocable_provider_revoke().
- Rebase accordingly after applying
  https://lore.kernel.org/all/20260203060210.972243-1-tzungbi@kernel.org.

v1: https://lore.kernel.org/all/20260116081036.352286-13-tzungbi@kernel.org

---
 drivers/gpio/gpiolib-cdev.c  |  77 ++++------
 drivers/gpio/gpiolib-sysfs.c |  31 ++---
 drivers/gpio/gpiolib.c       | 263 ++++++++++++++---------------------
 drivers/gpio/gpiolib.h       |  28 +---
 4 files changed, 150 insertions(+), 249 deletions(-)

diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c
index f36b7c06996d..4837497c5e6e 100644
--- a/drivers/gpio/gpiolib-cdev.c
+++ b/drivers/gpio/gpiolib-cdev.c
@@ -22,6 +22,7 @@
 #include <linux/overflow.h>
 #include <linux/pinctrl/consumer.h>
 #include <linux/poll.h>
+#include <linux/revocable.h>
 #include <linux/seq_file.h>
 #include <linux/spinlock.h>
 #include <linux/string.h>
@@ -210,11 +211,9 @@ static long linehandle_ioctl(struct file *file, unsigned int cmd,
 	DECLARE_BITMAP(vals, GPIOHANDLES_MAX);
 	unsigned int i;
 	int ret;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&lh->gdev->srcu);
-
-	if (!rcu_access_pointer(lh->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&lh->gdev->chip_rev, gc);
 
 	switch (cmd) {
 	case GPIOHANDLE_GET_LINE_VALUES_IOCTL:
@@ -1432,11 +1431,9 @@ static long linereq_ioctl(struct file *file, unsigned int cmd,
 {
 	struct linereq *lr = file->private_data;
 	void __user *ip = (void __user *)arg;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&lr->gdev->srcu);
-
-	if (!rcu_access_pointer(lr->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&lr->gdev->chip_rev, gc);
 
 	switch (cmd) {
 	case GPIO_V2_LINE_GET_VALUES_IOCTL:
@@ -1463,10 +1460,10 @@ static __poll_t linereq_poll(struct file *file,
 {
 	struct linereq *lr = file->private_data;
 	__poll_t events = 0;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&lr->gdev->srcu);
-
-	if (!rcu_access_pointer(lr->gdev->chip))
+	revocable_try_access_with(&lr->gdev->chip_rev, gc);
+	if (!gc)
 		return EPOLLHUP | EPOLLERR;
 
 	poll_wait(file, &lr->wait, wait);
@@ -1485,11 +1482,9 @@ static ssize_t linereq_read(struct file *file, char __user *buf,
 	struct gpio_v2_line_event le;
 	ssize_t bytes_read = 0;
 	int ret;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&lr->gdev->srcu);
-
-	if (!rcu_access_pointer(lr->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&lr->gdev->chip_rev, gc);
 
 	if (count < sizeof(le))
 		return -EINVAL;
@@ -1759,10 +1754,10 @@ static __poll_t lineevent_poll(struct file *file,
 {
 	struct lineevent_state *le = file->private_data;
 	__poll_t events = 0;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&le->gdev->srcu);
-
-	if (!rcu_access_pointer(le->gdev->chip))
+	revocable_try_access_with(&le->gdev->chip_rev, gc);
+	if (!gc)
 		return EPOLLHUP | EPOLLERR;
 
 	poll_wait(file, &le->wait, wait);
@@ -1797,11 +1792,9 @@ static ssize_t lineevent_read(struct file *file, char __user *buf,
 	ssize_t bytes_read = 0;
 	ssize_t ge_size;
 	int ret;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&le->gdev->srcu);
-
-	if (!rcu_access_pointer(le->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&le->gdev->chip_rev, gc);
 
 	/*
 	 * When compatible system call is being used the struct gpioevent_data,
@@ -1879,11 +1872,9 @@ static long lineevent_ioctl(struct file *file, unsigned int cmd,
 	struct lineevent_state *le = file->private_data;
 	void __user *ip = (void __user *)arg;
 	struct gpiohandle_data ghd;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&le->gdev->srcu);
-
-	if (!rcu_access_pointer(le->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&le->gdev->chip_rev, gc);
 
 	/*
 	 * We can get the value for an event line but not set it,
@@ -2165,10 +2156,9 @@ static void gpio_desc_to_lineinfo(struct gpio_desc *desc,
 	u32 debounce_period_us;
 	unsigned long dflags;
 	const char *label;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return;
+	revocable_try_access_or_return_void(&desc->gdev->chip_rev, gc);
 
 	memset(info, 0, sizeof(*info));
 	info->offset = gpiod_hwgpio(desc);
@@ -2201,10 +2191,10 @@ static void gpio_desc_to_lineinfo(struct gpio_desc *desc,
 	    test_bit(GPIOD_FLAG_IS_HOGGED, &dflags) ||
 	    test_bit(GPIOD_FLAG_EXPORT, &dflags) ||
 	    test_bit(GPIOD_FLAG_SYSFS, &dflags) ||
-	    !gpiochip_line_is_valid(guard.gc, info->offset)) {
+	    !gpiochip_line_is_valid(gc, info->offset)) {
 		info->flags |= GPIO_V2_LINE_FLAG_USED;
 	} else if (!atomic) {
-		if (!pinctrl_gpio_can_use_line(guard.gc, info->offset))
+		if (!pinctrl_gpio_can_use_line(gc, info->offset))
 			info->flags |= GPIO_V2_LINE_FLAG_USED;
 	}
 
@@ -2385,12 +2375,10 @@ static long gpio_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	struct gpio_chardev_data *cdev = file->private_data;
 	struct gpio_device *gdev = cdev->gdev;
 	void __user *ip = (void __user *)arg;
-
-	guard(srcu)(&gdev->srcu);
+	struct gpio_chip *gc;
 
 	/* We fail any subsequent ioctl():s when the chip is gone */
-	if (!rcu_access_pointer(gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&gdev->chip_rev, gc);
 
 	/* Fill in the struct and pass to userspace */
 	switch (cmd) {
@@ -2448,12 +2436,9 @@ static void lineinfo_changed_func(struct work_struct *work)
 		 * Pin functions are in general much more static and while it's
 		 * not 100% bullet-proof, it's good enough for most cases.
 		 */
-		scoped_guard(srcu, &ctx->gdev->srcu) {
-			gc = srcu_dereference(ctx->gdev->chip, &ctx->gdev->srcu);
-			if (gc &&
-			    !pinctrl_gpio_can_use_line(gc, ctx->chg.info.offset))
+		revocable_try_access_or_skip_scoped(&ctx->gdev->chip_rev, gc)
+			if (!pinctrl_gpio_can_use_line(gc, ctx->chg.info.offset))
 				ctx->chg.info.flags |= GPIO_V2_LINE_FLAG_USED;
-		}
 	}
 
 	ret = kfifo_in_spinlocked(&ctx->cdev->events, &ctx->chg, 1,
@@ -2534,10 +2519,10 @@ static __poll_t lineinfo_watch_poll(struct file *file,
 {
 	struct gpio_chardev_data *cdev = file->private_data;
 	__poll_t events = 0;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&cdev->gdev->srcu);
-
-	if (!rcu_access_pointer(cdev->gdev->chip))
+	revocable_try_access_with(&cdev->gdev->chip_rev, gc);
+	if (!gc)
 		return EPOLLHUP | EPOLLERR;
 
 	poll_wait(file, &cdev->wait, pollt);
@@ -2557,11 +2542,9 @@ static ssize_t lineinfo_watch_read(struct file *file, char __user *buf,
 	ssize_t bytes_read = 0;
 	int ret;
 	size_t event_size;
+	struct gpio_chip *gc;
 
-	guard(srcu)(&cdev->gdev->srcu);
-
-	if (!rcu_access_pointer(cdev->gdev->chip))
-		return -ENODEV;
+	revocable_try_access_or_return(&cdev->gdev->chip_rev, gc);
 
 #ifndef CONFIG_GPIO_CDEV_V1
 	event_size = sizeof(struct gpio_v2_line_info_changed);
diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index fc06b0c2881b..c40320433ff7 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -10,6 +10,7 @@
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/printk.h>
+#include <linux/revocable.h>
 #include <linux/slab.h>
 #include <linux/string.h>
 #include <linux/srcu.h>
@@ -215,10 +216,9 @@ static int gpio_sysfs_request_irq(struct gpiod_data *data, unsigned char flags)
 	struct gpio_desc *desc = data->desc;
 	unsigned long irq_flags;
 	int ret;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	data->irq = gpiod_to_irq(desc);
 	if (data->irq < 0)
@@ -244,7 +244,7 @@ static int gpio_sysfs_request_irq(struct gpiod_data *data, unsigned char flags)
 	 * Remove this redundant call (along with the corresponding unlock)
 	 * when those drivers have been fixed.
 	 */
-	ret = gpiochip_lock_as_irq(guard.gc, gpiod_hwgpio(desc));
+	ret = gpiochip_lock_as_irq(gc, gpiod_hwgpio(desc));
 	if (ret < 0)
 		goto err_clr_bits;
 
@@ -258,7 +258,7 @@ static int gpio_sysfs_request_irq(struct gpiod_data *data, unsigned char flags)
 	return 0;
 
 err_unlock:
-	gpiochip_unlock_as_irq(guard.gc, gpiod_hwgpio(desc));
+	gpiochip_unlock_as_irq(gc, gpiod_hwgpio(desc));
 err_clr_bits:
 	clear_bit(GPIOD_FLAG_EDGE_RISING, &desc->flags);
 	clear_bit(GPIOD_FLAG_EDGE_FALLING, &desc->flags);
@@ -273,14 +273,13 @@ static int gpio_sysfs_request_irq(struct gpiod_data *data, unsigned char flags)
 static void gpio_sysfs_free_irq(struct gpiod_data *data)
 {
 	struct gpio_desc *desc = data->desc;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return;
+	revocable_try_access_or_return_void(&desc->gdev->chip_rev, gc);
 
 	data->irq_flags = 0;
 	free_irq(data->irq, data);
-	gpiochip_unlock_as_irq(guard.gc, gpiod_hwgpio(desc));
+	gpiochip_unlock_as_irq(gc, gpiod_hwgpio(desc));
 	clear_bit(GPIOD_FLAG_EDGE_RISING, &desc->flags);
 	clear_bit(GPIOD_FLAG_EDGE_FALLING, &desc->flags);
 }
@@ -473,13 +472,12 @@ static DEVICE_ATTR_RO(ngpio);
 static int export_gpio_desc(struct gpio_desc *desc)
 {
 	int offset, ret;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	offset = gpiod_hwgpio(desc);
-	if (!gpiochip_line_is_valid(guard.gc, offset)) {
+	if (!gpiochip_line_is_valid(gc, offset)) {
 		pr_debug_ratelimited("%s: GPIO %d masked\n", __func__,
 				     gpiod_hwgpio(desc));
 		return -EINVAL;
@@ -732,6 +730,7 @@ int gpiod_export(struct gpio_desc *desc, bool direction_may_change)
 	struct gpio_device *gdev;
 	struct attribute **attrs;
 	int status;
+	struct gpio_chip *gc;
 
 	/* can't export until sysfs is available ... */
 	if (!class_is_registered(&gpio_class)) {
@@ -744,9 +743,7 @@ int gpiod_export(struct gpio_desc *desc, bool direction_may_change)
 		return -EINVAL;
 	}
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	if (test_and_set_bit(GPIOD_FLAG_EXPORT, &desc->flags))
 		return -EPERM;
@@ -769,7 +766,7 @@ int gpiod_export(struct gpio_desc *desc, bool direction_may_change)
 
 	desc_data->desc = desc;
 	mutex_init(&desc_data->mutex);
-	if (guard.gc->direction_input && guard.gc->direction_output)
+	if (gc->direction_input && gc->direction_output)
 		desc_data->direction_can_change = direction_may_change;
 	else
 		desc_data->direction_can_change = false;
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 1e6dce430dca..5ce12f3b753f 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -23,6 +23,7 @@
 #include <linux/nospec.h>
 #include <linux/of.h>
 #include <linux/pinctrl/consumer.h>
+#include <linux/revocable.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/srcu.h>
@@ -334,7 +335,10 @@ EXPORT_SYMBOL(gpio_device_get_label);
  */
 struct gpio_chip *gpio_device_get_chip(struct gpio_device *gdev)
 {
-	return rcu_dereference_check(gdev->chip, 1);
+	struct gpio_chip *gc;
+
+	revocable_try_access_with(&gdev->chip_rev, gc);
+	return gc;
 }
 EXPORT_SYMBOL_GPL(gpio_device_get_chip);
 
@@ -424,8 +428,6 @@ static int gpiochip_get_direction(struct gpio_chip *gc, unsigned int offset)
 {
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (WARN_ON(!gc->get_direction))
 		return -EOPNOTSUPP;
 
@@ -453,14 +455,13 @@ int gpiod_get_direction(struct gpio_desc *desc)
 	unsigned long flags;
 	unsigned int offset;
 	int ret;
+	struct gpio_chip *gc;
 
 	ret = validate_desc(desc, __func__);
 	if (ret <= 0)
 		return -EINVAL;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	offset = gpiod_hwgpio(desc);
 	flags = READ_ONCE(desc->flags);
@@ -473,7 +474,7 @@ int gpiod_get_direction(struct gpio_desc *desc)
 	    test_bit(GPIOD_FLAG_IS_OUT, &flags))
 		return 0;
 
-	ret = gpiochip_get_direction(guard.gc, offset);
+	ret = gpiochip_get_direction(gc, offset);
 	if (ret < 0)
 		return ret;
 
@@ -561,9 +562,7 @@ static struct gpio_desc *gpio_name_to_desc(const char * const name)
 
 	list_for_each_entry_srcu(gdev, &gpio_devices, list,
 				 srcu_read_lock_held(&gpio_devices_srcu)) {
-		guard(srcu)(&gdev->srcu);
-
-		gc = srcu_dereference(gdev->chip, &gdev->srcu);
+		revocable_try_access_with(&gdev->chip_rev, gc);
 		if (!gc)
 			continue;
 
@@ -874,10 +873,10 @@ static void gpiodev_release(struct device *dev)
 	synchronize_srcu(&gdev->desc_srcu);
 	cleanup_srcu_struct(&gdev->desc_srcu);
 
+	revocable_put(&gdev->chip_rev);
 	ida_free(&gpio_ida, gdev->id);
 	kfree_const(gdev->label);
 	kfree(gdev->descs);
-	cleanup_srcu_struct(&gdev->srcu);
 	kfree(gdev);
 }
 
@@ -1049,9 +1048,7 @@ static void gpiochip_setup_devs(void)
 
 	list_for_each_entry_srcu(gdev, &gpio_devices, list,
 				 srcu_read_lock_held(&gpio_devices_srcu)) {
-		guard(srcu)(&gdev->srcu);
-
-		gc = srcu_dereference(gdev->chip, &gdev->srcu);
+		revocable_try_access_with(&gdev->chip_rev, gc);
 		if (!gc) {
 			dev_err(&gdev->dev, "Underlying GPIO chip is gone\n");
 			continue;
@@ -1154,14 +1151,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 		goto err_free_gdev;
 	gdev->id = ret;
 
-	ret = init_srcu_struct(&gdev->srcu);
-	if (ret)
-		goto err_free_ida;
-	rcu_assign_pointer(gdev->chip, gc);
-
 	ret = init_srcu_struct(&gdev->desc_srcu);
 	if (ret)
-		goto err_cleanup_gdev_srcu;
+		goto err_free_ida;
 
 	ret = dev_set_name(&gdev->dev, GPIOCHIP_NAME "%d", gdev->id);
 	if (ret)
@@ -1210,6 +1202,10 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 	else
 		gdev->owner = THIS_MODULE;
 
+	ret = revocable_init(&gdev->chip_rev, gc);
+	if (ret)
+		goto err_put_device;
+
 	scoped_guard(mutex, &gpio_devices_lock) {
 		/*
 		 * TODO: this allocates a Linux GPIO number base in the global
@@ -1224,7 +1220,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 			if (base < 0) {
 				ret = base;
 				base = 0;
-				goto err_put_device;
+				goto err_revoke_chip_rev;
 			}
 
 			/*
@@ -1244,7 +1240,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 		ret = gpiodev_add_to_list_unlocked(gdev);
 		if (ret) {
 			gpiochip_err(gc, "GPIO integer space overlap, cannot add chip\n");
-			goto err_put_device;
+			goto err_revoke_chip_rev;
 		}
 	}
 
@@ -1343,14 +1339,14 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 	scoped_guard(mutex, &gpio_devices_lock)
 		list_del_rcu(&gdev->list);
 	synchronize_srcu(&gpio_devices_srcu);
+err_revoke_chip_rev:
+	revocable_revoke(&gdev->chip_rev);
 err_put_device:
 	gpio_device_put(gdev);
 	goto err_print_message;
 
 err_cleanup_desc_srcu:
 	cleanup_srcu_struct(&gdev->desc_srcu);
-err_cleanup_gdev_srcu:
-	cleanup_srcu_struct(&gdev->srcu);
 err_free_ida:
 	ida_free(&gpio_ida, gdev->id);
 err_free_gdev:
@@ -1387,8 +1383,7 @@ void gpiochip_remove(struct gpio_chip *gc)
 	synchronize_srcu(&gpio_devices_srcu);
 
 	/* Numb the device, cancelling all outstanding operations */
-	rcu_assign_pointer(gdev->chip, NULL);
-	synchronize_srcu(&gdev->srcu);
+	revocable_revoke(&gdev->chip_rev);
 	gpio_device_teardown_shared(gdev);
 	gpiochip_irqchip_remove(gc);
 	acpi_gpiochip_remove(gc);
@@ -1449,11 +1444,11 @@ struct gpio_device *gpio_device_find(const void *data,
 		if (!device_is_registered(&gdev->dev))
 			continue;
 
-		guard(srcu)(&gdev->srcu);
-
-		gc = srcu_dereference(gdev->chip, &gdev->srcu);
+		revocable_try_access_with(&gdev->chip_rev, gc);
+		if (!gc)
+			continue;
 
-		if (gc && match(gc, data))
+		if (match(gc, data))
 			return gpio_device_get(gdev);
 	}
 
@@ -2554,16 +2549,15 @@ int gpiod_request_commit(struct gpio_desc *desc, const char *label)
 {
 	unsigned int offset;
 	int ret;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	if (test_and_set_bit(GPIOD_FLAG_REQUESTED, &desc->flags))
 		return -EBUSY;
 
 	offset = gpiod_hwgpio(desc);
-	if (!gpiochip_line_is_valid(guard.gc, offset)) {
+	if (!gpiochip_line_is_valid(gc, offset)) {
 		ret = -EINVAL;
 		goto out_clear_bit;
 	}
@@ -2572,15 +2566,15 @@ int gpiod_request_commit(struct gpio_desc *desc, const char *label)
 	 * before IRQs are enabled, for non-sleeping (SOC) GPIOs.
 	 */
 
-	if (guard.gc->request) {
-		ret = guard.gc->request(guard.gc, offset);
+	if (gc->request) {
+		ret = gc->request(gc, offset);
 		if (ret > 0)
 			ret = -EBADE;
 		if (ret)
 			goto out_clear_bit;
 	}
 
-	if (guard.gc->get_direction)
+	if (gc->get_direction)
 		gpiod_get_direction(desc);
 
 	ret = desc_set_label(desc, label ? : "?");
@@ -2617,16 +2611,17 @@ int gpiod_request(struct gpio_desc *desc, const char *label)
 void gpiod_free_commit(struct gpio_desc *desc)
 {
 	unsigned long flags;
+	struct gpio_chip *gc;
 
 	might_sleep();
 
-	CLASS(gpio_chip_guard, guard)(desc);
+	revocable_try_access_or_return_void(&desc->gdev->chip_rev, gc);
 
 	flags = READ_ONCE(desc->flags);
 
-	if (guard.gc && test_bit(GPIOD_FLAG_REQUESTED, &flags)) {
-		if (guard.gc->free)
-			guard.gc->free(guard.gc, gpiod_hwgpio(desc));
+	if (test_bit(GPIOD_FLAG_REQUESTED, &flags)) {
+		if (gc->free)
+			gc->free(gc, gpiod_hwgpio(desc));
 
 		clear_bit(GPIOD_FLAG_ACTIVE_LOW, &flags);
 		clear_bit(GPIOD_FLAG_REQUESTED, &flags);
@@ -2778,15 +2773,14 @@ EXPORT_SYMBOL_GPL(gpiochip_free_own_desc);
 int gpio_do_set_config(struct gpio_desc *desc, unsigned long config)
 {
 	int ret;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
-	if (!guard.gc->set_config)
+	if (!gc->set_config)
 		return -ENOTSUPP;
 
-	ret = guard.gc->set_config(guard.gc, gpiod_hwgpio(desc), config);
+	ret = gc->set_config(gc, gpiod_hwgpio(desc), config);
 	if (ret > 0)
 		ret = -EBADE;
 
@@ -2899,8 +2893,6 @@ static int gpiochip_direction_input(struct gpio_chip *gc, unsigned int offset)
 {
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (WARN_ON(!gc->direction_input))
 		return -EOPNOTSUPP;
 
@@ -2916,8 +2908,6 @@ static int gpiochip_direction_output(struct gpio_chip *gc, unsigned int offset,
 {
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (WARN_ON(!gc->direction_output))
 		return -EOPNOTSUPP;
 
@@ -2955,17 +2945,16 @@ EXPORT_SYMBOL_GPL(gpiod_direction_input);
 int gpiod_direction_input_nonotify(struct gpio_desc *desc)
 {
 	int ret = 0, dir;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	/*
 	 * It is legal to have no .get() and .direction_input() specified if
 	 * the chip is output-only, but you can't specify .direction_input()
 	 * and not support the .get() operation, that doesn't make sense.
 	 */
-	if (!guard.gc->get && guard.gc->direction_input) {
+	if (!gc->get && gc->direction_input) {
 		gpiod_warn(desc,
 			   "%s: missing get() but have direction_input()\n",
 			   __func__);
@@ -2978,11 +2967,10 @@ int gpiod_direction_input_nonotify(struct gpio_desc *desc)
 	 * direction (if .get_direction() is supported) else we silently
 	 * assume we are in input mode after this.
 	 */
-	if (guard.gc->direction_input) {
-		ret = gpiochip_direction_input(guard.gc,
-					       gpiod_hwgpio(desc));
-	} else if (guard.gc->get_direction) {
-		dir = gpiochip_get_direction(guard.gc, gpiod_hwgpio(desc));
+	if (gc->direction_input) {
+		ret = gpiochip_direction_input(gc, gpiod_hwgpio(desc));
+	} else if (gc->get_direction) {
+		dir = gpiochip_get_direction(gc, gpiod_hwgpio(desc));
 		if (dir < 0)
 			return dir;
 
@@ -3007,8 +2995,6 @@ static int gpiochip_set(struct gpio_chip *gc, unsigned int offset, int value)
 {
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (WARN_ON(unlikely(!gc->set)))
 		return -EOPNOTSUPP;
 
@@ -3022,31 +3008,28 @@ static int gpiochip_set(struct gpio_chip *gc, unsigned int offset, int value)
 static int gpiod_direction_output_raw_commit(struct gpio_desc *desc, int value)
 {
 	int val = !!value, ret = 0, dir;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	/*
 	 * It's OK not to specify .direction_output() if the gpiochip is
 	 * output-only, but if there is then not even a .set() operation it
 	 * is pretty tricky to drive the output line.
 	 */
-	if (!guard.gc->set && !guard.gc->direction_output) {
+	if (!gc->set && !gc->direction_output) {
 		gpiod_warn(desc,
 			   "%s: missing set() and direction_output() operations\n",
 			   __func__);
 		return -EIO;
 	}
 
-	if (guard.gc->direction_output) {
-		ret = gpiochip_direction_output(guard.gc,
-						gpiod_hwgpio(desc), val);
+	if (gc->direction_output) {
+		ret = gpiochip_direction_output(gc, gpiod_hwgpio(desc), val);
 	} else {
 		/* Check that we are in output mode if we can */
-		if (guard.gc->get_direction) {
-			dir = gpiochip_get_direction(guard.gc,
-						     gpiod_hwgpio(desc));
+		if (gc->get_direction) {
+			dir = gpiochip_get_direction(gc, gpiod_hwgpio(desc));
 			if (dir < 0)
 				return dir;
 
@@ -3061,7 +3044,7 @@ static int gpiod_direction_output_raw_commit(struct gpio_desc *desc, int value)
 		 * If we can't actively set the direction, we are some
 		 * output-only chip, so just drive the output as desired.
 		 */
-		ret = gpiochip_set(guard.gc, gpiod_hwgpio(desc), val);
+		ret = gpiochip_set(gc, gpiod_hwgpio(desc), val);
 		if (ret)
 			return ret;
 	}
@@ -3199,20 +3182,18 @@ int gpiod_direction_output_nonotify(struct gpio_desc *desc, int value)
 int gpiod_enable_hw_timestamp_ns(struct gpio_desc *desc, unsigned long flags)
 {
 	int ret;
+	struct gpio_chip *gc;
 
 	VALIDATE_DESC(desc);
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
-	if (!guard.gc->en_hw_timestamp) {
+	if (!gc->en_hw_timestamp) {
 		gpiod_warn(desc, "%s: hw ts not supported\n", __func__);
 		return -ENOTSUPP;
 	}
 
-	ret = guard.gc->en_hw_timestamp(guard.gc,
-					gpiod_hwgpio(desc), flags);
+	ret = gc->en_hw_timestamp(gc, gpiod_hwgpio(desc), flags);
 	if (ret)
 		gpiod_warn(desc, "%s: hw ts request failed\n", __func__);
 
@@ -3232,20 +3213,18 @@ EXPORT_SYMBOL_GPL(gpiod_enable_hw_timestamp_ns);
 int gpiod_disable_hw_timestamp_ns(struct gpio_desc *desc, unsigned long flags)
 {
 	int ret;
+	struct gpio_chip *gc;
 
 	VALIDATE_DESC(desc);
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
-	if (!guard.gc->dis_hw_timestamp) {
+	if (!gc->dis_hw_timestamp) {
 		gpiod_warn(desc, "%s: hw ts not supported\n", __func__);
 		return -ENOTSUPP;
 	}
 
-	ret = guard.gc->dis_hw_timestamp(guard.gc, gpiod_hwgpio(desc),
-					 flags);
+	ret = gc->dis_hw_timestamp(gc, gpiod_hwgpio(desc), flags);
 	if (ret)
 		gpiod_warn(desc, "%s: hw ts release failed\n", __func__);
 
@@ -3363,8 +3342,6 @@ static int gpiochip_get(struct gpio_chip *gc, unsigned int offset)
 {
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	/* Make sure this is called after checking for gc->get(). */
 	ret = gc->get(gc, offset);
 	if (ret > 1) {
@@ -3406,18 +3383,10 @@ static int gpio_chip_get_value(struct gpio_chip *gc, const struct gpio_desc *des
 
 static int gpiod_get_raw_value_commit(const struct gpio_desc *desc)
 {
-	struct gpio_device *gdev;
 	struct gpio_chip *gc;
 	int value;
 
-	/* FIXME Unable to use gpio_chip_guard due to const desc. */
-	gdev = desc->gdev;
-
-	guard(srcu)(&gdev->srcu);
-
-	gc = srcu_dereference(gdev->chip, &gdev->srcu);
-	if (!gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	value = gpio_chip_get_value(gc, desc);
 	value = value < 0 ? value : !!value;
@@ -3428,8 +3397,6 @@ static int gpiod_get_raw_value_commit(const struct gpio_desc *desc)
 static int gpio_chip_get_multiple(struct gpio_chip *gc,
 				  unsigned long *mask, unsigned long *bits)
 {
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (gc->get_multiple) {
 		int ret;
 
@@ -3456,9 +3423,10 @@ static int gpio_chip_get_multiple(struct gpio_chip *gc,
 /* The 'other' chip must be protected with its GPIO device's SRCU. */
 static bool gpio_device_chip_cmp(struct gpio_device *gdev, struct gpio_chip *gc)
 {
-	guard(srcu)(&gdev->srcu);
+	struct gpio_chip *chip;
 
-	return gc == srcu_dereference(gdev->chip, &gdev->srcu);
+	revocable_try_access_with(&gdev->chip_rev, chip);
+	return chip ? chip == gc : false;
 }
 
 int gpiod_get_array_value_complex(bool raw, bool can_sleep,
@@ -3481,11 +3449,7 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 		if (!can_sleep)
 			WARN_ON(array_info->gdev->can_sleep);
 
-		guard(srcu)(&array_info->gdev->srcu);
-		gc = srcu_dereference(array_info->gdev->chip,
-				      &array_info->gdev->srcu);
-		if (!gc)
-			return -ENODEV;
+		revocable_try_access_or_return(&array_info->gdev->chip_rev, gc);
 
 		ret = gpio_chip_get_multiple(gc, array_info->get_mask,
 					     value_bitmap);
@@ -3509,31 +3473,29 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 		unsigned long *mask, *bits;
 		int first, j;
 
-		CLASS(gpio_chip_guard, guard)(desc_array[i]);
-		if (!guard.gc)
-			return -ENODEV;
+		revocable_try_access_or_return(&desc_array[i]->gdev->chip_rev, gc);
 
-		if (likely(guard.gc->ngpio <= FASTPATH_NGPIO)) {
+		if (likely(gc->ngpio <= FASTPATH_NGPIO)) {
 			mask = fastpath_mask;
 			bits = fastpath_bits;
 		} else {
 			gfp_t flags = can_sleep ? GFP_KERNEL : GFP_ATOMIC;
 
-			mask = bitmap_alloc(guard.gc->ngpio, flags);
+			mask = bitmap_alloc(gc->ngpio, flags);
 			if (!mask)
 				return -ENOMEM;
 
-			bits = bitmap_alloc(guard.gc->ngpio, flags);
+			bits = bitmap_alloc(gc->ngpio, flags);
 			if (!bits) {
 				bitmap_free(mask);
 				return -ENOMEM;
 			}
 		}
 
-		bitmap_zero(mask, guard.gc->ngpio);
+		bitmap_zero(mask, gc->ngpio);
 
 		if (!can_sleep)
-			WARN_ON(guard.gc->can_sleep);
+			WARN_ON(gc->can_sleep);
 
 		/* collect all inputs belonging to the same chip */
 		first = i;
@@ -3548,9 +3510,9 @@ int gpiod_get_array_value_complex(bool raw, bool can_sleep,
 				i = find_next_zero_bit(array_info->get_mask,
 						       array_size, i);
 		} while ((i < array_size) &&
-			 gpio_device_chip_cmp(desc_array[i]->gdev, guard.gc));
+			 gpio_device_chip_cmp(desc_array[i]->gdev, gc));
 
-		ret = gpio_chip_get_multiple(guard.gc, mask, bits);
+		ret = gpio_chip_get_multiple(gc, mask, bits);
 		if (ret) {
 			if (mask != fastpath_mask)
 				bitmap_free(mask);
@@ -3699,15 +3661,14 @@ EXPORT_SYMBOL_GPL(gpiod_get_array_value);
 static int gpio_set_open_drain_value_commit(struct gpio_desc *desc, bool value)
 {
 	int ret = 0, offset = gpiod_hwgpio(desc);
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	if (value) {
-		ret = gpiochip_direction_input(guard.gc, offset);
+		ret = gpiochip_direction_input(gc, offset);
 	} else {
-		ret = gpiochip_direction_output(guard.gc, offset, 0);
+		ret = gpiochip_direction_output(gc, offset, 0);
 		if (!ret)
 			set_bit(GPIOD_FLAG_IS_OUT, &desc->flags);
 	}
@@ -3728,17 +3689,16 @@ static int gpio_set_open_drain_value_commit(struct gpio_desc *desc, bool value)
 static int gpio_set_open_source_value_commit(struct gpio_desc *desc, bool value)
 {
 	int ret = 0, offset = gpiod_hwgpio(desc);
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	if (value) {
-		ret = gpiochip_direction_output(guard.gc, offset, 1);
+		ret = gpiochip_direction_output(gc, offset, 1);
 		if (!ret)
 			set_bit(GPIOD_FLAG_IS_OUT, &desc->flags);
 	} else {
-		ret = gpiochip_direction_input(guard.gc, offset);
+		ret = gpiochip_direction_input(gc, offset);
 	}
 	trace_gpio_direction(desc_to_gpio(desc), !value, ret);
 	if (ret < 0)
@@ -3751,15 +3711,15 @@ static int gpio_set_open_source_value_commit(struct gpio_desc *desc, bool value)
 
 static int gpiod_set_raw_value_commit(struct gpio_desc *desc, bool value)
 {
+	struct gpio_chip *gc;
+
 	if (unlikely(!test_bit(GPIOD_FLAG_IS_OUT, &desc->flags)))
 		return -EPERM;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	trace_gpio_value(desc_to_gpio(desc), 0, value);
-	return gpiochip_set(guard.gc, gpiod_hwgpio(desc), value);
+	return gpiochip_set(gc, gpiod_hwgpio(desc), value);
 }
 
 /*
@@ -3780,8 +3740,6 @@ static int gpiochip_set_multiple(struct gpio_chip *gc,
 	unsigned int i;
 	int ret;
 
-	lockdep_assert_held(&gc->gpiodev->srcu);
-
 	if (gc->set_multiple) {
 		ret = gc->set_multiple(gc, mask, bits);
 		if (ret > 0)
@@ -3826,11 +3784,7 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
 				return -EPERM;
 		}
 
-		guard(srcu)(&array_info->gdev->srcu);
-		gc = srcu_dereference(array_info->gdev->chip,
-				      &array_info->gdev->srcu);
-		if (!gc)
-			return -ENODEV;
+		revocable_try_access_or_return(&array_info->gdev->chip_rev, gc);
 
 		if (!raw && !bitmap_empty(array_info->invert_mask, array_size))
 			bitmap_xor(value_bitmap, value_bitmap,
@@ -3854,31 +3808,30 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
 		unsigned long *mask, *bits;
 		int count = 0;
 
-		CLASS(gpio_chip_guard, guard)(desc_array[i]);
-		if (!guard.gc)
-			return -ENODEV;
+		revocable_try_access_or_return(&desc_array[i]->gdev->chip_rev,
+					       gc);
 
-		if (likely(guard.gc->ngpio <= FASTPATH_NGPIO)) {
+		if (likely(gc->ngpio <= FASTPATH_NGPIO)) {
 			mask = fastpath_mask;
 			bits = fastpath_bits;
 		} else {
 			gfp_t flags = can_sleep ? GFP_KERNEL : GFP_ATOMIC;
 
-			mask = bitmap_alloc(guard.gc->ngpio, flags);
+			mask = bitmap_alloc(gc->ngpio, flags);
 			if (!mask)
 				return -ENOMEM;
 
-			bits = bitmap_alloc(guard.gc->ngpio, flags);
+			bits = bitmap_alloc(gc->ngpio, flags);
 			if (!bits) {
 				bitmap_free(mask);
 				return -ENOMEM;
 			}
 		}
 
-		bitmap_zero(mask, guard.gc->ngpio);
+		bitmap_zero(mask, gc->ngpio);
 
 		if (!can_sleep)
-			WARN_ON(guard.gc->can_sleep);
+			WARN_ON(gc->can_sleep);
 
 		do {
 			struct gpio_desc *desc = desc_array[i];
@@ -3917,10 +3870,10 @@ int gpiod_set_array_value_complex(bool raw, bool can_sleep,
 				i = find_next_zero_bit(array_info->set_mask,
 						       array_size, i);
 		} while ((i < array_size) &&
-			 gpio_device_chip_cmp(desc_array[i]->gdev, guard.gc));
+			 gpio_device_chip_cmp(desc_array[i]->gdev, gc));
 		/* push collected bits to outputs */
 		if (count != 0) {
-			ret = gpiochip_set_multiple(guard.gc, mask, bits);
+			ret = gpiochip_set_multiple(gc, mask, bits);
 			if (ret)
 				return ret;
 		}
@@ -4126,7 +4079,6 @@ EXPORT_SYMBOL_GPL(gpiod_is_shared);
  */
 int gpiod_to_irq(const struct gpio_desc *desc)
 {
-	struct gpio_device *gdev;
 	struct gpio_chip *gc;
 	int offset;
 	int ret;
@@ -4135,12 +4087,7 @@ int gpiod_to_irq(const struct gpio_desc *desc)
 	if (ret <= 0)
 		return -EINVAL;
 
-	gdev = desc->gdev;
-	/* FIXME Cannot use gpio_chip_guard due to const desc. */
-	guard(srcu)(&gdev->srcu);
-	gc = srcu_dereference(gdev->chip, &gdev->srcu);
-	if (!gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	offset = gpiod_hwgpio(desc);
 	if (gc->to_irq) {
@@ -5100,18 +5047,16 @@ int gpiod_hog(struct gpio_desc *desc, const char *name,
 	struct gpio_desc *local_desc;
 	int hwnum;
 	int ret;
+	struct gpio_chip *gc;
 
-	CLASS(gpio_chip_guard, guard)(desc);
-	if (!guard.gc)
-		return -ENODEV;
+	revocable_try_access_or_return(&desc->gdev->chip_rev, gc);
 
 	if (test_and_set_bit(GPIOD_FLAG_IS_HOGGED, &desc->flags))
 		return 0;
 
 	hwnum = gpiod_hwgpio(desc);
 
-	local_desc = gpiochip_request_own_desc(guard.gc, hwnum, name,
-					       lflags, dflags);
+	local_desc = gpiochip_request_own_desc(gc, hwnum, name, lflags, dflags);
 	if (IS_ERR(local_desc)) {
 		clear_bit(GPIOD_FLAG_IS_HOGGED, &desc->flags);
 		ret = PTR_ERR(local_desc);
@@ -5481,9 +5426,7 @@ static int gpiolib_seq_show(struct seq_file *s, void *v)
 	if (priv->newline)
 		seq_putc(s, '\n');
 
-	guard(srcu)(&gdev->srcu);
-
-	gc = srcu_dereference(gdev->chip, &gdev->srcu);
+	revocable_try_access_with(&gdev->chip_rev, gc);
 	if (!gc) {
 		seq_printf(s, "%s: (dangling chip)\n", dev_name(&gdev->dev));
 		return 0;
diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
index dc4cb61a9318..efbff4a1cd4e 100644
--- a/drivers/gpio/gpiolib.h
+++ b/drivers/gpio/gpiolib.h
@@ -16,6 +16,7 @@
 #include <linux/gpio/driver.h>
 #include <linux/module.h>
 #include <linux/notifier.h>
+#include <linux/revocable.h>
 #include <linux/spinlock.h>
 #include <linux/string.h>
 #include <linux/srcu.h>
@@ -31,7 +32,6 @@ struct fwnode_handle;
  * @chrdev: character device for the GPIO device
  * @id: numerical ID number for the GPIO chip
  * @owner: helps prevent removal of modules exporting active GPIOs
- * @chip: pointer to the corresponding gpiochip, holding static
  * data for this device
  * @descs: array of ngpio descriptors.
  * @valid_mask: If not %NULL, holds bitmask of GPIOs which are valid to be
@@ -54,7 +54,7 @@ struct fwnode_handle;
  *                 process context
  * @device_notifier: used to notify character device wait queues about the GPIO
  *                   device being unregistered
- * @srcu: protects the pointer to the underlying GPIO chip
+ * @chip_rev: revocable provider handle for the corresponding struct gpio_chip.
  * @pin_ranges: range of pins served by the GPIO driver
  *
  * This state container holds most of the runtime variable data
@@ -67,7 +67,6 @@ struct gpio_device {
 	struct cdev		chrdev;
 	int			id;
 	struct module		*owner;
-	struct gpio_chip __rcu	*chip;
 	struct gpio_desc	*descs;
 	unsigned long		*valid_mask;
 	struct srcu_struct	desc_srcu;
@@ -81,7 +80,7 @@ struct gpio_device {
 	rwlock_t		line_state_lock;
 	struct workqueue_struct	*line_state_wq;
 	struct blocking_notifier_head device_notifier;
-	struct srcu_struct	srcu;
+	struct revocable	chip_rev;
 
 #ifdef CONFIG_PINCTRL
 	/*
@@ -225,27 +224,6 @@ struct gpio_desc {
 
 #define gpiod_not_found(desc)		(IS_ERR(desc) && PTR_ERR(desc) == -ENOENT)
 
-struct gpio_chip_guard {
-	struct gpio_device *gdev;
-	struct gpio_chip *gc;
-	int idx;
-};
-
-DEFINE_CLASS(gpio_chip_guard,
-	     struct gpio_chip_guard,
-	     srcu_read_unlock(&_T.gdev->srcu, _T.idx),
-	     ({
-		struct gpio_chip_guard _guard;
-
-		_guard.gdev = desc->gdev;
-		_guard.idx = srcu_read_lock(&_guard.gdev->srcu);
-		_guard.gc = srcu_dereference(_guard.gdev->chip,
-					     &_guard.gdev->srcu);
-
-		_guard;
-	     }),
-	     struct gpio_desc *desc)
-
 int gpiod_request(struct gpio_desc *desc, const char *label);
 int gpiod_request_commit(struct gpio_desc *desc, const char *label);
 void gpiod_free(struct gpio_desc *desc);
-- 
2.51.0


^ permalink raw reply related

* [PATCH v11 2/5] revocable: Add KUnit test cases
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney,
	Bartosz Golaszewski
In-Reply-To: <20260513091043.6766-1-tzungbi@kernel.org>

Add KUnit test cases for the revocable API.

The test cases cover the following scenarios:

- Basic: Verifies that a consumer can successfully access the resource.
- Revocation: Verifies that after the provider revokes the resource,
  the consumer correctly receives a NULL pointer on a subsequent access.
- Try Access Macro: Same as "Revocation" but uses the macro level
  helpers.
- Concurrent Access: Verifies multiple threads can access the resource.

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
v11:
- Move the test to drivers/base/test/.
- Add R-b tag.

v10: https://lore.kernel.org/all/20260508105448.31799-3-tzungbi@kernel.org
- Merge revocable_test_try_access_macro*() cases.
- Change revocable API usages accordingly.

v9: https://lore.kernel.org/all/20260427135841.96266-3-tzungbi@kernel.org
- Add test cases for embedded resource provider.

v8: https://lore.kernel.org/all/20260213092307.858908-3-tzungbi@kernel.org
- Squash:
  - c259cd7ea3c9 revocable: fix missing module license and description
  - a243f7fb11fe revocable: Add KUnit test for provider lifetime races
  - 988357628c2c revocable: Add KUnit test for concurrent access
- Change accordingly due to its dependency "revocable: Revocable resource
  management" changes.

v7: https://lore.kernel.org/all/20260116080235.350305-3-tzungbi@kernel.org
- "2025" -> "2026" in copyright.
- Rename the test name "macro" -> "try_access_macro".

v6: https://lore.kernel.org/all/20251106152330.11733-3-tzungbi@kernel.org
- Rename REVOCABLE_TRY_ACCESS_WITH() -> REVOCABLE_TRY_ACCESS_SCOPED().
- Add tests for new REVOCABLE_TRY_ACCESS_WITH().

v5: https://lore.kernel.org/all/20251016054204.1523139-3-tzungbi@kernel.org
- No changes.

v4: https://lore.kernel.org/all/20250923075302.591026-3-tzungbi@kernel.org
- REVOCABLE() -> REVOCABLE_TRY_ACCESS_WITH().
- revocable_release() -> revocable_withdraw_access().

v3: https://lore.kernel.org/all/20250912081718.3827390-3-tzungbi@kernel.org
- No changes.

v2: https://lore.kernel.org/all/20250820081645.847919-3-tzungbi@kernel.org
- New in the series.

A way to run the test:
$ ./tools/testing/kunit/kunit.py run \
        --kconfig_add CONFIG_REVOCABLE_KUNIT_TEST=y \
        revocable_test
Or
$ ./tools/testing/kunit/kunit.py run \
        --kconfig_add CONFIG_REVOCABLE_KUNIT_TEST=y \
        --kconfig_add CONFIG_PROVE_LOCKING=y \
        --kconfig_add CONFIG_DEBUG_KERNEL=y \
        --kconfig_add CONFIG_DEBUG_INFO=y \
        --kconfig_add CONFIG_DEBUG_INFO_DWARF5=y \
        --kconfig_add CONFIG_KASAN=y \
        --kconfig_add CONFIG_DETECT_HUNG_TASK=y \
        --kconfig_add CONFIG_DEFAULT_HUNG_TASK_TIMEOUT="10" \
        --arch=x86_64 \
        --make_options="C=1 W=1" \
        revocable_test

---
 MAINTAINERS                        |   1 +
 drivers/base/test/Kconfig          |   5 +
 drivers/base/test/Makefile         |   2 +
 drivers/base/test/revocable-test.c | 406 +++++++++++++++++++++++++++++
 4 files changed, 414 insertions(+)
 create mode 100644 drivers/base/test/revocable-test.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 424847de7a17..24c884e19cd5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22869,6 +22869,7 @@ L:	driver-core@lists.linux.dev
 S:	Maintained
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git
 F:	drivers/base/revocable.c
+F:	drivers/base/revocable_test.c
 F:	include/linux/revocable.h
 
 RFKILL
diff --git a/drivers/base/test/Kconfig b/drivers/base/test/Kconfig
index 2756870615cc..fde950fcfac9 100644
--- a/drivers/base/test/Kconfig
+++ b/drivers/base/test/Kconfig
@@ -18,3 +18,8 @@ config DRIVER_PE_KUNIT_TEST
 	tristate "KUnit Tests for property entry API" if !KUNIT_ALL_TESTS
 	depends on KUNIT
 	default KUNIT_ALL_TESTS
+
+config REVOCABLE_KUNIT_TEST
+	tristate "KUnit tests for revocable" if !KUNIT_ALL_TESTS
+	depends on KUNIT
+	default KUNIT_ALL_TESTS
diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile
index e321dfc7e922..7b5832d38436 100644
--- a/drivers/base/test/Makefile
+++ b/drivers/base/test/Makefile
@@ -6,3 +6,5 @@ obj-$(CONFIG_DM_KUNIT_TEST)	+= platform-device-test.o
 
 obj-$(CONFIG_DRIVER_PE_KUNIT_TEST) += property-entry-test.o
 CFLAGS_property-entry-test.o += $(DISABLE_STRUCTLEAK_PLUGIN)
+
+obj-$(CONFIG_REVOCABLE_KUNIT_TEST) += revocable-test.o
diff --git a/drivers/base/test/revocable-test.c b/drivers/base/test/revocable-test.c
new file mode 100644
index 000000000000..85ec83412bbf
--- /dev/null
+++ b/drivers/base/test/revocable-test.c
@@ -0,0 +1,406 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2026 Google LLC
+ *
+ * KUnit tests for the revocable API.
+ *
+ * The test cases cover the following scenarios:
+ *
+ * - Basic: Verifies that a consumer can successfully access the resource.
+ *
+ * - Revocation: Verifies that after the provider revokes the resource,
+ *   the consumer correctly receives a NULL pointer on a subsequent access.
+ *
+ * - Try Access Macro: Same as "Revocation" but uses the macro level
+ *   helpers.
+ *
+ * - Concurrent Access: Verifies multiple threads can access the resource.
+ */
+
+#include <kunit/test.h>
+
+#include <linux/completion.h>
+#include <linux/delay.h>
+#include <linux/kthread.h>
+#include <linux/refcount.h>
+#include <linux/revocable.h>
+
+static int get_refcount(struct revocable *rev)
+{
+	return refcount_read(&rev->kref.refcount);
+}
+
+static void revocable_test_basic(struct kunit *test)
+{
+	struct revocable *rev;
+	struct revocable_handle rh;
+	void *real_res = (void *)0x12345678, *res;
+
+	rev = revocable_alloc(real_res);
+	KUNIT_ASSERT_NOT_NULL(test, rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	KUNIT_EXPECT_FALSE(test, rev->embedded);
+
+	revocable_handle_init(rev, &rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+	revocable_handle_deinit(&rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	revocable_revoke(rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+	revocable_put(rev);
+}
+
+static void revocable_embedded_test_basic(struct kunit *test)
+{
+	struct revocable rev;
+	struct revocable_handle rh;
+	void *real_res = (void *)0x12345678, *res;
+
+	revocable_init(&rev, real_res);
+	KUNIT_EXPECT_TRUE(test, rev.embedded);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 2);
+
+	revocable_handle_init(&rev, &rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 3);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 3);
+	revocable_handle_deinit(&rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 2);
+	revocable_revoke(&rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 1);
+	revocable_put(&rev);
+}
+
+static void revocable_test_revocation(struct kunit *test)
+{
+	struct revocable *rev;
+	struct revocable_handle rh;
+	void *real_res = (void *)0x12345678, *res;
+
+	rev = revocable_alloc(real_res);
+	KUNIT_ASSERT_NOT_NULL(test, rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	KUNIT_EXPECT_FALSE(test, rev->embedded);
+
+	revocable_handle_init(rev, &rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+	revocable_revoke(rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, NULL);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	revocable_handle_deinit(&rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+	revocable_put(rev);
+}
+
+static void revocable_embedded_test_revocation(struct kunit *test)
+{
+	struct revocable rev;
+	struct revocable_handle rh;
+	void *real_res = (void *)0x12345678, *res;
+
+	revocable_init(&rev, real_res);
+	KUNIT_EXPECT_TRUE(test, rev.embedded);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 2);
+
+	revocable_handle_init(&rev, &rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 3);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 3);
+	revocable_revoke(&rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 2);
+
+	res = revocable_try_access(&rh);
+	KUNIT_EXPECT_PTR_EQ(test, res, NULL);
+	revocable_withdraw_access(&rh);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 2);
+	revocable_handle_deinit(&rh);
+	KUNIT_EXPECT_EQ(test, get_refcount(&rev), 1);
+	revocable_put(&rev);
+}
+
+static int call_revocable_try_access_or_return_err(struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return_err(rev, res, -ENXIO);
+	return 0;
+}
+
+static int call_revocable_try_access_or_return(struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return(rev, res);
+	return 0;
+}
+
+static void call_revocable_try_access_or_return_void(struct kunit *test,
+						     struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return_void(rev, res);
+	KUNIT_FAIL(test, "unreachable");
+}
+
+static int call_revocable_try_access_or_return_err_scoped(struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return_err_scoped(rev, res, -ENXIO) {}
+	return 0;
+}
+
+static int call_revocable_try_access_or_return_scoped(struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return_scoped(rev, res) {}
+	return 0;
+}
+
+static void call_revocable_try_access_or_return_void_scoped(struct kunit *test,
+							    struct revocable *rev)
+{
+	void *res;
+
+	revocable_try_access_or_return_void_scoped(rev, res) {}
+	KUNIT_FAIL(test, "unreachable");
+}
+
+static void revocable_test_try_access_macro(struct kunit *test)
+{
+	struct revocable *rev;
+	void *real_res = (void *)0x12345678, *res;
+	int ret;
+	bool accessed;
+
+	rev = revocable_alloc(real_res);
+	KUNIT_ASSERT_NOT_NULL(test, rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	KUNIT_EXPECT_FALSE(test, rev->embedded);
+
+	{
+		revocable_try_access_with(rev, res);
+		KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+		KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+	}
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+
+	accessed = false;
+	revocable_try_access_with_scoped(rev, res) {
+		KUNIT_EXPECT_PTR_EQ(test, res, real_res);
+		KUNIT_EXPECT_EQ(test, get_refcount(rev), 3);
+		accessed = true;
+	}
+	KUNIT_EXPECT_TRUE(test, accessed);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+
+	revocable_revoke(rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+
+	{
+		revocable_try_access_with(rev, res);
+		KUNIT_EXPECT_PTR_EQ(test, res, NULL);
+		KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	}
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+
+	accessed = false;
+	revocable_try_access_with_scoped(rev, res) {
+		KUNIT_EXPECT_PTR_EQ(test, res, NULL);
+		KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+		accessed = true;
+	}
+	KUNIT_EXPECT_TRUE(test, accessed);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+
+	ret = call_revocable_try_access_or_return_err(rev);
+	KUNIT_EXPECT_EQ(test, ret, -ENXIO);
+
+	ret = call_revocable_try_access_or_return(rev);
+	KUNIT_EXPECT_EQ(test, ret, -ENODEV);
+
+	call_revocable_try_access_or_return_void(test, rev);
+
+	ret = call_revocable_try_access_or_return_err_scoped(rev);
+	KUNIT_EXPECT_EQ(test, ret, -ENXIO);
+
+	ret = call_revocable_try_access_or_return_scoped(rev);
+	KUNIT_EXPECT_EQ(test, ret, -ENODEV);
+
+	call_revocable_try_access_or_return_void_scoped(test, rev);
+
+	accessed = false;
+	revocable_try_access_or_skip_scoped(rev, res)
+		accessed = true;
+	KUNIT_EXPECT_FALSE(test, accessed);
+
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+	revocable_put(rev);
+}
+
+struct test_concurrent_access_context {
+	struct completion started, enter;
+	struct task_struct *thread;
+
+	union {
+		/* Used by test provider. */
+		struct revocable *rev;
+
+		/* Used by test consumer. */
+		struct {
+			struct completion exit;
+			struct revocable_handle rh;
+			struct kunit *test;
+			void *expected_res;
+		};
+	};
+};
+
+static int test_concurrent_access_provider(void *data)
+{
+	struct test_concurrent_access_context *ctx = data;
+
+	complete(&ctx->started);
+
+	wait_for_completion(&ctx->enter);
+	revocable_revoke(ctx->rev);
+
+	return 0;
+}
+
+static int test_concurrent_access_consumer(void *data)
+{
+	struct test_concurrent_access_context *ctx = data;
+	void *res;
+
+	complete(&ctx->started);
+
+	wait_for_completion(&ctx->enter);
+	res = revocable_try_access(&ctx->rh);
+	KUNIT_EXPECT_PTR_EQ(ctx->test, res, ctx->expected_res);
+
+	wait_for_completion(&ctx->exit);
+	revocable_withdraw_access(&ctx->rh);
+
+	return 0;
+}
+
+static void revocable_test_concurrent_access(struct kunit *test)
+{
+	struct revocable *rev;
+	void *real_res = (void *)0x12345678;
+	struct test_concurrent_access_context *ctx;
+	int i;
+
+	rev = revocable_alloc(real_res);
+	KUNIT_ASSERT_NOT_NULL(test, rev);
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 2);
+	KUNIT_EXPECT_FALSE(test, rev->embedded);
+
+	ctx = kunit_kmalloc_array(test, 3, sizeof(*ctx), GFP_KERNEL);
+	KUNIT_ASSERT_NOT_NULL(test, ctx);
+
+	for (i = 0; i < 3; ++i) {
+		ctx[i].test = test;
+		init_completion(&ctx[i].started);
+		init_completion(&ctx[i].enter);
+
+		if (i == 0) {
+			/* Transfer the ownership of provider reference too. */
+			ctx[i].rev = rev;
+			ctx[i].thread = kthread_run(
+				test_concurrent_access_provider, ctx + i,
+				"revocable_%d", i);
+		} else {
+			init_completion(&ctx[i].exit);
+			revocable_handle_init(rev, &ctx[i].rh);
+			KUNIT_EXPECT_EQ(test, get_refcount(rev), 2 + i);
+
+			ctx[i].thread = kthread_run(
+				test_concurrent_access_consumer, ctx + i,
+				"revocable_handle_%d", i);
+		}
+		KUNIT_ASSERT_FALSE(test, IS_ERR(ctx[i].thread));
+
+		wait_for_completion(&ctx[i].started);
+	}
+
+	ctx[1].expected_res = real_res;
+	/* consumer1 enters read-side critical section. */
+	complete(&ctx[1].enter);
+	msleep(100);
+
+	/* provider0 revokes the resource. */
+	complete(&ctx[0].enter);
+	msleep(100);
+	/* provider0 can't exit.  It's waiting for the grace period. */
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 4);
+
+	ctx[2].expected_res = NULL;
+	/* consumer2 enters read-side critical section. */
+	complete(&ctx[2].enter);
+	msleep(100);
+
+	/* consumer{1,2} exit read-side critical section. */
+	for (i = 1; i < 3; ++i) {
+		complete(&ctx[i].exit);
+		kthread_stop(ctx[i].thread);
+		revocable_handle_deinit(&ctx[i].rh);
+	}
+
+	kthread_stop(ctx[0].thread);
+	/* provider0 exits as all readers exit their critical section. */
+	KUNIT_EXPECT_EQ(test, get_refcount(rev), 1);
+
+	/* Drop the caller reference. */
+	revocable_put(rev);
+}
+
+static struct kunit_case revocable_test_cases[] = {
+	KUNIT_CASE(revocable_test_basic),
+	KUNIT_CASE(revocable_embedded_test_basic),
+	KUNIT_CASE(revocable_test_revocation),
+	KUNIT_CASE(revocable_embedded_test_revocation),
+	KUNIT_CASE(revocable_test_try_access_macro),
+	KUNIT_CASE(revocable_test_concurrent_access),
+	{}
+};
+
+static struct kunit_suite revocable_test_suite = {
+	.name = "revocable_test",
+	.test_cases = revocable_test_cases,
+};
+
+kunit_test_suite(revocable_test_suite);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Tzung-Bi Shih <tzungbi@kernel.org>");
+MODULE_DESCRIPTION("KUnit tests for the revocable API");
-- 
2.51.0


^ permalink raw reply related

* [PATCH v11 1/5] revocable: Revocable resource management
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney,
	Bartosz Golaszewski
In-Reply-To: <20260513091043.6766-1-tzungbi@kernel.org>

The "revocable" mechanism is a synchronization primitive designed to
manage safe access to resources that can be asynchronously removed or
invalidated.  Its primary purpose is to prevent Use-After-Free (UAF)
errors when interacting with resources whose lifetimes are not
guaranteed to outlast their consumers.

This is particularly useful in systems where resources can disappear
unexpectedly, such as those provided by hot-pluggable devices like
USB.  When a consumer holds a reference to such a resource, the
underlying device might be removed, causing the resource's memory to
be freed.  Subsequent access attempts by the consumer would then lead
to UAF errors.

Revocable addresses this by providing a form of "weak reference" and
a controlled access method.  It allows a resource consumer to safely
attempt to access the resource.  The mechanism guarantees that any
access granted is valid for the duration of its use.  If the resource
has already been revoked (i.e., freed), the access attempt will fail
safely, typically by returning NULL, instead of causing a crash.

It uses a provider/consumer model built on Sleepable RCU (SRCU) to
guarantee safe memory access:

- A resource provider, such as a driver for a hot-pluggable device,
  allocates a struct revocable and initializes it with a pointer
  to the resource.

- A resource consumer that wants to access the resource allocates a
  struct revocable_consumer containing a reference to the provider.

- To access the resource, the consumer uses revocable_try_access().
  This function enters an SRCU read-side critical section and returns
  the pointer to the resource.  If the provider has already freed the
  resource, it returns NULL.  After use, the consumer calls
  revocable_withdraw_access() to exit the SRCU critical section.  There
  are some macro level helpers for doing that.

  The API provides the following contract:

  - revocable_try_access() can be safely called from both process and
    atomic contexts.
  - It is permitted to sleep within the critical section established
    between revocable_try_access() and revocable_withdraw_access().
  - revocable_try_access() and the matching revocable_withdraw_access()
    must occur in the same context.  For example, it is illegal to
    invoke revocable_withdraw_access() in an irq handler if the matching
    revocable_try_access() was invoked in process context.

- When the provider needs to remove the resource, it calls
  revocable_revoke().  This function sets the internal resource
  pointer to NULL and then calls synchronize_srcu() to wait for all
  current readers to finish before the resource can be completely torn
  down.

Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
v11:
- Add Reviewed-by and Tested-by tags.

v10: https://lore.kernel.org/all/20260508105448.31799-2-tzungbi@kernel.org
- Drop unused header.
- Unify handling of embedded and dynamic allocation.
- Rename:
  - struct revocable_consumer -> struct revocable_handle.
  - revocable_init() -> revocable_handle_init().
  - revocable_deinit() -> revocable_handle_deinit().
  - revocable_embed_init() -> revocable_init().

v9: https://lore.kernel.org/all/20260427135841.96266-2-tzungbi@kernel.org
- Add revocable_embed_init() and revocable_embed_destroy() for embedded
  resource provider per
  https://lore.kernel.org/all/CAMRc=MehkJc-js=Wk9vBAcXOpazqjtYDLPUEhmbN8U7Wu2YpgA@mail.gmail.com

v8: https://lore.kernel.org/all/20260213092307.858908-2-tzungbi@kernel.org
- Squash:
  - fdeb3ca3cca8 revocable: Remove redundant synchronize_srcu() call
  - 4d7dc4d1a62d revocable: Fix races in revocable_alloc() using RCU
  - 377563ce0653 revocable: fix SRCU index corruption by requiring caller-provided storage
- Rename macro names:
  - REVOCABLE_TRY_ACCESS_WITH() -> revocable_try_access_with().
  - REVOCABLE_TRY_ACCESS_SCOPED() -> revocable_try_access_with_scoped().
- Rename terminologies as now normal users should only "see" provider
  handles, using a shorter name for provider handle to echo the main
  concept.
  - struct revocable -> struct revocable_consumer.
  - struct revocable_provider -> struct revocable.
  - revocable_provider_alloc() -> revocable_alloc().
  - revocable_provider_revoke() -> revocable_revoke().
- New APIs:
  - revocable_get().
  - revocable_put().
  - revocable_try_access_or_return_err().
  - revocable_try_access_or_return().
  - revocable_try_access_or_return_void().
  - revocable_try_access_or_return_err_scoped().
  - revocable_try_access_or_return_scoped().
  - revocable_try_access_or_void_scoped().
  - revocable_try_access_or_skip_scoped().
- Add API contract that revocable_try_access() works from process and
  atomic context while also allowing sleeping inside the critical
  sections.
- Add revocable.h to the DRIVER CORE entry in MAINTAINERS.

v7: https://lore.kernel.org/all/20260116080235.350305-2-tzungbi@kernel.org
- "2025" -> "2026" in copyright.
- Documentation/
  - Rephrase section "Revocable vs. Devres (devm)".
  - Include sections for struct revocable_provider and struct revocable.
- Minor rename: "revocable" -> "access_rev" for DEFINE_FREE().
- Add Acked-by tag.

v6: https://lore.kernel.org/all/20251106152330.11733-2-tzungbi@kernel.org
- Rename REVOCABLE_TRY_ACCESS_WITH() -> REVOCABLE_TRY_ACCESS_SCOPED().
- Add new REVOCABLE_TRY_ACCESS_WITH().
- Remove Acked-by tags as the API names changed a bit.

v5: https://lore.kernel.org/all/20251016054204.1523139-2-tzungbi@kernel.org
- No changes.

v4: https://lore.kernel.org/all/20250923075302.591026-2-tzungbi@kernel.org
- Rename:
  - revocable_provider_free() -> revocable_provider_revoke().
  - REVOCABLE() -> REVOCABLE_TRY_ACCESS_WITH().
  - revocable_release() -> revocable_withdraw_access().
- rcu_dereference() -> srcu_dereference() to fix a warning from lock debugging.
- Move most docs to kernel-doc, include them in Documentation/, and modify the
  commit message accordingly.
- Fix some doc errors.
- Add Acked-by tags.

v3: https://lore.kernel.org/all/20250912081718.3827390-2-tzungbi@kernel.org
- No changes.

v2: https://lore.kernel.org/all/20250820081645.847919-2-tzungbi@kernel.org
- Rename "ref_proxy" -> "revocable".
- Add introduction in kernel-doc format in revocable.c.
- Add MAINTAINERS entry.
- Add copyright.
- Move from lib/ to drivers/base/.
- EXPORT_SYMBOL() -> EXPORT_SYMBOL_GPL().
- Add Documentation/.
- Rename _get() -> try_access(); _put() -> release().
- Fix a sparse warning by removing the redundant __rcu annotations.
- Fix a sparse warning by adding __acquires() and __releases() annotations.

v1: https://lore.kernel.org/all/20250814091020.1302888-2-tzungbi@kernel.org

A way to verify Documentation/:
- `make O=build SPHINXDIRS=driver-api/driver-model/ htmldocs`.

---
 .../driver-api/driver-model/index.rst         |   1 +
 .../driver-api/driver-model/revocable.rst     | 384 ++++++++++++++++++
 MAINTAINERS                                   |   9 +
 drivers/base/Makefile                         |   2 +-
 drivers/base/revocable.c                      | 267 ++++++++++++
 include/linux/revocable.h                     | 204 ++++++++++
 6 files changed, 866 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/driver-api/driver-model/revocable.rst
 create mode 100644 drivers/base/revocable.c
 create mode 100644 include/linux/revocable.h

diff --git a/Documentation/driver-api/driver-model/index.rst b/Documentation/driver-api/driver-model/index.rst
index abeb4b36636b..cc90b20bb192 100644
--- a/Documentation/driver-api/driver-model/index.rst
+++ b/Documentation/driver-api/driver-model/index.rst
@@ -14,3 +14,4 @@ Driver Model
    overview
    platform
    porting
+   revocable
diff --git a/Documentation/driver-api/driver-model/revocable.rst b/Documentation/driver-api/driver-model/revocable.rst
new file mode 100644
index 000000000000..9a20c2032695
--- /dev/null
+++ b/Documentation/driver-api/driver-model/revocable.rst
@@ -0,0 +1,384 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Revocable Resource Management
+==============================
+
+Overview
+========
+
+.. kernel-doc:: drivers/base/revocable.c
+   :doc: Overview
+
+Revocable vs. Devres (devm)
+===========================
+
+Revocable and Devres address different problems in resource management:
+
+*   **Devres:** Primarily addresses **resource leaks**.  The lifetime of the
+    resources is tied to the lifetime of the device.  The resource is
+    automatically freed when the device is unbound.  This cleanup happens
+    irrespective of any potential active users.
+
+*   **Revocable:** Primarily addresses **invalid memory access**,
+    such as Use-After-Free (UAF).  It's an independent synchronization
+    primitive that decouples consumer access from the resource's actual
+    presence.  Consumers interact with a "revocable object" (an intermediary),
+    not the underlying resource directly.  This revocable object persists as
+    long as there are active references to it from consumer handles.
+
+**Key Distinctions & How They Complement Each Other:**
+
+1.  **Reference Target:** Consumers hold a reference to the *revocable object*,
+    not the encapsulated resource itself.
+
+2.  **Resource Lifetime vs. Access:** The underlying resource's lifetime is
+    independent of the number of references to the revocable object.  The
+    resource can be freed at any point.  A common scenario is the resource
+    being freed by `devres` when the providing device is unbound.
+
+3.  **Safe Access:** Revocable provides a safe way to attempt access.  Before
+    using the resource, a consumer uses the Revocable API (e.g.,
+    revocable_try_access()).  This function checks if the resource is still
+    valid.  It returns a pointer to the resource only if it hasn't been
+    revoked; otherwise, it returns NULL.  This prevents UAF by providing a
+    clear signal that the resource is gone.
+
+4.  **Complementary Usage:** `devres` and Revocable work well together.
+    `devres` can handle the automatic allocation and deallocation of a
+    resource tied to a device.  The Revocable mechanism can be layered on top
+    to provide safe access for consumers whose lifetimes might extend beyond
+    the provider device's lifetime.  For instance, a userspace program might
+    keep a character device file open even after the physical device has been
+    removed.  In this case:
+
+    *   `devres` frees the device-specific resource upon unbinding.
+    *   The Revocable mechanism ensures that any subsequent operations on the
+        open file handle, which attempt to access the now-freed resource,
+        will fail gracefully (e.g., revocable_try_access() returns NULL)
+        instead of causing a UAF.
+
+In summary, `devres` ensures resources are *released* to prevent leaks, while
+the Revocable mechanism ensures that attempts to *access* these resources are
+done safely, even if the resource has been released.
+
+API and Usage
+=============
+
+For Resource Providers
+----------------------
+
+There are two ways to manage the resource provider handle (``struct revocable``):
+
+Dynamic Allocation
+~~~~~~~~~~~~~~~~~~
+
+If the lifetime of the ``struct revocable`` is not tied to another specific
+kernel object, or if multiple independent consumers need to hold references,
+dynamic allocation should be used.
+
+*   **Creation:** Use revocable_alloc() to allocate and initialize.
+*   **Ownership:** The caller receives a reference, and the provider holds
+    another.
+*   **Revocation:** Call revocable_revoke() when the resource is going away.
+    This drops the provider's reference.
+*   **Cleanup:** The caller *must* call revocable_put() to release its reference
+    when it no longer needs the handle.  The memory is freed automatically when
+    the last reference is dropped.
+
+Embedded Allocation
+~~~~~~~~~~~~~~~~~~~
+
+If the ``struct revocable`` can be embedded within a parent kernel object
+(e.g., a foo_device struct), this method can be simpler as the lifetime is
+inherently tied to the parent.
+
+*   **Initialization:** Declare a ``struct revocable`` within your parent
+    structure and initialize it with revocable_init().
+*   **Ownership:** The caller receives a reference, and the provider holds
+    another.
+*   **Revocation:** Call revocable_revoke() when the resource is going away.
+    This drops the provider's reference.
+*   **Cleanup:** The owner *must* call revocable_put() during the parent
+    object's teardown process and ensuring no more consumers can access
+    it.  This cleans up internal resources like the SRCU domain.  The memory
+    for the ``struct revocable`` is freed when the parent object is freed.
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_alloc
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_init
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_revoke
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_get
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_put
+
+Example Usage (Dynamic Allocation)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    struct foo_device {
+        struct revocable *rev;
+        ...
+    };
+
+    int foo_device_probe(struct device *dev)
+    {
+        struct foo_device *foo_dev;
+        void *res;
+        int ret;
+
+        foo_dev = devm_kzalloc(dev, sizeof(*foo_dev), GFP_KERNEL);
+        if (!foo_dev)
+            return -ENOMEM;
+
+        // Acquire the actual resource.
+        res = ...(...);
+
+        // Allocate the revocable handle.
+        foo_dev->rev = revocable_alloc(res);
+        if (!foo_dev->rev)
+            return -ENOMEM;
+
+        dev_set_drvdata(dev, foo_dev);
+        // ... further device setup ...
+        return 0;
+    }
+
+    void foo_device_remove(struct device *dev)
+    {
+        struct foo_device *foo_dev = dev_get_drvdata(dev);
+
+        // Drop the reference.
+        revocable_put(foo_dev->rev);
+    }
+
+    // Provider side would use revocable_revoke() on foo_dev->rev.
+    // Consumer side would use revocable_try_access_* macros on foo_dev->rev.
+
+Example Usage (Embedded Allocation)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    struct foo_device {
+        struct revocable rev;
+        ...
+    };
+
+    int foo_device_probe(struct device *dev)
+    {
+        struct foo_device *foo_dev;
+        void *res;
+        int ret;
+
+        foo_dev = devm_kzalloc(dev, sizeof(*foo_dev), GFP_KERNEL);
+        if (!foo_dev)
+            return -ENOMEM;
+
+        // Acquire the actual resource.
+        res = ...(...);
+
+        // Initialize the embedded revocable.
+        ret = revocable_init(&foo_dev->rev, res);
+        if (ret)
+            return ret;
+
+        dev_set_drvdata(dev, foo_dev);
+        // ... further device setup ...
+        return 0;
+    }
+
+    void foo_device_remove(struct device *dev)
+    {
+        struct foo_device *foo_dev = dev_get_drvdata(dev);
+
+        // Cleanup the embedded revocable internal state.
+        revocable_put(&foo_dev->rev);
+    }
+
+    // Provider side would use revocable_revoke() on &foo_dev->rev.
+    // Consumer side would use revocable_try_access_* macros on &foo_dev->rev.
+
+For Resource Consumers
+----------------------
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_handle
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_handle_init
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_handle_deinit
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_try_access
+
+.. kernel-doc:: drivers/base/revocable.c
+   :identifiers: revocable_withdraw_access
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_with
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        revocable_try_access_with(rev, res);
+        // Always check if the resource is valid.
+        if (!res) {
+            pr_warn("Resource is not available\n");
+            return -EAGAIN;
+        }
+
+        // 'res' is guaranteed to be valid until this function exits.
+        do_something_with(res);
+        return 0;
+    } // revocable_withdraw_access() is automatically called here.
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_or_return_err
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        // Returns -ENXIO if access fails.
+        revocable_try_access_or_return_err(rev, res, -ENXIO);
+
+        // 'res' is guaranteed to be valid if we reach here.
+        do_something_with(res);
+        return 0;
+    } // revocable_withdraw_access() is automatically called here.
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_or_return
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        // Returns -ENODEV if access fails.
+        revocable_try_access_or_return(rev, res);
+
+        // 'res' is guaranteed to be valid if we reach here.
+        do_something_with(res);
+        return 0;
+    } // revocable_withdraw_access() is automatically called here.
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_with_scoped
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        revocable_try_access_with_scoped(rev, res) {
+            // Always check if the resource is valid.
+            if (!res) {
+                pr_warn("Resource is not available\n");
+                return -EAGAIN;
+            }
+
+            // 'res' is valid for the rest of this block.
+            do_something_with(res);
+        }
+        // revocable_withdraw_access() is automatically called here.
+
+        return 0;
+    }
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_or_return_err_scoped
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        // Returns -ENXIO if access fails.
+        revocable_try_access_or_return_err_scoped(rev, res, -ENXIO) {
+            // 'res' is guaranteed to be valid in this block.
+            do_something_with(res);
+        }
+        // revocable_withdraw_access() is automatically called here.
+
+        return 0; // Only reached if resource was accessed.
+    }
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_or_return_scoped
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        // Returns -ENODEV if access fails.
+        revocable_try_access_or_return_scoped(rev, res) {
+            // 'res' is guaranteed to be valid in this block.
+            do_something_with(res);
+        }
+        // revocable_withdraw_access() is automatically called here.
+
+        return 0; // Only reached if resource was accessed.
+    }
+
+.. kernel-doc:: include/linux/revocable.h
+   :identifiers: revocable_try_access_or_skip_scoped
+
+Example Usage
+~~~~~~~~~~~~~
+
+.. code-block:: c
+
+    int consumer_use_resource(struct revocable *rev)
+    {
+        struct foo_resource *res;
+
+        revocable_try_access_or_skip_scoped(rev, res) {
+            // This block is ONLY entered if 'res' is not NULL.
+            do_something_with(res);
+        }
+        // revocable_withdraw_access() is automatically called here.
+
+        return 0;
+    }
diff --git a/MAINTAINERS b/MAINTAINERS
index b2040011a386..424847de7a17 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7813,6 +7813,7 @@ F:	include/linux/fwnode.h
 F:	include/linux/kobj*
 F:	include/linux/ksysfs.h
 F:	include/linux/property.h
+F:	include/linux/revocable.h
 F:	include/linux/sysfs.h
 F:	kernel/ksysfs.c
 F:	lib/kobj*
@@ -22862,6 +22863,14 @@ F:	include/uapi/linux/rseq.h
 F:	kernel/rseq.c
 F:	tools/testing/selftests/rseq/
 
+REVOCABLE RESOURCE MANAGEMENT
+M:	Tzung-Bi Shih <tzungbi@kernel.org>
+L:	driver-core@lists.linux.dev
+S:	Maintained
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git
+F:	drivers/base/revocable.c
+F:	include/linux/revocable.h
+
 RFKILL
 M:	Johannes Berg <johannes@sipsolutions.net>
 L:	linux-wireless@vger.kernel.org
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 8074a10183dc..bdf854694e39 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -6,7 +6,7 @@ obj-y			:= component.o core.o bus.o dd.o syscore.o \
 			   cpu.o firmware.o init.o map.o devres.o \
 			   attribute_container.o transport_class.o \
 			   topology.o container.o property.o cacheinfo.o \
-			   swnode.o faux.o
+			   swnode.o faux.o revocable.o
 obj-$(CONFIG_AUXILIARY_BUS) += auxiliary.o
 obj-$(CONFIG_DEVTMPFS)	+= devtmpfs.o
 obj-y			+= power/
diff --git a/drivers/base/revocable.c b/drivers/base/revocable.c
new file mode 100644
index 000000000000..3fb747d749ab
--- /dev/null
+++ b/drivers/base/revocable.c
@@ -0,0 +1,267 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2026 Google LLC
+ *
+ * Revocable resource management
+ */
+
+#include <linux/kref.h>
+#include <linux/revocable.h>
+#include <linux/slab.h>
+#include <linux/srcu.h>
+
+/**
+ * DOC: Overview
+ *
+ * The "revocable" mechanism is a synchronization primitive designed to
+ * manage safe access to resources that can be asynchronously removed or
+ * invalidated.  Its primary purpose is to prevent Use-After-Free (UAF)
+ * errors when interacting with resources whose lifetimes are not
+ * guaranteed to outlast their consumers.
+ *
+ * This is particularly useful in systems where resources can disappear
+ * unexpectedly, such as those provided by hot-pluggable devices like
+ * USB.  When a consumer holds a reference to such a resource, the
+ * underlying device might be removed, causing the resource's memory to
+ * be freed.  Subsequent access attempts by the consumer would then lead
+ * to UAF errors.
+ *
+ * Revocable addresses this by providing a form of "weak reference" and
+ * a controlled access method.  It allows a resource consumer to safely
+ * attempt to access the resource.  The mechanism guarantees that any
+ * access granted is valid for the duration of its use.  If the resource
+ * has already been revoked (i.e., freed), the access attempt will fail
+ * safely, typically by returning NULL, instead of causing a crash.
+ *
+ * It uses a provider/consumer model built on Sleepable RCU (SRCU) to
+ * guarantee safe memory access:
+ *
+ * - A resource provider, such as a driver for a hot-pluggable device,
+ *   allocates a struct revocable and initializes it with a pointer
+ *   to the resource.
+ *
+ * - A resource consumer that wants to access the resource allocates a
+ *   struct revocable_handle containing a reference to the provider.
+ *
+ * - To access the resource, the consumer uses revocable_try_access().
+ *   This function enters an SRCU read-side critical section and returns
+ *   the pointer to the resource.  If the provider has already freed the
+ *   resource, it returns NULL.  After use, the consumer calls
+ *   revocable_withdraw_access() to exit the SRCU critical section.  There
+ *   are some macro level helpers for doing that.
+ *
+ *   The API provides the following contract:
+ *
+ *   - revocable_try_access() can be safely called from both process and
+ *     atomic contexts.
+ *   - It is permitted to sleep within the critical section established
+ *     between revocable_try_access() and revocable_withdraw_access().
+ *   - revocable_try_access() and the matching revocable_withdraw_access()
+ *     must occur in the same context.  For example, it is illegal to
+ *     invoke revocable_withdraw_access() in an irq handler if the matching
+ *     revocable_try_access() was invoked in process context.
+ *
+ * - When the provider needs to remove the resource, it calls
+ *   revocable_revoke().  This function sets the internal resource
+ *   pointer to NULL and then calls synchronize_srcu() to wait for all
+ *   current readers to finish before the resource can be completely torn
+ *   down.
+ */
+
+static void revocable_release(struct kref *kref)
+{
+	struct revocable *rev = container_of(kref, typeof(*rev), kref);
+
+	cleanup_srcu_struct(&rev->srcu);
+
+	if (!rev->embedded)
+		kfree(rev);
+}
+
+/**
+ * revocable_alloc() - Allocate struct revocable.
+ * @res: The pointer of resource.
+ *
+ * This allocates a resource provider handle and holds 2 initial reference
+ * counts to the handle.  If revocable_alloc() succeed:
+ *
+ * - The provider should call revocable_revoke() for dropping a reference.
+ * - The caller should call revocable_put() for dropping another reference.
+ *
+ * Return: The pointer of struct revocable.  NULL on errors.
+ */
+struct revocable *revocable_alloc(void *res)
+{
+	struct revocable *rev;
+	int ret;
+
+	rev = kzalloc_obj(*rev);
+	if (!rev)
+		return NULL;
+
+	ret = revocable_init(rev, res);
+	if (ret) {
+		kfree(rev);
+		return NULL;
+	}
+
+	rev->embedded = false;
+	return rev;
+}
+EXPORT_SYMBOL_GPL(revocable_alloc);
+
+/**
+ * revocable_init() - Initialize struct revocable.
+ * @rev: The pointer of resource provider.
+ * @res: The pointer of resource.
+ *
+ * This initializes a resource provider handle embedded within another
+ * structure and holds 2 initial reference counts to the handle.
+ *
+ * If revocable_init() succeed:
+ *
+ * - The provider should call revocable_revoke() for dropping a reference.
+ * - The caller should call revocable_put() for dropping another reference.
+ */
+int revocable_init(struct revocable *rev, void *res)
+{
+	int ret;
+
+	ret = init_srcu_struct(&rev->srcu);
+	if (ret)
+		return ret;
+
+	RCU_INIT_POINTER(rev->res, res);
+	kref_init(&rev->kref);
+	kref_get(&rev->kref);
+	rev->embedded = true;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(revocable_init);
+
+/**
+ * revocable_revoke() - Revoke the managed resource.
+ * @rev: The pointer of resource provider.
+ *
+ * This sets the resource `(struct revocable *)->res` to NULL to indicate
+ * the resource has gone.
+ *
+ * This drops a reference count to the resource provider.  If it is the
+ * final reference, revocable_release() will be called to free the internal
+ * resources.
+ */
+void revocable_revoke(struct revocable *rev)
+{
+	rcu_assign_pointer(rev->res, NULL);
+	synchronize_srcu(&rev->srcu);
+	revocable_put(rev);
+}
+EXPORT_SYMBOL_GPL(revocable_revoke);
+
+/**
+ * revocable_get() - Increase a reference count to the provider handle.
+ * @rev: The pointer of resource provider.
+ *
+ * This increments the reference count.
+ */
+void revocable_get(struct revocable *rev)
+{
+	kref_get(&rev->kref);
+}
+EXPORT_SYMBOL_GPL(revocable_get);
+
+/**
+ * revocable_put() - Decrease a reference count to the provider handle.
+ * @rev: The pointer of resource provider.
+ *
+ * This drops a reference count to the resource provider.  If it is the
+ * final reference, revocable_release() will be called to free the internal
+ * resources.
+ */
+void revocable_put(struct revocable *rev)
+{
+	kref_put(&rev->kref, revocable_release);
+}
+EXPORT_SYMBOL_GPL(revocable_put);
+
+/**
+ * revocable_handle_init() - Initialize struct revocable_handle.
+ * @rev: The pointer of resource provider.
+ * @rh: The pointer of resource_handle.
+ *
+ * This initializes a handle owned by the consumer and holds a reference
+ * count to the resource provider.
+ */
+void revocable_handle_init(struct revocable *rev, struct revocable_handle *rh)
+{
+	revocable_get(rev);
+	rh->rev = rev;
+}
+EXPORT_SYMBOL_GPL(revocable_handle_init);
+
+/**
+ * revocable_handle_deinit() - Deinitialize struct revocable_handle.
+ * @rh: The pointer of resource_handle.
+ *
+ * This drops a reference count to the resource provider.  If it is the
+ * final reference, revocable_release() will be called to free the internal
+ * resources.
+ */
+void revocable_handle_deinit(struct revocable_handle *rh)
+{
+	struct revocable *rev = rh->rev;
+
+	revocable_put(rev);
+}
+EXPORT_SYMBOL_GPL(revocable_handle_deinit);
+
+/**
+ * revocable_try_access() - Try to access the resource.
+ * @rh: The pointer of resource_handle.
+ *
+ * This tries to de-reference to the resource and enters a SRCU critical
+ * section.
+ *
+ * The function is safe to be called from both process and atomic contexts.
+ * While holding the access (i.e. before calling revocable_withdraw_access()),
+ * the caller is allowed to sleep.
+ *
+ * Note that revocable_try_access() and the matching
+ * revocable_withdraw_access() must occur in the same context.  For example, it
+ * is illegal to invoke revocable_withdraw_access() in an irq handler if the
+ * matching revocable_try_access() was invoked in process context.
+ *
+ * Return: The pointer to the resource.  NULL if the resource has gone.
+ */
+void *revocable_try_access(struct revocable_handle *rh)
+	__acquires(&rh->rev->srcu)
+{
+	struct revocable *rev = rh->rev;
+
+	rh->idx = srcu_read_lock(&rev->srcu);
+	return srcu_dereference(rev->res, &rev->srcu);
+}
+EXPORT_SYMBOL_GPL(revocable_try_access);
+
+/**
+ * revocable_withdraw_access() - Stop accessing to the resource.
+ * @rh: The pointer of resource_handle.
+ *
+ * Call this function to indicate the resource is no longer used.  It exits
+ * the SRCU critical section.
+ *
+ * The function is safe to be called from both process and atomic contexts.
+ *
+ * Note that revocable_try_access() and the matching
+ * revocable_withdraw_access() must occur in the same context.  For example, it
+ * is illegal to invoke revocable_withdraw_access() in an irq handler if the
+ * matching revocable_try_access() was invoked in process context.
+ */
+void revocable_withdraw_access(struct revocable_handle *rh)
+	__releases(&rh->rev->srcu)
+{
+	struct revocable *rev = rh->rev;
+
+	srcu_read_unlock(&rev->srcu, rh->idx);
+}
+EXPORT_SYMBOL_GPL(revocable_withdraw_access);
diff --git a/include/linux/revocable.h b/include/linux/revocable.h
new file mode 100644
index 000000000000..b66d41b92ee5
--- /dev/null
+++ b/include/linux/revocable.h
@@ -0,0 +1,204 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2026 Google LLC
+ */
+
+#ifndef __LINUX_REVOCABLE_H
+#define __LINUX_REVOCABLE_H
+
+#include <linux/cleanup.h>
+#include <linux/kref.h>
+#include <linux/srcu.h>
+
+/**
+ * struct revocable - A handle for resource provider.
+ * @srcu: The SRCU to protect the resource.
+ * @res:  The pointer of resource.  It can point to anything.
+ * @kref: The refcount for this handle.
+ * @embedded: Indicate if the handle is embedded in another struct.
+ *
+ * Note: All members of this structure are intended to be opaque and should
+ * not be accessed directly by the users.
+ */
+struct revocable {
+	struct srcu_struct srcu;
+	void __rcu *res;
+	struct kref kref;
+	bool embedded;
+};
+
+/**
+ * struct revocable_handle - A handle for resource consumer.
+ * @rev: The pointer of resource provider.
+ * @idx: The index for the SRCU critical section.
+ *
+ * Note: All members of this structure are intended to be opaque and should
+ * not be accessed directly by the users.
+ */
+struct revocable_handle {
+	struct revocable *rev;
+	int idx;
+};
+
+struct revocable *revocable_alloc(void *res);
+int revocable_init(struct revocable *rev, void *res);
+void revocable_revoke(struct revocable *rev);
+void revocable_get(struct revocable *rev);
+void revocable_put(struct revocable *rev);
+
+void revocable_handle_init(struct revocable *rev, struct revocable_handle *rh);
+void revocable_handle_deinit(struct revocable_handle *rh);
+void *revocable_try_access(struct revocable_handle *rh)
+	__acquires(&rh->rev->srcu);
+void revocable_withdraw_access(struct revocable_handle *rh)
+	__releases(&rh->rev->srcu);
+
+DEFINE_FREE(access_rev, struct revocable_handle *, {
+	revocable_withdraw_access(_T);
+	revocable_handle_deinit(_T);
+})
+
+#define _revocable_try_access_with(_rev, _rh, _res)				\
+	struct revocable_handle _rh;						\
+	struct revocable_handle *__UNIQUE_ID(name) __free(access_rev) = &_rh;	\
+										\
+	revocable_handle_init(_rev, &_rh);					\
+	_res = revocable_try_access(&_rh)
+
+/**
+ * revocable_try_access_with() - A helper for accessing revocable resource
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * The macro simplifies the access-release cycle for consumers, ensuring that
+ * corresponding revocable_withdraw_access() and revocable_handle_deinit() are
+ * called, even in the case of an early exit.
+ *
+ * It creates a local variable in the current scope.  @_res is populated with
+ * the result of revocable_try_access().  Callers **must** check if @_res is
+ * ``NULL`` before using it.  The revocable_withdraw_access() function is
+ * automatically called when the scope is exited.
+ *
+ * Note: It shares the same issue with guard() in cleanup.h.  No goto statements
+ * are allowed before the helper.  Otherwise, the compiler fails with
+ * "jump bypasses initialization of variable with __attribute__((cleanup))".
+ */
+#define revocable_try_access_with(_rev, _res)					\
+	_revocable_try_access_with(_rev, __UNIQUE_ID(name), _res)
+
+/**
+ * revocable_try_access_or_return_err() - Variant of revocable_try_access_with()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ * @_err: The error code to return if resource is revoked.
+ *
+ * Similar to revocable_try_access_with() but returns from the current function
+ * with @_err if the resource is revoked.  Callers don't need to check @_res for
+ * ``NULL`` as this handles the revocation case by returning early.
+ */
+#define revocable_try_access_or_return_err(_rev, _res, _err)			\
+	_revocable_try_access_with(_rev, __UNIQUE_ID(name), _res);		\
+	if (!_res)								\
+		return _err
+
+/**
+ * revocable_try_access_or_return() - Variant of revocable_try_access_with()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_or_return_err() but returns -ENODEV if the
+ * resource is revoked.
+ */
+#define revocable_try_access_or_return(_rev, _res)				\
+	revocable_try_access_or_return_err(_rev, _res, -ENODEV)
+
+/**
+ * revocable_try_access_or_return_void() - Variant of revocable_try_access_with()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_or_return_err() but returns void if the
+ * resource is revoked.
+ */
+#define revocable_try_access_or_return_void(_rev, _res)				\
+	revocable_try_access_or_return_err(_rev, _res, )
+
+#define _revocable_try_access_with_scoped(_rev, _rh, _label, _res)		\
+	for (struct revocable_handle _rh,					\
+			*__UNIQUE_ID(name) __free(access_rev) = &_rh;		\
+	     ({ revocable_handle_init(_rev, &_rh);				\
+		_res = revocable_try_access(&_rh);				\
+		true; });							\
+	     ({ goto _label; }))						\
+		if (0) {							\
+_label:										\
+			break;							\
+		} else
+
+/**
+ * revocable_try_access_with_scoped() - Variant of revocable_try_access_with()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_with() but with an explicit scope from a
+ * temporary ``for`` loop.
+ */
+#define revocable_try_access_with_scoped(_rev, _res)				\
+	_revocable_try_access_with_scoped(_rev, __UNIQUE_ID(name),		\
+					  __UNIQUE_ID(label), _res)
+
+/**
+ * revocable_try_access_or_return_err_scoped() - Variant of revocable_try_access_with_scoped()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ * @_err: The error code to return if resource is revoked.
+ *
+ * Similar to revocable_try_access_with_scoped() but returns from the current
+ * function with @_err if the resource is revoked.  Callers don't need to check
+ * @_res for ``NULL`` as this handles the revocation case by returning early.
+ */
+#define revocable_try_access_or_return_err_scoped(_rev, _res, _err)		\
+	_revocable_try_access_with_scoped(_rev, __UNIQUE_ID(name),		\
+					  __UNIQUE_ID(label), _res)		\
+	if (!_res) {								\
+		return _err;							\
+	} else
+
+/**
+ * revocable_try_access_or_return_scoped() - Variant of revocable_try_access_with_scoped()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_or_return_err_scoped() but returns -ENODEV
+ * if the resource is revoked.
+ */
+#define revocable_try_access_or_return_scoped(_rev, _res)			\
+	revocable_try_access_or_return_err_scoped(_rev, _res, -ENODEV)
+
+/**
+ * revocable_try_access_or_return_void_scoped() - Variant of revocable_try_access_with_scoped()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_or_return_err_scoped() but returns void
+ * if the resource is revoked.
+ */
+#define revocable_try_access_or_return_void_scoped(_rev, _res)			\
+	revocable_try_access_or_return_err_scoped(_rev, _res, )
+
+/**
+ * revocable_try_access_or_skip_scoped() - Variant of revocable_try_access_with_scoped()
+ * @_rev: The pointer of resource provider.
+ * @_res: A pointer variable that will be assigned the resource.
+ *
+ * Similar to revocable_try_access_with_scoped() but skips the following code
+ * block if the resource is revoked.
+ */
+#define revocable_try_access_or_skip_scoped(_rev, _res)				\
+	_revocable_try_access_with_scoped(_rev, __UNIQUE_ID(name),		\
+					  __UNIQUE_ID(label), _res)		\
+	if (!_res) {								\
+		break;								\
+	} else
+
+#endif /* __LINUX_REVOCABLE_H */
-- 
2.51.0


^ permalink raw reply related

* [PATCH v11 0/5] drivers/base: Introduce revocable
From: Tzung-Bi Shih @ 2026-05-13  9:10 UTC (permalink / raw)
  To: Arnd Bergmann, Greg Kroah-Hartman, Bartosz Golaszewski,
	Linus Walleij
  Cc: Benson Leung, tzungbi, linux-kernel, chrome-platform, driver-core,
	linux-doc, linux-gpio, Rafael J. Wysocki, Danilo Krummrich,
	Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
	Jason Gunthorpe, Johan Hovold, Paul E . McKenney

This series introduces the "revocable" mechanism, a synchronization
primitive designed to prevent Use-After-Free errors.

- Patch 1 introduces the revocable which is an implementation of ideas
  from the talk [1].

- Patch 2 adds KUnit test cases.

- Patch 3 transitions the UAF prevention logic within the GPIO core
  (gpiolib) to use the "revocable" mechanism.

  The existing code aims to prevent UAF issues when the underlying GPIO
  chip is removed.  They replace that custom logic with the generic
  "revocable" API, which is designed to handle such lifecycle
  dependencies.  There should be no changes in behavior.

- Patches 4 to 5 use "revocable" mechanism to fix an UAF in
  cros_ec_chardev driver.  Alternatively, [2] is a series for fixing the
  same issue without using "revocable".

Since v9, there are two ways to manage the resource provider handle.
- Embedded allocation: patch 3 might be the potential user.
- Dynamic allocation: patches 4 to 5 might be the potential user.

[1] https://lpc.events/event/17/contributions/1627/
[2] https://lore.kernel.org/all/20260427134659.95181-1-tzungbi@kernel.org

---
v11:
- Rebase onto v7.1-rc3.
- Squash patches 4 to 7 into patch 3.  A single patch for GPIO.

v10: https://lore.kernel.org/all/20260508105448.31799-1-tzungbi@kernel.org
- Unify handling of embedded and dynamic allocation.

v9: https://lore.kernel.org/all/20260427135841.96266-1-tzungbi@kernel.org
- Rebase onto v7.1-rc1.
- Remove the selftests patch as it makes less sense to test revocable
  APIs via kselftests.
- Merge patches 7 to 11 from
  https://lore.kernel.org/all/20260213092958.864411-1-tzungbi@kernel.org
  into the series.
- Merge patch from
  https://lore.kernel.org/all/20250923075302.591026-5-tzungbi@kernel.org
- Merge patch from
  https://lore.kernel.org/all/20250912081718.3827390-6-tzungbi@kernel.org

v8: https://lore.kernel.org/all/20260213092307.858908-1-tzungbi@kernel.org
- Rework on the revocable APIs.  See changelog in [PATCH v8 1/3] for details.

v7: https://lore.kernel.org/all/20260116080235.350305-1-tzungbi@kernel.org
- Rebase onto next-20260115.

v6: https://lore.kernel.org/all/20251106152330.11733-1-tzungbi@kernel.org
- Rebase onto next-20251106.
- Separate revocable core and use cases.

v5: https://lore.kernel.org/all/20251016054204.1523139-1-tzungbi@kernel.org
- Rebase onto next-20251015.
- Add more context about the PoC.
- Support multiple revocable providers in the PoC.

v4: https://lore.kernel.org/all/20250923075302.591026-1-tzungbi@kernel.org
- Rebase onto next-20250922.
- Remove the 5th patch from v3.
- Add fops replacement PoC in 5th - 7th patches.

v3: https://lore.kernel.org/all/20250912081718.3827390-1-tzungbi@kernel.org
- Rebase onto https://lore.kernel.org/all/20250828083601.856083-1-tzungbi@kernel.org
  and next-20250912.
- The 4th patch changed accordingly.

v2: https://lore.kernel.org/all/20250820081645.847919-1-tzungbi@kernel.org
- Rename "ref_proxy" -> "revocable".
- Add test cases in Kunit and selftest.

v1: https://lore.kernel.org/all/20250814091020.1302888-1-tzungbi@kernel.org

Tzung-Bi Shih (5):
  revocable: Revocable resource management
  revocable: Add KUnit test cases
  gpio: Leverage revocable for accessing struct gpio_chip
  platform/chrome: Protect cros_ec_device lifecycle with revocable
  platform/chrome: cros_ec_chardev: Consume cros_ec_device via revocable

 .../driver-api/driver-model/index.rst         |   1 +
 .../driver-api/driver-model/revocable.rst     | 384 +++++++++++++++++
 MAINTAINERS                                   |  10 +
 drivers/base/Makefile                         |   2 +-
 drivers/base/revocable.c                      | 267 ++++++++++++
 drivers/base/test/Kconfig                     |   5 +
 drivers/base/test/Makefile                    |   2 +
 drivers/base/test/revocable-test.c            | 406 ++++++++++++++++++
 drivers/gpio/gpiolib-cdev.c                   |  77 ++--
 drivers/gpio/gpiolib-sysfs.c                  |  31 +-
 drivers/gpio/gpiolib.c                        | 263 +++++-------
 drivers/gpio/gpiolib.h                        |  28 +-
 drivers/platform/chrome/cros_ec.c             |  11 +
 drivers/platform/chrome/cros_ec_chardev.c     |  80 +++-
 include/linux/platform_data/cros_ec_proto.h   |   3 +
 include/linux/revocable.h                     | 204 +++++++++
 16 files changed, 1505 insertions(+), 269 deletions(-)
 create mode 100644 Documentation/driver-api/driver-model/revocable.rst
 create mode 100644 drivers/base/revocable.c
 create mode 100644 drivers/base/test/revocable-test.c
 create mode 100644 include/linux/revocable.h

-- 
2.51.0


^ permalink raw reply

* [PATCH v7 08/43] fscrypt: add documentation about extent encryption
From: Daniel Vacek @ 2026-05-13  8:52 UTC (permalink / raw)
  To: Chris Mason, Josef Bacik, Eric Biggers, Theodore Y. Ts'o,
	Jaegeuk Kim, Jens Axboe, David Sterba, Jonathan Corbet,
	Shuah Khan
  Cc: linux-block, Daniel Vacek, linux-fscrypt, linux-btrfs,
	linux-kernel, linux-doc
In-Reply-To: <20260513085340.3673127-1-neelx@suse.com>

From: Josef Bacik <josef@toxicpanda.com>

Add a couple of sections to the fscrypt documentation about per-extent
encryption.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Daniel Vacek <neelx@suse.com>
---

v7 changes:
 * Fix spelling and typos.
No changes in v6.
v5: https://lore.kernel.org/linux-btrfs/7b2cc4dd423c3930e51b1ef5dd209164ff11c05a.1706116485.git.josef@toxicpanda.com/
---
 Documentation/filesystems/fscrypt.rst | 41 +++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index c0dd35f1af12..a1b0b50da869 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -283,6 +283,21 @@ alternative master keys or to support rotating master keys.  Instead,
 the master keys may be wrapped in userspace, e.g. as is done by the
 `fscrypt <https://github.com/google/fscrypt>`_ tool.
 
+Per-extent encryption keys
+--------------------------
+
+For certain file systems, such as btrfs, it's desired to derive a
+per-extent encryption key.  This is to enable features such as snapshots
+and reflink, where you could have different inodes pointing at the same
+extent.  When a new extent is created fscrypt randomly generates a
+16-byte nonce and the file system stores it alongside the extent.
+Then, it uses a KDF (as described in `Key derivation function`_) to
+derive the extent's key from the master key and nonce.
+
+Currently the inode's master key and encryption policy must match the
+extent, so you cannot share extents between inodes that were encrypted
+differently.
+
 DIRECT_KEY policies
 -------------------
 
@@ -1483,6 +1498,27 @@ by the kernel and is used as KDF input or as a tweak to cause
 different files to be encrypted differently; see `Per-file encryption
 keys`_ and `DIRECT_KEY policies`_.
 
+Extent encryption context
+-------------------------
+
+The extent encryption context mirrors the important parts of the above
+`Encryption context`_, with a few omissions.  The struct is defined as
+follows::
+
+        struct fscrypt_extent_context {
+                u8 version;
+                u8 encryption_mode;
+                u8 master_key_identifier[FSCRYPT_KEY_IDENTIFIER_SIZE];
+                u8 nonce[FSCRYPT_FILE_NONCE_SIZE];
+        };
+
+Currently all fields much match the containing inode's encryption
+context, with the exception of the nonce.
+
+Additionally extent encryption is only supported with
+FSCRYPT_EXTENT_CONTEXT_V2 using the standard policy; all other policies
+are disallowed.
+
 Data path changes
 -----------------
 
@@ -1506,6 +1542,11 @@ buffer.  Some filesystems, such as UBIFS, already use temporary
 buffers regardless of encryption.  Other filesystems, such as ext4 and
 F2FS, have to allocate bounce pages specially for encryption.
 
+Inline encryption is not optional for extent encryption based file
+systems; the amount of objects required to be kept around is too much.
+Inline encryption handles the object lifetime details which results in a
+cleaner implementation.
+
 Filename hashing and encoding
 -----------------------------
 
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH] Documentation: KVM: Document guest-visible compatibility expectations
From: Marc Zyngier @ 2026-05-13  8:42 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: David Woodhouse, Jonathan Corbet, Shuah Khan, kvm, linux-doc,
	linux-kernel, Sean Christopherson, Jim Mattson, Oliver Upton,
	Joey Gouly, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
	Will Deacon, Raghavendra Rao Ananta, Eric Auger, Kees Cook,
	Arnd Bergmann, Nathan Chancellor, linux-arm-kernel, kvmarm,
	linux-kselftest
In-Reply-To: <baff82ca-6321-4b16-aa61-b2d6d60b6535@redhat.com>

On Mon, 11 May 2026 17:56:15 +0100,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 5/11/26 18:38, David Woodhouse wrote:
> > Not *everything* is in CPUID; one recent exception that comes to mind
> > is the SUPPRESS_EOI_BROADCAST quirk. But on x86 we preserve the
> > existing behaviour of older kernels — even when that behaviour doesn't
> > make much sense, as with SUPPRESS_EOI_BROADCAST where older KVM would
> > *advertise* the feature, but not actually *implement* it. Nevertheless,
> > that remains the default behaviour of future kernels unless userspace
> > explicitly opts in to fully enable (or disable) the feature.
> > 
> > But this documentation update isn't even asking for that compatible-by-
> > default behaviour, even though that is the right thing to do. It's only
> > asking that it be *possible* to reinstate the old behaviour, for
> > userspace that *knows* about the change and explicitly wants to go back
> > to the old way to remain compatible.
> 
> Yep, these are the "quirks"---if it's too early for Arm to commit to
> that, I guess it's fine.

Compatible by default means nothing, because userspace needs to
discover the combined capabilities of the host and KVM. This is not a
"CPU model" architecture.

If userspace is not a total joke, it will read all the ID registers,
and configure what it wants to see, assuming it is a feature that can
be configured (not everything can, because the architecture itself is
not fully backward compatible).

Yes, this is buggy at times, because the combinatorial explosion of
CPU capabilities and supported features makes it pretty hard to test
(and really nobody actually does). But overall, it works, and QEMU is
growing an infrastructure to manage it in a "user friendly" way.

But really, this isn't what David is asking. He's demanding "bug for
bug" compatibility. For that, we have two possible cases:

- this is a behaviour that, while undesirable, is allowed by the
  architecture: fine, we preserve the behaviour and add another way to
  expose the one we really want. it is ugly, but we manage.

- this is a behaviour that is not allowed by the architecture: we fix
  it for good. We do that on every release. Some minor, some much more
  visible. And there is no way we will add this sort of "bring the
  bugs back" type of behaviours. Specially when it is really obvious
  that no SW can make any reasonable use of the defect. We allow
  userspace to keep behaving as before, but the guest will not see a
  non-compliant behaviour.

That being said, there is a way out of that: convince people in charge
of the architecture that the non-compliant KVM behaviour is actually
valuable, and deserves to be tolerated. This has happened before (VHE
only and NV2 only, just to name two recent changes).

Other terrible hacks (such as GICv3's GICD_TYPER.num_LPIs which KVM
doesn't support) were added at the request of cloud vendors that David
might be familiar with, so it isn't like it is a brand new process.

And once it is in the architecture, it becomes a behaviour that is
allowed to be exposed to a guest, for better or worse.

These are the rules we have followed since we started KVM/arm, and I
intend to stick to them.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply

* Re: [PATCH 0/3] Documentation/gpu: tables of contents cleanups and fixes
From: Jani Nikula @ 2026-05-13  8:20 UTC (permalink / raw)
  To: Jonathan Corbet, dri-devel, linux-doc, Maxime Ripard,
	Thomas Zimmermann, Maarten Lankhorst
  Cc: Randy Dunlap
In-Reply-To: <875x4uw3bo.fsf@trenco.lwn.net>

On Mon, 11 May 2026, Jonathan Corbet <corbet@lwn.net> wrote:
> Jani Nikula <jani.nikula@intel.com> writes:
>
>> On Fri, 08 May 2026, Jani Nikula <jani.nikula@intel.com> wrote:
>>> Make the GPU documentation slightly easier to navigate.
>>
>> Maxime, Thomas, Maarten, Jon -
>>
>> Any preferences which tree to merge this through? I'm thinking either
>> drm-misc-next or docs-next.
>
> Usually I stand back from DRM docs, expecting them to go through the DRM
> tree.  I can certainly pick up this set if that's best, but I wasn't
> expecting to.

Thanks. Since I got no reply from the others, and thought this was
benign enough, I went ahead and pushed this to drm-misc-next.

Also thanks Randy for having a look at it!

BR,
Jani.


-- 
Jani Nikula, Intel

^ permalink raw reply

* Re: [PATCH] docs: reporting-issues: clarify advice wording
From: Thorsten Leemhuis @ 2026-05-13  8:08 UTC (permalink / raw)
  To: Chen-Shi-Hong; +Cc: corbet, skhan, linux-doc, linux-kernel
In-Reply-To: <20260512150431.894-1-eric039eric@gmail.com>

On 5/12/26 17:04, Chen-Shi-Hong wrote:
> A previous change

You mean the one you sent earlier we discussed? But that was not
applied, unless I'm missing something. So this should have been a v2 of
your patch (which a changelog) and against the same base (the one that
"these advices"). You want to send that as v3 now.

> replaced "these advices" with "this advice", but that
> wording can be read too narrowly and may seem to refer only to a single
> recommendation.
> 
> Use "all of this advice" instead

That's great, thx. And thx to Jonathan for the suggestion!

Ciao, Thorsten

> to make it clearer that the sentence
> refers to the broader set of recommendations in the paragraph.
> 
> Signed-off-by: Chen-Shi-Hong <eric039eric@gmail.com>
> ---
>  Documentation/admin-guide/reporting-issues.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/reporting-issues.rst b/Documentation/admin-guide/reporting-issues.rst
> index 731865b5e8ff..87dd874fffcf 100644
> --- a/Documentation/admin-guide/reporting-issues.rst
> +++ b/Documentation/admin-guide/reporting-issues.rst
> @@ -129,7 +129,7 @@ After these preparations you'll now enter the main part:
>     situations; during the merge window that actually might be even the best
>     approach, but in that development phase it can be an even better idea to
>     suspend your efforts for a few days anyway. Whatever version you choose,
> -   ideally use a 'vanilla' build. Ignoring this advice will dramatically
> +   ideally use a 'vanilla' build. Ignoring all of this advice will dramatically
>     increase the risk your report will be rejected or ignored.
>  
>   * Ensure the kernel you just installed does not 'taint' itself when
> @@ -795,7 +795,7 @@ Install a fresh kernel for testing
>      situations; during the merge window that actually might be even the best
>      approach, but in that development phase it can be an even better idea to
>      suspend your efforts for a few days anyway. Whatever version you choose,
> -    ideally use a 'vanilla' built. Ignoring this advice will dramatically
> +    ideally use a 'vanilla' built. Ignoring all of this advice will dramatically
>      increase the risk your report will be rejected or ignored.*
>  
>  As mentioned in the detailed explanation for the first step already: Like most


^ permalink raw reply

* Re: [PATCH 2/3] mm/zswap: Implement proactive writeback
From: Hao Jia @ 2026-05-13  8:04 UTC (permalink / raw)
  To: Nhat Pham
  Cc: Yosry Ahmed, akpm, tj, hannes, shakeel.butt, mhocko, mkoutny,
	chengming.zhou, muchun.song, roman.gushchin, cgroups, linux-mm,
	linux-kernel, linux-doc, Hao Jia, Alexandre Ghiti
In-Reply-To: <CAKEwX=MOixJAUGiwUcMQa0Stvg-mR-MvpDRD8WA4YMtRvnUYTg@mail.gmail.com>



On 2026/5/12 23:47, Nhat Pham wrote:
> On Tue, May 12, 2026 at 2:32 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>>
>>
>>
>> On 2026/5/12 03:57, Yosry Ahmed wrote:
>>> On Mon, May 11, 2026 at 12:49 PM Nhat Pham <nphamcs@gmail.com> wrote:
>>>>
>>>> On Mon, May 11, 2026 at 3:52 AM Hao Jia <jiahao.kernel@gmail.com> wrote:
>>>>>
>>>>> From: Hao Jia <jiahao1@lixiang.com>
>>>>>
>>>>> Zswap currently writes back pages to backing swap devices reactively,
>>>>> triggered either by memory pressure via the shrinker or by the pool
>>>>> reaching its size limit. This reactive approach offers no precise
>>>>> control over when writeback happens, which can disturb latency-sensitive
>>>>> workloads, and it cannot direct writeback at a specific memory cgroup.
>>>>> However, there are scenarios where users might want to proactively
>>>>> write back cold pages from zswap to the backing swap device, for
>>>>> example, to free up memory for other applications or to prepare for
>>>>> upcoming memory-intensive workloads.
>>>>>
>>>>> Therefore, implement a proactive writeback mechanism for zswap by
>>>>> adding a new cgroup interface file memory.zswap.proactive_writeback
>>>>> within the memory controller.
>>>>
>>
>> Thanks Nhat, Yosry — let me address both comments together.
>>
>>>>
>>>> We already have memory.reclaim, no? Would that not work to create
>>>> headroom generally for your use case? Is there a reason why we are
>>>> treating zswap memory as special here?
>>>
>>
>> Apologies for the lack of detailed explanation in the patch description,
>> which led to the confusion.
>>
>> While we are already utilizing memory.reclaim, it does not fully address
>> our requirements.
>>
>> Our deployment runs a userspace proactive reclaimer that drives
>> memory.reclaim based on the system's runtime state (memory/CPU/IO
>> pressure, refault rate, ...) and workload-specific
>> policy. That first stage compresses cold anon pages into zswap. Entries
>> that then remain in zswap past a policy-defined age threshold are
>> considered "twice cold", and the reclaimer wants
>> to write them back to the backing swap device at a moment of its own
>> choosing, to further reclaim the DRAM still held by the compressed data.
>>
>> This is the "second-level offloading" pattern described in Meta's TMO
>> paper [1]. zswap proactive writeback is what this series introduces to
>> address that second-level offloading stage.
>>
>> [1] https://www.pdl.cmu.edu/ftp/NVM/tmo_asplos22.pdf
> 
> Yeah that's what we've been trying to work on as well :) We are
> working on a couple of improvements to the mechanism side of this path
> (cc Alex) - hopefully it will help your use case too!
> 
> Anyway, back to my original inquiry: I understand your use case. It's
> pretty similar to our goal. What I'm not getting is why is
> memory.reclaim (which you already use) not sufficient for zswap ->
> disk swap offloading too?
> 
> Zswap objects are organized into LRU and exposed to the shrinker
> interface. Echo-ing to memory.reclaim should also offload some zswap
> entries, correct? Are there still cold zswap entries that escape this,
> somehow?
> 

Yes, the memory.reclaim path does drive some zswap writeback, but
it is not enough for our case.

1. For a memcg that has reached steady state (a common case being
when memory.current is below the policy target), the userspace
reclaimer may not invoke memory.reclaim on it for a long time,
and so no second-level offloading happens through
memory.reclaim. In this state we want
memory.zswap.proactive_writeback to write back entries that
have sat in zswap past an age threshold, to further reclaim
the DRAM still held by the compressed data.

2. Even when memory.reclaim is running, the fraction of zswap
residency that ends up reaching the backing swap device is
still very small for many of our workloads, and the userspace
reclaimer has no way to participate in or control the
granularity of zswap writeback. So in our deployment we prefer
to leave the zswap shrinker disabled, decouple LRU -> zswap
from zswap -> swap, and use a dedicated proactive-writeback
interface that lifts the writeback policy into userspace where
it can evolve independently of the kernel.

Thanks,
Hao

> Furthermore, we already have a way to detect the "twice cold" entries
> you mentioned: the referenced bit. This is analogous to the way we
> treat uncompressed pages.
> 
>>
>>
>>> +1, why do we need to specifically proactively reclaim the compressed memory?
>>>
>>> Also, if we do need to minimize the compressed memory and force higher
>>> writeback rates, we can do so with memory.zswap.max, right?
>>
>> Here are a few reasons why memory.zswap.max is not enough:
>>
>> 1. Writing memory.zswap.max itself does not trigger any writeback
>> immediately. For a memcg that has reached steady state (on which the
>> userspace reclaimer is no longer invoking
>> memory.reclaim), after enough time has passed, the reclaimer has no good
>> way to trigger proactive writeback for second-level offloading by
>> lowering memory.zswap.max, because in steady
>> state nothing drives the zswap_store() -> shrink_memcg() path. The
>> userspace reclaimer still has no control over when proactive writeback
>> happens.
>>
>> 2. memory.zswap.max currently triggers zswap writeback via zswap_store()
>> -> shrink_memcg(), and each over-limit event can write back at most
>> NR_NODES entries. If zswap residency is far
>> above memory.zswap.max, converging to the target size requires at least
>> O(over-limit pages / NR_NODES) zswap_store() events, with no batching —
>> proactive writeback therefore has
>> significant latency.
>>
>> 3. memory.zswap.max is a stateful interface. If the userspace reclaimer
>> crashes for any reason mid-operation, it may leave memory.zswap.max at
>> some set value, putting the application in a
>>    persistently throttled bad state.
>>
>> 4. Once the userspace reclaimer has lowered memory.zswap.max, if the
>> workload is rapidly expanding and triggers memory reclaim via
>> memory.high / kswapd / etc., the actual amount written
>> back can exceed what was intended.
> 
> One more reason: IIRC, when you set memory.zswap.max to a value other
> than 0 max, every zswap store incurs a pretty expensive check
> (obj_cgroup_may_zswap), which does a force flush
> (__mem_cgroup_flush_stats). That was pretty expensive last time some
> of our internal services played with it. So yeah, it's not ideal...
> 
> (if you're using this, might wanna profile this as well).
> 
>>
>> Thanks,
>> Hao

^ permalink raw reply

* Re: [PATCH 09/12] swap: push down setting sis->bdev into ->swap_activate
From: Damien Le Moal @ 2026-05-13  7:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Darrick J. Wong, Andrew Morton, Chris Li, Kairui Song,
	Christian Brauner, Jens Axboe, David Sterba, Theodore Ts'o,
	Jaegeuk Kim, Chao Yu, Trond Myklebust, Anna Schumaker,
	Namjae Jeon, Hyunchul Lee, Steve French, Paulo Alcantara,
	Carlos Maiolino, Naohiro Aota, linux-xfs, linux-fsdevel,
	linux-doc, linux-mm, linux-block, linux-btrfs, linux-ext4,
	linux-f2fs-devel, linux-nfs, linux-cifs
In-Reply-To: <20260513074608.GA3693@lst.de>

On 5/13/26 16:46, Christoph Hellwig wrote:
> On Wed, May 13, 2026 at 04:44:53PM +0900, Damien Le Moal wrote:
>> Hmmm... With zonefs, swap files can be created on top of conventional zone
>> files. So enforcing "no swap on zoned device" here would break that.
> 
> We can check that none of the extents fall onto sequential zones instead
> of just devices.
> 
> I still wonder why you bother with swap to zonefs at all, though.

Yeah. I do not think anyone actually use that... But since it is there from the
start, kind of stuck with it now.


-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply

* Re: [PATCH v7 09/20] KVM: arm64: Set up MDCR_EL2 to handle a Partitioned PMU
From: Oliver Upton @ 2026-05-13  7:57 UTC (permalink / raw)
  To: Colton Lewis
  Cc: kvm, Alexandru Elisei, Paolo Bonzini, Jonathan Corbet,
	Russell King, Catalin Marinas, Will Deacon, Marc Zyngier,
	Oliver Upton, Mingwei Zhang, Joey Gouly, Suzuki K Poulose,
	Zenghui Yu, Mark Rutland, Shuah Khan, Ganapatrao Kulkarni,
	James Clark, linux-doc, linux-kernel, linux-arm-kernel, kvmarm,
	linux-perf-users, linux-kselftest
In-Reply-To: <20260504211813.1804997-10-coltonlewis@google.com>

On Mon, May 04, 2026 at 09:18:02PM +0000, Colton Lewis wrote:
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 3ad6b7c6e4ba7..0ab89c91e19cb 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -36,20 +36,43 @@ static int cpu_has_spe(u64 dfr0)
>   */
>  static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu)
>  {
> +	int hpmn = kvm_pmu_hpmn(vcpu);
> +
>  	preempt_disable();
>  
>  	/*
>  	 * This also clears MDCR_EL2_E2PB_MASK and MDCR_EL2_E2TB_MASK
>  	 * to disable guest access to the profiling and trace buffers
>  	 */
> -	vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN,
> -					 *host_data_ptr(nr_event_counters));
> +
> +	vcpu->arch.mdcr_el2 = FIELD_PREP(MDCR_EL2_HPMN, hpmn);
>  	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
>  				MDCR_EL2_TPMS |
>  				MDCR_EL2_TTRF |
>  				MDCR_EL2_TPMCR |
>  				MDCR_EL2_TDRA |
> -				MDCR_EL2_TDOSA);
> +				MDCR_EL2_TDOSA |
> +				MDCR_EL2_HPME);
> +
> +	if (kvm_vcpu_pmu_is_partitioned(vcpu)) {
> +		/*
> +		 * Filtering these should be redundant because we trap
> +		 * all the TYPER and FILTR registers anyway and ensure
> +		 * they filter EL2, but set the bits if they are here.
> +		 */
> +		if (is_pmuv3p1(read_pmuver()))
> +			vcpu->arch.mdcr_el2 |= MDCR_EL2_HPMD;
> +		if (is_pmuv3p5(read_pmuver()))
> +			vcpu->arch.mdcr_el2 |= MDCR_EL2_HCCD;

Neither of these controls are of any consequence on unsupported
hardware (RES0). Set them unconditionally?

> +		/*
> +		 * Take out the coarse grain traps if we are using
> +		 * fine grain traps.
> +		 */
> +		if (kvm_vcpu_pmu_use_fgt(vcpu))

I think open coding the check here would actually improve readability.

		if (cpus_have_final_cap(ARM64_HAS_FGT) &&
		    (cpus_have_final_cap(ARM64_HAS_HPMN0) ||
		     vcpu->kvm->arch.nr_pmu_counters != 0))
			vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_TPM | MDCR_EL2_TPMCR);
> +
> +/**
> + * kvm_pmu_hpmn() - Calculate HPMN field value
> + * @vcpu: Pointer to struct kvm_vcpu
> + *
> + * Calculate the appropriate value to set for MDCR_EL2.HPMN. If
> + * partitioned, this is the number of counters set for the guest if
> + * supported, falling back to max_guest_counters if needed. If we are not
> + * partitioned or can't set the implied HPMN value, fall back to the
> + * host value.
> + *
> + * Return: A valid HPMN value
> + */
> +u8 kvm_pmu_hpmn(struct kvm_vcpu *vcpu)
> +{
> +	u8 nr_guest_cntr = vcpu->kvm->arch.nr_pmu_counters;
> +
> +	if (kvm_vcpu_pmu_is_partitioned(vcpu)
> +	    && !vcpu_on_unsupported_cpu(vcpu)
> +	    && (cpus_have_final_cap(ARM64_HAS_HPMN0) || nr_guest_cntr > 0))
> +		return nr_guest_cntr;
> +
> +	return *host_data_ptr(nr_event_counters);
> +}

This helper isn't helpful. Just open code it in the place where we are
computing MDCR_EL2.

> @@ -542,6 +542,13 @@ u8 kvm_arm_pmu_get_max_counters(struct kvm *kvm)
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_PMUV3_IMPDEF_TRAPS))
>  		return 1;
>  
> +	/*
> +	 * If partitioned then we are limited by the max counters in
> +	 * the guest partition.
> +	 */
> +	if (kvm_pmu_is_partitioned(arm_pmu))
> +		return arm_pmu->max_guest_counters;
> +

Ok, this is exactly what I was getting at earlier. What about a VM with
an emulated PMU? It should use cntr_mask calculation, not the guest
range.

Thanks,
Oliver

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox