* [PATCH 1/3] Revert "selftests: revocable: Add kselftest cases"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
@ 2026-01-24 17:05 ` Johan Hovold
2026-01-24 17:05 ` [PATCH 2/3] Revert "revocable: Add Kunit test cases" Johan Hovold
` (4 subsequent siblings)
5 siblings, 0 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-24 17:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Rafael J . Wysocki, Danilo Krummrich, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Johan Hovold
This reverts commit 41242977c2d5705de33d3f58dc3bf5bc2f9de6cc.
The new revocable functionality is fundamentally broken and at a minimum
needs to be redesigned.
Drop the revocable selftests to allow the implementation to be reverted.
Signed-off-by: Johan Hovold <johan@kernel.org>
---
MAINTAINERS | 1 -
tools/testing/selftests/Makefile | 1 -
.../selftests/drivers/base/revocable/Makefile | 7 -
.../drivers/base/revocable/revocable_test.c | 136 ------------
.../drivers/base/revocable/test-revocable.sh | 39 ----
.../base/revocable/test_modules/Makefile | 10 -
.../revocable/test_modules/revocable_test.c | 195 ------------------
7 files changed, 389 deletions(-)
delete mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile
delete mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c
delete mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh
delete mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
delete mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
diff --git a/MAINTAINERS b/MAINTAINERS
index cb91537cf246..bf38181c07e4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22377,7 +22377,6 @@ S: Maintained
F: drivers/base/revocable.c
F: drivers/base/revocable_test.c
F: include/linux/revocable.h
-F: tools/testing/selftests/drivers/base/revocable/
RFKILL
M: Johannes Berg <johannes@sipsolutions.net>
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 11b6515ce3d0..56e44a98d6a5 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -17,7 +17,6 @@ TARGETS += damon
TARGETS += devices/error_logs
TARGETS += devices/probe
TARGETS += dmabuf-heaps
-TARGETS += drivers/base/revocable
TARGETS += drivers/dma-buf
TARGETS += drivers/ntsync
TARGETS += drivers/s390x/uvdevice
diff --git a/tools/testing/selftests/drivers/base/revocable/Makefile b/tools/testing/selftests/drivers/base/revocable/Makefile
deleted file mode 100644
index afa5ca0fa452..000000000000
--- a/tools/testing/selftests/drivers/base/revocable/Makefile
+++ /dev/null
@@ -1,7 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-
-TEST_GEN_MODS_DIR := test_modules
-TEST_GEN_PROGS_EXTENDED := revocable_test
-TEST_PROGS := test-revocable.sh
-
-include ../../../lib.mk
diff --git a/tools/testing/selftests/drivers/base/revocable/revocable_test.c b/tools/testing/selftests/drivers/base/revocable/revocable_test.c
deleted file mode 100644
index f024164e9273..000000000000
--- a/tools/testing/selftests/drivers/base/revocable/revocable_test.c
+++ /dev/null
@@ -1,136 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright 2026 Google LLC
- *
- * A selftest for the revocable API.
- *
- * The test cases cover the following scenarios:
- *
- * - Basic: Verifies that a consumer can successfully access the resource
- * provided via the provider.
- *
- * - Revocation: Verifies that after the provider revokes the resource,
- * the consumer correctly receives a NULL pointer on a subsequent access.
- *
- * - Try Access Macro: Same as "Revocation" but uses the
- * REVOCABLE_TRY_ACCESS_WITH() and REVOCABLE_TRY_ACCESS_SCOPED().
- */
-
-#include <fcntl.h>
-#include <unistd.h>
-
-#include "../../../kselftest_harness.h"
-
-#define DEBUGFS_PATH "/sys/kernel/debug/revocable_test"
-#define TEST_CMD_RESOURCE_GONE "resource_gone"
-#define TEST_DATA "12345678"
-#define TEST_MAGIC_OFFSET 0x1234
-#define TEST_MAGIC_OFFSET2 0x5678
-
-FIXTURE(revocable_fixture) {
- int pfd;
- int cfd;
-};
-
-FIXTURE_SETUP(revocable_fixture) {
- int ret;
-
- self->pfd = open(DEBUGFS_PATH "/provider", O_WRONLY);
- ASSERT_NE(-1, self->pfd)
- TH_LOG("failed to open provider fd");
-
- ret = write(self->pfd, TEST_DATA, strlen(TEST_DATA));
- ASSERT_NE(-1, ret) {
- close(self->pfd);
- TH_LOG("failed to write test data");
- }
-
- self->cfd = open(DEBUGFS_PATH "/consumer", O_RDONLY);
- ASSERT_NE(-1, self->cfd)
- TH_LOG("failed to open consumer fd");
-}
-
-FIXTURE_TEARDOWN(revocable_fixture) {
- close(self->cfd);
- close(self->pfd);
-}
-
-/*
- * ASSERT_* is only available in TEST or TEST_F block. Use
- * macro for the helper.
- */
-#define READ_TEST_DATA(_fd, _offset, _data, _msg) \
- do { \
- int ret; \
- \
- ret = lseek(_fd, _offset, SEEK_SET); \
- ASSERT_NE(-1, ret) \
- TH_LOG("failed to lseek"); \
- \
- ret = read(_fd, _data, sizeof(_data) - 1); \
- ASSERT_NE(-1, ret) \
- TH_LOG(_msg); \
- data[ret] = '\0'; \
- } while (0)
-
-TEST_F(revocable_fixture, basic) {
- char data[16];
-
- READ_TEST_DATA(self->cfd, 0, data, "failed to read test data");
- EXPECT_STREQ(TEST_DATA, data);
-}
-
-TEST_F(revocable_fixture, revocation) {
- char data[16];
- int ret;
-
- READ_TEST_DATA(self->cfd, 0, data, "failed to read test data");
- EXPECT_STREQ(TEST_DATA, data);
-
- ret = write(self->pfd, TEST_CMD_RESOURCE_GONE,
- strlen(TEST_CMD_RESOURCE_GONE));
- ASSERT_NE(-1, ret)
- TH_LOG("failed to write resource gone cmd");
-
- READ_TEST_DATA(self->cfd, 0, data,
- "failed to read test data after resource gone");
- EXPECT_STREQ("(null)", data);
-}
-
-TEST_F(revocable_fixture, try_access_macro) {
- char data[16];
- int ret;
-
- READ_TEST_DATA(self->cfd, TEST_MAGIC_OFFSET, data,
- "failed to read test data");
- EXPECT_STREQ(TEST_DATA, data);
-
- ret = write(self->pfd, TEST_CMD_RESOURCE_GONE,
- strlen(TEST_CMD_RESOURCE_GONE));
- ASSERT_NE(-1, ret)
- TH_LOG("failed to write resource gone cmd");
-
- READ_TEST_DATA(self->cfd, TEST_MAGIC_OFFSET, data,
- "failed to read test data after resource gone");
- EXPECT_STREQ("(null)", data);
-}
-
-TEST_F(revocable_fixture, try_access_macro2) {
- char data[16];
- int ret;
-
- READ_TEST_DATA(self->cfd, TEST_MAGIC_OFFSET2, data,
- "failed to read test data");
- EXPECT_STREQ(TEST_DATA, data);
-
- ret = write(self->pfd, TEST_CMD_RESOURCE_GONE,
- strlen(TEST_CMD_RESOURCE_GONE));
- ASSERT_NE(-1, ret)
- TH_LOG("failed to write resource gone cmd");
-
- READ_TEST_DATA(self->cfd, TEST_MAGIC_OFFSET2, data,
- "failed to read test data after resource gone");
- EXPECT_STREQ("(null)", data);
-}
-
-TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/drivers/base/revocable/test-revocable.sh b/tools/testing/selftests/drivers/base/revocable/test-revocable.sh
deleted file mode 100755
index 3a34be28001a..000000000000
--- a/tools/testing/selftests/drivers/base/revocable/test-revocable.sh
+++ /dev/null
@@ -1,39 +0,0 @@
-#!/bin/bash
-# SPDX-License-Identifier: GPL-2.0
-
-mod_name="revocable_test"
-ksft_fail=1
-ksft_skip=4
-
-if [ "$(id -u)" -ne 0 ]; then
- echo "$0: Must be run as root"
- exit "$ksft_skip"
-fi
-
-if ! which insmod > /dev/null 2>&1; then
- echo "$0: Need insmod"
- exit "$ksft_skip"
-fi
-
-if ! which rmmod > /dev/null 2>&1; then
- echo "$0: Need rmmod"
- exit "$ksft_skip"
-fi
-
-insmod test_modules/"${mod_name}".ko
-
-if [ ! -d /sys/kernel/debug/revocable_test/ ]; then
- mount -t debugfs none /sys/kernel/debug/
-
- if [ ! -d /sys/kernel/debug/revocable_test/ ]; then
- echo "$0: Error mounting debugfs"
- exit "$ksft_fail"
- fi
-fi
-
-./revocable_test
-ret=$?
-
-rmmod "${mod_name}"
-
-exit "${ret}"
diff --git a/tools/testing/selftests/drivers/base/revocable/test_modules/Makefile b/tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
deleted file mode 100644
index f29e4f909402..000000000000
--- a/tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
+++ /dev/null
@@ -1,10 +0,0 @@
-TESTMODS_DIR := $(realpath $(dir $(abspath $(lastword $(MAKEFILE_LIST)))))
-KDIR ?= /lib/modules/$(shell uname -r)/build
-
-obj-m += revocable_test.o
-
-all:
- $(Q)$(MAKE) -C $(KDIR) M=$(TESTMODS_DIR)
-
-clean:
- $(Q)$(MAKE) -C $(KDIR) M=$(TESTMODS_DIR) clean
diff --git a/tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c b/tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
deleted file mode 100644
index 1b0692eb75f3..000000000000
--- a/tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
+++ /dev/null
@@ -1,195 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright 2026 Google LLC
- *
- * A kernel module for testing the revocable API.
- */
-
-#include <linux/debugfs.h>
-#include <linux/module.h>
-#include <linux/revocable.h>
-#include <linux/slab.h>
-
-#define TEST_CMD_RESOURCE_GONE "resource_gone"
-#define TEST_MAGIC_OFFSET 0x1234
-#define TEST_MAGIC_OFFSET2 0x5678
-
-static struct dentry *debugfs_dir;
-
-struct revocable_test_provider_priv {
- struct revocable_provider *rp;
- struct dentry *dentry;
- char res[16];
-};
-
-static int revocable_test_consumer_open(struct inode *inode, struct file *filp)
-{
- struct revocable *rev;
- struct revocable_provider *rp = inode->i_private;
-
- rev = revocable_alloc(rp);
- if (!rev)
- return -ENOMEM;
- filp->private_data = rev;
-
- return 0;
-}
-
-static int revocable_test_consumer_release(struct inode *inode,
- struct file *filp)
-{
- struct revocable *rev = filp->private_data;
-
- revocable_free(rev);
- return 0;
-}
-
-static ssize_t revocable_test_consumer_read(struct file *filp,
- char __user *buf,
- size_t count, loff_t *offset)
-{
- char *res;
- char data[16];
- size_t len;
- struct revocable *rev = filp->private_data;
-
- switch (*offset) {
- case 0:
- res = revocable_try_access(rev);
- snprintf(data, sizeof(data), "%s", res ?: "(null)");
- revocable_withdraw_access(rev);
- break;
- case TEST_MAGIC_OFFSET:
- {
- REVOCABLE_TRY_ACCESS_WITH(rev, res);
- snprintf(data, sizeof(data), "%s", res ?: "(null)");
- }
- break;
- case TEST_MAGIC_OFFSET2:
- REVOCABLE_TRY_ACCESS_SCOPED(rev, res)
- snprintf(data, sizeof(data), "%s", res ?: "(null)");
- break;
- default:
- return 0;
- }
-
- len = min_t(size_t, strlen(data), count);
- if (copy_to_user(buf, data, len))
- return -EFAULT;
-
- *offset = len;
- return len;
-}
-
-static const struct file_operations revocable_test_consumer_fops = {
- .open = revocable_test_consumer_open,
- .release = revocable_test_consumer_release,
- .read = revocable_test_consumer_read,
- .llseek = default_llseek,
-};
-
-static int revocable_test_provider_open(struct inode *inode, struct file *filp)
-{
- struct revocable_test_provider_priv *priv;
-
- priv = kzalloc(sizeof(*priv), GFP_KERNEL);
- if (!priv)
- return -ENOMEM;
- filp->private_data = priv;
-
- return 0;
-}
-
-static int revocable_test_provider_release(struct inode *inode,
- struct file *filp)
-{
- struct revocable_test_provider_priv *priv = filp->private_data;
-
- debugfs_remove(priv->dentry);
- if (priv->rp)
- revocable_provider_revoke(priv->rp);
- kfree(priv);
-
- return 0;
-}
-
-static ssize_t revocable_test_provider_write(struct file *filp,
- const char __user *buf,
- size_t count, loff_t *offset)
-{
- size_t copied;
- char data[64];
- struct revocable_test_provider_priv *priv = filp->private_data;
-
- copied = strncpy_from_user(data, buf, sizeof(data));
- if (copied < 0)
- return copied;
- if (copied == sizeof(data))
- data[sizeof(data) - 1] = '\0';
-
- /*
- * Note: The test can't just close the FD for signaling the
- * resource gone. Subsequent file operations on the opening
- * FD of debugfs return -EIO after calling debugfs_remove().
- * See also debugfs_file_get().
- *
- * Here is a side command channel for signaling the resource
- * gone.
- */
- if (!strcmp(data, TEST_CMD_RESOURCE_GONE)) {
- revocable_provider_revoke(priv->rp);
- priv->rp = NULL;
- } else {
- if (priv->res[0] != '\0')
- return 0;
-
- strscpy(priv->res, data);
-
- priv->rp = revocable_provider_alloc(&priv->res);
- if (!priv->rp)
- return -ENOMEM;
-
- priv->dentry = debugfs_create_file("consumer", 0400,
- debugfs_dir, priv->rp,
- &revocable_test_consumer_fops);
- if (!priv->dentry) {
- revocable_provider_revoke(priv->rp);
- return -ENOMEM;
- }
- }
-
- return copied;
-}
-
-static const struct file_operations revocable_test_provider_fops = {
- .open = revocable_test_provider_open,
- .release = revocable_test_provider_release,
- .write = revocable_test_provider_write,
-};
-
-static int __init revocable_test_init(void)
-{
- debugfs_dir = debugfs_create_dir("revocable_test", NULL);
- if (!debugfs_dir)
- return -ENOMEM;
-
- if (!debugfs_create_file("provider", 0200, debugfs_dir, NULL,
- &revocable_test_provider_fops)) {
- debugfs_remove_recursive(debugfs_dir);
- return -ENOMEM;
- }
-
- return 0;
-}
-
-static void __exit revocable_test_exit(void)
-{
- debugfs_remove_recursive(debugfs_dir);
-}
-
-module_init(revocable_test_init);
-module_exit(revocable_test_exit);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Tzung-Bi Shih <tzungbi@kernel.org>");
-MODULE_DESCRIPTION("Revocable Kselftest");
--
2.52.0
^ permalink raw reply related [flat|nested] 70+ messages in thread* [PATCH 2/3] Revert "revocable: Add Kunit test cases"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
2026-01-24 17:05 ` [PATCH 1/3] Revert "selftests: revocable: Add kselftest cases" Johan Hovold
@ 2026-01-24 17:05 ` Johan Hovold
2026-01-24 17:05 ` [PATCH 3/3] Revert "revocable: Revocable resource management" Johan Hovold
` (3 subsequent siblings)
5 siblings, 0 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-24 17:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Rafael J . Wysocki, Danilo Krummrich, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Johan Hovold
This reverts commit d9a1ff40f5aa9b493f26812c8850423e386ba7c9.
The new revocable functionality is fundamentally broken and at a minimum
needs to be redesigned.
Drop the revocable Kunit tests to allow the implementation to be reverted.
Signed-off-by: Johan Hovold <johan@kernel.org>
---
MAINTAINERS | 1 -
drivers/base/Kconfig | 8 --
drivers/base/Makefile | 3 -
drivers/base/revocable_test.c | 142 ----------------------------------
4 files changed, 154 deletions(-)
delete mode 100644 drivers/base/revocable_test.c
diff --git a/MAINTAINERS b/MAINTAINERS
index bf38181c07e4..fac638cbb40a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22375,7 +22375,6 @@ M: Tzung-Bi Shih <tzungbi@kernel.org>
L: linux-kernel@vger.kernel.org
S: Maintained
F: drivers/base/revocable.c
-F: drivers/base/revocable_test.c
F: include/linux/revocable.h
RFKILL
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 8f7d7b9d81ac..1786d87b29e2 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -250,11 +250,3 @@ config FW_DEVLINK_SYNC_STATE_TIMEOUT
work on.
endmenu
-
-# Kunit test cases
-config REVOCABLE_KUNIT_TEST
- tristate "Kunit tests for revocable" if !KUNIT_ALL_TESTS
- depends on KUNIT
- default KUNIT_ALL_TESTS
- help
- Kunit tests for the revocable API.
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4185aaa9bbb9..bdf854694e39 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -35,6 +35,3 @@ ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG
# define_trace.h needs to know how to find our header
CFLAGS_trace.o := -I$(src)
obj-$(CONFIG_TRACING) += trace.o
-
-# Kunit test cases
-obj-$(CONFIG_REVOCABLE_KUNIT_TEST) += revocable_test.o
diff --git a/drivers/base/revocable_test.c b/drivers/base/revocable_test.c
deleted file mode 100644
index 873a44082b6c..000000000000
--- a/drivers/base/revocable_test.c
+++ /dev/null
@@ -1,142 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright 2026 Google LLC
- *
- * Kunit tests for the revocable API.
- *
- * The test cases cover the following scenarios:
- *
- * - Basic: Verifies that a consumer can successfully access the resource
- * provided via the provider.
- *
- * - Revocation: Verifies that after the provider revokes the resource,
- * the consumer correctly receives a NULL pointer on a subsequent access.
- *
- * - Try Access Macro: Same as "Revocation" but uses the
- * REVOCABLE_TRY_ACCESS_WITH() and REVOCABLE_TRY_ACCESS_SCOPED().
- */
-
-#include <kunit/test.h>
-#include <linux/revocable.h>
-
-static void revocable_test_basic(struct kunit *test)
-{
- struct revocable_provider *rp;
- struct revocable *rev;
- void *real_res = (void *)0x12345678, *res;
-
- rp = revocable_provider_alloc(real_res);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rp);
-
- rev = revocable_alloc(rp);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rev);
-
- res = revocable_try_access(rev);
- KUNIT_EXPECT_PTR_EQ(test, res, real_res);
- revocable_withdraw_access(rev);
-
- revocable_free(rev);
- revocable_provider_revoke(rp);
-}
-
-static void revocable_test_revocation(struct kunit *test)
-{
- struct revocable_provider *rp;
- struct revocable *rev;
- void *real_res = (void *)0x12345678, *res;
-
- rp = revocable_provider_alloc(real_res);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rp);
-
- rev = revocable_alloc(rp);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rev);
-
- res = revocable_try_access(rev);
- KUNIT_EXPECT_PTR_EQ(test, res, real_res);
- revocable_withdraw_access(rev);
-
- revocable_provider_revoke(rp);
-
- res = revocable_try_access(rev);
- KUNIT_EXPECT_PTR_EQ(test, res, NULL);
- revocable_withdraw_access(rev);
-
- revocable_free(rev);
-}
-
-static void revocable_test_try_access_macro(struct kunit *test)
-{
- struct revocable_provider *rp;
- struct revocable *rev;
- void *real_res = (void *)0x12345678, *res;
-
- rp = revocable_provider_alloc(real_res);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rp);
-
- rev = revocable_alloc(rp);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rev);
-
- {
- REVOCABLE_TRY_ACCESS_WITH(rev, res);
- KUNIT_EXPECT_PTR_EQ(test, res, real_res);
- }
-
- revocable_provider_revoke(rp);
-
- {
- REVOCABLE_TRY_ACCESS_WITH(rev, res);
- KUNIT_EXPECT_PTR_EQ(test, res, NULL);
- }
-
- revocable_free(rev);
-}
-
-static void revocable_test_try_access_macro2(struct kunit *test)
-{
- struct revocable_provider *rp;
- struct revocable *rev;
- void *real_res = (void *)0x12345678, *res;
- bool accessed;
-
- rp = revocable_provider_alloc(real_res);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rp);
-
- rev = revocable_alloc(rp);
- KUNIT_ASSERT_NOT_ERR_OR_NULL(test, rev);
-
- accessed = false;
- REVOCABLE_TRY_ACCESS_SCOPED(rev, res) {
- KUNIT_EXPECT_PTR_EQ(test, res, real_res);
- accessed = true;
- }
- KUNIT_EXPECT_TRUE(test, accessed);
-
- revocable_provider_revoke(rp);
-
- accessed = false;
- REVOCABLE_TRY_ACCESS_SCOPED(rev, res) {
- KUNIT_EXPECT_PTR_EQ(test, res, NULL);
- accessed = true;
- }
- KUNIT_EXPECT_TRUE(test, accessed);
-
- revocable_free(rev);
-}
-
-static struct kunit_case revocable_test_cases[] = {
- KUNIT_CASE(revocable_test_basic),
- KUNIT_CASE(revocable_test_revocation),
- KUNIT_CASE(revocable_test_try_access_macro),
- KUNIT_CASE(revocable_test_try_access_macro2),
- {}
-};
-
-static struct kunit_suite revocable_test_suite = {
- .name = "revocable_test",
- .test_cases = revocable_test_cases,
-};
-
-kunit_test_suite(revocable_test_suite);
-
-MODULE_DESCRIPTION("KUnit tests for the revocable API");
-MODULE_LICENSE("GPL");
--
2.52.0
^ permalink raw reply related [flat|nested] 70+ messages in thread* [PATCH 3/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
2026-01-24 17:05 ` [PATCH 1/3] Revert "selftests: revocable: Add kselftest cases" Johan Hovold
2026-01-24 17:05 ` [PATCH 2/3] Revert "revocable: Add Kunit test cases" Johan Hovold
@ 2026-01-24 17:05 ` Johan Hovold
2026-01-24 17:37 ` Johan Hovold
2026-01-24 17:46 ` Danilo Krummrich
2026-01-24 18:42 ` [PATCH 0/3] " Laurent Pinchart
` (2 subsequent siblings)
5 siblings, 2 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-24 17:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Rafael J . Wysocki, Danilo Krummrich, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Johan Hovold
This reverts commit 2da75b5b72559e5214a24aaf7ac933bb5bc2f1fe.
The revocable implementation uses two separate abstractions, struct
revocable_provider and struct revocable, in order to store the SRCU read
lock index which must be passed unaltered to srcu_read_unlock() in the
same context when a resource is no longer needed:
struct revocable {
struct revocable_provider *rp;
int idx;
};
void *revocable_try_access(struct revocable *rev)
{
struct revocable_provider *rp = rev->rp;
rev->idx = srcu_read_lock(&rp->srcu);
return srcu_dereference(rp->res, &rp->srcu);
}
void revocable_withdraw_access(struct revocable *rev)
{
struct revocable_provider *rp = rev->rp;
srcu_read_unlock(&rp->srcu, rev->idx);
}
Multiple threads may however share the same struct revocable and
therefore potentially overwrite the SRCU index of another thread which
can cause the SRCU synchronisation in revocable_provider_revoke() to
never complete.
This can specifically happen when two threads share an open file and a
revocable resource has been allocated on open(), which was the primary
example of how revocable could be used (e.g. see the now reverted
revocable selftests).
This would also affect any future revocable resource that may be
accessed from multiple contexts (e.g. a gpio used in both process and
interrupt context).
Revert the revocable implementation until a redesign has been proposed
and evaluated properly.
Signed-off-by: Johan Hovold <johan@kernel.org>
---
.../driver-api/driver-model/index.rst | 1 -
.../driver-api/driver-model/revocable.rst | 152 -----------
MAINTAINERS | 7 -
drivers/base/Makefile | 2 +-
drivers/base/revocable.c | 241 ------------------
include/linux/revocable.h | 69 -----
6 files changed, 1 insertion(+), 471 deletions(-)
delete mode 100644 Documentation/driver-api/driver-model/revocable.rst
delete mode 100644 drivers/base/revocable.c
delete mode 100644 include/linux/revocable.h
diff --git a/Documentation/driver-api/driver-model/index.rst b/Documentation/driver-api/driver-model/index.rst
index 8e1ee21185df..4831bdd92e5c 100644
--- a/Documentation/driver-api/driver-model/index.rst
+++ b/Documentation/driver-api/driver-model/index.rst
@@ -14,7 +14,6 @@ Driver Model
overview
platform
porting
- revocable
.. only:: subproject and html
diff --git a/Documentation/driver-api/driver-model/revocable.rst b/Documentation/driver-api/driver-model/revocable.rst
deleted file mode 100644
index 22a442cc8d7f..000000000000
--- a/Documentation/driver-api/driver-model/revocable.rst
+++ /dev/null
@@ -1,152 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-==============================
-Revocable Resource Management
-==============================
-
-Overview
-========
-
-.. kernel-doc:: drivers/base/revocable.c
- :doc: Overview
-
-Revocable vs. Devres (devm)
-===========================
-
-It's important to understand the distinct roles of the Revocable and Devres,
-and how they can complement each other. They address different problems in
-resource management:
-
-* **Devres:** Primarily address **resource leaks**. The lifetime of the
- resources is tied to the lifetime of the device. The resource is
- automatically freed when the device is unbound. This cleanup happens
- irrespective of any potential active users.
-
-* **Revocable:** Primarily addresses **invalid memory access**,
- such as Use-After-Free (UAF). It's an independent synchronization
- primitive that decouples consumer access from the resource's actual
- presence. Consumers interact with a "revocable object" (an intermediary),
- not the underlying resource directly. This revocable object persists as
- long as there are active references to it from consumer handles.
-
-**Key Distinctions & How They Complement Each Other:**
-
-1. **Reference Target:** Consumers of a resource managed by the Revocable
- mechanism hold a reference to the *revocable object*, not the
- encapsulated resource itself.
-
-2. **Resource Lifetime vs. Access:** The underlying resource's lifetime is
- independent of the number of references to the revocable object. The
- resource can be freed at any point. A common scenario is the resource
- being freed by `devres` when the providing device is unbound.
-
-3. **Safe Access:** Revocable provides a safe way to attempt access. Before
- using the resource, a consumer uses the Revocable API (e.g.,
- revocable_try_access()). This function checks if the resource is still
- valid. It returns a pointer to the resource only if it hasn't been
- revoked; otherwise, it returns NULL. This prevents UAF by providing a
- clear signal that the resource is gone.
-
-4. **Complementary Usage:** `devres` and Revocable work well together.
- `devres` can handle the automatic allocation and deallocation of a
- resource tied to a device. The Revocable mechanism can be layered on top
- to provide safe access for consumers whose lifetimes might extend beyond
- the provider device's lifetime. For instance, a userspace program might
- keep a character device file open even after the physical device has been
- removed. In this case:
-
- * `devres` frees the device-specific resource upon unbinding.
- * The Revocable mechanism ensures that any subsequent operations on the
- open file handle, which attempt to access the now-freed resource,
- will fail gracefully (e.g., revocable_try_access() returns NULL)
- instead of causing a UAF.
-
-In summary, `devres` ensures resources are *released* to prevent leaks, while
-the Revocable mechanism ensures that attempts to *access* these resources are
-done safely, even if the resource has been released.
-
-API and Usage
-=============
-
-For Resource Providers
-----------------------
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_provider
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_provider_alloc
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: devm_revocable_provider_alloc
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_provider_revoke
-
-For Resource Consumers
-----------------------
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_alloc
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_free
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_try_access
-
-.. kernel-doc:: drivers/base/revocable.c
- :identifiers: revocable_withdraw_access
-
-.. kernel-doc:: include/linux/revocable.h
- :identifiers: REVOCABLE_TRY_ACCESS_WITH
-
-Example Usage
-~~~~~~~~~~~~~
-
-.. code-block:: c
-
- void consumer_use_resource(struct revocable *rev)
- {
- struct foo_resource *res;
-
- REVOCABLE_TRY_ACCESS_WITH(rev, res);
- // Always check if the resource is valid.
- if (!res) {
- pr_warn("Resource is not available\n");
- return;
- }
-
- // At this point, 'res' is guaranteed to be valid until
- // this block exits.
- do_something_with(res);
-
- } // revocable_withdraw_access() is automatically called here.
-
-.. kernel-doc:: include/linux/revocable.h
- :identifiers: REVOCABLE_TRY_ACCESS_SCOPED
-
-Example Usage
-~~~~~~~~~~~~~
-
-.. code-block:: c
-
- void consumer_use_resource(struct revocable *rev)
- {
- struct foo_resource *res;
-
- REVOCABLE_TRY_ACCESS_SCOPED(rev, res) {
- // Always check if the resource is valid.
- if (!res) {
- pr_warn("Resource is not available\n");
- return;
- }
-
- // At this point, 'res' is guaranteed to be valid until
- // this block exits.
- do_something_with(res);
- }
-
- // revocable_withdraw_access() is automatically called here.
- }
diff --git a/MAINTAINERS b/MAINTAINERS
index fac638cbb40a..da9dbc1a4019 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -22370,13 +22370,6 @@ F: include/uapi/linux/rseq.h
F: kernel/rseq.c
F: tools/testing/selftests/rseq/
-REVOCABLE RESOURCE MANAGEMENT
-M: Tzung-Bi Shih <tzungbi@kernel.org>
-L: linux-kernel@vger.kernel.org
-S: Maintained
-F: drivers/base/revocable.c
-F: include/linux/revocable.h
-
RFKILL
M: Johannes Berg <johannes@sipsolutions.net>
L: linux-wireless@vger.kernel.org
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index bdf854694e39..8074a10183dc 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -6,7 +6,7 @@ obj-y := component.o core.o bus.o dd.o syscore.o \
cpu.o firmware.o init.o map.o devres.o \
attribute_container.o transport_class.o \
topology.o container.o property.o cacheinfo.o \
- swnode.o faux.o revocable.o
+ swnode.o faux.o
obj-$(CONFIG_AUXILIARY_BUS) += auxiliary.o
obj-$(CONFIG_DEVTMPFS) += devtmpfs.o
obj-y += power/
diff --git a/drivers/base/revocable.c b/drivers/base/revocable.c
deleted file mode 100644
index b068e18a847d..000000000000
--- a/drivers/base/revocable.c
+++ /dev/null
@@ -1,241 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright 2026 Google LLC
- *
- * Revocable resource management
- */
-
-#include <linux/device.h>
-#include <linux/kref.h>
-#include <linux/revocable.h>
-#include <linux/slab.h>
-#include <linux/srcu.h>
-
-/**
- * DOC: Overview
- *
- * The "revocable" mechanism is a synchronization primitive designed to manage
- * safe access to resources that can be asynchronously removed or invalidated.
- * Its primary purpose is to prevent Use-After-Free (UAF) errors when
- * interacting with resources whose lifetimes are not guaranteed to outlast
- * their consumers.
- *
- * This is particularly useful in systems where resources can disappear
- * unexpectedly, such as those provided by hot-pluggable devices like USB.
- * When a consumer holds a reference to such a resource, the underlying device
- * might be removed, causing the resource's memory to be freed. Subsequent
- * access attempts by the consumer would then lead to UAF errors.
- *
- * Revocable addresses this by providing a form of "weak reference" and a
- * controlled access method. It allows a resource consumer to safely attempt to
- * access the resource. The mechanism guarantees that any access granted is
- * valid for the duration of its use. If the resource has already been
- * revoked (i.e., freed), the access attempt will fail safely, typically by
- * returning NULL, instead of causing a crash.
- *
- * The implementation uses a provider/consumer model built on Sleepable
- * RCU (SRCU) to guarantee safe memory access:
- *
- * - A resource provider, such as a driver for a hot-pluggable device,
- * allocates a struct revocable_provider and initializes it with a pointer
- * to the resource.
- *
- * - A resource consumer that wants to access the resource allocates a
- * struct revocable which acts as a handle containing a reference to the
- * provider.
- *
- * - To access the resource, the consumer uses revocable_try_access().
- * This function enters an SRCU read-side critical section and returns
- * the pointer to the resource. If the provider has already freed the
- * resource, it returns NULL. After use, the consumer calls
- * revocable_withdraw_access() to exit the SRCU critical section. The
- * REVOCABLE_TRY_ACCESS_WITH() and REVOCABLE_TRY_ACCESS_SCOPED() are
- * convenient helpers for doing that.
- *
- * - When the provider needs to remove the resource, it calls
- * revocable_provider_revoke(). This function sets the internal resource
- * pointer to NULL and then calls synchronize_srcu() to wait for all
- * current readers to finish before the resource can be completely torn
- * down.
- */
-
-/**
- * struct revocable_provider - A handle for resource provider.
- * @srcu: The SRCU to protect the resource.
- * @res: The pointer of resource. It can point to anything.
- * @kref: The refcount for this handle.
- */
-struct revocable_provider {
- struct srcu_struct srcu;
- void __rcu *res;
- struct kref kref;
-};
-
-/**
- * struct revocable - A handle for resource consumer.
- * @rp: The pointer of resource provider.
- * @idx: The index for the RCU critical section.
- */
-struct revocable {
- struct revocable_provider *rp;
- int idx;
-};
-
-/**
- * revocable_provider_alloc() - Allocate struct revocable_provider.
- * @res: The pointer of resource.
- *
- * This holds an initial refcount to the struct.
- *
- * Return: The pointer of struct revocable_provider. NULL on errors.
- */
-struct revocable_provider *revocable_provider_alloc(void *res)
-{
- struct revocable_provider *rp;
-
- rp = kzalloc(sizeof(*rp), GFP_KERNEL);
- if (!rp)
- return NULL;
-
- init_srcu_struct(&rp->srcu);
- rcu_assign_pointer(rp->res, res);
- kref_init(&rp->kref);
-
- return rp;
-}
-EXPORT_SYMBOL_GPL(revocable_provider_alloc);
-
-static void revocable_provider_release(struct kref *kref)
-{
- struct revocable_provider *rp = container_of(kref,
- struct revocable_provider, kref);
-
- cleanup_srcu_struct(&rp->srcu);
- kfree(rp);
-}
-
-/**
- * revocable_provider_revoke() - Revoke the managed resource.
- * @rp: The pointer of resource provider.
- *
- * This sets the resource `(struct revocable_provider *)->res` to NULL to
- * indicate the resource has gone.
- *
- * This drops the refcount to the resource provider. If it is the final
- * reference, revocable_provider_release() will be called to free the struct.
- */
-void revocable_provider_revoke(struct revocable_provider *rp)
-{
- rcu_assign_pointer(rp->res, NULL);
- synchronize_srcu(&rp->srcu);
- kref_put(&rp->kref, revocable_provider_release);
-}
-EXPORT_SYMBOL_GPL(revocable_provider_revoke);
-
-static void devm_revocable_provider_revoke(void *data)
-{
- struct revocable_provider *rp = data;
-
- revocable_provider_revoke(rp);
-}
-
-/**
- * devm_revocable_provider_alloc() - Dev-managed revocable_provider_alloc().
- * @dev: The device.
- * @res: The pointer of resource.
- *
- * It is convenient to allocate providers via this function if the @res is
- * also tied to the lifetime of the @dev. revocable_provider_revoke() will
- * be called automatically when the device is unbound.
- *
- * This holds an initial refcount to the struct.
- *
- * Return: The pointer of struct revocable_provider. NULL on errors.
- */
-struct revocable_provider *devm_revocable_provider_alloc(struct device *dev,
- void *res)
-{
- struct revocable_provider *rp;
-
- rp = revocable_provider_alloc(res);
- if (!rp)
- return NULL;
-
- if (devm_add_action_or_reset(dev, devm_revocable_provider_revoke, rp))
- return NULL;
-
- return rp;
-}
-EXPORT_SYMBOL_GPL(devm_revocable_provider_alloc);
-
-/**
- * revocable_alloc() - Allocate struct revocable.
- * @rp: The pointer of resource provider.
- *
- * This holds a refcount to the resource provider.
- *
- * Return: The pointer of struct revocable. NULL on errors.
- */
-struct revocable *revocable_alloc(struct revocable_provider *rp)
-{
- struct revocable *rev;
-
- rev = kzalloc(sizeof(*rev), GFP_KERNEL);
- if (!rev)
- return NULL;
-
- rev->rp = rp;
- kref_get(&rp->kref);
-
- return rev;
-}
-EXPORT_SYMBOL_GPL(revocable_alloc);
-
-/**
- * revocable_free() - Free struct revocable.
- * @rev: The pointer of struct revocable.
- *
- * This drops a refcount to the resource provider. If it is the final
- * reference, revocable_provider_release() will be called to free the struct.
- */
-void revocable_free(struct revocable *rev)
-{
- struct revocable_provider *rp = rev->rp;
-
- kref_put(&rp->kref, revocable_provider_release);
- kfree(rev);
-}
-EXPORT_SYMBOL_GPL(revocable_free);
-
-/**
- * revocable_try_access() - Try to access the resource.
- * @rev: The pointer of struct revocable.
- *
- * This tries to de-reference to the resource and enters a RCU critical
- * section.
- *
- * Return: The pointer to the resource. NULL if the resource has gone.
- */
-void *revocable_try_access(struct revocable *rev) __acquires(&rev->rp->srcu)
-{
- struct revocable_provider *rp = rev->rp;
-
- rev->idx = srcu_read_lock(&rp->srcu);
- return srcu_dereference(rp->res, &rp->srcu);
-}
-EXPORT_SYMBOL_GPL(revocable_try_access);
-
-/**
- * revocable_withdraw_access() - Stop accessing to the resource.
- * @rev: The pointer of struct revocable.
- *
- * Call this function to indicate the resource is no longer used. It exits
- * the RCU critical section.
- */
-void revocable_withdraw_access(struct revocable *rev) __releases(&rev->rp->srcu)
-{
- struct revocable_provider *rp = rev->rp;
-
- srcu_read_unlock(&rp->srcu, rev->idx);
-}
-EXPORT_SYMBOL_GPL(revocable_withdraw_access);
diff --git a/include/linux/revocable.h b/include/linux/revocable.h
deleted file mode 100644
index 659ba01c58db..000000000000
--- a/include/linux/revocable.h
+++ /dev/null
@@ -1,69 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright 2026 Google LLC
- */
-
-#ifndef __LINUX_REVOCABLE_H
-#define __LINUX_REVOCABLE_H
-
-#include <linux/compiler.h>
-#include <linux/cleanup.h>
-
-struct device;
-struct revocable;
-struct revocable_provider;
-
-struct revocable_provider *revocable_provider_alloc(void *res);
-void revocable_provider_revoke(struct revocable_provider *rp);
-struct revocable_provider *devm_revocable_provider_alloc(struct device *dev,
- void *res);
-
-struct revocable *revocable_alloc(struct revocable_provider *rp);
-void revocable_free(struct revocable *rev);
-void *revocable_try_access(struct revocable *rev) __acquires(&rev->rp->srcu);
-void revocable_withdraw_access(struct revocable *rev) __releases(&rev->rp->srcu);
-
-DEFINE_FREE(access_rev, struct revocable *, if (_T) revocable_withdraw_access(_T))
-
-/**
- * REVOCABLE_TRY_ACCESS_WITH() - A helper for accessing revocable resource
- * @_rev: The consumer's ``struct revocable *`` handle.
- * @_res: A pointer variable that will be assigned the resource.
- *
- * The macro simplifies the access-release cycle for consumers, ensuring that
- * revocable_withdraw_access() is always called, even in the case of an early
- * exit.
- *
- * It creates a local variable in the current scope. @_res is populated with
- * the result of revocable_try_access(). The consumer code **must** check if
- * @_res is ``NULL`` before using it. The revocable_withdraw_access() function
- * is automatically called when the scope is exited.
- *
- * Note: It shares the same issue with guard() in cleanup.h. No goto statements
- * are allowed before the helper. Otherwise, the compiler fails with
- * "jump bypasses initialization of variable with __attribute__((cleanup))".
- */
-#define REVOCABLE_TRY_ACCESS_WITH(_rev, _res) \
- struct revocable *__UNIQUE_ID(name) __free(access_rev) = _rev; \
- _res = revocable_try_access(_rev)
-
-#define _REVOCABLE_TRY_ACCESS_SCOPED(_rev, _label, _res) \
- for (struct revocable *__UNIQUE_ID(name) __free(access_rev) = _rev; \
- (_res = revocable_try_access(_rev)) || true; ({ goto _label; })) \
- if (0) { \
-_label: \
- break; \
- } else
-
-/**
- * REVOCABLE_TRY_ACCESS_SCOPED() - A helper for accessing revocable resource
- * @_rev: The consumer's ``struct revocable *`` handle.
- * @_res: A pointer variable that will be assigned the resource.
- *
- * Similar to REVOCABLE_TRY_ACCESS_WITH() but with an explicit scope from a
- * temporary ``for`` loop.
- */
-#define REVOCABLE_TRY_ACCESS_SCOPED(_rev, _res) \
- _REVOCABLE_TRY_ACCESS_SCOPED(_rev, __UNIQUE_ID(label), _res)
-
-#endif /* __LINUX_REVOCABLE_H */
--
2.52.0
^ permalink raw reply related [flat|nested] 70+ messages in thread* Re: [PATCH 3/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 ` [PATCH 3/3] Revert "revocable: Revocable resource management" Johan Hovold
@ 2026-01-24 17:37 ` Johan Hovold
2026-01-24 17:46 ` Danilo Krummrich
1 sibling, 0 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-24 17:37 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Rafael J . Wysocki, Danilo Krummrich, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat, Jan 24, 2026 at 06:05:35PM +0100, Johan Hovold wrote:
> This reverts commit 2da75b5b72559e5214a24aaf7ac933bb5bc2f1fe.
I forgot to update the commit ids before posting. This was supposed to
say:
62eb557580eb2177cf16c3fd2b6efadff297b29a
and similar for patch 1 and 2:
9d4502fef00fa7a798d3c0806d4da4466a7ffc6f
cd7693419bb5abd91ad2f407dab69c480e417a61
Sorry about that.
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 3/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 ` [PATCH 3/3] Revert "revocable: Revocable resource management" Johan Hovold
2026-01-24 17:37 ` Johan Hovold
@ 2026-01-24 17:46 ` Danilo Krummrich
2026-01-26 13:20 ` Johan Hovold
1 sibling, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-24 17:46 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> The revocable implementation uses two separate abstractions, struct
> revocable_provider and struct revocable, in order to store the SRCU read
> lock index which must be passed unaltered to srcu_read_unlock() in the
> same context when a resource is no longer needed:
>
> struct revocable {
> struct revocable_provider *rp;
> int idx;
> };
>
> void *revocable_try_access(struct revocable *rev)
> {
> struct revocable_provider *rp = rev->rp;
>
> rev->idx = srcu_read_lock(&rp->srcu);
> return srcu_dereference(rp->res, &rp->srcu);
> }
>
> void revocable_withdraw_access(struct revocable *rev)
> {
> struct revocable_provider *rp = rev->rp;
>
> srcu_read_unlock(&rp->srcu, rev->idx);
> }
>
> Multiple threads may however share the same struct revocable and
> therefore potentially overwrite the SRCU index of another thread which
> can cause the SRCU synchronisation in revocable_provider_revoke() to
> never complete.
I think the easiest fix would be to just return the index to the caller and let
the corresponding revocable macro accessors handle it, such that it is still
transparent to the user.
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 3/3] Revert "revocable: Revocable resource management"
2026-01-24 17:46 ` Danilo Krummrich
@ 2026-01-26 13:20 ` Johan Hovold
2026-01-27 15:57 ` Tzung-Bi Shih
0 siblings, 1 reply; 70+ messages in thread
From: Johan Hovold @ 2026-01-26 13:20 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat, Jan 24, 2026 at 06:46:11PM +0100, Danilo Krummrich wrote:
> On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > The revocable implementation uses two separate abstractions, struct
> > revocable_provider and struct revocable, in order to store the SRCU read
> > lock index which must be passed unaltered to srcu_read_unlock() in the
> > same context when a resource is no longer needed:
> >
> > struct revocable {
> > struct revocable_provider *rp;
> > int idx;
> > };
> >
> > void *revocable_try_access(struct revocable *rev)
> > {
> > struct revocable_provider *rp = rev->rp;
> >
> > rev->idx = srcu_read_lock(&rp->srcu);
> > return srcu_dereference(rp->res, &rp->srcu);
> > }
> >
> > void revocable_withdraw_access(struct revocable *rev)
> > {
> > struct revocable_provider *rp = rev->rp;
> >
> > srcu_read_unlock(&rp->srcu, rev->idx);
> > }
> >
> > Multiple threads may however share the same struct revocable and
> > therefore potentially overwrite the SRCU index of another thread which
> > can cause the SRCU synchronisation in revocable_provider_revoke() to
> > never complete.
>
> I think the easiest fix would be to just return the index to the caller and let
> the corresponding revocable macro accessors handle it, such that it is still
> transparent to the user.
It can certainly be made to work, but it will be a very different API
since there would no longer be any need for the non-intuitive
revocable_provider and revocable split.
And that's not the kind of change that should be done incrementally
as that makes it much harder to review (and it affects all of the
current code; api, implementation, docs, tests).
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 3/3] Revert "revocable: Revocable resource management"
2026-01-26 13:20 ` Johan Hovold
@ 2026-01-27 15:57 ` Tzung-Bi Shih
0 siblings, 0 replies; 70+ messages in thread
From: Tzung-Bi Shih @ 2026-01-27 15:57 UTC (permalink / raw)
To: Johan Hovold
Cc: Danilo Krummrich, Greg Kroah-Hartman, Rafael J . Wysocki,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Mon, Jan 26, 2026 at 02:20:24PM +0100, Johan Hovold wrote:
> On Sat, Jan 24, 2026 at 06:46:11PM +0100, Danilo Krummrich wrote:
> > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > > The revocable implementation uses two separate abstractions, struct
> > > revocable_provider and struct revocable, in order to store the SRCU read
> > > lock index which must be passed unaltered to srcu_read_unlock() in the
> > > same context when a resource is no longer needed:
> > >
> > > struct revocable {
> > > struct revocable_provider *rp;
> > > int idx;
> > > };
> > >
> > > void *revocable_try_access(struct revocable *rev)
> > > {
> > > struct revocable_provider *rp = rev->rp;
> > >
> > > rev->idx = srcu_read_lock(&rp->srcu);
> > > return srcu_dereference(rp->res, &rp->srcu);
> > > }
> > >
> > > void revocable_withdraw_access(struct revocable *rev)
> > > {
> > > struct revocable_provider *rp = rev->rp;
> > >
> > > srcu_read_unlock(&rp->srcu, rev->idx);
> > > }
> > >
> > > Multiple threads may however share the same struct revocable and
> > > therefore potentially overwrite the SRCU index of another thread which
> > > can cause the SRCU synchronisation in revocable_provider_revoke() to
> > > never complete.
> >
> > I think the easiest fix would be to just return the index to the caller and let
> > the corresponding revocable macro accessors handle it, such that it is still
> > transparent to the user.
>
> It can certainly be made to work, but it will be a very different API
> since there would no longer be any need for the non-intuitive
> revocable_provider and revocable split.
Thank you both for the suggestions. I'll try it out.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
` (2 preceding siblings ...)
2026-01-24 17:05 ` [PATCH 3/3] Revert "revocable: Revocable resource management" Johan Hovold
@ 2026-01-24 18:42 ` Laurent Pinchart
2026-01-24 19:08 ` Danilo Krummrich
2026-01-27 15:57 ` Tzung-Bi Shih
5 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-24 18:42 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> I was surprised to learn that the revocable functionality was merged last week
> given the community feedback on list and at LPC, but not least since there are
> no users of it, which we are supposed to require to be able to evaluate it
> properly.
>
> The chromeos ec driver issue which motivated this work turned out not to need
> it as was found during review. And the example gpiolib conversion was posted
> the very same morning that this was merged which hardly provides enough time
> for evaluation (even if Bartosz quickly reported a performance regression).
>
> Turns out there are correctness issues with both the gpiolib conversion and
> the revocable design itself that can lead to use-after-free and hung tasks (see
> [1] and patch 3/3).
>
> And as was pointed out repeatedly during review, and again at the day of the
> merge, this does not look like the right interface for the chardev unplug
> issue.
>
> Revert the revocable implementation until a redesign has been proposed and
> evaluated properly.
I have voiced some of the concerns listed above. This was merge way too
quickly, without proper review and evaluation of the API as a solution
for the problem at hand. I don't want to see this API spreading through
drivers the same way devm_kzalloc() did without developers understanding
the limitations, it's just another recipe for disaster. I trust that we
have enough knowledge and wisdom in the community to implement correct
solutions to the producer-consumer races.
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> [1] https://lore.kernel.org/all/aXT45B6vLf9R3Pbf@hovoldconsulting.com/
>
>
> Johan Hovold (3):
> Revert "selftests: revocable: Add kselftest cases"
> Revert "revocable: Add Kunit test cases"
> Revert "revocable: Revocable resource management"
>
> .../driver-api/driver-model/index.rst | 1 -
> .../driver-api/driver-model/revocable.rst | 152 -----------
> MAINTAINERS | 9 -
> drivers/base/Kconfig | 8 -
> drivers/base/Makefile | 5 +-
> drivers/base/revocable.c | 241 ------------------
> drivers/base/revocable_test.c | 142 -----------
> include/linux/revocable.h | 69 -----
> tools/testing/selftests/Makefile | 1 -
> .../selftests/drivers/base/revocable/Makefile | 7 -
> .../drivers/base/revocable/revocable_test.c | 136 ----------
> .../drivers/base/revocable/test-revocable.sh | 39 ---
> .../base/revocable/test_modules/Makefile | 10 -
> .../revocable/test_modules/revocable_test.c | 195 --------------
> 14 files changed, 1 insertion(+), 1014 deletions(-)
> delete mode 100644 Documentation/driver-api/driver-model/revocable.rst
> delete mode 100644 drivers/base/revocable.c
> delete mode 100644 drivers/base/revocable_test.c
> delete mode 100644 include/linux/revocable.h
> delete mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile
> delete mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c
> delete mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh
> delete mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
> delete mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
` (3 preceding siblings ...)
2026-01-24 18:42 ` [PATCH 0/3] " Laurent Pinchart
@ 2026-01-24 19:08 ` Danilo Krummrich
2026-01-25 12:47 ` Greg Kroah-Hartman
2026-01-27 15:57 ` Tzung-Bi Shih
5 siblings, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-24 19:08 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> this does not look like the right interface for the chardev unplug issue.
I think it depends, we should do everything to prevent having the issue in the
first place, e.g. ensure that we synchronize the unplug properly on device
driver unbind.
Sometimes, however, this isn't possible; this is where a revocable mechanism can
come in handy to prevent UAF of device resources -- DRM is a good example for
this.
But to be fair, I also want to point out that there is a quite significant
difference regarding the usefulness of the revocable concept in C compared to in
Rust due to language capabilities.
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-24 19:08 ` Danilo Krummrich
@ 2026-01-25 12:47 ` Greg Kroah-Hartman
2026-01-25 13:22 ` Laurent Pinchart
` (3 more replies)
0 siblings, 4 replies; 70+ messages in thread
From: Greg Kroah-Hartman @ 2026-01-25 12:47 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Johan Hovold, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
> On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > this does not look like the right interface for the chardev unplug issue.
>
> I think it depends, we should do everything to prevent having the issue in the
> first place, e.g. ensure that we synchronize the unplug properly on device
> driver unbind.
>
> Sometimes, however, this isn't possible; this is where a revocable mechanism can
> come in handy to prevent UAF of device resources -- DRM is a good example for
> this.
This is not "possible" for almost all real devices so we need something
like this for almost all classes of devices, DRM just shows the extremes
involved, v4l2 is also another good example.
Note, other OSes also have this same problem, look at all the work the
BSDs are going through at the moment just to get closer to the place
where we are in Linux today with removable devices and they have hit our
same problems.
> But to be fair, I also want to point out that there is a quite significant
> difference regarding the usefulness of the revocable concept in C compared to in
> Rust due to language capabilities.
True, but we do need something. I took these patches without a real
user as a base for us to start working off of. The rust implementation
has shown that the design-pattern is a good solution for the problem,
and so I feel we should work with it and try to get this working
properly. We've been sitting and talking about it for years now, and
here is the first real code submission that is getting us closer to fix
the problem properly. It might not be perfict, but let's evolve it from
here for what is found not to work correctly.
So I don't want to take these reverts, let's try this out, by putting
this into the driver core now, we have the base to experiment with in a
"safe" way in lots of different driver subsytems at the same time. If
it doesn't work out, worst case we revert it in a release or two because
it didn't get used.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 12:47 ` Greg Kroah-Hartman
@ 2026-01-25 13:22 ` Laurent Pinchart
2026-01-25 14:07 ` Danilo Krummrich
2026-01-25 13:24 ` Laurent Pinchart
` (2 subsequent siblings)
3 siblings, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-25 13:22 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Danilo Krummrich, Johan Hovold, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, Jason Gunthorpe,
linux-doc, linux-kselftest, linux-kernel
On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg KH wrote:
> On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
> > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > > this does not look like the right interface for the chardev unplug issue.
> >
> > I think it depends, we should do everything to prevent having the issue in the
> > first place, e.g. ensure that we synchronize the unplug properly on device
> > driver unbind.
> >
> > Sometimes, however, this isn't possible; this is where a revocable mechanism can
> > come in handy to prevent UAF of device resources -- DRM is a good example for
> > this.
>
> This is not "possible" for almost all real devices so we need something
> like this for almost all classes of devices, DRM just shows the extremes
> involved, v4l2 is also another good example.
>
> Note, other OSes also have this same problem, look at all the work the
> BSDs are going through at the moment just to get closer to the place
> where we are in Linux today with removable devices and they have hit our
> same problems.
>
> > But to be fair, I also want to point out that there is a quite significant
> > difference regarding the usefulness of the revocable concept in C compared to in
> > Rust due to language capabilities.
>
> True, but we do need something. I took these patches without a real
> user as a base for us to start working off of. The rust implementation
> has shown that the design-pattern is a good solution for the problem,
> and so I feel we should work with it and try to get this working
> properly. We've been sitting and talking about it for years now, and
> here is the first real code submission that is getting us closer to fix
> the problem properly. It might not be perfict, but let's evolve it from
> here for what is found not to work correctly.
>
> So I don't want to take these reverts, let's try this out, by putting
> this into the driver core now, we have the base to experiment with in a
> "safe" way in lots of different driver subsytems at the same time. If
> it doesn't work out, worst case we revert it in a release or two because
> it didn't get used.
It's the wrong solution for most cases, if not all. It will spread in
drivers and then become another big piece of technical debt we'll wish
we had never merged. We know what the right solution to the cdev race
is, moving forward with the revocable API is shooting ourselves in the
foot. Or both feet. Or much worse. I can't stop everybody from harming
themselves, but I object vigourously when that impacts me. I want this
to be reverted now.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 13:22 ` Laurent Pinchart
@ 2026-01-25 14:07 ` Danilo Krummrich
2026-01-29 1:09 ` Laurent Pinchart
0 siblings, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-25 14:07 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Sun Jan 25, 2026 at 2:22 PM CET, Laurent Pinchart wrote:
> It's the wrong solution for most cases, if not all. It will spread in drivers
> and then become another big piece of technical debt we'll wish we had never
> merged.
It is a matter of how the revocable pattern is adopted. I.e. I don't think
drivers should create instances of revocable (device) resources by themselves.
Instead, I think it should be up to the corresponding subsystems to adopt the
pattern in the way necessary and make it accessible to drivers instead.
> We know what the right solution to the cdev race is
So, what is it? Assuming that this is what you are referring to, how do you
prevent accesses to (potentially freed) device resources after the bus device
has been unbound from the driver for subsystems that may still call back into
the driver after device unbind?
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 14:07 ` Danilo Krummrich
@ 2026-01-29 1:09 ` Laurent Pinchart
0 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 1:09 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Sun, Jan 25, 2026 at 03:07:41PM +0100, Danilo Krummrich wrote:
> On Sun Jan 25, 2026 at 2:22 PM CET, Laurent Pinchart wrote:
> > It's the wrong solution for most cases, if not all. It will spread in drivers
> > and then become another big piece of technical debt we'll wish we had never
> > merged.
>
> It is a matter of how the revocable pattern is adopted. I.e. I don't think
> drivers should create instances of revocable (device) resources by themselves.
> Instead, I think it should be up to the corresponding subsystems to adopt the
> pattern in the way necessary and make it accessible to drivers instead.
>
> > We know what the right solution to the cdev race is
>
> So, what is it? Assuming that this is what you are referring to, how do you
> prevent accesses to (potentially freed) device resources after the bus device
> has been unbound from the driver for subsystems that may still call back into
> the driver after device unbind?
I've answered this question in another e-mail in this thread, see
https://lore.kernel.org/all/20260129010822.GA3310904@killaraus/
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 12:47 ` Greg Kroah-Hartman
2026-01-25 13:22 ` Laurent Pinchart
@ 2026-01-25 13:24 ` Laurent Pinchart
2026-01-25 17:53 ` Danilo Krummrich
2026-01-26 13:50 ` Johan Hovold
3 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-25 13:24 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Danilo Krummrich, Johan Hovold, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, Jason Gunthorpe,
linux-doc, linux-kselftest, linux-kernel
On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg KH wrote:
> On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
> > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > > this does not look like the right interface for the chardev unplug issue.
> >
> > I think it depends, we should do everything to prevent having the issue in the
> > first place, e.g. ensure that we synchronize the unplug properly on device
> > driver unbind.
> >
> > Sometimes, however, this isn't possible; this is where a revocable mechanism can
> > come in handy to prevent UAF of device resources -- DRM is a good example for
> > this.
>
> This is not "possible" for almost all real devices so we need something
> like this for almost all classes of devices, DRM just shows the extremes
> involved, v4l2 is also another good example.
Revocable is not needed in V4L2.
> Note, other OSes also have this same problem, look at all the work the
> BSDs are going through at the moment just to get closer to the place
> where we are in Linux today with removable devices and they have hit our
> same problems.
>
> > But to be fair, I also want to point out that there is a quite significant
> > difference regarding the usefulness of the revocable concept in C compared to in
> > Rust due to language capabilities.
>
> True, but we do need something. I took these patches without a real
> user as a base for us to start working off of. The rust implementation
> has shown that the design-pattern is a good solution for the problem,
> and so I feel we should work with it and try to get this working
> properly. We've been sitting and talking about it for years now, and
> here is the first real code submission that is getting us closer to fix
> the problem properly. It might not be perfict, but let's evolve it from
> here for what is found not to work correctly.
>
> So I don't want to take these reverts, let's try this out, by putting
> this into the driver core now, we have the base to experiment with in a
> "safe" way in lots of different driver subsytems at the same time. If
> it doesn't work out, worst case we revert it in a release or two because
> it didn't get used.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 12:47 ` Greg Kroah-Hartman
2026-01-25 13:22 ` Laurent Pinchart
2026-01-25 13:24 ` Laurent Pinchart
@ 2026-01-25 17:53 ` Danilo Krummrich
2026-01-26 0:07 ` Jason Gunthorpe
2026-01-26 13:50 ` Johan Hovold
3 siblings, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-25 17:53 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Johan Hovold, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sun Jan 25, 2026 at 1:47 PM CET, Greg Kroah-Hartman wrote:
> On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
>> But to be fair, I also want to point out that there is a quite significant
>> difference regarding the usefulness of the revocable concept in C compared to in
>> Rust due to language capabilities.
>
> True, but we do need something. I took these patches without a real
> user as a base for us to start working off of. The rust implementation
> has shown that the design-pattern is a good solution for the problem,
> and so I feel we should work with it and try to get this working
> properly.
I agree, it's a matter of figuring out the best way to adopt this pattern. (For
those reading along, [1] details the Rust implementation to illustrate why it
may not make sense to adopt it in the same way).
> So I don't want to take these reverts, let's try this out, by putting this
> into the driver core now, we have the base to experiment with in a "safe" way
> in lots of different driver subsytems at the same time.
I also don't think this should be reverted -- I think it makes sense to start
experimenting to figure out what's the best way to adopt this pattern.
I think DRM has already shown interest in adopting this, and I think I have also
seen patch series doing preparation work to be able to adopt this pattern as
well.
Perhaps, to address some of the concerns, a good middle ground could be to mark
the feature as experimental with a separate Kconfig in the meantime?
--
[1] Revocable in Rust
In Rust (most) device resources are only ever handed out to drivers within an
opaque container type specific for device resources, hence it is named
Devres<T>.
The Devres container type uses the Revocable type internally to protect the
actual device resource; the resource is released automatically once the
corresponding device is unbound from the driver (for which it uses the "normal"
devres infrastructure).
There are mainly two ways to access a device resource with a Devres container:
Devres::access() and Devres::try_access().
Devres::access() provides a zero-cost access to the inner device resource, but
requires a proof that it is called from a scope where it is guaranteed that the
device remains bound. This is achieved by Devres::access() taking a
&Device<Bound> argument, i.e. a reference (not reference count) of a device that
is guaranteed to be bound for the entire lifetime of the reference.
Let's have a look at an example with an IRQ handler:
struct MyIrqData {
bar: Devres<pci::Bar>,
}
impl irq::Handler for MyIrqData {
fn handle(&self, dev: &Device<Bound>) -> IrqReturn {
// Directly access the inner `pci::Bar`; fails if the resource
// did not originate from `dev`.
//
// (Internally, this is just a pointer comparison between `dev`
// and the device the `Devres` container has been created with.)
if let Ok(bar) = self.bar.access(dev) else {
return IrqReturn::None;
}
// `bar` is a `&pci::Bar`; no (S)RCU read side critical section
// involved.
bar.write32(...);
IrqReturn::Handled
}
}
Since the Rust IRQ handler always guarantees that it won't race with device
unbind, we can provide a &Device<Bound> cookie and hence directly access the
device resource with no cost. Due to the type system representation of the
device context state, this is checked at compile time.
We can do this for anything that guarantees a scope where the device must be
bound. For instance, if we guarantee that a workqueue is torn down on device
unbind, we can provide a &Device<Bound> cookie for all work items scheduled on
this workqueue. The same applies to subsystem and filesystem callbacks etc.
But, if we are in a scope where we don't have a &Device<Bound>, it means that
this may be executed after device unbind. Consequently, drivers can't call
Devres::access() for direct access of the device resource (because it may be a
UAF) and, instead, have to fall back to Devres::try_access() and friends.
Let's take some DRM IOCTL for instance:
struct MyDriver {
bar: Devres<pci::Bar>,
}
fn ioctl_vm_create(
drm: &drm::Device<MyDriver>,
req: &mut uapi::drm_mydriver_vm_create,
file: &drm::File<MyDriverFile>,
) -> Result<u32> {
// Runs the closure in an (S)RCU read side critical section if the
// resource is available, returns ENXIO otherwise.
drm.bar.try_access_with(|bar| {
// (S)RCU read side critical section starts here.
bar.write32(...);
// (S)RCU read side critical section ends here.
}).ok_or(ENXIO)?;
Ok(0)
}
I think those examples make it obvious why a revocable implementation on the C
side can't provide the same value and ergonomics due to language limitations,
yet I think it makes sense to start experimenting how subsystems can adopt this
design-pattern in C.
One additional note, while this overall can come across a bit paranoid, it
achieves a clear goal: It becomes impossible for drivers to mess this up and
create memory safety bugs, while at the same time causing zero overhead in hot
paths, that can be proven to not have a potential for such bugs in the first
place.
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 17:53 ` Danilo Krummrich
@ 2026-01-26 0:07 ` Jason Gunthorpe
2026-01-26 16:08 ` Danilo Krummrich
0 siblings, 1 reply; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-26 0:07 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel
On Sun, Jan 25, 2026 at 06:53:15PM +0100, Danilo Krummrich wrote:
>
> Let's take some DRM IOCTL for instance:
>
> struct MyDriver {
> bar: Devres<pci::Bar>,
> }
>
> fn ioctl_vm_create(
> drm: &drm::Device<MyDriver>,
> req: &mut uapi::drm_mydriver_vm_create,
> file: &drm::File<MyDriverFile>,
> ) -> Result<u32> {
> // Runs the closure in an (S)RCU read side critical section if the
> // resource is available, returns ENXIO otherwise.
> drm.bar.try_access_with(|bar| {
> // (S)RCU read side critical section starts here.
>
> bar.write32(...);
>
> // (S)RCU read side critical section ends here.
> }).ok_or(ENXIO)?;
>
> Ok(0)
> }
That's the whole issue with DRM right there - allowing driver code to
run after the driver has unregistered from the subsystem is
*dangerous* and creates all these bugs.
From a rust perspective I would argue you should be looking at every
one of those try_access_with() sites in drivers as a code smell that
says something is questionable in the driver or subsystem.
In many other subsystems a driver should *never* use
"try_access_with". Unfortunately the rust solution forces
synchronize_srcu()'s anyway (which Bartoz rightly pointed out as an
unacceptable performance issue). IMHO since rust has the Device<Bound>
stuff the revocable should have used rwsem, because the expectation
should be that the majority uses access, not try_access.
> I think those examples make it obvious why a revocable implementation on the C
> side can't provide the same value and ergonomics due to language limitations,
> yet I think it makes sense to start experimenting how subsystems can adopt this
> design-pattern in C.
The most important part of this pattern, IMHO, is documenting when you
are in a safe Device<Bound> scope or not.
What the C version of revocable does is just enforce it *everwhere*
without any thought as to if it is papering over a bigger problem.
In C protecting some limited "device resources" is not nearly good
enough for alot of drivers since the root issue here is often the
author doesn't understand the undocumented contexts when it is
"try_access_with" vs "access" and then makes a lot more errors than
just "device resources". :(
Frankly I don't think "iterating" is going to salvage this idea. The
real value from rust was not in creating a thin wrapper around
SRCU. It is in all the other stuff Danilo has explained.
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 0:07 ` Jason Gunthorpe
@ 2026-01-26 16:08 ` Danilo Krummrich
2026-01-26 17:07 ` Jason Gunthorpe
0 siblings, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-26 16:08 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel
On Mon Jan 26, 2026 at 1:07 AM CET, Jason Gunthorpe wrote:
> That's the whole issue with DRM right there - allowing driver code to
> run after the driver has unregistered from the subsystem is
> *dangerous* and creates all these bugs.
Unfortunately, it is necessary (at least to a certain extend) in DRM. I think
there is space for improvements, but I don't think we can get rid of this
entirely, especially on the KMS side AFAIK.
(KMS is not exactly my core competence, so it would be better for someone else
to explain the details.)
> From a rust perspective I would argue you should be looking at every
> one of those try_access_with() sites in drivers as a code smell that
> says something is questionable in the driver or subsystem.
Agreed, if that is necessary it requires special attention and justification.
> In many other subsystems a driver should *never* use "try_access_with".
> Unfortunately the rust solution forces synchronize_srcu()'s anyway (which
> Bartoz rightly pointed out as an unacceptable performance issue).
I'm already working on patches that get this down to a single
synchronize_{s}rcu() call on driver unbind, which is not a hot path at all. So,
this should be fine. This should work for C code as well.
> IMHO since rust has the Device<Bound> stuff the revocable should have used
> rwsem, because the expectation should be that the majority uses access, not
> try_access.
Yes, the majority of uses is access(), not try_access(); not sure if rwsem is
the better solution though.
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 16:08 ` Danilo Krummrich
@ 2026-01-26 17:07 ` Jason Gunthorpe
2026-01-26 22:36 ` Danilo Krummrich
2026-01-28 23:40 ` Laurent Pinchart
0 siblings, 2 replies; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-26 17:07 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel
On Mon, Jan 26, 2026 at 05:08:20PM +0100, Danilo Krummrich wrote:
> On Mon Jan 26, 2026 at 1:07 AM CET, Jason Gunthorpe wrote:
> > That's the whole issue with DRM right there - allowing driver code to
> > run after the driver has unregistered from the subsystem is
> > *dangerous* and creates all these bugs.
>
> Unfortunately, it is necessary (at least to a certain extend) in DRM. I think
> there is space for improvements, but I don't think we can get rid of this
> entirely, especially on the KMS side AFAIK.
From what I saw alot of the issue with DRM was how it works the fops,
instead of the core subsytem managing the fops and calling the driver,
the driver managed its own fops and called back to the core.
Sure, the issues may be very hard to fix in existing code, but I find
it hard to swallow the idea that a subsystem couldn't own all the
fops/etc and guard every driver callback with a lock to generate the
right kind of fence..
> > IMHO since rust has the Device<Bound> stuff the revocable should have used
> > rwsem, because the expectation should be that the majority uses access, not
> > try_access.
>
> Yes, the majority of uses is access(), not try_access(); not sure if rwsem is
> the better solution though.
rwsem is much faster on destroy and somewhat slower on read. Which
sounds to match the use case here. Ie you wouldn't need to do special
effort to bundle the synchronize_srcu()
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 17:07 ` Jason Gunthorpe
@ 2026-01-26 22:36 ` Danilo Krummrich
2026-01-28 23:40 ` Laurent Pinchart
1 sibling, 0 replies; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-26 22:36 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Greg Kroah-Hartman, Johan Hovold, Rafael J . Wysocki,
Tzung-Bi Shih, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Laurent Pinchart, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel
On Mon Jan 26, 2026 at 6:07 PM CET, Jason Gunthorpe wrote:
> On Mon, Jan 26, 2026 at 05:08:20PM +0100, Danilo Krummrich wrote:
>> Yes, the majority of uses is access(), not try_access(); not sure if rwsem is
>> the better solution though.
>
> rwsem is much faster on destroy and somewhat slower on read. Which
> sounds to match the use case here. Ie you wouldn't need to do special
> effort to bundle the synchronize_srcu()
While not that many, the try_access() cases may still be in hot paths, whereas
the destroy case is always in a cold path, i.e. device driver unbind. Also note
that if the resource is released "manually" before driver unbind, we use
revoke_nosync() as the destructor already guarantees that there are no more
users.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 17:07 ` Jason Gunthorpe
2026-01-26 22:36 ` Danilo Krummrich
@ 2026-01-28 23:40 ` Laurent Pinchart
1 sibling, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-28 23:40 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Danilo Krummrich, Greg Kroah-Hartman, Johan Hovold,
Rafael J . Wysocki, Tzung-Bi Shih, Bartosz Golaszewski,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel
On Mon, Jan 26, 2026 at 01:07:20PM -0400, Jason Gunthorpe wrote:
> On Mon, Jan 26, 2026 at 05:08:20PM +0100, Danilo Krummrich wrote:
> > On Mon Jan 26, 2026 at 1:07 AM CET, Jason Gunthorpe wrote:
> > > That's the whole issue with DRM right there - allowing driver code to
> > > run after the driver has unregistered from the subsystem is
> > > *dangerous* and creates all these bugs.
> >
> > Unfortunately, it is necessary (at least to a certain extend) in DRM. I think
> > there is space for improvements, but I don't think we can get rid of this
> > entirely, especially on the KMS side AFAIK.
>
> From what I saw alot of the issue with DRM was how it works the fops,
> instead of the core subsytem managing the fops and calling the driver,
> the driver managed its own fops and called back to the core.
>
> Sure, the issues may be very hard to fix in existing code, but I find
> it hard to swallow the idea that a subsystem couldn't own all the
> fops/etc and guard every driver callback with a lock to generate the
> right kind of fence..
I also don't see a real technical reason. Retrofitting the right
solution in all existing drivers wouldn't be for the faint-hearted
though, so I understand the appeal for subsystems of a quick and easy
suboptimal implementation.
> > > IMHO since rust has the Device<Bound> stuff the revocable should have used
> > > rwsem, because the expectation should be that the majority uses access, not
> > > try_access.
> >
> > Yes, the majority of uses is access(), not try_access(); not sure if rwsem is
> > the better solution though.
>
> rwsem is much faster on destroy and somewhat slower on read. Which
> sounds to match the use case here. Ie you wouldn't need to do special
> effort to bundle the synchronize_srcu()
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-25 12:47 ` Greg Kroah-Hartman
` (2 preceding siblings ...)
2026-01-25 17:53 ` Danilo Krummrich
@ 2026-01-26 13:50 ` Johan Hovold
2026-01-27 21:18 ` Bartosz Golaszewski
2026-02-03 12:15 ` Johan Hovold
3 siblings, 2 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-26 13:50 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
> > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > > this does not look like the right interface for the chardev unplug issue.
> >
> > I think it depends, we should do everything to prevent having the issue in the
> > first place, e.g. ensure that we synchronize the unplug properly on device
> > driver unbind.
> >
> > Sometimes, however, this isn't possible; this is where a revocable mechanism can
> > come in handy to prevent UAF of device resources -- DRM is a good example for
> > this.
>
> This is not "possible" for almost all real devices so we need something
> like this for almost all classes of devices, DRM just shows the extremes
> involved, v4l2 is also another good example.
It's certainly possible to handle the chardev unplug issue without
revocable as several subsystems already do. All you need is a refcount,
a lock and a flag.
It may be possible to provide a generic solutions at the chardev level
or some kind of helper implementation (similar to revocable) for
subsystems to use directly.
But revocable appears to be too fine grained for this as when the
device goes away all operations must cease. There's no need to track
mmio regions individually as was suggested. This may be the mental model
for someone working with rust, but it isn't necessarily a good fit for
the rest of the kernel.
> > But to be fair, I also want to point out that there is a quite significant
> > difference regarding the usefulness of the revocable concept in C compared to in
> > Rust due to language capabilities.
>
> True, but we do need something. I took these patches without a real
> user as a base for us to start working off of. The rust implementation
> has shown that the design-pattern is a good solution for the problem,
> and so I feel we should work with it and try to get this working
> properly. We've been sitting and talking about it for years now, and
> here is the first real code submission that is getting us closer to fix
> the problem properly. It might not be perfict, but let's evolve it from
> here for what is found not to work correctly.
It's a design pattern that's perhaps needed for rust, but not
necessarily elsewhere. But either way there is no need to rush this. If
it turns out to be usable, it can be merged along with a future user.
Dropping the revocable_provider and revocable abstraction split should
even make it more palatable.
And with a new interface and a non-trivial user we can see what the
end-result looks like and decide where to go from there.
> So I don't want to take these reverts, let's try this out, by putting
> this into the driver core now, we have the base to experiment with in a
> "safe" way in lots of different driver subsytems at the same time. If
> it doesn't work out, worst case we revert it in a release or two because
> it didn't get used.
Please reconsider. Perhaps I didn't stress the point enough that the
current API needs to be reworked completely since there's no longer any
need for the two revocable abstractions.
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 13:50 ` Johan Hovold
@ 2026-01-27 21:18 ` Bartosz Golaszewski
2026-01-27 23:52 ` Jason Gunthorpe
2026-01-28 15:48 ` Johan Hovold
2026-02-03 12:15 ` Johan Hovold
1 sibling, 2 replies; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-27 21:18 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Danilo Krummrich, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Bartosz Golaszewski
On Mon, Jan 26, 2026 at 2:50 PM Johan Hovold <johan@kernel.org> wrote:
>
> On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> > On Sat, Jan 24, 2026 at 08:08:28PM +0100, Danilo Krummrich wrote:
> > > On Sat Jan 24, 2026 at 6:05 PM CET, Johan Hovold wrote:
> > > > this does not look like the right interface for the chardev unplug issue.
> > >
> > > I think it depends, we should do everything to prevent having the issue in the
> > > first place, e.g. ensure that we synchronize the unplug properly on device
> > > driver unbind.
> > >
> > > Sometimes, however, this isn't possible; this is where a revocable mechanism can
> > > come in handy to prevent UAF of device resources -- DRM is a good example for
> > > this.
> >
> > This is not "possible" for almost all real devices so we need something
> > like this for almost all classes of devices, DRM just shows the extremes
> > involved, v4l2 is also another good example.
>
> It's certainly possible to handle the chardev unplug issue without
> revocable as several subsystems already do. All you need is a refcount,
> a lock and a flag.
>
> It may be possible to provide a generic solutions at the chardev level
> or some kind of helper implementation (similar to revocable) for
> subsystems to use directly.
>
This echoes the heated exchange I recently had with Johan elsewhere so
I would like to chime in and use the wider forum of driver core
maintainers to settle an important question. It seems there are two
camps in this discussion: one whose perception of the problem is
limited to character devices being referenced from user-space at the
time of the driver unbind (favoring fixing the issues at the vfs
level) and another extending the problem to any driver unbinding where
we cannot ensure a proper ordering of the teardown (for whatever
reason: fw_devlink=off, helper auxiliary devices acting as
intermediates, or even user-space unbinding a driver manually with
bus-level sysfs attributes) leaving consumers of resources exposed by
providers that are gone with dangling references (focusing the
solutions on the subsystem level).
The question is: should we work towards making the kernel gracefully
handle any such situation or is it acceptable that if we do
"non-standard" things, we can trigger invalid memory accesses from
user-space. I'm asking this because I've been sending patches to
several subsystems addressing life-time issues at the subsystem level
with SRCU and I've faced resistance from Johan at least twice - not
based on the implementation details but on the philosophy itself of
synchronizing all accesses from consumers to providers (SRCU,
revocable or otherwise).
I myself am in the latter camp. My thinking is: if we expose an
interface, it should work correctly. In particular: it should not
allow the user (even root!) to crash the kernel. In addition: there
seems to be an agreement that rust in linux is good because of its
memory safety features. The issues we're discussing would have never
happened, had the code been written in rust so we should not just
accept them as normal in C or tell the user to "just not do it".
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 21:18 ` Bartosz Golaszewski
@ 2026-01-27 23:52 ` Jason Gunthorpe
2026-01-28 9:40 ` Bartosz Golaszewski
2026-01-29 1:08 ` Laurent Pinchart
2026-01-28 15:48 ` Johan Hovold
1 sibling, 2 replies; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-27 23:52 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Johan Hovold, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Wolfram Sang, Simona Vetter,
Dan Williams, linux-doc, linux-kselftest, linux-kernel,
Bartosz Golaszewski
On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
> maintainers to settle an important question. It seems there are two
> camps in this discussion: one whose perception of the problem is
> limited to character devices being referenced from user-space at the
> time of the driver unbind (favoring fixing the issues at the vfs
> level) and another extending the problem to any driver unbinding where
> we cannot ensure a proper ordering of the teardown (for whatever
> reason:
I don't think you can defend any position where the user can do
echo XYZ > /sys/.../YY
and the kernel has an oops.
Meaing in this discussion if the user does
echo ".." > /sys/bus/XX/drivers/YY/unbind
The kernel shouldn't oops or warn.
I've seen many kinds of bogus arguments over the years (especially
misunderstanding what the module refcount does!!), but ultimately I
think this is the generally agreed expectation.
However, in practice this isn't a common work flow, it is quite alot
of tricky work to understand how a subsystem works and put in the
necessary protections, and frankly many subsystems have had these bugs
for their entire existance. It isn't urgent.
Several subsystems do get it right, it is very possible and the best
practices are much more aligned with the Device<Bound> stuff in
Rust. ie guarantee in most contexts that remove() can't run.
I'm not surprised to hear pushback on trying to fix it, especially if
the proposed fixes are not subsystem comphrenesive in
nature. Sprinkling SRCU around as partial patches, especially in
drivers, is not a good idea, IMHO.
The reason cdev keeps coming up is because there are few common ways a
typical driver can actually generate concurrent operations during and
after remove that would be problematic.
File descriptors, subsystem callbacks, work queues, timers,
interrupts, and notifiers.
The latter already have robust schemes to help the driver shutdown and
end the concurrent operations. ie cancel_work_sync(),
del_timer_sync(), free_irq(), and *notifier_unregister().
Many wrappered file descriptors are safe. For example the sysfs usage
in a driver is sync stopped during device_del's calls to sysfs remove
functions.
IMHO the largest systemic issue in this space is people making their
own fops without understanding the lifecycle model and without
hand-rolling a special a "_sync" kind of shutdown around it.
A managed fops with a sync destruction operation would go a long way
to closing these issues.
ie the gpiolib example was basically all fops, one work and a bunch of
places where the protection was redundant.
Yes there are other cases, and certainly I've commonly seen cases of
drivers reaching into other drivers, and subsystem non-file ops, but
these cases usualy have other more fundamental issues with remove
races :(
So I would probably also take a strong position that introducing
something like DevRes where you try to wrapper MMIO or other device
resources is adamently not something we want to do. Not because I
don't care about these removal races, but because I want the drivers
to run in a Device<Bound> context with very few exceptions.
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 23:52 ` Jason Gunthorpe
@ 2026-01-28 9:40 ` Bartosz Golaszewski
2026-01-28 10:01 ` Wolfram Sang
2026-01-29 1:08 ` Laurent Pinchart
1 sibling, 1 reply; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-28 9:40 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Johan Hovold, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Wolfram Sang, Simona Vetter,
Dan Williams, linux-doc, linux-kselftest, linux-kernel,
Bartosz Golaszewski, Bartosz Golaszewski
On Wed, 28 Jan 2026 00:52:32 +0100, Jason Gunthorpe <jgg@nvidia.com> said:
> On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
>> maintainers to settle an important question. It seems there are two
>> camps in this discussion: one whose perception of the problem is
>> limited to character devices being referenced from user-space at the
>> time of the driver unbind (favoring fixing the issues at the vfs
>> level) and another extending the problem to any driver unbinding where
>> we cannot ensure a proper ordering of the teardown (for whatever
>> reason:
>
> I don't think you can defend any position where the user can do
>
> echo XYZ > /sys/.../YY
>
> and the kernel has an oops.
>
> Meaing in this discussion if the user does
> echo ".." > /sys/bus/XX/drivers/YY/unbind
>
> The kernel shouldn't oops or warn.
>
> I've seen many kinds of bogus arguments over the years (especially
> misunderstanding what the module refcount does!!), but ultimately I
> think this is the generally agreed expectation.
>
> However, in practice this isn't a common work flow, it is quite alot
> of tricky work to understand how a subsystem works and put in the
> necessary protections, and frankly many subsystems have had these bugs
> for their entire existance. It isn't urgent.
>
In general, that's true. However I have first bumped into this can of worms
at work when a client's custom device using a USB-to-I2C adapter and creating
devices dynamically from user-space would crash the kernel or freeze a thread
when the device would be disconnected with processes still referencing file
descriptors. It may not be common but I hope we do care about corner-cases
too.
So I'd say: it's fine as is until it isn't. :)
> Several subsystems do get it right, it is very possible and the best
> practices are much more aligned with the Device<Bound> stuff in
> Rust. ie guarantee in most contexts that remove() can't run.
>
> I'm not surprised to hear pushback on trying to fix it, especially if
> the proposed fixes are not subsystem comphrenesive in
> nature. Sprinkling SRCU around as partial patches, especially in
> drivers, is not a good idea, IMHO.
>
For sure. I've been trying to address it exclusively in subsystem code, where
this kind of logic belongs. After all, subsystem code is where we do complex
things to make drivers simpler. One exception is I2C where the logic is so
broken we need to first rework a lot of drivers. Wolfram is on board with that
though.
> The reason cdev keeps coming up is because there are few common ways a
> typical driver can actually generate concurrent operations during and
> after remove that would be problematic.
>
> File descriptors, subsystem callbacks, work queues, timers,
> interrupts, and notifiers.
>
> The latter already have robust schemes to help the driver shutdown and
> end the concurrent operations. ie cancel_work_sync(),
> del_timer_sync(), free_irq(), and *notifier_unregister().
>
> Many wrappered file descriptors are safe. For example the sysfs usage
> in a driver is sync stopped during device_del's calls to sysfs remove
> functions.
>
> IMHO the largest systemic issue in this space is people making their
> own fops without understanding the lifecycle model and without
> hand-rolling a special a "_sync" kind of shutdown around it.
>
> A managed fops with a sync destruction operation would go a long way
> to closing these issues.
>
> ie the gpiolib example was basically all fops, one work and a bunch of
> places where the protection was redundant.
>
> Yes there are other cases, and certainly I've commonly seen cases of
> drivers reaching into other drivers, and subsystem non-file ops, but
> these cases usualy have other more fundamental issues with remove
> races :(
>
> So I would probably also take a strong position that introducing
> something like DevRes where you try to wrapper MMIO or other device
> resources is adamently not something we want to do. Not because I
> don't care about these removal races, but because I want the drivers
> to run in a Device<Bound> context with very few exceptions.
>
> Jason
>
Rust makes it very clear who owns what. That's something that we struggle
a lot with in C, where we have drivers that create objects and then pass
them on to the subsystem for management but the latter still references
objects that are destroyed when the driver is unbound. Devres when used
right is great and the lifetime is very clear: from binding to unbinding.
The using right is the hard part though. :)
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 9:40 ` Bartosz Golaszewski
@ 2026-01-28 10:01 ` Wolfram Sang
2026-01-28 15:05 ` Jason Gunthorpe
0 siblings, 1 reply; 70+ messages in thread
From: Wolfram Sang @ 2026-01-28 10:01 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Jason Gunthorpe, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Laurent Pinchart,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
[-- Attachment #1: Type: text/plain, Size: 1172 bytes --]
> One exception is I2C where the logic is so broken we need to first
> rework a lot of drivers.
Let's say "bitrotten" instead of broken. People used what was available
at that time and they prevented the kernel from crashing, at least. And
up to now, nobody had the bandwidth to improve that part in I2C.
> Wolfram is on board with that though.
Ack. I want to emphasize here that for I2C the SRCU part goes into the
subsystem, not into the drivers.
(Disclaimer: I don't have the time to even read all the mails discussing
'revocable' despite it maybe being used in I2C. I am busy enough
handling the preparations needed to improve the I2C core to handle the
lifecycle issues. If 'revocable' is the final piece or not is a second
step for me. Maybe even a third.)
> > The reason cdev keeps coming up is because there are few common ways a
> > typical driver can actually generate concurrent operations during and
> > after remove that would be problematic.
Let me point out again that Dan Williams already had a PoC-patch for
handling the cdev issue generically [1]. Dunno if this fact is present
in the current discussion.
[1] https://lkml.org/lkml/2021/1/20/999
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 10:01 ` Wolfram Sang
@ 2026-01-28 15:05 ` Jason Gunthorpe
2026-01-28 15:20 ` Bartosz Golaszewski
2026-01-28 16:58 ` Wolfram Sang
0 siblings, 2 replies; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 15:05 UTC (permalink / raw)
To: Wolfram Sang
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Laurent Pinchart,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Wed, Jan 28, 2026 at 11:01:03AM +0100, Wolfram Sang wrote:
>
> > One exception is I2C where the logic is so broken we need to first
> > rework a lot of drivers.
>
> Let's say "bitrotten" instead of broken. People used what was available
> at that time and they prevented the kernel from crashing, at least. And
> up to now, nobody had the bandwidth to improve that part in I2C.
>
> > Wolfram is on board with that though.
>
> Ack. I want to emphasize here that for I2C the SRCU part goes into the
> subsystem, not into the drivers.
I would just gently advise again that SRCU is not a pancea and should
only be used if the read side sections are super performance
critical. I'm not sure that describes I2C. rwsem is often a simpler
and better choice.
> > > The reason cdev keeps coming up is because there are few common ways a
> > > typical driver can actually generate concurrent operations during and
> > > after remove that would be problematic.
>
> Let me point out again that Dan Williams already had a PoC-patch for
> handling the cdev issue generically [1]. Dunno if this fact is present
> in the current discussion.
>
> [1] https://lkml.org/lkml/2021/1/20/999
Yeah, this was brought up a couple drafts of possible options were
exchanged already but nothing was really focused on and polished.
It is a tricky problem to find a storage location for the lock and
revoke so that the fops shim can access it while not disturbing the
actual driver.
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 15:05 ` Jason Gunthorpe
@ 2026-01-28 15:20 ` Bartosz Golaszewski
2026-01-28 16:01 ` Jason Gunthorpe
2026-01-28 16:58 ` Wolfram Sang
1 sibling, 1 reply; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-28 15:20 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Wolfram Sang, Johan Hovold, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Simona Vetter, Dan Williams,
linux-doc, linux-kselftest, linux-kernel, Bartosz Golaszewski
On Wed, Jan 28, 2026 at 4:05 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jan 28, 2026 at 11:01:03AM +0100, Wolfram Sang wrote:
> >
> > > One exception is I2C where the logic is so broken we need to first
> > > rework a lot of drivers.
> >
> > Let's say "bitrotten" instead of broken. People used what was available
> > at that time and they prevented the kernel from crashing, at least. And
> > up to now, nobody had the bandwidth to improve that part in I2C.
> >
> > > Wolfram is on board with that though.
> >
> > Ack. I want to emphasize here that for I2C the SRCU part goes into the
> > subsystem, not into the drivers.
>
> I would just gently advise again that SRCU is not a pancea and should
> only be used if the read side sections are super performance
> critical. I'm not sure that describes I2C. rwsem is often a simpler
> and better choice.
>
Sure. We're not there yet so that's TBD. The reason for using SRCU in
GPIO was the fact that we have consumers that may call into GPIOLIB
from atomic context - so we must not sleep - but we also have sleeping
controllers - so we must not take a spinlock before entering driver
code. SRCU was about the only synchronization mechanism that worked in
all cases as the locking at the subsystem level needs to be
centralized.
Also SRCU protects pointers by design, which makes the split into
subsystem- and provider-owned data logical in this case:
struct subsys_data {
struct driver_data __rcu *priv;
struct device dev;
}
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 15:20 ` Bartosz Golaszewski
@ 2026-01-28 16:01 ` Jason Gunthorpe
2026-01-30 11:27 ` Bartosz Golaszewski
0 siblings, 1 reply; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 16:01 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Wolfram Sang, Johan Hovold, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Simona Vetter, Dan Williams,
linux-doc, linux-kselftest, linux-kernel, Bartosz Golaszewski
On Wed, Jan 28, 2026 at 04:20:02PM +0100, Bartosz Golaszewski wrote:
> On Wed, Jan 28, 2026 at 4:05 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 11:01:03AM +0100, Wolfram Sang wrote:
> > >
> > > > One exception is I2C where the logic is so broken we need to first
> > > > rework a lot of drivers.
> > >
> > > Let's say "bitrotten" instead of broken. People used what was available
> > > at that time and they prevented the kernel from crashing, at least. And
> > > up to now, nobody had the bandwidth to improve that part in I2C.
> > >
> > > > Wolfram is on board with that though.
> > >
> > > Ack. I want to emphasize here that for I2C the SRCU part goes into the
> > > subsystem, not into the drivers.
> >
> > I would just gently advise again that SRCU is not a pancea and should
> > only be used if the read side sections are super performance
> > critical. I'm not sure that describes I2C. rwsem is often a simpler
> > and better choice.
> >
>
> Sure. We're not there yet so that's TBD. The reason for using SRCU in
> GPIO was the fact that we have consumers that may call into GPIOLIB
> from atomic context - so we must not sleep - but we also have sleeping
> controllers - so we must not take a spinlock before entering driver
> code. SRCU was about the only synchronization mechanism that worked in
> all cases as the locking at the subsystem level needs to be
> centralized.
Oh, I didn't know you could safely use SRCU within atomic
sections. That's a pretty good reason to use it in some places.
> struct subsys_data {
> struct driver_data __rcu *priv;
> struct device dev;
> }
IMHO this is not a good pattern.
The thing to RCU protect is the *driver ops* pointer, not the driver
data. It is what I was saying before, the goal is to not call ops if
the driver is revoked so that the ops can assume they are in a
Device<Bound> context.
It should not be normal for the core code to be touching priv at all,
it has no natural reason to ever load it. It does have to fetch ops!
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 16:01 ` Jason Gunthorpe
@ 2026-01-30 11:27 ` Bartosz Golaszewski
0 siblings, 0 replies; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-30 11:27 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Wolfram Sang, Johan Hovold, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Simona Vetter, Dan Williams,
linux-doc, linux-kselftest, linux-kernel, Bartosz Golaszewski
On Wed, Jan 28, 2026 at 5:01 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Jan 28, 2026 at 04:20:02PM +0100, Bartosz Golaszewski wrote:
> > On Wed, Jan 28, 2026 at 4:05 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 11:01:03AM +0100, Wolfram Sang wrote:
> > > >
> > > > > One exception is I2C where the logic is so broken we need to first
> > > > > rework a lot of drivers.
> > > >
> > > > Let's say "bitrotten" instead of broken. People used what was available
> > > > at that time and they prevented the kernel from crashing, at least. And
> > > > up to now, nobody had the bandwidth to improve that part in I2C.
> > > >
> > > > > Wolfram is on board with that though.
> > > >
> > > > Ack. I want to emphasize here that for I2C the SRCU part goes into the
> > > > subsystem, not into the drivers.
> > >
> > > I would just gently advise again that SRCU is not a pancea and should
> > > only be used if the read side sections are super performance
> > > critical. I'm not sure that describes I2C. rwsem is often a simpler
> > > and better choice.
> > >
> >
> > Sure. We're not there yet so that's TBD. The reason for using SRCU in
> > GPIO was the fact that we have consumers that may call into GPIOLIB
> > from atomic context - so we must not sleep - but we also have sleeping
> > controllers - so we must not take a spinlock before entering driver
> > code. SRCU was about the only synchronization mechanism that worked in
> > all cases as the locking at the subsystem level needs to be
> > centralized.
>
> Oh, I didn't know you could safely use SRCU within atomic
> sections. That's a pretty good reason to use it in some places.
>
> > struct subsys_data {
> > struct driver_data __rcu *priv;
> > struct device dev;
> > }
>
> IMHO this is not a good pattern.
>
> The thing to RCU protect is the *driver ops* pointer, not the driver
> data. It is what I was saying before, the goal is to not call ops if
> the driver is revoked so that the ops can assume they are in a
> Device<Bound> context.
>
> It should not be normal for the core code to be touching priv at all,
> it has no natural reason to ever load it. It does have to fetch ops!
>
This is just an example and priv can hold whatever. Ideally this would
be ops but it's not always the case with existing code where ops and
driver data are often mixed into a single structure.
Bart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 15:05 ` Jason Gunthorpe
2026-01-28 15:20 ` Bartosz Golaszewski
@ 2026-01-28 16:58 ` Wolfram Sang
1 sibling, 0 replies; 70+ messages in thread
From: Wolfram Sang @ 2026-01-28 16:58 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Laurent Pinchart,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
[-- Attachment #1: Type: text/plain, Size: 806 bytes --]
> I would just gently advise again that SRCU is not a pancea and should
> only be used if the read side sections are super performance
> critical. I'm not sure that describes I2C. rwsem is often a simpler
> and better choice.
You might be totally right. But in any case, we need to prepare the
subsystem first. That is where I am now. Next steps come after that.
> > [1] https://lkml.org/lkml/2021/1/20/999
>
> Yeah, this was brought up a couple drafts of possible options were
> exchanged already but nothing was really focused on and polished.
>
> It is a tricky problem to find a storage location for the lock and
> revoke so that the fops shim can access it while not disturbing the
> actual driver.
I can imagine. Thanks for the heads up, nice to see it was not
forgotten!
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 23:52 ` Jason Gunthorpe
2026-01-28 9:40 ` Bartosz Golaszewski
@ 2026-01-29 1:08 ` Laurent Pinchart
2026-01-29 1:23 ` Jason Gunthorpe
2026-01-29 22:29 ` Danilo Krummrich
1 sibling, 2 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 1:08 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Tue, Jan 27, 2026 at 07:52:32PM -0400, Jason Gunthorpe wrote:
> On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
> > maintainers to settle an important question. It seems there are two
> > camps in this discussion: one whose perception of the problem is
> > limited to character devices being referenced from user-space at the
> > time of the driver unbind (favoring fixing the issues at the vfs
> > level) and another extending the problem to any driver unbinding where
> > we cannot ensure a proper ordering of the teardown (for whatever
> > reason:
>
> I don't think you can defend any position where the user can do
>
> echo XYZ > /sys/.../YY
>
> and the kernel has an oops.
>
> Meaing in this discussion if the user does
> echo ".." > /sys/bus/XX/drivers/YY/unbind
>
> The kernel shouldn't oops or warn.
>
> I've seen many kinds of bogus arguments over the years (especially
> misunderstanding what the module refcount does!!), but ultimately I
> think this is the generally agreed expectation.
>
> However, in practice this isn't a common work flow, it is quite alot
> of tricky work to understand how a subsystem works and put in the
> necessary protections, and frankly many subsystems have had these bugs
> for their entire existance. It isn't urgent.
I completely agree with this approach. I don't think anyone really
advocates for considering that causing an oops from userspace is a
normal behaviour. While the various discussions on this topic that
escalated into heated arguments over time may have created an impression
that we have two camps, I think we are all fundamentally in the same
camp: we want to kernel to run reliably without crashing, with
performant technical solutions. And let's remember that we achieve this
goal best when we work together.
> Several subsystems do get it right, it is very possible and the best
> practices are much more aligned with the Device<Bound> stuff in
> Rust. ie guarantee in most contexts that remove() can't run.
>
> I'm not surprised to hear pushback on trying to fix it, especially if
> the proposed fixes are not subsystem comphrenesive in
> nature. Sprinkling SRCU around as partial patches, especially in
> drivers, is not a good idea, IMHO.
>
> The reason cdev keeps coming up is because there are few common ways a
> typical driver can actually generate concurrent operations during and
> after remove that would be problematic.
>
> File descriptors, subsystem callbacks, work queues, timers,
> interrupts, and notifiers.
>
> The latter already have robust schemes to help the driver shutdown and
> end the concurrent operations. ie cancel_work_sync(),
> del_timer_sync(), free_irq(), and *notifier_unregister().
One a side note, devm_request_irq() is another of the devm_* helpers
that cause race conditions, as interrupt handlers can run right after
.remove() returns, which drivers will most likely not handle correctly.
> Many wrappered file descriptors are safe. For example the sysfs usage
> in a driver is sync stopped during device_del's calls to sysfs remove
> functions.
>
> IMHO the largest systemic issue in this space is people making their
> own fops without understanding the lifecycle model and without
> hand-rolling a special a "_sync" kind of shutdown around it.
>
> A managed fops with a sync destruction operation would go a long way
> to closing these issues.
That's what I've been advocating for. The best way to ensure that driver
code will not accessed data freed at .remove() time is to prevent the
code to run at all.
We have different classes of problems. It's tempting to think that a
single solution can handle them all, but that's often not the best
option. Based on what I've seen so far, handling the fops vs. .remove()
race is best done by ensuring that fops do not race with .remove(). Dan
Williams showed that it can be done quite simply. Conceptually speaking
(not using (S)RCU or other specific kernel primitives, to try and
describe the concepts clearly), it works as follows.
1. At the beginning of .remove(), flag that the device is being removed.
2. At the entry point of all fops, check the flag and return with an
error if set. This prevents *new* fops from being entered once
.remove() has started.
3. At the entry point of all fops, flag that a fop is in progress (with
a counter).
4. In .remove(), wake up all threads sleeping in fops.
5. At the exit point of all fops, decrease the fop-in-progress counter.
6. In .remove(), wait until the fop-in-progress counter reaches 0. At
that point, it is guaranteed that all fops will have completed and
that no new fop can run.
7. Complete .remove(), freeing resources.
Most of this of course would need to be handled in helpers at the cdev
level, and wrapped in subsystem helpers. Drivers will need to be
modified to
- Call a helper at the beginning of .remove() to handle (1).
- Wake up threads sleeping in fops (3), as only drivers know about these
(in the general case, in some subsystems this may be handled in
helpers).
- Call a helper to handle (6).
The only fop that this doesn't handle is .release(), as we can't block
.remove() until userspace closes all open file descriptors and unmaps
all memory. This is also the race condition that is the easiest to
trigger (along with the race conditions related to threads sleeping in
fops).
Solving the .release() race also requires driver involvement too. By
definition drivers have tasks to perform in .release() that need to
access resources (MMIO, data structures, ...) - otherwise there would be
no .release() operation. Drivers need to synchronize between .remove()
and .release() and skip release tasks that have been performed by
.remove() already. Strategy depends on subsystems, in the V4L2 case
things can be more complex as it's common for drivers to create multiple
cdevs, but it's entirely maangeable with clear concepts that can be
documented, implemented, and checked during review.
> ie the gpiolib example was basically all fops, one work and a bunch of
> places where the protection was redundant.
>
> Yes there are other cases, and certainly I've commonly seen cases of
> drivers reaching into other drivers, and subsystem non-file ops, but
> these cases usualy have other more fundamental issues with remove
> races :(
Drivers using resources provided by other drivers is a more complex
problem. I'm thinking about a driver acquiring a GPIO in .probe(), and
the GPIO provider disappearing at a random point of time. Or a clock, or
a regulator. These issues are more rare if we ignore unbinding drivers
forcefully through sysfs, but they can still occur, so we should try to
find a solution here too (and the sysfs unbind issue is also worth
fixing). There, unlike in the cdev case, some sort of revocable API
could make sense.
> So I would probably also take a strong position that introducing
> something like DevRes where you try to wrapper MMIO or other device
> resources is adamently not something we want to do. Not because I
> don't care about these removal races, but because I want the drivers
> to run in a Device<Bound> context with very few exceptions.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 1:08 ` Laurent Pinchart
@ 2026-01-29 1:23 ` Jason Gunthorpe
2026-01-29 3:42 ` dan.j.williams
2026-01-29 10:38 ` Laurent Pinchart
2026-01-29 22:29 ` Danilo Krummrich
1 sibling, 2 replies; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-29 1:23 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> > The latter already have robust schemes to help the driver shutdown and
> > end the concurrent operations. ie cancel_work_sync(),
> > del_timer_sync(), free_irq(), and *notifier_unregister().
>
> One a side note, devm_request_irq() is another of the devm_* helpers
> that cause race conditions, as interrupt handlers can run right after
> .remove() returns, which drivers will most likely not handle correctly.
Yes! You *cannot* intermix devm and non-devm approaches without
creating very subtle bugs exactly like this. If your subsystem does
not provide a "devm register" helper its drivers shouldn't use devm.
> 1. At the beginning of .remove(), flag that the device is being removed.
>
> 2. At the entry point of all fops, check the flag and return with an
> error if set. This prevents *new* fops from being entered once
> .remove() has started.
>
> 3. At the entry point of all fops, flag that a fop is in progress (with
> a counter).
>
> 4. In .remove(), wake up all threads sleeping in fops.
>
> 5. At the exit point of all fops, decrease the fop-in-progress counter.
>
> 6. In .remove(), wait until the fop-in-progress counter reaches 0. At
> that point, it is guaranteed that all fops will have completed and
> that no new fop can run.
>
> 7. Complete .remove(), freeing resources.
This is is basically just open coding a rwsem.. :)
SRCU should be faster than this, IIRC it has less cache line bouncing.
But sure, it is all easy once you figure out how to give the fops shim
some place to store all this state since people would not agree to
make this a universal cost to all fops.
> > Yes there are other cases, and certainly I've commonly seen cases of
> > drivers reaching into other drivers, and subsystem non-file ops, but
> > these cases usualy have other more fundamental issues with remove
> > races :(
>
> Drivers using resources provided by other drivers is a more complex
> problem. I'm thinking about a driver acquiring a GPIO in .probe(), and
> the GPIO provider disappearing at a random point of time. Or a clock, or
> a regulator. These issues are more rare if we ignore unbinding drivers
> forcefully through sysfs, but they can still occur, so we should try to
> find a solution here too (and the sysfs unbind issue is also worth
> fixing). There, unlike in the cdev case, some sort of revocable API
> could make sense.
IMHO the sanest answer is to force the depending driver to unplug
before allowing the dependend driver to remove. Isn't that what the
dev link stuff does? IDK it has been forever now since I've done
embedded stuff..
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 1:23 ` Jason Gunthorpe
@ 2026-01-29 3:42 ` dan.j.williams
2026-01-29 9:56 ` Danilo Krummrich
2026-01-29 10:38 ` Laurent Pinchart
1 sibling, 1 reply; 70+ messages in thread
From: dan.j.williams @ 2026-01-29 3:42 UTC (permalink / raw)
To: Jason Gunthorpe, Laurent Pinchart
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> > > The latter already have robust schemes to help the driver shutdown and
> > > end the concurrent operations. ie cancel_work_sync(),
> > > del_timer_sync(), free_irq(), and *notifier_unregister().
> >
> > One a side note, devm_request_irq() is another of the devm_* helpers
> > that cause race conditions, as interrupt handlers can run right after
> > .remove() returns, which drivers will most likely not handle correctly.
>
> Yes! You *cannot* intermix devm and non-devm approaches without
> creating very subtle bugs exactly like this. If your subsystem does
> not provide a "devm register" helper its drivers shouldn't use devm.
I wonder if we should have a proactive debug mode that checks for
idiomatic devres usage and flags:
- registering devres actions while the driver is detached
- registering devres actions for a device with a driver that has a
.remove() method
- passing a devres allocation to a kobject API
- invoking devres release actions from a kobject release API
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 3:42 ` dan.j.williams
@ 2026-01-29 9:56 ` Danilo Krummrich
2026-01-29 10:43 ` Laurent Pinchart
2026-01-30 0:36 ` dan.j.williams
0 siblings, 2 replies; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-29 9:56 UTC (permalink / raw)
To: dan.j.williams
Cc: Jason Gunthorpe, Laurent Pinchart, Bartosz Golaszewski,
Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote:
> Jason Gunthorpe wrote:
>> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
>> > > The latter already have robust schemes to help the driver shutdown and
>> > > end the concurrent operations. ie cancel_work_sync(),
>> > > del_timer_sync(), free_irq(), and *notifier_unregister().
>> >
>> > One a side note, devm_request_irq() is another of the devm_* helpers
>> > that cause race conditions, as interrupt handlers can run right after
>> > .remove() returns, which drivers will most likely not handle correctly.
>>
>> Yes! You *cannot* intermix devm and non-devm approaches without
>> creating very subtle bugs exactly like this. If your subsystem does
>> not provide a "devm register" helper its drivers shouldn't use devm.
>
> I wonder if we should have a proactive debug mode that checks for
> idiomatic devres usage and flags:
>
> - registering devres actions while the driver is detached
In Rust we already enforce this through the type system, i.e. we even fail to
compile the code when this is violated. (I.e. for all scopes that guarantee that
a device is bound (and will remain throughout) we give out a &Device<Bound>,
which serves as a cookie.)
In C I don't really see how that would be possible without additional locking,
as the only thing we could check is dev->driver, which obviously is racy.
> - registering devres actions for a device with a driver that has a
> .remove() method
This is perfectly valid, drivers might still be performing teardown operations
on the device (if it did not get hot unplugged).
> - passing a devres allocation to a kobject API
Well, this is an ownership violation. Technically, devres owns this allocation
and devres releases the allocation when the device is unbound. Which makes this
allocation only ever valid to access from a scope that is guaranteed that the
device is (and remains) bound.
> - invoking devres release actions from a kobject release API
This is similar to "registering devres actions while the driver is detached" and
falls into the boarder problem class of "messing with devres objects outside of
a Device<Bound> scope".
Again, I think in C we can't properly protect against this. While we could also
give out a "Device<Bound>" token for scopes where we can guarantee that the
device is actually bound to a driver in C, we can't control the lifetime of the
token as we can in Rust, which makes it pointless.
So, the best thing we can probably do is document that "This must only be called
from a scope that guarantees that the device remains bound throughout." for all
the devres accessors.
There may be one thing that comes to my mind that we could do though. While we
can't catch those at compile time, we might be able to catch those on runtime.
For instance, we could "abuse" lockdep and register a fake lock for a
Device<Bound> scope and put a lockdep assert into all the devres accessors.
However, fixing up all the violations that come up when introducing this sounds
like a huge pain. :)
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 9:56 ` Danilo Krummrich
@ 2026-01-29 10:43 ` Laurent Pinchart
2026-01-30 0:36 ` dan.j.williams
1 sibling, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 10:43 UTC (permalink / raw)
To: Danilo Krummrich
Cc: dan.j.williams, Jason Gunthorpe, Bartosz Golaszewski,
Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Thu, Jan 29, 2026 at 10:56:22AM +0100, Danilo Krummrich wrote:
> On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote:
> > Jason Gunthorpe wrote:
> >> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> >> > > The latter already have robust schemes to help the driver shutdown and
> >> > > end the concurrent operations. ie cancel_work_sync(),
> >> > > del_timer_sync(), free_irq(), and *notifier_unregister().
> >> >
> >> > One a side note, devm_request_irq() is another of the devm_* helpers
> >> > that cause race conditions, as interrupt handlers can run right after
> >> > .remove() returns, which drivers will most likely not handle correctly.
> >>
> >> Yes! You *cannot* intermix devm and non-devm approaches without
> >> creating very subtle bugs exactly like this. If your subsystem does
> >> not provide a "devm register" helper its drivers shouldn't use devm.
> >
> > I wonder if we should have a proactive debug mode that checks for
> > idiomatic devres usage and flags:
> >
> > - registering devres actions while the driver is detached
>
> In Rust we already enforce this through the type system, i.e. we even fail to
> compile the code when this is violated. (I.e. for all scopes that guarantee that
> a device is bound (and will remain throughout) we give out a &Device<Bound>,
> which serves as a cookie.)
>
> In C I don't really see how that would be possible without additional locking,
> as the only thing we could check is dev->driver, which obviously is racy.
>
> > - registering devres actions for a device with a driver that has a
> > .remove() method
>
> This is perfectly valid, drivers might still be performing teardown operations
> on the device (if it did not get hot unplugged).
>
> > - passing a devres allocation to a kobject API
>
> Well, this is an ownership violation. Technically, devres owns this allocation
> and devres releases the allocation when the device is unbound. Which makes this
> allocation only ever valid to access from a scope that is guaranteed that the
> device is (and remains) bound.
>
> > - invoking devres release actions from a kobject release API
>
> This is similar to "registering devres actions while the driver is detached" and
> falls into the boarder problem class of "messing with devres objects outside of
> a Device<Bound> scope".
>
> Again, I think in C we can't properly protect against this. While we could also
> give out a "Device<Bound>" token for scopes where we can guarantee that the
> device is actually bound to a driver in C, we can't control the lifetime of the
> token as we can in Rust, which makes it pointless.
>
> So, the best thing we can probably do is document that "This must only be called
> from a scope that guarantees that the device remains bound throughout." for all
> the devres accessors.
>
> There may be one thing that comes to my mind that we could do though. While we
> can't catch those at compile time, we might be able to catch those on runtime.
C will not allow catching as many things as compile time as rust, that's
for sure, but I don't think it's the main issue here. The core of the
problem, in my opinion, is that we have seen a proliferation of devres
and similar helpers that were quickly adopted by drivers as it made
their life easier, *but* without any documentation of the caveats and
correct usage patterns. We have APIs that don't tell how to use them
correctly, so we can hardly blame driver developers for not doing it
right. In many cases we haven't even thought about what the right (and
wrong) usage patterns are, and in some cases the APIs have been
developed with so little awareness of these issues that there's no
correct usage pattern. Fixing this is the first step, then we can see
how to use the features provided by the programming language to minimize
the risk of incorrect usage.
> For instance, we could "abuse" lockdep and register a fake lock for a
> Device<Bound> scope and put a lockdep assert into all the devres accessors.
>
> However, fixing up all the violations that come up when introducing this sounds
> like a huge pain. :)
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 9:56 ` Danilo Krummrich
2026-01-29 10:43 ` Laurent Pinchart
@ 2026-01-30 0:36 ` dan.j.williams
1 sibling, 0 replies; 70+ messages in thread
From: dan.j.williams @ 2026-01-30 0:36 UTC (permalink / raw)
To: Danilo Krummrich, dan.j.williams
Cc: Jason Gunthorpe, Laurent Pinchart, Bartosz Golaszewski,
Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
Danilo Krummrich wrote:
> On Thu Jan 29, 2026 at 4:42 AM CET, dan.j.williams wrote:
> > Jason Gunthorpe wrote:
> >> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> >> > > The latter already have robust schemes to help the driver shutdown and
> >> > > end the concurrent operations. ie cancel_work_sync(),
> >> > > del_timer_sync(), free_irq(), and *notifier_unregister().
> >> >
> >> > One a side note, devm_request_irq() is another of the devm_* helpers
> >> > that cause race conditions, as interrupt handlers can run right after
> >> > .remove() returns, which drivers will most likely not handle correctly.
> >>
> >> Yes! You *cannot* intermix devm and non-devm approaches without
> >> creating very subtle bugs exactly like this. If your subsystem does
> >> not provide a "devm register" helper its drivers shouldn't use devm.
> >
> > I wonder if we should have a proactive debug mode that checks for
> > idiomatic devres usage and flags:
> >
> > - registering devres actions while the driver is detached
>
> In Rust we already enforce this through the type system, i.e. we even fail to
> compile the code when this is violated. (I.e. for all scopes that guarantee that
> a device is bound (and will remain throughout) we give out a &Device<Bound>,
> which serves as a cookie.)
>
> In C I don't really see how that would be possible without additional locking,
> as the only thing we could check is dev->driver, which obviously is racy.
>
> > - registering devres actions for a device with a driver that has a
> > .remove() method
>
> This is perfectly valid, drivers might still be performing teardown operations
> on the device (if it did not get hot unplugged).
Yes, this one is a soft warning. It is perfectly valid, but the first
thing I look for in a device that uses devm in ->probe() and also has a
->remove() method is dependencies of the devm object on the ->remove()
managed object. That case is potentially mitigated if all resources are
devm acquired and no ->remove() is needed.
> > - passing a devres allocation to a kobject API
>
> Well, this is an ownership violation. Technically, devres owns this allocation
> and devres releases the allocation when the device is unbound. Which makes this
> allocation only ever valid to access from a scope that is guaranteed that the
> device is (and remains) bound.
To be clear I am talking about:
dev2 = devm_kzalloc(dev1...)
init(dev2)
device_register(dev2)
...where it must be valid past unbind of dev1.
> > - invoking devres release actions from a kobject release API
>
> This is similar to "registering devres actions while the driver is detached" and
> falls into the boarder problem class of "messing with devres objects outside of
> a Device<Bound> scope".
>
> Again, I think in C we can't properly protect against this. While we could also
> give out a "Device<Bound>" token for scopes where we can guarantee that the
> device is actually bound to a driver in C, we can't control the lifetime of the
> token as we can in Rust, which makes it pointless.
This is why Rust remains on my, learn when I have time, pile. The goal
would not be to "properly protect", but "sufficiently warn" the
unsuspecting. If anything is going to get me to convert a subsystem to
Rust it is to get help from the compiler for the review burden of the
above abuses.
> So, the best thing we can probably do is document that "This must only be called
> from a scope that guarantees that the device remains bound throughout." for all
> the devres accessors.
Agree.
> There may be one thing that comes to my mind that we could do though. While we
> can't catch those at compile time, we might be able to catch those on runtime.
>
> For instance, we could "abuse" lockdep and register a fake lock for a
> Device<Bound> scope and put a lockdep assert into all the devres accessors.
>
> However, fixing up all the violations that come up when introducing this sounds
> like a huge pain. :)
Right, and as you said there are many valid uses that are not typically
recommended that would make the warning not useful.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 1:23 ` Jason Gunthorpe
2026-01-29 3:42 ` dan.j.williams
@ 2026-01-29 10:38 ` Laurent Pinchart
2026-01-29 13:34 ` Jason Gunthorpe
1 sibling, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 10:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Wed, Jan 28, 2026 at 09:23:22PM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> > > The latter already have robust schemes to help the driver shutdown and
> > > end the concurrent operations. ie cancel_work_sync(),
> > > del_timer_sync(), free_irq(), and *notifier_unregister().
> >
> > One a side note, devm_request_irq() is another of the devm_* helpers
> > that cause race conditions, as interrupt handlers can run right after
> > .remove() returns, which drivers will most likely not handle correctly.
>
> Yes! You *cannot* intermix devm and non-devm approaches without
> creating very subtle bugs exactly like this. If your subsystem does
> not provide a "devm register" helper its drivers shouldn't use devm.
I'd relax that rule a bit. There are resources that drivers must never,
ever access after .remove(), such as MMIO. Using devm_ioremap* should
therefore be safe in all cases.
> > 1. At the beginning of .remove(), flag that the device is being removed.
> >
> > 2. At the entry point of all fops, check the flag and return with an
> > error if set. This prevents *new* fops from being entered once
> > .remove() has started.
> >
> > 3. At the entry point of all fops, flag that a fop is in progress (with
> > a counter).
> >
> > 4. In .remove(), wake up all threads sleeping in fops.
> >
> > 5. At the exit point of all fops, decrease the fop-in-progress counter.
> >
> > 6. In .remove(), wait until the fop-in-progress counter reaches 0. At
> > that point, it is guaranteed that all fops will have completed and
> > that no new fop can run.
> >
> > 7. Complete .remove(), freeing resources.
>
> This is is basically just open coding a rwsem.. :)
Yes, and that's why I wrote that my explanation was conceptual :-) I
think it's important for developers (at least the ones developing
subsystem integration for this, if not all driver authors) to understand
the concepts behind it. Then we can optimize performance with the right
kernel primitives. If we start the API documentation by talking about
SRCU, people will be lost.
> SRCU should be faster than this, IIRC it has less cache line bouncing.
>
> But sure, it is all easy once you figure out how to give the fops shim
> some place to store all this state since people would not agree to
> make this a universal cost to all fops.
I didn't see any push back against Dan's proposal to store that
information in struct cdev, did I miss something ?
> > > Yes there are other cases, and certainly I've commonly seen cases of
> > > drivers reaching into other drivers, and subsystem non-file ops, but
> > > these cases usualy have other more fundamental issues with remove
> > > races :(
> >
> > Drivers using resources provided by other drivers is a more complex
> > problem. I'm thinking about a driver acquiring a GPIO in .probe(), and
> > the GPIO provider disappearing at a random point of time. Or a clock, or
> > a regulator. These issues are more rare if we ignore unbinding drivers
> > forcefully through sysfs, but they can still occur, so we should try to
> > find a solution here too (and the sysfs unbind issue is also worth
> > fixing). There, unlike in the cdev case, some sort of revocable API
> > could make sense.
>
> IMHO the sanest answer is to force the depending driver to unplug
> before allowing the dependend driver to remove. Isn't that what the
> dev link stuff does? IDK it has been forever now since I've done
> embedded stuff..
I think it's the simplest answer, yes. It's a bit like using a canon to
kill a fly though, but all other solutions would I think be much more
complex.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 10:38 ` Laurent Pinchart
@ 2026-01-29 13:34 ` Jason Gunthorpe
2026-01-29 14:52 ` Laurent Pinchart
0 siblings, 1 reply; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-29 13:34 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Thu, Jan 29, 2026 at 12:38:50PM +0200, Laurent Pinchart wrote:
> On Wed, Jan 28, 2026 at 09:23:22PM -0400, Jason Gunthorpe wrote:
> > On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> > > > The latter already have robust schemes to help the driver shutdown and
> > > > end the concurrent operations. ie cancel_work_sync(),
> > > > del_timer_sync(), free_irq(), and *notifier_unregister().
> > >
> > > One a side note, devm_request_irq() is another of the devm_* helpers
> > > that cause race conditions, as interrupt handlers can run right after
> > > .remove() returns, which drivers will most likely not handle correctly.
> >
> > Yes! You *cannot* intermix devm and non-devm approaches without
> > creating very subtle bugs exactly like this. If your subsystem does
> > not provide a "devm register" helper its drivers shouldn't use devm.
>
> I'd relax that rule a bit. There are resources that drivers must never,
> ever access after .remove(), such as MMIO. Using devm_ioremap* should
> therefore be safe in all cases.
Yeah, probably, but I've seen driver using devm before & after
non-devm and it is just too hard to tell if things are going to
even work right.
To be fair the IRQ issue is always more involved. The subsystem should
provide a state after unregistration where the memory is still around
and the IRQ path into the subsystem becomes a NOP. The driver then
frees the IRQ, fences work and releases the driver memory.
It is hard to do this sequence with devm..
I think a lot of places manage without this because seeing interrupts
after unregister is probably a rare race condition in their HW.
> > But sure, it is all easy once you figure out how to give the fops shim
> > some place to store all this state since people would not agree to
> > make this a universal cost to all fops.
>
> I didn't see any push back against Dan's proposal to store that
> information in struct cdev, did I miss something ?
I also don't see an issue with that, especially if we can stack misc
on top of cdev to share the same logic.
I think if you take that idea and the other proposal to shim the fops
with ones that use the cdev data then we can see some
cdev_unregister_sync() primitive.
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 13:34 ` Jason Gunthorpe
@ 2026-01-29 14:52 ` Laurent Pinchart
0 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 14:52 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bartosz Golaszewski, Johan Hovold, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski
On Thu, Jan 29, 2026 at 09:34:40AM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 12:38:50PM +0200, Laurent Pinchart wrote:
> > On Wed, Jan 28, 2026 at 09:23:22PM -0400, Jason Gunthorpe wrote:
> > > On Thu, Jan 29, 2026 at 03:08:22AM +0200, Laurent Pinchart wrote:
> > > > > The latter already have robust schemes to help the driver shutdown and
> > > > > end the concurrent operations. ie cancel_work_sync(),
> > > > > del_timer_sync(), free_irq(), and *notifier_unregister().
> > > >
> > > > One a side note, devm_request_irq() is another of the devm_* helpers
> > > > that cause race conditions, as interrupt handlers can run right after
> > > > .remove() returns, which drivers will most likely not handle correctly.
> > >
> > > Yes! You *cannot* intermix devm and non-devm approaches without
> > > creating very subtle bugs exactly like this. If your subsystem does
> > > not provide a "devm register" helper its drivers shouldn't use devm.
> >
> > I'd relax that rule a bit. There are resources that drivers must never,
> > ever access after .remove(), such as MMIO. Using devm_ioremap* should
> > therefore be safe in all cases.
>
> Yeah, probably, but I've seen driver using devm before & after
> non-devm and it is just too hard to tell if things are going to
> even work right.
>
> To be fair the IRQ issue is always more involved. The subsystem should
> provide a state after unregistration where the memory is still around
> and the IRQ path into the subsystem becomes a NOP. The driver then
> frees the IRQ, fences work and releases the driver memory.
>
> It is hard to do this sequence with devm..
>
> I think a lot of places manage without this because seeing interrupts
> after unregister is probably a rare race condition in their HW.
>
> > > But sure, it is all easy once you figure out how to give the fops shim
> > > some place to store all this state since people would not agree to
> > > make this a universal cost to all fops.
> >
> > I didn't see any push back against Dan's proposal to store that
> > information in struct cdev, did I miss something ?
>
> I also don't see an issue with that, especially if we can stack misc
> on top of cdev to share the same logic.
>
> I think if you take that idea and the other proposal to shim the fops
> with ones that use the cdev data then we can see some
> cdev_unregister_sync() primitive.
I think we'll need to split that primitive in two (or add a second
primitive), as drivers need to wake up thread sleeping in fops between
flagging the cdev as being unregistered and completing the
unregistration with cdev_unregister_sync(). How to wake those threads up
is highly driver-specific (or at least subsystem-specific), so we need
two functions. Other than that, I think we're on the same page.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 1:08 ` Laurent Pinchart
2026-01-29 1:23 ` Jason Gunthorpe
@ 2026-01-29 22:29 ` Danilo Krummrich
2026-01-30 9:10 ` Laurent Pinchart
1 sibling, 1 reply; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-29 22:29 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Jason Gunthorpe, Bartosz Golaszewski, Johan Hovold,
Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski, maarten.lankhorst, mripard,
tzimmermann
(Cc: Maxime, Thomas, Maarten)
On Thu Jan 29, 2026 at 2:08 AM CET, Laurent Pinchart wrote:
> That's what I've been advocating for. The best way to ensure that driver
> code will not accessed data freed at .remove() time is to prevent the
> code to run at all.
With this we are in full agreement, I think that'd be best too. But, I also
think that sometimes this isn't possible. For instance, DRM has such a case with
atomic mode setting.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 22:29 ` Danilo Krummrich
@ 2026-01-30 9:10 ` Laurent Pinchart
2026-02-03 9:10 ` Maxime Ripard
0 siblings, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-30 9:10 UTC (permalink / raw)
To: Danilo Krummrich
Cc: Jason Gunthorpe, Bartosz Golaszewski, Johan Hovold,
Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, linux-doc, linux-kselftest,
linux-kernel, Bartosz Golaszewski, maarten.lankhorst, mripard,
tzimmermann
On Thu, Jan 29, 2026 at 11:29:03PM +0100, Danilo Krummrich wrote:
> (Cc: Maxime, Thomas, Maarten)
>
> On Thu Jan 29, 2026 at 2:08 AM CET, Laurent Pinchart wrote:
> > That's what I've been advocating for. The best way to ensure that driver
> > code will not accessed data freed at .remove() time is to prevent the
> > code to run at all.
>
> With this we are in full agreement, I think that'd be best too. But, I also
> think that sometimes this isn't possible. For instance, DRM has such a case with
> atomic mode setting.
I don't see why it would be impossible there.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-30 9:10 ` Laurent Pinchart
@ 2026-02-03 9:10 ` Maxime Ripard
2026-02-03 13:59 ` Laurent Pinchart
0 siblings, 1 reply; 70+ messages in thread
From: Maxime Ripard @ 2026-02-03 9:10 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Danilo Krummrich, Jason Gunthorpe, Bartosz Golaszewski,
Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, linux-doc,
linux-kselftest, linux-kernel, Bartosz Golaszewski,
maarten.lankhorst, tzimmermann
[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]
Hi,
On Fri, Jan 30, 2026 at 11:10:49AM +0200, Laurent Pinchart wrote:
> On Thu, Jan 29, 2026 at 11:29:03PM +0100, Danilo Krummrich wrote:
> > (Cc: Maxime, Thomas, Maarten)
> >
> > On Thu Jan 29, 2026 at 2:08 AM CET, Laurent Pinchart wrote:
> > > That's what I've been advocating for. The best way to ensure that driver
> > > code will not accessed data freed at .remove() time is to prevent the
> > > code to run at all.
> >
> > With this we are in full agreement, I think that'd be best too. But, I also
> > think that sometimes this isn't possible. For instance, DRM has such a case with
> > atomic mode setting.
>
> I don't see why it would be impossible there.
I'm not quite sure what you have in mind there, but DRM always allowed
the DRM driver to stick around longer than its device to accomodate the
fact that userspace might still have an open fd to it.
If userspace has an open fd, it can still call ioctl so preventing to
run any code is going to be difficult.
Maxime
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-02-03 9:10 ` Maxime Ripard
@ 2026-02-03 13:59 ` Laurent Pinchart
0 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-02-03 13:59 UTC (permalink / raw)
To: Maxime Ripard
Cc: Danilo Krummrich, Jason Gunthorpe, Bartosz Golaszewski,
Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, linux-doc,
linux-kselftest, linux-kernel, Bartosz Golaszewski,
maarten.lankhorst, tzimmermann
On Tue, Feb 03, 2026 at 10:10:57AM +0100, Maxime Ripard wrote:
> On Fri, Jan 30, 2026 at 11:10:49AM +0200, Laurent Pinchart wrote:
> > On Thu, Jan 29, 2026 at 11:29:03PM +0100, Danilo Krummrich wrote:
> > > (Cc: Maxime, Thomas, Maarten)
> > >
> > > On Thu Jan 29, 2026 at 2:08 AM CET, Laurent Pinchart wrote:
> > > > That's what I've been advocating for. The best way to ensure that driver
> > > > code will not accessed data freed at .remove() time is to prevent the
> > > > code to run at all.
> > >
> > > With this we are in full agreement, I think that'd be best too. But, I also
> > > think that sometimes this isn't possible. For instance, DRM has such a case with
> > > atomic mode setting.
> >
> > I don't see why it would be impossible there.
>
> I'm not quite sure what you have in mind there, but DRM always allowed
> the DRM driver to stick around longer than its device to accomodate the
> fact that userspace might still have an open fd to it.
>
> If userspace has an open fd, it can still call ioctl so preventing to
> run any code is going to be difficult.
Preventing new ioctls (and other fops) from being called isn't
difficult, they can be blocked at the entry point, outside of the
driver. In-progress ioctls running in other threads can also be forced
to complete before .remove() frees all memory. I think this is the right
thing to do.
The only fop that we can't completely prevent from running is
.release(), as we can't wait until userspace closes all file handles and
unmaps all memory before .remove() completes. There are solutions for
that.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 21:18 ` Bartosz Golaszewski
2026-01-27 23:52 ` Jason Gunthorpe
@ 2026-01-28 15:48 ` Johan Hovold
2026-01-29 9:11 ` Bartosz Golaszewski
2026-01-29 13:27 ` Linus Walleij
1 sibling, 2 replies; 70+ messages in thread
From: Johan Hovold @ 2026-01-28 15:48 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Greg Kroah-Hartman, Danilo Krummrich, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Bartosz Golaszewski
On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
> On Mon, Jan 26, 2026 at 2:50 PM Johan Hovold <johan@kernel.org> wrote:
> > It's certainly possible to handle the chardev unplug issue without
> > revocable as several subsystems already do. All you need is a refcount,
> > a lock and a flag.
> >
> > It may be possible to provide a generic solutions at the chardev level
> > or some kind of helper implementation (similar to revocable) for
> > subsystems to use directly.
>
> This echoes the heated exchange I recently had with Johan elsewhere so
> I would like to chime in and use the wider forum of driver core
> maintainers to settle an important question. It seems there are two
> camps in this discussion: one whose perception of the problem is
> limited to character devices being referenced from user-space at the
> time of the driver unbind (favoring fixing the issues at the vfs
> level) and another extending the problem to any driver unbinding where
> we cannot ensure a proper ordering of the teardown (for whatever
> reason: fw_devlink=off, helper auxiliary devices acting as
> intermediates, or even user-space unbinding a driver manually with
> bus-level sysfs attributes) leaving consumers of resources exposed by
> providers that are gone with dangling references (focusing the
> solutions on the subsystem level).
What I've been trying to get across is that the chardev hot-unplug issue
is real and needs to be fixed where it still exists, while the manual
unbinding of drivers by root is a corner case which does not need to be
addressed at *any* cost.
If addressing the latter by wrapping every resource access in code that
adds enough runtime overhead and makes drivers harder to write and
maintain it *may* not be worth it and we should instead explore
alternatives.
This may involve tracking consumers like fw_devlink already does today
so that they are unbound before their dependencies are.
Because in the end, how sound is a model where we allow critical
resources to silently go away while a device is still in use (e.g. you
won't discover that your emergency shutdown gpio is gone until you
actually need it)?
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 15:48 ` Johan Hovold
@ 2026-01-29 9:11 ` Bartosz Golaszewski
2026-01-29 10:56 ` Laurent Pinchart
2026-01-29 13:27 ` Linus Walleij
1 sibling, 1 reply; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-29 9:11 UTC (permalink / raw)
To: Johan Hovold
Cc: Bartosz Golaszewski, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Linus Walleij, Jonathan Corbet,
Shuah Khan, Laurent Pinchart, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Wed, Jan 28, 2026 at 4:48 PM Johan Hovold <johan@kernel.org> wrote:
>
> On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
> > On Mon, Jan 26, 2026 at 2:50 PM Johan Hovold <johan@kernel.org> wrote:
>
> > > It's certainly possible to handle the chardev unplug issue without
> > > revocable as several subsystems already do. All you need is a refcount,
> > > a lock and a flag.
> > >
> > > It may be possible to provide a generic solutions at the chardev level
> > > or some kind of helper implementation (similar to revocable) for
> > > subsystems to use directly.
> >
> > This echoes the heated exchange I recently had with Johan elsewhere so
> > I would like to chime in and use the wider forum of driver core
> > maintainers to settle an important question. It seems there are two
> > camps in this discussion: one whose perception of the problem is
> > limited to character devices being referenced from user-space at the
> > time of the driver unbind (favoring fixing the issues at the vfs
> > level) and another extending the problem to any driver unbinding where
> > we cannot ensure a proper ordering of the teardown (for whatever
> > reason: fw_devlink=off, helper auxiliary devices acting as
> > intermediates, or even user-space unbinding a driver manually with
> > bus-level sysfs attributes) leaving consumers of resources exposed by
> > providers that are gone with dangling references (focusing the
> > solutions on the subsystem level).
>
> What I've been trying to get across is that the chardev hot-unplug issue
> is real and needs to be fixed where it still exists, while the manual
> unbinding of drivers by root is a corner case which does not need to be
> addressed at *any* cost.
>
> If addressing the latter by wrapping every resource access in code that
> adds enough runtime overhead and makes drivers harder to write and
> maintain it *may* not be worth it and we should instead explore
> alternatives.
>
Alright, so we *do* agree at least on some parts. :)
I agree that any such change should not affect drivers. If you look at
the GPIO changes I did or the proposed nvmem rework - it never touched
drivers, only the subsystem level code. The latter especially is
really tiny, in fact:
drivers/nvmem/core.c | 172 +++++++++++++++++++++++---------------
drivers/nvmem/internals.h | 17 +++-
is all you need to make it not crash in the situations I described
under that series. Runtime overhead in read-sections with SRCU or
read-write semaphores is negligible and typically we only have to
write on driver unbind. So that "wrapping every resource access"
sounds scary but really is not.
GPIO work was bigger but it addressed way more synchronization issues
than just supplier unbinding.
For I2C both the problem is different (subsystem waiting forever for
consumers to release all references) and the culprit: memory used to
hold the reference-counted struct device is released the supplier
unbind unconditionally. Unfortunately there's no way around it other
than to first move it into a separate chunk managed by i2c core. But
that's not the synchronization part that leaks into the drivers, just
the need to move struct device out of struct i2c_adapter.
> This may involve tracking consumers like fw_devlink already does today
> so that they are unbound before their dependencies are.
>
During Saravana's talk at LPC we did briefly speak about whether it
would be possible to enforce devlinks for ALL devices linked in a
consumer-supplier fashion. I did in fact look into it for a bit on my
way back and it too would require at least subsystem-level changes
across all subsystems because you need to add that entry point at the
time of the resource being requested so it's not a no-cost operation.
But it is an alternative, yes though it'll require a comparable amount
of gap-plugging IMO.
> Because in the end, how sound is a model where we allow critical
> resources to silently go away while a device is still in use (e.g. you
> won't discover that your emergency shutdown gpio is gone until you
> actually need it)?
>
Well, we do allow it at the moment. It doesn't seem like devlink will
be able to cover 100% of use-cases anytime soon.
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 9:11 ` Bartosz Golaszewski
@ 2026-01-29 10:56 ` Laurent Pinchart
2026-01-29 13:50 ` Bartosz Golaszewski
0 siblings, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 10:56 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Johan Hovold, Bartosz Golaszewski, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, Jason Gunthorpe, linux-doc,
linux-kselftest, linux-kernel
On Thu, Jan 29, 2026 at 10:11:46AM +0100, Bartosz Golaszewski wrote:
> On Wed, Jan 28, 2026 at 4:48 PM Johan Hovold <johan@kernel.org> wrote:
> > On Tue, Jan 27, 2026 at 10:18:27PM +0100, Bartosz Golaszewski wrote:
> > > On Mon, Jan 26, 2026 at 2:50 PM Johan Hovold <johan@kernel.org> wrote:
> >
> > > > It's certainly possible to handle the chardev unplug issue without
> > > > revocable as several subsystems already do. All you need is a refcount,
> > > > a lock and a flag.
> > > >
> > > > It may be possible to provide a generic solutions at the chardev level
> > > > or some kind of helper implementation (similar to revocable) for
> > > > subsystems to use directly.
> > >
> > > This echoes the heated exchange I recently had with Johan elsewhere so
> > > I would like to chime in and use the wider forum of driver core
> > > maintainers to settle an important question. It seems there are two
> > > camps in this discussion: one whose perception of the problem is
> > > limited to character devices being referenced from user-space at the
> > > time of the driver unbind (favoring fixing the issues at the vfs
> > > level) and another extending the problem to any driver unbinding where
> > > we cannot ensure a proper ordering of the teardown (for whatever
> > > reason: fw_devlink=off, helper auxiliary devices acting as
> > > intermediates, or even user-space unbinding a driver manually with
> > > bus-level sysfs attributes) leaving consumers of resources exposed by
> > > providers that are gone with dangling references (focusing the
> > > solutions on the subsystem level).
> >
> > What I've been trying to get across is that the chardev hot-unplug issue
> > is real and needs to be fixed where it still exists, while the manual
> > unbinding of drivers by root is a corner case which does not need to be
> > addressed at *any* cost.
> >
> > If addressing the latter by wrapping every resource access in code that
> > adds enough runtime overhead and makes drivers harder to write and
> > maintain it *may* not be worth it and we should instead explore
> > alternatives.
>
> Alright, so we *do* agree at least on some parts. :)
>
> I agree that any such change should not affect drivers. If you look at
> the GPIO changes I did or the proposed nvmem rework - it never touched
> drivers, only the subsystem level code. The latter especially is
> really tiny, in fact:
>
> drivers/nvmem/core.c | 172 +++++++++++++++++++++++---------------
> drivers/nvmem/internals.h | 17 +++-
>
> is all you need to make it not crash in the situations I described
> under that series. Runtime overhead in read-sections with SRCU or
> read-write semaphores is negligible and typically we only have to
> write on driver unbind. So that "wrapping every resource access"
> sounds scary but really is not.
>
> GPIO work was bigger but it addressed way more synchronization issues
> than just supplier unbinding.
>
> For I2C both the problem is different (subsystem waiting forever for
> consumers to release all references) and the culprit: memory used to
> hold the reference-counted struct device is released the supplier
> unbind unconditionally. Unfortunately there's no way around it other
> than to first move it into a separate chunk managed by i2c core.
Isn't there ? Can't the driver-specific data structure be
reference-counted instead of unconditionally freed at unbind time ?
> But
> that's not the synchronization part that leaks into the drivers, just
> the need to move struct device out of struct i2c_adapter.
>
> > This may involve tracking consumers like fw_devlink already does today
> > so that they are unbound before their dependencies are.
>
> During Saravana's talk at LPC we did briefly speak about whether it
> would be possible to enforce devlinks for ALL devices linked in a
> consumer-supplier fashion. I did in fact look into it for a bit on my
> way back and it too would require at least subsystem-level changes
> across all subsystems because you need to add that entry point at the
> time of the resource being requested so it's not a no-cost operation.
> But it is an alternative, yes though it'll require a comparable amount
> of gap-plugging IMO.
I recall at least one driver (omap3isp) having a circular resource
issue. The ISP hardware block has the ability to produce a clock for the
camera sensor, and the camera sensor is a resource acquired by the ISP
driver. It's quite rare, but it happens. I would however not reject a
solution that would solve the 99.99% of the problem without addressing
this.
> > Because in the end, how sound is a model where we allow critical
> > resources to silently go away while a device is still in use (e.g. you
> > won't discover that your emergency shutdown gpio is gone until you
> > actually need it)?
>
> Well, we do allow it at the moment. It doesn't seem like devlink will
> be able to cover 100% of use-cases anytime soon.
We have this issue because designing resource management is hard. The
decision we made not to pay that cost has now turned into a huge
technical debt. There's no easy way around it, it won't be easier to
solve it correctly today than it was years ago. I don't know when we
will be able to fix the issue, but I know it will happen only when we
decide to face the situation and stop with band-aids.
What I think is the biggest issue at the moment is the lack of
motivation/time/money to address this huge, but I'm hopeful because I
trust the technical expertise of the community.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 10:56 ` Laurent Pinchart
@ 2026-01-29 13:50 ` Bartosz Golaszewski
2026-01-29 14:28 ` Jason Gunthorpe
2026-01-29 14:49 ` Laurent Pinchart
0 siblings, 2 replies; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-29 13:50 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Johan Hovold, Bartosz Golaszewski, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, Jason Gunthorpe, linux-doc,
linux-kselftest, linux-kernel, Bartosz Golaszewski
On Thu, 29 Jan 2026 11:56:34 +0100, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> said:
> On Thu, Jan 29, 2026 at 10:11:46AM +0100, Bartosz Golaszewski wrote:
>>
>> For I2C both the problem is different (subsystem waiting forever for
>> consumers to release all references) and the culprit: memory used to
>> hold the reference-counted struct device is released the supplier
>> unbind unconditionally. Unfortunately there's no way around it other
>> than to first move it into a separate chunk managed by i2c core.
>
> Isn't there ? Can't the driver-specific data structure be
> reference-counted instead of unconditionally freed at unbind time ?
>
Oh, for sure, if we did from the start. But we did not and there are now
hundreds of i2c drivers that do:
struct my_i2c_drv_data {
struct i2c_adapter adap;
int my_other_drv_data;
};
and in probe:
struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
(or just kzalloc() with kfree() in remove, it doesn't matter)
and the ownership of that data belongs to the driver. There's no way we could
address it now so the next best thing is to work towards moving the ownership
of struct i2c_adapter to the i2c core and make it reference counted using the
internal kobject of the associated struct device.
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 13:50 ` Bartosz Golaszewski
@ 2026-01-29 14:28 ` Jason Gunthorpe
2026-01-29 14:45 ` Laurent Pinchart
2026-01-29 14:49 ` Laurent Pinchart
1 sibling, 1 reply; 70+ messages in thread
From: Jason Gunthorpe @ 2026-01-29 14:28 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Laurent Pinchart, Johan Hovold, Bartosz Golaszewski,
Greg Kroah-Hartman, Danilo Krummrich, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, linux-doc,
linux-kselftest, linux-kernel
On Thu, Jan 29, 2026 at 08:50:30AM -0500, Bartosz Golaszewski wrote:
> and the ownership of that data belongs to the driver. There's no way we could
> address it now so the next best thing is to work towards moving the ownership
Think positive!
If this is common:
struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
Then change it into
struct my_i2c_drv_data *data = devm_i2c_adaptor_alloc(struct my_i2c_drv_data, adap);
With Coccinelle or sed.
Audit all the drivers to catch the stragglers.
Now you have a refcount. Look at how fwctl_alloc_device() works to
understand the pattern.
Kernel community has done far harder transformations than this :)
Sure it is 200 drivers, I would ask Coccinelle team for help.
Here is how I would approach it.
First, grep to find all the candidates:
$ git grep -E '^\s+struct i2c_adapter[^*]*;'
Get a kernel built with all those compiling and get a clangd database
going. Make a list of all possible candidate files with grep.
AI tells me (and AI is never right about Coccinelle sadly) that you
could use this:
// C1: Find any struct that has a member of type "struct i2c_adapter"
@ has_i2c_adapter_struct @
type S;
@@
struct S {
...
struct i2c_adapter;
...
};
// C2: Replace sizeof(...) with fixme_sizeof(...)
@ sizeof_i2c_adapter_struct depends on has_i2c_adapter_struct @
type has_i2c_adapter_struct.S;
@@
- sizeof(struct S)
+ fixme_sizeof(struct S)
The idea being the only reason to do sizeof(S) is for an allocation
and we want to find every allocation of a wrapper struct to fix it.
Now you have an index of all lines that need touching.
Look for common patterns, use Coccinelle or sed to make bulk
replacements. Group patches of all similar transformations. Sweep
through with grep to clean anything not caught. Probably there will be
a couple drivers doing something utterly insane, meditate on them and
clean them up by hand (this is what clangd is helpful for)
Betcha you can get through it in a few hours!
Jason
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 14:28 ` Jason Gunthorpe
@ 2026-01-29 14:45 ` Laurent Pinchart
0 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 14:45 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Bartosz Golaszewski, Johan Hovold, Bartosz Golaszewski,
Greg Kroah-Hartman, Danilo Krummrich, Rafael J . Wysocki,
Tzung-Bi Shih, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, linux-doc,
linux-kselftest, linux-kernel
On Thu, Jan 29, 2026 at 10:28:36AM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 29, 2026 at 08:50:30AM -0500, Bartosz Golaszewski wrote:
>
> > and the ownership of that data belongs to the driver. There's no way we could
> > address it now so the next best thing is to work towards moving the ownership
>
> Think positive!
>
> If this is common:
>
> struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
>
> Then change it into
>
> struct my_i2c_drv_data *data = devm_i2c_adaptor_alloc(struct my_i2c_drv_data, adap);
>
> With Coccinelle or sed.
>
> Audit all the drivers to catch the stragglers.
>
> Now you have a refcount. Look at how fwctl_alloc_device() works to
> understand the pattern.
>
> Kernel community has done far harder transformations than this :)
>
> Sure it is 200 drivers, I would ask Coccinelle team for help.
We rewrote the device model between v2.4 and v2.6 (and by "we" I mostly
mean kudos to Greg for that work, as well as to all the people who
worked with him who I don't know about). That impacted *all* the
drivers. We can do this if we want to.
> Here is how I would approach it.
>
> First, grep to find all the candidates:
>
> $ git grep -E '^\s+struct i2c_adapter[^*]*;'
>
> Get a kernel built with all those compiling and get a clangd database
> going. Make a list of all possible candidate files with grep.
>
> AI tells me (and AI is never right about Coccinelle sadly) that you
> could use this:
>
> // C1: Find any struct that has a member of type "struct i2c_adapter"
> @ has_i2c_adapter_struct @
> type S;
> @@
> struct S {
> ...
> struct i2c_adapter;
> ...
> };
>
> // C2: Replace sizeof(...) with fixme_sizeof(...)
> @ sizeof_i2c_adapter_struct depends on has_i2c_adapter_struct @
> type has_i2c_adapter_struct.S;
> @@
> - sizeof(struct S)
> + fixme_sizeof(struct S)
>
> The idea being the only reason to do sizeof(S) is for an allocation
> and we want to find every allocation of a wrapper struct to fix it.
>
> Now you have an index of all lines that need touching.
>
> Look for common patterns, use Coccinelle or sed to make bulk
> replacements. Group patches of all similar transformations. Sweep
> through with grep to clean anything not caught. Probably there will be
> a couple drivers doing something utterly insane, meditate on them and
> clean them up by hand (this is what clangd is helpful for)
>
> Betcha you can get through it in a few hours!
>
> Jason
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 13:50 ` Bartosz Golaszewski
2026-01-29 14:28 ` Jason Gunthorpe
@ 2026-01-29 14:49 ` Laurent Pinchart
2026-01-29 22:00 ` Danilo Krummrich
2026-01-30 11:19 ` Bartosz Golaszewski
1 sibling, 2 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-29 14:49 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Johan Hovold, Bartosz Golaszewski, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, Jason Gunthorpe, linux-doc,
linux-kselftest, linux-kernel
On Thu, Jan 29, 2026 at 08:50:30AM -0500, Bartosz Golaszewski wrote:
> On Thu, 29 Jan 2026 11:56:34 +0100, Laurent Pinchart said:
> > On Thu, Jan 29, 2026 at 10:11:46AM +0100, Bartosz Golaszewski wrote:
> >>
> >> For I2C both the problem is different (subsystem waiting forever for
> >> consumers to release all references) and the culprit: memory used to
> >> hold the reference-counted struct device is released the supplier
> >> unbind unconditionally. Unfortunately there's no way around it other
> >> than to first move it into a separate chunk managed by i2c core.
> >
> > Isn't there ? Can't the driver-specific data structure be
> > reference-counted instead of unconditionally freed at unbind time ?
>
> Oh, for sure, if we did from the start. But we did not and there are now
> hundreds of i2c drivers that do:
>
> struct my_i2c_drv_data {
> struct i2c_adapter adap;
> int my_other_drv_data;
> };
>
> and in probe:
>
> struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
>
> (or just kzalloc() with kfree() in remove, it doesn't matter)
>
> and the ownership of that data belongs to the driver. There's no way we could
> address it now so the next best thing is to work towards moving the ownership
> of struct i2c_adapter to the i2c core and make it reference counted using the
> internal kobject of the associated struct device.
What I'm reading here is essentially that we rolled out devm_kzalloc()
too quickly without understanding the consequences, and it has spread so
much that it can't be fixed properly now, so we need to find a
workaround. And now we're trying to work around the problem by rolling
out a revocable API that has barely seen any testing, and is known to
have design issues. Does any one else see the irony ? :-)
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 14:49 ` Laurent Pinchart
@ 2026-01-29 22:00 ` Danilo Krummrich
2026-01-30 11:19 ` Bartosz Golaszewski
1 sibling, 0 replies; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-29 22:00 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Bartosz Golaszewski, Johan Hovold, Bartosz Golaszewski,
Greg Kroah-Hartman, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, Jason Gunthorpe, linux-doc,
linux-kselftest, linux-kernel
On Thu Jan 29, 2026 at 3:49 PM CET, Laurent Pinchart wrote:
> On Thu, Jan 29, 2026 at 08:50:30AM -0500, Bartosz Golaszewski wrote:
>> On Thu, 29 Jan 2026 11:56:34 +0100, Laurent Pinchart said:
>> > On Thu, Jan 29, 2026 at 10:11:46AM +0100, Bartosz Golaszewski wrote:
>> >>
>> >> For I2C both the problem is different (subsystem waiting forever for
>> >> consumers to release all references) and the culprit: memory used to
>> >> hold the reference-counted struct device is released the supplier
>> >> unbind unconditionally. Unfortunately there's no way around it other
>> >> than to first move it into a separate chunk managed by i2c core.
>> >
>> > Isn't there ? Can't the driver-specific data structure be
>> > reference-counted instead of unconditionally freed at unbind time ?
>>
>> Oh, for sure, if we did from the start. But we did not and there are now
>> hundreds of i2c drivers that do:
>>
>> struct my_i2c_drv_data {
>> struct i2c_adapter adap;
>> int my_other_drv_data;
>> };
>>
>> and in probe:
>>
>> struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
>>
>> (or just kzalloc() with kfree() in remove, it doesn't matter)
>>
>> and the ownership of that data belongs to the driver. There's no way we could
>> address it now so the next best thing is to work towards moving the ownership
>> of struct i2c_adapter to the i2c core and make it reference counted using the
>> internal kobject of the associated struct device.
>
> What I'm reading here is essentially that we rolled out devm_kzalloc() too
> quickly without understanding the consequences, and it has spread so much that
> it can't be fixed properly now, so we need to find a workaround.
I'm pretty sure I don't have all the details about the problem with the
i2c_adapter, but from what I read here briefly the problem simply seems to be
that currently the driver providing the i2c_adapter frees the i2c_adapter on
driver unbind unconditionally.
I would argue that this is a violation of the driver core design anyways. I
mean, in the end an i2c_adapter is the same type of device as any other bus
device (platform, PCI, etc.).
For instance, when the physical device that is represented by a PCI device is
removed from the machine, the corresponding struct pci_dev is not forcefully
freed either, it stays around as long as there are references to the device.
So, I fail to see how devm_kmalloc() and frieds are the root cause of the
i2c_adapter lifetime problem?
> And now we're trying to work around the problem by rolling out a revocable API
> that has barely seen any testing, and is known to have design issues. Does any
> one else see the irony ? :-)
I don't think anyone is trying to work around problems with devm_kmalloc() and
friends. It's just system memory, i.e. nothing that needs to be revoked. We can
just not use devm_kmalloc() and friends if we need the memory to outlive driver
unbind for some reason. The problem is about real device resources, such as I/O
memory mappings that *must* be released when a driver is unbound from its
device. So, revocable is clearly not a fix for devm_kmalloc() et al.
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 14:49 ` Laurent Pinchart
2026-01-29 22:00 ` Danilo Krummrich
@ 2026-01-30 11:19 ` Bartosz Golaszewski
1 sibling, 0 replies; 70+ messages in thread
From: Bartosz Golaszewski @ 2026-01-30 11:19 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Johan Hovold, Bartosz Golaszewski, Greg Kroah-Hartman,
Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Linus Walleij, Jonathan Corbet, Shuah Khan, Wolfram Sang,
Simona Vetter, Dan Williams, Jason Gunthorpe, linux-doc,
linux-kselftest, linux-kernel
On Thu, Jan 29, 2026 at 3:49 PM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> On Thu, Jan 29, 2026 at 08:50:30AM -0500, Bartosz Golaszewski wrote:
> > On Thu, 29 Jan 2026 11:56:34 +0100, Laurent Pinchart said:
> > > On Thu, Jan 29, 2026 at 10:11:46AM +0100, Bartosz Golaszewski wrote:
> > >>
> > >> For I2C both the problem is different (subsystem waiting forever for
> > >> consumers to release all references) and the culprit: memory used to
> > >> hold the reference-counted struct device is released the supplier
> > >> unbind unconditionally. Unfortunately there's no way around it other
> > >> than to first move it into a separate chunk managed by i2c core.
> > >
> > > Isn't there ? Can't the driver-specific data structure be
> > > reference-counted instead of unconditionally freed at unbind time ?
> >
> > Oh, for sure, if we did from the start. But we did not and there are now
> > hundreds of i2c drivers that do:
> >
> > struct my_i2c_drv_data {
> > struct i2c_adapter adap;
> > int my_other_drv_data;
> > };
> >
> > and in probe:
> >
> > struct my_i2c_drv_data *data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
> >
> > (or just kzalloc() with kfree() in remove, it doesn't matter)
> >
> > and the ownership of that data belongs to the driver. There's no way we could
> > address it now so the next best thing is to work towards moving the ownership
> > of struct i2c_adapter to the i2c core and make it reference counted using the
> > internal kobject of the associated struct device.
>
> What I'm reading here is essentially that we rolled out devm_kzalloc()
> too quickly without understanding the consequences, and it has spread so
> much that it can't be fixed properly now, so we need to find a
> workaround. And now we're trying to work around the problem by rolling
> out a revocable API that has barely seen any testing, and is known to
> have design issues. Does any one else see the irony ? :-)
>
No, this has nothing to do with devm_anything(). It's resource
ownership. What driver creates at probe(), the driver should destroy
at remove(). I really hate with a passion the pattern where a driver
creates something in probe() and then tosses it over to the subsystem
for management. If an entity registers the struct device, it should
also be the one who allocates and manages the memory for it. Devres
just made it a bit easier to commit this kind of errors but they would
exist nevertheless.
Bartosz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 15:48 ` Johan Hovold
2026-01-29 9:11 ` Bartosz Golaszewski
@ 2026-01-29 13:27 ` Linus Walleij
1 sibling, 0 replies; 70+ messages in thread
From: Linus Walleij @ 2026-01-29 13:27 UTC (permalink / raw)
To: Johan Hovold
Cc: Bartosz Golaszewski, Greg Kroah-Hartman, Danilo Krummrich,
Rafael J . Wysocki, Tzung-Bi Shih, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel,
Bartosz Golaszewski
Doing a quick interlude here,
On Wed, Jan 28, 2026 at 4:48 PM Johan Hovold <johan@kernel.org> wrote:
> What I've been trying to get across is that the chardev hot-unplug issue
> is real and needs to be fixed where it still exists, while the manual
> unbinding of drivers by root is a corner case which does not need to be
> addressed at *any* cost.
I agree with Johans stance here.
I might have a skewed view of history and be a bit self-delusional,
but at the time I began the work with the GPIO character device, what
we were seeing were these GPIO adapters on USB.
Those were/are hot-pluggable, and provide GPIOs that are not
for "system critical" things, i.e. not critical for the computer/system
it is running on.
Instead these are things like the Mikroe Clickboard GPIO expander
and TTY, or these GPIOs that come on FTDI USB adapters. These
can and are certainly being used for things that are critical to the thing
they are used for, such as regulating water reserves for your local
hydro power plant. Of course only a madman would unplug that,
but from Linux' POV they are very hot-pluggable, by their very nature.
[Side quest on what are some insanities industrial system people
do here... but these adapters are *very* popular.]
Since they are hot-pluggable and will never appear in DTS files
or such, a character device or other userspace ABI is the right
abstraction, and it needs to be able to come and go without crashing
or wasting memory. Plug/unplug the Mikroe clickboard GPIO
10.000 times should be fine under any circumstances.
And that is the problem that need to be solved *first*, before
removing things on (local) I2C buses or unbinding random devices
from sysfs making them unpluggable in theory.
Yours,
Linus Walleij
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-26 13:50 ` Johan Hovold
2026-01-27 21:18 ` Bartosz Golaszewski
@ 2026-02-03 12:15 ` Johan Hovold
2026-02-03 12:26 ` Greg Kroah-Hartman
1 sibling, 1 reply; 70+ messages in thread
From: Johan Hovold @ 2026-02-03 12:15 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
Hi Greg,
On Mon, Jan 26, 2026 at 02:50:05PM +0100, Johan Hovold wrote:
> On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> > True, but we do need something. I took these patches without a real
> > user as a base for us to start working off of. The rust implementation
> > has shown that the design-pattern is a good solution for the problem,
> > and so I feel we should work with it and try to get this working
> > properly. We've been sitting and talking about it for years now, and
> > here is the first real code submission that is getting us closer to fix
> > the problem properly. It might not be perfict, but let's evolve it from
> > here for what is found not to work correctly.
>
> It's a design pattern that's perhaps needed for rust, but not
> necessarily elsewhere. But either way there is no need to rush this. If
> it turns out to be usable, it can be merged along with a future user.
>
> Dropping the revocable_provider and revocable abstraction split should
> even make it more palatable.
>
> And with a new interface and a non-trivial user we can see what the
> end-result looks like and decide where to go from there.
>
> > So I don't want to take these reverts, let's try this out, by putting
> > this into the driver core now, we have the base to experiment with in a
> > "safe" way in lots of different driver subsytems at the same time. If
> > it doesn't work out, worst case we revert it in a release or two because
> > it didn't get used.
>
> Please reconsider. Perhaps I didn't stress the point enough that the
> current API needs to be reworked completely since there's no longer any
> need for the two revocable abstractions.
I noticed that you picked up the proposed incremental fixes to the
issues I reported without anyone even having reviewed them. The fixes
being incremental makes it a lot harder to review, but based on a quick
look it seems there needs to be further changes.
And again, what is the rush? Anyone wanting to experiment with this
functionality only needs to apply a single patch. And exposing the API
before it is stable is just going to be a mess as subsystems may start
using it from day one.
So please, just drop it for 6.20. You can still merge this for the next
cycle when the basic functionality has been fixed.
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-02-03 12:15 ` Johan Hovold
@ 2026-02-03 12:26 ` Greg Kroah-Hartman
2026-02-03 12:30 ` [PATCH] driver core: disable revocable code from build Greg Kroah-Hartman
` (2 more replies)
0 siblings, 3 replies; 70+ messages in thread
From: Greg Kroah-Hartman @ 2026-02-03 12:26 UTC (permalink / raw)
To: Johan Hovold
Cc: Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue, Feb 03, 2026 at 01:15:50PM +0100, Johan Hovold wrote:
> Hi Greg,
>
> On Mon, Jan 26, 2026 at 02:50:05PM +0100, Johan Hovold wrote:
> > On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
>
> > > True, but we do need something. I took these patches without a real
> > > user as a base for us to start working off of. The rust implementation
> > > has shown that the design-pattern is a good solution for the problem,
> > > and so I feel we should work with it and try to get this working
> > > properly. We've been sitting and talking about it for years now, and
> > > here is the first real code submission that is getting us closer to fix
> > > the problem properly. It might not be perfict, but let's evolve it from
> > > here for what is found not to work correctly.
> >
> > It's a design pattern that's perhaps needed for rust, but not
> > necessarily elsewhere. But either way there is no need to rush this. If
> > it turns out to be usable, it can be merged along with a future user.
> >
> > Dropping the revocable_provider and revocable abstraction split should
> > even make it more palatable.
> >
> > And with a new interface and a non-trivial user we can see what the
> > end-result looks like and decide where to go from there.
> >
> > > So I don't want to take these reverts, let's try this out, by putting
> > > this into the driver core now, we have the base to experiment with in a
> > > "safe" way in lots of different driver subsytems at the same time. If
> > > it doesn't work out, worst case we revert it in a release or two because
> > > it didn't get used.
> >
> > Please reconsider. Perhaps I didn't stress the point enough that the
> > current API needs to be reworked completely since there's no longer any
> > need for the two revocable abstractions.
>
> I noticed that you picked up the proposed incremental fixes to the
> issues I reported without anyone even having reviewed them. The fixes
> being incremental makes it a lot harder to review, but based on a quick
> look it seems there needs to be further changes.
>
> And again, what is the rush? Anyone wanting to experiment with this
> functionality only needs to apply a single patch. And exposing the API
> before it is stable is just going to be a mess as subsystems may start
> using it from day one.
>
> So please, just drop it for 6.20. You can still merge this for the next
> cycle when the basic functionality has been fixed.
The fixes seemed correct on my review, what was wrong with them? And
having the code fixed for known issues is a good thing here, it gives
the gpio people a base to test their work on.
As no one is currently using this, I will disable this from the build,
but keeping the code in the tree right now is a good thing as I do feel
this is the right way forward, and others can work off of it easier this
way.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 70+ messages in thread* [PATCH] driver core: disable revocable code from build
2026-02-03 12:26 ` Greg Kroah-Hartman
@ 2026-02-03 12:30 ` Greg Kroah-Hartman
2026-02-03 13:20 ` Danilo Krummrich
2026-02-04 2:14 ` Tzung-Bi Shih
2026-02-03 13:57 ` [PATCH 0/3] Revert "revocable: Revocable resource management" Laurent Pinchart
2026-02-04 14:36 ` Johan Hovold
2 siblings, 2 replies; 70+ messages in thread
From: Greg Kroah-Hartman @ 2026-02-03 12:30 UTC (permalink / raw)
To: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
The revocable code is still under active discussion, and there is no
in-kernel users of it. So disable it from the build for now so that no
one suffers from it being present in the tree, yet leave it in the
source tree so that others can easily test it by reverting this commit
and building off of it for future releases.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/base/Kconfig | 2 +-
drivers/base/Makefile | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 8f7d7b9d81ac..9f318b98144d 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -254,7 +254,7 @@ endmenu
# Kunit test cases
config REVOCABLE_KUNIT_TEST
tristate "Kunit tests for revocable" if !KUNIT_ALL_TESTS
- depends on KUNIT
+ depends on KUNIT && BROKEN
default KUNIT_ALL_TESTS
help
Kunit tests for the revocable API.
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 4185aaa9bbb9..4c6607616a73 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -6,7 +6,7 @@ obj-y := component.o core.o bus.o dd.o syscore.o \
cpu.o firmware.o init.o map.o devres.o \
attribute_container.o transport_class.o \
topology.o container.o property.o cacheinfo.o \
- swnode.o faux.o revocable.o
+ swnode.o faux.o
obj-$(CONFIG_AUXILIARY_BUS) += auxiliary.o
obj-$(CONFIG_DEVTMPFS) += devtmpfs.o
obj-y += power/
--
2.53.0
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [PATCH] driver core: disable revocable code from build
2026-02-03 12:30 ` [PATCH] driver core: disable revocable code from build Greg Kroah-Hartman
@ 2026-02-03 13:20 ` Danilo Krummrich
2026-02-04 2:14 ` Tzung-Bi Shih
1 sibling, 0 replies; 70+ messages in thread
From: Danilo Krummrich @ 2026-02-03 13:20 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Johan Hovold, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue Feb 3, 2026 at 1:30 PM CET, Greg Kroah-Hartman wrote:
> The revocable code is still under active discussion, and there is no
> in-kernel users of it. So disable it from the build for now so that no
> one suffers from it being present in the tree, yet leave it in the
> source tree so that others can easily test it by reverting this commit
> and building off of it for future releases.
>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Danilo Krummrich <dakr@kernel.org>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH] driver core: disable revocable code from build
2026-02-03 12:30 ` [PATCH] driver core: disable revocable code from build Greg Kroah-Hartman
2026-02-03 13:20 ` Danilo Krummrich
@ 2026-02-04 2:14 ` Tzung-Bi Shih
2026-02-04 5:28 ` [PATCH] selftests: Disable " Tzung-Bi Shih
1 sibling, 1 reply; 70+ messages in thread
From: Tzung-Bi Shih @ 2026-02-04 2:14 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue, Feb 03, 2026 at 01:30:37PM +0100, Greg Kroah-Hartman wrote:
> The revocable code is still under active discussion, and there is no
> in-kernel users of it. So disable it from the build for now so that no
> one suffers from it being present in the tree, yet leave it in the
> source tree so that others can easily test it by reverting this commit
> and building off of it for future releases.
>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
And also:
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 11b6515ce3d0..56e44a98d6a5 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -17,7 +17,6 @@ TARGETS += damon
TARGETS += devices/error_logs
TARGETS += devices/probe
TARGETS += dmabuf-heaps
-TARGETS += drivers/base/revocable
TARGETS += drivers/dma-buf
TARGETS += drivers/ntsync
TARGETS += drivers/s390x/uvdevice
With that,
Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org>
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [PATCH] selftests: Disable revocable code from build
2026-02-04 2:14 ` Tzung-Bi Shih
@ 2026-02-04 5:28 ` Tzung-Bi Shih
2026-02-04 8:21 ` Greg Kroah-Hartman
0 siblings, 1 reply; 70+ messages in thread
From: Tzung-Bi Shih @ 2026-02-04 5:28 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
The revocable code is still under active discussion, and there is no
in-kernel users of it. So disable it from the build for now so that no
one suffers from it being present in the tree, yet leave it in the
source tree so that others can easily test it by reverting this commit
and building off of it for future releases.
Fixes: dd7762c73b1c ("driver core: disable revocable code from build")
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
---
Greg: I realized "driver core: disable revocable code from build" is
already in driver-core-testing branch. Sent this independent patch
in case it'd need to.
tools/testing/selftests/Makefile | 1 -
1 file changed, 1 deletion(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 11b6515ce3d0..56e44a98d6a5 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -17,7 +17,6 @@ TARGETS += damon
TARGETS += devices/error_logs
TARGETS += devices/probe
TARGETS += dmabuf-heaps
-TARGETS += drivers/base/revocable
TARGETS += drivers/dma-buf
TARGETS += drivers/ntsync
TARGETS += drivers/s390x/uvdevice
--
2.53.0.rc2.204.g2597b5adb4-goog
^ permalink raw reply related [flat|nested] 70+ messages in thread* Re: [PATCH] selftests: Disable revocable code from build
2026-02-04 5:28 ` [PATCH] selftests: Disable " Tzung-Bi Shih
@ 2026-02-04 8:21 ` Greg Kroah-Hartman
0 siblings, 0 replies; 70+ messages in thread
From: Greg Kroah-Hartman @ 2026-02-04 8:21 UTC (permalink / raw)
To: Tzung-Bi Shih
Cc: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Wed, Feb 04, 2026 at 05:28:18AM +0000, Tzung-Bi Shih wrote:
> The revocable code is still under active discussion, and there is no
> in-kernel users of it. So disable it from the build for now so that no
> one suffers from it being present in the tree, yet leave it in the
> source tree so that others can easily test it by reverting this commit
> and building off of it for future releases.
>
> Fixes: dd7762c73b1c ("driver core: disable revocable code from build")
> Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
> ---
> Greg: I realized "driver core: disable revocable code from build" is
> already in driver-core-testing branch. Sent this independent patch
> in case it'd need to.
>
> tools/testing/selftests/Makefile | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
> index 11b6515ce3d0..56e44a98d6a5 100644
> --- a/tools/testing/selftests/Makefile
> +++ b/tools/testing/selftests/Makefile
> @@ -17,7 +17,6 @@ TARGETS += damon
> TARGETS += devices/error_logs
> TARGETS += devices/probe
> TARGETS += dmabuf-heaps
> -TARGETS += drivers/base/revocable
> TARGETS += drivers/dma-buf
> TARGETS += drivers/ntsync
> TARGETS += drivers/s390x/uvdevice
> --
> 2.53.0.rc2.204.g2597b5adb4-goog
Thanks, I'll merge this into the other commit so that it all happens at
once, and can be reverted easier.
greg k-h
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-02-03 12:26 ` Greg Kroah-Hartman
2026-02-03 12:30 ` [PATCH] driver core: disable revocable code from build Greg Kroah-Hartman
@ 2026-02-03 13:57 ` Laurent Pinchart
2026-02-03 15:44 ` Greg Kroah-Hartman
2026-02-04 14:36 ` Johan Hovold
2 siblings, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-02-03 13:57 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, Jason Gunthorpe,
linux-doc, linux-kselftest, linux-kernel
On Tue, Feb 03, 2026 at 01:26:58PM +0100, Greg KH wrote:
> On Tue, Feb 03, 2026 at 01:15:50PM +0100, Johan Hovold wrote:
> > On Mon, Jan 26, 2026 at 02:50:05PM +0100, Johan Hovold wrote:
> > > On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> >
> > > > True, but we do need something. I took these patches without a real
> > > > user as a base for us to start working off of. The rust implementation
> > > > has shown that the design-pattern is a good solution for the problem,
> > > > and so I feel we should work with it and try to get this working
> > > > properly. We've been sitting and talking about it for years now, and
> > > > here is the first real code submission that is getting us closer to fix
> > > > the problem properly. It might not be perfict, but let's evolve it from
> > > > here for what is found not to work correctly.
> > >
> > > It's a design pattern that's perhaps needed for rust, but not
> > > necessarily elsewhere. But either way there is no need to rush this. If
> > > it turns out to be usable, it can be merged along with a future user.
> > >
> > > Dropping the revocable_provider and revocable abstraction split should
> > > even make it more palatable.
> > >
> > > And with a new interface and a non-trivial user we can see what the
> > > end-result looks like and decide where to go from there.
> > >
> > > > So I don't want to take these reverts, let's try this out, by putting
> > > > this into the driver core now, we have the base to experiment with in a
> > > > "safe" way in lots of different driver subsytems at the same time. If
> > > > it doesn't work out, worst case we revert it in a release or two because
> > > > it didn't get used.
> > >
> > > Please reconsider. Perhaps I didn't stress the point enough that the
> > > current API needs to be reworked completely since there's no longer any
> > > need for the two revocable abstractions.
> >
> > I noticed that you picked up the proposed incremental fixes to the
> > issues I reported without anyone even having reviewed them. The fixes
> > being incremental makes it a lot harder to review, but based on a quick
> > look it seems there needs to be further changes.
> >
> > And again, what is the rush? Anyone wanting to experiment with this
> > functionality only needs to apply a single patch. And exposing the API
> > before it is stable is just going to be a mess as subsystems may start
> > using it from day one.
> >
> > So please, just drop it for 6.20. You can still merge this for the next
> > cycle when the basic functionality has been fixed.
>
> The fixes seemed correct on my review, what was wrong with them? And
> having the code fixed for known issues is a good thing here, it gives
> the gpio people a base to test their work on.
>
> As no one is currently using this, I will disable this from the build,
> but keeping the code in the tree right now is a good thing as I do feel
> this is the right way forward, and others can work off of it easier this
> way.
The reason why I (and others) would like to see this series reverted is
because be believe it is *not* the right way forward. There's no
consensus on that topic. While having code in mainline can improve
collaboration and accelerate development, we don't traditionally do that
when a large portion of the parties involved believe the approach is
wrong, and no user should be added until a consensus on the API is
found.
In the worst case, if no consensus can be found, someone has to make a
decision, but doing so while discussions are ongoing, so close after LPC
where different approaches were discussed and proposed, is very
worrying. Do we want to prioritize your feeling that this is the right
way forward over trying to find an argument, while discussions are
progressing ?
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-02-03 13:57 ` [PATCH 0/3] Revert "revocable: Revocable resource management" Laurent Pinchart
@ 2026-02-03 15:44 ` Greg Kroah-Hartman
0 siblings, 0 replies; 70+ messages in thread
From: Greg Kroah-Hartman @ 2026-02-03 15:44 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Johan Hovold, Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Wolfram Sang, Simona Vetter, Dan Williams, Jason Gunthorpe,
linux-doc, linux-kselftest, linux-kernel
On Tue, Feb 03, 2026 at 03:57:11PM +0200, Laurent Pinchart wrote:
> On Tue, Feb 03, 2026 at 01:26:58PM +0100, Greg KH wrote:
> > On Tue, Feb 03, 2026 at 01:15:50PM +0100, Johan Hovold wrote:
> > > On Mon, Jan 26, 2026 at 02:50:05PM +0100, Johan Hovold wrote:
> > > > On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> > >
> > > > > True, but we do need something. I took these patches without a real
> > > > > user as a base for us to start working off of. The rust implementation
> > > > > has shown that the design-pattern is a good solution for the problem,
> > > > > and so I feel we should work with it and try to get this working
> > > > > properly. We've been sitting and talking about it for years now, and
> > > > > here is the first real code submission that is getting us closer to fix
> > > > > the problem properly. It might not be perfict, but let's evolve it from
> > > > > here for what is found not to work correctly.
> > > >
> > > > It's a design pattern that's perhaps needed for rust, but not
> > > > necessarily elsewhere. But either way there is no need to rush this. If
> > > > it turns out to be usable, it can be merged along with a future user.
> > > >
> > > > Dropping the revocable_provider and revocable abstraction split should
> > > > even make it more palatable.
> > > >
> > > > And with a new interface and a non-trivial user we can see what the
> > > > end-result looks like and decide where to go from there.
> > > >
> > > > > So I don't want to take these reverts, let's try this out, by putting
> > > > > this into the driver core now, we have the base to experiment with in a
> > > > > "safe" way in lots of different driver subsytems at the same time. If
> > > > > it doesn't work out, worst case we revert it in a release or two because
> > > > > it didn't get used.
> > > >
> > > > Please reconsider. Perhaps I didn't stress the point enough that the
> > > > current API needs to be reworked completely since there's no longer any
> > > > need for the two revocable abstractions.
> > >
> > > I noticed that you picked up the proposed incremental fixes to the
> > > issues I reported without anyone even having reviewed them. The fixes
> > > being incremental makes it a lot harder to review, but based on a quick
> > > look it seems there needs to be further changes.
> > >
> > > And again, what is the rush? Anyone wanting to experiment with this
> > > functionality only needs to apply a single patch. And exposing the API
> > > before it is stable is just going to be a mess as subsystems may start
> > > using it from day one.
> > >
> > > So please, just drop it for 6.20. You can still merge this for the next
> > > cycle when the basic functionality has been fixed.
> >
> > The fixes seemed correct on my review, what was wrong with them? And
> > having the code fixed for known issues is a good thing here, it gives
> > the gpio people a base to test their work on.
> >
> > As no one is currently using this, I will disable this from the build,
> > but keeping the code in the tree right now is a good thing as I do feel
> > this is the right way forward, and others can work off of it easier this
> > way.
>
> The reason why I (and others) would like to see this series reverted is
> because be believe it is *not* the right way forward. There's no
> consensus on that topic. While having code in mainline can improve
> collaboration and accelerate development, we don't traditionally do that
> when a large portion of the parties involved believe the approach is
> wrong, and no user should be added until a consensus on the API is
> found.
Ah, but I do think this is the way forward, given that the pattern/idea
works in the rust side of the kernel, and it's exactly what I've been
asking for for years now :)
But yes, without a real user, it's hard for me to justify it. But, I
want it present in the tree now so that lots of others can play with it
easily. If it turns out it is not correct, and does not work properly,
then great, we will delete the files entirely. But I'm not so sure that
we are there yet.
thanks,
greg "we all want more hours in a day" k-h
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-02-03 12:26 ` Greg Kroah-Hartman
2026-02-03 12:30 ` [PATCH] driver core: disable revocable code from build Greg Kroah-Hartman
2026-02-03 13:57 ` [PATCH 0/3] Revert "revocable: Revocable resource management" Laurent Pinchart
@ 2026-02-04 14:36 ` Johan Hovold
2 siblings, 0 replies; 70+ messages in thread
From: Johan Hovold @ 2026-02-04 14:36 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Danilo Krummrich, Rafael J . Wysocki, Tzung-Bi Shih,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue, Feb 03, 2026 at 01:26:58PM +0100, Greg Kroah-Hartman wrote:
> On Tue, Feb 03, 2026 at 01:15:50PM +0100, Johan Hovold wrote:
> > On Mon, Jan 26, 2026 at 02:50:05PM +0100, Johan Hovold wrote:
> > > On Sun, Jan 25, 2026 at 01:47:14PM +0100, Greg Kroah-Hartman wrote:
> > I noticed that you picked up the proposed incremental fixes to the
> > issues I reported without anyone even having reviewed them. The fixes
> > being incremental makes it a lot harder to review, but based on a quick
> > look it seems there needs to be further changes.
> >
> > And again, what is the rush? Anyone wanting to experiment with this
> > functionality only needs to apply a single patch. And exposing the API
> > before it is stable is just going to be a mess as subsystems may start
> > using it from day one.
> >
> > So please, just drop it for 6.20. You can still merge this for the next
> > cycle when the basic functionality has been fixed.
>
> The fixes seemed correct on my review, what was wrong with them? And
> having the code fixed for known issues is a good thing here, it gives
> the gpio people a base to test their work on.
Turns out the new revocable design is also fundamentally broken.
I've already spent too much time on this when I should be doing other
things, but here is an updated revert which explains things:
https://lore.kernel.org/r/20260204142849.22055-1-johan@kernel.org
> As no one is currently using this, I will disable this from the build,
> but keeping the code in the tree right now is a good thing as I do feel
> this is the right way forward, and others can work off of it easier this
> way.
API design should not be done incrementally in-tree. It just makes
things harder for reviewers, adds noise, and without any benefit for
anyone when the interface keeps changing every other week.
Please just merge the reverts and we can work out a way forward.
Johan
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-24 17:05 [PATCH 0/3] Revert "revocable: Revocable resource management" Johan Hovold
` (4 preceding siblings ...)
2026-01-24 19:08 ` Danilo Krummrich
@ 2026-01-27 15:57 ` Tzung-Bi Shih
2026-01-28 14:23 ` Johan Hovold
2026-01-29 15:01 ` Tzung-Bi Shih
5 siblings, 2 replies; 70+ messages in thread
From: Tzung-Bi Shih @ 2026-01-27 15:57 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> I was surprised to learn that the revocable functionality was merged last week
> given the community feedback on list and at LPC, but not least since there are
> no users of it, which we are supposed to require to be able to evaluate it
> properly.
>
> The chromeos ec driver issue which motivated this work turned out not to need
> it as was found during review. And the example gpiolib conversion was posted
Thanks for sharing your thoughts on revocable.
Regarding cros_ec_chardev, my last attempt [2] to solve its hot-plug issue by
synchronizing misc_{de,}register() with file operations using a global lock
highlighted the difficulty of alternatives: that approach serialized all file
operations and could easily lead to hung tasks if any file operation slept.
Given the drawbacks of [2], I believe cros_ec_chardev remains a valid use
case for revocable.
> the very same morning that this was merged which hardly provides enough time
> for evaluation (even if Bartosz quickly reported a performance regression).
The gpiolib conversion was provided as the first concrete user to enable
this evaluation process. The performance regression Bartosz identified is
valuable feedback, and I believe it is addressed by [3]. I plan to send the
next version of the series after v7.0-rc1 and revisit the issue.
[3] https://lore.kernel.org/all/20260121040204.2699886-1-tzungbi@kernel.org/
> Turns out there are correctness issues with both the gpiolib conversion and
> the revocable design itself that can lead to use-after-free and hung tasks (see
> [1] and patch 3/3).
I appreciate you identifying the issues where multiple threads share a file
descriptor; this is a case I overlooked. I see these kinds of findings as a
positive outcome of having wider review and a concrete user, allowing us to
identify and fix issues in the design.
> And as was pointed out repeatedly during review, and again at the day of the
> merge, this does not look like the right interface for the chardev unplug
> issue.
My focus has been on miscdevice [2], but I suspect cdev solutions for device
hot-plug would face similar synchronization challenges between device removal
and in-flight file operations.
> Revert the revocable implementation until a redesign has been proposed and
> evaluated properly.
I'll work on addressing the discovered issues and send follow-up fixes. I
believe keeping the current series in linux-next would be beneficial, as it
allows for easier testing and wider evaluation by others, rather than
reverting at this stage.
>
> Johan
>
>
> [1] https://lore.kernel.org/all/aXT45B6vLf9R3Pbf@hovoldconsulting.com/
---
[2]:
diff --git a/drivers/char/misc.c b/drivers/char/misc.c
...
+static struct miscdevice *find_miscdevice(int minor)
+{
+ struct miscdevice *c;
+
+ list_for_each_entry(c, &misc_list, list)
+ if (c->minor == minor)
+ return c;
+ return NULL;
+}
+
+static __poll_t misc_sync_poll(struct file *filp, poll_table *wait)
+{
+ struct miscdevice *c;
+
+ guard(mutex)(&misc_mtx);
+ c = find_miscdevice(iminor(filp->f_inode));
+ if (!c)
+ return -ENODEV;
+ if (!c->fops->poll)
+ return 0;
+
+ return c->fops->poll(filp, wait);
+}
...
+static const struct file_operations misc_sync_fops = {
+ .poll = misc_sync_poll,
+ .read = misc_sync_read,
+ .unlocked_ioctl = misc_sync_ioctl,
+ .release = misc_sync_release,
+};
+
static int misc_open(struct inode *inode, struct file *file)
{
int minor = iminor(inode);
@@ -161,6 +237,7 @@ static int misc_open(struct inode *inode, struct file *file)
replace_fops(file, new_fops);
if (file->f_op->open)
err = file->f_op->open(inode, file);
+ file->f_op = &misc_sync_fops;
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 15:57 ` Tzung-Bi Shih
@ 2026-01-28 14:23 ` Johan Hovold
2026-01-28 23:28 ` Laurent Pinchart
2026-01-29 15:01 ` Tzung-Bi Shih
1 sibling, 1 reply; 70+ messages in thread
From: Johan Hovold @ 2026-01-28 14:23 UTC (permalink / raw)
To: Tzung-Bi Shih
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue, Jan 27, 2026 at 03:57:44PM +0000, Tzung-Bi Shih wrote:
> On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> > I was surprised to learn that the revocable functionality was merged last week
> > given the community feedback on list and at LPC, but not least since there are
> > no users of it, which we are supposed to require to be able to evaluate it
> > properly.
> >
> > The chromeos ec driver issue which motivated this work turned out not to need
> > it as was found during review. And the example gpiolib conversion was posted
> Regarding cros_ec_chardev, my last attempt [2] to solve its hot-plug issue by
> synchronizing misc_{de,}register() with file operations using a global lock
> highlighted the difficulty of alternatives: that approach serialized all file
> operations and could easily lead to hung tasks if any file operation slept.
Yeah, fixing it at the chardev (or miscdevice) level would still require
some changes at the driver level (e.g. to wakeup sleeping tasks).
> Given the drawbacks of [2], I believe cros_ec_chardev remains a valid use
> case for revocable.
Yes, possibly.
Miscdevice is a bit of a special case however since you cannot tie the
lifetime of your driver data to that of the miscdev, but there are ways
to address that. And some drivers already handle this (i.e. without any
changes to miscdevice).
But notably the revocable design seems to derive from this quirk of
miscdevice.
[ Side note for completeness: Looking at the cros_ec driver it doesn't
seem to involve any hotpluggable buses so this looks like a mostly
theoretical issue of a developer with root access actively unbinding
an ec-driver while in use. ]
> > the very same morning that this was merged which hardly provides enough time
> > for evaluation (even if Bartosz quickly reported a performance regression).
>
> The gpiolib conversion was provided as the first concrete user to enable
> this evaluation process. The performance regression Bartosz identified is
> valuable feedback, and I believe it is addressed by [3]. I plan to send the
> next version of the series after v7.0-rc1 and revisit the issue.
>
> [3] https://lore.kernel.org/all/20260121040204.2699886-1-tzungbi@kernel.org/
>
> > Turns out there are correctness issues with both the gpiolib conversion and
> > the revocable design itself that can lead to use-after-free and hung tasks (see
> > [1] and patch 3/3).
>
> I appreciate you identifying the issues where multiple threads share a file
> descriptor; this is a case I overlooked. I see these kinds of findings as a
> positive outcome of having wider review and a concrete user, allowing us to
> identify and fix issues in the design.
Yes, that's exactly why we always require a user *before* merging
something like this. To be able to determine that the interface is sane.
Now the user showed up on the same day as the merge, which is obviously
not enough time for proper review and discussion.
[ And by short-circuiting the normal process you probably reduce the
motivation for people to spend time on the example conversion too. ]
> > And as was pointed out repeatedly during review, and again at the day of the
> > merge, this does not look like the right interface for the chardev unplug
> > issue.
>
> My focus has been on miscdevice [2], but I suspect cdev solutions for device
> hot-plug would face similar synchronization challenges between device removal
> and in-flight file operations.
But we need to look at that before throwing up our hands and saying that
it's not possible and that each driver should take care of this itself.
> > Revert the revocable implementation until a redesign has been proposed and
> > evaluated properly.
>
> I'll work on addressing the discovered issues and send follow-up fixes. I
> believe keeping the current series in linux-next would be beneficial, as it
> allows for easier testing and wider evaluation by others, rather than
> reverting at this stage.
To the contrary, now is right time to do the revert as there are
fundamental problems with the current interface that will require a
redesign. That's not the kind of work that should be done during an rc
stabilisation cycle.
Give people a chance to see for themselves what the gpiolib conversion
looks like, determine whether this abstraction makes the code more or
less readable, and think about possible further uses of a mechanism like
this.
Johan
> > [1] https://lore.kernel.org/all/aXT45B6vLf9R3Pbf@hovoldconsulting.com/
^ permalink raw reply [flat|nested] 70+ messages in thread* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-28 14:23 ` Johan Hovold
@ 2026-01-28 23:28 ` Laurent Pinchart
0 siblings, 0 replies; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-28 23:28 UTC (permalink / raw)
To: Johan Hovold
Cc: Tzung-Bi Shih, Greg Kroah-Hartman, Rafael J . Wysocki,
Danilo Krummrich, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Wed, Jan 28, 2026 at 03:23:10PM +0100, Johan Hovold wrote:
> On Tue, Jan 27, 2026 at 03:57:44PM +0000, Tzung-Bi Shih wrote:
> > On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> > > I was surprised to learn that the revocable functionality was merged last week
> > > given the community feedback on list and at LPC, but not least since there are
> > > no users of it, which we are supposed to require to be able to evaluate it
> > > properly.
> > >
> > > The chromeos ec driver issue which motivated this work turned out not to need
> > > it as was found during review. And the example gpiolib conversion was posted
>
> > Regarding cros_ec_chardev, my last attempt [2] to solve its hot-plug issue by
> > synchronizing misc_{de,}register() with file operations using a global lock
> > highlighted the difficulty of alternatives: that approach serialized all file
> > operations and could easily lead to hung tasks if any file operation slept.
>
> Yeah, fixing it at the chardev (or miscdevice) level would still require
> some changes at the driver level (e.g. to wakeup sleeping tasks).
Waking up sleeping tasks is a core part of the fix. Drivers will need to
do so explicitly in their .remove() operation, through a
subsystem-specific wrapper around a cdev helper function.
> > Given the drawbacks of [2], I believe cros_ec_chardev remains a valid use
> > case for revocable.
>
> Yes, possibly.
>
> Miscdevice is a bit of a special case however since you cannot tie the
> lifetime of your driver data to that of the miscdev, but there are ways
> to address that. And some drivers already handle this (i.e. without any
> changes to miscdevice).
>
> But notably the revocable design seems to derive from this quirk of
> miscdevice.
>
> [ Side note for completeness: Looking at the cros_ec driver it doesn't
> seem to involve any hotpluggable buses so this looks like a mostly
> theoretical issue of a developer with root access actively unbinding
> an ec-driver while in use. ]
>
> > > the very same morning that this was merged which hardly provides enough time
> > > for evaluation (even if Bartosz quickly reported a performance regression).
> >
> > The gpiolib conversion was provided as the first concrete user to enable
> > this evaluation process. The performance regression Bartosz identified is
> > valuable feedback, and I believe it is addressed by [3]. I plan to send the
> > next version of the series after v7.0-rc1 and revisit the issue.
> >
> > [3] https://lore.kernel.org/all/20260121040204.2699886-1-tzungbi@kernel.org/
> >
> > > Turns out there are correctness issues with both the gpiolib conversion and
> > > the revocable design itself that can lead to use-after-free and hung tasks (see
> > > [1] and patch 3/3).
> >
> > I appreciate you identifying the issues where multiple threads share a file
> > descriptor; this is a case I overlooked. I see these kinds of findings as a
> > positive outcome of having wider review and a concrete user, allowing us to
> > identify and fix issues in the design.
>
> Yes, that's exactly why we always require a user *before* merging
> something like this. To be able to determine that the interface is sane.
>
> Now the user showed up on the same day as the merge, which is obviously
> not enough time for proper review and discussion.
>
> [ And by short-circuiting the normal process you probably reduce the
> motivation for people to spend time on the example conversion too. ]
>
> > > And as was pointed out repeatedly during review, and again at the day of the
> > > merge, this does not look like the right interface for the chardev unplug
> > > issue.
> >
> > My focus has been on miscdevice [2], but I suspect cdev solutions for device
> > hot-plug would face similar synchronization challenges between device removal
> > and in-flight file operations.
>
> But we need to look at that before throwing up our hands and saying that
> it's not possible and that each driver should take care of this itself.
There has been previous attempts, including a functional patch series,
that didn't get merge due to the sole reason that it duplicated a small
amount of logic also present in debugfs. I'll reply to Danilo somewhere
else in this mail thread with more information.
> > > Revert the revocable implementation until a redesign has been proposed and
> > > evaluated properly.
> >
> > I'll work on addressing the discovered issues and send follow-up fixes. I
> > believe keeping the current series in linux-next would be beneficial, as it
> > allows for easier testing and wider evaluation by others, rather than
> > reverting at this stage.
>
> To the contrary, now is right time to do the revert as there are
> fundamental problems with the current interface that will require a
> redesign. That's not the kind of work that should be done during an rc
> stabilisation cycle.
>
> Give people a chance to see for themselves what the gpiolib conversion
> looks like, determine whether this abstraction makes the code more or
> less readable, and think about possible further uses of a mechanism like
> this.
fully agreed. Follow-up fixes is not the right way forward. As Johan
said, the quick merge despite the open issues, circumventing the normal
review process, damages trust, and reduces my motivation to review your
work and help shaping a good API to solve this problem.
> > > [1] https://lore.kernel.org/all/aXT45B6vLf9R3Pbf@hovoldconsulting.com/
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-27 15:57 ` Tzung-Bi Shih
2026-01-28 14:23 ` Johan Hovold
@ 2026-01-29 15:01 ` Tzung-Bi Shih
2026-01-30 9:12 ` Laurent Pinchart
1 sibling, 1 reply; 70+ messages in thread
From: Tzung-Bi Shih @ 2026-01-29 15:01 UTC (permalink / raw)
To: Johan Hovold
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
Bartosz Golaszewski, Linus Walleij, Jonathan Corbet, Shuah Khan,
Laurent Pinchart, Wolfram Sang, Simona Vetter, Dan Williams,
Jason Gunthorpe, linux-doc, linux-kselftest, linux-kernel
On Tue, Jan 27, 2026 at 03:57:44PM +0000, Tzung-Bi Shih wrote:
> On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> > Turns out there are correctness issues with both the gpiolib conversion and
> > the revocable design itself that can lead to use-after-free and hung tasks (see
> > [1] and patch 3/3).
> [...]
> > Revert the revocable implementation until a redesign has been proposed and
> > evaluated properly.
>
> I'll work on addressing the discovered issues and send follow-up fixes. I
> believe keeping the current series in linux-next would be beneficial, as it
> allows for easier testing and wider evaluation by others, rather than
> reverting at this stage.
FWIW: https://lore.kernel.org/all/20260129143733.45618-2-tzungbi@kernel.org/
and https://lore.kernel.org/all/20260129143733.45618-4-tzungbi@kernel.org/
are the proposed fixes.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-29 15:01 ` Tzung-Bi Shih
@ 2026-01-30 9:12 ` Laurent Pinchart
2026-01-30 17:41 ` Danilo Krummrich
0 siblings, 1 reply; 70+ messages in thread
From: Laurent Pinchart @ 2026-01-30 9:12 UTC (permalink / raw)
To: Tzung-Bi Shih
Cc: Johan Hovold, Greg Kroah-Hartman, Rafael J . Wysocki,
Danilo Krummrich, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Thu, Jan 29, 2026 at 03:01:50PM +0000, Tzung-Bi Shih wrote:
> On Tue, Jan 27, 2026 at 03:57:44PM +0000, Tzung-Bi Shih wrote:
> > On Sat, Jan 24, 2026 at 06:05:32PM +0100, Johan Hovold wrote:
> > > Turns out there are correctness issues with both the gpiolib conversion and
> > > the revocable design itself that can lead to use-after-free and hung tasks (see
> > > [1] and patch 3/3).
> > [...]
> > > Revert the revocable implementation until a redesign has been proposed and
> > > evaluated properly.
> >
> > I'll work on addressing the discovered issues and send follow-up fixes. I
> > believe keeping the current series in linux-next would be beneficial, as it
> > allows for easier testing and wider evaluation by others, rather than
> > reverting at this stage.
>
> FWIW: https://lore.kernel.org/all/20260129143733.45618-2-tzungbi@kernel.org/
> and https://lore.kernel.org/all/20260129143733.45618-4-tzungbi@kernel.org/
> are the proposed fixes.
I won't review that, sorry. As multiple people said in this mail thread,
the API needs to go back to the design board.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [PATCH 0/3] Revert "revocable: Revocable resource management"
2026-01-30 9:12 ` Laurent Pinchart
@ 2026-01-30 17:41 ` Danilo Krummrich
0 siblings, 0 replies; 70+ messages in thread
From: Danilo Krummrich @ 2026-01-30 17:41 UTC (permalink / raw)
To: Laurent Pinchart
Cc: Tzung-Bi Shih, Johan Hovold, Greg Kroah-Hartman,
Rafael J . Wysocki, Bartosz Golaszewski, Linus Walleij,
Jonathan Corbet, Shuah Khan, Wolfram Sang, Simona Vetter,
Dan Williams, Jason Gunthorpe, linux-doc, linux-kselftest,
linux-kernel
On Fri Jan 30, 2026 at 10:12 AM CET, Laurent Pinchart wrote:
> On Thu, Jan 29, 2026 at 03:01:50PM +0000, Tzung-Bi Shih wrote:
>> FWIW: https://lore.kernel.org/all/20260129143733.45618-2-tzungbi@kernel.org/
>> and https://lore.kernel.org/all/20260129143733.45618-4-tzungbi@kernel.org/
>> are the proposed fixes.
>
> I won't review that, sorry. As multiple people said in this mail thread,
> the API needs to go back to the design board.
Of course, you are entirely free to choose what you want to review, but I think
the justification you give is not entirely fair.
The patch series was on the list since August, you were explicitly Cc'd from the
get-go.
In v3 you said:
"To be clear, I'm not against the API being merged for the use cases that would
benefit from it, but I don't want to see drivers using it to protect from the
cdev/unregistration race."
This was a reply to Bartosz saying:
"Yeah, I'm not against this going upstream. If more development is needed for
this to be usable in other parts of the kernel, that can be done gradually.
Literally no subsystem ever was perfect on day 1."
Just to be clear, I'm not saying there are no issues to be addressed. And I'm
also not saying that you never raised concerns, you clearly did.
But, given the above, I don't think it's fair to request a revert as a
precondition to give constructive feedback for improvements and fixes.
- Danilo
Link: https://lore.kernel.org/chrome-platform/20250912145416.GL31682@pendragon.ideasonboard.com/
^ permalink raw reply [flat|nested] 70+ messages in thread