public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] driver core: Don't link the device to the bus until we're ready to probe
@ 2026-03-21  3:06 Douglas Anderson
  2026-03-21  5:41 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 11+ messages in thread
From: Douglas Anderson @ 2026-03-21  3:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich
  Cc: Alan Stern, Kay Sievers, Saravana Kannan, Douglas Anderson,
	stable, driver-core, linux-kernel

The moment we link a "struct device" into the list of devices for the
bus, it's possible probe can happen. This is because another thread
can load the driver at any time and that can cause the device to
probe. This has been seen in practice with a stack crawl that looks
like this [1]:

  really_probe()
  __driver_probe_device()
  driver_probe_device()
  __driver_attach()
  bus_for_each_dev()
  driver_attach()
  bus_add_driver()
  driver_register()
  __platform_driver_register()
  init_module() [some module]
  do_one_initcall()
  do_init_module()
  load_module()
  __arm64_sys_finit_module()
  invoke_syscall()

Practically speaking, in the case this happened
device_links_driver_bound() got called for the device before
"dev->fwnode->dev" was assigned. This prevented
__fw_devlink_pickup_dangling_consumers() from being called which meant
that other devices waiting on our driver's sub-nodes were stuck
deferring forever.

Fix the problem by adjusting where we link the device. Notably:
* Make sure we assign the dev->fwnode->dev before we link the device,
  since that needs to happen before a device probes.
* Make sure we link the device _before_ sending the UEVENT, as
  described in commit 2023c610dc54 ("Driver core: add new device to
  bus's list before probing").

[1] Captured on a machine running a downstream 6.6 kernel

Cc: stable@vger.kernel.org
Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
This is a super tricky area of code, and I don't have a lot of
confidence that I got this exactly right. If you think I should do
something different, please yell!

Notably:
* I don't know 100% for sure what happens if probe runs at the same
  time as (or before) the call to `bus_notify(dev,
  BUS_NOTIFY_ADD_DEVICE)`. Presumably we're at least no worse off than
  we were before this patch.
* I haven't dug into whether there could be other types of race
  conditions with a driver trying to probe the device at the same time
  we call bus_probe_device().

I can try to dig into those things, but I also figured that people on
the mailing lists would know this code a whole lot better than I do.

Feel free to consider this RFC patch as a bug report and post a
totally different solution. I won't be offended! :-)

I have only done very minimal testing with this patch so far (the
system doesn't seem to blow up with this patch). I'm kicking off tests
for the weekend, but I figured I'd post this up in parallel.

 drivers/base/base.h |  1 +
 drivers/base/bus.c  | 16 ++++++++++++++--
 drivers/base/core.c | 24 ++++++++++++++++++------
 3 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 1af95ac68b77..d4933ba7b651 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -166,6 +166,7 @@ static inline void auxiliary_bus_init(void) { }
 struct kobject *virtual_device_parent(void);
 
 int bus_add_device(struct device *dev);
+void bus_link_device(struct device *dev);
 void bus_probe_device(struct device *dev);
 void bus_remove_device(struct device *dev);
 void bus_notify(struct device *dev, enum bus_notifier_event value);
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index bb61d8adbab1..573c22ebb5b7 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -510,7 +510,6 @@ EXPORT_SYMBOL_GPL(bus_for_each_drv);
  *
  * - Add device's bus attributes.
  * - Create links to device's bus.
- * - Add the device to its bus's list of devices.
  */
 int bus_add_device(struct device *dev)
 {
@@ -545,7 +544,6 @@ int bus_add_device(struct device *dev)
 	if (error)
 		goto out_subsys;
 
-	klist_add_tail(&dev->p->knode_bus, &sp->klist_devices);
 	return 0;
 
 out_subsys:
@@ -557,6 +555,20 @@ int bus_add_device(struct device *dev)
 	return error;
 }
 
+/**
+ * bus_link_device - Make a device findable in the list of devices
+ * @dev: device being linked
+ *
+ * - Add the device to its bus's list of devices.
+ */
+void bus_link_device(struct device *dev)
+{
+	struct subsys_private *sp = bus_to_subsys(dev->bus);
+
+	if (sp)
+		klist_add_tail(&dev->p->knode_bus, &sp->klist_devices);
+}
+
 /**
  * bus_probe_device - probe drivers for a new device
  * @dev: device to probe
diff --git a/drivers/base/core.c b/drivers/base/core.c
index 791f9e444df8..6ff4725d677a 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3663,12 +3663,6 @@ int device_add(struct device *dev)
 		devtmpfs_create_node(dev);
 	}
 
-	/* Notify clients of device addition.  This call must come
-	 * after dpm_sysfs_add() and before kobject_uevent().
-	 */
-	bus_notify(dev, BUS_NOTIFY_ADD_DEVICE);
-	kobject_uevent(&dev->kobj, KOBJ_ADD);
-
 	/*
 	 * Check if any of the other devices (consumers) have been waiting for
 	 * this device (supplier) to be added so that they can create a device
@@ -3680,12 +3674,30 @@ int device_add(struct device *dev)
 	 * But this also needs to happen before bus_probe_device() to make sure
 	 * waiting consumers can link to it before the driver is bound to the
 	 * device and the driver sync_state callback is called for this device.
+	 *
+	 * Because a bus may be probed the moment bus_link_device() is called,
+	 * this must happen before bus_link_device().
 	 */
 	if (dev->fwnode && !dev->fwnode->dev) {
 		dev->fwnode->dev = dev;
 		fw_devlink_link_device(dev);
 	}
 
+	/*
+	 * The moment we link the bus in, it's possible for another thread
+	 * (one registering a new driver) to notice it and start probing.
+	 * At the same time, we need to link the bus before the uevent is
+	 * sent announcing the device or user programs might try to access
+	 * the device before it has been added to the bus.
+	 */
+	bus_link_device(dev);
+
+	/* Notify clients of device addition.  This call must come
+	 * after dpm_sysfs_add() and before kobject_uevent().
+	 */
+	bus_notify(dev, BUS_NOTIFY_ADD_DEVICE);
+	kobject_uevent(&dev->kobj, KOBJ_ADD);
+
 	bus_probe_device(dev);
 
 	/*
-- 
2.53.0.959.g497ff81fa9-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-03-26 21:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-21  3:06 [RFC PATCH] driver core: Don't link the device to the bus until we're ready to probe Douglas Anderson
2026-03-21  5:41 ` Greg Kroah-Hartman
2026-03-21  7:35   ` Doug Anderson
2026-03-21  7:42     ` Greg Kroah-Hartman
2026-03-21  8:05       ` Doug Anderson
2026-03-21 15:54         ` Alan Stern
2026-03-23 17:07           ` Doug Anderson
2026-03-23 17:47             ` Alan Stern
2026-03-23 22:24               ` Doug Anderson
2026-03-24 15:21                 ` Alan Stern
2026-03-26 21:49                   ` Doug Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox