* [BUG] Deferred probe loop with child devices
@ 2025-06-09 23:57 Sean Anderson
2025-06-10 18:34 ` [PATCH] driver core: Prevent deferred probe loops Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-09 23:57 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich
Cc: Rob Herring, linux-kernel@vger.kernel.org, Saravana Kannan,
devicetree@vger.kernel.org
Hi,
I've been running into a deferred probe loop when a device gets
EPROBE_DEFER after registering a bus with children:
deferred_probe_work_func()
driver_probe_device(parent)
test_parent_probe(parent)
device_add(child)
(probe successful)
driver_bound(child)
driver_deferred_probe_trigger()
return -EPROBE_DEFER
driver_deferred_probe_add(parent)
// deferred_trigger_count changed, so...
driver_deferred_probe_trigger()
Because there was another successful probe during the parent's probe,
driver_probe_device thinks we need to retry the whole probe process. But
we will never make progress this way because the only thing that changed
was a direct result of our own probe function.
Ideally I'd like to ignore probe events resulting from our own children
when probing. I think this would need a per-device probe counter that
gets added to the parent's on removal. Is that the right way to approach
things?
I've attached a minimal example below. When you load it, the console
will be filled with
test_parent_driver test_parent_driver.0: probing...
If this occurs because the module for the affected resource is missing
then the entire boot process will come to a halt (or not, depending on
how you look at things) while waiting for the parent.
While this example is contrived, this situation really does occur with
netdevs that acquire resources after creating their internal MDIO bus.
Reordering things so the MDIO bus is created last is not a very
satisfying solution, since the affected resources may itself be on the
MDIO bus.
--Sean
---
drivers/base/test/Makefile | 1 +
drivers/base/test/test_deferred_probe.c | 103 ++++++++++++++++++++++++
2 files changed, 104 insertions(+)
create mode 100644 drivers/base/test/test_deferred_probe.c
diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile
index e321dfc7e922..f5ba5bca7bce 100644
--- a/drivers/base/test/Makefile
+++ b/drivers/base/test/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o
+obj-m += test_deferred_probe.o
obj-$(CONFIG_DM_KUNIT_TEST) += root-device-test.o
obj-$(CONFIG_DM_KUNIT_TEST) += platform-device-test.o
diff --git a/drivers/base/test/test_deferred_probe.c b/drivers/base/test/test_deferred_probe.c
new file mode 100644
index 000000000000..89b68afed348
--- /dev/null
+++ b/drivers/base/test/test_deferred_probe.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2025 Sean Anderson <sean.anderson@seco.com>
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+
+static struct platform_driver child_driver = {
+ .driver = {
+ .name = "test_child_driver",
+ },
+};
+
+static int test_parent_probe(struct platform_device *pdev)
+{
+ struct platform_device *child;
+ struct device *dev = &pdev->dev;
+ int ret;
+
+ dev_info(dev, "probing...\n");
+
+ /* Probe a child on a bus of some kind */
+ child = platform_device_alloc("test_child_driver", 0);
+ if (!child)
+ return -ENOMEM;
+
+ child->dev.parent = dev;
+ ret = platform_device_add(child);
+ if (ret) {
+ dev_err(dev, "could not add child: %d\n", ret);
+ platform_device_put(child);
+ return ret;
+ }
+
+ /* Whoops, we got -EPROBE_DEFER from something else! */
+ platform_device_unregister(child);
+ return dev_err_probe(dev, -EPROBE_DEFER, "deferring...\n");
+}
+
+static struct platform_driver parent_driver = {
+ .driver = {
+ .name = "test_parent_driver",
+ },
+ .probe = test_parent_probe,
+};
+
+static struct platform_device *parent;
+
+static int __init test_deferred_probe_init(void)
+{
+ int ret;
+
+ ret = platform_driver_register(&parent_driver);
+ if (ret) {
+ pr_err("could not register parent driver: %d\n", ret);
+ return ret;
+ }
+
+ ret = platform_driver_register(&child_driver);
+ if (ret) {
+ pr_err("could not register child driver: %d\n", ret);
+ goto err_parent_driver;
+ }
+
+ parent = platform_device_alloc("test_parent_driver", 0);
+ if (!parent) {
+ ret = -ENOMEM;
+ goto err_child_driver;
+ }
+
+ pr_info("registering parent device\n");
+ ret = platform_device_add(parent);
+ if (ret) {
+ pr_err("Failed to add parent: %d\n", ret);
+ platform_device_put(parent) ;
+ goto err_child_driver;
+ }
+
+ return 0;
+
+err_child_driver:
+ platform_driver_unregister(&child_driver);
+err_parent_driver:
+ platform_driver_unregister(&parent_driver);
+ return ret;
+}
+module_init(test_deferred_probe_init);
+
+static void __exit test_deferred_probe_exit(void)
+{
+ platform_driver_unregister(&parent_driver);
+ platform_driver_unregister(&child_driver);
+ platform_device_unregister(parent);
+}
+module_exit(test_deferred_probe_exit);
+
+MODULE_DESCRIPTION("Test module for deferred driver probing");
+MODULE_AUTHOR("Sean Anderson <sean.anderson@seco.com>");
+MODULE_LICENSE("GPL");
--
2.35.1.1320.gc452695387.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH] driver core: Prevent deferred probe loops
2025-06-09 23:57 [BUG] Deferred probe loop with child devices Sean Anderson
@ 2025-06-10 18:34 ` Sean Anderson
2025-06-10 23:32 ` Saravana Kannan
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-10 18:34 UTC (permalink / raw)
To: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel
Cc: devicetree, Christoph Hellwig, Rob Herring, Grant Likely,
Saravana Kannan, Sean Anderson
A deferred probe loop can occur when a device returns EPROBE_DEFER after
registering a bus with children:
deferred_probe_work_func()
driver_probe_device(parent)
test_parent_probe(parent)
device_add(child)
(probe successful)
driver_bound(child)
driver_deferred_probe_trigger()
return -EPROBE_DEFER
driver_deferred_probe_add(parent)
// deferred_trigger_count changed, so...
driver_deferred_probe_trigger()
Because there was another successful probe during the parent's probe,
driver_probe_device thinks we need to retry the whole probe process. But
we will never make progress this way because the only thing that changed
was a direct result of our own probe function.
To prevent this, add a per-device trigger_count. This allows us to
determine if the global deferred_trigger_count was modified by some
unrelated device or only by our own children. The read side does the
work of summing children because I expect most deferred devices to be
childless. The alternative is to walk up the device's parents in
driver_deferred_probe_trigger.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---
drivers/base/base.h | 2 +-
drivers/base/core.c | 8 ++++-
drivers/base/dd.c | 67 ++++++++++++++++++++++++++++++++++--------
include/linux/device.h | 3 ++
4 files changed, 66 insertions(+), 14 deletions(-)
diff --git a/drivers/base/base.h b/drivers/base/base.h
index 123031a757d9..54263b186d1f 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -201,7 +201,7 @@ int devres_release_all(struct device *dev);
void device_block_probing(void);
void device_unblock_probing(void);
void deferred_probe_extend_timeout(void);
-void driver_deferred_probe_trigger(void);
+void driver_deferred_probe_trigger(struct device *dev);
const char *device_get_devnode(const struct device *dev, umode_t *mode,
kuid_t *uid, kgid_t *gid, const char **tmp);
diff --git a/drivers/base/core.c b/drivers/base/core.c
index cbc0099d8ef2..8ba231ec469b 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1858,7 +1858,7 @@ void __init wait_for_init_devices_probe(void)
pr_info("Trying to probe devices needed for running init ...\n");
fw_devlink_best_effort = true;
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(NULL);
/*
* Wait for all "best effort" probes to finish before going back to
@@ -3739,6 +3739,9 @@ int device_add(struct device *dev)
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
glue_dir = get_glue_dir(dev);
kobject_del(&dev->kobj);
+ if (parent)
+ atomic_add(atomic_read(&dev->trigger_count),
+ &dev->parent->trigger_count);
Error:
cleanup_glue_dir(dev, glue_dir);
parent_error:
@@ -3899,6 +3902,9 @@ void device_del(struct device *dev)
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
glue_dir = get_glue_dir(dev);
kobject_del(&dev->kobj);
+ if (parent)
+ atomic_add(atomic_read(&dev->trigger_count),
+ &parent->trigger_count);
cleanup_glue_dir(dev, glue_dir);
memalloc_noio_restore(noio_flag);
put_device(parent);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index b526e0e0f52d..8ce638c02275 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -156,6 +156,7 @@ void driver_deferred_probe_del(struct device *dev)
static bool driver_deferred_probe_enable;
/**
* driver_deferred_probe_trigger() - Kick off re-probing deferred devices
+ * @dev: the successfully-bound device, or %NULL if not applicable
*
* This functions moves all devices from the pending list to the active
* list and schedules the deferred probe workqueue to process them. It
@@ -172,7 +173,7 @@ static bool driver_deferred_probe_enable;
* changes in the midst of a probe, then deferred processing should be triggered
* again.
*/
-void driver_deferred_probe_trigger(void)
+void driver_deferred_probe_trigger(struct device *dev)
{
if (!driver_deferred_probe_enable)
return;
@@ -184,6 +185,10 @@ void driver_deferred_probe_trigger(void)
*/
mutex_lock(&deferred_probe_mutex);
atomic_inc(&deferred_trigger_count);
+ if (dev) {
+ smp_wmb(); /* paired with device_needs_retrigger */
+ atomic_inc(&dev->trigger_count);
+ }
list_splice_tail_init(&deferred_probe_pending_list,
&deferred_probe_active_list);
mutex_unlock(&deferred_probe_mutex);
@@ -216,7 +221,7 @@ void device_block_probing(void)
void device_unblock_probing(void)
{
defer_all_probes = false;
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(NULL);
}
/**
@@ -308,7 +313,7 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
fw_devlink_drivers_done();
driver_deferred_probe_timeout = 0;
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(NULL);
flush_work(&deferred_probe_work);
mutex_lock(&deferred_probe_mutex);
@@ -347,7 +352,7 @@ static int deferred_probe_initcall(void)
&deferred_devs_fops);
driver_deferred_probe_enable = true;
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(NULL);
/* Sort as many dependencies as possible before exiting initcalls */
flush_work(&deferred_probe_work);
initcalls_done = true;
@@ -359,7 +364,7 @@ static int deferred_probe_initcall(void)
* Trigger deferred probe again, this time we won't defer anything
* that is optional
*/
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(NULL);
flush_work(&deferred_probe_work);
if (driver_deferred_probe_timeout > 0) {
@@ -415,7 +420,7 @@ static void driver_bound(struct device *dev)
* kick off retrying all pending devices
*/
driver_deferred_probe_del(dev);
- driver_deferred_probe_trigger();
+ driver_deferred_probe_trigger(dev);
bus_notify(dev, BUS_NOTIFY_BOUND_DRIVER);
kobject_uevent(&dev->kobj, KOBJ_BIND);
@@ -806,6 +811,47 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
return ret;
}
+/**
+ * dev_get_trigger_count() - Recursively read trigger_count
+ * @dev: device to read from
+ * @data: pointer to the int result; should be initialized to 0
+ *
+ * Read @dev's trigger_count, as well as all its children's trigger counts,
+ * recursively. The result is the number of times @dev or any of its
+ * (possibly-removed) children have been successfully probed.
+ *
+ * Return: 0
+ */
+static int dev_get_trigger_count(struct device *dev, void *data)
+{
+ *(int *)data += atomic_read(&dev->trigger_count);
+ return device_for_each_child(dev, dev_get_trigger_count, data);
+}
+
+/*
+ * device_needs_retrigger() - Determine if we need to re-trigger a deferred probe
+ * @dev: Device that failed to probe with %EPROBE_DEFER
+ * @old_trigger_count: Value of deferred_trigger_count before probing the device
+ *
+ * The resource @dev was looking for could have been probed between when @dev
+ * looked up the resource and when the probe process finished. If this occurred
+ * we need to retrigger deferred probing so that @dev gets another shot at
+ * probing. However, we need to ignore deferred probe triggers from @dev's own
+ * children, since that could result in an infinite probe loop.
+ *
+ * Return: %true if we should retrigger probing of deferred devices
+ */
+static bool device_needs_retrigger(struct device *dev, int old_trigger_count)
+{
+ int dev_trigger_count = 0;
+ int new_trigger_count;
+
+ dev_get_trigger_count(dev, &dev_trigger_count);
+ smp_rmb(); /* paired with driver_deferred_probe_trigger */
+ new_trigger_count = atomic_read(&deferred_trigger_count);
+ return new_trigger_count > old_trigger_count + dev_trigger_count;
+}
+
/**
* driver_probe_device - attempt to bind device & driver together
* @drv: driver to bind a device to
@@ -830,12 +876,9 @@ static int driver_probe_device(const struct device_driver *drv, struct device *d
if (ret == -EPROBE_DEFER || ret == EPROBE_DEFER) {
driver_deferred_probe_add(dev);
- /*
- * Did a trigger occur while probing? Need to re-trigger if yes
- */
- if (trigger_count != atomic_read(&deferred_trigger_count) &&
- !defer_all_probes)
- driver_deferred_probe_trigger();
+ if (!defer_all_probes &&
+ device_needs_retrigger(dev, trigger_count))
+ driver_deferred_probe_trigger(NULL);
}
atomic_dec(&probe_count);
wake_up_all(&probe_waitqueue);
diff --git a/include/linux/device.h b/include/linux/device.h
index 4940db137fff..9c9153adb8d6 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -486,6 +486,8 @@ struct device_physical_location {
* @p: Holds the private data of the driver core portions of the device.
* See the comment of the struct device_private for detail.
* @kobj: A top-level, abstract class from which other classes are derived.
+ * @trigger_count: Number of times this device (or any of its removed children)
+ * has been successfully bound to a driver.
* @init_name: Initial name of the device.
* @type: The type of device.
* This identifies the device type and carries type-specific
@@ -581,6 +583,7 @@ struct device_physical_location {
*/
struct device {
struct kobject kobj;
+ atomic_t trigger_count;
struct device *parent;
struct device_private *p;
--
2.35.1.1320.gc452695387.dirty
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-10 18:34 ` [PATCH] driver core: Prevent deferred probe loops Sean Anderson
@ 2025-06-10 23:32 ` Saravana Kannan
2025-06-10 23:44 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Saravana Kannan @ 2025-06-10 23:32 UTC (permalink / raw)
To: Sean Anderson
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>
> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> registering a bus with children:
This is a broken driver. A parent device shouldn't register child
devices unless it is fully read itself. It's not logical to say the
child devices are available, if the parent itself isn't fully ready.
So, adding child devices/the bus should be the last thing done in the
parent's probe function.
I know there are odd exceptions where the parent depends on the child,
so they might add the child a bit earlier in the probe, but in those
cases, the parent's probe should still do all the checks ahead of
time.
Can you be more specific about the actual failure you are seeing?
Thanks,
Saravana
> deferred_probe_work_func()
> driver_probe_device(parent)
> test_parent_probe(parent)
> device_add(child)
> (probe successful)
> driver_bound(child)
> driver_deferred_probe_trigger()
> return -EPROBE_DEFER
> driver_deferred_probe_add(parent)
> // deferred_trigger_count changed, so...
> driver_deferred_probe_trigger()
>
> Because there was another successful probe during the parent's probe,
> driver_probe_device thinks we need to retry the whole probe process. But
> we will never make progress this way because the only thing that changed
> was a direct result of our own probe function.
>
> To prevent this, add a per-device trigger_count. This allows us to
> determine if the global deferred_trigger_count was modified by some
> unrelated device or only by our own children. The read side does the
> work of summing children because I expect most deferred devices to be
> childless. The alternative is to walk up the device's parents in
> driver_deferred_probe_trigger.
>
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
> ---
>
> drivers/base/base.h | 2 +-
> drivers/base/core.c | 8 ++++-
> drivers/base/dd.c | 67 ++++++++++++++++++++++++++++++++++--------
> include/linux/device.h | 3 ++
> 4 files changed, 66 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 123031a757d9..54263b186d1f 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -201,7 +201,7 @@ int devres_release_all(struct device *dev);
> void device_block_probing(void);
> void device_unblock_probing(void);
> void deferred_probe_extend_timeout(void);
> -void driver_deferred_probe_trigger(void);
> +void driver_deferred_probe_trigger(struct device *dev);
> const char *device_get_devnode(const struct device *dev, umode_t *mode,
> kuid_t *uid, kgid_t *gid, const char **tmp);
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index cbc0099d8ef2..8ba231ec469b 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1858,7 +1858,7 @@ void __init wait_for_init_devices_probe(void)
>
> pr_info("Trying to probe devices needed for running init ...\n");
> fw_devlink_best_effort = true;
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(NULL);
>
> /*
> * Wait for all "best effort" probes to finish before going back to
> @@ -3739,6 +3739,9 @@ int device_add(struct device *dev)
> kobject_uevent(&dev->kobj, KOBJ_REMOVE);
> glue_dir = get_glue_dir(dev);
> kobject_del(&dev->kobj);
> + if (parent)
> + atomic_add(atomic_read(&dev->trigger_count),
> + &dev->parent->trigger_count);
> Error:
> cleanup_glue_dir(dev, glue_dir);
> parent_error:
> @@ -3899,6 +3902,9 @@ void device_del(struct device *dev)
> kobject_uevent(&dev->kobj, KOBJ_REMOVE);
> glue_dir = get_glue_dir(dev);
> kobject_del(&dev->kobj);
> + if (parent)
> + atomic_add(atomic_read(&dev->trigger_count),
> + &parent->trigger_count);
> cleanup_glue_dir(dev, glue_dir);
> memalloc_noio_restore(noio_flag);
> put_device(parent);
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index b526e0e0f52d..8ce638c02275 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -156,6 +156,7 @@ void driver_deferred_probe_del(struct device *dev)
> static bool driver_deferred_probe_enable;
> /**
> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices
> + * @dev: the successfully-bound device, or %NULL if not applicable
> *
> * This functions moves all devices from the pending list to the active
> * list and schedules the deferred probe workqueue to process them. It
> @@ -172,7 +173,7 @@ static bool driver_deferred_probe_enable;
> * changes in the midst of a probe, then deferred processing should be triggered
> * again.
> */
> -void driver_deferred_probe_trigger(void)
> +void driver_deferred_probe_trigger(struct device *dev)
> {
> if (!driver_deferred_probe_enable)
> return;
> @@ -184,6 +185,10 @@ void driver_deferred_probe_trigger(void)
> */
> mutex_lock(&deferred_probe_mutex);
> atomic_inc(&deferred_trigger_count);
> + if (dev) {
> + smp_wmb(); /* paired with device_needs_retrigger */
> + atomic_inc(&dev->trigger_count);
> + }
> list_splice_tail_init(&deferred_probe_pending_list,
> &deferred_probe_active_list);
> mutex_unlock(&deferred_probe_mutex);
> @@ -216,7 +221,7 @@ void device_block_probing(void)
> void device_unblock_probing(void)
> {
> defer_all_probes = false;
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(NULL);
> }
>
> /**
> @@ -308,7 +313,7 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
> fw_devlink_drivers_done();
>
> driver_deferred_probe_timeout = 0;
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(NULL);
> flush_work(&deferred_probe_work);
>
> mutex_lock(&deferred_probe_mutex);
> @@ -347,7 +352,7 @@ static int deferred_probe_initcall(void)
> &deferred_devs_fops);
>
> driver_deferred_probe_enable = true;
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(NULL);
> /* Sort as many dependencies as possible before exiting initcalls */
> flush_work(&deferred_probe_work);
> initcalls_done = true;
> @@ -359,7 +364,7 @@ static int deferred_probe_initcall(void)
> * Trigger deferred probe again, this time we won't defer anything
> * that is optional
> */
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(NULL);
> flush_work(&deferred_probe_work);
>
> if (driver_deferred_probe_timeout > 0) {
> @@ -415,7 +420,7 @@ static void driver_bound(struct device *dev)
> * kick off retrying all pending devices
> */
> driver_deferred_probe_del(dev);
> - driver_deferred_probe_trigger();
> + driver_deferred_probe_trigger(dev);
>
> bus_notify(dev, BUS_NOTIFY_BOUND_DRIVER);
> kobject_uevent(&dev->kobj, KOBJ_BIND);
> @@ -806,6 +811,47 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> return ret;
> }
>
> +/**
> + * dev_get_trigger_count() - Recursively read trigger_count
> + * @dev: device to read from
> + * @data: pointer to the int result; should be initialized to 0
> + *
> + * Read @dev's trigger_count, as well as all its children's trigger counts,
> + * recursively. The result is the number of times @dev or any of its
> + * (possibly-removed) children have been successfully probed.
> + *
> + * Return: 0
> + */
> +static int dev_get_trigger_count(struct device *dev, void *data)
> +{
> + *(int *)data += atomic_read(&dev->trigger_count);
> + return device_for_each_child(dev, dev_get_trigger_count, data);
> +}
> +
> +/*
> + * device_needs_retrigger() - Determine if we need to re-trigger a deferred probe
> + * @dev: Device that failed to probe with %EPROBE_DEFER
> + * @old_trigger_count: Value of deferred_trigger_count before probing the device
> + *
> + * The resource @dev was looking for could have been probed between when @dev
> + * looked up the resource and when the probe process finished. If this occurred
> + * we need to retrigger deferred probing so that @dev gets another shot at
> + * probing. However, we need to ignore deferred probe triggers from @dev's own
> + * children, since that could result in an infinite probe loop.
> + *
> + * Return: %true if we should retrigger probing of deferred devices
> + */
> +static bool device_needs_retrigger(struct device *dev, int old_trigger_count)
> +{
> + int dev_trigger_count = 0;
> + int new_trigger_count;
> +
> + dev_get_trigger_count(dev, &dev_trigger_count);
> + smp_rmb(); /* paired with driver_deferred_probe_trigger */
> + new_trigger_count = atomic_read(&deferred_trigger_count);
> + return new_trigger_count > old_trigger_count + dev_trigger_count;
> +}
> +
> /**
> * driver_probe_device - attempt to bind device & driver together
> * @drv: driver to bind a device to
> @@ -830,12 +876,9 @@ static int driver_probe_device(const struct device_driver *drv, struct device *d
> if (ret == -EPROBE_DEFER || ret == EPROBE_DEFER) {
> driver_deferred_probe_add(dev);
>
> - /*
> - * Did a trigger occur while probing? Need to re-trigger if yes
> - */
> - if (trigger_count != atomic_read(&deferred_trigger_count) &&
> - !defer_all_probes)
> - driver_deferred_probe_trigger();
> + if (!defer_all_probes &&
> + device_needs_retrigger(dev, trigger_count))
> + driver_deferred_probe_trigger(NULL);
> }
> atomic_dec(&probe_count);
> wake_up_all(&probe_waitqueue);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 4940db137fff..9c9153adb8d6 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -486,6 +486,8 @@ struct device_physical_location {
> * @p: Holds the private data of the driver core portions of the device.
> * See the comment of the struct device_private for detail.
> * @kobj: A top-level, abstract class from which other classes are derived.
> + * @trigger_count: Number of times this device (or any of its removed children)
> + * has been successfully bound to a driver.
> * @init_name: Initial name of the device.
> * @type: The type of device.
> * This identifies the device type and carries type-specific
> @@ -581,6 +583,7 @@ struct device_physical_location {
> */
> struct device {
> struct kobject kobj;
> + atomic_t trigger_count;
> struct device *parent;
>
> struct device_private *p;
> --
> 2.35.1.1320.gc452695387.dirty
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-10 23:32 ` Saravana Kannan
@ 2025-06-10 23:44 ` Sean Anderson
2025-06-11 12:23 ` Greg Kroah-Hartman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-10 23:44 UTC (permalink / raw)
To: Saravana Kannan
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/10/25 19:32, Saravana Kannan wrote:
> On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> registering a bus with children:
>
> This is a broken driver. A parent device shouldn't register child
> devices unless it is fully read itself. It's not logical to say the
> child devices are available, if the parent itself isn't fully ready.
> So, adding child devices/the bus should be the last thing done in the
> parent's probe function.
>
> I know there are odd exceptions where the parent depends on the child,
> so they might add the child a bit earlier in the probe
This is exactly the case here. So the bus probing cannot happen any
later than it already does.
> but in those cases, the parent's probe should still do all the checks
> ahead of time.
Such as what? How is the parent going to know the resource is missing
without checking for it?
> Can you be more specific about the actual failure you are seeing?
MAC is looking for a PCS that's on its internal MDIO bus, but that PCS's
driver isn't loaded. The PCS has to be loaded at probe time because
phylink_create needs it, and phylink is necessary to register the
netdev. The latter situation is not ideal, but it would be quite a bit
of work to untangle.
--Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-10 23:44 ` Sean Anderson
@ 2025-06-11 12:23 ` Greg Kroah-Hartman
2025-06-12 15:53 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-06-11 12:23 UTC (permalink / raw)
To: Sean Anderson
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> On 6/10/25 19:32, Saravana Kannan wrote:
> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >>
> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> registering a bus with children:
> >
> > This is a broken driver. A parent device shouldn't register child
> > devices unless it is fully read itself. It's not logical to say the
> > child devices are available, if the parent itself isn't fully ready.
> > So, adding child devices/the bus should be the last thing done in the
> > parent's probe function.
> >
> > I know there are odd exceptions where the parent depends on the child,
> > so they might add the child a bit earlier in the probe
>
> This is exactly the case here. So the bus probing cannot happen any
> later than it already does.
Please fix the driver not to do this.
> > but in those cases, the parent's probe should still do all the checks
> > ahead of time.
>
> Such as what? How is the parent going to know the resource is missing
> without checking for it?
>
> > Can you be more specific about the actual failure you are seeing?
>
> MAC is looking for a PCS that's on its internal MDIO bus, but that PCS's
> driver isn't loaded. The PCS has to be loaded at probe time because
> phylink_create needs it, and phylink is necessary to register the
> netdev. The latter situation is not ideal, but it would be quite a bit
> of work to untangle.
Please untangle, don't put stuff in the driver core for broken
subsystems. That is just pushing the maintaince of this from the driver
authors to the driver core maintainers for the next 20+ years :(
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-11 12:23 ` Greg Kroah-Hartman
@ 2025-06-12 15:53 ` Sean Anderson
2025-06-12 17:56 ` Saravana Kannan
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-12 15:53 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
>> On 6/10/25 19:32, Saravana Kannan wrote:
>> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >>
>> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> >> registering a bus with children:
>> >
>> > This is a broken driver. A parent device shouldn't register child
>> > devices unless it is fully read itself. It's not logical to say the
>> > child devices are available, if the parent itself isn't fully ready.
>> > So, adding child devices/the bus should be the last thing done in the
>> > parent's probe function.
>> >
>> > I know there are odd exceptions where the parent depends on the child,
>> > so they might add the child a bit earlier in the probe
>>
>> This is exactly the case here. So the bus probing cannot happen any
>> later than it already does.
>
> Please fix the driver not to do this.
How? The driver needs the PCS to work. And the PCS can live on the MDIO
bus.
>> > but in those cases, the parent's probe should still do all the checks
>> > ahead of time.
>>
>> Such as what? How is the parent going to know the resource is missing
>> without checking for it?
>>
>> > Can you be more specific about the actual failure you are seeing?
>>
>> MAC is looking for a PCS that's on its internal MDIO bus, but that PCS's
>> driver isn't loaded. The PCS has to be loaded at probe time because
>> phylink_create needs it, and phylink is necessary to register the
>> netdev. The latter situation is not ideal, but it would be quite a bit
>> of work to untangle.
>
> Please untangle, don't put stuff in the driver core for broken
> subsystems. That is just pushing the maintaince of this from the driver
> authors to the driver core maintainers for the next 20+ years :(
What makes it broken? The "mess" has already been made in netdev. The driver
authors have already pushed it off onto phylink.
And by "quite a bit of work to untangle" I mean the PCS affects settings
(ethtool ksettings, MII IOCTL) that are exposed to userspace as soon as
the netdev is registered. So we cannot move to a "delayed" lookup
without breaking reading/modifying the settings. We could of course fake
it, but what happens when e.g. userspace looks at the settings and
breaks because we are not reporting the right capabilities (which would
have been reported in the past)?
--Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-12 15:53 ` Sean Anderson
@ 2025-06-12 17:56 ` Saravana Kannan
2025-06-12 20:40 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Saravana Kannan @ 2025-06-12 17:56 UTC (permalink / raw)
To: Sean Anderson
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>
> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> >> On 6/10/25 19:32, Saravana Kannan wrote:
> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >>
> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> >> registering a bus with children:
> >> >
> >> > This is a broken driver. A parent device shouldn't register child
> >> > devices unless it is fully read itself. It's not logical to say the
> >> > child devices are available, if the parent itself isn't fully ready.
> >> > So, adding child devices/the bus should be the last thing done in the
> >> > parent's probe function.
> >> >
> >> > I know there are odd exceptions where the parent depends on the child,
> >> > so they might add the child a bit earlier in the probe
> >>
> >> This is exactly the case here. So the bus probing cannot happen any
> >> later than it already does.
> >
> > Please fix the driver not to do this.
>
> How? The driver needs the PCS to work. And the PCS can live on the MDIO
> bus.
Obviously I don't know the full details, but you could implement it as
MFD. So the bus part would not get removed even if the PCS fails to
probe. Then the PCS can probe when whatever it needs ends up probing.
>
> >> > but in those cases, the parent's probe should still do all the checks
> >> > ahead of time.
> >>
> >> Such as what? How is the parent going to know the resource is missing
> >> without checking for it?
> >>
> >> > Can you be more specific about the actual failure you are seeing?
> >>
> >> MAC is looking for a PCS that's on its internal MDIO bus, but that PCS's
> >> driver isn't loaded. The PCS has to be loaded at probe time because
> >> phylink_create needs it, and phylink is necessary to register the
> >> netdev. The latter situation is not ideal, but it would be quite a bit
> >> of work to untangle.
I meant, point to a specific device in a DT and the driver. Provide
logs of the failure if possible, etc. Tell me which device is failing
and why, etc. That way, I can take a closer look or give you other
suggestions.
-Saravana
> >
> > Please untangle, don't put stuff in the driver core for broken
> > subsystems. That is just pushing the maintaince of this from the driver
> > authors to the driver core maintainers for the next 20+ years :(
>
> What makes it broken? The "mess" has already been made in netdev. The driver
> authors have already pushed it off onto phylink.
>
> And by "quite a bit of work to untangle" I mean the PCS affects settings
> (ethtool ksettings, MII IOCTL) that are exposed to userspace as soon as
> the netdev is registered. So we cannot move to a "delayed" lookup
> without breaking reading/modifying the settings. We could of course fake
> it, but what happens when e.g. userspace looks at the settings and
> breaks because we are not reporting the right capabilities (which would
> have been reported in the past)?
>
> --Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-12 17:56 ` Saravana Kannan
@ 2025-06-12 20:40 ` Sean Anderson
2025-06-17 8:50 ` Greg Kroah-Hartman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-12 20:40 UTC (permalink / raw)
To: Saravana Kannan
Cc: Greg Kroah-Hartman, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/12/25 13:56, Saravana Kannan wrote:
> On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
>> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
>> >> On 6/10/25 19:32, Saravana Kannan wrote:
>> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >>
>> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> >> >> registering a bus with children:
>> >> >
>> >> > This is a broken driver. A parent device shouldn't register child
>> >> > devices unless it is fully read itself. It's not logical to say the
>> >> > child devices are available, if the parent itself isn't fully ready.
>> >> > So, adding child devices/the bus should be the last thing done in the
>> >> > parent's probe function.
>> >> >
>> >> > I know there are odd exceptions where the parent depends on the child,
>> >> > so they might add the child a bit earlier in the probe
>> >>
>> >> This is exactly the case here. So the bus probing cannot happen any
>> >> later than it already does.
>> >
>> > Please fix the driver not to do this.
>>
>> How? The driver needs the PCS to work. And the PCS can live on the MDIO
>> bus.
>
> Obviously I don't know the full details, but you could implement it as
> MFD. So the bus part would not get removed even if the PCS fails to
> probe. Then the PCS can probe when whatever it needs ends up probing.
I was thinking about making the MDIO bus a separate device. But I think
it will be tricky to get suspend/resume working correctly. And this
makes conversions more difficult because you cannot just add some
pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
invariably created as a child of the MAC).
And what happens if a developer doesn't realize they have to split off
the MDIO bus before converting? Everything works fine, except if there
is some problem loading the PCS driver, which they may not test. Is this
prohibition against failing after creating a bus documented anywhere? I
don't recall seeing it...
>>
>> >> > but in those cases, the parent's probe should still do all the checks
>> >> > ahead of time.
>> >>
>> >> Such as what? How is the parent going to know the resource is missing
>> >> without checking for it?
>> >>
>> >> > Can you be more specific about the actual failure you are seeing?
>> >>
>> >> MAC is looking for a PCS that's on its internal MDIO bus, but that PCS's
>> >> driver isn't loaded. The PCS has to be loaded at probe time because
>> >> phylink_create needs it, and phylink is necessary to register the
>> >> netdev. The latter situation is not ideal, but it would be quite a bit
>> >> of work to untangle.
>
> I meant, point to a specific device in a DT and the driver. Provide
> logs of the failure if possible, etc. Tell me which device is failing
> and why, etc. That way, I can take a closer look or give you other
> suggestions.
See [1]. Devicetree is not upstream yet (working on it...) but it looks
like
&gem0 {
pcs-handle = <&pcs0>;
post-init-providers = <&pcs0>;
sfp = <&sfp0>;
managed = "in-band-status";
phy-mode = "1000base-x";
nvmem-cells = <ð_address 0>;
nvmem-cell-names = "mac-address";
/delete-property/ phy-handle;
mdio {
pcs0: ethernet-pcs@0 {
#clock-cells = <0>;
compatible = "xlnx,pcs-16.2", "xlnx,pcs";
reg = <0>;
clocks = <&si570>;
clock-names = "refclk";
interrupts-extended = <&gic GIC_SPI 106 IRQ_TYPE_LEVEL_HIGH>;
interrupt-names = "an";
reset-gpios = <&axi_gpio_2 0 GPIO_ACTIVE_HIGH>;
done-gpios = <&axi_gpio_3 0 GPIO_ACTIVE_HIGH>;
xlnx,pcs-modes = "sgmii", "1000base-x";
};
};
};
or in another instance
ð_0_eth_buf {
clocks = <&zynqmp_clk PL0_REF>;
clock-names = "s_axi_lite_clk";
pcs-handle = <&pcs4>;
managed = "in-band-status";
phy-handle = <&phy5>;
phy-mode = "sgmii";
post-init-providers = <&pcs4>, <&phy5>;
nvmem-cells = <ð_address 4>;
nvmem-cell-names = "mac-address";
iommus = <&smmu 0x200>, <&smmu 0x240>, <&smmu 0x248>;
/delete-property/ local-mac-address;
/delete-property/ xlnx,phy-type;
mdio {
reset-gpios = <&gpio 38 GPIO_ACTIVE_LOW>;
reset-delay-us = <10000>;
reset-post-delay-us = <50000>;
pcs4: ethernet-pcs@1 {
#reset-cells = <1>;
compatible = "xlnx,pcs-16.2", "xlnx,pcs";
reg = <1>;
clocks = <&vc5 4>;
clock-names = "refclk";
assigned-clocks = <&vc5 4>;
assigned-clock-rates = <125000000>;
reset-gpios = <&axi_gpio_2 4 GPIO_ACTIVE_HIGH>;
xlnx,pcs-modes = "sgmii";
};
phy2: ethernet-phy@2 {
compatible = "ethernet-phy-ieee802.3-c22";
reg = <2>;
interrupts-extended = <&gpio 118 IRQ_TYPE_LEVEL_LOW>;
};
phy4: ethernet-phy@4 {
compatible = "ethernet-phy-ieee802.3-c22";
reg = <4>;
interrupts-extended = <&gpio 119 IRQ_TYPE_LEVEL_LOW>;
};
phy5: ethernet-phy@5 {
compatible = "ethernet-phy-ieee802.3-c22";
reg = <5>;
interrupts-extended = <&gpio 117 IRQ_TYPE_LEVEL_LOW>;
};
phy6: ethernet-phy@6 {
compatible = "ethernet-phy-ieee802.3-c22";
reg = <6>;
interrupts-extended = <&gpio 120 IRQ_TYPE_LEVEL_LOW>;
};
};
};
In the second case, the phy also has the same relationship, but it is
not an issue since the phy is looked up in ndo_open instead of
phylink_create.
The ZCU102/106 supports this, but the in-tree devicetree only has hard
IP as opposed to things done in the FPGA. However, the "default"
configuration for xilinx_axienet assumes that the PCS is on the MAC's
MDIO bus [2].
--Sean
[1] https://lore.kernel.org/netdev/20250610233134.3588011-1-sean.anderson@linux.dev/
[2] https://github.com/Xilinx/device-tree-xlnx/blob/ac65e0142e52331244ea5799880650fb1e726ab7/axi_ethernet/data/axi_ethernet.tcl#L280
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-12 20:40 ` Sean Anderson
@ 2025-06-17 8:50 ` Greg Kroah-Hartman
2025-06-17 15:35 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-06-17 8:50 UTC (permalink / raw)
To: Sean Anderson
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
> On 6/12/25 13:56, Saravana Kannan wrote:
> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >>
> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >>
> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> >> >> registering a bus with children:
> >> >> >
> >> >> > This is a broken driver. A parent device shouldn't register child
> >> >> > devices unless it is fully read itself. It's not logical to say the
> >> >> > child devices are available, if the parent itself isn't fully ready.
> >> >> > So, adding child devices/the bus should be the last thing done in the
> >> >> > parent's probe function.
> >> >> >
> >> >> > I know there are odd exceptions where the parent depends on the child,
> >> >> > so they might add the child a bit earlier in the probe
> >> >>
> >> >> This is exactly the case here. So the bus probing cannot happen any
> >> >> later than it already does.
> >> >
> >> > Please fix the driver not to do this.
> >>
> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
> >> bus.
> >
> > Obviously I don't know the full details, but you could implement it as
> > MFD. So the bus part would not get removed even if the PCS fails to
> > probe. Then the PCS can probe when whatever it needs ends up probing.
>
> I was thinking about making the MDIO bus a separate device. But I think
> it will be tricky to get suspend/resume working correctly. And this
> makes conversions more difficult because you cannot just add some
> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
> invariably created as a child of the MAC).
>
> And what happens if a developer doesn't realize they have to split off
> the MDIO bus before converting? Everything works fine, except if there
> is some problem loading the PCS driver, which they may not test. Is this
> prohibition against failing after creating a bus documented anywhere? I
> don't recall seeing it...
What do you mean "failing after creating a bus"? If a bus is failed to
be created, you fail like normal, no difference here.
And if MFD doesn't work, there's always the aux-bus code, perhaps that
should be used here instead?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-17 8:50 ` Greg Kroah-Hartman
@ 2025-06-17 15:35 ` Sean Anderson
2025-06-17 15:49 ` Greg Kroah-Hartman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-17 15:35 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/17/25 04:50, Greg Kroah-Hartman wrote:
> On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
>> On 6/12/25 13:56, Saravana Kannan wrote:
>> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >>
>> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
>> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
>> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
>> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >> >>
>> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> >> >> >> registering a bus with children:
>> >> >> >
>> >> >> > This is a broken driver. A parent device shouldn't register child
>> >> >> > devices unless it is fully read itself. It's not logical to say the
>> >> >> > child devices are available, if the parent itself isn't fully ready.
>> >> >> > So, adding child devices/the bus should be the last thing done in the
>> >> >> > parent's probe function.
>> >> >> >
>> >> >> > I know there are odd exceptions where the parent depends on the child,
>> >> >> > so they might add the child a bit earlier in the probe
>> >> >>
>> >> >> This is exactly the case here. So the bus probing cannot happen any
>> >> >> later than it already does.
>> >> >
>> >> > Please fix the driver not to do this.
>> >>
>> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
>> >> bus.
>> >
>> > Obviously I don't know the full details, but you could implement it as
>> > MFD. So the bus part would not get removed even if the PCS fails to
>> > probe. Then the PCS can probe when whatever it needs ends up probing.
>>
>> I was thinking about making the MDIO bus a separate device. But I think
>> it will be tricky to get suspend/resume working correctly. And this
>> makes conversions more difficult because you cannot just add some
>> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
>> invariably created as a child of the MAC).
>>
>> And what happens if a developer doesn't realize they have to split off
>> the MDIO bus before converting? Everything works fine, except if there
>> is some problem loading the PCS driver, which they may not test. Is this
>> prohibition against failing after creating a bus documented anywhere? I
>> don't recall seeing it...
>
> What do you mean "failing after creating a bus"? If a bus is failed to
> be created, you fail like normal, no difference here.
Creating the bus is successful, but there's an EPROBE_DEFER failure after
that. Which induces the probe loop as described in my initial email.
> And if MFD doesn't work, there's always the aux-bus code, perhaps that
> should be used here instead?
I will have a look. However, I expect both of these approaches to
require fairly invasive conversions for existing drivers. Ideally, I
would like to keep conversions simple.
--Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-17 15:35 ` Sean Anderson
@ 2025-06-17 15:49 ` Greg Kroah-Hartman
2025-06-17 17:14 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-06-17 15:49 UTC (permalink / raw)
To: Sean Anderson
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Tue, Jun 17, 2025 at 11:35:04AM -0400, Sean Anderson wrote:
> On 6/17/25 04:50, Greg Kroah-Hartman wrote:
> > On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
> >> On 6/12/25 13:56, Saravana Kannan wrote:
> >> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >>
> >> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> >> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> >> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
> >> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >> >>
> >> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> >> >> >> registering a bus with children:
> >> >> >> >
> >> >> >> > This is a broken driver. A parent device shouldn't register child
> >> >> >> > devices unless it is fully read itself. It's not logical to say the
> >> >> >> > child devices are available, if the parent itself isn't fully ready.
> >> >> >> > So, adding child devices/the bus should be the last thing done in the
> >> >> >> > parent's probe function.
> >> >> >> >
> >> >> >> > I know there are odd exceptions where the parent depends on the child,
> >> >> >> > so they might add the child a bit earlier in the probe
> >> >> >>
> >> >> >> This is exactly the case here. So the bus probing cannot happen any
> >> >> >> later than it already does.
> >> >> >
> >> >> > Please fix the driver not to do this.
> >> >>
> >> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
> >> >> bus.
> >> >
> >> > Obviously I don't know the full details, but you could implement it as
> >> > MFD. So the bus part would not get removed even if the PCS fails to
> >> > probe. Then the PCS can probe when whatever it needs ends up probing.
> >>
> >> I was thinking about making the MDIO bus a separate device. But I think
> >> it will be tricky to get suspend/resume working correctly. And this
> >> makes conversions more difficult because you cannot just add some
> >> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
> >> invariably created as a child of the MAC).
> >>
> >> And what happens if a developer doesn't realize they have to split off
> >> the MDIO bus before converting? Everything works fine, except if there
> >> is some problem loading the PCS driver, which they may not test. Is this
> >> prohibition against failing after creating a bus documented anywhere? I
> >> don't recall seeing it...
> >
> > What do you mean "failing after creating a bus"? If a bus is failed to
> > be created, you fail like normal, no difference here.
>
> Creating the bus is successful, but there's an EPROBE_DEFER failure after
> that. Which induces the probe loop as described in my initial email.
Then don't allow a defer to happen :)
Or better yet, just succeed and spin up a new thread for the new bus to
attach it's devices to. That's what many other busses do today.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-17 15:49 ` Greg Kroah-Hartman
@ 2025-06-17 17:14 ` Sean Anderson
2025-06-19 8:21 ` Greg Kroah-Hartman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-17 17:14 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/17/25 11:49, Greg Kroah-Hartman wrote:
> On Tue, Jun 17, 2025 at 11:35:04AM -0400, Sean Anderson wrote:
>> On 6/17/25 04:50, Greg Kroah-Hartman wrote:
>> > On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
>> >> On 6/12/25 13:56, Saravana Kannan wrote:
>> >> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >>
>> >> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
>> >> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
>> >> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
>> >> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >> >> >>
>> >> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> >> >> >> >> registering a bus with children:
>> >> >> >> >
>> >> >> >> > This is a broken driver. A parent device shouldn't register child
>> >> >> >> > devices unless it is fully read itself. It's not logical to say the
>> >> >> >> > child devices are available, if the parent itself isn't fully ready.
>> >> >> >> > So, adding child devices/the bus should be the last thing done in the
>> >> >> >> > parent's probe function.
>> >> >> >> >
>> >> >> >> > I know there are odd exceptions where the parent depends on the child,
>> >> >> >> > so they might add the child a bit earlier in the probe
>> >> >> >>
>> >> >> >> This is exactly the case here. So the bus probing cannot happen any
>> >> >> >> later than it already does.
>> >> >> >
>> >> >> > Please fix the driver not to do this.
>> >> >>
>> >> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
>> >> >> bus.
>> >> >
>> >> > Obviously I don't know the full details, but you could implement it as
>> >> > MFD. So the bus part would not get removed even if the PCS fails to
>> >> > probe. Then the PCS can probe when whatever it needs ends up probing.
>> >>
>> >> I was thinking about making the MDIO bus a separate device. But I think
>> >> it will be tricky to get suspend/resume working correctly. And this
>> >> makes conversions more difficult because you cannot just add some
>> >> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
>> >> invariably created as a child of the MAC).
>> >>
>> >> And what happens if a developer doesn't realize they have to split off
>> >> the MDIO bus before converting? Everything works fine, except if there
>> >> is some problem loading the PCS driver, which they may not test. Is this
>> >> prohibition against failing after creating a bus documented anywhere? I
>> >> don't recall seeing it...
>> >
>> > What do you mean "failing after creating a bus"? If a bus is failed to
>> > be created, you fail like normal, no difference here.
>>
>> Creating the bus is successful, but there's an EPROBE_DEFER failure after
>> that. Which induces the probe loop as described in my initial email.
>
> Then don't allow a defer to happen :)
Well, I could require all PCS drivers to be built-in I guess. But I suspect
users will want them to be modules to reduce kernel size.
> Or better yet, just succeed and spin up a new thread for the new bus to
> attach it's devices to. That's what many other busses do today.
Sorry, I'm not sure I follow. How can you attach a device to a thread? Do
you have an example for this?
--Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-17 17:14 ` Sean Anderson
@ 2025-06-19 8:21 ` Greg Kroah-Hartman
2025-06-19 16:19 ` Sean Anderson
0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-06-19 8:21 UTC (permalink / raw)
To: Sean Anderson
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Tue, Jun 17, 2025 at 01:14:31PM -0400, Sean Anderson wrote:
> On 6/17/25 11:49, Greg Kroah-Hartman wrote:
> > On Tue, Jun 17, 2025 at 11:35:04AM -0400, Sean Anderson wrote:
> >> On 6/17/25 04:50, Greg Kroah-Hartman wrote:
> >> > On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
> >> >> On 6/12/25 13:56, Saravana Kannan wrote:
> >> >> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >>
> >> >> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> >> >> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> >> >> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
> >> >> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >> >> >>
> >> >> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> >> >> >> >> registering a bus with children:
> >> >> >> >> >
> >> >> >> >> > This is a broken driver. A parent device shouldn't register child
> >> >> >> >> > devices unless it is fully read itself. It's not logical to say the
> >> >> >> >> > child devices are available, if the parent itself isn't fully ready.
> >> >> >> >> > So, adding child devices/the bus should be the last thing done in the
> >> >> >> >> > parent's probe function.
> >> >> >> >> >
> >> >> >> >> > I know there are odd exceptions where the parent depends on the child,
> >> >> >> >> > so they might add the child a bit earlier in the probe
> >> >> >> >>
> >> >> >> >> This is exactly the case here. So the bus probing cannot happen any
> >> >> >> >> later than it already does.
> >> >> >> >
> >> >> >> > Please fix the driver not to do this.
> >> >> >>
> >> >> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
> >> >> >> bus.
> >> >> >
> >> >> > Obviously I don't know the full details, but you could implement it as
> >> >> > MFD. So the bus part would not get removed even if the PCS fails to
> >> >> > probe. Then the PCS can probe when whatever it needs ends up probing.
> >> >>
> >> >> I was thinking about making the MDIO bus a separate device. But I think
> >> >> it will be tricky to get suspend/resume working correctly. And this
> >> >> makes conversions more difficult because you cannot just add some
> >> >> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
> >> >> invariably created as a child of the MAC).
> >> >>
> >> >> And what happens if a developer doesn't realize they have to split off
> >> >> the MDIO bus before converting? Everything works fine, except if there
> >> >> is some problem loading the PCS driver, which they may not test. Is this
> >> >> prohibition against failing after creating a bus documented anywhere? I
> >> >> don't recall seeing it...
> >> >
> >> > What do you mean "failing after creating a bus"? If a bus is failed to
> >> > be created, you fail like normal, no difference here.
> >>
> >> Creating the bus is successful, but there's an EPROBE_DEFER failure after
> >> that. Which induces the probe loop as described in my initial email.
> >
> > Then don't allow a defer to happen :)
>
> Well, I could require all PCS drivers to be built-in I guess. But I suspect
> users will want them to be modules to reduce kernel size.
True, then just auto-load them as needed like all other busses do.
> > Or better yet, just succeed and spin up a new thread for the new bus to
> > attach it's devices to. That's what many other busses do today.
>
> Sorry, I'm not sure I follow. How can you attach a device to a thread? Do
> you have an example for this?
Busses discover their devices in a thread, which then calls probe for
them when needed. A device isn't being attached to a thread, sorry for
the confusion.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-19 8:21 ` Greg Kroah-Hartman
@ 2025-06-19 16:19 ` Sean Anderson
2025-06-19 16:33 ` Greg Kroah-Hartman
0 siblings, 1 reply; 15+ messages in thread
From: Sean Anderson @ 2025-06-19 16:19 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On 6/19/25 04:21, Greg Kroah-Hartman wrote:
> On Tue, Jun 17, 2025 at 01:14:31PM -0400, Sean Anderson wrote:
>> On 6/17/25 11:49, Greg Kroah-Hartman wrote:
>> > On Tue, Jun 17, 2025 at 11:35:04AM -0400, Sean Anderson wrote:
>> >> On 6/17/25 04:50, Greg Kroah-Hartman wrote:
>> >> > On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
>> >> >> On 6/12/25 13:56, Saravana Kannan wrote:
>> >> >> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >> >>
>> >> >> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
>> >> >> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
>> >> >> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
>> >> >> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
>> >> >> >> >> >>
>> >> >> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
>> >> >> >> >> >> registering a bus with children:
>> >> >> >> >> >
>> >> >> >> >> > This is a broken driver. A parent device shouldn't register child
>> >> >> >> >> > devices unless it is fully read itself. It's not logical to say the
>> >> >> >> >> > child devices are available, if the parent itself isn't fully ready.
>> >> >> >> >> > So, adding child devices/the bus should be the last thing done in the
>> >> >> >> >> > parent's probe function.
>> >> >> >> >> >
>> >> >> >> >> > I know there are odd exceptions where the parent depends on the child,
>> >> >> >> >> > so they might add the child a bit earlier in the probe
>> >> >> >> >>
>> >> >> >> >> This is exactly the case here. So the bus probing cannot happen any
>> >> >> >> >> later than it already does.
>> >> >> >> >
>> >> >> >> > Please fix the driver not to do this.
>> >> >> >>
>> >> >> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
>> >> >> >> bus.
>> >> >> >
>> >> >> > Obviously I don't know the full details, but you could implement it as
>> >> >> > MFD. So the bus part would not get removed even if the PCS fails to
>> >> >> > probe. Then the PCS can probe when whatever it needs ends up probing.
>> >> >>
>> >> >> I was thinking about making the MDIO bus a separate device. But I think
>> >> >> it will be tricky to get suspend/resume working correctly. And this
>> >> >> makes conversions more difficult because you cannot just add some
>> >> >> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
>> >> >> invariably created as a child of the MAC).
>> >> >>
>> >> >> And what happens if a developer doesn't realize they have to split off
>> >> >> the MDIO bus before converting? Everything works fine, except if there
>> >> >> is some problem loading the PCS driver, which they may not test. Is this
>> >> >> prohibition against failing after creating a bus documented anywhere? I
>> >> >> don't recall seeing it...
>> >> >
>> >> > What do you mean "failing after creating a bus"? If a bus is failed to
>> >> > be created, you fail like normal, no difference here.
>> >>
>> >> Creating the bus is successful, but there's an EPROBE_DEFER failure after
>> >> that. Which induces the probe loop as described in my initial email.
>> >
>> > Then don't allow a defer to happen :)
>>
>> Well, I could require all PCS drivers to be built-in I guess. But I suspect
>> users will want them to be modules to reduce kernel size.
>
> True, then just auto-load them as needed like all other busses do.
>
>> > Or better yet, just succeed and spin up a new thread for the new bus to
>> > attach it's devices to. That's what many other busses do today.
>>
>> Sorry, I'm not sure I follow. How can you attach a device to a thread? Do
>> you have an example for this?
>
> Busses discover their devices in a thread, which then calls probe for
> them when needed. A device isn't being attached to a thread, sorry for
> the confusion.
OK, just to clarify, the subsystem I linked above is not a bus, it's an
internal API. Think GPIO or PWM. The devices typically live on an MDIO
bus (although a platform bus wouldn't be out of the question). So it's
"not our job" to load the module; that should be done by the bus. From
our perspective, if we look up a device and it's not there we don't
really know if it's ever going to show up.
Regarding the auxiliary bus, I tried it out and it works. The conversion
is around +50 lines, which is not ideal. Ideally I would like to push
complexity into subsystem code rather than making drivers deal with it,
but I don't really see a good way to do this in the subsystem. There are
just a lot of assumptions along the line of "when you register the
device you must know what capabilities it supports." For example, fixed
links (MAC to MAC) are validated when the phylink is created and the
whole process fails if there's an incompatibility. Which can occur if
the late-binding component is the thing that adds support for the fixed
link.
--Sean
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] driver core: Prevent deferred probe loops
2025-06-19 16:19 ` Sean Anderson
@ 2025-06-19 16:33 ` Greg Kroah-Hartman
0 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-06-19 16:33 UTC (permalink / raw)
To: Sean Anderson
Cc: Saravana Kannan, Rafael J . Wysocki, Danilo Krummrich,
linux-kernel, devicetree, Christoph Hellwig, Rob Herring,
Grant Likely
On Thu, Jun 19, 2025 at 12:19:23PM -0400, Sean Anderson wrote:
> On 6/19/25 04:21, Greg Kroah-Hartman wrote:
> > On Tue, Jun 17, 2025 at 01:14:31PM -0400, Sean Anderson wrote:
> >> On 6/17/25 11:49, Greg Kroah-Hartman wrote:
> >> > On Tue, Jun 17, 2025 at 11:35:04AM -0400, Sean Anderson wrote:
> >> >> On 6/17/25 04:50, Greg Kroah-Hartman wrote:
> >> >> > On Thu, Jun 12, 2025 at 04:40:48PM -0400, Sean Anderson wrote:
> >> >> >> On 6/12/25 13:56, Saravana Kannan wrote:
> >> >> >> > On Thu, Jun 12, 2025 at 8:53 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >> >>
> >> >> >> >> On 6/11/25 08:23, Greg Kroah-Hartman wrote:
> >> >> >> >> > On Tue, Jun 10, 2025 at 07:44:27PM -0400, Sean Anderson wrote:
> >> >> >> >> >> On 6/10/25 19:32, Saravana Kannan wrote:
> >> >> >> >> >> > On Tue, Jun 10, 2025 at 11:35 AM Sean Anderson <sean.anderson@linux.dev> wrote:
> >> >> >> >> >> >>
> >> >> >> >> >> >> A deferred probe loop can occur when a device returns EPROBE_DEFER after
> >> >> >> >> >> >> registering a bus with children:
> >> >> >> >> >> >
> >> >> >> >> >> > This is a broken driver. A parent device shouldn't register child
> >> >> >> >> >> > devices unless it is fully read itself. It's not logical to say the
> >> >> >> >> >> > child devices are available, if the parent itself isn't fully ready.
> >> >> >> >> >> > So, adding child devices/the bus should be the last thing done in the
> >> >> >> >> >> > parent's probe function.
> >> >> >> >> >> >
> >> >> >> >> >> > I know there are odd exceptions where the parent depends on the child,
> >> >> >> >> >> > so they might add the child a bit earlier in the probe
> >> >> >> >> >>
> >> >> >> >> >> This is exactly the case here. So the bus probing cannot happen any
> >> >> >> >> >> later than it already does.
> >> >> >> >> >
> >> >> >> >> > Please fix the driver not to do this.
> >> >> >> >>
> >> >> >> >> How? The driver needs the PCS to work. And the PCS can live on the MDIO
> >> >> >> >> bus.
> >> >> >> >
> >> >> >> > Obviously I don't know the full details, but you could implement it as
> >> >> >> > MFD. So the bus part would not get removed even if the PCS fails to
> >> >> >> > probe. Then the PCS can probe when whatever it needs ends up probing.
> >> >> >>
> >> >> >> I was thinking about making the MDIO bus a separate device. But I think
> >> >> >> it will be tricky to get suspend/resume working correctly. And this
> >> >> >> makes conversions more difficult because you cannot just add some
> >> >> >> pcs_get/pcs_put calls, you have to split out the MDIO bus too (which is
> >> >> >> invariably created as a child of the MAC).
> >> >> >>
> >> >> >> And what happens if a developer doesn't realize they have to split off
> >> >> >> the MDIO bus before converting? Everything works fine, except if there
> >> >> >> is some problem loading the PCS driver, which they may not test. Is this
> >> >> >> prohibition against failing after creating a bus documented anywhere? I
> >> >> >> don't recall seeing it...
> >> >> >
> >> >> > What do you mean "failing after creating a bus"? If a bus is failed to
> >> >> > be created, you fail like normal, no difference here.
> >> >>
> >> >> Creating the bus is successful, but there's an EPROBE_DEFER failure after
> >> >> that. Which induces the probe loop as described in my initial email.
> >> >
> >> > Then don't allow a defer to happen :)
> >>
> >> Well, I could require all PCS drivers to be built-in I guess. But I suspect
> >> users will want them to be modules to reduce kernel size.
> >
> > True, then just auto-load them as needed like all other busses do.
> >
> >> > Or better yet, just succeed and spin up a new thread for the new bus to
> >> > attach it's devices to. That's what many other busses do today.
> >>
> >> Sorry, I'm not sure I follow. How can you attach a device to a thread? Do
> >> you have an example for this?
> >
> > Busses discover their devices in a thread, which then calls probe for
> > them when needed. A device isn't being attached to a thread, sorry for
> > the confusion.
>
> OK, just to clarify, the subsystem I linked above is not a bus, it's an
> internal API. Think GPIO or PWM. The devices typically live on an MDIO
> bus (although a platform bus wouldn't be out of the question). So it's
> "not our job" to load the module; that should be done by the bus. From
> our perspective, if we look up a device and it's not there we don't
> really know if it's ever going to show up.
That's fine, it will show up when it shows up, don't wait around for it :)
> Regarding the auxiliary bus, I tried it out and it works. The conversion
> is around +50 lines, which is not ideal. Ideally I would like to push
> complexity into subsystem code rather than making drivers deal with it,
> but I don't really see a good way to do this in the subsystem. There are
> just a lot of assumptions along the line of "when you register the
> device you must know what capabilities it supports." For example, fixed
> links (MAC to MAC) are validated when the phylink is created and the
> whole process fails if there's an incompatibility. Which can occur if
> the late-binding component is the thing that adds support for the fixed
> link.
50+ lines is not much, let's see the patch!
thanks,
greg k-h
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-06-19 16:33 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09 23:57 [BUG] Deferred probe loop with child devices Sean Anderson
2025-06-10 18:34 ` [PATCH] driver core: Prevent deferred probe loops Sean Anderson
2025-06-10 23:32 ` Saravana Kannan
2025-06-10 23:44 ` Sean Anderson
2025-06-11 12:23 ` Greg Kroah-Hartman
2025-06-12 15:53 ` Sean Anderson
2025-06-12 17:56 ` Saravana Kannan
2025-06-12 20:40 ` Sean Anderson
2025-06-17 8:50 ` Greg Kroah-Hartman
2025-06-17 15:35 ` Sean Anderson
2025-06-17 15:49 ` Greg Kroah-Hartman
2025-06-17 17:14 ` Sean Anderson
2025-06-19 8:21 ` Greg Kroah-Hartman
2025-06-19 16:19 ` Sean Anderson
2025-06-19 16:33 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).