From: "Danilo Krummrich" <dakr@kernel.org>
To: "Douglas Anderson" <dianders@chromium.org>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Rafael J . Wysocki" <rafael@kernel.org>,
"Alan Stern" <stern@rowland.harvard.edu>,
"Saravana Kannan" <saravanak@kernel.org>,
"Christoph Hellwig" <hch@lst.de>,
"Eric Dumazet" <edumazet@google.com>,
"Johan Hovold" <johan@kernel.org>,
"Leon Romanovsky" <leon@kernel.org>,
"Alexander Lobakin" <aleksander.lobakin@intel.com>,
"Alexey Kardashevskiy" <aik@ozlabs.ru>,
"Robin Murphy" <robin.murphy@arm.com>, <stable@vger.kernel.org>,
<driver-core@lists.linux.dev>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v4 1/9] driver core: Don't let a device probe until it's ready
Date: Sun, 05 Apr 2026 22:58:02 +0200 [thread overview]
Message-ID: <DHLITCTY913U.J59JSQOVL0NH@kernel.org> (raw)
In-Reply-To: <20260403170432.v4.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid>
On Sat Apr 4, 2026 at 2:04 AM CEST, Douglas Anderson wrote:
> Instead of adding another flag to the bitfields already in "struct
> device", instead add a new "flags" field and use that. This allows us
> to freely change the bit from different thread without holding the
> device lock and without worrying about corrupting nearby bits.
I was just about to pick up this patch series (Greg mentioned to pick it up next
week, but we agreed offlist that I will pick it now, so it gets a few more
cycles in linux-next).
Due to this, taking a second glance at the code, I noticed the below issue.
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 09b98f02f559..f07745659de3 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3688,6 +3688,19 @@ int device_add(struct device *dev)
> fw_devlink_link_device(dev);
> }
>
> + /*
> + * The moment the device was linked into the bus's "klist_devices" in
> + * bus_add_device() then it's possible that probe could have been
> + * attempted in a different thread via userspace loading a driver
> + * matching the device. "ready_to_prove" being unset would have
> + * blocked those attempts. Now that all of the above initialization has
> + * happened, unblock probe. If probe happens through another thread
> + * after this point but before bus_probe_device() runs then it's fine.
> + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> + * will notice (under device_lock) that the device is already bound.
> + */
> + dev_set_ready_to_probe(dev);
By converting this to a bitop, we now avoid races with other bitfields (such as
dev->can_match), but I think we still need to take the device lock for this one
specifically:
Task 0 (device_add): Task 1 (__driver_probe_device):
dev->fwnode->dev = dev;
device_lock(dev);
device_lock(dev); if (dev_ready_to_probe())
dev_set_ready_to_probe() access(fwnode->dev);
device_unlock(dev); device_unlock(dev);
Otherwise, nothing prevents the above dev->fwnode->dev = dev assignment to be
re-ordered with dev_set_ready_to_probe() and we are back to the problem the
commit attempts to solve in the first place.
(Technically, this could also be solved with explicit memory barriers - here and
below -, but __driver_probe_device() is always called with the device lock held,
so just taking the device lock seems *much* simpler. Also, in the absolute
majority of cases the lock should be uncontended in device_add() anyways.)
> +
> bus_probe_device(dev);
>
> /*
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 37c7e54e0e4c..8ec93128ea98 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> if (dev->driver)
> return -EBUSY;
>
> + /*
> + * In device_add(), the "struct device" gets linked into the subsystem's
> + * list of devices and broadcast to userspace (via uevent) before we're
> + * quite ready to probe. Those open pathways to driver probe before
> + * we've finished enough of device_add() to reliably support probe.
> + * Detect this and tell other pathways to try again later. device_add()
> + * itself will also try to probe immediately after setting
> + * "ready_to_probe".
> + */
> + if (!dev_ready_to_probe(dev))
> + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready to probe\n");
> +
> dev->can_match = true;
Focused on ordering from the above, I also noticed that this ordering of
dev_ready_to_probe() and dev->can_match = true is actually pretty subtle and we
should add the following comment.
/*
* Set can_match = true after calling dev_ready_to_probe(), so
* driver_deferred_probe_add() won't actually add the device to the
* deferred probe list when dev_ready_to_probe() returns false.
*
* When dev_ready_to_probe() returns false, it means that device_add()
* will do another probe() attempt for us.
*/
As it would be nice to land this for v7.1-rc1, I can apply both changes on
apply, i.e. not need to resend AFAIC.
Greg, Rafael: does that work for you?
Thanks,
Danilo
next prev parent reply other threads:[~2026-04-05 20:58 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 0:04 [PATCH v4 0/9] driver core: Fix some race conditions Douglas Anderson
2026-04-04 0:04 ` Douglas Anderson
2026-04-04 0:04 ` Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 1/9] driver core: Don't let a device probe until it's ready Douglas Anderson
2026-04-04 17:35 ` Danilo Krummrich
2026-04-05 20:58 ` Danilo Krummrich [this message]
2026-04-05 22:39 ` Doug Anderson
2026-04-06 6:39 ` Greg Kroah-Hartman
2026-04-06 14:15 ` Danilo Krummrich
2026-04-06 6:32 ` Marc Zyngier
2026-04-06 14:41 ` Doug Anderson
2026-04-06 14:59 ` Danilo Krummrich
2026-04-06 16:34 ` Marc Zyngier
2026-04-06 16:43 ` Danilo Krummrich
2026-04-06 17:06 ` Marc Zyngier
2026-04-06 18:11 ` Danilo Krummrich
2026-04-06 18:59 ` Doug Anderson
2026-04-06 16:45 ` Doug Anderson
2026-04-04 0:04 ` [PATCH v4 2/9] driver core: Replace dev->can_match with dev_can_match() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 3/9] driver core: Replace dev->dma_iommu with dev_dma_iommu() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 4/9] driver core: Replace dev->dma_skip_sync with dev_dma_skip_sync() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 5/9] driver core: Replace dev->dma_ops_bypass with dev_dma_ops_bypass() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 6/9] driver core: Replace dev->state_synced with dev_state_synced() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 7/9] driver core: Replace dev->dma_coherent with dev_dma_coherent() Douglas Anderson
2026-04-04 0:05 ` Douglas Anderson
2026-04-04 0:05 ` Douglas Anderson
2026-04-06 5:49 ` Vinod Koul
2026-04-06 5:49 ` Vinod Koul
2026-04-06 5:49 ` Vinod Koul
2026-04-04 0:05 ` [PATCH v4 8/9] driver core: Replace dev->of_node_reused with dev_of_node_reused() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 9/9] driver core: Replace dev->offline + ->offline_disabled with accessors Douglas Anderson
2026-04-04 17:11 ` [PATCH v4 0/9] driver core: Fix some race conditions Rafael J. Wysocki
2026-04-04 17:11 ` Rafael J. Wysocki
2026-04-04 17:11 ` Rafael J. Wysocki
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 22:43 ` Doug Anderson
2026-04-05 22:43 ` Doug Anderson
2026-04-05 22:43 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHLITCTY913U.J59JSQOVL0NH@kernel.org \
--to=dakr@kernel.org \
--cc=aik@ozlabs.ru \
--cc=aleksander.lobakin@intel.com \
--cc=dianders@chromium.org \
--cc=driver-core@lists.linux.dev \
--cc=edumazet@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=johan@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=saravanak@kernel.org \
--cc=stable@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.