From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Danilo Krummrich <dakr@kernel.org>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Alan Stern <stern@rowland.harvard.edu>,
Saravana Kannan <saravanak@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Eric Dumazet <edumazet@google.com>,
Johan Hovold <johan@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
Robin Murphy <robin.murphy@arm.com>,
stable@vger.kernel.org, driver-core@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/9] driver core: Don't let a device probe until it's ready
Date: Mon, 6 Apr 2026 08:39:23 +0200 [thread overview]
Message-ID: <2026040606-brewery-veteran-e013@gregkh> (raw)
In-Reply-To: <CAD=FV=Wgw7kU+Xse6dwjE+U06_A_tWcA93UXu6TTf0Erh+Mt8Q@mail.gmail.com>
On Sun, Apr 05, 2026 at 03:39:26PM -0700, Doug Anderson wrote:
> Hi,
>
> On Sun, Apr 5, 2026 at 1:58 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Sat Apr 4, 2026 at 2:04 AM CEST, Douglas Anderson wrote:
> > > Instead of adding another flag to the bitfields already in "struct
> > > device", instead add a new "flags" field and use that. This allows us
> > > to freely change the bit from different thread without holding the
> > > device lock and without worrying about corrupting nearby bits.
> >
> > I was just about to pick up this patch series (Greg mentioned to pick it up next
> > week, but we agreed offlist that I will pick it now, so it gets a few more
> > cycles in linux-next).
> >
> > Due to this, taking a second glance at the code, I noticed the below issue.
> >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index 09b98f02f559..f07745659de3 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -3688,6 +3688,19 @@ int device_add(struct device *dev)
> > > fw_devlink_link_device(dev);
> > > }
> > >
> > > + /*
> > > + * The moment the device was linked into the bus's "klist_devices" in
> > > + * bus_add_device() then it's possible that probe could have been
> > > + * attempted in a different thread via userspace loading a driver
> > > + * matching the device. "ready_to_prove" being unset would have
> > > + * blocked those attempts. Now that all of the above initialization has
> > > + * happened, unblock probe. If probe happens through another thread
> > > + * after this point but before bus_probe_device() runs then it's fine.
> > > + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> > > + * will notice (under device_lock) that the device is already bound.
> > > + */
> > > + dev_set_ready_to_probe(dev);
> >
> > By converting this to a bitop, we now avoid races with other bitfields (such as
> > dev->can_match), but I think we still need to take the device lock for this one
> > specifically:
> >
> > Task 0 (device_add): Task 1 (__driver_probe_device):
> >
> > dev->fwnode->dev = dev;
> > device_lock(dev);
> > device_lock(dev); if (dev_ready_to_probe())
> > dev_set_ready_to_probe() access(fwnode->dev);
> > device_unlock(dev); device_unlock(dev);
> >
> > Otherwise, nothing prevents the above dev->fwnode->dev = dev assignment to be
> > re-ordered with dev_set_ready_to_probe() and we are back to the problem the
> > commit attempts to solve in the first place.
>
> Ah, that sounds like a reasonable concern, and I agree that taking the
> device_lock() here seems like the cleanest solution.
>
>
> > > @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> > > if (dev->driver)
> > > return -EBUSY;
> > >
> > > + /*
> > > + * In device_add(), the "struct device" gets linked into the subsystem's
> > > + * list of devices and broadcast to userspace (via uevent) before we're
> > > + * quite ready to probe. Those open pathways to driver probe before
> > > + * we've finished enough of device_add() to reliably support probe.
> > > + * Detect this and tell other pathways to try again later. device_add()
> > > + * itself will also try to probe immediately after setting
> > > + * "ready_to_probe".
> > > + */
> > > + if (!dev_ready_to_probe(dev))
> > > + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready to probe\n");
> > > +
> > > dev->can_match = true;
> >
> > Focused on ordering from the above, I also noticed that this ordering of
> > dev_ready_to_probe() and dev->can_match = true is actually pretty subtle and we
> > should add the following comment.
> >
> > /*
> > * Set can_match = true after calling dev_ready_to_probe(), so
> > * driver_deferred_probe_add() won't actually add the device to the
> > * deferred probe list when dev_ready_to_probe() returns false.
> > *
> > * When dev_ready_to_probe() returns false, it means that device_add()
> > * will do another probe() attempt for us.
> > */
>
> Sure. That seems useful for future readers.
>
>
> > As it would be nice to land this for v7.1-rc1, I can apply both changes on
> > apply, i.e. not need to resend AFAIC.
>
> Thanks! I'm happy to resend a new version if need be, but I'm also
> happy if you want to make changes when applying.
New version is always best :)
thanks,
greg k-h
next prev parent reply other threads:[~2026-04-06 6:39 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 0:04 [PATCH v4 0/9] driver core: Fix some race conditions Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 1/9] driver core: Don't let a device probe until it's ready Douglas Anderson
2026-04-04 17:35 ` Danilo Krummrich
2026-04-05 20:58 ` Danilo Krummrich
2026-04-05 22:39 ` Doug Anderson
2026-04-06 6:39 ` Greg Kroah-Hartman [this message]
2026-04-06 14:15 ` Danilo Krummrich
2026-04-06 6:32 ` Marc Zyngier
2026-04-06 14:41 ` Doug Anderson
2026-04-06 14:59 ` Danilo Krummrich
2026-04-06 16:34 ` Marc Zyngier
2026-04-06 16:43 ` Danilo Krummrich
2026-04-06 17:06 ` Marc Zyngier
2026-04-06 18:11 ` Danilo Krummrich
2026-04-06 18:59 ` Doug Anderson
2026-04-06 16:45 ` Doug Anderson
2026-04-04 0:04 ` [PATCH v4 2/9] driver core: Replace dev->can_match with dev_can_match() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 3/9] driver core: Replace dev->dma_iommu with dev_dma_iommu() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 4/9] driver core: Replace dev->dma_skip_sync with dev_dma_skip_sync() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 5/9] driver core: Replace dev->dma_ops_bypass with dev_dma_ops_bypass() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 6/9] driver core: Replace dev->state_synced with dev_state_synced() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 7/9] driver core: Replace dev->dma_coherent with dev_dma_coherent() Douglas Anderson
2026-04-06 5:49 ` Vinod Koul
2026-04-04 0:05 ` [PATCH v4 8/9] driver core: Replace dev->of_node_reused with dev_of_node_reused() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 9/9] driver core: Replace dev->offline + ->offline_disabled with accessors Douglas Anderson
2026-04-04 17:11 ` [PATCH v4 0/9] driver core: Fix some race conditions Rafael J. Wysocki
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 22:43 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2026040606-brewery-veteran-e013@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=aik@ozlabs.ru \
--cc=aleksander.lobakin@intel.com \
--cc=dakr@kernel.org \
--cc=dianders@chromium.org \
--cc=driver-core@lists.linux.dev \
--cc=edumazet@google.com \
--cc=hch@lst.de \
--cc=johan@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=saravanak@kernel.org \
--cc=stable@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox