From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Doug Anderson <dianders@chromium.org>
Cc: Danilo Krummrich <dakr@kernel.org>,
"Rafael J . Wysocki" <rafael@kernel.org>,
Alan Stern <stern@rowland.harvard.edu>,
Saravana Kannan <saravanak@kernel.org>,
Christoph Hellwig <hch@lst.de>,
Eric Dumazet <edumazet@google.com>,
Johan Hovold <johan@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
Alexander Lobakin <aleksander.lobakin@intel.com>,
Alexey Kardashevskiy <aik@ozlabs.ru>,
Robin Murphy <robin.murphy@arm.com>,
stable@vger.kernel.org, driver-core@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/9] driver core: Don't let a device probe until it's ready
Date: Mon, 6 Apr 2026 08:39:23 +0200 [thread overview]
Message-ID: <2026040606-brewery-veteran-e013@gregkh> (raw)
In-Reply-To: <CAD=FV=Wgw7kU+Xse6dwjE+U06_A_tWcA93UXu6TTf0Erh+Mt8Q@mail.gmail.com>
On Sun, Apr 05, 2026 at 03:39:26PM -0700, Doug Anderson wrote:
> Hi,
>
> On Sun, Apr 5, 2026 at 1:58 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >
> > On Sat Apr 4, 2026 at 2:04 AM CEST, Douglas Anderson wrote:
> > > Instead of adding another flag to the bitfields already in "struct
> > > device", instead add a new "flags" field and use that. This allows us
> > > to freely change the bit from different thread without holding the
> > > device lock and without worrying about corrupting nearby bits.
> >
> > I was just about to pick up this patch series (Greg mentioned to pick it up next
> > week, but we agreed offlist that I will pick it now, so it gets a few more
> > cycles in linux-next).
> >
> > Due to this, taking a second glance at the code, I noticed the below issue.
> >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index 09b98f02f559..f07745659de3 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -3688,6 +3688,19 @@ int device_add(struct device *dev)
> > > fw_devlink_link_device(dev);
> > > }
> > >
> > > + /*
> > > + * The moment the device was linked into the bus's "klist_devices" in
> > > + * bus_add_device() then it's possible that probe could have been
> > > + * attempted in a different thread via userspace loading a driver
> > > + * matching the device. "ready_to_prove" being unset would have
> > > + * blocked those attempts. Now that all of the above initialization has
> > > + * happened, unblock probe. If probe happens through another thread
> > > + * after this point but before bus_probe_device() runs then it's fine.
> > > + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> > > + * will notice (under device_lock) that the device is already bound.
> > > + */
> > > + dev_set_ready_to_probe(dev);
> >
> > By converting this to a bitop, we now avoid races with other bitfields (such as
> > dev->can_match), but I think we still need to take the device lock for this one
> > specifically:
> >
> > Task 0 (device_add): Task 1 (__driver_probe_device):
> >
> > dev->fwnode->dev = dev;
> > device_lock(dev);
> > device_lock(dev); if (dev_ready_to_probe())
> > dev_set_ready_to_probe() access(fwnode->dev);
> > device_unlock(dev); device_unlock(dev);
> >
> > Otherwise, nothing prevents the above dev->fwnode->dev = dev assignment to be
> > re-ordered with dev_set_ready_to_probe() and we are back to the problem the
> > commit attempts to solve in the first place.
>
> Ah, that sounds like a reasonable concern, and I agree that taking the
> device_lock() here seems like the cleanest solution.
>
>
> > > @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> > > if (dev->driver)
> > > return -EBUSY;
> > >
> > > + /*
> > > + * In device_add(), the "struct device" gets linked into the subsystem's
> > > + * list of devices and broadcast to userspace (via uevent) before we're
> > > + * quite ready to probe. Those open pathways to driver probe before
> > > + * we've finished enough of device_add() to reliably support probe.
> > > + * Detect this and tell other pathways to try again later. device_add()
> > > + * itself will also try to probe immediately after setting
> > > + * "ready_to_probe".
> > > + */
> > > + if (!dev_ready_to_probe(dev))
> > > + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready to probe\n");
> > > +
> > > dev->can_match = true;
> >
> > Focused on ordering from the above, I also noticed that this ordering of
> > dev_ready_to_probe() and dev->can_match = true is actually pretty subtle and we
> > should add the following comment.
> >
> > /*
> > * Set can_match = true after calling dev_ready_to_probe(), so
> > * driver_deferred_probe_add() won't actually add the device to the
> > * deferred probe list when dev_ready_to_probe() returns false.
> > *
> > * When dev_ready_to_probe() returns false, it means that device_add()
> > * will do another probe() attempt for us.
> > */
>
> Sure. That seems useful for future readers.
>
>
> > As it would be nice to land this for v7.1-rc1, I can apply both changes on
> > apply, i.e. not need to resend AFAIC.
>
> Thanks! I'm happy to resend a new version if need be, but I'm also
> happy if you want to make changes when applying.
New version is always best :)
thanks,
greg k-h
next prev parent reply other threads:[~2026-04-06 6:39 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-04 0:04 [PATCH v4 0/9] driver core: Fix some race conditions Douglas Anderson
2026-04-04 0:04 ` Douglas Anderson
2026-04-04 0:04 ` Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 1/9] driver core: Don't let a device probe until it's ready Douglas Anderson
2026-04-04 17:35 ` Danilo Krummrich
2026-04-05 20:58 ` Danilo Krummrich
2026-04-05 22:39 ` Doug Anderson
2026-04-06 6:39 ` Greg Kroah-Hartman [this message]
2026-04-06 14:15 ` Danilo Krummrich
2026-04-06 6:32 ` Marc Zyngier
2026-04-06 14:41 ` Doug Anderson
2026-04-06 14:59 ` Danilo Krummrich
2026-04-06 16:34 ` Marc Zyngier
2026-04-06 16:43 ` Danilo Krummrich
2026-04-06 17:06 ` Marc Zyngier
2026-04-06 18:11 ` Danilo Krummrich
2026-04-06 18:59 ` Doug Anderson
2026-04-06 16:45 ` Doug Anderson
2026-04-04 0:04 ` [PATCH v4 2/9] driver core: Replace dev->can_match with dev_can_match() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 3/9] driver core: Replace dev->dma_iommu with dev_dma_iommu() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 4/9] driver core: Replace dev->dma_skip_sync with dev_dma_skip_sync() Douglas Anderson
2026-04-04 0:04 ` [PATCH v4 5/9] driver core: Replace dev->dma_ops_bypass with dev_dma_ops_bypass() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 6/9] driver core: Replace dev->state_synced with dev_state_synced() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 7/9] driver core: Replace dev->dma_coherent with dev_dma_coherent() Douglas Anderson
2026-04-04 0:05 ` Douglas Anderson
2026-04-04 0:05 ` Douglas Anderson
2026-04-06 5:49 ` Vinod Koul
2026-04-06 5:49 ` Vinod Koul
2026-04-06 5:49 ` Vinod Koul
2026-04-04 0:05 ` [PATCH v4 8/9] driver core: Replace dev->of_node_reused with dev_of_node_reused() Douglas Anderson
2026-04-04 0:05 ` [PATCH v4 9/9] driver core: Replace dev->offline + ->offline_disabled with accessors Douglas Anderson
2026-04-04 17:11 ` [PATCH v4 0/9] driver core: Fix some race conditions Rafael J. Wysocki
2026-04-04 17:11 ` Rafael J. Wysocki
2026-04-04 17:11 ` Rafael J. Wysocki
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 5:27 ` Greg Kroah-Hartman
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 12:02 ` Danilo Krummrich
2026-04-05 22:43 ` Doug Anderson
2026-04-05 22:43 ` Doug Anderson
2026-04-05 22:43 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2026040606-brewery-veteran-e013@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=aik@ozlabs.ru \
--cc=aleksander.lobakin@intel.com \
--cc=dakr@kernel.org \
--cc=dianders@chromium.org \
--cc=driver-core@lists.linux.dev \
--cc=edumazet@google.com \
--cc=hch@lst.de \
--cc=johan@kernel.org \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=robin.murphy@arm.com \
--cc=saravanak@kernel.org \
--cc=stable@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.