public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: "Danilo Krummrich" <dakr@kernel.org>
To: "Douglas Anderson" <dianders@chromium.org>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Rafael J . Wysocki" <rafael@kernel.org>,
	"Alan Stern" <stern@rowland.harvard.edu>,
	"Kay Sievers" <kay.sievers@vrfy.org>,
	"Saravana Kannan" <saravanak@kernel.org>,
	<stable@vger.kernel.org>, <driver-core@lists.linux.dev>,
	<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] driver core: Don't let a device probe until it's ready
Date: Tue, 31 Mar 2026 16:42:51 +0200	[thread overview]
Message-ID: <DHH1PD0ASG8H.1K3KG9L658DYN@kernel.org> (raw)
In-Reply-To: <20260330072839.v2.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid>

On Mon Mar 30, 2026 at 4:28 PM CEST, Douglas Anderson wrote:
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 09b98f02f559..4caa3fd1ecdb 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3688,6 +3688,21 @@ int device_add(struct device *dev)
>  		fw_devlink_link_device(dev);
>  	}
>  
> +	/*
> +	 * The moment the device was linked into the bus's "klist_devices" in
> +	 * bus_add_device() then it's possible that probe could have been
> +	 * attempted in a different thread via userspace loading a driver
> +	 * matching the device. "ready_to_probe" being false would have blocked
> +	 * those attempts. Now that all of the above initialization has
> +	 * happened, unblock probe. If probe happens through another thread
> +	 * after this point but before bus_probe_device() runs then it's fine.
> +	 * bus_probe_device() -> device_initial_probe() -> __device_attach()
> +	 * will notice (under device_lock) that the device is already bound.
> +	 */
> +	device_lock(dev);
> +	dev->ready_to_probe = true;
> +	device_unlock(dev);
> +
>  	bus_probe_device(dev);
>  
>  	/*
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 37c7e54e0e4c..a1762254828f 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
>  	if (dev->driver)
>  		return -EBUSY;
>  
> +	/*
> +	 * In device_add(), the "struct device" gets linked into the subsystem's
> +	 * list of devices and broadcast to userspace (via uevent) before we're
> +	 * quite ready to probe. Those open pathways to driver probe before
> +	 * we've finished enough of device_add() to reliably support probe.
> +	 * Detect this and tell other pathways to try again later. device_add()
> +	 * itself will also try to probe immediately after setting
> +	 * "ready_to_probe".
> +	 */
> +	if (!dev->ready_to_probe)
> +		return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready_to_probe");

Are we sure this dev->ready_to_probe dance does not introduce a new subtle bug
considering that ready_to_probe is within a bitfield of struct device?

I.e. are we sure there are no potential concurrent modifications of other fields
in this bitfield that are not protected with the device lock?

For instance, in __driver_attach() we set dev->can_match if
driver_match_device() returns -EPROBE_DEFER without the device lock held.

This is exactly the case you want to protect against, i.e. device_add() racing
with __driver_attach().

So, there is a chance that the dev->ready_to_probe change gets interleaved with
a dev->can_match change.

I think all this goes away if we stop using bitfields for synchronization; we
should convert some of those to flags that we can modify with set_bit() and
friends instead.

> +
>  	dev->can_match = true;
>  	dev_dbg(dev, "bus: '%s': %s: matched device with driver %s\n",
>  		drv->bus->name, __func__, drv->name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index e65d564f01cd..e2f83384b627 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -553,6 +553,8 @@ struct device_physical_location {
>   * @dma_skip_sync: DMA sync operations can be skipped for coherent buffers.
>   * @dma_iommu: Device is using default IOMMU implementation for DMA and
>   *		doesn't rely on dma_ops structure.
> + * @ready_to_probe: If set to %true then device_add() has finished enough
> + *		initialization that probe could be called.
>   *
>   * At the lowest level, every device in a Linux system is represented by an
>   * instance of struct device. The device structure contains the information
> @@ -675,6 +677,7 @@ struct device {
>  #ifdef CONFIG_IOMMU_DMA
>  	bool			dma_iommu:1;
>  #endif
> +	bool			ready_to_probe:1;
>  };

  parent reply	other threads:[~2026-03-31 14:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-30 14:28 [PATCH v2] driver core: Don't let a device probe until it's ready Douglas Anderson
2026-03-30 14:49 ` Alan Stern
2026-03-30 15:36 ` Rafael J. Wysocki
2026-03-31 14:42 ` Danilo Krummrich [this message]
2026-03-31 15:26   ` Doug Anderson
2026-04-01 21:05     ` Danilo Krummrich
2026-04-01 21:09       ` Doug Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DHH1PD0ASG8H.1K3KG9L658DYN@kernel.org \
    --to=dakr@kernel.org \
    --cc=dianders@chromium.org \
    --cc=driver-core@lists.linux.dev \
    --cc=gregkh@linuxfoundation.org \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael@kernel.org \
    --cc=saravanak@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox