From: "Danilo Krummrich" <dakr@kernel.org>
To: "Doug Anderson" <dianders@chromium.org>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Rafael J . Wysocki" <rafael@kernel.org>,
"Alan Stern" <stern@rowland.harvard.edu>,
"Kay Sievers" <kay.sievers@vrfy.org>,
"Saravana Kannan" <saravanak@kernel.org>,
<stable@vger.kernel.org>, <driver-core@lists.linux.dev>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] driver core: Don't let a device probe until it's ready
Date: Wed, 01 Apr 2026 23:05:52 +0200 [thread overview]
Message-ID: <DHI4H61YSZGE.BBM3H9B3H8F7@kernel.org> (raw)
In-Reply-To: <CAD=FV=XJ2qOZ7ftDg70AhD0GRX6TfQb6OVyNaUfFg42+hmwxGQ@mail.gmail.com>
On Tue Mar 31, 2026 at 5:26 PM CEST, Doug Anderson wrote:
> Hi,
>
> On Tue, Mar 31, 2026 at 7:42 AM Danilo Krummrich <dakr@kernel.org> wrote:
>>
>> > @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
>> > if (dev->driver)
>> > return -EBUSY;
>> >
>> > + /*
>> > + * In device_add(), the "struct device" gets linked into the subsystem's
>> > + * list of devices and broadcast to userspace (via uevent) before we're
>> > + * quite ready to probe. Those open pathways to driver probe before
>> > + * we've finished enough of device_add() to reliably support probe.
>> > + * Detect this and tell other pathways to try again later. device_add()
>> > + * itself will also try to probe immediately after setting
>> > + * "ready_to_probe".
>> > + */
>> > + if (!dev->ready_to_probe)
>> > + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready_to_probe");
>>
>> Are we sure this dev->ready_to_probe dance does not introduce a new subtle bug
>> considering that ready_to_probe is within a bitfield of struct device?
>>
>> I.e. are we sure there are no potential concurrent modifications of other fields
>> in this bitfield that are not protected with the device lock?
>>
>> For instance, in __driver_attach() we set dev->can_match if
>> driver_match_device() returns -EPROBE_DEFER without the device lock held.
>
> Bleh. Thank you for catching this. I naively assumed the device lock
> protected the bitfield, but I didn't verify that.
>
>
>> This is exactly the case you want to protect against, i.e. device_add() racing
>> with __driver_attach().
>>
>> So, there is a chance that the dev->ready_to_probe change gets interleaved with
>> a dev->can_match change.
>>
>> I think all this goes away if we stop using bitfields for synchronization; we
>> should convert some of those to flags that we can modify with set_bit() and
>> friends instead.
>
> That sounds reasonable to me. Do you want me to send a v3 where I
> create a new "unsigned long flags" in struct device and introduce this
> as the first flag? If there are additional bitfields you want me to
> convert, I can send them as additional patches in the series as long
> as it's not too big of a change...
I think the one with the biggest potential to cause real issues is can_match, as
it is modified without the device lock held from __driver_attach(), which can be
called at any time concurrently.
(I think there are others as well, but they are more on the theoretical side of
things. For instance, dma_skip_sync is modified by dma_set_mask(), which
strictly speaking does not require the device lock to be held. In practice,
that's probably never an issue since dma_set_mask() is typically called from bus
callbacks usually, but it's not strictly a requirement.)
More in general, from a robustness point of view, everything that is set once at
device creation time is fine to be a bitfield; bits that are used for
synchronization or are modified concurrently, I'd rather use bitops.
next prev parent reply other threads:[~2026-04-01 21:05 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-30 14:28 [PATCH v2] driver core: Don't let a device probe until it's ready Douglas Anderson
2026-03-30 14:49 ` Alan Stern
2026-03-30 15:36 ` Rafael J. Wysocki
2026-03-31 14:42 ` Danilo Krummrich
2026-03-31 15:26 ` Doug Anderson
2026-04-01 21:05 ` Danilo Krummrich [this message]
2026-04-01 21:09 ` Doug Anderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DHI4H61YSZGE.BBM3H9B3H8F7@kernel.org \
--to=dakr@kernel.org \
--cc=dianders@chromium.org \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=kay.sievers@vrfy.org \
--cc=linux-kernel@vger.kernel.org \
--cc=rafael@kernel.org \
--cc=saravanak@kernel.org \
--cc=stable@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.