From: Lukas Wunner <lukas@wunner.de>
To: Keith Busch <kbusch@kernel.org>
Cc: "Jozef Matejcik (Nokia)" <jozef.matejcik@nokia.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>
Subject: Re: pci_probe called concurrently in machine with 2 identical PCI devices causing race condition
Date: Fri, 4 Jul 2025 10:03:30 +0200 [thread overview]
Message-ID: <aGeK0lgzJOp7BBqR@wunner.de> (raw)
In-Reply-To: <aF1qRv0XlT4EDN-Y@kbusch-mbp>
On Thu, Jun 26, 2025 at 09:41:58AM -0600, Keith Busch wrote:
> On Thu, Jun 26, 2025 at 02:26:56PM +0200, Lukas Wunner wrote:
> > On Thu, Jun 26, 2025 at 12:20:48PM +0000, Jozef Matejcik (Nokia) wrote:
> > > However, I think this can happen in any machine with 2 identical
> > > PCI devices, because as far as I know, existing PCI drivers usually
> > > do not assume that probe function can be called from multiple threads.
> >
> > That can happen all the time and it would be a bug in the driver
> > if it caused issues.
>
> Wait, is that true? I thought that would only happen if the driver
> indicated probe_type PROBE_PREFER_ASYNCHRONOUS. The default appears to
> still be the same as PROBE_FORCE_SYNCHRONOUS.
You're right, and additionally PROBE_PREFER_ASYNCHRONOUS is only honored
on deferred probing. It appears Jozef is using an out-of-tree driver,
so it's unclear if those conditions are met, but if they are, then the
driver's ->probe() hook may be executed multiple times concurrently.
I guess I went out on a limb with the above-quoted statement, so I
apologize for that.
I've just submitted a patch to honor PROBE_PREFER_ASYNCHRONOUS also on
initial probing:
https://lore.kernel.org/r/53abe6f5ac7c631f95f5d061aa748b192eda0379.1751614426.git.lukas@wunner.de
Would you mind giving it a spin to ascertain that initial probing does
happen asynchronously with it? The nvme driver (which you co-maintain)
already opts in to async probing, so should take advantage of it right
away. GPU drivers seem particularly guilty of lengthy probe times,
so you might want to test async probing with those as well, in order to
have quicker booting on machines used for training neural networks.
Thanks!
Lukas
prev parent reply other threads:[~2025-07-04 8:03 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-26 10:14 pci_probe called concurrently in machine with 2 identical PCI devices causing race condition Jozef Matejcik (Nokia)
2025-06-26 12:08 ` Lukas Wunner
2025-06-26 12:20 ` Jozef Matejcik (Nokia)
2025-06-26 12:26 ` Lukas Wunner
2025-06-26 15:41 ` Keith Busch
2025-06-26 18:16 ` Jozef Matejcik (Nokia)
2025-06-26 22:37 ` Keith Busch
2025-07-04 8:03 ` Lukas Wunner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGeK0lgzJOp7BBqR@wunner.de \
--to=lukas@wunner.de \
--cc=jozef.matejcik@nokia.com \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox