From: "Uwe Kleine-König" <u.kleine-koenig@baylibre.com>
To: Danilo Krummrich <dakr@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Saravana Kannan <saravanak@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
driver-core@lists.linux.dev, rust-for-linux@vger.kernel.org,
"linux-kernel@vger.kernel.org Sudeep Holla"
<sudeep.holla@kernel.org>,
Cristian Marussi <cristian.marussi@arm.com>,
arm-scmi@vger.kernel.org
Subject: Probe function registering another driver [Was: Re: [GIT PULL] Driver core changes for 7.0-rc1]
Date: Wed, 15 Apr 2026 11:48:15 +0200 [thread overview]
Message-ID: <ad9cglZCwtsVsGmq@monoceros> (raw)
In-Reply-To: <DHT95ARBC9W8.37Z54YM70UCT2@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 7411 bytes --]
Hello Danilo,
[expanded Cc: a bit for the affected driver]
On Wed, Apr 15, 2026 at 01:04:47AM +0200, Danilo Krummrich wrote:
> On Tue Apr 14, 2026 at 8:39 PM CEST, Uwe Kleine-König wrote:
> > does that mean that there is a driver involved that somehow violates driver
> > core assumptions and should be fixed even without the consistent locking?
>
> Most likely. There are two known cases where interactions with this commit are
> expected.
>
> (1) One of the drivers probed on your machine gets stuck within probe() (or
> any other place where the device lock is held, e.g. bus callbacks) for
> some reason, e.g. due to a deadlock. In this case this commit would
> potentially cause other tasks to get stuck in driver_attach() when they
> attempt to register a driver for the same bus the bad one sits on.
>
> This is also the main reason why we eventually reverted this commit, i.e.
> despite not being the root cause of an issue, it makes an already bad
> situation worse.
>
> (2) If there is a driver probed on your machine that registers another driver
> from within its probe() function for the same bus it results in a
> deadlock. Note that this is transitive -- if a driver is probed on bus A,
> which e.g. deploys devices on bus B that are subsequently probed, and then
> in one of the probe() calls on bus B a driver is registered for bus A,
> that is a deadlock as well.
>
> For instance, this could happen when a platform driver that runs a PCIe
> root complex deploys the corresponding PCI devices and one of the
> corresponding PCI drivers registers a platform driver from probe().
>
> Anyways, for the underlying problem this reveals, the exact constellation
> doesn't matter. The anti-pattern it reveals is that drivers shouldn't be
> registered from another driver's probe() function in the first place.
>
> I fixed a few drivers having this anti-pattern and all of them had other
> (lifetime) issues due to this and I think there are other potential
> deadlock scenarios as well.
>
> > Hints about how to approach the issue (if there is any) welcome.
>
> For (1) I think it's obvious, and I think it wouldn't have gone unnoticed if any
> of the drivers were bad to the point that they're getting stuck in probe() or
> any other place where the device lock is held.
>
> As for (2) I think the best way to catch it is lockdep. Unfortunately, lockdep
> won't be very helpful without some additional tricks, since the driver core
> calls lockdep_set_novalidate_class() for the device lock to avoid false
> positives.
Thanks for the patch, indeed it creates a lockdep splat on my machine:
[ 2.151192] optee: probing for conduit method.
[ 2.195336] optee: revision 4.9
[ 2.203597] optee: Asynchronous notifications enabled
[ 2.203937] optee: dynamic shared memory is enabled
[ 2.218444]
[ 2.218466] ============================================
[ 2.218474] WARNING: possible recursive locking detected
[ 2.218484] 7.0.0-dirty #32 Tainted: G W
[ 2.218496] --------------------------------------------
[ 2.218500] swapper/0/1 is trying to acquire lock:
[ 2.218510] c2035cb4 (dev->mutex#3){+.+.}-{4:4}, at: __driver_attach+0x0/0x270
[ 2.218565]
[ 2.218565] but task is already holding lock:
[ 2.218570] c2035cb4 (dev->mutex#3){+.+.}-{4:4}, at: __driver_attach+0x130/0x270
[ 2.218601]
[ 2.218601] other info that might help us debug this:
[ 2.218607] Possible unsafe locking scenario:
[ 2.218607]
[ 2.218611] CPU0
[ 2.218614] ----
[ 2.218617] lock(dev->mutex#3);
[ 2.218631] lock(dev->mutex#3);
[ 2.218643]
[ 2.218643] *** DEADLOCK ***
[ 2.218643]
[ 2.218647] May be due to missing lock nesting notation
[ 2.218647]
[ 2.218651] 2 locks held by swapper/0/1:
[ 2.218659] #0: c2035cb4 (dev->mutex#3){+.+.}-{4:4}, at: __driver_attach+0x130/0x270
[ 2.218693] #1: c2690cb4 (dev->mutex#59){+.+.}-{4:4}, at: __device_attach+0x34/0x200
[ 2.218728]
[ 2.218728] stack backtrace:
[ 2.218738] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 7.0.0-dirty #32 PREEMPT
[ 2.218757] Tainted: [W]=WARN
[ 2.218762] Hardware name: STM32 (Device Tree Support)
[ 2.218770] Call trace:
[ 2.218780] unwind_backtrace from show_stack+0x18/0x1c
[ 2.218814] show_stack from dump_stack_lvl+0x68/0x88
[ 2.218843] dump_stack_lvl from print_deadlock_bug+0x370/0x380
[ 2.218871] print_deadlock_bug from __lock_acquire+0x1498/0x1f38
[ 2.218895] __lock_acquire from lock_acquire+0x138/0x40c
[ 2.218918] lock_acquire from __driver_attach+0x40/0x270
[ 2.218939] __driver_attach from bus_for_each_dev+0x78/0xc8
[ 2.218966] bus_for_each_dev from bus_add_driver+0xe8/0x238
[ 2.218996] bus_add_driver from driver_register+0x8c/0x140
[ 2.219022] driver_register from scmi_optee_service_probe+0x150/0x1f0
[ 2.219053] scmi_optee_service_probe from really_probe+0xe8/0x424
[ 2.219079] really_probe from __driver_probe_device+0xa4/0x1fc
[ 2.219097] __driver_probe_device from driver_probe_device+0x3c/0xd8
[ 2.219117] driver_probe_device from __device_attach_driver+0xbc/0x174
[ 2.219136] __device_attach_driver from bus_for_each_drv+0x8c/0xe0
[ 2.219160] bus_for_each_drv from __device_attach+0xb0/0x200
[ 2.219184] __device_attach from device_initial_probe+0x50/0x6c
[ 2.219203] device_initial_probe from bus_probe_device+0x2c/0x84
[ 2.219228] bus_probe_device from device_add+0x618/0x87c
[ 2.219257] device_add from optee_enumerate_devices+0x210/0x2cc
[ 2.219286] optee_enumerate_devices from optee_probe+0x8a0/0xa14
[ 2.219311] optee_probe from platform_probe+0x64/0x98
[ 2.219335] platform_probe from really_probe+0xe8/0x424
[ 2.219355] really_probe from __driver_probe_device+0xa4/0x1fc
[ 2.219374] __driver_probe_device from driver_probe_device+0x3c/0xd8
[ 2.219393] driver_probe_device from __driver_attach+0x13c/0x270
[ 2.219412] __driver_attach from bus_for_each_dev+0x78/0xc8
[ 2.219436] bus_for_each_dev from bus_add_driver+0xe8/0x238
[ 2.219465] bus_add_driver from driver_register+0x8c/0x140
[ 2.219490] driver_register from optee_core_init+0x18/0x3c
[ 2.219519] optee_core_init from do_one_initcall+0x74/0x424
[ 2.219548] do_one_initcall from kernel_init_freeable+0x2a8/0x328
[ 2.219574] kernel_init_freeable from kernel_init+0x1c/0x138
[ 2.219599] kernel_init from ret_from_fork+0x14/0x28
[ 2.219620] Exception stack(0xdd811fb0 to 0xdd811ff8)
[ 2.219634] 1fa0: 00000000 00000000 00000000 00000000
[ 2.219648] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 2.219659] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 2.221255] arm-scmi arm-scmi.0.auto: Using scmi_optee_transport
[ 2.221289] arm-scmi arm-scmi.0.auto: SCMI max-rx-timeout: 30ms / max-msg-size: 104bytes / max-msg: 20
The anti-pattern here is that scmi_optee_service_probe() calls
platform_driver_register(&scmi_optee_driver), see
drivers/firmware/arm_scmi/transports/optee.c.
Best regards
Uwe
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2026-04-15 9:48 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 23:04 [GIT PULL] Driver core changes for 7.0-rc1 Danilo Krummrich
2026-02-12 3:58 ` pr-tracker-bot
2026-03-01 7:44 ` Linus Torvalds
2026-03-01 7:56 ` Linus Torvalds
2026-03-01 13:01 ` Gary Guo
2026-03-01 13:04 ` Danilo Krummrich
2026-03-01 18:17 ` Linus Torvalds
2026-03-01 20:21 ` Danilo Krummrich
2026-03-01 21:01 ` Linus Torvalds
2026-03-02 19:19 ` Danilo Krummrich
2026-03-01 18:20 ` Danilo Krummrich
2026-04-14 18:39 ` Uwe Kleine-König
2026-04-14 23:04 ` Danilo Krummrich
2026-04-15 9:48 ` Uwe Kleine-König [this message]
2026-04-15 14:16 ` Probe function registering another driver [Was: Re: [GIT PULL] Driver core changes for 7.0-rc1] Cristian Marussi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad9cglZCwtsVsGmq@monoceros \
--to=u.kleine-koenig@baylibre.com \
--cc=akpm@linux-foundation.org \
--cc=arm-scmi@vger.kernel.org \
--cc=cristian.marussi@arm.com \
--cc=dakr@kernel.org \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=rafael@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=saravanak@kernel.org \
--cc=sudeep.holla@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox