From: Sasha Levin <sashal@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Danilo Krummrich <dakr@kernel.org>,
driver-core@lists.linux.dev, linux-kernel@vger.kernel.org
Cc: Sasha Levin <sashal@kernel.org>,
Maxime Ripard <mripard@kernel.org>,
David Gow <davidgow@google.com>, Stephen Boyd <sboyd@kernel.org>,
Brendan Higgins <brendanhiggins@google.com>,
Rae Moar <rmoar@google.com>,
linux-kselftest@vger.kernel.org, kunit-dev@googlegroups.com
Subject: Re: [CRASH] kunit failures in platform-device-devm
Date: Mon, 2 Mar 2026 07:31:24 -0500 [thread overview]
Message-ID: <20260302123125.2282292-1-sashal@kernel.org> (raw)
In-Reply-To: <aaRH-aXYKntYyjRS@google.com>
This response was AI-generated by bug-bot. The analysis may contain errors - please verify independently.
Hi Alice,
Thanks for the detailed report. Here is my analysis.
___
1. Bug Summary
The platform-device-devm kunit test suite crashes with a general
protection fault in queued_spin_lock_slowpath() during device
registration, followed by cascading failures including sysfs duplicate
filename errors. The root issue is test isolation: earlier kunit tests
(including the intentional NULL dereference in kunit_test_null_dereference)
corrupt kernel state, and the platform-device-devm tests use raw
platform device APIs without kunit-managed cleanup, so they cannot
recover from or survive this corrupted state. The severity is a test
infrastructure issue, not a driver core bug.
2. Stack Trace Analysis
First crash (Oops #3 — two earlier oopses already occurred):
Oops: general protection fault, probably for non-canonical address 0xb4c3c33fcc9f57f6: 0000 [#3] SMP PTI
CPU: 0 UID: 0 PID: 2500 Comm: kunit_try_catch Tainted: G D W N 7.0.0-rc1-00138-g0c21570fbd3d-dirty #3 PREEMPT(lazy)
Tainted: [D]=DIE, [W]=WARN, [N]=TEST
RIP: 0010:queued_spin_lock_slowpath+0x120/0x1c0
RAX: b4c3c340405a5a26 RBX: ffffb222800e3ce8 RCX: 0000000000050000
RDX: ffffa0a4fec1ddd0 RSI: 0000000000000010 RDI: ffffa0a4c2b43340
Call Trace:
<TASK>
klist_iter_exit+0x2c/0x70
? __pfx___device_attach_driver+0x10/0x10
bus_for_each_drv+0x12a/0x160
__device_attach+0xbf/0x160
device_initial_probe+0x2f/0x50
bus_probe_device+0x8f/0x110
device_add+0x23f/0x3d0
platform_device_add+0x137/0x1d0
platform_device_devm_register_unregister_test+0x6c/0x2e0
kunit_try_run_case+0x8f/0x190
kunit_generic_run_threadfn_adapter+0x1d/0x40
kthread+0x142/0x160
ret_from_fork+0xc7/0x1f0
ret_from_fork_asm+0x1a/0x30
</TASK>
The crash point is in queued_spin_lock_slowpath() at
kernel/locking/qspinlock.c, called from klist_iter_exit() at
lib/klist.c:311. RAX holds non-canonical address 0xb4c3c340405a5a26,
indicating corrupted klist data. The calling chain is process context:
platform_device_devm_register_unregister_test() calls
platform_device_add() -> device_add() -> bus_probe_device() ->
__device_attach() -> bus_for_each_drv() (drivers/base/bus.c:420)
which iterates the bus's klist_drivers. During klist_iter_exit(),
it tries to acquire the klist spinlock and hits corrupted memory.
Second failure (duplicate sysfs entry):
sysfs: cannot create duplicate filename '/devices/platform/test'
Call Trace:
<TASK>
dump_stack_lvl+0x2d/0x70
sysfs_create_dir_ns+0xe8/0x130
kobject_add_internal+0x1dd/0x360
kobject_add+0x88/0xf0
device_add+0x171/0x3d0
platform_device_add+0x137/0x1d0
platform_device_devm_register_get_unregister_with_devm_test+0x6c/0x2f0
kunit_try_run_case+0x8f/0x190
kunit_generic_run_threadfn_adapter+0x1d/0x40
kthread+0x142/0x160
ret_from_fork+0xc7/0x1f0
ret_from_fork_asm+0x1a/0x30
</TASK>
The assertion at drivers/base/test/platform-device-test.c:97 fails
with ret == -17 (EEXIST) because the first test crashed without
unregistering its device, leaving "/devices/platform/test" in sysfs.
3. Root Cause Analysis
This is a test isolation problem, not a driver core bug. Two issues
combine to cause the failures:
(a) Corrupted kernel state from earlier oopses. The Oops header shows
"[#3]" meaning this is the third kernel oops during the boot. The
taint flags [D]=DIE and [W]=WARN confirm prior fatal faults. The
kunit_test_null_dereference() function in lib/kunit/kunit-test.c:117
intentionally dereferences NULL to test kunit's fault handling. After
multiple oopses, kernel data structures (including the platform bus
klist) can be corrupted, which explains the non-canonical address
(0xb4c3c33fcc9f57f6) seen during spinlock acquisition.
(b) Missing test-managed cleanup. The four tests in
platform_device_devm_test_suite all use the raw kernel APIs
platform_device_alloc() and platform_device_add() directly, and all
use the same hardcoded name "test" with PLATFORM_DEVID_NONE
(drivers/base/test/platform-device-test.c:62-77). If a test crashes
before reaching platform_device_unregister(), the device remains
registered and subsequent tests cannot register a device with the
same name.
By contrast, the platform_device_find_by_null_test() in the same
file already uses the kunit-managed helpers kunit_platform_device_alloc()
and kunit_platform_device_add() from lib/kunit/platform.c (added in
commit 5ac79730324c "platform: Add test managed platform_device/driver
APIs"), which automatically unregister the device when the test exits,
even on crash.
4. Affected Versions
The platform-device-devm tests were introduced in commit b4cc44301b9d
("drivers: base: Add basic devm tests for platform devices") by
Maxime Ripard, which predates the kunit-managed platform device helpers
from commit 5ac79730324c. All kernel versions containing these tests
are affected by this test isolation issue. This is not a regression in
the driver core itself.
5. Relevant Commits and Fixes
- b4cc44301b9d ("drivers: base: Add basic devm tests for platform devices")
Introduced the test suite with raw platform device APIs.
- 699fb50d99039 ("drivers: base: Free devm resources when unregistering a device")
Fixed devm resource release for unprobed devices; updated test expectations.
- 5ac79730324c ("platform: Add test managed platform_device/driver APIs")
Added kunit_platform_device_alloc()/kunit_platform_device_add()
helpers that automatically clean up on test exit.
- 86a5f32ed8813 ("drivers: base: test: Add ...find_device_by...(... NULL) tests")
Added platform_device_find_by_null_test which already uses the
kunit-managed helpers correctly.
No fix for this specific test isolation issue exists in mainline yet.
6. Suggested Actions
The devm tests should be updated to use the kunit-managed platform
device helpers (kunit_platform_device_alloc/kunit_platform_device_add)
from include/kunit/platform_device.h, similar to how
platform_device_find_by_null_test() already does. This would ensure
proper cleanup even when tests crash.
One subtlety: the devm tests specifically test the behavior of
platform_device_unregister() releasing devm resources, so
kunit_platform_device_add() (which auto-unregisters) needs to be
used carefully — the kunit cleanup action should be removed or
disabled before the explicit unregister call to avoid double-free.
Alternatively, each test could use a unique device name via
PLATFORM_DEVID_AUTO to at least prevent the cascading sysfs duplicate
errors.
For the probed tests (probed_platform_device_devm_register_unregister_test
and probed_platform_device_devm_register_get_unregister_with_devm_test),
there is a similar need to register/unregister the fake_driver with
kunit-managed helpers like kunit_platform_driver_register().
In the short term, you can work around this by running the
platform-device-devm suite in isolation:
./tools/testing/kunit/kunit.py run --make_options LLVM=1 \
--arch x86_64 --kconfig_add CONFIG_RUST=y \
--kconfig_add CONFIG_PCI=y platform-device-devm
This avoids the corrupted state from earlier intentional-crash tests.
next prev parent reply other threads:[~2026-03-02 12:31 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-01 14:06 [CRASH] kunit failures in platform-device-devm Alice Ryhl
2026-03-01 20:17 ` Greg Kroah-Hartman
2026-03-01 21:21 ` Alice Ryhl
2026-03-02 12:31 ` Sasha Levin [this message]
2026-03-02 15:32 ` Alice Ryhl
2026-03-03 0:45 ` David Gow
2026-03-03 10:00 ` Alice Ryhl
2026-03-03 10:32 ` Frederic Weisbecker
2026-03-03 0:50 ` Frederic Weisbecker
2026-03-02 15:52 ` Alice Ryhl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260302123125.2282292-1-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=aliceryhl@google.com \
--cc=brendanhiggins@google.com \
--cc=dakr@kernel.org \
--cc=davidgow@google.com \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=kunit-dev@googlegroups.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mripard@kernel.org \
--cc=rafael@kernel.org \
--cc=rmoar@google.com \
--cc=sboyd@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox