* [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react()
2026-06-15 16:44 [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests wen.yang
@ 2026-06-15 16:44 ` wen.yang
2026-06-17 11:12 ` Nam Cao
2026-06-17 15:58 ` Nam Cao
2026-06-15 16:44 ` [PATCH 2/3] rv/reactors: add KUnit tests for reactor_printk wen.yang
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: wen.yang @ 2026-06-15 16:44 UTC (permalink / raw)
To: Gabriele Monaco
Cc: Nam Cao, linux-trace-kernel, linux-kernel, Wen Yang,
Thomas Weißschuh
From: Wen Yang <wen.yang@linux.dev>
The DEFINE_WAIT_OVERRIDE_MAP() macro creates a lockdep map with
wait_type_inner = LD_WAIT_CONFIG, which inherits the outer context's
wait type. When rv_react() is called from a LD_WAIT_FREE context
(e.g., a KUnit test with busy-wait), and the reactor callback triggers
a timer interrupt during the busy-loop, the interrupt exit path attempts
to schedule (preempt_schedule_irq -> __schedule -> rq->__lock), which is
LD_WAIT_SPIN. Lockdep then reports:
[ BUG: Invalid wait context ]
context-{5:5}
1 lock held by kunit_try_catch/209:
#0: rv_react_map-wait-type-override at rv_react+0x9d/0xf0
The wait_type_override map allowed the outer LD_WAIT_FREE to propagate
inward, but scheduling from an interrupt is LD_WAIT_SPIN, violating the
constraint.
Fix by explicitly setting wait_type_inner = LD_WAIT_SPIN, which is the
tightest constraint rv_react() callbacks must satisfy: they may not
sleep (LD_WAIT_SLEEP) or use mutexes, but can use spinlocks and be
interrupted. This matches the documented LD_WAIT_FREE constraint.
Fixes: 69d8895cb9a9 ("rv: Add explicit lockdep context for reactors")
Signed-off-by: Wen Yang <wen.yang@linux.dev>
Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
---
kernel/trace/rv/rv_reactors.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/trace/rv/rv_reactors.c b/kernel/trace/rv/rv_reactors.c
index 460af07f7aba..423f843bbd68 100644
--- a/kernel/trace/rv/rv_reactors.c
+++ b/kernel/trace/rv/rv_reactors.c
@@ -465,7 +465,13 @@ int init_rv_reactors(struct dentry *root_dir)
void rv_react(struct rv_monitor *monitor, const char *msg, ...)
{
- static DEFINE_WAIT_OVERRIDE_MAP(rv_react_map, LD_WAIT_FREE);
+#ifdef CONFIG_LOCKDEP
+ static struct lockdep_map rv_react_map = {
+ .name = "rv_react",
+ .wait_type_outer = LD_WAIT_FREE,
+ .wait_type_inner = LD_WAIT_SPIN,
+ };
+#endif
va_list args;
if (!rv_reacting_on() || !monitor->react)
--
2.25.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react()
2026-06-15 16:44 ` [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react() wen.yang
@ 2026-06-17 11:12 ` Nam Cao
2026-06-17 15:58 ` Nam Cao
1 sibling, 0 replies; 14+ messages in thread
From: Nam Cao @ 2026-06-17 11:12 UTC (permalink / raw)
To: wen.yang, Gabriele Monaco
Cc: linux-trace-kernel, linux-kernel, Wen Yang, Thomas Weißschuh
wen.yang@linux.dev writes:
> The DEFINE_WAIT_OVERRIDE_MAP() macro creates a lockdep map with
> wait_type_inner = LD_WAIT_CONFIG, which inherits the outer context's
> wait type. When rv_react() is called from a LD_WAIT_FREE context
> (e.g., a KUnit test with busy-wait), and the reactor callback triggers
> a timer interrupt during the busy-loop,
I am confused by the last sentence. How can reactor callback triggers a
timer interrupt?
Do you mean a timer interrupt happens in the middle of the reactor
callback? And this only happens sporadically, right?
> the interrupt exit path attempts
> to schedule (preempt_schedule_irq -> __schedule -> rq->__lock), which is
> LD_WAIT_SPIN. Lockdep then reports:
>
> [ BUG: Invalid wait context ]
> context-{5:5}
> 1 lock held by kunit_try_catch/209:
> #0: rv_react_map-wait-type-override at rv_react+0x9d/0xf0
>
> The wait_type_override map allowed the outer LD_WAIT_FREE to propagate
> inward, but scheduling from an interrupt is LD_WAIT_SPIN, violating the
> constraint.
>
> Fix by explicitly setting wait_type_inner = LD_WAIT_SPIN, which is the
> tightest constraint rv_react() callbacks must satisfy: they may not
> sleep (LD_WAIT_SLEEP) or use mutexes, but can use spinlocks and be
> interrupted. This matches the documented LD_WAIT_FREE constraint.
These concepts are new to me. Let me do some studying before reviewing.
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react()
2026-06-15 16:44 ` [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react() wen.yang
2026-06-17 11:12 ` Nam Cao
@ 2026-06-17 15:58 ` Nam Cao
1 sibling, 0 replies; 14+ messages in thread
From: Nam Cao @ 2026-06-17 15:58 UTC (permalink / raw)
To: wen.yang, Gabriele Monaco
Cc: linux-trace-kernel, linux-kernel, Wen Yang, Thomas Weißschuh
wen.yang@linux.dev writes:
> void rv_react(struct rv_monitor *monitor, const char *msg, ...)
> {
> - static DEFINE_WAIT_OVERRIDE_MAP(rv_react_map, LD_WAIT_FREE);
> +#ifdef CONFIG_LOCKDEP
> + static struct lockdep_map rv_react_map = {
> + .name = "rv_react",
> + .wait_type_outer = LD_WAIT_FREE,
> + .wait_type_inner = LD_WAIT_SPIN,
> + };
> +#endif
> va_list args;
>
> if (!rv_reacting_on() || !monitor->react)
From my limited understanding of lockdep, this looks fine to me. It now
will not warn us if reactor takes a raw_spin_lock, but I think it's fine.
But I would wait for Thomas's thought on this. He will be back next
week.
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/3] rv/reactors: add KUnit tests for reactor_printk
2026-06-15 16:44 [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests wen.yang
2026-06-15 16:44 ` [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react() wen.yang
@ 2026-06-15 16:44 ` wen.yang
2026-06-15 16:44 ` [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic wen.yang
2026-06-17 15:41 ` [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests Gabriele Monaco
3 siblings, 0 replies; 14+ messages in thread
From: wen.yang @ 2026-06-15 16:44 UTC (permalink / raw)
To: Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel, Wen Yang
From: Wen Yang <wen.yang@linux.dev>
Add KUnit tests for the printk reactor covering:
- Reactor registration and unregistration lifecycle
- React callback invocation via rv_react()
- Double registration rejection
- Multiple register/unregister cycles
The mock callback calls vprintk_deferred() — the same path as the real
reactor — then busy-waits to simulate I/O back-pressure, exercising the
LD_WAIT_FREE constraint of rv_react() under load.
Signed-off-by: Wen Yang <wen.yang@linux.dev>
---
kernel/trace/rv/Kconfig | 10 ++
kernel/trace/rv/Makefile | 1 +
kernel/trace/rv/reactor_printk_kunit.c | 123 +++++++++++++++++++++++++
3 files changed, 134 insertions(+)
create mode 100644 kernel/trace/rv/reactor_printk_kunit.c
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index 3884b14df375..ff47895c897f 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -104,6 +104,16 @@ config RV_REACT_PRINTK
Enables the printk reactor. The printk reactor emits a printk()
message if an exception is found.
+config RV_REACT_PRINTK_KUNIT
+ bool "KUnit tests for reactor_printk" if !KUNIT_ALL_TESTS
+ depends on RV_REACT_PRINTK && KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ This builds KUnit tests for the printk reactor. These are only
+ for development and testing, not for regular kernel use cases.
+
+ If unsure, say N.
+
config RV_REACT_PANIC
bool "Panic reactor"
depends on RV_REACTORS
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index 94498da35b37..ef0a2dcb927c 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -23,4 +23,5 @@ obj-$(CONFIG_RV_MON_NOMISS) += monitors/nomiss/nomiss.o
# Add new monitors here
obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
+obj-$(CONFIG_RV_REACT_PRINTK_KUNIT) += reactor_printk_kunit.o
obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
diff --git a/kernel/trace/rv/reactor_printk_kunit.c b/kernel/trace/rv/reactor_printk_kunit.c
new file mode 100644
index 000000000000..933aa5602226
--- /dev/null
+++ b/kernel/trace/rv/reactor_printk_kunit.c
@@ -0,0 +1,123 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit tests for reactor_printk
+ *
+ */
+
+#include <kunit/test.h>
+#include <linux/rv.h>
+#include <linux/printk.h>
+#include <linux/sched/clock.h>
+#include <linux/processor.h>
+
+/*
+ * Simulated execution time for mock_printk_react (sched_clock units,
+ * nanoseconds). Models the time a real printk reactor callback may consume
+ * under I/O pressure, exercising the LD_WAIT_FREE constraint of rv_react().
+ */
+#define MOCK_REACT_DURATION_NS 5000000ULL
+
+/*
+ * Mock react callback mirroring rv_printk_reaction().
+ *
+ * Calls vprintk_deferred() — the same path as the real reactor — then holds
+ * the CPU for MOCK_REACT_DURATION_NS via a sched_clock() timed busy-loop,
+ * simulating a callback that is slow due to I/O back-pressure.
+ * sched_clock() is notrace and lock-free; no sleep or lock acquisition is
+ * performed, satisfying the LD_WAIT_FREE constraint of rv_react().
+ */
+__printf(1, 0) static void mock_printk_react(const char *msg, va_list args)
+{
+ u64 start = sched_clock();
+
+ vprintk_deferred(msg, args);
+
+ while (sched_clock() - start < MOCK_REACT_DURATION_NS)
+ cpu_relax();
+}
+
+static struct rv_reactor mock_printk_reactor = {
+ .name = "test_printk",
+ .description = "test printk reactor",
+ .react = mock_printk_react,
+};
+
+/* Test 1: register and unregister reactor */
+static void test_printk_register_unregister(struct kunit *test)
+{
+ int ret;
+
+ ret = rv_register_reactor(&mock_printk_reactor);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+ KUNIT_EXPECT_STREQ(test, mock_printk_reactor.name, "test_printk");
+
+ rv_unregister_reactor(&mock_printk_reactor);
+}
+
+/* Test 2: react callback is invoked via rv_react() */
+static void test_printk_react_called(struct kunit *test)
+{
+ struct rv_reactor reactor = {
+ .name = "printk_cb_test",
+ .react = mock_printk_react,
+ };
+ struct rv_monitor monitor = {
+ .name = "test_monitor",
+ .reactor = &reactor,
+ .react = mock_printk_react,
+ };
+
+ rv_react(&monitor, "printk violation message");
+}
+
+/* Test 3: double registration should fail */
+static void test_printk_double_register(struct kunit *test)
+{
+ int ret;
+
+ ret = rv_register_reactor(&mock_printk_reactor);
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ ret = rv_register_reactor(&mock_printk_reactor);
+ KUNIT_EXPECT_NE(test, ret, 0);
+
+ rv_unregister_reactor(&mock_printk_reactor);
+}
+
+/* Test 4: register/unregister cycle */
+static void test_printk_register_cycle(struct kunit *test)
+{
+ int ret, i;
+
+ for (i = 0; i < 5; i++) {
+ ret = rv_register_reactor(&mock_printk_reactor);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+
+ rv_unregister_reactor(&mock_printk_reactor);
+ }
+}
+
+/* Test 5: react callback is not NULL (printk reactors must provide react) */
+static void test_printk_react_not_null(struct kunit *test)
+{
+ KUNIT_EXPECT_NOT_NULL(test, mock_printk_reactor.react);
+}
+
+static struct kunit_case reactor_printk_kunit_cases[] = {
+ KUNIT_CASE(test_printk_register_unregister),
+ KUNIT_CASE(test_printk_react_called),
+ KUNIT_CASE(test_printk_double_register),
+ KUNIT_CASE(test_printk_register_cycle),
+ KUNIT_CASE(test_printk_react_not_null),
+ {}
+};
+
+static struct kunit_suite reactor_printk_kunit_suite = {
+ .name = "rv_reactor_printk",
+ .test_cases = reactor_printk_kunit_cases,
+};
+
+kunit_test_suite(reactor_printk_kunit_suite);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("KUnit tests for reactor_printk");
--
2.25.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic
2026-06-15 16:44 [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests wen.yang
2026-06-15 16:44 ` [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react() wen.yang
2026-06-15 16:44 ` [PATCH 2/3] rv/reactors: add KUnit tests for reactor_printk wen.yang
@ 2026-06-15 16:44 ` wen.yang
2026-06-20 23:30 ` XIAO WU
2026-06-17 15:41 ` [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests Gabriele Monaco
3 siblings, 1 reply; 14+ messages in thread
From: wen.yang @ 2026-06-15 16:44 UTC (permalink / raw)
To: Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel, Wen Yang
From: Wen Yang <wen.yang@linux.dev>
Add KUnit tests for the panic reactor covering:
- Reactor registration and unregistration lifecycle
- Panic notifier chain reachability
The real rv_panic_reaction() calls vpanic(), which is __noreturn and
halts the system. KUnit cannot test across that boundary. Instead, the
test drives atomic_notifier_call_chain(&panic_notifier_list, ...) directly
with a high-priority mock notifier that returns NOTIFY_STOP, verifying
that the panic notification propagates without triggering real handlers
(kdump, watchdog, reboot).
The mock notifier busy-waits to simulate real handler execution time
(e.g., crash_save_vmcoreinfo, emergency_restart preamble) under the
panic context constraints.
Signed-off-by: Wen Yang <wen.yang@linux.dev>
---
kernel/trace/rv/Kconfig | 10 +++
kernel/trace/rv/Makefile | 1 +
kernel/trace/rv/reactor_panic_kunit.c | 106 ++++++++++++++++++++++++++
3 files changed, 117 insertions(+)
create mode 100644 kernel/trace/rv/reactor_panic_kunit.c
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index ff47895c897f..6c6c43c5f86c 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -121,3 +121,13 @@ config RV_REACT_PANIC
help
Enables the panic reactor. The panic reactor emits a printk()
message if an exception is found and panic()s the system.
+
+config RV_REACT_PANIC_KUNIT
+ bool "KUnit tests for reactor_panic" if !KUNIT_ALL_TESTS
+ depends on RV_REACT_PANIC && KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ This builds KUnit tests for the panic reactor. These are only
+ for development and testing, not for regular kernel use cases.
+
+ If unsure, say N.
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index ef0a2dcb927c..2ebfe5e5068c 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -25,3 +25,4 @@ obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
obj-$(CONFIG_RV_REACT_PRINTK_KUNIT) += reactor_printk_kunit.o
obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
+obj-$(CONFIG_RV_REACT_PANIC_KUNIT) += reactor_panic_kunit.o
diff --git a/kernel/trace/rv/reactor_panic_kunit.c b/kernel/trace/rv/reactor_panic_kunit.c
new file mode 100644
index 000000000000..f9a09ae7aaad
--- /dev/null
+++ b/kernel/trace/rv/reactor_panic_kunit.c
@@ -0,0 +1,106 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit tests for reactor_panic
+ *
+ */
+
+#include <kunit/test.h>
+#include <linux/rv.h>
+#include <linux/panic_notifier.h>
+#include <linux/notifier.h>
+#include <linux/limits.h>
+#include <linux/sched/clock.h>
+#include <linux/processor.h>
+
+/* Simulated execution time for mock panic notifier (nanoseconds). */
+#define RV_PANIC_NOTIFIER_EXEC_NS 2000000ULL
+
+/* Test state */
+static struct {
+ bool notifier_called;
+} panic_test_state;
+
+/*
+ * Mock panic notifier callback.
+ *
+ * Runs at INT_MAX priority and returns NOTIFY_STOP to prevent real panic
+ * handlers (kdump, watchdog) from executing during the test. Busy-waits
+ * RV_PANIC_NOTIFIER_EXEC_NS to simulate a real handler's execution time.
+ */
+static int mock_panic_notifier_fn(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ char *msg = data;
+ u64 start = sched_clock();
+
+ panic_test_state.notifier_called = true;
+ pr_emerg("KUnit: reactor_panic test intercepted panic notifier: %s\n",
+ msg ? msg : "(no message)");
+
+ while (sched_clock() - start < RV_PANIC_NOTIFIER_EXEC_NS)
+ cpu_relax();
+
+ return NOTIFY_STOP;
+}
+
+static struct notifier_block mock_panic_nb = {
+ .notifier_call = mock_panic_notifier_fn,
+ .priority = INT_MAX,
+};
+
+static struct rv_reactor mock_panic_reactor = {
+ .name = "test_panic",
+ .description = "test panic reactor",
+};
+
+static int reactor_panic_kunit_init(struct kunit *test)
+{
+ panic_test_state.notifier_called = false;
+ return 0;
+}
+
+/* Test 1: register and unregister reactor */
+static void test_panic_register_unregister(struct kunit *test)
+{
+ int ret;
+
+ ret = rv_register_reactor(&mock_panic_reactor);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+ KUNIT_EXPECT_STREQ(test, mock_panic_reactor.name, "test_panic");
+
+ rv_unregister_reactor(&mock_panic_reactor);
+}
+
+/*
+ * Test 2: panic notifier chain is reachable.
+ *
+ * vpanic() calls atomic_notifier_call_chain(&panic_notifier_list, ...).
+ * Drive the chain directly to verify panic notifiers receive the notification —
+ * the observable side-effect of reactor_panic without halting the system.
+ */
+static void test_panic_notifier_called(struct kunit *test)
+{
+ atomic_notifier_chain_register(&panic_notifier_list, &mock_panic_nb);
+ atomic_notifier_call_chain(&panic_notifier_list, 0,
+ "panic violation message");
+ atomic_notifier_chain_unregister(&panic_notifier_list, &mock_panic_nb);
+
+ KUNIT_EXPECT_TRUE(test, panic_test_state.notifier_called);
+}
+
+static struct kunit_case reactor_panic_kunit_cases[] = {
+ KUNIT_CASE(test_panic_register_unregister),
+ KUNIT_CASE(test_panic_notifier_called),
+ {}
+};
+
+static struct kunit_suite reactor_panic_kunit_suite = {
+ .name = "rv_reactor_panic",
+ .init = reactor_panic_kunit_init,
+ .test_cases = reactor_panic_kunit_cases,
+};
+
+kunit_test_suite(reactor_panic_kunit_suite);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("KUnit tests for reactor_panic");
--
2.25.1
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic
2026-06-15 16:44 ` [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic wen.yang
@ 2026-06-20 23:30 ` XIAO WU
2026-06-21 3:34 ` Wen Yang
0 siblings, 1 reply; 14+ messages in thread
From: XIAO WU @ 2026-06-20 23:30 UTC (permalink / raw)
To: wen.yang, Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
Hi Wen,
I came across a Sashiko AI code review [1] that flagged a potential NULL
pointer dereference in the `test_panic_register_unregister()` test case
added by this patch (commit 8655782285e2). The review's analysis seemed
plausible, so I spun up a QEMU environment to see whether it could be
reproduced in practice.
The short version: yes, it triggers a real kernel BUG + Oops. See below
for the crash log and the reproduction approach.
On Tue, 16 Jun 2026 at 00:44, Wen Yang wrote:
> Add KUnit tests for the panic reactor covering:
> - Reactor registration and unregistration lifecycle
> - Panic notifier chain reachability
...
> +static void test_panic_register_unregister(struct kunit *test)
> +{
> + int ret;
> +
> + ret = rv_register_reactor(&mock_panic_reactor);
> + KUNIT_EXPECT_EQ(test, ret, 0);
> + KUNIT_EXPECT_STREQ(test, mock_panic_reactor.name, "test_panic");
> +
> + rv_unregister_reactor(&mock_panic_reactor);
This is the function the review highlighted. The issue is:
- `KUNIT_EXPECT_EQ()` does *not* abort the test on failure.
- If `rv_register_reactor()` fails (e.g. because another reactor
named "test_panic" was already registered), the .list node of the
statically-allocated `mock_panic_reactor` is never added to any
list — it remains zero-initialized (prev = NULL, next = NULL).
- `rv_unregister_reactor()` then unconditionally calls `list_del()`
on this uninitialized list_head, which hits the NULL pointers.
I was able to reproduce this reliably. The trigger condition is
surprisingly simple: if any code path registers a reactor named
"test_panic" before the KUnit suite runs, the test crashes the kernel.
[Reproduction approach]
I rebuilt the kernel with a small late_initcall in rv_reactors.c that
pre-registers "test_panic" (simulating what would happen if, say, a
kernel module or another subsystem registered a reactor with the same
name before the KUnit tests execute):
static int __init prereg_test_panic(void)
{
static struct rv_reactor prereg = {
.name = "test_panic",
.description = "pre-registered to simulate name collision",
};
return rv_register_reactor(&prereg);
}
late_initcall(prereg_test_panic);
The KUnit tests then auto-run at boot (kunit_run_all_tests). The
test_panic_register_unregister case fails registration with -EINVAL due
to the duplicate name, the KUNIT_EXPECT_EQ does not abort, and
rv_unregister_reactor() crashes on the uninitialized list.
[Crash log — kernel 7.1.0-next-20260615, CONFIG_DEBUG_LIST=y]
Reactor test_panic is already registered
# test_panic_register_unregister: EXPECTATION FAILED at
kernel/trace/rv/reactor_panic_kunit.c:68
Expected ret == 0, but
ret == -22 (0xffffffffffffffea)
list_del corruption, ffffffff8ecce2f8->next is NULL
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:52!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 5028 Comm: kunit_try_catch Tainted: G N
RIP: 0010:__list_del_entry_valid_or_report+0xf2/0x200
Call Trace:
<TASK>
rv_unregister_reactor+0x37/0x190
test_panic_register_unregister+0x1de/0x2e0
kunit_try_run_case+0x1d2/0x520
kunit_generic_run_threadfn_adapter+0x89/0x100
kthread+0x387/0x4a0
ret_from_fork+0xb2c/0xdd0
</TASK>
Kernel panic - not syncing: Fatal exception
The crash is in `rv_unregister_reactor()`, called from
`test_panic_register_unregister()`. The `list_del()` in
`rv_unregister_reactor()` has no guard against a list node that was
never added to any list. With CONFIG_DEBUG_LIST=y the corruption is
caught explicitly; without it this would be a silent NULL dereference.
[Suggested fix]
The most straightforward fix is to use `KUNIT_ASSERT_EQ()` instead of
`KUNIT_EXPECT_EQ()` for the registration result, so the test aborts
before reaching `rv_unregister_reactor()` on a failed registration:
static void test_panic_register_unregister(struct kunit *test)
{
int ret;
ret = rv_register_reactor(&mock_panic_reactor);
- KUNIT_EXPECT_EQ(test, ret, 0);
+ KUNIT_ASSERT_EQ(test, ret, 0);
KUNIT_EXPECT_STREQ(test, mock_panic_reactor.name, "test_panic");
rv_unregister_reactor(&mock_panic_reactor);
}
An alternative (or complementary) approach would be to add a guard in
`rv_unregister_reactor()` itself — e.g. checking whether the reactor is
actually on the list before calling `list_del()`. That would make the
API more robust against future callers making the same mistake.
The same pattern likely applies to the printk reactor tests in patch
2/3, though I haven't tested those.
Full PoC code follows.
[PoC part 1 — Kernel-space: late_initcall to create the name collision]
This is what was added to kernel/trace/rv/rv_reactors.c (or could be
built as a standalone kernel module — see preregister.c below). It
pre-registers "test_panic" before KUnit auto-runs, so the test's own
rv_register_reactor() fails with -EINVAL:
static int __init prereg_test_panic(void)
{
static struct rv_reactor prereg = {
.name = "test_panic",
.description = "pre-registered to simulate name collision",
};
return rv_register_reactor(&prereg);
}
late_initcall(prereg_test_panic);
[PoC part 2 — Userspace: trigger the KUnit test via debugfs]
poc.c:
---8<----------------------------------------------------------------
/*
* POC: NULL pointer dereference in rv_unregister_reactor()
*
* Bug location: kernel/trace/rv/reactor_panic_kunit.c
* test_panic_register_unregister()
*
* Bug: When rv_register_reactor() fails (because "test_panic" is already
* registered), the test calls rv_unregister_reactor() unconditionally.
* This performs list_del() on a zero-initialized list_head (never added
* to any list), causing a NULL pointer dereference crash.
*
* Trigger: With a kernel that has pre-registered "test_panic" reactor,
* simply trigger the KUnit test via debugfs "run" file. The test's
* rv_register_reactor() fails with -EINVAL (duplicate), and the subsequent
* rv_unregister_reactor() crashes on the uninitialized list.
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#define KUNIT_RUN_PATH "/sys/kernel/debug/kunit/rv_reactor_panic/run"
int main(int argc, char **argv)
{
int fd, ret;
setbuf(stdout, NULL);
printf("[+] POC: Triggering NULL deref in rv_unregister_reactor\n");
printf("[+] Target: %s\n\n", KUNIT_RUN_PATH);
/* Mount debugfs if needed */
if (access("/sys/kernel/debug", F_OK) != 0) {
printf("[*] Mounting debugfs...\n");
ret = system("mount -t debugfs none /sys/kernel/debug/
2>/dev/null");
(void)ret;
}
/* Verify KUnit path exists */
if (access(KUNIT_RUN_PATH, W_OK) != 0) {
printf("[-] Cannot access %s: %m\n", KUNIT_RUN_PATH);
printf("[*] Available KUnit suites:\n");
fflush(stdout);
system("ls -la /sys/kernel/debug/kunit/ 2>&1");
return 1;
}
printf("[*] Test_panic reactor should be pre-registered at boot\n");
printf("[*] Triggering KUnit test suite...\n\n");
/*
* Write to the KUnit run file. This executes
* __kunit_test_suites_init() -> kunit_run_tests() which
* runs the reactor_panic_kunit test cases including
* test_panic_register_unregister.
*
* With "test_panic" pre-registered:
* 1. rv_register_reactor() returns -EINVAL (duplicate)
* 2. KUNIT_EXPECT_EQ doesn't abort
* 3. rv_unregister_reactor() calls list_del() on NULL list
* 4. BOOM: list corruption / NULL deref / kernel crash
*/
fd = open(KUNIT_RUN_PATH, O_WRONLY);
if (fd < 0) {
printf("[-] open failed: %m\n");
return 1;
}
printf("[!] Writing to %s - triggering the crash now...\n",
KUNIT_RUN_PATH);
fflush(stdout);
ret = write(fd, "1", 1);
if (ret < 0) {
printf("[-] write failed: %m\n");
} else {
printf("[+] Write succeeded (ret=%d)\n", ret);
}
close(fd);
/*
* If we reach here without crashing, let the user know
*/
printf("\n[*] If the system is still alive, check dmesg:\n");
printf(" dmesg | grep -i -E
'list_del|list_add|list_corrupt|NULL|BUG|oops\n");
printf("\n[*] dmesg output:\n");
fflush(stdout);
system("dmesg | tail -60");
printf("\n[+] POC completed.\n");
return 0;
}
---8<----------------------------------------------------------------
Compile with:
gcc -o poc poc.c -static
[PoC part 3 — Kernel module alternative (standalone)]
If you prefer not to modify rv_reactors.c directly, the same name
collision can be created by loading this module before running the
KUnit test. Note: this requires rv_register_reactor() to be exported
(or resolved via kallsyms), which it may not be in the current tree.
In that case the late_initcall approach above is the way to go.
preregister.c:
---8<----------------------------------------------------------------
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/rv.h>
static struct rv_reactor prereg_reactor = {
.name = "test_panic",
.description = "pre-registered to trigger KUnit bug",
};
static int __init prereg_init(void)
{
int ret;
ret = rv_register_reactor(&prereg_reactor);
if (ret < 0) {
pr_err("preregister: rv_register_reactor failed: %d\n", ret);
return ret;
}
pr_info("preregister: registered 'test_panic' reactor\n");
return 0;
}
static void __exit prereg_exit(void)
{
rv_unregister_reactor(&prereg_reactor);
pr_info("preregister: unregistered 'test_panic' reactor\n");
}
module_init(prereg_init);
module_exit(prereg_exit);
MODULE_LICENSE("GPL");
---8<----------------------------------------------------------------
[1] https://sashiko.dev/#/patchset/cover.1781541556.git.wen.yang%40linux.dev
(Sashiko AI code review — "Null Pointer Dereference", Severity: High)
Thanks,
XIAO
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic
2026-06-20 23:30 ` XIAO WU
@ 2026-06-21 3:34 ` Wen Yang
0 siblings, 0 replies; 14+ messages in thread
From: Wen Yang @ 2026-06-21 3:34 UTC (permalink / raw)
To: XIAO WU, Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
On 6/21/26 07:30, XIAO WU wrote:
> Hi Wen,
>
> I came across a Sashiko AI code review [1] that flagged a potential NULL
> pointer dereference in the `test_panic_register_unregister()` test case
> added by this patch (commit 8655782285e2). The review's analysis seemed
> plausible, so I spun up a QEMU environment to see whether it could be
> reproduced in practice.
>
> The short version: yes, it triggers a real kernel BUG + Oops. See below
> for the crash log and the reproduction approach.
>
> On Tue, 16 Jun 2026 at 00:44, Wen Yang wrote:
> > Add KUnit tests for the panic reactor covering:
> > - Reactor registration and unregistration lifecycle
> > - Panic notifier chain reachability
> ...
> > +static void test_panic_register_unregister(struct kunit *test)
> > +{
> > + int ret;
> > +
> > + ret = rv_register_reactor(&mock_panic_reactor);
> > + KUNIT_EXPECT_EQ(test, ret, 0);
> > + KUNIT_EXPECT_STREQ(test, mock_panic_reactor.name, "test_panic");
> > +
> > + rv_unregister_reactor(&mock_panic_reactor);
>
> This is the function the review highlighted. The issue is:
>
> - `KUNIT_EXPECT_EQ()` does *not* abort the test on failure.
> - If `rv_register_reactor()` fails (e.g. because another reactor
> named "test_panic" was already registered), the .list node of the
> statically-allocated `mock_panic_reactor` is never added to any
> list — it remains zero-initialized (prev = NULL, next = NULL).
> - `rv_unregister_reactor()` then unconditionally calls `list_del()`
> on this uninitialized list_head, which hits the NULL pointers.
>
> I was able to reproduce this reliably. The trigger condition is
> surprisingly simple: if any code path registers a reactor named
> "test_panic" before the KUnit suite runs, the test crashes the kernel.
>
> [Reproduction approach]
>
> I rebuilt the kernel with a small late_initcall in rv_reactors.c that
> pre-registers "test_panic" (simulating what would happen if, say, a
> kernel module or another subsystem registered a reactor with the same
> name before the KUnit tests execute):
>
> static int __init prereg_test_panic(void)
> {
> static struct rv_reactor prereg = {
> .name = "test_panic",
> .description = "pre-registered to simulate name collision",
> };
> return rv_register_reactor(&prereg);
> }
> late_initcall(prereg_test_panic);
>
> The KUnit tests then auto-run at boot (kunit_run_all_tests). The
> test_panic_register_unregister case fails registration with -EINVAL due
> to the duplicate name, the KUNIT_EXPECT_EQ does not abort, and
> rv_unregister_reactor() crashes on the uninitialized list.
>
> [Crash log — kernel 7.1.0-next-20260615, CONFIG_DEBUG_LIST=y]
>
> Reactor test_panic is already registered
> # test_panic_register_unregister: EXPECTATION FAILED at
> kernel/trace/rv/reactor_panic_kunit.c:68
> Expected ret == 0, but
> ret == -22 (0xffffffffffffffea)
> list_del corruption, ffffffff8ecce2f8->next is NULL
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:52!
> Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> CPU: 1 UID: 0 PID: 5028 Comm: kunit_try_catch Tainted: G N
> RIP: 0010:__list_del_entry_valid_or_report+0xf2/0x200
> Call Trace:
> <TASK>
> rv_unregister_reactor+0x37/0x190
> test_panic_register_unregister+0x1de/0x2e0
> kunit_try_run_case+0x1d2/0x520
> kunit_generic_run_threadfn_adapter+0x89/0x100
> kthread+0x387/0x4a0
> ret_from_fork+0xb2c/0xdd0
> </TASK>
> Kernel panic - not syncing: Fatal exception
>
> The crash is in `rv_unregister_reactor()`, called from
> `test_panic_register_unregister()`. The `list_del()` in
> `rv_unregister_reactor()` has no guard against a list node that was
> never added to any list. With CONFIG_DEBUG_LIST=y the corruption is
> caught explicitly; without it this would be a silent NULL dereference.
>
> [Suggested fix]
>
> The most straightforward fix is to use `KUNIT_ASSERT_EQ()` instead of
> `KUNIT_EXPECT_EQ()` for the registration result, so the test aborts
> before reaching `rv_unregister_reactor()` on a failed registration:
>
> static void test_panic_register_unregister(struct kunit *test)
> {
> int ret;
>
> ret = rv_register_reactor(&mock_panic_reactor);
> - KUNIT_EXPECT_EQ(test, ret, 0);
> + KUNIT_ASSERT_EQ(test, ret, 0);
> KUNIT_EXPECT_STREQ(test, mock_panic_reactor.name, "test_panic");
>
> rv_unregister_reactor(&mock_panic_reactor);
> }
>
> An alternative (or complementary) approach would be to add a guard in
> `rv_unregister_reactor()` itself — e.g. checking whether the reactor is
> actually on the list before calling `list_del()`. That would make the
> API more robust against future callers making the same mistake.
>
> The same pattern likely applies to the printk reactor tests in patch
> 2/3, though I haven't tested those.
>
Okay, thank you.
We've noted this is related to a Kunit test. We'll incorporate the
improvement in the v2.
Thanks again.
--
Wen
> Full PoC code follows.
>
> [PoC part 1 — Kernel-space: late_initcall to create the name collision]
>
> This is what was added to kernel/trace/rv/rv_reactors.c (or could be
> built as a standalone kernel module — see preregister.c below). It
> pre-registers "test_panic" before KUnit auto-runs, so the test's own
> rv_register_reactor() fails with -EINVAL:
>
> static int __init prereg_test_panic(void)
> {
> static struct rv_reactor prereg = {
> .name = "test_panic",
> .description = "pre-registered to simulate name collision",
> };
> return rv_register_reactor(&prereg);
> }
> late_initcall(prereg_test_panic);
>
> [PoC part 2 — Userspace: trigger the KUnit test via debugfs]
>
> poc.c:
> ---8<----------------------------------------------------------------
> /*
> * POC: NULL pointer dereference in rv_unregister_reactor()
> *
> * Bug location: kernel/trace/rv/reactor_panic_kunit.c
> * test_panic_register_unregister()
> *
> * Bug: When rv_register_reactor() fails (because "test_panic" is already
> * registered), the test calls rv_unregister_reactor() unconditionally.
> * This performs list_del() on a zero-initialized list_head (never added
> * to any list), causing a NULL pointer dereference crash.
> *
> * Trigger: With a kernel that has pre-registered "test_panic" reactor,
> * simply trigger the KUnit test via debugfs "run" file. The test's
> * rv_register_reactor() fails with -EINVAL (duplicate), and the
> subsequent
> * rv_unregister_reactor() crashes on the uninitialized list.
> */
>
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <errno.h>
>
> #define KUNIT_RUN_PATH "/sys/kernel/debug/kunit/rv_reactor_panic/run"
>
> int main(int argc, char **argv)
> {
> int fd, ret;
>
> setbuf(stdout, NULL);
>
> printf("[+] POC: Triggering NULL deref in rv_unregister_reactor\n");
> printf("[+] Target: %s\n\n", KUNIT_RUN_PATH);
>
> /* Mount debugfs if needed */
> if (access("/sys/kernel/debug", F_OK) != 0) {
> printf("[*] Mounting debugfs...\n");
> ret = system("mount -t debugfs none /sys/kernel/debug/
> 2>/dev/null");
> (void)ret;
> }
>
> /* Verify KUnit path exists */
> if (access(KUNIT_RUN_PATH, W_OK) != 0) {
> printf("[-] Cannot access %s: %m\n", KUNIT_RUN_PATH);
> printf("[*] Available KUnit suites:\n");
> fflush(stdout);
> system("ls -la /sys/kernel/debug/kunit/ 2>&1");
> return 1;
> }
>
> printf("[*] Test_panic reactor should be pre-registered at boot\n");
> printf("[*] Triggering KUnit test suite...\n\n");
>
> /*
> * Write to the KUnit run file. This executes
> * __kunit_test_suites_init() -> kunit_run_tests() which
> * runs the reactor_panic_kunit test cases including
> * test_panic_register_unregister.
> *
> * With "test_panic" pre-registered:
> * 1. rv_register_reactor() returns -EINVAL (duplicate)
> * 2. KUNIT_EXPECT_EQ doesn't abort
> * 3. rv_unregister_reactor() calls list_del() on NULL list
> * 4. BOOM: list corruption / NULL deref / kernel crash
> */
> fd = open(KUNIT_RUN_PATH, O_WRONLY);
> if (fd < 0) {
> printf("[-] open failed: %m\n");
> return 1;
> }
>
> printf("[!] Writing to %s - triggering the crash now...\n",
> KUNIT_RUN_PATH);
> fflush(stdout);
>
> ret = write(fd, "1", 1);
> if (ret < 0) {
> printf("[-] write failed: %m\n");
> } else {
> printf("[+] Write succeeded (ret=%d)\n", ret);
> }
>
> close(fd);
>
> /*
> * If we reach here without crashing, let the user know
> */
> printf("\n[*] If the system is still alive, check dmesg:\n");
> printf(" dmesg | grep -i -E
> 'list_del|list_add|list_corrupt|NULL|BUG|oops\n");
> printf("\n[*] dmesg output:\n");
> fflush(stdout);
> system("dmesg | tail -60");
>
> printf("\n[+] POC completed.\n");
>
> return 0;
> }
> ---8<----------------------------------------------------------------
>
> Compile with:
> gcc -o poc poc.c -static
>
> [PoC part 3 — Kernel module alternative (standalone)]
>
> If you prefer not to modify rv_reactors.c directly, the same name
> collision can be created by loading this module before running the
> KUnit test. Note: this requires rv_register_reactor() to be exported
> (or resolved via kallsyms), which it may not be in the current tree.
> In that case the late_initcall approach above is the way to go.
>
> preregister.c:
> ---8<----------------------------------------------------------------
> #include <linux/module.h>
> #include <linux/kernel.h>
> #include <linux/rv.h>
>
> static struct rv_reactor prereg_reactor = {
> .name = "test_panic",
> .description = "pre-registered to trigger KUnit bug",
> };
>
> static int __init prereg_init(void)
> {
> int ret;
> ret = rv_register_reactor(&prereg_reactor);
> if (ret < 0) {
> pr_err("preregister: rv_register_reactor failed: %d\n", ret);
> return ret;
> }
> pr_info("preregister: registered 'test_panic' reactor\n");
> return 0;
> }
>
> static void __exit prereg_exit(void)
> {
> rv_unregister_reactor(&prereg_reactor);
> pr_info("preregister: unregistered 'test_panic' reactor\n");
> }
>
> module_init(prereg_init);
> module_exit(prereg_exit);
> MODULE_LICENSE("GPL");
> ---8<----------------------------------------------------------------
>
>
> [1]
> https://sashiko.dev/#/patchset/cover.1781541556.git.wen.yang%40linux.dev
> (Sashiko AI code review — "Null Pointer Dereference", Severity: High)
>
> Thanks,
> XIAO
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-15 16:44 [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests wen.yang
` (2 preceding siblings ...)
2026-06-15 16:44 ` [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic wen.yang
@ 2026-06-17 15:41 ` Gabriele Monaco
2026-06-17 15:52 ` Nam Cao
2026-06-17 17:11 ` Wen Yang
3 siblings, 2 replies; 14+ messages in thread
From: Gabriele Monaco @ 2026-06-17 15:41 UTC (permalink / raw)
To: wen.yang; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
On Tue, 2026-06-16 at 00:44 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
>
> We occasionally hit a lockdep "Invalid wait context" warning in
> production
> environments when rv_react() callbacks are interrupted.
>
> The bug is intermittent in production. KUnit tests with busy-wait
> callbacks
> can reproduce it by holding the CPU long enough for a timer interrupt
> to fire
> during rv_react(), exposing the lockdep constraint violation:
>
> [ 44.820913] =============================
> [ 44.820923] [ BUG: Invalid wait context ]
> [ 44.821137] 7.1.0-rc7-next-20260612-virtme #6 Tainted:
> G N
> [ 44.821203] -----------------------------
It's nice to have reactors kunit coverage, I need to go through them
more carefully but I like the idea.
Are those tests supposed to trigger this issue though? Under what
configuration?
I reverted the lockdep fix and run the tests in vng on both x86_64 and
arm64, both preempt_rt and not but I see no splat.
Repeating the tests multiple times from debugfs also didn't seem to
help. Both machines were relatively large (128 and 48 CPUs).
The config was the bare vng one with kunit built-in, lockdep and the
reactors tests.
What am I missing?
Thanks,
Gabriele
> [ 44.821211] kunit_try_catch/209 is trying to lock:
> [ 44.821244] ffff8a743ed3e8a0 (&rq->__lock){-...}-{2:2}, at:
> __schedule+0x102/0x13d0
> [ 44.821688] other info that might help us debug this:
> [ 44.821708] context-{5:5}
> [ 44.821730] 1 lock held by kunit_try_catch/209:
> [ 44.821745] #0: ffffffffb6ba62c0 (rv_react_map-wait-type-
> override){+.+.}-{1:1}, at: rv_react+0x9d/0xf0
> [ 44.821803] stack backtrace:
> [ 44.822110] CPU: 10 UID: 0 PID: 209 Comm: kunit_try_catch Tainted:
> G N 7.1.0-rc7-next-20260612-virtme #6
> PREEMPT_{RT,(full)}
> [ 44.822197] Tainted: [N]=TEST
> [ 44.822210] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX,
> arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 44.822328] Call Trace:
> [ 44.822377] <TASK>
> [ 44.822806] dump_stack_lvl+0x78/0xe0
> [ 44.822860] __lock_acquire+0x926/0x1c90
> [ 44.822888] lock_acquire+0xd3/0x310
> [ 44.822901] ? __schedule+0x102/0x13d0
> [ 44.822919] ? rcu_qs+0x2d/0x1a0
> [ 44.822954] _raw_spin_lock_nested+0x36/0x50
> [ 44.822966] ? __schedule+0x102/0x13d0
> [ 44.822979] __schedule+0x102/0x13d0
> [ 44.822993] ? mark_held_locks+0x40/0x70
> [ 44.823009] preempt_schedule_irq+0x37/0x70
> [ 44.823018] irqentry_exit+0x1da/0x8c0
> [ 44.823032] asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [ 44.823093] RIP: 0010:mock_printk_react+0x2a/0x50
> [ 44.823250] Code: f3 0f 1e fa 0f 1f 44 00 00 41 54 49 89 f4 55 48
> 89 fd 53 e8 18 8b db ff 4c 89 e6 48 89 ef 48 89 c3 e8 fa 8e ed ff eb
> 02 f3 90 <e8> 01 8b db ff 48 29 d8 48 3d 3f 4b 4c 00 76 ee 5b 5d 41
> 5c c3 cc
> [ 44.823303] RSP: 0018:ffffd1c3c0733d38 EFLAGS: 00000297
> [ 44.823332] RAX: 00000000000119f3 RBX: 0000000a74e60d1c RCX:
> 000000000000001f
> [ 44.823342] RDX: 0000000000000000 RSI: 000000003348c8a2 RDI:
> ffffffffc1abbfd9
> [ 44.823351] RBP: ffffffffb671b613 R08: 0000000000000002 R09:
> 0000000000000000
> [ 44.823359] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffd1c3c0733d60
> [ 44.823367] R13: ffffffffb575a5fd R14: ffffd1c3c0017be8 R15:
> ffffd1c3c00179f8
> [ 44.823397] ? rv_react+0x9d/0xf0
> [ 44.823437] ? mock_printk_react+0x2f/0x50
> [ 44.823448] rv_react+0xb4/0xf0
> [ 44.823455] ? rv_react+0x9d/0xf0
> [ 44.823476] test_printk_react_called+0x83/0xb0
> [ 44.823486] ? __pfx_mock_printk_react+0x10/0x10
> [ 44.823502] ? __pfx_mock_printk_react+0x10/0x10
> [ 44.823513] kunit_try_run_case+0x97/0x190
> [ 44.823534] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
> [ 44.823544] kunit_generic_run_threadfn_adapter+0x21/0x40
> [ 44.823551] kthread+0x124/0x160
> [ 44.823562] ? __pfx_kthread+0x10/0x10
> [ 44.823574] ret_from_fork+0x291/0x3b0
> [ 44.823585] ? __pfx_kthread+0x10/0x10
> [ 44.823595] ret_from_fork_asm+0x1a/0x30
> [ 44.823641] </TASK>
>
>
> Patch 1 fixes the lockdep bug by correcting rv_react()'s
> wait_type_inner
> from LD_WAIT_CONFIG (which inherits the outer context) to
> LD_WAIT_SPIN
> (the tightest constraint callbacks must satisfy).
>
> Patch 2 adds KUnit tests for reactor_printk. The busy-wait in the
> mock
> callback reproduces the timer interrupt scenario that exposes the
> bug.
>
> Patch 3 adds KUnit tests for reactor_panic, exercising the panic
> notifier
> chain without halting the system.
>
> Tested with CONFIG_PROVE_LOCKING=y and CONFIG_KUNIT=y.
>
>
> Wen Yang (3):
> rv/reactors: fix lockdep "Invalid wait context" in rv_react()
> rv/reactors: add KUnit tests for reactor_printk
> rv/reactors: add KUnit tests for reactor_panic
>
> kernel/trace/rv/Kconfig | 20 ++++
> kernel/trace/rv/Makefile | 2 +
> kernel/trace/rv/reactor_panic_kunit.c | 106 +++++++++++++++++++++
> kernel/trace/rv/reactor_printk_kunit.c | 123
> +++++++++++++++++++++++++
> kernel/trace/rv/rv_reactors.c | 8 +-
> 5 files changed, 258 insertions(+), 1 deletion(-)
> create mode 100644 kernel/trace/rv/reactor_panic_kunit.c
> create mode 100644 kernel/trace/rv/reactor_printk_kunit.c
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-17 15:41 ` [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests Gabriele Monaco
@ 2026-06-17 15:52 ` Nam Cao
2026-06-17 16:14 ` Gabriele Monaco
2026-06-17 17:11 ` Wen Yang
1 sibling, 1 reply; 14+ messages in thread
From: Nam Cao @ 2026-06-17 15:52 UTC (permalink / raw)
To: Gabriele Monaco, wen.yang; +Cc: linux-trace-kernel, linux-kernel
Gabriele Monaco <gmonaco@redhat.com> writes:
> Are those tests supposed to trigger this issue though? Under what
> configuration?
>
> I reverted the lockdep fix and run the tests in vng on both x86_64 and
> arm64, both preempt_rt and not but I see no splat.
> Repeating the tests multiple times from debugfs also didn't seem to
> help. Both machines were relatively large (128 and 48 CPUs).
>
> The config was the bare vng one with kunit built-in, lockdep and the
> reactors tests.
>
> What am I missing?
I haven't tried to reproduce it, but seems quite rare. From the look of
it, adding some delay into the reactor function should make the issue
more easily reproducible.
Nam
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-17 15:52 ` Nam Cao
@ 2026-06-17 16:14 ` Gabriele Monaco
0 siblings, 0 replies; 14+ messages in thread
From: Gabriele Monaco @ 2026-06-17 16:14 UTC (permalink / raw)
To: Nam Cao, wen.yang; +Cc: linux-trace-kernel, linux-kernel
On Wed, 2026-06-17 at 17:52 +0200, Nam Cao wrote:
> Gabriele Monaco <gmonaco@redhat.com> writes:
> > Are those tests supposed to trigger this issue though? Under what
> > configuration?
> >
> > I reverted the lockdep fix and run the tests in vng on both x86_64
> > and arm64, both preempt_rt and not but I see no splat.
> > Repeating the tests multiple times from debugfs also didn't seem to
> > help. Both machines were relatively large (128 and 48 CPUs).
> >
> > The config was the bare vng one with kunit built-in, lockdep and
> > the reactors tests.
> >
> > What am I missing?
>
> I haven't tried to reproduce it, but seems quite rare. From the look
> of it, adding some delay into the reactor function should make the
> issue more easily reproducible.
Yeah the tests should be doing that, but even increasing the delay
didn't help. I should probably try on physical machines to have more
likely interrupts but at least the tick should be running.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-17 15:41 ` [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests Gabriele Monaco
2026-06-17 15:52 ` Nam Cao
@ 2026-06-17 17:11 ` Wen Yang
2026-06-18 15:35 ` Gabriele Monaco
1 sibling, 1 reply; 14+ messages in thread
From: Wen Yang @ 2026-06-17 17:11 UTC (permalink / raw)
To: Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
On 6/17/26 23:41, Gabriele Monaco wrote:
> On Tue, 2026-06-16 at 00:44 +0800, wen.yang@linux.dev wrote:
>> From: Wen Yang <wen.yang@linux.dev>
>>
>> We occasionally hit a lockdep "Invalid wait context" warning in
>> production
>> environments when rv_react() callbacks are interrupted.
>>
>> The bug is intermittent in production. KUnit tests with busy-wait
>> callbacks
>> can reproduce it by holding the CPU long enough for a timer interrupt
>> to fire
>> during rv_react(), exposing the lockdep constraint violation:
>>
>> [ 44.820913] =============================
>> [ 44.820923] [ BUG: Invalid wait context ]
>> [ 44.821137] 7.1.0-rc7-next-20260612-virtme #6 Tainted:
>> G N
>> [ 44.821203] -----------------------------
>
> It's nice to have reactors kunit coverage, I need to go through them
> more carefully but I like the idea.
>
> Are those tests supposed to trigger this issue though? Under what
> configuration?
>
> I reverted the lockdep fix and run the tests in vng on both x86_64 and
> arm64, both preempt_rt and not but I see no splat.
> Repeating the tests multiple times from debugfs also didn't seem to
> help. Both machines were relatively large (128 and 48 CPUs).
>
> The config was the bare vng one with kunit built-in, lockdep and the
> reactors tests.
>
> What am I missing?
>
Thank you for your feedback.
I am using a WSL dev environment with 12 cores and 16GB. The config of
the tested kernel code is as follows:
$ make savedefconfig
$ cat defconfig
CONFIG_WERROR=y
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RT=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_CGROUPS=y
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_SCHED=y
CONFIG_CGROUP_PIDS=y
CONFIG_CGROUP_RDMA=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_CPUSETS=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_PERF=y
CONFIG_CGROUP_BPF=y
CONFIG_CGROUP_MISC=y
CONFIG_CGROUP_DEBUG=y
CONFIG_NAMESPACES=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_EXPERT=y
CONFIG_PROFILING=y
CONFIG_KEXEC=y
CONFIG_SMP=y
CONFIG_IOSF_MBI=y
CONFIG_HYPERVISOR_GUEST=y
CONFIG_PARAVIRT=y
CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_NUMA=y
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
# CONFIG_MTRR_SANITIZER is not set
CONFIG_EFI=y
CONFIG_EFI_STUB=y
CONFIG_EFI_MIXED=y
CONFIG_HZ_1000=y
CONFIG_HIBERNATION=y
CONFIG_PM_DEBUG=y
CONFIG_PM_TRACE_RTC=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_BGRT=y
CONFIG_IA32_EMULATION=y
CONFIG_KVM=y
CONFIG_KVM_INTEL=y
CONFIG_KVM_AMD=y
# CONFIG_SCHED_MC is not set
CONFIG_KPROBES=y
CONFIG_JUMP_LABEL=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_BLK_CGROUP_IOLATENCY=y
CONFIG_BLK_CGROUP_IOCOST=y
CONFIG_BLK_CGROUP_IOPRIO=y
CONFIG_BINFMT_MISC=y
# CONFIG_COMPAT_BRK is not set
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_ZONE_DEVICE=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_DIAG is not set
CONFIG_TCP_CONG_ADVANCED=y
# CONFIG_TCP_CONG_BIC is not set
# CONFIG_TCP_CONG_WESTWOOD is not set
# CONFIG_TCP_CONG_HTCP is not set
# CONFIG_IPV6 is not set
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_SCHED=y
CONFIG_NET_CLS_CGROUP=y
CONFIG_NET_EMATCH=y
CONFIG_NET_CLS_ACT=y
CONFIG_DNS_RESOLVER=y
CONFIG_CGROUP_NET_PRIO=y
# CONFIG_WIRELESS is not set
CONFIG_NET_9P=y
CONFIG_NET_9P_VIRTIO=y
CONFIG_PCI=y
CONFIG_PCIEPORTBUS=y
CONFIG_HOTPLUG_PCI=y
CONFIG_PCCARD=y
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_DEBUG_DEVRES=y
CONFIG_CONNECTOR=y
CONFIG_FW_CFG_SYSFS=y
CONFIG_FW_CFG_SYSFS_CMDLINE=y
# CONFIG_EFI_DISABLE_RUNTIME is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_VIRTIO_BLK=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_VIRTIO=y
CONFIG_ATA=y
CONFIG_SATA_AHCI=y
CONFIG_ATA_PIIX=y
CONFIG_PATA_AMD=y
CONFIG_PATA_OLDPIIX=y
CONFIG_PATA_SCH=y
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_BLK_DEV_DM=y
CONFIG_DM_MIRROR=y
CONFIG_DM_ZERO=y
CONFIG_MACINTOSH_DRIVERS=y
CONFIG_MAC_EMUMOUSEBTN=y
CONFIG_NETDEVICES=y
CONFIG_NETCONSOLE=y
CONFIG_VIRTIO_NET=y
# CONFIG_ETHERNET is not set
CONFIG_PHYLIB=y
CONFIG_REALTEK_PHY=y
# CONFIG_WLAN is not set
CONFIG_INPUT_FF_MEMLESS=y
CONFIG_INPUT_EVDEV=y
CONFIG_INPUT_JOYSTICK=y
CONFIG_INPUT_TABLET=y
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_INPUT_MISC=y
# CONFIG_LEGACY_PTYS is not set
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
CONFIG_SERIAL_8250_DETECT_IRQ=y
CONFIG_SERIAL_8250_RSA=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_NONSTANDARD=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_HW_RANDOM=y
# CONFIG_HW_RANDOM_INTEL is not set
# CONFIG_HW_RANDOM_AMD is not set
CONFIG_NVRAM=y
CONFIG_HPET=y
# CONFIG_HPET_MMAP is not set
CONFIG_I2C_I801=y
CONFIG_PTP_1588_CLOCK=y
CONFIG_WATCHDOG=y
CONFIG_I6300ESB_WDT=y
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_INTEL=y
CONFIG_DRM=y
# CONFIG_DRM_FBDEV_EMULATION is not set
CONFIG_DRM_BOCHS=y
CONFIG_DRM_VIRTIO_GPU=y
CONFIG_FB=y
CONFIG_FB_VESA=y
CONFIG_BACKLIGHT_CLASS_DEVICE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_SOUND=y
CONFIG_SND=y
CONFIG_SND_HRTIMER=y
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=y
# CONFIG_SND_DRIVERS is not set
CONFIG_SND_INTEL8X0=y
CONFIG_SND_HDA_HWDEP=y
CONFIG_SND_HDA_INTEL=y
CONFIG_SND_HDA_CODEC_REALTEK=y
# CONFIG_SND_PCMCIA is not set
# CONFIG_SND_X86 is not set
# CONFIG_HID is not set
CONFIG_RTC_CLASS=y
CONFIG_DMADEVICES=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_VIRTIO_INPUT=y
CONFIG_VIRTIO_MMIO=y
CONFIG_EEEPC_LAPTOP=y
CONFIG_ACPI_WMI=y
CONFIG_MAILBOX=y
CONFIG_PCC=y
CONFIG_AMD_IOMMU=y
CONFIG_INTEL_IOMMU=y
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
CONFIG_IRQ_REMAP=y
CONFIG_VIRTIO_IOMMU=y
CONFIG_FS_DAX=y
CONFIG_QUOTA=y
CONFIG_QUOTA_NETLINK_INTERFACE=y
CONFIG_QFMT_V2=y
CONFIG_FUSE_FS=y
CONFIG_VIRTIO_FS=y
CONFIG_OVERLAY_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_PROC_KCORE=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_SQUASHFS=y
CONFIG_SQUASHFS_XZ=y
CONFIG_SQUASHFS_ZSTD=y
CONFIG_9P_FS=y
CONFIG_NLS_DEFAULT="utf8"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ASCII=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_UTF8=y
CONFIG_KEYS=y
CONFIG_SECURITYFS=y
CONFIG_CRYPTO_AUTHENC=y
CONFIG_CRYPTO_RSA=y
CONFIG_CRYPTO_AES=y
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CCM=y
CONFIG_CRYPTO_GCM=y
CONFIG_CRYPTO_SEQIV=y
CONFIG_CRYPTO_ECHAINIV=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_SHA256=y
CONFIG_ASYMMETRIC_KEY_TYPE=y
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
CONFIG_X509_CERTIFICATE_PARSER=y
CONFIG_PKCS7_MESSAGE_PARSER=y
CONFIG_SYSTEM_TRUSTED_KEYRING=y
CONFIG_PRINTK_TIME=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_WX=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_SCHEDSTATS=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_DEBUG_ATOMIC=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_CSD_LOCK_WAIT_DEBUG=y
CONFIG_CSD_LOCK_WAIT_DEBUG_DEFAULT=y
CONFIG_DEBUG_KOBJECT=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_RV=y
CONFIG_RV_MON_WWNR=y
CONFIG_RV_MON_RTAPP=y
CONFIG_RV_MON_STALL=y
CONFIG_RV_MON_DEADLINE=y
CONFIG_RV_REACT_PRINTK_KUNIT=y
CONFIG_RV_REACT_PANIC_KUNIT=y
CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
CONFIG_EARLY_PRINTK_DBGP=y
CONFIG_DEBUG_BOOT_PARAMS=y
CONFIG_DEBUG_ENTRY=y
CONFIG_KUNIT=y
# CONFIG_KUNIT_DEBUGFS is not set
And then, using vng to build and run kselftests (since kunit is already
built-in) can reproduce this issue:
$ vng --build
$ vng -v --run arch/x86/boot/bzImage --user root --
tools/testing/selftests/verification/verificationtest-ktap
--
Best wishes,
Wen
> Thanks,
> Gabriele
>
>> [ 44.821211] kunit_try_catch/209 is trying to lock:
>> [ 44.821244] ffff8a743ed3e8a0 (&rq->__lock){-...}-{2:2}, at:
>> __schedule+0x102/0x13d0
>> [ 44.821688] other info that might help us debug this:
>> [ 44.821708] context-{5:5}
>> [ 44.821730] 1 lock held by kunit_try_catch/209:
>> [ 44.821745] #0: ffffffffb6ba62c0 (rv_react_map-wait-type-
>> override){+.+.}-{1:1}, at: rv_react+0x9d/0xf0
>> [ 44.821803] stack backtrace:
>> [ 44.822110] CPU: 10 UID: 0 PID: 209 Comm: kunit_try_catch Tainted:
>> G N 7.1.0-rc7-next-20260612-virtme #6
>> PREEMPT_{RT,(full)}
>> [ 44.822197] Tainted: [N]=TEST
>> [ 44.822210] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX,
>> arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
>> [ 44.822328] Call Trace:
>> [ 44.822377] <TASK>
>> [ 44.822806] dump_stack_lvl+0x78/0xe0
>> [ 44.822860] __lock_acquire+0x926/0x1c90
>> [ 44.822888] lock_acquire+0xd3/0x310
>> [ 44.822901] ? __schedule+0x102/0x13d0
>> [ 44.822919] ? rcu_qs+0x2d/0x1a0
>> [ 44.822954] _raw_spin_lock_nested+0x36/0x50
>> [ 44.822966] ? __schedule+0x102/0x13d0
>> [ 44.822979] __schedule+0x102/0x13d0
>> [ 44.822993] ? mark_held_locks+0x40/0x70
>> [ 44.823009] preempt_schedule_irq+0x37/0x70
>> [ 44.823018] irqentry_exit+0x1da/0x8c0
>> [ 44.823032] asm_sysvec_apic_timer_interrupt+0x1a/0x20
>> [ 44.823093] RIP: 0010:mock_printk_react+0x2a/0x50
>> [ 44.823250] Code: f3 0f 1e fa 0f 1f 44 00 00 41 54 49 89 f4 55 48
>> 89 fd 53 e8 18 8b db ff 4c 89 e6 48 89 ef 48 89 c3 e8 fa 8e ed ff eb
>> 02 f3 90 <e8> 01 8b db ff 48 29 d8 48 3d 3f 4b 4c 00 76 ee 5b 5d 41
>> 5c c3 cc
>> [ 44.823303] RSP: 0018:ffffd1c3c0733d38 EFLAGS: 00000297
>> [ 44.823332] RAX: 00000000000119f3 RBX: 0000000a74e60d1c RCX:
>> 000000000000001f
>> [ 44.823342] RDX: 0000000000000000 RSI: 000000003348c8a2 RDI:
>> ffffffffc1abbfd9
>> [ 44.823351] RBP: ffffffffb671b613 R08: 0000000000000002 R09:
>> 0000000000000000
>> [ 44.823359] R10: 0000000000000001 R11: 0000000000000000 R12:
>> ffffd1c3c0733d60
>> [ 44.823367] R13: ffffffffb575a5fd R14: ffffd1c3c0017be8 R15:
>> ffffd1c3c00179f8
>> [ 44.823397] ? rv_react+0x9d/0xf0
>> [ 44.823437] ? mock_printk_react+0x2f/0x50
>> [ 44.823448] rv_react+0xb4/0xf0
>> [ 44.823455] ? rv_react+0x9d/0xf0
>> [ 44.823476] test_printk_react_called+0x83/0xb0
>> [ 44.823486] ? __pfx_mock_printk_react+0x10/0x10
>> [ 44.823502] ? __pfx_mock_printk_react+0x10/0x10
>> [ 44.823513] kunit_try_run_case+0x97/0x190
>> [ 44.823534] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
>> [ 44.823544] kunit_generic_run_threadfn_adapter+0x21/0x40
>> [ 44.823551] kthread+0x124/0x160
>> [ 44.823562] ? __pfx_kthread+0x10/0x10
>> [ 44.823574] ret_from_fork+0x291/0x3b0
>> [ 44.823585] ? __pfx_kthread+0x10/0x10
>> [ 44.823595] ret_from_fork_asm+0x1a/0x30
>> [ 44.823641] </TASK>
>>
>>
>> Patch 1 fixes the lockdep bug by correcting rv_react()'s
>> wait_type_inner
>> from LD_WAIT_CONFIG (which inherits the outer context) to
>> LD_WAIT_SPIN
>> (the tightest constraint callbacks must satisfy).
>>
>> Patch 2 adds KUnit tests for reactor_printk. The busy-wait in the
>> mock
>> callback reproduces the timer interrupt scenario that exposes the
>> bug.
>>
>> Patch 3 adds KUnit tests for reactor_panic, exercising the panic
>> notifier
>> chain without halting the system.
>>
>> Tested with CONFIG_PROVE_LOCKING=y and CONFIG_KUNIT=y.
>>
>>
>> Wen Yang (3):
>> rv/reactors: fix lockdep "Invalid wait context" in rv_react()
>> rv/reactors: add KUnit tests for reactor_printk
>> rv/reactors: add KUnit tests for reactor_panic
>>
>> kernel/trace/rv/Kconfig | 20 ++++
>> kernel/trace/rv/Makefile | 2 +
>> kernel/trace/rv/reactor_panic_kunit.c | 106 +++++++++++++++++++++
>> kernel/trace/rv/reactor_printk_kunit.c | 123
>> +++++++++++++++++++++++++
>> kernel/trace/rv/rv_reactors.c | 8 +-
>> 5 files changed, 258 insertions(+), 1 deletion(-)
>> create mode 100644 kernel/trace/rv/reactor_panic_kunit.c
>> create mode 100644 kernel/trace/rv/reactor_printk_kunit.c
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-17 17:11 ` Wen Yang
@ 2026-06-18 15:35 ` Gabriele Monaco
2026-06-20 9:13 ` Wen Yang
0 siblings, 1 reply; 14+ messages in thread
From: Gabriele Monaco @ 2026-06-18 15:35 UTC (permalink / raw)
To: Wen Yang; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
On Thu, 2026-06-18 at 01:11 +0800, Wen Yang wrote:
> Thank you for your feedback.
> I am using a WSL dev environment with 12 cores and 16GB. The config
> of the tested kernel code is as follows:
Uhm that's a strange one, I cannot get a machine like that..
The closest is a 16 CPUs where I can limit the resources in vng.
> And then, using vng to build and run kselftests (since kunit is
> already
> built-in) can reproduce this issue:
>
> $ vng --build
>
> $ vng -v --run arch/x86/boot/bzImage --user root --
> tools/testing/selftests/verification/verificationtest-ktap
Well whenever I pass some argument to vng (instead of just vng -v that brings
up an interactive shell), I see an unrelated lockdep splat in
timekeeping_init(), but all clear when the KUnit runs..
I'm going to try and understand better what's going on, I don't think I can
reproduce it easily.
Thanks,
Gabriele
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
2026-06-18 15:35 ` Gabriele Monaco
@ 2026-06-20 9:13 ` Wen Yang
0 siblings, 0 replies; 14+ messages in thread
From: Wen Yang @ 2026-06-20 9:13 UTC (permalink / raw)
To: Gabriele Monaco; +Cc: Nam Cao, linux-trace-kernel, linux-kernel
On 6/18/26 23:35, Gabriele Monaco wrote:
> On Thu, 2026-06-18 at 01:11 +0800, Wen Yang wrote:
>> Thank you for your feedback.
>> I am using a WSL dev environment with 12 cores and 16GB. The config
>> of the tested kernel code is as follows:
>
> Uhm that's a strange one, I cannot get a machine like that..
> The closest is a 16 CPUs where I can limit the resources in vng.
>
I switched to a server with 32 cores and 126 GB mem, based on the
following code:
https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
8970865b788e
Then, using the above defconfig, and manually enabled:
CONFIG_RV_REACT_PRINTK_KUNIT=y
CONFIG_RV_REACT_PANIC_KUNIT=y
The issue can still be reproduced, as follows:
...
[ 2.467818] rtc_cmos PNP0B00:00: setting system clock to
2026-06-20T08:53:17 UTC (1781945597)
[ 2.467959] rtc_cmos PNP0B00:00: alarms up to one day, y3k, 242 bytes
nvram, hpet irqs
[ 2.468796] i6300ESB timer 0000:00:03.0: initialized. heartbeat=30
sec (nowayout=0)
[ 2.469135] device-mapper: ioctl: 4.50.0-ioctl (2025-04-28)
initialised: dm-devel@lists.linux.dev
[ 2.471695] NET: Registered PF_PACKET protocol family
[ 2.471724] 9pnet: Installing 9P2000 support
[ 2.473143] Key type dns_resolver registered
[ 2.483567] IPI shorthand broadcast: enabled
[ 2.508735] sched_clock: Marking stable (2449003255,
59304729)->(2516047879, -7739895)
[ 2.510654] registered taskstats version 1
[ 2.511333] Loading compiled-in X.509 certificates
[ 2.535612] Demotion targets for Node 0: null
[ 2.536818] netconsole: network logging started
[ 2.537105] clk: Disabling unused clocks
[ 2.537129] ALSA device list:
[ 2.537132] No soundcards found.
[ 2.537133] KTAP version 1
[ 2.537134] 1..2
[ 2.538768] KTAP version 1
[ 2.538769] # Subtest: rv_reactor_printk
[ 2.538770] # module: reactor_printk_kunit
[ 2.538771] 1..5
[ 2.539198] ok 1 test_printk_register_unregister
[ 2.539278] printk violation message
[ 2.539309]
[ 2.539309] =============================
[ 2.539310] [ BUG: Invalid wait context ]
[ 2.539310] 7.1.0-rc5-virtme #15 Tainted: G N
[ 2.539311] -----------------------------
[ 2.539311] kunit_try_catch/420 is trying to lock:
[ 2.539312] ffff8e347e93e1a0 (&rq->__lock){-...}-{2:2}, at:
__schedule+0xf5/0x1390
[ 2.539317] other info that might help us debug this:
[ 2.539317] context-{5:5}
[ 2.539317] 1 lock held by kunit_try_catch/420:
[ 2.539318] #0: ffffffffb9d8c3e0
(rv_react_map-wait-type-override){+.+.}-{1:1}, at: rv_react+0x56/0xd0
[ 2.539321] stack backtrace:
[ 2.539322] CPU: 22 UID: 0 PID: 420 Comm: kunit_try_catch Tainted: G
N 7.1.0-rc5-virtme #15 PREEMPT_{RT,(full)}
[ 2.539323] Tainted: [N]=TEST
[ 2.539324] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.15.0-1 04/01/2014
[ 2.539324] Call Trace:
[ 2.539325] <TASK>
[ 2.539326] dump_stack_lvl+0x82/0xd0
[ 2.539328] __lock_acquire+0xabc/0x27e0
[ 2.539330] ? desc_read_finalized_seq+0x2e/0x90
[ 2.539333] lock_acquire+0xd5/0x320
[ 2.539334] ? __schedule+0xf5/0x1390
[ 2.539336] _raw_spin_lock_nested+0x39/0x50
[ 2.539338] ? __schedule+0xf5/0x1390
[ 2.539339] __schedule+0xf5/0x1390
[ 2.539341] ? mark_held_locks+0x49/0x80
[ 2.539342] preempt_schedule_irq+0x37/0x70
[ 2.539343] irqentry_exit+0x1c5/0x750
[ 2.539344] ? rcu_is_watching+0x11/0x50
[ 2.539346] ? trace_hardirqs_off_finish+0xac/0xd0
[ 2.539348] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 2.539349] RIP: 0010:mock_printk_react+0x2a/0x50
[ 2.539351] Code: f3 0f 1e fa 0f 1f 44 00 00 41 54 49 89 f4 55 48 89
fd 53 e8 c8 a2 dd ff 4c 89 e6 48 89 ef 48 89 c3 e8 ba 21 ee ff eb 02 f3
90 <e8> b1 a2 dd ff 48 29 d8 48 3d 3f 4b 4c 00 76 ee 5b 5d 41 5c c3 cc
[ 2.539352] RSP: 0000:ffffd26a80ebfd48 EFLAGS: 00000297
[ 2.539353] RAX: 0000000000005a6c RBX: 0000000097d06aed RCX:
000000000000001f
[ 2.539354] RDX: 0000000000000000 RSI: 00000000282920b2 RDI:
fffffffff2cc40ea
[ 2.539354] RBP: ffffffffb9aac37e R08: 0000000000000002 R09:
0000000000000000
[ 2.539355] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffd26a80ebfd70
[ 2.539355] R13: ffffffffb8a6a366 R14: ffffffffb8e49e70 R15:
ffffd26a80013be8
[ 2.539355] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[ 2.539357] ? rv_react+0x56/0xd0
[ 2.539360] ? mock_printk_react+0x2f/0x50
[ 2.539362] rv_react+0x9c/0xd0
[ 2.539363] ? rv_react+0x56/0xd0
[ 2.539365] test_printk_react_called+0x83/0xb0
[ 2.539367] ? __pfx_mock_printk_react+0x10/0x10
[ 2.539368] ? __pfx_mock_printk_react+0x10/0x10
[ 2.539370] kunit_try_run_case+0x74/0x160
[ 2.539372] ? lockdep_hardirqs_on+0xc1/0x140
[ 2.539373] ? _raw_spin_unlock_irqrestore+0x46/0x80
[ 2.539375] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
[ 2.539375] kunit_generic_run_threadfn_adapter+0x21/0x40
[ 2.539376] kthread+0x126/0x170
[ 2.539378] ? __pfx_kthread+0x10/0x10
[ 2.539378] ret_from_fork+0x22b/0x310
[ 2.539380] ? __pfx_kthread+0x10/0x10
[ 2.539381] ret_from_fork_asm+0x1a/0x30
[ 2.539384] </TASK>
[ 2.544290] kunit_try_catch (420) used greatest stack depth: 12920
bytes left
[ 2.544542] ok 2 test_printk_react_called
[ 2.544768] Reactor test_printk is already registered
[ 2.544819] ok 3 test_printk_double_register
[ 2.544953] ok 4 test_printk_register_cycle
[ 2.545121] ok 5 test_printk_react_not_null
[ 2.545121] # rv_reactor_printk: pass:5 fail:0 skip:0 total:5
[ 2.545122] # Totals: pass:5 fail:0 skip:0 total:5
[ 2.545123] ok 1 rv_reactor_printk
[ 2.545123] KTAP version 1
[ 2.545123] # Subtest: rv_reactor_panic
[ 2.545124] # module: reactor_panic_kunit
[ 2.545124] 1..2
[ 2.545284] ok 1 test_panic_register_unregister
[ 2.545337] KUnit: reactor_panic test intercepted panic notifier:
panic violation message
[ 2.547624] ok 2 test_panic_notifier_called
[ 2.547624] # rv_reactor_panic: pass:2 fail:0 skip:0 total:2
[ 2.547625] # Totals: pass:2 fail:0 skip:0 total:2
[ 2.547625] ok 2 rv_reactor_panic
[ 2.612894] ata2: found unknown device (class 0)
[ 2.616059] ata2.00: ATAPI: QEMU DVD-ROM, 2.5+, max UDMA/100
[ 2.621249] kobject: 'devlink' (0000000023b6b59a):
kobject_add_internal: parent: 'virtual', set: '(null)'
[ 2.621380] kobject: ':ata2--scsi:1:0:0:0' (0000000079538692):
kobject_add_internal: parent: 'devlink', set: 'devices'
[ 2.621467] kobject: ':ata2--scsi:1:0:0:0' (0000000079538692):
kobject_uevent_env
[ 2.621473] kobject: ':ata2--scsi:1:0:0:0' (0000000079538692):
fill_kobj_path: path = '/devices/virtual/devlink/:ata2--scsi:1:0:0:0'
[ 2.624159] scsi 1:0:0:0: CD-ROM QEMU QEMU DVD-ROM
2.5+ PQ: 0 ANSI: 5
[ 2.637661] kobject: 'target1:0:0' (00000000e7e99ebc):
kobject_add_internal: parent: 'host1', set: 'devices'
...
The above warning appears in the middle of the screen output, not at the
end, because kunit is built-in and is executed before the init starts.
In addition, there is another issue at the very end: warning: WARNING:
kernel/exit.c:902 at do_exit+0x9d8/0xc60, CPU#0: virtme-ng-init/1)...
--
Best wishes,
Wen
^ permalink raw reply [flat|nested] 14+ messages in thread