Linux Trace Kernel
 help / color / mirror / Atom feed
From: Gabriele Monaco <gmonaco@redhat.com>
To: wen.yang@linux.dev
Cc: Nam Cao <namcao@linutronix.de>,
	linux-trace-kernel@vger.kernel.org,
	 linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests
Date: Wed, 17 Jun 2026 17:41:20 +0200	[thread overview]
Message-ID: <2bcfa0bda551c0e1ba137b728dbe7886ff5c2579.camel@redhat.com> (raw)
In-Reply-To: <cover.1781541556.git.wen.yang@linux.dev>

On Tue, 2026-06-16 at 00:44 +0800, wen.yang@linux.dev wrote:
> From: Wen Yang <wen.yang@linux.dev>
> 
> We occasionally hit a lockdep "Invalid wait context" warning in
> production
> environments when rv_react() callbacks are interrupted.
> 
> The bug is intermittent in production. KUnit tests with busy-wait
> callbacks
> can reproduce it by holding the CPU long enough for a timer interrupt
> to fire
> during rv_react(), exposing the lockdep constraint violation:
> 
> [   44.820913] =============================
> [   44.820923] [ BUG: Invalid wait context ]
> [   44.821137] 7.1.0-rc7-next-20260612-virtme #6 Tainted:
> G                 N
> [   44.821203] -----------------------------

It's nice to have reactors kunit coverage, I need to go through them
more carefully but I like the idea.

Are those tests supposed to trigger this issue though? Under what
configuration?

I reverted the lockdep fix and run the tests in vng on both x86_64 and
arm64, both preempt_rt and not but I see no splat.
Repeating the tests multiple times from debugfs also didn't seem to
help. Both machines were relatively large (128 and 48 CPUs).

The config was the bare vng one with kunit built-in, lockdep and the
reactors tests.

What am I missing?

Thanks,
Gabriele

> [   44.821211] kunit_try_catch/209 is trying to lock:
> [   44.821244] ffff8a743ed3e8a0 (&rq->__lock){-...}-{2:2}, at:
> __schedule+0x102/0x13d0
> [   44.821688] other info that might help us debug this:
> [   44.821708] context-{5:5}
> [   44.821730] 1 lock held by kunit_try_catch/209:
> [   44.821745]  #0: ffffffffb6ba62c0 (rv_react_map-wait-type-
> override){+.+.}-{1:1}, at: rv_react+0x9d/0xf0
> [   44.821803] stack backtrace:
> [   44.822110] CPU: 10 UID: 0 PID: 209 Comm: kunit_try_catch Tainted:
> G                 N  7.1.0-rc7-next-20260612-virtme #6
> PREEMPT_{RT,(full)}
> [   44.822197] Tainted: [N]=TEST
> [   44.822210] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX,
> arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   44.822328] Call Trace:
> [   44.822377]  <TASK>
> [   44.822806]  dump_stack_lvl+0x78/0xe0
> [   44.822860]  __lock_acquire+0x926/0x1c90
> [   44.822888]  lock_acquire+0xd3/0x310
> [   44.822901]  ? __schedule+0x102/0x13d0
> [   44.822919]  ? rcu_qs+0x2d/0x1a0
> [   44.822954]  _raw_spin_lock_nested+0x36/0x50
> [   44.822966]  ? __schedule+0x102/0x13d0
> [   44.822979]  __schedule+0x102/0x13d0
> [   44.822993]  ? mark_held_locks+0x40/0x70
> [   44.823009]  preempt_schedule_irq+0x37/0x70
> [   44.823018]  irqentry_exit+0x1da/0x8c0
> [   44.823032]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [   44.823093] RIP: 0010:mock_printk_react+0x2a/0x50
> [   44.823250] Code: f3 0f 1e fa 0f 1f 44 00 00 41 54 49 89 f4 55 48
> 89 fd 53 e8 18 8b db ff 4c 89 e6 48 89 ef 48 89 c3 e8 fa 8e ed ff eb
> 02 f3 90 <e8> 01 8b db ff 48 29 d8 48 3d 3f 4b 4c 00 76 ee 5b 5d 41
> 5c c3 cc
> [   44.823303] RSP: 0018:ffffd1c3c0733d38 EFLAGS: 00000297
> [   44.823332] RAX: 00000000000119f3 RBX: 0000000a74e60d1c RCX:
> 000000000000001f
> [   44.823342] RDX: 0000000000000000 RSI: 000000003348c8a2 RDI:
> ffffffffc1abbfd9
> [   44.823351] RBP: ffffffffb671b613 R08: 0000000000000002 R09:
> 0000000000000000
> [   44.823359] R10: 0000000000000001 R11: 0000000000000000 R12:
> ffffd1c3c0733d60
> [   44.823367] R13: ffffffffb575a5fd R14: ffffd1c3c0017be8 R15:
> ffffd1c3c00179f8
> [   44.823397]  ? rv_react+0x9d/0xf0
> [   44.823437]  ? mock_printk_react+0x2f/0x50
> [   44.823448]  rv_react+0xb4/0xf0
> [   44.823455]  ? rv_react+0x9d/0xf0
> [   44.823476]  test_printk_react_called+0x83/0xb0
> [   44.823486]  ? __pfx_mock_printk_react+0x10/0x10
> [   44.823502]  ? __pfx_mock_printk_react+0x10/0x10
> [   44.823513]  kunit_try_run_case+0x97/0x190
> [   44.823534]  ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10
> [   44.823544]  kunit_generic_run_threadfn_adapter+0x21/0x40
> [   44.823551]  kthread+0x124/0x160
> [   44.823562]  ? __pfx_kthread+0x10/0x10
> [   44.823574]  ret_from_fork+0x291/0x3b0
> [   44.823585]  ? __pfx_kthread+0x10/0x10
> [   44.823595]  ret_from_fork_asm+0x1a/0x30
> [   44.823641]  </TASK>
> 
> 
> Patch 1 fixes the lockdep bug by correcting rv_react()'s
> wait_type_inner
> from LD_WAIT_CONFIG (which inherits the outer context) to
> LD_WAIT_SPIN
> (the tightest constraint callbacks must satisfy).
> 
> Patch 2 adds KUnit tests for reactor_printk. The busy-wait in the
> mock
> callback reproduces the timer interrupt scenario that exposes the
> bug.
> 
> Patch 3 adds KUnit tests for reactor_panic, exercising the panic
> notifier
> chain without halting the system.
> 
> Tested with CONFIG_PROVE_LOCKING=y and CONFIG_KUNIT=y.
> 
> 
> Wen Yang (3):
>   rv/reactors: fix lockdep "Invalid wait context" in rv_react()
>   rv/reactors: add KUnit tests for reactor_printk
>   rv/reactors: add KUnit tests for reactor_panic
> 
>  kernel/trace/rv/Kconfig                |  20 ++++
>  kernel/trace/rv/Makefile               |   2 +
>  kernel/trace/rv/reactor_panic_kunit.c  | 106 +++++++++++++++++++++
>  kernel/trace/rv/reactor_printk_kunit.c | 123
> +++++++++++++++++++++++++
>  kernel/trace/rv/rv_reactors.c          |   8 +-
>  5 files changed, 258 insertions(+), 1 deletion(-)
>  create mode 100644 kernel/trace/rv/reactor_panic_kunit.c
>  create mode 100644 kernel/trace/rv/reactor_printk_kunit.c


  parent reply	other threads:[~2026-06-17 15:41 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15 16:44 [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests wen.yang
2026-06-15 16:44 ` [PATCH 1/3] rv/reactors: fix lockdep "Invalid wait context" in rv_react() wen.yang
2026-06-17 11:12   ` Nam Cao
2026-06-17 15:58   ` Nam Cao
2026-06-15 16:44 ` [PATCH 2/3] rv/reactors: add KUnit tests for reactor_printk wen.yang
2026-06-15 16:44 ` [PATCH 3/3] rv/reactors: add KUnit tests for reactor_panic wen.yang
2026-06-17 15:41 ` Gabriele Monaco [this message]
2026-06-17 15:52   ` [PATCH 0/3] rv/reactors: fix lockdep warning and add KUnit tests Nam Cao
2026-06-17 16:14     ` Gabriele Monaco
2026-06-17 17:11   ` Wen Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2bcfa0bda551c0e1ba137b728dbe7886ff5c2579.camel@redhat.com \
    --to=gmonaco@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=namcao@linutronix.de \
    --cc=wen.yang@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox