From: Thomas Gleixner <tglx@kernel.org>
To: Ihor Solodrai <ihor.solodrai@linux.dev>,
LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Gabriele Monaco <gmonaco@redhat.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Michael Jeanson <mjeanson@efficios.com>,
Jens Axboe <axboe@kernel.dk>,
"Paul E. McKenney" <paulmck@kernel.org>,
"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
Florian Weimer <fweimer@redhat.com>,
Tim Chen <tim.c.chen@intel.com>,
Yury Norov <yury.norov@gmail.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
bpf <bpf@vger.kernel.org>,
sched-ext@lists.linux.dev, Kernel Team <kernel-team@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Puranjay Mohan <puranjay@kernel.org>, Tejun Heo <tj@kernel.org>
Subject: Re: [patch V5 00/20] sched: Rewrite MM CID management
Date: Wed, 28 Jan 2026 12:57:20 +0100 [thread overview]
Message-ID: <877bt29cgv.ffs@tglx> (raw)
In-Reply-To: <2b7463d7-0f58-4e34-9775-6e2115cfb971@linux.dev>
On Tue, Jan 27 2026 at 16:01, Ihor Solodrai wrote:
> BPF CI caught a deadlock on current bpf-next tip (35538dba51b4).
> Job: https://github.com/kernel-patches/bpf/actions/runs/21417415035/job/61670254640
>
> It appears to be related to this series. Pasting a splat below.
The deadlock splat is completely unrelated as it is a consequence of the
panic which is triggered by the watchdog:
> [ 45.009755] watchdog: CPU2: Watchdog detected hard LOCKUP on cpu 2
...
> [ 46.053170] lock(&nmi_desc[NMI_LOCAL].lock);
> [ 46.053172] <Interrupt>
> [ 46.053173] lock(&nmi_desc[NMI_LOCAL].lock);
...
> Any ideas what might be going on?
Without a full backtrace of all CPUs it's hard to tell because it's
unclear what is holding the runqueue lock of CPU2 long enough to trigger
the hard lockup watchdog.
I'm pretty sure the CID changes are unrelated, that new code just happen
to show up as the messenger which gets stuck on the lock forever.
> [ 46.053209] CPU: 2 UID: 0 PID: 126 Comm: test_progs Tainted: G OE 6.19.0-rc5-g748c6d52700a-dirty #1 PREEMPT(full)
> [ 46.053214] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> [ 46.053215] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 46.053217] Call Trace:
> [ 46.053220] <NMI>
> [ 46.053223] dump_stack_lvl+0x5d/0x80
> [ 46.053227] print_usage_bug.part.0+0x22b/0x2c0
> [ 46.053231] lock_acquire+0x272/0x2b0
> [ 46.053235] ? __register_nmi_handler+0x83/0x350
> [ 46.053240] _raw_spin_lock_irqsave+0x39/0x60
> [ 46.053242] ? __register_nmi_handler+0x83/0x350
> [ 46.053246] __register_nmi_handler+0x83/0x350
> [ 46.053250] native_stop_other_cpus+0x31c/0x460
> [ 46.053255] ? __pfx_native_stop_other_cpus+0x10/0x10
> [ 46.053260] vpanic+0x1c5/0x3f0
vpanic() really should disable lockdep here before taking that lock in
NMI context. The resulting lockdep splat is not really useful.
Thanks.
tglx
next prev parent reply other threads:[~2026-01-28 11:57 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-19 17:26 [patch V5 00/20] sched: Rewrite MM CID management Thomas Gleixner
2025-11-19 17:26 ` [patch V5 01/20] sched/mmcid: Revert the complex " Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 02/20] sched/mmcid: Use proper data structures Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 03/20] sched/mmcid: Cacheline align MM CID storage Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 04/20] sched: Fixup whitespace damage Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 05/20] sched/mmcid: Move scheduler code out of global header Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 06/20] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed() Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 07/20] cpumask: Introduce cpumask_weighted_or() Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:26 ` [patch V5 08/20] sched/mmcid: Use cpumask_weighted_or() Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 09/20] cpumask: Cache num_possible_cpus() Thomas Gleixner
2025-11-20 11:20 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-21 22:56 ` [patch V5 09/20] " Marek Szyprowski
2025-11-22 15:36 ` Thomas Gleixner
2025-11-22 16:24 ` Marek Szyprowski
2025-11-22 19:09 ` Paul E. McKenney
2025-11-23 19:03 ` [tip: core/rseq] cpu: Initialize __num_possible_cpus correctly tip-bot2 for Thomas Gleixner
2025-11-22 18:47 ` [patch V5 09/20] cpumask: Cache num_possible_cpus() Paul E. McKenney
2025-11-22 19:10 ` Thomas Gleixner
2025-11-22 0:27 ` Nathan Chancellor
2025-11-26 4:36 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 10/20] sched/mmcid: Convert mm CID mask to a bitmap Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 11/20] signal: Move MMCID exit out of sighand lock Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 12/20] sched/mmcid: Move initialization out of line Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 13/20] sched/mmcid: Provide precomputed maximal value Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 14/20] sched/mmcid: Serialize sched_mm_cid_fork()/exit() with a mutex Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 15/20] sched/mmcid: Introduce per task/CPU ownership infrastructure Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 16/20] sched/mmcid: Provide new scheduler CID mechanism Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 17/20] sched/mmcid: Provide CID ownership mode fixup functions Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 18/20] irqwork: Move data struct to a types header Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 19/20] sched/mmcid: Implement deferred mode change Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` tip-bot2 for Thomas Gleixner
2025-11-19 17:27 ` [patch V5 20/20] sched/mmcid: Switch over to the new mechanism Thomas Gleixner
2025-11-20 11:19 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-22 0:43 ` [patch V5 20/20] " Nathan Chancellor
2025-11-22 15:02 ` Thomas Gleixner
2025-11-22 16:54 ` Shrikanth Hegde
2025-11-23 19:03 ` [tip: core/rseq] sched/mmcid: Ensure that per CPU threshold is > 0 tip-bot2 for Thomas Gleixner
2025-11-26 4:36 ` [tip: core/rseq] sched/mmcid: Switch over to the new mechanism tip-bot2 for Thomas Gleixner
2026-01-28 0:01 ` [patch V5 00/20] sched: Rewrite MM CID management Ihor Solodrai
2026-01-28 8:46 ` Peter Zijlstra
2026-01-28 11:57 ` Thomas Gleixner [this message]
2026-01-28 12:58 ` Shrikanth Hegde
2026-01-28 13:56 ` Thomas Gleixner
2026-01-28 22:24 ` Thomas Gleixner
2026-01-28 22:33 ` Ihor Solodrai
2026-01-28 23:08 ` Ihor Solodrai
2026-01-29 17:06 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877bt29cgv.ffs@tglx \
--to=tglx@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=axboe@kernel.dk \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=fweimer@redhat.com \
--cc=gautham.shenoy@amd.com \
--cc=gmonaco@redhat.com \
--cc=ihor.solodrai@linux.dev \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjeanson@efficios.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=puranjay@kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@intel.com \
--cc=tj@kernel.org \
--cc=yury.norov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox