From: sashiko-bot@kernel.org
To: "Boqun Feng" <boqun@kernel.org>
Cc: bpf@vger.kernel.org, ojeda@kernel.org
Subject: Re: [PATCH v2 01/12] preempt: Track NMI nesting to separate per-CPU counter
Date: Tue, 26 May 2026 16:12:03 +0000 [thread overview]
Message-ID: <20260526161204.6BDFB1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260526152148.30514-2-boqun@kernel.org>
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Race condition in __nmi_exit() causing preempt_count underflow and severe state corruption.
- [High] Non-atomic update of preempt_count in __nmi_enter() clobbers state changes made by nested NMIs.
--
commit c247872c3bac47f83b6c7df80c559c9576868ea1
Author: Joel Fernandes <joelagnelf@nvidia.com>
preempt: Track NMI nesting to separate per-CPU counter
Move NMI nesting tracking from the preempt_count bits to a separate per-CPU
counter (nmi_nesting) to free up bits for other uses. Reduce NMI_BITS from
4 to 1, using it only to detect if we are in an NMI, and cap nesting at 15.
Link: https://patch.msgid.link/20260121223933.1568682-3-lyude@redhat.com
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index d57cab4d4c06f..1a0360a1000f1 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
[ ... ]
> @@ -102,14 +104,17 @@ void irq_exit_rcu(void);
> */
>
> /*
> - * nmi_enter() can nest up to 15 times; see NMI_BITS.
> + * nmi_enter() can nest - nesting is tracked in a per-CPU counter.
> */
> #define __nmi_enter() \
> do { \
> lockdep_off(); \
> arch_nmi_enter(); \
> - BUG_ON(in_nmi() == NMI_MASK); \
> - __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
> + /* Maximum NMI nesting is 15. */ \
> + BUG_ON(__this_cpu_read(nmi_nesting) >= 15); \
> + __this_cpu_inc(nmi_nesting); \
> + __preempt_count_add(HARDIRQ_OFFSET); \
> + preempt_count_set(preempt_count() | NMI_MASK); \
[Severity: High]
Does this non-atomic read-modify-write of preempt_count risk clobbering
state changes?
If a nested non-maskable exception (such as an MCE) interrupts the CPU exactly
after the outer NMI reads preempt_count() but before it executes
preempt_count_set(), the outer NMI retains a stale snapshot.
If the nested exception modifies preempt_count (such as setting the
PREEMPT_NEED_RESCHED flag via set_preempt_need_resched()), those changes
could be unconditionally overwritten and lost when the outer NMI resumes and
blindly writes back its stale snapshot.
> } while (0)
>
> #define nmi_enter() \
> @@ -124,8 +129,12 @@ void irq_exit_rcu(void);
>
> #define __nmi_exit() \
> do { \
> + unsigned int nesting; \
> BUG_ON(!in_nmi()); \
> - __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
> + __preempt_count_sub(HARDIRQ_OFFSET); \
> + nesting = __this_cpu_dec_return(nmi_nesting); \
> + if (!nesting) \
> + __preempt_count_sub(NMI_OFFSET); \
[Severity: Critical]
Can a nested NMI here cause preempt_count underflow?
If a nested non-maskable exception interrupts execution exactly after
nesting = __this_cpu_dec_return(nmi_nesting) but before the outer NMI executes
__preempt_count_sub(NMI_OFFSET), the outer NMI will have decremented
nmi_nesting to 0.
The nested exception would then run nmi_enter(), see nmi_nesting == 0,
increment it to 1, and set NMI_MASK. On exit, the nested exception will
decrement nmi_nesting back to 0 and clear NMI_MASK via
__preempt_count_sub(NMI_OFFSET).
When the nested exception returns, the outer NMI resumes execution with its
local nesting variable as 0, and blindly executes
__preempt_count_sub(NMI_OFFSET). Since the nested exception already cleared
the NMI_MASK bit, this second subtraction could underflow bit 20 (NMI_OFFSET),
borrowing from the upper bits and corrupting flags like PREEMPT_NEED_RESCHED.
> arch_nmi_exit(); \
> lockdep_on(); \
> } while (0)
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260526152148.30514-1-boqun@kernel.org?part=1
next prev parent reply other threads:[~2026-05-26 16:12 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-26 15:21 [PATCH v2 00/12] Refcounted interrupt disable and SpinLockIrq for rust (Part 1) Boqun Feng
2026-05-26 15:21 ` [PATCH v2 01/12] preempt: Track NMI nesting to separate per-CPU counter Boqun Feng
2026-05-26 16:12 ` sashiko-bot [this message]
2026-06-04 12:36 ` Boqun Feng
2026-05-26 15:21 ` [PATCH v2 02/12] preempt: Introduce HARDIRQ_DISABLE_BITS Boqun Feng
2026-05-26 15:21 ` [PATCH v2 03/12] preempt: Introduce __preempt_count_{sub, add}_return() Boqun Feng
2026-05-26 15:21 ` [PATCH v2 04/12] openrisc: Include <linux/cpumask.h> in smp.h Boqun Feng
2026-05-26 15:21 ` [PATCH v2 05/12] irq & spin_lock: Add counted interrupt disabling/enabling Boqun Feng
2026-05-26 16:19 ` bot+bpf-ci
2026-05-26 17:54 ` sashiko-bot
2026-05-28 10:43 ` Peter Zijlstra
2026-05-28 14:31 ` Boqun Feng
2026-05-26 15:21 ` [PATCH v2 06/12] irq: Add KUnit test for refcounted interrupt enable/disable Boqun Feng
2026-05-26 18:18 ` sashiko-bot
2026-05-26 15:21 ` [PATCH v2 07/12] locking: Switch to _irq_{disable,enable}() variants in cleanup guards Boqun Feng
2026-05-28 10:45 ` Peter Zijlstra
2026-05-28 14:31 ` Boqun Feng
2026-05-26 15:21 ` [PATCH v2 08/12] sched: Remove the unused preempt_offset parameter of __cant_sleep() Boqun Feng
2026-05-26 15:21 ` [PATCH v2 09/12] sched: Avoid signed comparison of preempt_count() in __cant_migrate() Boqun Feng
2026-05-26 15:21 ` [PATCH v2 10/12] preempt: Introduce HAS_SEPARATE_PREEMPT_RESCHED_BITS Boqun Feng
2026-05-26 19:57 ` sashiko-bot
2026-06-04 12:40 ` Boqun Feng
2026-05-26 15:21 ` [PATCH v2 11/12] arm64: sched/preempt: Enable HAS_SEPARATE_PREEMPT_RESCHED_BITS Boqun Feng
2026-05-28 10:50 ` Peter Zijlstra
2026-05-26 15:21 ` [PATCH v2 12/12] s390/preempt: " Boqun Feng
2026-05-28 10:53 ` Peter Zijlstra
2026-05-28 14:41 ` Boqun Feng
2026-05-28 15:18 ` Heiko Carstens
2026-05-27 16:18 ` [PATCH v2 00/12] Refcounted interrupt disable and SpinLockIrq for rust (Part 1) Peter Zijlstra
2026-05-27 16:33 ` Boqun Feng
2026-06-03 19:20 ` Boqun Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260526161204.6BDFB1F000E9@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=boqun@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=ojeda@kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.