From: Thomas Gleixner <tglx@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>,
Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, ada.coupriediaz@arm.com,
catalin.marinas@arm.com, linux-kernel@vger.kernel.org,
luto@kernel.org, ruanjinjie@huawei.com, vladimir.murzin@arm.com,
will@kernel.org
Subject: Re: [PATCH 1/2] arm64/entry: Fix involuntary preemption exception masking
Date: Fri, 20 Mar 2026 15:11:20 +0100 [thread overview]
Message-ID: <87h5qak2uv.ffs@tglx> (raw)
In-Reply-To: <20260320130433.GV3738786@noisy.programming.kicks-ass.net>
On Fri, Mar 20 2026 at 14:04, Peter Zijlstra wrote:
> On Fri, Mar 20, 2026 at 11:30:25AM +0000, Mark Rutland wrote:
>> Thomas, Peter, I have a couple of things I'd like to check:
>>
>> (1) The generic irq entry code will preempt from any exception (e.g. a
>> synchronous fault) where interrupts were unmasked in the original
>> context. Is that intentional/necessary, or was that just the way the
>> x86 code happened to be implemented?
>>
>> I assume that it'd be fine if arm64 only preempted from true
>> interrupts, but if that was intentional/necessary I can go rework
>> this.
>
> So NMI-from-kernel must not trigger resched IIRC. There is some code
> that relies on this somewhere. And on x86 many of those synchronous
> exceptions are marked as NMI, since they can happen with IRQs disabled
> inside locks etc.
>
> But for the rest I don't think we care particularly. Notably page-fault
> will already schedule itself when possible (faults leading to IO and
> blocking).
Right. In general we allow preemption on any interrupt, trap and exception
when:
1) the interrupted context had interrupts enabled
2) RCU was watching in the original context
This _is_ intentional as there is no reason to defer preemption in such
a case. The RT people might get upset if you do so.
NMI like exceptions, which are not allowed to schedule, should therefore
never go through irqentry_irq_entry() and irqentry_irq_exit().
irqentry_nmi_enter() and irqentry_nmi_exit() exist for a technical
reason and are not just of decorative nature. :)
>> (2) The generic irq entry code only preempts when RCU was watching in
>> the original context. IIUC that's just to avoid preempting from the
>> idle thread. Is it functionally necessary to avoid that, or is that
>> just an optimization?
>>
>> I'm asking because historically arm64 didn't check that, and I
>> haven't bothered checking here. I don't know whether we have a
>> latent functional bug.
>
> Like I told you on IRC, I *think* this is just an optimization, since if
> you hit idle, the idle loop will take care of scheduling. But I can't
> quite remember the details here, and wish we'd have written a sensible
> comment at that spot.
There is one, but it's obviously not detailed enough.
> Other places where RCU isn't watching are userspace and KVM. The first
> isn't relevant because this is return-to-kernel, and the second I'm not
> sure about.
>
> Thomas, can you remember?
Yes. It's not an optimization. It's a correctness issue.
If the interrupted context is RCU idle then you have to carefully go
back to that context. So that the context can tell RCU it is done with
the idle state and RCU has to pay attention again. Otherwise all of this
becomes imbalanced.
This is about context-level nesting:
...
L1.A ct_cpuidle_enter();
-> interrupt
L2.A ct_irq_enter();
... // Set NEED_RESCHED
L2.B ct_irq_exit();
...
L1.B ct_cpuidle_exit();
Scheduling between #L2.B and #L1.B makes RCU rightfully upset. Think
about it this way:
L1.A preempt_disable();
L2.A local_bh_disable();
..
L2.B local_bh_enable();
if (need_resched())
schedule();
L1.B preempt_enable();
RCU is not any different. For context-level nesting of any kind the only
valid order is:
L1.A -> L2.A -> L2.B -> L1.B
Pretty obvious if you actually think about it, no?
Thanks,
tglx
next prev parent reply other threads:[~2026-03-20 14:11 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 11:30 [PATCH 0/2] arm64/entry: Fix involuntary preemption exception masking Mark Rutland
2026-03-20 11:30 ` [PATCH 1/2] " Mark Rutland
2026-03-20 13:04 ` Peter Zijlstra
2026-03-20 14:11 ` Thomas Gleixner [this message]
2026-03-20 14:57 ` Mark Rutland
2026-03-20 15:34 ` Peter Zijlstra
2026-03-20 16:16 ` Mark Rutland
2026-03-20 15:50 ` Thomas Gleixner
2026-03-23 17:21 ` Mark Rutland
2026-03-20 14:59 ` Thomas Gleixner
2026-03-20 15:37 ` Mark Rutland
2026-03-20 16:26 ` Thomas Gleixner
2026-03-20 17:31 ` Mark Rutland
2026-03-21 23:25 ` Thomas Gleixner
2026-03-24 12:19 ` Thomas Gleixner
2026-03-25 11:03 ` Mark Rutland
2026-03-25 15:46 ` Thomas Gleixner
2026-03-26 8:56 ` Jinjie Ruan
2026-03-26 18:11 ` Mark Rutland
2026-03-26 18:32 ` Thomas Gleixner
2026-03-27 1:27 ` Jinjie Ruan
2026-03-26 8:52 ` Jinjie Ruan
2026-03-24 3:14 ` Jinjie Ruan
2026-03-24 10:51 ` Mark Rutland
2026-03-20 11:30 ` [PATCH 2/2] arm64/entry: Remove arch_irqentry_exit_need_resched() Mark Rutland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87h5qak2uv.ffs@tglx \
--to=tglx@kernel.org \
--cc=ada.coupriediaz@arm.com \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=peterz@infradead.org \
--cc=ruanjinjie@huawei.com \
--cc=vladimir.murzin@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox