public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Yeoreum Yun <yeoreum.yun@arm.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, ptosi@google.com,
	pcc@google.com, catalin.marinas@arm.com, bigeasy@linutronix.de,
	linux-kernel@vger.kernel.org, joey.gouly@arm.com,
	mcgrof@kernel.org, broonie@kernel.org, geert@linux-m68k.org,
	kristina.martsenko@arm.com, james.morse@arm.com,
	Will Deacon <will@kernel.org>,
	rppt@kernel.org
Subject: Re: [PATCH] arm64/trap: fix broken ct->nmi_nesting when die() is called in a kthread
Date: Tue, 3 Jun 2025 12:14:18 +0100	[thread overview]
Message-ID: <aD7ZCnNUqxb9XWNh@e129823.arm.com> (raw)
In-Reply-To: <aD7EA2Bmp-mrWZaN@J2N7QTR9R3.cambridge.arm.com>

Hi Mark,

> On Mon, Jun 02, 2025 at 06:50:53PM +0100, Yeoreum Yun wrote:
> > > One of the reasons we treak BRK as an NMI is that exception entry for
> > > BRK will leave all DAIF bits set, whereas schedule() should be called
> > > with debug and SError unmasked (but IRQ+FIQ masked). Generally, calling
> > > ct_nmi_enter() prevents preemption (and hence calls to schedule()).
> >
> > I think ct_nmi_enter() doesn't prevents preemption but
> > debug_exception_enter() disables preemption.
>
> Yep, sorry for the confusion there. I had erroneously pattern-matched on
> the nmi_nesting values and I had confused that with the similar
> manipulation of the preempt count.
>
> > > Another is that we may have a BUG() or WARN() in entry code where the
> > > task could be in an inconsistent state, and we need to treat the
> > > exception like an NMI to avoid consuming that inconsistent state.
> >
> > So, let's think the "inconsistent" state like:
> >   -> el0_enter()
> > 	  -> enter_from_user_mode()
> > 		  -> before update ct_state (context_tracking.state), call BUG()/WARN()
> > 			  -> el1_dbg()
> >
> > It need to call ct_nmi_enter() in el1_dbg() right?
>
> Yes. The critical things are that RCU may not be watching, and all other
> entry accounting may be in an intermediate/inconsistent state, since the
> BUG()/WARN() could be anywhere in that C code. Currently that means we
> must call ct_nmi_enter().
>
> The other problem to bear in mind is that we don't have a way to
> distinguish these BUG()/WARN() cases from others throughout the kernel,
> which is why we currently unconditionally treat this as an NMI entry.
>
> > > To handle that properly, we need to:
> > >
> > > (a) Figure out what to do with entry code. Last I looked I was under the
> > >     impression that x86 either didn't have a problem here, or simply
> > >     ignored it.
> >
> > TBH, in above case, x86 seems context_traking.state will be broken...
>
> That's certainly possible, that was the impression I had last time I
> looked, but I haven't looked at this in detail for a short while, and I
> may have missed something.
>
> > > (b) Handle BUG/WARN traps separately from other BRKs, such that we can
> > >     use local_daif_inherit(), and treat this as a special function call
> > >     rather than an NMI.
> > >
> > > (c) Somehow teach make_task_dead() to handle the case where DAIF.D
> > >     and/or DAIF.A are set. Most likely we simply have to panic() here,
> > >     as with BUG() in interrupt context.
> >
> > Right... It should handle for DAIF.D and DAIF.A bits...
>
> Yes.
>
> [...]

Thanks for clarficiation :D

>
> > > As-is, I think an extra warning in the case of a BUG() is fine given
> > > the larger functional issues.
> > >
> > > I do not think this patch is correct as-is.
> >
> > So, what I think:
> >   1. arm64_enter_el1_dbg() should ct_nmi_enter() as it is.
> >   2. in bug_handler() while handling BUG_TYPE, add above ct_nmi_exit()
> >      conditional call.
> >   3. DAIF.D and DAIF.A handling.
>
> No, that is not safe. In step 2, calling ct_nmi_exit() would undo *all*
> of the ct_nmi_enter() logic, and may stop RCU from watching if the
> exception was entered from some intermediate/inconsistent state.

Yes if call ct_nmi_enter() without condition.
But I imply with the condition check what I posted.
if CT_NESTING_IRQ_NONIDLE,
it wouldn't need call and that cpu can be watched by RCU.

>
> If we want to change anything now, it should be the DAIF.DA handling,
> but even for that I'm not sure what the best approach is, and that'll
> require some changes to core code.
>
> Please leave this as-if for now.
>

Not now. But waiting for Ada's patch merged.
and let me talk with you again please.

Thanks for your confirmation again!

--
Sincerely,
Yeoreum Yun


  reply	other threads:[~2025-06-03 11:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-30  9:27 [PATCH] arm64/trap: fix broken ct->nmi_nesting when die() is called in a kthread Yeoreum Yun
2025-06-02 12:47 ` Will Deacon
2025-06-02 14:54   ` Yeoreum Yun
2025-06-02 15:11     ` Mark Rutland
2025-06-02 17:50       ` Yeoreum Yun
2025-06-03  9:44         ` Mark Rutland
2025-06-03 11:14           ` Yeoreum Yun [this message]
2025-06-03 13:46             ` Mark Rutland
2025-06-03 15:23               ` Yeoreum Yun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aD7ZCnNUqxb9XWNh@e129823.arm.com \
    --to=yeoreum.yun@arm.com \
    --cc=bigeasy@linutronix.de \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=geert@linux-m68k.org \
    --cc=james.morse@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kristina.martsenko@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mcgrof@kernel.org \
    --cc=pcc@google.com \
    --cc=ptosi@google.com \
    --cc=rppt@kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox