From: Pingfan Liu <kernelfans@gmail.com>
To: Mark Rutland <mark.rutland@arm.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
linux-arm-kernel@lists.infradead.org,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Joey Gouly <joey.gouly@arm.com>,
Sami Tolvanen <samitolvanen@google.com>,
Julien Thierry <julien.thierry@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Yuichi Ito <ito-yuichi@fujitsu.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead
Date: Sat, 9 Oct 2021 11:49:47 +0800 [thread overview]
Message-ID: <YWERWy5tMdbaEUU8@piliu.users.ipa.redhat.com> (raw)
In-Reply-To: <20211008172513.GD976@C02TD0UTHF1T.local>
On Fri, Oct 08, 2021 at 06:25:13PM +0100, Mark Rutland wrote:
> On Fri, Oct 08, 2021 at 10:55:04PM +0800, Pingfan Liu wrote:
> > On Fri, Oct 08, 2021 at 12:01:25PM +0800, Pingfan Liu wrote:
> > > Sorry that I missed this message and I am just back from a long
> > > festival.
> > >
> > > Adding Paul for RCU guidance.
> > >
> > > On Thu, Sep 30, 2021 at 02:32:57PM +0100, Mark Rutland wrote:
> > > > On Sat, Sep 25, 2021 at 11:39:55PM +0800, Pingfan Liu wrote:
> > > > > On Fri, Sep 24, 2021 at 06:53:06PM +0100, Mark Rutland wrote:
> > > > > > On Fri, Sep 24, 2021 at 09:28:33PM +0800, Pingfan Liu wrote:
> > > > > > > In enter_el1_irq_or_nmi(), it can be the case which NMI interrupts an
> > > > > > > irq, which makes the condition !interrupts_enabled(regs) fail to detect
> > > > > > > the NMI. This will cause a mistaken account for irq.
> > > > > >
> > > > > Sorry about the confusing word "account", it should be "lockdep/rcu/.."
> > > > >
> > > > > > Can you please explain this in more detail? It's not clear which
> > > > > > specific case you mean when you say "NMI interrupts an irq", as that
> > > > > > could mean a number of distinct scenarios.
> > > > > >
> > > > > > AFAICT, if we're in an IRQ handler (with NMIs unmasked), and an NMI
> > > > > > causes a new exception we'll do the right thing. So either I'm missing a
> > > > > > subtlety or you're describing a different scenario..
> > > > > >
> > > > > > Note that the entry code is only trying to distinguish between:
> > > > > >
> > > > > > a) This exception is *definitely* an NMI (because regular interrupts
> > > > > > were masked).
> > > > > >
> > > > > > b) This exception is *either* and IRQ or an NMI (and this *cannot* be
> > > > > > distinguished until we acknowledge the interrupt), so we treat it as
> > > > > > an IRQ for now.
> > > > > >
> > > > > b) is the aim.
> > > > >
> > > > > At the entry, enter_el1_irq_or_nmi() -> enter_from_kernel_mode()->rcu_irq_enter()/rcu_irq_enter_check_tick() etc.
> > > > > While at irqchip level, gic_handle_irq()->gic_handle_nmi()->nmi_enter(),
> > > > > which does not call rcu_irq_enter_check_tick(). So it is not proper to
> > > > > "treat it as an IRQ for now"
> > > >
> > > > I'm struggling to understand the problem here. What is "not proper", and
> > > > why?
> > > >
> > > > Do you think there's a correctness problem, or that we're doing more
> > > > work than necessary?
> > > >
> > > I had thought it just did redundant accounting. But after revisiting RCU
> > > code, I think it confronts a real bug.
> > >
> > > > If you could give a specific example of a problem, it would really help.
> > > >
> > > Refer to rcu_nmi_enter(), which can be called by
> > > enter_from_kernel_mode():
> > >
> > > ||noinstr void rcu_nmi_enter(void)
> > > ||{
> > > || ...
> > > || if (rcu_dynticks_curr_cpu_in_eqs()) {
> > > ||
> > > || if (!in_nmi())
> > > || rcu_dynticks_task_exit();
> > > ||
> > > || // RCU is not watching here ...
> > > || rcu_dynticks_eqs_exit();
> > > || // ... but is watching here.
> > > ||
> > > || if (!in_nmi()) {
> > > || instrumentation_begin();
> > > || rcu_cleanup_after_idle();
> > > || instrumentation_end();
> > > || }
> > > ||
> > > || instrumentation_begin();
> > > || // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
> > > || instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));
> > > || // instrumentation for the noinstr rcu_dynticks_eqs_exit()
> > > || instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
> > > ||
> > > || incby = 1;
> > > || } else if (!in_nmi()) {
> > > || instrumentation_begin();
> > > || rcu_irq_enter_check_tick();
> > > || } else {
> > > || instrumentation_begin();
> > > || }
> > > || ...
> > > ||}
> > >
> >
> > Forget to supplement the context for understanding the case:
> > On arm64, at present, a pNMI (akin to NMI) may call rcu_nmi_enter()
> > without calling "__preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET);".
> > As a result it can be mistaken as an normal interrupt in
> > rcu_nmi_enter().
>
> I appreciate that there's a window where we treat the pNMI like an IRQ,
> but that's by design, and we account for this in gic_handle_irq() and
> gic_handle_nmi() where we "upgrade" to NMI context with
> nmi_enter()..nmi_exit().
>
> The idea is that we have two cases:
>
> 1) If we take a pNMI from a context where IRQs were masked, we know it
> must be a pNMI, and perform the NMI entry immediately to avoid
> reentrancy problems.
>
> I think we're all happy with this case.
>
Right.
> 2) If we take a pNMI from a context where IRQs were unmasked, we don't know
> whether the trigger was a pNMI/IRQ until we read from the GIC, and
> since we *could* have taken an IRQ, this is equivalent to taking a
> spurious IRQ, and while handling that, taking the NMI, e.g.
>
> < run with IRQs unmasked >
> ~~~ take IRQ ~~~
> < enter IRQ >
> ~~~ take NMI exception ~~~
> < enter NMI >
> < handle NMI >
> < exit NMI >
> ~~~ return from NMI exception ~~~
> < handle IRQ / spurious / do-nothing >
> < exit IRQ >
> ~~~ return from IRQ exception ~~~
> < continue running with IRQs unmasked >
>
Yes, here I am on the same page. (I think I used a wrong example in
previous email, which caused the confusion)
> ... except that we don't do the HW NMI exception entry/exit, just all
> the necessary SW accounting.
>
A little but important thing: local_irq_save() etc can not disable pNMI.
>
> Note that case (2) can *never* nest within itself or within case (1).
>
> Do you have a specific example of something that goes wrong with the
> above? e.g. something that's inconsistent with that rationale?
>
Please see the following comment.
> > And this may cause the following issue:
> > > There is 3 pieces of code put under the
> > > protection of if (!in_nmi()). At least the last one
> > > "rcu_irq_enter_check_tick()" can trigger a hard lock up bug. Because it
> > > is supposed to hold a spin lock with irqoff by
> > > "raw_spin_lock_rcu_node(rdp->mynode)", but pNMI can breach it. The same
> > > scenario in rcu_nmi_exit()->rcu_prepare_for_idle().
Sorry that this should be an wrong example, since here it takes the case (1).
Concentrating on the spin lock rcu_node->lock, there are two operators:
raw_spin_lock_rcu_node()
raw_spin_trylock_rcu_node()
Then suppose the scenario for deadlock:
note_gp_changes() in non-irq-context
{
local_irq_save(flags);
...
raw_spin_trylock_rcu_node(rnp) // hold lock
needwake = __note_gp_changes(rnp, rdp); ------\
raw_spin_unlock_irqrestore_rcu_node(rnp, flags); \
} \
\---> pNMI breaks in due to local_irq_save() can not disable it.
rcu_irq_enter() without __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET)
->rcu_nmi_enter()
{
else if (!in_nmi())
rcu_irq_enter_check_tick()
->__rcu_irq_enter_check_tick()
{
...
raw_spin_lock_rcu_node(rdp->mynode);
//Oops deadlock!
}
}
> > >
> > > As for the first two "if (!in_nmi())", I have no idea of why, except
> > > breaching spin_lock_irq() by NMI. Hope Paul can give some guide.
>
> That code (in enter_from_kernel_mode()) only runs in case 2, where it
> cannot be nested within a pNMI, so I struggle to see how this can
> deadlock. It it can, then I would expect the general case of a pNMI
> nesting within and IRQ would be broken?
>
Sorry again for the previous misleading wrong example. Hope my new
example can help.
> Can you give a concrete example of a sequence that would lockup?
> Currently I can't see how that's possible.
>
It seems the RCU subsystem has a strict semantic on NMI and normal
interrupt. Besides the deadlock example, there may be other supprise to
confront with (will trace it on another mail with Paul)
Thanks,
Pingfan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2021-10-09 3:52 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-24 13:28 [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead Pingfan Liu
2021-09-24 17:53 ` Mark Rutland
2021-09-25 15:39 ` Pingfan Liu
2021-09-30 13:32 ` Mark Rutland
2021-10-08 4:01 ` Pingfan Liu
2021-10-08 14:55 ` Pingfan Liu
2021-10-08 17:25 ` Mark Rutland
2021-10-09 3:49 ` Pingfan Liu [this message]
2021-10-08 15:45 ` Paul E. McKenney
2021-10-09 4:14 ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 2/5] irqchip/GICv3: expose handle_nmi() directly Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 3/5] kernel/irq: make irq_{enter, exit}() in handle_domain_irq() arch optional Pingfan Liu
2021-09-28 8:55 ` [PATCHv2 3/5] kernel/irq: make irq_{enter,exit}() " Mark Rutland
2021-09-29 3:15 ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 4/5] irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64 Pingfan Liu
2021-09-28 9:10 ` Mark Rutland
2021-09-29 3:10 ` Pingfan Liu
2021-09-29 7:20 ` Marc Zyngier
2021-09-29 8:27 ` Pingfan Liu
2021-09-29 9:23 ` Mark Rutland
2021-09-29 11:40 ` Pingfan Liu
2021-09-29 14:29 ` Pingfan Liu
2021-09-29 17:41 ` Mark Rutland
2021-09-24 13:28 ` [PATCHv2 5/5] irqchip/GICv3: make reschedule-ipi light weight Pingfan Liu
2021-09-29 7:24 ` Marc Zyngier
2021-09-29 8:32 ` Pingfan Liu
2021-09-24 17:36 ` [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Mark Rutland
2021-09-24 22:59 ` Paul E. McKenney
2021-09-27 9:23 ` Mark Rutland
2021-09-28 0:09 ` Paul E. McKenney
2021-09-28 8:32 ` Mark Rutland
2021-09-28 8:35 ` Mark Rutland
2021-09-28 9:52 ` Sven Schnelle
2021-09-28 10:26 ` Mark Rutland
2021-09-28 13:55 ` Paul E. McKenney
2021-09-25 15:12 ` Pingfan Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YWERWy5tMdbaEUU8@piliu.users.ipa.redhat.com \
--to=kernelfans@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=ito-yuichi@fujitsu.com \
--cc=joey.gouly@arm.com \
--cc=julien.thierry@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=paulmck@kernel.org \
--cc=samitolvanen@google.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).