public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pingfan Liu <kernelfans@gmail.com>
To: Mark Rutland <mark.rutland@arm.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Joey Gouly <joey.gouly@arm.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Julien Thierry <julien.thierry@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Yuichi Ito <ito-yuichi@fujitsu.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead
Date: Fri, 8 Oct 2021 12:01:25 +0800	[thread overview]
Message-ID: <YV/ClUNWvMga3qud@piliu.users.ipa.redhat.com> (raw)
In-Reply-To: <20210930133257.GB18258@lakrids.cambridge.arm.com>

Sorry that I missed this message and I am just back from a long
festival.

Adding Paul for RCU guidance.

On Thu, Sep 30, 2021 at 02:32:57PM +0100, Mark Rutland wrote:
> On Sat, Sep 25, 2021 at 11:39:55PM +0800, Pingfan Liu wrote:
> > On Fri, Sep 24, 2021 at 06:53:06PM +0100, Mark Rutland wrote:
> > > On Fri, Sep 24, 2021 at 09:28:33PM +0800, Pingfan Liu wrote:
> > > > In enter_el1_irq_or_nmi(), it can be the case which NMI interrupts an
> > > > irq, which makes the condition !interrupts_enabled(regs) fail to detect
> > > > the NMI. This will cause a mistaken account for irq.
> > > 
> > Sorry about the confusing word "account", it should be "lockdep/rcu/.."
> > 
> > > Can you please explain this in more detail? It's not clear which
> > > specific case you mean when you say "NMI interrupts an irq", as that
> > > could mean a number of distinct scenarios.
> > > 
> > > AFAICT, if we're in an IRQ handler (with NMIs unmasked), and an NMI
> > > causes a new exception we'll do the right thing. So either I'm missing a
> > > subtlety or you're describing a different scenario..
> > > 
> > > Note that the entry code is only trying to distinguish between:
> > > 
> > > a) This exception is *definitely* an NMI (because regular interrupts
> > >    were masked).
> > > 
> > > b) This exception is *either* and IRQ or an NMI (and this *cannot* be
> > >    distinguished until we acknowledge the interrupt), so we treat it as
> > >    an IRQ for now.
> > > 
> > b) is the aim.
> > 
> > At the entry, enter_el1_irq_or_nmi() -> enter_from_kernel_mode()->rcu_irq_enter()/rcu_irq_enter_check_tick() etc.
> > While at irqchip level, gic_handle_irq()->gic_handle_nmi()->nmi_enter(),
> > which does not call rcu_irq_enter_check_tick(). So it is not proper to
> > "treat it as an IRQ for now"
> 
> I'm struggling to understand the problem here. What is "not proper", and
> why?
> 
> Do you think there's a correctness problem, or that we're doing more
> work than necessary? 
> 
I had thought it just did redundant accounting. But after revisiting RCU
code, I think it confronts a real bug.

> If you could give a specific example of a problem, it would really help.
> 
Refer to rcu_nmi_enter(), which can be called by
enter_from_kernel_mode():

||noinstr void rcu_nmi_enter(void)
||{
||        ...
||        if (rcu_dynticks_curr_cpu_in_eqs()) {
||
||                if (!in_nmi())
||                        rcu_dynticks_task_exit();
||
||                // RCU is not watching here ...
||                rcu_dynticks_eqs_exit();
||                // ... but is watching here.
||
||                if (!in_nmi()) {
||                        instrumentation_begin();
||                        rcu_cleanup_after_idle();
||                        instrumentation_end();
||                }
||
||                instrumentation_begin();
||                // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
||                instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));
||                // instrumentation for the noinstr rcu_dynticks_eqs_exit()
||                instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
||
||                incby = 1;
||        } else if (!in_nmi()) {
||                instrumentation_begin();
||                rcu_irq_enter_check_tick();
||        } else  {
||                instrumentation_begin();
||        }
||        ...
||}

There is 3 pieces of code put under the
protection of if (!in_nmi()). At least the last one
"rcu_irq_enter_check_tick()" can trigger a hard lock up bug. Because it
is supposed to hold a spin lock with irqoff by
"raw_spin_lock_rcu_node(rdp->mynode)", but pNMI can breach it. The same
scenario in rcu_nmi_exit()->rcu_prepare_for_idle().

As for the first two "if (!in_nmi())", I have no idea of why, except
breaching spin_lock_irq() by NMI. Hope Paul can give some guide.


Thanks,

	Pingfan


> I'm aware that we do more work than strictly necessary when we take a
> pNMI from a context with IRQs enabled, but that's how we'd intended this
> to work, as it's vastly simpler to manage the state that way. Unless
> there's a real problem with that approach I'd prefer to leave it as-is.
> 
> Thanks,
> Mark.
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-10-08  4:01 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-24 13:28 [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 1/5] arm64/entry-common: push the judgement of nmi ahead Pingfan Liu
2021-09-24 17:53   ` Mark Rutland
2021-09-25 15:39     ` Pingfan Liu
2021-09-30 13:32       ` Mark Rutland
2021-10-08  4:01         ` Pingfan Liu [this message]
2021-10-08 14:55           ` Pingfan Liu
2021-10-08 17:25             ` Mark Rutland
2021-10-09  3:49               ` Pingfan Liu
2021-10-08 15:45           ` Paul E. McKenney
2021-10-09  4:14             ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 2/5] irqchip/GICv3: expose handle_nmi() directly Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 3/5] kernel/irq: make irq_{enter,exit}() in handle_domain_irq() arch optional Pingfan Liu
2021-09-28  8:55   ` Mark Rutland
2021-09-29  3:15     ` Pingfan Liu
2021-09-24 13:28 ` [PATCHv2 4/5] irqchip/GICv3: let gic_handle_irq() utilize irqentry on arm64 Pingfan Liu
2021-09-28  9:10   ` Mark Rutland
2021-09-29  3:10     ` Pingfan Liu
2021-09-29  7:20       ` Marc Zyngier
2021-09-29  8:27         ` Pingfan Liu
2021-09-29  9:23           ` Mark Rutland
2021-09-29 11:40             ` Pingfan Liu
2021-09-29 14:29             ` Pingfan Liu
2021-09-29 17:41               ` Mark Rutland
2021-09-24 13:28 ` [PATCHv2 5/5] irqchip/GICv3: make reschedule-ipi light weight Pingfan Liu
2021-09-29  7:24   ` Marc Zyngier
2021-09-29  8:32     ` Pingfan Liu
2021-09-24 17:36 ` [PATCHv2 0/5] arm64/irqentry: remove duplicate housekeeping of Mark Rutland
2021-09-24 22:59   ` Paul E. McKenney
2021-09-27  9:23     ` Mark Rutland
2021-09-28  0:09       ` Paul E. McKenney
2021-09-28  8:32         ` Mark Rutland
2021-09-28  8:35           ` Mark Rutland
2021-09-28  9:52           ` Sven Schnelle
2021-09-28 10:26             ` Mark Rutland
2021-09-28 13:55           ` Paul E. McKenney
2021-09-25 15:12   ` Pingfan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YV/ClUNWvMga3qud@piliu.users.ipa.redhat.com \
    --to=kernelfans@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=ito-yuichi@fujitsu.com \
    --cc=joey.gouly@arm.com \
    --cc=julien.thierry@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=samitolvanen@google.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox