linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org, "Carol L. Soto" <clsoto@us.ibm.com>
Subject: Re: [PATCH] powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
Date: Wed, 21 Mar 2018 13:16:15 +1000	[thread overview]
Message-ID: <20180321131615.7b8bd8e4@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <1521599548.16434.263.camel@kernel.crashing.org>

Adding Carol...

On Wed, 21 Mar 2018 13:32:28 +1100
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Wed, 2018-03-21 at 12:22 +1000, Nicholas Piggin wrote:
> > force_external_irq_replay() can be called in the do_IRQ path with
> > interrupts hard enabled and soft disabled if may_hard_irq_enable() set
> > MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
> > store sequence. If a maskable interrupt hits during this sequence, it
> > will go to the masked handler to be marked pending in irq_happened.
> > This update will be lost when the interrupt returns and the store
> > instruction executes.  This can result in unpredictable latencies,
> > timeouts, lockups, etc.
> > 
> > Fix this by ensuring hard interrupts are disabled before modifying
> > irq_happened.
> > 
> > This could cause any maskable asynchronous interrupt to get lost, but
> > it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
> > so very high external interrupt rate and high IPI rate. The hang was
> > bisected down to enabling doorbell interrupts for IPIs. These provided
> > an interrupt type that could run at high rates in the do_IRQ path,
> > stressing the race.
> > 
> > Fixes: 1d607bb3bd ("powerpc/irq: Add mechanism to force a replay of interrupts")
> > Reported-by: Carol L. Soto <clsoto@us.ibm.com>
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---  
> 
> Nice one. We need that back into the distros asap.


Yes Carol is working on this with distros. Backporting should be
simple, so hoping to get this upstream soon to help the process.
We can take that as a Reviewed-by:? :)

Thanks,
Nick

  reply	other threads:[~2018-03-21  3:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21  2:22 [PATCH] powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened Nicholas Piggin
2018-03-21  2:32 ` Benjamin Herrenschmidt
2018-03-21  3:16   ` Nicholas Piggin [this message]
2018-03-21  4:11     ` Benjamin Herrenschmidt
2018-03-23 11:11 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180321131615.7b8bd8e4@roar.ozlabs.ibm.com \
    --to=npiggin@gmail.com \
    --cc=benh@kernel.crashing.org \
    --cc=clsoto@us.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).