linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Nicholas Piggin <npiggin@gmail.com>, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
Date: Wed, 21 Mar 2018 13:32:28 +1100	[thread overview]
Message-ID: <1521599548.16434.263.camel@kernel.crashing.org> (raw)
In-Reply-To: <20180321022228.9606-1-npiggin@gmail.com>

On Wed, 2018-03-21 at 12:22 +1000, Nicholas Piggin wrote:
> force_external_irq_replay() can be called in the do_IRQ path with
> interrupts hard enabled and soft disabled if may_hard_irq_enable() set
> MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
> store sequence. If a maskable interrupt hits during this sequence, it
> will go to the masked handler to be marked pending in irq_happened.
> This update will be lost when the interrupt returns and the store
> instruction executes.  This can result in unpredictable latencies,
> timeouts, lockups, etc.
> 
> Fix this by ensuring hard interrupts are disabled before modifying
> irq_happened.
> 
> This could cause any maskable asynchronous interrupt to get lost, but
> it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
> so very high external interrupt rate and high IPI rate. The hang was
> bisected down to enabling doorbell interrupts for IPIs. These provided
> an interrupt type that could run at high rates in the do_IRQ path,
> stressing the race.
> 
> Fixes: 1d607bb3bd ("powerpc/irq: Add mechanism to force a replay of interrupts")
> Reported-by: Carol L. Soto <clsoto@us.ibm.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

Nice one. We need that back into the distros asap.

> This has survived stress testing quite well so far, may need a little
> more testing but I'd like to post it now to get some more comments.
> 
> We can optimise the mtmsr a bit more (e.g., skip it if interrupts are
> already disabled or EE alrady set), but I've got some other patches
> pending which change things there slightly, so I prefer to have this
> minimal fix now, then make such changes upstream later.
> 
> ---
>  arch/powerpc/kernel/irq.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index f88038847790..061aa0f47bb1 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -476,6 +476,14 @@ void force_external_irq_replay(void)
>  	 */
>  	WARN_ON(!arch_irqs_disabled());
>  
> +	/*
> +	 * Interrupts must always be hard disabled before irq_happened is
> +	 * modified (to prevent lost update in case of interrupt between
> +	 * load and store).
> +	 */
> +	__hard_irq_disable();
> +	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> +
>  	/* Indicate in the PACA that we have an interrupt to replay */
>  	local_paca->irq_happened |= PACA_IRQ_EE;
>  }

  reply	other threads:[~2018-03-21  2:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21  2:22 [PATCH] powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened Nicholas Piggin
2018-03-21  2:32 ` Benjamin Herrenschmidt [this message]
2018-03-21  3:16   ` Nicholas Piggin
2018-03-21  4:11     ` Benjamin Herrenschmidt
2018-03-23 11:11 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1521599548.16434.263.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).