stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <gregkh@linuxfoundation.org>
To: alistair@popple.id.au, andrew.donnellan@au1.ibm.com,
	gregkh@linuxfoundation.org, mpe@ellerman.id.au
Cc: <stable@vger.kernel.org>, <stable-commits@vger.kernel.org>
Subject: Patch "powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"" has been added to the 4.3-stable tree
Date: Tue, 26 Jan 2016 22:56:06 -0800	[thread overview]
Message-ID: <1453877766237137@kroah.com> (raw)


This is a note to let you know that I've just added the patch titled

    powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"

to the 4.3-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powerpc-opal-irqchip-fix-deadlock-introduced-by-fix-double-endian-conversion.patch
and it can be found in the queue-4.3 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


>From 036592fbbe753d236402a0ae68148e7c143a0f0e Mon Sep 17 00:00:00 2001
From: Alistair Popple <alistair@popple.id.au>
Date: Fri, 18 Dec 2015 17:16:17 +1100
Subject: powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"

From: Alistair Popple <alistair@popple.id.au>

commit 036592fbbe753d236402a0ae68148e7c143a0f0e upstream.

Commit 25642e1459ac ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -> opal_handle_events()
     -> generic_handle_irq()
        -> handle_level_irq()
           -> raw_spin_lock(&desc->lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -> opal_event_unmask()
                 -> opal_handle_events()
                    -> generic_handle_irq()
                       -> handle_level_irq()
                          -> raw_spin_lock(&desc->lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/powerpc/platforms/powernv/opal-irqchip.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

--- a/arch/powerpc/platforms/powernv/opal-irqchip.c
+++ b/arch/powerpc/platforms/powernv/opal-irqchip.c
@@ -83,7 +83,19 @@ static void opal_event_unmask(struct irq
 	set_bit(d->hwirq, &opal_event_irqchip.mask);
 
 	opal_poll_events(&events);
-	opal_handle_events(be64_to_cpu(events));
+	last_outstanding_events = be64_to_cpu(events);
+
+	/*
+	 * We can't just handle the events now with opal_handle_events().
+	 * If we did we would deadlock when opal_event_unmask() is called from
+	 * handle_level_irq() with the irq descriptor lock held, because
+	 * calling opal_handle_events() would call generic_handle_irq() and
+	 * then handle_level_irq() which would try to take the descriptor lock
+	 * again. Instead queue the events for later.
+	 */
+	if (last_outstanding_events & opal_event_irqchip.mask)
+		/* Need to retrigger the interrupt */
+		irq_work_queue(&opal_event_irq_work);
 }
 
 static int opal_event_set_type(struct irq_data *d, unsigned int flow_type)


Patches currently in stable-queue which might be from alistair@popple.id.au are

queue-4.3/powerpc-opal-irqchip-fix-double-endian-conversion.patch
queue-4.3/powerpc-opal-irqchip-fix-deadlock-introduced-by-fix-double-endian-conversion.patch

                 reply	other threads:[~2016-01-27  7:00 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1453877766237137@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=alistair@popple.id.au \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=stable-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).