Alpha arch development list
 help / color / mirror / Atom feed
From: Matt Turner <mattst88@gmail.com>
To: linux-alpha@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
	Richard Henderson <richard.henderson@linaro.org>,
	Magnus Lindholm <linmag7@gmail.com>,
	Ivan Kokshaysky <ink@unseen.parts>,
	Matt Turner <mattst88@gmail.com>
Subject: [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler
Date: Sat, 30 May 2026 16:25:43 -0400	[thread overview]
Message-ID: <20260530202544.59231-3-mattst88@gmail.com> (raw)
In-Reply-To: <20260530202544.59231-1-mattst88@gmail.com>

On EV7/IO7, the wripir PALcall delivers IPIs as edge-triggered hardware
signals through the IO7 I/O controller. If the target CPU is already
executing at IPL=7 inside do_entInt handling another interrupt, the IPI
edge is lost: the hardware never re-delivers it when the CPU drops back
to IPL=0.

The software IPI bit in ipi_data[cpu].bits is set before wripir is
called, so it remains set after the interrupt handler returns. But
because no hardware edge fires, handle_ipi() is never invoked again,
and the sending CPU spins forever in csd_lock_wait.

This race is the root cause of a 15-year SMP deadlock on EV7/Marvel
systems. It is reliably triggered by workloads that generate many
synchronous IPIs (TLB flushes via on_each_cpu(wait=1)) while the
target CPU receives concurrent I/O or RTC interrupts.

Fix: add alpha_poll_ipi_inirq(), called from do_entInt within each
interrupt handler's irq_enter/irq_exit bracket. It checks
ipi_data[smp_processor_id()].bits and drains any pending IPIs that
arrived while we were at IPL=7, before irq_exit() opens the softirq
window where a TLB-flush softirq could itself deadlock on
alpha_smp_ipi_lock. The check is a single READ_ONCE so there is no
overhead when no IPI was missed.

For the RTC interrupt (case 1 in do_entInt), handle_irq() already calls
its own irq_enter()/irq_exit() internally. The outer irq_enter/irq_exit
pair added here is intentional: it keeps irq_count > 0 while handle_irq()
runs, so handle_irq()'s inner irq_exit() sees a non-zero count and skips
the softirq window. The softirq window is deferred until the outer
irq_exit(), which runs after alpha_poll_ipi_inirq() has already drained
any pending IPIs. Without this outer bracket, irq_exit() inside
handle_irq() could open the softirq window before any missed IPIs are
rescued, risking a deadlock on alpha_smp_ipi_lock.

Approximately 98% of rescued IPIs are IPI_CALL_FUNC (the TLB-flush
type), confirming that IO7 genuinely drops the hardware edge rather than
holding it pending until IPL falls.

A lost IPI_CALL_FUNC only deadlocks when the sender is blocking (wait=1).
wait=0 callers do not hang, but silently skip the function on the remote
CPU, which may be a correctness issue in its own right.

This fix is complementary to the alpha_smp_ipi_lock serialization
(previous commit). Both are required:
  - Serialization prevents two CPUs simultaneously issuing wait=1 IPIs
    from deadlocking each other in csd_lock_wait.
  - This fix prevents a single wait=1 caller from deadlocking due to an
    IPI edge lost to an IPL=7 window on the remote CPU.

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Matt Turner <mattst88@gmail.com>
---
 arch/alpha/kernel/irq_alpha.c | 29 ++++++++++++++++++++++++++++-
 arch/alpha/kernel/proto.h     |  1 +
 arch/alpha/kernel/smp.c       | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

diff --git ./arch/alpha/kernel/irq_alpha.c ./arch/alpha/kernel/irq_alpha.c
index ac941172ae66..0e4234ef7ea0 100644
--- ./arch/alpha/kernel/irq_alpha.c
+++ ./arch/alpha/kernel/irq_alpha.c
@@ -69,22 +69,49 @@ do_entInt(unsigned long type, unsigned long vector,
 		break;
 #endif
 	case 1:
-		/* handle_irq() already does irq_enter()/irq_exit() */
+		/*
+		 * Wrap handle_irq() in our own irq_enter/irq_exit so that the
+		 * inner irq_exit() inside handle_irq() does not run softirqs
+		 * (irq_count remains > 0). We poll for lost IPIs before the
+		 * outer irq_exit(), which is where softirqs may run. This
+		 * prevents a TLB flush softirq from deadlocking on
+		 * alpha_smp_ipi_lock while the sending CPU waits for our ACK.
+		 */
+		irq_enter();
 		handle_irq(RTC_IRQ);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
+		irq_exit();
 		break;
 	case 2:
 		irq_enter();
 		alpha_mv.machine_check(vector, la_ptr);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	case 3:
 		irq_enter();
 		alpha_mv.device_interrupt(vector);
+#ifdef CONFIG_SMP
+		/*
+		 * Drain any IPIs whose edge was lost while we were at IPL=7.
+		 * Must be called before irq_exit() to prevent softirqs (e.g.
+		 * a TLB flush) from deadlocking on alpha_smp_ipi_lock while
+		 * the sending CPU spins in csd_lock_wait.
+		 */
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	case 4:
 		irq_enter();
 		perf_irq(la_ptr, regs);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	default:
diff --git ./arch/alpha/kernel/proto.h ./arch/alpha/kernel/proto.h
index f138bd494628..04879e0b2932 100644
--- ./arch/alpha/kernel/proto.h
+++ ./arch/alpha/kernel/proto.h
@@ -120,6 +120,7 @@ extern void unregister_srm_console(void);
 /* smp.c */
 extern void setup_smp(void);
 extern void handle_ipi(struct pt_regs *);
+extern void alpha_poll_ipi_inirq(struct pt_regs *);
 extern void __init smp_callin(void);
 
 /* bios32.c */
diff --git ./arch/alpha/kernel/smp.c ./arch/alpha/kernel/smp.c
index d900da49b0d8..099e1ac6a0d6 100644
--- ./arch/alpha/kernel/smp.c
+++ ./arch/alpha/kernel/smp.c
@@ -557,6 +557,41 @@ handle_ipi(struct pt_regs *regs)
 		recv_secondary_console_msg();
 }
 
+/*
+ * On EV7/IO7, IPI signals are edge-triggered. If an IPI arrives while this
+ * CPU is executing at IPL=7 (inside another interrupt handler), the hardware
+ * edge is lost. The software bit in ipi_data[] remains set but handle_ipi()
+ * is never re-invoked, causing the sending CPU to spin forever in csd_lock_wait.
+ *
+ * Call this from within hardirq context (between irq_enter and irq_exit) to
+ * drain any IPIs that arrived while we were running at IPL=7, before irq_exit()
+ * opens the softirq window where a TLB flush could deadlock on alpha_smp_ipi_lock.
+ */
+void alpha_poll_ipi_inirq(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+	unsigned long bits = READ_ONCE(ipi_data[cpu].bits);
+
+	if (!bits)
+		return;
+
+	/*
+	 * Peek at type bits before handle_ipi() clears them via xchg().
+	 * Bits arriving after this READ_ONCE are drained but not counted;
+	 * the counters are approximate but sufficient for diagnosis.
+	 * Note: handle_ipi() also increments ipi_count, so the "IPI:" row
+	 * in /proc/interrupts includes both normal and rescued deliveries.
+	 */
+	if (bits & (1UL << IPI_RESCHEDULE))
+		cpu_data[cpu].rescued_reschedule_count++;
+	if (bits & (1UL << IPI_CALL_FUNC))
+		cpu_data[cpu].rescued_call_func_count++;
+	if (bits & (1UL << IPI_CPU_STOP))
+		cpu_data[cpu].rescued_cpu_stop_count++;
+
+	handle_ipi(regs);
+}
+
 void
 arch_smp_send_reschedule(int cpu)
 {
-- 
2.53.0


  parent reply	other threads:[~2026-05-30 20:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
2026-05-30 20:25 ` [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Matt Turner
2026-05-30 20:25 ` Matt Turner [this message]
2026-05-30 20:25 ` [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts Matt Turner
2026-05-31  8:24 ` [PATCH 0/3] alpha SMP fixes for EV7/Marvel Magnus Lindholm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260530202544.59231-3-mattst88@gmail.com \
    --to=mattst88@gmail.com \
    --cc=ink@unseen.parts \
    --cc=linmag7@gmail.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox