[PATCH 0/3] alpha SMP fixes for EV7/Marvel

Alpha arch development list
 help / color / mirror / Atom feed

* [PATCH 0/3] alpha SMP fixes for EV7/Marvel
@ 2026-05-30 20:25 Matt Turner
  2026-05-30 20:25 ` [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Matt Turner
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Matt Turner @ 2026-05-30 20:25 UTC (permalink / raw)
  To: linux-alpha
  Cc: linux-kernel, Richard Henderson, Magnus Lindholm, Ivan Kokshaysky,
	Matt Turner

I acquired an AlphaServer ES47 in 2010, and it's never been stable --
deadlocking after random amounts of time. I could never make any
connections with load, uptime, etc.

The only dots I could connect was that the git test suite would always
trigger the deadlock.

I spent some time over the last week playing with Claude and have found
*a* solution. With the first two patches in place, I've successfully run
the git test suite 6 times in a row. I've never previously seen it run
successfully without deadlocking the system.

The first patch is generally applicable (not specific to EV7/Marvel).
I'm unsure why this would never have caused problems on other systems
(or why it would only be relevant for EV7/Marvel). That gives me some
pause.

The second patch applies only to EV7/Marvel, I believe. tl;dr: IPIs seem
to be lost.

The third patch adds some accounting to /proc/interrupts to report the
number of lost interrupts, confirming the problem from patch 2.

Please review.

Matt

Matt Turner (3):
  alpha: smp: Serialize all synchronous IPI operations to fix SMP
    deadlock
  alpha: Fix SMP IPI loss when target CPU is in interrupt handler
  alpha: Break down rescued IPI counter by type in /proc/interrupts

 arch/alpha/include/asm/smp.h  | 12 +++++
 arch/alpha/kernel/irq.c       | 12 +++++
 arch/alpha/kernel/irq_alpha.c | 29 ++++++++++-
 arch/alpha/kernel/proto.h     |  1 +
 arch/alpha/kernel/smp.c       | 97 +++++++++++++++++++++++++++++++++++
 arch/alpha/mm/tlbflush.c      |  3 ++
 6 files changed, 153 insertions(+), 1 deletion(-)

-- 
2.53.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock
  2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
@ 2026-05-30 20:25 ` Matt Turner
  2026-05-30 20:25 ` [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler Matt Turner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Matt Turner @ 2026-05-30 20:25 UTC (permalink / raw)
  To: linux-alpha
  Cc: linux-kernel, Richard Henderson, Magnus Lindholm, Ivan Kokshaysky,
	Matt Turner

Two or more CPUs simultaneously calling any function that uses
on_each_cpu(wait=1) or smp_call_function(wait=1) deadlock: each blocks
in csd_lock_wait spinning while waiting for the remote CPU to signal CSD
completion. While spinning, neither CPU can receive the other's IPI, so
neither completion signal arrives — permanent hang.

Affected callers: smp_imb, flush_tlb_all, flush_tlb_mm, flush_tlb_page,
flush_icache_user_page (smp.c) and migrate_flush_tlb_page (tlbflush.c).

Introduce alpha_smp_ipi_lock (plain spinlock, defined in smp.c, declared
in asm/smp.h) and apply it to all six callers. Rather than spin_lock(),
use a trylock loop with alpha_drain_ipi(): if the lock is held, the loser
actively drains any pending IPI bits on the local CPU before retrying.
This is necessary because some callers hold IRQs disabled (e.g. paths
that take spin_lock_irqsave), so no RTC interrupt will fire to rescue a
lost wripir edge via alpha_poll_ipi_inirq(). alpha_drain_ipi() calls
handle_ipi() under local_irq_save/restore, satisfying handle_ipi()'s
requirement that IRQs be disabled, without touching lockdep
hardirq-context state.

This fix is necessary but not sufficient. A separate, independent
deadlock path exists: if the target CPU is inside do_entInt at IPL=7
when wripir fires, the hardware IPI edge is lost and the sending CPU
spins forever even when only one CPU is issuing a wait=1 call. That
race is fixed independently by alpha_poll_ipi_inirq() (see follow-on
commit). Both fixes are required for a complete solution.

The deadlock has been observed on EV7/Marvel under workloads generating
a high rate of synchronous TLB flush IPIs (e.g. the git test suite).

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Matt Turner <mattst88@gmail.com>
---
 arch/alpha/include/asm/smp.h |  9 ++++++
 arch/alpha/kernel/smp.c      | 62 ++++++++++++++++++++++++++++++++++++
 arch/alpha/mm/tlbflush.c     |  3 ++
 3 files changed, 74 insertions(+)

diff --git ./arch/alpha/include/asm/smp.h ./arch/alpha/include/asm/smp.h
index 2264ae72673b..8bd529376cf6 100644
--- ./arch/alpha/include/asm/smp.h
+++ ./arch/alpha/include/asm/smp.h
@@ -48,6 +48,15 @@ extern int smp_num_cpus;
 extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
 
+/*
+ * Global spinlock serializing all synchronous (wait=1) IPI callers.
+ * Callers must use the trylock+alpha_drain_ipi() pattern, not spin_lock(),
+ * because some call sites hold IRQs disabled and cannot rely on the RTC
+ * interrupt to rescue a lost wripir edge.
+ */
+extern spinlock_t alpha_smp_ipi_lock;
+extern void alpha_drain_ipi(void);
+
 #else /* CONFIG_SMP */
 
 #define hard_smp_processor_id()		0
diff --git ./arch/alpha/kernel/smp.c ./arch/alpha/kernel/smp.c
index ed06367ece57..d900da49b0d8 100644
--- ./arch/alpha/kernel/smp.c
+++ ./arch/alpha/kernel/smp.c
@@ -597,11 +597,61 @@ ipi_imb(void *ignored)
 	imb();
 }
 
+/*
+ * Serialize all synchronous (wait=1) IPI operations to prevent cross-CPU
+ * deadlock on EV7/Marvel.  If two CPUs simultaneously call any function that
+ * uses on_each_cpu(wait=1) or smp_call_function(wait=1), each blocks in
+ * csd_lock_wait spinning for the remote CPU to signal completion.  While
+ * spinning, neither CPU can receive the other's IPI, so neither completion
+ * signal arrives — permanent hang.
+ *
+ * A plain spinlock (not irqsave) is intentional: the CPU that loses the lock
+ * race spins with IRQs enabled and can service the winner's IPI before
+ * taking the lock itself.
+ *
+ * All callers of synchronous IPIs — including migrate_flush_tlb_page in
+ * tlbflush.c — must hold this lock.
+ */
+DEFINE_SPINLOCK(alpha_smp_ipi_lock);
+
+/*
+ * Drain any pending IPIs for this CPU while spinning on alpha_smp_ipi_lock.
+ *
+ * The lock holder has already sent a wripir but is blocked in csd_lock_wait
+ * waiting for our IPI ACK.  We cannot simply spin on the lock: if IRQs are
+ * disabled (e.g. caller holds a spin_lock_irqsave), no RTC interrupt will
+ * fire and the lost wripir edge is never rescued by alpha_poll_ipi_inirq.
+ *
+ * Call this from the trylock loop so the IPI is processed even with IRQs
+ * disabled, breaking the circular wait.
+ *
+ * handle_ipi() requires IRQs disabled: generic_smp_call_function_interrupt
+ * asserts lockdep_assert_irqs_disabled().  Use local_irq_save/restore so
+ * this is safe whether the caller has IRQs enabled (e.g. page fault path)
+ * or disabled (e.g. spin_lock_irqsave holder).  Avoid __irq_enter_raw/
+ * __irq_exit_raw: those manipulate lockdep hardirq-context state and trigger
+ * a lockdep WARNING when called while lockdep already tracks hardirq context.
+ */
+void alpha_drain_ipi(void)
+{
+	unsigned long flags;
+
+	if (!READ_ONCE(ipi_data[smp_processor_id()].bits))
+		return;
+
+	local_irq_save(flags);
+	handle_ipi(NULL); /* regs unused in handle_ipi() */
+	local_irq_restore(flags);
+}
+
 void
 smp_imb(void)
 {
 	/* Must wait other processors to flush their icache before continue. */
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	on_each_cpu(ipi_imb, NULL, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 }
 EXPORT_SYMBOL(smp_imb);
 
@@ -616,7 +666,10 @@ flush_tlb_all(void)
 {
 	/* Although we don't have any data to pass, we do want to
 	   synchronize with the other processors.  */
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	on_each_cpu(ipi_flush_tlb_all, NULL, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 }
 
 #define asn_locked() (cpu_data[smp_processor_id()].asn_lock)
@@ -651,7 +704,10 @@ flush_tlb_mm(struct mm_struct *mm)
 		}
 	}
 
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	smp_call_function(ipi_flush_tlb_mm, mm, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 
 	preempt_enable();
 }
@@ -702,7 +758,10 @@ flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
 	data.mm = mm;
 	data.addr = addr;
 
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	smp_call_function(ipi_flush_tlb_page, &data, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 
 	preempt_enable();
 }
@@ -752,7 +811,10 @@ flush_icache_user_page(struct vm_area_struct *vma, struct page *page,
 		}
 	}
 
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	smp_call_function(ipi_flush_icache_page, mm, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 
 	preempt_enable();
 }
diff --git ./arch/alpha/mm/tlbflush.c ./arch/alpha/mm/tlbflush.c
index ccbc317b9a34..37607d08796b 100644
--- ./arch/alpha/mm/tlbflush.c
+++ ./arch/alpha/mm/tlbflush.c
@@ -89,7 +89,10 @@ void migrate_flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)
 	 * This is the "combined" version of flush_tlb_mm + per-page invalidate.
 	 */
 	preempt_disable();
+	while (!spin_trylock(&alpha_smp_ipi_lock))
+		alpha_drain_ipi();
 	on_each_cpu(ipi_flush_mm_and_page, &d, 1);
+	spin_unlock(&alpha_smp_ipi_lock);
 
 	/*
 	 * mimic flush_tlb_mm()'s mm_users<=1 optimization.
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler
  2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
  2026-05-30 20:25 ` [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Matt Turner
@ 2026-05-30 20:25 ` Matt Turner
  2026-05-30 20:25 ` [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts Matt Turner
  2026-05-31  8:24 ` [PATCH 0/3] alpha SMP fixes for EV7/Marvel Magnus Lindholm
  3 siblings, 0 replies; 5+ messages in thread
From: Matt Turner @ 2026-05-30 20:25 UTC (permalink / raw)
  To: linux-alpha
  Cc: linux-kernel, Richard Henderson, Magnus Lindholm, Ivan Kokshaysky,
	Matt Turner

On EV7/IO7, the wripir PALcall delivers IPIs as edge-triggered hardware
signals through the IO7 I/O controller. If the target CPU is already
executing at IPL=7 inside do_entInt handling another interrupt, the IPI
edge is lost: the hardware never re-delivers it when the CPU drops back
to IPL=0.

The software IPI bit in ipi_data[cpu].bits is set before wripir is
called, so it remains set after the interrupt handler returns. But
because no hardware edge fires, handle_ipi() is never invoked again,
and the sending CPU spins forever in csd_lock_wait.

This race is the root cause of a 15-year SMP deadlock on EV7/Marvel
systems. It is reliably triggered by workloads that generate many
synchronous IPIs (TLB flushes via on_each_cpu(wait=1)) while the
target CPU receives concurrent I/O or RTC interrupts.

Fix: add alpha_poll_ipi_inirq(), called from do_entInt within each
interrupt handler's irq_enter/irq_exit bracket. It checks
ipi_data[smp_processor_id()].bits and drains any pending IPIs that
arrived while we were at IPL=7, before irq_exit() opens the softirq
window where a TLB-flush softirq could itself deadlock on
alpha_smp_ipi_lock. The check is a single READ_ONCE so there is no
overhead when no IPI was missed.

For the RTC interrupt (case 1 in do_entInt), handle_irq() already calls
its own irq_enter()/irq_exit() internally. The outer irq_enter/irq_exit
pair added here is intentional: it keeps irq_count > 0 while handle_irq()
runs, so handle_irq()'s inner irq_exit() sees a non-zero count and skips
the softirq window. The softirq window is deferred until the outer
irq_exit(), which runs after alpha_poll_ipi_inirq() has already drained
any pending IPIs. Without this outer bracket, irq_exit() inside
handle_irq() could open the softirq window before any missed IPIs are
rescued, risking a deadlock on alpha_smp_ipi_lock.

Approximately 98% of rescued IPIs are IPI_CALL_FUNC (the TLB-flush
type), confirming that IO7 genuinely drops the hardware edge rather than
holding it pending until IPL falls.

A lost IPI_CALL_FUNC only deadlocks when the sender is blocking (wait=1).
wait=0 callers do not hang, but silently skip the function on the remote
CPU, which may be a correctness issue in its own right.

This fix is complementary to the alpha_smp_ipi_lock serialization
(previous commit). Both are required:
  - Serialization prevents two CPUs simultaneously issuing wait=1 IPIs
    from deadlocking each other in csd_lock_wait.
  - This fix prevents a single wait=1 caller from deadlocking due to an
    IPI edge lost to an IPL=7 window on the remote CPU.

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Matt Turner <mattst88@gmail.com>
---
 arch/alpha/kernel/irq_alpha.c | 29 ++++++++++++++++++++++++++++-
 arch/alpha/kernel/proto.h     |  1 +
 arch/alpha/kernel/smp.c       | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 64 insertions(+), 1 deletion(-)

diff --git ./arch/alpha/kernel/irq_alpha.c ./arch/alpha/kernel/irq_alpha.c
index ac941172ae66..0e4234ef7ea0 100644
--- ./arch/alpha/kernel/irq_alpha.c
+++ ./arch/alpha/kernel/irq_alpha.c
@@ -69,22 +69,49 @@ do_entInt(unsigned long type, unsigned long vector,
 		break;
 #endif
 	case 1:
-		/* handle_irq() already does irq_enter()/irq_exit() */
+		/*
+		 * Wrap handle_irq() in our own irq_enter/irq_exit so that the
+		 * inner irq_exit() inside handle_irq() does not run softirqs
+		 * (irq_count remains > 0). We poll for lost IPIs before the
+		 * outer irq_exit(), which is where softirqs may run. This
+		 * prevents a TLB flush softirq from deadlocking on
+		 * alpha_smp_ipi_lock while the sending CPU waits for our ACK.
+		 */
+		irq_enter();
 		handle_irq(RTC_IRQ);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
+		irq_exit();
 		break;
 	case 2:
 		irq_enter();
 		alpha_mv.machine_check(vector, la_ptr);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	case 3:
 		irq_enter();
 		alpha_mv.device_interrupt(vector);
+#ifdef CONFIG_SMP
+		/*
+		 * Drain any IPIs whose edge was lost while we were at IPL=7.
+		 * Must be called before irq_exit() to prevent softirqs (e.g.
+		 * a TLB flush) from deadlocking on alpha_smp_ipi_lock while
+		 * the sending CPU spins in csd_lock_wait.
+		 */
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	case 4:
 		irq_enter();
 		perf_irq(la_ptr, regs);
+#ifdef CONFIG_SMP
+		alpha_poll_ipi_inirq(regs);
+#endif
 		irq_exit();
 		break;
 	default:
diff --git ./arch/alpha/kernel/proto.h ./arch/alpha/kernel/proto.h
index f138bd494628..04879e0b2932 100644
--- ./arch/alpha/kernel/proto.h
+++ ./arch/alpha/kernel/proto.h
@@ -120,6 +120,7 @@ extern void unregister_srm_console(void);
 /* smp.c */
 extern void setup_smp(void);
 extern void handle_ipi(struct pt_regs *);
+extern void alpha_poll_ipi_inirq(struct pt_regs *);
 extern void __init smp_callin(void);

 /* bios32.c */
diff --git ./arch/alpha/kernel/smp.c ./arch/alpha/kernel/smp.c
index d900da49b0d8..099e1ac6a0d6 100644
--- ./arch/alpha/kernel/smp.c
+++ ./arch/alpha/kernel/smp.c
@@ -557,6 +557,41 @@ handle_ipi(struct pt_regs *regs)
 		recv_secondary_console_msg();
 }

+/*
+ * On EV7/IO7, IPI signals are edge-triggered. If an IPI arrives while this
+ * CPU is executing at IPL=7 (inside another interrupt handler), the hardware
+ * edge is lost. The software bit in ipi_data[] remains set but handle_ipi()
+ * is never re-invoked, causing the sending CPU to spin forever in csd_lock_wait.
+ *
+ * Call this from within hardirq context (between irq_enter and irq_exit) to
+ * drain any IPIs that arrived while we were running at IPL=7, before irq_exit()
+ * opens the softirq window where a TLB flush could deadlock on alpha_smp_ipi_lock.
+ */
+void alpha_poll_ipi_inirq(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+	unsigned long bits = READ_ONCE(ipi_data[cpu].bits);
+
+	if (!bits)
+		return;
+
+	/*
+	 * Peek at type bits before handle_ipi() clears them via xchg().
+	 * Bits arriving after this READ_ONCE are drained but not counted;
+	 * the counters are approximate but sufficient for diagnosis.
+	 * Note: handle_ipi() also increments ipi_count, so the "IPI:" row
+	 * in /proc/interrupts includes both normal and rescued deliveries.
+	 */
+	if (bits & (1UL << IPI_RESCHEDULE))
+		cpu_data[cpu].rescued_reschedule_count++;
+	if (bits & (1UL << IPI_CALL_FUNC))
+		cpu_data[cpu].rescued_call_func_count++;
+	if (bits & (1UL << IPI_CPU_STOP))
+		cpu_data[cpu].rescued_cpu_stop_count++;
+
+	handle_ipi(regs);
+}
+
 void
 arch_smp_send_reschedule(int cpu)
 {
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts
  2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
  2026-05-30 20:25 ` [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Matt Turner
  2026-05-30 20:25 ` [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler Matt Turner
@ 2026-05-30 20:25 ` Matt Turner
  2026-05-31  8:24 ` [PATCH 0/3] alpha SMP fixes for EV7/Marvel Magnus Lindholm
  3 siblings, 0 replies; 5+ messages in thread
From: Matt Turner @ 2026-05-30 20:25 UTC (permalink / raw)
  To: linux-alpha
  Cc: linux-kernel, Richard Henderson, Magnus Lindholm, Ivan Kokshaysky,
	Matt Turner

Add per-type rescued IPI counters to cpuinfo_alpha:
  rescued_reschedule_count (RIP:)
  rescued_call_func_count  (RIF:)
  rescued_cpu_stop_count   (RIS:)

alpha_poll_ipi_inirq() peeks at ipi_data[cpu].bits before handle_ipi()
clears them via xchg(), then increments the appropriate per-CPU counter.
Expose all three as separate rows in /proc/interrupts alongside the
existing "IPI:" row.

This lets us distinguish the deadlock-causing subset (RIF: IPI_CALL_FUNC,
of which wait=1 callers are the ones that deadlock) from the harmless
majority (RIP: reschedule). A non-zero RIF count confirms the EV7/IO7
edge-triggered IPI loss hypothesis.

Note: handle_ipi() also increments ipi_count unconditionally, so the
"IPI:" row in /proc/interrupts includes both normal and rescued deliveries.
The RIP:/RIF:/RIS: counters sample bits before the xchg(); bits that
arrive after the READ_ONCE are drained by handle_ipi() but not reflected
in these counters — they are approximate but sufficient for diagnosis.

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Matt Turner <mattst88@gmail.com>
---
 arch/alpha/include/asm/smp.h |  3 +++
 arch/alpha/kernel/irq.c      | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git ./arch/alpha/include/asm/smp.h ./arch/alpha/include/asm/smp.h
index 8bd529376cf6..98f522ee367f 100644
--- ./arch/alpha/include/asm/smp.h
+++ ./arch/alpha/include/asm/smp.h
@@ -31,6 +31,9 @@ struct cpuinfo_alpha {
 	int need_new_asn;
 	int asn_lock;
 	unsigned long ipi_count;
+	unsigned long rescued_reschedule_count;
+	unsigned long rescued_call_func_count;
+	unsigned long rescued_cpu_stop_count;
 	unsigned long prof_multiplier;
 	unsigned long prof_counter;
 	unsigned char mcheck_expected;
diff --git ./arch/alpha/kernel/irq.c ./arch/alpha/kernel/irq.c
index c67047c5d830..34709e1c42c5 100644
--- ./arch/alpha/kernel/irq.c
+++ ./arch/alpha/kernel/irq.c
@@ -76,6 +76,18 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 	for_each_online_cpu(j)
 		seq_printf(p, "%10lu ", cpu_data[j].ipi_count);
 	seq_putc(p, '\n');
+	seq_puts(p, "RIP: ");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10lu ", cpu_data[j].rescued_reschedule_count);
+	seq_puts(p, "          Rescued IPIs: reschedule\n");
+	seq_puts(p, "RIF: ");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10lu ", cpu_data[j].rescued_call_func_count);
+	seq_puts(p, "          Rescued IPIs: call function\n");
+	seq_puts(p, "RIS: ");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10lu ", cpu_data[j].rescued_cpu_stop_count);
+	seq_puts(p, "          Rescued IPIs: cpu stop\n");
 #endif
 	seq_puts(p, "PMI: ");
 	for_each_online_cpu(j)
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] alpha SMP fixes for EV7/Marvel
  2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
                   ` (2 preceding siblings ...)
  2026-05-30 20:25 ` [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts Matt Turner
@ 2026-05-31  8:24 ` Magnus Lindholm
  3 siblings, 0 replies; 5+ messages in thread
From: Magnus Lindholm @ 2026-05-31  8:24 UTC (permalink / raw)
  To: Matt Turner; +Cc: linux-alpha, linux-kernel, Richard Henderson, Ivan Kokshaysky

On Sat, May 30, 2026 at 10:25 PM Matt Turner <mattst88@gmail.com> wrote:
>
> I acquired an AlphaServer ES47 in 2010, and it's never been stable --
> deadlocking after random amounts of time. I could never make any
> connections with load, uptime, etc.
>
> The only dots I could connect was that the git test suite would always
> trigger the deadlock.
>
> I spent some time over the last week playing with Claude and have found
> *a* solution. With the first two patches in place, I've successfully run
> the git test suite 6 times in a row. I've never previously seen it run
> successfully without deadlocking the system.
>
> The first patch is generally applicable (not specific to EV7/Marvel).
> I'm unsure why this would never have caused problems on other systems
> (or why it would only be relevant for EV7/Marvel). That gives me some
> pause.
>
> The second patch applies only to EV7/Marvel, I believe. tl;dr: IPIs seem
> to be lost.
>
> The third patch adds some accounting to /proc/interrupts to report the
> number of lost interrupts, confirming the problem from patch 2.
>
> Please review.
>
> Matt
>

Hi Matt,

Thanks for working on this. This is very impressive work, and it looks like
you're close to nailing down some long-standing bugs and making the Marvel
platform a lot more usable with SMP kernels. The lost-edge IPI diagnosis looks
plausible, but I hit a few issues while reviewing/testing the series.

First, after applying the series I hit a build failure. Patch 1 adds:

extern spinlock_t alpha_smp_ipi_lock;

to arch/alpha/include/asm/smp.h, but that header can be included before
spinlock_t is defined, e.g. while building kernel/sched/rq-offsets.s:

arch/alpha/include/asm/smp.h:60:8: error: unknown type name 'spinlock_t'

Including <linux/spinlock_types.h> from asm/smp.h, or avoiding exposing
spinlock_t from that early header, fixes that part.

Patch 2 also appears not to be buildable independently: it updates
cpu_data[].rescued_{reschedule,call_func,cpu_stop}_count, but those fields are
only introduced in patch 3. Please either move the struct additions into patch
2, move the accounting into patch 3, or squash those patches.

I also wonder if alpha_drain_ipi() should disable interrupts before looking at
the per-CPU IPI word. That would avoid reading ipi_data[smp_processor_id()].bits
before local IRQs are disabled, and would keep the CPU lookup and pending-bit
check in the same IRQ-disabled section:

local_irq_save(flags);
cpu = smp_processor_id();
if (READ_ONCE(ipi_data[cpu].bits))
handle_ipi(NULL);
local_irq_restore(flags);

That looks safer than reading ipi_data[smp_processor_id()].bits before
local_irq_save().

On the design side, patch 1 says it serializes all synchronous IPI operations,
but it seems to only wrap the Alpha arch TLB/icache/IMB users. Either the commit
message should narrow that claim, or the serialization needs to live lower in
the IPI/call-function path. The patch seems to do: "serialize a subset of Alpha
arch synchronous IPI users, mainly TLB/cache/IMB flushes"

Also, the series does not apply cleanly to current v7.1-rc1 directly.
It appears to
depend on the Alpha GENERIC_ENTRY series:

Link: https://lore.kernel.org/linux-alpha/20260529142322.1362438-1-linmag7@gmail.com/T/#t

which is still under review and not in mainline yet. Please mention
that dependency
in the cover letter and include the base commit and/or a lore link to
the prerequisite
series.

Finally, this adds a global spin_trylock()/spin_unlock() around hot paths such
as migrate_flush_tlb_page(). That has no impact on non-Alpha architectures, but
it serializes these operations for all Alpha SMP systems, while the bug
description is EV7/Marvel/IO7-specific. Can this be justified for non-EV7
systems, or gated to the affected platform?

Thanks,
Magnus

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-31  8:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-30 20:25 [PATCH 0/3] alpha SMP fixes for EV7/Marvel Matt Turner
2026-05-30 20:25 ` [PATCH 1/3] alpha: smp: Serialize all synchronous IPI operations to fix SMP deadlock Matt Turner
2026-05-30 20:25 ` [PATCH 2/3] alpha: Fix SMP IPI loss when target CPU is in interrupt handler Matt Turner
2026-05-30 20:25 ` [PATCH 3/3] alpha: Break down rescued IPI counter by type in /proc/interrupts Matt Turner
2026-05-31  8:24 ` [PATCH 0/3] alpha SMP fixes for EV7/Marvel Magnus Lindholm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox