[RFC] genirq: Fix lockup in handle_edge

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] genirq: Fix lockup in handle_edge_irq
@ 2025-07-01 16:35 Liangyan
  2025-07-02 13:17 ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Liangyan @ 2025-07-01 16:35 UTC (permalink / raw)
  To: tglx; +Cc: linux-kernel, Liangyan, Yicong Shen

Yicong reported a softlockup in guest vm triggered by setting NIC IRQ
affinity in irqbalance service.

When a NIC IRQ affinity is changed from cpu 0 to cpu 1 and cpu 0 is
handling the first interrupt of this IRQ in handle_edge_irq, the second
interrupt is activated and handled in cpu 1 which sets IRQS_PENDING flag,
cpu 0 will invoke handle_irq_event again after finish the first interrupt.
If the interval between two interrupts is smaller than the latency of
handling one interrupt in the loop of handle_edge_irq (i.e., unmask_irq +
handle_irq_event), cpu 0 may repeat to invoke handle_irq_event and not
exit handle_edge_irq which causes softlockup at last(hardlockup is
not enabled in guest vm).

In our online guest vm, we have some heavy network traffic business,
the number of NIC interrupt is more that 1000 per second, the NIC
mask/unmask_irq will trap to host and consume more than 1ms, this
softlockup is easy to reproduce. By bpftrace, we can see cpu 0 invokes
handle_irq_event more than 5000 times in handle_edge_irq when
softlockup occurs.

To fix this, we can limit the repeat times of calling handle_irq_event.

       cpu 0                                        cpu 1

  handle_edge_irq
    spin_lock
    do {
        unmask_irq if IRQS_PENDING
                                                handle_edge_irq
        handle_irq_event
          istate &= ~IRQS_PENDING
          spin_unlock
                                                  spin_lock
                                                  istate |= IRQS_PENDING
          handle_irq_event_percpu                 mask_ack_irq
                                                  spin_unlock
          spin_lock
      } while(istate & IRQS_PENDING)
      spin_unlock

The softlockup traces look something like this:
-----
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/1:0]
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G             L
Hardware name: ByteDance Inc. OpenStack Nova, BIOS
RIP: 0010:__do_softirq+0x78/0x2ac
RSP: 0018:ffffa02a00134f98 EFLAGS: 00000246
RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 00000000ffffffff
RDX: 00000000000000c1 RSI: ffffffff9e801040 RDI: 0000000000000016
RBP: ffffa02a000c7dd8 R08: 000002ea2320b76b R09: 7fffffffffffffff
R10: 000002ea3a1c0080 R11: 00000000002fefff R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000080
FS:  0000000000000000(0000) GS:ffff89323e840000(0000)
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2e5957c000 CR3: 0000000167a9a005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <IRQ>
 __irq_exit_rcu+0xb9/0xf0
 sysvec_apic_timer_interrupt+0x72/0x90
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:cpuidle_enter_state+0xd2/0x400
RSP: 0018:ffffa02a000c7e80 EFLAGS: 00000202
RAX: ffff89323e870bc0 RBX: 0000000000000001 RCX: 00000000ffffffff
RDX: 0000000000000016 RSI: ffffffff9e801040 RDI: 0000000000000000
RBP: ffff89323e87c700 R08: 000002ea22ebdf87 R09: 0000000000000018
R10: 000000000000010d R11: 000000000000020a R12: ffffffff9dab58e0
R13: 000002ea22ebdf87 R14: 0000000000000001 R15: 0000000000000000
 cpuidle_enter+0x29/0x40
 cpuidle_idle_call+0xfa/0x160
 do_idle+0x7b/0xe0
 cpu_startup_entry+0x19/0x20
 start_secondary+0x116/0x140
 secondary_startup_64_no_verify+0xe5/0xeb
 </TASK>

Signed-off-by: Liangyan <liangyan.peng@bytedance.com>
Reported-by: Yicong Shen <shenyicong.1023@bytedance.com>
---
 kernel/irq/chip.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 2b274007e8ba..9f5c50e75e6b 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -764,6 +764,8 @@ EXPORT_SYMBOL_GPL(handle_fasteoi_nmi);
  */
 void handle_edge_irq(struct irq_desc *desc)
 {
+	bool need_unmask = false;
+
 	guard(raw_spinlock)(&desc->lock);
 
 	if (!irq_can_handle(desc)) {
@@ -791,12 +793,16 @@ void handle_edge_irq(struct irq_desc *desc)
 		if (unlikely(desc->istate & IRQS_PENDING)) {
 			if (!irqd_irq_disabled(&desc->irq_data) &&
 			    irqd_irq_masked(&desc->irq_data))
-				unmask_irq(desc);
+				need_unmask = true;
 		}
 
 		handle_irq_event(desc);
 
 	} while ((desc->istate & IRQS_PENDING) && !irqd_irq_disabled(&desc->irq_data));
+
+	if (need_unmask && !irqd_irq_disabled(&desc->irq_data) &&
+	    irqd_irq_masked(&desc->irq_data))
+		unmask_irq(desc);
 }
 EXPORT_SYMBOL(handle_edge_irq);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC] genirq: Fix lockup in handle_edge_irq
  2025-07-01 16:35 [RFC] genirq: Fix lockup in handle_edge_irq Liangyan
@ 2025-07-02 13:17 ` Thomas Gleixner
  2025-07-03 15:31   ` [External] " Liangyan
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2025-07-02 13:17 UTC (permalink / raw)
  To: Liangyan; +Cc: linux-kernel, Liangyan, Yicong Shen

On Wed, Jul 02 2025 at 00:35, Liangyan wrote:
>  void handle_edge_irq(struct irq_desc *desc)
>  {
> +	bool need_unmask = false;
> +
>  	guard(raw_spinlock)(&desc->lock);
>  
>  	if (!irq_can_handle(desc)) {
> @@ -791,12 +793,16 @@ void handle_edge_irq(struct irq_desc *desc)
>  		if (unlikely(desc->istate & IRQS_PENDING)) {
>  			if (!irqd_irq_disabled(&desc->irq_data) &&
>  			    irqd_irq_masked(&desc->irq_data))
> -				unmask_irq(desc);
> +				need_unmask = true;
>  		}
>  
>  		handle_irq_event(desc);
>  
>  	} while ((desc->istate & IRQS_PENDING) && !irqd_irq_disabled(&desc->irq_data));
> +
> +	if (need_unmask && !irqd_irq_disabled(&desc->irq_data) &&
> +	    irqd_irq_masked(&desc->irq_data))
> +		unmask_irq(desc);

This might work in your setup by some definition of "works", but it
breaks the semantics of this handler because of this:

device interrupt        CPU0                            CPU1
                        handle_edge_irq()
                        set(INPROGRESS);

                        do {
                               handle_event();

device interrupt
                                                        handle_edge_irq()
                                                           if (INPROGRESS) {
                                                             set(PENDING);
                                                             mask();
                                                             return;
                                                           }

                               ...
                               if (PENDING) {
                                  need_unmask = true;
                               }
                               handle_event();

device interrupt   << possible FAIL

because there are enough edge type interrupt controllers out there which
lose an edge when the line is masked at the interrupt controller
level. As edge type interrupts are fire and forget from the device
perspective, the interrupt is not retriggered when unmasking later.

That's the reason why this handler is written the way it is and this
cannot be changed for obvious reasons.

So no, this is not going to happen.

The only possible solution for this is to analyze all interrupt
controllers, which are involved in the delivery chain, and establish
whether they are affected by the above problem. If not, then that
particular delivery chain combination of interrupt controllers can be
changed to use a different flow handler along with a profound
explanation why this is correct under all circumstances.

As you failed to provide any information about the involved controllers,
I cannot even give any hint about a possible solution.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [External] Re: [RFC] genirq: Fix lockup in handle_edge_irq
  2025-07-02 13:17 ` Thomas Gleixner
@ 2025-07-03 15:31   ` Liangyan
  2025-07-04 14:42     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Liangyan @ 2025-07-03 15:31 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Yicong Shen, ziqianlu, songmuchun, yuanzhu

Hello Thomas,

We have this softlockup issue in guest vm, so the related IRQ is from 
virtio-net tx queue, the interrupt controller is virt pci msix 
controller, related components have pci_msi_controller, virtio_pci, 
virtio_net and qemu.

And according to qemu msix.c source code, when irq is unmasked, it will 
fire new one if the msix pending bit is set.
Seems that for msi-x controller, it will not lose interrupt during 
unmask period.

For this virt MSIX controller, do you have some suggestion? Thanks.

Regards,
Liangyan


On 2025/7/2 21:17, Thomas Gleixner wrote:
> On Wed, Jul 02 2025 at 00:35, Liangyan wrote:
>>   void handle_edge_irq(struct irq_desc *desc)
>>   {
>> +	bool need_unmask = false;
>> +
>>   	guard(raw_spinlock)(&desc->lock);
>>   
>>   	if (!irq_can_handle(desc)) {
>> @@ -791,12 +793,16 @@ void handle_edge_irq(struct irq_desc *desc)
>>   		if (unlikely(desc->istate & IRQS_PENDING)) {
>>   			if (!irqd_irq_disabled(&desc->irq_data) &&
>>   			    irqd_irq_masked(&desc->irq_data))
>> -				unmask_irq(desc);
>> +				need_unmask = true;
>>   		}
>>   
>>   		handle_irq_event(desc);
>>   
>>   	} while ((desc->istate & IRQS_PENDING) && !irqd_irq_disabled(&desc->irq_data));
>> +
>> +	if (need_unmask && !irqd_irq_disabled(&desc->irq_data) &&
>> +	    irqd_irq_masked(&desc->irq_data))
>> +		unmask_irq(desc);
> 
> This might work in your setup by some definition of "works", but it
> breaks the semantics of this handler because of this:
> 
> device interrupt        CPU0                            CPU1
>                          handle_edge_irq()
>                          set(INPROGRESS);
> 
>                          do {
>                                 handle_event();
> 
> device interrupt
>                                                          handle_edge_irq()
>                                                             if (INPROGRESS) {
>                                                               set(PENDING);
>                                                               mask();
>                                                               return;
>                                                             }
> 
>                                 ...
>                                 if (PENDING) {
>                                    need_unmask = true;
>                                 }
>                                 handle_event();
> 
> device interrupt   << possible FAIL
> 
> because there are enough edge type interrupt controllers out there which
> lose an edge when the line is masked at the interrupt controller
> level. As edge type interrupts are fire and forget from the device
> perspective, the interrupt is not retriggered when unmasking later.
> 
> That's the reason why this handler is written the way it is and this
> cannot be changed for obvious reasons.
> 
> So no, this is not going to happen.
> 
> The only possible solution for this is to analyze all interrupt
> controllers, which are involved in the delivery chain, and establish
> whether they are affected by the above problem. If not, then that
> particular delivery chain combination of interrupt controllers can be
> changed to use a different flow handler along with a profound
> explanation why this is correct under all circumstances.
> 
> As you failed to provide any information about the involved controllers,
> I cannot even give any hint about a possible solution.
> 
> Thanks,
> 
>          tglx
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [External] Re: [RFC] genirq: Fix lockup in handle_edge_irq
  2025-07-03 15:31   ` [External] " Liangyan
@ 2025-07-04 14:42     ` Thomas Gleixner
  2025-07-04 15:36       ` Liangyan
  2025-07-08  1:43       ` Liangyan
  0 siblings, 2 replies; 6+ messages in thread
From: Thomas Gleixner @ 2025-07-04 14:42 UTC (permalink / raw)
  To: Liangyan; +Cc: linux-kernel, Yicong Shen, ziqianlu, songmuchun, yuanzhu

Liangyan!

Please don't top post and trim your reply. See:

  https://people.kernel.org/tglx/notes-about-netiquette

for further explanation.

On Thu, Jul 03 2025 at 23:31, Liangyan wrote:
> We have this softlockup issue in guest vm, so the related IRQ is from 
> virtio-net tx queue, the interrupt controller is virt pci msix 
> controller, related components have pci_msi_controller, virtio_pci, 
> virtio_net and qemu.

That's a random list of pieces, which are not necessarily related to the
interrupt control flow. You have to look at the actual interrupt domain
hierarchy of the interrupt in question. /sys/kernel/debug/irq/irqs/$N.

> And according to qemu msix.c source code, when irq is unmasked, it will 
> fire new one if the msix pending bit is set.
> Seems that for msi-x controller, it will not lose interrupt during 
> unmask period.

That's correct and behaving according to specification. Though
unfortunately not all PCI-MSI-X implementations are specification
compliant, so we can't do that unconditionally. There is also no way to
detect whether there is a sane implementation in the hardware
[emulation] or not.

So playing games with the unmask is not really feasible. But let's take
a step back and look at the actual problem.

It only happens when the interrupt affinity is moved or the interrupt
has multiple target CPUs enabled in the effective affinity mask. x86 and
arm64 enforce the effective affinity to be a single CPU, so on those
architectures the problem only arises when the interrupt affinity
changes.

Now we can use that fact and check whether the CPU, which observes
INPROGRESS, is the target CPU in the effective affinity mask. If so,
then the obvious cure is to busy poll the INPROGRESS flag instead of
doing the mask()/PENDING/unmask() dance.

Something like the uncompiled and therefore untested patch below should
do the trick. If you find bugs in it, you can keep and fix them :)

Thanks,

        tglx
---
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -449,22 +449,31 @@ void unmask_threaded_irq(struct irq_desc
 	unmask_irq(desc);
 }
 
-static bool irq_check_poll(struct irq_desc *desc)
+/* Busy wait until INPROGRESS is cleared */
+static bool irq_wait_on_inprogress(struct irq_desc *desc)
 {
-	if (!(desc->istate & IRQS_POLL_INPROGRESS))
-		return false;
-	return irq_wait_for_poll(desc);
+	if (IS_ENABLED(CONFIG_SMP)) {
+		do {
+			raw_spin_unlock(&desc->lock);
+			while (irqd_irq_inprogress(&desc->irq_data))
+				cpu_relax();
+			raw_spin_lock(&desc->lock);
+		} while (irqd_irq_inprogress(&desc->irq_data));
+	}
+	/* Might have been disabled in meantime */
+	return !irqd_irq_disabled(&desc->irq_data) && desc->action;
 }
 
 static bool irq_can_handle_pm(struct irq_desc *desc)
 {
-	unsigned int mask = IRQD_IRQ_INPROGRESS | IRQD_WAKEUP_ARMED;
+	struct irq_data *irqd = &desc->irq_data;
+	const struct cpumask *aff;
 
 	/*
 	 * If the interrupt is not in progress and is not an armed
 	 * wakeup interrupt, proceed.
 	 */
-	if (!irqd_has_set(&desc->irq_data, mask))
+	if (!irqd_has_set(irqd, IRQD_IRQ_INPROGRESS | IRQD_WAKEUP_ARMED))
 		return true;
 
 	/*
@@ -472,13 +481,53 @@ static bool irq_can_handle_pm(struct irq
 	 * and suspended, disable it and notify the pm core about the
 	 * event.
 	 */
-	if (irq_pm_check_wakeup(desc))
+	if (unlikely(irqd_has_set(irqd, IRQD_WAKEUP_ARMED))) {
+		irq_pm_handle_wakeup(desc);
+		return false;
+	}
+
+	/* Check whether the interrupt is polled on another CPU */
+	if (unlikely(desc->istate & IRQS_POLL_INPROGRESS)) {
+		if (WARN_ONCE(irq_poll_cpu == smp_processor_id(),
+			      "irq poll in progress on cpu %d for irq %d\n",
+			      smp_processor_id(), desc->irq_data.irq))
+			return false;
+		return irq_wait_on_inprogress(desc);
+	}
+
+	if (!IS_ENABLED(CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK) ||
+	    !irqd_is_single_target(irqd) || desc->handle_irq != handle_edge_irq)
 		return false;
 
 	/*
-	 * Handle a potential concurrent poll on a different core.
+	 * If the interrupt affinity was moved to this CPU and the
+	 * interrupt is currently handled on the previous target CPU, then
+	 * busy wait for INPROGRESS to be cleared. Otherwise for edge type
+	 * interrupts the handler might get stuck on the previous target:
+	 *
+	 * CPU 0			CPU 1 (new target)
+	 * handle_edge_irq()
+	 * repeat:
+	 *	handle_event()		handle_edge_irq()
+	 *			        if (INPROGESS) {
+	 *				  set(PENDING);
+	 *				  mask();
+	 *				  return;
+	 *				}
+	 *	if (PENDING) {
+	 *	  clear(PENDING);
+	 *	  unmask();
+	 *	  goto repeat;
+	 *	}
+	 *
+	 * This happens when the device raises interrupts with a high rate
+	 * and always before handle_event() completes and the CPU0 handler
+	 * can clear INPROGRESS. This has been observed in virtual machines.
 	 */
-	return irq_check_poll(desc);
+	aff = irq_data_get_effective_affinity_mask(irqd);
+	if (cpumask_first(aff) != smp_processor_id())
+		return false;
+	return irq_wait_on_inprogress(desc);
 }
 
 static inline bool irq_can_handle_actions(struct irq_desc *desc)
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -20,6 +20,7 @@
 #define istate core_internal_state__do_not_mess_with_it
 
 extern bool noirqdebug;
+extern int irq_poll_cpu;
 
 extern struct irqaction chained_action;
 
@@ -112,7 +113,6 @@ irqreturn_t handle_irq_event(struct irq_
 int check_irq_resend(struct irq_desc *desc, bool inject);
 void clear_irq_resend(struct irq_desc *desc);
 void irq_resend_init(struct irq_desc *desc);
-bool irq_wait_for_poll(struct irq_desc *desc);
 void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action);
 
 void wake_threads_waitq(struct irq_desc *desc);
@@ -277,11 +277,11 @@ static inline bool irq_is_nmi(struct irq
 }
 
 #ifdef CONFIG_PM_SLEEP
-bool irq_pm_check_wakeup(struct irq_desc *desc);
+void irq_pm_handle_wakeup(struct irq_desc *desc);
 void irq_pm_install_action(struct irq_desc *desc, struct irqaction *action);
 void irq_pm_remove_action(struct irq_desc *desc, struct irqaction *action);
 #else
-static inline bool irq_pm_check_wakeup(struct irq_desc *desc) { return false; }
+static inline void irq_pm_handle_wakeup(struct irq_desc *desc) { }
 static inline void
 irq_pm_install_action(struct irq_desc *desc, struct irqaction *action) { }
 static inline void
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -13,17 +13,13 @@
 
 #include "internals.h"
 
-bool irq_pm_check_wakeup(struct irq_desc *desc)
+void irq_pm_handle_wakeup(struct irq_desc *desc)
 {
-	if (irqd_is_wakeup_armed(&desc->irq_data)) {
-		irqd_clear(&desc->irq_data, IRQD_WAKEUP_ARMED);
-		desc->istate |= IRQS_SUSPENDED | IRQS_PENDING;
-		desc->depth++;
-		irq_disable(desc);
-		pm_system_irq_wakeup(irq_desc_get_irq(desc));
-		return true;
-	}
-	return false;
+	irqd_clear(&desc->irq_data, IRQD_WAKEUP_ARMED);
+	desc->istate |= IRQS_SUSPENDED | IRQS_PENDING;
+	desc->depth++;
+	irq_disable(desc);
+	pm_system_irq_wakeup(irq_desc_get_irq(desc));
 }
 
 /*
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -19,45 +19,10 @@ static int irqfixup __read_mostly;
 #define POLL_SPURIOUS_IRQ_INTERVAL (HZ/10)
 static void poll_spurious_irqs(struct timer_list *unused);
 static DEFINE_TIMER(poll_spurious_irq_timer, poll_spurious_irqs);
-static int irq_poll_cpu;
+int irq_poll_cpu;
 static atomic_t irq_poll_active;
 
 /*
- * We wait here for a poller to finish.
- *
- * If the poll runs on this CPU, then we yell loudly and return
- * false. That will leave the interrupt line disabled in the worst
- * case, but it should never happen.
- *
- * We wait until the poller is done and then recheck disabled and
- * action (about to be disabled). Only if it's still active, we return
- * true and let the handler run.
- */
-bool irq_wait_for_poll(struct irq_desc *desc)
-{
-	lockdep_assert_held(&desc->lock);
-
-	if (WARN_ONCE(irq_poll_cpu == smp_processor_id(),
-		      "irq poll in progress on cpu %d for irq %d\n",
-		      smp_processor_id(), desc->irq_data.irq))
-		return false;
-
-#ifdef CONFIG_SMP
-	do {
-		raw_spin_unlock(&desc->lock);
-		while (irqd_irq_inprogress(&desc->irq_data))
-			cpu_relax();
-		raw_spin_lock(&desc->lock);
-	} while (irqd_irq_inprogress(&desc->irq_data));
-	/* Might have been disabled in meantime */
-	return !irqd_irq_disabled(&desc->irq_data) && desc->action;
-#else
-	return false;
-#endif
-}
-
-
-/*
  * Recovery handler for misrouted interrupts.
  */
 static bool try_one_irq(struct irq_desc *desc, bool force)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [External] Re: [RFC] genirq: Fix lockup in handle_edge_irq
  2025-07-04 14:42     ` Thomas Gleixner
@ 2025-07-04 15:36       ` Liangyan
  2025-07-08  1:43       ` Liangyan
  1 sibling, 0 replies; 6+ messages in thread
From: Liangyan @ 2025-07-04 15:36 UTC (permalink / raw)
  To: Thomas Gleixner, Liangyan
  Cc: linux-kernel, Yicong Shen, ziqianlu, songmuchun, yuanzhu



On 2025/7/4 22:42, Thomas Gleixner wrote:
> Liangyan!
> 
> Please don't top post and trim your reply. See:
> 
>   https://people.kernel.org/tglx/notes-about-netiquette
Got it， thanks for the guidance, Thomas!

> 
> for further explanation.
> 
> On Thu, Jul 03 2025 at 23:31, Liangyan wrote:
>> We have this softlockup issue in guest vm, so the related IRQ is from 
>> virtio-net tx queue, the interrupt controller is virt pci msix 
>> controller, related components have pci_msi_controller, virtio_pci, 
>> virtio_net and qemu.
> 
> That's a random list of pieces, which are not necessarily related to the
> interrupt control flow. You have to look at the actual interrupt domain
> hierarchy of the interrupt in question. /sys/kernel/debug/irq/irqs/$N.
> 
>> And according to qemu msix.c source code, when irq is unmasked, it will 
>> fire new one if the msix pending bit is set.
>> Seems that for msi-x controller, it will not lose interrupt during 
>> unmask period.
> 
> That's correct and behaving according to specification. Though
> unfortunately not all PCI-MSI-X implementations are specification
> compliant, so we can't do that unconditionally. There is also no way to
> detect whether there is a sane implementation in the hardware
> [emulation] or not.
> 
> So playing games with the unmask is not really feasible. But let's take
> a step back and look at the actual problem.
> 
> It only happens when the interrupt affinity is moved or the interrupt
> has multiple target CPUs enabled in the effective affinity mask. x86 and
> arm64 enforce the effective affinity to be a single CPU, so on those
> architectures the problem only arises when the interrupt affinity
> changes.
> 
> Now we can use that fact and check whether the CPU, which observes
> INPROGRESS, is the target CPU in the effective affinity mask. If so,
> then the obvious cure is to busy poll the INPROGRESS flag instead of
> doing the mask()/PENDING/unmask() dance.
> 
> Something like the uncompiled and therefore untested patch below should
> do the trick. If you find bugs in it, you can keep and fix them :)

Great， thanks for the patch,  I will test it and feedback later.

Regards,
Liangyan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [External] Re: [RFC] genirq: Fix lockup in handle_edge_irq
  2025-07-04 14:42     ` Thomas Gleixner
  2025-07-04 15:36       ` Liangyan
@ 2025-07-08  1:43       ` Liangyan
  1 sibling, 0 replies; 6+ messages in thread
From: Liangyan @ 2025-07-08  1:43 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, Yicong Shen, ziqianlu, songmuchun, yuanzhu



On 2025/7/4 22:42, Thomas Gleixner wrote:
> 
> So playing games with the unmask is not really feasible. But let's take
> a step back and look at the actual problem.
> 
> It only happens when the interrupt affinity is moved or the interrupt
> has multiple target CPUs enabled in the effective affinity mask. x86 and
> arm64 enforce the effective affinity to be a single CPU, so on those
> architectures the problem only arises when the interrupt affinity
> changes.
> 
> Now we can use that fact and check whether the CPU, which observes
> INPROGRESS, is the target CPU in the effective affinity mask. If so,
> then the obvious cure is to busy poll the INPROGRESS flag instead of
> doing the mask()/PENDING/unmask() dance.
> 
> Something like the uncompiled and therefore untested patch below should
> do the trick. If you find bugs in it, you can keep and fix them :)
> 
Hello Thomas,

After over a day of testing the patch, the softlockup issue has not
recurred. This confirms that your patch should fix the problem. Thanks.
Would you consider upstreaming this patch?

Regards，
Liangyan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-08  1:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-01 16:35 [RFC] genirq: Fix lockup in handle_edge_irq Liangyan
2025-07-02 13:17 ` Thomas Gleixner
2025-07-03 15:31   ` [External] " Liangyan
2025-07-04 14:42     ` Thomas Gleixner
2025-07-04 15:36       ` Liangyan
2025-07-08  1:43       ` Liangyan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).