* [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts
@ 2025-11-25 21:50 Thomas Gleixner
2025-11-25 21:50 ` [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI Thomas Gleixner
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Thomas Gleixner @ 2025-11-25 21:50 UTC (permalink / raw)
To: LKML; +Cc: x86, Luigi Rizzo
A small update to V1 which can be found here:
https://lore.kernel.org/lkml/20251125101912.564125647@linutronix.de
Luigi reported that the retrigger mechanism for posted MSI interrupts is
broken. That happens because retrigger sends an IPI to the actual allocated
vector, which is handled correctly, but lacks an EOI. That leaves a stale
APIC ISR bit around.
The following series addresses this and does some related cleanups in that
area on top.
Changes vs. V1:
Use __this_cpu_*() - Luigi
Thanks,
tglx
---
arch/x86/include/asm/irq_remapping.h | 12 ++++++-
arch/x86/kernel/irq.c | 54 +++++++++++++++++++++++------------
drivers/iommu/intel/irq_remapping.c | 12 +++----
3 files changed, 52 insertions(+), 26 deletions(-)
^ permalink raw reply [flat|nested] 11+ messages in thread* [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI 2025-11-25 21:50 [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Thomas Gleixner @ 2025-11-25 21:50 ` Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner 2025-11-25 21:50 ` [patch V2 2/3] x86/irq: Cleanup posted MSI code Thomas Gleixner ` (2 subsequent siblings) 3 siblings, 1 reply; 11+ messages in thread From: Thomas Gleixner @ 2025-11-25 21:50 UTC (permalink / raw) To: LKML; +Cc: x86, Luigi Rizzo, stable From: Thomas Gleixner <tglx@linutronix.de> Luigi reported that retriggering a posted MSI interrupt does not work correctly. The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself. As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared. Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem: If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all. So instead of playing games with the PIR, this can be actually solved for both cases by: 1) Keeping track of the posted interrupt vector handler state 2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler. This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Luigi Rizzo <lrizzo@google.com> Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com --- V2: Use __this_cpu...() - Luigi --- arch/x86/include/asm/irq_remapping.h | 7 +++++++ arch/x86/kernel/irq.c | 23 +++++++++++++++++++++++ drivers/iommu/intel/irq_remapping.c | 8 ++++---- 3 files changed, 34 insertions(+), 4 deletions(-) --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -87,4 +87,11 @@ static inline void panic_if_irq_remap(co } #endif /* CONFIG_IRQ_REMAP */ + +#ifdef CONFIG_X86_POSTED_MSI +void intel_ack_posted_msi_irq(struct irq_data *irqd); +#else +#define intel_ack_posted_msi_irq NULL +#endif + #endif /* __X86_IRQ_REMAPPING_H */ --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -396,6 +396,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm /* Posted Interrupt Descriptors for coalesced MSIs to be posted */ DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc); +static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); void intel_posted_msi_init(void) { @@ -413,6 +414,25 @@ void intel_posted_msi_init(void) this_cpu_write(posted_msi_pi_desc.ndst, destination); } +void intel_ack_posted_msi_irq(struct irq_data *irqd) +{ + irq_move_irq(irqd); + + /* + * Handle the rare case that irq_retrigger() raised the actual + * assigned vector on the target CPU, which means that it was not + * invoked via the posted MSI handler below. In that case APIC EOI + * is required as otherwise the ISR entry becomes stale and lower + * priority interrupts are never going to be delivered after that. + * + * If the posted handler invoked the device interrupt handler then + * the EOI would be premature because it would acknowledge the + * posted vector. + */ + if (unlikely(!__this_cpu_read(posted_msi_handler_active))) + apic_eoi(); +} + static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs) { unsigned long pir_copy[NR_PIR_WORDS]; @@ -445,6 +465,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi pid = this_cpu_ptr(&posted_msi_pi_desc); + /* Mark the handler active for intel_ack_posted_msi_irq() */ + __this_cpu_write(posted_msi_handler_active, true); inc_irq_stat(posted_msi_notification_count); irq_enter(); @@ -473,6 +495,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi apic_eoi(); irq_exit(); + __this_cpu_write(posted_msi_handler_active, false); set_irq_regs(old_regs); } #endif /* X86_POSTED_MSI */ --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1303,17 +1303,17 @@ static struct irq_chip intel_ir_chip = { * irq_enter(); * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * apic_eoi() @@ -1322,7 +1322,7 @@ static struct irq_chip intel_ir_chip = { */ static struct irq_chip intel_ir_chip_post_msi = { .name = "INTEL-IR-POST", - .irq_ack = irq_move_irq, + .irq_ack = intel_ack_posted_msi_irq, .irq_set_affinity = intel_ir_set_affinity, .irq_compose_msi_msg = intel_ir_compose_msi_msg, .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, ^ permalink raw reply [flat|nested] 11+ messages in thread
* [tip: x86/urgent] x86/msi: Make irq_retrigger() functional for posted MSI 2025-11-25 21:50 ` [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI Thomas Gleixner @ 2025-12-17 17:48 ` tip-bot2 for Thomas Gleixner 0 siblings, 0 replies; 11+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2025-12-17 17:48 UTC (permalink / raw) To: linux-tip-commits; +Cc: Luigi Rizzo, Thomas Gleixner, stable, x86, linux-kernel The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 0edc78b82bea85e1b2165d8e870a5c3535919695 Gitweb: https://git.kernel.org/tip/0edc78b82bea85e1b2165d8e870a5c3535919695 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 25 Nov 2025 22:50:45 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Wed, 17 Dec 2025 18:41:52 +01:00 x86/msi: Make irq_retrigger() functional for posted MSI Luigi reported that retriggering a posted MSI interrupt does not work correctly. The reason is that the retrigger happens at the vector domain by sending an IPI to the actual vector on the target CPU. That works correctly exactly once because the posted MSI interrupt chip does not issue an EOI as that's only required for the posted MSI notification vector itself. As a consequence the vector becomes stale in the ISR, which not only affects this vector but also any lower priority vector in the affected APIC because the ISR bit is not cleared. Luigi proposed to set the vector in the remap PIR bitmap and raise the posted MSI notification vector. That works, but that still does not cure a related problem: If there is ever a stray interrupt on such a vector, then the related APIC ISR bit becomes stale due to the lack of EOI as described above. Unlikely to happen, but if it happens it's not debuggable at all. So instead of playing games with the PIR, this can be actually solved for both cases by: 1) Keeping track of the posted interrupt vector handler state 2) Implementing a posted MSI specific irq_ack() callback which checks that state. If the posted vector handler is inactive it issues an EOI, otherwise it delegates that to the posted handler. This is correct versus affinity changes and concurrent events on the posted vector as the actual handler invocation is serialized through the interrupt descriptor lock. Fixes: ed1e48ea4370 ("iommu/vt-d: Enable posted mode for device MSIs") Reported-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Luigi Rizzo <lrizzo@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125214631.044440658@linutronix.de Closes: https://lore.kernel.org/lkml/20251124104836.3685533-1-lrizzo@google.com --- arch/x86/include/asm/irq_remapping.h | 7 +++++++ arch/x86/kernel/irq.c | 23 +++++++++++++++++++++++ drivers/iommu/intel/irq_remapping.c | 8 ++++---- 3 files changed, 34 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 5a0d424..4e55d17 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -87,4 +87,11 @@ static inline void panic_if_irq_remap(const char *msg) } #endif /* CONFIG_IRQ_REMAP */ + +#ifdef CONFIG_X86_POSTED_MSI +void intel_ack_posted_msi_irq(struct irq_data *irqd); +#else +#define intel_ack_posted_msi_irq NULL +#endif + #endif /* __X86_IRQ_REMAPPING_H */ diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index 86f4e57..b2fe618 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -397,6 +397,7 @@ DEFINE_IDTENTRY_SYSVEC_SIMPLE(sysvec_kvm_posted_intr_nested_ipi) /* Posted Interrupt Descriptors for coalesced MSIs to be posted */ DEFINE_PER_CPU_ALIGNED(struct pi_desc, posted_msi_pi_desc); +static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); void intel_posted_msi_init(void) { @@ -414,6 +415,25 @@ void intel_posted_msi_init(void) this_cpu_write(posted_msi_pi_desc.ndst, destination); } +void intel_ack_posted_msi_irq(struct irq_data *irqd) +{ + irq_move_irq(irqd); + + /* + * Handle the rare case that irq_retrigger() raised the actual + * assigned vector on the target CPU, which means that it was not + * invoked via the posted MSI handler below. In that case APIC EOI + * is required as otherwise the ISR entry becomes stale and lower + * priority interrupts are never going to be delivered after that. + * + * If the posted handler invoked the device interrupt handler then + * the EOI would be premature because it would acknowledge the + * posted vector. + */ + if (unlikely(!__this_cpu_read(posted_msi_handler_active))) + apic_eoi(); +} + static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_regs *regs) { unsigned long pir_copy[NR_PIR_WORDS]; @@ -446,6 +466,8 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) pid = this_cpu_ptr(&posted_msi_pi_desc); + /* Mark the handler active for intel_ack_posted_msi_irq() */ + __this_cpu_write(posted_msi_handler_active, true); inc_irq_stat(posted_msi_notification_count); irq_enter(); @@ -474,6 +496,7 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) apic_eoi(); irq_exit(); + __this_cpu_write(posted_msi_handler_active, false); set_irq_regs(old_regs); } #endif /* X86_POSTED_MSI */ diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 4f9b01d..8bcbfe3 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1303,17 +1303,17 @@ static struct irq_chip intel_ir_chip = { * irq_enter(); * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * handle_edge_irq() * irq_chip_ack_parent() - * irq_move_irq(); // No EOI + * intel_ack_posted_msi_irq(); // No EOI * handle_irq_event() * driver_handler() * apic_eoi() @@ -1322,7 +1322,7 @@ static struct irq_chip intel_ir_chip = { */ static struct irq_chip intel_ir_chip_post_msi = { .name = "INTEL-IR-POST", - .irq_ack = irq_move_irq, + .irq_ack = intel_ack_posted_msi_irq, .irq_set_affinity = intel_ir_set_affinity, .irq_compose_msi_msg = intel_ir_compose_msi_msg, .irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity, ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [patch V2 2/3] x86/irq: Cleanup posted MSI code 2025-11-25 21:50 [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Thomas Gleixner 2025-11-25 21:50 ` [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI Thomas Gleixner @ 2025-11-25 21:50 ` Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 2025-11-25 21:50 ` [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() Thomas Gleixner 2025-12-17 17:03 ` [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Luigi Rizzo 3 siblings, 2 replies; 11+ messages in thread From: Thomas Gleixner @ 2025-11-25 21:50 UTC (permalink / raw) To: LKML; +Cc: x86, Luigi Rizzo Make code and comments readable and use __this_cpu..() as this is guaranteed to be invoked with interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- V2: Use __this_cpu...() - Luigi --- arch/x86/kernel/irq.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -400,11 +400,9 @@ static DEFINE_PER_CPU_CACHE_HOT(bool, po void intel_posted_msi_init(void) { - u32 destination; - u32 apic_id; + u32 destination, apic_id; this_cpu_write(posted_msi_pi_desc.nv, POSTED_MSI_NOTIFICATION_VECTOR); - /* * APIC destination ID is stored in bit 8:15 while in XAPIC mode. * VT-d spec. CH 9.11 @@ -448,8 +446,8 @@ static __always_inline bool handle_pendi } /* - * Performance data shows that 3 is good enough to harvest 90+% of the benefit - * on high IRQ rate workload. + * Performance data shows that 3 is good enough to harvest 90+% of the + * benefit on high interrupt rate workloads. */ #define MAX_POSTED_MSI_COALESCING_LOOP 3 @@ -459,11 +457,8 @@ static __always_inline bool handle_pendi */ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) { + struct pi_desc *pid = __this_cpu_ptr(&posted_msi_pi_desc); struct pt_regs *old_regs = set_irq_regs(regs); - struct pi_desc *pid; - int i = 0; - - pid = this_cpu_ptr(&posted_msi_pi_desc); /* Mark the handler active for intel_ack_posted_msi_irq() */ __this_cpu_write(posted_msi_handler_active, true); @@ -471,25 +466,25 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi irq_enter(); /* - * Max coalescing count includes the extra round of handle_pending_pir - * after clearing the outstanding notification bit. Hence, at most - * MAX_POSTED_MSI_COALESCING_LOOP - 1 loops are executed here. + * Loop only MAX_POSTED_MSI_COALESCING_LOOP - 1 times here to take + * the final handle_pending_pir() invocation after clearing the + * outstanding notification bit into account. */ - while (++i < MAX_POSTED_MSI_COALESCING_LOOP) { + for (int i = 1; i < MAX_POSTED_MSI_COALESCING_LOOP; i++) { if (!handle_pending_pir(pid->pir, regs)) break; } /* - * Clear outstanding notification bit to allow new IRQ notifications, - * do this last to maximize the window of interrupt coalescing. + * Clear the outstanding notification bit to rearm the notification + * mechanism. */ pi_clear_on(pid); /* - * There could be a race of PI notification and the clearing of ON bit, - * process PIR bits one last time such that handling the new interrupts - * are not delayed until the next IRQ. + * Clearing the ON bit can race with a notification. Process the + * PIR bits one last time so that handling the new interrupts is + * not delayed until the next notification happens. */ handle_pending_pir(pid->pir, regs); ^ permalink raw reply [flat|nested] 11+ messages in thread
* [tip: x86/irq] x86/irq: Cleanup posted MSI code 2025-11-25 21:50 ` [patch V2 2/3] x86/irq: Cleanup posted MSI code Thomas Gleixner @ 2025-12-17 17:48 ` tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 11+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2025-12-17 17:48 UTC (permalink / raw) To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel The following commit has been merged into the x86/irq branch of tip: Commit-ID: 329e2051476858f264e2c217c6db4e68e203d5db Gitweb: https://git.kernel.org/tip/329e2051476858f264e2c217c6db4e68e203d5db Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 25 Nov 2025 22:50:47 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Wed, 17 Dec 2025 18:44:16 +01:00 x86/irq: Cleanup posted MSI code Make code and comments readable and use __this_cpu..() as this is guaranteed to be invoked with interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251125214631.108458942@linutronix.de --- arch/x86/kernel/irq.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index b2fe618..7bc640d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -401,11 +401,9 @@ static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); void intel_posted_msi_init(void) { - u32 destination; - u32 apic_id; + u32 destination, apic_id; this_cpu_write(posted_msi_pi_desc.nv, POSTED_MSI_NOTIFICATION_VECTOR); - /* * APIC destination ID is stored in bit 8:15 while in XAPIC mode. * VT-d spec. CH 9.11 @@ -449,8 +447,8 @@ static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_reg } /* - * Performance data shows that 3 is good enough to harvest 90+% of the benefit - * on high IRQ rate workload. + * Performance data shows that 3 is good enough to harvest 90+% of the + * benefit on high interrupt rate workloads. */ #define MAX_POSTED_MSI_COALESCING_LOOP 3 @@ -460,11 +458,8 @@ static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_reg */ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) { + struct pi_desc *pid = __this_cpu_ptr(&posted_msi_pi_desc); struct pt_regs *old_regs = set_irq_regs(regs); - struct pi_desc *pid; - int i = 0; - - pid = this_cpu_ptr(&posted_msi_pi_desc); /* Mark the handler active for intel_ack_posted_msi_irq() */ __this_cpu_write(posted_msi_handler_active, true); @@ -472,25 +467,25 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) irq_enter(); /* - * Max coalescing count includes the extra round of handle_pending_pir - * after clearing the outstanding notification bit. Hence, at most - * MAX_POSTED_MSI_COALESCING_LOOP - 1 loops are executed here. + * Loop only MAX_POSTED_MSI_COALESCING_LOOP - 1 times here to take + * the final handle_pending_pir() invocation after clearing the + * outstanding notification bit into account. */ - while (++i < MAX_POSTED_MSI_COALESCING_LOOP) { + for (int i = 1; i < MAX_POSTED_MSI_COALESCING_LOOP; i++) { if (!handle_pending_pir(pid->pir, regs)) break; } /* - * Clear outstanding notification bit to allow new IRQ notifications, - * do this last to maximize the window of interrupt coalescing. + * Clear the outstanding notification bit to rearm the notification + * mechanism. */ pi_clear_on(pid); /* - * There could be a race of PI notification and the clearing of ON bit, - * process PIR bits one last time such that handling the new interrupts - * are not delayed until the next IRQ. + * Clearing the ON bit can race with a notification. Process the + * PIR bits one last time so that handling the new interrupts is + * not delayed until the next notification happens. */ handle_pending_pir(pid->pir, regs); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [tip: x86/irq] x86/irq: Cleanup posted MSI code 2025-11-25 21:50 ` [patch V2 2/3] x86/irq: Cleanup posted MSI code Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner @ 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 11+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2025-12-18 22:03 UTC (permalink / raw) To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel The following commit has been merged into the x86/irq branch of tip: Commit-ID: 4021a6dad720273a95ac3c0816fc48e35e77dace Gitweb: https://git.kernel.org/tip/4021a6dad720273a95ac3c0816fc48e35e77dace Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 25 Nov 2025 22:50:47 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Thu, 18 Dec 2025 22:59:40 +01:00 x86/irq: Cleanup posted MSI code Make code and comments readable and use __this_cpu..() as this is guaranteed to be invoked with interrupts disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251125214631.108458942@linutronix.de --- arch/x86/kernel/irq.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index b2fe618..d817feb 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -401,11 +401,9 @@ static DEFINE_PER_CPU_CACHE_HOT(bool, posted_msi_handler_active); void intel_posted_msi_init(void) { - u32 destination; - u32 apic_id; + u32 destination, apic_id; this_cpu_write(posted_msi_pi_desc.nv, POSTED_MSI_NOTIFICATION_VECTOR); - /* * APIC destination ID is stored in bit 8:15 while in XAPIC mode. * VT-d spec. CH 9.11 @@ -449,8 +447,8 @@ static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_reg } /* - * Performance data shows that 3 is good enough to harvest 90+% of the benefit - * on high IRQ rate workload. + * Performance data shows that 3 is good enough to harvest 90+% of the + * benefit on high interrupt rate workloads. */ #define MAX_POSTED_MSI_COALESCING_LOOP 3 @@ -460,11 +458,8 @@ static __always_inline bool handle_pending_pir(unsigned long *pir, struct pt_reg */ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) { + struct pi_desc *pid = this_cpu_ptr(&posted_msi_pi_desc); struct pt_regs *old_regs = set_irq_regs(regs); - struct pi_desc *pid; - int i = 0; - - pid = this_cpu_ptr(&posted_msi_pi_desc); /* Mark the handler active for intel_ack_posted_msi_irq() */ __this_cpu_write(posted_msi_handler_active, true); @@ -472,25 +467,25 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_posted_msi_notification) irq_enter(); /* - * Max coalescing count includes the extra round of handle_pending_pir - * after clearing the outstanding notification bit. Hence, at most - * MAX_POSTED_MSI_COALESCING_LOOP - 1 loops are executed here. + * Loop only MAX_POSTED_MSI_COALESCING_LOOP - 1 times here to take + * the final handle_pending_pir() invocation after clearing the + * outstanding notification bit into account. */ - while (++i < MAX_POSTED_MSI_COALESCING_LOOP) { + for (int i = 1; i < MAX_POSTED_MSI_COALESCING_LOOP; i++) { if (!handle_pending_pir(pid->pir, regs)) break; } /* - * Clear outstanding notification bit to allow new IRQ notifications, - * do this last to maximize the window of interrupt coalescing. + * Clear the outstanding notification bit to rearm the notification + * mechanism. */ pi_clear_on(pid); /* - * There could be a race of PI notification and the clearing of ON bit, - * process PIR bits one last time such that handling the new interrupts - * are not delayed until the next IRQ. + * Clearing the ON bit can race with a notification. Process the + * PIR bits one last time so that handling the new interrupts is + * not delayed until the next notification happens. */ handle_pending_pir(pid->pir, regs); ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() 2025-11-25 21:50 [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Thomas Gleixner 2025-11-25 21:50 ` [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI Thomas Gleixner 2025-11-25 21:50 ` [patch V2 2/3] x86/irq: Cleanup posted MSI code Thomas Gleixner @ 2025-11-25 21:50 ` Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 2025-12-17 17:03 ` [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Luigi Rizzo 3 siblings, 2 replies; 11+ messages in thread From: Thomas Gleixner @ 2025-11-25 21:50 UTC (permalink / raw) To: LKML; +Cc: x86, Luigi Rizzo posted_msi_supported() is a misnomer as it actually checks whether it is enabled or not. Aside of that this does not take CONFIG_X86_POSTED_MSI into account which is required to actually use it. Rename it to posted_msi_enabled() and make the return value depend on CONFIG_X86_POSTED_MSI, which allows the compiler to eliminate the related dead code and data if disabled: text data bss dec hex filename 10046 701 3296 14043 36db drivers/iommu/intel/irq_remapping.o 9904 413 3296 13613 352d drivers/iommu/intel/irq_remapping.o Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/irq_remapping.h | 5 +++-- drivers/iommu/intel/irq_remapping.c | 4 ++-- 2 files changed, 5 insertions(+), 4 deletions(-) --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -67,9 +67,10 @@ static inline struct irq_domain *arch_ge extern bool enable_posted_msi; -static inline bool posted_msi_supported(void) +static inline bool posted_msi_enabled(void) { - return enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); + return IS_ENABLED(CONFIG_X86_POSTED_MSI) && + enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); } #else /* CONFIG_IRQ_REMAP */ --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1368,7 +1368,7 @@ static void intel_irq_remapping_prepare_ break; case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - if (posted_msi_supported()) { + if (posted_msi_enabled()) { prepare_irte_posted(irte); data->irq_2_iommu.posted_msi = 1; } @@ -1460,7 +1460,7 @@ static int intel_irq_remapping_alloc(str irq_data->hwirq = (index << 16) + i; irq_data->chip_data = ird; - if (posted_msi_supported() && + if (posted_msi_enabled() && ((info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) || (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX))) irq_data->chip = &intel_ir_chip_post_msi; ^ permalink raw reply [flat|nested] 11+ messages in thread
* [tip: x86/irq] x86/irq_remapping: Sanitize posted_msi_supported() 2025-11-25 21:50 ` [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() Thomas Gleixner @ 2025-12-17 17:48 ` tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 11+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2025-12-17 17:48 UTC (permalink / raw) To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel The following commit has been merged into the x86/irq branch of tip: Commit-ID: 64d4c88270cf90089434e3db67ed443fd982a9a2 Gitweb: https://git.kernel.org/tip/64d4c88270cf90089434e3db67ed443fd982a9a2 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 25 Nov 2025 22:50:49 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Wed, 17 Dec 2025 18:44:17 +01:00 x86/irq_remapping: Sanitize posted_msi_supported() posted_msi_supported() is a misnomer as it actually checks whether it is enabled or not. Aside of that this does not take CONFIG_X86_POSTED_MSI into account which is required to actually use it. Rename it to posted_msi_enabled() and make the return value depend on CONFIG_X86_POSTED_MSI, which allows the compiler to eliminate the related dead code and data if disabled: text data bss dec hex filename 10046 701 3296 14043 36db drivers/iommu/intel/irq_remapping.o 9904 413 3296 13613 352d drivers/iommu/intel/irq_remapping.o Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251125214631.170499997@linutronix.de --- arch/x86/include/asm/irq_remapping.h | 5 +++-- drivers/iommu/intel/irq_remapping.c | 4 ++-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 4e55d17..37b94f4 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -67,9 +67,10 @@ static inline struct irq_domain *arch_get_ir_parent_domain(void) extern bool enable_posted_msi; -static inline bool posted_msi_supported(void) +static inline bool posted_msi_enabled(void) { - return enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); + return IS_ENABLED(CONFIG_X86_POSTED_MSI) && + enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); } #else /* CONFIG_IRQ_REMAP */ diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 8bcbfe3..ecb591e 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1368,7 +1368,7 @@ static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data, break; case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - if (posted_msi_supported()) { + if (posted_msi_enabled()) { prepare_irte_posted(irte); data->irq_2_iommu.posted_msi = 1; } @@ -1460,7 +1460,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain, irq_data->hwirq = (index << 16) + i; irq_data->chip_data = ird; - if (posted_msi_supported() && + if (posted_msi_enabled() && ((info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) || (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX))) irq_data->chip = &intel_ir_chip_post_msi; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [tip: x86/irq] x86/irq_remapping: Sanitize posted_msi_supported() 2025-11-25 21:50 ` [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner @ 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 11+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2025-12-18 22:03 UTC (permalink / raw) To: linux-tip-commits; +Cc: Thomas Gleixner, x86, linux-kernel The following commit has been merged into the x86/irq branch of tip: Commit-ID: d441e38a2c87824afc7e656e634e55141d015307 Gitweb: https://git.kernel.org/tip/d441e38a2c87824afc7e656e634e55141d015307 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Tue, 25 Nov 2025 22:50:49 +01:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Thu, 18 Dec 2025 22:59:40 +01:00 x86/irq_remapping: Sanitize posted_msi_supported() posted_msi_supported() is a misnomer as it actually checks whether it is enabled or not. Aside of that this does not take CONFIG_X86_POSTED_MSI into account which is required to actually use it. Rename it to posted_msi_enabled() and make the return value depend on CONFIG_X86_POSTED_MSI, which allows the compiler to eliminate the related dead code and data if disabled: text data bss dec hex filename 10046 701 3296 14043 36db drivers/iommu/intel/irq_remapping.o 9904 413 3296 13613 352d drivers/iommu/intel/irq_remapping.o Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251125214631.170499997@linutronix.de --- arch/x86/include/asm/irq_remapping.h | 5 +++-- drivers/iommu/intel/irq_remapping.c | 4 ++-- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h index 4e55d17..37b94f4 100644 --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -67,9 +67,10 @@ static inline struct irq_domain *arch_get_ir_parent_domain(void) extern bool enable_posted_msi; -static inline bool posted_msi_supported(void) +static inline bool posted_msi_enabled(void) { - return enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); + return IS_ENABLED(CONFIG_X86_POSTED_MSI) && + enable_posted_msi && irq_remapping_cap(IRQ_POSTING_CAP); } #else /* CONFIG_IRQ_REMAP */ diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 8bcbfe3..ecb591e 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1368,7 +1368,7 @@ static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data, break; case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - if (posted_msi_supported()) { + if (posted_msi_enabled()) { prepare_irte_posted(irte); data->irq_2_iommu.posted_msi = 1; } @@ -1460,7 +1460,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain, irq_data->hwirq = (index << 16) + i; irq_data->chip_data = ird; - if (posted_msi_supported() && + if (posted_msi_enabled() && ((info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) || (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX))) irq_data->chip = &intel_ir_chip_post_msi; ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts 2025-11-25 21:50 [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Thomas Gleixner ` (2 preceding siblings ...) 2025-11-25 21:50 ` [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() Thomas Gleixner @ 2025-12-17 17:03 ` Luigi Rizzo 2025-12-17 17:37 ` Thomas Gleixner 3 siblings, 1 reply; 11+ messages in thread From: Luigi Rizzo @ 2025-12-17 17:03 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, x86 On Tue, Nov 25, 2025 at 10:50 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > A small update to V1 which can be found here: > > https://lore.kernel.org/lkml/20251125101912.564125647@linutronix.de > > Luigi reported that the retrigger mechanism for posted MSI interrupts is > broken. That happens because retrigger sends an IPI to the actual allocated > vector, which is handled correctly, but lacks an EOI. That leaves a stale > APIC ISR bit around. > > The following series addresses this and does some related cleanups in that > area on top. What is happening with this series, is there any blocker ? cheers luigi ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts 2025-12-17 17:03 ` [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Luigi Rizzo @ 2025-12-17 17:37 ` Thomas Gleixner 0 siblings, 0 replies; 11+ messages in thread From: Thomas Gleixner @ 2025-12-17 17:37 UTC (permalink / raw) To: Luigi Rizzo; +Cc: LKML, x86 On Wed, Dec 17 2025 at 18:03, Luigi Rizzo wrote: > On Tue, Nov 25, 2025 at 10:50 PM Thomas Gleixner <tglx@linutronix.de> wrote: >> >> A small update to V1 which can be found here: >> >> https://lore.kernel.org/lkml/20251125101912.564125647@linutronix.de >> >> Luigi reported that the retrigger mechanism for posted MSI interrupts is >> broken. That happens because retrigger sends an IPI to the actual allocated >> vector, which is handled correctly, but lacks an EOI. That leaves a stale >> APIC ISR bit around. >> >> The following series addresses this and does some related cleanups in that >> area on top. > > What is happening with this series, is there any blocker ? I don't think so. It's probably because people were traveling to Japan or distracted otherwise. I'll take care of it now. Thanks, tglx ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-12-18 22:03 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-11-25 21:50 [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Thomas Gleixner 2025-11-25 21:50 ` [patch V2 1/3] x86/msi: Make irq_retrigger() functional for posted MSI Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner 2025-11-25 21:50 ` [patch V2 2/3] x86/irq: Cleanup posted MSI code Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 2025-11-25 21:50 ` [patch V2 3/3] x86/irq_remapping: Sanitize posted_msi_supported() Thomas Gleixner 2025-12-17 17:48 ` [tip: x86/irq] " tip-bot2 for Thomas Gleixner 2025-12-18 22:03 ` tip-bot2 for Thomas Gleixner 2025-12-17 17:03 ` [patch V2 0/3] x86/irq: Bugfix and cleanup for posted MSI interrupts Luigi Rizzo 2025-12-17 17:37 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox