* [PATCH v6.6 RESEND 0/2] x86/irq: Plug vector setup race
@ 2025-08-22 3:38 Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 1/2] x86/irq: Factor out handler invocation from common_interrupt() Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race Jinjie Ruan
0 siblings, 2 replies; 7+ messages in thread
From: Jinjie Ruan @ 2025-08-22 3:38 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, hpa, prarit, gregkh, x86, stable,
linux-kernel
Cc: ruanjinjie
There is a vector setup race, which overwrites the interrupt
descriptor in the per CPU vector array resulting in a disfunctional device.
CPU0 CPU1
interrupt is raised in APIC IRR
but not handled
free_irq()
per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
request_irq() common_interrupt()
d = this_cpu_read(vector_irq[vector]);
per_cpu(vector_irq, CPU1)[vector] = desc;
if (d == VECTOR_SHUTDOWN)
this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
free_irq() cannot observe the pending vector in the CPU1 APIC as there is
no way to query the remote CPUs APIC IRR.
This requires that request_irq() uses the same vector/CPU as the one which
was freed, but this also can be triggered by a spurious interrupt.
Interestingly enough this problem managed to be hidden for more than a
decade.
Prevent this by reevaluating vector_irq under the vector lock, which is
held by the interrupt activation code when vector_irq is updated.
The first patch provides context for subsequent real bugfix patch.
Fixes: 9345005f4eed ("x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs")
Cc: stable@vger.kernel.org#6.6.x
Cc: gregkh@linuxfoundation.org
v1 -> RESEND
- Add upstream commit ID.
Jacob Pan (1):
x86/irq: Factor out handler invocation from common_interrupt()
Thomas Gleixner (1):
x86/irq: Plug vector setup race
arch/x86/kernel/irq.c | 70 ++++++++++++++++++++++++++++++++++---------
1 file changed, 56 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v6.6 RESEND 1/2] x86/irq: Factor out handler invocation from common_interrupt()
2025-08-22 3:38 [PATCH v6.6 RESEND 0/2] x86/irq: Plug vector setup race Jinjie Ruan
@ 2025-08-22 3:38 ` Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race Jinjie Ruan
1 sibling, 0 replies; 7+ messages in thread
From: Jinjie Ruan @ 2025-08-22 3:38 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, hpa, prarit, gregkh, x86, stable,
linux-kernel
Cc: ruanjinjie
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
commit 6087c7f36ab293a06bc0bcf3857ed4d7eb1f9905 upstream.
Prepare for calling external interrupt handlers directly from the posted
MSI demultiplexing loop. Extract the common code from common_interrupt() to
avoid code duplication.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240423174114.526704-8-jacob.jun.pan@linux.intel.com
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/x86/kernel/irq.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 6573678c4bf4..1f066268ec29 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -242,18 +242,10 @@ static __always_inline void handle_irq(struct irq_desc *desc,
__handle_irq(desc, regs);
}
-/*
- * common_interrupt() handles all normal device IRQ's (the special SMP
- * cross-CPU interrupts have their own entry points).
- */
-DEFINE_IDTENTRY_IRQ(common_interrupt)
+static __always_inline void call_irq_handler(int vector, struct pt_regs *regs)
{
- struct pt_regs *old_regs = set_irq_regs(regs);
struct irq_desc *desc;
- /* entry code tells RCU that we're not quiescent. Check it. */
- RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
-
desc = __this_cpu_read(vector_irq[vector]);
if (likely(!IS_ERR_OR_NULL(desc))) {
handle_irq(desc, regs);
@@ -268,7 +260,20 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
__this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
}
}
+}
+
+/*
+ * common_interrupt() handles all normal device IRQ's (the special SMP
+ * cross-CPU interrupts have their own entry points).
+ */
+DEFINE_IDTENTRY_IRQ(common_interrupt)
+{
+ struct pt_regs *old_regs = set_irq_regs(regs);
+
+ /* entry code tells RCU that we're not quiescent. Check it. */
+ RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
+ call_irq_handler(vector, regs);
set_irq_regs(old_regs);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race
2025-08-22 3:38 [PATCH v6.6 RESEND 0/2] x86/irq: Plug vector setup race Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 1/2] x86/irq: Factor out handler invocation from common_interrupt() Jinjie Ruan
@ 2025-08-22 3:38 ` Jinjie Ruan
2025-08-22 8:57 ` Greg KH
1 sibling, 1 reply; 7+ messages in thread
From: Jinjie Ruan @ 2025-08-22 3:38 UTC (permalink / raw)
To: tglx, mingo, bp, dave.hansen, hpa, prarit, gregkh, x86, stable,
linux-kernel
Cc: ruanjinjie
From: Thomas Gleixner <tglx@linutronix.de>
commit ce0b5eedcb753697d43f61dd2e27d68eb5d3150f upstream.
Hogan reported a vector setup race, which overwrites the interrupt
descriptor in the per CPU vector array resulting in a disfunctional device.
CPU0 CPU1
interrupt is raised in APIC IRR
but not handled
free_irq()
per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
request_irq() common_interrupt()
d = this_cpu_read(vector_irq[vector]);
per_cpu(vector_irq, CPU1)[vector] = desc;
if (d == VECTOR_SHUTDOWN)
this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
free_irq() cannot observe the pending vector in the CPU1 APIC as there is
no way to query the remote CPUs APIC IRR.
This requires that request_irq() uses the same vector/CPU as the one which
was freed, but this also can be triggered by a spurious interrupt.
Interestingly enough this problem managed to be hidden for more than a
decade.
Prevent this by reevaluating vector_irq under the vector lock, which is
held by the interrupt activation code when vector_irq is updated.
To avoid ifdeffery or IS_ENABLED() nonsense, move the
[un]lock_vector_lock() declarations out under the
CONFIG_IRQ_DOMAIN_HIERARCHY guard as it's only provided when
CONFIG_X86_LOCAL_APIC=y.
The current CONFIG_IRQ_DOMAIN_HIERARCHY guard is selected by
CONFIG_X86_LOCAL_APIC, but can also be selected by other parts of the
Kconfig system, which makes 32-bit UP builds with CONFIG_X86_LOCAL_APIC=n
fail.
Can we just get rid of this !APIC nonsense once and forever?
Fixes: 9345005f4eed ("x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs")
Cc: stable@vger.kernel.org#6.6.x
Cc: gregkh@linuxfoundation.org
Reported-by: Hogan Wang <hogan.wang@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Hogan Wang <hogan.wang@huawei.com>
Link: https://lore.kernel.org/all/draft-87ikjhrhhh.ffs@tglx
[ Conflicts in arch/x86/kernel/irq.c because call_irq_handler() has been
refactored to do apic_eoi() according to the return value.
Conflicts in arch/x86/include/asm/hw_irq.h because (un)lock_vector_lock()
are already controlled by CONFIG_X86_LOCAL_APIC. ]
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
---
arch/x86/kernel/irq.c | 65 +++++++++++++++++++++++++++++++++----------
1 file changed, 51 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 1f066268ec29..29d0fc94232e 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -242,24 +242,59 @@ static __always_inline void handle_irq(struct irq_desc *desc,
__handle_irq(desc, regs);
}
-static __always_inline void call_irq_handler(int vector, struct pt_regs *regs)
+static struct irq_desc *reevaluate_vector(int vector)
{
- struct irq_desc *desc;
+ struct irq_desc *desc = __this_cpu_read(vector_irq[vector]);
+
+ if (!IS_ERR_OR_NULL(desc))
+ return desc;
+
+ if (desc == VECTOR_UNUSED)
+ pr_emerg_ratelimited("No irq handler for %d.%u\n", smp_processor_id(), vector);
+ else
+ __this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
+ return NULL;
+}
+
+static __always_inline bool call_irq_handler(int vector, struct pt_regs *regs)
+{
+ struct irq_desc *desc = __this_cpu_read(vector_irq[vector]);
- desc = __this_cpu_read(vector_irq[vector]);
if (likely(!IS_ERR_OR_NULL(desc))) {
handle_irq(desc, regs);
- } else {
- apic_eoi();
-
- if (desc == VECTOR_UNUSED) {
- pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
- __func__, smp_processor_id(),
- vector);
- } else {
- __this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
- }
+ return true;
}
+
+ /*
+ * Reevaluate with vector_lock held to prevent a race against
+ * request_irq() setting up the vector:
+ *
+ * CPU0 CPU1
+ * interrupt is raised in APIC IRR
+ * but not handled
+ * free_irq()
+ * per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
+ *
+ * request_irq() common_interrupt()
+ * d = this_cpu_read(vector_irq[vector]);
+ *
+ * per_cpu(vector_irq, CPU1)[vector] = desc;
+ *
+ * if (d == VECTOR_SHUTDOWN)
+ * this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
+ *
+ * This requires that the same vector on the same target CPU is
+ * handed out or that a spurious interrupt hits that CPU/vector.
+ */
+ lock_vector_lock();
+ desc = reevaluate_vector(vector);
+ unlock_vector_lock();
+
+ if (!desc)
+ return false;
+
+ handle_irq(desc, regs);
+ return true;
}
/*
@@ -273,7 +308,9 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
/* entry code tells RCU that we're not quiescent. Check it. */
RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
- call_irq_handler(vector, regs);
+ if (unlikely(!call_irq_handler(vector, regs)))
+ apic_eoi();
+
set_irq_regs(old_regs);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race
2025-08-22 3:38 ` [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race Jinjie Ruan
@ 2025-08-22 8:57 ` Greg KH
2025-08-22 9:25 ` Jinjie Ruan
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2025-08-22 8:57 UTC (permalink / raw)
To: Jinjie Ruan
Cc: tglx, mingo, bp, dave.hansen, hpa, prarit, x86, stable,
linux-kernel
On Fri, Aug 22, 2025 at 03:38:25AM +0000, Jinjie Ruan wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> commit ce0b5eedcb753697d43f61dd2e27d68eb5d3150f upstream.
>
> Hogan reported a vector setup race, which overwrites the interrupt
> descriptor in the per CPU vector array resulting in a disfunctional device.
>
> CPU0 CPU1
> interrupt is raised in APIC IRR
> but not handled
> free_irq()
> per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
>
> request_irq() common_interrupt()
> d = this_cpu_read(vector_irq[vector]);
>
> per_cpu(vector_irq, CPU1)[vector] = desc;
>
> if (d == VECTOR_SHUTDOWN)
> this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
>
> free_irq() cannot observe the pending vector in the CPU1 APIC as there is
> no way to query the remote CPUs APIC IRR.
>
> This requires that request_irq() uses the same vector/CPU as the one which
> was freed, but this also can be triggered by a spurious interrupt.
>
> Interestingly enough this problem managed to be hidden for more than a
> decade.
>
> Prevent this by reevaluating vector_irq under the vector lock, which is
> held by the interrupt activation code when vector_irq is updated.
>
> To avoid ifdeffery or IS_ENABLED() nonsense, move the
> [un]lock_vector_lock() declarations out under the
> CONFIG_IRQ_DOMAIN_HIERARCHY guard as it's only provided when
> CONFIG_X86_LOCAL_APIC=y.
>
> The current CONFIG_IRQ_DOMAIN_HIERARCHY guard is selected by
> CONFIG_X86_LOCAL_APIC, but can also be selected by other parts of the
> Kconfig system, which makes 32-bit UP builds with CONFIG_X86_LOCAL_APIC=n
> fail.
>
> Can we just get rid of this !APIC nonsense once and forever?
>
> Fixes: 9345005f4eed ("x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs")
> Cc: stable@vger.kernel.org#6.6.x
> Cc: gregkh@linuxfoundation.org
> Reported-by: Hogan Wang <hogan.wang@huawei.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Hogan Wang <hogan.wang@huawei.com>
> Link: https://lore.kernel.org/all/draft-87ikjhrhhh.ffs@tglx
> [ Conflicts in arch/x86/kernel/irq.c because call_irq_handler() has been
> refactored to do apic_eoi() according to the return value.
> Conflicts in arch/x86/include/asm/hw_irq.h because (un)lock_vector_lock()
> are already controlled by CONFIG_X86_LOCAL_APIC. ]
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> ---
> arch/x86/kernel/irq.c | 65 +++++++++++++++++++++++++++++++++----------
> 1 file changed, 51 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 1f066268ec29..29d0fc94232e 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -242,24 +242,59 @@ static __always_inline void handle_irq(struct irq_desc *desc,
> __handle_irq(desc, regs);
> }
>
> -static __always_inline void call_irq_handler(int vector, struct pt_regs *regs)
> +static struct irq_desc *reevaluate_vector(int vector)
> {
> - struct irq_desc *desc;
> + struct irq_desc *desc = __this_cpu_read(vector_irq[vector]);
> +
> + if (!IS_ERR_OR_NULL(desc))
> + return desc;
> +
> + if (desc == VECTOR_UNUSED)
> + pr_emerg_ratelimited("No irq handler for %d.%u\n", smp_processor_id(), vector);
> + else
> + __this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
> + return NULL;
> +}
> +
> +static __always_inline bool call_irq_handler(int vector, struct pt_regs *regs)
> +{
> + struct irq_desc *desc = __this_cpu_read(vector_irq[vector]);
>
> - desc = __this_cpu_read(vector_irq[vector]);
> if (likely(!IS_ERR_OR_NULL(desc))) {
> handle_irq(desc, regs);
> - } else {
> - apic_eoi();
> -
> - if (desc == VECTOR_UNUSED) {
> - pr_emerg_ratelimited("%s: %d.%u No irq handler for vector\n",
> - __func__, smp_processor_id(),
> - vector);
> - } else {
> - __this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
> - }
> + return true;
> }
> +
> + /*
> + * Reevaluate with vector_lock held to prevent a race against
> + * request_irq() setting up the vector:
> + *
> + * CPU0 CPU1
> + * interrupt is raised in APIC IRR
> + * but not handled
> + * free_irq()
> + * per_cpu(vector_irq, CPU1)[vector] = VECTOR_SHUTDOWN;
> + *
> + * request_irq() common_interrupt()
> + * d = this_cpu_read(vector_irq[vector]);
> + *
> + * per_cpu(vector_irq, CPU1)[vector] = desc;
> + *
> + * if (d == VECTOR_SHUTDOWN)
> + * this_cpu_write(vector_irq[vector], VECTOR_UNUSED);
> + *
> + * This requires that the same vector on the same target CPU is
> + * handed out or that a spurious interrupt hits that CPU/vector.
> + */
> + lock_vector_lock();
> + desc = reevaluate_vector(vector);
> + unlock_vector_lock();
> +
> + if (!desc)
> + return false;
> +
> + handle_irq(desc, regs);
> + return true;
> }
>
> /*
> @@ -273,7 +308,9 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
> /* entry code tells RCU that we're not quiescent. Check it. */
> RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
>
> - call_irq_handler(vector, regs);
> + if (unlikely(!call_irq_handler(vector, regs)))
> + apic_eoi();
> +
This chunk does not look correct. The original commit did not have
this, so why add it here? Where did it come from?
The original patch said:
- if (unlikely(call_irq_handler(vector, regs)))
+ if (unlikely(!call_irq_handler(vector, regs)))
And was not an if statement.
So did you forget to backport something else here? Why is this not
identical to what the original was?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race
2025-08-22 8:57 ` Greg KH
@ 2025-08-22 9:25 ` Jinjie Ruan
2025-08-22 9:48 ` Greg KH
0 siblings, 1 reply; 7+ messages in thread
From: Jinjie Ruan @ 2025-08-22 9:25 UTC (permalink / raw)
To: Greg KH
Cc: tglx, mingo, bp, dave.hansen, hpa, prarit, x86, stable,
linux-kernel
On 2025/8/22 16:57, Greg KH wrote:
> On Fri, Aug 22, 2025 at 03:38:25AM +0000, Jinjie Ruan wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>>
>> commit ce0b5eedcb753697d43f61dd2e27d68eb5d3150f upstream.
>>
[...]
>>
>> /*
>> @@ -273,7 +308,9 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
>> /* entry code tells RCU that we're not quiescent. Check it. */
>> RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
>>
>> - call_irq_handler(vector, regs);
>> + if (unlikely(!call_irq_handler(vector, regs)))
>> + apic_eoi();
>> +
>
> This chunk does not look correct. The original commit did not have
> this, so why add it here? Where did it come from?
>
> The original patch said:
> - if (unlikely(call_irq_handler(vector, regs)))
> + if (unlikely(!call_irq_handler(vector, regs)))
>
> And was not an if statement.
>
> So did you forget to backport something else here? Why is this not
> identical to what the original was?
The if statement is introduced in commit 1b03d82ba15e ("x86/irq: Install
posted MSI notification handler") which is a patch in patch set
https://lore.kernel.org/all/20240423174114.526704-1-jacob.jun.pan@linux.intel.com/,
but it seems to be a performance optimization patch set, and this patch
includes additional modifications. The context conflict is merely a
simple refactoring, but the cost of the entire round of this patch set
is too high.
>
> thanks,
>
> greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race
2025-08-22 9:25 ` Jinjie Ruan
@ 2025-08-22 9:48 ` Greg KH
2025-08-22 9:50 ` Greg KH
0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2025-08-22 9:48 UTC (permalink / raw)
To: Jinjie Ruan
Cc: tglx, mingo, bp, dave.hansen, hpa, prarit, x86, stable,
linux-kernel
On Fri, Aug 22, 2025 at 05:25:56PM +0800, Jinjie Ruan wrote:
>
>
> On 2025/8/22 16:57, Greg KH wrote:
> > On Fri, Aug 22, 2025 at 03:38:25AM +0000, Jinjie Ruan wrote:
> >> From: Thomas Gleixner <tglx@linutronix.de>
> >>
> >> commit ce0b5eedcb753697d43f61dd2e27d68eb5d3150f upstream.
> >>
>
> [...]
>
> >>
> >> /*
> >> @@ -273,7 +308,9 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
> >> /* entry code tells RCU that we're not quiescent. Check it. */
> >> RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
> >>
> >> - call_irq_handler(vector, regs);
> >> + if (unlikely(!call_irq_handler(vector, regs)))
> >> + apic_eoi();
> >> +
> >
> > This chunk does not look correct. The original commit did not have
> > this, so why add it here? Where did it come from?
> >
> > The original patch said:
> > - if (unlikely(call_irq_handler(vector, regs)))
> > + if (unlikely(!call_irq_handler(vector, regs)))
> >
> > And was not an if statement.
> >
> > So did you forget to backport something else here? Why is this not
> > identical to what the original was?
>
> The if statement is introduced in commit 1b03d82ba15e ("x86/irq: Install
> posted MSI notification handler") which is a patch in patch set
> https://lore.kernel.org/all/20240423174114.526704-1-jacob.jun.pan@linux.intel.com/,
> but it seems to be a performance optimization patch set, and this patch
> includes additional modifications. The context conflict is merely a
> simple refactoring, but the cost of the entire round of this patch set
> is too high.
Why is it "too high"? We almost NEVER want to deviate from what is in
mainline, as every time wo do that it adds the potential for bugs AND it
increases our maintance burden (i.e. later patches will not apply.)
For a kernel that has to live as long as this one does, we need to try
to stick to what is in mainline as close as possible. Otherwise it
becomes unworkable.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race
2025-08-22 9:48 ` Greg KH
@ 2025-08-22 9:50 ` Greg KH
0 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2025-08-22 9:50 UTC (permalink / raw)
To: Jinjie Ruan
Cc: tglx, mingo, bp, dave.hansen, hpa, prarit, x86, stable,
linux-kernel
On Fri, Aug 22, 2025 at 11:48:56AM +0200, Greg KH wrote:
> On Fri, Aug 22, 2025 at 05:25:56PM +0800, Jinjie Ruan wrote:
> >
> >
> > On 2025/8/22 16:57, Greg KH wrote:
> > > On Fri, Aug 22, 2025 at 03:38:25AM +0000, Jinjie Ruan wrote:
> > >> From: Thomas Gleixner <tglx@linutronix.de>
> > >>
> > >> commit ce0b5eedcb753697d43f61dd2e27d68eb5d3150f upstream.
> > >>
> >
> > [...]
> >
> > >>
> > >> /*
> > >> @@ -273,7 +308,9 @@ DEFINE_IDTENTRY_IRQ(common_interrupt)
> > >> /* entry code tells RCU that we're not quiescent. Check it. */
> > >> RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
> > >>
> > >> - call_irq_handler(vector, regs);
> > >> + if (unlikely(!call_irq_handler(vector, regs)))
> > >> + apic_eoi();
> > >> +
> > >
> > > This chunk does not look correct. The original commit did not have
> > > this, so why add it here? Where did it come from?
> > >
> > > The original patch said:
> > > - if (unlikely(call_irq_handler(vector, regs)))
> > > + if (unlikely(!call_irq_handler(vector, regs)))
> > >
> > > And was not an if statement.
> > >
> > > So did you forget to backport something else here? Why is this not
> > > identical to what the original was?
> >
> > The if statement is introduced in commit 1b03d82ba15e ("x86/irq: Install
> > posted MSI notification handler") which is a patch in patch set
> > https://lore.kernel.org/all/20240423174114.526704-1-jacob.jun.pan@linux.intel.com/,
> > but it seems to be a performance optimization patch set, and this patch
> > includes additional modifications. The context conflict is merely a
> > simple refactoring, but the cost of the entire round of this patch set
> > is too high.
>
> Why is it "too high"? We almost NEVER want to deviate from what is in
> mainline, as every time wo do that it adds the potential for bugs AND it
> increases our maintance burden (i.e. later patches will not apply.)
>
> For a kernel that has to live as long as this one does, we need to try
> to stick to what is in mainline as close as possible. Otherwise it
> becomes unworkable.
Also, I will push back, why not just use 6.12.y to resolve this issue
instead? Why are you stuck on 6.6.y for now, and what are you going to
do with those systems once 6.6.y goes end-of-life? Why postpone the
inevitable?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-22 9:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-22 3:38 [PATCH v6.6 RESEND 0/2] x86/irq: Plug vector setup race Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 1/2] x86/irq: Factor out handler invocation from common_interrupt() Jinjie Ruan
2025-08-22 3:38 ` [PATCH v6.6 RESEND 2/2] x86/irq: Plug vector setup race Jinjie Ruan
2025-08-22 8:57 ` Greg KH
2025-08-22 9:25 ` Jinjie Ruan
2025-08-22 9:48 ` Greg KH
2025-08-22 9:50 ` Greg KH
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).