* [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
@ 2025-02-06 15:06 ` Roger Pau Monne
2025-02-10 10:20 ` Jan Beulich
2025-02-06 15:06 ` [PATCH v2 2/5] x86/irq: drop fixup_irqs() parameters Roger Pau Monne
` (5 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monne @ 2025-02-06 15:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The current shutdown logic in smp_send_stop() will disable the APs while
having interrupts enabled on the BSP or possibly other APs. On AMD systems
this can lead to local APIC errors:
APIC error on CPU0: 00(08), Receive accept error
Such error message can be printed in a loop, thus blocking the system from
rebooting. I assume this loop is created by the error being triggered by
the console interrupt, which is further stirred by the ESR handler
printing to the console.
Intel SDM states:
"Receive Accept Error.
Set when the local APIC detects that the message it received was not
accepted by any APIC on the APIC bus, including itself. Used only on P6
family and Pentium processors."
So the error shouldn't trigger on any Intel CPU supported by Xen.
However AMD doesn't make such claims, and indeed the error is broadcasted
to all local APICs when an interrupt targets a CPU that's already offline.
To prevent the error from stalling the shutdown process perform the
disabling of APs and the BSP local APIC with interrupts disabled on all
CPUs in the system, so that by the time interrupts are unmasked on the BSP
the local APIC is already disabled. This can still lead to a spurious:
APIC error on CPU0: 00(00)
As a result of an LVT Error getting injected while interrupts are masked on
the CPU, and the vector only handled after the local APIC is already
disabled.
Note the NMI crash path doesn't have such issue, because disabling of APs
and the caller local APIC is already done in the same contiguous region
with interrupts disabled. There's a possible window on the NMI crash path
(nmi_shootdown_cpus()) where some APs might be disabled (and thus
interrupts targeting them raising "Receive accept error") before others APs
have interrupts disabled. However the shutdown NMI will be handled,
regardless of whether the AP is processing a local APIC error, and hence
such interrupts will not cause the shutdown process to get stuck.
Remove the call to fixup_irqs() in smp_send_stop(), as it doesn't achieve
the intended goal of moving all interrupts to the BSP anyway, because when
called the APs are still set as online in cpu_online_map.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- Change the approach and instead rely on having interrupts uniformly
disabled when offlining APs.
---
xen/arch/x86/smp.c | 27 ++++++++++++++++++++-------
1 file changed, 20 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 02a6ed7593f3..074baae2cc3b 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -345,6 +345,11 @@ void __stop_this_cpu(void)
static void cf_check stop_this_cpu(void *dummy)
{
+ const bool *stop_aps = dummy;
+
+ while ( !*stop_aps )
+ cpu_relax();
+
__stop_this_cpu();
for ( ; ; )
halt();
@@ -357,16 +362,25 @@ static void cf_check stop_this_cpu(void *dummy)
void smp_send_stop(void)
{
unsigned int cpu = smp_processor_id();
+ bool stop_aps = false;
+
+ /*
+ * Perform AP offlining and disabling of interrupt controllers with all
+ * CPUs on the system having interrupts disabled to prevent interrupt
+ * delivery errors. On AMD systems "Receive accept error" will be
+ * broadcasted to local APICs if interrupts target CPUs that are offline.
+ */
+ if ( num_online_cpus() > 1 )
+ smp_call_function(stop_this_cpu, &stop_aps, 0);
+
+ local_irq_disable();
if ( num_online_cpus() > 1 )
{
int timeout = 10;
- local_irq_disable();
- fixup_irqs(cpumask_of(cpu), 0);
- local_irq_enable();
-
- smp_call_function(stop_this_cpu, NULL, 0);
+ /* Signal APs to stop. */
+ stop_aps = true;
/* Wait 10ms for all other CPUs to go offline. */
while ( (num_online_cpus() > 1) && (timeout-- > 0) )
@@ -375,13 +389,12 @@ void smp_send_stop(void)
if ( cpu_online(cpu) )
{
- local_irq_disable();
disable_IO_APIC();
hpet_disable();
__stop_this_cpu();
x2apic_enabled = (current_local_apic_mode() == APIC_MODE_X2APIC);
- local_irq_enable();
}
+ local_irq_enable();
}
void smp_send_nmi_allbutself(void)
--
2.46.0
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs
2025-02-06 15:06 ` [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs Roger Pau Monne
@ 2025-02-10 10:20 ` Jan Beulich
2025-02-10 10:46 ` Roger Pau Monné
0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2025-02-10 10:20 UTC (permalink / raw)
To: Roger Pau Monne, Andrew Cooper; +Cc: xen-devel
On 06.02.2025 16:06, Roger Pau Monne wrote:
> The current shutdown logic in smp_send_stop() will disable the APs while
> having interrupts enabled on the BSP or possibly other APs. On AMD systems
> this can lead to local APIC errors:
>
> APIC error on CPU0: 00(08), Receive accept error
>
> Such error message can be printed in a loop, thus blocking the system from
> rebooting. I assume this loop is created by the error being triggered by
> the console interrupt, which is further stirred by the ESR handler
> printing to the console.
>
> Intel SDM states:
>
> "Receive Accept Error.
>
> Set when the local APIC detects that the message it received was not
> accepted by any APIC on the APIC bus, including itself. Used only on P6
> family and Pentium processors."
>
> So the error shouldn't trigger on any Intel CPU supported by Xen.
>
> However AMD doesn't make such claims, and indeed the error is broadcasted
> to all local APICs when an interrupt targets a CPU that's already offline.
>
> To prevent the error from stalling the shutdown process perform the
> disabling of APs and the BSP local APIC with interrupts disabled on all
> CPUs in the system, so that by the time interrupts are unmasked on the BSP
> the local APIC is already disabled. This can still lead to a spurious:
>
> APIC error on CPU0: 00(00)
>
> As a result of an LVT Error getting injected while interrupts are masked on
> the CPU, and the vector only handled after the local APIC is already
> disabled.
Isn't this bogus, too? As in: Error interrupt without any ESR bits set? Since
I can already see our QA folks report this as another issue, can we perhaps
somehow amend the log message in that case, indicating we think it's bogus?
> Note the NMI crash path doesn't have such issue, because disabling of APs
> and the caller local APIC is already done in the same contiguous region
> with interrupts disabled. There's a possible window on the NMI crash path
> (nmi_shootdown_cpus()) where some APs might be disabled (and thus
> interrupts targeting them raising "Receive accept error") before others APs
> have interrupts disabled. However the shutdown NMI will be handled,
> regardless of whether the AP is processing a local APIC error, and hence
> such interrupts will not cause the shutdown process to get stuck.
>
> Remove the call to fixup_irqs() in smp_send_stop(), as it doesn't achieve
> the intended goal of moving all interrupts to the BSP anyway, because when
> called the APs are still set as online in cpu_online_map.
This is a little too little for my taste: The fact the APs are still online
was, after all, intended to be covered by passing cpumask_of(cpu).
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
I suppose there simply is no "good" commit to blame here with a Fixes: tag.
> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -345,6 +345,11 @@ void __stop_this_cpu(void)
>
> static void cf_check stop_this_cpu(void *dummy)
> {
> + const bool *stop_aps = dummy;
> +
> + while ( !*stop_aps )
> + cpu_relax();
> +
> __stop_this_cpu();
> for ( ; ; )
> halt();
> @@ -357,16 +362,25 @@ static void cf_check stop_this_cpu(void *dummy)
> void smp_send_stop(void)
> {
> unsigned int cpu = smp_processor_id();
> + bool stop_aps = false;
> +
> + /*
> + * Perform AP offlining and disabling of interrupt controllers with all
> + * CPUs on the system having interrupts disabled to prevent interrupt
> + * delivery errors. On AMD systems "Receive accept error" will be
> + * broadcasted to local APICs if interrupts target CPUs that are offline.
> + */
> + if ( num_online_cpus() > 1 )
> + smp_call_function(stop_this_cpu, &stop_aps, 0);
> +
> + local_irq_disable();
With the extensive comment I think this is going to be okay. Just one grammar
thing (and I'm curious myself), mainly to Andrew as a native speaker (or any
other native speakers who read this): While I can find the form you use even
in things calling themselves dictionaries, I've still been under the impression
that it is "be broadcast". (If so, also somewhere in the description then.)
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs
2025-02-10 10:20 ` Jan Beulich
@ 2025-02-10 10:46 ` Roger Pau Monné
2025-02-10 11:09 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2025-02-10 10:46 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Mon, Feb 10, 2025 at 11:20:20AM +0100, Jan Beulich wrote:
> On 06.02.2025 16:06, Roger Pau Monne wrote:
> > The current shutdown logic in smp_send_stop() will disable the APs while
> > having interrupts enabled on the BSP or possibly other APs. On AMD systems
> > this can lead to local APIC errors:
> >
> > APIC error on CPU0: 00(08), Receive accept error
> >
> > Such error message can be printed in a loop, thus blocking the system from
> > rebooting. I assume this loop is created by the error being triggered by
> > the console interrupt, which is further stirred by the ESR handler
> > printing to the console.
> >
> > Intel SDM states:
> >
> > "Receive Accept Error.
> >
> > Set when the local APIC detects that the message it received was not
> > accepted by any APIC on the APIC bus, including itself. Used only on P6
> > family and Pentium processors."
> >
> > So the error shouldn't trigger on any Intel CPU supported by Xen.
> >
> > However AMD doesn't make such claims, and indeed the error is broadcasted
> > to all local APICs when an interrupt targets a CPU that's already offline.
> >
> > To prevent the error from stalling the shutdown process perform the
> > disabling of APs and the BSP local APIC with interrupts disabled on all
> > CPUs in the system, so that by the time interrupts are unmasked on the BSP
> > the local APIC is already disabled. This can still lead to a spurious:
> >
> > APIC error on CPU0: 00(00)
> >
> > As a result of an LVT Error getting injected while interrupts are masked on
> > the CPU, and the vector only handled after the local APIC is already
> > disabled.
>
> Isn't this bogus, too? As in: Error interrupt without any ESR bits set? Since
> I can already see our QA folks report this as another issue, can we perhaps
> somehow amend the log message in that case, indicating we think it's bogus?
Note that the disable_local_APIC() call in __stop_this_cpu() does also
call clear_local_APIC(), which will attempt to clear ESR also.
Further patches in the series prevent the error from triggering in the
first place, since an attempt is made to disable or mask all possible
external interrupt sources Xen knows about before doing AP shutdown.
>
> > Note the NMI crash path doesn't have such issue, because disabling of APs
> > and the caller local APIC is already done in the same contiguous region
> > with interrupts disabled. There's a possible window on the NMI crash path
> > (nmi_shootdown_cpus()) where some APs might be disabled (and thus
> > interrupts targeting them raising "Receive accept error") before others APs
> > have interrupts disabled. However the shutdown NMI will be handled,
> > regardless of whether the AP is processing a local APIC error, and hence
> > such interrupts will not cause the shutdown process to get stuck.
> >
> > Remove the call to fixup_irqs() in smp_send_stop(), as it doesn't achieve
> > the intended goal of moving all interrupts to the BSP anyway, because when
> > called the APs are still set as online in cpu_online_map.
>
> This is a little too little for my taste: The fact the APs are still online
> was, after all, intended to be covered by passing cpumask_of(cpu).
>
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> I suppose there simply is no "good" commit to blame here with a Fixes: tag.
Wondered the same, but I couldn't find any suitable commit. It's been
like this forever as far as I can tell. It's possible my previous
fixes for fixup_irqs() made this worse, but I don't think it was
correct to begin with.
>
> > --- a/xen/arch/x86/smp.c
> > +++ b/xen/arch/x86/smp.c
> > @@ -345,6 +345,11 @@ void __stop_this_cpu(void)
> >
> > static void cf_check stop_this_cpu(void *dummy)
> > {
> > + const bool *stop_aps = dummy;
> > +
> > + while ( !*stop_aps )
> > + cpu_relax();
> > +
> > __stop_this_cpu();
> > for ( ; ; )
> > halt();
> > @@ -357,16 +362,25 @@ static void cf_check stop_this_cpu(void *dummy)
> > void smp_send_stop(void)
> > {
> > unsigned int cpu = smp_processor_id();
> > + bool stop_aps = false;
> > +
> > + /*
> > + * Perform AP offlining and disabling of interrupt controllers with all
> > + * CPUs on the system having interrupts disabled to prevent interrupt
> > + * delivery errors. On AMD systems "Receive accept error" will be
> > + * broadcasted to local APICs if interrupts target CPUs that are offline.
> > + */
> > + if ( num_online_cpus() > 1 )
> > + smp_call_function(stop_this_cpu, &stop_aps, 0);
> > +
> > + local_irq_disable();
>
> With the extensive comment I think this is going to be okay. Just one grammar
> thing (and I'm curious myself), mainly to Andrew as a native speaker (or any
> other native speakers who read this): While I can find the form you use even
> in things calling themselves dictionaries, I've still been under the impression
> that it is "be broadcast". (If so, also somewhere in the description then.)
I don't have a strong opinion. I also looked it up and seemed to be
correct, but might not be fine to use in this context.
Thanks, Roger.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs
2025-02-10 10:46 ` Roger Pau Monné
@ 2025-02-10 11:09 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-02-10 11:09 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, xen-devel
On 10.02.2025 11:46, Roger Pau Monné wrote:
> On Mon, Feb 10, 2025 at 11:20:20AM +0100, Jan Beulich wrote:
>> On 06.02.2025 16:06, Roger Pau Monne wrote:
>>> The current shutdown logic in smp_send_stop() will disable the APs while
>>> having interrupts enabled on the BSP or possibly other APs. On AMD systems
>>> this can lead to local APIC errors:
>>>
>>> APIC error on CPU0: 00(08), Receive accept error
>>>
>>> Such error message can be printed in a loop, thus blocking the system from
>>> rebooting. I assume this loop is created by the error being triggered by
>>> the console interrupt, which is further stirred by the ESR handler
>>> printing to the console.
>>>
>>> Intel SDM states:
>>>
>>> "Receive Accept Error.
>>>
>>> Set when the local APIC detects that the message it received was not
>>> accepted by any APIC on the APIC bus, including itself. Used only on P6
>>> family and Pentium processors."
>>>
>>> So the error shouldn't trigger on any Intel CPU supported by Xen.
>>>
>>> However AMD doesn't make such claims, and indeed the error is broadcasted
>>> to all local APICs when an interrupt targets a CPU that's already offline.
>>>
>>> To prevent the error from stalling the shutdown process perform the
>>> disabling of APs and the BSP local APIC with interrupts disabled on all
>>> CPUs in the system, so that by the time interrupts are unmasked on the BSP
>>> the local APIC is already disabled. This can still lead to a spurious:
>>>
>>> APIC error on CPU0: 00(00)
>>>
>>> As a result of an LVT Error getting injected while interrupts are masked on
>>> the CPU, and the vector only handled after the local APIC is already
>>> disabled.
>>
>> Isn't this bogus, too? As in: Error interrupt without any ESR bits set? Since
>> I can already see our QA folks report this as another issue, can we perhaps
>> somehow amend the log message in that case, indicating we think it's bogus?
>
> Note that the disable_local_APIC() call in __stop_this_cpu() does also
> call clear_local_APIC(), which will attempt to clear ESR also.
Hmm, that's odd. I wonder whether we shouldn't suppress this when called
from disable_local_APIC().
> Further patches in the series prevent the error from triggering in the
> first place, since an attempt is made to disable or mask all possible
> external interrupt sources Xen knows about before doing AP shutdown.
Right.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/5] x86/irq: drop fixup_irqs() parameters
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
2025-02-06 15:06 ` [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs Roger Pau Monne
@ 2025-02-06 15:06 ` Roger Pau Monne
2025-02-06 15:06 ` [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown Roger Pau Monne
` (4 subsequent siblings)
6 siblings, 0 replies; 20+ messages in thread
From: Roger Pau Monne @ 2025-02-06 15:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The solely remaining caller always passes the same globally available
parameters. Drop the parameters and modify fixup_irqs() to use
cpu_online_map in place of the input mask parameter, and always be verbose
in its output printing.
While there remove some of the checks given the single context where
fixup_irqs() is now called, which should always be in the CPU offline path,
after the CPU going offline has been removed from cpu_online_map.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
There's more cleanup that can likely be done here, but it's best if such
cleanup is done after the cpu_mask and old_cpu_mask irq_desc fields are
converted from cpu masks to integers, as logic delivery mode should never
be used for external interrupts now.
---
xen/arch/x86/include/asm/irq.h | 4 ++--
xen/arch/x86/irq.c | 30 +++++++++++++-----------------
xen/arch/x86/smpboot.c | 2 +-
3 files changed, 16 insertions(+), 20 deletions(-)
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index d3bc76806808..354868ba31ab 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -168,8 +168,8 @@ void free_domain_pirqs(struct domain *d);
int map_domain_emuirq_pirq(struct domain *d, int pirq, int emuirq);
int unmap_domain_pirq_emuirq(struct domain *d, int pirq);
-/* Evacuate interrupts assigned to CPUs not present in the input CPU mask. */
-void fixup_irqs(const cpumask_t *mask, bool verbose);
+/* Evacuate interrupts assigned to CPUs not present in the CPU online map. */
+void fixup_irqs(void);
void fixup_eoi(void);
int init_irq_data(void);
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index e56bacc88d84..ff3ac832f4b9 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2590,17 +2590,21 @@ static int __init cf_check setup_dump_irqs(void)
}
__initcall(setup_dump_irqs);
-/* Evacuate interrupts assigned to CPUs not present in the input CPU mask. */
-void fixup_irqs(const cpumask_t *mask, bool verbose)
+/* Evacuate interrupts assigned to CPUs not present in the CPU online map. */
+void fixup_irqs(void)
{
+ const unsigned int cpu = smp_processor_id();
unsigned int irq;
static int warned;
struct irq_desc *desc;
+ /* Only to be called from the context of a CPU going offline. */
+ ASSERT(!cpu_online(cpu));
+
for ( irq = 0; irq < nr_irqs; irq++ )
{
bool break_affinity = false, set_affinity = true, check_irr = false;
- unsigned int vector, cpu = smp_processor_id();
+ unsigned int vector;
cpumask_t *affinity = this_cpu(scratch_cpumask);
if ( irq == 2 )
@@ -2644,12 +2648,6 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
}
if ( desc->arch.move_in_progress &&
- /*
- * Only attempt to adjust the mask if the current CPU is going
- * offline, otherwise the whole system is going down and leaving
- * stale data in the masks is fine.
- */
- !cpu_online(cpu) &&
cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) )
{
/*
@@ -2691,16 +2689,17 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
/*
* Avoid shuffling the interrupt around as long as current target CPUs
- * are a subset of the input mask. What fixup_irqs() cares about is
- * evacuating interrupts from CPUs not in the input mask.
+ * are a subset of the online mask. What fixup_irqs() cares about is
+ * evacuating interrupts from CPUs not in the online mask.
*/
- if ( !desc->action || cpumask_subset(desc->arch.cpu_mask, mask) )
+ if ( !desc->action || cpumask_subset(desc->arch.cpu_mask,
+ &cpu_online_map) )
{
spin_unlock(&desc->lock);
continue;
}
- if ( !cpumask_intersects(mask, desc->affinity) )
+ if ( !cpumask_intersects(&cpu_online_map, desc->affinity) )
{
break_affinity = true;
cpumask_setall(affinity);
@@ -2716,7 +2715,7 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
* the interrupt, signal to check whether there are any pending vectors
* to be handled in the local APIC after the interrupt has been moved.
*/
- if ( !cpu_online(cpu) && cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
+ if ( cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
check_irr = true;
if ( desc->handler->set_affinity )
@@ -2743,9 +2742,6 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
spin_unlock(&desc->lock);
- if ( !verbose )
- continue;
-
if ( !set_affinity )
printk("Cannot set affinity for IRQ%u\n", irq);
else if ( break_affinity )
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 79a79c54c304..891a29fca146 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1282,7 +1282,7 @@ void __cpu_disable(void)
/* It's now safe to remove this processor from the online map */
cpumask_clear_cpu(cpu, &cpu_online_map);
- fixup_irqs(&cpu_online_map, 1);
+ fixup_irqs();
fixup_eoi();
}
--
2.46.0
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
2025-02-06 15:06 ` [PATCH v2 1/5] x86/shutdown: offline APs with interrupts disabled on all CPUs Roger Pau Monne
2025-02-06 15:06 ` [PATCH v2 2/5] x86/irq: drop fixup_irqs() parameters Roger Pau Monne
@ 2025-02-06 15:06 ` Roger Pau Monne
2025-02-10 10:24 ` Jan Beulich
2025-02-06 15:06 ` [PATCH v2 4/5] x86/pci: disable MSI(-X) on all devices at shutdown Roger Pau Monne
` (3 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monne @ 2025-02-06 15:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Move the disabling of interrupt sources so it's done ahead of the offlining
of APs. This is to prevent AMD systems triggering "Receive accept error"
when interrupts target CPUs that are no longer online.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- New in this version.
---
xen/arch/x86/smp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index 074baae2cc3b..f931db0d71c6 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -374,6 +374,8 @@ void smp_send_stop(void)
smp_call_function(stop_this_cpu, &stop_aps, 0);
local_irq_disable();
+ disable_IO_APIC();
+ hpet_disable();
if ( num_online_cpus() > 1 )
{
@@ -389,8 +391,6 @@ void smp_send_stop(void)
if ( cpu_online(cpu) )
{
- disable_IO_APIC();
- hpet_disable();
__stop_this_cpu();
x2apic_enabled = (current_local_apic_mode() == APIC_MODE_X2APIC);
}
--
2.46.0
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown
2025-02-06 15:06 ` [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown Roger Pau Monne
@ 2025-02-10 10:24 ` Jan Beulich
2025-02-10 10:47 ` Roger Pau Monné
0 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2025-02-10 10:24 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 06.02.2025 16:06, Roger Pau Monne wrote:
> Move the disabling of interrupt sources so it's done ahead of the offlining
> of APs. This is to prevent AMD systems triggering "Receive accept error"
> when interrupts target CPUs that are no longer online.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Changes since v1:
> - New in this version.
Ah, you decided to effectively split the original patch.
Reviewed-by: Jan Beulich <jbeulich@suse.com>
with a question:
> --- a/xen/arch/x86/smp.c
> +++ b/xen/arch/x86/smp.c
> @@ -374,6 +374,8 @@ void smp_send_stop(void)
> smp_call_function(stop_this_cpu, &stop_aps, 0);
>
> local_irq_disable();
> + disable_IO_APIC();
> + hpet_disable();
Is this then taking care of the bogus error interrupt observing ESR=0x00?
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown
2025-02-10 10:24 ` Jan Beulich
@ 2025-02-10 10:47 ` Roger Pau Monné
0 siblings, 0 replies; 20+ messages in thread
From: Roger Pau Monné @ 2025-02-10 10:47 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Mon, Feb 10, 2025 at 11:24:58AM +0100, Jan Beulich wrote:
> On 06.02.2025 16:06, Roger Pau Monne wrote:
> > Move the disabling of interrupt sources so it's done ahead of the offlining
> > of APs. This is to prevent AMD systems triggering "Receive accept error"
> > when interrupts target CPUs that are no longer online.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Changes since v1:
> > - New in this version.
>
> Ah, you decided to effectively split the original patch.
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with a question:
>
> > --- a/xen/arch/x86/smp.c
> > +++ b/xen/arch/x86/smp.c
> > @@ -374,6 +374,8 @@ void smp_send_stop(void)
> > smp_call_function(stop_this_cpu, &stop_aps, 0);
> >
> > local_irq_disable();
> > + disable_IO_APIC();
> > + hpet_disable();
>
> Is this then taking care of the bogus error interrupt observing ESR=0x00?
This, with the extra commits that follow should prevent the error from
triggering yes. With all patches in the series applied I'm no longer
able to see the ESR=0x00 message.
Thanks, Roger.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 4/5] x86/pci: disable MSI(-X) on all devices at shutdown
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
` (2 preceding siblings ...)
2025-02-06 15:06 ` [PATCH v2 3/5] x86/smp: perform disabling on interrupts ahead of AP shutdown Roger Pau Monne
@ 2025-02-06 15:06 ` Roger Pau Monne
2025-02-10 10:41 ` Jan Beulich
2025-02-06 15:06 ` [PATCH v2 5/5] x86/iommu: disable interrupts " Roger Pau Monne
` (2 subsequent siblings)
6 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monne @ 2025-02-06 15:06 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini
Attempt to disable MSI(-X) capabilities on all PCI devices know by Xen at
shutdown. Doing such disabling should facilitate kexec chained kernel from
booting more reliably, as device MSI(-X) interrupt generation should be
quiesced.
It would also prevent "Receive accept error" being raised as a result of
non-disabled interrupts targeting offline CPUs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- Split from bigger patch.
- Iterate over all devices, even if the handler returns failure.
---
xen/arch/x86/crash.c | 1 +
xen/arch/x86/include/asm/msi.h | 1 +
xen/arch/x86/msi.c | 18 ++++++++++++++++++
xen/arch/x86/smp.c | 1 +
xen/drivers/passthrough/pci.c | 33 +++++++++++++++++++++++++++++++++
xen/include/xen/pci.h | 4 ++++
6 files changed, 58 insertions(+)
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index a789416ca3ae..c946225c0b9b 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -177,6 +177,7 @@ static void nmi_shootdown_cpus(void)
disable_IO_APIC();
hpet_disable();
+ pci_disable_msi_all();
}
}
diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h
index 63adb19820e8..7f9e531f73e6 100644
--- a/xen/arch/x86/include/asm/msi.h
+++ b/xen/arch/x86/include/asm/msi.h
@@ -86,6 +86,7 @@ extern int pci_enable_msi(struct pci_dev *pdev, struct msi_info *msi,
extern void pci_disable_msi(struct msi_desc *msi_desc);
extern int pci_prepare_msix(u16 seg, u8 bus, u8 devfn, bool off);
extern void pci_cleanup_msi(struct pci_dev *pdev);
+extern void pci_disable_msi_all(void);
extern int setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc);
extern int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
hw_irq_controller *handler);
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index e2360579deda..c9fe942c46f3 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -1248,6 +1248,24 @@ void pci_cleanup_msi(struct pci_dev *pdev)
msi_free_irqs(pdev);
}
+static int cf_check disable_msi(struct pci_dev *pdev, void *arg)
+{
+ msi_set_enable(pdev, 0);
+ msix_set_enable(pdev, 0);
+
+ return 0;
+}
+
+/* Disable MSI and/or MSI-X on all devices known by Xen. */
+void pci_disable_msi_all(void)
+{
+ int rc = pci_iterate_devices(disable_msi, NULL);
+
+ if ( rc )
+ printk(XENLOG_ERR
+ "Failed to disable MSI(-X) on some devices: %d\n", rc);
+}
+
int pci_reset_msix_state(struct pci_dev *pdev)
{
unsigned int pos = pdev->msix_pos;
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index f931db0d71c6..f58c8d3cafe1 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -376,6 +376,7 @@ void smp_send_stop(void)
local_irq_disable();
disable_IO_APIC();
hpet_disable();
+ pci_disable_msi_all();
if ( num_online_cpus() > 1 )
{
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 777c6b1a7fdc..945118383f45 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1803,6 +1803,39 @@ int iommu_do_pci_domctl(
return ret;
}
+struct segment_iter {
+ int (*handler)(struct pci_dev *pdev, void *arg);
+ void *arg;
+ int rc;
+};
+
+static int cf_check iterate_all(struct pci_seg *pseg, void *arg)
+{
+ struct segment_iter *iter = arg;
+ struct pci_dev *pdev;
+
+ list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
+ {
+ int rc = iter->handler(pdev, iter->arg);
+
+ if ( !iter->rc )
+ iter->rc = rc;
+ }
+
+ return 0;
+}
+
+int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg),
+ void *arg)
+{
+ struct segment_iter iter = {
+ .handler = handler,
+ .arg = arg,
+ };
+
+ return pci_segments_iterate(iterate_all, &iter) ?: iter.rc;
+}
+
/*
* Local variables:
* mode: C
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index f784e9116059..983c592124a8 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -226,6 +226,10 @@ struct pci_dev *pci_get_pdev(const struct domain *d, pci_sbdf_t sbdf);
struct pci_dev *pci_get_real_pdev(pci_sbdf_t sbdf);
void pci_check_disable_device(u16 seg, u8 bus, u8 devfn);
+/* Iterate without locking or preemption over all PCI devices known by Xen. */
+int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg),
+ void *arg);
+
uint8_t pci_conf_read8(pci_sbdf_t sbdf, unsigned int reg);
uint16_t pci_conf_read16(pci_sbdf_t sbdf, unsigned int reg);
uint32_t pci_conf_read32(pci_sbdf_t sbdf, unsigned int reg);
--
2.46.0
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 4/5] x86/pci: disable MSI(-X) on all devices at shutdown
2025-02-06 15:06 ` [PATCH v2 4/5] x86/pci: disable MSI(-X) on all devices at shutdown Roger Pau Monne
@ 2025-02-10 10:41 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-02-10 10:41 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, xen-devel
On 06.02.2025 16:06, Roger Pau Monne wrote:
> --- a/xen/arch/x86/crash.c
> +++ b/xen/arch/x86/crash.c
> @@ -177,6 +177,7 @@ static void nmi_shootdown_cpus(void)
>
> disable_IO_APIC();
> hpet_disable();
> + pci_disable_msi_all();
> }
Apart from my concern below regarding use of the function in this context,
for both uses I wonder in how far the order of the three calls above may
matter. I can't really give a precise reason, but to me it feels like the
PCI device processing may better be done first.
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1803,6 +1803,39 @@ int iommu_do_pci_domctl(
> return ret;
> }
>
> +struct segment_iter {
> + int (*handler)(struct pci_dev *pdev, void *arg);
> + void *arg;
> + int rc;
> +};
> +
> +static int cf_check iterate_all(struct pci_seg *pseg, void *arg)
> +{
> + struct segment_iter *iter = arg;
> + struct pci_dev *pdev;
> +
> + list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> + {
> + int rc = iter->handler(pdev, iter->arg);
> +
> + if ( !iter->rc )
> + iter->rc = rc;
> + }
> +
> + return 0;
> +}
> +
> +int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg),
> + void *arg)
> +{
> + struct segment_iter iter = {
> + .handler = handler,
> + .arg = arg,
> + };
> +
> + return pci_segments_iterate(iterate_all, &iter) ?: iter.rc;
> +}
My earlier concern remains as far as e.g. list traversal goes, especially
when we're called from nmi_shootdown_cpus() context. The lists themselves
may be screwed, after all. Whereas disable_IO_APIC() and hpet_disable()
don't involve any list traversal, and even if they did those lists would
be stable post-boot.
We may want to talk about the up- and down-sides of this on the x86 call
later in the day.
> --- a/xen/include/xen/pci.h
> +++ b/xen/include/xen/pci.h
> @@ -226,6 +226,10 @@ struct pci_dev *pci_get_pdev(const struct domain *d, pci_sbdf_t sbdf);
> struct pci_dev *pci_get_real_pdev(pci_sbdf_t sbdf);
> void pci_check_disable_device(u16 seg, u8 bus, u8 devfn);
>
> +/* Iterate without locking or preemption over all PCI devices known by Xen. */
> +int pci_iterate_devices(int (*handler)(struct pci_dev *pdev, void *arg),
> + void *arg);
Oh, I see you added the comment here that I did ask for. As it's pretty
important for people to notice, may I ask that it be replicated in (or
ahead of) the function definition? And then there perhaps also mentioning
that one needs to be aware of the function being expected to run with IRQs
off (to make clear that it's not a simple matter of adding preemption
checks, for example).
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 5/5] x86/iommu: disable interrupts at shutdown
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
` (3 preceding siblings ...)
2025-02-06 15:06 ` [PATCH v2 4/5] x86/pci: disable MSI(-X) on all devices at shutdown Roger Pau Monne
@ 2025-02-06 15:06 ` Roger Pau Monne
2025-02-10 11:04 ` Jan Beulich
2025-02-10 10:02 ` [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors " Roger Pau Monné
2025-02-11 6:39 ` [PATCH " Jan Beulich
6 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monne @ 2025-02-06 15:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Add a new hook to inhibit interrupt generation by the IOMMU(s). Note the
hook is currently only implemented for x86 IOMMUs. The purpose is to
disable interrupt generation at shutdown so any kexec chained image finds
the IOMMU(s) in a quiesced state.
It would also prevent "Receive accept error" being raised as a result of
non-disabled interrupts targeting offline CPUs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
- New in this version.
---
xen/arch/x86/crash.c | 1 +
xen/arch/x86/smp.c | 1 +
xen/drivers/passthrough/amd/iommu.h | 1 +
xen/drivers/passthrough/amd/iommu_init.c | 17 +++++++++++++++++
xen/drivers/passthrough/amd/pci_amd_iommu.c | 1 +
xen/drivers/passthrough/iommu.c | 6 ++++++
xen/drivers/passthrough/vtd/iommu.c | 19 +++++++++++++++++++
xen/include/xen/iommu.h | 3 +++
8 files changed, 49 insertions(+)
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index c946225c0b9b..a5feed298d1e 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -178,6 +178,7 @@ static void nmi_shootdown_cpus(void)
disable_IO_APIC();
hpet_disable();
pci_disable_msi_all();
+ iommu_quiesce();
}
}
diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c
index f58c8d3cafe1..06c0e84d40b3 100644
--- a/xen/arch/x86/smp.c
+++ b/xen/arch/x86/smp.c
@@ -377,6 +377,7 @@ void smp_send_stop(void)
disable_IO_APIC();
hpet_disable();
pci_disable_msi_all();
+ iommu_quiesce();
if ( num_online_cpus() > 1 )
{
diff --git a/xen/drivers/passthrough/amd/iommu.h b/xen/drivers/passthrough/amd/iommu.h
index c32e9e9a1602..00e81b4b2ab3 100644
--- a/xen/drivers/passthrough/amd/iommu.h
+++ b/xen/drivers/passthrough/amd/iommu.h
@@ -292,6 +292,7 @@ extern unsigned long *shared_intremap_inuse;
void cf_check amd_iommu_resume(void);
int __must_check cf_check amd_iommu_suspend(void);
void cf_check amd_iommu_crash_shutdown(void);
+void cf_check amd_iommu_quiesce(void);
static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift)
{
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 302362502033..2e60aef151b0 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1611,3 +1611,20 @@ void cf_check amd_iommu_resume(void)
invalidate_all_domain_pages();
}
}
+
+void cf_check amd_iommu_quiesce(void)
+{
+ struct amd_iommu *iommu;
+
+ for_each_amd_iommu ( iommu )
+ {
+ if ( iommu->ctrl.int_cap_xt_en )
+ {
+ iommu->ctrl.int_cap_xt_en = false;
+ writeq(iommu->ctrl.raw,
+ iommu->mmio_base + IOMMU_CONTROL_MMIO_OFFSET);
+ }
+ else
+ amd_iommu_msi_enable(iommu, IOMMU_CONTROL_DISABLED);
+ }
+}
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index f96f59440bcc..d00697edb3a2 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -791,6 +791,7 @@ static const struct iommu_ops __initconst_cf_clobber _iommu_ops = {
.crash_shutdown = amd_iommu_crash_shutdown,
.get_reserved_device_memory = amd_iommu_get_reserved_device_memory,
.dump_page_tables = amd_dump_page_tables,
+ .quiesce = amd_iommu_quiesce,
};
static const struct iommu_init_ops __initconstrel _iommu_init_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 9e74a1fc72fa..35c08ee0612c 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -663,6 +663,12 @@ void iommu_crash_shutdown(void)
#endif
}
+void iommu_quiesce(void)
+{
+ if ( iommu_enabled )
+ iommu_vcall(iommu_get_ops(), quiesce);
+}
+
int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
{
const struct iommu_ops *ops;
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 9d7a9977a6a6..a1927d9f126d 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -3207,6 +3207,24 @@ static int cf_check intel_iommu_quarantine_init(struct pci_dev *pdev,
return rc;
}
+static void cf_check vtd_quiesce(void)
+{
+ const struct acpi_drhd_unit *drhd;
+
+ for_each_drhd_unit ( drhd )
+ {
+ const struct vtd_iommu *iommu = drhd->iommu;
+ uint32_t sts = dmar_readl(iommu->reg, DMAR_FECTL_REG);
+
+ /*
+ * Open code dma_msi_mask() to avoid taking the spinlock which could
+ * deadlock if called from crash context.
+ */
+ sts |= DMA_FECTL_IM;
+ dmar_writel(iommu->reg, DMAR_FECTL_REG, sts);
+ }
+}
+
static const struct iommu_ops __initconst_cf_clobber vtd_ops = {
.page_sizes = PAGE_SIZE_4K,
.init = intel_iommu_domain_init,
@@ -3236,6 +3254,7 @@ static const struct iommu_ops __initconst_cf_clobber vtd_ops = {
.iotlb_flush = iommu_flush_iotlb,
.get_reserved_device_memory = intel_iommu_get_reserved_device_memory,
.dump_page_tables = vtd_dump_page_tables,
+ .quiesce = vtd_quiesce,
};
const struct iommu_init_ops __initconstrel intel_iommu_init_ops = {
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index b928c67e1995..77a514019cc6 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -314,6 +314,8 @@ struct iommu_ops {
*/
int (*dt_xlate)(device_t *dev, const struct dt_phandle_args *args);
#endif
+ /* Inhibit all interrupt generation, to be used at shutdown. */
+ void (*quiesce)(void);
};
/*
@@ -404,6 +406,7 @@ static inline int iommu_do_domctl(struct xen_domctl *domctl, struct domain *d,
int __must_check iommu_suspend(void);
void iommu_resume(void);
void iommu_crash_shutdown(void);
+void iommu_quiesce(void);
int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt);
int iommu_quarantine_dev_init(device_t *dev);
--
2.46.0
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 5/5] x86/iommu: disable interrupts at shutdown
2025-02-06 15:06 ` [PATCH v2 5/5] x86/iommu: disable interrupts " Roger Pau Monne
@ 2025-02-10 11:04 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-02-10 11:04 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 06.02.2025 16:06, Roger Pau Monne wrote:
> Add a new hook to inhibit interrupt generation by the IOMMU(s). Note the
> hook is currently only implemented for x86 IOMMUs.
Yet then ...
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -663,6 +663,12 @@ void iommu_crash_shutdown(void)
> #endif
> }
>
> +void iommu_quiesce(void)
> +{
> + if ( iommu_enabled )
> + iommu_vcall(iommu_get_ops(), quiesce);
> +}
..., with this being in common code, the function wants checking that the pointer
to be called through isn't NULL.
As to the use on the crash path - iommu_crash_shutdown() was called a few lines
earlier. Why would we still need iommu_quiesce() there?
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
` (4 preceding siblings ...)
2025-02-06 15:06 ` [PATCH v2 5/5] x86/iommu: disable interrupts " Roger Pau Monne
@ 2025-02-10 10:02 ` Roger Pau Monné
2025-02-10 18:29 ` Oleksii Kurochko
2025-02-11 6:39 ` [PATCH " Jan Beulich
6 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2025-02-10 10:02 UTC (permalink / raw)
To: Oleksii Kurochko, Jan Beulich, Andrew Cooper
Cc: xen-devel, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini
Hello,
This should have had a 'for-4.20?' tag in the subject name, as
otherwise we will need to add an errata to the release notes to notice
that reboot can sometimes fail on AMD boxes.
Also adding Oleksii.
Thanks, Roger.
On Thu, Feb 06, 2025 at 04:06:10PM +0100, Roger Pau Monne wrote:
> Hello,
>
> The following series aims to prevent local APIC errors from stalling the
> shtudown process. On XenServer testing we have seen reports of AMD
> boxes sporadically getting stuck in a spam of:
>
> APIC error on CPU0: 00(08), Receive accept error
>
> Messages during shutdown, as a result of device interrupts targeting
> CPUs that are offline (and have the local APIC disabled).
>
> First patch strictly solves the issue of shutdown getting stuck, further
> patches aim to quiesce interrupts from all devices (known by Xen) as an
> attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
> make kexec more reliable.
>
> Thanks, Roger.
>
> Roger Pau Monne (5):
> x86/shutdown: offline APs with interrupts disabled on all CPUs
> x86/irq: drop fixup_irqs() parameters
> x86/smp: perform disabling on interrupts ahead of AP shutdown
> x86/pci: disable MSI(-X) on all devices at shutdown
> x86/iommu: disable interrupts at shutdown
>
> xen/arch/x86/crash.c | 2 ++
> xen/arch/x86/include/asm/irq.h | 4 +--
> xen/arch/x86/include/asm/msi.h | 1 +
> xen/arch/x86/irq.c | 30 ++++++++-----------
> xen/arch/x86/msi.c | 18 +++++++++++
> xen/arch/x86/smp.c | 33 +++++++++++++++------
> xen/arch/x86/smpboot.c | 2 +-
> xen/drivers/passthrough/amd/iommu.h | 1 +
> xen/drivers/passthrough/amd/iommu_init.c | 17 +++++++++++
> xen/drivers/passthrough/amd/pci_amd_iommu.c | 1 +
> xen/drivers/passthrough/iommu.c | 6 ++++
> xen/drivers/passthrough/pci.c | 33 +++++++++++++++++++++
> xen/drivers/passthrough/vtd/iommu.c | 19 ++++++++++++
> xen/include/xen/iommu.h | 3 ++
> xen/include/xen/pci.h | 4 +++
> 15 files changed, 145 insertions(+), 29 deletions(-)
>
> --
> 2.46.0
>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-10 10:02 ` [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors " Roger Pau Monné
@ 2025-02-10 18:29 ` Oleksii Kurochko
2025-02-11 8:38 ` Roger Pau Monné
0 siblings, 1 reply; 20+ messages in thread
From: Oleksii Kurochko @ 2025-02-10 18:29 UTC (permalink / raw)
To: Roger Pau Monné
Cc: xen-devel, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini
[-- Attachment #1: Type: text/plain, Size: 2510 bytes --]
Hello Roger,
On 2/10/25 11:02 AM, Roger Pau Monné wrote:
> Hello,
>
> This should have had a 'for-4.20?' tag in the subject name, as
> otherwise we will need to add an errata to the release notes to notice
> that reboot can sometimes fail on AMD boxes.
>
> Also adding Oleksii.
>
> Thanks, Roger.
>
> On Thu, Feb 06, 2025 at 04:06:10PM +0100, Roger Pau Monne wrote:
>> Hello,
>>
>> The following series aims to prevent local APIC errors from stalling the
>> shtudown process. On XenServer testing we have seen reports of AMD
>> boxes sporadically getting stuck in a spam of:
How often this issue happen?
>>
>> APIC error on CPU0: 00(08), Receive accept error
>>
>> Messages during shutdown, as a result of device interrupts targeting
>> CPUs that are offline (and have the local APIC disabled).
>>
>> First patch strictly solves the issue of shutdown getting stuck, further
>> patches aim to quiesce interrupts from all devices (known by Xen) as an
>> attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
>> make kexec more reliable.
If the first patch solves does it make sense to consider, at least, it to be merged?
~ Oleksii
>>
>> Thanks, Roger.
>>
>> Roger Pau Monne (5):
>> x86/shutdown: offline APs with interrupts disabled on all CPUs
>> x86/irq: drop fixup_irqs() parameters
>> x86/smp: perform disabling on interrupts ahead of AP shutdown
>> x86/pci: disable MSI(-X) on all devices at shutdown
>> x86/iommu: disable interrupts at shutdown
>>
>> xen/arch/x86/crash.c | 2 ++
>> xen/arch/x86/include/asm/irq.h | 4 +--
>> xen/arch/x86/include/asm/msi.h | 1 +
>> xen/arch/x86/irq.c | 30 ++++++++-----------
>> xen/arch/x86/msi.c | 18 +++++++++++
>> xen/arch/x86/smp.c | 33 +++++++++++++++------
>> xen/arch/x86/smpboot.c | 2 +-
>> xen/drivers/passthrough/amd/iommu.h | 1 +
>> xen/drivers/passthrough/amd/iommu_init.c | 17 +++++++++++
>> xen/drivers/passthrough/amd/pci_amd_iommu.c | 1 +
>> xen/drivers/passthrough/iommu.c | 6 ++++
>> xen/drivers/passthrough/pci.c | 33 +++++++++++++++++++++
>> xen/drivers/passthrough/vtd/iommu.c | 19 ++++++++++++
>> xen/include/xen/iommu.h | 3 ++
>> xen/include/xen/pci.h | 4 +++
>> 15 files changed, 145 insertions(+), 29 deletions(-)
>>
>> --
>> 2.46.0
>>
[-- Attachment #2: Type: text/html, Size: 3202 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-10 18:29 ` Oleksii Kurochko
@ 2025-02-11 8:38 ` Roger Pau Monné
2025-02-11 9:26 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2025-02-11 8:38 UTC (permalink / raw)
To: Oleksii Kurochko
Cc: xen-devel, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini
On Mon, Feb 10, 2025 at 07:29:35PM +0100, Oleksii Kurochko wrote:
> Hello Roger,
>
> On 2/10/25 11:02 AM, Roger Pau Monné wrote:
> > Hello,
> >
> > This should have had a 'for-4.20?' tag in the subject name, as
> > otherwise we will need to add an errata to the release notes to notice
> > that reboot can sometimes fail on AMD boxes.
> >
> > Also adding Oleksii.
> >
> > Thanks, Roger.
> >
> > On Thu, Feb 06, 2025 at 04:06:10PM +0100, Roger Pau Monne wrote:
> > > Hello,
> > >
> > > The following series aims to prevent local APIC errors from stalling the
> > > shtudown process. On XenServer testing we have seen reports of AMD
> > > boxes sporadically getting stuck in a spam of:
>
> How often this issue happen?
Hard to tell, we have certainly hit it more than once on XenRT, but
I don't have figures about its probability. I have at least 14
reports from XenRT from the last 6 months, but there's possibly a lot
more that could have been classified as a different kind of issue.
> > >
> > > APIC error on CPU0: 00(08), Receive accept error
> > >
> > > Messages during shutdown, as a result of device interrupts targeting
> > > CPUs that are offline (and have the local APIC disabled).
> > >
> > > First patch strictly solves the issue of shutdown getting stuck, further
> > > patches aim to quiesce interrupts from all devices (known by Xen) as an
> > > attempt to prevent a spurious "APIC error on CPU0: 00(00)" plus also
> > > make kexec more reliable.
>
> If the first patch solves does it make sense to consider, at least, it to be merged?
First one sure, the rest I think are also worth considering. They get
rid of the resulting innocuous "APIC error on CPU0: 00(00)" message.
Also remaining patches are likely to provide the kexec kernel with a
better context, as they quiesce interrupts from devices.
I will send a new version soon, hopefully we can discuss over that one
which patches we want to pick. With my XenServer hat on I plan to
backport the whole series into our patch queue.
Thanks, Roger.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-11 8:38 ` Roger Pau Monné
@ 2025-02-11 9:26 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-02-11 9:26 UTC (permalink / raw)
To: Roger Pau Monné, Oleksii Kurochko
Cc: xen-devel, Andrew Cooper, Anthony PERARD, Michal Orzel,
Julien Grall, Stefano Stabellini
On 11.02.2025 09:38, Roger Pau Monné wrote:
> On Mon, Feb 10, 2025 at 07:29:35PM +0100, Oleksii Kurochko wrote:
>> On 2/10/25 11:02 AM, Roger Pau Monné wrote:
>>> This should have had a 'for-4.20?' tag in the subject name, as
>>> otherwise we will need to add an errata to the release notes to notice
>>> that reboot can sometimes fail on AMD boxes.
>>>
>>> Also adding Oleksii.
>>>
>>> Thanks, Roger.
>>>
>>> On Thu, Feb 06, 2025 at 04:06:10PM +0100, Roger Pau Monne wrote:
>>>> Hello,
>>>>
>>>> The following series aims to prevent local APIC errors from stalling the
>>>> shtudown process. On XenServer testing we have seen reports of AMD
>>>> boxes sporadically getting stuck in a spam of:
>>
>> How often this issue happen?
>
> Hard to tell, we have certainly hit it more than once on XenRT, but
> I don't have figures about its probability. I have at least 14
> reports from XenRT from the last 6 months, but there's possibly a lot
> more that could have been classified as a different kind of issue.
Our QA tells me that this is severely hindering their work.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-06 15:06 [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown Roger Pau Monne
` (5 preceding siblings ...)
2025-02-10 10:02 ` [PATCH for-4.20? v2 0/5] xen/x86: prevent local APIC errors " Roger Pau Monné
@ 2025-02-11 6:39 ` Jan Beulich
2025-02-11 8:50 ` Roger Pau Monné
6 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2025-02-11 6:39 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, xen-devel
On 06.02.2025 16:06, Roger Pau Monne wrote:
> The following series aims to prevent local APIC errors from stalling the
> shtudown process. On XenServer testing we have seen reports of AMD
> boxes sporadically getting stuck in a spam of:
>
> APIC error on CPU0: 00(08), Receive accept error
>
> Messages during shutdown, as a result of device interrupts targeting
> CPUs that are offline (and have the local APIC disabled).
One more thought here: Have you/we perhaps discovered the reason why there
was that 1ms delay at the end of fixup_irqs() that was badly commented,
and that you removed in e2bb28d62158 ("x86/irq: forward pending interrupts
to new destination in fixup_irqs()")? May be worth mentioning that by way
of a Fixes: tag.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-11 6:39 ` [PATCH " Jan Beulich
@ 2025-02-11 8:50 ` Roger Pau Monné
2025-02-11 9:20 ` Jan Beulich
0 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2025-02-11 8:50 UTC (permalink / raw)
To: Jan Beulich
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, xen-devel
On Tue, Feb 11, 2025 at 07:39:12AM +0100, Jan Beulich wrote:
> On 06.02.2025 16:06, Roger Pau Monne wrote:
> > The following series aims to prevent local APIC errors from stalling the
> > shtudown process. On XenServer testing we have seen reports of AMD
> > boxes sporadically getting stuck in a spam of:
> >
> > APIC error on CPU0: 00(08), Receive accept error
> >
> > Messages during shutdown, as a result of device interrupts targeting
> > CPUs that are offline (and have the local APIC disabled).
>
> One more thought here: Have you/we perhaps discovered the reason why there
> was that 1ms delay at the end of fixup_irqs() that was badly commented,
> and that you removed in e2bb28d62158 ("x86/irq: forward pending interrupts
> to new destination in fixup_irqs()")? May be worth mentioning that by way
> of a Fixes: tag.
Hm, so you think the delay was added there as a way to ensure any
pending interrupts would get drained (ie: serviced) on the old target?
I'm maybe a bit confused, but I don't think the delay would help much
with preventing the local APIC errors? Regardless of the wait, if the
interrupts target offline CPUs there's a chance receive accept errors
will be triggered on AMD.
Thanks, Roger.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 0/5] xen/x86: prevent local APIC errors at shutdown
2025-02-11 8:50 ` Roger Pau Monné
@ 2025-02-11 9:20 ` Jan Beulich
0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2025-02-11 9:20 UTC (permalink / raw)
To: Roger Pau Monné
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, xen-devel
On 11.02.2025 09:50, Roger Pau Monné wrote:
> On Tue, Feb 11, 2025 at 07:39:12AM +0100, Jan Beulich wrote:
>> On 06.02.2025 16:06, Roger Pau Monne wrote:
>>> The following series aims to prevent local APIC errors from stalling the
>>> shtudown process. On XenServer testing we have seen reports of AMD
>>> boxes sporadically getting stuck in a spam of:
>>>
>>> APIC error on CPU0: 00(08), Receive accept error
>>>
>>> Messages during shutdown, as a result of device interrupts targeting
>>> CPUs that are offline (and have the local APIC disabled).
>>
>> One more thought here: Have you/we perhaps discovered the reason why there
>> was that 1ms delay at the end of fixup_irqs() that was badly commented,
>> and that you removed in e2bb28d62158 ("x86/irq: forward pending interrupts
>> to new destination in fixup_irqs()")? May be worth mentioning that by way
>> of a Fixes: tag.
>
> Hm, so you think the delay was added there as a way to ensure any
> pending interrupts would get drained (ie: serviced) on the old target?
So far I didn't have the slightest idea why that call had been there. This
at least gives a possible reason.
> I'm maybe a bit confused, but I don't think the delay would help much
> with preventing the local APIC errors? Regardless of the wait, if the
> interrupts target offline CPUs there's a chance receive accept errors
> will be triggered on AMD.
But fixup_irqs() right now runs ahead of actually offlining the APs.
Jan
^ permalink raw reply [flat|nested] 20+ messages in thread