* [PATCH 00/12] x86/irq: drop interrupt target cpumasks
@ 2025-11-20 9:06 Roger Pau Monne
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
` (5 more replies)
0 siblings, 6 replies; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, Jason Andryuk
Hello,
After the adjustments that dropped logical delivery mode for all
external interrupts the Xen internal interrupt descriptors no longer
require a cpumask to store the possible targets of the interrupt. Using
physical delivery mode only allows a single CPU to be the target of the
interrupt, and hence the field can be converted into an unsigned int.
The series contain a bit of cleanup and code adjustments ahead of the
removal of the cpumasks that's done in the last 3 patches.
A couple of notes from the series:
* It's possible that further simplifications can be applied to
fixup_irqs() (and other logic) after dropping the cpumasks. I
haven't looked closely enough, but in any case that would be a
followup change(s) on top of this series.
* Current code seems to confuse interrupt affinity (possible interrupt
destinations) with the current target of the interrupt. For example
irq_set_affinity() doesn't set the interrupt affinity, but rather
migrates the interrupt from one target CPU to another.
Thanks, Roger.
Roger Pau Monne (12):
x86/apic: remove cpu_mask_to_apicid hook
x86/apic: remove vector_allocation_cpumask hook
x86/irq: introduce local irq_desc
x86/irq: set accurate cpu_mask for high priority vectors used by
external interrupts
x86/io-apic: purge usage of logical mode
x86/i8259: redo workaround for AMD spurious PIC interrupts
x86/io-apic: fix usage of setup_ioapic_dest()
x86/irq: adjust bind_irq_vector() to take a single CPU as parameter
x86/irq: convert cpumask parameter to integer in
{,p}irq_set_affinity()
x86/irq: convert irq_desc cpu_mask field to integer
x86/irq: convert irq_desc old_cpu_mask field to integer
x86/irq: convert irq_desc pending_mask field to integer
xen/arch/x86/genapic/bigsmp.c | 2 -
xen/arch/x86/genapic/default.c | 2 -
xen/arch/x86/genapic/delivery.c | 7 +-
xen/arch/x86/genapic/x2apic.c | 9 -
xen/arch/x86/hpet.c | 11 +-
xen/arch/x86/hvm/hvm.c | 2 +-
xen/arch/x86/i8259.c | 17 +-
xen/arch/x86/include/asm/genapic.h | 17 +-
xen/arch/x86/include/asm/irq.h | 22 +-
xen/arch/x86/include/asm/msi.h | 3 +-
xen/arch/x86/io_apic.c | 61 +++---
xen/arch/x86/irq.c | 249 ++++++++++-------------
xen/arch/x86/msi.c | 17 +-
xen/arch/x86/platform_hypercall.c | 2 +-
xen/arch/x86/smpboot.c | 12 +-
xen/common/cpu.c | 1 +
xen/common/event_channel.c | 6 +-
xen/drivers/passthrough/amd/iommu_init.c | 2 +-
xen/drivers/passthrough/vtd/iommu.c | 2 +-
xen/include/xen/cpumask.h | 1 +
xen/include/xen/irq.h | 3 +-
21 files changed, 179 insertions(+), 269 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
@ 2025-11-20 9:06 ` Roger Pau Monne
2025-11-20 12:19 ` Jan Beulich
2025-11-20 12:25 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook Roger Pau Monne
` (4 subsequent siblings)
5 siblings, 2 replies; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
All implementations use the same hook. No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/genapic/bigsmp.c | 1 -
xen/arch/x86/genapic/default.c | 1 -
xen/arch/x86/genapic/delivery.c | 2 +-
xen/arch/x86/genapic/x2apic.c | 2 --
xen/arch/x86/include/asm/genapic.h | 13 +------------
5 files changed, 2 insertions(+), 17 deletions(-)
diff --git a/xen/arch/x86/genapic/bigsmp.c b/xen/arch/x86/genapic/bigsmp.c
index ddb3a0b5d727..066feb4a1755 100644
--- a/xen/arch/x86/genapic/bigsmp.c
+++ b/xen/arch/x86/genapic/bigsmp.c
@@ -48,7 +48,6 @@ const struct genapic __initconst_cf_clobber apic_bigsmp = {
APIC_INIT("bigsmp", probe_bigsmp),
.init_apic_ldr = init_apic_ldr_phys,
.vector_allocation_cpumask = vector_allocation_cpumask_phys,
- .cpu_mask_to_apicid = cpu_mask_to_apicid_phys,
.send_IPI_mask = send_IPI_mask_phys,
.send_IPI_self = send_IPI_self_legacy
};
diff --git a/xen/arch/x86/genapic/default.c b/xen/arch/x86/genapic/default.c
index 16e1875f6378..ab9a292464d6 100644
--- a/xen/arch/x86/genapic/default.c
+++ b/xen/arch/x86/genapic/default.c
@@ -18,7 +18,6 @@ const struct genapic __initconst_cf_clobber apic_default = {
APIC_INIT("default", NULL),
.init_apic_ldr = init_apic_ldr_flat,
.vector_allocation_cpumask = vector_allocation_cpumask_phys,
- .cpu_mask_to_apicid = cpu_mask_to_apicid_phys,
.send_IPI_mask = send_IPI_mask_flat,
.send_IPI_self = send_IPI_self_legacy
};
diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c
index 15100439be05..5d105e848502 100644
--- a/xen/arch/x86/genapic/delivery.c
+++ b/xen/arch/x86/genapic/delivery.c
@@ -35,7 +35,7 @@ const cpumask_t *cf_check vector_allocation_cpumask_phys(int cpu)
return cpumask_of(cpu);
}
-unsigned int cf_check cpu_mask_to_apicid_phys(const cpumask_t *cpumask)
+unsigned int cpu_mask_to_apicid(const cpumask_t *cpumask)
{
/* As we are using single CPU as destination, pick only one CPU here */
return cpu_physical_id(cpumask_any(cpumask));
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index 58157c217ee8..f4709ab92950 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -142,7 +142,6 @@ static const struct genapic __initconst_cf_clobber apic_x2apic_phys = {
APIC_INIT("x2apic_phys", NULL),
.init_apic_ldr = init_apic_ldr_phys,
.vector_allocation_cpumask = vector_allocation_cpumask_phys,
- .cpu_mask_to_apicid = cpu_mask_to_apicid_phys,
.send_IPI_mask = send_IPI_mask_x2apic_phys,
.send_IPI_self = send_IPI_self_x2apic
};
@@ -162,7 +161,6 @@ static const struct genapic __initconst_cf_clobber apic_x2apic_mixed = {
* hence are set to use Physical destination mode handlers.
*/
.vector_allocation_cpumask = vector_allocation_cpumask_phys,
- .cpu_mask_to_apicid = cpu_mask_to_apicid_phys,
/*
* The following fields are exclusively used by IPIs and hence are set to
diff --git a/xen/arch/x86/include/asm/genapic.h b/xen/arch/x86/include/asm/genapic.h
index 04d3f1de7a1f..6d979279bd2b 100644
--- a/xen/arch/x86/include/asm/genapic.h
+++ b/xen/arch/x86/include/asm/genapic.h
@@ -25,7 +25,6 @@ struct genapic {
void (*init_apic_ldr)(void);
const cpumask_t *(*vector_allocation_cpumask)(int cpu);
- unsigned int (*cpu_mask_to_apicid)(const cpumask_t *cpumask);
void (*send_IPI_mask)(const cpumask_t *mask, int vector);
void (*send_IPI_self)(uint8_t vector);
};
@@ -36,16 +35,6 @@ struct genapic {
#define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
#define init_apic_ldr() alternative_vcall(genapic.init_apic_ldr)
-#define cpu_mask_to_apicid(mask) ({ \
- /* \
- * There are a number of places where the address of a local variable \
- * gets passed here. The use of ?: in alternative_call<N>() triggers an \
- * "address of ... is always true" warning in such a case with at least \
- * gcc 7 and 8. Hence the seemingly pointless local variable here. \
- */ \
- const cpumask_t *m_ = (mask); \
- alternative_call(genapic.cpu_mask_to_apicid, m_); \
-})
#define vector_allocation_cpumask(cpu) \
alternative_call(genapic.vector_allocation_cpumask, cpu)
@@ -59,7 +48,7 @@ void cf_check init_apic_ldr_flat(void);
void cf_check send_IPI_mask_flat(const cpumask_t *cpumask, int vector);
void cf_check init_apic_ldr_phys(void);
-unsigned int cf_check cpu_mask_to_apicid_phys(const cpumask_t *cpumask);
+unsigned int cpu_mask_to_apicid(const cpumask_t *cpumask);
void cf_check send_IPI_mask_phys(const cpumask_t *mask, int vector);
const cpumask_t *cf_check vector_allocation_cpumask_phys(int cpu);
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
@ 2025-11-20 9:06 ` Roger Pau Monne
2025-11-20 12:25 ` Jan Beulich
2025-11-20 12:30 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 03/12] x86/irq: introduce local irq_desc Roger Pau Monne
` (3 subsequent siblings)
5 siblings, 2 replies; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
All implementations use the same hook which is a wrapper around
cpumask_of(cpu). Adjust callers to no longer use such dummy mask with a
single CPU set, as the CPU is already known to the caller. This removes a
pair of usages of for_each_cpu().
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/genapic/bigsmp.c | 1 -
xen/arch/x86/genapic/default.c | 1 -
xen/arch/x86/genapic/delivery.c | 5 -----
xen/arch/x86/genapic/x2apic.c | 7 -------
xen/arch/x86/include/asm/genapic.h | 4 ----
xen/arch/x86/irq.c | 19 +++++++------------
6 files changed, 7 insertions(+), 30 deletions(-)
diff --git a/xen/arch/x86/genapic/bigsmp.c b/xen/arch/x86/genapic/bigsmp.c
index 066feb4a1755..c0bcee659f1b 100644
--- a/xen/arch/x86/genapic/bigsmp.c
+++ b/xen/arch/x86/genapic/bigsmp.c
@@ -47,7 +47,6 @@ static int __init cf_check probe_bigsmp(void)
const struct genapic __initconst_cf_clobber apic_bigsmp = {
APIC_INIT("bigsmp", probe_bigsmp),
.init_apic_ldr = init_apic_ldr_phys,
- .vector_allocation_cpumask = vector_allocation_cpumask_phys,
.send_IPI_mask = send_IPI_mask_phys,
.send_IPI_self = send_IPI_self_legacy
};
diff --git a/xen/arch/x86/genapic/default.c b/xen/arch/x86/genapic/default.c
index ab9a292464d6..58b5884aac0d 100644
--- a/xen/arch/x86/genapic/default.c
+++ b/xen/arch/x86/genapic/default.c
@@ -17,7 +17,6 @@
const struct genapic __initconst_cf_clobber apic_default = {
APIC_INIT("default", NULL),
.init_apic_ldr = init_apic_ldr_flat,
- .vector_allocation_cpumask = vector_allocation_cpumask_phys,
.send_IPI_mask = send_IPI_mask_flat,
.send_IPI_self = send_IPI_self_legacy
};
diff --git a/xen/arch/x86/genapic/delivery.c b/xen/arch/x86/genapic/delivery.c
index 5d105e848502..777570f3b633 100644
--- a/xen/arch/x86/genapic/delivery.c
+++ b/xen/arch/x86/genapic/delivery.c
@@ -30,11 +30,6 @@ void cf_check init_apic_ldr_phys(void)
/* We only deliver in phys mode - no setup needed. */
}
-const cpumask_t *cf_check vector_allocation_cpumask_phys(int cpu)
-{
- return cpumask_of(cpu);
-}
-
unsigned int cpu_mask_to_apicid(const cpumask_t *cpumask)
{
/* As we are using single CPU as destination, pick only one CPU here */
diff --git a/xen/arch/x86/genapic/x2apic.c b/xen/arch/x86/genapic/x2apic.c
index f4709ab92950..89d66bc627d7 100644
--- a/xen/arch/x86/genapic/x2apic.c
+++ b/xen/arch/x86/genapic/x2apic.c
@@ -141,7 +141,6 @@ static void cf_check send_IPI_mask_x2apic_cluster(
static const struct genapic __initconst_cf_clobber apic_x2apic_phys = {
APIC_INIT("x2apic_phys", NULL),
.init_apic_ldr = init_apic_ldr_phys,
- .vector_allocation_cpumask = vector_allocation_cpumask_phys,
.send_IPI_mask = send_IPI_mask_x2apic_phys,
.send_IPI_self = send_IPI_self_x2apic
};
@@ -156,12 +155,6 @@ static const struct genapic __initconst_cf_clobber apic_x2apic_phys = {
static const struct genapic __initconst_cf_clobber apic_x2apic_mixed = {
APIC_INIT("x2apic_mixed", NULL),
- /*
- * The following fields are exclusively used by external interrupts and
- * hence are set to use Physical destination mode handlers.
- */
- .vector_allocation_cpumask = vector_allocation_cpumask_phys,
-
/*
* The following fields are exclusively used by IPIs and hence are set to
* use Cluster Logical destination mode handlers. Note that init_apic_ldr
diff --git a/xen/arch/x86/include/asm/genapic.h b/xen/arch/x86/include/asm/genapic.h
index 6d979279bd2b..33ec492a6ff5 100644
--- a/xen/arch/x86/include/asm/genapic.h
+++ b/xen/arch/x86/include/asm/genapic.h
@@ -24,7 +24,6 @@ struct genapic {
int (*probe)(void);
void (*init_apic_ldr)(void);
- const cpumask_t *(*vector_allocation_cpumask)(int cpu);
void (*send_IPI_mask)(const cpumask_t *mask, int vector);
void (*send_IPI_self)(uint8_t vector);
};
@@ -35,8 +34,6 @@ struct genapic {
#define TARGET_CPUS ((const typeof(cpu_online_map) *)&cpu_online_map)
#define init_apic_ldr() alternative_vcall(genapic.init_apic_ldr)
-#define vector_allocation_cpumask(cpu) \
- alternative_call(genapic.vector_allocation_cpumask, cpu)
extern struct genapic genapic;
extern const struct genapic apic_default;
@@ -50,7 +47,6 @@ void cf_check send_IPI_mask_flat(const cpumask_t *cpumask, int vector);
void cf_check init_apic_ldr_phys(void);
unsigned int cpu_mask_to_apicid(const cpumask_t *cpumask);
void cf_check send_IPI_mask_phys(const cpumask_t *mask, int vector);
-const cpumask_t *cf_check vector_allocation_cpumask_phys(int cpu);
void generic_apic_probe(void);
void generic_bigsmp_probe(void);
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 80f7417d2103..7009a3c6d0dd 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -621,16 +621,12 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
for_each_cpu(cpu, mask)
{
- const cpumask_t *vec_mask;
- int new_cpu;
int vector, offset;
/* Only try and allocate irqs on cpus that are present. */
if (!cpu_online(cpu))
continue;
- vec_mask = vector_allocation_cpumask(cpu);
-
vector = current_vector;
offset = current_offset;
next:
@@ -650,13 +646,12 @@ next:
&& test_bit(vector, irq_used_vectors) )
goto next;
- if ( cpumask_test_cpu(0, vec_mask) &&
+ if ( !cpu &&
vector >= FIRST_LEGACY_VECTOR && vector <= LAST_LEGACY_VECTOR )
goto next;
- for_each_cpu(new_cpu, vec_mask)
- if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
- goto next;
+ if ( per_cpu(vector_irq, cpu)[vector] >= 0 )
+ goto next;
/* Found one! */
current_vector = vector;
current_offset = offset;
@@ -688,12 +683,12 @@ next:
release_old_vec(desc);
}
- trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
+ trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, cpumask_of(cpu));
- for_each_cpu(new_cpu, vec_mask)
- per_cpu(vector_irq, new_cpu)[vector] = irq;
+ per_cpu(vector_irq, cpu)[vector] = irq;
desc->arch.vector = vector;
- cpumask_copy(desc->arch.cpu_mask, vec_mask);
+ cpumask_clear(desc->arch.cpu_mask);
+ cpumask_set_cpu(cpu, desc->arch.cpu_mask);
desc->arch.used = IRQ_USED;
ASSERT((desc->arch.used_vectors == NULL)
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 03/12] x86/irq: introduce local irq_desc
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
2025-11-20 9:06 ` [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook Roger Pau Monne
@ 2025-11-20 9:06 ` Roger Pau Monne
2025-11-20 12:39 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts Roger Pau Monne
` (2 subsequent siblings)
5 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Avoid duplicated calls to irq_to_desc() by storing the result in a local
variable. No functional change.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/smpboot.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 8d3161248de0..7fab5552335b 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1455,12 +1455,16 @@ void __init smp_intr_init(void)
*/
for ( seridx = 0; seridx <= SERHND_IDX; seridx++ )
{
+ struct irq_desc *desc;
+
if ( (irq = serial_irq(seridx)) < 0 )
continue;
vector = alloc_hipriority_vector();
per_cpu(vector_irq, cpu)[vector] = irq;
- irq_to_desc(irq)->arch.vector = vector;
- cpumask_copy(irq_to_desc(irq)->arch.cpu_mask, &cpu_online_map);
+
+ desc = irq_to_desc(irq);
+ desc->arch.vector = vector;
+ cpumask_copy(desc->arch.cpu_mask, &cpu_online_map);
}
/* Direct IPI vectors. */
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
` (2 preceding siblings ...)
2025-11-20 9:06 ` [PATCH 03/12] x86/irq: introduce local irq_desc Roger Pau Monne
@ 2025-11-20 9:06 ` Roger Pau Monne
2025-11-20 12:45 ` Andrew Cooper
2025-11-20 13:06 ` Jan Beulich
2025-11-20 9:06 ` [PATCH 05/12] x86/io-apic: purge usage of logical mode Roger Pau Monne
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
5 siblings, 2 replies; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Setting the irq descriptor target CPU mask of high priority interrupts to
contain all online CPUs is not accurate. External interrupts are
exclusively delivered using physical destination mode, and hence can only
target a single CPU. Setting the descriptor CPU mask to contain all online
CPUs makes it impossible for Xen to figure out which CPU the interrupt is
really targeting.
Instead handle high priority vectors used by external interrupts similarly
to normal vectors, keeping the target CPU mask accurate. Introduce
specific code in _assign_irq_vector() to deal with moving high priority
vectors across CPUs, this is needed at least for fixup_irqs() to be able to
evacuate those if the target CPU goes offline.
Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/irq.c | 24 ++++++++++++++++++------
xen/arch/x86/smpboot.c | 3 ++-
2 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 7009a3c6d0dd..5cd934ea2a32 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -547,6 +547,20 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
cpumask_t tmp_mask;
cpumask_and(&tmp_mask, mask, &cpu_online_map);
+
+ /*
+ * High priority vectors are reserved on all CPUs, hence moving them
+ * just requires changing the target CPU. There's no need for vector
+ * allocation on the destination.
+ */
+ if ( old_vector >= FIRST_HIPRIORITY_VECTOR &&
+ old_vector <= LAST_HIPRIORITY_VECTOR )
+ {
+ cpumask_copy(desc->arch.cpu_mask,
+ cpumask_of(cpumask_any(&tmp_mask)));
+ return 0;
+ }
+
if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
desc->arch.vector = old_vector;
return 0;
@@ -756,12 +770,10 @@ void setup_vector_irq(unsigned int cpu)
if ( !irq_desc_initialized(desc) )
continue;
vector = irq_to_vector(irq);
- if ( vector >= FIRST_HIPRIORITY_VECTOR &&
- vector <= LAST_HIPRIORITY_VECTOR )
- cpumask_set_cpu(cpu, desc->arch.cpu_mask);
- else if ( !cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
- continue;
- per_cpu(vector_irq, cpu)[vector] = irq;
+ if ( (vector >= FIRST_HIPRIORITY_VECTOR &&
+ vector <= LAST_HIPRIORITY_VECTOR) ||
+ cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
+ per_cpu(vector_irq, cpu)[vector] = irq;
}
}
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 7fab5552335b..69cc9bbc6e2c 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1464,7 +1464,8 @@ void __init smp_intr_init(void)
desc = irq_to_desc(irq);
desc->arch.vector = vector;
- cpumask_copy(desc->arch.cpu_mask, &cpu_online_map);
+ cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+ cpumask_setall(desc->affinity);
}
/* Direct IPI vectors. */
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 05/12] x86/io-apic: purge usage of logical mode
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
` (3 preceding siblings ...)
2025-11-20 9:06 ` [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts Roger Pau Monne
@ 2025-11-20 9:06 ` Roger Pau Monne
2025-11-20 13:18 ` Andrew Cooper
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
5 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:06 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The IO-APIC RTEs are unconditionally programmed with physical destination
mode, and hence the field to set in the RTE is always physical_dest.
Remove the mode parameter from SET_DEST() and take the opportunity to
convert it into a function, there's no need for it to be a macro.
This is a benign fix, because due to the endianness of x86 the start of the
physical_dest and logical_dest fields on the RTE overlap.
While there also adjust setup_IO_APIC_irqs() to use the target CPU set by
the assign_irq_vector(), rather than using TARGET_CPUS and possibly
creating a mismatch between the target CPU in desc->arch.cpu_mask and the
one programmed in the IO-APIC RTE.
Fixes: 032d6733a458 ("x86/apic: remove delivery and destination mode fields from drivers")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/io_apic.c | 27 +++++++++++++++------------
1 file changed, 15 insertions(+), 12 deletions(-)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 46c2a43dac6d..3c59bf0e15da 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1037,12 +1037,14 @@ static hw_irq_controller ioapic_edge_type;
#define IOAPIC_EDGE 0
#define IOAPIC_LEVEL 1
-#define SET_DEST(ent, mode, val) do { \
- if (x2apic_enabled && iommu_intremap) \
- (ent).dest.dest32 = (val); \
- else \
- (ent).dest.mode.mode##_dest = (val); \
-} while (0)
+static void set_entry_dest(struct IO_APIC_route_entry *entry,
+ unsigned int apic_id)
+{
+ if ( x2apic_enabled && iommu_intremap )
+ entry->dest.dest32 = apic_id;
+ else
+ entry->dest.physical.physical_dest = apic_id;
+}
static inline void ioapic_register_intr(int irq, unsigned long trigger)
{
@@ -1109,7 +1111,8 @@ static void __init setup_IO_APIC_irqs(void)
if (platform_legacy_irq(irq))
disable_8259A_irq(irq_to_desc(irq));
- SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
+ set_entry_dest(&entry,
+ cpu_mask_to_apicid(irq_to_desc(irq)->arch.cpu_mask));
spin_lock_irqsave(&ioapic_lock, flags);
__ioapic_write_entry(apic, pin, false, entry);
spin_unlock_irqrestore(&ioapic_lock, flags);
@@ -1140,7 +1143,7 @@ static void __init setup_ExtINT_IRQ0_pin(unsigned int apic, unsigned int pin, in
*/
entry.dest_mode = 0; /* physical delivery */
entry.mask = 0; /* unmask IRQ now */
- SET_DEST(entry, logical, cpu_mask_to_apicid(TARGET_CPUS));
+ set_entry_dest(&entry, cpu_mask_to_apicid(TARGET_CPUS));
entry.delivery_mode = dest_Fixed;
entry.polarity = 0;
entry.trigger = 0;
@@ -1432,7 +1435,7 @@ void disable_IO_APIC(void)
entry.dest_mode = 0; /* Physical */
entry.delivery_mode = dest_ExtINT; /* ExtInt */
entry.vector = 0;
- SET_DEST(entry, physical, get_apic_id());
+ set_entry_dest(&entry, get_apic_id());
/*
* Add it to the IO-APIC irq-routing table:
@@ -1806,7 +1809,7 @@ static void __init unlock_ExtINT_logic(void)
entry1.dest_mode = 0; /* physical delivery */
entry1.mask = 0; /* unmask IRQ now */
- SET_DEST(entry1, physical, get_apic_id());
+ set_entry_dest(&entry1, get_apic_id());
entry1.delivery_mode = dest_ExtINT;
entry1.polarity = entry0.polarity;
entry1.trigger = 0;
@@ -2137,7 +2140,7 @@ int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int a
cpumask_t *mask = this_cpu(scratch_cpumask);
cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- SET_DEST(entry, logical, cpu_mask_to_apicid(mask));
+ set_entry_dest(&entry, cpu_mask_to_apicid(mask));
} else {
printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
@@ -2334,7 +2337,7 @@ int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
cpumask_t *mask = this_cpu(scratch_cpumask);
cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- SET_DEST(rte, logical, cpu_mask_to_apicid(mask));
+ set_entry_dest(&rte, cpu_mask_to_apicid(mask));
}
else
{
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
` (4 preceding siblings ...)
2025-11-20 9:06 ` [PATCH 05/12] x86/io-apic: purge usage of logical mode Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-20 9:58 ` [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest() Roger Pau Monne
` (6 more replies)
5 siblings, 7 replies; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Do not set legacy IRQs target CPU mask to all CPUs, as the interrupt is
strictly targeting a single CPU, even if a spurious interrupt can be
delivered to other CPUs.
Instead set the target CPU mask correctly, as do_IRQ() will always deal
with spurious interrupts generated on AMD hardware. Otherwise fixup_irqs()
would assume the IRQ is targeting all CPUs, and would simply not perform
the required affinity move if the target CPU goes offline.
Most of this is unlikely to make any difference in practice, the IO-APIC
setup will take over those entries and set the proper destination as part
of startup setup.
Fixes: 87f37449d586 ("x86/i8259: do not assume interrupts always target CPU0")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/i8259.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index 5c7e21a7515c..41f949a36531 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -407,21 +407,14 @@ void __init init_IRQ(void)
per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
/*
- * The interrupt affinity logic never targets interrupts to offline
- * CPUs, hence it's safe to use cpumask_all here.
- *
* Legacy PIC interrupts are only targeted to CPU0, but depending on
* the platform they can be distributed to any online CPU in hardware.
- * Note this behavior has only been observed on AMD hardware. In order
- * to cope install all active legacy vectors on all CPUs.
- *
- * IO-APIC will change the destination mask if/when taking ownership of
- * the interrupt.
+ * Note this behavior has only been observed on AMD hardware. Set the
+ * target CPU as expected here, and cope with the possibly spurious
+ * interrupts in do_IRQ(). This behavior has only been observed
+ * during AP bringup.
*/
- cpumask_copy(desc->arch.cpu_mask,
- (boot_cpu_data.x86_vendor &
- (X86_VENDOR_AMD | X86_VENDOR_HYGON) ? &cpumask_all
- : cpumask_of(cpu)));
+ cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
desc->arch.vector = LEGACY_VECTOR(irq);
}
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest()
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-20 16:02 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter Roger Pau Monne
` (5 subsequent siblings)
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Attempt to clarify the purpose of setup_ioapic_dest(), as the comment ahead
of it is outdated, and looks to be a verbatim copy from Linux from one of
the code imports.
The function serves two purposes: shuffling the interrupts across CPUs
after SMP bringup or re-assigning all interrupts to CPU#0 if no IRQ
balancing is set at run time. However the function won't perform any of
those functions correctly, as it was unconditionally using
desc->arch.cpu_mask as the target CPU mask for interrupts (which is the
current target anyway).
Fix by adding a new `shuffle` parameter that's used to signal whether the
function is intended to balance interrupts across CPUs, or to re-assign all
interrupts to the BSP.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
I couldn't find a specific Fixes tag to use here, I think this has been
broken all along.
---
xen/arch/x86/include/asm/irq.h | 2 +-
xen/arch/x86/io_apic.c | 13 +++++++------
xen/arch/x86/platform_hypercall.c | 2 +-
xen/arch/x86/smpboot.c | 3 ++-
4 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index 7315150b66b4..df7b48c8653e 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -137,7 +137,7 @@ int i8259A_resume(void);
void enable_IO_APIC(void);
void setup_IO_APIC(void);
void disable_IO_APIC(void);
-void setup_ioapic_dest(void);
+void setup_ioapic_dest(bool shuffle);
vmask_t *io_apic_get_used_vector_map(unsigned int irq);
extern unsigned int io_apic_irqs;
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 3c59bf0e15da..19960d291c47 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -717,12 +717,14 @@ static int __init find_isa_irq_apic(int irq, int type)
static int pin_2_irq(int idx, int apic, int pin);
/*
- * This function currently is only a helper for the i386 smp boot process where
- * we need to reprogram the ioredtbls to cater for the cpus which have come online
- * so mask in all cases should simply be TARGET_CPUS
+ * This function serves two different purposes: shuffling the IO-APIC
+ * interrupts across CPUs after SMP bringup, or re-assigning all interrupts to
+ * the BSP if IRQ balancing is disabled at runtime. Such functional
+ * distinction is signaled by the `shuffle` parameter.
*/
-void /*__init*/ setup_ioapic_dest(void)
+void setup_ioapic_dest(bool shuffle)
{
+ const cpumask_t *mask = shuffle ? TARGET_CPUS : cpumask_of(0);
int pin, ioapic, irq, irq_entry;
if (skip_ioapic_setup)
@@ -740,8 +742,7 @@ void /*__init*/ setup_ioapic_dest(void)
desc = irq_to_desc(irq);
spin_lock_irqsave(&desc->lock, flags);
- BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
- set_ioapic_affinity_irq(desc, desc->arch.cpu_mask);
+ set_ioapic_affinity_irq(desc, mask);
spin_unlock_irqrestore(&desc->lock, flags);
}
}
diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index 79bb99e0b602..22fd65a6aa7e 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -337,7 +337,7 @@ ret_t do_platform_op(
case QUIRK_NOIRQBALANCING:
printk("Platform quirk -- Disabling IRQ balancing/affinity.\n");
opt_noirqbalance = 1;
- setup_ioapic_dest();
+ setup_ioapic_dest(false);
break;
case QUIRK_IOAPIC_BAD_REGSEL:
dprintk(XENLOG_WARNING,
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 69cc9bbc6e2c..55ea62d7d67f 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1433,7 +1433,8 @@ void __init smp_cpus_done(void)
check_nmi_watchdog();
}
- setup_ioapic_dest();
+ if ( !opt_noirqbalance )
+ setup_ioapic_dest(true);
mtrr_save_state();
mtrr_aps_sync_end();
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
2025-11-20 9:58 ` [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest() Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-24 10:52 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity() Roger Pau Monne
` (4 subsequent siblings)
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
The vector will be targeting a single CPU at a time, so passing a mask is
not needed. Simplify the interface and adjust callers to make use of it.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/hpet.c | 2 +-
xen/arch/x86/include/asm/irq.h | 2 +-
xen/arch/x86/io_apic.c | 2 +-
xen/arch/x86/irq.c | 23 ++++++++++-------------
4 files changed, 13 insertions(+), 16 deletions(-)
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index a69abe2650a8..abf4eaf86db1 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -352,7 +352,7 @@ static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch)
* Technically we don't want to bind the IRQ to any CPU yet, but we need to
* specify at least one online one here. Use the BSP.
*/
- ret = bind_irq_vector(ch->msi.irq, HPET_BROADCAST_VECTOR, cpumask_of(0));
+ ret = bind_irq_vector(ch->msi.irq, HPET_BROADCAST_VECTOR, 0);
if ( ret )
return ret;
cpumask_setall(desc->affinity);
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index df7b48c8653e..355332188932 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -199,7 +199,7 @@ void setup_vector_irq(unsigned int cpu);
void move_native_irq(struct irq_desc *desc);
void move_masked_irq(struct irq_desc *desc);
-int bind_irq_vector(int irq, int vector, const cpumask_t *mask);
+int bind_irq_vector(int irq, int vector, unsigned int cpu);
void cf_check end_nonmaskable_irq(struct irq_desc *desc, uint8_t vector);
void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask);
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 19960d291c47..dfbe27b12d54 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1859,7 +1859,7 @@ static void __init check_timer(void)
vector = IRQ0_VECTOR;
clear_irq_vector(0);
- if ((ret = bind_irq_vector(0, vector, &cpumask_all)))
+ if ((ret = bind_irq_vector(0, vector, smp_processor_id())))
printk(KERN_ERR"..IRQ0 is not set correctly with ioapic!!!, err:%d\n", ret);
irq_desc[0].status &= ~IRQ_DISABLED;
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 5cd934ea2a32..e09559fce856 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -150,26 +150,23 @@ static void trace_irq_mask(uint32_t event, int irq, int vector,
}
static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
- const cpumask_t *cpu_mask)
+ unsigned int cpu)
{
- cpumask_t online_mask;
- int cpu;
-
BUG_ON((unsigned)vector >= X86_IDT_VECTORS);
- cpumask_and(&online_mask, cpu_mask, &cpu_online_map);
- if (cpumask_empty(&online_mask))
+ if ( !cpu_online(cpu) )
return -EINVAL;
if ( (desc->arch.vector == vector) &&
- cpumask_equal(desc->arch.cpu_mask, &online_mask) )
+ cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
return 0;
if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
return -EBUSY;
- trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, &online_mask);
- for_each_cpu(cpu, &online_mask)
- per_cpu(vector_irq, cpu)[vector] = desc->irq;
+
+ trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, cpumask_of(cpu));
+ per_cpu(vector_irq, cpu)[vector] = desc->irq;
desc->arch.vector = vector;
- cpumask_copy(desc->arch.cpu_mask, &online_mask);
+ cpumask_clear(desc->arch.cpu_mask);
+ cpumask_set_cpu(cpu, desc->arch.cpu_mask);
if ( desc->arch.used_vectors )
{
ASSERT(!test_bit(vector, desc->arch.used_vectors));
@@ -179,7 +176,7 @@ static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
return 0;
}
-int __init bind_irq_vector(int irq, int vector, const cpumask_t *mask)
+int __init bind_irq_vector(int irq, int vector, unsigned int cpu)
{
struct irq_desc *desc = irq_to_desc(irq);
unsigned long flags;
@@ -189,7 +186,7 @@ int __init bind_irq_vector(int irq, int vector, const cpumask_t *mask)
spin_lock_irqsave(&desc->lock, flags);
spin_lock(&vector_lock);
- ret = _bind_irq_vector(desc, vector, mask);
+ ret = _bind_irq_vector(desc, vector, cpu);
spin_unlock(&vector_lock);
spin_unlock_irqrestore(&desc->lock, flags);
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity()
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
2025-11-20 9:58 ` [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest() Roger Pau Monne
2025-11-20 9:58 ` [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-24 11:20 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer Roger Pau Monne
` (3 subsequent siblings)
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini
Existing callers where already generating the passed cpumask using
cpumask_of() to contain a single target CPU. Reduce complexity by passing
the single target CPU as an integer parameter.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
The function names are misleading, as {,p}irq_set_affinity() doesn't adjust
the affinity of the interrupt (desc->affinity) but the interrupt target
itself. Further cleanup might be helpful to correctly differentiate
between setting interrupt affinity vs setting interrupt target.
---
xen/arch/x86/hvm/hvm.c | 2 +-
xen/arch/x86/include/asm/irq.h | 2 +-
xen/arch/x86/irq.c | 8 ++++----
xen/common/event_channel.c | 6 ++----
xen/include/xen/irq.h | 3 +--
5 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0ff242d4a0d6..33521599a844 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -485,7 +485,7 @@ void hvm_migrate_pirq(struct hvm_pirq_dpci *pirq_dpci, const struct vcpu *v)
if ( !desc )
return;
ASSERT(MSI_IRQ(desc - irq_desc));
- irq_set_affinity(desc, cpumask_of(v->processor));
+ irq_set_affinity(desc, v->processor);
spin_unlock_irq(&desc->lock);
}
}
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index 355332188932..73abc8323a8d 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -202,7 +202,7 @@ void move_masked_irq(struct irq_desc *desc);
int bind_irq_vector(int irq, int vector, unsigned int cpu);
void cf_check end_nonmaskable_irq(struct irq_desc *desc, uint8_t vector);
-void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask);
+void irq_set_affinity(struct irq_desc *desc, unsigned int cpu);
int init_domain_irq_mapping(struct domain *d);
void cleanup_domain_irq_mapping(struct domain *d);
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index e09559fce856..bfb94852a6dc 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -947,7 +947,7 @@ unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
}
/* For re-setting irq interrupt affinity for specific irq */
-void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask)
+void irq_set_affinity(struct irq_desc *desc, unsigned int cpu)
{
if (!desc->handler->set_affinity)
return;
@@ -955,19 +955,19 @@ void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask)
ASSERT(spin_is_locked(&desc->lock));
desc->status &= ~IRQ_MOVE_PENDING;
smp_wmb();
- cpumask_copy(desc->arch.pending_mask, mask);
+ cpumask_copy(desc->arch.pending_mask, cpumask_of(cpu));
smp_wmb();
desc->status |= IRQ_MOVE_PENDING;
}
-void pirq_set_affinity(struct domain *d, int pirq, const cpumask_t *mask)
+void pirq_set_affinity(struct domain *d, int pirq, unsigned int cpu)
{
unsigned long flags;
struct irq_desc *desc = domain_spin_lock_irq_desc(d, pirq, &flags);
if ( !desc )
return;
- irq_set_affinity(desc, mask);
+ irq_set_affinity(desc, cpu);
spin_unlock_irqrestore(&desc->lock, flags);
}
diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 67700b050ad1..8e155649b171 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -1226,8 +1226,7 @@ int evtchn_bind_vcpu(evtchn_port_t port, unsigned int vcpu_id)
break;
unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
chn->notify_vcpu_id = v->vcpu_id;
- pirq_set_affinity(d, chn->u.pirq.irq,
- cpumask_of(v->processor));
+ pirq_set_affinity(d, chn->u.pirq.irq, v->processor);
link_pirq_port(port, chn, v);
break;
#endif
@@ -1712,7 +1711,6 @@ void evtchn_destroy_final(struct domain *d)
void evtchn_move_pirqs(struct vcpu *v)
{
struct domain *d = v->domain;
- const cpumask_t *mask = cpumask_of(v->processor);
unsigned int port;
struct evtchn *chn;
@@ -1720,7 +1718,7 @@ void evtchn_move_pirqs(struct vcpu *v)
for ( port = v->pirq_evtchn_head; port; port = chn->u.pirq.next_port )
{
chn = evtchn_from_port(d, port);
- pirq_set_affinity(d, chn->u.pirq.irq, mask);
+ pirq_set_affinity(d, chn->u.pirq.irq, v->processor);
}
read_unlock(&d->event_lock);
}
diff --git a/xen/include/xen/irq.h b/xen/include/xen/irq.h
index 95034c0d6bb5..f0b119d23521 100644
--- a/xen/include/xen/irq.h
+++ b/xen/include/xen/irq.h
@@ -193,8 +193,7 @@ extern void desc_guest_eoi(struct irq_desc *desc, struct pirq *pirq);
extern int pirq_guest_unmask(struct domain *d);
extern int pirq_guest_bind(struct vcpu *v, struct pirq *pirq, int will_share);
extern void pirq_guest_unbind(struct domain *d, struct pirq *pirq);
-extern void pirq_set_affinity(struct domain *d, int pirq,
- const cpumask_t *mask);
+extern void pirq_set_affinity(struct domain *d, int pirq, unsigned int cpu);
extern struct irq_desc *domain_spin_lock_irq_desc(
struct domain *d, int pirq, unsigned long *pflags);
extern struct irq_desc *pirq_spin_lock_irq_desc(
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
` (2 preceding siblings ...)
2025-11-20 9:58 ` [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity() Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-24 12:01 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask " Roger Pau Monne
` (2 subsequent siblings)
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel
Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper, Anthony PERARD,
Michal Orzel, Julien Grall, Stefano Stabellini, Jason Andryuk
With the dropping of logical target mode for interrupts Xen can now
guarantee that an irq_desc can only target a single CPU. As such, simplify
the fields and convert the cpu_mask that used to contain the set of target
CPUs into a integer that will contain the single target CPU.
No functional change intended in how interrupts are assigned or distributed
across CPUs.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/hpet.c | 9 ++-
xen/arch/x86/i8259.c | 2 +-
xen/arch/x86/include/asm/irq.h | 10 +--
xen/arch/x86/include/asm/msi.h | 3 +-
xen/arch/x86/io_apic.c | 27 +++----
xen/arch/x86/irq.c | 97 ++++++++++--------------
xen/arch/x86/msi.c | 17 +----
xen/arch/x86/smpboot.c | 2 +-
xen/common/cpu.c | 1 +
xen/drivers/passthrough/amd/iommu_init.c | 2 +-
xen/drivers/passthrough/vtd/iommu.c | 2 +-
xen/include/xen/cpumask.h | 1 +
12 files changed, 70 insertions(+), 103 deletions(-)
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index abf4eaf86db1..168675420f06 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -310,9 +310,9 @@ static void cf_check hpet_msi_set_affinity(
struct msi_msg msg = ch->msi.msg;
/* This really is only for dump_irqs(). */
- cpumask_copy(desc->arch.cpu_mask, mask);
+ desc->arch.cpu = cpumask_any(mask);
- msg.dest32 = cpu_mask_to_apicid(mask);
+ msg.dest32 = cpu_physical_id(desc->arch.cpu);
msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
msg.address_lo |= MSI_ADDR_DEST_ID(msg.dest32);
if ( msg.dest32 != ch->msi.msg.dest32 )
@@ -337,7 +337,8 @@ static int __hpet_setup_msi_irq(struct irq_desc *desc)
{
struct msi_msg msg;
- msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg);
+ msg.dest32 = cpu_physical_id(desc->arch.cpu);
+ msi_compose_msg(desc->arch.vector, &msg);
return hpet_msi_write(desc->action->dev_id, &msg);
}
@@ -501,7 +502,7 @@ static void set_channel_irq_affinity(struct hpet_event_channel *ch)
* we're no longer at risk of missing IRQs (provided hpet_broadcast_enter()
* keeps setting the new deadline only afterwards).
*/
- cpumask_copy(desc->arch.cpu_mask, cpumask_of(ch->cpu));
+ desc->arch.cpu = ch->cpu;
spin_unlock(&desc->lock);
diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index 41f949a36531..5ad5e622c0e1 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -414,7 +414,7 @@ void __init init_IRQ(void)
* interrupts in do_IRQ(). This behavior has only been observed
* during AP bringup.
*/
- cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+ desc->arch.cpu = cpu;
desc->arch.vector = LEGACY_VECTOR(irq);
}
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index 73abc8323a8d..97c706acebf2 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -69,13 +69,9 @@ struct irq_desc;
struct arch_irq_desc {
int16_t vector; /* vector itself is only 8 bits, */
int16_t old_vector; /* but we use -1 for unassigned */
- /*
- * Except for high priority interrupts @cpu_mask may have bits set for
- * offline CPUs. Consumers need to be careful to mask this down to
- * online ones as necessary. There is supposed to always be a non-
- * empty intersection with cpu_online_map.
- */
- cpumask_var_t cpu_mask;
+/* Special target CPU values. */
+#define CPU_INVALID ~0U
+ unsigned int cpu; /* Target CPU of the interrupt. */
cpumask_var_t old_cpu_mask;
cpumask_var_t pending_mask;
vmask_t *used_vectors;
diff --git a/xen/arch/x86/include/asm/msi.h b/xen/arch/x86/include/asm/msi.h
index 00059d4a3a9d..2f91294105be 100644
--- a/xen/arch/x86/include/asm/msi.h
+++ b/xen/arch/x86/include/asm/msi.h
@@ -235,8 +235,7 @@ struct arch_msix {
};
void early_msi_init(void);
-void msi_compose_msg(unsigned vector, const cpumask_t *cpu_mask,
- struct msi_msg *msg);
+void msi_compose_msg(unsigned vector, struct msi_msg *msg);
void __msi_set_enable(pci_sbdf_t sbdf, int pos, int enable);
void cf_check mask_msi_irq(struct irq_desc *desc);
void cf_check unmask_msi_irq(struct irq_desc *desc);
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index dfbe27b12d54..a50f7061161a 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1112,8 +1112,7 @@ static void __init setup_IO_APIC_irqs(void)
if (platform_legacy_irq(irq))
disable_8259A_irq(irq_to_desc(irq));
- set_entry_dest(&entry,
- cpu_mask_to_apicid(irq_to_desc(irq)->arch.cpu_mask));
+ set_entry_dest(&entry, cpu_physical_id(irq_to_desc(irq)->arch.cpu));
spin_lock_irqsave(&ioapic_lock, flags);
__ioapic_write_entry(apic, pin, false, entry);
spin_unlock_irqrestore(&ioapic_lock, flags);
@@ -2137,14 +2136,11 @@ int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int a
return vector;
entry.vector = vector;
- if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
- cpumask_t *mask = this_cpu(scratch_cpumask);
-
- cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- set_entry_dest(&entry, cpu_mask_to_apicid(mask));
+ if (cpu_online(desc->arch.cpu)) {
+ set_entry_dest(&entry, cpu_physical_id(desc->arch.cpu));
} else {
- printk(XENLOG_ERR "IRQ%d: no target CPU (%*pb vs %*pb)\n",
- irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
+ printk(XENLOG_ERR "IRQ%d: target CPU %u is offline\n",
+ irq, desc->arch.cpu);
desc->status |= IRQ_DISABLED;
}
@@ -2333,17 +2329,12 @@ int ioapic_guest_write(unsigned long physbase, unsigned int reg, u32 val)
/* Set the vector field to the real vector! */
rte.vector = desc->arch.vector;
- if ( cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS) )
- {
- cpumask_t *mask = this_cpu(scratch_cpumask);
-
- cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
- set_entry_dest(&rte, cpu_mask_to_apicid(mask));
- }
+ if ( cpu_online(desc->arch.cpu) )
+ set_entry_dest(&rte, cpu_physical_id(desc->arch.cpu));
else
{
- gprintk(XENLOG_ERR, "IRQ%d: no target CPU (%*pb vs %*pb)\n",
- irq, CPUMASK_PR(desc->arch.cpu_mask), CPUMASK_PR(TARGET_CPUS));
+ gprintk(XENLOG_ERR, "IRQ%d: target CPU %u offline\n",
+ irq, desc->arch.cpu);
desc->status |= IRQ_DISABLED;
rte.mask = 1;
}
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index bfb94852a6dc..a56d1e8fc267 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -156,8 +156,7 @@ static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
if ( !cpu_online(cpu) )
return -EINVAL;
- if ( (desc->arch.vector == vector) &&
- cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
+ if ( (desc->arch.vector == vector) && cpu == desc->arch.cpu )
return 0;
if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED )
return -EBUSY;
@@ -165,8 +164,7 @@ static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, cpumask_of(cpu));
per_cpu(vector_irq, cpu)[vector] = desc->irq;
desc->arch.vector = vector;
- cpumask_clear(desc->arch.cpu_mask);
- cpumask_set_cpu(cpu, desc->arch.cpu_mask);
+ desc->arch.cpu = cpu;
if ( desc->arch.used_vectors )
{
ASSERT(!test_bit(vector, desc->arch.used_vectors));
@@ -195,23 +193,21 @@ int __init bind_irq_vector(int irq, int vector, unsigned int cpu)
static void _clear_irq_vector(struct irq_desc *desc)
{
- unsigned int cpu, old_vector, irq = desc->irq;
+ unsigned int cpu = desc->arch.cpu, old_vector, irq = desc->irq;
unsigned int vector = desc->arch.vector;
cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
BUG_ON(!valid_irq_vector(vector));
/* Always clear desc->arch.vector */
- cpumask_and(tmp_mask, desc->arch.cpu_mask, &cpu_online_map);
-
- for_each_cpu(cpu, tmp_mask)
+ if ( cpu_online(cpu) )
{
ASSERT(per_cpu(vector_irq, cpu)[vector] == irq);
per_cpu(vector_irq, cpu)[vector] = ~irq;
}
desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
- cpumask_clear(desc->arch.cpu_mask);
+ desc->arch.cpu = CPU_INVALID;
if ( desc->arch.used_vectors )
{
@@ -219,7 +215,7 @@ static void _clear_irq_vector(struct irq_desc *desc)
clear_bit(vector, desc->arch.used_vectors);
}
- trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, tmp_mask);
+ trace_irq_mask(TRC_HW_IRQ_CLEAR_VECTOR, irq, vector, cpumask_of(cpu));
if ( unlikely(desc->arch.move_in_progress) )
{
@@ -392,22 +388,16 @@ int irq_to_vector(int irq)
int arch_init_one_irq_desc(struct irq_desc *desc)
{
- if ( !zalloc_cpumask_var(&desc->arch.cpu_mask) )
- return -ENOMEM;
-
if ( !alloc_cpumask_var(&desc->arch.old_cpu_mask) )
- {
- free_cpumask_var(desc->arch.cpu_mask);
return -ENOMEM;
- }
if ( !alloc_cpumask_var(&desc->arch.pending_mask) )
{
free_cpumask_var(desc->arch.old_cpu_mask);
- free_cpumask_var(desc->arch.cpu_mask);
return -ENOMEM;
}
+ desc->arch.cpu = CPU_INVALID;
desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.creator_domid = DOMID_INVALID;
@@ -553,12 +543,12 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
if ( old_vector >= FIRST_HIPRIORITY_VECTOR &&
old_vector <= LAST_HIPRIORITY_VECTOR )
{
- cpumask_copy(desc->arch.cpu_mask,
- cpumask_of(cpumask_any(&tmp_mask)));
+ desc->arch.cpu = cpumask_any(&tmp_mask);
return 0;
}
- if (cpumask_intersects(&tmp_mask, desc->arch.cpu_mask)) {
+ if ( cpumask_test_cpu(desc->arch.cpu, &tmp_mask) )
+ {
desc->arch.vector = old_vector;
return 0;
}
@@ -570,7 +560,7 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
* If the current destination is online refuse to shuffle. Retry after
* the in-progress movement has finished.
*/
- if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) )
+ if ( cpu_online(desc->arch.cpu) )
return -EAGAIN;
/*
@@ -591,11 +581,10 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
* in the 'mask' parameter.
*/
desc->arch.vector = desc->arch.old_vector;
- cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, mask);
+ desc->arch.cpu = cpumask_any(desc->arch.old_cpu_mask);
/* Undo any possibly done cleanup. */
- for_each_cpu(cpu, desc->arch.cpu_mask)
- per_cpu(vector_irq, cpu)[desc->arch.vector] = irq;
+ per_cpu(vector_irq, desc->arch.cpu)[desc->arch.vector] = irq;
/* Cancel the pending move and release the current vector. */
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
@@ -669,7 +658,7 @@ next:
if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count )
{
- ASSERT(!cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map));
+ ASSERT(!cpu_online(desc->arch.cpu));
/*
* Special case when evacuating an interrupt from a CPU to be
* offlined and the interrupt was already in the process of being
@@ -684,8 +673,9 @@ next:
}
else if ( valid_irq_vector(old_vector) )
{
- cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
- &cpu_online_map);
+ cpumask_clear(desc->arch.old_cpu_mask);
+ if ( cpu_online(desc->arch.cpu) )
+ cpumask_set_cpu(desc->arch.cpu, desc->arch.old_cpu_mask);
desc->arch.old_vector = desc->arch.vector;
if ( !cpumask_empty(desc->arch.old_cpu_mask) )
desc->arch.move_in_progress = 1;
@@ -698,8 +688,7 @@ next:
per_cpu(vector_irq, cpu)[vector] = irq;
desc->arch.vector = vector;
- cpumask_clear(desc->arch.cpu_mask);
- cpumask_set_cpu(cpu, desc->arch.cpu_mask);
+ desc->arch.cpu = cpu;
desc->arch.used = IRQ_USED;
ASSERT((desc->arch.used_vectors == NULL)
@@ -762,14 +751,14 @@ void setup_vector_irq(unsigned int cpu)
/* Mark the inuse vectors */
for ( irq = 0; irq < nr_irqs; ++irq )
{
- struct irq_desc *desc = irq_to_desc(irq);
+ const struct irq_desc *desc = irq_to_desc(irq);
if ( !irq_desc_initialized(desc) )
continue;
vector = irq_to_vector(irq);
if ( (vector >= FIRST_HIPRIORITY_VECTOR &&
vector <= LAST_HIPRIORITY_VECTOR) ||
- cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
+ cpu == desc->arch.cpu )
per_cpu(vector_irq, cpu)[vector] = irq;
}
}
@@ -847,8 +836,7 @@ void cf_check irq_move_cleanup_interrupt(void)
if (!desc->arch.move_cleanup_count)
goto unlock;
- if ( vector == desc->arch.vector &&
- cpumask_test_cpu(me, desc->arch.cpu_mask) )
+ if ( vector == desc->arch.vector && me == desc->arch.cpu )
goto unlock;
irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
@@ -910,8 +898,7 @@ void cf_check irq_complete_move(struct irq_desc *desc)
vector = (u8)get_irq_regs()->entry_vector;
me = smp_processor_id();
- if ( vector == desc->arch.vector &&
- cpumask_test_cpu(me, desc->arch.cpu_mask) )
+ if ( vector == desc->arch.vector && me == desc->arch.cpu )
send_cleanup_vector(desc);
}
@@ -919,7 +906,6 @@ unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
{
int ret;
unsigned long flags;
- cpumask_t dest_mask;
if ( mask && !cpumask_intersects(mask, &cpu_online_map) )
return BAD_APICID;
@@ -932,18 +918,12 @@ unsigned int set_desc_affinity(struct irq_desc *desc, const cpumask_t *mask)
return BAD_APICID;
if ( mask )
- {
cpumask_copy(desc->affinity, mask);
- cpumask_and(&dest_mask, mask, desc->arch.cpu_mask);
- }
else
- {
cpumask_setall(desc->affinity);
- cpumask_copy(&dest_mask, desc->arch.cpu_mask);
- }
- cpumask_and(&dest_mask, &dest_mask, &cpu_online_map);
- return cpu_mask_to_apicid(&dest_mask);
+ ASSERT(cpu_online(desc->arch.cpu));
+ return cpu_physical_id(desc->arch.cpu);
}
/* For re-setting irq interrupt affinity for specific irq */
@@ -1687,8 +1667,7 @@ int pirq_guest_bind(struct vcpu *v, struct pirq *pirq, int will_share)
cpumask_setall(desc->affinity);
affinity = &cpumask_all;
}
- else if ( !cpumask_intersects(desc->arch.cpu_mask,
- &cpu_online_map) )
+ else if ( !cpu_online(desc->arch.cpu) )
affinity = desc->affinity;
if ( affinity )
desc->handler->set_affinity(desc, affinity);
@@ -1966,6 +1945,15 @@ static void do_IRQ_guest(struct irq_desc *desc, unsigned int vector)
}
}
+static const cpumask_t *get_cpumask(unsigned int cpu)
+{
+ if ( cpu < nr_cpu_ids )
+ return cpumask_of(cpu);
+
+ ASSERT(cpu == CPU_INVALID);
+ return &cpumask_none;
+}
+
void do_IRQ(struct cpu_user_regs *regs)
{
struct irqaction *action;
@@ -2013,7 +2001,8 @@ void do_IRQ(struct cpu_user_regs *regs)
spin_lock(&desc->lock);
printk("IRQ%d a={%*pbl}[{%*pbl},{%*pbl}] v=%02x[%02x] t=%s s=%08x\n",
~irq, CPUMASK_PR(desc->affinity),
- CPUMASK_PR(desc->arch.cpu_mask),
+ /* TODO: handle hipri vectors nicely. */
+ CPUMASK_PR(get_cpumask(desc->arch.cpu)),
CPUMASK_PR(desc->arch.old_cpu_mask),
desc->arch.vector, desc->arch.old_vector,
desc->handler->typename, desc->status);
@@ -2545,7 +2534,8 @@ static void cf_check dump_irqs(unsigned char key)
printk(" IRQ:%4d vec:%02x %-15s status=%03x aff:{%*pbl}/{%*pbl} ",
irq, desc->arch.vector, desc->handler->typename, desc->status,
- CPUMASK_PR(desc->affinity), CPUMASK_PR(desc->arch.cpu_mask));
+ CPUMASK_PR(desc->affinity),
+ CPUMASK_PR(get_cpumask(desc->arch.cpu)));
if ( ssid )
printk("Z=%-25s ", ssid);
@@ -2683,8 +2673,7 @@ void fixup_irqs(void)
* interrupts.
*/
if ( apic_irr_read(desc->arch.old_vector) )
- send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)),
- desc->arch.vector);
+ send_IPI_mask(cpumask_of(desc->arch.cpu), desc->arch.vector);
/*
* This CPU is going offline, remove it from ->arch.old_cpu_mask
@@ -2709,8 +2698,7 @@ void fixup_irqs(void)
* are a subset of the online mask. What fixup_irqs() cares about is
* evacuating interrupts from CPUs not in the online mask.
*/
- if ( !desc->action || cpumask_subset(desc->arch.cpu_mask,
- &cpu_online_map) )
+ if ( !desc->action || cpu_online(desc->arch.cpu) )
{
spin_unlock(&desc->lock);
continue;
@@ -2732,7 +2720,7 @@ void fixup_irqs(void)
* the interrupt, signal to check whether there are any pending vectors
* to be handled in the local APIC after the interrupt has been moved.
*/
- if ( cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
+ if ( cpu == desc->arch.cpu )
check_irr = true;
if ( desc->handler->set_affinity )
@@ -2754,8 +2742,7 @@ void fixup_irqs(void)
* desc in order for any in-flight interrupts to be delivered to
* the lapic.
*/
- send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)),
- desc->arch.vector);
+ send_IPI_mask(cpumask_of(desc->arch.cpu), desc->arch.vector);
spin_unlock(&desc->lock);
diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
index 5389bc08674a..f70abac687e4 100644
--- a/xen/arch/x86/msi.c
+++ b/xen/arch/x86/msi.c
@@ -154,24 +154,13 @@ static bool msix_memory_decoded(const struct pci_dev *dev, unsigned int pos)
/*
* MSI message composition
*/
-void msi_compose_msg(unsigned vector, const cpumask_t *cpu_mask, struct msi_msg *msg)
+void msi_compose_msg(unsigned vector, struct msi_msg *msg)
{
memset(msg, 0, sizeof(*msg));
if ( vector < FIRST_DYNAMIC_VECTOR )
return;
- if ( cpu_mask )
- {
- cpumask_t *mask = this_cpu(scratch_cpumask);
-
- if ( !cpumask_intersects(cpu_mask, &cpu_online_map) )
- return;
-
- cpumask_and(mask, cpu_mask, &cpu_online_map);
- msg->dest32 = cpu_mask_to_apicid(mask);
- }
-
msg->address_hi = MSI_ADDR_BASE_HI;
msg->address_lo = MSI_ADDR_BASE_LO |
MSI_ADDR_DESTMODE_PHYS |
@@ -531,7 +520,9 @@ int __setup_msi_irq(struct irq_desc *desc, struct msi_desc *msidesc,
desc->msi_desc = msidesc;
desc->handler = handler;
- msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg);
+ msg.dest32 = cpu_physical_id(desc->arch.cpu);
+
+ msi_compose_msg(desc->arch.vector, &msg);
ret = write_msi_msg(msidesc, &msg, true);
if ( unlikely(ret) )
{
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 55ea62d7d67f..3492073d5c8c 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -1465,7 +1465,7 @@ void __init smp_intr_init(void)
desc = irq_to_desc(irq);
desc->arch.vector = vector;
- cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+ desc->arch.cpu = cpu;
cpumask_setall(desc->affinity);
}
diff --git a/xen/common/cpu.c b/xen/common/cpu.c
index f09af0444b6a..f52f02e2a9d6 100644
--- a/xen/common/cpu.c
+++ b/xen/common/cpu.c
@@ -16,6 +16,7 @@ unsigned int __read_mostly nr_cpumask_bits
const cpumask_t cpumask_all = {
.bits[0 ... (BITS_TO_LONGS(NR_CPUS) - 1)] = ~0UL
};
+const cpumask_t cpumask_none;
/*
* cpu_bit_bitmap[] is a special, "compressed" data structure that
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 00d2c46cbcd5..0d03b9d86254 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -507,7 +507,7 @@ static void cf_check set_x2apic_affinity(
if ( dest == BAD_APICID )
return;
- msi_compose_msg(desc->arch.vector, NULL, &iommu->msi.msg);
+ msi_compose_msg(desc->arch.vector, &iommu->msi.msg);
iommu->msi.msg.dest32 = dest;
ctrl.dest_mode = MASK_EXTR(iommu->msi.msg.address_lo,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 90f36ac22b8a..d2dbf74bc8bb 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1194,7 +1194,7 @@ static void cf_check dma_msi_set_affinity(
return;
}
- msi_compose_msg(desc->arch.vector, NULL, &msg);
+ msi_compose_msg(desc->arch.vector, &msg);
msg.dest32 = dest;
if (x2apic_enabled)
msg.address_hi = dest & 0xFFFFFF00;
diff --git a/xen/include/xen/cpumask.h b/xen/include/xen/cpumask.h
index b713bb69a9cb..1d72c179d7a4 100644
--- a/xen/include/xen/cpumask.h
+++ b/xen/include/xen/cpumask.h
@@ -289,6 +289,7 @@ static inline const cpumask_t *cpumask_of(unsigned int cpu)
#define cpumask_bits(maskp) ((maskp)->bits)
extern const cpumask_t cpumask_all;
+extern const cpumask_t cpumask_none;
/*
* cpumask_var_t: struct cpumask for stack usage.
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask field to integer
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
` (3 preceding siblings ...)
2025-11-20 9:58 ` [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-24 12:54 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 12/12] x86/irq: convert irq_desc pending_mask " Roger Pau Monne
2025-11-20 15:05 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Jan Beulich
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
As the cpu_mask field has already been converted to an integer, propagate
such change to the field that stores the previous target CPU and convert it
to an integer.
Also convert the move_cleanup_count field into a boolean, since the
previous target will always be a single CPU.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/include/asm/irq.h | 4 +-
xen/arch/x86/irq.c | 90 +++++++++++++---------------------
2 files changed, 35 insertions(+), 59 deletions(-)
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index 97c706acebf2..bc59ce7c3ffb 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -72,10 +72,10 @@ struct arch_irq_desc {
/* Special target CPU values. */
#define CPU_INVALID ~0U
unsigned int cpu; /* Target CPU of the interrupt. */
- cpumask_var_t old_cpu_mask;
+ unsigned int old_cpu;
cpumask_var_t pending_mask;
vmask_t *used_vectors;
- unsigned move_cleanup_count;
+ bool move_cleanup : 1;
bool move_in_progress : 1;
int8_t used;
/*
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index a56d1e8fc267..680f190da065 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -115,7 +115,7 @@ static void release_old_vec(struct irq_desc *desc)
unsigned int vector = desc->arch.old_vector;
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
- cpumask_clear(desc->arch.old_cpu_mask);
+ desc->arch.old_cpu = CPU_INVALID;
if ( !valid_irq_vector(vector) )
ASSERT_UNREACHABLE();
@@ -195,7 +195,6 @@ static void _clear_irq_vector(struct irq_desc *desc)
{
unsigned int cpu = desc->arch.cpu, old_vector, irq = desc->irq;
unsigned int vector = desc->arch.vector;
- cpumask_t *tmp_mask = this_cpu(scratch_cpumask);
BUG_ON(!valid_irq_vector(vector));
@@ -221,10 +220,10 @@ static void _clear_irq_vector(struct irq_desc *desc)
{
/* If we were in motion, also clear desc->arch.old_vector */
old_vector = desc->arch.old_vector;
- cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
- for_each_cpu(cpu, tmp_mask)
+ if ( cpu_online(desc->arch.old_cpu) )
{
+ cpu = desc->arch.old_cpu;
ASSERT(per_cpu(vector_irq, cpu)[old_vector] == irq);
TRACE_TIME(TRC_HW_IRQ_MOVE_FINISH, irq, old_vector, cpu);
per_cpu(vector_irq, cpu)[old_vector] = ~irq;
@@ -388,16 +387,11 @@ int irq_to_vector(int irq)
int arch_init_one_irq_desc(struct irq_desc *desc)
{
- if ( !alloc_cpumask_var(&desc->arch.old_cpu_mask) )
- return -ENOMEM;
-
if ( !alloc_cpumask_var(&desc->arch.pending_mask) )
- {
- free_cpumask_var(desc->arch.old_cpu_mask);
return -ENOMEM;
- }
desc->arch.cpu = CPU_INVALID;
+ desc->arch.old_cpu = CPU_INVALID;
desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.creator_domid = DOMID_INVALID;
@@ -554,7 +548,7 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
}
}
- if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count )
+ if ( desc->arch.move_in_progress || desc->arch.move_cleanup )
{
/*
* If the current destination is online refuse to shuffle. Retry after
@@ -570,9 +564,9 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
* ->arch.old_cpu_mask.
*/
ASSERT(valid_irq_vector(desc->arch.old_vector));
- ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, &cpu_online_map));
+ ASSERT(cpu_online(desc->arch.old_cpu));
- if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) )
+ if ( cpumask_test_cpu(desc->arch.old_cpu, mask) )
{
/*
* Fallback to the old destination if moving is in progress and the
@@ -581,16 +575,16 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
* in the 'mask' parameter.
*/
desc->arch.vector = desc->arch.old_vector;
- desc->arch.cpu = cpumask_any(desc->arch.old_cpu_mask);
+ desc->arch.cpu = desc->arch.old_cpu;
/* Undo any possibly done cleanup. */
per_cpu(vector_irq, desc->arch.cpu)[desc->arch.vector] = irq;
/* Cancel the pending move and release the current vector. */
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
- cpumask_clear(desc->arch.old_cpu_mask);
+ desc->arch.old_cpu = CPU_INVALID;
desc->arch.move_in_progress = 0;
- desc->arch.move_cleanup_count = 0;
+ desc->arch.move_cleanup = false;
if ( desc->arch.used_vectors )
{
ASSERT(test_bit(old_vector, desc->arch.used_vectors));
@@ -656,7 +650,7 @@ next:
current_vector = vector;
current_offset = offset;
- if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count )
+ if ( desc->arch.move_in_progress || desc->arch.move_cleanup )
{
ASSERT(!cpu_online(desc->arch.cpu));
/*
@@ -673,12 +667,13 @@ next:
}
else if ( valid_irq_vector(old_vector) )
{
- cpumask_clear(desc->arch.old_cpu_mask);
- if ( cpu_online(desc->arch.cpu) )
- cpumask_set_cpu(desc->arch.cpu, desc->arch.old_cpu_mask);
+ desc->arch.old_cpu = CPU_INVALID;
desc->arch.old_vector = desc->arch.vector;
- if ( !cpumask_empty(desc->arch.old_cpu_mask) )
+ if ( cpu_online(desc->arch.cpu) )
+ {
+ desc->arch.old_cpu = desc->arch.cpu;
desc->arch.move_in_progress = 1;
+ }
else
/* This can happen while offlining a CPU. */
release_old_vec(desc);
@@ -833,7 +828,7 @@ void cf_check irq_move_cleanup_interrupt(void)
if (desc->handler->enable == enable_8259A_irq)
goto unlock;
- if (!desc->arch.move_cleanup_count)
+ if ( !desc->arch.move_cleanup )
goto unlock;
if ( vector == desc->arch.vector && me == desc->arch.cpu )
@@ -862,13 +857,10 @@ void cf_check irq_move_cleanup_interrupt(void)
TRACE_TIME(TRC_HW_IRQ_MOVE_CLEANUP, irq, vector, me);
per_cpu(vector_irq, me)[vector] = ~irq;
- desc->arch.move_cleanup_count--;
+ desc->arch.move_cleanup = false;
- if ( desc->arch.move_cleanup_count == 0 )
- {
- ASSERT(vector == desc->arch.old_vector);
- release_old_vec(desc);
- }
+ ASSERT(vector == desc->arch.old_vector);
+ release_old_vec(desc);
unlock:
spin_unlock(&desc->lock);
}
@@ -876,12 +868,11 @@ unlock:
static void send_cleanup_vector(struct irq_desc *desc)
{
- cpumask_and(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
- &cpu_online_map);
- desc->arch.move_cleanup_count = cpumask_weight(desc->arch.old_cpu_mask);
-
- if ( desc->arch.move_cleanup_count )
- send_IPI_mask(desc->arch.old_cpu_mask, IRQ_MOVE_CLEANUP_VECTOR);
+ if ( cpu_online(desc->arch.old_cpu) )
+ {
+ desc->arch.move_cleanup = true;
+ send_IPI_mask(cpumask_of(desc->arch.old_cpu), IRQ_MOVE_CLEANUP_VECTOR);
+ }
else
release_old_vec(desc);
@@ -2003,7 +1994,7 @@ void do_IRQ(struct cpu_user_regs *regs)
~irq, CPUMASK_PR(desc->affinity),
/* TODO: handle hipri vectors nicely. */
CPUMASK_PR(get_cpumask(desc->arch.cpu)),
- CPUMASK_PR(desc->arch.old_cpu_mask),
+ CPUMASK_PR(get_cpumask(desc->arch.old_cpu)),
desc->arch.vector, desc->arch.old_vector,
desc->handler->typename, desc->status);
spin_unlock(&desc->lock);
@@ -2636,26 +2627,14 @@ void fixup_irqs(void)
continue;
}
- if ( desc->arch.move_cleanup_count )
+ if ( desc->arch.move_cleanup && !cpu_online(desc->arch.old_cpu) )
{
/* The cleanup IPI may have got sent while we were still online. */
- cpumask_andnot(affinity, desc->arch.old_cpu_mask,
- &cpu_online_map);
- desc->arch.move_cleanup_count -= cpumask_weight(affinity);
- if ( !desc->arch.move_cleanup_count )
- release_old_vec(desc);
- else
- /*
- * Adjust old_cpu_mask to account for the offline CPUs,
- * otherwise further calls to fixup_irqs() could subtract those
- * again and possibly underflow the counter.
- */
- cpumask_andnot(desc->arch.old_cpu_mask, desc->arch.old_cpu_mask,
- affinity);
+ desc->arch.move_cleanup = false;
+ release_old_vec(desc);
}
- if ( desc->arch.move_in_progress &&
- cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) )
+ if ( desc->arch.move_in_progress && cpu == desc->arch.old_cpu )
{
/*
* This to be offlined CPU was the target of an interrupt that's
@@ -2685,12 +2664,9 @@ void fixup_irqs(void)
* per-cpu vector table will no longer have ->arch.old_vector
* setup, and hence ->arch.old_cpu_mask would be stale.
*/
- cpumask_clear_cpu(cpu, desc->arch.old_cpu_mask);
- if ( cpumask_empty(desc->arch.old_cpu_mask) )
- {
- desc->arch.move_in_progress = 0;
- release_old_vec(desc);
- }
+ desc->arch.old_cpu = CPU_INVALID;
+ desc->arch.move_in_progress = 0;
+ release_old_vec(desc);
}
/*
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 12/12] x86/irq: convert irq_desc pending_mask field to integer
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
` (4 preceding siblings ...)
2025-11-20 9:58 ` [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask " Roger Pau Monne
@ 2025-11-20 9:58 ` Roger Pau Monne
2025-11-24 12:57 ` Jan Beulich
2025-11-20 15:05 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Jan Beulich
6 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monne @ 2025-11-20 9:58 UTC (permalink / raw)
To: xen-devel; +Cc: Roger Pau Monne, Jan Beulich, Andrew Cooper
Much like the rest of the fields that relate to the current or old CPU
target, the pending_mask can be converted into an integer field.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
xen/arch/x86/include/asm/irq.h | 2 +-
xen/arch/x86/irq.c | 14 +++++---------
2 files changed, 6 insertions(+), 10 deletions(-)
diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h
index bc59ce7c3ffb..55047772eb46 100644
--- a/xen/arch/x86/include/asm/irq.h
+++ b/xen/arch/x86/include/asm/irq.h
@@ -73,7 +73,7 @@ struct arch_irq_desc {
#define CPU_INVALID ~0U
unsigned int cpu; /* Target CPU of the interrupt. */
unsigned int old_cpu;
- cpumask_var_t pending_mask;
+ unsigned int pending_cpu;
vmask_t *used_vectors;
bool move_cleanup : 1;
bool move_in_progress : 1;
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 680f190da065..8d7947116e33 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -387,11 +387,9 @@ int irq_to_vector(int irq)
int arch_init_one_irq_desc(struct irq_desc *desc)
{
- if ( !alloc_cpumask_var(&desc->arch.pending_mask) )
- return -ENOMEM;
-
desc->arch.cpu = CPU_INVALID;
desc->arch.old_cpu = CPU_INVALID;
+ desc->arch.pending_cpu = CPU_INVALID;
desc->arch.vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
desc->arch.creator_domid = DOMID_INVALID;
@@ -760,8 +758,6 @@ void setup_vector_irq(unsigned int cpu)
void move_masked_irq(struct irq_desc *desc)
{
- cpumask_t *pending_mask = desc->arch.pending_mask;
-
if (likely(!(desc->status & IRQ_MOVE_PENDING)))
return;
@@ -779,10 +775,10 @@ void move_masked_irq(struct irq_desc *desc)
*
* For correct operation this depends on the caller masking the irqs.
*/
- if ( likely(cpumask_intersects(pending_mask, &cpu_online_map)) )
- desc->handler->set_affinity(desc, pending_mask);
+ if ( likely(cpu_online(desc->arch.pending_cpu)) )
+ desc->handler->set_affinity(desc, cpumask_of(desc->arch.pending_cpu));
- cpumask_clear(pending_mask);
+ desc->arch.pending_cpu = CPU_INVALID;
}
void move_native_irq(struct irq_desc *desc)
@@ -926,7 +922,7 @@ void irq_set_affinity(struct irq_desc *desc, unsigned int cpu)
ASSERT(spin_is_locked(&desc->lock));
desc->status &= ~IRQ_MOVE_PENDING;
smp_wmb();
- cpumask_copy(desc->arch.pending_mask, cpumask_of(cpu));
+ desc->arch.pending_cpu = cpu;
smp_wmb();
desc->status |= IRQ_MOVE_PENDING;
}
--
2.51.0
^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
@ 2025-11-20 12:19 ` Jan Beulich
2025-11-20 12:25 ` Andrew Cooper
1 sibling, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 12:19 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:06, Roger Pau Monne wrote:
> All implementations use the same hook. No functional change intended.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(even more so that I had effectively the same patch, just dealing with the other
hook at the same time)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook
2025-11-20 9:06 ` [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook Roger Pau Monne
@ 2025-11-20 12:25 ` Jan Beulich
2025-11-20 12:30 ` Andrew Cooper
1 sibling, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 12:25 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:06, Roger Pau Monne wrote:
> All implementations use the same hook which is a wrapper around
> cpumask_of(cpu). Adjust callers to no longer use such dummy mask with a
> single CPU set, as the CPU is already known to the caller. This removes a
> pair of usages of for_each_cpu().
>
> No functional change intended.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one possible adjustment:
> @@ -688,12 +683,12 @@ next:
> release_old_vec(desc);
> }
>
> - trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
> + trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, cpumask_of(cpu));
>
> - for_each_cpu(new_cpu, vec_mask)
> - per_cpu(vector_irq, new_cpu)[vector] = irq;
> + per_cpu(vector_irq, cpu)[vector] = irq;
> desc->arch.vector = vector;
> - cpumask_copy(desc->arch.cpu_mask, vec_mask);
> + cpumask_clear(desc->arch.cpu_mask);
> + cpumask_set_cpu(cpu, desc->arch.cpu_mask);
This being a LOCKed operation, might we better use
cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
here? Otoh I expect this to go away later in the series anyway, so possibly not
a big deal.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
2025-11-20 12:19 ` Jan Beulich
@ 2025-11-20 12:25 ` Andrew Cooper
1 sibling, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 12:25 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich
On 20/11/2025 9:06 am, Roger Pau Monne wrote:
> All implementations use the same hook. No functional change intended.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook
2025-11-20 9:06 ` [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook Roger Pau Monne
2025-11-20 12:25 ` Jan Beulich
@ 2025-11-20 12:30 ` Andrew Cooper
1 sibling, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 12:30 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich
On 20/11/2025 9:06 am, Roger Pau Monne wrote:
> All implementations use the same hook which is a wrapper around
> cpumask_of(cpu). Adjust callers to no longer use such dummy mask with a
> single CPU set, as the CPU is already known to the caller. This removes a
> pair of usages of for_each_cpu().
>
> No functional change intended.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one suggestion.
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index 80f7417d2103..7009a3c6d0dd 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -650,13 +646,12 @@ next:
> && test_bit(vector, irq_used_vectors) )
> goto next;
>
> - if ( cpumask_test_cpu(0, vec_mask) &&
> + if ( !cpu &&
> vector >= FIRST_LEGACY_VECTOR && vector <= LAST_LEGACY_VECTOR )
> goto next;
>
> - for_each_cpu(new_cpu, vec_mask)
> - if (per_cpu(vector_irq, new_cpu)[vector] >= 0)
> - goto next;
> + if ( per_cpu(vector_irq, cpu)[vector] >= 0 )
> + goto next;
Please can we have a blank line here for legibility.
> /* Found one! */
> current_vector = vector;
> current_offset = offset;
> @@ -688,12 +683,12 @@ next:
> release_old_vec(desc);
> }
>
> - trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, vec_mask);
> + trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, cpumask_of(cpu));
This highlights that there's a problem with tracing on large systems,
not that there's anything we can really do about it.
~Andrew
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 03/12] x86/irq: introduce local irq_desc
2025-11-20 9:06 ` [PATCH 03/12] x86/irq: introduce local irq_desc Roger Pau Monne
@ 2025-11-20 12:39 ` Andrew Cooper
0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 12:39 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich
On 20/11/2025 9:06 am, Roger Pau Monne wrote:
> Avoid duplicated calls to irq_to_desc() by storing the result in a local
> variable. No functional change.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-20 9:06 ` [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts Roger Pau Monne
@ 2025-11-20 12:45 ` Andrew Cooper
2025-11-20 12:53 ` Jan Beulich
2025-11-20 13:06 ` Jan Beulich
1 sibling, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 12:45 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich
On 20/11/2025 9:06 am, Roger Pau Monne wrote:
> Setting the irq descriptor target CPU mask of high priority interrupts to
> contain all online CPUs is not accurate. External interrupts are
> exclusively delivered using physical destination mode, and hence can only
> target a single CPU. Setting the descriptor CPU mask to contain all online
> CPUs makes it impossible for Xen to figure out which CPU the interrupt is
> really targeting.
>
> Instead handle high priority vectors used by external interrupts similarly
> to normal vectors, keeping the target CPU mask accurate. Introduce
> specific code in _assign_irq_vector() to deal with moving high priority
> vectors across CPUs, this is needed at least for fixup_irqs() to be able to
> evacuate those if the target CPU goes offline.
>
> Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Which external interrupts do we have like this?
Looking at Jan's series, the VMX Posted Interrupt vector is like this,
but I can't see a case of getting a high priority vector, and
fixup_irqs() being a legitimate thing to do.
~Andrew
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-20 12:45 ` Andrew Cooper
@ 2025-11-20 12:53 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 12:53 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monne, xen-devel
On 20.11.2025 13:45, Andrew Cooper wrote:
> On 20/11/2025 9:06 am, Roger Pau Monne wrote:
>> Setting the irq descriptor target CPU mask of high priority interrupts to
>> contain all online CPUs is not accurate. External interrupts are
>> exclusively delivered using physical destination mode, and hence can only
>> target a single CPU. Setting the descriptor CPU mask to contain all online
>> CPUs makes it impossible for Xen to figure out which CPU the interrupt is
>> really targeting.
>>
>> Instead handle high priority vectors used by external interrupts similarly
>> to normal vectors, keeping the target CPU mask accurate. Introduce
>> specific code in _assign_irq_vector() to deal with moving high priority
>> vectors across CPUs, this is needed at least for fixup_irqs() to be able to
>> evacuate those if the target CPU goes offline.
>>
>> Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Which external interrupts do we have like this?
>
> Looking at Jan's series, the VMX Posted Interrupt vector is like this,
> but I can't see a case of getting a high priority vector, and
> fixup_irqs() being a legitimate thing to do.
The serial IRQ is hiprio and wants to be subject to moving around.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-20 9:06 ` [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts Roger Pau Monne
2025-11-20 12:45 ` Andrew Cooper
@ 2025-11-20 13:06 ` Jan Beulich
2025-11-21 8:29 ` Roger Pau Monné
1 sibling, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 13:06 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:06, Roger Pau Monne wrote:
> Setting the irq descriptor target CPU mask of high priority interrupts to
> contain all online CPUs is not accurate. External interrupts are
> exclusively delivered using physical destination mode, and hence can only
> target a single CPU. Setting the descriptor CPU mask to contain all online
> CPUs makes it impossible for Xen to figure out which CPU the interrupt is
> really targeting.
>
> Instead handle high priority vectors used by external interrupts similarly
> to normal vectors, keeping the target CPU mask accurate. Introduce
> specific code in _assign_irq_vector() to deal with moving high priority
> vectors across CPUs, this is needed at least for fixup_irqs() to be able to
> evacuate those if the target CPU goes offline.
>
> Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one further request:
> @@ -756,12 +770,10 @@ void setup_vector_irq(unsigned int cpu)
> if ( !irq_desc_initialized(desc) )
> continue;
> vector = irq_to_vector(irq);
> - if ( vector >= FIRST_HIPRIORITY_VECTOR &&
> - vector <= LAST_HIPRIORITY_VECTOR )
> - cpumask_set_cpu(cpu, desc->arch.cpu_mask);
> - else if ( !cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
> - continue;
> - per_cpu(vector_irq, cpu)[vector] = irq;
> + if ( (vector >= FIRST_HIPRIORITY_VECTOR &&
> + vector <= LAST_HIPRIORITY_VECTOR) ||
> + cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
> + per_cpu(vector_irq, cpu)[vector] = irq;
Going beyond desc->arch.cpu_mask for hiprio vectors may deserve a comment here. When
the vector is global, this is necessary. But for e.g. the serial IRQ (which still
moves, but isn't bound to multiple CPUs, the more normal way of respecting
desc->arch.cpu_mask would be sufficient. It is merely (largely) benign if we set
vector_irq[] also for other CPUs. "Largely" because strictly speaking if that vector
triggered on the wrong CPU for whatever reason, we rather shouldn't treat it as a
legitimate interrupt.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 05/12] x86/io-apic: purge usage of logical mode
2025-11-20 9:06 ` [PATCH 05/12] x86/io-apic: purge usage of logical mode Roger Pau Monne
@ 2025-11-20 13:18 ` Andrew Cooper
2025-11-20 14:27 ` Jan Beulich
0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 13:18 UTC (permalink / raw)
To: Roger Pau Monne, xen-devel; +Cc: Jan Beulich
On 20/11/2025 9:06 am, Roger Pau Monne wrote:
> The IO-APIC RTEs are unconditionally programmed with physical destination
> mode, and hence the field to set in the RTE is always physical_dest.
>
> Remove the mode parameter from SET_DEST() and take the opportunity to
> convert it into a function, there's no need for it to be a macro.
>
> This is a benign fix, because due to the endianness of x86 the start of the
> physical_dest and logical_dest fields on the RTE overlap.
RTEs do not have overlapping fields; it's Xen's abstraction of the
IO-APIC which is buggy.
For starters, Xen's IO_APIC_route_entry still only has a 4-bit
physical_dest field which hasn't been true since the Pentium 4 days.
This might explain some of the interrupt bugs we see.
~Andrew
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 05/12] x86/io-apic: purge usage of logical mode
2025-11-20 13:18 ` Andrew Cooper
@ 2025-11-20 14:27 ` Jan Beulich
2025-11-20 18:30 ` Andrew Cooper
0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 14:27 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monne, xen-devel
On 20.11.2025 14:18, Andrew Cooper wrote:
> On 20/11/2025 9:06 am, Roger Pau Monne wrote:
>> The IO-APIC RTEs are unconditionally programmed with physical destination
>> mode, and hence the field to set in the RTE is always physical_dest.
>>
>> Remove the mode parameter from SET_DEST() and take the opportunity to
>> convert it into a function, there's no need for it to be a macro.
>>
>> This is a benign fix, because due to the endianness of x86 the start of the
>> physical_dest and logical_dest fields on the RTE overlap.
>
> RTEs do not have overlapping fields; it's Xen's abstraction of the
> IO-APIC which is buggy.
I wouldn't put it this negatively. In the old days, ...
> For starters, Xen's IO_APIC_route_entry still only has a 4-bit
> physical_dest field which hasn't been true since the Pentium 4 days.
> This might explain some of the interrupt bugs we see.
... as you mention here, the two fields were distinct (and hence overlapping).
In a number of places we passed "logical" to SET_DEST() as the middle argument,
thus covering for the too narrow field width of physical_dest. Dropping that
parameter and always using physical_dest requires that field to be widened,
though (or else we'll end up chopping off the top 4 bits, as we already do in
disable_IO_APIC() and unlock_ExtINT_logic() - both benign as long as the CPU
used always has APIC ID 0, which will at least typically be the case, I think).
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
` (5 preceding siblings ...)
2025-11-20 9:58 ` [PATCH 12/12] x86/irq: convert irq_desc pending_mask " Roger Pau Monne
@ 2025-11-20 15:05 ` Jan Beulich
2025-11-21 8:35 ` Roger Pau Monné
6 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 15:05 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> --- a/xen/arch/x86/i8259.c
> +++ b/xen/arch/x86/i8259.c
> @@ -407,21 +407,14 @@ void __init init_IRQ(void)
> per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
>
> /*
> - * The interrupt affinity logic never targets interrupts to offline
> - * CPUs, hence it's safe to use cpumask_all here.
> - *
> * Legacy PIC interrupts are only targeted to CPU0, but depending on
> * the platform they can be distributed to any online CPU in hardware.
> - * Note this behavior has only been observed on AMD hardware. In order
> - * to cope install all active legacy vectors on all CPUs.
> - *
> - * IO-APIC will change the destination mask if/when taking ownership of
> - * the interrupt.
> + * Note this behavior has only been observed on AMD hardware. Set the
> + * target CPU as expected here, and cope with the possibly spurious
> + * interrupts in do_IRQ(). This behavior has only been observed
> + * during AP bringup.
> */
> - cpumask_copy(desc->arch.cpu_mask,
> - (boot_cpu_data.x86_vendor &
> - (X86_VENDOR_AMD | X86_VENDOR_HYGON) ? &cpumask_all
> - : cpumask_of(cpu)));
> + cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
> desc->arch.vector = LEGACY_VECTOR(irq);
> }
Doesn't this collide with what setup_vector_irq() does (see also patch 04)? If
you don't set all bits here, not all CPUs will have the vector_irq[] slot set
correctly for do_IRQ() to actually be able to associate the vector with the
right IRQ.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest()
2025-11-20 9:58 ` [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest() Roger Pau Monne
@ 2025-11-20 16:02 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-20 16:02 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> Attempt to clarify the purpose of setup_ioapic_dest(), as the comment ahead
> of it is outdated, and looks to be a verbatim copy from Linux from one of
> the code imports.
>
> The function serves two purposes: shuffling the interrupts across CPUs
> after SMP bringup or re-assigning all interrupts to CPU#0 if no IRQ
> balancing is set at run time. However the function won't perform any of
> those functions correctly, as it was unconditionally using
> desc->arch.cpu_mask as the target CPU mask for interrupts (which is the
> current target anyway).
>
> Fix by adding a new `shuffle` parameter that's used to signal whether the
> function is intended to balance interrupts across CPUs, or to re-assign all
> interrupts to the BSP.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> I couldn't find a specific Fixes tag to use here, I think this has been
> broken all along.
Perhaps dddd88c891af ("Auto-disable IRQ balancing/affinity on buggy chipsets"),
which is where the 2nd use of the function was introduced?
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -717,12 +717,14 @@ static int __init find_isa_irq_apic(int irq, int type)
> static int pin_2_irq(int idx, int apic, int pin);
>
> /*
> - * This function currently is only a helper for the i386 smp boot process where
> - * we need to reprogram the ioredtbls to cater for the cpus which have come online
> - * so mask in all cases should simply be TARGET_CPUS
> + * This function serves two different purposes: shuffling the IO-APIC
> + * interrupts across CPUs after SMP bringup, or re-assigning all interrupts to
> + * the BSP if IRQ balancing is disabled at runtime. Such functional
> + * distinction is signaled by the `shuffle` parameter.
> */
> -void /*__init*/ setup_ioapic_dest(void)
> +void setup_ioapic_dest(bool shuffle)
> {
> + const cpumask_t *mask = shuffle ? TARGET_CPUS : cpumask_of(0);
Don't we want to aim at getting rid of TARGET_CPUS, which is now only hiding
something invariant? IOW should we perhaps avoid introducing new uses?
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 05/12] x86/io-apic: purge usage of logical mode
2025-11-20 14:27 ` Jan Beulich
@ 2025-11-20 18:30 ` Andrew Cooper
2025-11-21 7:38 ` Jan Beulich
0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2025-11-20 18:30 UTC (permalink / raw)
To: Jan Beulich; +Cc: Roger Pau Monne, xen-devel
On 20/11/2025 2:27 pm, Jan Beulich wrote:
> On 20.11.2025 14:18, Andrew Cooper wrote:
>> On 20/11/2025 9:06 am, Roger Pau Monne wrote:
>>> The IO-APIC RTEs are unconditionally programmed with physical destination
>>> mode, and hence the field to set in the RTE is always physical_dest.
>>>
>>> Remove the mode parameter from SET_DEST() and take the opportunity to
>>> convert it into a function, there's no need for it to be a macro.
>>>
>>> This is a benign fix, because due to the endianness of x86 the start of the
>>> physical_dest and logical_dest fields on the RTE overlap.
>> RTEs do not have overlapping fields; it's Xen's abstraction of the
>> IO-APIC which is buggy.
> I wouldn't put it this negatively. In the old days, ...
>
>> For starters, Xen's IO_APIC_route_entry still only has a 4-bit
>> physical_dest field which hasn't been true since the Pentium 4 days.
>> This might explain some of the interrupt bugs we see.
> ... as you mention here, the two fields were distinct (and hence overlapping).
Since when?
Even in the oldest manuals I can find, it's a single field called
destination, and who's contents is interpreted differently depending on
the logical mode bit.
From a hardware point of view, there will either be 4 or 8 bits of
storage, and it will have nothing to do with the lower bits.
> In a number of places we passed "logical" to SET_DEST() as the middle argument,
> thus covering for the too narrow field width of physical_dest. Dropping that
> parameter and always using physical_dest requires that field to be widened,
> though (or else we'll end up chopping off the top 4 bits, as we already do in
> disable_IO_APIC() and unlock_ExtINT_logic() - both benign as long as the CPU
> used always has APIC ID 0, which will at least typically be the case, I think).
Latent or not, it's still in need of fixing.
It looks like the code Xen inherited was added to Linux in e1d919786
(Jan 2008, even then only x86_32) and deleted by d83e94acd957 (August 2008).
It looks like we're 17 years late undoing this...
~Andrew
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 05/12] x86/io-apic: purge usage of logical mode
2025-11-20 18:30 ` Andrew Cooper
@ 2025-11-21 7:38 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-21 7:38 UTC (permalink / raw)
To: Andrew Cooper; +Cc: Roger Pau Monne, xen-devel
On 20.11.2025 19:30, Andrew Cooper wrote:
> On 20/11/2025 2:27 pm, Jan Beulich wrote:
>> On 20.11.2025 14:18, Andrew Cooper wrote:
>>> On 20/11/2025 9:06 am, Roger Pau Monne wrote:
>>>> The IO-APIC RTEs are unconditionally programmed with physical destination
>>>> mode, and hence the field to set in the RTE is always physical_dest.
>>>>
>>>> Remove the mode parameter from SET_DEST() and take the opportunity to
>>>> convert it into a function, there's no need for it to be a macro.
>>>>
>>>> This is a benign fix, because due to the endianness of x86 the start of the
>>>> physical_dest and logical_dest fields on the RTE overlap.
>>> RTEs do not have overlapping fields; it's Xen's abstraction of the
>>> IO-APIC which is buggy.
>> I wouldn't put it this negatively. In the old days, ...
>>
>>> For starters, Xen's IO_APIC_route_entry still only has a 4-bit
>>> physical_dest field which hasn't been true since the Pentium 4 days.
>>> This might explain some of the interrupt bugs we see.
>> ... as you mention here, the two fields were distinct (and hence overlapping).
>
> Since when?
>
> Even in the oldest manuals I can find, it's a single field called
> destination, and who's contents is interpreted differently depending on
> the logical mode bit.
And these two different meanings are reflected in the union-ized definition
in Xen. Which I still think is fair to describe as "overlapping". And the
use of a union itself isn't buggy imo; it's some of the code using it which
is.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-20 13:06 ` Jan Beulich
@ 2025-11-21 8:29 ` Roger Pau Monné
2025-11-24 10:16 ` Jan Beulich
0 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monné @ 2025-11-21 8:29 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Thu, Nov 20, 2025 at 02:06:39PM +0100, Jan Beulich wrote:
> On 20.11.2025 10:06, Roger Pau Monne wrote:
> > Setting the irq descriptor target CPU mask of high priority interrupts to
> > contain all online CPUs is not accurate. External interrupts are
> > exclusively delivered using physical destination mode, and hence can only
> > target a single CPU. Setting the descriptor CPU mask to contain all online
> > CPUs makes it impossible for Xen to figure out which CPU the interrupt is
> > really targeting.
> >
> > Instead handle high priority vectors used by external interrupts similarly
> > to normal vectors, keeping the target CPU mask accurate. Introduce
> > specific code in _assign_irq_vector() to deal with moving high priority
> > vectors across CPUs, this is needed at least for fixup_irqs() to be able to
> > evacuate those if the target CPU goes offline.
> >
> > Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> with one further request:
>
> > @@ -756,12 +770,10 @@ void setup_vector_irq(unsigned int cpu)
> > if ( !irq_desc_initialized(desc) )
> > continue;
> > vector = irq_to_vector(irq);
> > - if ( vector >= FIRST_HIPRIORITY_VECTOR &&
> > - vector <= LAST_HIPRIORITY_VECTOR )
> > - cpumask_set_cpu(cpu, desc->arch.cpu_mask);
> > - else if ( !cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
> > - continue;
> > - per_cpu(vector_irq, cpu)[vector] = irq;
> > + if ( (vector >= FIRST_HIPRIORITY_VECTOR &&
> > + vector <= LAST_HIPRIORITY_VECTOR) ||
> > + cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
> > + per_cpu(vector_irq, cpu)[vector] = irq;
>
> Going beyond desc->arch.cpu_mask for hiprio vectors may deserve a comment here. When
> the vector is global, this is necessary. But for e.g. the serial IRQ (which still
> moves, but isn't bound to multiple CPUs, the more normal way of respecting
> desc->arch.cpu_mask would be sufficient.
Note that hiprio vectors are handled specially in _assign_irq_vector()
and the logic to set per_cpu(vector_irq, cpu)[vector] is
short-circuited assuming that hiprio vectors are already setup on all
CPUs.
> It is merely (largely) benign if we set
> vector_irq[] also for other CPUs. "Largely" because strictly speaking if that vector
> triggered on the wrong CPU for whatever reason, we rather shouldn't treat it as a
> legitimate interrupt.
I see, yes, we could handle hiprio vectors more like normal ones and
do a bit more work in _assign_irq_vector() to also set the
vector_irq[] array at bind time, but then we would need the cleanup
logic as the interrupt migrates, which is a bit of overhead when we
know the vector won't be re-used for anything else.
What about I add the following comment:
if ( cpumask_test_cpu(cpu, desc->arch.cpu_mask) ||
/*
* High-priority vectors are reserved on all CPUs. Set
* the vector to IRQ assignment on all CPUs, even if the
* interrupt is not targeting this CPU. That makes
* shuffling those interrupts around CPUs easier.
*/
(vector >= FIRST_HIPRIORITY_VECTOR &&
vector <= LAST_HIPRIORITY_VECTOR) )
per_cpu(vector_irq, cpu)[vector] = irq;
Thanks, Roger.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts
2025-11-20 15:05 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Jan Beulich
@ 2025-11-21 8:35 ` Roger Pau Monné
2025-11-24 10:20 ` Jan Beulich
0 siblings, 1 reply; 36+ messages in thread
From: Roger Pau Monné @ 2025-11-21 8:35 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, xen-devel
On Thu, Nov 20, 2025 at 04:05:38PM +0100, Jan Beulich wrote:
> On 20.11.2025 10:58, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/i8259.c
> > +++ b/xen/arch/x86/i8259.c
> > @@ -407,21 +407,14 @@ void __init init_IRQ(void)
> > per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
> >
> > /*
> > - * The interrupt affinity logic never targets interrupts to offline
> > - * CPUs, hence it's safe to use cpumask_all here.
> > - *
> > * Legacy PIC interrupts are only targeted to CPU0, but depending on
> > * the platform they can be distributed to any online CPU in hardware.
> > - * Note this behavior has only been observed on AMD hardware. In order
> > - * to cope install all active legacy vectors on all CPUs.
> > - *
> > - * IO-APIC will change the destination mask if/when taking ownership of
> > - * the interrupt.
> > + * Note this behavior has only been observed on AMD hardware. Set the
> > + * target CPU as expected here, and cope with the possibly spurious
> > + * interrupts in do_IRQ(). This behavior has only been observed
> > + * during AP bringup.
> > */
> > - cpumask_copy(desc->arch.cpu_mask,
> > - (boot_cpu_data.x86_vendor &
> > - (X86_VENDOR_AMD | X86_VENDOR_HYGON) ? &cpumask_all
> > - : cpumask_of(cpu)));
> > + cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
> > desc->arch.vector = LEGACY_VECTOR(irq);
> > }
>
> Doesn't this collide with what setup_vector_irq() does (see also patch 04)? If
> you don't set all bits here, not all CPUs will have the vector_irq[] slot set
> correctly for do_IRQ() to actually be able to associate the vector with the
> right IRQ.
For the AMD workaround I've only ever saw the spurious vector
triggering on CPUs different than the BSP at bringup, I don't think we
strictly need to bind all legacy vectors on all possible CPUs. Well
behaved PIC interrupts will only target the BSP, and that's properly
setup.
Thanks, Roger.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts
2025-11-21 8:29 ` Roger Pau Monné
@ 2025-11-24 10:16 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 10:16 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, xen-devel
On 21.11.2025 09:29, Roger Pau Monné wrote:
> On Thu, Nov 20, 2025 at 02:06:39PM +0100, Jan Beulich wrote:
>> On 20.11.2025 10:06, Roger Pau Monne wrote:
>>> Setting the irq descriptor target CPU mask of high priority interrupts to
>>> contain all online CPUs is not accurate. External interrupts are
>>> exclusively delivered using physical destination mode, and hence can only
>>> target a single CPU. Setting the descriptor CPU mask to contain all online
>>> CPUs makes it impossible for Xen to figure out which CPU the interrupt is
>>> really targeting.
>>>
>>> Instead handle high priority vectors used by external interrupts similarly
>>> to normal vectors, keeping the target CPU mask accurate. Introduce
>>> specific code in _assign_irq_vector() to deal with moving high priority
>>> vectors across CPUs, this is needed at least for fixup_irqs() to be able to
>>> evacuate those if the target CPU goes offline.
>>>
>>> Fixes: fc0c3fa2ad5c ("x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)")
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> with one further request:
>>
>>> @@ -756,12 +770,10 @@ void setup_vector_irq(unsigned int cpu)
>>> if ( !irq_desc_initialized(desc) )
>>> continue;
>>> vector = irq_to_vector(irq);
>>> - if ( vector >= FIRST_HIPRIORITY_VECTOR &&
>>> - vector <= LAST_HIPRIORITY_VECTOR )
>>> - cpumask_set_cpu(cpu, desc->arch.cpu_mask);
>>> - else if ( !cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
>>> - continue;
>>> - per_cpu(vector_irq, cpu)[vector] = irq;
>>> + if ( (vector >= FIRST_HIPRIORITY_VECTOR &&
>>> + vector <= LAST_HIPRIORITY_VECTOR) ||
>>> + cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
>>> + per_cpu(vector_irq, cpu)[vector] = irq;
>>
>> Going beyond desc->arch.cpu_mask for hiprio vectors may deserve a comment here. When
>> the vector is global, this is necessary. But for e.g. the serial IRQ (which still
>> moves, but isn't bound to multiple CPUs, the more normal way of respecting
>> desc->arch.cpu_mask would be sufficient.
>
> Note that hiprio vectors are handled specially in _assign_irq_vector()
> and the logic to set per_cpu(vector_irq, cpu)[vector] is
> short-circuited assuming that hiprio vectors are already setup on all
> CPUs.
>
>> It is merely (largely) benign if we set
>> vector_irq[] also for other CPUs. "Largely" because strictly speaking if that vector
>> triggered on the wrong CPU for whatever reason, we rather shouldn't treat it as a
>> legitimate interrupt.
>
> I see, yes, we could handle hiprio vectors more like normal ones and
> do a bit more work in _assign_irq_vector() to also set the
> vector_irq[] array at bind time, but then we would need the cleanup
> logic as the interrupt migrates, which is a bit of overhead when we
> know the vector won't be re-used for anything else.
>
> What about I add the following comment:
>
> if ( cpumask_test_cpu(cpu, desc->arch.cpu_mask) ||
> /*
> * High-priority vectors are reserved on all CPUs. Set
> * the vector to IRQ assignment on all CPUs, even if the
> * interrupt is not targeting this CPU. That makes
> * shuffling those interrupts around CPUs easier.
> */
> (vector >= FIRST_HIPRIORITY_VECTOR &&
> vector <= LAST_HIPRIORITY_VECTOR) )
> per_cpu(vector_irq, cpu)[vector] = irq;
Thanks, that's fine with me.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts
2025-11-21 8:35 ` Roger Pau Monné
@ 2025-11-24 10:20 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 10:20 UTC (permalink / raw)
To: Roger Pau Monné; +Cc: Andrew Cooper, xen-devel
On 21.11.2025 09:35, Roger Pau Monné wrote:
> On Thu, Nov 20, 2025 at 04:05:38PM +0100, Jan Beulich wrote:
>> On 20.11.2025 10:58, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/i8259.c
>>> +++ b/xen/arch/x86/i8259.c
>>> @@ -407,21 +407,14 @@ void __init init_IRQ(void)
>>> per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
>>>
>>> /*
>>> - * The interrupt affinity logic never targets interrupts to offline
>>> - * CPUs, hence it's safe to use cpumask_all here.
>>> - *
>>> * Legacy PIC interrupts are only targeted to CPU0, but depending on
>>> * the platform they can be distributed to any online CPU in hardware.
>>> - * Note this behavior has only been observed on AMD hardware. In order
>>> - * to cope install all active legacy vectors on all CPUs.
>>> - *
>>> - * IO-APIC will change the destination mask if/when taking ownership of
>>> - * the interrupt.
>>> + * Note this behavior has only been observed on AMD hardware. Set the
>>> + * target CPU as expected here, and cope with the possibly spurious
>>> + * interrupts in do_IRQ(). This behavior has only been observed
>>> + * during AP bringup.
>>> */
>>> - cpumask_copy(desc->arch.cpu_mask,
>>> - (boot_cpu_data.x86_vendor &
>>> - (X86_VENDOR_AMD | X86_VENDOR_HYGON) ? &cpumask_all
>>> - : cpumask_of(cpu)));
>>> + cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
>>> desc->arch.vector = LEGACY_VECTOR(irq);
>>> }
>>
>> Doesn't this collide with what setup_vector_irq() does (see also patch 04)? If
>> you don't set all bits here, not all CPUs will have the vector_irq[] slot set
>> correctly for do_IRQ() to actually be able to associate the vector with the
>> right IRQ.
>
> For the AMD workaround I've only ever saw the spurious vector
> triggering on CPUs different than the BSP at bringup,
Besides this possibly being an artifact (the issue occurring once during boot
may hide it otherwise potentially also occurring later), I think we want to be
conservative and (continue to) cover all possible cases unless we have an
explanation for why the issue would be AP-bringup-time-only.
Jan
> I don't think we
> strictly need to bind all legacy vectors on all possible CPUs. Well
> behaved PIC interrupts will only target the BSP, and that's properly
> setup.
>
> Thanks, Roger.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter
2025-11-20 9:58 ` [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter Roger Pau Monne
@ 2025-11-24 10:52 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 10:52 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> The vector will be targeting a single CPU at a time, so passing a mask is
> not needed. Simplify the interface and adjust callers to make use of it.
Considering the two callers we have, and considering the function is __init,
do we need the function parameter at all? Can't we uniformly use the BSP?
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -352,7 +352,7 @@ static int __init hpet_setup_msi_irq(struct hpet_event_channel *ch)
> * Technically we don't want to bind the IRQ to any CPU yet, but we need to
> * specify at least one online one here. Use the BSP.
> */
> - ret = bind_irq_vector(ch->msi.irq, HPET_BROADCAST_VECTOR, cpumask_of(0));
> + ret = bind_irq_vector(ch->msi.irq, HPET_BROADCAST_VECTOR, 0);
Irrespective of the remark above, the comment then will want re-wording some,
too.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity()
2025-11-20 9:58 ` [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity() Roger Pau Monne
@ 2025-11-24 11:20 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 11:20 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> Existing callers where already generating the passed cpumask using
> cpumask_of() to contain a single target CPU. Reduce complexity by passing
> the single target CPU as an integer parameter.
>
> No functional change intended.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> The function names are misleading, as {,p}irq_set_affinity() doesn't adjust
> the affinity of the interrupt (desc->affinity) but the interrupt target
> itself. Further cleanup might be helpful to correctly differentiate
> between setting interrupt affinity vs setting interrupt target.
Indeed, I wanted to mention this as well. As you touch all callers anyway,
how about doing a rename (whether to "target" or "binding" or yet something
else) right here?
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer
2025-11-20 9:58 ` [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer Roger Pau Monne
@ 2025-11-24 12:01 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 12:01 UTC (permalink / raw)
To: Roger Pau Monne
Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
Stefano Stabellini, Jason Andryuk, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -310,9 +310,9 @@ static void cf_check hpet_msi_set_affinity(
> struct msi_msg msg = ch->msi.msg;
>
> /* This really is only for dump_irqs(). */
> - cpumask_copy(desc->arch.cpu_mask, mask);
> + desc->arch.cpu = cpumask_any(mask);
Going from the comment, couldn't you use CPU_INVALID here? Then again, see
"x86/HPET: drop .set_affinity hook", where the function goes away anyway.
> @@ -337,7 +337,8 @@ static int __hpet_setup_msi_irq(struct irq_desc *desc)
> {
> struct msi_msg msg;
>
> - msi_compose_msg(desc->arch.vector, desc->arch.cpu_mask, &msg);
> + msg.dest32 = cpu_physical_id(desc->arch.cpu);
> + msi_compose_msg(desc->arch.vector, &msg);
> return hpet_msi_write(desc->action->dev_id, &msg);
> }
Setting msg.dest32 ahead of calling msi_compose_msg() feels odd. It makes things
look as if this was an input to the function, when by its name it rather would
want to be an output. Furthermore this is dead code right now, as the function
clears the entire structure first thing. Imo it being the function to fill the
field should be retained; instead of the CPU mask you'd once again make it a
scalar parameter. For the case where NULL was passed before, ...
> --- a/xen/arch/x86/include/asm/irq.h
> +++ b/xen/arch/x86/include/asm/irq.h
> @@ -69,13 +69,9 @@ struct irq_desc;
> struct arch_irq_desc {
> int16_t vector; /* vector itself is only 8 bits, */
> int16_t old_vector; /* but we use -1 for unassigned */
> - /*
> - * Except for high priority interrupts @cpu_mask may have bits set for
> - * offline CPUs. Consumers need to be careful to mask this down to
> - * online ones as necessary. There is supposed to always be a non-
> - * empty intersection with cpu_online_map.
> - */
> - cpumask_var_t cpu_mask;
> +/* Special target CPU values. */
> +#define CPU_INVALID ~0U
... you already make a suitable constant available. (Nit: The expansion wants
parenthesizing.)
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1112,8 +1112,7 @@ static void __init setup_IO_APIC_irqs(void)
> if (platform_legacy_irq(irq))
> disable_8259A_irq(irq_to_desc(irq));
>
> - set_entry_dest(&entry,
> - cpu_mask_to_apicid(irq_to_desc(irq)->arch.cpu_mask));
> + set_entry_dest(&entry, cpu_physical_id(irq_to_desc(irq)->arch.cpu));
I may as well mention this here: Looks like this patch removes all call sites
of cpu_mask_to_apicid(). That would leave the function unreachable, i.e. violating
a Misra rule, so I think the function needs dropping right here.
> @@ -2137,14 +2136,11 @@ int io_apic_set_pci_routing (int ioapic, int pin, int irq, int edge_level, int a
> return vector;
> entry.vector = vector;
>
> - if (cpumask_intersects(desc->arch.cpu_mask, TARGET_CPUS)) {
> - cpumask_t *mask = this_cpu(scratch_cpumask);
> -
> - cpumask_and(mask, desc->arch.cpu_mask, TARGET_CPUS);
> - set_entry_dest(&entry, cpu_mask_to_apicid(mask));
> + if (cpu_online(desc->arch.cpu)) {
Can CPU_INVALID make it here? If so, it needs guarding against. If not, an
assertion may be nice. (Same possibly elsewhere.)
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -156,8 +156,7 @@ static int __init _bind_irq_vector(struct irq_desc *desc, int vector,
>
> if ( !cpu_online(cpu) )
> return -EINVAL;
> - if ( (desc->arch.vector == vector) &&
> - cpumask_test_cpu(cpu, desc->arch.cpu_mask) )
> + if ( (desc->arch.vector == vector) && cpu == desc->arch.cpu )
Please can you be consistent with parentheses on both sides of the &&?
(I'd prefer the excess ones to be dropped, but the alternative is also
okay.)
> @@ -684,8 +673,9 @@ next:
> }
> else if ( valid_irq_vector(old_vector) )
> {
> - cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
> - &cpu_online_map);
> + cpumask_clear(desc->arch.old_cpu_mask);
> + if ( cpu_online(desc->arch.cpu) )
> + cpumask_set_cpu(desc->arch.cpu, desc->arch.old_cpu_mask);
As mentioned for an earlier patch, to avoid the LOCK-ed update
cpumask_copy() may be better to use. Yet iirc like there, likely this
goes away again later in the series (by the title right in the next patch),
so perhaps not a big deal.
> --- a/xen/common/cpu.c
> +++ b/xen/common/cpu.c
> @@ -16,6 +16,7 @@ unsigned int __read_mostly nr_cpumask_bits
> const cpumask_t cpumask_all = {
> .bits[0 ... (BITS_TO_LONGS(NR_CPUS) - 1)] = ~0UL
> };
> +const cpumask_t cpumask_none;
This feels wasteful at least for larger NR_CPUS. And it's likely going to
violate some Misra rule on non-x86, for having no user. On x86, as long as
NR_CPUS <= 8 * PAGE_SIZE, you could (re-)use zero_page[] instead.
> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -507,7 +507,7 @@ static void cf_check set_x2apic_affinity(
> if ( dest == BAD_APICID )
> return;
>
> - msi_compose_msg(desc->arch.vector, NULL, &iommu->msi.msg);
> + msi_compose_msg(desc->arch.vector, &iommu->msi.msg);
> iommu->msi.msg.dest32 = dest;
With the outlined adjustment above, it looks like the explicit setting
of .dest32 would then not be needed here (and perhaps elsewhere) anymore.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask field to integer
2025-11-20 9:58 ` [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask " Roger Pau Monne
@ 2025-11-24 12:54 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 12:54 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -115,7 +115,7 @@ static void release_old_vec(struct irq_desc *desc)
> unsigned int vector = desc->arch.old_vector;
>
> desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> - cpumask_clear(desc->arch.old_cpu_mask);
> + desc->arch.old_cpu = CPU_INVALID;
With this, ...
> @@ -221,10 +220,10 @@ static void _clear_irq_vector(struct irq_desc *desc)
> {
> /* If we were in motion, also clear desc->arch.old_vector */
> old_vector = desc->arch.old_vector;
> - cpumask_and(tmp_mask, desc->arch.old_cpu_mask, &cpu_online_map);
>
> - for_each_cpu(cpu, tmp_mask)
> + if ( cpu_online(desc->arch.old_cpu) )
... you pretty certainly want to guard against the value making it here (even
if just accidentally), and thus triggering the assertion in cpumask_check().
(Again possibly elsewhere as well.)
> @@ -581,16 +575,16 @@ static int _assign_irq_vector(struct irq_desc *desc, const cpumask_t *mask)
> * in the 'mask' parameter.
> */
> desc->arch.vector = desc->arch.old_vector;
> - desc->arch.cpu = cpumask_any(desc->arch.old_cpu_mask);
> + desc->arch.cpu = desc->arch.old_cpu;
>
> /* Undo any possibly done cleanup. */
> per_cpu(vector_irq, desc->arch.cpu)[desc->arch.vector] = irq;
>
> /* Cancel the pending move and release the current vector. */
> desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> - cpumask_clear(desc->arch.old_cpu_mask);
> + desc->arch.old_cpu = CPU_INVALID;
> desc->arch.move_in_progress = 0;
> - desc->arch.move_cleanup_count = 0;
> + desc->arch.move_cleanup = false;
Nit: Excess blank.
> @@ -2003,7 +1994,7 @@ void do_IRQ(struct cpu_user_regs *regs)
> ~irq, CPUMASK_PR(desc->affinity),
> /* TODO: handle hipri vectors nicely. */
> CPUMASK_PR(get_cpumask(desc->arch.cpu)),
> - CPUMASK_PR(desc->arch.old_cpu_mask),
> + CPUMASK_PR(get_cpumask(desc->arch.old_cpu)),
I should have asked on the previous patch already: Does it actually make sense
to still print these in mask form? Without that you wouldn't need get_cpumask(),
and as a result you also wouldn't need cpumask_none.
> @@ -2685,12 +2664,9 @@ void fixup_irqs(void)
> * per-cpu vector table will no longer have ->arch.old_vector
> * setup, and hence ->arch.old_cpu_mask would be stale.
> */
> - cpumask_clear_cpu(cpu, desc->arch.old_cpu_mask);
> - if ( cpumask_empty(desc->arch.old_cpu_mask) )
> - {
> - desc->arch.move_in_progress = 0;
> - release_old_vec(desc);
> - }
> + desc->arch.old_cpu = CPU_INVALID;
> + desc->arch.move_in_progress = 0;
As you touch the line anyway, switch to using "false"?
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH 12/12] x86/irq: convert irq_desc pending_mask field to integer
2025-11-20 9:58 ` [PATCH 12/12] x86/irq: convert irq_desc pending_mask " Roger Pau Monne
@ 2025-11-24 12:57 ` Jan Beulich
0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2025-11-24 12:57 UTC (permalink / raw)
To: Roger Pau Monne; +Cc: Andrew Cooper, xen-devel
On 20.11.2025 10:58, Roger Pau Monne wrote:
> @@ -779,10 +775,10 @@ void move_masked_irq(struct irq_desc *desc)
> *
> * For correct operation this depends on the caller masking the irqs.
> */
> - if ( likely(cpumask_intersects(pending_mask, &cpu_online_map)) )
> - desc->handler->set_affinity(desc, pending_mask);
> + if ( likely(cpu_online(desc->arch.pending_cpu)) )
Same remark again regarding the guarding against hitting ...
> + desc->handler->set_affinity(desc, cpumask_of(desc->arch.pending_cpu));
>
> - cpumask_clear(pending_mask);
> + desc->arch.pending_cpu = CPU_INVALID;
... the value stored here.
Jan
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2025-11-24 12:57 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-20 9:06 [PATCH 00/12] x86/irq: drop interrupt target cpumasks Roger Pau Monne
2025-11-20 9:06 ` [PATCH 01/12] x86/apic: remove cpu_mask_to_apicid hook Roger Pau Monne
2025-11-20 12:19 ` Jan Beulich
2025-11-20 12:25 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 02/12] x86/apic: remove vector_allocation_cpumask hook Roger Pau Monne
2025-11-20 12:25 ` Jan Beulich
2025-11-20 12:30 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 03/12] x86/irq: introduce local irq_desc Roger Pau Monne
2025-11-20 12:39 ` Andrew Cooper
2025-11-20 9:06 ` [PATCH 04/12] x86/irq: set accurate cpu_mask for high priority vectors used by external interrupts Roger Pau Monne
2025-11-20 12:45 ` Andrew Cooper
2025-11-20 12:53 ` Jan Beulich
2025-11-20 13:06 ` Jan Beulich
2025-11-21 8:29 ` Roger Pau Monné
2025-11-24 10:16 ` Jan Beulich
2025-11-20 9:06 ` [PATCH 05/12] x86/io-apic: purge usage of logical mode Roger Pau Monne
2025-11-20 13:18 ` Andrew Cooper
2025-11-20 14:27 ` Jan Beulich
2025-11-20 18:30 ` Andrew Cooper
2025-11-21 7:38 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Roger Pau Monne
2025-11-20 9:58 ` [PATCH 07/12] x86/io-apic: fix usage of setup_ioapic_dest() Roger Pau Monne
2025-11-20 16:02 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 08/12] x86/irq: adjust bind_irq_vector() to take a single CPU as parameter Roger Pau Monne
2025-11-24 10:52 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 09/12] x86/irq: convert cpumask parameter to integer in {,p}irq_set_affinity() Roger Pau Monne
2025-11-24 11:20 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 10/12] x86/irq: convert irq_desc cpu_mask field to integer Roger Pau Monne
2025-11-24 12:01 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 11/12] x86/irq: convert irq_desc old_cpu_mask " Roger Pau Monne
2025-11-24 12:54 ` Jan Beulich
2025-11-20 9:58 ` [PATCH 12/12] x86/irq: convert irq_desc pending_mask " Roger Pau Monne
2025-11-24 12:57 ` Jan Beulich
2025-11-20 15:05 ` [PATCH 06/12] x86/i8259: redo workaround for AMD spurious PIC interrupts Jan Beulich
2025-11-21 8:35 ` Roger Pau Monné
2025-11-24 10:20 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.