* [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI
@ 2026-06-26 10:59 Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 1/5] iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU Sairaj Kodilkar
` (4 more replies)
0 siblings, 5 replies; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
Introduction
============
On newer generation of AMD processor, the IOMMU AVIC/x2AVIC feature can be
enabled with the new Guest APIC Physical Processor Interrupt (GAPPI) mode,
which is an alternative mode for handling AVIC guest interrupts to
non-running vcpus (i.e. IRTE[IsRun]=0).
With GAPPI enabled, the IOMMU delivers the posted interrupt to the physical
CPU described by the IRTE destination fields, with the wake up vector in
ga_tag.
Please see more detail about GAPPI in section 2.2.5.4 Guest APIC Physical
Processor Interrupt of the AMD I/O Virtualization Technology (IOMMU)
Specification [1].
Implementation Details
======================
GAPPI reuses the posted-interrupt wakeup path introduced for Intel VMX: the
IOMMU sets ga_tag to POSTED_INTR_WAKEUP_VECTOR and SVM registers the handler
with kvm_set_posted_intr_wakeup_handler().
SVM maintains a per-CPU list of vCPUs that are scheduled out.
When a CPU receives a GAPPI interrupt from the IOMMU, the handler
walks that list, finds vCPUs with a pending IRR bit, and wakes them.
The IRTE destination is chosen as the last host CPU where the vCPU ran, to
reduce unnecessary VMEXITs from GAPPI deliveries.
The first patch refactors the SVM/IOMMU interface: apicid (formerly cpu)
denotes the running vCPU's host APIC ID or, when scheduled out, the APIC ID
of the pCPU hosting the vCPU on its GAPPI wakeup list. Explicit running and
posted-interrupt flags replace the old ga_log_intr boolean, since apicid no
longer implies vCPU state on its own.
Advantages
==========
With GALOG, IOMMU can only generate a single interrupt using the MMIO offset
0x180h (XT IOMMU GA Log Interrupt Control Register) and appends the vCPU
information to the GALOG buffer. hypervisor has to scan this list in order to
wakeup the vCPUs which can introduce significant latency and even cause buffer
overflow under high interrupt rate.
GAPPI resolves this problem by distributing the posted interrupt across
multiple CPUs.
[1] https://docs.amd.com/v/u/en-US/48882_3.11_IOMMU_PUB
Sairaj Kodilkar (5):
iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU
iommu/amd: Configure IRTE to use the GAPPI for posted interrupts
kvm/svm: Introduce per-CPU lock and wakeup queue
kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload
iommu/amd: Provide kernel command line option to enable GAPPI
arch/x86/include/asm/irq_remapping.h | 4 +-
arch/x86/kvm/svm/avic.c | 136 +++++++++++++++++++++++----
arch/x86/kvm/svm/svm.c | 2 +
arch/x86/kvm/svm/svm.h | 5 +
drivers/iommu/amd/amd_iommu.h | 1 +
drivers/iommu/amd/amd_iommu_types.h | 6 +-
drivers/iommu/amd/init.c | 23 ++++-
drivers/iommu/amd/iommu.c | 41 +++++---
include/linux/amd-iommu.h | 15 ++-
9 files changed, 192 insertions(+), 41 deletions(-)
base-commit: 8cd9520d35a6c38db6567e97dd93b1f11f185dc6
--
2.34.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 1/5] iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
@ 2026-06-26 10:59 ` Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts Sairaj Kodilkar
` (3 subsequent siblings)
4 siblings, 0 replies; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
The name "cpu" in the parameter is misleading as it represents the
physical apicid of the cpu. Hence rename it.
Introduce flags to determine the state of the vCPU (running or not) and
posted interrupts. This is useful as following patch overloads the
apicid (formerly cpu) field to determine GAPPI destination when vCPU is
not running and it can no longer be used to determine if vCPU is running
or not.
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
---
arch/x86/include/asm/irq_remapping.h | 4 ++--
arch/x86/kvm/svm/avic.c | 22 ++++++++++++++--------
drivers/iommu/amd/iommu.c | 24 ++++++++++++------------
include/linux/amd-iommu.h | 14 ++++++++++----
4 files changed, 38 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 37b94f484ef3..40a2206a33c0 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -35,8 +35,8 @@ struct amd_iommu_pi_data {
u64 vapic_addr; /* Physical address of the vCPU's vAPIC. */
u32 ga_tag;
u32 vector; /* Guest vector of the interrupt */
- int cpu;
- bool ga_log_intr;
+ int apicid;
+ int flags;
bool is_guest_mode;
void *ir_data;
};
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index cdd5a6dc646f..7862b13c5409 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -932,6 +932,7 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
struct vcpu_svm *svm = to_svm(vcpu);
u64 entry;
int ret;
+ int posted_intr;
/*
* Prevent the vCPU from being scheduled out or migrated until
@@ -949,10 +950,11 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
*/
entry = svm->avic_physical_id_entry;
if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK) {
- pi_data.cpu = entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
+ pi_data.apicid = entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
+ pi_data.flags = AMD_IOMMU_FLAG_VCPU_RUNNING;
} else {
- pi_data.cpu = -1;
- pi_data.ga_log_intr = entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR;
+ posted_intr = !!(entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR);
+ pi_data.flags = posted_intr << AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
}
ret = irq_set_vcpu_affinity(host_irq, &pi_data);
@@ -1004,10 +1006,14 @@ enum avic_vcpu_action {
AVIC_START_BLOCKING = BIT(1),
};
-static void avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu,
+static void avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int apicid,
enum avic_vcpu_action action)
{
- bool ga_log_intr = (action & AVIC_START_BLOCKING);
+ int posted_intr = !!(action & AVIC_START_BLOCKING) <<
+ AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
+ int is_vcpu_running = (apicid >= 0) <<
+ AMD_IOMMU_FLAG_VCPU_RUNNING_SHIFT;
+ int flags = posted_intr | is_vcpu_running;
struct vcpu_svm *svm = to_svm(vcpu);
struct kvm_kernel_irqfd *irqfd;
@@ -1024,9 +1030,9 @@ static void avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int cpu,
void *data = irqfd->irq_bypass_data;
if (!(action & AVIC_TOGGLE_ON_OFF))
- WARN_ON_ONCE(amd_iommu_update_ga(data, cpu, ga_log_intr));
- else if (cpu >= 0)
- WARN_ON_ONCE(amd_iommu_activate_guest_mode(data, cpu, ga_log_intr));
+ WARN_ON_ONCE(amd_iommu_update_ga(data, apicid, flags));
+ else if (is_vcpu_running)
+ WARN_ON_ONCE(amd_iommu_activate_guest_mode(data, apicid, flags));
else
WARN_ON_ONCE(amd_iommu_deactivate_guest_mode(data));
}
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 57dc8fabc7d9..76f0e469490e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3959,19 +3959,19 @@ static const struct irq_domain_ops amd_ir_domain_ops = {
.deactivate = irq_remapping_deactivate,
};
-static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu,
- bool ga_log_intr)
+static void __amd_iommu_update_ga(struct irte_ga *entry, int apicid, int flags)
{
- if (cpu >= 0) {
+ if (flags & AMD_IOMMU_FLAG_VCPU_RUNNING) {
entry->lo.fields_vapic.destination =
- APICID_TO_IRTE_DEST_LO(cpu);
+ APICID_TO_IRTE_DEST_LO(apicid);
entry->hi.fields.destination =
- APICID_TO_IRTE_DEST_HI(cpu);
+ APICID_TO_IRTE_DEST_HI(apicid);
entry->lo.fields_vapic.is_run = true;
entry->lo.fields_vapic.ga_log_intr = false;
} else {
entry->lo.fields_vapic.is_run = false;
- entry->lo.fields_vapic.ga_log_intr = ga_log_intr;
+ entry->lo.fields_vapic.ga_log_intr = !!(flags &
+ AMD_IOMMU_FLAG_POSTED_INTR);
}
}
@@ -3992,7 +3992,7 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu,
* and thus don't require an invalidation to ensure the IOMMU consumes fresh
* information.
*/
-int amd_iommu_update_ga(void *data, int cpu, bool ga_log_intr)
+int amd_iommu_update_ga(void *data, int apicid, int flags)
{
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
@@ -4006,14 +4006,14 @@ int amd_iommu_update_ga(void *data, int cpu, bool ga_log_intr)
if (!ir_data->iommu)
return -ENODEV;
- __amd_iommu_update_ga(entry, cpu, ga_log_intr);
+ __amd_iommu_update_ga(entry, apicid, flags);
return __modify_irte_ga(ir_data->iommu, ir_data->irq_2_irte.devid,
ir_data->irq_2_irte.index, entry);
}
EXPORT_SYMBOL(amd_iommu_update_ga);
-int amd_iommu_activate_guest_mode(void *data, int cpu, bool ga_log_intr)
+int amd_iommu_activate_guest_mode(void *data, int apicid, int flags)
{
struct amd_ir_data *ir_data = (struct amd_ir_data *)data;
struct irte_ga *entry = (struct irte_ga *) ir_data->entry;
@@ -4036,7 +4036,7 @@ int amd_iommu_activate_guest_mode(void *data, int cpu, bool ga_log_intr)
entry->hi.fields.vector = ir_data->ga_vector;
entry->lo.fields_vapic.ga_tag = ir_data->ga_tag;
- __amd_iommu_update_ga(entry, cpu, ga_log_intr);
+ __amd_iommu_update_ga(entry, apicid, flags);
return modify_irte_ga(ir_data->iommu, ir_data->irq_2_irte.devid,
ir_data->irq_2_irte.index, entry);
@@ -4107,8 +4107,8 @@ static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *info)
ir_data->ga_vector = pi_data->vector;
ir_data->ga_tag = pi_data->ga_tag;
if (pi_data->is_guest_mode)
- ret = amd_iommu_activate_guest_mode(ir_data, pi_data->cpu,
- pi_data->ga_log_intr);
+ ret = amd_iommu_activate_guest_mode(ir_data, pi_data->apicid,
+ pi_data->flags);
else
ret = amd_iommu_deactivate_guest_mode(ir_data);
} else {
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index edcee9f5335a..3dd9074e5967 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -30,8 +30,8 @@ static inline void amd_iommu_detect(void) { }
/* IOMMU AVIC Function */
extern int amd_iommu_register_ga_log_notifier(int (*notifier)(u32));
-extern int amd_iommu_update_ga(void *data, int cpu, bool ga_log_intr);
-extern int amd_iommu_activate_guest_mode(void *data, int cpu, bool ga_log_intr);
+extern int amd_iommu_update_ga(void *data, int apicid, int flags);
+extern int amd_iommu_activate_guest_mode(void *data, int apicid, int flags);
extern int amd_iommu_deactivate_guest_mode(void *data);
#else /* defined(CONFIG_AMD_IOMMU) && defined(CONFIG_IRQ_REMAP) */
@@ -42,12 +42,12 @@ amd_iommu_register_ga_log_notifier(int (*notifier)(u32))
return 0;
}
-static inline int amd_iommu_update_ga(void *data, int cpu, bool ga_log_intr)
+static inline int amd_iommu_update_ga(void *data, int apicid, int flags)
{
return 0;
}
-static inline int amd_iommu_activate_guest_mode(void *data, int cpu, bool ga_log_intr)
+static inline int amd_iommu_activate_guest_mode(void *data, int apicid, int flags)
{
return 0;
}
@@ -76,4 +76,10 @@ static inline int amd_iommu_snp_disable(void) { return 0; }
static inline bool amd_iommu_sev_tio_supported(void) { return false; }
#endif
+#define AMD_IOMMU_FLAG_VCPU_RUNNING_SHIFT 0
+#define AMD_IOMMU_FLAG_VCPU_RUNNING BIT(0)
+
+#define AMD_IOMMU_FLAG_POSTED_INTR_SHIFT 1
+#define AMD_IOMMU_FLAG_POSTED_INTR BIT(1)
+
#endif /* _ASM_X86_AMD_IOMMU_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 1/5] iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU Sairaj Kodilkar
@ 2026-06-26 10:59 ` Sairaj Kodilkar
2026-06-26 11:29 ` sashiko-bot
2026-06-26 10:59 ` [RFC PATCH 3/5] kvm/svm: Introduce per-CPU lock and wakeup queue Sairaj Kodilkar
` (2 subsequent siblings)
4 siblings, 1 reply; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
When GAPPI is enabled, a guest-mode IRTE with is_run clear delivers a
GAPPI interrupt to the physical APIC ID in the IRTE destination fields,
with the wake vector carried in ga_tag. Program the guest-activated IRTE
accordingly.
Reuse the POSTED_INTR_WAKEUP_VECTOR for gappi which is already reserved
for posted interrupt wakeup handler on x86 and is handled by IDTE entry
sysvec_kvm_posted_intr_wakeup_ipi. Intel VMX already uses same vector
for the wakeup case.
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
---
arch/x86/kvm/svm/avic.c | 5 +++++
drivers/iommu/amd/amd_iommu.h | 1 +
drivers/iommu/amd/amd_iommu_types.h | 4 +++-
drivers/iommu/amd/init.c | 3 +++
drivers/iommu/amd/iommu.c | 21 +++++++++++++++++----
include/linux/amd-iommu.h | 1 +
6 files changed, 30 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 7862b13c5409..b666efb5d91c 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -955,6 +955,11 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
} else {
posted_intr = !!(entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR);
pi_data.flags = posted_intr << AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
+ /* GAPPI is disabled at this point (amd_iommu_gappi is
+ * enabled in the following patches) hence keep the
+ * apicid as 0.
+ */
+ pi_data.apicid = 0;
}
ret = irq_set_vcpu_affinity(host_irq, &pi_data);
diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h
index 834d8fabfba3..044179cab12e 100644
--- a/drivers/iommu/amd/amd_iommu.h
+++ b/drivers/iommu/amd/amd_iommu.h
@@ -41,6 +41,7 @@ int amd_iommu_enable(void);
void amd_iommu_disable(void);
int amd_iommu_reenable(int mode);
int amd_iommu_enable_faulting(unsigned int cpu);
+extern bool amd_iommu_gappi;
extern int amd_iommu_guest_ir;
extern enum protection_domain_mode amd_iommu_pgtable;
extern int amd_iommu_gpt_level;
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index f9f718087893..26d7a9796e64 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -113,6 +113,7 @@
/* Extended Feature 2 Bits */
#define FEATURE_SEVSNPIO_SUP BIT_ULL(1)
#define FEATURE_GCR3TRPMODE BIT_ULL(3)
+#define FEATURE_GAPPIDISSUP BIT_ULL(4)
#define FEATURE_SNPAVICSUP GENMASK_ULL(7, 5)
#define FEATURE_SNPAVICSUP_GAM(x) \
(FIELD_GET(FEATURE_SNPAVICSUP, x) == 0x1)
@@ -1004,7 +1005,8 @@ union irte_ga_lo {
no_fault : 1,
/* ------ */
ga_log_intr : 1,
- rsvd1 : 3,
+ rsvd1 : 2,
+ gappi_dis : 1,
is_run : 1,
/* ------ */
guest_mode : 1,
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 3bdb380d23e9..2e1889f8a9e4 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -160,6 +160,9 @@ u8 amd_iommu_hpt_level;
/* Guest page table level */
int amd_iommu_gpt_level = PAGE_MODE_4_LEVEL;
+bool amd_iommu_gappi;
+EXPORT_SYMBOL(amd_iommu_gappi);
+
int amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_VAPIC;
static int amd_iommu_xt_mode = IRQ_REMAP_XAPIC_MODE;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 76f0e469490e..4690cecc9aa7 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -3969,9 +3969,18 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int apicid, int flags)
entry->lo.fields_vapic.is_run = true;
entry->lo.fields_vapic.ga_log_intr = false;
} else {
- entry->lo.fields_vapic.is_run = false;
- entry->lo.fields_vapic.ga_log_intr = !!(flags &
- AMD_IOMMU_FLAG_POSTED_INTR);
+ bool posted_intr = !!(flags & AMD_IOMMU_FLAG_POSTED_INTR);
+ if (amd_iommu_gappi) {
+ entry->lo.fields_vapic.gappi_dis = !posted_intr &&
+ check_feature2(FEATURE_GAPPIDISSUP);
+ entry->lo.fields_vapic.is_run = false;
+ entry->lo.fields_vapic.destination =
+ APICID_TO_IRTE_DEST_LO(apicid);
+ entry->hi.fields.destination = APICID_TO_IRTE_DEST_HI(apicid);
+ } else {
+ entry->lo.fields_vapic.is_run = false;
+ entry->lo.fields_vapic.ga_log_intr = posted_intr;
+ }
}
}
@@ -4034,7 +4043,11 @@ int amd_iommu_activate_guest_mode(void *data, int apicid, int flags)
entry->lo.fields_vapic.guest_mode = 1;
entry->hi.fields.ga_root_ptr = ir_data->ga_root_ptr;
entry->hi.fields.vector = ir_data->ga_vector;
- entry->lo.fields_vapic.ga_tag = ir_data->ga_tag;
+
+ if (amd_iommu_gappi)
+ entry->lo.fields_vapic.ga_tag = POSTED_INTR_WAKEUP_VECTOR;
+ else
+ entry->lo.fields_vapic.ga_tag = ir_data->ga_tag;
__amd_iommu_update_ga(entry, apicid, flags);
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index 3dd9074e5967..87e76f617ea1 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -82,4 +82,5 @@ static inline bool amd_iommu_sev_tio_supported(void) { return false; }
#define AMD_IOMMU_FLAG_POSTED_INTR_SHIFT 1
#define AMD_IOMMU_FLAG_POSTED_INTR BIT(1)
+extern bool amd_iommu_gappi;
#endif /* _ASM_X86_AMD_IOMMU_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 3/5] kvm/svm: Introduce per-CPU lock and wakeup queue
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 1/5] iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts Sairaj Kodilkar
@ 2026-06-26 10:59 ` Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI Sairaj Kodilkar
4 siblings, 0 replies; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
Add a per-CPU list and raw spinlock that SVM will use to track vCPUs
blocked and waiting for a wakeup. The GAPPI interrupt handler will
scan the corresponding CPU’s list to decide which vCPU to run.
Initialize both in avic_hardware_setup() when AVIC is enabled.
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
---
arch/x86/kvm/svm/avic.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index b666efb5d91c..e7a4c0e90e7a 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -63,6 +63,9 @@
ga_tag; \
})
+static DEFINE_PER_CPU(struct list_head, gappi_vcpu_wakeup_list);
+static DEFINE_PER_CPU(raw_spinlock_t, gappi_vcpu_wakeup_list_lock);
+
static_assert(__AVIC_GATAG(AVIC_VM_ID_MASK, AVIC_VCPU_IDX_MASK) == -1u);
#define AVIC_AUTO_MODE -1
@@ -1315,12 +1318,18 @@ static bool __init avic_want_avic_enabled(void)
*/
bool __init avic_hardware_setup(void)
{
+ int cpu;
avic = avic_want_avic_enabled();
if (!avic)
return false;
pr_info("AVIC enabled\n");
+ for_each_possible_cpu(cpu) {
+ INIT_LIST_HEAD(&per_cpu(gappi_vcpu_wakeup_list, cpu));
+ raw_spin_lock_init(&per_cpu(gappi_vcpu_wakeup_list_lock, cpu));
+ }
+
/* AVIC is a prerequisite for x2AVIC. */
x2avic_enabled = boot_cpu_has(X86_FEATURE_X2AVIC);
if (x2avic_enabled) {
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
` (2 preceding siblings ...)
2026-06-26 10:59 ` [RFC PATCH 3/5] kvm/svm: Introduce per-CPU lock and wakeup queue Sairaj Kodilkar
@ 2026-06-26 10:59 ` Sairaj Kodilkar
2026-06-26 11:25 ` sashiko-bot
2026-06-26 10:59 ` [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI Sairaj Kodilkar
4 siblings, 1 reply; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
When a vCPU is unloaded from a physical CPU enqueue it on that CPU's
GAPPI wakeup list. Remove the vCPU from the wakeup list when it is loaded
on to a CPU again.
Also enqueue from avic_pi_update_irte() when vCPU is not running and
ir_list is still empty. This handles the condition where vCPU load skips
the per-CPU wakeup-list update when ir_list is empty.
The GAPPI wakeup handler walks this CPU's list and wakes vCPUs that still
have a pending IRR. Install it with kvm_set_posted_intr_wakeup_handler()
so deliveries on POSTED_INTR_WAKEUP_VECTOR invoke it.
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
---
arch/x86/kvm/svm/avic.c | 110 +++++++++++++++++++++++++++++++++++-----
arch/x86/kvm/svm/svm.c | 2 +
arch/x86/kvm/svm/svm.h | 5 ++
3 files changed, 104 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index e7a4c0e90e7a..d238f65a8172 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -877,6 +877,9 @@ int avic_init_vcpu(struct vcpu_svm *svm)
INIT_LIST_HEAD(&svm->ir_list);
raw_spin_lock_init(&svm->ir_list_lock);
+ INIT_LIST_HEAD(&svm->gappi_vcpu_wakeup_list);
+ svm->gappi_cpu = -1;
+
if (!enable_apicv || !irqchip_in_kernel(vcpu->kvm))
return 0;
@@ -889,6 +892,44 @@ int avic_init_vcpu(struct vcpu_svm *svm)
return ret;
}
+static void avic_add_vcpu_to_gappi_wakeup_list(struct vcpu_svm *svm, int cpu)
+{
+ struct list_head *wakeup_list;
+ raw_spinlock_t *spinlock;
+
+ if (WARN_ON(cpu < 0))
+ return;
+
+ wakeup_list = &per_cpu(gappi_vcpu_wakeup_list, cpu);
+ spinlock = &per_cpu(gappi_vcpu_wakeup_list_lock, cpu);
+ raw_spin_lock(spinlock);
+ if (list_empty(&svm->gappi_vcpu_wakeup_list))
+ list_add_tail(&svm->gappi_vcpu_wakeup_list, wakeup_list);
+ raw_spin_unlock(spinlock);
+}
+
+static void avic_remove_vcpu_from_gappi_wakeup_list(struct vcpu_svm *svm, int cpu)
+{
+ raw_spinlock_t *spinlock;
+
+ if (WARN_ON(cpu < 0))
+ return;
+
+ spinlock = &per_cpu(gappi_vcpu_wakeup_list_lock, cpu);
+ raw_spin_lock(spinlock);
+ if (!list_empty(&svm->gappi_vcpu_wakeup_list))
+ list_del_init(&svm->gappi_vcpu_wakeup_list);
+ raw_spin_unlock(spinlock);
+}
+
+void avic_destroy_vcpu(struct vcpu_svm *svm)
+{
+ if (svm->gappi_cpu != -1 && amd_iommu_gappi) {
+ avic_remove_vcpu_from_gappi_wakeup_list(svm, svm->gappi_cpu);
+ svm->gappi_cpu = -1;
+ }
+}
+
void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
{
avic_handle_dfr_update(vcpu);
@@ -899,13 +940,18 @@ static void svm_ir_list_del(struct kvm_kernel_irqfd *irqfd)
{
struct kvm_vcpu *vcpu = irqfd->irq_bypass_vcpu;
unsigned long flags;
+ struct vcpu_svm *svm;
if (!vcpu)
return;
- raw_spin_lock_irqsave(&to_svm(vcpu)->ir_list_lock, flags);
+ svm = to_svm(vcpu);
+
+ raw_spin_lock_irqsave(&svm->ir_list_lock, flags);
list_del(&irqfd->vcpu_list);
- raw_spin_unlock_irqrestore(&to_svm(vcpu)->ir_list_lock, flags);
+ if (list_empty(&svm->ir_list))
+ avic_remove_vcpu_from_gappi_wakeup_list(svm, svm->gappi_cpu);
+ raw_spin_unlock_irqrestore(&svm->ir_list_lock, flags);
}
int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
@@ -936,6 +982,7 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
u64 entry;
int ret;
int posted_intr;
+ bool is_vcpu_waiting = false;
/*
* Prevent the vCPU from being scheduled out or migrated until
@@ -958,16 +1005,18 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
} else {
posted_intr = !!(entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR);
pi_data.flags = posted_intr << AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
- /* GAPPI is disabled at this point (amd_iommu_gappi is
- * enabled in the following patches) hence keep the
- * apicid as 0.
- */
- pi_data.apicid = 0;
+ if (amd_iommu_gappi) {
+ pi_data.apicid = kvm_cpu_get_apicid(svm->gappi_cpu);
+ if (list_empty(&svm->ir_list)) {
+ avic_add_vcpu_to_gappi_wakeup_list(svm, svm->gappi_cpu);
+ is_vcpu_waiting = true;
+ }
+ }
}
ret = irq_set_vcpu_affinity(host_irq, &pi_data);
if (ret)
- return ret;
+ goto gappi_err_out;
/*
* Revert to legacy mode if the IOMMU didn't provide metadata
@@ -976,12 +1025,17 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
*/
if (WARN_ON_ONCE(!pi_data.ir_data)) {
irq_set_vcpu_affinity(host_irq, NULL);
- return -EIO;
+ ret = -EIO;
+ goto gappi_err_out;
}
irqfd->irq_bypass_data = pi_data.ir_data;
list_add(&irqfd->vcpu_list, &svm->ir_list);
return 0;
+gappi_err_out:
+ if (is_vcpu_waiting)
+ avic_remove_vcpu_from_gappi_wakeup_list(svm, svm->gappi_cpu);
+ return ret;
}
return irq_set_vcpu_affinity(host_irq, NULL);
}
@@ -1015,7 +1069,7 @@ enum avic_vcpu_action {
};
static void avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int apicid,
- enum avic_vcpu_action action)
+ int cpu, enum avic_vcpu_action action)
{
int posted_intr = !!(action & AVIC_START_BLOCKING) <<
AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
@@ -1031,8 +1085,22 @@ static void avic_update_iommu_vcpu_affinity(struct kvm_vcpu *vcpu, int apicid,
* Here, we go through the per-vcpu ir_list to update all existing
* interrupt remapping table entry targeting this vcpu.
*/
- if (list_empty(&svm->ir_list))
+ if (list_empty(&svm->ir_list)) {
+ if (amd_iommu_gappi && cpu >= 0)
+ svm->gappi_cpu = cpu;
return;
+ }
+
+ if (is_vcpu_running && amd_iommu_gappi) {
+ /* IF condition handles the initial state */
+ if (svm->gappi_cpu != -1)
+ avic_remove_vcpu_from_gappi_wakeup_list(svm, svm->gappi_cpu);
+
+ svm->gappi_cpu = cpu; /* Store cpu no as target for GAPPI */
+ } else if (amd_iommu_gappi) {
+ apicid = kvm_cpu_get_apicid(svm->gappi_cpu);
+ avic_add_vcpu_to_gappi_wakeup_list(svm, svm->gappi_cpu);
+ }
list_for_each_entry(irqfd, &svm->ir_list, vcpu_list) {
void *data = irqfd->irq_bypass_data;
@@ -1094,7 +1162,7 @@ static void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu,
WRITE_ONCE(kvm_svm->avic_physical_id_table[vcpu->vcpu_id], entry);
- avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, action);
+ avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, cpu, action);
raw_spin_unlock_irqrestore(&svm->ir_list_lock, flags);
}
@@ -1137,7 +1205,7 @@ static void __avic_vcpu_put(struct kvm_vcpu *vcpu, enum avic_vcpu_action action)
*/
raw_spin_lock_irqsave(&svm->ir_list_lock, flags);
- avic_update_iommu_vcpu_affinity(vcpu, -1, action);
+ avic_update_iommu_vcpu_affinity(vcpu, -1, -1, action);
WARN_ON_ONCE(entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR);
@@ -1310,6 +1378,21 @@ static bool __init avic_want_avic_enabled(void)
return true;
}
+static void avic_gappi_wakeup_handler(void)
+{
+ int cpu = smp_processor_id();
+ struct list_head *vcpu_wakeup_list = &per_cpu(gappi_vcpu_wakeup_list, cpu);
+ raw_spinlock_t *spinlock = &per_cpu(gappi_vcpu_wakeup_list_lock, cpu);
+ struct vcpu_svm *svm;
+
+ raw_spin_lock(spinlock);
+ list_for_each_entry(svm, vcpu_wakeup_list, gappi_vcpu_wakeup_list) {
+ if (kvm_lapic_find_highest_irr(&svm->vcpu) >= 0)
+ kvm_vcpu_wake_up(&svm->vcpu);
+ }
+ raw_spin_unlock(spinlock);
+}
+
/*
* Note:
* - The module param avic enable both xAPIC and x2APIC mode.
@@ -1353,6 +1436,7 @@ bool __init avic_hardware_setup(void)
enable_ipiv = false;
amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
+ kvm_set_posted_intr_wakeup_handler(&avic_gappi_wakeup_handler);
return true;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e02a38da5296..b687133f8528 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1356,6 +1356,8 @@ static void svm_vcpu_free(struct kvm_vcpu *vcpu)
WARN_ON_ONCE(!list_empty(&svm->ir_list));
+ avic_destroy_vcpu(svm);
+
svm_leave_nested(vcpu);
svm_free_nested(svm);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5137416be593..47d5bb5d7103 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -362,6 +362,10 @@ struct vcpu_svm {
/* Guest GIF value, used when vGIF is not enabled */
bool guest_gif;
+
+ /* GAPPI related fields */
+ struct list_head gappi_vcpu_wakeup_list;
+ int gappi_cpu;
};
struct svm_cpu_data {
@@ -909,6 +913,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb);
int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
int avic_init_vcpu(struct vcpu_svm *svm);
+void avic_destroy_vcpu(struct vcpu_svm *svm);
void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
void avic_vcpu_put(struct kvm_vcpu *vcpu);
void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu);
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
` (3 preceding siblings ...)
2026-06-26 10:59 ` [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload Sairaj Kodilkar
@ 2026-06-26 10:59 ` Sairaj Kodilkar
2026-06-26 11:25 ` sashiko-bot
4 siblings, 1 reply; 9+ messages in thread
From: Sairaj Kodilkar @ 2026-06-26 10:59 UTC (permalink / raw)
To: H. Peter Anvin, Joerg Roedel (AMD), Borislav Petkov, Dave Hansen,
Ingo Molnar, Paolo Bonzini, Robin Murphy, Sairaj Kodilkar,
Sean Christopherson, Suravee Suthikulpanit, Thomas Gleixner,
Vasant Hegde, Will Deacon, iommu, kvm, linux-kernel, x86
Enable GAPPI when IOMMU GAPPISup extended feature bit is set and
the kernel is booted with "amd_iommu=gappi" command line parameter.
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Sairaj Kodilkar <sarunkod@amd.com>
---
drivers/iommu/amd/amd_iommu_types.h | 2 ++
drivers/iommu/amd/init.c | 20 ++++++++++++++++----
2 files changed, 18 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h
index 26d7a9796e64..de0b23306944 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -94,6 +94,7 @@
#define FEATURE_X2APIC BIT_ULL(2)
#define FEATURE_NX BIT_ULL(3)
#define FEATURE_GT BIT_ULL(4)
+#define FEATURE_GAPPI BIT_ULL(5)
#define FEATURE_IA BIT_ULL(6)
#define FEATURE_GA BIT_ULL(7)
#define FEATURE_HE BIT_ULL(8)
@@ -191,6 +192,7 @@
#define CONTROL_EPH_EN 45
#define CONTROL_XT_EN 50
#define CONTROL_INTCAPXT_EN 51
+#define CONTROL_GAPPI_EN 55
#define CONTROL_GCR3TRPMODE 58
#define CONTROL_IRTCACHEDIS 59
#define CONTROL_SNPAVIC_EN 61
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 2e1889f8a9e4..b3e1d8ee7e9d 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3038,11 +3038,20 @@ static void enable_iommus_vapic(void)
return;
}
+ if (amd_iommu_gappi &&
+ !(check_feature(FEATURE_GAPPI) &&
+ AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))) {
+ pr_warn("GAPPI is not supported.\n");
+ amd_iommu_gappi = false;
+ }
+
/* Enabling GAM and SNPAVIC support */
for_each_iommu(iommu) {
- if (iommu_init_ga_log(iommu) ||
- iommu_ga_log_enable(iommu))
- return;
+ if (amd_iommu_gappi)
+ iommu_feature_enable(iommu, CONTROL_GAPPI_EN);
+ else if (iommu_init_ga_log(iommu) ||
+ iommu_ga_log_enable(iommu))
+ return;
iommu_feature_enable(iommu, CONTROL_GAM_EN);
if (amd_iommu_snp_en)
@@ -3050,7 +3059,8 @@ static void enable_iommus_vapic(void)
}
amd_iommu_irq_ops.capability |= (1 << IRQ_POSTING_CAP);
- pr_info("Virtual APIC enabled\n");
+ pr_info("Virtual APIC enabled with %s\n",
+ amd_iommu_gappi ? "GAPPI" : "GALOG");
#endif
}
@@ -3741,6 +3751,8 @@ static int __init parse_amd_iommu_options(char *str)
} else if (strncmp(str, "v2_pgsizes_only", 15) == 0) {
pr_info("Restricting V1 page-sizes to 4KiB/2MiB/1GiB");
amd_iommu_pgsize_bitmap = AMD_IOMMU_PGSIZES_V2;
+ } else if (strncmp(str, "gappi", 5) == 0) {
+ amd_iommu_gappi = true;
} else {
pr_notice("Unknown option - '%s'\n", str);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload
2026-06-26 10:59 ` [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload Sairaj Kodilkar
@ 2026-06-26 11:25 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-26 11:25 UTC (permalink / raw)
To: Sairaj Kodilkar; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [High] `avic_remove_vcpu_from_gappi_wakeup_list` and `avic_add_vcpu_to_gappi_wakeup_list` acquire a raw spinlock without disabling interrupts, leading to a hard deadlock with the GAPPI IPI hardirq handler.
- [High] Uninitialized `svm->gappi_cpu` (-1) is unconditionally passed to `kvm_cpu_get_apicid()` and `avic_add_vcpu_to_gappi_wakeup_list()`, causing an out-of-bounds array read and a userspace-triggerable `WARN_ON`.
- [High] The GAPPI wakeup handler is registered with the core kernel but never unregistered, leading to a dangling pointer if the KVM module is unloaded.
- [Medium] The per-CPU GAPPI wakeup lists lack CPU hotplug support, resulting in permanently lost device interrupts and hanging vCPUs if a CPU is offlined.
--
commit 3af48ba2bd8326c04b2aa47ac47a9f53e0cd61a3
Author: Sairaj Kodilkar <sarunkod@amd.com>
kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload
When a vCPU is unloaded from a physical CPU enqueue it on that CPU's
GAPPI wakeup list. Remove the vCPU from the wakeup list when it is loaded
on to a CPU again.
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index e7a4c0e90e7a7..d238f65a8172b 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -889,6 +892,44 @@ int avic_init_vcpu(struct vcpu_svm *svm)
> return ret;
> }
>
> +static void avic_add_vcpu_to_gappi_wakeup_list(struct vcpu_svm *svm, int cpu)
> +{
> + struct list_head *wakeup_list;
> + raw_spinlock_t *spinlock;
> +
> + if (WARN_ON(cpu < 0))
> + return;
> +
> + wakeup_list = &per_cpu(gappi_vcpu_wakeup_list, cpu);
> + spinlock = &per_cpu(gappi_vcpu_wakeup_list_lock, cpu);
> + raw_spin_lock(spinlock);
> + if (list_empty(&svm->gappi_vcpu_wakeup_list))
> + list_add_tail(&svm->gappi_vcpu_wakeup_list, wakeup_list);
> + raw_spin_unlock(spinlock);
> +}
> +
> +static void avic_remove_vcpu_from_gappi_wakeup_list(struct vcpu_svm *svm, int cpu)
> +{
> + raw_spinlock_t *spinlock;
> +
> + if (WARN_ON(cpu < 0))
> + return;
> +
> + spinlock = &per_cpu(gappi_vcpu_wakeup_list_lock, cpu);
> + raw_spin_lock(spinlock);
> + if (!list_empty(&svm->gappi_vcpu_wakeup_list))
> + list_del_init(&svm->gappi_vcpu_wakeup_list);
> + raw_spin_unlock(spinlock);
> +}
[Severity: High]
Can this cause a hard deadlock with the GAPPI IPI hardirq handler?
When a user process destroys a VM, avic_destroy_vcpu() is called from
svm_vcpu_free() in normal thread context with interrupts enabled. If the
POSTED_INTR_WAKEUP_VECTOR IPI fires on the same CPU while this thread holds
the gappi_vcpu_wakeup_list_lock non-recursive raw spinlock, the interrupt
handler will spin forever attempting to acquire it.
Should these functions use raw_spin_lock_irqsave() to disable interrupts
during lock acquisition?
[ ... ]
> @@ -958,16 +1005,18 @@ int avic_pi_update_irte(struct kvm_kernel_irqfd *irqfd, struct kvm *kvm,
> } else {
> posted_intr = !!(entry & AVIC_PHYSICAL_ID_ENTRY_GA_LOG_INTR);
> pi_data.flags = posted_intr << AMD_IOMMU_FLAG_POSTED_INTR_SHIFT;
> - /* GAPPI is disabled at this point (amd_iommu_gappi is
> - * enabled in the following patches) hence keep the
> - * apicid as 0.
> - */
> - pi_data.apicid = 0;
> + if (amd_iommu_gappi) {
> + pi_data.apicid = kvm_cpu_get_apicid(svm->gappi_cpu);
> + if (list_empty(&svm->ir_list)) {
> + avic_add_vcpu_to_gappi_wakeup_list(svm, svm->gappi_cpu);
> + is_vcpu_waiting = true;
> + }
> + }
> }
[Severity: High]
Is it possible for svm->gappi_cpu to still be -1 here?
If userspace creates a VM but inhibits AVIC (for example, via an invalid
APIC ID), avic_vcpu_load() bails out early leaving svm->gappi_cpu as -1.
If userspace later assigns a device interrupt via KVM_IRQFD,
avic_pi_update_irte() will pass -1 to kvm_cpu_get_apicid().
This triggers a negative index array read yielding a garbage APIC ID, and
also hits the WARN_ON(cpu < 0) inside avic_add_vcpu_to_gappi_wakeup_list().
[ ... ]
> @@ -1353,6 +1436,7 @@ bool __init avic_hardware_setup(void)
> enable_ipiv = false;
>
> amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
> + kvm_set_posted_intr_wakeup_handler(&avic_gappi_wakeup_handler);
>
> return true;
> }
[Severity: Medium]
Does the per-CPU GAPPI wakeup list need CPU hotplug coordination?
If a physical CPU is taken offline while a VM is running, any blocked
vCPUs queued on that offlined CPU's gappi_vcpu_wakeup_list might permanently
miss their GAPPI wakeups. Without a hotplug state notifier to migrate these
vCPUs, will they hang indefinitely?
[Severity: High]
Is there a missing module unload teardown for the wakeup handler?
This registers avic_gappi_wakeup_handler with the core kernel, but it
doesn't seem to be unregistered (via kvm_set_posted_intr_wakeup_handler(NULL))
during module teardown in avic_hardware_unsetup(). If the kvm-amd module
is unloaded, could a subsequent POSTED_INTR_WAKEUP_VECTOR jump to unmapped
memory and cause a kernel panic?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260626105906.14577-1-sarunkod@amd.com?part=4
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI
2026-06-26 10:59 ` [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI Sairaj Kodilkar
@ 2026-06-26 11:25 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-26 11:25 UTC (permalink / raw)
To: Sairaj Kodilkar; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
New issues:
- [High] The global variable `amd_iommu_gappi` is not cleared to `false` when `enable_iommus_vapic()` returns early.
- [High] The `CONTROL_GAPPI_EN` feature is enabled but never disabled during IOMMU teardown or kexec initialization, leaking hardware state.
- [High] The GA log interrupt handler unconditionally re-enables `CONTROL_GALOG_EN` upon an overflow event, even when GAPPI is used and GA log memory is uninitialized.
Pre-existing issues:
- [High] `iommu_init_ga_log()` unconditionally allocates memory with `GFP_KERNEL` and reallocates pointers during suspend/resume cycles, causing a memory leak and a sleep-in-atomic bug.
- [High] `enable_iommus_vapic()` returns immediately on error without rolling back previously initialized IOMMUs, leading to an inconsistent initialization state.
--
commit a43ef908c557e1883a80f54d875c75488a6d0f52
Author: Sairaj Kodilkar <sarunkod@amd.com>
iommu/amd: Provide kernel command line option to enable GAPPI
This commit introduces the amd_iommu=gappi command line parameter to enable
the GAPPI extended feature when supported by the IOMMU hardware.
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -3038,11 +3038,20 @@ static void enable_iommus_vapic(void)
> return;
> }
>
[Severity: High]
If enable_iommus_vapic() returns early before this point because
FEATURE_GAM_VAPIC is missing or SNP is active, the newly added
amd_iommu_gappi flag is never cleared.
Could this cause issues later in amd_iommu_update_ga() when KVM expects
GAPPI behavior but Virtual APIC mode is logically disabled, triggering
warnings and failing interrupt delivery?
[Severity: High]
When recovering from a previous kernel in the kexec path earlier in this
function, the cleanup loop clears CONTROL_GALOG_EN and CONTROL_GAINT_EN.
Since this patch introduces CONTROL_GAPPI_EN, does the cleanup loop also
need to call iommu_feature_disable(iommu, CONTROL_GAPPI_EN) to prevent
hardware state corruption if the previous kernel left GAPPI active? A
similar omission appears to exist in iommu_disable().
> + if (amd_iommu_gappi &&
> + !(check_feature(FEATURE_GAPPI) &&
> + AMD_IOMMU_GUEST_IR_VAPIC(amd_iommu_guest_ir))) {
> + pr_warn("GAPPI is not supported.\n");
> + amd_iommu_gappi = false;
> + }
> +
> /* Enabling GAM and SNPAVIC support */
> for_each_iommu(iommu) {
> - if (iommu_init_ga_log(iommu) ||
> - iommu_ga_log_enable(iommu))
> - return;
> + if (amd_iommu_gappi)
> + iommu_feature_enable(iommu, CONTROL_GAPPI_EN);
> + else if (iommu_init_ga_log(iommu) ||
> + iommu_ga_log_enable(iommu))
> + return;
[Severity: High]
When amd_iommu_gappi is true, iommu_init_ga_log() is intentionally bypassed,
meaning the GA log memory is never allocated.
If a spurious GA log overflow interrupt fires, amd_iommu_int_thread_galog()
processes it by calling amd_iommu_restart_ga_log(), which blindly re-enables
CONTROL_GALOG_EN.
Will the hardware then attempt to write log entries to an uninitialized
physical address, potentially causing kernel memory corruption or fatal
IOMMU target aborts?
[Severity: High]
This isn't a bug introduced by this patch, but returning immediately on
failure here leaves the system in an inconsistent state.
If a subsequent IOMMU fails initialization, the loop aborts without rolling
back the features (like CONTROL_GAM_EN or CONTROL_SNPAVIC_EN) enabled on
previously processed IOMMUs. Could this result in partially initialized
interrupt routing?
[Severity: High]
This isn't a bug introduced by this patch, but calling iommu_init_ga_log()
here unconditionally allocates memory with GFP_KERNEL.
During system resume, amd_iommu_syscore_ops.resume calls
enable_iommus_vapic() with hardware interrupts disabled. Does calling a
GFP_KERNEL allocator in this atomic context cause a sleep-in-atomic bug,
and does unconditionally overwriting iommu->ga_log leak the previously
allocated memory on every resume cycle?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260626105906.14577-1-sarunkod@amd.com?part=5
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts
2026-06-26 10:59 ` [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts Sairaj Kodilkar
@ 2026-06-26 11:29 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-26 11:29 UTC (permalink / raw)
To: Sairaj Kodilkar; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Medium] `amd_iommu_update_ga` dynamically modifies the `gappi_dis` field in the IRTE without issuing an IOMMU invalidation, which may result in permanently lost wakeups.
--
commit f265b23208b1df21d27bc13bee3b5f94ce45d71c
Author: Sairaj Kodilkar <sarunkod@amd.com>
iommu/amd: Configure IRTE to use the GAPPI for posted interrupts
This commit updates the AMD IOMMU driver to support Guest APIC Posted
Interrupts (GAPPI). It configures the Interrupt Remapping Table Entry (IRTE)
for wakeups using the reserved POSTED_INTR_WAKEUP_VECTOR when the guest
is not running.
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
[ ... ]
> @@ -3969,9 +3969,18 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int apicid, int flags)
> entry->lo.fields_vapic.is_run = true;
> entry->lo.fields_vapic.ga_log_intr = false;
> } else {
> - entry->lo.fields_vapic.is_run = false;
> - entry->lo.fields_vapic.ga_log_intr = !!(flags &
> - AMD_IOMMU_FLAG_POSTED_INTR);
> + bool posted_intr = !!(flags & AMD_IOMMU_FLAG_POSTED_INTR);
> + if (amd_iommu_gappi) {
> + entry->lo.fields_vapic.gappi_dis = !posted_intr &&
> + check_feature2(FEATURE_GAPPIDISSUP);
[Severity: Medium]
Does this code need to issue an IOMMU cache invalidation after modifying
gappi_dis?
Since __amd_iommu_update_ga() is a fast-path update function, it appears
there is no subsequent IOMMU cache flush when gappi_dis changes. If KVM
schedules out a vCPU (setting gappi_dis = 1) and then the vCPU blocks
(setting gappi_dis = 0), hardware that caches the gappi_dis bit might not
observe the transition back to 0.
Could this caching behavior lead to permanently dropped wakeup interrupts
and indefinite vCPU hangs?
> + entry->lo.fields_vapic.is_run = false;
> + entry->lo.fields_vapic.destination =
> + APICID_TO_IRTE_DEST_LO(apicid);
> + entry->hi.fields.destination = APICID_TO_IRTE_DEST_HI(apicid);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260626105906.14577-1-sarunkod@amd.com?part=2
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-06-26 11:29 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-26 10:59 [RFC PATCH 0/5] Add support for AMD IOMMU GAPPI Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 1/5] iommu/amd: kvm/svm: Improve API between SVM and AMD IOMMU Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 2/5] iommu/amd: Configure IRTE to use the GAPPI for posted interrupts Sairaj Kodilkar
2026-06-26 11:29 ` sashiko-bot
2026-06-26 10:59 ` [RFC PATCH 3/5] kvm/svm: Introduce per-CPU lock and wakeup queue Sairaj Kodilkar
2026-06-26 10:59 ` [RFC PATCH 4/5] kvm/svm: Update the per-CPU wakeup-list during vCPU load and unload Sairaj Kodilkar
2026-06-26 11:25 ` sashiko-bot
2026-06-26 10:59 ` [RFC PATCH 5/5] iommu/amd: Provide kernel command line option to enable GAPPI Sairaj Kodilkar
2026-06-26 11:25 ` sashiko-bot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.