* [PATCH v3 1/2] KVM: arm64: vgic: Fix race between LPI release and re-registration
2026-07-03 11:01 [PATCH v3 0/2] KVM: arm64: vgic: Fix racy LPI release and re-registration handling Carlos López
@ 2026-07-03 11:01 ` Carlos López
2026-07-03 11:01 ` [PATCH v3 2/2] KVM: arm64: vgic: Mitigate potential LPI registration failure Carlos López
1 sibling, 0 replies; 3+ messages in thread
From: Carlos López @ 2026-07-03 11:01 UTC (permalink / raw)
To: kvmarm, linux-kernel
Cc: maz, oupton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, Carlos López
Fix a potential race between decrementing an LPI's reference count and
evicting that structure from the LPI xarray.
LPI structures are maintained in the VGIC LPI xarray (dist->lpi_xa).
When the reference count of an LPI structure drops to zero,
vgic_release_lpi_locked() removes the structure from the xarray and
frees it under the xarray lock.
However, the release of an LPI can race with a concurrent LPI
re-registration with the same INTID via vgic_add_lpi() on another CPU,
since the reference count drop and the xarray eviction are not performed
in a single atomic step. This can happen e.g. if the guest issues a
DISCARD while the LPI is still referenced from a vCPU's active-pending
list (ap_list), and the same INTID is re-mapped via MAPTI.
Particularly, vgic_release_lpi_locked() is called from two distinct
paths: direct release via vgic_put_irq(), and deferred release via
vgic_release_deleted_lpis(). During direct release, the issue can result
in deleting a newly registered LPI from the xarray:
CPU0 (Releasing LPI) CPU1 (Adding new LPI)
==================== =====================
vgic_put_irq()
__vgic_put_irq()
refcount_dec_and_test()
vgic_add_lpi()
xa_lock_irqsave()
old_irq = xa_load(.., intid)
vgic_try_get_irq_ref(old_irq) == false
new IRQ inserted --> __xa_store(.., intid, ..)
xa_unlock_irqrestore()
xa_lock_irqsave();
vgic_release_lpi_locked()
__xa_erase(.., irq->intid) <-- BUG: new IRQ is erased
kfree_rcu(old_irq)
During the deferred release path, the old IRQ can be leaked:
CPU0 (Releasing LPI) CPU1 (Adding new LPI)
==================== =====================
vgic_put_irq_norelease()
__vgic_put_irq()
refcount_dec_and_test()
irq->pending_release = true
vgic_add_lpi()
xa_lock_irqsave()
old_irq = xa_load(.., intid)
vgic_try_get_irq_ref(oldirq) == false
BUG: old IRQ overwritten --> __xa_store(.., intid, ..)
xa_unlock_irqrestore()
vgic_release_deleted_lpis()
xa_lock_irqsave()
xa_for_each() { .. } <-- old IRQ with pending_release = true
is gone, so it cannot be released
To fix the direct release path, move the reference count drop inside
the xarray lock, making sure that vgic_add_lpi() never encounters the
to-be-released LPI.
To fix the deferred release path, since the refcount drop must happen
under a raw spinlock, the same solution does not work. Instead, update
vgic_add_lpi(), so that if it evicts a non-NULL refcount=0 LPI from the
xarray, it takes on the responsibility of releasing it. If this happens,
vgic_release_deleted_lpis() will iterate the xarray normally and will
simply not find the already released structure.
Reported-by: Claude:claude-opus-4-6
Fixes: 3a08a6ca7c37 ("KVM: arm64: vgic-v3: Use bare refcount for VGIC LPIs")
Fixes: d54594accf73 ("KVM: arm64: vgic-v3: Erase LPIs from xarray outside of raw spinlocks")
Signed-off-by: Carlos López <clopez@suse.de>
---
arch/arm64/kvm/vgic/vgic-its.c | 10 +++++++++-
arch/arm64/kvm/vgic/vgic.c | 10 ++++++----
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 67d107e9a77d..577286069368 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -116,7 +116,15 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
kfree(irq);
irq = oldirq;
} else {
- ret = xa_err(__xa_store(&dist->lpi_xa, intid, irq, 0));
+ /*
+ * The entry is either empty or contains a dead LPI (refcount=0)
+ * from the deferred release path, pending cleanup by
+ * vgic_release_deleted_lpis(). Evict and free it if present.
+ */
+ oldirq = __xa_store(&dist->lpi_xa, intid, irq, 0);
+ ret = xa_err(oldirq);
+ if (!ret && oldirq)
+ kfree_rcu(oldirq, rcu);
}
xa_unlock_irqrestore(&dist->lpi_xa, flags);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 5a4768d8cd4f..cc09e0c45b46 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -167,12 +167,14 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
guard(spinlock_irqsave)(&dist->lpi_xa.xa_lock);
}
- if (!__vgic_put_irq(kvm, irq))
+ if (!irq_is_lpi(kvm, irq->intid))
return;
- xa_lock_irqsave(&dist->lpi_xa, flags);
- vgic_release_lpi_locked(dist, irq);
- xa_unlock_irqrestore(&dist->lpi_xa, flags);
+ if (refcount_dec_and_lock_irqsave(&irq->refcount,
+ &dist->lpi_xa.xa_lock, &flags)) {
+ vgic_release_lpi_locked(dist, irq);
+ xa_unlock_irqrestore(&dist->lpi_xa, flags);
+ }
}
static void vgic_release_deleted_lpis(struct kvm *kvm)
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH v3 2/2] KVM: arm64: vgic: Mitigate potential LPI registration failure
2026-07-03 11:01 [PATCH v3 0/2] KVM: arm64: vgic: Fix racy LPI release and re-registration handling Carlos López
2026-07-03 11:01 ` [PATCH v3 1/2] KVM: arm64: vgic: Fix race between LPI release and re-registration Carlos López
@ 2026-07-03 11:01 ` Carlos López
1 sibling, 0 replies; 3+ messages in thread
From: Carlos López @ 2026-07-03 11:01 UTC (permalink / raw)
To: kvmarm, linux-kernel
Cc: maz, oupton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, Carlos López,
Sashiko
Mitigate a potential failure when inserting a new LPI into the VGIC LPI
xarray.
When vgic_add_lpi() is preparing to register a new LPI, it pre-allocates
an xarray entry using xa_reserve_irq(), so that it can later perform the
insertion under the xarray lock without allocating.
However, since xa_reserve_irq() is called before acquiring such lock,
there is a potential race where xa_reserve_irq() observes a populated
entry, thus not performing the allocation, and another CPU removes that
entry before the xarray lock is grabbed to perform the insertion.
CPU0 (Adding new LPI) CPU1 (Releasing LPI)
===================== ===================
vgic_add_lpi()
/* Entry populated, does not allocate */
xa_reserve_irq(.., intid, ..)
vgic_release_deleted_lpis()
xa_lock_irqsave()
vgic_release_lpi_locked()
xarray node freed --> __xa_erase(.., intid)
xa_unlock_irqrestore()
xa_lock_irqsave()
xa_load(.., intid) == NULL
vgic_try_get_irq_ref(NULL) == false
__xa_store(.., intid, irq, 0) <-- xarray node was freed, gfp=0
cannot allocate, returns -ENOMEM
This can happen e.g. if the guest issues a DISCARD while the LPI is
still referenced from a vCPU's active-pending list (ap_list), and the
same INTID is re-mapped via MAPTI.
Mitigate this by passing GFP_NOWAIT to __xa_store(), so that the
allocation can happen under the lock in the rare case that this
condition is hit.
Reported-by: Sashiko <sashiko-bot@kernel.org>
Fixes: 1d6f83f60f79 ("KVM: arm64: vgic: Store LPIs in an xarray")
Signed-off-by: Carlos López <clopez@suse.de>
---
arch/arm64/kvm/vgic/vgic-its.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 577286069368..ace3e59fff97 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -121,7 +121,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
* from the deferred release path, pending cleanup by
* vgic_release_deleted_lpis(). Evict and free it if present.
*/
- oldirq = __xa_store(&dist->lpi_xa, intid, irq, 0);
+ oldirq = __xa_store(&dist->lpi_xa, intid, irq, GFP_NOWAIT);
ret = xa_err(oldirq);
if (!ret && oldirq)
kfree_rcu(oldirq, rcu);
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread