* [PATCH v2] KVM: arm64: vgic: Fix race between LPI release and re-registration
@ 2026-07-03 2:15 Carlos López
2026-07-03 8:44 ` Oliver Upton
0 siblings, 1 reply; 3+ messages in thread
From: Carlos López @ 2026-07-03 2:15 UTC (permalink / raw)
To: kvmarm, linux-kernel
Cc: Carlos López, Marc Zyngier, Oliver Upton, Joey Gouly,
Steffen Eiden, Suzuki K Poulose, Zenghui Yu, Catalin Marinas,
Will Deacon,
moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
Fix a potential race between decrementing an LPI's reference count and
evicting that structure from the LPI xarray.
LPI structures are maintained in the VGIC LPI xarray (dist->lpi_xa).
When the reference count of an LPI structure drops to zero,
vgic_release_lpi_locked() removes the structure from the xarray and
frees it under the xarray lock.
However, the release of an LPI can race with a concurrent LPI
re-registration with the same INTID via vgic_add_lpi() on another CPU,
since the reference count drop and the xarray eviction are not performed
in a single atomic step. This can happen e.g. if the guest issues a
DISCARD while the LPI is still referenced from a vCPU's active-pending
list (ap_list), and the same INTID is re-mapped via MAPTI.
Particularly, vgic_release_lpi_locked() is called from two distinct
paths: direct release via vgic_put_irq(), and deferred release via
vgic_release_deleted_lpis(). During direct release, the issue can result
in deleting a newly registered LPI from the xarray:
CPU0 (Releasing LPI) CPU1 (Adding new LPI)
==================== =====================
vgic_put_irq()
__vgic_put_irq()
refcount_dec_and_test()
vgic_add_lpi()
xa_lock_irqsave(..);
old_irq = xa_load(&dist->lpi_xa, intid);
vgic_try_get_irq_ref(old_irq) == false
new IRQ inserted --> __xa_store(&dist->lpi_xa, intid, ..)
xa_unlock_irqrestore(..);
xa_lock_irqsave(..);
vgic_release_lpi_locked()
__xa_erase(&dist->lpi_xa, irq->intid); <-- BUG: new IRQ is erased
kfree_rcu(old_irq)
During the deferred release path, the old IRQ can be leaked:
CPU0 (Releasing LPI) CPU1 (Adding new LPI)
==================== =====================
vgic_put_irq_norelease()
__vgic_put_irq()
refcount_dec_and_test()
irq->pending_release = true
vgic_add_lpi()
xa_lock_irqsave(..);
old_irq = xa_load(&dist->lpi_xa, intid);
vgic_try_get_irq_ref() == false
BUG: old IRQ overwritten --> __xa_store(&dist->lpi_xa, intid, ..)
xa_unlock_irqrestore(..);
vgic_release_deleted_lpis()
xa_lock_irqsave(..);
xa_for_each() { .. } <-- old IRQ with pending_release = true
is gone, so it cannot be released
To fix the direct release path, move the reference count drop inside
the xarray lock, making sure that vgic_add_lpi() never encounters the
to-be-released LPI.
To fix the deferred release path, since the refcount drop must happen
under a raw spinlock, the same solution does not work. Instead, update
vgic_add_lpi(), so that if it evicts a non-NULL refcount=0 LPI from the
xarray, it takes on the responsibility of releasing it. If this happens,
vgic_release_deleted_lpis() will iterate the xarray normally and will
simply not find the already released structure.
Reported-by: Claude:claude-opus-4-6
Fixes: 3a08a6ca7c37 ("KVM: arm64: vgic-v3: Use bare refcount for VGIC LPIs")
Fixes: d54594accf73 ("KVM: arm64: vgic-v3: Erase LPIs from xarray outside of raw spinlocks")
Signed-off-by: Carlos López <clopez@suse.de>
---
v2:
* Address Sashiko's review. Fix the direct release path by decrementing the
refcount under the xarray spinlock, preventing a UAF that would have been
introduced in v1.
---
arch/arm64/kvm/vgic/vgic-its.c | 10 +++++++++-
arch/arm64/kvm/vgic/vgic.c | 5 +++--
2 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 67d107e9a77d..577286069368 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -116,7 +116,15 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
kfree(irq);
irq = oldirq;
} else {
- ret = xa_err(__xa_store(&dist->lpi_xa, intid, irq, 0));
+ /*
+ * The entry is either empty or contains a dead LPI (refcount=0)
+ * from the deferred release path, pending cleanup by
+ * vgic_release_deleted_lpis(). Evict and free it if present.
+ */
+ oldirq = __xa_store(&dist->lpi_xa, intid, irq, 0);
+ ret = xa_err(oldirq);
+ if (!ret && oldirq)
+ kfree_rcu(oldirq, rcu);
}
xa_unlock_irqrestore(&dist->lpi_xa, flags);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 5a4768d8cd4f..c32e6c9777e5 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -167,11 +167,12 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
guard(spinlock_irqsave)(&dist->lpi_xa.xa_lock);
}
- if (!__vgic_put_irq(kvm, irq))
+ if (!irq_is_lpi(kvm, irq->intid))
return;
xa_lock_irqsave(&dist->lpi_xa, flags);
- vgic_release_lpi_locked(dist, irq);
+ if (refcount_dec_and_test(&irq->refcount))
+ vgic_release_lpi_locked(dist, irq);
xa_unlock_irqrestore(&dist->lpi_xa, flags);
}
base-commit: 1ee27dacbe5dc4def481794d899d67b0d4570094
--
2.51.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] KVM: arm64: vgic: Fix race between LPI release and re-registration
2026-07-03 2:15 [PATCH v2] KVM: arm64: vgic: Fix race between LPI release and re-registration Carlos López
@ 2026-07-03 8:44 ` Oliver Upton
2026-07-03 9:16 ` Carlos López
0 siblings, 1 reply; 3+ messages in thread
From: Oliver Upton @ 2026-07-03 8:44 UTC (permalink / raw)
To: Carlos López
Cc: kvmarm, linux-kernel, Marc Zyngier, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
Hi Carlos,
Thanks for reporting this ugly bug.
On Fri, Jul 03, 2026 at 04:15:08AM +0200, Carlos López wrote:
> To fix the direct release path, move the reference count drop inside
> the xarray lock, making sure that vgic_add_lpi() never encounters the
> to-be-released LPI.
As Sashiko pointed out, this is going to massively regress performance
of LPI injection. I don't think this is going to be a viable option.
> To fix the deferred release path, since the refcount drop must happen
> under a raw spinlock, the same solution does not work. Instead, update
> vgic_add_lpi(), so that if it evicts a non-NULL refcount=0 LPI from the
> xarray, it takes on the responsibility of releasing it. If this happens,
> vgic_release_deleted_lpis() will iterate the xarray normally and will
> simply not find the already released structure.
>
> Reported-by: Claude:claude-opus-4-6
> Fixes: 3a08a6ca7c37 ("KVM: arm64: vgic-v3: Use bare refcount for VGIC LPIs")
> Fixes: d54594accf73 ("KVM: arm64: vgic-v3: Erase LPIs from xarray outside of raw spinlocks")
> Signed-off-by: Carlos López <clopez@suse.de>
> ---
> v2:
> * Address Sashiko's review. Fix the direct release path by decrementing the
> refcount under the xarray spinlock, preventing a UAF that would have been
> introduced in v1.
So I actually agree with your approach in v1, vgic_release_lpi_locked()
should do an __xa_cmpxchg() to only erase if the to-be-deleted IRQ that
it owns remains in the xarray.
I believe the UAF could've been avoided by unconditionally calling
kfree_rcu() in vgic_release_lpi_locked() and not attempting to cleanup
dead LPIs in vgic_add_lpi(). IOW, whoever takes the refcount of an LPI
to 0 always has the responsibility of freeing it.
Maybe below would be enough?
Thanks,
Oliver
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 5a4768d8cd4f..4c79e1096af4 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -132,7 +132,14 @@ struct vgic_irq *vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid)
static void vgic_release_lpi_locked(struct vgic_dist *dist, struct vgic_irq *irq)
{
lockdep_assert_held(&dist->lpi_xa.xa_lock);
- __xa_erase(&dist->lpi_xa, irq->intid);
+
+ /*
+ * Another LPI could've been inserted prior to taking the xa_lock, as
+ * vgic_add_lpi() can only take a reference on a pre-existing LPI if
+ * the refcount is nonzero. While freeing the object is always done here,
+ * only delete the entry @INTID if it is this IRQ.
+ */
+ __xa_cmpxchg(&dist->lpi_xa, irq->intid, irq, NULL, 0);
kfree_rcu(irq, rcu);
}
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH v2] KVM: arm64: vgic: Fix race between LPI release and re-registration
2026-07-03 8:44 ` Oliver Upton
@ 2026-07-03 9:16 ` Carlos López
0 siblings, 0 replies; 3+ messages in thread
From: Carlos López @ 2026-07-03 9:16 UTC (permalink / raw)
To: Oliver Upton
Cc: kvmarm, linux-kernel, Marc Zyngier, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon,
moderated list:KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
Hi,
I was about to reply to Sashiko, I'd rather talk to a human :)
On 7/3/26 10:44 AM, Oliver Upton wrote:
> Hi Carlos,
>
> Thanks for reporting this ugly bug.
>
> On Fri, Jul 03, 2026 at 04:15:08AM +0200, Carlos López wrote:
>> To fix the direct release path, move the reference count drop inside
>> the xarray lock, making sure that vgic_add_lpi() never encounters the
>> to-be-released LPI.
>
> As Sashiko pointed out, this is going to massively regress performance
> of LPI injection. I don't think this is going to be a viable option.
I think we can just use refcount_dec_and_lock_irqsave(), no? Then we
grab the lock only if the refcount drops to 0.
As for the other issue (spurious -ENOMEM on __xa_store()), it's a
preexisting issue, but should be fixed by just passing GFP_NOWAIT to
__xa_store(). I can add another patch in v3 for this.
>> To fix the deferred release path, since the refcount drop must happen
>> under a raw spinlock, the same solution does not work. Instead, update
>> vgic_add_lpi(), so that if it evicts a non-NULL refcount=0 LPI from the
>> xarray, it takes on the responsibility of releasing it. If this happens,
>> vgic_release_deleted_lpis() will iterate the xarray normally and will
>> simply not find the already released structure.
>>
>> Reported-by: Claude:claude-opus-4-6
>> Fixes: 3a08a6ca7c37 ("KVM: arm64: vgic-v3: Use bare refcount for VGIC LPIs")
>> Fixes: d54594accf73 ("KVM: arm64: vgic-v3: Erase LPIs from xarray outside of raw spinlocks")
>> Signed-off-by: Carlos López <clopez@suse.de>
>> ---
>> v2:
>> * Address Sashiko's review. Fix the direct release path by decrementing the
>> refcount under the xarray spinlock, preventing a UAF that would have been
>> introduced in v1.
>
> So I actually agree with your approach in v1, vgic_release_lpi_locked()
> should do an __xa_cmpxchg() to only erase if the to-be-deleted IRQ that
> it owns remains in the xarray.
>
> I believe the UAF could've been avoided by unconditionally calling
> kfree_rcu() in vgic_release_lpi_locked() and not attempting to cleanup
> dead LPIs in vgic_add_lpi(). IOW, whoever takes the refcount of an LPI
> to 0 always has the responsibility of freeing it.
I think this would solve the direct release path, but not the deferred
path. If vgic_add_lpi() does not perform any cleanup, and encounters an
IRQ that was vgic_put_irq_norelease()-ed before
vgic_release_deleted_lpis() grabs the xarray lock then the struct is
overwritten without being released.
> Maybe below would be enough?
>
> Thanks,
> Oliver
>
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 5a4768d8cd4f..4c79e1096af4 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -132,7 +132,14 @@ struct vgic_irq *vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid)
> static void vgic_release_lpi_locked(struct vgic_dist *dist, struct vgic_irq *irq)
> {
> lockdep_assert_held(&dist->lpi_xa.xa_lock);
> - __xa_erase(&dist->lpi_xa, irq->intid);
> +
> + /*
> + * Another LPI could've been inserted prior to taking the xa_lock, as
> + * vgic_add_lpi() can only take a reference on a pre-existing LPI if
> + * the refcount is nonzero. While freeing the object is always done here,
> + * only delete the entry @INTID if it is this IRQ.
> + */
> + __xa_cmpxchg(&dist->lpi_xa, irq->intid, irq, NULL, 0);
> kfree_rcu(irq, rcu);
> }
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-07-03 9:17 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-03 2:15 [PATCH v2] KVM: arm64: vgic: Fix race between LPI release and re-registration Carlos López
2026-07-03 8:44 ` Oliver Upton
2026-07-03 9:16 ` Carlos López
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox