All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs
@ 2026-06-04 21:16 Hyunwoo Kim
  2026-06-05  5:47 ` Oliver Upton
  0 siblings, 1 reply; 4+ messages in thread
From: Hyunwoo Kim @ 2026-06-04 21:16 UTC (permalink / raw)
  To: maz, oupton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23
  Cc: linux-arm-kernel, kvmarm, imv4bel

vgic_v3_fold_lr_state() walks the ap_list from last_lr_irq without holding
the ap_list_lock, relying on vgic_irq being freed via kfree_rcu() and on
interrupts being disabled. vgic_flush_pending_lpis() removes entries with
list_del(), which clobbers a node's next pointer, so when another vCPU
disables LPIs via GICR_CTLR the walk can follow the clobbered next pointer
from a removed node, or from the node that last_lr_irq points to.

Remove entries with list_del_rcu() so that the next pointer stays valid
until the walk completes.

Fixes: 3cfd59f81e0f ("KVM: arm64: GICv3: Handle LR overflow when EOImode==0")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
 arch/arm64/kvm/vgic/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1e9fe8764584d..73efc0f95bfb1 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -204,7 +204,7 @@ void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
 	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
 		if (irq_is_lpi(vcpu->kvm, irq->intid)) {
 			raw_spin_lock(&irq->irq_lock);
-			list_del(&irq->ap_list);
+			list_del_rcu(&irq->ap_list);
 			irq->vcpu = NULL;
 			raw_spin_unlock(&irq->irq_lock);
 			deleted |= vgic_put_irq_norelease(vcpu->kvm, irq);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs
  2026-06-04 21:16 [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs Hyunwoo Kim
@ 2026-06-05  5:47 ` Oliver Upton
  2026-06-05  8:17   ` Marc Zyngier
  0 siblings, 1 reply; 4+ messages in thread
From: Oliver Upton @ 2026-06-05  5:47 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: maz, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, linux-arm-kernel,
	kvmarm

Hi Hyunwoo,

On Fri, Jun 05, 2026 at 06:16:08AM +0900, Hyunwoo Kim wrote:
> vgic_v3_fold_lr_state() walks the ap_list from last_lr_irq without holding
> the ap_list_lock, relying on vgic_irq being freed via kfree_rcu() and on
> interrupts being disabled. vgic_flush_pending_lpis() removes entries with
> list_del(), which clobbers a node's next pointer, so when another vCPU
> disables LPIs via GICR_CTLR the walk can follow the clobbered next pointer
> from a removed node, or from the node that last_lr_irq points to.
> 
> Remove entries with list_del_rcu() so that the next pointer stays valid
> until the walk completes.
> 
> Fixes: 3cfd59f81e0f ("KVM: arm64: GICv3: Handle LR overflow when EOImode==0")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>

Changing only one of the writer paths to use the rculist helpers does
not make the ap_list an rculist. Insertions are not RCU-safe, nor are
deleations from vgic_prune_ap_list().

And TBH, the real bug here is the fact that vgic_v3_fold_lr_state() isn't
taking the ap_list_lock.

Thanks,
Oliver

> ---
>  arch/arm64/kvm/vgic/vgic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 1e9fe8764584d..73efc0f95bfb1 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -204,7 +204,7 @@ void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
>  	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
>  		if (irq_is_lpi(vcpu->kvm, irq->intid)) {
>  			raw_spin_lock(&irq->irq_lock);
> -			list_del(&irq->ap_list);
> +			list_del_rcu(&irq->ap_list);
>  			irq->vcpu = NULL;
>  			raw_spin_unlock(&irq->irq_lock);
>  			deleted |= vgic_put_irq_norelease(vcpu->kvm, irq);
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs
  2026-06-05  5:47 ` Oliver Upton
@ 2026-06-05  8:17   ` Marc Zyngier
  2026-06-07 22:25     ` Hyunwoo Kim
  0 siblings, 1 reply; 4+ messages in thread
From: Marc Zyngier @ 2026-06-05  8:17 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Hyunwoo Kim, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, linux-arm-kernel,
	kvmarm

On Fri, 05 Jun 2026 06:47:17 +0100,
Oliver Upton <oupton@kernel.org> wrote:
> 
> Hi Hyunwoo,
> 
> On Fri, Jun 05, 2026 at 06:16:08AM +0900, Hyunwoo Kim wrote:
> > vgic_v3_fold_lr_state() walks the ap_list from last_lr_irq without holding
> > the ap_list_lock, relying on vgic_irq being freed via kfree_rcu() and on
> > interrupts being disabled. vgic_flush_pending_lpis() removes entries with
> > list_del(), which clobbers a node's next pointer, so when another vCPU
> > disables LPIs via GICR_CTLR the walk can follow the clobbered next pointer
> > from a removed node, or from the node that last_lr_irq points to.
> > 
> > Remove entries with list_del_rcu() so that the next pointer stays valid
> > until the walk completes.
> > 
> > Fixes: 3cfd59f81e0f ("KVM: arm64: GICv3: Handle LR overflow when EOImode==0")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> 
> Changing only one of the writer paths to use the rculist helpers does
> not make the ap_list an rculist. Insertions are not RCU-safe, nor are
> deleations from vgic_prune_ap_list().
> 
> And TBH, the real bug here is the fact that vgic_v3_fold_lr_state() isn't
> taking the ap_list_lock.

Yup, that'd be more sensible. I need to convince myself that there is
no possible path from vgic_v*_fold_lr() to vgic_irq_queue_unlock(),
because that one does actually acquire that lock.

	M.

-- 
Jazz isn't dead. It just smells funny.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs
  2026-06-05  8:17   ` Marc Zyngier
@ 2026-06-07 22:25     ` Hyunwoo Kim
  0 siblings, 0 replies; 4+ messages in thread
From: Hyunwoo Kim @ 2026-06-07 22:25 UTC (permalink / raw)
  To: Marc Zyngier, oupton
  Cc: joey.gouly, seiden, suzuki.poulose, yuzenghui, catalin.marinas,
	will, Sascha.Bischoff, jic23, linux-arm-kernel, kvmarm, imv4bel

On Fri, Jun 05, 2026 at 09:17:56AM +0100, Marc Zyngier wrote:
> On Fri, 05 Jun 2026 06:47:17 +0100,
> Oliver Upton <oupton@kernel.org> wrote:
> > 
> > Hi Hyunwoo,
> > 
> > On Fri, Jun 05, 2026 at 06:16:08AM +0900, Hyunwoo Kim wrote:
> > > vgic_v3_fold_lr_state() walks the ap_list from last_lr_irq without holding
> > > the ap_list_lock, relying on vgic_irq being freed via kfree_rcu() and on
> > > interrupts being disabled. vgic_flush_pending_lpis() removes entries with
> > > list_del(), which clobbers a node's next pointer, so when another vCPU
> > > disables LPIs via GICR_CTLR the walk can follow the clobbered next pointer
> > > from a removed node, or from the node that last_lr_irq points to.
> > > 
> > > Remove entries with list_del_rcu() so that the next pointer stays valid
> > > until the walk completes.
> > > 
> > > Fixes: 3cfd59f81e0f ("KVM: arm64: GICv3: Handle LR overflow when EOImode==0")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > 
> > Changing only one of the writer paths to use the rculist helpers does
> > not make the ap_list an rculist. Insertions are not RCU-safe, nor are
> > deleations from vgic_prune_ap_list().
> > 
> > And TBH, the real bug here is the fact that vgic_v3_fold_lr_state() isn't
> > taking the ap_list_lock.
> 
> Yup, that'd be more sensible. I need to convince myself that there is
> no possible path from vgic_v*_fold_lr() to vgic_irq_queue_unlock(),
> because that one does actually acquire that lock.

I did some more digging into this. What are your thoughts on this 
ap_list_lock approach? It doesn't seem like there's a deadlock problem.
Additionally, this also fixes the eventfd lock-nesting.

This is quite a substantial change, though.


Best regards,
Hyunwoo Kim

---

vgic_v3_fold_lr_state() and vgic_v2_fold_lr_state() walk the tail of the
ap_list starting from last_lr_irq to replay EOIcount-based deactivations.
This walk runs without the ap_list_lock, yet the other paths that touch the
ap_list (vgic_flush_state, vgic_flush_pending_lpis, vgic_prune_ap_list,
vgic_queue_irq_unlock) all take that lock.

Another vCPU can therefore change the ap_list during the walk. For example,
clearing ENABLE_LPIS in GICR_CTLR makes vgic_flush_pending_lpis() remove an
LPI with list_del(), and prune and interrupt injection also add and remove
entries. A lock-free walk can follow the next pointer of a node that has
already been removed. last_lr_irq is also recorded at flush time and
consumed at fold time, so it spans the guest run and may be unlinked from
the ap_list or released in the meantime.

Take the ap_list_lock in vgic_fold_state(), as vgic_flush_state() does, to
serialize the walk against the ap_list modifiers.

vgic_put_irq() takes lpi_xa, and the lock order places lpi_xa above the
ap_list_lock, so calling it while the fold holds the ap_list_lock inverts
that order. The fold therefore only drops the reference with
vgic_put_irq_norelease() and reclaims the LPIs with
vgic_release_deleted_lpis() after dropping the lock, the same way
vgic_flush_pending_lpis() and vgic_prune_ap_list() already do.

Pin last_lr_irq with vgic_get_irq_ref() in vgic_flush_lr_state() so it is
not released during the guest run, continue the tail walk only while it is
still on this vCPU's ap_list, and drop the reference at the end of the
fold.

kvm_notify_acked_irq() takes a regular spinlock via eventfd_signal()
through a registered irqfd resampler, and can re-enter the vgic via
kvm_set_irq() to reach vgic_queue_irq_unlock(), which takes the
ap_list_lock. Neither may run under the raw ap_list_lock, so during the
fold, record only the SPI in question and notify after dropping the
ap_list_lock. The vgic_v3_deactivate() and vgic_v2_deactivate() paths,
which do not take the ap_list_lock, notify directly as before.

---

diff --git a/arch/arm64/kvm/vgic/vgic-v2.c b/arch/arm64/kvm/vgic/vgic-v2.c
index cafa3cb32bda6..fd3db099e1fe9 100644
--- a/arch/arm64/kvm/vgic/vgic-v2.c
+++ b/arch/arm64/kvm/vgic/vgic-v2.c
@@ -53,7 +53,7 @@ static bool lr_signals_eoi_mi(u32 lr_val)
 	       !(lr_val & GICH_LR_HW);
 }

-static void vgic_v2_fold_lr(struct kvm_vcpu *vcpu, u32 val)
+static bool vgic_v2_fold_lr(struct kvm_vcpu *vcpu, u32 val, unsigned long *eoi_spis)
 {
 	u32 cpuid, intid = val & GICH_LR_VIRTUALID;
 	struct vgic_irq *irq;
@@ -63,9 +63,13 @@ static void vgic_v2_fold_lr(struct kvm_vcpu *vcpu, u32 val)
 	cpuid = FIELD_GET(GICH_LR_PHYSID_CPUID, val) & 7;

 	/* Notify fds when the guest EOI'ed a level-triggered SPI */
-	if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid))
-		kvm_notify_acked_irq(vcpu->kvm, 0,
-				     intid - VGIC_NR_PRIVATE_IRQS);
+	if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid)) {
+		if (eoi_spis)
+			__set_bit(intid - VGIC_NR_PRIVATE_IRQS, eoi_spis);
+		else
+			kvm_notify_acked_irq(vcpu->kvm, 0,
+					     intid - VGIC_NR_PRIVATE_IRQS);
+	}

 	irq = vgic_get_vcpu_irq(vcpu, intid);

@@ -98,7 +102,7 @@ static void vgic_v2_fold_lr(struct kvm_vcpu *vcpu, u32 val)
 		irq->on_lr = false;
 	}

-	vgic_put_irq(vcpu->kvm, irq);
+	return vgic_put_irq_norelease(vcpu->kvm, irq);
 }

 static u32 vgic_v2_compute_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq);
@@ -110,19 +114,25 @@ static u32 vgic_v2_compute_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq);
  *   - transferred as is in case of edge sensitive IRQs
  *   - set to the line-level (resample time) for level sensitive IRQs
  */
-void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
+bool vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu, unsigned long *eoi_spis)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_v2_cpu_if *cpuif = &vgic_cpu->vgic_v2;
 	u32 eoicount = FIELD_GET(GICH_HCR_EOICOUNT, cpuif->vgic_hcr);
-	struct vgic_irq *irq = *host_data_ptr(last_lr_irq);
+	struct vgic_irq *last = *host_data_ptr(last_lr_irq);
+	struct vgic_irq *irq = last;
+	bool deleted = false;

 	DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
+	lockdep_assert_held(&vgic_cpu->ap_list_lock);

 	for (int lr = 0; lr < vgic_cpu->vgic_v2.used_lrs; lr++)
-		vgic_v2_fold_lr(vcpu, cpuif->vgic_lr[lr]);
+		deleted |= vgic_v2_fold_lr(vcpu, cpuif->vgic_lr[lr], eoi_spis);

 	/* See the GICv3 equivalent for the EOIcount handling rationale */
+	if (!last || last->vcpu != vcpu)
+		goto done;
+
 	list_for_each_entry_continue(irq, &vgic_cpu->ap_list_head, ap_list) {
 		u32 lr;

@@ -141,11 +151,17 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 		if (lr & GICH_LR_HW)
 			writel_relaxed(FIELD_GET(GICH_LR_PHYSID_CPUID, lr),
 				       kvm_vgic_global_state.gicc_base + GIC_CPU_DEACTIVATE);
-		vgic_v2_fold_lr(vcpu, lr);
+		deleted |= vgic_v2_fold_lr(vcpu, lr, eoi_spis);
 		eoicount--;
 	}

+done:
+	if (last)
+		deleted |= vgic_put_irq_norelease(vcpu->kvm, last);
+
 	cpuif->used_lrs = 0;
+
+	return deleted;
 }

 void vgic_v2_deactivate(struct kvm_vcpu *vcpu, u32 val)
@@ -205,7 +221,7 @@ void vgic_v2_deactivate(struct kvm_vcpu *vcpu, u32 val)
 		writel_relaxed(FIELD_GET(GICH_LR_PHYSID_CPUID, lr),
 			       kvm_vgic_global_state.gicc_base + GIC_CPU_DEACTIVATE);

-	vgic_v2_fold_lr(vcpu, lr);
+	vgic_v2_fold_lr(vcpu, lr, NULL);

 put:
 	vgic_put_irq(vcpu->kvm, irq);
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 9e841e7afd4a7..f019edf574c3e 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -71,7 +71,7 @@ static bool lr_signals_eoi_mi(u64 lr_val)
 	       !(lr_val & ICH_LR_HW);
 }

-static void vgic_v3_fold_lr(struct kvm_vcpu *vcpu, u64 val)
+static bool vgic_v3_fold_lr(struct kvm_vcpu *vcpu, u64 val, unsigned long *eoi_spis)
 {
 	struct vgic_irq *irq;
 	bool is_v2_sgi = false;
@@ -87,7 +87,7 @@ static void vgic_v3_fold_lr(struct kvm_vcpu *vcpu, u64 val)

 	irq = vgic_get_vcpu_irq(vcpu, intid);
 	if (!irq)	/* An LPI could have been unmapped. */
-		return;
+		return false;

 	scoped_guard(raw_spinlock, &irq->irq_lock) {
 		/* Always preserve the active bit for !LPIs, note deactivation */
@@ -125,12 +125,15 @@ static void vgic_v3_fold_lr(struct kvm_vcpu *vcpu, u64 val)

 	/* Notify fds when the guest EOI'ed a level-triggered SPI, and drop the refcount */
 	if (deactivated && lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid)) {
-		kvm_notify_acked_irq(vcpu->kvm, 0,
-				     intid - VGIC_NR_PRIVATE_IRQS);
+		if (eoi_spis)
+			__set_bit(intid - VGIC_NR_PRIVATE_IRQS, eoi_spis);
+		else
+			kvm_notify_acked_irq(vcpu->kvm, 0,
+					     intid - VGIC_NR_PRIVATE_IRQS);
 		atomic_dec_if_positive(&vcpu->kvm->arch.vgic.active_spis);
 	}

-	vgic_put_irq(vcpu->kvm, irq);
+	return vgic_put_irq_norelease(vcpu->kvm, irq);
 }

 static u64 vgic_v3_compute_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq);
@@ -143,17 +146,20 @@ static void vgic_v3_deactivate_phys(u32 intid)
 		gic_write_dir(intid);
 }

-void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
+bool vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu, unsigned long *eoi_spis)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_v3_cpu_if *cpuif = &vgic_cpu->vgic_v3;
 	u32 eoicount = FIELD_GET(ICH_HCR_EL2_EOIcount, cpuif->vgic_hcr);
-	struct vgic_irq *irq = *host_data_ptr(last_lr_irq);
+	struct vgic_irq *last = *host_data_ptr(last_lr_irq);
+	struct vgic_irq *irq = last;
+	bool deleted = false;

 	DEBUG_SPINLOCK_BUG_ON(!irqs_disabled());
+	lockdep_assert_held(&vgic_cpu->ap_list_lock);

 	for (int lr = 0; lr < cpuif->used_lrs; lr++)
-		vgic_v3_fold_lr(vcpu, cpuif->vgic_lr[lr]);
+		deleted |= vgic_v3_fold_lr(vcpu, cpuif->vgic_lr[lr], eoi_spis);

 	/*
 	 * EOIMode=0: use EOIcount to emulate deactivation. We are
@@ -161,8 +167,12 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 	 * just pick one active interrupt after the other in the tail part
 	 * of the ap_list, past the LRs, and replay the deactivation as if
 	 * the CPU was doing it. We also rely on priority drop to have taken
-	 * place, and the list to be sorted by priority.
+	 * place, and the list to be sorted by priority. Skip if a remote
+	 * flush/prune unlinked last_lr_irq during the guest run.
 	 */
+	if (!last || last->vcpu != vcpu)
+		goto done;
+
 	list_for_each_entry_continue(irq, &vgic_cpu->ap_list_head, ap_list) {
 		u64 lr;

@@ -185,11 +195,17 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 		if (lr & ICH_LR_HW)
 			vgic_v3_deactivate_phys(FIELD_GET(ICH_LR_PHYS_ID_MASK, lr));

-		vgic_v3_fold_lr(vcpu, lr);
+		deleted |= vgic_v3_fold_lr(vcpu, lr, eoi_spis);
 		eoicount--;
 	}

+done:
+	if (last)
+		deleted |= vgic_put_irq_norelease(vcpu->kvm, last);
+
 	cpuif->used_lrs = 0;
+
+	return deleted;
 }

 void vgic_v3_deactivate(struct kvm_vcpu *vcpu, u64 val)
@@ -278,7 +294,7 @@ void vgic_v3_deactivate(struct kvm_vcpu *vcpu, u64 val)
 	if (lr & ICH_LR_HW)
 		vgic_v3_deactivate_phys(FIELD_GET(ICH_LR_PHYS_ID_MASK, lr));

-	vgic_v3_fold_lr(vcpu, lr);
+	vgic_v3_fold_lr(vcpu, lr, NULL);

 put:
 	vgic_put_irq(vcpu->kvm, irq);
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1e9fe8764584d..2461a77576a2b 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -145,7 +145,7 @@ static __must_check bool __vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 	return refcount_dec_and_test(&irq->refcount);
 }

-static __must_check bool vgic_put_irq_norelease(struct kvm *kvm, struct vgic_irq *irq)
+__must_check bool vgic_put_irq_norelease(struct kvm *kvm, struct vgic_irq *irq)
 {
 	if (!__vgic_put_irq(kvm, irq))
 		return false;
@@ -855,6 +855,11 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)

 static void vgic_fold_state(struct kvm_vcpu *vcpu)
 {
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	DECLARE_BITMAP(eoi_spis, VGIC_MAX_SPI - VGIC_NR_PRIVATE_IRQS + 1);
+	bool deleted = false;
+	int spi;
+
 	if (vgic_is_v5(vcpu->kvm)) {
 		vgic_v5_fold_ppi_state(vcpu);
 		return;
@@ -863,10 +868,21 @@ static void vgic_fold_state(struct kvm_vcpu *vcpu)
 	if (!*host_data_ptr(last_lr_irq))
 		return;

-	if (kvm_vgic_global_state.type == VGIC_V2)
-		vgic_v2_fold_lr_state(vcpu);
-	else
-		vgic_v3_fold_lr_state(vcpu);
+	bitmap_zero(eoi_spis, VGIC_MAX_SPI - VGIC_NR_PRIVATE_IRQS + 1);
+
+	scoped_guard(raw_spinlock, &vgic_cpu->ap_list_lock) {
+		if (kvm_vgic_global_state.type == VGIC_V2)
+			deleted = vgic_v2_fold_lr_state(vcpu, eoi_spis);
+		else
+			deleted = vgic_v3_fold_lr_state(vcpu, eoi_spis);
+	}
+
+	if (unlikely(deleted))
+		vgic_release_deleted_lpis(vcpu->kvm);
+
+	/* kvm_notify_acked_irq() grabs regular spinlocks; call after unlock. */
+	for_each_set_bit(spi, eoi_spis, VGIC_MAX_SPI - VGIC_NR_PRIVATE_IRQS + 1)
+		kvm_notify_acked_irq(vcpu->kvm, 0, spi);
 }

 /* Requires the irq_lock to be held. */
@@ -1023,6 +1039,10 @@ static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
 			break;
 	}

+	/* Pin the EOIcount walk start; it may be freed during the guest run. */
+	if (*host_data_ptr(last_lr_irq))
+		vgic_get_irq_ref(*host_data_ptr(last_lr_irq));
+
 	/* Nuke remaining LRs */
 	for (int i = count ; i < kvm_vgic_global_state.nr_lr; i++)
 		vgic_clear_lr(vcpu, i);
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 9d941241c8a2b..e52012f4bdec9 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -262,6 +262,7 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
 struct vgic_irq *vgic_get_irq(struct kvm *kvm, u32 intid);
 struct vgic_irq *vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
+__must_check bool vgic_put_irq_norelease(struct kvm *kvm, struct vgic_irq *irq);
 struct kvm_vcpu *vgic_target_oracle(struct vgic_irq *irq);
 bool vgic_get_phys_line_level(struct vgic_irq *irq);
 void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
@@ -276,7 +277,7 @@ int vgic_check_iorange(struct kvm *kvm, phys_addr_t ioaddr,
 		       phys_addr_t addr, phys_addr_t alignment,
 		       phys_addr_t size);

-void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu);
+bool vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu, unsigned long *eoi_spis);
 void vgic_v2_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v2_deactivate(struct kvm_vcpu *vcpu, u32 val);
 void vgic_v2_clear_lr(struct kvm_vcpu *vcpu, int lr);
@@ -317,7 +318,7 @@ static inline void vgic_get_irq_ref(struct vgic_irq *irq)
 	WARN_ON_ONCE(!vgic_try_get_irq_ref(irq));
 }

-void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu);
+bool vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu, unsigned long *eoi_spis);
 void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, struct vgic_irq *irq, int lr);
 void vgic_v3_clear_lr(struct kvm_vcpu *vcpu, int lr);
 void vgic_v3_deactivate(struct kvm_vcpu *vcpu, u64 val);

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-07 22:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 21:16 [PATCH] KVM: arm64: vgic: Use list_del_rcu() when flushing pending LPIs Hyunwoo Kim
2026-06-05  5:47 ` Oliver Upton
2026-06-05  8:17   ` Marc Zyngier
2026-06-07 22:25     ` Hyunwoo Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.