Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
@ 2026-06-04 20:59 Hyunwoo Kim
  2026-06-05  6:00 ` Oliver Upton
  0 siblings, 1 reply; 6+ messages in thread
From: Hyunwoo Kim @ 2026-06-04 20:59 UTC (permalink / raw)
  To: maz, oupton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	eric.auger, christoffer.dall, andre.przywara
  Cc: linux-arm-kernel, kvmarm, imv4bel

vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
an interrupt to another vCPU. After reacquiring the locks it only checks
that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
before moving the interrupt, which assumes that an interrupt whose affinity
is preserved is still queued on this vCPU's ap_list.

That assumption no longer holds if the interrupt is taken off the ap_list
while the locks are dropped. vgic_flush_pending_lpis() removes the
interrupt from the list and sets irq->vcpu to NULL, but leaves
enabled/pending/target_vcpu untouched. As the interrupt is still enabled
and pending, vgic_target_oracle() returns the same target_vcpu, so the
affinity check passes and list_del() is run a second time on an entry that
has already been removed.

Also check that the interrupt is still assigned to this vCPU
(irq->vcpu == vcpu) before moving it.

Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
 arch/arm64/kvm/vgic/vgic.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1e9fe8764584..18b280de9a29 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -818,15 +818,16 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
 		raw_spin_lock(&irq->irq_lock);
 
 		/*
-		 * If the affinity has been preserved, move the
-		 * interrupt around. Otherwise, it means things have
-		 * changed while the interrupt was unlocked, and we
-		 * need to replay this.
+		 * If the interrupt is still ours and its affinity has
+		 * been preserved, move it around. Otherwise, it means
+		 * things have changed while the interrupt was unlocked
+		 * (it may even have been taken off the list with its
+		 * affinity left untouched), and we need to replay this.
 		 *
 		 * In all cases, we cannot trust the list not to have
 		 * changed, so we restart from the beginning.
 		 */
-		if (target_vcpu == vgic_target_oracle(irq)) {
+		if (irq->vcpu == vcpu && target_vcpu == vgic_target_oracle(irq)) {
 			struct vgic_cpu *new_cpu = &target_vcpu->arch.vgic_cpu;
 
 			list_del(&irq->ap_list);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
  2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
@ 2026-06-05  6:00 ` Oliver Upton
  2026-06-05  7:42   ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Oliver Upton @ 2026-06-05  6:00 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: maz, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	eric.auger, christoffer.dall, andre.przywara, linux-arm-kernel,
	kvmarm

On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> an interrupt to another vCPU. After reacquiring the locks it only checks
> that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> before moving the interrupt, which assumes that an interrupt whose affinity
> is preserved is still queued on this vCPU's ap_list.
> 
> That assumption no longer holds if the interrupt is taken off the ap_list
> while the locks are dropped. vgic_flush_pending_lpis() removes the
> interrupt from the list and sets irq->vcpu to NULL, but leaves
> enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> and pending, vgic_target_oracle() returns the same target_vcpu, so the
> affinity check passes and list_del() is run a second time on an entry that
> has already been removed.
> 
> Also check that the interrupt is still assigned to this vCPU
> (irq->vcpu == vcpu) before moving it.
> 
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>

Looking at this and the other VGIC patch you sent (which should've been
a combined series), are you trying to deal with a vCPU writing to
another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
behind the back of vCPU A?

That is extremely relevant information as the off-the-cuff reaction is
that no race exists. But since the GIC architecture is awesome and
allows for this sort of insanity, it obviously does....

Anyway, for LPIs resident on a particular RD, there's zero expectation
that the pending state is preserved when EnableLPIs=0. So I'd rather
vgic_flush_pending_lpis() just invalidate the pending state.

Beyond that, I see two other fixes for lifetime issues around the
vgic_irq in the middle of migration. I'd like to see explicit RCU
protection around the release && reacquire of the ap_list_lock rather
than depending on the precondition that IRQs are disabled.

Then vgic_flush_pending_lpis() should leave IRQs intact that are pending
a migration (e.g. irq->vcpu != vgic_target_oracle()) as the only expectation
we need to uphold is that LPIs resident on the RD have the pending state cleared.

Although I think we could benefit from the wetware implementation of the
GIC giving this a once over too. Any thoughts Marc?

Thanks,
Oliver

> ---
>  arch/arm64/kvm/vgic/vgic.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 1e9fe8764584..18b280de9a29 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -818,15 +818,16 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
>  		raw_spin_lock(&irq->irq_lock);
>  
>  		/*
> -		 * If the affinity has been preserved, move the
> -		 * interrupt around. Otherwise, it means things have
> -		 * changed while the interrupt was unlocked, and we
> -		 * need to replay this.
> +		 * If the interrupt is still ours and its affinity has
> +		 * been preserved, move it around. Otherwise, it means
> +		 * things have changed while the interrupt was unlocked
> +		 * (it may even have been taken off the list with its
> +		 * affinity left untouched), and we need to replay this.
>  		 *
>  		 * In all cases, we cannot trust the list not to have
>  		 * changed, so we restart from the beginning.
>  		 */
> -		if (target_vcpu == vgic_target_oracle(irq)) {
> +		if (irq->vcpu == vcpu && target_vcpu == vgic_target_oracle(irq)) {
>  			struct vgic_cpu *new_cpu = &target_vcpu->arch.vgic_cpu;
>  
>  			list_del(&irq->ap_list);
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
  2026-06-05  6:00 ` Oliver Upton
@ 2026-06-05  7:42   ` Marc Zyngier
  2026-06-05  8:43     ` Oliver Upton
  0 siblings, 1 reply; 6+ messages in thread
From: Marc Zyngier @ 2026-06-05  7:42 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Hyunwoo Kim, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	eric.auger, christoffer.dall, andre.przywara, linux-arm-kernel,
	kvmarm

On Fri, 05 Jun 2026 07:00:37 +0100,
Oliver Upton <oupton@kernel.org> wrote:
> 
> On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > an interrupt to another vCPU. After reacquiring the locks it only checks
> > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > before moving the interrupt, which assumes that an interrupt whose affinity
> > is preserved is still queued on this vCPU's ap_list.
> > 
> > That assumption no longer holds if the interrupt is taken off the ap_list
> > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > affinity check passes and list_del() is run a second time on an entry that
> > has already been removed.
> > 
> > Also check that the interrupt is still assigned to this vCPU
> > (irq->vcpu == vcpu) before moving it.
> > 
> > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> 
> Looking at this and the other VGIC patch you sent (which should've been
> a combined series), are you trying to deal with a vCPU writing to
> another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> behind the back of vCPU A?
> 
> That is extremely relevant information as the off-the-cuff reaction is
> that no race exists. But since the GIC architecture is awesome and
> allows for this sort of insanity, it obviously does....
> 
> Anyway, for LPIs resident on a particular RD, there's zero expectation
> that the pending state is preserved when EnableLPIs=0. So I'd rather
> vgic_flush_pending_lpis() just invalidate the pending state.

Just clearing the pending state introduces a potential problem as we
now have an interrupt that is neither active nor pending on the AP
list. It is not impossible to solve (we now have similar behaviours
with SPI deactivation from another vcpu), but that requires posting a
KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.

> Beyond that, I see two other fixes for lifetime issues around the
> vgic_irq in the middle of migration. I'd like to see explicit RCU
> protection around the release && reacquire of the ap_list_lock rather
> than depending on the precondition that IRQs are disabled.

I'm not sure I follow. Are you suggesting turning the AP list into an
RCU protected list?

Thanks,

	M.

-- 
Jazz isn't dead. It just smells funny.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
  2026-06-05  7:42   ` Marc Zyngier
@ 2026-06-05  8:43     ` Oliver Upton
  2026-06-10 13:52       ` Hyunwoo Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Oliver Upton @ 2026-06-05  8:43 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Hyunwoo Kim, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	eric.auger, christoffer.dall, andre.przywara, linux-arm-kernel,
	kvmarm

On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> On Fri, 05 Jun 2026 07:00:37 +0100,
> Oliver Upton <oupton@kernel.org> wrote:
> > 
> > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > is preserved is still queued on this vCPU's ap_list.
> > > 
> > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > affinity check passes and list_del() is run a second time on an entry that
> > > has already been removed.
> > > 
> > > Also check that the interrupt is still assigned to this vCPU
> > > (irq->vcpu == vcpu) before moving it.
> > > 
> > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > 
> > Looking at this and the other VGIC patch you sent (which should've been
> > a combined series), are you trying to deal with a vCPU writing to
> > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > behind the back of vCPU A?
> > 
> > That is extremely relevant information as the off-the-cuff reaction is
> > that no race exists. But since the GIC architecture is awesome and
> > allows for this sort of insanity, it obviously does....
> > 
> > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > vgic_flush_pending_lpis() just invalidate the pending state.
> 
> Just clearing the pending state introduces a potential problem as we
> now have an interrupt that is neither active nor pending on the AP
> list. It is not impossible to solve (we now have similar behaviours
> with SPI deactivation from another vcpu), but that requires posting a
> KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.

Right, I was suggesting that in addition to deleting the LPI from the AP
list we actually invalidate the pending state so that someone sitting on
a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
NULL

> > Beyond that, I see two other fixes for lifetime issues around the
> > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > protection around the release && reacquire of the ap_list_lock rather
> > than depending on the precondition that IRQs are disabled.
> 
> I'm not sure I follow. Are you suggesting turning the AP list into an
> RCU protected list?

No, sorry, I should expand a little.

We store a reference on the vgic_irq struct in the AP list, which is
stable so long as the ap_list_lock is held. It should be possible for
the refcount to drop to 0 between releasing the ap_list_lock and
reacquiring it.

So either vgic_prune_ap_list() takes an additional reference on the
vgic_irq before dropping the ap_list_lock or rely on RCU to protect
vgic_irq structs observed with a non-zero refcount.

Thanks,
Oliver


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
  2026-06-05  8:43     ` Oliver Upton
@ 2026-06-10 13:52       ` Hyunwoo Kim
  2026-06-10 16:00         ` Marc Zyngier
  0 siblings, 1 reply; 6+ messages in thread
From: Hyunwoo Kim @ 2026-06-10 13:52 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Marc Zyngier, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	andre.przywara, linux-arm-kernel, kvmarm, imv4bel

On Fri, Jun 05, 2026 at 01:43:32AM -0700, Oliver Upton wrote:
> On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> > On Fri, 05 Jun 2026 07:00:37 +0100,
> > Oliver Upton <oupton@kernel.org> wrote:
> > > 
> > > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > > is preserved is still queued on this vCPU's ap_list.
> > > > 
> > > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > > affinity check passes and list_del() is run a second time on an entry that
> > > > has already been removed.
> > > > 
> > > > Also check that the interrupt is still assigned to this vCPU
> > > > (irq->vcpu == vcpu) before moving it.
> > > > 
> > > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > > 
> > > Looking at this and the other VGIC patch you sent (which should've been
> > > a combined series), are you trying to deal with a vCPU writing to
> > > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > > behind the back of vCPU A?
> > > 
> > > That is extremely relevant information as the off-the-cuff reaction is
> > > that no race exists. But since the GIC architecture is awesome and
> > > allows for this sort of insanity, it obviously does....
> > > 
> > > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > > vgic_flush_pending_lpis() just invalidate the pending state.
> > 
> > Just clearing the pending state introduces a potential problem as we
> > now have an interrupt that is neither active nor pending on the AP
> > list. It is not impossible to solve (we now have similar behaviours
> > with SPI deactivation from another vcpu), but that requires posting a
> > KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.
> 
> Right, I was suggesting that in addition to deleting the LPI from the AP
> list we actually invalidate the pending state so that someone sitting on
> a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
> NULL
> 
> > > Beyond that, I see two other fixes for lifetime issues around the
> > > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > > protection around the release && reacquire of the ap_list_lock rather
> > > than depending on the precondition that IRQs are disabled.
> > 
> > I'm not sure I follow. Are you suggesting turning the AP list into an
> > RCU protected list?
> 
> No, sorry, I should expand a little.
> 
> We store a reference on the vgic_irq struct in the AP list, which is
> stable so long as the ap_list_lock is held. It should be possible for
> the refcount to drop to 0 between releasing the ap_list_lock and
> reacquiring it.
> 
> So either vgic_prune_ap_list() takes an additional reference on the
> vgic_irq before dropping the ap_list_lock or rely on RCU to protect
> vgic_irq structs observed with a non-zero refcount.

What are your thoughts on this approach?


Best regards,
Hyunwoo Kim

---

diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
index 933983bb2005..7fb871c3ccd8 100644
--- a/arch/arm64/kvm/vgic/vgic-init.c
+++ b/arch/arm64/kvm/vgic/vgic-init.c
@@ -523,7 +523,7 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 	 * Retire all pending LPIs on this vcpu anyway as we're
 	 * going to destroy it.
 	 */
-	vgic_flush_pending_lpis(vcpu);
+	vgic_flush_pending_lpis(vcpu, true);

 	INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
 	kfree(vgic_cpu->private_irqs);
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 5913a20d8301..f85d63f17af0 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -303,7 +303,7 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
 		if (ctlr != GICR_CTLR_ENABLE_LPIS)
 			return;

-		vgic_flush_pending_lpis(vcpu);
+		vgic_flush_pending_lpis(vcpu, false);
 		vgic_its_invalidate_all_caches(vcpu->kvm);
 		atomic_set_release(&vgic_cpu->ctlr, 0);
 	} else {
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 1e9fe8764584..09629a38fc0a 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -192,7 +192,7 @@ static void vgic_release_deleted_lpis(struct kvm *kvm)
 	xa_unlock_irqrestore(&dist->lpi_xa, flags);
 }

-void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
+void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 	struct vgic_irq *irq, *tmp;
@@ -204,6 +204,13 @@ void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
 	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
 		if (irq_is_lpi(vcpu->kvm, irq->intid)) {
 			raw_spin_lock(&irq->irq_lock);
+			/* Leave interrupts pending a migration for prune. */
+			if (!destroy && irq->vcpu != vgic_target_oracle(irq)) {
+				raw_spin_unlock(&irq->irq_lock);
+				continue;
+			}
+			/* Pending state is not preserved across EnableLPIs=0. */
+			irq->pending_latch = false;
 			list_del(&irq->ap_list);
 			irq->vcpu = NULL;
 			raw_spin_unlock(&irq->irq_lock);
@@ -797,6 +804,9 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)

 		/* This interrupt looks like it has to be migrated. */

+		/* Keep the interrupt alive while the locks are dropped. */
+		vgic_get_irq_ref(irq);
+
 		raw_spin_unlock(&irq->irq_lock);
 		raw_spin_unlock(&vgic_cpu->ap_list_lock);

@@ -839,6 +849,8 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
 		raw_spin_unlock(&vcpuB->arch.vgic_cpu.ap_list_lock);
 		raw_spin_unlock(&vcpuA->arch.vgic_cpu.ap_list_lock);

+		deleted_lpis |= vgic_put_irq_norelease(vcpu->kvm, irq);
+
 		if (target_vcpu_needs_kick) {
 			kvm_make_request(KVM_REQ_IRQ_PENDING, target_vcpu);
 			kvm_vcpu_kick(target_vcpu);
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 9d941241c8a2..c1ac24ede899 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -341,7 +341,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu);
 bool vgic_has_its(struct kvm *kvm);
 int kvm_vgic_register_its_device(void);
 void vgic_enable_lpis(struct kvm_vcpu *vcpu);
-void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu);
+void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy);
 int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
 int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
 int vgic_v3_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
  2026-06-10 13:52       ` Hyunwoo Kim
@ 2026-06-10 16:00         ` Marc Zyngier
  0 siblings, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2026-06-10 16:00 UTC (permalink / raw)
  To: Hyunwoo Kim
  Cc: Oliver Upton, joey.gouly, seiden, suzuki.poulose, yuzenghui,
	catalin.marinas, will, Sascha.Bischoff, jic23, timothy.hayes,
	andre.przywara, linux-arm-kernel, kvmarm

On Wed, 10 Jun 2026 14:52:10 +0100,
Hyunwoo Kim <imv4bel@gmail.com> wrote:
> 
> On Fri, Jun 05, 2026 at 01:43:32AM -0700, Oliver Upton wrote:
> > On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> > > On Fri, 05 Jun 2026 07:00:37 +0100,
> > > Oliver Upton <oupton@kernel.org> wrote:
> > > > 
> > > > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > > > is preserved is still queued on this vCPU's ap_list.
> > > > > 
> > > > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > > > affinity check passes and list_del() is run a second time on an entry that
> > > > > has already been removed.
> > > > > 
> > > > > Also check that the interrupt is still assigned to this vCPU
> > > > > (irq->vcpu == vcpu) before moving it.
> > > > > 
> > > > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > > > 
> > > > Looking at this and the other VGIC patch you sent (which should've been
> > > > a combined series), are you trying to deal with a vCPU writing to
> > > > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > > > behind the back of vCPU A?
> > > > 
> > > > That is extremely relevant information as the off-the-cuff reaction is
> > > > that no race exists. But since the GIC architecture is awesome and
> > > > allows for this sort of insanity, it obviously does....
> > > > 
> > > > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > > > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > > > vgic_flush_pending_lpis() just invalidate the pending state.
> > > 
> > > Just clearing the pending state introduces a potential problem as we
> > > now have an interrupt that is neither active nor pending on the AP
> > > list. It is not impossible to solve (we now have similar behaviours
> > > with SPI deactivation from another vcpu), but that requires posting a
> > > KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.
> > 
> > Right, I was suggesting that in addition to deleting the LPI from the AP
> > list we actually invalidate the pending state so that someone sitting on
> > a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
> > NULL
> > 
> > > > Beyond that, I see two other fixes for lifetime issues around the
> > > > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > > > protection around the release && reacquire of the ap_list_lock rather
> > > > than depending on the precondition that IRQs are disabled.
> > > 
> > > I'm not sure I follow. Are you suggesting turning the AP list into an
> > > RCU protected list?
> > 
> > No, sorry, I should expand a little.
> > 
> > We store a reference on the vgic_irq struct in the AP list, which is
> > stable so long as the ap_list_lock is held. It should be possible for
> > the refcount to drop to 0 between releasing the ap_list_lock and
> > reacquiring it.
> > 
> > So either vgic_prune_ap_list() takes an additional reference on the
> > vgic_irq before dropping the ap_list_lock or rely on RCU to protect
> > vgic_irq structs observed with a non-zero refcount.
> 
> What are your thoughts on this approach?
> 
> 
> Best regards,
> Hyunwoo Kim
> 
> ---
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> index 933983bb2005..7fb871c3ccd8 100644
> --- a/arch/arm64/kvm/vgic/vgic-init.c
> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> @@ -523,7 +523,7 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	 * Retire all pending LPIs on this vcpu anyway as we're
>  	 * going to destroy it.
>  	 */
> -	vgic_flush_pending_lpis(vcpu);
> +	vgic_flush_pending_lpis(vcpu, true);
> 
>  	INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
>  	kfree(vgic_cpu->private_irqs);
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 5913a20d8301..f85d63f17af0 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -303,7 +303,7 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
>  		if (ctlr != GICR_CTLR_ENABLE_LPIS)
>  			return;
> 
> -		vgic_flush_pending_lpis(vcpu);
> +		vgic_flush_pending_lpis(vcpu, false);
>  		vgic_its_invalidate_all_caches(vcpu->kvm);
>  		atomic_set_release(&vgic_cpu->ctlr, 0);
>  	} else {
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 1e9fe8764584..09629a38fc0a 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -192,7 +192,7 @@ static void vgic_release_deleted_lpis(struct kvm *kvm)
>  	xa_unlock_irqrestore(&dist->lpi_xa, flags);
>  }
> 
> -void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
> +void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  	struct vgic_irq *irq, *tmp;
> @@ -204,6 +204,13 @@ void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
>  	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
>  		if (irq_is_lpi(vcpu->kvm, irq->intid)) {
>  			raw_spin_lock(&irq->irq_lock);
> +			/* Leave interrupts pending a migration for prune. */
> +			if (!destroy && irq->vcpu != vgic_target_oracle(irq)) {
> +				raw_spin_unlock(&irq->irq_lock);
> +				continue;
> +			}

It's rather unclear to me what the semantics of this are.

If vcpu-a decides to nuke the LPIs of vcpu-b and the LPI had in the
meantime been migrated to vcpu-c, but obviously not observed by vcpu-c
yet as the LPI is still on vcpu-b's AP-list, then I don't see the
point in keeping this state.

Am I missing something obvious?

> +			/* Pending state is not preserved across EnableLPIs=0. */
> +			irq->pending_latch = false;

That part I agree with.

>  			list_del(&irq->ap_list);
>  			irq->vcpu = NULL;
>  			raw_spin_unlock(&irq->irq_lock);
> @@ -797,6 +804,9 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
> 
>  		/* This interrupt looks like it has to be migrated. */
> 
> +		/* Keep the interrupt alive while the locks are dropped. */
> +		vgic_get_irq_ref(irq);
> +
>  		raw_spin_unlock(&irq->irq_lock);
>  		raw_spin_unlock(&vgic_cpu->ap_list_lock);
> 
> @@ -839,6 +849,8 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
>  		raw_spin_unlock(&vcpuB->arch.vgic_cpu.ap_list_lock);
>  		raw_spin_unlock(&vcpuA->arch.vgic_cpu.ap_list_lock);
> 
> +		deleted_lpis |= vgic_put_irq_norelease(vcpu->kvm, irq);
> +
>  		if (target_vcpu_needs_kick) {
>  			kvm_make_request(KVM_REQ_IRQ_PENDING, target_vcpu);
>  			kvm_vcpu_kick(target_vcpu);
> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
> index 9d941241c8a2..c1ac24ede899 100644
> --- a/arch/arm64/kvm/vgic/vgic.h
> +++ b/arch/arm64/kvm/vgic/vgic.h
> @@ -341,7 +341,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu);
>  bool vgic_has_its(struct kvm *kvm);
>  int kvm_vgic_register_its_device(void);
>  void vgic_enable_lpis(struct kvm_vcpu *vcpu);
> -void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu);
> +void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy);
>  int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
>  int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
>  int vgic_v3_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
> 

I reckon this would work just as well with just the pending state
being removed in vgic_flush_pending_lpis(), and the reference holding
hack in gvgic_prune_ap_list().

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-06-10 16:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
2026-06-05  6:00 ` Oliver Upton
2026-06-05  7:42   ` Marc Zyngier
2026-06-05  8:43     ` Oliver Upton
2026-06-10 13:52       ` Hyunwoo Kim
2026-06-10 16:00         ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox