From: Marc Zyngier <maz@kernel.org>
To: Hyunwoo Kim <imv4bel@gmail.com>
Cc: Oliver Upton <oupton@kernel.org>,
joey.gouly@arm.com, seiden@linux.ibm.com, suzuki.poulose@arm.com,
yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
Sascha.Bischoff@arm.com, jic23@kernel.org, timothy.hayes@arm.com,
andre.przywara@arm.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.linux.dev
Subject: Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
Date: Wed, 10 Jun 2026 17:00:25 +0100 [thread overview]
Message-ID: <865x3qtmg6.wl-maz@kernel.org> (raw)
In-Reply-To: <ailsCnyoS82r_QRz@v4bel>
On Wed, 10 Jun 2026 14:52:10 +0100,
Hyunwoo Kim <imv4bel@gmail.com> wrote:
>
> On Fri, Jun 05, 2026 at 01:43:32AM -0700, Oliver Upton wrote:
> > On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> > > On Fri, 05 Jun 2026 07:00:37 +0100,
> > > Oliver Upton <oupton@kernel.org> wrote:
> > > >
> > > > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > > > is preserved is still queued on this vCPU's ap_list.
> > > > >
> > > > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > > > affinity check passes and list_del() is run a second time on an entry that
> > > > > has already been removed.
> > > > >
> > > > > Also check that the interrupt is still assigned to this vCPU
> > > > > (irq->vcpu == vcpu) before moving it.
> > > > >
> > > > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > > >
> > > > Looking at this and the other VGIC patch you sent (which should've been
> > > > a combined series), are you trying to deal with a vCPU writing to
> > > > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > > > behind the back of vCPU A?
> > > >
> > > > That is extremely relevant information as the off-the-cuff reaction is
> > > > that no race exists. But since the GIC architecture is awesome and
> > > > allows for this sort of insanity, it obviously does....
> > > >
> > > > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > > > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > > > vgic_flush_pending_lpis() just invalidate the pending state.
> > >
> > > Just clearing the pending state introduces a potential problem as we
> > > now have an interrupt that is neither active nor pending on the AP
> > > list. It is not impossible to solve (we now have similar behaviours
> > > with SPI deactivation from another vcpu), but that requires posting a
> > > KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.
> >
> > Right, I was suggesting that in addition to deleting the LPI from the AP
> > list we actually invalidate the pending state so that someone sitting on
> > a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
> > NULL
> >
> > > > Beyond that, I see two other fixes for lifetime issues around the
> > > > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > > > protection around the release && reacquire of the ap_list_lock rather
> > > > than depending on the precondition that IRQs are disabled.
> > >
> > > I'm not sure I follow. Are you suggesting turning the AP list into an
> > > RCU protected list?
> >
> > No, sorry, I should expand a little.
> >
> > We store a reference on the vgic_irq struct in the AP list, which is
> > stable so long as the ap_list_lock is held. It should be possible for
> > the refcount to drop to 0 between releasing the ap_list_lock and
> > reacquiring it.
> >
> > So either vgic_prune_ap_list() takes an additional reference on the
> > vgic_irq before dropping the ap_list_lock or rely on RCU to protect
> > vgic_irq structs observed with a non-zero refcount.
>
> What are your thoughts on this approach?
>
>
> Best regards,
> Hyunwoo Kim
>
> ---
>
> diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c
> index 933983bb2005..7fb871c3ccd8 100644
> --- a/arch/arm64/kvm/vgic/vgic-init.c
> +++ b/arch/arm64/kvm/vgic/vgic-init.c
> @@ -523,7 +523,7 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
> * Retire all pending LPIs on this vcpu anyway as we're
> * going to destroy it.
> */
> - vgic_flush_pending_lpis(vcpu);
> + vgic_flush_pending_lpis(vcpu, true);
>
> INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
> kfree(vgic_cpu->private_irqs);
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 5913a20d8301..f85d63f17af0 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -303,7 +303,7 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
> if (ctlr != GICR_CTLR_ENABLE_LPIS)
> return;
>
> - vgic_flush_pending_lpis(vcpu);
> + vgic_flush_pending_lpis(vcpu, false);
> vgic_its_invalidate_all_caches(vcpu->kvm);
> atomic_set_release(&vgic_cpu->ctlr, 0);
> } else {
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 1e9fe8764584..09629a38fc0a 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -192,7 +192,7 @@ static void vgic_release_deleted_lpis(struct kvm *kvm)
> xa_unlock_irqrestore(&dist->lpi_xa, flags);
> }
>
> -void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
> +void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy)
> {
> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> struct vgic_irq *irq, *tmp;
> @@ -204,6 +204,13 @@ void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
> list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {
> if (irq_is_lpi(vcpu->kvm, irq->intid)) {
> raw_spin_lock(&irq->irq_lock);
> + /* Leave interrupts pending a migration for prune. */
> + if (!destroy && irq->vcpu != vgic_target_oracle(irq)) {
> + raw_spin_unlock(&irq->irq_lock);
> + continue;
> + }
It's rather unclear to me what the semantics of this are.
If vcpu-a decides to nuke the LPIs of vcpu-b and the LPI had in the
meantime been migrated to vcpu-c, but obviously not observed by vcpu-c
yet as the LPI is still on vcpu-b's AP-list, then I don't see the
point in keeping this state.
Am I missing something obvious?
> + /* Pending state is not preserved across EnableLPIs=0. */
> + irq->pending_latch = false;
That part I agree with.
> list_del(&irq->ap_list);
> irq->vcpu = NULL;
> raw_spin_unlock(&irq->irq_lock);
> @@ -797,6 +804,9 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
>
> /* This interrupt looks like it has to be migrated. */
>
> + /* Keep the interrupt alive while the locks are dropped. */
> + vgic_get_irq_ref(irq);
> +
> raw_spin_unlock(&irq->irq_lock);
> raw_spin_unlock(&vgic_cpu->ap_list_lock);
>
> @@ -839,6 +849,8 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
> raw_spin_unlock(&vcpuB->arch.vgic_cpu.ap_list_lock);
> raw_spin_unlock(&vcpuA->arch.vgic_cpu.ap_list_lock);
>
> + deleted_lpis |= vgic_put_irq_norelease(vcpu->kvm, irq);
> +
> if (target_vcpu_needs_kick) {
> kvm_make_request(KVM_REQ_IRQ_PENDING, target_vcpu);
> kvm_vcpu_kick(target_vcpu);
> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
> index 9d941241c8a2..c1ac24ede899 100644
> --- a/arch/arm64/kvm/vgic/vgic.h
> +++ b/arch/arm64/kvm/vgic/vgic.h
> @@ -341,7 +341,7 @@ void vgic_v3_put(struct kvm_vcpu *vcpu);
> bool vgic_has_its(struct kvm *kvm);
> int kvm_vgic_register_its_device(void);
> void vgic_enable_lpis(struct kvm_vcpu *vcpu);
> -void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu);
> +void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu, bool destroy);
> int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
> int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr *attr);
> int vgic_v3_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
>
I reckon this would work just as well with just the pending state
being removed in vgic_flush_pending_lpis(), and the reference holding
hack in gvgic_prune_ap_list().
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
prev parent reply other threads:[~2026-06-10 16:00 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
2026-06-05 6:00 ` Oliver Upton
2026-06-05 7:42 ` Marc Zyngier
2026-06-05 8:43 ` Oliver Upton
2026-06-10 13:52 ` Hyunwoo Kim
2026-06-10 16:00 ` Marc Zyngier [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=865x3qtmg6.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=Sascha.Bischoff@arm.com \
--cc=andre.przywara@arm.com \
--cc=catalin.marinas@arm.com \
--cc=imv4bel@gmail.com \
--cc=jic23@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oupton@kernel.org \
--cc=seiden@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=timothy.hayes@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox