From: Oliver Upton <oupton@kernel.org>
To: Hyunwoo Kim <imv4bel@gmail.com>
Cc: maz@kernel.org, joey.gouly@arm.com, seiden@linux.ibm.com,
suzuki.poulose@arm.com, yuzenghui@huawei.com,
catalin.marinas@arm.com, will@kernel.org,
Sascha.Bischoff@arm.com, jic23@kernel.org, timothy.hayes@arm.com,
eric.auger@linaro.org, christoffer.dall@linaro.org,
andre.przywara@arm.com, linux-arm-kernel@lists.infradead.org,
kvmarm@lists.linux.dev
Subject: Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
Date: Thu, 4 Jun 2026 23:00:37 -0700 [thread overview]
Message-ID: <aiJmBfXAxcytfGha@kernel.org> (raw)
In-Reply-To: <aiHnI1mu6SGQrgnz@v4bel>
On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> an interrupt to another vCPU. After reacquiring the locks it only checks
> that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> before moving the interrupt, which assumes that an interrupt whose affinity
> is preserved is still queued on this vCPU's ap_list.
>
> That assumption no longer holds if the interrupt is taken off the ap_list
> while the locks are dropped. vgic_flush_pending_lpis() removes the
> interrupt from the list and sets irq->vcpu to NULL, but leaves
> enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> and pending, vgic_target_oracle() returns the same target_vcpu, so the
> affinity check passes and list_del() is run a second time on an entry that
> has already been removed.
>
> Also check that the interrupt is still assigned to this vCPU
> (irq->vcpu == vcpu) before moving it.
>
> Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Looking at this and the other VGIC patch you sent (which should've been
a combined series), are you trying to deal with a vCPU writing to
another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
behind the back of vCPU A?
That is extremely relevant information as the off-the-cuff reaction is
that no race exists. But since the GIC architecture is awesome and
allows for this sort of insanity, it obviously does....
Anyway, for LPIs resident on a particular RD, there's zero expectation
that the pending state is preserved when EnableLPIs=0. So I'd rather
vgic_flush_pending_lpis() just invalidate the pending state.
Beyond that, I see two other fixes for lifetime issues around the
vgic_irq in the middle of migration. I'd like to see explicit RCU
protection around the release && reacquire of the ap_list_lock rather
than depending on the precondition that IRQs are disabled.
Then vgic_flush_pending_lpis() should leave IRQs intact that are pending
a migration (e.g. irq->vcpu != vgic_target_oracle()) as the only expectation
we need to uphold is that LPIs resident on the RD have the pending state cleared.
Although I think we could benefit from the wetware implementation of the
GIC giving this a once over too. Any thoughts Marc?
Thanks,
Oliver
> ---
> arch/arm64/kvm/vgic/vgic.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index 1e9fe8764584..18b280de9a29 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -818,15 +818,16 @@ static void vgic_prune_ap_list(struct kvm_vcpu *vcpu)
> raw_spin_lock(&irq->irq_lock);
>
> /*
> - * If the affinity has been preserved, move the
> - * interrupt around. Otherwise, it means things have
> - * changed while the interrupt was unlocked, and we
> - * need to replay this.
> + * If the interrupt is still ours and its affinity has
> + * been preserved, move it around. Otherwise, it means
> + * things have changed while the interrupt was unlocked
> + * (it may even have been taken off the list with its
> + * affinity left untouched), and we need to replay this.
> *
> * In all cases, we cannot trust the list not to have
> * changed, so we restart from the beginning.
> */
> - if (target_vcpu == vgic_target_oracle(irq)) {
> + if (irq->vcpu == vcpu && target_vcpu == vgic_target_oracle(irq)) {
> struct vgic_cpu *new_cpu = &target_vcpu->arch.vgic_cpu;
>
> list_del(&irq->ap_list);
> --
> 2.43.0
>
next prev parent reply other threads:[~2026-06-05 6:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
2026-06-05 6:00 ` Oliver Upton [this message]
2026-06-05 7:42 ` Marc Zyngier
2026-06-05 8:43 ` Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiJmBfXAxcytfGha@kernel.org \
--to=oupton@kernel.org \
--cc=Sascha.Bischoff@arm.com \
--cc=andre.przywara@arm.com \
--cc=catalin.marinas@arm.com \
--cc=christoffer.dall@linaro.org \
--cc=eric.auger@linaro.org \
--cc=imv4bel@gmail.com \
--cc=jic23@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=seiden@linux.ibm.com \
--cc=suzuki.poulose@arm.com \
--cc=timothy.hayes@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.