All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oliver Upton <oupton@kernel.org>
To: Marc Zyngier <maz@kernel.org>
Cc: Hyunwoo Kim <imv4bel@gmail.com>,
	joey.gouly@arm.com, seiden@linux.ibm.com, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org,
	Sascha.Bischoff@arm.com, jic23@kernel.org, timothy.hayes@arm.com,
	eric.auger@linaro.org, christoffer.dall@linaro.org,
	andre.przywara@arm.com, linux-arm-kernel@lists.infradead.org,
	kvmarm@lists.linux.dev
Subject: Re: [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it
Date: Fri, 5 Jun 2026 01:43:32 -0700	[thread overview]
Message-ID: <aiKMNIUMv9GQiIbD@kernel.org> (raw)
In-Reply-To: <87ecila0w3.wl-maz@kernel.org>

On Fri, Jun 05, 2026 at 08:42:52AM +0100, Marc Zyngier wrote:
> On Fri, 05 Jun 2026 07:00:37 +0100,
> Oliver Upton <oupton@kernel.org> wrote:
> > 
> > On Fri, Jun 05, 2026 at 05:59:15AM +0900, Hyunwoo Kim wrote:
> > > vgic_prune_ap_list() drops both ap_list_lock and irq_lock while migrating
> > > an interrupt to another vCPU. After reacquiring the locks it only checks
> > > that the affinity is unchanged (target_vcpu == vgic_target_oracle(irq))
> > > before moving the interrupt, which assumes that an interrupt whose affinity
> > > is preserved is still queued on this vCPU's ap_list.
> > > 
> > > That assumption no longer holds if the interrupt is taken off the ap_list
> > > while the locks are dropped. vgic_flush_pending_lpis() removes the
> > > interrupt from the list and sets irq->vcpu to NULL, but leaves
> > > enabled/pending/target_vcpu untouched. As the interrupt is still enabled
> > > and pending, vgic_target_oracle() returns the same target_vcpu, so the
> > > affinity check passes and list_del() is run a second time on an entry that
> > > has already been removed.
> > > 
> > > Also check that the interrupt is still assigned to this vCPU
> > > (irq->vcpu == vcpu) before moving it.
> > > 
> > > Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
> > > Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> > 
> > Looking at this and the other VGIC patch you sent (which should've been
> > a combined series), are you trying to deal with a vCPU writing to
> > another vCPU's redistributor? I.e. vCPU B setting GICR_CTLR.EnableLPIs=0
> > behind the back of vCPU A?
> > 
> > That is extremely relevant information as the off-the-cuff reaction is
> > that no race exists. But since the GIC architecture is awesome and
> > allows for this sort of insanity, it obviously does....
> > 
> > Anyway, for LPIs resident on a particular RD, there's zero expectation
> > that the pending state is preserved when EnableLPIs=0. So I'd rather
> > vgic_flush_pending_lpis() just invalidate the pending state.
> 
> Just clearing the pending state introduces a potential problem as we
> now have an interrupt that is neither active nor pending on the AP
> list. It is not impossible to solve (we now have similar behaviours
> with SPI deactivation from another vcpu), but that requires posting a
> KVM_REQ_VGIC_PROCESS_UPDATE to the target vcpu.

Right, I was suggesting that in addition to deleting the LPI from the AP
list we actually invalidate the pending state so that someone sitting on
a pointer to a to-be-freed LPI sees vgic_target_oracle() returning
NULL

> > Beyond that, I see two other fixes for lifetime issues around the
> > vgic_irq in the middle of migration. I'd like to see explicit RCU
> > protection around the release && reacquire of the ap_list_lock rather
> > than depending on the precondition that IRQs are disabled.
> 
> I'm not sure I follow. Are you suggesting turning the AP list into an
> RCU protected list?

No, sorry, I should expand a little.

We store a reference on the vgic_irq struct in the AP list, which is
stable so long as the ap_list_lock is held. It should be possible for
the refcount to drop to 0 between releasing the ap_list_lock and
reacquiring it.

So either vgic_prune_ap_list() takes an additional reference on the
vgic_irq before dropping the ap_list_lock or rely on RCU to protect
vgic_irq structs observed with a non-zero refcount.

Thanks,
Oliver


      reply	other threads:[~2026-06-05  8:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 20:59 [PATCH] KVM: arm64: vgic: Check the interrupt is still ours before migrating it Hyunwoo Kim
2026-06-05  6:00 ` Oliver Upton
2026-06-05  7:42   ` Marc Zyngier
2026-06-05  8:43     ` Oliver Upton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiKMNIUMv9GQiIbD@kernel.org \
    --to=oupton@kernel.org \
    --cc=Sascha.Bischoff@arm.com \
    --cc=andre.przywara@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@linaro.org \
    --cc=eric.auger@linaro.org \
    --cc=imv4bel@gmail.com \
    --cc=jic23@kernel.org \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=maz@kernel.org \
    --cc=seiden@linux.ibm.com \
    --cc=suzuki.poulose@arm.com \
    --cc=timothy.hayes@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.