From: Marc Zyngier <maz@kernel.org>
To: kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
kvm@vger.kernel.org
Cc: Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Oliver Upton <oliver.upton@linux.dev>,
Zenghui Yu <yuzenghui@huawei.com>,
Christoffer Dall <christoffer.dall@arm.com>,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Subject: [PATCH 08/33] KVM: arm64: Add LR overflow handling documentation
Date: Mon, 3 Nov 2025 16:54:52 +0000 [thread overview]
Message-ID: <20251103165517.2960148-9-maz@kernel.org> (raw)
In-Reply-To: <20251103165517.2960148-1-maz@kernel.org>
Add a bit of documentation describing how we are dealing with LR
overflow. This is mostly a braindump of how things are expected
to work. For now anyway.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/vgic/vgic.c | 81 +++++++++++++++++++++++++++++++++++++-
1 file changed, 80 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 6dd5a10081e27..8d7f6803e601b 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -825,7 +825,86 @@ static int compute_ap_list_depth(struct kvm_vcpu *vcpu,
return count;
}
-/* Requires the VCPU's ap_list_lock to be held. */
+/*
+ * Dealing with LR overflow is close to black magic -- dress accordingly.
+ *
+ * We have to present an almost infinite number of interrupts through a very
+ * limited number of registers. Therefore crucial decisions must be made to
+ * ensure we feed the most relevant interrupts into the LRs, and yet have
+ * some facilities to let the guest interact with those that are not there.
+ *
+ * All considerations below are in the context of interrupts targeting a
+ * single vcpu with non-idle state (either pending, active, or both),
+ * colloquially called the ap_list:
+ *
+ * - Pending interrupts must have priority over active interrupts. This also
+ * excludes pending+active interrupts. This ensures that a guest can
+ * perform priority drops on any number of interrupts, and yet be
+ * presented the next pending one.
+ *
+ * - Deactivation of interrupts outside of the LRs must be tracked by using
+ * either the EOIcount-driven maintenance interrupt, and sometimes by
+ * trapping the DIR register.
+ *
+ * - For EOImode=0, a non-zero EOIcount means walking the ap_list past the
+ * point that made it into the LRs, and deactivate interrupts that would
+ * have made it onto the LRs if we had the space.
+ *
+ * - The MI-generation bits must be used to try and force an exit when the
+ * guest has done enough changes to the LRs that we want to reevaluate the
+ * situation:
+ *
+ * - if the total number of pending interrupts exceeds the number of
+ * LR, NPIE must be set in order to exit once no pending interrupts
+ * are present in the LRs, allowing us to populate the next batch.
+ *
+ * - if there are active interrupts outside of the LRs, then LRENPIE
+ * must be set so that we exit on deactivation of one of these, and
+ * work out which one is to be deactivated. Note that this is not
+ * enough to deal with EOImode=1, see below.
+ *
+ * - if the overall number of interrupts exceeds the number of LRs,
+ * then UIE must be set to allow refilling of the LRs once the
+ * majority of them has been processed.
+ *
+ * - as usual, MI triggers are only an optimisation, since we cannot
+ * rely on the MI being delivered in timely manner...
+ *
+ * - EOImode=1 creates some additional problems:
+ *
+ * - deactivation can happen in any order, and we cannot rely on
+ * EOImode=0's coupling of priority-drop and deactivation which
+ * imposes strict reverse Ack order. This means that DIR must be set
+ * if we have active interrupts outside of the LRs.
+ *
+ * - deactivation of SPIs can occur on any CPU, while the SPI is only
+ * present in the ap_list of the CPU that actually ack-ed it. In that
+ * case, EOIcount doesn't provide enough information, and we must
+ * resort to trapping DIR even if we don't overflow the LRs. Bonus
+ * point for not trapping DIR when no SPIs are pending or active in
+ * the whole VM.
+ *
+ * - LPIs do not suffer the same problem as SPIs on deactivation, as we
+ * have to essentially discard the active state, see below.
+ *
+ * - Virtual LPIs have an active state (surprise!), which gets removed on
+ * priority drop (EOI). However, EOIcount doesn't get bumped when the LPI
+ * is not present in the LR (surprise again!). Special care must therefore
+ * be taken to remove the active state from any activated LPI when exiting
+ * from the guest. This is in a way no different from what happens on the
+ * physical side. We still rely on the running priority to have been
+ * removed from the APRs, irrespective of the LPI being present in the LRs
+ * or not.
+ *
+ * - Virtual SGIs directly injected via GICv4.1 must not affect EOIcount, as
+ * they are not managed in SW and don't have a true active state. So only
+ * set vSGIEOICount when no SGIs are in the ap_list.
+ *
+ * - GICv2 SGIs with multiple sources are injected one source at a time, as
+ * if they were made pending sequentially. This may mean that we don't
+ * always present the HPPI if other interrupts with lower priority are
+ * pending in the LRs. Big deal.
+ */
static void vgic_flush_lr_state(struct kvm_vcpu *vcpu)
{
struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
--
2.47.3
next prev parent reply other threads:[~2025-11-03 16:55 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-03 16:54 [PATCH 00/33] KVM: arm64: Add LR overflow infrastructure Marc Zyngier
2025-11-03 16:54 ` [PATCH 01/33] irqchip/gic: Add missing GICH_HCR control bits Marc Zyngier
2025-11-03 16:54 ` [PATCH 02/33] irqchip/gic: Expose CPU interface VA to KVM Marc Zyngier
2025-11-03 16:54 ` [PATCH 03/33] irqchip/apple-aic: Spit out ICH_MIDR_EL2 value on spurious vGIC MI Marc Zyngier
2025-11-04 11:13 ` Zenghui Yu
2025-11-03 16:54 ` [PATCH 04/33] KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant Marc Zyngier
2025-11-03 16:54 ` [PATCH 05/33] KVM: arm64: GICv3: Detect and work around the lack of ICV_DIR_EL1 trapping Marc Zyngier
2025-11-04 8:50 ` Yao Yuan
2025-11-04 9:04 ` Marc Zyngier
2025-11-04 9:40 ` Yao Yuan
2025-11-05 2:01 ` kernel test robot
2025-11-05 11:31 ` Marc Zyngier
2025-11-03 16:54 ` [PATCH 06/33] KVM: arm64: Repack struct vgic_irq fields Marc Zyngier
2025-11-03 16:54 ` [PATCH 07/33] KVM: arm64: Add tracking of vgic_irq being present in a LR Marc Zyngier
2025-11-03 16:54 ` Marc Zyngier [this message]
2025-11-03 16:54 ` [PATCH 09/33] KVM: arm64: GICv3: Drop LPI active state when folding LRs Marc Zyngier
2025-11-03 16:54 ` [PATCH 10/33] KVM: arm64: GICv3: Preserve EOIcount on exit Marc Zyngier
2025-11-03 16:54 ` [PATCH 11/33] KVM: arm64: GICv3: Decouple ICH_HCR_EL2 programming from LRs Marc Zyngier
2025-11-03 16:54 ` [PATCH 12/33] KVM: arm64: GICv3: Extract LR folding primitive Marc Zyngier
2025-11-03 16:54 ` [PATCH 13/33] KVM: arm64: GICv3: Extract LR computing primitive Marc Zyngier
2025-11-03 16:54 ` [PATCH 14/33] KVM: arm64: GICv2: Preserve EOIcount on exit Marc Zyngier
2025-11-03 16:54 ` [PATCH 15/33] KVM: arm64: GICv2: Decouple GICH_HCR programming from LRs being loaded Marc Zyngier
2025-11-03 16:55 ` [PATCH 16/33] KVM: arm64: GICv2: Extract LR folding primitive Marc Zyngier
2025-11-03 16:55 ` [PATCH 17/33] KVM: arm64: GICv2: Extract LR computing primitive Marc Zyngier
2025-11-03 16:55 ` [PATCH 18/33] KVM: arm64: Compute vgic state irrespective of the number of interrupts Marc Zyngier
2025-11-03 16:55 ` [PATCH 19/33] KVM: arm64: Eagerly save VMCR on exit Marc Zyngier
2025-11-03 16:55 ` [PATCH 20/33] KVM: arm64: Revamp vgic maintenance interrupt configuration Marc Zyngier
2025-11-03 16:55 ` [PATCH 21/33] KVM: arm64: Make vgic_target_oracle() globally available Marc Zyngier
2025-11-03 16:55 ` [PATCH 22/33] KVM: arm64: Invert ap_list sorting to push active interrupts out Marc Zyngier
2025-11-03 16:55 ` [PATCH 23/33] KVM: arm64: Move undeliverable interrupts to the end of ap_list Marc Zyngier
2025-11-03 16:55 ` [PATCH 24/33] KVM: arm64: Use MI to detect groups being enabled/disabled Marc Zyngier
2025-11-03 16:55 ` [PATCH 25/33] KVM: arm64: Add AP-list overflow split/splice Marc Zyngier
2025-11-03 16:55 ` [PATCH 26/33] KVM: arm64: GICv3: Handle LR overflow when EOImode==0 Marc Zyngier
2025-11-03 16:55 ` [PATCH 27/33] KVM: arm64: GICv3: Handle deactivation via ICV_DIR_EL1 traps Marc Zyngier
2025-11-03 16:55 ` [PATCH 28/33] KVM: arm64: GICv3: Add GICv2 SGI handling to deactivation primitive Marc Zyngier
2025-11-03 16:55 ` [PATCH 29/33] KVM: arm64: GICv3: Set ICH_HCR_EL2.TDIR when interrupts overflow LR capacity Marc Zyngier
2025-11-03 16:55 ` [PATCH 30/33] KVM: arm64: GICv2: Handle LR overflow when EOImode==0 Marc Zyngier
2025-11-03 16:55 ` [PATCH 31/33] KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps Marc Zyngier
2025-11-03 16:55 ` [PATCH 32/33] KVM: arm64: GICv2: Always trap GICV_DIR register Marc Zyngier
2025-11-03 16:55 ` [PATCH 33/33] KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation Marc Zyngier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103165517.2960148-9-maz@kernel.org \
--to=maz@kernel.org \
--cc=Volodymyr_Babchuk@epam.com \
--cc=christoffer.dall@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=oliver.upton@linux.dev \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.