* [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
@ 2025-11-17 9:15 Marc Zyngier
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
` (7 more replies)
0 siblings, 8 replies; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
This is a follow-up to the original series [1] (and fixes [2][3])
with a bunch of bug-fixes and improvements. At least one patch has
already been posted, but I thought I might repost it as part of a
series, since I accumulated more stuff:
- The first patch addresses Mark's observation that the no-vgic-v3
test has been broken once more. At some point, we'll have to retire
that functionality, because even if we keep fixing the SR handling,
nobody tests the actual interrupt state exposure to userspace, which
I'm pretty sure has badly been broken for at least 5 years.
- The second one addresses a report from Fuad that on QEMU,
ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
the host exploding on deactivating an interrupt. This behaviour is
allowed by the spec, so make sure we clear all trap bits
- Running vgic_irq in an L1 guest (the test being an L2) results in a
MI storm on the host, as the state synchronisation is done at the
wrong place, much like it was on the non-NV path before it was
reworked. Apply the same methods to the NV code, and enjoy much
better MI emulation, now tested all the way into an L3.
- Nuke a small leftover from previous rework.
- Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
the trap prevents too many spurious MIs in an L1 guest, as the write
to ICH_HCR_EL2 does exactly nothing on its own when running under
FEAT_NV2.
Oliver: this is starting to be a large series of fixes on top of the
existing series, plus the two patches you have already added. I'd be
happy to respin a full v4 with the fixes squashed into their original
patches. On the other hand, if you want to see the history in its full
glory, that also works for me.
[1] https://msgid.link/20251109171619.1507205-1-maz@kernel.org
[2] https://msgid.link/20251113172524.2795158-1-maz@kernel.org
[3] https://lore.kernel.org/kvmarm/86frahu21h.wl-maz@kernel.org
Marc Zyngier (5):
KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is
configured
KVM: arm64: GICv3: Completely disable trapping on vcpu exit
KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI
emulation
KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers
KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
arch/arm64/include/asm/kvm_hyp.h | 1 +
arch/arm64/kvm/hyp/vgic-v3-sr.c | 11 +++-
arch/arm64/kvm/vgic/vgic-v3-nested.c | 78 ++++++++++++++++------------
arch/arm64/kvm/vgic/vgic-v3.c | 3 ++
arch/arm64/kvm/vgic/vgic.c | 6 ++-
arch/arm64/kvm/vgic/vgic.h | 1 +
6 files changed, 62 insertions(+), 38 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
@ 2025-11-17 9:15 ` Marc Zyngier
2025-11-17 10:34 ` Fuad Tabba
2025-11-17 11:29 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit Marc Zyngier
` (6 subsequent siblings)
7 siblings, 2 replies; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
Configuring GICv3 to deal with the lack of GIC in the guest relies
on not setting ICH_HCR_EL2.En in the shadow register, as this is
an indication of the fact that we want to trap all system registers
to report an UNDEF in the guest.
Make sure we leave vgic_hcr untouched in this case.
Reported-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/r/72e1e8b5-e397-4dc5-9cd6-a32b6af3d739@sirena.org.uk
Fixes: 877324a1b5415 ("KVM: arm64: Revamp vgic maintenance interrupt configuration")
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 598621b14a30d..1d6dd1b545bdd 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -26,6 +26,9 @@ void vgic_v3_configure_hcr(struct kvm_vcpu *vcpu,
{
struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
+ if (!irqchip_in_kernel(vcpu->kvm))
+ return;
+
cpuif->vgic_hcr = ICH_HCR_EL2_En;
if (irqs_pending_outside_lrs(als))
--
2.47.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
@ 2025-11-17 9:15 ` Marc Zyngier
2025-11-17 10:36 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation Marc Zyngier
` (5 subsequent siblings)
7 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
Fuad reports that on QEMU, the DIR trapping is still effective after
a vcpu exit and that the host is running nVHE, resulting in a BUG()
(we only expect DIR to be trapped for the guest, and never the host).
As it turns out, this is an implementation-dependent behaviour, which
the architecture allows, but that seem to be relatively uncommon across
implementations.
Fix this by completely zeroing the ICH_HCR_EL2 register when the
vcpu exits.
Reported-by: Fuad Tabba <tabba@google.com>
Fixes: ca30799f7c2d0 ("KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant")
Closes: https://lore.kernel.org/r/CA+EHjTzRwswNq+hZQDD5tXj+-0nr04OmR201mHmi82FJ0VHuJA@mail.gmail.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index e950efa225478..71199e1a92940 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -243,7 +243,7 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
cpu_if->vgic_hcr |= val & ICH_HCR_EL2_EOIcount;
}
- write_gicreg(compute_ich_hcr(cpu_if) & ~ICH_HCR_EL2_En, ICH_HCR_EL2);
+ write_gicreg(0, ICH_HCR_EL2);
}
void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if)
--
2.47.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
2025-11-17 9:15 ` [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit Marc Zyngier
@ 2025-11-17 9:15 ` Marc Zyngier
2025-11-17 11:24 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers Marc Zyngier
` (4 subsequent siblings)
7 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
The current approach to nested GICv3 support is to not do anything
while L2 is running, wait a transition from L2 to L1 to resync
LRs, VMCR and HCR, and only then evaluate the state to decide
whether to generate a maintenance interrupt.
This doesn't provide a good quality of emulation, and it would be
far preferable to find out early that we need to perform a switch.
Move the LRs/VMCR and HCR resync into vgic_v3_sync_nested(), so
that we have most of the state available. As we turning the vgic
off at this stage to avoid a screaming host MI, add a new helper
vgic_v3_flush_nested() that switches the vgic on again. The MI can
then be directly injected as required.
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/include/asm/kvm_hyp.h | 1 +
arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
arch/arm64/kvm/vgic/vgic-v3-nested.c | 69 ++++++++++++++++------------
arch/arm64/kvm/vgic/vgic.c | 6 ++-
arch/arm64/kvm/vgic/vgic.h | 1 +
5 files changed, 46 insertions(+), 33 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index dbf16a9f67728..76ce2b94bd97e 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -77,6 +77,7 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
u64 __gic_v3_get_lr(unsigned int lr);
+void __gic_v3_set_lr(u64 val, int lr);
void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 71199e1a92940..99342c13e1794 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -60,7 +60,7 @@ u64 __gic_v3_get_lr(unsigned int lr)
unreachable();
}
-static void __gic_v3_set_lr(u64 val, int lr)
+void __gic_v3_set_lr(u64 val, int lr)
{
switch (lr & 0xf) {
case 0:
diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
index 17bceef83269e..bf37fd3198ba7 100644
--- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
+++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
@@ -70,13 +70,14 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
* - on L2 put: perform the inverse transformation, so that the result of L2
* running becomes visible to L1 in the VNCR-accessible registers.
*
- * - there is nothing to do on L2 entry, as everything will have happened
- * on load. However, this is the point where we detect that an interrupt
- * targeting L1 and prepare the grand switcheroo.
+ * - there is nothing to do on L2 entry apart from enabling the vgic, as
+ * everything will have happened on load. However, this is the point where
+ * we detect that an interrupt targeting L1 and prepare the grand
+ * switcheroo.
*
- * - on L2 exit: emulate the HW bit, and deactivate corresponding the L1
- * interrupt. The L0 active state will be cleared by the HW if the L1
- * interrupt was itself backed by a HW interrupt.
+ * - on L2 exit: resync the LRs and VMCR, emulate the HW bit, and deactivate
+ * corresponding the L1 interrupt. The L0 active state will be cleared by
+ * the HW if the L1 interrupt was itself backed by a HW interrupt.
*
* Maintenance Interrupt (MI) management:
*
@@ -265,15 +266,30 @@ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu,
s_cpu_if->used_lrs = hweight16(shadow_if->lr_map);
}
+void vgic_v3_flush_nested(struct kvm_vcpu *vcpu)
+{
+ u64 val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
+
+ write_sysreg_s(val | vgic_ich_hcr_trap_bits(), SYS_ICH_HCR_EL2);
+}
+
void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
{
struct shadow_if *shadow_if = get_shadow_if();
int i;
for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
- u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
+ u64 val, host_lr, lr;
struct vgic_irq *irq;
+ host_lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
+
+ /* Propagate the new LR state */
+ lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
+ val = lr & ~ICH_LR_STATE;
+ val |= host_lr & ICH_LR_STATE;
+ __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
+
if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE))
continue;
@@ -286,12 +302,21 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */
continue;
- lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
- if (!(lr & ICH_LR_STATE))
+ if (!(host_lr & ICH_LR_STATE))
irq->active = false;
vgic_put_irq(vcpu->kvm, irq);
}
+
+ /* We need these to be synchronised to generate the MI */
+ __vcpu_assign_sys_reg(vcpu, ICH_VMCR_EL2, read_sysreg_s(SYS_ICH_VMCR_EL2));
+ __vcpu_rmw_sys_reg(vcpu, ICH_HCR_EL2, &=, ~ICH_HCR_EL2_EOIcount);
+ __vcpu_rmw_sys_reg(vcpu, ICH_HCR_EL2, |=, read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EL2_EOIcount);
+
+ write_sysreg_s(0, SYS_ICH_HCR_EL2);
+ isb();
+
+ vgic_v3_nested_update_mi(vcpu);
}
static void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu,
@@ -325,7 +350,8 @@ void vgic_v3_load_nested(struct kvm_vcpu *vcpu)
__vgic_v3_restore_vmcr_aprs(cpu_if);
__vgic_v3_activate_traps(cpu_if);
- __vgic_v3_restore_state(cpu_if);
+ for (int i = 0; i < cpu_if->used_lrs; i++)
+ __gic_v3_set_lr(cpu_if->vgic_lr[i], i);
/*
* Propagate the number of used LRs for the benefit of the HYP
@@ -338,36 +364,19 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu)
{
struct shadow_if *shadow_if = get_shadow_if();
struct vgic_v3_cpu_if *s_cpu_if = &shadow_if->cpuif;
- u64 val;
int i;
__vgic_v3_save_aprs(s_cpu_if);
- __vgic_v3_deactivate_traps(s_cpu_if);
- __vgic_v3_save_state(s_cpu_if);
-
- /*
- * Translate the shadow state HW fields back to the virtual ones
- * before copying the shadow struct back to the nested one.
- */
- val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
- val &= ~ICH_HCR_EL2_EOIcount_MASK;
- val |= (s_cpu_if->vgic_hcr & ICH_HCR_EL2_EOIcount_MASK);
- __vcpu_assign_sys_reg(vcpu, ICH_HCR_EL2, val);
- __vcpu_assign_sys_reg(vcpu, ICH_VMCR_EL2, s_cpu_if->vgic_vmcr);
for (i = 0; i < 4; i++) {
__vcpu_assign_sys_reg(vcpu, ICH_AP0RN(i), s_cpu_if->vgic_ap0r[i]);
__vcpu_assign_sys_reg(vcpu, ICH_AP1RN(i), s_cpu_if->vgic_ap1r[i]);
}
- for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
- val = __vcpu_sys_reg(vcpu, ICH_LRN(i));
-
- val &= ~ICH_LR_STATE;
- val |= s_cpu_if->vgic_lr[lr_map_idx_to_shadow_idx(shadow_if, i)] & ICH_LR_STATE;
+ for (i = 0; i < s_cpu_if->used_lrs; i++)
+ __gic_v3_set_lr(0, i);
- __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
- }
+ __vgic_v3_deactivate_traps(s_cpu_if);
vcpu->arch.vgic_cpu.vgic_v3.used_lrs = 0;
}
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index a2f408754774e..4e4db52008c10 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -1056,8 +1056,9 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
* abort the entry procedure and inject the exception at the
* beginning of the run loop.
*
- * - Otherwise, do exactly *NOTHING*. The guest state is
- * already loaded, and we can carry on with running it.
+ * - Otherwise, do exactly *NOTHING* apart from enabling the virtual
+ * CPU interface. The guest state is already loaded, and we can
+ * carry on with running it.
*
* If we have NV, but are not in a nested state, compute the
* maintenance interrupt state, as it may fire.
@@ -1066,6 +1067,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
if (kvm_vgic_vcpu_pending_irq(vcpu))
kvm_make_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu);
+ vgic_v3_flush_nested(vcpu);
return;
}
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index ec3a61e8e6b30..5f0fc96b4dc29 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -446,6 +446,7 @@ static inline bool kvm_has_gicv3(struct kvm *kvm)
return kvm_has_feat(kvm, ID_AA64PFR0_EL1, GIC, IMP);
}
+void vgic_v3_flush_nested(struct kvm_vcpu *vcpu);
void vgic_v3_sync_nested(struct kvm_vcpu *vcpu);
void vgic_v3_load_nested(struct kvm_vcpu *vcpu);
void vgic_v3_put_nested(struct kvm_vcpu *vcpu);
--
2.47.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
` (2 preceding siblings ...)
2025-11-17 9:15 ` [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation Marc Zyngier
@ 2025-11-17 9:15 ` Marc Zyngier
2025-11-17 11:25 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En Marc Zyngier
` (3 subsequent siblings)
7 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
There's a bizarre or'ing of a 0 with the guest's ICH_HCR_EL2's
value, which is a leftover from the host workaround merging
code. Just kill it.
Fixes: ca30799f7c2d0 ("KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant")
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/vgic/vgic-v3-nested.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
index bf37fd3198ba7..40f7a37e0685c 100644
--- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
+++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
@@ -323,10 +323,9 @@ static void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu,
struct vgic_v3_cpu_if *s_cpu_if)
{
struct vgic_v3_cpu_if *host_if = &vcpu->arch.vgic_cpu.vgic_v3;
- u64 val = 0;
int i;
- s_cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) | val;
+ s_cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
s_cpu_if->vgic_vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2);
s_cpu_if->vgic_sre = host_if->vgic_sre;
--
2.47.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
` (3 preceding siblings ...)
2025-11-17 9:15 ` [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers Marc Zyngier
@ 2025-11-17 9:15 ` Marc Zyngier
2025-11-17 11:35 ` Fuad Tabba
2025-11-18 7:16 ` Oliver Upton
2025-11-17 9:40 ` [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Fuad Tabba
` (2 subsequent siblings)
7 siblings, 2 replies; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:15 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm
Cc: Joey Gouly, Suzuki K Poulose, Oliver Upton, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
interrupt is vain. This only hits memory, and the guest hasn't cleared
anything -- the MI will fire.
For example, running the vgic_irq test under NV results in about 800
maintenance interrupts being actually handled by the L1 guest,
when none were expected.
As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
the host in NV, giving it the opportunity to retire the pending
MI. With this, the above test tuns to completion without any MI
being actually handled.
Yes, this is really poor...
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 99342c13e1794..f503cf01ac82c 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
}
write_gicreg(0, ICH_HCR_EL2);
+
+ /*
+ * Hack alert: On NV, this results in a trap so that the above
+ * write actually takes effect...
+ */
+ isb();
+ read_gicreg(ICH_MISR_EL2);
}
void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if)
diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
index 40f7a37e0685c..d6797632157a0 100644
--- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
+++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
@@ -94,8 +94,10 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
*
* - because most of the ICH_*_EL2 registers live in the VNCR page, the
* quality of emulation is poor: L1 can setup the vgic so that an MI would
- * immediately fire, and not observe anything until the next exit. Trying
- * to read ICH_MISR_EL2 would do the trick, for example.
+ * immediately fire, and not observe anything until the next exit.
+ * Similarly, a pending MI is not immediately disabled by clearing
+ * ICH_HCR_EL2.En. Trying to read ICH_MISR_EL2 would do the trick, for
+ * example.
*
* System register emulation:
*
--
2.47.3
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
` (4 preceding siblings ...)
2025-11-17 9:15 ` [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En Marc Zyngier
@ 2025-11-17 9:40 ` Fuad Tabba
2025-11-17 9:54 ` Marc Zyngier
2025-11-18 7:20 ` Oliver Upton
2025-11-18 23:34 ` Oliver Upton
7 siblings, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 9:40 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
>
> This is a follow-up to the original series [1] (and fixes [2][3])
> with a bunch of bug-fixes and improvements. At least one patch has
> already been posted, but I thought I might repost it as part of a
> series, since I accumulated more stuff:
I'd like to test this series as well. Do you have it applied in one of
your branches at
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
, or which commit is it based on?
Thanks,
/fuad
> - The first patch addresses Mark's observation that the no-vgic-v3
> test has been broken once more. At some point, we'll have to retire
> that functionality, because even if we keep fixing the SR handling,
> nobody tests the actual interrupt state exposure to userspace, which
> I'm pretty sure has badly been broken for at least 5 years.
>
> - The second one addresses a report from Fuad that on QEMU,
> ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
> the host exploding on deactivating an interrupt. This behaviour is
> allowed by the spec, so make sure we clear all trap bits
>
> - Running vgic_irq in an L1 guest (the test being an L2) results in a
> MI storm on the host, as the state synchronisation is done at the
> wrong place, much like it was on the non-NV path before it was
> reworked. Apply the same methods to the NV code, and enjoy much
> better MI emulation, now tested all the way into an L3.
>
> - Nuke a small leftover from previous rework.
>
> - Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
> the trap prevents too many spurious MIs in an L1 guest, as the write
> to ICH_HCR_EL2 does exactly nothing on its own when running under
> FEAT_NV2.
>
> Oliver: this is starting to be a large series of fixes on top of the
> existing series, plus the two patches you have already added. I'd be
> happy to respin a full v4 with the fixes squashed into their original
> patches. On the other hand, if you want to see the history in its full
> glory, that also works for me.
>
> [1] https://msgid.link/20251109171619.1507205-1-maz@kernel.org
> [2] https://msgid.link/20251113172524.2795158-1-maz@kernel.org
> [3] https://lore.kernel.org/kvmarm/86frahu21h.wl-maz@kernel.org
>
> Marc Zyngier (5):
> KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is
> configured
> KVM: arm64: GICv3: Completely disable trapping on vcpu exit
> KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI
> emulation
> KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers
> KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
>
> arch/arm64/include/asm/kvm_hyp.h | 1 +
> arch/arm64/kvm/hyp/vgic-v3-sr.c | 11 +++-
> arch/arm64/kvm/vgic/vgic-v3-nested.c | 78 ++++++++++++++++------------
> arch/arm64/kvm/vgic/vgic-v3.c | 3 ++
> arch/arm64/kvm/vgic/vgic.c | 6 ++-
> arch/arm64/kvm/vgic/vgic.h | 1 +
> 6 files changed, 62 insertions(+), 38 deletions(-)
>
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 9:40 ` [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Fuad Tabba
@ 2025-11-17 9:54 ` Marc Zyngier
2025-11-17 10:18 ` Fuad Tabba
0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 9:54 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Fuad,
On Mon, 17 Nov 2025 09:40:47 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
> On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> >
> > This is a follow-up to the original series [1] (and fixes [2][3])
> > with a bunch of bug-fixes and improvements. At least one patch has
> > already been posted, but I thought I might repost it as part of a
> > series, since I accumulated more stuff:
>
> I'd like to test this series as well. Do you have it applied in one of
> your branches at
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> , or which commit is it based on?
I just pushed a new branch
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow-fixes
that is based on -rc5, kvmarm/next, kvmarm-fixes-6.18-rc3 plus these
patches. Let me know how this fares for you.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 9:54 ` Marc Zyngier
@ 2025-11-17 10:18 ` Fuad Tabba
2025-11-17 12:54 ` Fuad Tabba
0 siblings, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 10:18 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:54, Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Mon, 17 Nov 2025 09:40:47 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> > On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > This is a follow-up to the original series [1] (and fixes [2][3])
> > > with a bunch of bug-fixes and improvements. At least one patch has
> > > already been posted, but I thought I might repost it as part of a
> > > series, since I accumulated more stuff:
> >
> > I'd like to test this series as well. Do you have it applied in one of
> > your branches at
> > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> > , or which commit is it based on?
>
> I just pushed a new branch
>
> https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow-fixes
>
> that is based on -rc5, kvmarm/next, kvmarm-fixes-6.18-rc3 plus these
> patches. Let me know how this fares for you.
Great! I've applied the pKVM patches on top of it. So far so good.
I'll test this series more thoroughly and review it as well. Stay tuned...
Cheers,
/fuad
>
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
@ 2025-11-17 10:34 ` Fuad Tabba
2025-11-17 11:28 ` Marc Zyngier
2025-11-17 11:29 ` Fuad Tabba
1 sibling, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 10:34 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
>
> Configuring GICv3 to deal with the lack of GIC in the guest relies
> on not setting ICH_HCR_EL2.En in the shadow register, as this is
> an indication of the fact that we want to trap all system registers
> to report an UNDEF in the guest.
>
> Make sure we leave vgic_hcr untouched in this case.
>
> Reported-by: Mark Brown <broonie@kernel.org>
> Tested-by: Mark Brown <broonie@kernel.org>
> Closes: https://lore.kernel.org/r/72e1e8b5-e397-4dc5-9cd6-a32b6af3d739@sirena.org.uk
> Fixes: 877324a1b5415 ("KVM: arm64: Revamp vgic maintenance interrupt configuration")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 598621b14a30d..1d6dd1b545bdd 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -26,6 +26,9 @@ void vgic_v3_configure_hcr(struct kvm_vcpu *vcpu,
> {
> struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
>
> + if (!irqchip_in_kernel(vcpu->kvm))
> + return;
> +
Bear with me, since I'm not too familiar with this code. This is the
only function that initializes cpuif->vgic_hcr. Should we be
explicitly setting it to 0, or warn if ICH_HCR_EL2_En is set?
Cheers,
/fuad
> cpuif->vgic_hcr = ICH_HCR_EL2_En;
>
> if (irqs_pending_outside_lrs(als))
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit
2025-11-17 9:15 ` [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit Marc Zyngier
@ 2025-11-17 10:36 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 10:36 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:20, Marc Zyngier <maz@kernel.org> wrote:
>
> Fuad reports that on QEMU, the DIR trapping is still effective after
> a vcpu exit and that the host is running nVHE, resulting in a BUG()
> (we only expect DIR to be trapped for the guest, and never the host).
>
> As it turns out, this is an implementation-dependent behaviour, which
> the architecture allows, but that seem to be relatively uncommon across
> implementations.
>
> Fix this by completely zeroing the ICH_HCR_EL2 register when the
> vcpu exits.
>
> Reported-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> Fixes: ca30799f7c2d0 ("KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant")
> Closes: https://lore.kernel.org/r/CA+EHjTzRwswNq+hZQDD5tXj+-0nr04OmR201mHmi82FJ0VHuJA@mail.gmail.com
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index e950efa225478..71199e1a92940 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -243,7 +243,7 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> cpu_if->vgic_hcr |= val & ICH_HCR_EL2_EOIcount;
> }
>
> - write_gicreg(compute_ich_hcr(cpu_if) & ~ICH_HCR_EL2_En, ICH_HCR_EL2);
> + write_gicreg(0, ICH_HCR_EL2);
> }
>
> void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if)
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
2025-11-17 9:15 ` [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation Marc Zyngier
@ 2025-11-17 11:24 ` Fuad Tabba
2025-11-17 11:34 ` Marc Zyngier
0 siblings, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:24 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
>
> The current approach to nested GICv3 support is to not do anything
> while L2 is running, wait a transition from L2 to L1 to resync
> LRs, VMCR and HCR, and only then evaluate the state to decide
> whether to generate a maintenance interrupt.
>
> This doesn't provide a good quality of emulation, and it would be
> far preferable to find out early that we need to perform a switch.
>
> Move the LRs/VMCR and HCR resync into vgic_v3_sync_nested(), so
> that we have most of the state available. As we turning the vgic
> off at this stage to avoid a screaming host MI, add a new helper
> vgic_v3_flush_nested() that switches the vgic on again. The MI can
> then be directly injected as required.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/include/asm/kvm_hyp.h | 1 +
> arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
> arch/arm64/kvm/vgic/vgic-v3-nested.c | 69 ++++++++++++++++------------
> arch/arm64/kvm/vgic/vgic.c | 6 ++-
> arch/arm64/kvm/vgic/vgic.h | 1 +
> 5 files changed, 46 insertions(+), 33 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index dbf16a9f67728..76ce2b94bd97e 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -77,6 +77,7 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
>
> u64 __gic_v3_get_lr(unsigned int lr);
> +void __gic_v3_set_lr(u64 val, int lr);
>
> void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
> void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 71199e1a92940..99342c13e1794 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -60,7 +60,7 @@ u64 __gic_v3_get_lr(unsigned int lr)
> unreachable();
> }
>
> -static void __gic_v3_set_lr(u64 val, int lr)
> +void __gic_v3_set_lr(u64 val, int lr)
> {
> switch (lr & 0xf) {
> case 0:
> diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> index 17bceef83269e..bf37fd3198ba7 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> @@ -70,13 +70,14 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
> * - on L2 put: perform the inverse transformation, so that the result of L2
> * running becomes visible to L1 in the VNCR-accessible registers.
> *
> - * - there is nothing to do on L2 entry, as everything will have happened
> - * on load. However, this is the point where we detect that an interrupt
> - * targeting L1 and prepare the grand switcheroo.
> + * - there is nothing to do on L2 entry apart from enabling the vgic, as
> + * everything will have happened on load. However, this is the point where
> + * we detect that an interrupt targeting L1 and prepare the grand
> + * switcheroo.
> *
> - * - on L2 exit: emulate the HW bit, and deactivate corresponding the L1
> - * interrupt. The L0 active state will be cleared by the HW if the L1
> - * interrupt was itself backed by a HW interrupt.
> + * - on L2 exit: resync the LRs and VMCR, emulate the HW bit, and deactivate
> + * corresponding the L1 interrupt. The L0 active state will be cleared by
> + * the HW if the L1 interrupt was itself backed by a HW interrupt.
> *
> * Maintenance Interrupt (MI) management:
> *
> @@ -265,15 +266,30 @@ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu,
> s_cpu_if->used_lrs = hweight16(shadow_if->lr_map);
> }
>
> +void vgic_v3_flush_nested(struct kvm_vcpu *vcpu)
> +{
> + u64 val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
> +
> + write_sysreg_s(val | vgic_ich_hcr_trap_bits(), SYS_ICH_HCR_EL2);
> +}
> +
> void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> {
> struct shadow_if *shadow_if = get_shadow_if();
> int i;
>
> for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
> - u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> + u64 val, host_lr, lr;
> struct vgic_irq *irq;
>
> + host_lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> +
> + /* Propagate the new LR state */
> + lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> + val = lr & ~ICH_LR_STATE;
> + val |= host_lr & ICH_LR_STATE;
> + __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
> +
As I said before, I am outside of my comfort zone here. However,
should the following check be changed to use the merged 'val', rather
than the guest lr as it was?
Cheers,
/fuad
> if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE))
> continue;
>
> @@ -286,12 +302,21 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */
> continue;
>
> - lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> - if (!(lr & ICH_LR_STATE))
> + if (!(host_lr & ICH_LR_STATE))
> irq->active = false;
>
> vgic_put_irq(vcpu->kvm, irq);
> }
> +
> + /* We need these to be synchronised to generate the MI */
> + __vcpu_assign_sys_reg(vcpu, ICH_VMCR_EL2, read_sysreg_s(SYS_ICH_VMCR_EL2));
> + __vcpu_rmw_sys_reg(vcpu, ICH_HCR_EL2, &=, ~ICH_HCR_EL2_EOIcount);
> + __vcpu_rmw_sys_reg(vcpu, ICH_HCR_EL2, |=, read_sysreg_s(SYS_ICH_HCR_EL2) & ICH_HCR_EL2_EOIcount);
> +
> + write_sysreg_s(0, SYS_ICH_HCR_EL2);
> + isb();
> +
> + vgic_v3_nested_update_mi(vcpu);
> }
>
> static void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu,
> @@ -325,7 +350,8 @@ void vgic_v3_load_nested(struct kvm_vcpu *vcpu)
> __vgic_v3_restore_vmcr_aprs(cpu_if);
> __vgic_v3_activate_traps(cpu_if);
>
> - __vgic_v3_restore_state(cpu_if);
> + for (int i = 0; i < cpu_if->used_lrs; i++)
> + __gic_v3_set_lr(cpu_if->vgic_lr[i], i);
>
> /*
> * Propagate the number of used LRs for the benefit of the HYP
> @@ -338,36 +364,19 @@ void vgic_v3_put_nested(struct kvm_vcpu *vcpu)
> {
> struct shadow_if *shadow_if = get_shadow_if();
> struct vgic_v3_cpu_if *s_cpu_if = &shadow_if->cpuif;
> - u64 val;
> int i;
>
> __vgic_v3_save_aprs(s_cpu_if);
> - __vgic_v3_deactivate_traps(s_cpu_if);
> - __vgic_v3_save_state(s_cpu_if);
> -
> - /*
> - * Translate the shadow state HW fields back to the virtual ones
> - * before copying the shadow struct back to the nested one.
> - */
> - val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
> - val &= ~ICH_HCR_EL2_EOIcount_MASK;
> - val |= (s_cpu_if->vgic_hcr & ICH_HCR_EL2_EOIcount_MASK);
> - __vcpu_assign_sys_reg(vcpu, ICH_HCR_EL2, val);
> - __vcpu_assign_sys_reg(vcpu, ICH_VMCR_EL2, s_cpu_if->vgic_vmcr);
>
> for (i = 0; i < 4; i++) {
> __vcpu_assign_sys_reg(vcpu, ICH_AP0RN(i), s_cpu_if->vgic_ap0r[i]);
> __vcpu_assign_sys_reg(vcpu, ICH_AP1RN(i), s_cpu_if->vgic_ap1r[i]);
> }
>
> - for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
> - val = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> -
> - val &= ~ICH_LR_STATE;
> - val |= s_cpu_if->vgic_lr[lr_map_idx_to_shadow_idx(shadow_if, i)] & ICH_LR_STATE;
> + for (i = 0; i < s_cpu_if->used_lrs; i++)
> + __gic_v3_set_lr(0, i);
>
> - __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
> - }
> + __vgic_v3_deactivate_traps(s_cpu_if);
>
> vcpu->arch.vgic_cpu.vgic_v3.used_lrs = 0;
> }
> diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
> index a2f408754774e..4e4db52008c10 100644
> --- a/arch/arm64/kvm/vgic/vgic.c
> +++ b/arch/arm64/kvm/vgic/vgic.c
> @@ -1056,8 +1056,9 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
> * abort the entry procedure and inject the exception at the
> * beginning of the run loop.
> *
> - * - Otherwise, do exactly *NOTHING*. The guest state is
> - * already loaded, and we can carry on with running it.
> + * - Otherwise, do exactly *NOTHING* apart from enabling the virtual
> + * CPU interface. The guest state is already loaded, and we can
> + * carry on with running it.
> *
> * If we have NV, but are not in a nested state, compute the
> * maintenance interrupt state, as it may fire.
> @@ -1066,6 +1067,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
> if (kvm_vgic_vcpu_pending_irq(vcpu))
> kvm_make_request(KVM_REQ_GUEST_HYP_IRQ_PENDING, vcpu);
>
> + vgic_v3_flush_nested(vcpu);
> return;
> }
>
> diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
> index ec3a61e8e6b30..5f0fc96b4dc29 100644
> --- a/arch/arm64/kvm/vgic/vgic.h
> +++ b/arch/arm64/kvm/vgic/vgic.h
> @@ -446,6 +446,7 @@ static inline bool kvm_has_gicv3(struct kvm *kvm)
> return kvm_has_feat(kvm, ID_AA64PFR0_EL1, GIC, IMP);
> }
>
> +void vgic_v3_flush_nested(struct kvm_vcpu *vcpu);
> void vgic_v3_sync_nested(struct kvm_vcpu *vcpu);
> void vgic_v3_load_nested(struct kvm_vcpu *vcpu);
> void vgic_v3_put_nested(struct kvm_vcpu *vcpu);
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers
2025-11-17 9:15 ` [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers Marc Zyngier
@ 2025-11-17 11:25 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:25 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 at 09:20, Marc Zyngier <maz@kernel.org> wrote:
>
> There's a bizarre or'ing of a 0 with the guest's ICH_HCR_EL2's
> value, which is a leftover from the host workaround merging
> code. Just kill it.
>
> Fixes: ca30799f7c2d0 ("KVM: arm64: Turn vgic-v3 errata traps into a patched-in constant")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
Reviewed-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> arch/arm64/kvm/vgic/vgic-v3-nested.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> index bf37fd3198ba7..40f7a37e0685c 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> @@ -323,10 +323,9 @@ static void vgic_v3_create_shadow_state(struct kvm_vcpu *vcpu,
> struct vgic_v3_cpu_if *s_cpu_if)
> {
> struct vgic_v3_cpu_if *host_if = &vcpu->arch.vgic_cpu.vgic_v3;
> - u64 val = 0;
> int i;
>
> - s_cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2) | val;
> + s_cpu_if->vgic_hcr = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
> s_cpu_if->vgic_vmcr = __vcpu_sys_reg(vcpu, ICH_VMCR_EL2);
> s_cpu_if->vgic_sre = host_if->vgic_sre;
>
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured
2025-11-17 10:34 ` Fuad Tabba
@ 2025-11-17 11:28 ` Marc Zyngier
0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 11:28 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 10:34:36 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
> On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> >
> > Configuring GICv3 to deal with the lack of GIC in the guest relies
> > on not setting ICH_HCR_EL2.En in the shadow register, as this is
> > an indication of the fact that we want to trap all system registers
> > to report an UNDEF in the guest.
> >
> > Make sure we leave vgic_hcr untouched in this case.
> >
> > Reported-by: Mark Brown <broonie@kernel.org>
> > Tested-by: Mark Brown <broonie@kernel.org>
> > Closes: https://lore.kernel.org/r/72e1e8b5-e397-4dc5-9cd6-a32b6af3d739@sirena.org.uk
> > Fixes: 877324a1b5415 ("KVM: arm64: Revamp vgic maintenance interrupt configuration")
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> > index 598621b14a30d..1d6dd1b545bdd 100644
> > --- a/arch/arm64/kvm/vgic/vgic-v3.c
> > +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> > @@ -26,6 +26,9 @@ void vgic_v3_configure_hcr(struct kvm_vcpu *vcpu,
> > {
> > struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
> >
> > + if (!irqchip_in_kernel(vcpu->kvm))
> > + return;
> > +
>
> Bear with me, since I'm not too familiar with this code. This is the
> only function that initializes cpuif->vgic_hcr. Should we be
> explicitly setting it to 0, or warn if ICH_HCR_EL2_En is set?
It must be zero by construction, as the vcpu structure (which this is
part of) is always zeroed at allocation time, and the only way to
populate this field to have created an irqchip and start populating
the shadow structure, right after this test.
If we start doubting the state of bits of the vcpu structure that
should never be written to because of some particular configuration,
then we should address this globally, not piecemeal.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
2025-11-17 10:34 ` Fuad Tabba
@ 2025-11-17 11:29 ` Fuad Tabba
1 sibling, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:29 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
>
> Configuring GICv3 to deal with the lack of GIC in the guest relies
> on not setting ICH_HCR_EL2.En in the shadow register, as this is
> an indication of the fact that we want to trap all system registers
> to report an UNDEF in the guest.
>
> Make sure we leave vgic_hcr untouched in this case.
>
> Reported-by: Mark Brown <broonie@kernel.org>
> Tested-by: Mark Brown <broonie@kernel.org>
> Closes: https://lore.kernel.org/r/72e1e8b5-e397-4dc5-9cd6-a32b6af3d739@sirena.org.uk
> Fixes: 877324a1b5415 ("KVM: arm64: Revamp vgic maintenance interrupt configuration")
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
Reviewed-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> arch/arm64/kvm/vgic/vgic-v3.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
> index 598621b14a30d..1d6dd1b545bdd 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3.c
> @@ -26,6 +26,9 @@ void vgic_v3_configure_hcr(struct kvm_vcpu *vcpu,
> {
> struct vgic_v3_cpu_if *cpuif = &vcpu->arch.vgic_cpu.vgic_v3;
>
> + if (!irqchip_in_kernel(vcpu->kvm))
> + return;
> +
> cpuif->vgic_hcr = ICH_HCR_EL2_En;
>
> if (irqs_pending_outside_lrs(als))
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
2025-11-17 11:24 ` Fuad Tabba
@ 2025-11-17 11:34 ` Marc Zyngier
2025-11-17 11:37 ` Fuad Tabba
0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 11:34 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 11:24:24 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
>
> On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> >
> > The current approach to nested GICv3 support is to not do anything
> > while L2 is running, wait a transition from L2 to L1 to resync
> > LRs, VMCR and HCR, and only then evaluate the state to decide
> > whether to generate a maintenance interrupt.
> >
> > This doesn't provide a good quality of emulation, and it would be
> > far preferable to find out early that we need to perform a switch.
> >
> > Move the LRs/VMCR and HCR resync into vgic_v3_sync_nested(), so
> > that we have most of the state available. As we turning the vgic
> > off at this stage to avoid a screaming host MI, add a new helper
> > vgic_v3_flush_nested() that switches the vgic on again. The MI can
> > then be directly injected as required.
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/include/asm/kvm_hyp.h | 1 +
> > arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
> > arch/arm64/kvm/vgic/vgic-v3-nested.c | 69 ++++++++++++++++------------
> > arch/arm64/kvm/vgic/vgic.c | 6 ++-
> > arch/arm64/kvm/vgic/vgic.h | 1 +
> > 5 files changed, 46 insertions(+), 33 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index dbf16a9f67728..76ce2b94bd97e 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -77,6 +77,7 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> > int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
> >
> > u64 __gic_v3_get_lr(unsigned int lr);
> > +void __gic_v3_set_lr(u64 val, int lr);
> >
> > void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
> > void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
> > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > index 71199e1a92940..99342c13e1794 100644
> > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > @@ -60,7 +60,7 @@ u64 __gic_v3_get_lr(unsigned int lr)
> > unreachable();
> > }
> >
> > -static void __gic_v3_set_lr(u64 val, int lr)
> > +void __gic_v3_set_lr(u64 val, int lr)
> > {
> > switch (lr & 0xf) {
> > case 0:
> > diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > index 17bceef83269e..bf37fd3198ba7 100644
> > --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > @@ -70,13 +70,14 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
> > * - on L2 put: perform the inverse transformation, so that the result of L2
> > * running becomes visible to L1 in the VNCR-accessible registers.
> > *
> > - * - there is nothing to do on L2 entry, as everything will have happened
> > - * on load. However, this is the point where we detect that an interrupt
> > - * targeting L1 and prepare the grand switcheroo.
> > + * - there is nothing to do on L2 entry apart from enabling the vgic, as
> > + * everything will have happened on load. However, this is the point where
> > + * we detect that an interrupt targeting L1 and prepare the grand
> > + * switcheroo.
> > *
> > - * - on L2 exit: emulate the HW bit, and deactivate corresponding the L1
> > - * interrupt. The L0 active state will be cleared by the HW if the L1
> > - * interrupt was itself backed by a HW interrupt.
> > + * - on L2 exit: resync the LRs and VMCR, emulate the HW bit, and deactivate
> > + * corresponding the L1 interrupt. The L0 active state will be cleared by
> > + * the HW if the L1 interrupt was itself backed by a HW interrupt.
> > *
> > * Maintenance Interrupt (MI) management:
> > *
> > @@ -265,15 +266,30 @@ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu,
> > s_cpu_if->used_lrs = hweight16(shadow_if->lr_map);
> > }
> >
> > +void vgic_v3_flush_nested(struct kvm_vcpu *vcpu)
> > +{
> > + u64 val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
> > +
> > + write_sysreg_s(val | vgic_ich_hcr_trap_bits(), SYS_ICH_HCR_EL2);
> > +}
> > +
> > void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> > {
> > struct shadow_if *shadow_if = get_shadow_if();
> > int i;
> >
> > for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
> > - u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> > + u64 val, host_lr, lr;
> > struct vgic_irq *irq;
> >
> > + host_lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> > +
> > + /* Propagate the new LR state */
> > + lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> > + val = lr & ~ICH_LR_STATE;
> > + val |= host_lr & ICH_LR_STATE;
> > + __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
> > +
>
> As I said before, I am outside of my comfort zone here. However,
> should the following check be changed to use the merged 'val', rather
> than the guest lr as it was?
[...]
>
> > if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE))
> > continue;
No, this decision must be taken based on the *original* state, before
the L2 guest was run. If the LR was in an invalid state the first
place, there is nothing to do.
> >
> > @@ -286,12 +302,21 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> > if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */
> > continue;
> >
> > - lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> > - if (!(lr & ICH_LR_STATE))
> > + if (!(host_lr & ICH_LR_STATE))
> > irq->active = false;
And here, if we see that the *new* state (as fished out of the HW LRs)
is now invalid, this means that a deactivation has taken place in L2,
and we must propagate it to L1.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-17 9:15 ` [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En Marc Zyngier
@ 2025-11-17 11:35 ` Fuad Tabba
2025-11-17 11:42 ` Marc Zyngier
2025-11-18 7:16 ` Oliver Upton
1 sibling, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:35 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 09:22, Marc Zyngier <maz@kernel.org> wrote:
>
> FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
> effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
> interrupt is vain. This only hits memory, and the guest hasn't cleared
> anything -- the MI will fire.
>
> For example, running the vgic_irq test under NV results in about 800
> maintenance interrupts being actually handled by the L1 guest,
> when none were expected.
>
> As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
> ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
> the host in NV, giving it the opportunity to retire the pending
> MI. With this, the above test tuns to completion without any MI
> being actually handled.
nit: tuns->runs
>
> Yes, this is really poor...
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
> arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 99342c13e1794..f503cf01ac82c 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> }
>
> write_gicreg(0, ICH_HCR_EL2);
> +
> + /*
> + * Hack alert: On NV, this results in a trap so that the above
> + * write actually takes effect...
> + */
> + isb();
> + read_gicreg(ICH_MISR_EL2);
> }
nit: is it worth gating this with "ARM64_HAS_NESTED_VIRT"?
Otherwise,
Reviewed-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if)
> diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> index 40f7a37e0685c..d6797632157a0 100644
> --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> @@ -94,8 +94,10 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
> *
> * - because most of the ICH_*_EL2 registers live in the VNCR page, the
> * quality of emulation is poor: L1 can setup the vgic so that an MI would
> - * immediately fire, and not observe anything until the next exit. Trying
> - * to read ICH_MISR_EL2 would do the trick, for example.
> + * immediately fire, and not observe anything until the next exit.
> + * Similarly, a pending MI is not immediately disabled by clearing
> + * ICH_HCR_EL2.En. Trying to read ICH_MISR_EL2 would do the trick, for
> + * example.
> *
> * System register emulation:
> *
> --
> 2.47.3
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
2025-11-17 11:34 ` Marc Zyngier
@ 2025-11-17 11:37 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:37 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 at 11:34, Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 17 Nov 2025 11:24:24 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> >
> > On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > The current approach to nested GICv3 support is to not do anything
> > > while L2 is running, wait a transition from L2 to L1 to resync
> > > LRs, VMCR and HCR, and only then evaluate the state to decide
> > > whether to generate a maintenance interrupt.
> > >
> > > This doesn't provide a good quality of emulation, and it would be
> > > far preferable to find out early that we need to perform a switch.
> > >
> > > Move the LRs/VMCR and HCR resync into vgic_v3_sync_nested(), so
> > > that we have most of the state available. As we turning the vgic
> > > off at this stage to avoid a screaming host MI, add a new helper
> > > vgic_v3_flush_nested() that switches the vgic on again. The MI can
> > > then be directly injected as required.
> > >
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > > arch/arm64/include/asm/kvm_hyp.h | 1 +
> > > arch/arm64/kvm/hyp/vgic-v3-sr.c | 2 +-
> > > arch/arm64/kvm/vgic/vgic-v3-nested.c | 69 ++++++++++++++++------------
> > > arch/arm64/kvm/vgic/vgic.c | 6 ++-
> > > arch/arm64/kvm/vgic/vgic.h | 1 +
> > > 5 files changed, 46 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > > index dbf16a9f67728..76ce2b94bd97e 100644
> > > --- a/arch/arm64/include/asm/kvm_hyp.h
> > > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > > @@ -77,6 +77,7 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> > > int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
> > >
> > > u64 __gic_v3_get_lr(unsigned int lr);
> > > +void __gic_v3_set_lr(u64 val, int lr);
> > >
> > > void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
> > > void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
> > > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > index 71199e1a92940..99342c13e1794 100644
> > > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > @@ -60,7 +60,7 @@ u64 __gic_v3_get_lr(unsigned int lr)
> > > unreachable();
> > > }
> > >
> > > -static void __gic_v3_set_lr(u64 val, int lr)
> > > +void __gic_v3_set_lr(u64 val, int lr)
> > > {
> > > switch (lr & 0xf) {
> > > case 0:
> > > diff --git a/arch/arm64/kvm/vgic/vgic-v3-nested.c b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > > index 17bceef83269e..bf37fd3198ba7 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-v3-nested.c
> > > @@ -70,13 +70,14 @@ static int lr_map_idx_to_shadow_idx(struct shadow_if *shadow_if, int idx)
> > > * - on L2 put: perform the inverse transformation, so that the result of L2
> > > * running becomes visible to L1 in the VNCR-accessible registers.
> > > *
> > > - * - there is nothing to do on L2 entry, as everything will have happened
> > > - * on load. However, this is the point where we detect that an interrupt
> > > - * targeting L1 and prepare the grand switcheroo.
> > > + * - there is nothing to do on L2 entry apart from enabling the vgic, as
> > > + * everything will have happened on load. However, this is the point where
> > > + * we detect that an interrupt targeting L1 and prepare the grand
> > > + * switcheroo.
> > > *
> > > - * - on L2 exit: emulate the HW bit, and deactivate corresponding the L1
> > > - * interrupt. The L0 active state will be cleared by the HW if the L1
> > > - * interrupt was itself backed by a HW interrupt.
> > > + * - on L2 exit: resync the LRs and VMCR, emulate the HW bit, and deactivate
> > > + * corresponding the L1 interrupt. The L0 active state will be cleared by
> > > + * the HW if the L1 interrupt was itself backed by a HW interrupt.
> > > *
> > > * Maintenance Interrupt (MI) management:
> > > *
> > > @@ -265,15 +266,30 @@ static void vgic_v3_create_shadow_lr(struct kvm_vcpu *vcpu,
> > > s_cpu_if->used_lrs = hweight16(shadow_if->lr_map);
> > > }
> > >
> > > +void vgic_v3_flush_nested(struct kvm_vcpu *vcpu)
> > > +{
> > > + u64 val = __vcpu_sys_reg(vcpu, ICH_HCR_EL2);
> > > +
> > > + write_sysreg_s(val | vgic_ich_hcr_trap_bits(), SYS_ICH_HCR_EL2);
> > > +}
> > > +
> > > void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> > > {
> > > struct shadow_if *shadow_if = get_shadow_if();
> > > int i;
> > >
> > > for_each_set_bit(i, &shadow_if->lr_map, kvm_vgic_global_state.nr_lr) {
> > > - u64 lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> > > + u64 val, host_lr, lr;
> > > struct vgic_irq *irq;
> > >
> > > + host_lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> > > +
> > > + /* Propagate the new LR state */
> > > + lr = __vcpu_sys_reg(vcpu, ICH_LRN(i));
> > > + val = lr & ~ICH_LR_STATE;
> > > + val |= host_lr & ICH_LR_STATE;
> > > + __vcpu_assign_sys_reg(vcpu, ICH_LRN(i), val);
> > > +
> >
> > As I said before, I am outside of my comfort zone here. However,
> > should the following check be changed to use the merged 'val', rather
> > than the guest lr as it was?
>
> [...]
>
> >
> > > if (!(lr & ICH_LR_HW) || !(lr & ICH_LR_STATE))
> > > continue;
>
> No, this decision must be taken based on the *original* state, before
> the L2 guest was run. If the LR was in an invalid state the first
> place, there is nothing to do.
>
> > >
> > > @@ -286,12 +302,21 @@ void vgic_v3_sync_nested(struct kvm_vcpu *vcpu)
> > > if (WARN_ON(!irq)) /* Shouldn't happen as we check on load */
> > > continue;
> > >
> > > - lr = __gic_v3_get_lr(lr_map_idx_to_shadow_idx(shadow_if, i));
> > > - if (!(lr & ICH_LR_STATE))
> > > + if (!(host_lr & ICH_LR_STATE))
> > > irq->active = false;
>
> And here, if we see that the *new* state (as fished out of the HW LRs)
> is now invalid, this means that a deactivation has taken place in L2,
> and we must propagate it to L1.
Thanks for the clarification.
Reviewed-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> Thanks,
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-17 11:35 ` Fuad Tabba
@ 2025-11-17 11:42 ` Marc Zyngier
2025-11-17 11:48 ` Fuad Tabba
0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-17 11:42 UTC (permalink / raw)
To: Fuad Tabba
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 11:35:18 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
> On Mon, 17 Nov 2025 at 09:22, Marc Zyngier <maz@kernel.org> wrote:
> >
> > FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
> > effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
> > interrupt is vain. This only hits memory, and the guest hasn't cleared
> > anything -- the MI will fire.
> >
> > For example, running the vgic_irq test under NV results in about 800
> > maintenance interrupts being actually handled by the L1 guest,
> > when none were expected.
> >
> > As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
> > ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
> > the host in NV, giving it the opportunity to retire the pending
> > MI. With this, the above test tuns to completion without any MI
> > being actually handled.
>
> nit: tuns->runs
>
>
> >
> > Yes, this is really poor...
> >
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
> > arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
> > 2 files changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > index 99342c13e1794..f503cf01ac82c 100644
> > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > @@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> > }
> >
> > write_gicreg(0, ICH_HCR_EL2);
> > +
> > + /*
> > + * Hack alert: On NV, this results in a trap so that the above
> > + * write actually takes effect...
> > + */
> > + isb();
> > + read_gicreg(ICH_MISR_EL2);
> > }
>
> nit: is it worth gating this with "ARM64_HAS_NESTED_VIRT"?
This is in a *guest*, which knows nothing about being virtualised!
> Otherwise,
> Reviewed-by: Fuad Tabba <tabba@google.com>
Thanks!
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-17 11:42 ` Marc Zyngier
@ 2025-11-17 11:48 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 11:48 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
On Mon, 17 Nov 2025 at 11:42, Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 17 Nov 2025 11:35:18 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> > On Mon, 17 Nov 2025 at 09:22, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
> > > effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
> > > interrupt is vain. This only hits memory, and the guest hasn't cleared
> > > anything -- the MI will fire.
> > >
> > > For example, running the vgic_irq test under NV results in about 800
> > > maintenance interrupts being actually handled by the L1 guest,
> > > when none were expected.
> > >
> > > As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
> > > ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
> > > the host in NV, giving it the opportunity to retire the pending
> > > MI. With this, the above test tuns to completion without any MI
> > > being actually handled.
> >
> > nit: tuns->runs
> >
> >
> > >
> > > Yes, this is really poor...
> > >
> > > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > > ---
> > > arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
> > > arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
> > > 2 files changed, 11 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > index 99342c13e1794..f503cf01ac82c 100644
> > > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > > @@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> > > }
> > >
> > > write_gicreg(0, ICH_HCR_EL2);
> > > +
> > > + /*
> > > + * Hack alert: On NV, this results in a trap so that the above
> > > + * write actually takes effect...
> > > + */
> > > + isb();
> > > + read_gicreg(ICH_MISR_EL2);
> > > }
> >
> > nit: is it worth gating this with "ARM64_HAS_NESTED_VIRT"?
>
> This is in a *guest*, which knows nothing about being virtualised!
Nested makes my head hurt :D
Cheers,
/fuad
>
> > Otherwise,
> > Reviewed-by: Fuad Tabba <tabba@google.com>
>
> Thanks!
>
> M.
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 10:18 ` Fuad Tabba
@ 2025-11-17 12:54 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-17 12:54 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Oliver Upton, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Mon, 17 Nov 2025 at 10:18, Fuad Tabba <tabba@google.com> wrote:
>
> Hi Marc,
>
> On Mon, 17 Nov 2025 at 09:54, Marc Zyngier <maz@kernel.org> wrote:
> >
> > Hi Fuad,
> >
> > On Mon, 17 Nov 2025 09:40:47 +0000,
> > Fuad Tabba <tabba@google.com> wrote:
> > >
> > > Hi Marc,
> > >
> > > On Mon, 17 Nov 2025 at 09:15, Marc Zyngier <maz@kernel.org> wrote:
> > > >
> > > > This is a follow-up to the original series [1] (and fixes [2][3])
> > > > with a bunch of bug-fixes and improvements. At least one patch has
> > > > already been posted, but I thought I might repost it as part of a
> > > > series, since I accumulated more stuff:
> > >
> > > I'd like to test this series as well. Do you have it applied in one of
> > > your branches at
> > > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> > > , or which commit is it based on?
> >
> > I just pushed a new branch
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow-fixes
> >
> > that is based on -rc5, kvmarm/next, kvmarm-fixes-6.18-rc3 plus these
> > patches. Let me know how this fares for you.
>
> Great! I've applied the pKVM patches on top of it. So far so good.
> I'll test this series more thoroughly and review it as well. Stay tuned...
For this series, and the series that it sits on top of (i.e.,
https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow-fixes):
Tested-by: Fuad Tabba <tabba@google.com>
On QEMU, nVHE, hVHE protected mode, and protected VMs (with the latest
Android pKVM patches applied on top).
Thanks,
/fuad
> Cheers,
> /fuad
>
> >
> > Thanks,
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-17 9:15 ` [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En Marc Zyngier
2025-11-17 11:35 ` Fuad Tabba
@ 2025-11-18 7:16 ` Oliver Upton
2025-11-18 8:54 ` Marc Zyngier
1 sibling, 1 reply; 28+ messages in thread
From: Oliver Upton @ 2025-11-18 7:16 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Christoffer Dall, Fuad Tabba, Mark Brown
Hey Marc,
On Mon, Nov 17, 2025 at 09:15:27AM +0000, Marc Zyngier wrote:
> FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
> effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
> interrupt is vain. This only hits memory, and the guest hasn't cleared
> anything -- the MI will fire.
>
> For example, running the vgic_irq test under NV results in about 800
> maintenance interrupts being actually handled by the L1 guest,
> when none were expected.
>
> As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
> ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
> the host in NV, giving it the opportunity to retire the pending
> MI. With this, the above test tuns to completion without any MI
> being actually handled.
Just to make sure I'm following, the scenario you're talking about is
we've already put the vgic into a non-nested state, populated an LR with
the pending MI at the time of that switch and L0 has no signal for when
it can drop the LR / pending state.
> Yes, this is really poor...
+1 :-/
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
> arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> index 99342c13e1794..f503cf01ac82c 100644
> --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> @@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> }
>
> write_gicreg(0, ICH_HCR_EL2);
> +
> + /*
> + * Hack alert: On NV, this results in a trap so that the above
> + * write actually takes effect...
> + */
> + isb();
> + read_gicreg(ICH_MISR_EL2);
I'm all for writing correct code but since we don't actually care about
the value of MISR_EL2 do we need the ISB? There's no need for a CSE for
non-NV and you'd get it 'for free' by way of exception entry in the NV
case.
Thanks,
Oliver
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
` (5 preceding siblings ...)
2025-11-17 9:40 ` [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Fuad Tabba
@ 2025-11-18 7:20 ` Oliver Upton
2025-11-18 13:59 ` Fuad Tabba
2025-11-18 23:34 ` Oliver Upton
7 siblings, 1 reply; 28+ messages in thread
From: Oliver Upton @ 2025-11-18 7:20 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Christoffer Dall, Fuad Tabba, Mark Brown
On Mon, Nov 17, 2025 at 09:15:22AM +0000, Marc Zyngier wrote:
> This is a follow-up to the original series [1] (and fixes [2][3])
> with a bunch of bug-fixes and improvements. At least one patch has
> already been posted, but I thought I might repost it as part of a
> series, since I accumulated more stuff:
>
> - The first patch addresses Mark's observation that the no-vgic-v3
> test has been broken once more. At some point, we'll have to retire
> that functionality, because even if we keep fixing the SR handling,
> nobody tests the actual interrupt state exposure to userspace, which
> I'm pretty sure has badly been broken for at least 5 years.
>
> - The second one addresses a report from Fuad that on QEMU,
> ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
> the host exploding on deactivating an interrupt. This behaviour is
> allowed by the spec, so make sure we clear all trap bits
>
> - Running vgic_irq in an L1 guest (the test being an L2) results in a
> MI storm on the host, as the state synchronisation is done at the
> wrong place, much like it was on the non-NV path before it was
> reworked. Apply the same methods to the NV code, and enjoy much
> better MI emulation, now tested all the way into an L3.
>
> - Nuke a small leftover from previous rework.
>
> - Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
> the trap prevents too many spurious MIs in an L1 guest, as the write
> to ICH_HCR_EL2 does exactly nothing on its own when running under
> FEAT_NV2.
>
> Oliver: this is starting to be a large series of fixes on top of the
> existing series, plus the two patches you have already added. I'd be
> happy to respin a full v4 with the fixes squashed into their original
> patches. On the other hand, if you want to see the history in its full
> glory, that also works for me.
I'll pick up these patches in a moment but at this point I'd prefer a
clean history. Plan is to send out the 6.19 pull sometime next week so
any time before then would be great for v4.
Thanks for ironing out all the quirks :)
Best,
Oliver
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
2025-11-18 7:16 ` Oliver Upton
@ 2025-11-18 8:54 ` Marc Zyngier
0 siblings, 0 replies; 28+ messages in thread
From: Marc Zyngier @ 2025-11-18 8:54 UTC (permalink / raw)
To: Oliver Upton
Cc: kvmarm, linux-arm-kernel, kvm, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Christoffer Dall, Fuad Tabba, Mark Brown
On Tue, 18 Nov 2025 07:16:49 +0000,
Oliver Upton <oupton@kernel.org> wrote:
>
> Hey Marc,
>
> On Mon, Nov 17, 2025 at 09:15:27AM +0000, Marc Zyngier wrote:
> > FEAT_NV2 is pretty terrible for anything that tries to enforce immediate
> > effects, and writing to ICH_HCR_EL2 in the hope to disable a maintenance
> > interrupt is vain. This only hits memory, and the guest hasn't cleared
> > anything -- the MI will fire.
> >
> > For example, running the vgic_irq test under NV results in about 800
> > maintenance interrupts being actually handled by the L1 guest,
> > when none were expected.
> >
> > As a cheap workaround, read back ICH_MISR_EL2 after writing 0 to
> > ICH_HCR_EL2. This is very cheap on real HW, and causes a trap to
> > the host in NV, giving it the opportunity to retire the pending
> > MI. With this, the above test tuns to completion without any MI
> > being actually handled.
>
> Just to make sure I'm following, the scenario you're talking about is
> we've already put the vgic into a non-nested state, populated an LR with
> the pending MI at the time of that switch and L0 has no signal for when
> it can drop the LR / pending state.
Exactly. Only an exit can cause it to reevaluate the state and retire
the pending MI.
>
> > Yes, this is really poor...
>
> +1 :-/
>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > arch/arm64/kvm/hyp/vgic-v3-sr.c | 7 +++++++
> > arch/arm64/kvm/vgic/vgic-v3-nested.c | 6 ++++--
> > 2 files changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > index 99342c13e1794..f503cf01ac82c 100644
> > --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
> > @@ -244,6 +244,13 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if)
> > }
> >
> > write_gicreg(0, ICH_HCR_EL2);
> > +
> > + /*
> > + * Hack alert: On NV, this results in a trap so that the above
> > + * write actually takes effect...
> > + */
> > + isb();
> > + read_gicreg(ICH_MISR_EL2);
>
> I'm all for writing correct code but since we don't actually care about
> the value of MISR_EL2 do we need the ISB? There's no need for a CSE for
> non-NV and you'd get it 'for free' by way of exception entry in the NV
> case.
Yup, that's a good point. And exceptions are of course guaranteed to
be in program order anyway, so the ISB can go. I'll add comment to
that effect.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-18 7:20 ` Oliver Upton
@ 2025-11-18 13:59 ` Fuad Tabba
2025-11-18 19:06 ` Marc Zyngier
0 siblings, 1 reply; 28+ messages in thread
From: Fuad Tabba @ 2025-11-18 13:59 UTC (permalink / raw)
To: Oliver Upton
Cc: Marc Zyngier, kvmarm, linux-arm-kernel, kvm, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Christoffer Dall, Mark Brown
On Tue, 18 Nov 2025 at 07:20, Oliver Upton <oupton@kernel.org> wrote:
>
> On Mon, Nov 17, 2025 at 09:15:22AM +0000, Marc Zyngier wrote:
> > This is a follow-up to the original series [1] (and fixes [2][3])
> > with a bunch of bug-fixes and improvements. At least one patch has
> > already been posted, but I thought I might repost it as part of a
> > series, since I accumulated more stuff:
> >
> > - The first patch addresses Mark's observation that the no-vgic-v3
> > test has been broken once more. At some point, we'll have to retire
> > that functionality, because even if we keep fixing the SR handling,
> > nobody tests the actual interrupt state exposure to userspace, which
> > I'm pretty sure has badly been broken for at least 5 years.
> >
> > - The second one addresses a report from Fuad that on QEMU,
> > ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
> > the host exploding on deactivating an interrupt. This behaviour is
> > allowed by the spec, so make sure we clear all trap bits
> >
> > - Running vgic_irq in an L1 guest (the test being an L2) results in a
> > MI storm on the host, as the state synchronisation is done at the
> > wrong place, much like it was on the non-NV path before it was
> > reworked. Apply the same methods to the NV code, and enjoy much
> > better MI emulation, now tested all the way into an L3.
> >
> > - Nuke a small leftover from previous rework.
> >
> > - Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
> > the trap prevents too many spurious MIs in an L1 guest, as the write
> > to ICH_HCR_EL2 does exactly nothing on its own when running under
> > FEAT_NV2.
> >
> > Oliver: this is starting to be a large series of fixes on top of the
> > existing series, plus the two patches you have already added. I'd be
> > happy to respin a full v4 with the fixes squashed into their original
> > patches. On the other hand, if you want to see the history in its full
> > glory, that also works for me.
>
> I'll pick up these patches in a moment but at this point I'd prefer a
> clean history. Plan is to send out the 6.19 pull sometime next week so
> any time before then would be great for v4.
I'm happy to take that for another spin Marc before you send it, if
it's different from the ToT I tested. In that case, just send me a
pointer to the branch.
Cheers,
/fuad
> Thanks for ironing out all the quirks :)
>
> Best,
> Oliver
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-18 13:59 ` Fuad Tabba
@ 2025-11-18 19:06 ` Marc Zyngier
2025-11-19 10:37 ` Fuad Tabba
0 siblings, 1 reply; 28+ messages in thread
From: Marc Zyngier @ 2025-11-18 19:06 UTC (permalink / raw)
To: Fuad Tabba
Cc: Oliver Upton, kvmarm, linux-arm-kernel, kvm, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Fuad,
On Tue, 18 Nov 2025 13:59:14 +0000,
Fuad Tabba <tabba@google.com> wrote:
>
> On Tue, 18 Nov 2025 at 07:20, Oliver Upton <oupton@kernel.org> wrote:
> >
> > On Mon, Nov 17, 2025 at 09:15:22AM +0000, Marc Zyngier wrote:
> > > This is a follow-up to the original series [1] (and fixes [2][3])
> > > with a bunch of bug-fixes and improvements. At least one patch has
> > > already been posted, but I thought I might repost it as part of a
> > > series, since I accumulated more stuff:
> > >
> > > - The first patch addresses Mark's observation that the no-vgic-v3
> > > test has been broken once more. At some point, we'll have to retire
> > > that functionality, because even if we keep fixing the SR handling,
> > > nobody tests the actual interrupt state exposure to userspace, which
> > > I'm pretty sure has badly been broken for at least 5 years.
> > >
> > > - The second one addresses a report from Fuad that on QEMU,
> > > ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
> > > the host exploding on deactivating an interrupt. This behaviour is
> > > allowed by the spec, so make sure we clear all trap bits
> > >
> > > - Running vgic_irq in an L1 guest (the test being an L2) results in a
> > > MI storm on the host, as the state synchronisation is done at the
> > > wrong place, much like it was on the non-NV path before it was
> > > reworked. Apply the same methods to the NV code, and enjoy much
> > > better MI emulation, now tested all the way into an L3.
> > >
> > > - Nuke a small leftover from previous rework.
> > >
> > > - Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
> > > the trap prevents too many spurious MIs in an L1 guest, as the write
> > > to ICH_HCR_EL2 does exactly nothing on its own when running under
> > > FEAT_NV2.
> > >
> > > Oliver: this is starting to be a large series of fixes on top of the
> > > existing series, plus the two patches you have already added. I'd be
> > > happy to respin a full v4 with the fixes squashed into their original
> > > patches. On the other hand, if you want to see the history in its full
> > > glory, that also works for me.
> >
> > I'll pick up these patches in a moment but at this point I'd prefer a
> > clean history. Plan is to send out the 6.19 pull sometime next week so
> > any time before then would be great for v4.
>
> I'm happy to take that for another spin Marc before you send it, if
> it's different from the ToT I tested. In that case, just send me a
> pointer to the branch.
I've just pushed out a full branch at [1]. Please make sure to merge
kvmarm-fixes-6.18-3 in, as it fixes a couple of nasties (small
conflict expected, but the resolution should be obvious).
For my own testing, I added -rc6 on top.
Note that I didn't take your Tested-by: tags, as you are about to
retest the whole thing anyway. If all goes well (fingers crossed),
Oliver will be able to apply any further tag once I post these
patches.
Thanks,
M.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
` (6 preceding siblings ...)
2025-11-18 7:20 ` Oliver Upton
@ 2025-11-18 23:34 ` Oliver Upton
7 siblings, 0 replies; 28+ messages in thread
From: Oliver Upton @ 2025-11-18 23:34 UTC (permalink / raw)
To: kvmarm, linux-arm-kernel, kvm, Marc Zyngier
Cc: Oliver Upton, Joey Gouly, Suzuki K Poulose, Zenghui Yu,
Christoffer Dall, Fuad Tabba, Mark Brown
On Mon, 17 Nov 2025 09:15:22 +0000, Marc Zyngier wrote:
> This is a follow-up to the original series [1] (and fixes [2][3])
> with a bunch of bug-fixes and improvements. At least one patch has
> already been posted, but I thought I might repost it as part of a
> series, since I accumulated more stuff:
>
> - The first patch addresses Mark's observation that the no-vgic-v3
> test has been broken once more. At some point, we'll have to retire
> that functionality, because even if we keep fixing the SR handling,
> nobody tests the actual interrupt state exposure to userspace, which
> I'm pretty sure has badly been broken for at least 5 years.
>
> [...]
Applied to next, thanks!
[1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured
https://git.kernel.org/kvmarm/kvmarm/c/3c14fb1b1c88
[2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit
https://git.kernel.org/kvmarm/kvmarm/c/8be00d1ba3a1
[3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation
https://git.kernel.org/kvmarm/kvmarm/c/34586ff89152
[4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers
https://git.kernel.org/kvmarm/kvmarm/c/22c299785240
[5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En
https://git.kernel.org/kvmarm/kvmarm/c/54cf1341324a
--
Best,
Oliver
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly)
2025-11-18 19:06 ` Marc Zyngier
@ 2025-11-19 10:37 ` Fuad Tabba
0 siblings, 0 replies; 28+ messages in thread
From: Fuad Tabba @ 2025-11-19 10:37 UTC (permalink / raw)
To: Marc Zyngier
Cc: Oliver Upton, kvmarm, linux-arm-kernel, kvm, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Christoffer Dall, Mark Brown
Hi Marc,
On Tue, 18 Nov 2025 at 19:06, Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Tue, 18 Nov 2025 13:59:14 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > On Tue, 18 Nov 2025 at 07:20, Oliver Upton <oupton@kernel.org> wrote:
> > >
> > > On Mon, Nov 17, 2025 at 09:15:22AM +0000, Marc Zyngier wrote:
> > > > This is a follow-up to the original series [1] (and fixes [2][3])
> > > > with a bunch of bug-fixes and improvements. At least one patch has
> > > > already been posted, but I thought I might repost it as part of a
> > > > series, since I accumulated more stuff:
> > > >
> > > > - The first patch addresses Mark's observation that the no-vgic-v3
> > > > test has been broken once more. At some point, we'll have to retire
> > > > that functionality, because even if we keep fixing the SR handling,
> > > > nobody tests the actual interrupt state exposure to userspace, which
> > > > I'm pretty sure has badly been broken for at least 5 years.
> > > >
> > > > - The second one addresses a report from Fuad that on QEMU,
> > > > ICH_HCR_EL2.TDIR traps ICC_DIR_EL1 on top of ICV_DIR_EL1, leading to
> > > > the host exploding on deactivating an interrupt. This behaviour is
> > > > allowed by the spec, so make sure we clear all trap bits
> > > >
> > > > - Running vgic_irq in an L1 guest (the test being an L2) results in a
> > > > MI storm on the host, as the state synchronisation is done at the
> > > > wrong place, much like it was on the non-NV path before it was
> > > > reworked. Apply the same methods to the NV code, and enjoy much
> > > > better MI emulation, now tested all the way into an L3.
> > > >
> > > > - Nuke a small leftover from previous rework.
> > > >
> > > > - Force a read-back of ICH_MISR_EL2 when disabling the vgic, so that
> > > > the trap prevents too many spurious MIs in an L1 guest, as the write
> > > > to ICH_HCR_EL2 does exactly nothing on its own when running under
> > > > FEAT_NV2.
> > > >
> > > > Oliver: this is starting to be a large series of fixes on top of the
> > > > existing series, plus the two patches you have already added. I'd be
> > > > happy to respin a full v4 with the fixes squashed into their original
> > > > patches. On the other hand, if you want to see the history in its full
> > > > glory, that also works for me.
> > >
> > > I'll pick up these patches in a moment but at this point I'd prefer a
> > > clean history. Plan is to send out the 6.19 pull sometime next week so
> > > any time before then would be great for v4.
> >
> > I'm happy to take that for another spin Marc before you send it, if
> > it's different from the ToT I tested. In that case, just send me a
> > pointer to the branch.
>
> I've just pushed out a full branch at [1]. Please make sure to merge
> kvmarm-fixes-6.18-3 in, as it fixes a couple of nasties (small
> conflict expected, but the resolution should be obvious).
For this branch [1]:
Tested-by: Fuad Tabba <tabba@google.com>
On QEMU, nVHE, hVHE protected mode (non-protected VMs with and without
the Android pKVM patches), and protected VMs (with the Android pKVM
patches).
Cheers,
/fuad
> For my own testing, I added -rc6 on top.
>
> Note that I didn't take your Tested-by: tags, as you are about to
> retest the whole thing anyway. If all goes well (fingers crossed),
> Oliver will be able to apply any further tag once I post these
> patches.
>
> Thanks,
>
> M.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=kvm-arm64/vgic-lr-overflow
>
> --
> Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-11-19 10:38 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 9:15 [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Marc Zyngier
2025-11-17 9:15 ` [PATCH v3 1/5] KVM: arm64: GICv3: Don't advertise ICH_HCR_EL2.En==1 when no vgic is configured Marc Zyngier
2025-11-17 10:34 ` Fuad Tabba
2025-11-17 11:28 ` Marc Zyngier
2025-11-17 11:29 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 2/5] KVM: arm64: GICv3: Completely disable trapping on vcpu exit Marc Zyngier
2025-11-17 10:36 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 3/5] KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation Marc Zyngier
2025-11-17 11:24 ` Fuad Tabba
2025-11-17 11:34 ` Marc Zyngier
2025-11-17 11:37 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 4/5] KVM: arm64: GICv3: Remove vgic_hcr workaround handling leftovers Marc Zyngier
2025-11-17 11:25 ` Fuad Tabba
2025-11-17 9:15 ` [PATCH v3 5/5] KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En Marc Zyngier
2025-11-17 11:35 ` Fuad Tabba
2025-11-17 11:42 ` Marc Zyngier
2025-11-17 11:48 ` Fuad Tabba
2025-11-18 7:16 ` Oliver Upton
2025-11-18 8:54 ` Marc Zyngier
2025-11-17 9:40 ` [PATCH v3 0/5] KVM: arm64: Add LR overflow infrastructure (the dregs, the bad and the ugly) Fuad Tabba
2025-11-17 9:54 ` Marc Zyngier
2025-11-17 10:18 ` Fuad Tabba
2025-11-17 12:54 ` Fuad Tabba
2025-11-18 7:20 ` Oliver Upton
2025-11-18 13:59 ` Fuad Tabba
2025-11-18 19:06 ` Marc Zyngier
2025-11-19 10:37 ` Fuad Tabba
2025-11-18 23:34 ` Oliver Upton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).