Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [PATCH v2] KVM: arm64: Preserve all guest ZCR_EL2.LEN values
@ 2026-05-28 23:01 Mark Brown
  2026-05-29  9:22 ` Marc Zyngier
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Brown @ 2026-05-28 23:01 UTC (permalink / raw)
  To: Marc Zyngier, Oliver Upton, Joey Gouly, Steffen Eiden,
	Suzuki K Poulose, Catalin Marinas, Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, kvmarm, linux-kernel, Mark Brown,
	stable

Since commit b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps") when
guests write to ZCR_EL2 we have clamped the value of ZCR_EL2.LEN to be
at most that configuring the maximum guest VL when accessed directly as
ZCR_EL2. This is not clearly the behaviour the architecture documents
for ZCR_EL2.LEN, while things are a little ambiguous currently there is
a fairly direct reading that suggests values will be read as written.
Further, the documented procedure for enumerating vector lengths means
that it is expected that values larger than the largest supported vector
length will be written in practice.

The reasoning for the current behaviour is not specifically articulated, my
best guess is that it is intended to ensure that the guest can not see an
effective VL greater than the maximum that has been configured, though
this will be ineffective when a VHE guest uses the ZCR_EL1 accessor.
The VL can instead be constrained by configuring ZCR_EL2 when loading
L2 guest state:

 - When the L2 guest is running in EL1 or EL0 state configure
   ZCR_EL2.LEN to the minimum of the guest ZCR_EL2.LEN and
   vcpu_sve_max_vq(vcpu)-1.
 - When the L2 guest is running at EL2 configure the maximum VL for
   the guest in ZCR_EL2.LEN like we do for L1 guests and load the
   guest ZCR_EL2 into ZCR_EL1.

This will ensure that the effective VL seen by the guest is always
constrained by all of the maximum VL configured by the VMM and the
ZCR_ELx values in effect.

With the above implemented we can simplify the emulation for trapped
writes to ZCR_EL2 to store LEN configured by the guest directly,
matching the handling for ZCR_EL1. We still need to sanitise the values
written to ensure no unsupported fields are written, do this by adding
the register to our generic sanitisation infrastructure. This register
is a bit unusual in that as an unnamed field at bits 8:4 which is
RAZ/WI, for the purposes of sanitisation treat these bits as though they
were RES0.

Fixes: b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
Changes in v2:
- Use generic santisation infrastructure.
- Remove redundant !is_hyp_ctxt() check.
- Also fix ZCR_EL2 configuration in __hyp_sve_restore_guest().
- Commit message clarifications.
- Link to v1: https://patch.msgid.link/20260522-kvm-arm64-fix-zcr-len-nv-v1-1-ec254e9078cf@kernel.org
---
 arch/arm64/include/asm/kvm_host.h       |  2 +-
 arch/arm64/kvm/hyp/include/hyp/switch.h | 16 ++++++++++------
 arch/arm64/kvm/nested.c                 |  5 +++++
 arch/arm64/kvm/sys_regs.c               |  6 +-----
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 65eead8362e0..a49042bfa801 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -511,7 +511,6 @@ enum vcpu_sysreg {
 	ACTLR_EL2,	/* Auxiliary Control Register (EL2) */
 	CPTR_EL2,	/* Architectural Feature Trap Register (EL2) */
 	HACR_EL2,	/* Hypervisor Auxiliary Control Register */
-	ZCR_EL2,	/* SVE Control Register (EL2) */
 	TTBR0_EL2,	/* Translation Table Base Register 0 (EL2) */
 	TTBR1_EL2,	/* Translation Table Base Register 1 (EL2) */
 	TCR_EL2,	/* Translation Control Register (EL2) */
@@ -543,6 +542,7 @@ enum vcpu_sysreg {
 	SCTLR2_EL2,	/* System Control Register 2 (EL2) */
 	MDCR_EL2,	/* Monitor Debug Configuration Register (EL2) */
 	CNTHCTL_EL2,	/* Counter-timer Hypervisor Control register */
+	ZCR_EL2,	/* SVE Control Register (EL2) */
 
 	/* Any VNCR-capable reg goes after this point */
 	MARKER(__VNCR_START__),
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index bf0eb5e43427..320cd45d49c5 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -462,11 +462,13 @@ static inline bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)
 
 static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 {
+	u64 zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
+
 	/*
 	 * The vCPU's saved SVE state layout always matches the max VL of the
 	 * vCPU. Start off with the max VL so we can load the SVE state.
 	 */
-	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
+	sve_cond_update_zcr_vq(zcr_el2, SYS_ZCR_EL2);
 	__sve_restore_state(vcpu_sve_pffr(vcpu),
 			    &vcpu->arch.ctxt.fp_regs.fpsr,
 			    true);
@@ -476,8 +478,10 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	 * nested guest, as the guest hypervisor could select a smaller VL. Slap
 	 * that into hardware before wrapping up.
 	 */
-	if (is_nested_ctxt(vcpu))
-		sve_cond_update_zcr_vq(__vcpu_sys_reg(vcpu, ZCR_EL2), SYS_ZCR_EL2);
+	if (is_nested_ctxt(vcpu)) {
+		zcr_el2 = min(zcr_el2, __vcpu_sys_reg(vcpu, ZCR_EL2));
+		sve_cond_update_zcr_vq(zcr_el2, SYS_ZCR_EL2);
+	}
 
 	write_sysreg_el1(__vcpu_sys_reg(vcpu, vcpu_sve_zcr_elx(vcpu)), SYS_ZCR);
 }
@@ -501,11 +505,11 @@ static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)
 		return;
 
 	if (vcpu_has_sve(vcpu)) {
+		zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
+
 		/* A guest hypervisor may restrict the effective max VL. */
 		if (is_nested_ctxt(vcpu))
-			zcr_el2 = __vcpu_sys_reg(vcpu, ZCR_EL2);
-		else
-			zcr_el2 = vcpu_sve_max_vq(vcpu) - 1;
+			zcr_el2 = min(zcr_el2, __vcpu_sys_reg(vcpu, ZCR_EL2));
 
 		write_sysreg_el2(zcr_el2, SYS_ZCR);
 
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 883b6c1008fb..38f672e94087 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1834,6 +1834,11 @@ int kvm_init_nv_sysregs(struct kvm_vcpu *vcpu)
 	resx.res1 = VNCR_EL2_RES1;
 	set_sysreg_masks(kvm, VNCR_EL2, resx);
 
+	/* ZCR_EL2 - bits 8:4 are RAZ/WI so treat them as RES0 */
+	resx.res0 = ZCR_ELx_RES0 | GENMASK_ULL(8, 4);
+	resx.res1 = ZCR_ELx_RES1;
+	set_sysreg_masks(kvm, ZCR_EL2, resx);
+
 out:
 	for (enum vcpu_sysreg sr = __SANITISED_REG_START__; sr < NR_SYS_REGS; sr++)
 		__vcpu_rmw_sys_reg(vcpu, sr, |=, 0);
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 148fc3400ea8..77e1b5d223c6 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2862,8 +2862,6 @@ static bool access_zcr_el2(struct kvm_vcpu *vcpu,
 			   struct sys_reg_params *p,
 			   const struct sys_reg_desc *r)
 {
-	unsigned int vq;
-
 	if (guest_hyp_sve_traps_enabled(vcpu)) {
 		kvm_inject_nested_sve_trap(vcpu);
 		return false;
@@ -2874,9 +2872,7 @@ static bool access_zcr_el2(struct kvm_vcpu *vcpu,
 		return true;
 	}
 
-	vq = SYS_FIELD_GET(ZCR_ELx, LEN, p->regval) + 1;
-	vq = min(vq, vcpu_sve_max_vq(vcpu));
-	__vcpu_assign_sys_reg(vcpu, ZCR_EL2, vq - 1);
+	__vcpu_assign_sys_reg(vcpu, ZCR_EL2, p->regval);
 	return true;
 }
 

---
base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
change-id: 20260519-kvm-arm64-fix-zcr-len-nv-9e9e7bae012a

Best regards,
--  
Mark Brown <broonie@kernel.org>


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] KVM: arm64: Preserve all guest ZCR_EL2.LEN values
  2026-05-28 23:01 [PATCH v2] KVM: arm64: Preserve all guest ZCR_EL2.LEN values Mark Brown
@ 2026-05-29  9:22 ` Marc Zyngier
  2026-05-29  9:46   ` Joey Gouly
  0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2026-05-29  9:22 UTC (permalink / raw)
  To: Mark Brown
  Cc: Oliver Upton, Joey Gouly, Steffen Eiden, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	kvmarm, linux-kernel, stable

Thanks for respinning this patch quickly.

On Fri, 29 May 2026 00:01:44 +0100,
Mark Brown <broonie@kernel.org> wrote:
> 
> Since commit b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps") when
> guests write to ZCR_EL2 we have clamped the value of ZCR_EL2.LEN to be
> at most that configuring the maximum guest VL when accessed directly as
> ZCR_EL2. This is not clearly the behaviour the architecture documents
> for ZCR_EL2.LEN, while things are a little ambiguous currently there is
> a fairly direct reading that suggests values will be read as written.
> Further, the documented procedure for enumerating vector lengths means
> that it is expected that values larger than the largest supported vector
> length will be written in practice.

Honestly, that's not the core issue. And even $SUBJECT fails to
capture what is at stake here.

> 
> The reasoning for the current behaviour is not specifically articulated, my

I don't think there is a reasoning behind each and every bug.

> best guess is that it is intended to ensure that the guest can not see an
> effective VL greater than the maximum that has been configured, though
> this will be ineffective when a VHE guest uses the ZCR_EL1 accessor.

This last point *IS* the core problem. It is that the guest can access
VLs beyond what is intended by the VM configuration. Not getting the
read-as-written behaviour really is secondary compared to that issue.

[...]

I've rewritten the commit message to make it plain what the problem
is, see below. I've also slightly tidied up access_zcr_el2(), but the
fix otherwise looks good.

Thanks,

	M.

KVM: arm64: Correctly cap ZCR_EL2 provided by a guest hypervisor

ZCR_EL2 can be updated by a VHE guest hypervisor either using ZCR_EL2
(which traps) or ZCR_EL1 (which does not trap). KVM handles both in
different way:

- on ZCR_EL2 trap, ZCR_EL2.LEN is immediately capped at the VM's own
  VL limit. This has the potential to break existing SW that relies
  on the full LEN field to be stateful.

- on ZCR_EL1 access, we do absolutely nothing.

On restoring the SVE context for an L2 guest, we directly restore the
guest hypervisor's view of ZCR_EL2 into the physical ZCR_EL2. If the
guest's view of the register was updated using the ZCR_EL2 accessor,
the value has already been sanitised (with the caveat mentioned above).

But if the guest used ZCR_EL1, the raw value is written into the HW,
and the L2 guest can now access VLs that it shouldn't.

Fix all the above by moving the VL capping to the restore points,
ensuring that:

- the HW is always programmed with a capped value, irrespective of
  the accessor being used,

- the ZCR_EL2.LEN field is always completely stateful, irrespective
  of the accessor being used.

Additionally, move ZCR_EL2 to be a sanitised register, ensuring that
only the LEN field is actually stateful. This requires some creative
construction of the RES0 mask, as the sysreg generation script does
not yet generate RAZ/WI fields.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] KVM: arm64: Preserve all guest ZCR_EL2.LEN values
  2026-05-29  9:22 ` Marc Zyngier
@ 2026-05-29  9:46   ` Joey Gouly
  0 siblings, 0 replies; 3+ messages in thread
From: Joey Gouly @ 2026-05-29  9:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Brown, Oliver Upton, Steffen Eiden, Suzuki K Poulose,
	Catalin Marinas, Will Deacon, Mark Rutland, linux-arm-kernel,
	kvmarm, linux-kernel, stable

On Fri, May 29, 2026 at 10:22:41AM +0100, Marc Zyngier wrote:
> Thanks for respinning this patch quickly.
> 
> On Fri, 29 May 2026 00:01:44 +0100,
> Mark Brown <broonie@kernel.org> wrote:
> > 
> > Since commit b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps") when
> > guests write to ZCR_EL2 we have clamped the value of ZCR_EL2.LEN to be
> > at most that configuring the maximum guest VL when accessed directly as
> > ZCR_EL2. This is not clearly the behaviour the architecture documents
> > for ZCR_EL2.LEN, while things are a little ambiguous currently there is
> > a fairly direct reading that suggests values will be read as written.
> > Further, the documented procedure for enumerating vector lengths means
> > that it is expected that values larger than the largest supported vector
> > length will be written in practice.
> 
> Honestly, that's not the core issue. And even $SUBJECT fails to
> capture what is at stake here.
> 
> > 
> > The reasoning for the current behaviour is not specifically articulated, my
> 
> I don't think there is a reasoning behind each and every bug.
> 
> > best guess is that it is intended to ensure that the guest can not see an
> > effective VL greater than the maximum that has been configured, though
> > this will be ineffective when a VHE guest uses the ZCR_EL1 accessor.
> 
> This last point *IS* the core problem. It is that the guest can access
> VLs beyond what is intended by the VM configuration. Not getting the
> read-as-written behaviour really is secondary compared to that issue.
> 
> [...]
> 
> I've rewritten the commit message to make it plain what the problem
> is, see below. I've also slightly tidied up access_zcr_el2(), but the
> fix otherwise looks good.

Seems like you're starting your own CMAAS! https://lore.kernel.org/kvmarm/86cxyzxymq.wl-maz@kernel.org/

> 
> Thanks,
> 
> 	M.
> 
> KVM: arm64: Correctly cap ZCR_EL2 provided by a guest hypervisor
> 
> ZCR_EL2 can be updated by a VHE guest hypervisor either using ZCR_EL2
> (which traps) or ZCR_EL1 (which does not trap). KVM handles both in
> different way:
> 
> - on ZCR_EL2 trap, ZCR_EL2.LEN is immediately capped at the VM's own
>   VL limit. This has the potential to break existing SW that relies
>   on the full LEN field to be stateful.
> 
> - on ZCR_EL1 access, we do absolutely nothing.
> 
> On restoring the SVE context for an L2 guest, we directly restore the
> guest hypervisor's view of ZCR_EL2 into the physical ZCR_EL2. If the
> guest's view of the register was updated using the ZCR_EL2 accessor,
> the value has already been sanitised (with the caveat mentioned above).
> 
> But if the guest used ZCR_EL1, the raw value is written into the HW,
> and the L2 guest can now access VLs that it shouldn't.
> 
> Fix all the above by moving the VL capping to the restore points,
> ensuring that:
> 
> - the HW is always programmed with a capped value, irrespective of
>   the accessor being used,
> 
> - the ZCR_EL2.LEN field is always completely stateful, irrespective
>   of the accessor being used.
> 
> Additionally, move ZCR_EL2 to be a sanitised register, ensuring that
> only the LEN field is actually stateful. This requires some creative
> construction of the RES0 mask, as the sysreg generation script does
> not yet generate RAZ/WI fields.
> 
> -- 

Reviewed-by: Joey Gouly <joey.gouly@arm.com>

Thanks,
Joey

> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-29  9:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28 23:01 [PATCH v2] KVM: arm64: Preserve all guest ZCR_EL2.LEN values Mark Brown
2026-05-29  9:22 ` Marc Zyngier
2026-05-29  9:46   ` Joey Gouly

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox