* [PATCH 0/2] KVM: arm64: AT + SR accessor fixes
@ 2025-08-09 14:48 Marc Zyngier
2025-08-09 14:48 ` [PATCH 1/2] KVM: arm64: nv: Fix ATS12 handling of single-stage translation Marc Zyngier
2025-08-09 14:48 ` [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors Marc Zyngier
0 siblings, 2 replies; 7+ messages in thread
From: Marc Zyngier @ 2025-08-09 14:48 UTC (permalink / raw)
To: kvmarm, kvm, linux-arm-kernel
Cc: Volodymyr Babchuk, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu
Volodymyr having reported[1] a couple of rather interesting bugs while
running Xen under KVM, here's a couple of patches to plug these
issues:
- a fix for ATS12 stopping the walk at S1 under the wrong conditions
- a much larger fix for the vcpu_{read,write}_sys_reg() accessors,
fixing the fate of TPIDR*_EL{0,1} and PAR_EL1, and overall becoming
much easier to work with
The latter is a pretty large change, but is worth it IMO as it makes
everything much more straightforward.
Volodymyr, I'd very much welcome your feedback on those, as despite my
best effort, I didn't manage to even boot Debian's packaging of Xen
(Grub just refuses to run *anything* after Xen being installed -- I
guess it's not tested at all).
[1] https://lore.kernel.org/r/20250806141707.3479194-1-volodymyr_babchuk@epam.com
Marc Zyngier (2):
KVM: arm64: nv: Fix ATS12 handling of single-stage translation
KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
arch/arm64/include/asm/kvm_host.h | 4 +-
arch/arm64/kvm/at.c | 6 +-
arch/arm64/kvm/sys_regs.c | 243 +++++++++++++++---------------
3 files changed, 130 insertions(+), 123 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] KVM: arm64: nv: Fix ATS12 handling of single-stage translation
2025-08-09 14:48 [PATCH 0/2] KVM: arm64: AT + SR accessor fixes Marc Zyngier
@ 2025-08-09 14:48 ` Marc Zyngier
2025-08-09 14:48 ` [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors Marc Zyngier
1 sibling, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2025-08-09 14:48 UTC (permalink / raw)
To: kvmarm, kvm, linux-arm-kernel
Cc: Volodymyr Babchuk, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu
Volodymyr reports that using a Xen DomU as a nested guest (where
HCR_EL2.E2H == 0), ATS12 results in a translation that stops at
the L2's S1, which isn't something you'd normally expects.
Comparing the code against the spec proves to be illuminating,
and suggests that the author of such code must have been tired,
cross-eyed, drunk, or maybe all of the above.
The gist of it is that, apart from HCR_EL2.VM or HCR_EL2.DC being
0, only the use of the EL2&0 translation regime limits the walk
to S1 only, and that we must finish the S2 walk in any other case.
Which solves the above issue, as E2H==0 indicates that ATS12 walks
the EL1&0 translation regime.
Explicitly checking for EL2&0 fixes this.
Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Suggested-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: be04cebf3e788 ("KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}")
Link: https://lore.kernel.org/r/20250806141707.3479194-2-volodymyr_babchuk@epam.com
---
arch/arm64/kvm/at.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kvm/at.c b/arch/arm64/kvm/at.c
index 0e56105339493..d71ca4ddc9d1e 100644
--- a/arch/arm64/kvm/at.c
+++ b/arch/arm64/kvm/at.c
@@ -1420,10 +1420,10 @@ void __kvm_at_s12(struct kvm_vcpu *vcpu, u32 op, u64 vaddr)
return;
/*
- * If we only have a single stage of translation (E2H=0 or
- * TGE=1), exit early. Same thing if {VM,DC}=={0,0}.
+ * If we only have a single stage of translation (EL2&0), exit
+ * early. Same thing if {VM,DC}=={0,0}.
*/
- if (!vcpu_el2_e2h_is_set(vcpu) || vcpu_el2_tge_is_set(vcpu) ||
+ if (compute_translation_regime(vcpu, op) == TR_EL20 ||
!(vcpu_read_sys_reg(vcpu, HCR_EL2) & (HCR_VM | HCR_DC)))
return;
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
2025-08-09 14:48 [PATCH 0/2] KVM: arm64: AT + SR accessor fixes Marc Zyngier
2025-08-09 14:48 ` [PATCH 1/2] KVM: arm64: nv: Fix ATS12 handling of single-stage translation Marc Zyngier
@ 2025-08-09 14:48 ` Marc Zyngier
2025-08-12 20:23 ` Oliver Upton
1 sibling, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2025-08-09 14:48 UTC (permalink / raw)
To: kvmarm, kvm, linux-arm-kernel
Cc: Volodymyr Babchuk, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu
Volodymyr reports (again!) that under some circumstances (E2H==0,
walking S1 PTs), PAR_EL1 doesn't report the value of the latest
walk in the CPU register, but that instead the value is written to
the backing store.
Further investigation indicates that the root cause of this is
that a group of registers (PAR_EL1, TPIDR*_EL{0,1}) should always
be considered as "on CPU", as they are not remapped between EL1 and
EL2. We fail to treat them accordingly, and end-up considering that
the register (PAR_EL1 in this example) should be written to memory
instead of in the register.
While it would be possible to quickly work around it, it is obvious
that the way we track these things at the moment is pretty horrible,
and could do with some improvement.
Revamp the whole thing by:
- defining a location for a register (memory, cpu), potentially
depending on the state of the vcpu
- define a transformation for this register (mapped register, potential
translation, special register needing some particular attention)
- convey this information in a structure that can be easily passed
around
As a result, the accessors themselves become much simpler, as the
state is explicit instead of being driven by hard-to-understand
conventions.
We get rid of the "pure EL2 register" notion, which wasn't very
useful, and add sanitisation of the values by applying the RESx
masks as required, something that was missing so far.
And of course, we add the missing registers to the list, with the
indication that they are always loaded.
Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: fedc612314acf ("KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg()")
Link: https://lore.kerbnel.org/r/20250806141707.3479194-3-volodymyr_babchuk@epam.com
---
arch/arm64/include/asm/kvm_host.h | 4 +-
arch/arm64/kvm/sys_regs.c | 243 +++++++++++++++---------------
2 files changed, 127 insertions(+), 120 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2f2394cce24ed..a9661d35bfd02 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1160,8 +1160,8 @@ u64 kvm_vcpu_apply_reg_masks(const struct kvm_vcpu *, enum vcpu_sysreg, u64);
__v; \
})
-u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg);
-void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg);
+u64 vcpu_read_sys_reg(const struct kvm_vcpu *, enum vcpu_sysreg);
+void vcpu_write_sys_reg(struct kvm_vcpu *, u64, enum vcpu_sysreg);
static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val)
{
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 82ffb3b3b3cf7..c9aa41da8d3a3 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -82,43 +82,55 @@ static bool write_to_read_only(struct kvm_vcpu *vcpu,
"sys_reg write to read-only register");
}
-#define PURE_EL2_SYSREG(el2) \
- case el2: { \
- *el1r = el2; \
- return true; \
- }
+enum sr_loc_attr {
+ SR_LOC_MEMORY = 0, /* Register definitely in memory */
+ SR_LOC_LOADED = BIT(0), /* Register on CPU, unless it cannot */
+ SR_LOC_MAPPED = BIT(1), /* Register in a different CPU register */
+ SR_LOC_XLATED = BIT(2), /* Register translated to fit another reg */
+ SR_LOC_SPECIAL = BIT(3), /* Demanding register, implies loaded */
+};
-#define MAPPED_EL2_SYSREG(el2, el1, fn) \
- case el2: { \
- *xlate = fn; \
- *el1r = el1; \
- return true; \
- }
+struct sr_loc {
+ enum sr_loc_attr loc;
+ enum vcpu_sysreg map_reg;
+ u64 (*xlate)(u64);
+};
-static bool get_el2_to_el1_mapping(unsigned int reg,
- unsigned int *el1r, u64 (**xlate)(u64))
+static void locate_mapped_el2_register(const struct kvm_vcpu *vcpu,
+ enum vcpu_sysreg reg,
+ enum vcpu_sysreg map_reg,
+ u64 (*xlate)(u64),
+ struct sr_loc *loc)
{
+ if (!is_hyp_ctxt(vcpu)) {
+ loc->loc = SR_LOC_MEMORY;
+ return;
+ }
+
+ loc->loc = SR_LOC_LOADED | SR_LOC_MAPPED;
+ loc->map_reg = map_reg;
+
+ if (xlate != NULL && !vcpu_el2_e2h_is_set(vcpu)) {
+ loc->loc |= SR_LOC_XLATED;
+ loc->xlate = xlate;
+ }
+}
+
+#define MAPPED_EL2_SYSREG(r, m, t) \
+ case r: { \
+ locate_mapped_el2_register(vcpu, r, m, t, loc); \
+ break; \
+ }
+
+static void locate_register(const struct kvm_vcpu *vcpu, enum vcpu_sysreg reg,
+ struct sr_loc *loc)
+{
+ if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU)) {
+ loc->loc = SR_LOC_MEMORY;
+ return;
+ }
+
switch (reg) {
- PURE_EL2_SYSREG( VPIDR_EL2 );
- PURE_EL2_SYSREG( VMPIDR_EL2 );
- PURE_EL2_SYSREG( ACTLR_EL2 );
- PURE_EL2_SYSREG( HCR_EL2 );
- PURE_EL2_SYSREG( MDCR_EL2 );
- PURE_EL2_SYSREG( HSTR_EL2 );
- PURE_EL2_SYSREG( HACR_EL2 );
- PURE_EL2_SYSREG( VTTBR_EL2 );
- PURE_EL2_SYSREG( VTCR_EL2 );
- PURE_EL2_SYSREG( TPIDR_EL2 );
- PURE_EL2_SYSREG( HPFAR_EL2 );
- PURE_EL2_SYSREG( HCRX_EL2 );
- PURE_EL2_SYSREG( HFGRTR_EL2 );
- PURE_EL2_SYSREG( HFGWTR_EL2 );
- PURE_EL2_SYSREG( HFGITR_EL2 );
- PURE_EL2_SYSREG( HDFGRTR_EL2 );
- PURE_EL2_SYSREG( HDFGWTR_EL2 );
- PURE_EL2_SYSREG( HAFGRTR_EL2 );
- PURE_EL2_SYSREG( CNTVOFF_EL2 );
- PURE_EL2_SYSREG( CNTHCTL_EL2 );
MAPPED_EL2_SYSREG(SCTLR_EL2, SCTLR_EL1,
translate_sctlr_el2_to_sctlr_el1 );
MAPPED_EL2_SYSREG(CPTR_EL2, CPACR_EL1,
@@ -144,125 +156,120 @@ static bool get_el2_to_el1_mapping(unsigned int reg,
MAPPED_EL2_SYSREG(ZCR_EL2, ZCR_EL1, NULL );
MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL );
MAPPED_EL2_SYSREG(SCTLR2_EL2, SCTLR2_EL1, NULL );
+ case CNTHCTL_EL2:
+ /* CNTHCTL_EL2 is super special, until we support NV2.1 */
+ loc->loc = ((is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) ?
+ SR_LOC_SPECIAL : SR_LOC_MEMORY);
+ break;
+ case TPIDR_EL0:
+ case TPIDRRO_EL0:
+ case TPIDR_EL1:
+ case PAR_EL1:
+ /* These registers are always loaded, no matter what */
+ loc->loc = SR_LOC_LOADED;
+ break;
default:
- return false;
+ /*
+ * Non-mapped EL2 registers are by definition in memory, but
+ * we don't need to distinguish them here, as the CPU
+ * register accessors will bail out and we'll end-up using
+ * the backing store.
+ *
+ * EL1 registers are, however, only loaded if we're
+ * not in hypervisor context.
+ */
+ loc->loc = is_hyp_ctxt(vcpu) ? SR_LOC_MEMORY : SR_LOC_LOADED;
}
}
-u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
+u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, enum vcpu_sysreg reg)
{
- u64 val = 0x8badf00d8badf00d;
- u64 (*xlate)(u64) = NULL;
- unsigned int el1r;
+ struct sr_loc loc = {};
- if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
- goto memory_read;
+ locate_register(vcpu, reg, &loc);
- if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
- if (!is_hyp_ctxt(vcpu))
- goto memory_read;
+ if (loc.loc & SR_LOC_SPECIAL) {
+ u64 val;
/*
- * CNTHCTL_EL2 requires some special treatment to
- * account for the bits that can be set via CNTKCTL_EL1.
+ * CNTHCTL_EL2 requires some special treatment to account
+ * for the bits that can be set via CNTKCTL_EL1 when E2H==1.
*/
switch (reg) {
case CNTHCTL_EL2:
- if (vcpu_el2_e2h_is_set(vcpu)) {
- val = read_sysreg_el1(SYS_CNTKCTL);
- val &= CNTKCTL_VALID_BITS;
- val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
- return val;
- }
- break;
+ val = read_sysreg_el1(SYS_CNTKCTL);
+ val &= CNTKCTL_VALID_BITS;
+ val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
+ return val;
+ default:
+ WARN_ON_ONCE(1);
}
-
- /*
- * If this register does not have an EL1 counterpart,
- * then read the stored EL2 version.
- */
- if (reg == el1r)
- goto memory_read;
-
- /*
- * If we have a non-VHE guest and that the sysreg
- * requires translation to be used at EL1, use the
- * in-memory copy instead.
- */
- if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
- goto memory_read;
-
- /* Get the current version of the EL1 counterpart. */
- WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
- if (reg >= __SANITISED_REG_START__)
- val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
-
- return val;
}
- /* EL1 register can't be on the CPU if the guest is in vEL2. */
- if (unlikely(is_hyp_ctxt(vcpu)))
- goto memory_read;
+ if (loc.loc & SR_LOC_LOADED) {
+ enum vcpu_sysreg map_reg = reg;
+ u64 val = 0x8badf00d8badf00d;
- if (__vcpu_read_sys_reg_from_cpu(reg, &val))
- return val;
+ if (loc.loc & SR_LOC_MAPPED)
+ map_reg = loc.map_reg;
+
+ if (!(loc.loc & SR_LOC_XLATED) &&
+ __vcpu_read_sys_reg_from_cpu(map_reg, &val)) {
+ if (reg >= __SANITISED_REG_START__)
+ val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
+
+ return val;
+ }
+ }
-memory_read:
return __vcpu_sys_reg(vcpu, reg);
}
-void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
+void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, enum vcpu_sysreg reg)
{
- u64 (*xlate)(u64) = NULL;
- unsigned int el1r;
+ struct sr_loc loc = {};
- if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
- goto memory_write;
-
- if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
- if (!is_hyp_ctxt(vcpu))
- goto memory_write;
-
- /*
- * Always store a copy of the write to memory to avoid having
- * to reverse-translate virtual EL2 system registers for a
- * non-VHE guest hypervisor.
- */
- __vcpu_assign_sys_reg(vcpu, reg, val);
+ locate_register(vcpu, reg, &loc);
+ if (loc.loc & SR_LOC_SPECIAL) {
switch (reg) {
case CNTHCTL_EL2:
/*
- * If E2H=0, CNHTCTL_EL2 is a pure shadow register.
- * Otherwise, some of the bits are backed by
+ * If E2H=1, some of the bits are backed by
* CNTKCTL_EL1, while the rest is kept in memory.
* Yes, this is fun stuff.
*/
- if (vcpu_el2_e2h_is_set(vcpu))
- write_sysreg_el1(val, SYS_CNTKCTL);
- return;
+ write_sysreg_el1(val, SYS_CNTKCTL);
+ break;
+ default:
+ WARN_ON_ONCE(1);
}
-
- /* No EL1 counterpart? We're done here.? */
- if (reg == el1r)
- return;
-
- if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
- val = xlate(val);
-
- /* Redirect this to the EL1 version of the register. */
- WARN_ON(!__vcpu_write_sys_reg_to_cpu(val, el1r));
- return;
}
- /* EL1 register can't be on the CPU if the guest is in vEL2. */
- if (unlikely(is_hyp_ctxt(vcpu)))
- goto memory_write;
+ if (loc.loc & SR_LOC_LOADED) {
+ enum vcpu_sysreg map_reg = reg;
+ u64 xlated_val;
- if (__vcpu_write_sys_reg_to_cpu(val, reg))
- return;
+ if (reg >= __SANITISED_REG_START__)
+ val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
+
+ if (loc.loc & SR_LOC_MAPPED)
+ map_reg = loc.map_reg;
+
+ if (loc.loc & SR_LOC_XLATED)
+ xlated_val = loc.xlate(val);
+ else
+ xlated_val = val;
+
+ __vcpu_write_sys_reg_to_cpu(xlated_val, map_reg);
+
+ /*
+ * Fall through to write the backing store anyway, which
+ * allows translated registers to be directly read without a
+ * reverse translation.
+ */
+ }
-memory_write:
__vcpu_assign_sys_reg(vcpu, reg, val);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
2025-08-09 14:48 ` [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors Marc Zyngier
@ 2025-08-12 20:23 ` Oliver Upton
2025-08-13 6:54 ` Marc Zyngier
2025-08-14 16:16 ` Marc Zyngier
0 siblings, 2 replies; 7+ messages in thread
From: Oliver Upton @ 2025-08-12 20:23 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, kvm, linux-arm-kernel, Volodymyr Babchuk, Joey Gouly,
Suzuki K Poulose, Zenghui Yu
On Sat, Aug 09, 2025 at 03:48:11PM +0100, Marc Zyngier wrote:
> @@ -144,125 +156,120 @@ static bool get_el2_to_el1_mapping(unsigned int reg,
> MAPPED_EL2_SYSREG(ZCR_EL2, ZCR_EL1, NULL );
> MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL );
> MAPPED_EL2_SYSREG(SCTLR2_EL2, SCTLR2_EL1, NULL );
> + case CNTHCTL_EL2:
> + /* CNTHCTL_EL2 is super special, until we support NV2.1 */
> + loc->loc = ((is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) ?
> + SR_LOC_SPECIAL : SR_LOC_MEMORY);
> + break;
> + case TPIDR_EL0:
> + case TPIDRRO_EL0:
> + case TPIDR_EL1:
> + case PAR_EL1:
> + /* These registers are always loaded, no matter what */
> + loc->loc = SR_LOC_LOADED;
> + break;
> default:
> - return false;
> + /*
> + * Non-mapped EL2 registers are by definition in memory, but
> + * we don't need to distinguish them here, as the CPU
> + * register accessors will bail out and we'll end-up using
> + * the backing store.
> + *
> + * EL1 registers are, however, only loaded if we're
> + * not in hypervisor context.
> + */
> + loc->loc = is_hyp_ctxt(vcpu) ? SR_LOC_MEMORY : SR_LOC_LOADED;
Hmm... I get the feeling that this flow is becoming even more subtle.
There's some implicit coupling between this switch statement and the
__vcpu_{read,write}_sys_reg_from_cpu() which feels like it could be
error prone. Especially since we're gonna lose the WARN() that would
inform us if an on-CPU register was actually redirected to memory.
I'm wondering if we need some macro hell containing the block of
registers we handle on-CPU and expand that can be expanded into this
triage switch case as well as the sysreg accessor.
What you have definitely seems correct, though. I'll twiddle a bit and
see if I come up with something, although I imagine what you have is
what we'll use in the end anyway.
Thanks,
Oliver
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
2025-08-12 20:23 ` Oliver Upton
@ 2025-08-13 6:54 ` Marc Zyngier
2025-08-14 16:16 ` Marc Zyngier
1 sibling, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2025-08-13 6:54 UTC (permalink / raw)
To: Oliver Upton
Cc: kvmarm, kvm, linux-arm-kernel, Volodymyr Babchuk, Joey Gouly,
Suzuki K Poulose, Zenghui Yu
Hey Oliver,
Thanks for looking into this.
On Tue, 12 Aug 2025 21:23:33 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Sat, Aug 09, 2025 at 03:48:11PM +0100, Marc Zyngier wrote:
> > @@ -144,125 +156,120 @@ static bool get_el2_to_el1_mapping(unsigned int reg,
> > MAPPED_EL2_SYSREG(ZCR_EL2, ZCR_EL1, NULL );
> > MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL );
> > MAPPED_EL2_SYSREG(SCTLR2_EL2, SCTLR2_EL1, NULL );
> > + case CNTHCTL_EL2:
> > + /* CNTHCTL_EL2 is super special, until we support NV2.1 */
> > + loc->loc = ((is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) ?
> > + SR_LOC_SPECIAL : SR_LOC_MEMORY);
> > + break;
> > + case TPIDR_EL0:
> > + case TPIDRRO_EL0:
> > + case TPIDR_EL1:
> > + case PAR_EL1:
> > + /* These registers are always loaded, no matter what */
> > + loc->loc = SR_LOC_LOADED;
> > + break;
> > default:
> > - return false;
> > + /*
> > + * Non-mapped EL2 registers are by definition in memory, but
> > + * we don't need to distinguish them here, as the CPU
> > + * register accessors will bail out and we'll end-up using
> > + * the backing store.
> > + *
> > + * EL1 registers are, however, only loaded if we're
> > + * not in hypervisor context.
> > + */
> > + loc->loc = is_hyp_ctxt(vcpu) ? SR_LOC_MEMORY : SR_LOC_LOADED;
>
> Hmm... I get the feeling that this flow is becoming even more subtle.
> There's some implicit coupling between this switch statement and the
> __vcpu_{read,write}_sys_reg_from_cpu() which feels like it could be
> error prone. Especially since we're gonna lose the WARN() that would
> inform us if an on-CPU register was actually redirected to memory.
This implicit behaviour was already present, and nobody noticed it.
See how the FGT2 registers are currently missing from the list of
"pure" registers. We didn't see the problem because the fallback saves
us. This is what decided me to throw away the "pure" annotation and
lump both non-remapped EL2 and EL1 registers together.
> I'm wondering if we need some macro hell containing the block of
> registers we handle on-CPU and expand that can be expanded into this
> triage switch case as well as the sysreg accessor.
That's an interesting approach. A bit tricky, because we do rely on
heavy inlining and constant propagation in the CPU accessors, but
maybe there's a way to deal with that...
> What you have definitely seems correct, though. I'll twiddle a bit and
> see if I come up with something, although I imagine what you have is
> what we'll use in the end anyway.
I'll have a look in parallel and post my findings, if any.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
2025-08-12 20:23 ` Oliver Upton
2025-08-13 6:54 ` Marc Zyngier
@ 2025-08-14 16:16 ` Marc Zyngier
2025-08-15 18:56 ` Oliver Upton
1 sibling, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2025-08-14 16:16 UTC (permalink / raw)
To: Oliver Upton
Cc: kvmarm, kvm, linux-arm-kernel, Volodymyr Babchuk, Joey Gouly,
Suzuki K Poulose, Zenghui Yu
On Tue, 12 Aug 2025 21:23:33 +0100,
Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Sat, Aug 09, 2025 at 03:48:11PM +0100, Marc Zyngier wrote:
> > @@ -144,125 +156,120 @@ static bool get_el2_to_el1_mapping(unsigned int reg,
> > MAPPED_EL2_SYSREG(ZCR_EL2, ZCR_EL1, NULL );
> > MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL );
> > MAPPED_EL2_SYSREG(SCTLR2_EL2, SCTLR2_EL1, NULL );
> > + case CNTHCTL_EL2:
> > + /* CNTHCTL_EL2 is super special, until we support NV2.1 */
> > + loc->loc = ((is_hyp_ctxt(vcpu) && vcpu_el2_e2h_is_set(vcpu)) ?
> > + SR_LOC_SPECIAL : SR_LOC_MEMORY);
> > + break;
> > + case TPIDR_EL0:
> > + case TPIDRRO_EL0:
> > + case TPIDR_EL1:
> > + case PAR_EL1:
> > + /* These registers are always loaded, no matter what */
> > + loc->loc = SR_LOC_LOADED;
> > + break;
> > default:
> > - return false;
> > + /*
> > + * Non-mapped EL2 registers are by definition in memory, but
> > + * we don't need to distinguish them here, as the CPU
> > + * register accessors will bail out and we'll end-up using
> > + * the backing store.
> > + *
> > + * EL1 registers are, however, only loaded if we're
> > + * not in hypervisor context.
> > + */
> > + loc->loc = is_hyp_ctxt(vcpu) ? SR_LOC_MEMORY : SR_LOC_LOADED;
>
> Hmm... I get the feeling that this flow is becoming even more subtle.
> There's some implicit coupling between this switch statement and the
> __vcpu_{read,write}_sys_reg_from_cpu() which feels like it could be
> error prone. Especially since we're gonna lose the WARN() that would
> inform us if an on-CPU register was actually redirected to memory.
>
> I'm wondering if we need some macro hell containing the block of
> registers we handle on-CPU and expand that can be expanded into this
> triage switch case as well as the sysreg accessor.
>
> What you have definitely seems correct, though. I'll twiddle a bit and
> see if I come up with something, although I imagine what you have is
> what we'll use in the end anyway.
My current conclusion is that a macro hack is not really practical, if
only because we end-up here from out-of-line C code, and that at this
stage we've lost all symbolic information.
We *could* take the nuclear option of re-modelling the sysreg enum as
a bunch of #define, similar to the way we deal with vcpu flags, and
have accessors for the various bits of information, but that comes
with two different problems:
- we don't have a good way to iterate over symbolic registers
- we need to repaint a large portion of the code base
Given that, I've taken another approach, which is to move all these
things close together (no more inlining), and add enough WARN_ON()s
that you really have to try and game the code to miss something and
not get caught. In the process, I found a couple of extra stragglers
that are always loaded when running a 32bit guest (the *32_EL2
registers...).
I've pushed the current state on my kvm-arm64/at-fixes-6.17 branch,
and I'll try to repost patches over the weekend.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors
2025-08-14 16:16 ` Marc Zyngier
@ 2025-08-15 18:56 ` Oliver Upton
0 siblings, 0 replies; 7+ messages in thread
From: Oliver Upton @ 2025-08-15 18:56 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, kvm, linux-arm-kernel, Volodymyr Babchuk, Joey Gouly,
Suzuki K Poulose, Zenghui Yu
On Thu, Aug 14, 2025 at 05:16:57PM +0100, Marc Zyngier wrote:
> My current conclusion is that a macro hack is not really practical, if
> only because we end-up here from out-of-line C code, and that at this
> stage we've lost all symbolic information.
>
> We *could* take the nuclear option of re-modelling the sysreg enum as
> a bunch of #define, similar to the way we deal with vcpu flags, and
> have accessors for the various bits of information, but that comes
> with two different problems:
>
> - we don't have a good way to iterate over symbolic registers
>
> - we need to repaint a large portion of the code base
>
> Given that, I've taken another approach, which is to move all these
> things close together (no more inlining), and add enough WARN_ON()s
> that you really have to try and game the code to miss something and
> not get caught. In the process, I found a couple of extra stragglers
> that are always loaded when running a 32bit guest (the *32_EL2
> registers...).
>
> I've pushed the current state on my kvm-arm64/at-fixes-6.17 branch,
> and I'll try to repost patches over the weekend.
Thanks! I've taken a glance at the branch and LGTM. Just wanted to make
sure we have sufficient idiotproofing for the next time I misuse this
stuff ;-)
Best,
Oliver
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-15 20:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-09 14:48 [PATCH 0/2] KVM: arm64: AT + SR accessor fixes Marc Zyngier
2025-08-09 14:48 ` [PATCH 1/2] KVM: arm64: nv: Fix ATS12 handling of single-stage translation Marc Zyngier
2025-08-09 14:48 ` [PATCH 2/2] KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessors Marc Zyngier
2025-08-12 20:23 ` Oliver Upton
2025-08-13 6:54 ` Marc Zyngier
2025-08-14 16:16 ` Marc Zyngier
2025-08-15 18:56 ` Oliver Upton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).