linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs
@ 2025-08-05 13:56 Fuad Tabba
  2025-08-05 13:56 ` [PATCH v1 1/4] KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host " Fuad Tabba
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 13:56 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, will, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, vdonnefort, qperret, sebastianene, keirf,
	smostafa, tabba

This patch series is mainly about fixing issues we've encountered in
pKVM (in downstream Android code), related to the handling of protected
VM access to restricted registers and injecting undefined exceptions
into a protected guest.

The last patch isn't pKVM specific, but a fix to the vgic-v2 code
encountered while fixing the issues in this series. The issue it fixes
was indirectly introduced into the code with hVHE.

Based on Linux 6.16.

Cheers,
/fuad

Fuad Tabba (4):
  KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host for protected VMs
  KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef
    exception
  arm64: vgic-v2: Fix guest endianness check in hVHE mode

 arch/arm64/include/asm/kvm_emulate.h     | 184 +++++++++++++++++++++++
 arch/arm64/include/asm/kvm_host.h        |   3 -
 arch/arm64/kvm/hyp/nvhe/sys_regs.c       |   5 +
 arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c |   2 +-
 arch/arm64/kvm/sys_regs.c                | 184 -----------------------
 5 files changed, 190 insertions(+), 188 deletions(-)


base-commit: 038d61fd642278bab63ee8ef722c50d10ab01e8f
-- 
2.50.1.565.gc32cd1483b-goog



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v1 1/4] KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host for protected VMs
  2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
@ 2025-08-05 13:56 ` Fuad Tabba
  2025-08-05 13:56 ` [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code Fuad Tabba
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 13:56 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, will, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, vdonnefort, qperret, sebastianene, keirf,
	smostafa, tabba

Since commit 17efc1acee62 ("arm64: Expose AIDR_EL1 via sysfs"), AIDR_EL1
is read early during boot. Therefore, a guest running as a protected VM
will fail to boot because when it attempts to access AIDR_EL1, since
access to that register is restricted in pKVM for protected guests.

Similar to how MIDR_EL1 is handled by the host for protected VMs, let
the host handle accesses to AIDR_EL1 as well as REVIDR_EL1. However note
that, unlike MIDR_EL1, AIDR_EL1 and REVIDR_EL1 are trapped by
HCR_EL2.TID1. Therefore, explicitly mark them as handled by the host for
protected VMs. TID1 is always set in pKVM, because it needs to restrict
access to SMIDR_EL1, which is also trapped by that bit.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/sys_regs.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index 1ddd9ed3cbb3..bbd60013cf9e 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -372,6 +372,9 @@ static const struct sys_reg_desc pvm_sys_reg_descs[] = {
 
 	/* Debug and Trace Registers are restricted. */
 
+	/* Group 1 ID registers */
+	HOST_HANDLED(SYS_REVIDR_EL1),
+
 	/* AArch64 mappings of the AArch32 ID registers */
 	/* CRm=1 */
 	AARCH32(SYS_ID_PFR0_EL1),
@@ -460,6 +463,7 @@ static const struct sys_reg_desc pvm_sys_reg_descs[] = {
 
 	HOST_HANDLED(SYS_CCSIDR_EL1),
 	HOST_HANDLED(SYS_CLIDR_EL1),
+	HOST_HANDLED(SYS_AIDR_EL1),
 	HOST_HANDLED(SYS_CSSELR_EL1),
 	HOST_HANDLED(SYS_CTR_EL0),
 
-- 
2.50.1.565.gc32cd1483b-goog



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
  2025-08-05 13:56 ` [PATCH v1 1/4] KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host " Fuad Tabba
@ 2025-08-05 13:56 ` Fuad Tabba
  2025-08-05 14:38   ` Will Deacon
  2025-08-05 13:56 ` [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception Fuad Tabba
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 13:56 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, will, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, vdonnefort, qperret, sebastianene, keirf,
	smostafa, tabba

Allow vcpu_{read,write}_sys_reg() to be called from EL2. This makes it
possible for hyp to use existing helper functions to access the vCPU
context.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 184 +++++++++++++++++++++++++++
 arch/arm64/include/asm/kvm_host.h    |   3 -
 arch/arm64/kvm/sys_regs.c            | 184 ---------------------------
 3 files changed, 184 insertions(+), 187 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 0720898f563e..1f449ef4564c 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -224,6 +224,190 @@ static inline bool vcpu_is_host_el0(const struct kvm_vcpu *vcpu)
 	return is_hyp_ctxt(vcpu) && !vcpu_is_el2(vcpu);
 }
 
+#define PURE_EL2_SYSREG(el2)						\
+	case el2: {							\
+		*el1r = el2;						\
+		return true;						\
+	}
+
+#define MAPPED_EL2_SYSREG(el2, el1, fn)					\
+	case el2: {							\
+		*xlate = fn;						\
+		*el1r = el1;						\
+		return true;						\
+	}
+
+static bool get_el2_to_el1_mapping(unsigned int reg,
+				   unsigned int *el1r, u64 (**xlate)(u64))
+{
+	switch (reg) {
+		PURE_EL2_SYSREG(  VPIDR_EL2	);
+		PURE_EL2_SYSREG(  VMPIDR_EL2	);
+		PURE_EL2_SYSREG(  ACTLR_EL2	);
+		PURE_EL2_SYSREG(  HCR_EL2	);
+		PURE_EL2_SYSREG(  MDCR_EL2	);
+		PURE_EL2_SYSREG(  HSTR_EL2	);
+		PURE_EL2_SYSREG(  HACR_EL2	);
+		PURE_EL2_SYSREG(  VTTBR_EL2	);
+		PURE_EL2_SYSREG(  VTCR_EL2	);
+		PURE_EL2_SYSREG(  RVBAR_EL2	);
+		PURE_EL2_SYSREG(  TPIDR_EL2	);
+		PURE_EL2_SYSREG(  HPFAR_EL2	);
+		PURE_EL2_SYSREG(  HCRX_EL2	);
+		PURE_EL2_SYSREG(  HFGRTR_EL2	);
+		PURE_EL2_SYSREG(  HFGWTR_EL2	);
+		PURE_EL2_SYSREG(  HFGITR_EL2	);
+		PURE_EL2_SYSREG(  HDFGRTR_EL2	);
+		PURE_EL2_SYSREG(  HDFGWTR_EL2	);
+		PURE_EL2_SYSREG(  HAFGRTR_EL2	);
+		PURE_EL2_SYSREG(  CNTVOFF_EL2	);
+		PURE_EL2_SYSREG(  CNTHCTL_EL2	);
+		MAPPED_EL2_SYSREG(SCTLR_EL2,   SCTLR_EL1,
+				  translate_sctlr_el2_to_sctlr_el1	     );
+		MAPPED_EL2_SYSREG(CPTR_EL2,    CPACR_EL1,
+				  translate_cptr_el2_to_cpacr_el1	     );
+		MAPPED_EL2_SYSREG(TTBR0_EL2,   TTBR0_EL1,
+				  translate_ttbr0_el2_to_ttbr0_el1	     );
+		MAPPED_EL2_SYSREG(TTBR1_EL2,   TTBR1_EL1,   NULL	     );
+		MAPPED_EL2_SYSREG(TCR_EL2,     TCR_EL1,
+				  translate_tcr_el2_to_tcr_el1		     );
+		MAPPED_EL2_SYSREG(VBAR_EL2,    VBAR_EL1,    NULL	     );
+		MAPPED_EL2_SYSREG(AFSR0_EL2,   AFSR0_EL1,   NULL	     );
+		MAPPED_EL2_SYSREG(AFSR1_EL2,   AFSR1_EL1,   NULL	     );
+		MAPPED_EL2_SYSREG(ESR_EL2,     ESR_EL1,     NULL	     );
+		MAPPED_EL2_SYSREG(FAR_EL2,     FAR_EL1,     NULL	     );
+		MAPPED_EL2_SYSREG(MAIR_EL2,    MAIR_EL1,    NULL	     );
+		MAPPED_EL2_SYSREG(TCR2_EL2,    TCR2_EL1,    NULL	     );
+		MAPPED_EL2_SYSREG(PIR_EL2,     PIR_EL1,     NULL	     );
+		MAPPED_EL2_SYSREG(PIRE0_EL2,   PIRE0_EL1,   NULL	     );
+		MAPPED_EL2_SYSREG(POR_EL2,     POR_EL1,     NULL	     );
+		MAPPED_EL2_SYSREG(AMAIR_EL2,   AMAIR_EL1,   NULL	     );
+		MAPPED_EL2_SYSREG(ELR_EL2,     ELR_EL1,	    NULL	     );
+		MAPPED_EL2_SYSREG(SPSR_EL2,    SPSR_EL1,    NULL	     );
+		MAPPED_EL2_SYSREG(ZCR_EL2,     ZCR_EL1,     NULL	     );
+		MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL	     );
+	default:
+		return false;
+	}
+}
+
+static inline u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
+{
+	u64 val = 0x8badf00d8badf00d;
+	u64 (*xlate)(u64) = NULL;
+	unsigned int el1r;
+
+	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
+		goto memory_read;
+
+	if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
+		if (!is_hyp_ctxt(vcpu))
+			goto memory_read;
+
+		/*
+		 * CNTHCTL_EL2 requires some special treatment to
+		 * account for the bits that can be set via CNTKCTL_EL1.
+		 */
+		switch (reg) {
+		case CNTHCTL_EL2:
+			if (vcpu_el2_e2h_is_set(vcpu)) {
+				val = read_sysreg_el1(SYS_CNTKCTL);
+				val &= CNTKCTL_VALID_BITS;
+				val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
+				return val;
+			}
+			break;
+		}
+
+		/*
+		 * If this register does not have an EL1 counterpart,
+		 * then read the stored EL2 version.
+		 */
+		if (reg == el1r)
+			goto memory_read;
+
+		/*
+		 * If we have a non-VHE guest and that the sysreg
+		 * requires translation to be used at EL1, use the
+		 * in-memory copy instead.
+		 */
+		if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
+			goto memory_read;
+
+		/* Get the current version of the EL1 counterpart. */
+		WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
+		if (reg >= __SANITISED_REG_START__)
+			val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
+
+		return val;
+	}
+
+	/* EL1 register can't be on the CPU if the guest is in vEL2. */
+	if (unlikely(is_hyp_ctxt(vcpu)))
+		goto memory_read;
+
+	if (__vcpu_read_sys_reg_from_cpu(reg, &val))
+		return val;
+
+memory_read:
+	return __vcpu_sys_reg(vcpu, reg);
+}
+
+static inline void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
+{
+	u64 (*xlate)(u64) = NULL;
+	unsigned int el1r;
+
+	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
+		goto memory_write;
+
+	if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
+		if (!is_hyp_ctxt(vcpu))
+			goto memory_write;
+
+		/*
+		 * Always store a copy of the write to memory to avoid having
+		 * to reverse-translate virtual EL2 system registers for a
+		 * non-VHE guest hypervisor.
+		 */
+		__vcpu_assign_sys_reg(vcpu, reg, val);
+
+		switch (reg) {
+		case CNTHCTL_EL2:
+			/*
+			 * If E2H=0, CNHTCTL_EL2 is a pure shadow register.
+			 * Otherwise, some of the bits are backed by
+			 * CNTKCTL_EL1, while the rest is kept in memory.
+			 * Yes, this is fun stuff.
+			 */
+			if (vcpu_el2_e2h_is_set(vcpu))
+				write_sysreg_el1(val, SYS_CNTKCTL);
+			return;
+		}
+
+		/* No EL1 counterpart? We're done here.? */
+		if (reg == el1r)
+			return;
+
+		if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
+			val = xlate(val);
+
+		/* Redirect this to the EL1 version of the register. */
+		WARN_ON(!__vcpu_write_sys_reg_to_cpu(val, el1r));
+		return;
+	}
+
+	/* EL1 register can't be on the CPU if the guest is in vEL2. */
+	if (unlikely(is_hyp_ctxt(vcpu)))
+		goto memory_write;
+
+	if (__vcpu_write_sys_reg_to_cpu(val, reg))
+		return;
+
+memory_write:
+	__vcpu_assign_sys_reg(vcpu, reg, val);
+}
+
 /*
  * The layout of SPSR for an AArch32 state is different when observed from an
  * AArch64 SPSR_ELx or an AArch32 SPSR_*. This function generates the AArch32
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3e41a880b062..1b0f9c63dc93 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1138,9 +1138,6 @@ u64 kvm_vcpu_apply_reg_masks(const struct kvm_vcpu *, enum vcpu_sysreg, u64);
 		__v;							\
 	})
 
-u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg);
-void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg);
-
 static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val)
 {
 	/*
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c20bd6f21e60..94c46cc040ea 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -82,190 +82,6 @@ static bool write_to_read_only(struct kvm_vcpu *vcpu,
 			"sys_reg write to read-only register");
 }
 
-#define PURE_EL2_SYSREG(el2)						\
-	case el2: {							\
-		*el1r = el2;						\
-		return true;						\
-	}
-
-#define MAPPED_EL2_SYSREG(el2, el1, fn)					\
-	case el2: {							\
-		*xlate = fn;						\
-		*el1r = el1;						\
-		return true;						\
-	}
-
-static bool get_el2_to_el1_mapping(unsigned int reg,
-				   unsigned int *el1r, u64 (**xlate)(u64))
-{
-	switch (reg) {
-		PURE_EL2_SYSREG(  VPIDR_EL2	);
-		PURE_EL2_SYSREG(  VMPIDR_EL2	);
-		PURE_EL2_SYSREG(  ACTLR_EL2	);
-		PURE_EL2_SYSREG(  HCR_EL2	);
-		PURE_EL2_SYSREG(  MDCR_EL2	);
-		PURE_EL2_SYSREG(  HSTR_EL2	);
-		PURE_EL2_SYSREG(  HACR_EL2	);
-		PURE_EL2_SYSREG(  VTTBR_EL2	);
-		PURE_EL2_SYSREG(  VTCR_EL2	);
-		PURE_EL2_SYSREG(  RVBAR_EL2	);
-		PURE_EL2_SYSREG(  TPIDR_EL2	);
-		PURE_EL2_SYSREG(  HPFAR_EL2	);
-		PURE_EL2_SYSREG(  HCRX_EL2	);
-		PURE_EL2_SYSREG(  HFGRTR_EL2	);
-		PURE_EL2_SYSREG(  HFGWTR_EL2	);
-		PURE_EL2_SYSREG(  HFGITR_EL2	);
-		PURE_EL2_SYSREG(  HDFGRTR_EL2	);
-		PURE_EL2_SYSREG(  HDFGWTR_EL2	);
-		PURE_EL2_SYSREG(  HAFGRTR_EL2	);
-		PURE_EL2_SYSREG(  CNTVOFF_EL2	);
-		PURE_EL2_SYSREG(  CNTHCTL_EL2	);
-		MAPPED_EL2_SYSREG(SCTLR_EL2,   SCTLR_EL1,
-				  translate_sctlr_el2_to_sctlr_el1	     );
-		MAPPED_EL2_SYSREG(CPTR_EL2,    CPACR_EL1,
-				  translate_cptr_el2_to_cpacr_el1	     );
-		MAPPED_EL2_SYSREG(TTBR0_EL2,   TTBR0_EL1,
-				  translate_ttbr0_el2_to_ttbr0_el1	     );
-		MAPPED_EL2_SYSREG(TTBR1_EL2,   TTBR1_EL1,   NULL	     );
-		MAPPED_EL2_SYSREG(TCR_EL2,     TCR_EL1,
-				  translate_tcr_el2_to_tcr_el1		     );
-		MAPPED_EL2_SYSREG(VBAR_EL2,    VBAR_EL1,    NULL	     );
-		MAPPED_EL2_SYSREG(AFSR0_EL2,   AFSR0_EL1,   NULL	     );
-		MAPPED_EL2_SYSREG(AFSR1_EL2,   AFSR1_EL1,   NULL	     );
-		MAPPED_EL2_SYSREG(ESR_EL2,     ESR_EL1,     NULL	     );
-		MAPPED_EL2_SYSREG(FAR_EL2,     FAR_EL1,     NULL	     );
-		MAPPED_EL2_SYSREG(MAIR_EL2,    MAIR_EL1,    NULL	     );
-		MAPPED_EL2_SYSREG(TCR2_EL2,    TCR2_EL1,    NULL	     );
-		MAPPED_EL2_SYSREG(PIR_EL2,     PIR_EL1,     NULL	     );
-		MAPPED_EL2_SYSREG(PIRE0_EL2,   PIRE0_EL1,   NULL	     );
-		MAPPED_EL2_SYSREG(POR_EL2,     POR_EL1,     NULL	     );
-		MAPPED_EL2_SYSREG(AMAIR_EL2,   AMAIR_EL1,   NULL	     );
-		MAPPED_EL2_SYSREG(ELR_EL2,     ELR_EL1,	    NULL	     );
-		MAPPED_EL2_SYSREG(SPSR_EL2,    SPSR_EL1,    NULL	     );
-		MAPPED_EL2_SYSREG(ZCR_EL2,     ZCR_EL1,     NULL	     );
-		MAPPED_EL2_SYSREG(CONTEXTIDR_EL2, CONTEXTIDR_EL1, NULL	     );
-	default:
-		return false;
-	}
-}
-
-u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
-{
-	u64 val = 0x8badf00d8badf00d;
-	u64 (*xlate)(u64) = NULL;
-	unsigned int el1r;
-
-	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
-		goto memory_read;
-
-	if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
-		if (!is_hyp_ctxt(vcpu))
-			goto memory_read;
-
-		/*
-		 * CNTHCTL_EL2 requires some special treatment to
-		 * account for the bits that can be set via CNTKCTL_EL1.
-		 */
-		switch (reg) {
-		case CNTHCTL_EL2:
-			if (vcpu_el2_e2h_is_set(vcpu)) {
-				val = read_sysreg_el1(SYS_CNTKCTL);
-				val &= CNTKCTL_VALID_BITS;
-				val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
-				return val;
-			}
-			break;
-		}
-
-		/*
-		 * If this register does not have an EL1 counterpart,
-		 * then read the stored EL2 version.
-		 */
-		if (reg == el1r)
-			goto memory_read;
-
-		/*
-		 * If we have a non-VHE guest and that the sysreg
-		 * requires translation to be used at EL1, use the
-		 * in-memory copy instead.
-		 */
-		if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
-			goto memory_read;
-
-		/* Get the current version of the EL1 counterpart. */
-		WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
-		if (reg >= __SANITISED_REG_START__)
-			val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
-
-		return val;
-	}
-
-	/* EL1 register can't be on the CPU if the guest is in vEL2. */
-	if (unlikely(is_hyp_ctxt(vcpu)))
-		goto memory_read;
-
-	if (__vcpu_read_sys_reg_from_cpu(reg, &val))
-		return val;
-
-memory_read:
-	return __vcpu_sys_reg(vcpu, reg);
-}
-
-void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
-{
-	u64 (*xlate)(u64) = NULL;
-	unsigned int el1r;
-
-	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
-		goto memory_write;
-
-	if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
-		if (!is_hyp_ctxt(vcpu))
-			goto memory_write;
-
-		/*
-		 * Always store a copy of the write to memory to avoid having
-		 * to reverse-translate virtual EL2 system registers for a
-		 * non-VHE guest hypervisor.
-		 */
-		__vcpu_assign_sys_reg(vcpu, reg, val);
-
-		switch (reg) {
-		case CNTHCTL_EL2:
-			/*
-			 * If E2H=0, CNHTCTL_EL2 is a pure shadow register.
-			 * Otherwise, some of the bits are backed by
-			 * CNTKCTL_EL1, while the rest is kept in memory.
-			 * Yes, this is fun stuff.
-			 */
-			if (vcpu_el2_e2h_is_set(vcpu))
-				write_sysreg_el1(val, SYS_CNTKCTL);
-			return;
-		}
-
-		/* No EL1 counterpart? We're done here.? */
-		if (reg == el1r)
-			return;
-
-		if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
-			val = xlate(val);
-
-		/* Redirect this to the EL1 version of the register. */
-		WARN_ON(!__vcpu_write_sys_reg_to_cpu(val, el1r));
-		return;
-	}
-
-	/* EL1 register can't be on the CPU if the guest is in vEL2. */
-	if (unlikely(is_hyp_ctxt(vcpu)))
-		goto memory_write;
-
-	if (__vcpu_write_sys_reg_to_cpu(val, reg))
-		return;
-
-memory_write:
-	__vcpu_assign_sys_reg(vcpu, reg, val);
-}
-
 /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */
 #define CSSELR_MAX 14
 
-- 
2.50.1.565.gc32cd1483b-goog



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
  2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
  2025-08-05 13:56 ` [PATCH v1 1/4] KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host " Fuad Tabba
  2025-08-05 13:56 ` [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code Fuad Tabba
@ 2025-08-05 13:56 ` Fuad Tabba
  2025-08-05 14:35   ` Will Deacon
  2025-08-05 18:41   ` Marc Zyngier
  2025-08-05 13:56 ` [PATCH v1 4/4] arm64: vgic-v2: Fix guest endianness check in hVHE mode Fuad Tabba
  2025-08-05 14:39 ` [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Will Deacon
  4 siblings, 2 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 13:56 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, will, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, vdonnefort, qperret, sebastianene, keirf,
	smostafa, tabba

In pKVM, a race condition can occur if a guest updates its VBAR_EL1
register and, before a vCPU exit synchronizes this change, the
hypervisor needs to inject an undefined exception into a protected
guest.

In this scenario, the vCPU still holds the stale VBAR_EL1 value from
before the guest's update. When pKVM injects the exception, it ends up
using the stale value.

Explicitly read the live value of VBAR_EL1 from the guest and update the
vCPU value immediately before pending the exception. This ensures the
vCPU's value is the same as the guest's and that the exception will be
handled at the correct address upon resuming the guest.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/sys_regs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index bbd60013cf9e..b34b10be1ad7 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -253,6 +253,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
 
 	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
 	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
+	vcpu_write_sys_reg(vcpu, read_sysreg_el1(SYS_VBAR), VBAR_EL1);
 
 	kvm_pend_exception(vcpu, EXCEPT_AA64_EL1_SYNC);
 
-- 
2.50.1.565.gc32cd1483b-goog



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v1 4/4] arm64: vgic-v2: Fix guest endianness check in hVHE mode
  2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
                   ` (2 preceding siblings ...)
  2025-08-05 13:56 ` [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception Fuad Tabba
@ 2025-08-05 13:56 ` Fuad Tabba
  2025-08-05 14:39 ` [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Will Deacon
  4 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 13:56 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, will, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, vdonnefort, qperret, sebastianene, keirf,
	smostafa, tabba

In hVHE when running at the hypervisor, SCTLR_EL1 refers to the
hypervisor's System Control Register rather than the guest's. Make sure
to access the guest's register to determine its endianness.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c
index 87a54375bd6e..78579b31a420 100644
--- a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c
@@ -20,7 +20,7 @@ static bool __is_be(struct kvm_vcpu *vcpu)
 	if (vcpu_mode_is_32bit(vcpu))
 		return !!(read_sysreg_el2(SYS_SPSR) & PSR_AA32_E_BIT);
 
-	return !!(read_sysreg(SCTLR_EL1) & SCTLR_ELx_EE);
+	return !!(read_sysreg_el1(SYS_SCTLR) & SCTLR_ELx_EE);
 }
 
 /*
-- 
2.50.1.565.gc32cd1483b-goog



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
  2025-08-05 13:56 ` [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception Fuad Tabba
@ 2025-08-05 14:35   ` Will Deacon
  2025-08-05 15:25     ` Fuad Tabba
  2025-08-05 18:41   ` Marc Zyngier
  1 sibling, 1 reply; 13+ messages in thread
From: Will Deacon @ 2025-08-05 14:35 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

On Tue, Aug 05, 2025 at 02:56:16PM +0100, Fuad Tabba wrote:
> In pKVM, a race condition can occur if a guest updates its VBAR_EL1
> register and, before a vCPU exit synchronizes this change, the
> hypervisor needs to inject an undefined exception into a protected
> guest.
> 
> In this scenario, the vCPU still holds the stale VBAR_EL1 value from
> before the guest's update. When pKVM injects the exception, it ends up
> using the stale value.
> 
> Explicitly read the live value of VBAR_EL1 from the guest and update the
> vCPU value immediately before pending the exception. This ensures the
> vCPU's value is the same as the guest's and that the exception will be
> handled at the correct address upon resuming the guest.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> index bbd60013cf9e..b34b10be1ad7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -253,6 +253,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
>  
>  	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
>  	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +	vcpu_write_sys_reg(vcpu, read_sysreg_el1(SYS_VBAR), VBAR_EL1);

You say "in pKVM" in the commit message, but why isn't this applicable
to !VHE in general?

Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  2025-08-05 13:56 ` [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code Fuad Tabba
@ 2025-08-05 14:38   ` Will Deacon
  2025-08-05 15:51     ` Fuad Tabba
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2025-08-05 14:38 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

On Tue, Aug 05, 2025 at 02:56:15PM +0100, Fuad Tabba wrote:
> Allow vcpu_{read,write}_sys_reg() to be called from EL2. This makes it
> possible for hyp to use existing helper functions to access the vCPU
> context.
> 
> No functional change intended.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h | 184 +++++++++++++++++++++++++++
>  arch/arm64/include/asm/kvm_host.h    |   3 -
>  arch/arm64/kvm/sys_regs.c            | 184 ---------------------------
>  3 files changed, 184 insertions(+), 187 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 0720898f563e..1f449ef4564c 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -224,6 +224,190 @@ static inline bool vcpu_is_host_el0(const struct kvm_vcpu *vcpu)
>  	return is_hyp_ctxt(vcpu) && !vcpu_is_el2(vcpu);
>  }
>  
> +#define PURE_EL2_SYSREG(el2)						\
> +	case el2: {							\
> +		*el1r = el2;						\
> +		return true;						\
> +	}
> +
> +#define MAPPED_EL2_SYSREG(el2, el1, fn)					\
> +	case el2: {							\
> +		*xlate = fn;						\
> +		*el1r = el1;						\
> +		return true;						\
> +	}
> +
> +static bool get_el2_to_el1_mapping(unsigned int reg,
> +				   unsigned int *el1r, u64 (**xlate)(u64))
> +{

I guess this needs to be 'inline' too, but...

> +static inline u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
> +{
> +	u64 val = 0x8badf00d8badf00d;
> +	u64 (*xlate)(u64) = NULL;
> +	unsigned int el1r;
> +
> +	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
> +		goto memory_read;
> +
> +	if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
> +		if (!is_hyp_ctxt(vcpu))
> +			goto memory_read;
> +
> +		/*
> +		 * CNTHCTL_EL2 requires some special treatment to
> +		 * account for the bits that can be set via CNTKCTL_EL1.
> +		 */
> +		switch (reg) {
> +		case CNTHCTL_EL2:
> +			if (vcpu_el2_e2h_is_set(vcpu)) {
> +				val = read_sysreg_el1(SYS_CNTKCTL);
> +				val &= CNTKCTL_VALID_BITS;
> +				val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
> +				return val;
> +			}
> +			break;
> +		}
> +
> +		/*
> +		 * If this register does not have an EL1 counterpart,
> +		 * then read the stored EL2 version.
> +		 */
> +		if (reg == el1r)
> +			goto memory_read;
> +
> +		/*
> +		 * If we have a non-VHE guest and that the sysreg
> +		 * requires translation to be used at EL1, use the
> +		 * in-memory copy instead.
> +		 */
> +		if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
> +			goto memory_read;
> +
> +		/* Get the current version of the EL1 counterpart. */
> +		WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
> +		if (reg >= __SANITISED_REG_START__)
> +			val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
> +
> +		return val;
> +	}
> +
> +	/* EL1 register can't be on the CPU if the guest is in vEL2. */
> +	if (unlikely(is_hyp_ctxt(vcpu)))
> +		goto memory_read;
> +
> +	if (__vcpu_read_sys_reg_from_cpu(reg, &val))
> +		return val;
> +
> +memory_read:
> +	return __vcpu_sys_reg(vcpu, reg);
> +}

... isn't this now pretty huge to be inlining? Similarly for the write
accessor. What does the bloat-o-meter script say?

Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs
  2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
                   ` (3 preceding siblings ...)
  2025-08-05 13:56 ` [PATCH v1 4/4] arm64: vgic-v2: Fix guest endianness check in hVHE mode Fuad Tabba
@ 2025-08-05 14:39 ` Will Deacon
  2025-08-05 15:20   ` Fuad Tabba
  4 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2025-08-05 14:39 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

Hi Fuad,

On Tue, Aug 05, 2025 at 02:56:13PM +0100, Fuad Tabba wrote:
> This patch series is mainly about fixing issues we've encountered in
> pKVM (in downstream Android code), related to the handling of protected
> VM access to restricted registers and injecting undefined exceptions
> into a protected guest.
> 
> The last patch isn't pKVM specific, but a fix to the vgic-v2 code
> encountered while fixing the issues in this series. The issue it fixes
> was indirectly introduced into the code with hVHE.
> 
> Based on Linux 6.16.

Thanks for putting this together so quickly.

I left some small comments on patches 2 and 3 but, for the sake of
transparency (I already mentioned the GIC thing to Marc), please can
you add:

Reported-by: Will Deacon <will@kernel.org>

on patches 1 and 4?

Cheers,

Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs
  2025-08-05 14:39 ` [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Will Deacon
@ 2025-08-05 15:20   ` Fuad Tabba
  0 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 15:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

Hi Will,

On Tue, 5 Aug 2025 at 15:39, Will Deacon <will@kernel.org> wrote:
>
> Hi Fuad,
>
> On Tue, Aug 05, 2025 at 02:56:13PM +0100, Fuad Tabba wrote:
> > This patch series is mainly about fixing issues we've encountered in
> > pKVM (in downstream Android code), related to the handling of protected
> > VM access to restricted registers and injecting undefined exceptions
> > into a protected guest.
> >
> > The last patch isn't pKVM specific, but a fix to the vgic-v2 code
> > encountered while fixing the issues in this series. The issue it fixes
> > was indirectly introduced into the code with hVHE.
> >
> > Based on Linux 6.16.
>
> Thanks for putting this together so quickly.
>
> I left some small comments on patches 2 and 3 but, for the sake of
> transparency (I already mentioned the GIC thing to Marc), please can
> you add:
>
> Reported-by: Will Deacon <will@kernel.org>
>
> on patches 1 and 4?

Will do.

Cheers,
/fuad

>
> Cheers,
>
> Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
  2025-08-05 14:35   ` Will Deacon
@ 2025-08-05 15:25     ` Fuad Tabba
  0 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 15:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

Hi Will,

On Tue, 5 Aug 2025 at 15:35, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Aug 05, 2025 at 02:56:16PM +0100, Fuad Tabba wrote:
> > In pKVM, a race condition can occur if a guest updates its VBAR_EL1
> > register and, before a vCPU exit synchronizes this change, the
> > hypervisor needs to inject an undefined exception into a protected
> > guest.
> >
> > In this scenario, the vCPU still holds the stale VBAR_EL1 value from
> > before the guest's update. When pKVM injects the exception, it ends up
> > using the stale value.
> >
> > Explicitly read the live value of VBAR_EL1 from the guest and update the
> > vCPU value immediately before pending the exception. This ensures the
> > vCPU's value is the same as the guest's and that the exception will be
> > handled at the correct address upon resuming the guest.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > index bbd60013cf9e..b34b10be1ad7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -253,6 +253,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
> >
> >       *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> >       *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +     vcpu_write_sys_reg(vcpu, read_sysreg_el1(SYS_VBAR), VBAR_EL1);
>
> You say "in pKVM" in the commit message, but why isn't this applicable
> to !VHE in general?

Because this function is only used in pKVM. I guess there's nothing
stopping anyone from using it in other non-VHE code in the future
though

Cheers,
/fuad

> Will


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
  2025-08-05 14:38   ` Will Deacon
@ 2025-08-05 15:51     ` Fuad Tabba
  0 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 15:51 UTC (permalink / raw)
  To: Will Deacon
  Cc: kvmarm, linux-arm-kernel, maz, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

Hi Will,

On Tue, 5 Aug 2025 at 15:39, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Aug 05, 2025 at 02:56:15PM +0100, Fuad Tabba wrote:
> > Allow vcpu_{read,write}_sys_reg() to be called from EL2. This makes it
> > possible for hyp to use existing helper functions to access the vCPU
> > context.
> >
> > No functional change intended.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_emulate.h | 184 +++++++++++++++++++++++++++
> >  arch/arm64/include/asm/kvm_host.h    |   3 -
> >  arch/arm64/kvm/sys_regs.c            | 184 ---------------------------
> >  3 files changed, 184 insertions(+), 187 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > index 0720898f563e..1f449ef4564c 100644
> > --- a/arch/arm64/include/asm/kvm_emulate.h
> > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > @@ -224,6 +224,190 @@ static inline bool vcpu_is_host_el0(const struct kvm_vcpu *vcpu)
> >       return is_hyp_ctxt(vcpu) && !vcpu_is_el2(vcpu);
> >  }
> >
> > +#define PURE_EL2_SYSREG(el2)                                         \
> > +     case el2: {                                                     \
> > +             *el1r = el2;                                            \
> > +             return true;                                            \
> > +     }
> > +
> > +#define MAPPED_EL2_SYSREG(el2, el1, fn)                                      \
> > +     case el2: {                                                     \
> > +             *xlate = fn;                                            \
> > +             *el1r = el1;                                            \
> > +             return true;                                            \
> > +     }
> > +
> > +static bool get_el2_to_el1_mapping(unsigned int reg,
> > +                                unsigned int *el1r, u64 (**xlate)(u64))
> > +{
>
> I guess this needs to be 'inline' too, but...
>
> > +static inline u64 vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
> > +{
> > +     u64 val = 0x8badf00d8badf00d;
> > +     u64 (*xlate)(u64) = NULL;
> > +     unsigned int el1r;
> > +
> > +     if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
> > +             goto memory_read;
> > +
> > +     if (unlikely(get_el2_to_el1_mapping(reg, &el1r, &xlate))) {
> > +             if (!is_hyp_ctxt(vcpu))
> > +                     goto memory_read;
> > +
> > +             /*
> > +              * CNTHCTL_EL2 requires some special treatment to
> > +              * account for the bits that can be set via CNTKCTL_EL1.
> > +              */
> > +             switch (reg) {
> > +             case CNTHCTL_EL2:
> > +                     if (vcpu_el2_e2h_is_set(vcpu)) {
> > +                             val = read_sysreg_el1(SYS_CNTKCTL);
> > +                             val &= CNTKCTL_VALID_BITS;
> > +                             val |= __vcpu_sys_reg(vcpu, reg) & ~CNTKCTL_VALID_BITS;
> > +                             return val;
> > +                     }
> > +                     break;
> > +             }
> > +
> > +             /*
> > +              * If this register does not have an EL1 counterpart,
> > +              * then read the stored EL2 version.
> > +              */
> > +             if (reg == el1r)
> > +                     goto memory_read;
> > +
> > +             /*
> > +              * If we have a non-VHE guest and that the sysreg
> > +              * requires translation to be used at EL1, use the
> > +              * in-memory copy instead.
> > +              */
> > +             if (!vcpu_el2_e2h_is_set(vcpu) && xlate)
> > +                     goto memory_read;
> > +
> > +             /* Get the current version of the EL1 counterpart. */
> > +             WARN_ON(!__vcpu_read_sys_reg_from_cpu(el1r, &val));
> > +             if (reg >= __SANITISED_REG_START__)
> > +                     val = kvm_vcpu_apply_reg_masks(vcpu, reg, val);
> > +
> > +             return val;
> > +     }
> > +
> > +     /* EL1 register can't be on the CPU if the guest is in vEL2. */
> > +     if (unlikely(is_hyp_ctxt(vcpu)))
> > +             goto memory_read;
> > +
> > +     if (__vcpu_read_sys_reg_from_cpu(reg, &val))
> > +             return val;
> > +
> > +memory_read:
> > +     return __vcpu_sys_reg(vcpu, reg);
> > +}
>
> ... isn't this now pretty huge to be inlining? Similarly for the write
> accessor. What does the bloat-o-meter script say?

In terms of the bloat-o-meter script:

add/remove: 4/4 grow/shrink: 27/5 up/down: 16720/-108 (16612)
Function                                     old     new   delta
vcpu_read_sys_reg                            648    4132   +3484
__vcpu_read_sys_reg_from_cpu                 428    2112   +1684
vcpu_write_sys_reg                           672    2016   +1344
get_el2_to_el1_mapping                       404    1616   +1212
is_hyp_ctxt                                  216    1256   +1040
__vcpu_write_sys_reg_to_cpu                  420    1232    +812
vcpu_el2_e2h_is_set                          296    1032    +736
...
Total: Before=33050106, After=33066718, chg +0.05%

Looking more carefully at the code, I should be using __vcpu_sys_reg()
and __vcpu_assign_sys_reg() directly, since we're !VHE. I'll fix it
when I respin.

Thanks!
/fuad

> Will
>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
  2025-08-05 13:56 ` [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception Fuad Tabba
  2025-08-05 14:35   ` Will Deacon
@ 2025-08-05 18:41   ` Marc Zyngier
  2025-08-05 18:43     ` Fuad Tabba
  1 sibling, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2025-08-05 18:41 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, linux-arm-kernel, oliver.upton, will, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

On Tue, 05 Aug 2025 14:56:16 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> In pKVM, a race condition can occur if a guest updates its VBAR_EL1
> register and, before a vCPU exit synchronizes this change, the
> hypervisor needs to inject an undefined exception into a protected
> guest.
> 
> In this scenario, the vCPU still holds the stale VBAR_EL1 value from
> before the guest's update. When pKVM injects the exception, it ends up
> using the stale value.
> 
> Explicitly read the live value of VBAR_EL1 from the guest and update the
> vCPU value immediately before pending the exception. This ensures the
> vCPU's value is the same as the guest's and that the exception will be
> handled at the correct address upon resuming the guest.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> index bbd60013cf9e..b34b10be1ad7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -253,6 +253,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
>  
>  	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
>  	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +	vcpu_write_sys_reg(vcpu, read_sysreg_el1(SYS_VBAR), VBAR_EL1);
>  
>  	kvm_pend_exception(vcpu, EXCEPT_AA64_EL1_SYNC);
>  

There is something I don't understand. vcpu_write_sys_reg() is only
useful if you make use of the SYSREGS_ON_CPU flag. Which is only
driven by the VHE code (in arch/arm64/kvm/hyp/vhe/sysreg-sr.c).

As a consequence, this only writes to memory, since the flag is always
false, and we take the following path:

static inline void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
	u64 (*xlate)(u64) = NULL;
	unsigned int el1r;

	if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
		goto memory_write;

[...]
memory_write:
	__vcpu_assign_sys_reg(vcpu, reg, val);
}

My conclusion so far is that you only ever need to write to the shadow
view of the register, and that the previous patch serves no purpose.

Am I missing anything?

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception
  2025-08-05 18:41   ` Marc Zyngier
@ 2025-08-05 18:43     ` Fuad Tabba
  0 siblings, 0 replies; 13+ messages in thread
From: Fuad Tabba @ 2025-08-05 18:43 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, linux-arm-kernel, oliver.upton, will, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, vdonnefort, qperret,
	sebastianene, keirf, smostafa

Hi Marc,

On Tue, 5 Aug 2025 at 19:41, Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 05 Aug 2025 14:56:16 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > In pKVM, a race condition can occur if a guest updates its VBAR_EL1
> > register and, before a vCPU exit synchronizes this change, the
> > hypervisor needs to inject an undefined exception into a protected
> > guest.
> >
> > In this scenario, the vCPU still holds the stale VBAR_EL1 value from
> > before the guest's update. When pKVM injects the exception, it ends up
> > using the stale value.
> >
> > Explicitly read the live value of VBAR_EL1 from the guest and update the
> > vCPU value immediately before pending the exception. This ensures the
> > vCPU's value is the same as the guest's and that the exception will be
> > handled at the correct address upon resuming the guest.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > index bbd60013cf9e..b34b10be1ad7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -253,6 +253,7 @@ static void inject_undef64(struct kvm_vcpu *vcpu)
> >
> >       *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> >       *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +     vcpu_write_sys_reg(vcpu, read_sysreg_el1(SYS_VBAR), VBAR_EL1);
> >
> >       kvm_pend_exception(vcpu, EXCEPT_AA64_EL1_SYNC);
> >
>
> There is something I don't understand. vcpu_write_sys_reg() is only
> useful if you make use of the SYSREGS_ON_CPU flag. Which is only
> driven by the VHE code (in arch/arm64/kvm/hyp/vhe/sysreg-sr.c).
>
> As a consequence, this only writes to memory, since the flag is always
> false, and we take the following path:
>
> static inline void vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
> {
>         u64 (*xlate)(u64) = NULL;
>         unsigned int el1r;
>
>         if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU))
>                 goto memory_write;
>
> [...]
> memory_write:
>         __vcpu_assign_sys_reg(vcpu, reg, val);
> }
>
> My conclusion so far is that you only ever need to write to the shadow
> view of the register, and that the previous patch serves no purpose.
>
> Am I missing anything?

No, you're not. I realized that when I replied to Will (patch 2). I'll
drop that patch and fix this when I repin.

Cheers,
/fuad

/fuad

>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-08-05 18:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 13:56 [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Fuad Tabba
2025-08-05 13:56 ` [PATCH v1 1/4] KVM: arm64: Handle AIDR_EL1 and REVIDR_EL1 in host " Fuad Tabba
2025-08-05 13:56 ` [PATCH v1 2/4] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code Fuad Tabba
2025-08-05 14:38   ` Will Deacon
2025-08-05 15:51     ` Fuad Tabba
2025-08-05 13:56 ` [PATCH v1 3/4] KVM: arm64: Sync protected guest VBAR_EL1 on injecting an undef exception Fuad Tabba
2025-08-05 14:35   ` Will Deacon
2025-08-05 15:25     ` Fuad Tabba
2025-08-05 18:41   ` Marc Zyngier
2025-08-05 18:43     ` Fuad Tabba
2025-08-05 13:56 ` [PATCH v1 4/4] arm64: vgic-v2: Fix guest endianness check in hVHE mode Fuad Tabba
2025-08-05 14:39 ` [PATCH v1 0/4] KVM: arm64: Fixes to handling of restricted registers for protected VMs Will Deacon
2025-08-05 15:20   ` Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).