public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support
@ 2026-03-25  0:36 Wei-Lin Chang
  2026-03-25  0:36 ` [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV Wei-Lin Chang
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-25  0:36 UTC (permalink / raw)
  To: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

Hi,

This series adds basic support for running nested guests (L2) in
kselftest. The first patch adds library functions. While designing the
APIs for userspace, I referenced Joey's approach for kvm-unit-tests [1].
In summary, four preparatory functions are provided for userspace to
set up state to run an L2 in EL1:

 - prepare_l2_stack()            <- sets up stack for L2
 - prepare_hyp_state()           <- sets up vEL2 registers
 - prepare_eret_destination()    <- userspace passes a function pointer
                                    for L2 to run
 - prepare_nested_sync_handler() <- sets up hvc handler in order to
                                    regain control after L2's hvc

After calling those functions, userspace can vcpu_run(), and when
run_l2() is called within the guest, the supplied function will be run
in L2, with the control flow managed by the library code in nested.c and
nested_asm.S. After running the L2 function, run_l2() will automatically
return. Note that the L2 function supplied by the user does not have to
call hvc.

Patch 2 demonstrates usage of the APIs introduced above, with a simple
L1 -> L2 -> L1 sequence, with an empty L2 function.

Patch 3 enhances the library functions by setting up L2 -> L1 stage-2
translation. Currently the translation is simple, with start level 0, 4
levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.

[1]: https://lore.kernel.org/kvmarm/20260306142656.2775185-1-joey.gouly@arm.com/

Wei-Lin Chang (3):
  KVM: arm64: selftests: Add library functions for NV
  KVM: arm64: sefltests: Add basic NV selftest
  KVM: arm64: selftests: Enable stage-2 in NV preparation functions

 tools/testing/selftests/kvm/Makefile.kvm      |   3 +
 .../selftests/kvm/arm64/hello_nested.c        |  65 ++++++++
 .../selftests/kvm/include/arm64/nested.h      |  25 +++
 .../selftests/kvm/include/arm64/processor.h   |   9 +
 .../testing/selftests/kvm/lib/arm64/nested.c  | 154 ++++++++++++++++++
 .../selftests/kvm/lib/arm64/nested_asm.S      |  35 ++++
 6 files changed, 291 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/arm64/hello_nested.c
 create mode 100644 tools/testing/selftests/kvm/include/arm64/nested.h
 create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested.c
 create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested_asm.S

-- 
2.43.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV
  2026-03-25  0:36 [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Wei-Lin Chang
@ 2026-03-25  0:36 ` Wei-Lin Chang
  2026-03-25  9:03   ` Marc Zyngier
  2026-03-25  0:36 ` [PATCH 2/3] KVM: arm64: sefltests: Add basic NV selftest Wei-Lin Chang
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-25  0:36 UTC (permalink / raw)
  To: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

The API is designed for userspace to first call prepare_{l2_stack,
hyp_state, eret_destination, nested_sync_handler}, with a function
supplied to prepare_eret_destination() to be run in L2. Then run_l2()
can be called in L1 to run the given function in L2.

Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |  2 +
 .../selftests/kvm/include/arm64/nested.h      | 18 ++++++
 .../testing/selftests/kvm/lib/arm64/nested.c  | 61 +++++++++++++++++++
 .../selftests/kvm/lib/arm64/nested_asm.S      | 35 +++++++++++
 4 files changed, 116 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/include/arm64/nested.h
 create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested.c
 create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested_asm.S

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 98da9fa4b8b7..5e681e8e0cd7 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -34,6 +34,8 @@ LIBKVM_arm64 += lib/arm64/gic.c
 LIBKVM_arm64 += lib/arm64/gic_v3.c
 LIBKVM_arm64 += lib/arm64/gic_v3_its.c
 LIBKVM_arm64 += lib/arm64/handlers.S
+LIBKVM_arm64 += lib/arm64/nested.c
+LIBKVM_arm64 += lib/arm64/nested_asm.S
 LIBKVM_arm64 += lib/arm64/processor.c
 LIBKVM_arm64 += lib/arm64/spinlock.c
 LIBKVM_arm64 += lib/arm64/ucall.c
diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
new file mode 100644
index 000000000000..739ff2ee0161
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/arm64/nested.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * ARM64 Nested virtualization defines
+ */
+
+#ifndef SELFTEST_KVM_NESTED_H
+#define SELFTEST_KVM_NESTED_H
+
+void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
+void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
+void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
+void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
+
+void run_l2(void);
+void after_hvc(void);
+void do_hvc(void);
+
+#endif /* SELFTEST_KVM_NESTED_H */
diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
new file mode 100644
index 000000000000..111d02f44cfe
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM64 Nested virtualization helpers
+ */
+
+#include "kvm_util.h"
+#include "nested.h"
+#include "processor.h"
+#include "test_util.h"
+
+#include <asm/sysreg.h>
+
+static void hvc_handler(struct ex_regs *regs)
+{
+	GUEST_ASSERT_EQ(get_current_el(), 2);
+	GUEST_PRINTF("hvc handler\n");
+	regs->pstate = PSR_MODE_EL2h | PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT;
+	regs->pc = (u64)after_hvc;
+}
+
+void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
+{
+	size_t l2_stack_size;
+	uint64_t l2_stack_paddr;
+
+	l2_stack_size = vm->page_size == 4096 ? DEFAULT_STACK_PGS * vm->page_size :
+					 vm->page_size;
+	l2_stack_paddr = __vm_phy_pages_alloc(vm, l2_stack_size / vm->page_size,
+					      0, 0, false);
+	vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), l2_stack_paddr + l2_stack_size);
+}
+
+void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
+{
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);
+}
+
+void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
+{
+	vm_paddr_t do_hvc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)do_hvc);
+	vm_paddr_t l2_pc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)l2_pc);
+
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SPSR_EL2), PSR_MODE_EL1h |
+							    PSR_D_BIT     |
+							    PSR_A_BIT     |
+							    PSR_I_BIT     |
+							    PSR_F_BIT);
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ELR_EL2), l2_pc_paddr);
+	/* HACK: use TPIDR_EL2 to pass address, see run_l2() in nested_asm.S */
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TPIDR_EL2), do_hvc_paddr);
+}
+
+void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
+{
+	if (!vm->handlers) {
+		vm_init_descriptor_tables(vm);
+		vcpu_init_descriptor_tables(vcpu);
+	}
+	vm_install_sync_handler(vm, VECTOR_SYNC_LOWER_64,
+				ESR_ELx_EC_HVC64, hvc_handler);
+}
diff --git a/tools/testing/selftests/kvm/lib/arm64/nested_asm.S b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
new file mode 100644
index 000000000000..4ecf2d510a6f
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * ARM64 Nested virtualization assembly helpers
+ */
+
+.globl run_l2
+.globl after_hvc
+.globl do_hvc
+run_l2:
+	/*
+	 * At this point TPIDR_EL2 will contain the gpa of do_hvc from
+	 * prepare_eret_destination(). gpa of do_hvc have to be passed in
+	 * because we want L2 to issue an hvc after it returns from the user
+	 * passed function. In order for that to happen the lr must be
+	 * controlled, which at this point holds the value of the address of
+	 * the next instruction after this run_l2() call, which is not useful
+	 * for L2. Additionally, L1 can't translate gva into gpa, so we can't
+	 * calculate it here.
+	 *
+	 * So first save lr, then move TPIDR_EL2 to lr so when the user supplied
+	 * L2 function returns, L2 jumps to do_hvc and let the L1 hvc handler
+	 * take control. This implies we expect the L2 code to preserve lr and
+	 * calls a regular ret in the end, which is true for normal C functions.
+	 * The hvc handler will jump back to after_hvc when finished, and lr
+	 * will be restored and we can return run_l2().
+	 */
+	stp	x29, lr, [sp, #-16]!
+	mrs	x0, tpidr_el2
+	mov	lr, x0
+	eret
+after_hvc:
+	ldp	x29, lr, [sp], #16
+	ret
+do_hvc:
+	hvc #0
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] KVM: arm64: sefltests: Add basic NV selftest
  2026-03-25  0:36 [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Wei-Lin Chang
  2026-03-25  0:36 ` [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV Wei-Lin Chang
@ 2026-03-25  0:36 ` Wei-Lin Chang
  2026-03-25  0:36 ` [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions Wei-Lin Chang
  2026-03-25  8:00 ` [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Itaru Kitayama
  3 siblings, 0 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-25  0:36 UTC (permalink / raw)
  To: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

Add a simple NV selftest that uses the NV library functions to eret from
vEL2 to EL1, then call an hvc to jump back to vEL2.

Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
 tools/testing/selftests/kvm/Makefile.kvm      |  1 +
 .../selftests/kvm/arm64/hello_nested.c        | 65 +++++++++++++++++++
 2 files changed, 66 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/arm64/hello_nested.c

diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
index 5e681e8e0cd7..d7499609cd0c 100644
--- a/tools/testing/selftests/kvm/Makefile.kvm
+++ b/tools/testing/selftests/kvm/Makefile.kvm
@@ -167,6 +167,7 @@ TEST_GEN_PROGS_arm64 += arm64/arch_timer_edge_cases
 TEST_GEN_PROGS_arm64 += arm64/at
 TEST_GEN_PROGS_arm64 += arm64/debug-exceptions
 TEST_GEN_PROGS_arm64 += arm64/hello_el2
+TEST_GEN_PROGS_arm64 += arm64/hello_nested
 TEST_GEN_PROGS_arm64 += arm64/host_sve
 TEST_GEN_PROGS_arm64 += arm64/hypercalls
 TEST_GEN_PROGS_arm64 += arm64/external_aborts
diff --git a/tools/testing/selftests/kvm/arm64/hello_nested.c b/tools/testing/selftests/kvm/arm64/hello_nested.c
new file mode 100644
index 000000000000..16c600539810
--- /dev/null
+++ b/tools/testing/selftests/kvm/arm64/hello_nested.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * hello_nested - Go from vEL2 to EL1 then back
+ */
+#include "kvm_util.h"
+#include "nested.h"
+#include "processor.h"
+#include "test_util.h"
+#include "ucall.h"
+
+static void l2_guest_code(void)
+{
+	/* nothing */
+}
+
+static void guest_code(void)
+{
+	GUEST_ASSERT_EQ(get_current_el(), 2);
+	GUEST_PRINTF("vEL2 entry\n");
+	run_l2();
+	GUEST_DONE();
+}
+
+int main(void)
+{
+	struct kvm_vcpu_init init;
+	struct kvm_vcpu *vcpu;
+	struct kvm_vm *vm;
+	struct ucall uc;
+
+	TEST_REQUIRE(kvm_check_cap(KVM_CAP_ARM_EL2));
+	vm = vm_create(1);
+
+	kvm_get_default_vcpu_target(vm, &init);
+	init.features[0] |= BIT(KVM_ARM_VCPU_HAS_EL2);
+	vcpu = aarch64_vcpu_add(vm, 0, &init, guest_code);
+	kvm_arch_vm_finalize_vcpus(vm);
+
+	prepare_l2_stack(vm, vcpu);
+	prepare_hyp_state(vm, vcpu);
+	prepare_eret_destination(vm, vcpu, l2_guest_code);
+	prepare_nested_sync_handler(vm, vcpu);
+
+	while (true) {
+		vcpu_run(vcpu);
+
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_PRINTF:
+			pr_info("%s", uc.buffer);
+			break;
+		case UCALL_DONE:
+			pr_info("DONE!\n");
+			goto end;
+		case UCALL_ABORT:
+			REPORT_GUEST_ASSERT(uc);
+			fallthrough;
+		default:
+			TEST_FAIL("Unhandled ucall: %ld\n", uc.cmd);
+		}
+	}
+
+end:
+	kvm_vm_free(vm);
+	return 0;
+}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions
  2026-03-25  0:36 [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Wei-Lin Chang
  2026-03-25  0:36 ` [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV Wei-Lin Chang
  2026-03-25  0:36 ` [PATCH 2/3] KVM: arm64: sefltests: Add basic NV selftest Wei-Lin Chang
@ 2026-03-25  0:36 ` Wei-Lin Chang
  2026-03-25  6:23   ` Itaru Kitayama
  2026-03-25  9:18   ` Marc Zyngier
  2026-03-25  8:00 ` [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Itaru Kitayama
  3 siblings, 2 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-25  0:36 UTC (permalink / raw)
  To: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

Introduce library functions for setting up guest stage-2 page tables,
then use that to give L2 an identity mapped stage-2 and enable it.

The translation and stage-2 page table built is simple, start level 0,
4 levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.

The nested page table code is adapted from lib/x86/vmx.c.

Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
---
 .../selftests/kvm/include/arm64/nested.h      |  7 ++
 .../selftests/kvm/include/arm64/processor.h   |  9 ++
 .../testing/selftests/kvm/lib/arm64/nested.c  | 97 ++++++++++++++++++-
 3 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
index 739ff2ee0161..0be10a775e48 100644
--- a/tools/testing/selftests/kvm/include/arm64/nested.h
+++ b/tools/testing/selftests/kvm/include/arm64/nested.h
@@ -6,6 +6,13 @@
 #ifndef SELFTEST_KVM_NESTED_H
 #define SELFTEST_KVM_NESTED_H
 
+uint64_t get_l1_vtcr(void);
+
+void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
+		uint64_t nested_paddr, uint64_t paddr, uint64_t size);
+void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
+			uint32_t memslot);
+
 void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
 void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
 void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h
index ac97a1c436fc..5de2e932d95a 100644
--- a/tools/testing/selftests/kvm/include/arm64/processor.h
+++ b/tools/testing/selftests/kvm/include/arm64/processor.h
@@ -104,6 +104,15 @@
 #define TCR_HA			(UL(1) << 39)
 #define TCR_DS			(UL(1) << 59)
 
+/* VTCR_EL2 specific flags */
+#define VTCR_EL2_T0SZ_BITS(x)	((UL(64) - (x)) << VTCR_EL2_T0SZ_SHIFT)
+
+#define VTCR_EL2_SL0_LV0_4K	(UL(2) << VTCR_EL2_SL0_SHIFT)
+#define VTCR_EL2_SL0_LV1_4K	(UL(1) << VTCR_EL2_SL0_SHIFT)
+#define VTCR_EL2_SL0_LV2_4K	(UL(0) << VTCR_EL2_SL0_SHIFT)
+
+#define VTCR_EL2_PS_40_BITS	(UL(2) << VTCR_EL2_PS_SHIFT)
+
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
  */
diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
index 111d02f44cfe..910f8cd30f96 100644
--- a/tools/testing/selftests/kvm/lib/arm64/nested.c
+++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
@@ -1,8 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- * ARM64 Nested virtualization helpers
+ * ARM64 Nested virtualization helpers, nested page table code adapted from
+ * ../x86/vmx.c.
  */
 
+#include <linux/sizes.h>
+
 #include "kvm_util.h"
 #include "nested.h"
 #include "processor.h"
@@ -18,6 +21,87 @@ static void hvc_handler(struct ex_regs *regs)
 	regs->pc = (u64)after_hvc;
 }
 
+uint64_t get_l1_vtcr(void)
+{
+	return VTCR_EL2_PS_40_BITS | VTCR_EL2_TG0_4K | VTCR_EL2_ORGN0_WBWA |
+	       VTCR_EL2_IRGN0_WBWA | VTCR_EL2_SL0_LV0_4K | VTCR_EL2_T0SZ_BITS(48);
+}
+
+static void __nested_pg_map(struct kvm_vm *vm, uint64_t guest_pgd,
+		     uint64_t nested_paddr, uint64_t paddr, uint64_t flags)
+{
+	uint8_t attr_idx = flags & (PTE_ATTRINDX_MASK >> PTE_ATTRINDX_SHIFT);
+	uint64_t pg_attr;
+	uint64_t *ptep;
+
+	TEST_ASSERT((nested_paddr % vm->page_size) == 0,
+		"L2 IPA not on page boundary,\n"
+		"  nested_paddr: 0x%lx vm->page_size: 0x%x", nested_paddr, vm->page_size);
+	TEST_ASSERT((paddr % vm->page_size) == 0,
+		"Guest physical address not on page boundary,\n"
+		"  paddr: 0x%lx vm->page_size: 0x%x", paddr, vm->page_size);
+	TEST_ASSERT((paddr >> vm->page_shift) <= vm->max_gfn,
+		"Physical address beyond maximum supported,\n"
+		"  paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x",
+		paddr, vm->max_gfn, vm->page_size);
+
+	ptep = addr_gpa2hva(vm, guest_pgd) + ((nested_paddr >> 39) & 0x1ffu) * 8;
+	if (!*ptep)
+		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PGD_TYPE_TABLE | PTE_VALID;
+	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 30) & 0x1ffu) * 8;
+	if (!*ptep)
+		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PUD_TYPE_TABLE | PTE_VALID;
+	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 21) & 0x1ffu) * 8;
+	if (!*ptep)
+		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PMD_TYPE_TABLE | PTE_VALID;
+	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 12) & 0x1ffu) * 8;
+
+	pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID;
+	pg_attr |= PTE_SHARED;
+
+	*ptep = (paddr & GENMASK(47, 12)) | pg_attr;
+}
+
+void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
+		uint64_t nested_paddr, uint64_t paddr, uint64_t size)
+{
+	size_t npages = size / SZ_4K;
+
+	TEST_ASSERT(nested_paddr + size > nested_paddr, "Vaddr overflow");
+	TEST_ASSERT(paddr + size > paddr, "Paddr overflow");
+
+	while (npages--) {
+		__nested_pg_map(vm, guest_pgd, nested_paddr, paddr, MT_NORMAL);
+		nested_paddr += SZ_4K;
+		paddr += SZ_4K;
+	}
+}
+
+/*
+ * Prepare an identity shadow page table that maps all the
+ * physical pages in VM.
+ */
+void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
+			uint32_t memslot)
+{
+	sparsebit_idx_t i, last;
+	struct userspace_mem_region *region =
+		memslot2region(vm, memslot);
+
+	i = (region->region.guest_phys_addr >> vm->page_shift) - 1;
+	last = i + (region->region.memory_size >> vm->page_shift);
+	for (;;) {
+		i = sparsebit_next_clear(region->unused_phy_pages, i);
+		if (i > last)
+			break;
+
+		nested_map(vm, guest_pgd,
+			   (uint64_t)i << vm->page_shift,
+			   (uint64_t)i << vm->page_shift,
+			   1 << vm->page_shift);
+	}
+}
+
 void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
 {
 	size_t l2_stack_size;
@@ -32,7 +116,16 @@ void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
 
 void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
 {
-	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);
+	vm_paddr_t guest_pgd;
+
+	guest_pgd = vm_phy_pages_alloc(vm, 1,
+				       KVM_GUEST_PAGE_TABLE_MIN_PADDR,
+				       vm->memslots[MEM_REGION_PT]);
+	nested_map_memslot(vm, guest_pgd, 0);
+
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW | HCR_EL2_VM);
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTTBR_EL2), guest_pgd);
+	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTCR_EL2), get_l1_vtcr());
 }
 
 void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions
  2026-03-25  0:36 ` [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions Wei-Lin Chang
@ 2026-03-25  6:23   ` Itaru Kitayama
  2026-03-26 21:34     ` Wei-Lin Chang
  2026-03-25  9:18   ` Marc Zyngier
  1 sibling, 1 reply; 11+ messages in thread
From: Itaru Kitayama @ 2026-03-25  6:23 UTC (permalink / raw)
  To: Wei-Lin Chang
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

Hi Wei Lin,
On Wed, Mar 25, 2026 at 12:36:20AM +0000, Wei-Lin Chang wrote:
> Introduce library functions for setting up guest stage-2 page tables,
> then use that to give L2 an identity mapped stage-2 and enable it.
> 
> The translation and stage-2 page table built is simple, start level 0,
> 4 levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.
> 
> The nested page table code is adapted from lib/x86/vmx.c.
> 
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
>  .../selftests/kvm/include/arm64/nested.h      |  7 ++
>  .../selftests/kvm/include/arm64/processor.h   |  9 ++
>  .../testing/selftests/kvm/lib/arm64/nested.c  | 97 ++++++++++++++++++-
>  3 files changed, 111 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> index 739ff2ee0161..0be10a775e48 100644
> --- a/tools/testing/selftests/kvm/include/arm64/nested.h
> +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> @@ -6,6 +6,13 @@
>  #ifndef SELFTEST_KVM_NESTED_H
>  #define SELFTEST_KVM_NESTED_H
>  
> +uint64_t get_l1_vtcr(void);

Using a type u64 is simpler? And I think you configure guest
hypervisor's stage 2 translation table, I felt this gives us
an impression somewhere the configuration IA and OA sizes etc 
are stored.

> +
> +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +		uint64_t nested_paddr, uint64_t paddr, uint64_t size);
> +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +			uint32_t memslot);
> +
>  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
>  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
>  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h
> index ac97a1c436fc..5de2e932d95a 100644
> --- a/tools/testing/selftests/kvm/include/arm64/processor.h
> +++ b/tools/testing/selftests/kvm/include/arm64/processor.h
> @@ -104,6 +104,15 @@
>  #define TCR_HA			(UL(1) << 39)
>  #define TCR_DS			(UL(1) << 59)
>  
> +/* VTCR_EL2 specific flags */
> +#define VTCR_EL2_T0SZ_BITS(x)	((UL(64) - (x)) << VTCR_EL2_T0SZ_SHIFT)
> +
> +#define VTCR_EL2_SL0_LV0_4K	(UL(2) << VTCR_EL2_SL0_SHIFT)
> +#define VTCR_EL2_SL0_LV1_4K	(UL(1) << VTCR_EL2_SL0_SHIFT)
> +#define VTCR_EL2_SL0_LV2_4K	(UL(0) << VTCR_EL2_SL0_SHIFT)
> +
> +#define VTCR_EL2_PS_40_BITS	(UL(2) << VTCR_EL2_PS_SHIFT)
> +
>  /*
>   * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
>   */
> diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> index 111d02f44cfe..910f8cd30f96 100644
> --- a/tools/testing/selftests/kvm/lib/arm64/nested.c
> +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> @@ -1,8 +1,11 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /*
> - * ARM64 Nested virtualization helpers
> + * ARM64 Nested virtualization helpers, nested page table code adapted from
> + * ../x86/vmx.c.
>   */
>  
> +#include <linux/sizes.h>
> +
>  #include "kvm_util.h"
>  #include "nested.h"
>  #include "processor.h"
> @@ -18,6 +21,87 @@ static void hvc_handler(struct ex_regs *regs)
>  	regs->pc = (u64)after_hvc;
>  }
>  
> +uint64_t get_l1_vtcr(void)
> +{
> +	return VTCR_EL2_PS_40_BITS | VTCR_EL2_TG0_4K | VTCR_EL2_ORGN0_WBWA |
> +	       VTCR_EL2_IRGN0_WBWA | VTCR_EL2_SL0_LV0_4K | VTCR_EL2_T0SZ_BITS(48);
> +}
> +
> +static void __nested_pg_map(struct kvm_vm *vm, uint64_t guest_pgd,
> +		     uint64_t nested_paddr, uint64_t paddr, uint64_t flags)
> +{
> +	uint8_t attr_idx = flags & (PTE_ATTRINDX_MASK >> PTE_ATTRINDX_SHIFT);
> +	uint64_t pg_attr;
> +	uint64_t *ptep;
> +
> +	TEST_ASSERT((nested_paddr % vm->page_size) == 0,
> +		"L2 IPA not on page boundary,\n"
> +		"  nested_paddr: 0x%lx vm->page_size: 0x%x", nested_paddr, vm->page_size);
> +	TEST_ASSERT((paddr % vm->page_size) == 0,
> +		"Guest physical address not on page boundary,\n"
> +		"  paddr: 0x%lx vm->page_size: 0x%x", paddr, vm->page_size);
> +	TEST_ASSERT((paddr >> vm->page_shift) <= vm->max_gfn,
> +		"Physical address beyond maximum supported,\n"
> +		"  paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x",
> +		paddr, vm->max_gfn, vm->page_size);
> +
> +	ptep = addr_gpa2hva(vm, guest_pgd) + ((nested_paddr >> 39) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PGD_TYPE_TABLE | PTE_VALID;

Same but given this is stage 2 translation tables, KVM_PTE_VALID?

Thanks,
Itaru.
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 30) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PUD_TYPE_TABLE | PTE_VALID;
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 21) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PMD_TYPE_TABLE | PTE_VALID;
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 12) & 0x1ffu) * 8;
> +
> +	pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID;
> +	pg_attr |= PTE_SHARED;
> +
> +	*ptep = (paddr & GENMASK(47, 12)) | pg_attr;
> +}
> +
> +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +		uint64_t nested_paddr, uint64_t paddr, uint64_t size)
> +{
> +	size_t npages = size / SZ_4K;
> +
> +	TEST_ASSERT(nested_paddr + size > nested_paddr, "Vaddr overflow");
> +	TEST_ASSERT(paddr + size > paddr, "Paddr overflow");
> +
> +	while (npages--) {
> +		__nested_pg_map(vm, guest_pgd, nested_paddr, paddr, MT_NORMAL);
> +		nested_paddr += SZ_4K;
> +		paddr += SZ_4K;
> +	}
> +}
> +
> +/*
> + * Prepare an identity shadow page table that maps all the
> + * physical pages in VM.
> + */
> +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +			uint32_t memslot)
> +{
> +	sparsebit_idx_t i, last;
> +	struct userspace_mem_region *region =
> +		memslot2region(vm, memslot);
> +
> +	i = (region->region.guest_phys_addr >> vm->page_shift) - 1;
> +	last = i + (region->region.memory_size >> vm->page_shift);
> +	for (;;) {
> +		i = sparsebit_next_clear(region->unused_phy_pages, i);
> +		if (i > last)
> +			break;
> +
> +		nested_map(vm, guest_pgd,
> +			   (uint64_t)i << vm->page_shift,
> +			   (uint64_t)i << vm->page_shift,
> +			   1 << vm->page_shift);
> +	}
> +}
> +
>  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
>  {
>  	size_t l2_stack_size;
> @@ -32,7 +116,16 @@ void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
>  
>  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
>  {
> -	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);
> +	vm_paddr_t guest_pgd;
> +
> +	guest_pgd = vm_phy_pages_alloc(vm, 1,
> +				       KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> +				       vm->memslots[MEM_REGION_PT]);
> +	nested_map_memslot(vm, guest_pgd, 0);
> +
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW | HCR_EL2_VM);
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTTBR_EL2), guest_pgd);
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTCR_EL2), get_l1_vtcr());
>  }
>  
>  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support
  2026-03-25  0:36 [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Wei-Lin Chang
                   ` (2 preceding siblings ...)
  2026-03-25  0:36 ` [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions Wei-Lin Chang
@ 2026-03-25  8:00 ` Itaru Kitayama
  3 siblings, 0 replies; 11+ messages in thread
From: Itaru Kitayama @ 2026-03-25  8:00 UTC (permalink / raw)
  To: Wei-Lin Chang
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

Hi Wei Lin,

On Wed, Mar 25, 2026 at 12:36:17AM +0000, Wei-Lin Chang wrote:
> Hi,
> 
> This series adds basic support for running nested guests (L2) in
> kselftest. The first patch adds library functions. While designing the
> APIs for userspace, I referenced Joey's approach for kvm-unit-tests [1].
> In summary, four preparatory functions are provided for userspace to
> set up state to run an L2 in EL1:
> 
>  - prepare_l2_stack()            <- sets up stack for L2
>  - prepare_hyp_state()           <- sets up vEL2 registers
>  - prepare_eret_destination()    <- userspace passes a function pointer
>                                     for L2 to run
>  - prepare_nested_sync_handler() <- sets up hvc handler in order to
>                                     regain control after L2's hvc
> 
> After calling those functions, userspace can vcpu_run(), and when
> run_l2() is called within the guest, the supplied function will be run
> in L2, with the control flow managed by the library code in nested.c and
> nested_asm.S. After running the L2 function, run_l2() will automatically
> return. Note that the L2 function supplied by the user does not have to
> call hvc.
> 
> Patch 2 demonstrates usage of the APIs introduced above, with a simple
> L1 -> L2 -> L1 sequence, with an empty L2 function.
> 
> Patch 3 enhances the library functions by setting up L2 -> L1 stage-2
> translation. Currently the translation is simple, with start level 0, 4
> levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.
> 
> [1]: https://lore.kernel.org/kvmarm/20260306142656.2775185-1-joey.gouly@arm.com/

Look like this selftest assumes nested guest's MMU is disabled (L2 IPA
to L1 IPA to PA), but I couldn's find the explicit SCTLR.M bit operation in this
series, how do you make sure it is always off?

Thanks,
Itaru.

> 
> Wei-Lin Chang (3):
>   KVM: arm64: selftests: Add library functions for NV
>   KVM: arm64: sefltests: Add basic NV selftest
>   KVM: arm64: selftests: Enable stage-2 in NV preparation functions
> 
>  tools/testing/selftests/kvm/Makefile.kvm      |   3 +
>  .../selftests/kvm/arm64/hello_nested.c        |  65 ++++++++
>  .../selftests/kvm/include/arm64/nested.h      |  25 +++
>  .../selftests/kvm/include/arm64/processor.h   |   9 +
>  .../testing/selftests/kvm/lib/arm64/nested.c  | 154 ++++++++++++++++++
>  .../selftests/kvm/lib/arm64/nested_asm.S      |  35 ++++
>  6 files changed, 291 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/arm64/hello_nested.c
>  create mode 100644 tools/testing/selftests/kvm/include/arm64/nested.h
>  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested.c
>  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> 
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV
  2026-03-25  0:36 ` [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV Wei-Lin Chang
@ 2026-03-25  9:03   ` Marc Zyngier
  2026-03-26 14:28     ` Wei-Lin Chang
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-03-25  9:03 UTC (permalink / raw)
  To: Wei-Lin Chang
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

On Wed, 25 Mar 2026 00:36:18 +0000,
Wei-Lin Chang <weilin.chang@arm.com> wrote:
> 
> The API is designed for userspace to first call prepare_{l2_stack,
> hyp_state, eret_destination, nested_sync_handler}, with a function
> supplied to prepare_eret_destination() to be run in L2. Then run_l2()
> can be called in L1 to run the given function in L2.
> 
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
>  tools/testing/selftests/kvm/Makefile.kvm      |  2 +
>  .../selftests/kvm/include/arm64/nested.h      | 18 ++++++
>  .../testing/selftests/kvm/lib/arm64/nested.c  | 61 +++++++++++++++++++
>  .../selftests/kvm/lib/arm64/nested_asm.S      | 35 +++++++++++
>  4 files changed, 116 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/include/arm64/nested.h
>  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested.c
>  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> 
> diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
> index 98da9fa4b8b7..5e681e8e0cd7 100644
> --- a/tools/testing/selftests/kvm/Makefile.kvm
> +++ b/tools/testing/selftests/kvm/Makefile.kvm
> @@ -34,6 +34,8 @@ LIBKVM_arm64 += lib/arm64/gic.c
>  LIBKVM_arm64 += lib/arm64/gic_v3.c
>  LIBKVM_arm64 += lib/arm64/gic_v3_its.c
>  LIBKVM_arm64 += lib/arm64/handlers.S
> +LIBKVM_arm64 += lib/arm64/nested.c
> +LIBKVM_arm64 += lib/arm64/nested_asm.S
>  LIBKVM_arm64 += lib/arm64/processor.c
>  LIBKVM_arm64 += lib/arm64/spinlock.c
>  LIBKVM_arm64 += lib/arm64/ucall.c
> diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> new file mode 100644
> index 000000000000..739ff2ee0161
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * ARM64 Nested virtualization defines
> + */
> +
> +#ifndef SELFTEST_KVM_NESTED_H
> +#define SELFTEST_KVM_NESTED_H
> +
> +void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> +void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> +void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> +void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> +
> +void run_l2(void);
> +void after_hvc(void);
> +void do_hvc(void);
> +
> +#endif /* SELFTEST_KVM_NESTED_H */
> diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> new file mode 100644
> index 000000000000..111d02f44cfe
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> @@ -0,0 +1,61 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * ARM64 Nested virtualization helpers
> + */
> +
> +#include "kvm_util.h"
> +#include "nested.h"
> +#include "processor.h"
> +#include "test_util.h"
> +
> +#include <asm/sysreg.h>
> +
> +static void hvc_handler(struct ex_regs *regs)
> +{
> +	GUEST_ASSERT_EQ(get_current_el(), 2);
> +	GUEST_PRINTF("hvc handler\n");
> +	regs->pstate = PSR_MODE_EL2h | PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT;
> +	regs->pc = (u64)after_hvc;
> +}
> +
> +void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> +{
> +	size_t l2_stack_size;
> +	uint64_t l2_stack_paddr;
> +
> +	l2_stack_size = vm->page_size == 4096 ? DEFAULT_STACK_PGS * vm->page_size :
> +					 vm->page_size;

Please use symbolic constants. Also, this looks wrong if the default
stack size is 32k and the page size is 16k. You probably want to
express a stack size directly, rather than a number of pages.

> +	l2_stack_paddr = __vm_phy_pages_alloc(vm, l2_stack_size / vm->page_size,
> +					      0, 0, false);
> +	vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), l2_stack_paddr + l2_stack_size);
> +}
> +
> +void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> +{
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);

Surely the E2H value matters. Or are you planning to only run this on
configuration that hardcode E2H==0? That'd be pretty limiting.

> +}
> +
> +void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
> +{
> +	vm_paddr_t do_hvc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)do_hvc);
> +	vm_paddr_t l2_pc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)l2_pc);
> +
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SPSR_EL2), PSR_MODE_EL1h |
> +							    PSR_D_BIT     |
> +							    PSR_A_BIT     |
> +							    PSR_I_BIT     |
> +							    PSR_F_BIT);
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ELR_EL2), l2_pc_paddr);
> +	/* HACK: use TPIDR_EL2 to pass address, see run_l2() in nested_asm.S */
> +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TPIDR_EL2), do_hvc_paddr);
> +}
> +
> +void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> +{
> +	if (!vm->handlers) {
> +		vm_init_descriptor_tables(vm);
> +		vcpu_init_descriptor_tables(vcpu);
> +	}
> +	vm_install_sync_handler(vm, VECTOR_SYNC_LOWER_64,
> +				ESR_ELx_EC_HVC64, hvc_handler);
> +}
> diff --git a/tools/testing/selftests/kvm/lib/arm64/nested_asm.S b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> new file mode 100644
> index 000000000000..4ecf2d510a6f
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * ARM64 Nested virtualization assembly helpers
> + */
> +
> +.globl run_l2
> +.globl after_hvc
> +.globl do_hvc
> +run_l2:
> +	/*
> +	 * At this point TPIDR_EL2 will contain the gpa of do_hvc from
> +	 * prepare_eret_destination(). gpa of do_hvc have to be passed in
> +	 * because we want L2 to issue an hvc after it returns from the user
> +	 * passed function. In order for that to happen the lr must be
> +	 * controlled, which at this point holds the value of the address of
> +	 * the next instruction after this run_l2() call, which is not useful
> +	 * for L2. Additionally, L1 can't translate gva into gpa, so we can't
> +	 * calculate it here.
> +	 *
> +	 * So first save lr, then move TPIDR_EL2 to lr so when the user supplied
> +	 * L2 function returns, L2 jumps to do_hvc and let the L1 hvc handler
> +	 * take control. This implies we expect the L2 code to preserve lr and
> +	 * calls a regular ret in the end, which is true for normal C functions.
> +	 * The hvc handler will jump back to after_hvc when finished, and lr
> +	 * will be restored and we can return run_l2().
> +	 */
> +	stp	x29, lr, [sp, #-16]!
> +	mrs	x0, tpidr_el2
> +	mov	lr, x0
> +	eret
> +after_hvc:
> +	ldp	x29, lr, [sp], #16
> +	ret
> +do_hvc:
> +	hvc #0

This probably works for a single instruction L2 guest, but not having
any save/restore of the L2 context makes it hard to build anything on
top of this.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions
  2026-03-25  0:36 ` [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions Wei-Lin Chang
  2026-03-25  6:23   ` Itaru Kitayama
@ 2026-03-25  9:18   ` Marc Zyngier
  2026-03-26 21:16     ` Wei-Lin Chang
  1 sibling, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-03-25  9:18 UTC (permalink / raw)
  To: Wei-Lin Chang
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

On Wed, 25 Mar 2026 00:36:20 +0000,
Wei-Lin Chang <weilin.chang@arm.com> wrote:
> 
> Introduce library functions for setting up guest stage-2 page tables,
> then use that to give L2 an identity mapped stage-2 and enable it.
> 
> The translation and stage-2 page table built is simple, start level 0,
> 4 levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.

That's a no go. The most common NV-capable out there can realistically
only do 16kB at S2, and is limited to 36bit IPA. We can't really
afford to say "too bad" and leave the main development platform behind.

> 
> The nested page table code is adapted from lib/x86/vmx.c.

I guess this starting point is the main issue.

> 
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
>  .../selftests/kvm/include/arm64/nested.h      |  7 ++
>  .../selftests/kvm/include/arm64/processor.h   |  9 ++
>  .../testing/selftests/kvm/lib/arm64/nested.c  | 97 ++++++++++++++++++-
>  3 files changed, 111 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> index 739ff2ee0161..0be10a775e48 100644
> --- a/tools/testing/selftests/kvm/include/arm64/nested.h
> +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> @@ -6,6 +6,13 @@
>  #ifndef SELFTEST_KVM_NESTED_H
>  #define SELFTEST_KVM_NESTED_H
>  
> +uint64_t get_l1_vtcr(void);
> +
> +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +		uint64_t nested_paddr, uint64_t paddr, uint64_t size);
> +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> +			uint32_t memslot);
> +
>  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
>  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
>  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h
> index ac97a1c436fc..5de2e932d95a 100644
> --- a/tools/testing/selftests/kvm/include/arm64/processor.h
> +++ b/tools/testing/selftests/kvm/include/arm64/processor.h
> @@ -104,6 +104,15 @@
>  #define TCR_HA			(UL(1) << 39)
>  #define TCR_DS			(UL(1) << 59)
>  
> +/* VTCR_EL2 specific flags */
> +#define VTCR_EL2_T0SZ_BITS(x)	((UL(64) - (x)) << VTCR_EL2_T0SZ_SHIFT)
> +
> +#define VTCR_EL2_SL0_LV0_4K	(UL(2) << VTCR_EL2_SL0_SHIFT)
> +#define VTCR_EL2_SL0_LV1_4K	(UL(1) << VTCR_EL2_SL0_SHIFT)
> +#define VTCR_EL2_SL0_LV2_4K	(UL(0) << VTCR_EL2_SL0_SHIFT)
> +
> +#define VTCR_EL2_PS_40_BITS	(UL(2) << VTCR_EL2_PS_SHIFT)
> +
>  /*
>   * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
>   */
> diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> index 111d02f44cfe..910f8cd30f96 100644
> --- a/tools/testing/selftests/kvm/lib/arm64/nested.c
> +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> @@ -1,8 +1,11 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /*
> - * ARM64 Nested virtualization helpers
> + * ARM64 Nested virtualization helpers, nested page table code adapted from
> + * ../x86/vmx.c.
>   */
>  
> +#include <linux/sizes.h>
> +
>  #include "kvm_util.h"
>  #include "nested.h"
>  #include "processor.h"
> @@ -18,6 +21,87 @@ static void hvc_handler(struct ex_regs *regs)
>  	regs->pc = (u64)after_hvc;
>  }
>  
> +uint64_t get_l1_vtcr(void)
> +{
> +	return VTCR_EL2_PS_40_BITS | VTCR_EL2_TG0_4K | VTCR_EL2_ORGN0_WBWA |
> +	       VTCR_EL2_IRGN0_WBWA | VTCR_EL2_SL0_LV0_4K | VTCR_EL2_T0SZ_BITS(48);

Irk. See above.

> +}
> +
> +static void __nested_pg_map(struct kvm_vm *vm, uint64_t guest_pgd,
> +		     uint64_t nested_paddr, uint64_t paddr, uint64_t flags)
> +{
> +	uint8_t attr_idx = flags & (PTE_ATTRINDX_MASK >> PTE_ATTRINDX_SHIFT);
> +	uint64_t pg_attr;
> +	uint64_t *ptep;
> +
> +	TEST_ASSERT((nested_paddr % vm->page_size) == 0,
> +		"L2 IPA not on page boundary,\n"
> +		"  nested_paddr: 0x%lx vm->page_size: 0x%x", nested_paddr, vm->page_size);
> +	TEST_ASSERT((paddr % vm->page_size) == 0,
> +		"Guest physical address not on page boundary,\n"
> +		"  paddr: 0x%lx vm->page_size: 0x%x", paddr, vm->page_size);
> +	TEST_ASSERT((paddr >> vm->page_shift) <= vm->max_gfn,
> +		"Physical address beyond maximum supported,\n"
> +		"  paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x",
> +		paddr, vm->max_gfn, vm->page_size);
> +
> +	ptep = addr_gpa2hva(vm, guest_pgd) + ((nested_paddr >> 39) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PGD_TYPE_TABLE | PTE_VALID;
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 30) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PUD_TYPE_TABLE | PTE_VALID;
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 21) & 0x1ffu) * 8;
> +	if (!*ptep)
> +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PMD_TYPE_TABLE | PTE_VALID;
> +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 12) & 0x1ffu) * 8;
> +
> +	pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID;
> +	pg_attr |= PTE_SHARED;
> +
> +	*ptep = (paddr & GENMASK(47, 12)) | pg_attr;

Please use named constants, and write a page table generator that is
independent of page, IA and OA sizes, as advertised to the guest.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV
  2026-03-25  9:03   ` Marc Zyngier
@ 2026-03-26 14:28     ` Wei-Lin Chang
  0 siblings, 0 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-26 14:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

On Wed, Mar 25, 2026 at 09:03:47AM +0000, Marc Zyngier wrote:
> On Wed, 25 Mar 2026 00:36:18 +0000,
> Wei-Lin Chang <weilin.chang@arm.com> wrote:
> > 
> > The API is designed for userspace to first call prepare_{l2_stack,
> > hyp_state, eret_destination, nested_sync_handler}, with a function
> > supplied to prepare_eret_destination() to be run in L2. Then run_l2()
> > can be called in L1 to run the given function in L2.
> > 
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> >  tools/testing/selftests/kvm/Makefile.kvm      |  2 +
> >  .../selftests/kvm/include/arm64/nested.h      | 18 ++++++
> >  .../testing/selftests/kvm/lib/arm64/nested.c  | 61 +++++++++++++++++++
> >  .../selftests/kvm/lib/arm64/nested_asm.S      | 35 +++++++++++
> >  4 files changed, 116 insertions(+)
> >  create mode 100644 tools/testing/selftests/kvm/include/arm64/nested.h
> >  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested.c
> >  create mode 100644 tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> > 
> > diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
> > index 98da9fa4b8b7..5e681e8e0cd7 100644
> > --- a/tools/testing/selftests/kvm/Makefile.kvm
> > +++ b/tools/testing/selftests/kvm/Makefile.kvm
> > @@ -34,6 +34,8 @@ LIBKVM_arm64 += lib/arm64/gic.c
> >  LIBKVM_arm64 += lib/arm64/gic_v3.c
> >  LIBKVM_arm64 += lib/arm64/gic_v3_its.c
> >  LIBKVM_arm64 += lib/arm64/handlers.S
> > +LIBKVM_arm64 += lib/arm64/nested.c
> > +LIBKVM_arm64 += lib/arm64/nested_asm.S
> >  LIBKVM_arm64 += lib/arm64/processor.c
> >  LIBKVM_arm64 += lib/arm64/spinlock.c
> >  LIBKVM_arm64 += lib/arm64/ucall.c
> > diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> > new file mode 100644
> > index 000000000000..739ff2ee0161
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> > @@ -0,0 +1,18 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * ARM64 Nested virtualization defines
> > + */
> > +
> > +#ifndef SELFTEST_KVM_NESTED_H
> > +#define SELFTEST_KVM_NESTED_H
> > +
> > +void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> > +void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> > +void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> > +void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> > +
> > +void run_l2(void);
> > +void after_hvc(void);
> > +void do_hvc(void);
> > +
> > +#endif /* SELFTEST_KVM_NESTED_H */
> > diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > new file mode 100644
> > index 000000000000..111d02f44cfe
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > @@ -0,0 +1,61 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * ARM64 Nested virtualization helpers
> > + */
> > +
> > +#include "kvm_util.h"
> > +#include "nested.h"
> > +#include "processor.h"
> > +#include "test_util.h"
> > +
> > +#include <asm/sysreg.h>
> > +
> > +static void hvc_handler(struct ex_regs *regs)
> > +{
> > +	GUEST_ASSERT_EQ(get_current_el(), 2);
> > +	GUEST_PRINTF("hvc handler\n");
> > +	regs->pstate = PSR_MODE_EL2h | PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT;
> > +	regs->pc = (u64)after_hvc;
> > +}
> > +
> > +void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> > +{
> > +	size_t l2_stack_size;
> > +	uint64_t l2_stack_paddr;
> > +
> > +	l2_stack_size = vm->page_size == 4096 ? DEFAULT_STACK_PGS * vm->page_size :
> > +					 vm->page_size;
> 
> Please use symbolic constants. Also, this looks wrong if the default
> stack size is 32k and the page size is 16k. You probably want to
> express a stack size directly, rather than a number of pages.

Makes sense, will fix the size of the stack.

> 
> > +	l2_stack_paddr = __vm_phy_pages_alloc(vm, l2_stack_size / vm->page_size,
> > +					      0, 0, false);
> > +	vcpu_set_reg(vcpu, ARM64_CORE_REG(sp_el1), l2_stack_paddr + l2_stack_size);
> > +}
> > +
> > +void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> > +{
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);
> 
> Surely the E2H value matters. Or are you planning to only run this on
> configuration that hardcode E2H==0? That'd be pretty limiting.

Yes it does matter, I was tunnel-visioned in trying to make L1 <-> L2
transition work with the bare minimum, and missed what we will want in
the future.

> 
> > +}
> > +
> > +void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
> > +{
> > +	vm_paddr_t do_hvc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)do_hvc);
> > +	vm_paddr_t l2_pc_paddr = addr_gva2gpa(vm, (vm_vaddr_t)l2_pc);
> > +
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_SPSR_EL2), PSR_MODE_EL1h |
> > +							    PSR_D_BIT     |
> > +							    PSR_A_BIT     |
> > +							    PSR_I_BIT     |
> > +							    PSR_F_BIT);
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_ELR_EL2), l2_pc_paddr);
> > +	/* HACK: use TPIDR_EL2 to pass address, see run_l2() in nested_asm.S */
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TPIDR_EL2), do_hvc_paddr);
> > +}
> > +
> > +void prepare_nested_sync_handler(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> > +{
> > +	if (!vm->handlers) {
> > +		vm_init_descriptor_tables(vm);
> > +		vcpu_init_descriptor_tables(vcpu);
> > +	}
> > +	vm_install_sync_handler(vm, VECTOR_SYNC_LOWER_64,
> > +				ESR_ELx_EC_HVC64, hvc_handler);
> > +}
> > diff --git a/tools/testing/selftests/kvm/lib/arm64/nested_asm.S b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> > new file mode 100644
> > index 000000000000..4ecf2d510a6f
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/lib/arm64/nested_asm.S
> > @@ -0,0 +1,35 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * ARM64 Nested virtualization assembly helpers
> > + */
> > +
> > +.globl run_l2
> > +.globl after_hvc
> > +.globl do_hvc
> > +run_l2:
> > +	/*
> > +	 * At this point TPIDR_EL2 will contain the gpa of do_hvc from
> > +	 * prepare_eret_destination(). gpa of do_hvc have to be passed in
> > +	 * because we want L2 to issue an hvc after it returns from the user
> > +	 * passed function. In order for that to happen the lr must be
> > +	 * controlled, which at this point holds the value of the address of
> > +	 * the next instruction after this run_l2() call, which is not useful
> > +	 * for L2. Additionally, L1 can't translate gva into gpa, so we can't
> > +	 * calculate it here.
> > +	 *
> > +	 * So first save lr, then move TPIDR_EL2 to lr so when the user supplied
> > +	 * L2 function returns, L2 jumps to do_hvc and let the L1 hvc handler
> > +	 * take control. This implies we expect the L2 code to preserve lr and
> > +	 * calls a regular ret in the end, which is true for normal C functions.
> > +	 * The hvc handler will jump back to after_hvc when finished, and lr
> > +	 * will be restored and we can return run_l2().
> > +	 */
> > +	stp	x29, lr, [sp, #-16]!
> > +	mrs	x0, tpidr_el2
> > +	mov	lr, x0
> > +	eret
> > +after_hvc:
> > +	ldp	x29, lr, [sp], #16
> > +	ret
> > +do_hvc:
> > +	hvc #0
> 
> This probably works for a single instruction L2 guest, but not having
> any save/restore of the L2 context makes it hard to build anything on
> top of this.

Agreed, we need L2 save/restore to meaningfully test NV.

Thanks,
Wei-Lin Chang

> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions
  2026-03-25  9:18   ` Marc Zyngier
@ 2026-03-26 21:16     ` Wei-Lin Chang
  0 siblings, 0 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-26 21:16 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

On Wed, Mar 25, 2026 at 09:18:20AM +0000, Marc Zyngier wrote:
> On Wed, 25 Mar 2026 00:36:20 +0000,
> Wei-Lin Chang <weilin.chang@arm.com> wrote:
> > 
> > Introduce library functions for setting up guest stage-2 page tables,
> > then use that to give L2 an identity mapped stage-2 and enable it.
> > 
> > The translation and stage-2 page table built is simple, start level 0,
> > 4 levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.
> 
> That's a no go. The most common NV-capable out there can realistically
> only do 16kB at S2, and is limited to 36bit IPA. We can't really
> afford to say "too bad" and leave the main development platform behind.

Interesting, I didn't know that! I didn't consider the guest stage-2
granule size < host stage-2 granule size case either.

> 
> > 
> > The nested page table code is adapted from lib/x86/vmx.c.
> 
> I guess this starting point is the main issue.
> 
> > 
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> >  .../selftests/kvm/include/arm64/nested.h      |  7 ++
> >  .../selftests/kvm/include/arm64/processor.h   |  9 ++
> >  .../testing/selftests/kvm/lib/arm64/nested.c  | 97 ++++++++++++++++++-
> >  3 files changed, 111 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> > index 739ff2ee0161..0be10a775e48 100644
> > --- a/tools/testing/selftests/kvm/include/arm64/nested.h
> > +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> > @@ -6,6 +6,13 @@
> >  #ifndef SELFTEST_KVM_NESTED_H
> >  #define SELFTEST_KVM_NESTED_H
> >  
> > +uint64_t get_l1_vtcr(void);
> > +
> > +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +		uint64_t nested_paddr, uint64_t paddr, uint64_t size);
> > +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +			uint32_t memslot);
> > +
> >  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> >  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> >  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> > diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h
> > index ac97a1c436fc..5de2e932d95a 100644
> > --- a/tools/testing/selftests/kvm/include/arm64/processor.h
> > +++ b/tools/testing/selftests/kvm/include/arm64/processor.h
> > @@ -104,6 +104,15 @@
> >  #define TCR_HA			(UL(1) << 39)
> >  #define TCR_DS			(UL(1) << 59)
> >  
> > +/* VTCR_EL2 specific flags */
> > +#define VTCR_EL2_T0SZ_BITS(x)	((UL(64) - (x)) << VTCR_EL2_T0SZ_SHIFT)
> > +
> > +#define VTCR_EL2_SL0_LV0_4K	(UL(2) << VTCR_EL2_SL0_SHIFT)
> > +#define VTCR_EL2_SL0_LV1_4K	(UL(1) << VTCR_EL2_SL0_SHIFT)
> > +#define VTCR_EL2_SL0_LV2_4K	(UL(0) << VTCR_EL2_SL0_SHIFT)
> > +
> > +#define VTCR_EL2_PS_40_BITS	(UL(2) << VTCR_EL2_PS_SHIFT)
> > +
> >  /*
> >   * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
> >   */
> > diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > index 111d02f44cfe..910f8cd30f96 100644
> > --- a/tools/testing/selftests/kvm/lib/arm64/nested.c
> > +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > @@ -1,8 +1,11 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  /*
> > - * ARM64 Nested virtualization helpers
> > + * ARM64 Nested virtualization helpers, nested page table code adapted from
> > + * ../x86/vmx.c.
> >   */
> >  
> > +#include <linux/sizes.h>
> > +
> >  #include "kvm_util.h"
> >  #include "nested.h"
> >  #include "processor.h"
> > @@ -18,6 +21,87 @@ static void hvc_handler(struct ex_regs *regs)
> >  	regs->pc = (u64)after_hvc;
> >  }
> >  
> > +uint64_t get_l1_vtcr(void)
> > +{
> > +	return VTCR_EL2_PS_40_BITS | VTCR_EL2_TG0_4K | VTCR_EL2_ORGN0_WBWA |
> > +	       VTCR_EL2_IRGN0_WBWA | VTCR_EL2_SL0_LV0_4K | VTCR_EL2_T0SZ_BITS(48);
> 
> Irk. See above.
> 
> > +}
> > +
> > +static void __nested_pg_map(struct kvm_vm *vm, uint64_t guest_pgd,
> > +		     uint64_t nested_paddr, uint64_t paddr, uint64_t flags)
> > +{
> > +	uint8_t attr_idx = flags & (PTE_ATTRINDX_MASK >> PTE_ATTRINDX_SHIFT);
> > +	uint64_t pg_attr;
> > +	uint64_t *ptep;
> > +
> > +	TEST_ASSERT((nested_paddr % vm->page_size) == 0,
> > +		"L2 IPA not on page boundary,\n"
> > +		"  nested_paddr: 0x%lx vm->page_size: 0x%x", nested_paddr, vm->page_size);
> > +	TEST_ASSERT((paddr % vm->page_size) == 0,
> > +		"Guest physical address not on page boundary,\n"
> > +		"  paddr: 0x%lx vm->page_size: 0x%x", paddr, vm->page_size);
> > +	TEST_ASSERT((paddr >> vm->page_shift) <= vm->max_gfn,
> > +		"Physical address beyond maximum supported,\n"
> > +		"  paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x",
> > +		paddr, vm->max_gfn, vm->page_size);
> > +
> > +	ptep = addr_gpa2hva(vm, guest_pgd) + ((nested_paddr >> 39) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PGD_TYPE_TABLE | PTE_VALID;
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 30) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PUD_TYPE_TABLE | PTE_VALID;
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 21) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PMD_TYPE_TABLE | PTE_VALID;
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 12) & 0x1ffu) * 8;
> > +
> > +	pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID;
> > +	pg_attr |= PTE_SHARED;
> > +
> > +	*ptep = (paddr & GENMASK(47, 12)) | pg_attr;
> 
> Please use named constants, and write a page table generator that is
> independent of page, IA and OA sizes, as advertised to the guest.

Ack, and yes, I should absolutely consider what is advertised to the
guest...

Thanks,
Wei-Lin Chang

> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions
  2026-03-25  6:23   ` Itaru Kitayama
@ 2026-03-26 21:34     ` Wei-Lin Chang
  0 siblings, 0 replies; 11+ messages in thread
From: Wei-Lin Chang @ 2026-03-26 21:34 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: kvm, linux-kselftest, linux-arm-kernel, kvmarm, linux-kernel,
	Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Joey Gouly,
	Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon

On Wed, Mar 25, 2026 at 03:23:28PM +0900, Itaru Kitayama wrote:
> Hi Wei Lin,

Hi,

> On Wed, Mar 25, 2026 at 12:36:20AM +0000, Wei-Lin Chang wrote:
> > Introduce library functions for setting up guest stage-2 page tables,
> > then use that to give L2 an identity mapped stage-2 and enable it.
> > 
> > The translation and stage-2 page table built is simple, start level 0,
> > 4 levels, 4KB granules, normal cachable, 48-bit IA, 40-bit OA.
> > 
> > The nested page table code is adapted from lib/x86/vmx.c.
> > 
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> >  .../selftests/kvm/include/arm64/nested.h      |  7 ++
> >  .../selftests/kvm/include/arm64/processor.h   |  9 ++
> >  .../testing/selftests/kvm/lib/arm64/nested.c  | 97 ++++++++++++++++++-
> >  3 files changed, 111 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/kvm/include/arm64/nested.h b/tools/testing/selftests/kvm/include/arm64/nested.h
> > index 739ff2ee0161..0be10a775e48 100644
> > --- a/tools/testing/selftests/kvm/include/arm64/nested.h
> > +++ b/tools/testing/selftests/kvm/include/arm64/nested.h
> > @@ -6,6 +6,13 @@
> >  #ifndef SELFTEST_KVM_NESTED_H
> >  #define SELFTEST_KVM_NESTED_H
> >  
> > +uint64_t get_l1_vtcr(void);
> 
> Using a type u64 is simpler? And I think you configure guest
> hypervisor's stage 2 translation table, I felt this gives us
> an impression somewhere the configuration IA and OA sizes etc 
> are stored.

Sure, u64 is okay.
In this version I basically just used hard-coded values whenever I not
needed IA, OA and other related values e.g. page shift, which is not
good and as Marc said would not even work on some platforms. I'll make
it more modular in the next iteration.

> 
> > +
> > +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +		uint64_t nested_paddr, uint64_t paddr, uint64_t size);
> > +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +			uint32_t memslot);
> > +
> >  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> >  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu);
> >  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc);
> > diff --git a/tools/testing/selftests/kvm/include/arm64/processor.h b/tools/testing/selftests/kvm/include/arm64/processor.h
> > index ac97a1c436fc..5de2e932d95a 100644
> > --- a/tools/testing/selftests/kvm/include/arm64/processor.h
> > +++ b/tools/testing/selftests/kvm/include/arm64/processor.h
> > @@ -104,6 +104,15 @@
> >  #define TCR_HA			(UL(1) << 39)
> >  #define TCR_DS			(UL(1) << 59)
> >  
> > +/* VTCR_EL2 specific flags */
> > +#define VTCR_EL2_T0SZ_BITS(x)	((UL(64) - (x)) << VTCR_EL2_T0SZ_SHIFT)
> > +
> > +#define VTCR_EL2_SL0_LV0_4K	(UL(2) << VTCR_EL2_SL0_SHIFT)
> > +#define VTCR_EL2_SL0_LV1_4K	(UL(1) << VTCR_EL2_SL0_SHIFT)
> > +#define VTCR_EL2_SL0_LV2_4K	(UL(0) << VTCR_EL2_SL0_SHIFT)
> > +
> > +#define VTCR_EL2_PS_40_BITS	(UL(2) << VTCR_EL2_PS_SHIFT)
> > +
> >  /*
> >   * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
> >   */
> > diff --git a/tools/testing/selftests/kvm/lib/arm64/nested.c b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > index 111d02f44cfe..910f8cd30f96 100644
> > --- a/tools/testing/selftests/kvm/lib/arm64/nested.c
> > +++ b/tools/testing/selftests/kvm/lib/arm64/nested.c
> > @@ -1,8 +1,11 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  /*
> > - * ARM64 Nested virtualization helpers
> > + * ARM64 Nested virtualization helpers, nested page table code adapted from
> > + * ../x86/vmx.c.
> >   */
> >  
> > +#include <linux/sizes.h>
> > +
> >  #include "kvm_util.h"
> >  #include "nested.h"
> >  #include "processor.h"
> > @@ -18,6 +21,87 @@ static void hvc_handler(struct ex_regs *regs)
> >  	regs->pc = (u64)after_hvc;
> >  }
> >  
> > +uint64_t get_l1_vtcr(void)
> > +{
> > +	return VTCR_EL2_PS_40_BITS | VTCR_EL2_TG0_4K | VTCR_EL2_ORGN0_WBWA |
> > +	       VTCR_EL2_IRGN0_WBWA | VTCR_EL2_SL0_LV0_4K | VTCR_EL2_T0SZ_BITS(48);
> > +}
> > +
> > +static void __nested_pg_map(struct kvm_vm *vm, uint64_t guest_pgd,
> > +		     uint64_t nested_paddr, uint64_t paddr, uint64_t flags)
> > +{
> > +	uint8_t attr_idx = flags & (PTE_ATTRINDX_MASK >> PTE_ATTRINDX_SHIFT);
> > +	uint64_t pg_attr;
> > +	uint64_t *ptep;
> > +
> > +	TEST_ASSERT((nested_paddr % vm->page_size) == 0,
> > +		"L2 IPA not on page boundary,\n"
> > +		"  nested_paddr: 0x%lx vm->page_size: 0x%x", nested_paddr, vm->page_size);
> > +	TEST_ASSERT((paddr % vm->page_size) == 0,
> > +		"Guest physical address not on page boundary,\n"
> > +		"  paddr: 0x%lx vm->page_size: 0x%x", paddr, vm->page_size);
> > +	TEST_ASSERT((paddr >> vm->page_shift) <= vm->max_gfn,
> > +		"Physical address beyond maximum supported,\n"
> > +		"  paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x",
> > +		paddr, vm->max_gfn, vm->page_size);
> > +
> > +	ptep = addr_gpa2hva(vm, guest_pgd) + ((nested_paddr >> 39) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PGD_TYPE_TABLE | PTE_VALID;
> 
> Same but given this is stage 2 translation tables, KVM_PTE_VALID?

I see your point, but KVM_PTE_VALID is only defined for KVM, not here in
kselftest userspace. However since I will redo the page table generator,
I can add this, let's see.
Thanks for the suggestions!

Thanks,
Wei-Lin Chang

> 
> Thanks,
> Itaru.
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 30) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PUD_TYPE_TABLE | PTE_VALID;
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 21) & 0x1ffu) * 8;
> > +	if (!*ptep)
> > +		*ptep = (vm_alloc_page_table(vm) & GENMASK(47, 12)) | PMD_TYPE_TABLE | PTE_VALID;
> > +	ptep = addr_gpa2hva(vm, *ptep & GENMASK(47, 12)) + ((nested_paddr >> 12) & 0x1ffu) * 8;
> > +
> > +	pg_attr = PTE_AF | PTE_ATTRINDX(attr_idx) | PTE_TYPE_PAGE | PTE_VALID;
> > +	pg_attr |= PTE_SHARED;
> > +
> > +	*ptep = (paddr & GENMASK(47, 12)) | pg_attr;
> > +}
> > +
> > +void nested_map(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +		uint64_t nested_paddr, uint64_t paddr, uint64_t size)
> > +{
> > +	size_t npages = size / SZ_4K;
> > +
> > +	TEST_ASSERT(nested_paddr + size > nested_paddr, "Vaddr overflow");
> > +	TEST_ASSERT(paddr + size > paddr, "Paddr overflow");
> > +
> > +	while (npages--) {
> > +		__nested_pg_map(vm, guest_pgd, nested_paddr, paddr, MT_NORMAL);
> > +		nested_paddr += SZ_4K;
> > +		paddr += SZ_4K;
> > +	}
> > +}
> > +
> > +/*
> > + * Prepare an identity shadow page table that maps all the
> > + * physical pages in VM.
> > + */
> > +void nested_map_memslot(struct kvm_vm *vm, vm_paddr_t guest_pgd,
> > +			uint32_t memslot)
> > +{
> > +	sparsebit_idx_t i, last;
> > +	struct userspace_mem_region *region =
> > +		memslot2region(vm, memslot);
> > +
> > +	i = (region->region.guest_phys_addr >> vm->page_shift) - 1;
> > +	last = i + (region->region.memory_size >> vm->page_shift);
> > +	for (;;) {
> > +		i = sparsebit_next_clear(region->unused_phy_pages, i);
> > +		if (i > last)
> > +			break;
> > +
> > +		nested_map(vm, guest_pgd,
> > +			   (uint64_t)i << vm->page_shift,
> > +			   (uint64_t)i << vm->page_shift,
> > +			   1 << vm->page_shift);
> > +	}
> > +}
> > +
> >  void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> >  {
> >  	size_t l2_stack_size;
> > @@ -32,7 +116,16 @@ void prepare_l2_stack(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> >  
> >  void prepare_hyp_state(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
> >  {
> > -	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW);
> > +	vm_paddr_t guest_pgd;
> > +
> > +	guest_pgd = vm_phy_pages_alloc(vm, 1,
> > +				       KVM_GUEST_PAGE_TABLE_MIN_PADDR,
> > +				       vm->memslots[MEM_REGION_PT]);
> > +	nested_map_memslot(vm, guest_pgd, 0);
> > +
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_HCR_EL2), HCR_EL2_RW | HCR_EL2_VM);
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTTBR_EL2), guest_pgd);
> > +	vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_VTCR_EL2), get_l1_vtcr());
> >  }
> >  
> >  void prepare_eret_destination(struct kvm_vm *vm, struct kvm_vcpu *vcpu, void *l2_pc)
> > -- 
> > 2.43.0
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-03-26 21:34 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25  0:36 [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Wei-Lin Chang
2026-03-25  0:36 ` [PATCH 1/3] KVM: arm64: selftests: Add library functions for NV Wei-Lin Chang
2026-03-25  9:03   ` Marc Zyngier
2026-03-26 14:28     ` Wei-Lin Chang
2026-03-25  0:36 ` [PATCH 2/3] KVM: arm64: sefltests: Add basic NV selftest Wei-Lin Chang
2026-03-25  0:36 ` [PATCH 3/3] KVM: arm64: selftests: Enable stage-2 in NV preparation functions Wei-Lin Chang
2026-03-25  6:23   ` Itaru Kitayama
2026-03-26 21:34     ` Wei-Lin Chang
2026-03-25  9:18   ` Marc Zyngier
2026-03-26 21:16     ` Wei-Lin Chang
2026-03-25  8:00 ` [PATCH 0/3] KVM: arm64: selftests: Basic nested guest support Itaru Kitayama

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox