public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v2 00/13] Improve CET tests
@ 2025-06-26  7:34 Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 01/13] x86: cet: Pass virtual addresses to invlpg Mathias Krause
                   ` (12 more replies)
  0 siblings, 13 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

Hi,

this is a v2 of [1] and [2]. It's merging the two series as well as
integrating feedback and minor fixes.

[1] https://lore.kernel.org/kvm/20250513072250.568180-1-chao.gao@intel.com/
[2] https://lore.kernel.org/kvm/20250620153912.214600-1-minipli@grsecurity.net/

Please apply!

Thanks,
Mathias


Chao Gao (7):
  x86: cet: Remove unnecessary memory zeroing for shadow stack
  x86: cet: Directly check for #CP exception in run_in_user()
  x86: cet: Validate #CP error code
  x86: cet: Use report_skip()
  x86: cet: Drop unnecessary casting
  x86: cet: Validate writing unaligned values to SSP MSR causes #GP
  x86: cet: Validate CET states during VMX transitions

Mathias Krause (5):
  x86: cet: Make shadow stack less fragile
  x86: cet: Simplify IBT test
  x86: cet: Use symbolic values for the #CP error codes
  x86: cet: Test far returns too
  x86: Avoid top-most page for vmalloc on x86-64

Yang Weijiang (1):
  x86: cet: Pass virtual addresses to invlpg

 lib/x86/msr.h      |   1 +
 lib/x86/usermode.c |   4 ++
 lib/x86/vm.c       |   2 +
 x86/vmx.h          |   8 +++-
 x86/cet.c          | 110 ++++++++++++++++++++++++++-------------------
 x86/lam.c          |  10 ++---
 x86/vmx_tests.c    |  81 +++++++++++++++++++++++++++++++++
 x86/unittests.cfg  |   7 +++
 8 files changed, 171 insertions(+), 52 deletions(-)

-- 
2.47.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 01/13] x86: cet: Pass virtual addresses to invlpg
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 02/13] x86: cet: Remove unnecessary memory zeroing for shadow stack Mathias Krause
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause, Yang Weijiang

From: Yang Weijiang <weijiang.yang@intel.com>

Correct the parameter passed to invlpg.

The invlpg instruction should take a virtual address instead of a physical
address when flushing TLBs. Using shstk_phys results in TLBs associated
with the virtual address (shstk_virt) not being flushed, and the virtual
address may not be treated as a shadow stack address if there is a stale
TLB. So, subsequent shadow stack accesses to shstk_virt may cause a #PF,
which terminates the test unexpectedly.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/cet.c b/x86/cet.c
index 42d2b1fc043f..51a54a509a47 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -100,7 +100,7 @@ int main(int ac, char **av)
 	*ptep |= PT_DIRTY_MASK;
 
 	/* Flush the paging cache. */
-	invlpg((void *)shstk_phys);
+	invlpg((void *)shstk_virt);
 
 	/* Enable shadow-stack protection */
 	wrmsr(MSR_IA32_U_CET, ENABLE_SHSTK_BIT);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 02/13] x86: cet: Remove unnecessary memory zeroing for shadow stack
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 01/13] x86: cet: Pass virtual addresses to invlpg Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 03/13] x86: cet: Directly check for #CP exception in run_in_user() Mathias Krause
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

Skip mapping the shadow stack as a writable page and the redundant memory
zeroing.

Currently, the shadow stack is allocated using alloc_page(), then mapped as
a writable page, zeroed, and finally mapped as a shadow stack page. The
memory zeroing is redundant as alloc_page() already does that.

This also eliminates the need for invlpg, as the shadow stack is no
longer mapped writable.

Signed-off-by: Chao Gao <chao.gao@intel.com>
[mks: drop invlpg() as it's no longer needed, adapted changelog accordingly]
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
v2:
- dropped invlpg(), no longer needed

 x86/cet.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index 51a54a509a47..ce4de5e44c35 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -67,7 +67,6 @@ int main(int ac, char **av)
 {
 	char *shstk_virt;
 	unsigned long shstk_phys;
-	unsigned long *ptep;
 	pteval_t pte = 0;
 	bool rvc;
 
@@ -90,17 +89,8 @@ int main(int ac, char **av)
 	shstk_phys = (unsigned long)virt_to_phys(alloc_page());
 
 	/* Install the new page. */
-	pte = shstk_phys | PT_PRESENT_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
+	pte = shstk_phys | PT_PRESENT_MASK | PT_USER_MASK | PT_DIRTY_MASK;
 	install_pte(current_page_table(), 1, shstk_virt, pte, 0);
-	memset(shstk_virt, 0x0, PAGE_SIZE);
-
-	/* Mark it as shadow-stack page. */
-	ptep = get_pte_level(current_page_table(), shstk_virt, 1);
-	*ptep &= ~PT_WRITABLE_MASK;
-	*ptep |= PT_DIRTY_MASK;
-
-	/* Flush the paging cache. */
-	invlpg((void *)shstk_virt);
 
 	/* Enable shadow-stack protection */
 	wrmsr(MSR_IA32_U_CET, ENABLE_SHSTK_BIT);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 03/13] x86: cet: Directly check for #CP exception in run_in_user()
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 01/13] x86: cet: Pass virtual addresses to invlpg Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 02/13] x86: cet: Remove unnecessary memory zeroing for shadow stack Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 04/13] x86: cet: Validate #CP error code Mathias Krause
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

Current CET tests validate if a #CP exception is raised by registering
a #CP handler. This handler counts the #CP exceptions and raises a #GP
exception, which is then caught by the run_in_user() infrastructure to
switch back to the kernel. This is convoluted.

Catch the #CP exception directly by run_in_user() to avoid the manual
counting of #CP exceptions and the #CP->#GP dance.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 23 ++++-------------------
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index ce4de5e44c35..1ce576fe0291 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -8,9 +8,6 @@
 #include "alloc_page.h"
 #include "fault_test.h"
 
-static int cp_count;
-static unsigned long invalid_offset = 0xffffffffffffff;
-
 static u64 cet_shstk_func(void)
 {
 	unsigned long *ret_addr, *ssp;
@@ -54,15 +51,6 @@ static u64 cet_ibt_func(void)
 #define ENABLE_SHSTK_BIT 0x1
 #define ENABLE_IBT_BIT   0x4
 
-static void handle_cp(struct ex_regs *regs)
-{
-	cp_count++;
-	printf("In #CP exception handler, error_code = 0x%lx\n",
-		regs->error_code);
-	/* Below jmp is expected to trigger #GP */
-	asm("jmpq *%0": :"m"(invalid_offset));
-}
-
 int main(int ac, char **av)
 {
 	char *shstk_virt;
@@ -70,7 +58,6 @@ int main(int ac, char **av)
 	pteval_t pte = 0;
 	bool rvc;
 
-	cp_count = 0;
 	if (!this_cpu_has(X86_FEATURE_SHSTK)) {
 		printf("SHSTK not enabled\n");
 		return report_summary();
@@ -82,7 +69,6 @@ int main(int ac, char **av)
 	}
 
 	setup_vm();
-	handle_exception(CP_VECTOR, handle_cp);
 
 	/* Allocate one page for shadow-stack. */
 	shstk_virt = alloc_vpage();
@@ -102,15 +88,14 @@ int main(int ac, char **av)
 	write_cr4(read_cr4() | X86_CR4_CET);
 
 	printf("Unit test for CET user mode...\n");
-	run_in_user((usermode_func)cet_shstk_func, GP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(cp_count == 1, "Completed shadow-stack protection test successfully.");
-	cp_count = 0;
+	run_in_user((usermode_func)cet_shstk_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
+	report(rvc, "Shadow-stack protection test.");
 
 	/* Enable indirect-branch tracking */
 	wrmsr(MSR_IA32_U_CET, ENABLE_IBT_BIT);
 
-	run_in_user((usermode_func)cet_ibt_func, GP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(cp_count == 1, "Completed Indirect-branch tracking test successfully.");
+	run_in_user((usermode_func)cet_ibt_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
+	report(rvc, "Indirect-branch tracking test.");
 
 	write_cr4(read_cr4() & ~X86_CR4_CET);
 	wrmsr(MSR_IA32_U_CET, 0);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 04/13] x86: cet: Validate #CP error code
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (2 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 03/13] x86: cet: Directly check for #CP exception in run_in_user() Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 05/13] x86: cet: Use report_skip() Mathias Krause
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

The #CP exceptions include an error code that provides additional
information about how the exception occurred. Previously, CET tests simply
printed these error codes without validation.

Enhance the CET tests to validate the #CP error code.

This requires the run_in_user() infrastructure to catch the exception
vector, error code, and rflags, similar to what check_exception_table()
does.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 lib/x86/usermode.c | 4 ++++
 x86/cet.c          | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c
index c3ec0ad763d3..f896e3bdcbdb 100644
--- a/lib/x86/usermode.c
+++ b/lib/x86/usermode.c
@@ -23,6 +23,10 @@ static void restore_exec_to_jmpbuf(void)
 
 static void restore_exec_to_jmpbuf_exception_handler(struct ex_regs *regs)
 {
+	this_cpu_write_exception_vector(regs->vector);
+	this_cpu_write_exception_rflags_rf((regs->rflags >> 16) & 1);
+	this_cpu_write_exception_error_code(regs->error_code);
+
 	/* longjmp must happen after iret, so do not do it now.  */
 	regs->rip = (unsigned long)&restore_exec_to_jmpbuf;
 	regs->cs = KERNEL_CS;
diff --git a/x86/cet.c b/x86/cet.c
index 1ce576fe0291..b9699b0de787 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -89,13 +89,13 @@ int main(int ac, char **av)
 
 	printf("Unit test for CET user mode...\n");
 	run_in_user((usermode_func)cet_shstk_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(rvc, "Shadow-stack protection test.");
+	report(rvc && exception_error_code() == 1, "Shadow-stack protection test.");
 
 	/* Enable indirect-branch tracking */
 	wrmsr(MSR_IA32_U_CET, ENABLE_IBT_BIT);
 
 	run_in_user((usermode_func)cet_ibt_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(rvc, "Indirect-branch tracking test.");
+	report(rvc && exception_error_code() == 3, "Indirect-branch tracking test.");
 
 	write_cr4(read_cr4() & ~X86_CR4_CET);
 	wrmsr(MSR_IA32_U_CET, 0);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 05/13] x86: cet: Use report_skip()
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (3 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 04/13] x86: cet: Validate #CP error code Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 06/13] x86: cet: Drop unnecessary casting Mathias Krause
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

report_skip() function is preferred for skipping inapplicable tests when
the necessary hardware features are unavailable. For example, with this
patch applied, the test output is as follows if IBT is not supported:

SKIP: IBT not enabled
SUMMARY: 1 tests, 1 skipped

Previously, it printed:

IBT not enabled
SUMMARY: 0 tests

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index b9699b0de787..1459c3dd3d67 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -59,12 +59,12 @@ int main(int ac, char **av)
 	bool rvc;
 
 	if (!this_cpu_has(X86_FEATURE_SHSTK)) {
-		printf("SHSTK not enabled\n");
+		report_skip("SHSTK not enabled");
 		return report_summary();
 	}
 
 	if (!this_cpu_has(X86_FEATURE_IBT)) {
-		printf("IBT not enabled\n");
+		report_skip("IBT not enabled");
 		return report_summary();
 	}
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 06/13] x86: cet: Drop unnecessary casting
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (4 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 05/13] x86: cet: Use report_skip() Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 07/13] x86: cet: Validate writing unaligned values to SSP MSR causes #GP Mathias Krause
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

cet_shstk_func() and cet_ibt_func() have the same type as usermode_func.
So, remove the unnecessary casting.

Signed-off-by: Chao Gao <chao.gao@intel.com>
[mks: make the types really equal by using uint64_t]
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
v2: 
- change return type to uint64_t

 x86/cet.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index 1459c3dd3d67..61db913d3985 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -8,7 +8,7 @@
 #include "alloc_page.h"
 #include "fault_test.h"
 
-static u64 cet_shstk_func(void)
+static uint64_t cet_shstk_func(void)
 {
 	unsigned long *ret_addr, *ssp;
 
@@ -31,7 +31,7 @@ static u64 cet_shstk_func(void)
 	return 0;
 }
 
-static u64 cet_ibt_func(void)
+static uint64_t cet_ibt_func(void)
 {
 	/*
 	 * In below assembly code, the first instruction at label 2 is not
@@ -88,13 +88,13 @@ int main(int ac, char **av)
 	write_cr4(read_cr4() | X86_CR4_CET);
 
 	printf("Unit test for CET user mode...\n");
-	run_in_user((usermode_func)cet_shstk_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
+	run_in_user(cet_shstk_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
 	report(rvc && exception_error_code() == 1, "Shadow-stack protection test.");
 
 	/* Enable indirect-branch tracking */
 	wrmsr(MSR_IA32_U_CET, ENABLE_IBT_BIT);
 
-	run_in_user((usermode_func)cet_ibt_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
+	run_in_user(cet_ibt_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
 	report(rvc && exception_error_code() == 3, "Indirect-branch tracking test.");
 
 	write_cr4(read_cr4() & ~X86_CR4_CET);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 07/13] x86: cet: Validate writing unaligned values to SSP MSR causes #GP
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (5 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 06/13] x86: cet: Drop unnecessary casting Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions Mathias Krause
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

Validate that writing invalid values to SSP MSRs triggers a #GP exception.
This verifies that necessary validity checks are performed by the hardware
or the underlying VMM.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/x86/cet.c b/x86/cet.c
index 61db913d3985..72af7526df69 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -56,6 +56,7 @@ int main(int ac, char **av)
 	char *shstk_virt;
 	unsigned long shstk_phys;
 	pteval_t pte = 0;
+	u8 vector;
 	bool rvc;
 
 	if (!this_cpu_has(X86_FEATURE_SHSTK)) {
@@ -100,5 +101,9 @@ int main(int ac, char **av)
 	write_cr4(read_cr4() & ~X86_CR4_CET);
 	wrmsr(MSR_IA32_U_CET, 0);
 
+	/* SSP should be 4-Byte aligned */
+	vector = wrmsr_safe(MSR_IA32_PL3_SSP, 0x1);
+	report(vector == GP_VECTOR, "MSR_IA32_PL3_SSP alignment test.");
+
 	return report_summary();
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (6 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 07/13] x86: cet: Validate writing unaligned values to SSP MSR causes #GP Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-07-11  8:51   ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 09/13] x86: cet: Make shadow stack less fragile Mathias Krause
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

From: Chao Gao <chao.gao@intel.com>

Add tests to verify that CET states are correctly handled during VMX
transitions.

The following behaviors are verified:

1. Host states are loaded from VMCS iff "Load CET" VM-exit control is set
2. Guest states are loaded from VMCS iff "Load CET" VM-entry control is set
3. Guest states are saved to VMCS during VM exits unconditionally
4. Invalid guest or host CET states leads to VM entry failures.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 lib/x86/msr.h     |  1 +
 x86/vmx.h         |  8 +++--
 x86/vmx_tests.c   | 81 +++++++++++++++++++++++++++++++++++++++++++++++
 x86/unittests.cfg |  7 ++++
 4 files changed, 95 insertions(+), 2 deletions(-)

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index cc4cb8551ea1..df6d3049f8ca 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -296,6 +296,7 @@
 #define MSR_IA32_FEATURE_CONTROL        0x0000003a
 #define MSR_IA32_TSC_ADJUST		0x0000003b
 #define MSR_IA32_U_CET                  0x000006a0
+#define MSR_IA32_S_CET                  0x000006a2
 #define MSR_IA32_PL3_SSP                0x000006a7
 #define MSR_IA32_PKRS			0x000006e1
 
diff --git a/x86/vmx.h b/x86/vmx.h
index 9cd90488cea6..33373bd1a2a9 100644
--- a/x86/vmx.h
+++ b/x86/vmx.h
@@ -356,6 +356,7 @@ enum Encoding {
 	GUEST_PENDING_DEBUG	= 0x6822ul,
 	GUEST_SYSENTER_ESP	= 0x6824ul,
 	GUEST_SYSENTER_EIP	= 0x6826ul,
+	GUEST_S_CET		= 0x6828ul,
 
 	/* Natural-Width Host State Fields */
 	HOST_CR0		= 0x6c00ul,
@@ -369,7 +370,8 @@ enum Encoding {
 	HOST_SYSENTER_ESP	= 0x6c10ul,
 	HOST_SYSENTER_EIP	= 0x6c12ul,
 	HOST_RSP		= 0x6c14ul,
-	HOST_RIP		= 0x6c16ul
+	HOST_RIP		= 0x6c16ul,
+	HOST_S_CET		= 0x6c18ul,
 };
 
 #define VMX_ENTRY_FAILURE	(1ul << 31)
@@ -449,6 +451,7 @@ enum Ctrl_exi {
 	EXI_SAVE_EFER		= 1UL << 20,
 	EXI_LOAD_EFER		= 1UL << 21,
 	EXI_SAVE_PREEMPT	= 1UL << 22,
+	EXI_LOAD_CET		= 1UL << 28,
 };
 
 enum Ctrl_ent {
@@ -457,7 +460,8 @@ enum Ctrl_ent {
 	ENT_LOAD_PERF		= 1UL << 13,
 	ENT_LOAD_PAT		= 1UL << 14,
 	ENT_LOAD_EFER		= 1UL << 15,
-	ENT_LOAD_BNDCFGS	= 1UL << 16
+	ENT_LOAD_BNDCFGS	= 1UL << 16,
+	ENT_LOAD_CET		= 1UL << 20,
 };
 
 enum Ctrl_pin {
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c
index 0b3cfe50c614..6383bc83da97 100644
--- a/x86/vmx_tests.c
+++ b/x86/vmx_tests.c
@@ -11377,6 +11377,85 @@ static void vmx_posted_interrupts_test(void)
 	enter_guest();
 }
 
+static u64 guest_s_cet = -1;
+
+static void vmx_cet_test_guest(void)
+{
+	guest_s_cet = rdmsr(MSR_IA32_S_CET);
+	vmcall();
+}
+
+static void vmx_cet_test(void)
+{
+	struct vmcs *curr;
+	u64 val;
+
+	if (!(ctrl_exit_rev.clr & EXI_LOAD_CET)) {
+		report_skip("Load CET state exit control is not available");
+		return;
+	}
+
+	if (!(ctrl_enter_rev.clr & ENT_LOAD_CET)) {
+		report_skip("Load CET state entry control is not available");
+		return;
+	}
+
+	/* Allow the guest to read GUEST_S_CET directly */
+	msr_bmp_init();
+
+	/*
+	 * Check whether VMCS transitions load host and guest values
+	 * according to the settings of the relevant VM-entry and exit
+	 * controls.
+	 */
+	vmcs_write(HOST_S_CET, 2);
+	vmcs_write(GUEST_S_CET, 2);
+	test_set_guest(vmx_cet_test_guest);
+
+	enter_guest();
+	val = rdmsr(MSR_IA32_S_CET);
+
+	/* Validate both guest/host S_CET MSR have the default values */
+	report(val == 0 && guest_s_cet == 0, "Load CET state disabled");
+
+	/*
+	 * CPU supports the 1-setting of the 'load CET' VM-entry control,
+	 * the contents of the IA32_S_CET and IA32_INTERRUPT_SSP_TABLE_ADDR
+	 * MSRs are saved into the corresponding fields
+	 */
+	report(vmcs_read(GUEST_S_CET) == 0, "S_CET is unconditionally saved");
+
+	/* Enable load CET state entry/exit controls and retest */
+	vmcs_set_bits(EXI_CONTROLS, EXI_LOAD_CET);
+	vmcs_set_bits(ENT_CONTROLS, ENT_LOAD_CET);
+	vmcs_write(GUEST_S_CET, 2);
+	test_override_guest(vmx_cet_test_guest);
+
+	enter_guest();
+	val = rdmsr(MSR_IA32_S_CET);
+
+	/* Validate both guest/host S_CET MSR are loaded from VMCS */
+	report(val == 2 && guest_s_cet == 2, "Load CET state enabled");
+
+	/*
+	 * Validate that bit 10 (SUPPRESS) and Bit 11 (TRACKER) cannot be
+	 * both set
+	 */
+	val = BIT(10) | BIT(11);
+	vmcs_write(GUEST_S_CET, val);
+	test_guest_state("Load invalid guest CET state", true, val, "GUEST_S_CET");
+
+	/* Following test_vmx_vmlaunch() needs a "not launched" VMCS */
+	vmcs_save(&curr);
+	vmcs_clear(curr);
+	make_vmcs_current(curr);
+
+	vmcs_write(HOST_S_CET, val);
+	test_vmx_vmlaunch(VMXERR_ENTRY_INVALID_HOST_STATE_FIELD);
+
+	test_set_guest_finished();
+}
+
 #define TEST(name) { #name, .v2 = name }
 
 /* name/init/guest_main/exit_handler/syscall_handler/guest_regs */
@@ -11490,5 +11569,7 @@ struct vmx_test vmx_tests[] = {
 	TEST(vmx_pf_vpid_test),
 	TEST(vmx_exception_test),
 	TEST(vmx_canonical_test),
+	/* "Load CET" VM-entry/exit controls tests. */
+	TEST(vmx_cet_test),
 	{ NULL, NULL, NULL, NULL, NULL, {0} },
 };
diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index a2b351ff552a..d07f65b6b207 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -427,6 +427,13 @@ arch = x86_64
 groups = vmx nested_exception
 check = /sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr=Y
 
+[vmx_cet_test]
+file = vmx.flat
+extra_params = -cpu max,+vmx -append "vmx_cet_test"
+arch = x86_64
+groups = vmx
+timeout = 240
+
 [debug]
 file = debug.flat
 arch = x86_64
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 09/13] x86: cet: Make shadow stack less fragile
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (7 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 10/13] x86: cet: Simplify IBT test Mathias Krause
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

The CET shadow stack test has certain assumptions about the code, namely
that it was compiled with frame pointers enabled and the return address
won't be 0xdeaddead.

Make the code less fragile by actually lifting these assumptions to (1)
explicitly mention the dependency to the frame pointer by making us of
__builtin_frame_address(0) and (2) modify the return address by toggling
bits instead of writing a fixed value. Also ensure that write will
actually be generated by the compiler by making it a 'volatile' write.

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index 72af7526df69..50546c5eee05 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -10,14 +10,14 @@
 
 static uint64_t cet_shstk_func(void)
 {
-	unsigned long *ret_addr, *ssp;
+	unsigned long *ret_addr = __builtin_frame_address(0) + sizeof(void *);
+	unsigned long *ssp;
 
 	/* rdsspq %rax */
 	asm volatile (".byte 0xf3, 0x48, 0x0f, 0x1e, 0xc8" : "=a"(ssp));
 
-	asm("movq %%rbp,%0" : "=r"(ret_addr));
 	printf("The return-address in shadow-stack = 0x%lx, in normal stack = 0x%lx\n",
-	       *ssp, *(ret_addr + 1));
+	       *ssp, *ret_addr);
 
 	/*
 	 * In below line, it modifies the return address, it'll trigger #CP
@@ -26,7 +26,7 @@ static uint64_t cet_shstk_func(void)
 	 * when HW detects the violation.
 	 */
 	printf("Try to temper the return-address, this causes #CP on returning...\n");
-	*(ret_addr + 1) = 0xdeaddead;
+	*(volatile unsigned long *)ret_addr ^= 0xdeaddead;
 
 	return 0;
 }
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 10/13] x86: cet: Simplify IBT test
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (8 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 09/13] x86: cet: Make shadow stack less fragile Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 11/13] x86: cet: Use symbolic values for the #CP error codes Mathias Krause
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

The inline assembly of cet_ibt_func() does unnecessary things and
doesn't mention the clobbered registers.

Fix that by reducing the code to what's needed (an indirect jump to a
target lacking the ENDBR instruction) and passing and output register
variable for it.

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index 50546c5eee05..dfc2484cba5d 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -33,18 +33,17 @@ static uint64_t cet_shstk_func(void)
 
 static uint64_t cet_ibt_func(void)
 {
+	unsigned long tmp;
 	/*
 	 * In below assembly code, the first instruction at label 2 is not
 	 * endbr64, it'll trigger #CP with error code 0x3, and the execution
 	 * is terminated when HW detects the violation.
 	 */
 	printf("No endbr64 instruction at jmp target, this triggers #CP...\n");
-	asm volatile ("movq $2, %rcx\n"
-		      "dec %rcx\n"
-		      "leaq 2f(%rip), %rax\n"
-		      "jmp *%rax \n"
-		      "2:\n"
-		      "dec %rcx\n");
+	asm volatile ("leaq 2f(%%rip), %0\n\t"
+		      "jmpq *%0\n\t"
+		      "2:"
+		      : "=r"(tmp));
 	return 0;
 }
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 11/13] x86: cet: Use symbolic values for the #CP error codes
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (9 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 10/13] x86: cet: Simplify IBT test Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 12/13] x86: cet: Test far returns too Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64 Mathias Krause
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

Use symbolic names for the #CP exception error codes.

Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
 x86/cet.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/x86/cet.c b/x86/cet.c
index dfc2484cba5d..dbaecc7d74d7 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -47,6 +47,13 @@ static uint64_t cet_ibt_func(void)
 	return 0;
 }
 
+#define CP_ERR_NEAR_RET	0x0001
+#define CP_ERR_FAR_RET	0x0002
+#define CP_ERR_ENDBR	0x0003
+#define CP_ERR_RSTORSSP	0x0004
+#define CP_ERR_SETSSBSY	0x0005
+#define CP_ERR_ENCL		BIT(15)
+
 #define ENABLE_SHSTK_BIT 0x1
 #define ENABLE_IBT_BIT   0x4
 
@@ -87,15 +94,17 @@ int main(int ac, char **av)
 	/* Enable CET master control bit in CR4. */
 	write_cr4(read_cr4() | X86_CR4_CET);
 
-	printf("Unit test for CET user mode...\n");
+	printf("Unit tests for CET user mode...\n");
 	run_in_user(cet_shstk_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(rvc && exception_error_code() == 1, "Shadow-stack protection test.");
+	report(rvc && exception_error_code() == CP_ERR_NEAR_RET,
+	       "NEAR RET shadow-stack protection test");
 
 	/* Enable indirect-branch tracking */
 	wrmsr(MSR_IA32_U_CET, ENABLE_IBT_BIT);
 
 	run_in_user(cet_ibt_func, CP_VECTOR, 0, 0, 0, 0, &rvc);
-	report(rvc && exception_error_code() == 3, "Indirect-branch tracking test.");
+	report(rvc && exception_error_code() == CP_ERR_ENDBR,
+	       "Indirect-branch tracking test");
 
 	write_cr4(read_cr4() & ~X86_CR4_CET);
 	wrmsr(MSR_IA32_U_CET, 0);
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 12/13] x86: cet: Test far returns too
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (10 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 11/13] x86: cet: Use symbolic values for the #CP error codes Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64 Mathias Krause
  12 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

Add a test for far returns which has a dedicated error code.

Tested-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
v2:
- rebase on top of Chao Gao's series

 x86/cet.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/x86/cet.c b/x86/cet.c
index dbaecc7d74d7..99333c35a028 100644
--- a/x86/cet.c
+++ b/x86/cet.c
@@ -31,6 +31,34 @@ static uint64_t cet_shstk_func(void)
 	return 0;
 }
 
+static uint64_t cet_shstk_far_ret(void)
+{
+	struct far_pointer32 fp = {
+		.offset = (uintptr_t)&&far_func,
+		.selector = USER_CS,
+	};
+
+	if (fp.offset != (uintptr_t)&&far_func) {
+		printf("Code address too high.\n");
+		return -1;
+	}
+
+	printf("Try to temper the return-address of far-called function...\n");
+
+	/* The NOP isn't superfluous, the called function tries to skip it. */
+	asm goto ("lcall *%0; nop" : : "m" (fp) : : far_func);
+
+	printf("Uhm... how did we get here?! This should have #CP'ed!\n");
+
+	return 0;
+far_func:
+	asm volatile (/* mess with the ret addr, make it point past the NOP */
+		      "incq (%rsp)\n\t"
+		      /* 32-bit return, just as we have been called */
+		      "lret");
+	__builtin_unreachable();
+}
+
 static uint64_t cet_ibt_func(void)
 {
 	unsigned long tmp;
@@ -99,6 +127,10 @@ int main(int ac, char **av)
 	report(rvc && exception_error_code() == CP_ERR_NEAR_RET,
 	       "NEAR RET shadow-stack protection test");
 
+	run_in_user(cet_shstk_far_ret, CP_VECTOR, 0, 0, 0, 0, &rvc);
+	report(rvc && exception_error_code() == CP_ERR_FAR_RET,
+	       "FAR RET shadow-stack protection test");
+
 	/* Enable indirect-branch tracking */
 	wrmsr(MSR_IA32_U_CET, ENABLE_IBT_BIT);
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64
  2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
                   ` (11 preceding siblings ...)
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 12/13] x86: cet: Test far returns too Mathias Krause
@ 2025-06-26  7:34 ` Mathias Krause
  2025-09-12 14:37   ` Sean Christopherson
  12 siblings, 1 reply; 17+ messages in thread
From: Mathias Krause @ 2025-06-26  7:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm, Mathias Krause

The x86-64 implementation if setup_mmu() doesn't initialize 'vfree_top'
and leaves it at its zero-value. This isn't wrong per se, however, it
leads to odd configurations when the first vmalloc/vmap page gets
allocated. It'll be the very last page in the virtual address space --
which is an interesting corner case -- but its boundary will probably
wrap. It does so, for CET's shadow stack, at least, which loads the
shadow stack pointer with the base address of the mapped page plus its
size, i.e. 0xffffffff_fffff000 + 4096, which wraps to 0x0.

The CPU seems to handle such configurations just fine. However, it feels
odd to set the shadow stack pointer to "NULL".

To avoid the wrapping, ignore the top most page by initializing
'vfree_top' to just one page below.

Reviewed-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
v2:
- change comment in x86/lam.c too

 lib/x86/vm.c |  2 ++
 x86/lam.c    | 10 +++++-----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/x86/vm.c b/lib/x86/vm.c
index 90f73fbb2dfd..27e7bb4004ef 100644
--- a/lib/x86/vm.c
+++ b/lib/x86/vm.c
@@ -191,6 +191,8 @@ void *setup_mmu(phys_addr_t end_of_memory, void *opt_mask)
         end_of_memory = (1ul << 32);  /* map mmio 1:1 */
 
     setup_mmu_range(cr3, 0, end_of_memory);
+    /* skip the last page for out-of-bound and wrap-around reasons */
+    init_alloc_vpage((void *)(~(PAGE_SIZE - 1)));
 #else
     setup_mmu_range(cr3, 0, (2ul << 30));
     setup_mmu_range(cr3, 3ul << 30, (1ul << 30));
diff --git a/x86/lam.c b/x86/lam.c
index 1af6c5fdd80a..87efc5dd4a72 100644
--- a/x86/lam.c
+++ b/x86/lam.c
@@ -197,11 +197,11 @@ static void test_lam_sup(void)
 	int vector;
 
 	/*
-	 * KUT initializes vfree_top to 0 for X86_64, and each virtual address
-	 * allocation decreases the size from vfree_top. It's guaranteed that
-	 * the return value of alloc_vpage() is considered as kernel mode
-	 * address and canonical since only a small amount of virtual address
-	 * range is allocated in this test.
+	 * KUT initializes vfree_top to -PAGE_SIZE for X86_64, and each virtual
+	 * address allocation decreases the size from vfree_top. It's
+	 * guaranteed that the return value of alloc_vpage() is considered as
+	 * kernel mode address and canonical since only a small amount of
+	 * virtual address range is allocated in this test.
 	 */
 	vaddr = alloc_vpage();
 	vaddr_mmio = alloc_vpage();
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions Mathias Krause
@ 2025-07-11  8:51   ` Mathias Krause
  0 siblings, 0 replies; 17+ messages in thread
From: Mathias Krause @ 2025-07-11  8:51 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Sean Christopherson, Chao Gao, kvm

On 26.06.25 09:34, Mathias Krause wrote:
> [...]

> diff --git a/x86/unittests.cfg b/x86/unittests.cfg
> index a2b351ff552a..d07f65b6b207 100644
> --- a/x86/unittests.cfg
> +++ b/x86/unittests.cfg
> @@ -427,6 +427,13 @@ arch = x86_64
>  groups = vmx nested_exception
>  check = /sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr=Y
>  
> +[vmx_cet_test]
> +file = vmx.flat
> +extra_params = -cpu max,+vmx -append "vmx_cet_test"
> +arch = x86_64
> +groups = vmx
> +timeout = 240
> +
>  [debug]
>  file = debug.flat
>  arch = x86_64

This needs the following fixup since commit a7794f16c84a ("scripts: Add
'test_args' test definition parameter"):

diff --git a/x86/unittests.cfg b/x86/unittests.cfg
index ec07d26b7016..a814bfacf052 100644
--- a/x86/unittests.cfg
+++ b/x86/unittests.cfg
@@ -455,7 +455,8 @@ check = /sys/module/kvm_intel/parameters/allow_smaller_maxphyaddr=Y

 [vmx_cet_test]
 file = vmx.flat
-extra_params = -cpu max,+vmx -append "vmx_cet_test"
+test_args = "vmx_cet_test"
+qemu_params = -cpu max,+vmx
 arch = x86_64
 groups = vmx
 timeout = 240

I can fold it into a respin if you want me to.

Thanks,
Mathias

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64
  2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64 Mathias Krause
@ 2025-09-12 14:37   ` Sean Christopherson
  2025-09-14 11:05     ` Chao Gao
  0 siblings, 1 reply; 17+ messages in thread
From: Sean Christopherson @ 2025-09-12 14:37 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Paolo Bonzini, Chao Gao, kvm

On Thu, Jun 26, 2025, Mathias Krause wrote:
> The x86-64 implementation if setup_mmu() doesn't initialize 'vfree_top'
> and leaves it at its zero-value. This isn't wrong per se, however, it
> leads to odd configurations when the first vmalloc/vmap page gets
> allocated. It'll be the very last page in the virtual address space --
> which is an interesting corner case -- but its boundary will probably
> wrap. It does so, for CET's shadow stack, at least, which loads the
> shadow stack pointer with the base address of the mapped page plus its
> size, i.e. 0xffffffff_fffff000 + 4096, which wraps to 0x0.
> 
> The CPU seems to handle such configurations just fine. However, it feels
> odd to set the shadow stack pointer to "NULL".
> 
> To avoid the wrapping, ignore the top most page by initializing
> 'vfree_top' to just one page below.
> 
> Reviewed-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Mathias Krause <minipli@grsecurity.net>
> ---
> v2:
> - change comment in x86/lam.c too
> 
>  lib/x86/vm.c |  2 ++
>  x86/lam.c    | 10 +++++-----
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/x86/vm.c b/lib/x86/vm.c
> index 90f73fbb2dfd..27e7bb4004ef 100644
> --- a/lib/x86/vm.c
> +++ b/lib/x86/vm.c
> @@ -191,6 +191,8 @@ void *setup_mmu(phys_addr_t end_of_memory, void *opt_mask)
>          end_of_memory = (1ul << 32);  /* map mmio 1:1 */
>  
>      setup_mmu_range(cr3, 0, end_of_memory);
> +    /* skip the last page for out-of-bound and wrap-around reasons */
> +    init_alloc_vpage((void *)(~(PAGE_SIZE - 1)));

This breaks the eventinj test (leads to SHUTDOWN).  I haven't spent any time
trying to figure out why.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64
  2025-09-12 14:37   ` Sean Christopherson
@ 2025-09-14 11:05     ` Chao Gao
  0 siblings, 0 replies; 17+ messages in thread
From: Chao Gao @ 2025-09-14 11:05 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Mathias Krause, Paolo Bonzini, kvm

On Fri, Sep 12, 2025 at 07:37:54AM -0700, Sean Christopherson wrote:
>On Thu, Jun 26, 2025, Mathias Krause wrote:
>> The x86-64 implementation if setup_mmu() doesn't initialize 'vfree_top'
>> and leaves it at its zero-value. This isn't wrong per se, however, it
>> leads to odd configurations when the first vmalloc/vmap page gets
>> allocated. It'll be the very last page in the virtual address space --
>> which is an interesting corner case -- but its boundary will probably
>> wrap. It does so, for CET's shadow stack, at least, which loads the
>> shadow stack pointer with the base address of the mapped page plus its
>> size, i.e. 0xffffffff_fffff000 + 4096, which wraps to 0x0.
>> 
>> The CPU seems to handle such configurations just fine. However, it feels
>> odd to set the shadow stack pointer to "NULL".
>> 
>> To avoid the wrapping, ignore the top most page by initializing
>> 'vfree_top' to just one page below.
>> 
>> Reviewed-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Mathias Krause <minipli@grsecurity.net>
>> ---
>> v2:
>> - change comment in x86/lam.c too
>> 
>>  lib/x86/vm.c |  2 ++
>>  x86/lam.c    | 10 +++++-----
>>  2 files changed, 7 insertions(+), 5 deletions(-)
>> 
>> diff --git a/lib/x86/vm.c b/lib/x86/vm.c
>> index 90f73fbb2dfd..27e7bb4004ef 100644
>> --- a/lib/x86/vm.c
>> +++ b/lib/x86/vm.c
>> @@ -191,6 +191,8 @@ void *setup_mmu(phys_addr_t end_of_memory, void *opt_mask)
>>          end_of_memory = (1ul << 32);  /* map mmio 1:1 */
>>  
>>      setup_mmu_range(cr3, 0, end_of_memory);
>> +    /* skip the last page for out-of-bound and wrap-around reasons */
>> +    init_alloc_vpage((void *)(~(PAGE_SIZE - 1)));
>
>This breaks the eventinj test (leads to SHUTDOWN).  I haven't spent any time
>trying to figure out why.

do_iret in eventinj.c is buggy. When crafting an IRET frame, it just sets
RFLAGS, CS and RIP properly, leaving SS and RSP as 0. So after IRET, RSP
becomes 0. Then, a nested NMI is injected right after IRET and accesses the
stack. If the stack (i.e., the page starts from 0xffffffff_fffff000) isn't
mapped, the NMI will cause TRIPLE FAULT.

The issue goes unnoticed because w/o this patch, the handle_irq()->malloc() at
the beginning of eventinj.c:main() allocates and maps the page 0xffffffff_fffff000

The issue can be fixed with below diff:

index 6fbb2d0f..945f7d11 100644
--- a/x86/eventinj.c
+++ b/x86/eventinj.c
@@ -132,7 +132,7 @@ static void nested_nmi_iret_isr(struct ex_regs *r)
 {
        printf("Nested NMI isr running rip=%lx\n", r->rip);

-       if (r->rip == iret_stack[-3])
+       if (r->rip == iret_stack[-4])
                test_count++;
 }

@@ -152,6 +152,7 @@ asm("do_iret:"
        "mov 8(%esp), %edx \n\t"        // virt_stack
 #endif
        "xchg %"R "dx, %"R "sp \n\t"    // point to new stack
+       "push"W" %"R "dx \n\t"
        "pushf"W" \n\t"
        "mov %cs, %ecx \n\t"
        "push"W" %"R "cx \n\t"

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-09-14 11:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26  7:34 [kvm-unit-tests PATCH v2 00/13] Improve CET tests Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 01/13] x86: cet: Pass virtual addresses to invlpg Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 02/13] x86: cet: Remove unnecessary memory zeroing for shadow stack Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 03/13] x86: cet: Directly check for #CP exception in run_in_user() Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 04/13] x86: cet: Validate #CP error code Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 05/13] x86: cet: Use report_skip() Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 06/13] x86: cet: Drop unnecessary casting Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 07/13] x86: cet: Validate writing unaligned values to SSP MSR causes #GP Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 08/13] x86: cet: Validate CET states during VMX transitions Mathias Krause
2025-07-11  8:51   ` Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 09/13] x86: cet: Make shadow stack less fragile Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 10/13] x86: cet: Simplify IBT test Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 11/13] x86: cet: Use symbolic values for the #CP error codes Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 12/13] x86: cet: Test far returns too Mathias Krause
2025-06-26  7:34 ` [kvm-unit-tests PATCH v2 13/13] x86: Avoid top-most page for vmalloc on x86-64 Mathias Krause
2025-09-12 14:37   ` Sean Christopherson
2025-09-14 11:05     ` Chao Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox