* [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access
@ 2023-04-03 10:56 Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 1/4] x86: Use existing CR0.WP / CR4.SMEP bit definitions Mathias Krause
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 10:56 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson, Mathias Krause
v2: https://lore.kernel.org/kvm/20230331135709.132713-1-minipli@grsecurity.net/
This series adds explicit tests that verify a page fault will occur for
attempts to write to an r/o page while CR0.WP is 1 as well as access is
granted when CR0.WP is 0.
There are existing tests already, e.g. in pks.c, pku.c, smap.c or even
access.c that implicitly test this. However, they all either explicitly
(via INVLPG) or implicitly (via CR3 reload) flush the TLB before doing
the access which might lead to false positives if the access succeeded
before, e.g. because CR0.WP was 0 before.
Better to have an explicit test, especially to back up the changes of
[1] which were missing the emulator case, initially.
Changes from v2 to v3 integrate Sean's feedback, especially the
changelogs were rewritten to avoid pronouns and the guts of the CR0.WP
toggling test were moved to a helper function.
Please apply!
Thanks,
Mathias
[1] https://lore.kernel.org/kvm/20230322013731.102955-1-minipli@grsecurity.net/
Mathias Krause (4):
x86: Use existing CR0.WP / CR4.SMEP bit definitions
x86/access: CR0.WP toggling write to r/o data test
x86/access: Forced emulation support
x86/access: Try emulation for CR0.WP test as well
x86/access.c | 114 ++++++++++++++++++++++++++++++++++++++++++++-------
x86/pks.c | 5 +--
x86/pku.c | 5 +--
3 files changed, 104 insertions(+), 20 deletions(-)
--
2.39.2
^ permalink raw reply [flat|nested] 13+ messages in thread
* [kvm-unit-tests PATCH v3 1/4] x86: Use existing CR0.WP / CR4.SMEP bit definitions
2023-04-03 10:56 [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access Mathias Krause
@ 2023-04-03 10:56 ` Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 2/4] x86/access: CR0.WP toggling write to r/o data test Mathias Krause
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 10:56 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson, Mathias Krause
Use the existing bit definitions from x86/processor.h instead of
defining one-off versions in individual tests.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
x86/access.c | 11 ++++-------
x86/pks.c | 5 ++---
x86/pku.c | 5 ++---
3 files changed, 8 insertions(+), 13 deletions(-)
diff --git a/x86/access.c b/x86/access.c
index 4dfec23073f7..203353a3f74f 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -20,9 +20,6 @@ static int invalid_mask;
#define PT_BASE_ADDR_MASK ((pt_element_t)((((pt_element_t)1 << 36) - 1) & PAGE_MASK))
#define PT_PSE_BASE_ADDR_MASK (PT_BASE_ADDR_MASK & ~(1ull << 21))
-#define CR0_WP_MASK (1UL << 16)
-#define CR4_SMEP_MASK (1UL << 20)
-
#define PFERR_PRESENT_MASK (1U << 0)
#define PFERR_WRITE_MASK (1U << 1)
#define PFERR_USER_MASK (1U << 2)
@@ -239,9 +236,9 @@ static void set_cr0_wp(int wp)
{
unsigned long cr0 = shadow_cr0;
- cr0 &= ~CR0_WP_MASK;
+ cr0 &= ~X86_CR0_WP;
if (wp)
- cr0 |= CR0_WP_MASK;
+ cr0 |= X86_CR0_WP;
if (cr0 != shadow_cr0) {
write_cr0(cr0);
shadow_cr0 = cr0;
@@ -272,9 +269,9 @@ static unsigned set_cr4_smep(ac_test_t *at, int smep)
unsigned long cr4 = shadow_cr4;
unsigned r;
- cr4 &= ~CR4_SMEP_MASK;
+ cr4 &= ~X86_CR4_SMEP;
if (smep)
- cr4 |= CR4_SMEP_MASK;
+ cr4 |= X86_CR4_SMEP;
if (cr4 == shadow_cr4)
return 0;
diff --git a/x86/pks.c b/x86/pks.c
index ef95fb96c597..bda15efc546d 100644
--- a/x86/pks.c
+++ b/x86/pks.c
@@ -5,7 +5,6 @@
#include "x86/vm.h"
#include "x86/msr.h"
-#define CR0_WP_MASK (1UL << 16)
#define PTE_PKEY_BIT 59
#define SUPER_BASE (1 << 23)
#define SUPER_VAR(v) (*((__typeof__(&(v))) (((unsigned long)&v) + SUPER_BASE)))
@@ -18,9 +17,9 @@ static void set_cr0_wp(int wp)
{
unsigned long cr0 = read_cr0();
- cr0 &= ~CR0_WP_MASK;
+ cr0 &= ~X86_CR0_WP;
if (wp)
- cr0 |= CR0_WP_MASK;
+ cr0 |= X86_CR0_WP;
write_cr0(cr0);
}
diff --git a/x86/pku.c b/x86/pku.c
index 51ff412cef82..6c0d72cc6f81 100644
--- a/x86/pku.c
+++ b/x86/pku.c
@@ -5,7 +5,6 @@
#include "x86/vm.h"
#include "x86/msr.h"
-#define CR0_WP_MASK (1UL << 16)
#define PTE_PKEY_BIT 59
#define USER_BASE (1 << 23)
#define USER_VAR(v) (*((__typeof__(&(v))) (((unsigned long)&v) + USER_BASE)))
@@ -18,9 +17,9 @@ static void set_cr0_wp(int wp)
{
unsigned long cr0 = read_cr0();
- cr0 &= ~CR0_WP_MASK;
+ cr0 &= ~X86_CR0_WP;
if (wp)
- cr0 |= CR0_WP_MASK;
+ cr0 |= X86_CR0_WP;
write_cr0(cr0);
}
--
2.39.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [kvm-unit-tests PATCH v3 2/4] x86/access: CR0.WP toggling write to r/o data test
2023-04-03 10:56 [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 1/4] x86: Use existing CR0.WP / CR4.SMEP bit definitions Mathias Krause
@ 2023-04-03 10:56 ` Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 4/4] x86/access: Try emulation for CR0.WP test as well Mathias Krause
3 siblings, 0 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 10:56 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson, Mathias Krause
KUT has tests that verify a supervisor write access to an r/o page is
successful when CR0.WP=0, but lacks a test that explicitly verifies that
the same access faults after setting CR0.WP=1 without flushing any
associated TLB entries, either explicitly (INVLPG) or implicitly (write
to CR3). Add such a test.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
For v3 I moved the guts to a helper but omitted the "as expected" part
from the printf() as it not only keeps the generated output below 80
chars but also is kinda superfluous when it triggers.
Here's an example failure run (with the full series applied):
: run
: ....................................................................................................................................................................................................................................................................................................
: test pte.a pte.p pde.a pde.p write fep: FAIL: unexpected fault
: Dump mapping: address: 0xffff923042007000
: ------L4 I292: 2100027
: ------L3 I193: 2101027
: ------L2 I16: 2102021
: ------L1 I7: 2000061
: do_cr0_wp_access: emulated supervisor write with CR0.WP=0 did not SUCCEED
:
: test pte.a pte.p pde.a pde.p write cr0.wp fep: FAIL: unexpected access
: Dump mapping: address: 0xffff923042007000
: ------L4 I292: 2100027
: ------L3 I193: 2101027
: ------L2 I16: 2102021
: ------L1 I7: 2000061
: do_cr0_wp_access: emulated supervisor write with CR0.WP=1 did not FAULT
:
: 19169286 tests, 1 failures
: FAIL access
As can be seen, ac_test_check() already prints the failure reason, no
need to mention it again.
x86/access.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 59 insertions(+), 2 deletions(-)
diff --git a/x86/access.c b/x86/access.c
index 203353a3f74f..a01278451b96 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -575,9 +575,10 @@ fault:
at->expected_error &= ~PFERR_FETCH_MASK;
}
-static void ac_set_expected_status(ac_test_t *at)
+static void __ac_set_expected_status(ac_test_t *at, bool flush)
{
- invlpg(at->virt);
+ if (flush)
+ invlpg(at->virt);
if (at->ptep)
at->expected_pte = *at->ptep;
@@ -599,6 +600,11 @@ static void ac_set_expected_status(ac_test_t *at)
ac_emulate_access(at, at->flags);
}
+static void ac_set_expected_status(ac_test_t *at)
+{
+ __ac_set_expected_status(at, true);
+}
+
static pt_element_t ac_get_pt(ac_test_t *at, int i, pt_element_t *ptep)
{
pt_element_t pte;
@@ -1061,6 +1067,56 @@ err:
return 0;
}
+#define TOGGLE_CR0_WP_TEST_BASE_FLAGS \
+ (AC_PDE_PRESENT_MASK | AC_PDE_ACCESSED_MASK | \
+ AC_PTE_PRESENT_MASK | AC_PTE_ACCESSED_MASK | \
+ AC_ACCESS_WRITE_MASK)
+
+static int do_cr0_wp_access(ac_test_t *at, int flags)
+{
+ const bool cr0_wp = !!(flags & AC_CPU_CR0_WP_MASK);
+
+ at->flags = TOGGLE_CR0_WP_TEST_BASE_FLAGS | flags;
+ __ac_set_expected_status(at, false);
+
+ /*
+ * Under VMX the guest might own the CR0.WP bit, requiring KVM to
+ * manually keep track of it where needed, e.g. in the guest page
+ * table walker.
+ *
+ * Load CR0.WP with the inverse value of what will be used during
+ * the access test and toggle EFER.NX to coerce KVM into rebuilding
+ * the current MMU context based on the soon-to-be-stale CR0.WP.
+ */
+ set_cr0_wp(!cr0_wp);
+ set_efer_nx(1);
+ set_efer_nx(0);
+
+ if (!ac_test_do_access(at)) {
+ printf("%s: supervisor write with CR0.WP=%d did not %s\n",
+ __FUNCTION__, cr0_wp, cr0_wp ? "FAULT" : "SUCCEED");
+
+ return 1;
+ }
+
+ return 0;
+}
+
+static int check_toggle_cr0_wp(ac_pt_env_t *pt_env)
+{
+ ac_test_t at;
+ int err = 0;
+
+ ac_test_init(&at, 0xffff923042007000ul, pt_env);
+ at.flags = TOGGLE_CR0_WP_TEST_BASE_FLAGS;
+ ac_test_setup_ptes(&at);
+
+ err += do_cr0_wp_access(&at, 0);
+ err += do_cr0_wp_access(&at, AC_CPU_CR0_WP_MASK);
+
+ return err == 0;
+}
+
static int check_effective_sp_permissions(ac_pt_env_t *pt_env)
{
unsigned long ptr1 = 0xffff923480000000;
@@ -1150,6 +1206,7 @@ const ac_test_fn ac_test_cases[] =
check_pfec_on_prefetch_pte,
check_large_pte_dirty_for_nowp,
check_smep_andnot_wp,
+ check_toggle_cr0_wp,
check_effective_sp_permissions,
};
--
2.39.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 10:56 [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 1/4] x86: Use existing CR0.WP / CR4.SMEP bit definitions Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 2/4] x86/access: CR0.WP toggling write to r/o data test Mathias Krause
@ 2023-04-03 10:56 ` Mathias Krause
2023-04-03 11:28 ` Mathias Krause
2023-04-04 0:04 ` Sean Christopherson
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 4/4] x86/access: Try emulation for CR0.WP test as well Mathias Krause
3 siblings, 2 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 10:56 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson, Mathias Krause
Add support to enforce access tests to be handled by the emulator, if
supported by KVM. Exclude it from the ac_test_exec() test, though, to
not slow it down too much.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
x86/access.c | 35 ++++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
diff --git a/x86/access.c b/x86/access.c
index a01278451b96..674077297978 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -82,6 +82,13 @@ enum {
AC_CPU_CR4_SMEP_BIT,
AC_CPU_CR4_PKE_BIT,
+ NR_AC_TEST_FLAGS,
+
+ /*
+ * synthetic flags follow, won't be exercised by ac_test_exec().
+ */
+ AC_FEP_BIT,
+
NR_AC_FLAGS
};
@@ -121,6 +128,8 @@ enum {
#define AC_CPU_CR4_SMEP_MASK (1 << AC_CPU_CR4_SMEP_BIT)
#define AC_CPU_CR4_PKE_MASK (1 << AC_CPU_CR4_PKE_BIT)
+#define AC_FEP_MASK (1 << AC_FEP_BIT)
+
const char *ac_names[] = {
[AC_PTE_PRESENT_BIT] = "pte.p",
[AC_PTE_ACCESSED_BIT] = "pte.a",
@@ -152,6 +161,7 @@ const char *ac_names[] = {
[AC_CPU_CR0_WP_BIT] = "cr0.wp",
[AC_CPU_CR4_SMEP_BIT] = "cr4.smep",
[AC_CPU_CR4_PKE_BIT] = "cr4.pke",
+ [AC_FEP_BIT] = "fep",
};
static inline void *va(pt_element_t phys)
@@ -396,7 +406,7 @@ static void ac_test_init(ac_test_t *at, unsigned long virt, ac_pt_env_t *pt_env)
static int ac_test_bump_one(ac_test_t *at)
{
at->flags = ((at->flags | invalid_mask) + 1) & ~invalid_mask;
- return at->flags < (1 << NR_AC_FLAGS);
+ return at->flags < (1 << NR_AC_TEST_FLAGS);
}
#define F(x) ((flags & x##_MASK) != 0)
@@ -799,10 +809,13 @@ static int ac_test_do_access(ac_test_t *at)
if (F(AC_ACCESS_TWICE)) {
asm volatile ("mov $fixed2, %%rsi \n\t"
- "mov (%[addr]), %[reg] \n\t"
+ "cmp $0, %[fep] \n\t"
+ "jz 1f \n\t"
+ KVM_FEP
+ "1: mov (%[addr]), %[reg] \n\t"
"fixed2:"
: [reg]"=r"(r), [fault]"=a"(fault), "=b"(e)
- : [addr]"r"(at->virt)
+ : [addr]"r"(at->virt), [fep]"r"(F(AC_FEP))
: "rsi");
fault = 0;
}
@@ -823,9 +836,15 @@ static int ac_test_do_access(ac_test_t *at)
"jnz 2f \n\t"
"cmp $0, %[write] \n\t"
"jnz 1f \n\t"
- "mov (%[addr]), %[reg] \n\t"
+ "cmp $0, %[fep] \n\t"
+ "jz 0f \n\t"
+ KVM_FEP
+ "0: mov (%[addr]), %[reg] \n\t"
"jmp done \n\t"
- "1: mov %[reg], (%[addr]) \n\t"
+ "1: cmp $0, %[fep] \n\t"
+ "jz 0f \n\t"
+ KVM_FEP
+ "0: mov %[reg], (%[addr]) \n\t"
"jmp done \n\t"
"2: call *%[addr] \n\t"
"done: \n"
@@ -843,6 +862,7 @@ static int ac_test_do_access(ac_test_t *at)
[write]"r"(F(AC_ACCESS_WRITE)),
[user]"r"(F(AC_ACCESS_USER)),
[fetch]"r"(F(AC_ACCESS_FETCH)),
+ [fep]"r"(F(AC_FEP)),
[user_ds]"i"(USER_DS),
[user_cs]"i"(USER_CS),
[user_stack_top]"r"(user_stack + sizeof user_stack),
@@ -1233,6 +1253,11 @@ int ac_test_run(int pt_levels)
invalid_mask |= AC_PTE_BIT36_MASK;
}
+ if (!is_fep_available()) {
+ printf("FEP not available, skipping emulation tests\n");
+ invalid_mask |= AC_FEP_MASK;
+ }
+
ac_env_int(&pt_env, pt_levels);
ac_test_init(&at, 0xffff923400000000ul, &pt_env);
--
2.39.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [kvm-unit-tests PATCH v3 4/4] x86/access: Try emulation for CR0.WP test as well
2023-04-03 10:56 [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access Mathias Krause
` (2 preceding siblings ...)
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support Mathias Krause
@ 2023-04-03 10:56 ` Mathias Krause
3 siblings, 0 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 10:56 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson, Mathias Krause
Enhance the CR.WP toggling test to do additional tests via the emulator
as these used to trigger bugs when CR0.WP is guest owned.
Link: https://lore.kernel.org/kvm/ea3a8fbc-2bf8-7442-e498-3e5818384c83@grsecurity.net/
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
Instead of testing 'invalid_mask', I simply added yet another call to
is_fep_available(). It's cleaner, IMO, as it's easier to grasp than
'!(invalid_mask & AC_FEP_MASK)'. The additional exception doesn't
influence the tests, as all preparation is done in do_cr0_wp_access().
x86/access.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/x86/access.c b/x86/access.c
index 674077297978..eab3959bc871 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -1107,14 +1107,17 @@ static int do_cr0_wp_access(ac_test_t *at, int flags)
* Load CR0.WP with the inverse value of what will be used during
* the access test and toggle EFER.NX to coerce KVM into rebuilding
* the current MMU context based on the soon-to-be-stale CR0.WP.
+ *
+ * This used to trigger a bug in the emulator, testable via FEP.
*/
set_cr0_wp(!cr0_wp);
set_efer_nx(1);
set_efer_nx(0);
if (!ac_test_do_access(at)) {
- printf("%s: supervisor write with CR0.WP=%d did not %s\n",
- __FUNCTION__, cr0_wp, cr0_wp ? "FAULT" : "SUCCEED");
+ printf("%s: %ssupervisor write with CR0.WP=%d did not %s\n",
+ __FUNCTION__, (flags & AC_FEP_MASK) ? "emulated " : "",
+ cr0_wp, cr0_wp ? "FAULT" : "SUCCEED");
return 1;
}
@@ -1133,6 +1136,10 @@ static int check_toggle_cr0_wp(ac_pt_env_t *pt_env)
err += do_cr0_wp_access(&at, 0);
err += do_cr0_wp_access(&at, AC_CPU_CR0_WP_MASK);
+ if (is_fep_available()) {
+ err += do_cr0_wp_access(&at, AC_FEP_MASK);
+ err += do_cr0_wp_access(&at, AC_FEP_MASK | AC_CPU_CR0_WP_MASK);
+ }
return err == 0;
}
--
2.39.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support Mathias Krause
@ 2023-04-03 11:28 ` Mathias Krause
2023-04-03 19:06 ` Sean Christopherson
2023-04-04 0:04 ` Sean Christopherson
1 sibling, 1 reply; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 11:28 UTC (permalink / raw)
To: kvm; +Cc: Sean Christopherson
On 03.04.23 12:56, Mathias Krause wrote:
> Add support to enforce access tests to be handled by the emulator, if
> supported by KVM. Exclude it from the ac_test_exec() test, though, to
> not slow it down too much.
I tend to read a lot of objdumps and when initially looking at the
generated code it was kinda hard to recognize the FEP instruction, as
the FEP actually decodes to UD2 followed by some IMUL instruction that
lacks a byte, so when objdump does its linear disassembly, it eats a
byte from the to-be-emulated instruction. Like, "FEP; int $0xc3" would
decode to:
0: 0f 0b ud2
2: 6b 76 6d cd imul $0xffffffcd,0x6d(%rsi),%esi
6: c3 retq
This is slightly confusing, especially when the shortened instruction is
actually a valid one as above ("retq" vs "int $0xc3").
I have the below diff to "fix" that. It adds 0x3e to the FEP which would
restore objdump's ability to generate a proper disassembly that won't
destroy the to-be-emulated instruction. As 0x3e decodes to the DS prefix
byte, which the emulator assumes by default anyways, this should mostly
be a no-op. However, it helped me to get a proper code dump.
If there's interest, I can send a proper patch. If not, this might help
others to understand garbled objdumps involving the FEP ;)
--- a/lib/x86/desc.h
+++ b/lib/x86/desc.h
@@ -104,6 +104,7 @@ typedef struct __attribute__((packed)) {
/* Forced emulation prefix, used to invoke the emulator unconditionally. */
#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"
+#define KVM_FEP_PRETTY KVM_FEP ".byte 0x3e;"
#define ASM_TRY_FEP(catch) __ASM_TRY(KVM_FEP, catch)
static inline bool is_fep_available(void)
diff --git a/x86/access.c b/x86/access.c
index eab3959bc871..ab1913313fbb 100644
--- a/x86/access.c
+++ b/x86/access.c
@@ -811,7 +811,7 @@ static int ac_test_do_access(ac_test_t *at)
asm volatile ("mov $fixed2, %%rsi \n\t"
"cmp $0, %[fep] \n\t"
"jz 1f \n\t"
- KVM_FEP
+ KVM_FEP_PRETTY
"1: mov (%[addr]), %[reg] \n\t"
"fixed2:"
: [reg]"=r"(r), [fault]"=a"(fault), "=b"(e)
@@ -838,12 +838,12 @@ static int ac_test_do_access(ac_test_t *at)
"jnz 1f \n\t"
"cmp $0, %[fep] \n\t"
"jz 0f \n\t"
- KVM_FEP
+ KVM_FEP_PRETTY
"0: mov (%[addr]), %[reg] \n\t"
"jmp done \n\t"
"1: cmp $0, %[fep] \n\t"
"jz 0f \n\t"
- KVM_FEP
+ KVM_FEP_PRETTY
"0: mov %[reg], (%[addr]) \n\t"
"jmp done \n\t"
"2: call *%[addr] \n\t"
Thanks,
Mathias
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 11:28 ` Mathias Krause
@ 2023-04-03 19:06 ` Sean Christopherson
2023-04-03 20:06 ` Mathias Krause
0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2023-04-03 19:06 UTC (permalink / raw)
To: Mathias Krause; +Cc: kvm
On Mon, Apr 03, 2023, Mathias Krause wrote:
> On 03.04.23 12:56, Mathias Krause wrote:
> > Add support to enforce access tests to be handled by the emulator, if
> > supported by KVM. Exclude it from the ac_test_exec() test, though, to
> > not slow it down too much.
>
> I tend to read a lot of objdumps and when initially looking at the
> generated code it was kinda hard to recognize the FEP instruction, as
> the FEP actually decodes to UD2 followed by some IMUL instruction that
> lacks a byte, so when objdump does its linear disassembly, it eats a
> byte from the to-be-emulated instruction. Like, "FEP; int $0xc3" would
> decode to:
> 0: 0f 0b ud2
> 2: 6b 76 6d cd imul $0xffffffcd,0x6d(%rsi),%esi
> 6: c3 retq
> This is slightly confusing, especially when the shortened instruction is
> actually a valid one as above ("retq" vs "int $0xc3").
>
> I have the below diff to "fix" that. It adds 0x3e to the FEP which would
> restore objdump's ability to generate a proper disassembly that won't
> destroy the to-be-emulated instruction. As 0x3e decodes to the DS prefix
> byte, which the emulator assumes by default anyways, this should mostly
> be a no-op. However, it helped me to get a proper code dump.a
I agree that the objdump output is annoying, but I don't love the idea of cramming
in a prefix that's _mostly_ a nop.
Given that FEP utilizes extremely specialized, off-by-default KVM code, what about
reworking FEP in KVM itself to play nice with objdump (and other disasm tools)?
E.g. "officially" change the magic prefix to include a trailing 0x3e. Though IMO,
even better would be a magic value that decodes to a multi-byte nop, e.g.
0F 1F 44 00 00. The only "requirement" is that the magic value doesn't get
false positives, and I can't imagine any of our test environments generate a ud2
followed by a multi-byte nop.
Keeping KVM-Unit-Tests and KVM synchronized on the correct FEP value would be a
pain, but disconnects between KVM and KUT are nothing new.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 19:06 ` Sean Christopherson
@ 2023-04-03 20:06 ` Mathias Krause
2023-04-04 0:08 ` Sean Christopherson
0 siblings, 1 reply; 13+ messages in thread
From: Mathias Krause @ 2023-04-03 20:06 UTC (permalink / raw)
To: Sean Christopherson; +Cc: kvm
On 03.04.23 21:06, Sean Christopherson wrote:
> On Mon, Apr 03, 2023, Mathias Krause wrote:
>> On 03.04.23 12:56, Mathias Krause wrote:
>>> Add support to enforce access tests to be handled by the emulator, if
>>> supported by KVM. Exclude it from the ac_test_exec() test, though, to
>>> not slow it down too much.
>>
>> I tend to read a lot of objdumps and when initially looking at the
>> generated code it was kinda hard to recognize the FEP instruction, as
>> the FEP actually decodes to UD2 followed by some IMUL instruction that
>> lacks a byte, so when objdump does its linear disassembly, it eats a
>> byte from the to-be-emulated instruction. Like, "FEP; int $0xc3" would
>> decode to:
>> 0: 0f 0b ud2
>> 2: 6b 76 6d cd imul $0xffffffcd,0x6d(%rsi),%esi
>> 6: c3 retq
>> This is slightly confusing, especially when the shortened instruction is
>> actually a valid one as above ("retq" vs "int $0xc3").
>>
>> I have the below diff to "fix" that. It adds 0x3e to the FEP which would
>> restore objdump's ability to generate a proper disassembly that won't
>> destroy the to-be-emulated instruction. As 0x3e decodes to the DS prefix
>> byte, which the emulator assumes by default anyways, this should mostly
>> be a no-op. However, it helped me to get a proper code dump.a
>
> I agree that the objdump output is annoying, but I don't love the idea of cramming
> in a prefix that's _mostly_ a nop.
>
> Given that FEP utilizes extremely specialized, off-by-default KVM code, what about
> reworking FEP in KVM itself to play nice with objdump (and other disasm tools)?
> E.g. "officially" change the magic prefix to include a trailing 0x3e. Though IMO,
> even better would be a magic value that decodes to a multi-byte nop, e.g.
> 0F 1F 44 00 00. The only "requirement" is that the magic value doesn't get
> false positives, and I can't imagine any of our test environments generate a ud2
> followed by a multi-byte nop.
Well, the above is a bad choice, actually, as that mirrors what GNU as
might generate when asked to generate a 5-byte NOP, e.g. for filling the
inter-function gap. And if there is a UD2 as last instruction and we're
unlucky to hit it, e.g. because we're hitting some UBSAN instrumented
error handling, we'll instead trigger "forced emulation" in KVM and
fall-through to the next function. Have fun debugging that! ;P
$ echo 'foo: ret; ud2; .balign 8; baz: int3' | as - && objdump -d
[...]
0000000000000000 <foo>:
0: c3 retq
1: 0f 0b ud2
3: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
0000000000000008 <baz>:
8: cc int3
>
> Keeping KVM-Unit-Tests and KVM synchronized on the correct FEP value would be a
> pain, but disconnects between KVM and KUT are nothing new.
Nah, that would be an ABI break, IMO a no-go. What I was suggesting was
pretty much the patch: change selective users in KUT to make them
objdump-friendly. The FEP as-is is ABI, IMO, and cannot be changed any
more. We could add a FEP2 that's more objdump-friedly, but there's
really no need to support yet another byte combination in KVM itself. It
can all be handled by its users, e.g. in KUT as proposed with the 0x3e
suffix.
But, as I said, it's mostly just me trying to read disassembly and I see
others don't have the need to. So this doesn't need to lead anywhere.
But I thought I bring it up, in case there's others questioning why some
of the KUT code dumps look so weird in objdump ;)
Thanks,
Mathias
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support Mathias Krause
2023-04-03 11:28 ` Mathias Krause
@ 2023-04-04 0:04 ` Sean Christopherson
2023-04-04 7:27 ` Mathias Krause
1 sibling, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2023-04-04 0:04 UTC (permalink / raw)
To: Mathias Krause; +Cc: kvm
On Mon, Apr 03, 2023, Mathias Krause wrote:
> Add support to enforce access tests to be handled by the emulator, if
> supported by KVM. Exclude it from the ac_test_exec() test, though, to
> not slow it down too much.
IMO, the slowdown is nowhere near bad enought to warrant exclusion. On bare metal
without KASAN and other debug gunk, the total runtime with EPT enabled is <6s.
With EPT disabled, it's <8s. In a VM, they times are <16s and <26s respectively.
Those are perfectly reasonable, and forcing emulation actually makes the EPT case,
interesting. And the KASAN/debug builds are so horrendously slow that I think we
should figure out a way to special case those kernels anyways.
If you don't object, I'll include FEP as a regular flag when applying.
One other fun thing the usage from vmx_pf_exception_test_guest(), which runs afoul
of a KVM bug. The VMX #PF test runs the access test as an L2 guest (from KVM's
perspective), i.e. triggers emulation from L2. KVM's emulator is goofy and checks
nested intercepts for PAUSE even on vanilla NOPs. SVM filters out non-PAUSE instructions
on the backend, but VMX does not (VMX doesn't have any logic for PAUSE interception
and just ends up injecting a #UD).
I'll post this as a KVM patch.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9ae4044f076f..1e560457bf9a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7898,6 +7898,21 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
/* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */
break;
+ case x86_intercept_pause:
+ /*
+ * PAUSE is a single-byte NOP with a REPE prefix, i.e. collides
+ * with vanilla NOPs in the emulator. Apply the interception
+ * check only to actual PAUSE instructions. Don't check
+ * PAUSE-loop-exiting, software can't expect a given PAUSE to
+ * exit, i.e. KVM is within its rights to allow L2 to execute
+ * the PAUSE.
+ */
+ if ((info->rep_prefix != REPE_PREFIX) ||
+ !nested_cpu_has2(vmcs12, CPU_BASED_PAUSE_EXITING))
+ return X86EMUL_CONTINUE;
+
+ break;
+
/* TODO: check more intercepts... */
default:
break;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-03 20:06 ` Mathias Krause
@ 2023-04-04 0:08 ` Sean Christopherson
0 siblings, 0 replies; 13+ messages in thread
From: Sean Christopherson @ 2023-04-04 0:08 UTC (permalink / raw)
To: Mathias Krause; +Cc: kvm
On Mon, Apr 03, 2023, Mathias Krause wrote:
> On 03.04.23 21:06, Sean Christopherson wrote:
> > Keeping KVM-Unit-Tests and KVM synchronized on the correct FEP value would be a
> > pain, but disconnects between KVM and KUT are nothing new.
>
> Nah, that would be an ABI break, IMO a no-go. What I was suggesting was
> pretty much the patch: change selective users in KUT to make them
> objdump-friendly. The FEP as-is is ABI, IMO
I would be amazed if anything other that KVM's own tests utilizes the feature.
IMO, it really shouldn't be considered ABI.
Paolo?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-04 0:04 ` Sean Christopherson
@ 2023-04-04 7:27 ` Mathias Krause
2023-04-04 16:36 ` Sean Christopherson
0 siblings, 1 reply; 13+ messages in thread
From: Mathias Krause @ 2023-04-04 7:27 UTC (permalink / raw)
To: Sean Christopherson; +Cc: kvm
On 04.04.23 02:04, Sean Christopherson wrote:
> On Mon, Apr 03, 2023, Mathias Krause wrote:
>> Add support to enforce access tests to be handled by the emulator, if
>> supported by KVM. Exclude it from the ac_test_exec() test, though, to
>> not slow it down too much.
>
> IMO, the slowdown is nowhere near bad enought to warrant exclusion. On bare metal
> without KASAN and other debug gunk, the total runtime with EPT enabled is <6s.
> With EPT disabled, it's <8s. In a VM, they times are <16s and <26s respectively.
> Those are perfectly reasonable, and forcing emulation actually makes the EPT case,
> interesting. And the KASAN/debug builds are so horrendously slow that I think we
> should figure out a way to special case those kernels anyways.
You must have a more beefy machine than I do. Testing bare metal on a
NUC12 (i7-1260P) with kvm.ko loaded with force_emulation_prefix=1 and
not excluding AC_FEP_BIT from ac_test_bump() gives me a runtime of
little over 41s with EPT enabled and, funnily, only 9s with EPT
disabled, as that implicitly excludes the CR4.PKE tests, reducing the
number of tests to run by a factor of 10 (~38 million tests down do 3.8
million). The non-EPT run took 56s in a VM but the EPT one ran into the
90s timeout. After lifting that, it was 2m22s. :/
For comparison, the runtime of the access test on bare with this series
applied as-is is 10s with EPT enabled and 5s with EPT disabled. In a VM
I get 10s with EPT and 29s without.
>
> If you don't object, I'll include FEP as a regular flag when applying.
My concerns are that the additional FEP access tests will push the
runtime over the 90s limit not only for my setup but for some CI setups
as well, as they are notoriously resource constraint (and I vaguely
remember a discussion about already hitting timeouts for the gitlab CI
pipeline?).
So yes, I do object. Please keep the AC_FEP_BIT excluded from the
ac_test_bump() tests. It'll cause trouble for CI setups, I'm certain.
>
> One other fun thing the usage from vmx_pf_exception_test_guest(), which runs afoul
> of a KVM bug. The VMX #PF test runs the access test as an L2 guest (from KVM's
> perspective), i.e. triggers emulation from L2. KVM's emulator is goofy and checks
> nested intercepts for PAUSE even on vanilla NOPs. SVM filters out non-PAUSE instructions
> on the backend, but VMX does not (VMX doesn't have any logic for PAUSE interception
> and just ends up injecting a #UD).
Hehe, nice :D
>
> I'll post this as a KVM patch.
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9ae4044f076f..1e560457bf9a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7898,6 +7898,21 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
> /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */
> break;
>
> + case x86_intercept_pause:
> + /*
> + * PAUSE is a single-byte NOP with a REPE prefix, i.e. collides
> + * with vanilla NOPs in the emulator. Apply the interception
> + * check only to actual PAUSE instructions. Don't check
> + * PAUSE-loop-exiting, software can't expect a given PAUSE to
> + * exit, i.e. KVM is within its rights to allow L2 to execute
> + * the PAUSE.
> + */
> + if ((info->rep_prefix != REPE_PREFIX) ||
> + !nested_cpu_has2(vmcs12, CPU_BASED_PAUSE_EXITING))
> + return X86EMUL_CONTINUE;
> +
> + break;
> +
> /* TODO: check more intercepts... */
> default:
> break;
>
Thanks,
Mathias
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-04 7:27 ` Mathias Krause
@ 2023-04-04 16:36 ` Sean Christopherson
2023-04-05 8:01 ` Mathias Krause
0 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2023-04-04 16:36 UTC (permalink / raw)
To: Mathias Krause; +Cc: kvm
On Tue, Apr 04, 2023, Mathias Krause wrote:
> On 04.04.23 02:04, Sean Christopherson wrote:
> > On Mon, Apr 03, 2023, Mathias Krause wrote:
> >> Add support to enforce access tests to be handled by the emulator, if
> >> supported by KVM. Exclude it from the ac_test_exec() test, though, to
> >> not slow it down too much.
> >
> > IMO, the slowdown is nowhere near bad enought to warrant exclusion. On bare metal
> > without KASAN and other debug gunk, the total runtime with EPT enabled is <6s.
> > With EPT disabled, it's <8s. In a VM, they times are <16s and <26s respectively.
> > Those are perfectly reasonable, and forcing emulation actually makes the EPT case,
> > interesting. And the KASAN/debug builds are so horrendously slow that I think we
> > should figure out a way to special case those kernels anyways.
>
> You must have a more beefy machine than I do.
I doubt it, my numbers are from a Haswell with a whopping 6 cores (and the core
count should be irrelevant).
> Testing bare metal on a NUC12 (i7-1260P) with kvm.ko loaded with
> force_emulation_prefix=1 and not excluding AC_FEP_BIT from ac_test_bump()
> gives me a runtime of little over 41s with EPT enabled and, funnily, only 9s
> with EPT disabled, as that implicitly excludes the CR4.PKE tests, reducing
> the number of tests to run by a factor of 10 (~38 million tests down do 3.8
> million).
Ah, right, fancy new features. Running on an Icelake, i.e. with 5-level paging
and PKRU support, is indeed quite painful.
After much fiddling, I think the best option is to add a separate config entry
to enable FEP, and have that entry be nodefault, i.e. a "manual" testcase. Ditto
for the nVMX #PF variant. That will allow CI and other runners to enable the
test for compatible configs, e.g. when running on bare metal, without causing
problems for existing setups. Well, unless there are setups that do a generic
"-g nodefault", but x86 doesn't currently have any nodefault tests so that's
quite unlikely.
The only downside is that the CR0.WP testcase will also become manual only. We
could obviously have it ignore the opt-in flag, but there's value is containing
it to the opt-in testcase, e.g. it becomes very obvious that emulation is relevant
to the failure when the FEP version fails but the non-FEP version does not. And
I also think we should mark the VPID-based variants nodefault, as they have 4+
minute runtimes in VMs, i.e. we should encourage use of "-g nodefault" in CI when
appropriate.
I'll post a v4, there are other cleanups needed in the access test, e.g. the darn
thing doesn't use report_summary() and so actually getting it to report a SKIP is
impossible.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support
2023-04-04 16:36 ` Sean Christopherson
@ 2023-04-05 8:01 ` Mathias Krause
0 siblings, 0 replies; 13+ messages in thread
From: Mathias Krause @ 2023-04-05 8:01 UTC (permalink / raw)
To: Sean Christopherson; +Cc: kvm
On 04.04.23 18:36, Sean Christopherson wrote:
> On Tue, Apr 04, 2023, Mathias Krause wrote:
>> Testing bare metal on a NUC12 (i7-1260P) with kvm.ko loaded with
>> force_emulation_prefix=1 and not excluding AC_FEP_BIT from ac_test_bump()
>> gives me a runtime of little over 41s with EPT enabled and, funnily, only 9s
>> with EPT disabled, as that implicitly excludes the CR4.PKE tests, reducing
>> the number of tests to run by a factor of 10 (~38 million tests down do 3.8
>> million).
>
> Ah, right, fancy new features. Running on an Icelake, i.e. with 5-level paging
> and PKRU support, is indeed quite painful.
Jepp.
>
> After much fiddling, I think the best option is to add a separate config entry
> to enable FEP, and have that entry be nodefault, i.e. a "manual" testcase. Ditto
> for the nVMX #PF variant. That will allow CI and other runners to enable the
> test for compatible configs, e.g. when running on bare metal, without causing
> problems for existing setups. Well, unless there are setups that do a generic
> "-g nodefault", but x86 doesn't currently have any nodefault tests so that's
> quite unlikely.
Yeah, that works too. I was also thinking of a commandline parameter to
allow ac_test_bump() to take the FEP bit into account to allow testing
forced emulation, but not by default.
>
> The only downside is that the CR0.WP testcase will also become manual only. We
> could obviously have it ignore the opt-in flag, but there's value is containing
> it to the opt-in testcase, e.g. it becomes very obvious that emulation is relevant
> to the failure when the FEP version fails but the non-FEP version does not.
Well, I think there's much more value in doing the CR0.WP emulation
tests by default than there is to bind them to the "manual" test.
They're fast, only two accesses, so runtime is no argument here. They
also test an important aspects of KVM's MMU role, so we shouldn't bury
these behind a "nodefault" flag.
I'd rather see the command line option be a toggle for the access bits
permutation test only, which are dog slow with forced emulation.
> And
> I also think we should mark the VPID-based variants nodefault, as they have 4+
> minute runtimes in VMs, i.e. we should encourage use of "-g nodefault" in CI when
> appropriate.
Ok, no objection from my side, as I have no dog in that fight ;)
>
> I'll post a v4, there are other cleanups needed in the access test, e.g. the darn
> thing doesn't use report_summary() and so actually getting it to report a SKIP is
> impossible.
I think that's just because it's such an old test that predates the
reporting infrastructure. But yeah, it could get some spring cleanup,
like dropping the #defines for true and false ;)
Thanks,
Mathias
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-04-05 8:01 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-03 10:56 [kvm-unit-tests PATCH v3 0/4] Tests for CR0.WP=0/1 r/o write access Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 1/4] x86: Use existing CR0.WP / CR4.SMEP bit definitions Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 2/4] x86/access: CR0.WP toggling write to r/o data test Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 3/4] x86/access: Forced emulation support Mathias Krause
2023-04-03 11:28 ` Mathias Krause
2023-04-03 19:06 ` Sean Christopherson
2023-04-03 20:06 ` Mathias Krause
2023-04-04 0:08 ` Sean Christopherson
2023-04-04 0:04 ` Sean Christopherson
2023-04-04 7:27 ` Mathias Krause
2023-04-04 16:36 ` Sean Christopherson
2023-04-05 8:01 ` Mathias Krause
2023-04-03 10:56 ` [kvm-unit-tests PATCH v3 4/4] x86/access: Try emulation for CR0.WP test as well Mathias Krause
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox