* [RFC PATCH 0/8] Introduce CET supervisor xstate support
@ 2023-09-14 3:23 Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 1/8] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
Hi, x86 maintainers,
Please review this series for CET virtualization enabling, the series is
considered as a necessary part for supporting guest CET. See related
discussion here [*].
Thanks!
----------------------------------------------------------------------------
CET supervisor state, i.e., IA32_PL{0,1,2}_SSP, are xsave-managed MSRs,
it can be opt-in via IA32_XSS[bit 12]. Currently host supervisor shadow
stack are not enabled and the feature bit is not set. But from KVM usage
perspective, enabling host CET supervisor state is required for guest CET
supervisor MSRs management. The benefits are: 1) No need to manually save/
restore the 3 MSRs when vCPU is switched in/out. 2) Omit manually saving/
reloading the MSRs at VM-Exit/VM-Entry. 3) Make guest CET user mode and
supervisor mode states managed within current FPU framework in consistent
manners.
This series tries to:
1) Fix issues resulted from CET virtualizaiton enabling and CET usage in guest.
2) Add CET supervisor xstate support in kernel.
3) Introduce kernel dynamic xfeature set for CET supervisor state optimization.
4) Change guest fpstate settings to hold kernel dynamic xfeatures.
For guest fpstate, we have xstate_bv[12] == xcomp_bv[12] == 1 in xstate_header,
and CET supervisor mode state are saved/reloaded when xsaves/xrstors runs
on fpstate area.
For non-guest fpstate we have xstate_bv[12] == xcomp_bv[12] == 0, then HW can
optimize xsaves/xrstors operations.
Basic tests done (based on v6.6-rc1 kernel tree):
1. selftests: x86:amx_64/test_fpu, kvm: amx_test,smm_test,state_test etc.
2. Guest launch with IA32_PL{0,1,2}_SSP read/write in host/guest kernel.
3. Guest live migration tests.
No perceivable issues (mainly dmesg) are found in both host and guest during
above tests.
Patch1: Fix a potential CET xstate dependency issue in guest kernel.
Patch2: Fix an inconsistent size issue in guest fpstate allocation.
Patch3: Introduce CET supervisor xstate support.
Patch4: Introduce kernel dynamic xfeature set for optimization.
Patch5: Remove kernel dynamic xfeatures from normal fpstate.
Patch6: Opt-in kernel dynamic xfeatures when resize guest xsave area.
Patch7: Include kernel dynamic xfetures when allocate guest xsave area.
Patch8: Check unexpected/invalid fpstate settings before fire xsave.
[*]: https://lore.kernel.org/all/806e26c2-8d21-9cc9-a0b7-7787dd231729@intel.com/
Yang Weijiang (8):
x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit
x86/fpu/xstate: Fix guest fpstate allocation size calculation
x86/fpu/xstate: Add CET supervisor mode state support
x86/fpu/xstate: Introduce kernel dynamic xfeature set
x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features
x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size
x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures
x86/fpu/xstate: WARN if normal fpstate contains kernel dynamic xfeatures
arch/x86/include/asm/fpu/types.h | 14 ++++++--
arch/x86/include/asm/fpu/xstate.h | 6 ++--
arch/x86/kernel/fpu/core.c | 56 ++++++++++++++++++++++++++-----
arch/x86/kernel/fpu/xstate.c | 49 ++++++++++++++++++++++++---
arch/x86/kernel/fpu/xstate.h | 5 +++
5 files changed, 112 insertions(+), 18 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 1/8] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
Remove XFEATURE_CET_USER entry from dependency array as the entry doesn't
reflect true dependency between CET features and the xstate bit, instead
manually check and add the bit back if either SHSTK or IBT is supported.
Both user mode shadow stack and indirect branch tracking features depend
on XFEATURE_CET_USER bit in XSS to automatically save/restore user mode
xstate registers, i.e., IA32_U_CET and IA32_PL3_SSP whenever necessary.
Although in real world a platform with IBT but no SHSTK is rare, but in
virtualization world it's common, guest SHSTK and IBT can be controlled
independently via userspace app.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/xstate.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index cadf68737e6b..12c8cb278346 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -73,7 +73,6 @@ static unsigned short xsave_cpuid_features[] __initdata = {
[XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT,
[XFEATURE_PKRU] = X86_FEATURE_OSPKE,
[XFEATURE_PASID] = X86_FEATURE_ENQCMD,
- [XFEATURE_CET_USER] = X86_FEATURE_SHSTK,
[XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE,
[XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE,
};
@@ -798,6 +797,14 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
fpu_kernel_cfg.max_features &= ~BIT_ULL(i);
}
+ /*
+ * Manually add CET user mode xstate bit if either SHSTK or IBT is
+ * available. Both features depend on the xstate bit to save/restore
+ * CET user mode state.
+ */
+ if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT))
+ fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER);
+
if (!cpu_feature_enabled(X86_FEATURE_XFD))
fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC;
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 1/8] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-10-21 0:29 ` Sean Christopherson
2023-09-14 3:23 ` [RFC PATCH 3/8] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
` (5 subsequent siblings)
7 siblings, 1 reply; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
Fix guest xsave area allocation size from fpu_user_cfg.default_size to
fpu_kernel_cfg.default_size so that the xsave area size is consistent
with fpstate->size set in __fpstate_reset().
With the fix, guest fpstate size is sufficient for KVM supported guest
xfeatures.
Fixes: 69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup");
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index a86d37052a64..a42d8ad26ce6 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -220,7 +220,9 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
struct fpstate *fpstate;
unsigned int size;
- size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
+ size = fpu_kernel_cfg.default_size +
+ ALIGN(offsetof(struct fpstate, regs), 64);
+
fpstate = vzalloc(size);
if (!fpstate)
return false;
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 3/8] x86/fpu/xstate: Add CET supervisor mode state support
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 1/8] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 4/8] x86/fpu/xstate: Introduce kernel dynamic xfeature set Yang Weijiang
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
Add supervisor mode state support within FPU xstate management framework.
Although supervisor shadow stack is not enabled/used today in kernel,KVM
requires the support because when KVM advertises shadow stack feature to
guest, architechturally it claims the support for both user and supervisor
modes for Linux and non-Linux guest OSes.
With the xstate support, guest supervisor mode shadow stack state can be
properly saved/restored when 1) guest/host FPU context is swapped 2) vCPU
thread is sched out/in.
The alternative is to enable it in KVM domain, but KVM maintainers NAKed
the solution. The external discussion can be found at [*], it ended up
with adding the support in kernel instead of KVM domain.
Note, in KVM case, guest CET supervisor state i.e., IA32_PL{0,1,2}_MSRs,
are preserved after VM-Exit until host/guest fpstates are swapped, but
since host supervisor shadow stack is disabled, the preserved MSRs won't
hurt host.
[*]: https://lore.kernel.org/all/806e26c2-8d21-9cc9-a0b7-7787dd231729@intel.com/
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/include/asm/fpu/types.h | 14 ++++++++++++--
arch/x86/include/asm/fpu/xstate.h | 6 +++---
arch/x86/kernel/fpu/xstate.c | 6 +++++-
3 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index eb810074f1e7..c6fd13a17205 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -116,7 +116,7 @@ enum xfeature {
XFEATURE_PKRU,
XFEATURE_PASID,
XFEATURE_CET_USER,
- XFEATURE_CET_KERNEL_UNUSED,
+ XFEATURE_CET_KERNEL,
XFEATURE_RSRVD_COMP_13,
XFEATURE_RSRVD_COMP_14,
XFEATURE_LBR,
@@ -139,7 +139,7 @@ enum xfeature {
#define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU)
#define XFEATURE_MASK_PASID (1 << XFEATURE_PASID)
#define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER)
-#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED)
+#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL)
#define XFEATURE_MASK_LBR (1 << XFEATURE_LBR)
#define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG)
#define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA)
@@ -264,6 +264,16 @@ struct cet_user_state {
u64 user_ssp;
};
+/*
+ * State component 12 is Control-flow Enforcement supervisor states
+ */
+struct cet_supervisor_state {
+ /* supervisor ssp pointers */
+ u64 pl0_ssp;
+ u64 pl1_ssp;
+ u64 pl2_ssp;
+};
+
/*
* State component 15: Architectural LBR configuration state.
* The size of Arch LBR state depends on the number of LBRs (lbr_depth).
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index d4427b88ee12..3b4a038d3c57 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -51,7 +51,8 @@
/* All currently supported supervisor features */
#define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
- XFEATURE_MASK_CET_USER)
+ XFEATURE_MASK_CET_USER | \
+ XFEATURE_MASK_CET_KERNEL)
/*
* A supervisor state component may not always contain valuable information,
@@ -78,8 +79,7 @@
* Unsupported supervisor features. When a supervisor feature in this mask is
* supported in the future, move it to the supported supervisor feature mask.
*/
-#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \
- XFEATURE_MASK_CET_KERNEL)
+#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT)
/* All supervisor states including supported and unsupported states. */
#define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 12c8cb278346..c3ed86732d33 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -51,7 +51,7 @@ static const char *xfeature_names[] =
"Protection Keys User registers",
"PASID state",
"Control-flow User registers",
- "Control-flow Kernel registers (unused)",
+ "Control-flow Kernel registers",
"unknown xstate feature",
"unknown xstate feature",
"unknown xstate feature",
@@ -73,6 +73,7 @@ static unsigned short xsave_cpuid_features[] __initdata = {
[XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT,
[XFEATURE_PKRU] = X86_FEATURE_OSPKE,
[XFEATURE_PASID] = X86_FEATURE_ENQCMD,
+ [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK,
[XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE,
[XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE,
};
@@ -277,6 +278,7 @@ static void __init print_xstate_features(void)
print_xstate_feature(XFEATURE_MASK_PKRU);
print_xstate_feature(XFEATURE_MASK_PASID);
print_xstate_feature(XFEATURE_MASK_CET_USER);
+ print_xstate_feature(XFEATURE_MASK_CET_KERNEL);
print_xstate_feature(XFEATURE_MASK_XTILE_CFG);
print_xstate_feature(XFEATURE_MASK_XTILE_DATA);
}
@@ -346,6 +348,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate)
XFEATURE_MASK_BNDCSR | \
XFEATURE_MASK_PASID | \
XFEATURE_MASK_CET_USER | \
+ XFEATURE_MASK_CET_KERNEL | \
XFEATURE_MASK_XTILE)
/*
@@ -546,6 +549,7 @@ static bool __init check_xstate_against_struct(int nr)
case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state);
case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg);
case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state);
+ case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_state);
case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true;
default:
XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr);
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 4/8] x86/fpu/xstate: Introduce kernel dynamic xfeature set
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
` (2 preceding siblings ...)
2023-09-14 3:23 ` [RFC PATCH 3/8] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 5/8] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Yang Weijiang
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
Define a new kernel xfeature set including the features can be dynamically
enabled, i.e., the relevant feature is enabled on demand. The xfeature set
is currently used by KVM to configure __guest__ fpstate, i.e., calculating
the xfeature and fpstate storage size etc. The xfeature set is initialized
once and used whenever it's referenced to avoid repeat calculation.
Currently it's used when 1) guest fpstate __state_size is calculated while
guest permits are configured 2) guest vCPU is created and its fpstate is
initialized.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/xstate.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c3ed86732d33..eaec05bc1b3c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -84,6 +84,8 @@ static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
{ [ 0 ... XFEATURE_MAX - 1] = -1};
static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init;
+u64 fpu_kernel_dynamic_xfeatures __ro_after_init;
+
#define XSTATE_FLAG_SUPERVISOR BIT(0)
#define XSTATE_FLAG_ALIGNED64 BIT(1)
@@ -740,6 +742,23 @@ static void __init fpu__init_disable_system_xstate(unsigned int legacy_size)
fpstate_reset(¤t->thread.fpu);
}
+static unsigned short xsave_kernel_dynamic_xfeatures[] = {
+ [XFEATURE_CET_KERNEL] = X86_FEATURE_SHSTK,
+};
+
+static void __init init_kernel_dynamic_xfeatures(void)
+{
+ unsigned short cid;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(xsave_kernel_dynamic_xfeatures); i++) {
+ cid = xsave_kernel_dynamic_xfeatures[i];
+
+ if (cid && boot_cpu_has(cid))
+ fpu_kernel_dynamic_xfeatures |= BIT_ULL(i);
+ }
+}
+
/*
* Enable and initialize the xsave feature.
* Called once per system bootup.
@@ -809,6 +828,8 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
if (boot_cpu_has(X86_FEATURE_SHSTK) || boot_cpu_has(X86_FEATURE_IBT))
fpu_kernel_cfg.max_features |= BIT_ULL(XFEATURE_CET_USER);
+ init_kernel_dynamic_xfeatures();
+
if (!cpu_feature_enabled(X86_FEATURE_XFD))
fpu_kernel_cfg.max_features &= ~XFEATURE_MASK_USER_DYNAMIC;
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 5/8] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
` (3 preceding siblings ...)
2023-09-14 3:23 ` [RFC PATCH 4/8] x86/fpu/xstate: Introduce kernel dynamic xfeature set Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 6/8] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Yang Weijiang
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
The kernel dynamic xfeatures are supported by host, i.e., they're enabled
in xsaves/xrstors operating xfeature set (XCR0 | XSS), but the corresponding
CPU features are disabled for the time-being in host kernel so the bits are
not necessarily set by default.
Remove the bits from fpu_kernel_cfg.default_features so that the bits in
xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be optimized by HW
for normal fpstate.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/xstate.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index eaec05bc1b3c..4753c677e2e1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -845,6 +845,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
/* Clean out dynamic features from default */
fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
+ fpu_kernel_cfg.default_features &= ~fpu_kernel_dynamic_xfeatures;
fpu_user_cfg.default_features = fpu_user_cfg.max_features;
fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 6/8] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
` (4 preceding siblings ...)
2023-09-14 3:23 ` [RFC PATCH 5/8] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 7/8] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 8/8] x86/fpu/xstate: WARN if normal fpstate contains " Yang Weijiang
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
When user space requests guest xstate permits, the sufficient xstate size
is calculated from permitted mask. Currently the max guest permits are set
to fpu_kernel_cfg.default_features, and the latter doesn't include kernel
dynamic xfeatures, so add them back for correct guest fpstate size.
If guest dynamic xfeatures are enabled, KVM re-allocates guest fpstate area
with above resulting size before launches VM.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/xstate.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4753c677e2e1..c5d903b4df4d 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1636,9 +1636,17 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
/* Calculate the resulting kernel state size */
mask = permitted | requested;
- /* Take supervisor states into account on the host */
+ /*
+ * Take supervisor states into account on the host. And add
+ * kernel dynamic xfeatures to guest since guest kernel may
+ * enable corresponding CPU feaures and the xstate registers
+ * need to be saved/restored properly.
+ */
if (!guest)
mask |= xfeatures_mask_supervisor();
+ else
+ mask |= fpu_kernel_dynamic_xfeatures;
+
ksize = xstate_calculate_size(mask, compacted);
/* Calculate the resulting user state size */
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 7/8] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
` (5 preceding siblings ...)
2023-09-14 3:23 ` [RFC PATCH 6/8] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 8/8] x86/fpu/xstate: WARN if normal fpstate contains " Yang Weijiang
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
The guest fpstate is sized with fpu_kernel_cfg.default_size (by preceding
fix) and the kernel dynamic xfeatures are not taken into account, so add
the support and tweak fpstate xfeatures and size accordingly.
Below configuration steps are currently enforced to get guest fpstate:
1) User space sets thread group xstate permits via arch_prctl().
2) User space creates vcpu thread.
3) User space enables guest dynamic xfeatures.
In #1, guest fpstate size (i.e., __state_size [1]) is induced from
(fpu_kernel_cfg.default_features | user dynamic xfeatures) [2].
In #2, guest fpstate size is calculated with fpu_kernel_cfg.default_size
and fpstate->size is set to the same. fpstate->xfeatures is set to
fpu_kernel_cfg.default_features.
In #3, guest fpstate is re-allocated as [1] and fpstate->xfeatures is
set to [2].
By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area
size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic
_xfeatures | user dynamic xfeatures)[3], and guest fpstate->xfeatures is
set to [3]. Then host xsaves/xrstors can act on all guest xfeatures.
The user_* fields remain unchanged for compatibility of non-compacted KVM
uAPIs.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/core.c | 56 +++++++++++++++++++++++++++++-------
arch/x86/kernel/fpu/xstate.c | 2 +-
arch/x86/kernel/fpu/xstate.h | 2 ++
3 files changed, 49 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index a42d8ad26ce6..e5819b38545a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -33,6 +33,8 @@ DEFINE_STATIC_KEY_FALSE(__fpu_state_size_dynamic);
DEFINE_PER_CPU(u64, xfd_state);
#endif
+extern unsigned int xstate_calculate_size(u64 xfeatures, bool compacted);
+
/* The FPU state configuration data for kernel and user space */
struct fpu_state_config fpu_kernel_cfg __ro_after_init;
struct fpu_state_config fpu_user_cfg __ro_after_init;
@@ -193,8 +195,6 @@ void fpu_reset_from_exception_fixup(void)
}
#if IS_ENABLED(CONFIG_KVM)
-static void __fpstate_reset(struct fpstate *fpstate, u64 xfd);
-
static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
{
struct fpu_state_perm *fpuperm;
@@ -215,28 +215,64 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED;
}
-bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
+static struct fpstate *__fpu_alloc_init_guest_fpstate(struct fpu_guest *gfpu)
{
+ bool compacted = cpu_feature_enabled(X86_FEATURE_XCOMPACTED);
+ unsigned int gfpstate_size, size;
struct fpstate *fpstate;
- unsigned int size;
+ u64 xfeatures;
+
+ /*
+ * fpu_kernel_cfg.default_features includes all enabled xfeatures
+ * except those dynamic xfeatures. Compared with user dynamic
+ * xfeatures, the kernel dynamic ones are enabled for guest by
+ * default, so add the kernel dynamic xfeatures back when calculate
+ * guest fpstate size.
+ *
+ * If the user dynamic xfeatures are enabled, the guest fpstate will
+ * be re-allocated to hold all guest enabled xfeatures, so omit user
+ * dynamic xfeatures here.
+ */
+ xfeatures = fpu_kernel_cfg.default_features |
+ fpu_kernel_dynamic_xfeatures;
+
+ gfpstate_size = xstate_calculate_size(xfeatures, compacted);
- size = fpu_kernel_cfg.default_size +
- ALIGN(offsetof(struct fpstate, regs), 64);
+ size = gfpstate_size + ALIGN(offsetof(struct fpstate, regs), 64);
fpstate = vzalloc(size);
if (!fpstate)
- return false;
+ return NULL;
+ /*
+ * Initialize sizes and feature masks, use fpu_user_cfg.*
+ * for user_* settings for compatibility of exiting uAPIs.
+ */
+ fpstate->size = gfpstate_size;
+ fpstate->xfeatures = xfeatures;
+ fpstate->user_size = fpu_user_cfg.default_size;
+ fpstate->user_xfeatures = fpu_user_cfg.default_features;
+ fpstate->xfd = 0;
- /* Leave xfd to 0 (the reset value defined by spec) */
- __fpstate_reset(fpstate, 0);
fpstate_init_user(fpstate);
fpstate->is_valloc = true;
fpstate->is_guest = true;
gfpu->fpstate = fpstate;
- gfpu->xfeatures = fpu_user_cfg.default_features;
+ gfpu->xfeatures = xfeatures;
gfpu->perm = fpu_user_cfg.default_features;
+ return fpstate;
+}
+
+bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
+{
+ struct fpstate *fpstate;
+
+ fpstate = __fpu_alloc_init_guest_fpstate(gfpu);
+
+ if (!fpstate)
+ return false;
+
/*
* KVM sets the FP+SSE bits in the XSAVE header when copying FPU state
* to userspace, even when XSAVE is unsupported, so that restoring FPU
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c5d903b4df4d..87149aba6f11 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -561,7 +561,7 @@ static bool __init check_xstate_against_struct(int nr)
return true;
}
-static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
+unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
{
unsigned int topmost = fls64(xfeatures) - 1;
unsigned int offset = xstate_offsets[topmost];
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index a4ecb04d8d64..9c6e3ca05c5c 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -10,6 +10,8 @@
DECLARE_PER_CPU(u64, xfd_state);
#endif
+extern u64 fpu_kernel_dynamic_xfeatures;
+
static inline void xstate_init_xcomp_bv(struct xregs_state *xsave, u64 mask)
{
/*
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 8/8] x86/fpu/xstate: WARN if normal fpstate contains kernel dynamic xfeatures
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
` (6 preceding siblings ...)
2023-09-14 3:23 ` [RFC PATCH 7/8] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Yang Weijiang
@ 2023-09-14 3:23 ` Yang Weijiang
7 siblings, 0 replies; 11+ messages in thread
From: Yang Weijiang @ 2023-09-14 3:23 UTC (permalink / raw)
To: x86, linux-kernel
Cc: dave.hansen, tglx, peterz, seanjc, pbonzini, rick.p.edgecombe,
kvm, yang.zhong, jing2.liu, chao.gao, Yang Weijiang
fpu_kernel_dynamic_xfeatures now are __ONLY__ enabled by guest kernel and
used for guest fpstate, i.e., none for normal fpstate. The bits are added
when guest fpstate is allocated and fpstate->is_guest set to %true.
For normal fpstate, the bits should have been removed when init system FPU
settings, WARN_ONCE() if normal fpstate contains kernel dynamic xfeatures
before xsaves is executed.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kernel/fpu/xstate.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 9c6e3ca05c5c..c2b33a5db53d 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -186,6 +186,9 @@ static inline void os_xsave(struct fpstate *fpstate)
WARN_ON_FPU(!alternatives_patched);
xfd_validate_state(fpstate, mask, false);
+ WARN_ON_FPU(!fpstate->is_guest &&
+ (mask & fpu_kernel_dynamic_xfeatures));
+
XSTATE_XSAVE(&fpstate->regs.xsave, lmask, hmask, err);
/* We should never fault when copying to a kernel buffer: */
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation
2023-09-14 3:23 ` [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
@ 2023-10-21 0:29 ` Sean Christopherson
2023-10-21 0:35 ` Sean Christopherson
0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2023-10-21 0:29 UTC (permalink / raw)
To: Yang Weijiang
Cc: x86, linux-kernel, dave.hansen, tglx, peterz, pbonzini,
rick.p.edgecombe, kvm, yang.zhong, jing2.liu, chao.gao
On Wed, Sep 13, 2023, Yang Weijiang wrote:
> Fix guest xsave area allocation size from fpu_user_cfg.default_size to
> fpu_kernel_cfg.default_size so that the xsave area size is consistent
> with fpstate->size set in __fpstate_reset().
>
> With the fix, guest fpstate size is sufficient for KVM supported guest
> xfeatures.
>
> Fixes: 69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup");
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
> arch/x86/kernel/fpu/core.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index a86d37052a64..a42d8ad26ce6 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -220,7 +220,9 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
> struct fpstate *fpstate;
> unsigned int size;
>
> - size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
> + size = fpu_kernel_cfg.default_size +
> + ALIGN(offsetof(struct fpstate, regs), 64);
This looks sketchy and incomplete. I haven't looked at the gory details of
fpu_user_cfg vs. fpu_kernel_cfg, but the rest of this function uses fpu_user_cfg,
including a check on fpu_user_cfg.default_size. That makes me think that changing
just the allocation size isn't quite right.
/* Leave xfd to 0 (the reset value defined by spec) */
__fpstate_reset(fpstate, 0);
fpstate_init_user(fpstate);
fpstate->is_valloc = true;
fpstate->is_guest = true;
gfpu->fpstate = fpstate;
gfpu->xfeatures = fpu_user_cfg.default_features;
gfpu->perm = fpu_user_cfg.default_features;
/*
* KVM sets the FP+SSE bits in the XSAVE header when copying FPU state
* to userspace, even when XSAVE is unsupported, so that restoring FPU
* state on a different CPU that does support XSAVE can cleanly load
* the incoming state using its natural XSAVE. In other words, KVM's
* uABI size may be larger than this host's default size. Conversely,
* the default size should never be larger than KVM's base uABI size;
* all features that can expand the uABI size must be opt-in.
*/
gfpu->uabi_size = sizeof(struct kvm_xsave);
if (WARN_ON_ONCE(fpu_user_cfg.default_size > gfpu->uabi_size))
gfpu->uabi_size = fpu_user_cfg.default_size;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation
2023-10-21 0:29 ` Sean Christopherson
@ 2023-10-21 0:35 ` Sean Christopherson
0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2023-10-21 0:35 UTC (permalink / raw)
To: Yang Weijiang
Cc: x86, linux-kernel, dave.hansen, tglx, peterz, pbonzini,
rick.p.edgecombe, kvm, yang.zhong, jing2.liu, chao.gao
On Fri, Oct 20, 2023, Sean Christopherson wrote:
> On Wed, Sep 13, 2023, Yang Weijiang wrote:
> > Fix guest xsave area allocation size from fpu_user_cfg.default_size to
> > fpu_kernel_cfg.default_size so that the xsave area size is consistent
> > with fpstate->size set in __fpstate_reset().
> >
> > With the fix, guest fpstate size is sufficient for KVM supported guest
> > xfeatures.
> >
> > Fixes: 69f6ed1d14c6 ("x86/fpu: Provide infrastructure for KVM FPU cleanup");
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > ---
> > arch/x86/kernel/fpu/core.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > index a86d37052a64..a42d8ad26ce6 100644
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -220,7 +220,9 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
> > struct fpstate *fpstate;
> > unsigned int size;
> >
> > - size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
> > + size = fpu_kernel_cfg.default_size +
> > + ALIGN(offsetof(struct fpstate, regs), 64);
>
> This looks sketchy and incomplete. I haven't looked at the gory details of
> fpu_user_cfg vs. fpu_kernel_cfg, but the rest of this function uses fpu_user_cfg,
> including a check on fpu_user_cfg.default_size. That makes me think that changing
> just the allocation size isn't quite right.
Shoot, I didn't realize the CET virtualization series included this a day later.
I'll respond there.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-10-21 0:35 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-14 3:23 [RFC PATCH 0/8] Introduce CET supervisor xstate support Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 1/8] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 2/8] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
2023-10-21 0:29 ` Sean Christopherson
2023-10-21 0:35 ` Sean Christopherson
2023-09-14 3:23 ` [RFC PATCH 3/8] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 4/8] x86/fpu/xstate: Introduce kernel dynamic xfeature set Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 5/8] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 6/8] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 7/8] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Yang Weijiang
2023-09-14 3:23 ` [RFC PATCH 8/8] x86/fpu/xstate: WARN if normal fpstate contains " Yang Weijiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox