kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Introduce CET supervisor state support
@ 2024-11-26 10:17 Chao Gao
  2024-11-26 10:17 ` [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

This v2 is essentially a resend of the v1 series. I took over this work
from Weijiang, so I added my Signed-off-by and incremented the version
number. This repost is to seek more feedback on this work, which is a
dependency for CET KVM support. In turn, CET KVM support is a dependency
for both FRED KVM support and CET AMD support.

==Background==

This series spins off from CET KVM virtualization enabling series [1].
The purpose is to get these preparation work resolved ahead of KVM part
landing. There was a discussion about introducing CET supervisor state
support [2] [3].

CET supervisor state, i.e., IA32_PL{0,1,2}_SSP, are xsave-managed MSRs,
it can be enabled via IA32_XSS[bit 12]. KVM relies on host side CET
supervisor state support to fully enable guest CET MSR contents storage.
The benefits are: 1) No need to manually save/restore the 3 MSRs when
vCPU fpu context is sched in/out. 2) Omit manually swapping the three
MSRs at VM-Exit/VM-Entry for guest/host. 3) Make guest CET user/supervisor
states managed in a consistent manner within host kernel FPU framework.

==Solution==

This series tries to:
1) Fix existing issue regarding enabling guest supervisor states support.
2) Add CET supervisor state support in core kernel.
3) Introduce new FPU config for guest fpstate setup.

With the preparation work landed, for guest fpstate, we have xstate_bv[12]
== xcomp_bv[12] == 1 and CET supervisor state is saved/reloaded when
xsaves/xrstors executes on guest fpstate.
For non-guest/normal fpstate, we have xstate_bv[12] == xcomp_bv[12] == 0,
then HW can optimize xsaves/xrstors operations.

==Performance==

We measured context-switching performance with the benchmark [4] in following
three cases.

case 1: the baseline. i.e., this series isn't applied
case 2: baseline + this series. CET-S space is allocated for guest fpu only.
case 3: baseline + allocate CET-S space for all tasks. Hardware init
        optimization avoids writing out CET-S space on each XSAVES.

the data are as follows

case |IA32_XSS[12] | Space | RFBM[12] | Drop%	
-----+-------------+-------+----------+------
  1  |	   0	   | None  |	0     |  0.0%
  2  |	   1	   | None  |	0     |  0.2%
  3  |	   1	   | 24B?  |	1     |  0.2%

Case 2 and 3 have no difference in performnace. But case 2 is preferred because
it can save 24B of CET-S space for all non-vCPU threads with just a one-line
change in patch 3:

+	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;

Patch 4 and 5 have their own merits. Regardless of the approach we take, using
different FPU configuration settings for the guest and the kernel improves
readability, decouples them from each other, and arguably enhances
extensibility.

==Changelog==

v1->v2:
 - rebase onto the latest kvm-x86/next
 - Add performance data to the cover-letter
 - v1: https://lore.kernel.org/kvm/73802bff-833c-4233-9a5b-88af0d062c82@intel.com/

==Organization==

Patch1: Preserve guest supervisor xfeatures in __state_perm.
Patch2: Enable CET supervisor xstate support.
Patch3: Introduce kernel dynamic xfeature set.
Patch4: Initialize fpu_guest_cfg settings.
Patch5: Create guest fpstate with fpu_guest_cfg.
Patch6: Check invalid fpstate config before executes xsaves.

[1]: https://lore.kernel.org/all/20240219074733.122080-1-weijiang.yang@intel.com/
[2]: https://lore.kernel.org/all/ZM1jV3UPL0AMpVDI@google.com/
[3]: https://lore.kernel.org/all/2597a87b-1248-b8ce-ce60-94074bc67ea4@intel.com/
[4]: https://github.com/antonblanchard/will-it-scale/blob/master/tests/context_switch1.c

Sean Christopherson (1):
  x86/fpu/xstate: Always preserve non-user xfeatures/flags in
    __state_perm

Yang Weijiang (5):
  x86/fpu/xstate: Add CET supervisor mode state support
  x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
  x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration
  x86/fpu/xstate: Create guest fpstate with guest specific config
  x86/fpu/xstate: Warn if CET supervisor state is detected in normal
    fpstate

 arch/x86/include/asm/fpu/types.h  | 16 ++++++++--
 arch/x86/include/asm/fpu/xstate.h | 11 ++++---
 arch/x86/kernel/fpu/core.c        | 53 ++++++++++++++++++++++++-------
 arch/x86/kernel/fpu/xstate.c      | 35 +++++++++++++++-----
 arch/x86/kernel/fpu/xstate.h      |  2 ++
 5 files changed, 90 insertions(+), 27 deletions(-)

-- 
2.46.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2025-03-04 22:20   ` Dave Hansen
  2024-11-26 10:17 ` [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support Chao Gao
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Sean Christopherson <seanjc@google.com>

When granting userspace or a KVM guest access to an xfeature, preserve the
entity's existing supervisor and software-defined permissions as tracked
by __state_perm, i.e. use __state_perm to track *all* permissions even
though all supported supervisor xfeatures are granted to all FPUs and
FPU_GUEST_PERM_LOCKED disallows changing permissions.

Effectively clobbering supervisor permissions results in inconsistent
behavior, as xstate_get_group_perm() will report supervisor features for
process that do NOT request access to dynamic user xfeatures, whereas any
and all supervisor features will be absent from the set of permissions for
any process that is granted access to one or more dynamic xfeatures (which
right now means AMX).

The inconsistency isn't problematic because fpu_xstate_prctl() already
strips out everything except user xfeatures:

        case ARCH_GET_XCOMP_PERM:
                /*
                 * Lockless snapshot as it can also change right after the
                 * dropping the lock.
                 */
                permitted = xstate_get_host_group_perm();
                permitted &= XFEATURE_MASK_USER_SUPPORTED;
                return put_user(permitted, uptr);

        case ARCH_GET_XCOMP_GUEST_PERM:
                permitted = xstate_get_guest_group_perm();
                permitted &= XFEATURE_MASK_USER_SUPPORTED;
                return put_user(permitted, uptr);

and similarly KVM doesn't apply the __state_perm to supervisor states
(kvm_get_filtered_xcr0() incorporates xstate_get_guest_group_perm()):

        case 0xd: {
                u64 permitted_xcr0 = kvm_get_filtered_xcr0();
                u64 permitted_xss = kvm_caps.supported_xss;

But if KVM in particular were to ever change, dropping supervisor
permissions would result in subtle bugs in KVM's reporting of supported
CPUID settings.  And the above behavior also means that having supervisor
xfeatures in __state_perm is correctly handled by all users.

Dropping supervisor permissions also creates another landmine for KVM.  If
more dynamic user xfeatures are ever added, requesting access to multiple
xfeatures in separate ARCH_REQ_XCOMP_GUEST_PERM calls will result in the
second invocation of __xstate_request_perm() computing the wrong ksize, as
as the mask passed to xstate_calculate_size() would not contain *any*
supervisor features.

Commit 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE
permissions") fudged around the size issue for userspace FPUs, but for
reasons unknown skipped guest FPUs.  Lack of a fix for KVM "works" only
because KVM doesn't yet support virtualizing features that have supervisor
xfeatures, i.e. as of today, KVM guest FPUs will never need the relevant
xfeatures.

Simply extending the hack-a-fix for guests would temporarily solve the
ksize issue, but wouldn't address the inconsistency issue and would leave
another lurking pitfall for KVM.  KVM support for virtualizing CET will
likely add CET_KERNEL as a guest-only xfeature, i.e. CET_KERNEL will not
be set in xfeatures_mask_supervisor() and would again be dropped when
granting access to dynamic xfeatures.

Note, the existing clobbering behavior is rather subtle.  The @permitted
parameter to __xstate_request_perm() comes from:

	permitted = xstate_get_group_perm(guest);

which is either fpu->guest_perm.__state_perm or fpu->perm.__state_perm,
where __state_perm is initialized to:

        fpu->perm.__state_perm          = fpu_kernel_cfg.default_features;

and copied to the guest side of things:

	/* Same defaults for guests */
	fpu->guest_perm = fpu->perm;

fpu_kernel_cfg.default_features contains everything except the dynamic
xfeatures, i.e. everything except XFEATURE_MASK_XTILE_DATA:

        fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
        fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;

When __xstate_request_perm() restricts the local "mask" variable to
compute the user state size:

	mask &= XFEATURE_MASK_USER_SUPPORTED;
	usize = xstate_calculate_size(mask, false);

it subtly overwrites the target __state_perm with "mask" containing only
user xfeatures:

	perm = guest ? &fpu->guest_perm : &fpu->perm;
	/* Pairs with the READ_ONCE() in xstate_get_group_perm() */
	WRITE_ONCE(perm->__state_perm, mask);

Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Weijiang Yang <weijiang.yang@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: John Allen <john.allen@amd.com>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/all/ZTqgzZl-reO1m01I@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/kernel/fpu/xstate.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 22abb5ee0cf2..414fed7eb278 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1622,16 +1622,20 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
 	if ((permitted & requested) == requested)
 		return 0;
 
-	/* Calculate the resulting kernel state size */
+	/*
+	 * Calculate the resulting kernel state size.  Note, @permitted also
+	 * contains supervisor xfeatures even though supervisor are always
+	 * permitted for kernel and guest FPUs, and never permitted for user
+	 * FPUs.
+	 */
 	mask = permitted | requested;
-	/* Take supervisor states into account on the host */
-	if (!guest)
-		mask |= xfeatures_mask_supervisor();
 	ksize = xstate_calculate_size(mask, compacted);
 
-	/* Calculate the resulting user state size */
-	mask &= XFEATURE_MASK_USER_SUPPORTED;
-	usize = xstate_calculate_size(mask, false);
+	/*
+	 * Calculate the resulting user state size.  Take care not to clobber
+	 * the supervisor xfeatures in the new mask!
+	 */
+	usize = xstate_calculate_size(mask & XFEATURE_MASK_USER_SUPPORTED, false);
 
 	if (!guest) {
 		ret = validate_sigaltstack(usize);
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
  2024-11-26 10:17 ` [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2025-03-04 22:26   ` Dave Hansen
  2024-11-26 10:17 ` [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Chao Gao
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Yang Weijiang <weijiang.yang@intel.com>

Add supervisor mode state support within FPU xstate management framework.
Although supervisor shadow stack is not enabled/used today in kernel,KVM
requires the support because when KVM advertises shadow stack feature to
guest, architecturally it claims the support for both user and supervisor
modes for guest OSes(Linux or non-Linux).

CET supervisor states not only includes PL{0,1,2}_SSP but also IA32_S_CET
MSR, but the latter is not xsave-managed. In virtualization world, guest
IA32_S_CET is saved/stored into/from VM control structure. With supervisor
xstate support, guest supervisor mode shadow stack state can be properly
saved/restored when 1) guest/host FPU context is swapped 2) vCPU
thread is sched out/in.

The alternative is to enable it in KVM domain, but KVM maintainers NAKed
the solution. The external discussion can be found at [*], it ended up
with adding the support in kernel instead of KVM domain.

Note, in KVM case, guest CET supervisor state i.e., IA32_PL{0,1,2}_MSRs,
are preserved after VM-Exit until host/guest fpstates are swapped, but
since host supervisor shadow stack is disabled, the preserved MSRs won't
hurt host.

[*]: https://lore.kernel.org/all/806e26c2-8d21-9cc9-a0b7-7787dd231729@intel.com/

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/fpu/types.h  | 14 ++++++++++++--
 arch/x86/include/asm/fpu/xstate.h |  6 +++---
 arch/x86/kernel/fpu/xstate.c      |  6 +++++-
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index de16862bf230..b49e65120d34 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -118,7 +118,7 @@ enum xfeature {
 	XFEATURE_PKRU,
 	XFEATURE_PASID,
 	XFEATURE_CET_USER,
-	XFEATURE_CET_KERNEL_UNUSED,
+	XFEATURE_CET_KERNEL,
 	XFEATURE_RSRVD_COMP_13,
 	XFEATURE_RSRVD_COMP_14,
 	XFEATURE_LBR,
@@ -141,7 +141,7 @@ enum xfeature {
 #define XFEATURE_MASK_PKRU		(1 << XFEATURE_PKRU)
 #define XFEATURE_MASK_PASID		(1 << XFEATURE_PASID)
 #define XFEATURE_MASK_CET_USER		(1 << XFEATURE_CET_USER)
-#define XFEATURE_MASK_CET_KERNEL	(1 << XFEATURE_CET_KERNEL_UNUSED)
+#define XFEATURE_MASK_CET_KERNEL	(1 << XFEATURE_CET_KERNEL)
 #define XFEATURE_MASK_LBR		(1 << XFEATURE_LBR)
 #define XFEATURE_MASK_XTILE_CFG		(1 << XFEATURE_XTILE_CFG)
 #define XFEATURE_MASK_XTILE_DATA	(1 << XFEATURE_XTILE_DATA)
@@ -266,6 +266,16 @@ struct cet_user_state {
 	u64 user_ssp;
 };
 
+/*
+ * State component 12 is Control-flow Enforcement supervisor states
+ */
+struct cet_supervisor_state {
+	/* supervisor ssp pointers  */
+	u64 pl0_ssp;
+	u64 pl1_ssp;
+	u64 pl2_ssp;
+};
+
 /*
  * State component 15: Architectural LBR configuration state.
  * The size of Arch LBR state depends on the number of LBRs (lbr_depth).
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index d4427b88ee12..3b4a038d3c57 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -51,7 +51,8 @@
 
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
-					    XFEATURE_MASK_CET_USER)
+					    XFEATURE_MASK_CET_USER | \
+					    XFEATURE_MASK_CET_KERNEL)
 
 /*
  * A supervisor state component may not always contain valuable information,
@@ -78,8 +79,7 @@
  * Unsupported supervisor features. When a supervisor feature in this mask is
  * supported in the future, move it to the supported supervisor feature mask.
  */
-#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \
-					      XFEATURE_MASK_CET_KERNEL)
+#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT)
 
 /* All supervisor states including supported and unsupported states. */
 #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 414fed7eb278..2cf6ec536c0d 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -54,7 +54,7 @@ static const char *xfeature_names[] =
 	"Protection Keys User registers",
 	"PASID state",
 	"Control-flow User registers",
-	"Control-flow Kernel registers (unused)",
+	"Control-flow Kernel registers",
 	"unknown xstate feature",
 	"unknown xstate feature",
 	"unknown xstate feature",
@@ -77,6 +77,7 @@ static unsigned short xsave_cpuid_features[] __initdata = {
 	[XFEATURE_PKRU]				= X86_FEATURE_OSPKE,
 	[XFEATURE_PASID]			= X86_FEATURE_ENQCMD,
 	[XFEATURE_CET_USER]			= X86_FEATURE_SHSTK,
+	[XFEATURE_CET_KERNEL]			= X86_FEATURE_SHSTK,
 	[XFEATURE_XTILE_CFG]			= X86_FEATURE_AMX_TILE,
 	[XFEATURE_XTILE_DATA]			= X86_FEATURE_AMX_TILE,
 };
@@ -282,6 +283,7 @@ static void __init print_xstate_features(void)
 	print_xstate_feature(XFEATURE_MASK_PKRU);
 	print_xstate_feature(XFEATURE_MASK_PASID);
 	print_xstate_feature(XFEATURE_MASK_CET_USER);
+	print_xstate_feature(XFEATURE_MASK_CET_KERNEL);
 	print_xstate_feature(XFEATURE_MASK_XTILE_CFG);
 	print_xstate_feature(XFEATURE_MASK_XTILE_DATA);
 }
@@ -351,6 +353,7 @@ static __init void os_xrstor_booting(struct xregs_state *xstate)
 	 XFEATURE_MASK_BNDCSR |			\
 	 XFEATURE_MASK_PASID |			\
 	 XFEATURE_MASK_CET_USER |		\
+	 XFEATURE_MASK_CET_KERNEL |		\
 	 XFEATURE_MASK_XTILE)
 
 /*
@@ -551,6 +554,7 @@ static bool __init check_xstate_against_struct(int nr)
 	case XFEATURE_PASID:	  return XCHECK_SZ(sz, nr, struct ia32_pasid_state);
 	case XFEATURE_XTILE_CFG:  return XCHECK_SZ(sz, nr, struct xtile_cfg);
 	case XFEATURE_CET_USER:	  return XCHECK_SZ(sz, nr, struct cet_user_state);
+	case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_state);
 	case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return true;
 	default:
 		XSTATE_WARN_ON(1, "No structure for xstate: %d\n", nr);
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
  2024-11-26 10:17 ` [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
  2024-11-26 10:17 ` [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2025-03-04 22:37   ` Dave Hansen
  2024-11-26 10:17 ` [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Chao Gao
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Yang Weijiang <weijiang.yang@intel.com>

Define a new XFEATURE_MASK_KERNEL_DYNAMIC mask to specify the features
that can be optionally enabled by kernel components. This is similar to
XFEATURE_MASK_USER_DYNAMIC in that it contains optional xfeatures that
can allows the FPU buffer to be dynamically sized. The difference is that
the KERNEL variant contains supervisor features and will be enabled by
kernel components that need them, and not directly by the user. Currently
it's used by KVM to configure guest dedicated fpstate for calculating
the xfeature and fpstate storage size etc.

The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which
is supported by host as they're enabled in kernel XSS MSR setting but
relevant CPU feature, i.e., supervisor shadow stack, is not enabled in
host kernel therefore it can be omitted for normal fpstate by default.

Remove the kernel dynamic feature from fpu_kernel_cfg.default_features
so that the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors
can be optimized by HW for normal fpstate.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/fpu/xstate.h | 5 ++++-
 arch/x86/kernel/fpu/xstate.c      | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 3b4a038d3c57..a212d3851429 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -46,9 +46,12 @@
 #define XFEATURE_MASK_USER_RESTORE	\
 	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
 
-/* Features which are dynamically enabled for a process on request */
+/* Features which are dynamically enabled per userspace request */
 #define XFEATURE_MASK_USER_DYNAMIC	XFEATURE_MASK_XTILE_DATA
 
+/* Features which are dynamically enabled per kernel side request */
+#define XFEATURE_MASK_KERNEL_DYNAMIC	XFEATURE_MASK_CET_KERNEL
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
 					    XFEATURE_MASK_CET_USER | \
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 2cf6ec536c0d..c38e477e3e45 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
 	/* Clean out dynamic features from default */
 	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
 	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
+	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
 
 	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
 	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
                   ` (2 preceding siblings ...)
  2024-11-26 10:17 ` [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2025-03-04 22:50   ` Dave Hansen
  2024-11-26 10:17 ` [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config Chao Gao
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Yang Weijiang <weijiang.yang@intel.com>

Define new fpu_guest_cfg to hold all guest FPU settings so that it can
differ from generic kernel FPU settings, e.g., enabling CET supervisor
xstate by default for guest fpstate while it's remained disabled in
kernel FPU config.

The kernel dynamic xfeatures are specifically used by guest fpstate now,
add the mask for guest fpstate so that guest_perm.__state_perm ==
(fpu_kernel_cfg.default_xfeature | XFEATURE_MASK_KERNEL_DYNAMIC). And
if guest fpstate is re-allocated to hold user dynamic xfeatures, the
resulting permissions are consumed before calculate new guest fpstate.

With new guest FPU config added, there're 3 categories of FPU configs in
kernel, the usages and key fields are recapped as below.

kernel FPU config:
  @fpu_kernel_cfg.max_features
  - all known and CPU supported user and supervisor features except
    independent kernel features

  @fpu_kernel_cfg.default_features
  - all known and CPU supported user and supervisor features except
    dynamic kernel features, independent kernel features and dynamic
    userspace features.

  @fpu_kernel_cfg.max_size
  - size of compacted buffer with 'fpu_kernel_cfg.max_features'

  @fpu_kernel_cfg.default_size
  - size of compacted buffer with 'fpu_kernel_cfg.default_features'

user FPU config:
  @fpu_user_cfg.max_features
  - all known and CPU supported user features

  @fpu_user_cfg.default_features
  - all known and CPU supported user features except dynamic userspace
    features.

  @fpu_user_cfg.max_size
  - size of non-compacted buffer with 'fpu_user_cfg.max_features'

  @fpu_user_cfg.default_size
  - size of non-compacted buffer with 'fpu_user_cfg.default_features'

guest FPU config:
  @fpu_guest_cfg.max_features
  - all known and CPU supported user and supervisor features except
    independent kernel features.

  @fpu_guest_cfg.default_features
  - all known and CPU supported user and supervisor features except
    independent kernel features and dynamic userspace features.

  @fpu_guest_cfg.max_size
  - size of compacted buffer with 'fpu_guest_cfg.max_features'

  @fpu_guest_cfg.default_size
  - size of compacted buffer with 'fpu_guest_cfg.default_features'

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/include/asm/fpu/types.h |  2 +-
 arch/x86/kernel/fpu/core.c       | 14 +++++++++++---
 arch/x86/kernel/fpu/xstate.c     | 10 ++++++++++
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index b49e65120d34..da6583a1c0a2 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -611,6 +611,6 @@ struct fpu_state_config {
 };
 
 /* FPU state configuration information */
-extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg;
+extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg, fpu_guest_cfg;
 
 #endif /* _ASM_X86_FPU_TYPES_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1209c7aebb21..9e2e5c46cf28 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -33,9 +33,10 @@ DEFINE_STATIC_KEY_FALSE(__fpu_state_size_dynamic);
 DEFINE_PER_CPU(u64, xfd_state);
 #endif
 
-/* The FPU state configuration data for kernel and user space */
+/* The FPU state configuration data for kernel, user space and guest. */
 struct fpu_state_config	fpu_kernel_cfg __ro_after_init;
 struct fpu_state_config fpu_user_cfg __ro_after_init;
+struct fpu_state_config fpu_guest_cfg __ro_after_init;
 
 /*
  * Represents the initial FPU state. It's mostly (but not completely) zeroes,
@@ -536,8 +537,15 @@ void fpstate_reset(struct fpu *fpu)
 	fpu->perm.__state_perm		= fpu_kernel_cfg.default_features;
 	fpu->perm.__state_size		= fpu_kernel_cfg.default_size;
 	fpu->perm.__user_state_size	= fpu_user_cfg.default_size;
-	/* Same defaults for guests */
-	fpu->guest_perm = fpu->perm;
+
+	/* Guest permission settings */
+	fpu->guest_perm.__state_perm	= fpu_guest_cfg.default_features;
+	fpu->guest_perm.__state_size	= fpu_guest_cfg.default_size;
+	/*
+	 * Set guest's __user_state_size to fpu_user_cfg.default_size so that
+	 * existing uAPIs can still work.
+	 */
+	fpu->guest_perm.__user_state_size = fpu_user_cfg.default_size;
 }
 
 static inline void fpu_inherit_perms(struct fpu *dst_fpu)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c38e477e3e45..17c3255dfa41 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -686,6 +686,7 @@ static int __init init_xstate_size(void)
 {
 	/* Recompute the context size for enabled features: */
 	unsigned int user_size, kernel_size, kernel_default_size;
+	unsigned int guest_default_size;
 	bool compacted = cpu_feature_enabled(X86_FEATURE_XCOMPACTED);
 
 	/* Uncompacted user space size */
@@ -707,13 +708,18 @@ static int __init init_xstate_size(void)
 	kernel_default_size =
 		xstate_calculate_size(fpu_kernel_cfg.default_features, compacted);
 
+	guest_default_size =
+		xstate_calculate_size(fpu_guest_cfg.default_features, compacted);
+
 	if (!paranoid_xstate_size_valid(kernel_size))
 		return -EINVAL;
 
 	fpu_kernel_cfg.max_size = kernel_size;
 	fpu_user_cfg.max_size = user_size;
+	fpu_guest_cfg.max_size = kernel_size;
 
 	fpu_kernel_cfg.default_size = kernel_default_size;
+	fpu_guest_cfg.default_size = guest_default_size;
 	fpu_user_cfg.default_size =
 		xstate_calculate_size(fpu_user_cfg.default_features, false);
 
@@ -829,6 +835,10 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
 	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
 	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
 
+	fpu_guest_cfg.max_features = fpu_kernel_cfg.max_features;
+	fpu_guest_cfg.default_features = fpu_guest_cfg.max_features;
+	fpu_guest_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
+
 	/* Store it for paranoia check at the end */
 	xfeatures = fpu_kernel_cfg.max_features;
 
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
                   ` (3 preceding siblings ...)
  2024-11-26 10:17 ` [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2025-03-04 23:02   ` Dave Hansen
  2024-11-26 10:17 ` [PATCH v2 6/6] x86/fpu/xstate: Warn if CET supervisor state is detected in normal fpstate Chao Gao
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Yang Weijiang <weijiang.yang@intel.com>

Use fpu_guest_cfg to calculate guest fpstate settings, open code for
__fpstate_reset() to avoid using kernel FPU config.

Below configuration steps are currently enforced to get guest fpstate:
1) Kernel sets up guest FPU settings in fpu__init_system_xstate().
2) User space sets vCPU thread group xstate permits via arch_prctl().
3) User space creates guest fpstate via __fpu_alloc_init_guest_fpstate()
   for vcpu thread.
4) User space enables guest dynamic xfeatures and re-allocate guest
   fpstate.

By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area
size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic
xfeatures | user dynamic xfeatures), then host xsaves/xrstors can operate
for all guest xfeatures.

The user_* fields remain unchanged for compatibility with KVM uAPIs.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 arch/x86/kernel/fpu/core.c | 39 +++++++++++++++++++++++++++++---------
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9e2e5c46cf28..00d7dcf45b34 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -194,8 +194,6 @@ void fpu_reset_from_exception_fixup(void)
 }
 
 #if IS_ENABLED(CONFIG_KVM)
-static void __fpstate_reset(struct fpstate *fpstate, u64 xfd);
-
 static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
 {
 	struct fpu_state_perm *fpuperm;
@@ -216,25 +214,48 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
 	gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED;
 }
 
-bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
+static struct fpstate *__fpu_alloc_init_guest_fpstate(struct fpu_guest *gfpu)
 {
 	struct fpstate *fpstate;
 	unsigned int size;
 
-	size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
+	/*
+	 * fpu_guest_cfg.default_size is initialized to hold all enabled
+	 * xfeatures except the user dynamic xfeatures. If the user dynamic
+	 * xfeatures are enabled, the guest fpstate will be re-allocated to
+	 * hold all guest enabled xfeatures, so omit user dynamic xfeatures
+	 * here.
+	 */
+	size = fpu_guest_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
+
 	fpstate = vzalloc(size);
 	if (!fpstate)
-		return false;
+		return NULL;
+	/*
+	 * Initialize sizes and feature masks, use fpu_user_cfg.*
+	 * for user_* settings for compatibility of exiting uAPIs.
+	 */
+	fpstate->size		= fpu_guest_cfg.default_size;
+	fpstate->xfeatures	= fpu_guest_cfg.default_features;
+	fpstate->user_size	= fpu_user_cfg.default_size;
+	fpstate->user_xfeatures	= fpu_user_cfg.default_features;
+	fpstate->xfd		= 0;
 
-	/* Leave xfd to 0 (the reset value defined by spec) */
-	__fpstate_reset(fpstate, 0);
 	fpstate_init_user(fpstate);
 	fpstate->is_valloc	= true;
 	fpstate->is_guest	= true;
 
 	gfpu->fpstate		= fpstate;
-	gfpu->xfeatures		= fpu_user_cfg.default_features;
-	gfpu->perm		= fpu_user_cfg.default_features;
+	gfpu->xfeatures		= fpu_guest_cfg.default_features;
+	gfpu->perm		= fpu_guest_cfg.default_features;
+
+	return fpstate;
+}
+
+bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
+{
+	if (!__fpu_alloc_init_guest_fpstate(gfpu))
+		return false;
 
 	/*
 	 * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/6] x86/fpu/xstate: Warn if CET supervisor state is detected in normal fpstate
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
                   ` (4 preceding siblings ...)
  2024-11-26 10:17 ` [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config Chao Gao
@ 2024-11-26 10:17 ` Chao Gao
  2024-11-26 10:28 ` [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
  2025-02-19  1:44 ` Chao Gao
  7 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:17 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen,
	Chao Gao

From: Yang Weijiang <weijiang.yang@intel.com>

CET supervisor state bit is __ONLY__ enabled for guest fpstate, i.e.,
never for normal kernel fpstate. The bit is set when guest FPU config
is initialized.

For normal fpstate, the bit should have been removed when initializes
kernel FPU config settings, WARN_ONCE() if kernel detects normal fpstate
xfeatures contains CET supervisor state bit before xsaves operation.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/kernel/fpu/xstate.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 0b86a5002c84..3b60d3775705 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -187,6 +187,8 @@ static inline void os_xsave(struct fpstate *fpstate)
 	WARN_ON_FPU(!alternatives_patched);
 	xfd_validate_state(fpstate, mask, false);
 
+	WARN_ON_FPU(!fpstate->is_guest && (mask & XFEATURE_MASK_CET_KERNEL));
+
 	XSTATE_XSAVE(&fpstate->regs.xsave, lmask, hmask, err);
 
 	/* We should never fault when copying to a kernel buffer: */
-- 
2.46.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
                   ` (5 preceding siblings ...)
  2024-11-26 10:17 ` [PATCH v2 6/6] x86/fpu/xstate: Warn if CET supervisor state is detected in normal fpstate Chao Gao
@ 2024-11-26 10:28 ` Chao Gao
  2025-01-11  1:26   ` Sean Christopherson
  2025-02-19  1:44 ` Chao Gao
  7 siblings, 1 reply; 23+ messages in thread
From: Chao Gao @ 2024-11-26 10:28 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
>This v2 is essentially a resend of the v1 series. I took over this work
>from Weijiang, so I added my Signed-off-by and incremented the version
>number. This repost is to seek more feedback on this work, which is a
>dependency for CET KVM support. In turn, CET KVM support is a dependency
>for both FRED KVM support and CET AMD support.

This series is primarily for the CET KVM series. Merging it through the tip
tree means this code will not have an actual user until the CET KVM series
is merged. A good proposal from Rick is that x86 maintainers can ack this
series, and then it can be picked up by the KVM maintainers along with the
CET KVM series. Dave, Paolo and Sean, are you okay with this approach?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2024-11-26 10:28 ` [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
@ 2025-01-11  1:26   ` Sean Christopherson
  2025-02-11  2:09     ` Xin Li
  2025-03-04 19:40     ` Xin Li
  0 siblings, 2 replies; 23+ messages in thread
From: Sean Christopherson @ 2025-01-11  1:26 UTC (permalink / raw)
  To: Chao Gao
  Cc: tglx, dave.hansen, x86, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Nov 26, 2024, Chao Gao wrote:
> On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
> >This v2 is essentially a resend of the v1 series. I took over this work
> >from Weijiang, so I added my Signed-off-by and incremented the version
> >number. This repost is to seek more feedback on this work, which is a
> >dependency for CET KVM support. In turn, CET KVM support is a dependency
> >for both FRED KVM support and CET AMD support.
> 
> This series is primarily for the CET KVM series. Merging it through the tip
> tree means this code will not have an actual user until the CET KVM series
> is merged. A good proposal from Rick is that x86 maintainers can ack this
> series, and then it can be picked up by the KVM maintainers along with the
> CET KVM series. Dave, Paolo and Sean, are you okay with this approach?

Boris indicated off-list that he would prefer to take this through tip and give
KVM an immutable branch.  I'm a-ok with either approach.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2025-01-11  1:26   ` Sean Christopherson
@ 2025-02-11  2:09     ` Xin Li
  2025-03-04 19:40     ` Xin Li
  1 sibling, 0 replies; 23+ messages in thread
From: Xin Li @ 2025-02-11  2:09 UTC (permalink / raw)
  To: Sean Christopherson, Chao Gao, Borislav Petkov
  Cc: tglx, dave.hansen, x86, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 1/10/2025 5:26 PM, Sean Christopherson wrote:
> On Tue, Nov 26, 2024, Chao Gao wrote:
>> On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
>>> This v2 is essentially a resend of the v1 series. I took over this work
>> >from Weijiang, so I added my Signed-off-by and incremented the version
>>> number. This repost is to seek more feedback on this work, which is a
>>> dependency for CET KVM support. In turn, CET KVM support is a dependency
>>> for both FRED KVM support and CET AMD support.
>>
>> This series is primarily for the CET KVM series. Merging it through the tip
>> tree means this code will not have an actual user until the CET KVM series
>> is merged. A good proposal from Rick is that x86 maintainers can ack this
>> series, and then it can be picked up by the KVM maintainers along with the
>> CET KVM series. Dave, Paolo and Sean, are you okay with this approach?
> 
> Boris indicated off-list that he would prefer to take this through tip and give
> KVM an immutable branch.  I'm a-ok with either approach.
> 

So is the plan to merge this patch set in the v6.14 cycle?

Thanks!
     Xin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
                   ` (6 preceding siblings ...)
  2024-11-26 10:28 ` [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
@ 2025-02-19  1:44 ` Chao Gao
  7 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-02-19  1:44 UTC (permalink / raw)
  To: tglx, dave.hansen, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
>This v2 is essentially a resend of the v1 series. I took over this work
>from Weijiang, so I added my Signed-off-by and incremented the version
>number. This repost is to seek more feedback on this work, which is a
>dependency for CET KVM support. In turn, CET KVM support is a dependency
>for both FRED KVM support and CET AMD support.
>
>==Background==
>
>This series spins off from CET KVM virtualization enabling series [1].
>The purpose is to get these preparation work resolved ahead of KVM part
>landing. There was a discussion about introducing CET supervisor state
>support [2] [3].

Gentle ping.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2025-01-11  1:26   ` Sean Christopherson
  2025-02-11  2:09     ` Xin Li
@ 2025-03-04 19:40     ` Xin Li
  2025-03-04 19:53       ` Sean Christopherson
  1 sibling, 1 reply; 23+ messages in thread
From: Xin Li @ 2025-03-04 19:40 UTC (permalink / raw)
  To: Sean Christopherson, Chao Gao, Borislav Petkov
  Cc: tglx, dave.hansen, x86, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 1/10/2025 5:26 PM, Sean Christopherson wrote:
> On Tue, Nov 26, 2024, Chao Gao wrote:
>> On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
>>> This v2 is essentially a resend of the v1 series. I took over this work
>> >from Weijiang, so I added my Signed-off-by and incremented the version
>>> number. This repost is to seek more feedback on this work, which is a
>>> dependency for CET KVM support. In turn, CET KVM support is a dependency
>>> for both FRED KVM support and CET AMD support.
>>
>> This series is primarily for the CET KVM series. Merging it through the tip
>> tree means this code will not have an actual user until the CET KVM series
>> is merged. A good proposal from Rick is that x86 maintainers can ack this
>> series, and then it can be picked up by the KVM maintainers along with the
>> CET KVM series. Dave, Paolo and Sean, are you okay with this approach?
> 
> Boris indicated off-list that he would prefer to take this through tip and give
> KVM an immutable branch.  I'm a-ok with either approach.
> 

Hi Sean and Boris,

At this point because KVM is the only user of this feature, would it
make more sense to take this patch set through KVM x86 tree?

Thanks!
     Xin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] Introduce CET supervisor state support
  2025-03-04 19:40     ` Xin Li
@ 2025-03-04 19:53       ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2025-03-04 19:53 UTC (permalink / raw)
  To: Xin Li
  Cc: Chao Gao, Borislav Petkov, tglx, dave.hansen, x86, pbonzini,
	linux-kernel, kvm, peterz, rick.p.edgecombe, mlevitsk,
	weijiang.yang, john.allen

On Tue, Mar 04, 2025, Xin Li wrote:
> On 1/10/2025 5:26 PM, Sean Christopherson wrote:
> > On Tue, Nov 26, 2024, Chao Gao wrote:
> > > On Tue, Nov 26, 2024 at 06:17:04PM +0800, Chao Gao wrote:
> > > > This v2 is essentially a resend of the v1 series. I took over this work
> > > >from Weijiang, so I added my Signed-off-by and incremented the version
> > > > number. This repost is to seek more feedback on this work, which is a
> > > > dependency for CET KVM support. In turn, CET KVM support is a dependency
> > > > for both FRED KVM support and CET AMD support.
> > > 
> > > This series is primarily for the CET KVM series. Merging it through the tip
> > > tree means this code will not have an actual user until the CET KVM series
> > > is merged. A good proposal from Rick is that x86 maintainers can ack this
> > > series, and then it can be picked up by the KVM maintainers along with the
> > > CET KVM series. Dave, Paolo and Sean, are you okay with this approach?
> > 
> > Boris indicated off-list that he would prefer to take this through tip and give
> > KVM an immutable branch.  I'm a-ok with either approach.
> > 
> 
> Hi Sean and Boris,
> 
> At this point because KVM is the only user of this feature, would it
> make more sense to take this patch set through KVM x86 tree?

Which tree it goes through is largely irrelevant, it needs explicit acceptance
from the tip tree folks no matter what.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm
  2024-11-26 10:17 ` [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
@ 2025-03-04 22:20   ` Dave Hansen
  2025-03-05  8:25     ` Chao Gao
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2025-03-04 22:20 UTC (permalink / raw)
  To: Chao Gao, tglx, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 11/26/24 02:17, Chao Gao wrote:
> When granting userspace or a KVM guest access to an xfeature, preserve the
> entity's existing supervisor and software-defined permissions as tracked
> by __state_perm, i.e. use __state_perm to track *all* permissions even
> though all supported supervisor xfeatures are granted to all FPUs and
> FPU_GUEST_PERM_LOCKED disallows changing permissions.

Should we document what __state_perm contains either in
fpu_state_perm or xstate_get_group_perm()?

Either way:

Acked-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support
  2024-11-26 10:17 ` [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support Chao Gao
@ 2025-03-04 22:26   ` Dave Hansen
  2025-03-05  8:32     ` Chao Gao
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2025-03-04 22:26 UTC (permalink / raw)
  To: Chao Gao, tglx, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 11/26/24 02:17, Chao Gao wrote:
...
> The alternative is to enable it in KVM domain, but KVM maintainers NAKed
> the solution. The external discussion can be found at [*], it ended up
> with adding the support in kernel instead of KVM domain.
So, there's a lot of changelog here, but scant details.

This patch enables XFEATURE_CET_KERNEL everywhere it's available, right?
So, this patch at least wastes the XSAVE buffer space and doesn't
actually get anything. Right?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
  2024-11-26 10:17 ` [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Chao Gao
@ 2025-03-04 22:37   ` Dave Hansen
  2025-03-05  8:43     ` Chao Gao
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2025-03-04 22:37 UTC (permalink / raw)
  To: Chao Gao, tglx, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

Subjects should ideally be written without large identifiers:

	Subject: x86/fpu/xstate: Introduce dynamic kernel features

On 11/26/24 02:17, Chao Gao wrote:
> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features
> so that the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors
> can be optimized by HW for normal fpstate.

I'm really confused why this changelog hunk is here.

Right now, all kernel XSAVE buffers have the same supervisor xfeatures.
This introduces the idea that they can have different xfeature sets.

The _purpose_ of this is to save space when only some FPUs need a given
feature. This saves space in 'struct fpu'. It probably doesn't actually
make XSAVE/XRSTOR any faster because the init optimization is very
likely to be in place for these features.

I'm not sure what point the changelog was trying to make about xstate_bv
and xcomp_bv.  xstate_bv[12] would have been 0 if the feature was in its
init state.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration
  2024-11-26 10:17 ` [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Chao Gao
@ 2025-03-04 22:50   ` Dave Hansen
  2025-03-05  8:57     ` Chao Gao
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2025-03-04 22:50 UTC (permalink / raw)
  To: Chao Gao, tglx, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 11/26/24 02:17, Chao Gao wrote:
> From: Yang Weijiang <weijiang.yang@intel.com>
> 
> Define new fpu_guest_cfg to hold all guest FPU settings so that it can
> differ from generic kernel FPU settings, e.g., enabling CET supervisor
> xstate by default for guest fpstate while it's remained disabled in
> kernel FPU config.
> 
> The kernel dynamic xfeatures are specifically used by guest fpstate now,
> add the mask for guest fpstate so that guest_perm.__state_perm ==
> (fpu_kernel_cfg.default_xfeature | XFEATURE_MASK_KERNEL_DYNAMIC). And
> if guest fpstate is re-allocated to hold user dynamic xfeatures, the
> resulting permissions are consumed before calculate new guest fpstate.

This kinda restates what the code does, but I don't think it matches the
code.

> With new guest FPU config added, there're 3 categories of FPU configs in
> kernel, the usages and key fields are recapped as below.

This changelog is pretty rough. It's got a lot of words but not much
substance.


> kernel FPU config:
>   @fpu_kernel_cfg.max_features
>   - all known and CPU supported user and supervisor features except
>     independent kernel features
> 
>   @fpu_kernel_cfg.default_features
>   - all known and CPU supported user and supervisor features except
>     dynamic kernel features, independent kernel features and dynamic
>     userspace features.
> 
>   @fpu_kernel_cfg.max_size
>   - size of compacted buffer with 'fpu_kernel_cfg.max_features'
> 
>   @fpu_kernel_cfg.default_size
>   - size of compacted buffer with 'fpu_kernel_cfg.default_features'
> 
> user FPU config:
>   @fpu_user_cfg.max_features
>   - all known and CPU supported user features
> 
>   @fpu_user_cfg.default_features
>   - all known and CPU supported user features except dynamic userspace
>     features.
> 
>   @fpu_user_cfg.max_size
>   - size of non-compacted buffer with 'fpu_user_cfg.max_features'
> 
>   @fpu_user_cfg.default_size
>   - size of non-compacted buffer with 'fpu_user_cfg.default_features'
> 
> guest FPU config:
>   @fpu_guest_cfg.max_features
>   - all known and CPU supported user and supervisor features except
>     independent kernel features.
> 
>   @fpu_guest_cfg.default_features
>   - all known and CPU supported user and supervisor features except
>     independent kernel features and dynamic userspace features.
> 
>   @fpu_guest_cfg.max_size
>   - size of compacted buffer with 'fpu_guest_cfg.max_features'
> 
>   @fpu_guest_cfg.default_size
>   - size of compacted buffer with 'fpu_guest_cfg.default_features'

This looks like kerneldoc, not changelog. Are you sure you want it _here_?


> diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> index b49e65120d34..da6583a1c0a2 100644
> --- a/arch/x86/include/asm/fpu/types.h
> +++ b/arch/x86/include/asm/fpu/types.h
> @@ -611,6 +611,6 @@ struct fpu_state_config {
>  };
>  
>  /* FPU state configuration information */
> -extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg;
> +extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg, fpu_guest_cfg;
>  
>  #endif /* _ASM_X86_FPU_TYPES_H */
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 1209c7aebb21..9e2e5c46cf28 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -33,9 +33,10 @@ DEFINE_STATIC_KEY_FALSE(__fpu_state_size_dynamic);
>  DEFINE_PER_CPU(u64, xfd_state);
>  #endif
>  
> -/* The FPU state configuration data for kernel and user space */
> +/* The FPU state configuration data for kernel, user space and guest. */
>  struct fpu_state_config	fpu_kernel_cfg __ro_after_init;
>  struct fpu_state_config fpu_user_cfg __ro_after_init;
> +struct fpu_state_config fpu_guest_cfg __ro_after_init;
>  
>  /*
>   * Represents the initial FPU state. It's mostly (but not completely) zeroes,
> @@ -536,8 +537,15 @@ void fpstate_reset(struct fpu *fpu)
>  	fpu->perm.__state_perm		= fpu_kernel_cfg.default_features;
>  	fpu->perm.__state_size		= fpu_kernel_cfg.default_size;
>  	fpu->perm.__user_state_size	= fpu_user_cfg.default_size;
> -	/* Same defaults for guests */
> -	fpu->guest_perm = fpu->perm;
> +
> +	/* Guest permission settings */
> +	fpu->guest_perm.__state_perm	= fpu_guest_cfg.default_features;
> +	fpu->guest_perm.__state_size	= fpu_guest_cfg.default_size;
> +	/*
> +	 * Set guest's __user_state_size to fpu_user_cfg.default_size so that
> +	 * existing uAPIs can still work.
> +	 */
> +	fpu->guest_perm.__user_state_size = fpu_user_cfg.default_size;
>  }
>  
>  static inline void fpu_inherit_perms(struct fpu *dst_fpu)
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c38e477e3e45..17c3255dfa41 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -686,6 +686,7 @@ static int __init init_xstate_size(void)
>  {
>  	/* Recompute the context size for enabled features: */
>  	unsigned int user_size, kernel_size, kernel_default_size;
> +	unsigned int guest_default_size;
>  	bool compacted = cpu_feature_enabled(X86_FEATURE_XCOMPACTED);
>  
>  	/* Uncompacted user space size */
> @@ -707,13 +708,18 @@ static int __init init_xstate_size(void)
>  	kernel_default_size =
>  		xstate_calculate_size(fpu_kernel_cfg.default_features, compacted);
>  
> +	guest_default_size =
> +		xstate_calculate_size(fpu_guest_cfg.default_features, compacted);
> +
>  	if (!paranoid_xstate_size_valid(kernel_size))
>  		return -EINVAL;
>  
>  	fpu_kernel_cfg.max_size = kernel_size;
>  	fpu_user_cfg.max_size = user_size;
> +	fpu_guest_cfg.max_size = kernel_size;
>  
>  	fpu_kernel_cfg.default_size = kernel_default_size;
> +	fpu_guest_cfg.default_size = guest_default_size;
>  	fpu_user_cfg.default_size =
>  		xstate_calculate_size(fpu_user_cfg.default_features, false);
>  
> @@ -829,6 +835,10 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>  	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>  	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>  
> +	fpu_guest_cfg.max_features = fpu_kernel_cfg.max_features;
> +	fpu_guest_cfg.default_features = fpu_guest_cfg.max_features;
> +	fpu_guest_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;

I thought this was saying above that it was _setting_ dynamic features.

Why _not_ set them by default?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config
  2024-11-26 10:17 ` [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config Chao Gao
@ 2025-03-04 23:02   ` Dave Hansen
  2025-03-05  9:04     ` Chao Gao
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Hansen @ 2025-03-04 23:02 UTC (permalink / raw)
  To: Chao Gao, tglx, x86, seanjc, pbonzini, linux-kernel, kvm
  Cc: peterz, rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On 11/26/24 02:17, Chao Gao wrote:
> From: Yang Weijiang <weijiang.yang@intel.com>
> 
> Use fpu_guest_cfg to calculate guest fpstate settings, open code for
> __fpstate_reset() to avoid using kernel FPU config.
> 
> Below configuration steps are currently enforced to get guest fpstate:
> 1) Kernel sets up guest FPU settings in fpu__init_system_xstate().
> 2) User space sets vCPU thread group xstate permits via arch_prctl().
> 3) User space creates guest fpstate via __fpu_alloc_init_guest_fpstate()
>    for vcpu thread.
> 4) User space enables guest dynamic xfeatures and re-allocate guest
>    fpstate.
> 
> By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area
> size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic
> xfeatures | user dynamic xfeatures), then host xsaves/xrstors can operate
> for all guest xfeatures.

These changelogs have a lot of content, but they're honestly not helping
my along here very much.

>  arch/x86/kernel/fpu/core.c | 39 +++++++++++++++++++++++++++++---------
>  1 file changed, 30 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 9e2e5c46cf28..00d7dcf45b34 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -194,8 +194,6 @@ void fpu_reset_from_exception_fixup(void)
>  }
>  
>  #if IS_ENABLED(CONFIG_KVM)
> -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd);
> -
>  static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
>  {
>  	struct fpu_state_perm *fpuperm;
> @@ -216,25 +214,48 @@ static void fpu_init_guest_permissions(struct fpu_guest *gfpu)
>  	gfpu->perm = perm & ~FPU_GUEST_PERM_LOCKED;
>  }
>  
> -bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
> +static struct fpstate *__fpu_alloc_init_guest_fpstate(struct fpu_guest *gfpu)
>  {
>  	struct fpstate *fpstate;
>  	unsigned int size;
>  
> -	size = fpu_user_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
> +	/*
> +	 * fpu_guest_cfg.default_size is initialized to hold all enabled
> +	 * xfeatures except the user dynamic xfeatures. If the user dynamic
> +	 * xfeatures are enabled, the guest fpstate will be re-allocated to
> +	 * hold all guest enabled xfeatures, so omit user dynamic xfeatures
> +	 * here.
> +	 */
> +	size = fpu_guest_cfg.default_size + ALIGN(offsetof(struct fpstate, regs), 64);
> +
>  	fpstate = vzalloc(size);
>  	if (!fpstate)
> -		return false;
> +		return NULL;
> +	/*
> +	 * Initialize sizes and feature masks, use fpu_user_cfg.*
> +	 * for user_* settings for compatibility of exiting uAPIs.
> +	 */
> +	fpstate->size		= fpu_guest_cfg.default_size;
> +	fpstate->xfeatures	= fpu_guest_cfg.default_features;
> +	fpstate->user_size	= fpu_user_cfg.default_size;
> +	fpstate->user_xfeatures	= fpu_user_cfg.default_features;
> +	fpstate->xfd		= 0;
>  
> -	/* Leave xfd to 0 (the reset value defined by spec) */
> -	__fpstate_reset(fpstate, 0);
>  	fpstate_init_user(fpstate);
>  	fpstate->is_valloc	= true;
>  	fpstate->is_guest	= true;
>  
>  	gfpu->fpstate		= fpstate;
> -	gfpu->xfeatures		= fpu_user_cfg.default_features;
> -	gfpu->perm		= fpu_user_cfg.default_features;
> +	gfpu->xfeatures		= fpu_guest_cfg.default_features;
> +	gfpu->perm		= fpu_guest_cfg.default_features;
> +
> +	return fpstate;
> +}
> +
> +bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
> +{
> +	if (!__fpu_alloc_init_guest_fpstate(gfpu))
> +		return false;
>  
>  	/*
>  	 * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state

This series is starting to look backward to me.

The normal way you do these things is that you introduce new
abstractions and refactor the code. Then you go adding features.

For instance, this series should spend a few patches introducing
'fpu_guest_cfg' and using it before ever introducing the concept of a
dynamic xfeature.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm
  2025-03-04 22:20   ` Dave Hansen
@ 2025-03-05  8:25     ` Chao Gao
  0 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-03-05  8:25 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, x86, seanjc, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Mar 04, 2025 at 02:20:31PM -0800, Dave Hansen wrote:
>On 11/26/24 02:17, Chao Gao wrote:
>> When granting userspace or a KVM guest access to an xfeature, preserve the
>> entity's existing supervisor and software-defined permissions as tracked
>> by __state_perm, i.e. use __state_perm to track *all* permissions even
>> though all supported supervisor xfeatures are granted to all FPUs and
>> FPU_GUEST_PERM_LOCKED disallows changing permissions.
>
>Should we document what __state_perm contains either in
>fpu_state_perm or xstate_get_group_perm()?

Yes. I think we should document it. Will apply this change:

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index da6583a1c0a2..93481583dc85 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -417,9 +417,11 @@ struct fpu_state_perm {
 	/*
 	 * @__state_perm:
 	 *
-	 * This bitmap indicates the permission for state components, which
-	 * are available to a thread group. The permission prctl() sets the
-	 * enabled state bits in thread_group_leader()->thread.fpu.
+	 * This bitmap indicates the permission for state components
+	 * available to a thread group, including both user and supervisor
+	 * components and software-defined bits like FPU_GUEST_PERM_LOCKED.
+	 * The permission prctl() sets the enabled state bits in
+	 * thread_group_leader()->thread.fpu.
 	 *
 	 * All run time operations use the per thread information in the
 	 * currently active fpu.fpstate which contains the xfeature masks

>
>Either way:
>
>Acked-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support
  2025-03-04 22:26   ` Dave Hansen
@ 2025-03-05  8:32     ` Chao Gao
  0 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-03-05  8:32 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, x86, seanjc, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Mar 04, 2025 at 02:26:01PM -0800, Dave Hansen wrote:
>On 11/26/24 02:17, Chao Gao wrote:
>...
>> The alternative is to enable it in KVM domain, but KVM maintainers NAKed
>> the solution. The external discussion can be found at [*], it ended up
>> with adding the support in kernel instead of KVM domain.
>So, there's a lot of changelog here, but scant details.

Let me refine the changelog to ensure it's clear and concise and includes all
necessary details.

>
>This patch enables XFEATURE_CET_KERNEL everywhere it's available, right?

Yes

>So, this patch at least wastes the XSAVE buffer space and doesn't
>actually get anything. Right?

Yes

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
  2025-03-04 22:37   ` Dave Hansen
@ 2025-03-05  8:43     ` Chao Gao
  0 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-03-05  8:43 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, x86, seanjc, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Mar 04, 2025 at 02:37:00PM -0800, Dave Hansen wrote:
>Subjects should ideally be written without large identifiers:
>
>	Subject: x86/fpu/xstate: Introduce dynamic kernel features

will do.

>
>On 11/26/24 02:17, Chao Gao wrote:
>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features
>> so that the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors
>> can be optimized by HW for normal fpstate.
>
>I'm really confused why this changelog hunk is here.
>
>Right now, all kernel XSAVE buffers have the same supervisor xfeatures.
>This introduces the idea that they can have different xfeature sets.
>
>The _purpose_ of this is to save space when only some FPUs need a given
>feature. This saves space in 'struct fpu'. It probably doesn't actually
>make XSAVE/XRSTOR any faster because the init optimization is very
>likely to be in place for these features.

Thanks for the detailed explanation. You're absolutely right—this change
aims to reduce the size of struct fpu. I see now that implying a perf
improvement for XSAVE/XRSTOR was misleading, as the underlying optimization
is likely handled by the init optimization. I'll revise the changelog to
clarify this and incorporate your wording to avoid confusion.

>
>I'm not sure what point the changelog was trying to make about xstate_bv
>and xcomp_bv.  xstate_bv[12] would have been 0 if the feature was in its
>init state.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration
  2025-03-04 22:50   ` Dave Hansen
@ 2025-03-05  8:57     ` Chao Gao
  0 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-03-05  8:57 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, x86, seanjc, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Mar 04, 2025 at 02:50:48PM -0800, Dave Hansen wrote:
>On 11/26/24 02:17, Chao Gao wrote:
>> From: Yang Weijiang <weijiang.yang@intel.com>
>> 
>> Define new fpu_guest_cfg to hold all guest FPU settings so that it can
>> differ from generic kernel FPU settings, e.g., enabling CET supervisor
>> xstate by default for guest fpstate while it's remained disabled in
>> kernel FPU config.
>> 
>> The kernel dynamic xfeatures are specifically used by guest fpstate now,
>> add the mask for guest fpstate so that guest_perm.__state_perm ==
>> (fpu_kernel_cfg.default_xfeature | XFEATURE_MASK_KERNEL_DYNAMIC). And
>> if guest fpstate is re-allocated to hold user dynamic xfeatures, the
>> resulting permissions are consumed before calculate new guest fpstate.
>
>This kinda restates what the code does, but I don't think it matches the
>code.

Actually, it does (see more below)

>
>> With new guest FPU config added, there're 3 categories of FPU configs in
>> kernel, the usages and key fields are recapped as below.
>
>This changelog is pretty rough. It's got a lot of words but not much
>substance.

Will revise the changelog.

[...]

>
>
>> 
>>   @fpu_guest_cfg.default_size
>>   - size of compacted buffer with 'fpu_guest_cfg.default_features'
>
>This looks like kerneldoc, not changelog. Are you sure you want it _here_?

No, I agree this should be a comment above fpu_guest/kernel/user_cfg.

>
>
>> @@ -829,6 +835,10 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>  	fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>  	fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>  
>> +	fpu_guest_cfg.max_features = fpu_kernel_cfg.max_features;
>> +	fpu_guest_cfg.default_features = fpu_guest_cfg.max_features;
>> +	fpu_guest_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>
>I thought this was saying above that it was _setting_ dynamic features.
>
>Why _not_ set them by default?

At first glance, I had the same question.

XFEATURE_MASK_*KERNEL*_DYNAMIC is not excluded here, so it is enabled by
default. I believe the confusion arises partly from the order of the
patches. I will reorder the patches as you suggested in patch 5.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config
  2025-03-04 23:02   ` Dave Hansen
@ 2025-03-05  9:04     ` Chao Gao
  0 siblings, 0 replies; 23+ messages in thread
From: Chao Gao @ 2025-03-05  9:04 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, x86, seanjc, pbonzini, linux-kernel, kvm, peterz,
	rick.p.edgecombe, mlevitsk, weijiang.yang, john.allen

On Tue, Mar 04, 2025 at 03:02:04PM -0800, Dave Hansen wrote:
>On 11/26/24 02:17, Chao Gao wrote:
>> From: Yang Weijiang <weijiang.yang@intel.com>
>> 
>> Use fpu_guest_cfg to calculate guest fpstate settings, open code for
>> __fpstate_reset() to avoid using kernel FPU config.
>> 
>> Below configuration steps are currently enforced to get guest fpstate:
>> 1) Kernel sets up guest FPU settings in fpu__init_system_xstate().
>> 2) User space sets vCPU thread group xstate permits via arch_prctl().
>> 3) User space creates guest fpstate via __fpu_alloc_init_guest_fpstate()
>>    for vcpu thread.
>> 4) User space enables guest dynamic xfeatures and re-allocate guest
>>    fpstate.
>> 
>> By adding kernel dynamic xfeatures in above #1 and #2, guest xstate area
>> size is expanded to hold (fpu_kernel_cfg.default_features | kernel dynamic
>> xfeatures | user dynamic xfeatures), then host xsaves/xrstors can operate
>> for all guest xfeatures.
>
>These changelogs have a lot of content, but they're honestly not helping
>my along here very much.

Will revise the changelog.

[...]

>> +bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu)
>> +{
>> +	if (!__fpu_alloc_init_guest_fpstate(gfpu))
>> +		return false;
>>  
>>  	/*
>>  	 * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state
>
>This series is starting to look backward to me.
>
>The normal way you do these things is that you introduce new
>abstractions and refactor the code. Then you go adding features.
>
>For instance, this series should spend a few patches introducing
>'fpu_guest_cfg' and using it before ever introducing the concept of a
>dynamic xfeature.

This suggestion makes sense to me. Will do.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-03-05  9:04 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-26 10:17 [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
2024-11-26 10:17 ` [PATCH v2 1/6] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
2025-03-04 22:20   ` Dave Hansen
2025-03-05  8:25     ` Chao Gao
2024-11-26 10:17 ` [PATCH v2 2/6] x86/fpu/xstate: Add CET supervisor mode state support Chao Gao
2025-03-04 22:26   ` Dave Hansen
2025-03-05  8:32     ` Chao Gao
2024-11-26 10:17 ` [PATCH v2 3/6] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Chao Gao
2025-03-04 22:37   ` Dave Hansen
2025-03-05  8:43     ` Chao Gao
2024-11-26 10:17 ` [PATCH v2 4/6] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Chao Gao
2025-03-04 22:50   ` Dave Hansen
2025-03-05  8:57     ` Chao Gao
2024-11-26 10:17 ` [PATCH v2 5/6] x86/fpu/xstate: Create guest fpstate with guest specific config Chao Gao
2025-03-04 23:02   ` Dave Hansen
2025-03-05  9:04     ` Chao Gao
2024-11-26 10:17 ` [PATCH v2 6/6] x86/fpu/xstate: Warn if CET supervisor state is detected in normal fpstate Chao Gao
2024-11-26 10:28 ` [PATCH v2 0/6] Introduce CET supervisor state support Chao Gao
2025-01-11  1:26   ` Sean Christopherson
2025-02-11  2:09     ` Xin Li
2025-03-04 19:40     ` Xin Li
2025-03-04 19:53       ` Sean Christopherson
2025-02-19  1:44 ` Chao Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).