kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] Introduce CET supervisor state support
@ 2025-03-18 15:31 Chao Gao
  2025-03-18 15:31 ` [PATCH v4 1/8] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Chao Gao @ 2025-03-18 15:31 UTC (permalink / raw)
  To: x86, linux-kernel, kvm, tglx, dave.hansen, seanjc, pbonzini
  Cc: peterz, rick.p.edgecombe, weijiang.yang, john.allen, bp,
	chang.seok.bae, xin3.li, Chao Gao, Adamos Ttofari,
	Aruna Ramakrishna, Dave Hansen, Eric Biggers, H. Peter Anvin,
	Ingo Molnar, Li RongQing, Maxim Levitsky, Mitchell Levy,
	Samuel Holland, Stanislav Spassov, Uros Bizjak,
	Vignesh Balasubramanian, Zhao Liu

==Changelog==
v3->v4:
 - Remove fpu_guest_cfg.
   The fact that only the default_features and default_size fields of
   fpu_guest_cfg are used suggests that a full fpu_guest_cfg may not be
   necessary. Adding two members, "guest_default_xfeatures" and
   "guest_default_size", or even a single "guest_only_xfeatures" member in
   fpu_kernel_cfg, similar to "independent_xfeatures", is more logical. To
   facilitate discussion, implement this approach in this version.
 - Extract the fix for inconsistencies in fpu_guest and post it separately
   (Chang)
 - Rename XFEATURE_MASK_KERNEL_DYNAMIC to XFEATURE_MASK_SUPERVISOR_GUEST as
   tglx noted "this dynamic naming is really bad":

   https://lore.kernel.org/all/87sg1owmth.ffs@nanos.tec.linutronix.de/

 - Rerun performance tests and update the performance claims in the cover-letter
   (Dave)
 - Tighten down the changelogs and drop useless comments (Dave)
 - Reorder the patches to put the CET supervisor state patch before the
   "guest-only" optimization, allowing maintainers to easily adopt or omit the
   optimization.
 - v3: https://lore.kernel.org/kvm/20250307164123.1613414-1-chao.gao@intel.com/

v2->v3:
 - reorder patches to add fpu_guest_cfg first and then introduce dynamic kernel
   feature concept (Dave)
 - Revise changelog for all patches except the first and the last one (Dave)
 - Split up patches that do multiple things into separate patches.
 - collect tags for patch 1
 - v2: https://lore.kernel.org/kvm/20241126101710.62492-1-chao.gao@intel.com/

v1->v2:
 - rebase onto the latest kvm-x86/next
 - Add performance data to the cover-letter
 - v1: https://lore.kernel.org/kvm/73802bff-833c-4233-9a5b-88af0d062c82@intel.com/

==Background==

This series spins off from CET KVM virtualization enabling series [1].
The purpose is to get these preparation work resolved ahead of KVM part
landing. There was a discussion about introducing CET supervisor state
support [2] [3].

CET supervisor state, i.e., IA32_PL{0,1,2}_SSP, are xsave-managed MSRs,
it can be enabled via IA32_XSS[bit 12]. KVM relies on host side CET
supervisor state support to fully enable guest CET MSR contents storage.
The benefits are: 1) No need to manually save/restore the 3 MSRs when
vCPU fpu context is sched in/out. 2) Omit manually swapping the three
MSRs at VM-Exit/VM-Entry for guest/host. 3) Make guest CET user/supervisor
states managed in a consistent manner within host kernel FPU framework.

==Solution==

This series tries to:
1) Fix existing issue regarding enabling guest supervisor states support.
2) Add CET supervisor state support in core kernel.
3) Introduce guest default features and size for guest fpstate setup.

With the preparation work landed, for guest fpstate, we have xstate_bv[12]
== xcomp_bv[12] == 1 and CET supervisor state is saved/reloaded when
xsaves/xrstors executes on guest fpstate.
For non-guest/normal fpstate, we have xstate_bv[12] == xcomp_bv[12] == 0,
then HW can optimize xsaves/xrstors operations.

==Performance==

We measured context-switching performance with the benchmark [4] in following
three cases.

case 1: the baseline. i.e., this series isn't applied
case 2: baseline + this series. CET-S space is allocated for guest fpu only.
case 3: baseline + allocate CET-S space for all tasks. Hardware init
        optimization avoids writing out CET-S space on each XSAVES.

The performance differences in the three cases are very small and fall within the
run-to-run variation.

Case 2 is preferred over Case 3 because it can save 24B of CET-S space for all
non-vCPU threads with just a one-line change:

+	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_SUPERVISOR_GUEST

We believe adding guest defaults has its own merits. It improves readability,
decouples host FPUs and guest FPUs, and arguably enhances extensibility.

[1]: https://lore.kernel.org/all/20240219074733.122080-1-weijiang.yang@intel.com/
[2]: https://lore.kernel.org/all/ZM1jV3UPL0AMpVDI@google.com/
[3]: https://lore.kernel.org/all/2597a87b-1248-b8ce-ce60-94074bc67ea4@intel.com/
[4]: https://github.com/antonblanchard/will-it-scale/blob/master/tests/context_switch1.c


Chao Gao (4):
  x86/fpu: Drop @perm from guest pseudo FPU container
  x86/fpu/xstate: Differentiate default features for host and guest FPUs
  x86/fpu: Initialize guest FPU permissions from guest defaults
  x86/fpu: Initialize guest fpstate and FPU pseudo container from guest
    defaults

Sean Christopherson (1):
  x86/fpu/xstate: Always preserve non-user xfeatures/flags in
    __state_perm

Yang Weijiang (3):
  x86/fpu/xstate: Add CET supervisor xfeature support
  x86/fpu/xstate: Introduce "guest-only" supervisor xfeature set
  x86/fpu/xstate: Warn if guest-only supervisor states are detected in
    normal fpstate

 arch/x86/include/asm/fpu/types.h  | 58 ++++++++++++++++++++++---------
 arch/x86/include/asm/fpu/xstate.h |  9 +++--
 arch/x86/kernel/fpu/core.c        | 36 ++++++++++++-------
 arch/x86/kernel/fpu/xstate.c      | 42 +++++++++++++++-------
 arch/x86/kernel/fpu/xstate.h      |  2 ++
 5 files changed, 102 insertions(+), 45 deletions(-)

-- 
2.46.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-04-04  1:06 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-18 15:31 [PATCH v4 0/8] Introduce CET supervisor state support Chao Gao
2025-03-18 15:31 ` [PATCH v4 1/8] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Chao Gao
2025-04-01 17:17   ` Chang S. Bae
2025-04-01 17:56     ` Sean Christopherson
2025-03-18 15:31 ` [PATCH v4 2/8] x86/fpu: Drop @perm from guest pseudo FPU container Chao Gao
2025-04-01 17:16   ` Chang S. Bae
2025-04-02  1:56     ` Chao Gao
2025-03-18 15:31 ` [PATCH v4 3/8] x86/fpu/xstate: Add CET supervisor xfeature support Chao Gao
2025-04-01 17:15   ` Chang S. Bae
2025-04-02  2:28     ` Chao Gao
2025-04-02 21:37     ` Dave Hansen
2025-04-03 13:26       ` Chao Gao
2025-04-03 14:04       ` Ingo Molnar
2025-03-18 15:31 ` [PATCH v4 4/8] x86/fpu/xstate: Differentiate default features for host and guest FPUs Chao Gao
2025-04-01 17:18   ` Chang S. Bae
2025-04-02  3:16     ` Chao Gao
2025-03-18 15:31 ` [PATCH v4 5/8] x86/fpu: Initialize guest FPU permissions from guest defaults Chao Gao
2025-03-18 15:31 ` [PATCH v4 6/8] x86/fpu: Initialize guest fpstate and FPU pseudo container " Chao Gao
2025-03-18 15:31 ` [PATCH v4 7/8] x86/fpu/xstate: Introduce "guest-only" supervisor xfeature set Chao Gao
2025-04-01 17:16   ` Chang S. Bae
2025-04-02  4:29     ` Chao Gao
2025-03-18 15:31 ` [PATCH v4 8/8] x86/fpu/xstate: Warn if guest-only supervisor states are detected in normal fpstate Chao Gao
2025-04-01 17:17   ` Chang S. Bae
2025-04-02 14:30     ` Chao Gao
2025-04-04  0:02       ` Chang S. Bae
2025-04-04  1:06         ` Dave Hansen
2025-04-01 17:20 ` [PATCH v4 0/8] Introduce CET supervisor state support Chang S. Bae
2025-04-02 21:12 ` Edgecombe, Rick P
2025-04-02 21:35   ` Dave Hansen
2025-04-02 21:44     ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).