From: "Yang, Weijiang" <weijiang.yang@intel.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <kvm@vger.kernel.org>,
<dave.hansen@intel.com>, <pbonzini@redhat.com>,
<seanjc@google.com>, <peterz@infradead.org>, <chao.gao@intel.com>,
<rick.p.edgecombe@intel.com>, <john.allen@amd.com>
Subject: Re: [PATCH v7 04/26] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set
Date: Fri, 8 Dec 2023 23:57:00 +0800 [thread overview]
Message-ID: <c1ee3d5b-5c54-4eea-b3d7-7385d39bae45@intel.com> (raw)
In-Reply-To: <8e7b64f06fe2a8132a8f9f76d673ac663ecfd854.camel@redhat.com>
On 12/7/2023 12:11 AM, Maxim Levitsky wrote:
> On Wed, 2023-12-06 at 11:00 +0800, Yang, Weijiang wrote:
>> On 12/5/2023 5:55 PM, Maxim Levitsky wrote:
>>> On Fri, 2023-12-01 at 15:49 +0800, Yang, Weijiang wrote:
>>>> On 12/1/2023 1:33 AM, Maxim Levitsky wrote:
>>>>> On Fri, 2023-11-24 at 00:53 -0500, Yang Weijiang wrote:
>>>>>> Define new XFEATURE_MASK_KERNEL_DYNAMIC set including the features can be
>>>>> I am not sure though that this name is correct, but I don't know if I can
>>>>> suggest a better name.
>>>> It's a symmetry of XFEATURE_MASK_USER_DYNAMIC ;-)
>>>>>> optionally enabled by kernel components, i.e., the features are required by
>>>>>> specific kernel components. Currently it's used by KVM to configure guest
>>>>>> dedicated fpstate for calculating the xfeature and fpstate storage size etc.
>>>>>>
>>>>>> The kernel dynamic xfeatures now only contain XFEATURE_CET_KERNEL, which is
>>>>>> supported by host as they're enabled in xsaves/xrstors operating xfeature set
>>>>>> (XCR0 | XSS), but the relevant CPU feature, i.e., supervisor shadow stack, is
>>>>>> not enabled in host kernel so it can be omitted for normal fpstate by default.
>>>>>>
>>>>>> Remove the kernel dynamic feature from fpu_kernel_cfg.default_features so that
>>>>>> the bits in xstate_bv and xcomp_bv are cleared and xsaves/xrstors can be
>>>>>> optimized by HW for normal fpstate.
>>>>>>
>>>>>> Suggested-by: Dave Hansen <dave.hansen@intel.com>
>>>>>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>>>>>> ---
>>>>>> arch/x86/include/asm/fpu/xstate.h | 5 ++++-
>>>>>> arch/x86/kernel/fpu/xstate.c | 1 +
>>>>>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
>>>>>> index 3b4a038d3c57..a212d3851429 100644
>>>>>> --- a/arch/x86/include/asm/fpu/xstate.h
>>>>>> +++ b/arch/x86/include/asm/fpu/xstate.h
>>>>>> @@ -46,9 +46,12 @@
>>>>>> #define XFEATURE_MASK_USER_RESTORE \
>>>>>> (XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
>>>>>>
>>>>>> -/* Features which are dynamically enabled for a process on request */
>>>>>> +/* Features which are dynamically enabled per userspace request */
>>>>>> #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA
>>>>>>
>>>>>> +/* Features which are dynamically enabled per kernel side request */
>>>>> I suggest to explain this a bit better. How about something like that:
>>>>>
>>>>> "Kernel features that are not enabled by default for all processes, but can
>>>>> be still used by some processes, for example to support guest virtualization"
>>>> It looks good to me, will apply it in next version, thanks!
>>>>
>>>>> But feel free to keep it as is or propose something else. IMHO this will
>>>>> be confusing this way or another.
>>>>>
>>>>>
>>>>> Another question: kernel already has a notion of 'independent features'
>>>>> which are currently kernel features that are enabled in IA32_XSS but not present in 'fpu_kernel_cfg.max_features'
>>>>>
>>>>> Currently only 'XFEATURE_LBR' is in this set. These features are saved/restored manually
>>>>> from independent buffer (in case of LBRs, perf code cares for this).
>>>>>
>>>>> Does it make sense to add CET_S to there as well instead of having XFEATURE_MASK_KERNEL_DYNAMIC,
>>>> CET_S here refers to PL{0,1,2}_SSP, right?
>>>>
>>>> IMHO, perf relies on dedicated code to switch LBR MSRs for various reason, e.g., overhead, the feature
>>>> owns dozens of MSRs, remove xfeature bit will offload the burden of common FPU/xsave framework.
>>> This is true, but the question that begs to be asked, is what is the true purpose of the 'independent features' is
>>> from the POV of the kernel FPU framework. IMHO these are features that the framework is not aware of, except
>>> that it enables it in IA32_XSS (and in XCR0 in the future).
>> This is the origin intention for introducing independent features(firstly called dynamic feature, renamed later), from the
>> changelog the major concern is overhead:
> Yes, and to some extent the reason why we want to have CET supervisor state not saved on normal thread's FPU state is also overhead,
> because in theory if the kernel did save it, the MSRs will be in INIT state and thus XSAVES shouldn't have any functional impact,
> even if it saves/restores them for nothing.
CET supervisor state in normal thread's FPU state won't always be in INIT state. Per SDM, it's INIT state is defined only if 3 MSRs are 0,
but if guest is using supervisor CET, then with vCPU migration between pCPUs, more and more MSRs would hold non-zero contents.
This doesn't impact host kernel behavior because host CET_S is still disabled, but it does impact host XSAVES/XRSTORS behavior.
> In other words, as I said, independent features = features that FPU state doesn't manage, and are just optionally enabled,
> so that a custom code can do a custom xsave(s)/xrstor(s), likely from/to a custom area to save/load these features.
>
> It might make sense to rename independent features again to something like 'unmanaged features' or 'manual features' or something
> like that.
>
>
> Another interesting question that arises here, is once KVM supports arch LBRs, it will likely need to expose the XFEATURE_LBR
> to the guest and will need to context switch it similar to CET_S state, which strengthens the argument that CET_S should
> be in the same group as the 'independent features'.
>
> Depending on the performance impact, XFEATURE_LBR might even need to be dynamically allocated.
This is most likely true for fpu_guest_cfg instead of fpu_kernel_cfg, let me think it over, thanks for bring up this brilliant idea :-)
> For the reference this is the patch series that introduced the arch LBRs to KVM:
> https://www.spinics.net/lists/kvm/msg296507.html
>
>
> Best regards,
> Maxim Levitsky
>
>> commit f0dccc9da4c0fda049e99326f85db8c242fd781f
>> Author: Kan Liang <kan.liang@linux.intel.com>
>> Date: Fri Jul 3 05:49:26 2020 -0700
>>
>> x86/fpu/xstate: Support dynamic supervisor feature for LBR
>>
>> "However, the kernel should not save/restore the LBR state component at
>> each context switch, like other state components, because of the
>> following unique features of LBR:
>> - The LBR state component only contains valuable information when LBR
>> is enabled in the perf subsystem, but for most of the time, LBR is
>> disabled.
>> - The size of the LBR state component is huge. For the current
>> platform, it's 808 bytes.
>> If the kernel saves/restores the LBR state at each context switch, for
>> most of the time, it is just a waste of space and cycles."
>>
>>> For the guest only features, like CET_S, it is also kind of the same thing (xsave but to guest state area only).
>>> I don't insist that we add CET_S to independent features, but I just gave an idea that maybe that is better
>>> from complexity point of view to add CET there. It's up to you to decide.
>>>
>>> Sean what do you think?
>>>
>>> Best regards,
>>> Maxim Levitsky
>>>
>>>
>>>> But CET only has 3 supervisor MSRs and they need to be managed together with user mode MSRs.
>>>> Enabling it in common FPU framework would make the switch/swap much easier without additional
>>>> support code.
>>>>
>>>>> and maybe rename the
>>>>> 'XFEATURE_MASK_INDEPENDENT' to something like 'XFEATURES_THE_KERNEL_DOESNT_CARE_ABOUT'
>>>>> (terrible name, but you might think of a better name)
>>>>>
>>>>>
>>>>>> +#define XFEATURE_MASK_KERNEL_DYNAMIC XFEATURE_MASK_CET_KERNEL
>>>>>> +
>>>>>> /* All currently supported supervisor features */
>>>>>> #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \
>>>>>> XFEATURE_MASK_CET_USER | \
>>>>>> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
>>>>>> index b57d909facca..ba4172172afd 100644
>>>>>> --- a/arch/x86/kernel/fpu/xstate.c
>>>>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>>>>> @@ -824,6 +824,7 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
>>>>>> /* Clean out dynamic features from default */
>>>>>> fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
>>>>>> fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>>> + fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_KERNEL_DYNAMIC;
>>>>>>
>>>>>> fpu_user_cfg.default_features = fpu_user_cfg.max_features;
>>>>>> fpu_user_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;
>>>>> Best regards,
>>>>> Maxim Levitsky
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>
next prev parent reply other threads:[~2023-12-08 15:57 UTC|newest]
Thread overview: 105+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-24 5:53 [PATCH v7 00/26] Enable CET Virtualization Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 01/26] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Yang Weijiang
2023-11-30 17:24 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 02/26] x86/fpu/xstate: Refine CET user xstate bit enabling Yang Weijiang
2023-11-24 9:40 ` Peter Zijlstra
2023-11-27 2:55 ` Yang, Weijiang
2023-11-28 1:31 ` Edgecombe, Rick P
2023-11-28 8:50 ` Peter Zijlstra
2023-11-28 1:31 ` Edgecombe, Rick P
2023-11-28 7:52 ` Yang, Weijiang
2023-11-30 17:26 ` Maxim Levitsky
2023-12-01 6:51 ` Yang, Weijiang
2023-12-05 9:53 ` Maxim Levitsky
2023-12-06 1:03 ` Yang, Weijiang
2023-12-06 15:57 ` Maxim Levitsky
2023-12-08 14:57 ` Yang, Weijiang
2023-12-08 15:15 ` Maxim Levitsky
2023-12-13 9:30 ` Yang, Weijiang
2023-12-13 13:31 ` Maxim Levitsky
2023-12-13 17:01 ` Chang S. Bae
2023-12-14 3:12 ` Yang, Weijiang
2023-11-24 5:53 ` [PATCH v7 03/26] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2023-11-24 9:45 ` Peter Zijlstra
2023-11-27 4:06 ` Yang, Weijiang
2023-11-28 3:38 ` Li, Xin3
2023-11-28 1:34 ` Edgecombe, Rick P
2023-11-30 17:27 ` Maxim Levitsky
2023-12-01 7:01 ` Yang, Weijiang
2023-12-05 9:53 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 04/26] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Yang Weijiang
2023-11-28 1:46 ` Edgecombe, Rick P
2023-11-28 8:00 ` Yang, Weijiang
2023-11-30 17:33 ` Maxim Levitsky
2023-12-01 7:49 ` Yang, Weijiang
2023-12-05 9:55 ` Maxim Levitsky
2023-12-06 3:00 ` Yang, Weijiang
2023-12-06 16:11 ` Maxim Levitsky
2023-12-08 15:57 ` Yang, Weijiang [this message]
2023-11-24 5:53 ` [PATCH v7 05/26] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Yang Weijiang
2023-11-28 14:58 ` Edgecombe, Rick P
2023-11-29 14:12 ` Yang, Weijiang
2023-11-29 17:08 ` Edgecombe, Rick P
2023-11-30 13:28 ` Yang, Weijiang
2023-11-30 17:29 ` Maxim Levitsky
2023-11-30 18:02 ` Edgecombe, Rick P
2023-11-30 17:29 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 06/26] x86/fpu/xstate: Create guest fpstate with guest specific config Yang Weijiang
2023-11-28 15:19 ` Edgecombe, Rick P
2023-11-29 14:16 ` Yang, Weijiang
2023-11-30 17:36 ` Maxim Levitsky
2023-12-01 8:36 ` Yang, Weijiang
2023-12-05 9:57 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 07/26] x86/fpu/xstate: Warn if kernel dynamic xfeatures detected in normal fpstate Yang Weijiang
2023-11-28 15:25 ` Edgecombe, Rick P
2023-11-29 14:18 ` Yang, Weijiang
2023-11-24 5:53 ` [PATCH v7 08/26] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 09/26] KVM: x86: Rename kvm_{g,s}et_msr() to menifest emulation operations Yang Weijiang
2023-11-30 17:36 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 10/26] KVM: x86: Refine xsave-managed guest register/MSR reset handling Yang Weijiang
2023-11-30 17:36 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 11/26] KVM: x86: Add kvm_msr_{read,write}() helpers Yang Weijiang
2023-11-30 17:37 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 12/26] KVM: x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 13/26] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2023-11-30 17:37 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 14/26] KVM: x86: Initialize kvm_caps.supported_xss Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 15/26] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2023-11-30 17:38 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 16/26] KVM: x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 17/26] KVM: x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2023-11-30 17:40 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 18/26] KVM: VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 19/26] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Yang Weijiang
2023-11-30 17:40 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 20/26] KVM: VMX: Emulate read and write to CET MSRs Yang Weijiang
2023-11-30 17:41 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 21/26] KVM: x86: Save and reload SSP to/from SMRAM Yang Weijiang
2023-11-30 17:42 ` Maxim Levitsky
2023-12-01 2:23 ` Chao Gao
2023-12-04 0:45 ` Yang, Weijiang
2023-12-05 10:02 ` Maxim Levitsky
2023-12-01 8:55 ` Yang, Weijiang
2023-11-24 5:53 ` [PATCH v7 22/26] KVM: VMX: Set up interception for CET MSRs Yang Weijiang
2023-11-30 17:44 ` Maxim Levitsky
2023-12-01 6:33 ` Chao Gao
2023-12-05 10:04 ` Maxim Levitsky
2023-12-01 9:45 ` Yang, Weijiang
2023-12-05 10:07 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 23/26] KVM: VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 24/26] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2023-11-30 17:46 ` Maxim Levitsky
2023-12-01 16:15 ` Yang, Weijiang
2023-12-05 10:07 ` Maxim Levitsky
2023-11-24 5:53 ` [PATCH v7 25/26] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Yang Weijiang
2023-11-24 5:53 ` [PATCH v7 26/26] KVM: nVMX: Enable CET support for nested guest Yang Weijiang
2023-11-30 17:53 ` Maxim Levitsky
2023-12-04 8:50 ` Yang, Weijiang
2023-12-05 10:12 ` Maxim Levitsky
2023-12-06 9:22 ` Yang, Weijiang
2023-12-06 17:24 ` Maxim Levitsky
2023-12-08 15:15 ` Yang, Weijiang
2023-12-08 15:22 ` Maxim Levitsky
2023-12-12 8:56 ` Yang, Weijiang
2023-12-12 11:09 ` Maxim Levitsky
2023-12-15 2:29 ` [PATCH v7 00/26] Enable CET Virtualization Yang, Weijiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c1ee3d5b-5c54-4eea-b3d7-7385d39bae45@intel.com \
--to=weijiang.yang@intel.com \
--cc=chao.gao@intel.com \
--cc=dave.hansen@intel.com \
--cc=john.allen@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox