From: Sean Christopherson <seanjc@google.com>
To: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Weijiang Yang <weijiang.yang@intel.com>,
Dave Hansen <dave.hansen@intel.com>,
pbonzini@redhat.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, peterz@infradead.org,
chao.gao@intel.com, rick.p.edgecombe@intel.com,
john.allen@amd.com
Subject: Re: [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size
Date: Fri, 3 Nov 2023 07:33:01 -0700 [thread overview]
Message-ID: <ZUUEnXcqgY7O0jp7@google.com> (raw)
In-Reply-To: <f4e2d8c79ca3f238aafd61a82a3f5ad5c2d6bcab.camel@redhat.com>
On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> On Wed, 2023-11-01 at 07:16 -0700, Sean Christopherson wrote:
> > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > On Thu, 2023-10-26 at 10:24 -0700, Sean Christopherson wrote:
> > > > --
> > > > From: Sean Christopherson <seanjc@google.com>
> > > > Date: Thu, 26 Oct 2023 10:17:33 -0700
> > > > Subject: [PATCH] x86/fpu/xstate: Always preserve non-user xfeatures/flags in
> > > > __state_perm
> > > >
> > > > Fixes: 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE permissions")
> > > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > > ---
> > > > arch/x86/kernel/fpu/xstate.c | 18 +++++++++++-------
> > > > 1 file changed, 11 insertions(+), 7 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > > > index ef6906107c54..73f6bc00d178 100644
> > > > --- a/arch/x86/kernel/fpu/xstate.c
> > > > +++ b/arch/x86/kernel/fpu/xstate.c
> > > > @@ -1601,16 +1601,20 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
> > > > if ((permitted & requested) == requested)
> > > > return 0;
> > > >
> > > > - /* Calculate the resulting kernel state size */
> > > > + /*
> > > > + * Calculate the resulting kernel state size. Note, @permitted also
> > > > + * contains supervisor xfeatures even though supervisor are always
> > > > + * permitted for kernel and guest FPUs, and never permitted for user
> > > > + * FPUs.
> > > > + */
> > > > mask = permitted | requested;
> > > > - /* Take supervisor states into account on the host */
> > > > - if (!guest)
> > > > - mask |= xfeatures_mask_supervisor();
> > > > ksize = xstate_calculate_size(mask, compacted);
> > >
> > > This might not work with kernel dynamic features, because
> > > xfeatures_mask_supervisor() will return all supported supervisor features.
> >
> > I don't understand what you mean by "This".
>
> >
> > Somewhat of a side topic, I feel very strongly that we should use "guest only"
> > terminology instead of "dynamic". There is nothing dynamic about whether or not
> > XFEATURE_CET_KERNEL is allowed; there's not even a real "decision" beyond checking
> > wheter or not CET is supported.
>
> > > Therefore at least until we have an actual kernel dynamic feature (a feature
> > > used by the host kernel and not KVM, and which has to be dynamic like AMX),
> > > I suggest that KVM stops using the permission API completely for the guest
> > > FPU state, and just gives all the features it wants to enable right to
> >
> > By "it", I assume you mean userspace?
> >
> > > __fpu_alloc_init_guest_fpstate() (Guest FPU permission API IMHO should be
> > > deprecated and ignored)
> >
> > KVM allocates guest FPU state during KVM_CREATE_VCPU, so not using prctl() would
> > either require KVM to defer allocating guest FPU state until KVM_SET_CPUID{,2},
> > or would require a VM-scoped KVM ioctl() to let userspace opt-in to
> >
> > Allocating guest FPU state during KVM_SET_CPUID{,2} would get messy,
>
> > as KVM allows
> > multiple calls to KVM_SET_CPUID{,2} so long as the vCPU hasn't done KVM_RUN. E.g.
> > KVM would need to support actually resizing guest FPU state, which would be extra
> > complexity without any meaningful benefit.
>
>
> OK, I understand you now. What you claim is that it is legal to do this:
>
> - KVM_SET_XSAVE
> - KVM_SET_CPUID (with AMX enabled)
>
> KVM_SET_CPUID will have to resize the xstate which is already valid.
I was actually talking about
KVM_SET_CPUID2 (with dynamic user feature #1)
KVM_SET_CPUID2 (with dynamic user feature #2)
The second call through __xstate_request_perm() will be done with only user
xfeatures in @permitted and so the kernel will compute the wrong ksize.
> Your patch to fix the __xstate_request_perm() does seem to be correct in a
> sense that it will preserve the kernel fpu components in the fpu permissions.
>
> However note that kernel fpu permissions come from
> 'fpu_kernel_cfg.default_features' which don't include the dynamic kernel
> xfeatures (added a few patches before this one).
CET_KERNEL isn't dynamic! It's guest-only. There are no runtime decisions as to
whether or not CET_KERNEL is allowed. All guest FPU get CET_KERNEL, no kernel FPUs
get CET_KERNEL.
That matters because I am also proposing that we add a dedicated, defined-at-boot
fpu_guest_cfg instead of bolting on a "dynamic", which is what I meant by this:
: Or even better if it doesn't cause weirdness elsewhere, a dedicated
: fpu_guest_cfg. For me at least, a fpu_guest_cfg would make it easier to
: understand what all is going on.
That way, initialization of permissions is simply
fpu->guest_perm = fpu_guest_cfg.default_features;
and there's no need to differentiate between guest and kernel FPUs when reallocating
for dynamic user xfeatures because guest_perm.__state_perm already holds the correct
data.
> Therefore an attempt to resize the xstate to include a kernel dynamic feature by
> __xfd_enable_feature will fail.
>
> If kvm on the other hand includes all the kernel dynamic features in the
> initial allocation of FPU state (not optimal but possible),
This is what I am suggesting.
: There are definitely scenarios where CET will not be exposed to KVM guests, but
: I don't see any reason to make the guest FPU space dynamically sized for CET.
: It's what, 40 bytes?
> then later call to __xstate_request_perm for a userspace dynamic feature
> (which can still happen) will mess the the xstate, because again the
> permission code assumes that only default kernel features were granted the
> permissions.
>
>
> This has to be solved this way or another.
>
> >
> > The only benefit I can think of for a VM-scoped ioctl() is that it would allow a
> > single process to host multiple VMs with different dynamic xfeature requirements.
> > But such a setup is mostly theoretical. Maybe it'll affect the SEV migration
> > helper at some point? But even that isn't guaranteed.
> >
> > So while I agree that ARCH_GET_XCOMP_GUEST_PERM isn't ideal, practically speaking
> > it's sufficient for all current use cases. Unless a concrete use case comes along,
> > deprecating ARCH_GET_XCOMP_GUEST_PERM in favor of a KVM ioctl() would be churn for
> > both the kernel and userspace without any meaningful benefit, or really even any
> > true change in behavior.
>
>
> ARCH_GET_XCOMP_GUEST_PERM/ARCH_SET_XCOMP_GUEST_PERM is not a good API from
> usability POV, because it is redundant.
>
> KVM already has API called KVM_SET_CPUID2, by which the qemu/userspace
> instructs the KVM, how much space to allocate, to support a VM with *this*
> CPUID.
>
> For example if qemu asks for nested SVM/VMX, then kvm will allocate on demand
> state for it (also at least 8K/vCPU btw). The same should apply for AMX -
> Qemu sets AMX xsave bit in CPUID - that permits KVM to allocate the extra
> state when needed.
>
> I don't see why we need an extra and non KVM API for that.
I don't necessarily disagree, but what's done is done. We missed our chance to
propose a different mechanism, and at this point undoing all of that without good
cause is unlikely to benefit anyone. If a use comes along that needs something
"better" than the prctl() API, then I agree it'd be worth revisiting.
next prev parent reply other threads:[~2023-11-03 14:33 UTC|newest]
Thread overview: 119+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-14 6:33 [PATCH v6 00/25] Enable CET Virtualization Yang Weijiang
2023-09-14 6:33 ` [PATCH v6 01/25] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
2023-09-14 22:39 ` Edgecombe, Rick P
2023-09-15 2:32 ` Yang, Weijiang
2023-09-15 16:35 ` Edgecombe, Rick P
2023-09-18 7:16 ` Yang, Weijiang
2023-10-31 17:43 ` Maxim Levitsky
2023-11-01 9:19 ` Yang, Weijiang
2023-09-14 6:33 ` [PATCH v6 02/25] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
2023-09-14 22:45 ` Edgecombe, Rick P
2023-09-15 2:45 ` Yang, Weijiang
2023-09-15 16:35 ` Edgecombe, Rick P
2023-10-21 0:39 ` Sean Christopherson
2023-10-24 8:50 ` Yang, Weijiang
2023-10-24 16:32 ` Sean Christopherson
2023-10-25 13:49 ` Yang, Weijiang
2023-10-31 17:43 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 03/25] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2023-09-15 0:06 ` Edgecombe, Rick P
2023-09-15 6:30 ` Yang, Weijiang
2023-10-31 17:44 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 04/25] x86/fpu/xstate: Introduce kernel dynamic xfeature set Yang Weijiang
2023-09-15 0:24 ` Edgecombe, Rick P
2023-09-15 6:42 ` Yang, Weijiang
2023-10-31 17:44 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 05/25] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Yang Weijiang
2023-09-14 16:22 ` Dave Hansen
2023-09-15 1:52 ` Yang, Weijiang
2023-10-31 17:44 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Yang Weijiang
2023-09-14 17:40 ` Dave Hansen
2023-09-15 2:22 ` Yang, Weijiang
2023-10-24 17:07 ` Sean Christopherson
2023-10-25 14:49 ` Yang, Weijiang
2023-10-26 17:24 ` Sean Christopherson
2023-10-26 22:06 ` Edgecombe, Rick P
2023-10-31 17:45 ` Maxim Levitsky
2023-11-01 14:16 ` Sean Christopherson
2023-11-02 18:20 ` Maxim Levitsky
2023-11-03 14:33 ` Sean Christopherson [this message]
2023-11-07 18:04 ` Maxim Levitsky
2023-11-14 9:13 ` Yang, Weijiang
2023-09-14 6:33 ` [PATCH v6 07/25] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Yang Weijiang
2023-10-31 17:45 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 08/25] x86/fpu/xstate: WARN if normal fpstate contains " Yang Weijiang
2023-10-31 17:45 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 09/25] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Yang Weijiang
2023-10-31 17:46 ` Maxim Levitsky
2023-11-01 14:41 ` Sean Christopherson
2023-11-02 18:25 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 10/25] KVM: x86: Add kvm_msr_{read,write}() helpers Yang Weijiang
2023-10-31 17:47 ` Maxim Levitsky
2023-11-01 19:32 ` Sean Christopherson
2023-11-02 18:26 ` Maxim Levitsky
2023-11-15 9:00 ` Yang, Weijiang
2023-09-14 6:33 ` [PATCH v6 11/25] KVM: x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2023-10-31 17:47 ` Maxim Levitsky
2023-11-01 19:18 ` Sean Christopherson
2023-11-02 18:31 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 12/25] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2023-10-08 5:54 ` Chao Gao
2023-10-10 0:49 ` Yang, Weijiang
2023-10-31 17:51 ` Maxim Levitsky
2023-11-01 17:20 ` Sean Christopherson
2023-11-15 7:18 ` Binbin Wu
2023-09-14 6:33 ` [PATCH v6 13/25] KVM: x86: Initialize kvm_caps.supported_xss Yang Weijiang
2023-10-31 17:51 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 14/25] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2023-10-31 17:51 ` Maxim Levitsky
2023-11-01 18:05 ` Sean Christopherson
2023-11-02 18:31 ` Maxim Levitsky
2023-11-03 8:46 ` Yang, Weijiang
2023-11-03 14:02 ` Sean Christopherson
2023-09-14 6:33 ` [PATCH v6 15/25] KVM: x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2023-10-31 17:51 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 16/25] KVM: x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2023-10-08 6:19 ` Chao Gao
2023-10-10 0:54 ` Yang, Weijiang
2023-10-31 17:52 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 17/25] KVM: VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2023-10-31 17:52 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 18/25] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Yang Weijiang
2023-10-31 17:54 ` Maxim Levitsky
2023-11-01 15:46 ` Sean Christopherson
2023-11-02 18:35 ` Maxim Levitsky
2023-11-04 0:07 ` Sean Christopherson
2023-11-07 18:05 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs Yang Weijiang
2023-10-31 17:55 ` Maxim Levitsky
2023-11-01 16:31 ` Sean Christopherson
2023-11-02 18:38 ` Maxim Levitsky
2023-11-02 23:58 ` Sean Christopherson
2023-11-07 18:12 ` Maxim Levitsky
2023-11-07 18:39 ` Sean Christopherson
2023-11-03 8:18 ` Yang, Weijiang
2023-11-03 22:26 ` Sean Christopherson
2023-09-14 6:33 ` [PATCH v6 20/25] KVM: x86: Save and reload SSP to/from SMRAM Yang Weijiang
2023-10-31 17:55 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 21/25] KVM: VMX: Set up interception for CET MSRs Yang Weijiang
2023-10-31 17:56 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 22/25] KVM: VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2023-10-31 17:56 ` Maxim Levitsky
2023-09-14 6:33 ` [PATCH v6 23/25] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2023-09-24 13:38 ` kernel test robot
2023-09-25 0:26 ` Yang, Weijiang
2023-10-31 17:56 ` Maxim Levitsky
2023-11-01 22:14 ` Sean Christopherson
2023-09-14 6:33 ` [PATCH v6 24/25] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Yang Weijiang
2023-10-31 17:57 ` Maxim Levitsky
2023-11-01 4:21 ` Chao Gao
2023-11-15 8:31 ` Yang, Weijiang
2023-09-14 6:33 ` [PATCH v6 25/25] KVM: nVMX: Enable CET support for nested guest Yang Weijiang
2023-10-31 17:57 ` Maxim Levitsky
2023-11-01 2:09 ` Chao Gao
2023-11-01 9:22 ` Yang, Weijiang
2023-11-01 9:54 ` Maxim Levitsky
2023-11-15 8:56 ` Yang, Weijiang
2023-11-15 8:23 ` Yang, Weijiang
2023-09-25 0:31 ` [PATCH v6 00/25] Enable CET Virtualization Yang, Weijiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUUEnXcqgY7O0jp7@google.com \
--to=seanjc@google.com \
--cc=chao.gao@intel.com \
--cc=dave.hansen@intel.com \
--cc=john.allen@amd.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=weijiang.yang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).