Re: [PATCH 0/3] KVM: x86: SGX vs. XCR0 cleanups

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Zhi Wang <zhi.wang.linux@gmail.com>
To: "Huang, Kai" <kai.huang@intel.com>
Cc: "Christopherson,, Sean" <seanjc@google.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/3] KVM: x86: SGX vs. XCR0 cleanups
Date: Thu, 13 Apr 2023 09:07:03 +0300	[thread overview]
Message-ID: <20230413090703.00002690.zhi.wang.linux@gmail.com> (raw)
In-Reply-To: <e1e7a37a29c2c7ad22cd14181f24b06088eca451.camel@intel.com>

On Wed, 12 Apr 2023 12:07:13 +0000
"Huang, Kai" <kai.huang@intel.com> wrote:

> On Thu, 2023-04-06 at 13:01 +0300, Zhi Wang wrote:
> > On Wed, 5 Apr 2023 19:10:40 -0700
> > Sean Christopherson <seanjc@google.com> wrote:
> > 
> > > On Wed, Apr 05, 2023, Huang, Kai wrote:
> > > > On Tue, 2023-04-04 at 17:59 -0700, Sean Christopherson wrote:
> > > > > *** WARNING *** ABI breakage.
> > > > > 
> > > > > Stop adjusting the guest's CPUID info for the allowed XFRM (a.k.a. XCR0)
> > > > > for SGX enclaves.  Past me didn't understand the roles and responsibilities
> > > > > between userspace and KVM with respect to CPUID leafs, i.e. I thought I was
> > > > > being helpful by having KVM adjust the entries.
> > > > 
> > > > Actually I am not clear about this topic.
> > > > 
> > > > So the rule is KVM should never adjust CPUID entries passed from userspace?
> > > 
> > > Yes, except for true runtime entries where a CPUID leaf is dynamic based on other
> > > CPU state, e.g. CR4 bits, MISC_ENABLES in the MONITOR/MWAIT case, etc.
> > > 
> > > > What if the userspace passed the incorrect CPUID entries?  Should KVM sanitize
> > > > those CPUID entries to ensure there's no insane configuration?  My concern is if
> > > > we allow guest to be created with insane CPUID configurations, the guest can be
> > > > confused and behaviour unexpectedly.
> > > 
> > > It is userspace's responsibility to provide a sane, correct setup.  The one
> > > exception is that KVM rejects KVM_SET_CPUID{2} if userspace attempts to define an
> > > unsupported virtual address width, the argument being that a malicious userspace
> > > could attack KVM by coercing KVM into stuff a non-canonical address into e.g. a
> > > VMCS field.
> > > 
> > > The reason for KVM punting to userspace is that it's all but impossible to define
> > > what is/isn't sane.  A really good example would be an alternative we (Google)
> > > considered for the "smaller MAXPHYADDR" fiasco, the underlying problem being that
> > > migrating a vCPU with MAXPHYADDR=46 to a system with MAXPHYADDR=52 will incorrectly
> > > miss reserved bit #PFs.
> > > 
> > > Rather than teach KVM to try and deal with smaller MAXPHYADDRs, an idea we considered
> > > was to instead enumerate guest.MAXPHYADDR=52 on platforms with host.MAXPHYADDR=46 in
> > > anticipation of eventual migration.  So long as userspace doesn't actually enumerate
> > > memslots in the illegal address space, KVM would be able to treat such accesses as
> > > emulated MMIO, and would only need to intercept #PF(RSVD).
> > > 
> > > Circling back to "what's sane", enumerating guest.MAXPHYADDR > host.MAXPHYADDR
> > > definitely qualifies as insane since it really can't work correctly, but in our
> > > opinion it was far superior to running with allow_smaller_maxphyaddr=true.
> > > 
> > > And sane is not the same thing as architecturally legal.  AMX is a good example
> > > of this.  It's _technically_ legal to enumerate support for XFEATURE_TILE_CFG but
> > > not XFEATURE_TILE_DATA in CPUID, but illegal to actually try to enable TILE_CFG
> > > in XCR0 without also enabling TILE_DATA.  KVM should arguably reject CPUID configs
> > > with TILE_CFG but not TILE_DATA, and vice versa, but then KVM is rejecting a 100%
> > > architecturally valid, if insane, CPUID configuration.  Ditto for nearly all of
> > > the VMX control bits versus their CPUID counterparts.
> > > 
> > > And sometimes there are good reasons to run a VM with a truly insane configuration,
> > > e.g. for testing purposes.
> > > 
> > > TL;DR: trying to enforce "sane" CPUID/feature configuration is a gigantic can of worms.
> > 
> > Interesting point. I was digging the CPUID virtualization OF TDX/SNP.
> > It would be nice to have a conclusion of what is "sane" and what is the
> > proper role for KVM, as firmware/TDX module is going to validate the "sane"
> > CPUID.
> > 
> > TDX/SNP requires the CPUID to be pre-configured and validated before creating
> > a CC guest. (It is done via TDH.MNG.INIT in TDX and inserting a CPUID page in
> > SNP_LAUNCH_UPDATE in SNP).
> > 
> > IIUC according to what you mentioned, KVM should be treated like "CPUID box"
> > for QEMU and the checks in KVM is only to ensure the requirements of a chosen
> > one is literally possible and correct. KVM should not care if the combination, the usage of the chosen ones is insane or not, which gives QEMU flexibility.
> > 
> > As the valid CPUIDs have been decided when creating a CC guest, what should be
> > the proper behavior (basically any new checks?) of KVM for the later
> > SET_CPUID2? My gut feeling is KVM should know the "CPUID box" is reduced
> > at least, because some KVM code paths rely on guest CPUID configuration.
> 
> For TDX guest my preference is KVM to save all CPUID entries in TDH.MNG.INIT and
> manually make vcpu's CPUID point to the saved CPUIDs.  And then KVM just ignore
> the SET_CPUID2 for TDX guest.
> 
> Not sure whether AMD counterpart can be done in similar way though. 

I took a look on AMD SNP kernel[1], it supports host managing the CPUID
and firmware managing the CPUID. The host-managed CPUID is done via a GHCB
message call and it is going to be removed according to the SNP firmware ABI
spec:

7.1 CPUID Reporting
Note: This guest message may be removed in future versions as it is redundant with the CPUID page in SNP_LAUNCH_UPDATE. (See Section 8.17.)

So the style of CPUID virtualization of TDX and SNP will be aligned eventually.
Both will configure the supported CPUID for the firmware/TDX module before
creating a vCPU. 

[1] https://github.com/AMDESE/linux/blob/upmv10-host-snp-v8-rfc/arch/x86/kvm/svm/sev.c
[2] https://www.amd.com/system/files/TechDocs/56860.pdf

next prev parent reply	other threads:[~2023-04-13  6:07 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-05  0:59 [PATCH 0/3] KVM: x86: SGX vs. XCR0 cleanups Sean Christopherson
2023-04-05  0:59 ` [PATCH 1/3] KVM: VMX: Don't rely _only_ on CPUID to enforce XCR0 restrictions for ECREATE Sean Christopherson
2023-04-05 10:52   ` Huang, Kai
2023-04-06  1:44     ` Sean Christopherson
2023-04-06  3:02       ` Huang, Kai
2023-04-06 19:12         ` Sean Christopherson
2023-04-12 10:12           ` Huang, Kai
2023-04-20 10:55             ` Huang, Kai
2023-04-05  0:59 ` [PATCH 2/3] KVM: x86: Don't adjust guest's CPUID.0x12.1 (allowed SGX enclave XFRM) Sean Christopherson
2023-04-05  0:59 ` [PATCH 3/3] KVM: x86: Open code supported XCR0 calculation in kvm_vcpu_after_set_cpuid() Sean Christopherson
2023-04-05  3:05 ` [PATCH 0/3] KVM: x86: SGX vs. XCR0 cleanups Huang, Kai
2023-04-05  9:44 ` Huang, Kai
2023-04-06  2:10   ` Sean Christopherson
2023-04-06 10:01     ` Zhi Wang
2023-04-12 12:07       ` Huang, Kai
2023-04-12 15:22         ` Sean Christopherson
2023-04-13  0:20           ` Huang, Kai
2023-04-13 22:48             ` Sean Christopherson
2023-04-14 13:42               ` Huang, Kai
2023-04-16  6:36                 ` Zhi Wang
2023-04-13  6:07         ` Zhi Wang [this message]
2023-04-12 12:15     ` Huang, Kai
2023-04-12 14:57       ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230413090703.00002690.zhi.wang.linux@gmail.com \
    --to=zhi.wang.linux@gmail.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.