From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Sean Christopherson <seanjc@google.com>,
Rick P Edgecombe <rick.p.edgecombe@intel.com>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
Kai Huang <kai.huang@intel.com>,
"binbin.wu@linux.intel.com" <binbin.wu@linux.intel.com>,
Reinette Chatre <reinette.chatre@intel.com>,
Yan Y Zhao <yan.y.zhao@intel.com>,
"tony.lindgren@linux.intel.com" <tony.lindgren@linux.intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Isaku Yamahata <isaku.yamahata@intel.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: (Proposal) New TDX Global Metadata To Report FIXED0 and FIXED1 CPUID Bits
Date: Tue, 17 Dec 2024 12:27:33 +0800 [thread overview]
Message-ID: <eedbef51-ab8f-4d28-af8b-ba405d060015@intel.com> (raw)
In-Reply-To: <Z2DZpJz5K9W92NAE@google.com>
On 12/17/2024 9:53 AM, Sean Christopherson wrote:
> On Tue, Dec 10, 2024, Rick P Edgecombe wrote:
>> On Tue, 2024-12-10 at 11:22 +0800, Xiaoyao Li wrote:
>>>> The solution in this proposal decreases the work the VMM has to do, but
>>>> in the long term won't remove hand coding completely. As long as we are
>>>> designing something, what kind of bar should we target?
>>>
>>> For this specific #VE reduction case, I think userspace doesn't need to
>>> do any hand coding. Userspace just treats the bits related to #VE
>>> reduction as configurable as reported by TDX module/KVM. And userspace
>>> doesn't care if the value seen by TD guest is matched with what gets
>>> configured by it because they are out of control of userspace.
>>
>> Besides a specific problem, here reduced #VE is also an example of increasing
>> complexity for TD CPUID. If we have more things like it, it could make this
>> interface too rigid.
>
> I agree with Rick in that having QEMU treat them as configurable is going to be
> a disaster. But I don't think it's actually problematic in practice.
Correct the proposal. It should be QEMU treats them as what KVM reports.
TDX module reports these #VE reduction related CPUIDs as configurable
because it allows VMM to paravirt them. If KVM doesn't support the
paravirt of them, KVM can clear them from configurable bits and add them
to fixed0 bits when KVM reports to userspace.
> If QEMU (or KVM) has no visibility into the state of the guest's view of the
> affected features, then it doesn't matter whether they are fixed or configurable.
> They're effectively Schrödinger's bits: until QEMU/KVM actually looks at them,
> they're neither dead nor alive, and since QEMU/KVM *can't* look at them, who cares?
To some degree, I think it matters. As I explained above, if KVM reports
it as configurable to userspace, it mean TDX module allows it to be
configured and KVM allows it to be paravirtualized as well. So userspace
can configure it as 1 when users wants it. This is how VMM is going to
present the feature to TD guest.
However, how TD guest is going to use it depends on itself.
1) when TD guest doesn't enable #VE reduction: the configuration from
VMM doesn't matter. The CPUIDs are fixed1 and related operation leads to
#VE.
2) When TD guest enables #VE reduction and doesn't enable
TDCS.FEATURE_PARAVIRT_CTRL of the related bit: the configuration from
VMM doesn't matter. The CPUIDs are fixed0 and related operation leads to
#GP.
3) When TD guest enables #VE reduction and enable
TDCS.FEATURE_PARAVIRT_CTRL of the related bit: the configuration from
VMM matters.
- When VMM configures the bits to 1, the related operation leads to
#VE (for paravirtualization).
- When VMM configures the bits to 0, the related operation leads to #GP.
So for case 3), it does matters.
> So, if the TDX Module *requires* them to be set/cleared when the TD is created,
> then they should be reported as fixed. If the TDX module doesn't care, then they
> should be reported as configurable. The fact that the guest can muck with things
> under the hood doesn't factor into that logic.
yes, I agree on it.
> If TDX pulls something like this for features that KVM cares about, then we have
> problems, but that's already true today. If a feature requires KVM support, it
> doesn't really matter if the feature is fixed or configurable. What matters is
> that KVM has a chance to enforce that the feature can be used by the guest if
> and only if KVM has the proper support in place. Because if KVM is completely
> unaware of a feature, it's impossible for KVM to know that the feature needs to
> be rejected.
I agree.
With the proposed fixed/fixed1 information, and in addition to the
configurable bits, KVM can fully validate the TDX module against its
capabilities. When violation occurs (e.g., some KVM unsupported bit
being reported as fixed1 by TDX module), KVM can just refuse to enable TDX.
> This isn't unique to TDX, CoCo, or firmware. Every new feature that lands in
> hardware needs to either be "benign" or have the appropriate virtualization
> controls. KVM already has to deal with cases where features can effectively be
> used without KVM's knowledge. E.g. there are plenty of instruction-level
> virtualization holes, and SEV-ES doubled down by essentially forcing KVM to let
> the guest write XCR0 and XSS directly.
>
> It all works, so long as the hardware vendor doesn't screw up and let the guest
> use a feature that impacts host safety and/or functionality, without the hypervisor's
> knowledge.
>
> So, just don't screw up :-)
next prev parent reply other threads:[~2024-12-17 4:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-06 2:42 (Proposal) New TDX Global Metadata To Report FIXED0 and FIXED1 CPUID Bits Xiaoyao Li
2024-12-06 18:41 ` Edgecombe, Rick P
2024-12-10 3:22 ` Xiaoyao Li
2024-12-10 17:45 ` Edgecombe, Rick P
2024-12-17 1:53 ` Sean Christopherson
2024-12-17 4:27 ` Xiaoyao Li [this message]
2024-12-17 21:31 ` Edgecombe, Rick P
2024-12-18 0:08 ` Sean Christopherson
2024-12-19 1:56 ` Edgecombe, Rick P
2024-12-19 2:33 ` Sean Christopherson
2024-12-19 17:52 ` Edgecombe, Rick P
2024-12-20 2:40 ` Xiaoyao Li
2024-12-20 16:59 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eedbef51-ab8f-4d28-af8b-ba405d060015@intel.com \
--to=xiaoyao.li@intel.com \
--cc=adrian.hunter@intel.com \
--cc=binbin.wu@linux.intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=tony.lindgren@linux.intel.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).