public inbox for linux-coco@lists.linux.dev
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jon Lange <jlange@microsoft.com>, Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	John Starks <John.Starks@microsoft.com>,
	Will Deacon <will@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
	LKML <linux-kernel@vger.kernel.org>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Subject: Re: [EXTERNAL] Re: "Paravisor" Feature Enumeration
Date: Tue, 6 Jan 2026 22:39:08 +0000	[thread overview]
Message-ID: <43ae1b15-c911-4ecd-aaaa-15bc23ec6192@citrix.com> (raw)
In-Reply-To: <CH8PR21MB522275E86FF5D33B04CE7A75CA87A@CH8PR21MB5222.namprd21.prod.outlook.com>

On 06/01/2026 2:12 am, Jon Lange wrote:
> Andrew wrote:
>
>> Are we saying that, inside an opaque blob that a customer provides to a CSP to run we might have:
>> * a paravisor and an unaware OS, or
>> * svsm and a fully-aware OS, or
>> * something in-between these two.
>> and we're looking a way to describe which piece of the interior stack owns which capability/service?
>> I think the discussion would benefit greatly from having a couple of concrete examples of data this wants to hold,
>> and how it is to be used at different levels of the interior software stack.
> Here are two examples.  In both examples, the OS is running behind a paravisor but I wouldn't term it an "unaware OS".  Rather, the paravisor is present because of the set of services it provides, and it is running in paravisor mode (not SVSM mode) because the implementation benefits from taking full management responsibility for the confidential trust boundary (e.g. determination of when/how to validate/accept pages).  In such a configuration, where the paravisor has management responsibility for the confidential trust boundary, all of the enlightenments in the guest OS for managing confidentiality state must be suppressed.  The straightforward way to do this is for the paravisor to suppress the confidential VM enumeration information visible to the guest OS (the "SNP available" CPUID bit, or the "TDX active" bit, for example).
>
> Note that this occurs out of necessity because we can't have the paravisor and the guest OS fighting over who has the right/responsibility to execute PVALIDATE, or TDG.MEM.PAGE.ACCEPT, or whatever.  The kernel today only has two concepts of its execution mode: either it is a confidential VM, in which case it takes full responsibility, or it is not a confidential VM, in which case it ignores the responsibility.  When a paravisor (not SVSM) is active, we have to operate in the second mode because the first mode would provoke precisely the conflict we're trying to avoid. 
>
> First example: a confidential VM running under a paravisor wants to obtain an attestation report for itself to pass to a third party to vouch for the fact that it is a confidential VM.  Assume in this example that the relying party is aware of the paravisor and the paravisor's measurements, so the evidence provided in such an attestation report can successfully be verified as authentic.  In order for this to be possible, the kernel has to know that it's running in a confidential VM in a mode where attestation reports are available but where the responsibility for confidential memory state management is suppressed.  This is a third state beyond the two states described above.  This isn't just a userspace problem because access to the attestation service is mediated by a kernel-mode driver that needs to know how to configure itself (such configuration today is based on CPUID and not on ACPI).
>
> Second example: a confidential VM running under a paravisor determines that one of the devices available to it is a TDISP device that requires the OS - not the paravisor - to perform the operations required to configure the device, to obtain and verify its attestation information, and to consent to activating the device in the TDISP RUN state.  In order for the OS to be able to execute that sequence, the device has to know that it is running as a confidential VM so it knows that TDISP configuration may be necessary.

Thankyou - that is helpful.

So overall, we're wanting the paravisor to be able to express "You're in
a confidential VM, but you're not in charge" to the OS.

Hiding the SNP / TDX bit is of course necessary.  They have well defined
meanings which the OS cannot use when it's not in charge.

In your first example, when you say "attestation report", do you mean of
the whole encrypted VM, or only the "OS" part of it?  After all, a
paravisor could be running multiple OSes.

Whichever it is, this is clearly a service provided by the paravisor,
with some kind of API that's going to be of the from "execute
VM(M)CALL/etc with these regs".  TDISP is also CPU-initiated actions,
some of which may need a paravisor API.


What you're really describing is "just another hypervisor".  So really,
on x86, the paravisor (which does control CPUID in this scenario) ought
to hide the outer data, advertise itself at 0x4000_0000, and Linux wants
a new paravirt mode for this new kind of virtual platform, which is
probably not going to be very different from a typical KVM/XenHVM/HyperV
guest today.

Anything else, and it seems like you're just re-inventing the wheel but
a little more square.

Do you foresee a need to pass anything other than "here's a handful of
services that are available to you"?  An ACPI table might be an
approach, but this seems like it could be a leaf or two and nothing more.


There's no common enumeration scheme between different architectures,
but I'm a firm believer that things ought to be enumerated in the
typical way for the architecture/platform.  This means CPUID on x86, and
things like devicetree on ARM.  It's slightly ugly duplicating
information, but it's less ugly than shoehorning a non-typical
enumeration scheme in to an existing infrastructure.

~Andrew

  reply	other threads:[~2026-01-06 22:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-05 21:42 "Paravisor" Feature Enumeration Dave Hansen
2026-01-06  0:01 ` dan.j.williams
2026-01-06  0:10   ` [EXTERNAL] " Jon Lange
2026-01-06  0:46     ` Dave Hansen
2026-01-06  0:36   ` Dave Hansen
2026-01-06  1:08     ` Sean Christopherson
2026-01-06  3:24     ` dan.j.williams
2026-01-06  1:44 ` Andrew Cooper
2026-01-06  2:12   ` [EXTERNAL] " Jon Lange
2026-01-06 22:39     ` Andrew Cooper [this message]
2026-01-06 23:01       ` Jon Lange
2026-01-07  1:58         ` dan.j.williams
2026-01-07  2:48           ` Jon Lange
2026-01-07 18:42             ` dan.j.williams
2026-01-08  6:53               ` Jon Lange
2026-01-07 12:06       ` Kiryl Shutsemau
2026-01-06 19:17   ` Edgecombe, Rick P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43ae1b15-c911-4ecd-aaaa-15bc23ec6192@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=John.Starks@microsoft.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=jlange@microsoft.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox