From: Sean Christopherson <seanjc@google.com>
To: Michael Roth <michael.roth@amd.com>
Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev,
linux-mm@kvack.org, linux-crypto@vger.kernel.org,
x86@kernel.org, linux-kernel@vger.kernel.org,
tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de,
thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org,
pbonzini@redhat.com, vkuznets@redhat.com, jmattson@google.com,
luto@kernel.org, dave.hansen@linux.intel.com, slp@redhat.com,
pgonda@google.com, peterz@infradead.org,
srinivas.pandruvada@linux.intel.com, rientjes@google.com,
dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de,
vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com,
tony.luck@intel.com, sathyanarayanan.kuppuswamy@linux.intel.com,
alpergun@google.com, jarkko@kernel.org, ashish.kalra@amd.com,
nikunj.dadhania@amd.com, pankaj.gupta@amd.com,
liam.merwick@oracle.com, zhi.a.wang@intel.com,
Brijesh Singh <brijesh.singh@amd.com>
Subject: Re: [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
Date: Fri, 2 Feb 2024 14:54:55 -0800 [thread overview]
Message-ID: <Zb1yv67h6gkYqqv9@google.com> (raw)
In-Reply-To: <20240116041457.wver7acnwthjaflr@amd.com>
On Mon, Jan 15, 2024, Michael Roth wrote:
> On Wed, Jan 10, 2024 at 07:45:36AM -0800, Sean Christopherson wrote:
> > On Sat, Dec 30, 2023, Michael Roth wrote:
> > > diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> > > index b1beb2fe8766..d4325b26724c 100644
> > > --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> > > +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> > > @@ -485,6 +485,34 @@ Returns: 0 on success, -negative on error
> > >
> > > See the SEV-SNP specification for further detail on the launch input.
> > >
> > > +20. KVM_SNP_LAUNCH_UPDATE
> > > +-------------------------
> > > +
> > > +The KVM_SNP_LAUNCH_UPDATE is used for encrypting a memory region. It also
> > > +calculates a measurement of the memory contents. The measurement is a signature
> > > +of the memory contents that can be sent to the guest owner as an attestation
> > > +that the memory was encrypted correctly by the firmware.
> > > +
> > > +Parameters (in): struct kvm_snp_launch_update
> > > +
> > > +Returns: 0 on success, -negative on error
> > > +
> > > +::
> > > +
> > > + struct kvm_sev_snp_launch_update {
> > > + __u64 start_gfn; /* Guest page number to start from. */
> > > + __u64 uaddr; /* userspace address need to be encrypted */
> >
> > Huh? Why is KVM taking a userspace address? IIUC, the address unconditionally
> > gets translated into a gfn, so why not pass a gfn?
> >
> > And speaking of gfns, AFAICT start_gfn is never used.
>
> I think having both the uaddr and start_gfn parameters makes sense. I
> think it's only awkward because how I'm using the memslot to translate
> the uaddr to a GFN in the current implementation,
Yes.
> > Oof, reading more of the code, this *requires* an effective in-place copy-and-convert
> > of guest memory.
>
> Yes, I'm having some trouble locating the various threads, but initially
> there were some discussions about having a userspace API that allows for
> UPM/gmem pages to be pre-populated before they are in-place encrypted, but
> we'd all eventually decided that having KVM handle this internally was
> the simplest approach.
>
> So that's how it's done here, KVM_SNP_LAUNCH_UPDATE copies the pages into
> gmem, then passes those pages on to firmware for encryption. Then the
> VMM is expected to mark the GFN range as private via
> KVM_SET_MEMORY_ATTRIBUTES, since the VMM understands what constitutes
> initial private/encrypted payload. I should document that better in
> KVM_SNP_LAUNCH_UPDATE docs however.
That's fine. As above, my complaints are that the unencrypted source *must* be
covered by a memslot. That's beyond ugly.
> > What's "the IMI"? Initial Measurement Image? I assume this is essentially just
> > a flag that communicates whether or not the page should be measured?
>
> This is actually for loading a measured migration agent into the target
> system so that it can handle receiving the encrypted guest data from the
> source. There's still a good deal of planning around how live migration
> will be handled however so if you think it's premature to expose this
> via KVM I can remove the related fields.
Yes, please. Though FWIW, I honestly hope KVM_SEV_SNP_LAUNCH_UPDATE goes away
and we end up with a common uAPI for populating guest memory:
https://lore.kernel.org/all/Zbrj5WKVgMsUFDtb@google.com
> > > + __u8 page_type; /* page type */
> > > + __u8 vmpl3_perms; /* VMPL3 permission mask */
> > > + __u8 vmpl2_perms; /* VMPL2 permission mask */
> > > + __u8 vmpl1_perms; /* VMPL1 permission mask */
> >
> > Why? KVM doesn't support VMPLs.
>
> It does actually get used by the SVSM.
> I can remove these but then we'd need some capability bit or something to
> know when they are available if they get re-introduced.
_If_. We don't merge dead code, and we _definitely_ don't merge dead code that
creates ABI.
> > > + kvaddr = pfn_to_kaddr(pfns[i]);
> > > + if (!virt_addr_valid(kvaddr)) {
> >
> > I really, really don't like that this assume guest_memfd is backed by struct page.
>
> There are similar enforcements in the SEV/SEV-ES code. There was some
> initial discussion about relaxing this for SNP so we could support
> things like /dev/mem-mapped guest memory, but then guest_memfd came
> along and made that to be an unlikely use-case in the near-term given
> that it relies on alloc_pages() currently and explicitly guards against
> mmap()'ing pages in userspace.
>
> I think it makes to keep the current tightened restrictions in-place
> until such a use-case comes along, since otherwise we are relaxing a
> bunch of currently-useful sanity checks that span all throughout the code
> to support some nebulous potential use-case that might never come along.
> I think it makes more sense to cross that bridge when we get there.
I disagree. You say "sanity checks", while I see a bunch of arbitrary assumptions
that don't need to exist. Yes, today guest_memfd always uses struct page memory,
but that should have _zero_ impact on SNP. Arbitrary assumptions often cause a
lot of confusion for future readers, e.g. a few years from now, if the above code
still exists, someone might wonder what is so special about struct page memory,
and then waste a bunch of time trying to figure out that there's no technical
reason SNP "requires" struct page memory.
This is partly why I was pushing for guest_memfd to handle some of this; the
gmem code _knows_ what backing type it's using, it _knows_ if the direct map is
valid, etc.
> > > + pr_err("Invalid HVA 0x%llx for GFN 0x%llx\n", (uint64_t)kvaddr, gfn);
> > > + ret = -EINVAL;
> > > + goto e_release;
> > > + }
> > > +
> > > + ret = kvm_read_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE);
> >
> > Good gravy. If I'm reading this correctly, KVM:
> >
> > 1. Translates an HVA into a GFN.
> > 2. Gets the PFN for that GFN from guest_memfd
> > 3. Verifies the PFN is not assigned to the guest
> > 4. Copies memory from the shared memslot page to the guest_memfd page
> > 5. Converts the page to private and asks the PSP to encrypt it
> >
> > (a) As above, why is #1 a thing?
>
> Yah, it's probably best to avoid this, as proposed above.
>
> > (b) Why are KVM's memory attributes never consulted?
>
> It doesn't really matter if the attributes are set before or after
> KVM_SNP_LAUNCH_UPDATE, only that by the time the guest actually launches
> they pages get set to private so they get faulted in from gmem. We could
> document our expectations and enforce them here if that's preferable
> however. Maybe requiring KVM_SET_MEMORY_ATTRIBUTES(private) in advance
> would make it easier to enforce that userspace does the right thing.
> I'll see how that looks if there are no objections.
Userspace owns whether a page is PRIVATE or SHARED, full stop. If KVM can't
honor that, then we need to come up with better uAPI.
> > (c) What prevents TOCTOU issues with respect to the RMP?
>
> Time-of-use will be when the guest faults the gmem page in with C-bit
> set. If it is not in the expected Guest-owned/pre-validated state that
> SEV_CMD_SNP_LAUNCH_UPDATE expected/set, then the guest will get an RMP
> fault or #VC exception for unvalidated page access. It will also fail
> attestation. Not sure if that covers the scenarios you had in mind.
I don't think this covers what I was asking, but I suspect my concern will go
away once the new APIs come along, so let's table this for now.
>
> > (d) Why is *KVM* copying memory into guest_memfd?
>
> As mentioned above, there were various discussions of ways of allowing
> userspace to pwrite() to the guest_memfd in advance of
> "sealing"/"binding" it and then encrypting it in place. I think this was
> one of the related threads:
>
> https://lore.kernel.org/linux-mm/YkyKywkQYbr9U0CA@google.com/
>
> My read at the time was that the requirements between pKVM/TDX/SNP were all
> so unique that we'd spin forever trying to come up with a userspace ABI
> that worked for everyone. At the time you'd suggested for pKVM to handle
> their specific requirements internally in pKVM to avoid unecessary churn on
> TDX/SNP side, and I took the same approach with SNP in implementing it
> internally in SNP's KVM interfaces since it seemed unlikely there would
> be much common ground with how TDX handles it via KVM_TDX_INIT_MEM_REGION.
Yeah, the whole "intra-memslot copy" thing threw me.
> > (e) What guarantees the direct map is valid for guest_memfd?
>
> Are you suggesting this may change in the near-term?
I asking because, when I asked, I was unaware that the plan was to shatter the
direct to address the 2MiB vs. 4KiB erratum (as opposed to unmapping guest_memfd
pfns).
> > > + if (ret) {
> > > + pr_err("SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
> > > + ret, *error);
> > > + snp_page_reclaim(pfns[i]);
> > > +
> > > + /*
> > > + * When invalid CPUID function entries are detected, the firmware
> > > + * corrects these entries for debugging purpose and leaves the
> > > + * page unencrypted so it can be provided users for debugging
> > > + * and error-reporting.
> > > + *
> > > + * Copy the corrected CPUID page back to shared memory so
> > > + * userpsace can retrieve this information.
> >
> > Why? IIUC, this is basically backdooring reads/writes into guest_memfd to avoid
> > having to add proper mmap() support.
>
> The CPUID page is private/encrypted, so it needs to be a gmem page.
> SNP firmware is doing the backdooring when it writes the unencrypted,
> expected contents into the page during failure. What's wrong with copying
> the contents back into the source page so userspace can be use of it?
> If we implement the above-mentioned changes then the source page is just
> a userspace buffer that isn't necessarily associated with a memslot, so
> it becomes even more straightforward.
>
> Would that be acceptable?
Yes, I am specifically complaining about writing guest memory on failure, which is
all kinds of weird.
> > > + kvfree(pfns);
> >
> > Saving PFNs from guest_memfd, which is fully owned by KVM, is so unnecessarily
> > complex. Add a guest_memfd API (or three) to do this safely, e.g. to lock the
> > pages, do (and track) the RMP conversion, etc...
>
> Is adding 3 gmem APIs and tracking RMP states inside gmem really less
> complex than what's going on here?
I think we covered this in PUCK? Holler if you still have questions here.
next prev parent reply other threads:[~2024-02-02 22:54 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-30 17:23 [PATCH v11 00/35] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2023-12-30 17:23 ` [PATCH v11 01/35] KVM: Add hugepage support for dedicated guest memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 02/35] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 03/35] KVM: Use AS_INACCESSIBLE when creating guest_memfd inode Michael Roth
2023-12-30 17:23 ` [PATCH v11 04/35] KVM: x86: Add gmem hook for initializing memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 05/35] KVM: x86: Add gmem hook for invalidating memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 06/35] KVM: x86/mmu: Pass around full 64-bit error code for KVM page faults Michael Roth
2024-02-06 20:51 ` Sean Christopherson
2024-02-12 10:00 ` Paolo Bonzini
2024-02-12 16:42 ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 07/35] KVM: x86: Add KVM_X86_SNP_VM vm_type Michael Roth
2023-12-30 17:23 ` [PATCH v11 08/35] KVM: x86: Define RMP page fault error bits for #NPF Michael Roth
2023-12-30 17:23 ` [PATCH v11 09/35] KVM: x86: Determine shared/private faults based on vm_type Michael Roth
2024-02-12 10:31 ` Paolo Bonzini
2024-02-12 16:27 ` Sean Christopherson
2024-02-12 16:47 ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 10/35] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests Michael Roth
2023-12-30 17:23 ` [PATCH v11 11/35] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
2023-12-30 17:23 ` [PATCH v11 12/35] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
2023-12-30 17:23 ` [PATCH v11 13/35] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 14/35] KVM: SEV: Add initial SEV-SNP support Michael Roth
2023-12-30 17:23 ` [PATCH v11 15/35] KVM: SEV: Add KVM_SNP_INIT command Michael Roth
2024-02-06 23:51 ` Paolo Bonzini
2024-03-20 17:28 ` Paolo Bonzini
2023-12-30 17:23 ` [PATCH v11 16/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2023-12-30 17:23 ` [PATCH v11 17/35] KVM: Add HVA range operator Michael Roth
2023-12-30 17:23 ` [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2024-01-10 15:45 ` Sean Christopherson
2024-01-16 4:14 ` Michael Roth
2024-02-02 22:54 ` Sean Christopherson [this message]
2024-02-06 23:43 ` Paolo Bonzini
2024-02-07 2:43 ` Sean Christopherson
2024-02-07 8:03 ` Paolo Bonzini
2024-02-09 1:52 ` Michael Roth
2024-02-09 14:34 ` Sean Christopherson
2024-03-18 21:02 ` Peter Gonda
2023-12-30 17:23 ` [PATCH v11 19/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2023-12-30 17:23 ` [PATCH v11 20/35] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 21/35] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 22/35] KVM: SEV: Add support to handle " Michael Roth
2023-12-30 17:23 ` [PATCH v11 23/35] KVM: x86: Export the kvm_zap_gfn_range() for the SNP use Michael Roth
2023-12-30 17:23 ` [PATCH v11 24/35] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
2023-12-30 17:23 ` [PATCH v11 25/35] KVM: SEV: Use a VMSA physical address variable for populating VMCB Michael Roth
2023-12-30 17:23 ` [PATCH v11 26/35] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
2024-01-05 22:08 ` Jacob Xu
2024-01-08 15:53 ` Sean Christopherson
2023-12-30 17:23 ` [PATCH v11 27/35] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 28/35] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2024-03-11 5:50 ` Binbin Wu
2023-12-30 17:23 ` [PATCH v11 29/35] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2023-12-30 17:23 ` [PATCH v11 30/35] KVM: x86: Add gmem hook for determining max NPT mapping level Michael Roth
2024-02-12 10:50 ` Paolo Bonzini
2024-02-12 17:03 ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 31/35] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 32/35] KVM: SVM: Add module parameter to enable the SEV-SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 33/35] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2023-12-30 17:23 ` [PATCH v11 34/35] crypto: ccp: Add the SNP_SET_CONFIG_{START,END} commands Michael Roth
2023-12-30 17:23 ` [PATCH v11 35/35] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zb1yv67h6gkYqqv9@google.com \
--to=seanjc@google.com \
--cc=ak@linux.intel.com \
--cc=alpergun@google.com \
--cc=ardb@kernel.org \
--cc=ashish.kalra@amd.com \
--cc=bp@alien8.de \
--cc=brijesh.singh@amd.com \
--cc=dave.hansen@linux.intel.com \
--cc=dovmurik@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=jarkko@kernel.org \
--cc=jmattson@google.com \
--cc=jroedel@suse.de \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=liam.merwick@oracle.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=michael.roth@amd.com \
--cc=mingo@redhat.com \
--cc=nikunj.dadhania@amd.com \
--cc=pankaj.gupta@amd.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=pgonda@google.com \
--cc=rientjes@google.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=slp@redhat.com \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=tobin@ibm.com \
--cc=tony.luck@intel.com \
--cc=vbabka@suse.cz \
--cc=vkuznets@redhat.com \
--cc=x86@kernel.org \
--cc=zhi.a.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.