All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Michael Roth <michael.roth@amd.com>
Cc: kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	linux-mm@kvack.org,  linux-crypto@vger.kernel.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	 tglx@linutronix.de, mingo@redhat.com, jroedel@suse.de,
	 thomas.lendacky@amd.com, hpa@zytor.com, ardb@kernel.org,
	pbonzini@redhat.com,  vkuznets@redhat.com, jmattson@google.com,
	luto@kernel.org,  dave.hansen@linux.intel.com, slp@redhat.com,
	pgonda@google.com,  peterz@infradead.org,
	srinivas.pandruvada@linux.intel.com,  rientjes@google.com,
	dovmurik@linux.ibm.com, tobin@ibm.com, bp@alien8.de,
	 vbabka@suse.cz, kirill@shutemov.name, ak@linux.intel.com,
	tony.luck@intel.com,  sathyanarayanan.kuppuswamy@linux.intel.com,
	alpergun@google.com,  jarkko@kernel.org, ashish.kalra@amd.com,
	nikunj.dadhania@amd.com,  pankaj.gupta@amd.com,
	liam.merwick@oracle.com, zhi.a.wang@intel.com,
	 Brijesh Singh <brijesh.singh@amd.com>
Subject: Re: [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command
Date: Wed, 10 Jan 2024 07:45:36 -0800	[thread overview]
Message-ID: <ZZ67oJwzAsSvui5U@google.com> (raw)
In-Reply-To: <20231230172351.574091-19-michael.roth@amd.com>

On Sat, Dec 30, 2023, Michael Roth wrote:
> From: Brijesh Singh <brijesh.singh@amd.com>
> 
> The KVM_SEV_SNP_LAUNCH_UPDATE command can be used to insert data into
> the guest's memory. The data is encrypted with the cryptographic context
> created with the KVM_SEV_SNP_LAUNCH_START.
> 
> In addition to the inserting data, it can insert a two special pages
> into the guests memory: the secrets page and the CPUID page.
> 
> While terminating the guest, reclaim the guest pages added in the RMP
> table. If the reclaim fails, then the page is no longer safe to be
> released back to the system and leak them.
> 
> For more information see the SEV-SNP specification.

Please rewrite all changelogs to explain what *KVM* support is being added, why
the proposed uAPI looks like it does, and how the new uAPI is intended be used.

Porividing a crash course on the relevant hardware behavior is definitely helpful,
but the changelog absolutely needs to explain/justify the patch.

> Co-developed-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  .../virt/kvm/x86/amd-memory-encryption.rst    |  28 +++
>  arch/x86/kvm/svm/sev.c                        | 181 ++++++++++++++++++
>  include/uapi/linux/kvm.h                      |  19 ++
>  3 files changed, 228 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/x86/amd-memory-encryption.rst b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> index b1beb2fe8766..d4325b26724c 100644
> --- a/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> +++ b/Documentation/virt/kvm/x86/amd-memory-encryption.rst
> @@ -485,6 +485,34 @@ Returns: 0 on success, -negative on error
>  
>  See the SEV-SNP specification for further detail on the launch input.
>  
> +20. KVM_SNP_LAUNCH_UPDATE
> +-------------------------
> +
> +The KVM_SNP_LAUNCH_UPDATE is used for encrypting a memory region. It also
> +calculates a measurement of the memory contents. The measurement is a signature
> +of the memory contents that can be sent to the guest owner as an attestation
> +that the memory was encrypted correctly by the firmware.
> +
> +Parameters (in): struct  kvm_snp_launch_update
> +
> +Returns: 0 on success, -negative on error
> +
> +::
> +
> +        struct kvm_sev_snp_launch_update {
> +                __u64 start_gfn;        /* Guest page number to start from. */
> +                __u64 uaddr;            /* userspace address need to be encrypted */

Huh?  Why is KVM taking a userspace address?  IIUC, the address unconditionally
gets translated into a gfn, so why not pass a gfn?

And speaking of gfns, AFAICT start_gfn is never used.

Oof, reading more of the code, this *requires* an effective in-place copy-and-convert
of guest memory.

> +                __u32 len;              /* length of memory region */

Bytes?  Pages?  One field above operates on frame numbers, one apparently operates
on a byte-granularity address.

> +                __u8 imi_page;          /* 1 if memory is part of the IMI */

What's "the IMI"?  Initial Measurement Image?  I assume this is essentially just
a flag that communicates whether or not the page should be measured?

> +                __u8 page_type;         /* page type */
> +                __u8 vmpl3_perms;       /* VMPL3 permission mask */
> +                __u8 vmpl2_perms;       /* VMPL2 permission mask */
> +                __u8 vmpl1_perms;       /* VMPL1 permission mask */

Why?  KVM doesn't support VMPLs.

> +static int snp_page_reclaim(u64 pfn)
> +{
> +	struct sev_data_snp_page_reclaim data = {0};
> +	int err, rc;
> +
> +	data.paddr = __sme_set(pfn << PAGE_SHIFT);
> +	rc = sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err);
> +	if (rc) {
> +		/*
> +		 * If the reclaim failed, then page is no longer safe
> +		 * to use.

Uh, why can reclaim fail, and why does the kernel apparently not care about
leaking pages?  AFAICT, nothing ever complains beyond a pr_debug.  That sounds
bonkers to me, i.e. at the very minimum, why doesn't this warrant a WARN_ON_ONCE?

> +		 */
> +		snp_leak_pages(pfn, 1);
> +	}
> +
> +	return rc;
> +}
> +
> +static int host_rmp_make_shared(u64 pfn, enum pg_level level, bool leak)
> +{
> +	int rc;
> +
> +	rc = rmp_make_shared(pfn, level);
> +	if (rc && leak)
> +		snp_leak_pages(pfn,
> +			       page_level_size(level) >> PAGE_SHIFT);

Completely unnecessary wrap.

> +
> +	return rc;
> +}
> +
>  static void sev_unbind_asid(struct kvm *kvm, unsigned int handle)
>  {
>  	struct sev_data_deactivate deactivate;
> @@ -1990,6 +2020,154 @@ static int snp_launch_start(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return rc;
>  }
>  
> +static int snp_launch_update_gfn_handler(struct kvm *kvm,
> +					 struct kvm_gfn_range *range,
> +					 void *opaque)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct kvm_memory_slot *memslot = range->slot;
> +	struct sev_data_snp_launch_update data = {0};
> +	struct kvm_sev_snp_launch_update params;
> +	struct kvm_sev_cmd *argp = opaque;
> +	int *error = &argp->error;
> +	int i, n = 0, ret = 0;
> +	unsigned long npages;
> +	kvm_pfn_t *pfns;
> +	gfn_t gfn;
> +
> +	if (!kvm_slot_can_be_private(memslot)) {
> +		pr_err("SEV-SNP requires private memory support via guest_memfd.\n");

Yeah, no.  Sprinkling pr_err() all over the place in user-triggerable error paths
is not acceptable.  I get that it's often hard to extract "what went wrong" out
of the kernel, but adding pr_err() everywhere is not a viable solution.

> +		return -EINVAL;
> +	}
> +
> +	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params))) {
> +		pr_err("Failed to copy user parameters for SEV-SNP launch.\n");
> +		return -EFAULT;
> +	}
> +
> +	data.gctx_paddr = __psp_pa(sev->snp_context);
> +
> +	npages = range->end - range->start;
> +	pfns = kvmalloc_array(npages, sizeof(*pfns), GFP_KERNEL_ACCOUNT);
> +	if (!pfns)
> +		return -ENOMEM;
> +
> +	pr_debug("%s: GFN range 0x%llx-0x%llx, type %d\n", __func__,
> +		 range->start, range->end, params.page_type);
> +
> +	for (gfn = range->start, i = 0; gfn < range->end; gfn++, i++) {
> +		int order, level;
> +		bool assigned;
> +		void *kvaddr;
> +
> +		ret = __kvm_gmem_get_pfn(kvm, memslot, gfn, &pfns[i], &order, false);
> +		if (ret)
> +			goto e_release;
> +
> +		n++;
> +		ret = snp_lookup_rmpentry((u64)pfns[i], &assigned, &level);
> +		if (ret || assigned) {
> +			pr_err("Failed to ensure GFN 0x%llx is in initial shared state, ret: %d, assigned: %d\n",
> +			       gfn, ret, assigned);
> +			return -EFAULT;
> +		}
> +
> +		kvaddr = pfn_to_kaddr(pfns[i]);
> +		if (!virt_addr_valid(kvaddr)) {

I really, really don't like that this assume guest_memfd is backed by struct page.

> +			pr_err("Invalid HVA 0x%llx for GFN 0x%llx\n", (uint64_t)kvaddr, gfn);
> +			ret = -EINVAL;
> +			goto e_release;
> +		}
> +
> +		ret = kvm_read_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE);

Good gravy.  If I'm reading this correctly, KVM:

  1. Translates an HVA into a GFN.
  2. Gets the PFN for that GFN from guest_memfd
  3. Verifies the PFN is not assigned to the guest
  4. Copies memory from the shared memslot page to the guest_memfd page
  5. Converts the page to private and asks the PSP to encrypt it

(a) As above, why is #1 a thing?
(b) Why are KVM's memory attributes never consulted?
(c) What prevents TOCTOU issues with respect to the RMP?
(d) Why is *KVM* copying memory into guest_memfd?
(e) What guarantees the direct map is valid for guest_memfd?
(f) Why does KVM's uAPI *require* the source page to come from the same memslot?

> +		if (ret) {
> +			pr_err("Guest read failed, ret: 0x%x\n", ret);
> +			goto e_release;
> +		}
> +
> +		ret = rmp_make_private(pfns[i], gfn << PAGE_SHIFT, PG_LEVEL_4K,
> +				       sev_get_asid(kvm), true);
> +		if (ret) {
> +			ret = -EFAULT;
> +			goto e_release;
> +		}
> +
> +		data.address = __sme_set(pfns[i] << PAGE_SHIFT);
> +		data.page_size = PG_LEVEL_TO_RMP(PG_LEVEL_4K);
> +		data.page_type = params.page_type;
> +		data.vmpl3_perms = params.vmpl3_perms;
> +		data.vmpl2_perms = params.vmpl2_perms;
> +		data.vmpl1_perms = params.vmpl1_perms;
> +		ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE,
> +				      &data, error);
> +		if (ret) {
> +			pr_err("SEV-SNP launch update failed, ret: 0x%x, fw_error: 0x%x\n",
> +			       ret, *error);
> +			snp_page_reclaim(pfns[i]);
> +
> +			/*
> +			 * When invalid CPUID function entries are detected, the firmware
> +			 * corrects these entries for debugging purpose and leaves the
> +			 * page unencrypted so it can be provided users for debugging
> +			 * and error-reporting.
> +			 *
> +			 * Copy the corrected CPUID page back to shared memory so
> +			 * userpsace can retrieve this information.

Why?  IIUC, this is basically backdooring reads/writes into guest_memfd to avoid
having to add proper mmap() support.
 
> +			 */
> +			if (params.page_type == SNP_PAGE_TYPE_CPUID &&
> +			    *error == SEV_RET_INVALID_PARAM) {
> +				int ret;

Ugh, do not shadow variables.

> +
> +				host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true);
> +
> +				ret = kvm_write_guest_page(kvm, gfn, kvaddr, 0, PAGE_SIZE);
> +				if (ret)
> +					pr_err("Failed to write CPUID page back to userspace, ret: 0x%x\n",
> +					       ret);
> +			}
> +
> +			goto e_release;
> +		}
> +	}
> +
> +e_release:
> +	/* Content of memory is updated, mark pages dirty */
> +	for (i = 0; i < n; i++) {
> +		set_page_dirty(pfn_to_page(pfns[i]));
> +		mark_page_accessed(pfn_to_page(pfns[i]));
> +
> +		/*
> +		 * If its an error, then update RMP entry to change page ownership
> +		 * to the hypervisor.
> +		 */
> +		if (ret)
> +			host_rmp_make_shared(pfns[i], PG_LEVEL_4K, true);
> +
> +		put_page(pfn_to_page(pfns[i]));
> +	}
> +
> +	kvfree(pfns);

Saving PFNs from guest_memfd, which is fully owned by KVM, is so unnecessarily
complex.  Add a guest_memfd API (or three) to do this safely, e.g. to lock the
pages, do (and track) the RMP conversion, etc...

  reply	other threads:[~2024-01-10 15:45 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-30 17:23 [PATCH v11 00/35] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support Michael Roth
2023-12-30 17:23 ` [PATCH v11 01/35] KVM: Add hugepage support for dedicated guest memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 02/35] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 03/35] KVM: Use AS_INACCESSIBLE when creating guest_memfd inode Michael Roth
2023-12-30 17:23 ` [PATCH v11 04/35] KVM: x86: Add gmem hook for initializing memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 05/35] KVM: x86: Add gmem hook for invalidating memory Michael Roth
2023-12-30 17:23 ` [PATCH v11 06/35] KVM: x86/mmu: Pass around full 64-bit error code for KVM page faults Michael Roth
2024-02-06 20:51   ` Sean Christopherson
2024-02-12 10:00     ` Paolo Bonzini
2024-02-12 16:42       ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 07/35] KVM: x86: Add KVM_X86_SNP_VM vm_type Michael Roth
2023-12-30 17:23 ` [PATCH v11 08/35] KVM: x86: Define RMP page fault error bits for #NPF Michael Roth
2023-12-30 17:23 ` [PATCH v11 09/35] KVM: x86: Determine shared/private faults based on vm_type Michael Roth
2024-02-12 10:31   ` Paolo Bonzini
2024-02-12 16:27     ` Sean Christopherson
2024-02-12 16:47       ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 10/35] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests Michael Roth
2023-12-30 17:23 ` [PATCH v11 11/35] KVM: SEV: Select KVM_GENERIC_PRIVATE_MEM when CONFIG_KVM_AMD_SEV=y Michael Roth
2023-12-30 17:23 ` [PATCH v11 12/35] KVM: SEV: Add support to handle AP reset MSR protocol Michael Roth
2023-12-30 17:23 ` [PATCH v11 13/35] KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 14/35] KVM: SEV: Add initial SEV-SNP support Michael Roth
2023-12-30 17:23 ` [PATCH v11 15/35] KVM: SEV: Add KVM_SNP_INIT command Michael Roth
2024-02-06 23:51   ` Paolo Bonzini
2024-03-20 17:28   ` Paolo Bonzini
2023-12-30 17:23 ` [PATCH v11 16/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START command Michael Roth
2023-12-30 17:23 ` [PATCH v11 17/35] KVM: Add HVA range operator Michael Roth
2023-12-30 17:23 ` [PATCH v11 18/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command Michael Roth
2024-01-10 15:45   ` Sean Christopherson [this message]
2024-01-16  4:14     ` Michael Roth
2024-02-02 22:54       ` Sean Christopherson
2024-02-06 23:43         ` Paolo Bonzini
2024-02-07  2:43           ` Sean Christopherson
2024-02-07  8:03             ` Paolo Bonzini
2024-02-09  1:52           ` Michael Roth
2024-02-09 14:34             ` Sean Christopherson
2024-03-18 21:02   ` Peter Gonda
2023-12-30 17:23 ` [PATCH v11 19/35] KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH command Michael Roth
2023-12-30 17:23 ` [PATCH v11 20/35] KVM: SEV: Add support to handle GHCB GPA register VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 21/35] KVM: SEV: Add support to handle MSR based Page State Change VMGEXIT Michael Roth
2023-12-30 17:23 ` [PATCH v11 22/35] KVM: SEV: Add support to handle " Michael Roth
2023-12-30 17:23 ` [PATCH v11 23/35] KVM: x86: Export the kvm_zap_gfn_range() for the SNP use Michael Roth
2023-12-30 17:23 ` [PATCH v11 24/35] KVM: SEV: Add support to handle RMP nested page faults Michael Roth
2023-12-30 17:23 ` [PATCH v11 25/35] KVM: SEV: Use a VMSA physical address variable for populating VMCB Michael Roth
2023-12-30 17:23 ` [PATCH v11 26/35] KVM: SEV: Support SEV-SNP AP Creation NAE event Michael Roth
2024-01-05 22:08   ` Jacob Xu
2024-01-08 15:53     ` Sean Christopherson
2023-12-30 17:23 ` [PATCH v11 27/35] KVM: SEV: Add support for GHCB-based termination requests Michael Roth
2023-12-30 17:23 ` [PATCH v11 28/35] KVM: SEV: Implement gmem hook for initializing private pages Michael Roth
2024-03-11  5:50   ` Binbin Wu
2023-12-30 17:23 ` [PATCH v11 29/35] KVM: SEV: Implement gmem hook for invalidating " Michael Roth
2023-12-30 17:23 ` [PATCH v11 30/35] KVM: x86: Add gmem hook for determining max NPT mapping level Michael Roth
2024-02-12 10:50   ` Paolo Bonzini
2024-02-12 17:03     ` Michael Roth
2023-12-30 17:23 ` [PATCH v11 31/35] KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 32/35] KVM: SVM: Add module parameter to enable the SEV-SNP Michael Roth
2023-12-30 17:23 ` [PATCH v11 33/35] KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event Michael Roth
2023-12-30 17:23 ` [PATCH v11 34/35] crypto: ccp: Add the SNP_SET_CONFIG_{START,END} commands Michael Roth
2023-12-30 17:23 ` [PATCH v11 35/35] KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event Michael Roth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZZ67oJwzAsSvui5U@google.com \
    --to=seanjc@google.com \
    --cc=ak@linux.intel.com \
    --cc=alpergun@google.com \
    --cc=ardb@kernel.org \
    --cc=ashish.kalra@amd.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dovmurik@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jmattson@google.com \
    --cc=jroedel@suse.de \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=liam.merwick@oracle.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=nikunj.dadhania@amd.com \
    --cc=pankaj.gupta@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=slp@redhat.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tobin@ibm.com \
    --cc=tony.luck@intel.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=x86@kernel.org \
    --cc=zhi.a.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.