public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>,
	linux-kernel@vger.kernel.org,  kvm@vger.kernel.org,
	michael.roth@amd.com, isaku.yamahata@linux.intel.com
Subject: Re: [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data
Date: Mon, 10 Jun 2024 16:40:52 -0700	[thread overview]
Message-ID: <ZmeOxAtwfTsDCi1x@google.com> (raw)
In-Reply-To: <de1b0bbc-b781-4372-88ad-81f26c9152c2@redhat.com>

On Tue, Jun 11, 2024, Paolo Bonzini wrote:
> On 6/10/24 23:48, Paolo Bonzini wrote:
> > On Sat, Jun 8, 2024 at 1:03 AM Sean Christopherson <seanjc@google.com> wrote:
> > > SNP folks and/or Paolo, what's the plan for this?  I don't see how what's sitting
> > > in kvm/next can possibly be correct without conditioning population on the folio
> > > being !uptodate.
> > 
> > I don't think I have time to look at it closely until Friday; but
> > thanks for reminding me.
> 
> Ok, I'm officially confused.  I think I understand what you did in your
> suggested code.  Limiting it to the bare minimum (keeping the callback
> instead of CONFIG_HAVE_KVM_GMEM_INITIALIZE) it would be something
> like what I include at the end of the message.
> 
> But the discussion upthread was about whether to do the check for
> RMP state in sev.c, or do it in common code using folio_mark_uptodate().
> I am not sure what you mean by "cannot possibly be correct", and
> whether it's referring to kvm_gmem_populate() in general or the
> callback in sev_gmem_post_populate().

Doing fallocate() before KVM_SEV_SNP_LAUNCH_UPDATE will cause the latter to fail.
That likely works for QEMU, at least for now, but it's unnecessarily restrictive
and IMO incorrect/wrong.

E.g. a more convoluted, fallocate() + PUNCH_HOLE + KVM_SEV_SNP_LAUNCH_UPDATE will
work (I think?  AFAICT adding and removing pages directly to/from the RMP doesn't
affect SNP's measurement, only pages that are added via SNP_LAUNCH_UPDATE affect
the measurement).

Punting the sanity check to vendor code is also gross and will make it harder to
provide a consistent, unified ABI for all architectures.  E.g. SNP returns -EINVAL
if the page is already assigned, which is quite misleading.

> The change below looks like just an optimization to me, which
> suggests that I'm missing something glaring.

I really dislike @prepare.  There are two paths that should actually initialize
the contents of the folio, and they are mutually exclusive and have meaningfully
different behavior.  Faulting in memory via kvm_gmem_get_pfn() explicitly zeros
the folio _if necessary_, whereas kvm_gmem_populate() initializes the folio with
user-provided data _and_ requires that the folio be !uptodate.

If we fix the above oddity where fallocate() initializes memory, then there's
no need to try and handle the initialization in a common chokepoint as the two
relevant paths will naturally have unique code.

The below is also still suboptimal for TDX, as KVM will zero the memory and then
TDX-module will also zero memory on PAGE.AUGA.

And I find SNP to be the odd one.  IIUC, the ASP (the artist formerly known as
the PSP) doesn't provide any guarantees about the contents of a page that is
assigned to a guest without bouncing through SNP_LAUNCH_UPDATE.  It'd be nice to
explicitly document that somewhere in the SNP code. E.g. if guest_memfd invokes
a common kvm_gmem_initialize_folio() or whatever, then SNP's implementation can
clearly capture that KVM zeros the page to protect the _host_ data.

> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index d4206e53a9c81..a0417ef5b86eb 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -52,37 +52,39 @@ static int kvm_gmem_prepare_folio(struct inode *inode, pgoff_t index, struct fol
>  static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool prepare)
>  {
>  	struct folio *folio;
> +	int r;
>  	/* TODO: Support huge pages. */
>  	folio = filemap_grab_folio(inode->i_mapping, index);
>  	if (IS_ERR(folio))
>  		return folio;
> -	/*
> -	 * Use the up-to-date flag to track whether or not the memory has been
> -	 * zeroed before being handed off to the guest.  There is no backing
> -	 * storage for the memory, so the folio will remain up-to-date until
> -	 * it's removed.
> -	 *
> -	 * TODO: Skip clearing pages when trusted firmware will do it when
> -	 * assigning memory to the guest.
> -	 */
> -	if (!folio_test_uptodate(folio)) {
> -		unsigned long nr_pages = folio_nr_pages(folio);
> -		unsigned long i;
> +	if (prepare) {
> +		/*
> +		 * Use the up-to-date flag to track whether or not the memory has
> +		 * been handed off to the guest.  There is no backing storage for
> +		 * the memory, so the folio will remain up-to-date until it's
> +		 * removed.
> +		 *
> +		 * Take the occasion of the first prepare operation to clear it.
> +		 */
> +		if (!folio_test_uptodate(folio)) {
> +			unsigned long nr_pages = folio_nr_pages(folio);
> +			unsigned long i;
> -		for (i = 0; i < nr_pages; i++)
> -			clear_highpage(folio_page(folio, i));
> +			for (i = 0; i < nr_pages; i++)
> +				clear_highpage(folio_page(folio, i));
> +		}
> +
> +		r = kvm_gmem_prepare_folio(inode, index, folio);
> +		if (r < 0)
> +			goto err_unlock_put;
>  		folio_mark_uptodate(folio);
> -	}
> -
> -	if (prepare) {
> -		int r =	kvm_gmem_prepare_folio(inode, index, folio);
> -		if (r < 0) {
> -			folio_unlock(folio);
> -			folio_put(folio);
> -			return ERR_PTR(r);
> +	} else {
> +		if (folio_test_uptodate(folio)) {
> +			r = -EEXIST;
> +			goto err_unlock_put;
>  		}
>  	}

  reply	other threads:[~2024-06-10 23:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 18:50 [PATCH 00/11] KVM: guest_memfd: New hooks and functionality for SEV-SNP and TDX Paolo Bonzini
2024-04-04 18:50 ` [PATCH 01/11] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Paolo Bonzini
2024-04-29 13:14   ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 02/11] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Paolo Bonzini
2024-04-29 13:15   ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 03/11] KVM: guest_memfd: pass error up from filemap_grab_folio Paolo Bonzini
2024-04-04 18:50 ` [PATCH 04/11] filemap: add FGP_CREAT_ONLY Paolo Bonzini
2024-04-25  5:52   ` Paolo Bonzini
2024-04-29 13:26     ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 05/11] KVM: guest_memfd: limit overzealous WARN Paolo Bonzini
2024-04-04 18:50 ` [PATCH 06/11] KVM: guest_memfd: Add hook for initializing memory Paolo Bonzini
2024-04-22 10:53   ` Xu Yilun
2024-05-07 16:17     ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 07/11] KVM: guest_memfd: extract __kvm_gmem_get_pfn() Paolo Bonzini
2024-04-09 23:35   ` Michael Roth
2024-04-24 22:34   ` Sean Christopherson
2024-04-24 22:59     ` Sean Christopherson
2024-04-04 18:50 ` [PATCH 08/11] KVM: guest_memfd: extract __kvm_gmem_punch_hole() Paolo Bonzini
2024-04-04 18:50 ` [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data Paolo Bonzini
2024-04-22 14:44   ` Xu Yilun
2024-04-23 23:50   ` Isaku Yamahata
2024-04-24 22:24     ` Sean Christopherson
2024-04-25  1:12       ` Isaku Yamahata
2024-04-25  6:01         ` Paolo Bonzini
2024-04-25 16:00           ` Sean Christopherson
2024-04-25 16:51             ` Isaku Yamahata
2024-04-26  5:44               ` Paolo Bonzini
2024-04-26 17:15                 ` Isaku Yamahata
2024-04-26  5:41             ` Paolo Bonzini
2024-04-26 15:17               ` Sean Christopherson
2024-06-07 23:03                 ` Sean Christopherson
2024-06-10 21:48                   ` Paolo Bonzini
2024-06-10 22:26                     ` Paolo Bonzini
2024-06-10 23:40                       ` Sean Christopherson [this message]
2024-06-11  6:09                         ` Paolo Bonzini
2024-04-24 22:32   ` Sean Christopherson
2024-04-25  5:56     ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 10/11] KVM: guest_memfd: Add hook for invalidating memory Paolo Bonzini
2024-04-04 18:50 ` [PATCH 11/11] KVM: x86: Add gmem hook for determining max NPT mapping level Paolo Bonzini
2024-04-09 23:46   ` Michael Roth
2024-04-19 18:26     ` Isaku Yamahata
2024-04-22 14:52   ` Xu Yilun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmeOxAtwfTsDCi1x@google.com \
    --to=seanjc@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=isaku.yamahata@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox