All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>,
	linux-kernel@vger.kernel.org,  kvm@vger.kernel.org,
	michael.roth@amd.com, isaku.yamahata@linux.intel.com
Subject: Re: [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data
Date: Mon, 10 Jun 2024 16:40:52 -0700	[thread overview]
Message-ID: <ZmeOxAtwfTsDCi1x@google.com> (raw)
In-Reply-To: <de1b0bbc-b781-4372-88ad-81f26c9152c2@redhat.com>

On Tue, Jun 11, 2024, Paolo Bonzini wrote:
> On 6/10/24 23:48, Paolo Bonzini wrote:
> > On Sat, Jun 8, 2024 at 1:03 AM Sean Christopherson <seanjc@google.com> wrote:
> > > SNP folks and/or Paolo, what's the plan for this?  I don't see how what's sitting
> > > in kvm/next can possibly be correct without conditioning population on the folio
> > > being !uptodate.
> > 
> > I don't think I have time to look at it closely until Friday; but
> > thanks for reminding me.
> 
> Ok, I'm officially confused.  I think I understand what you did in your
> suggested code.  Limiting it to the bare minimum (keeping the callback
> instead of CONFIG_HAVE_KVM_GMEM_INITIALIZE) it would be something
> like what I include at the end of the message.
> 
> But the discussion upthread was about whether to do the check for
> RMP state in sev.c, or do it in common code using folio_mark_uptodate().
> I am not sure what you mean by "cannot possibly be correct", and
> whether it's referring to kvm_gmem_populate() in general or the
> callback in sev_gmem_post_populate().

Doing fallocate() before KVM_SEV_SNP_LAUNCH_UPDATE will cause the latter to fail.
That likely works for QEMU, at least for now, but it's unnecessarily restrictive
and IMO incorrect/wrong.

E.g. a more convoluted, fallocate() + PUNCH_HOLE + KVM_SEV_SNP_LAUNCH_UPDATE will
work (I think?  AFAICT adding and removing pages directly to/from the RMP doesn't
affect SNP's measurement, only pages that are added via SNP_LAUNCH_UPDATE affect
the measurement).

Punting the sanity check to vendor code is also gross and will make it harder to
provide a consistent, unified ABI for all architectures.  E.g. SNP returns -EINVAL
if the page is already assigned, which is quite misleading.

> The change below looks like just an optimization to me, which
> suggests that I'm missing something glaring.

I really dislike @prepare.  There are two paths that should actually initialize
the contents of the folio, and they are mutually exclusive and have meaningfully
different behavior.  Faulting in memory via kvm_gmem_get_pfn() explicitly zeros
the folio _if necessary_, whereas kvm_gmem_populate() initializes the folio with
user-provided data _and_ requires that the folio be !uptodate.

If we fix the above oddity where fallocate() initializes memory, then there's
no need to try and handle the initialization in a common chokepoint as the two
relevant paths will naturally have unique code.

The below is also still suboptimal for TDX, as KVM will zero the memory and then
TDX-module will also zero memory on PAGE.AUGA.

And I find SNP to be the odd one.  IIUC, the ASP (the artist formerly known as
the PSP) doesn't provide any guarantees about the contents of a page that is
assigned to a guest without bouncing through SNP_LAUNCH_UPDATE.  It'd be nice to
explicitly document that somewhere in the SNP code. E.g. if guest_memfd invokes
a common kvm_gmem_initialize_folio() or whatever, then SNP's implementation can
clearly capture that KVM zeros the page to protect the _host_ data.

> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index d4206e53a9c81..a0417ef5b86eb 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -52,37 +52,39 @@ static int kvm_gmem_prepare_folio(struct inode *inode, pgoff_t index, struct fol
>  static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index, bool prepare)
>  {
>  	struct folio *folio;
> +	int r;
>  	/* TODO: Support huge pages. */
>  	folio = filemap_grab_folio(inode->i_mapping, index);
>  	if (IS_ERR(folio))
>  		return folio;
> -	/*
> -	 * Use the up-to-date flag to track whether or not the memory has been
> -	 * zeroed before being handed off to the guest.  There is no backing
> -	 * storage for the memory, so the folio will remain up-to-date until
> -	 * it's removed.
> -	 *
> -	 * TODO: Skip clearing pages when trusted firmware will do it when
> -	 * assigning memory to the guest.
> -	 */
> -	if (!folio_test_uptodate(folio)) {
> -		unsigned long nr_pages = folio_nr_pages(folio);
> -		unsigned long i;
> +	if (prepare) {
> +		/*
> +		 * Use the up-to-date flag to track whether or not the memory has
> +		 * been handed off to the guest.  There is no backing storage for
> +		 * the memory, so the folio will remain up-to-date until it's
> +		 * removed.
> +		 *
> +		 * Take the occasion of the first prepare operation to clear it.
> +		 */
> +		if (!folio_test_uptodate(folio)) {
> +			unsigned long nr_pages = folio_nr_pages(folio);
> +			unsigned long i;
> -		for (i = 0; i < nr_pages; i++)
> -			clear_highpage(folio_page(folio, i));
> +			for (i = 0; i < nr_pages; i++)
> +				clear_highpage(folio_page(folio, i));
> +		}
> +
> +		r = kvm_gmem_prepare_folio(inode, index, folio);
> +		if (r < 0)
> +			goto err_unlock_put;
>  		folio_mark_uptodate(folio);
> -	}
> -
> -	if (prepare) {
> -		int r =	kvm_gmem_prepare_folio(inode, index, folio);
> -		if (r < 0) {
> -			folio_unlock(folio);
> -			folio_put(folio);
> -			return ERR_PTR(r);
> +	} else {
> +		if (folio_test_uptodate(folio)) {
> +			r = -EEXIST;
> +			goto err_unlock_put;
>  		}
>  	}

  reply	other threads:[~2024-06-10 23:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 18:50 [PATCH 00/11] KVM: guest_memfd: New hooks and functionality for SEV-SNP and TDX Paolo Bonzini
2024-04-04 18:50 ` [PATCH 01/11] mm: Introduce AS_INACCESSIBLE for encrypted/confidential memory Paolo Bonzini
2024-04-29 13:14   ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 02/11] KVM: guest_memfd: Use AS_INACCESSIBLE when creating guest_memfd inode Paolo Bonzini
2024-04-29 13:15   ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 03/11] KVM: guest_memfd: pass error up from filemap_grab_folio Paolo Bonzini
2024-04-04 18:50 ` [PATCH 04/11] filemap: add FGP_CREAT_ONLY Paolo Bonzini
2024-04-25  5:52   ` Paolo Bonzini
2024-04-29 13:26     ` Vlastimil Babka
2024-04-04 18:50 ` [PATCH 05/11] KVM: guest_memfd: limit overzealous WARN Paolo Bonzini
2024-04-04 18:50 ` [PATCH 06/11] KVM: guest_memfd: Add hook for initializing memory Paolo Bonzini
2024-04-22 10:53   ` Xu Yilun
2024-05-07 16:17     ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 07/11] KVM: guest_memfd: extract __kvm_gmem_get_pfn() Paolo Bonzini
2024-04-09 23:35   ` Michael Roth
2024-04-24 22:34   ` Sean Christopherson
2024-04-24 22:59     ` Sean Christopherson
2024-04-04 18:50 ` [PATCH 08/11] KVM: guest_memfd: extract __kvm_gmem_punch_hole() Paolo Bonzini
2024-04-04 18:50 ` [PATCH 09/11] KVM: guest_memfd: Add interface for populating gmem pages with user data Paolo Bonzini
2024-04-22 14:44   ` Xu Yilun
2024-04-23 23:50   ` Isaku Yamahata
2024-04-24 22:24     ` Sean Christopherson
2024-04-25  1:12       ` Isaku Yamahata
2024-04-25  6:01         ` Paolo Bonzini
2024-04-25 16:00           ` Sean Christopherson
2024-04-25 16:51             ` Isaku Yamahata
2024-04-26  5:44               ` Paolo Bonzini
2024-04-26 17:15                 ` Isaku Yamahata
2024-04-26  5:41             ` Paolo Bonzini
2024-04-26 15:17               ` Sean Christopherson
2024-06-07 23:03                 ` Sean Christopherson
2024-06-10 21:48                   ` Paolo Bonzini
2024-06-10 22:26                     ` Paolo Bonzini
2024-06-10 23:40                       ` Sean Christopherson [this message]
2024-06-11  6:09                         ` Paolo Bonzini
2024-04-24 22:32   ` Sean Christopherson
2024-04-25  5:56     ` Paolo Bonzini
2024-04-04 18:50 ` [PATCH 10/11] KVM: guest_memfd: Add hook for invalidating memory Paolo Bonzini
2024-04-04 18:50 ` [PATCH 11/11] KVM: x86: Add gmem hook for determining max NPT mapping level Paolo Bonzini
2024-04-09 23:46   ` Michael Roth
2024-04-19 18:26     ` Isaku Yamahata
2024-04-22 14:52   ` Xu Yilun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmeOxAtwfTsDCi1x@google.com \
    --to=seanjc@google.com \
    --cc=isaku.yamahata@intel.com \
    --cc=isaku.yamahata@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.