Re: [RFC PATCH] KVM: TDX: Decouple TDX init mem region from kvm_gmem_populate()

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Vishal Annapurve <vannapurve@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Ira Weiny <ira.weiny@intel.com>, Yan Zhao <yan.y.zhao@intel.com>,
	 Michael Roth <michael.roth@amd.com>,
	pbonzini@redhat.com, kvm@vger.kernel.org,
	 linux-kernel@vger.kernel.org, rick.p.edgecombe@intel.com,
	kai.huang@intel.com,  adrian.hunter@intel.com,
	reinette.chatre@intel.com, xiaoyao.li@intel.com,
	 tony.lindgren@intel.com, binbin.wu@linux.intel.com,
	dmatlack@google.com,  isaku.yamahata@intel.com, david@redhat.com,
	ackerleytng@google.com,  tabba@google.com, chao.p.peng@intel.com
Subject: Re: [RFC PATCH] KVM: TDX: Decouple TDX init mem region from kvm_gmem_populate()
Date: Tue, 5 Aug 2025 07:30:17 -0700	[thread overview]
Message-ID: <CAGtprH-VcO4VJkgRZJPS8SEpdO6Xsjmw3CeGYCFchjkdROoMLg@mail.gmail.com> (raw)
In-Reply-To: <CAGtprH9ELoYmwA+brSx-kWH5qSK==u8huW=4otEZ5evu_GTvtQ@mail.gmail.com>

On Mon, Aug 4, 2025 at 6:20 PM Vishal Annapurve <vannapurve@google.com> wrote:
>
> On Mon, Aug 4, 2025 at 5:22 PM Sean Christopherson <seanjc@google.com> wrote:
> > > > > > IIUC, the suggestion in the link is to abandon kvm_gmem_populate().
> > > > > > For TDX, it means adopting the approach in this RFC patch, right?
> > > > > Yes, IMO this RFC is following the right approach as posted.
> >
> > I don't think we want to abandon kvm_gmem_populate().  Unless I'm missing something,
> > SNP has the same AB-BA problem as TDX.  The copy_from_user() on @src can trigger
> > a page fault, and resolving the page fault may require taking mm->mmap_lock.
> >
> > Fundamentally, TDX and SNP are doing the same thing: copying from source to guest
> > memory.  The only differences are in the mechanics of the copy+encrypt, everything
> > else is the same.  I.e. I don't expect that we'll find a magic solution that works
> > well for one and not the other.
> >
> > I also don't want to end up with wildly different ABI for SNP vs. everything else.
> > E.g. cond_resched() needs to be called if the to-be-initialzied range is large,
> > which means dropping mmu_lock between pages, whereas kvm_gmem_populate() can
> > yield without dropping invalidate_lock, which means that the behavior of populating
> > guest_memfd memory will be quite different with respect to guest_memfd operations.
>
> I would think that TDX/CCA VMs [1] will run into the similar behavior
> of needing to simulate stage2 faults i.e. KVM will end up picking up
> and dropping mmu_lock for each page anyways at least for these two
> platforms.
>
> [1] https://lore.kernel.org/kvm/20250611104844.245235-5-steven.price@arm.com/
> (rmi_rtt_create())
>
> >
> > Pulling in the RFC text:
> >
> > : I think the only different scenario is SNP, where the host must write
> > : initial contents to guest memory.
> > :
> > : Will this work for all cases CCA/SNP/TDX during initial memory
> > : population from within KVM:
> > : 1) Simulate stage2 fault
> > : 2) Take a KVM mmu read lock
> >
> > Doing all of this under mmu_lock is pretty much a non-starter.

Looking closer at CPU <-> PSP communication which is not implemented
to work within an atomic context, I agree now that this wouldn't work
for SNP VMs.


> >
> > : 3) Check that the needed gpa is mapped in EPT/NPT entries
> >
> > No, KVM's page tables are not the source of truth.  S-EPT is a special snowflake,
> > and I'd like to avoid foisting the same requirements on NPT.
>
> I agree this would be a new requirement.
>
> >
> > : 4) For SNP, if src != null, make the target pfn to be shared, copy
> > : contents and then make the target pfn back to private.
> >
> > Copying from userspace under spinlock (rwlock) is illegal, as accessing userspace
> > memory might_fault() and thus might_sleep().
>
> I would think that a combination of get_user_pages() and
> kmap_local_pfn() will prevent this situation of might_fault().
>
> >
> > : 5) For TDX, if src != null, pass the same address for source and
> > : target (likely this works for CCA too)
> > : 6) Invoke appropriate memory encryption operations
> > : 7) measure contents
> > : 8) release the KVM mmu read lock
> > :
> > : If this scheme works, ideally we should also not call RMP table
> > : population logic from guest_memfd, but from KVM NPT fault handling
> > : logic directly (a bit of cosmetic change).
> >

next prev parent reply	other threads:[~2025-08-05 14:30 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-03  6:26 [RFC PATCH] KVM: TDX: Decouple TDX init mem region from kvm_gmem_populate() Yan Zhao
2025-07-03 16:51 ` Vishal Annapurve
2025-07-09 23:21 ` Michael Roth
2025-07-10 16:24   ` Sean Christopherson
2025-07-11  1:41     ` Ira Weiny
2025-07-11 14:21       ` Sean Christopherson
2025-07-11  4:36     ` Yan Zhao
2025-07-11 15:17       ` Michael Roth
2025-07-11 15:39         ` Sean Christopherson
2025-07-11 16:34           ` Michael Roth
2025-07-11 18:38             ` Vishal Annapurve
2025-07-11 19:49               ` Michael Roth
2025-07-11 20:19                 ` Sean Christopherson
2025-07-11 20:25             ` Ira Weiny
2025-07-11 22:56               ` Sean Christopherson
2025-07-11 23:04                 ` Vishal Annapurve
2025-07-14 23:11                   ` Ira Weiny
2025-07-15  0:41                     ` Vishal Annapurve
2025-07-14 23:08                 ` Ira Weiny
2025-07-14 23:12                   ` Sean Christopherson
2025-07-11 18:46           ` Vishal Annapurve
2025-07-12 17:38             ` Vishal Annapurve
2025-07-14  6:15           ` Yan Zhao
2025-07-14 15:46             ` Sean Christopherson
2025-07-14 16:02               ` David Hildenbrand
2025-07-14 16:07                 ` Sean Christopherson
2025-07-15  1:10               ` Yan Zhao
2025-07-18  9:14                 ` Yan Zhao
2025-07-18 15:57                   ` Vishal Annapurve
2025-07-18 18:42                     ` Ira Weiny
2025-07-18 18:59                       ` Vishal Annapurve
2025-07-21 17:46                         ` Ira Weiny
2025-07-28  9:48                     ` Yan Zhao
2025-07-29  0:45                       ` Vishal Annapurve
2025-07-29  1:37                         ` Yan Zhao
2025-07-29 16:33                           ` Ira Weiny
2025-08-05  0:22                             ` Sean Christopherson
2025-08-05  1:20                               ` Vishal Annapurve
2025-08-05 14:30                                 ` Vishal Annapurve [this message]
2025-08-05 19:59                                 ` Sean Christopherson
2025-08-06  0:09                                   ` Vishal Annapurve
2025-07-14  3:20         ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGtprH-VcO4VJkgRZJPS8SEpdO6Xsjmw3CeGYCFchjkdROoMLg@mail.gmail.com \
    --to=vannapurve@google.com \
    --cc=ackerleytng@google.com \
    --cc=adrian.hunter@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=chao.p.peng@intel.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=ira.weiny@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    --cc=tabba@google.com \
    --cc=tony.lindgren@intel.com \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).