From: Yan Zhao <yan.y.zhao@intel.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: <seanjc@google.com>, <pbonzini@redhat.com>,
<dave.hansen@linux.intel.com>, <tglx@kernel.org>,
<mingo@redhat.com>, <bp@alien8.de>, <kas@kernel.org>,
<x86@kernel.org>, <linux-kernel@vger.kernel.org>,
<kvm@vger.kernel.org>, <linux-coco@lists.linux.dev>,
<kai.huang@intel.com>, <rick.p.edgecombe@intel.com>,
<yilun.xu@linux.intel.com>, <vannapurve@google.com>,
<ackerleytng@google.com>, <sagis@google.com>,
<binbin.wu@linux.intel.com>, <xiaoyao.li@intel.com>,
<isaku.yamahata@intel.com>
Subject: Re: [PATCH 1/2] x86/virt/tdx: Use PFN directly for mapping guest private memory
Date: Wed, 25 Mar 2026 17:10:56 +0800 [thread overview]
Message-ID: <acOmoOP2fFUyOByC@yzhao56-desk.sh.intel.com> (raw)
In-Reply-To: <a14531ab-f069-41f9-8c5c-9fe6f28a9454@intel.com>
On Thu, Mar 19, 2026 at 11:05:09AM -0700, Dave Hansen wrote:
> On 3/18/26 17:57, Yan Zhao wrote:
> > Remove the completely unnecessary assumption that memory mapped into a TDX
> > guest is backed by refcounted struct page memory. From KVM's point of view,
> > TDH_MEM_PAGE_ADD and TDH_MEM_PAGE_AUG are glorified writes to PTEs, so they
> > have no business placing requirements on how KVM and guest_memfd manage
> > memory.
>
> I think this goes a bit too far.
>
> It's one thing to say that it's more convenient for KVM to stick with
> pfns because it's what KVM uses now. Or, that the goals of using 'struct
> page' can be accomplished other ways. It's quite another to say what
> other bits of the codebase have "business" doing.
I explained the background in the cover letter, thinking we could add the link
to the final patches when they are merged.
I can expand the patch logs by providing background explanation as well.
> Sean, can we tone this down a _bit_ to help guide folks in the future?
Sorry for being lazy and not expanding the patch logs from Sean's original
patch tagged "DO NOT MERGE".
> > Rip out the misguided struct page assumptions/constraints and instead have
>
> Could we maybe tone down the editorializing a bit, please? Folks can
> have honest disagreements about this stuff while not being "misguided".
You are right. I need to make it clear.
> > the two SEAMCALL wrapper APIs take PFN directly. This ensures that for
> > future huge page support in S-EPT, the kernel doesn't pick up even worse
> > assumptions like "a hugepage must be contained in a single folio".
>
> I don't really understand what this is saying.
>
> Is the concern that KVM might want to set up page tables for memory that
> differ from how it was allocated? I'm a bit worried that this assumes
> something about folios that doesn't always hold.
>
> I think the hugetlbfs gigantic support uses folios in at least a few
> spots today.
Below is the background of this problem. I'll try to include a short summary in
the next version's patch logs.
In TDX huge page v3, I added logic that assumes PFNs are contained in a single
folio in both TDX's map/unmap paths [1][2]:
if (start_idx + npages > folio_nr_pages(folio))
return TDX_OPERAND_INVALID;
This not only assumes the PFNs have corresponding struct page, but also assumes
they must be contained in a single folio, since with only base_page + npages,
it's not easy to get the ith page's pointer without first ensuring the pages are
contained in a single folio.
This should work since current KVM/guest_memfd only allocates memory with
struct page and maps them into S-EPT at a level lower than or equal to the
backend folio size. That is, a single S-EPT mapping cannot span multiple backend
folios.
However, Ackerley's 1G hugetlb-based gmem splits the backend folio [3] ahead of
splitting/unmapping them from S-EPT [4], due to implementation limitations
mentioned at [5]. It makes the warning in [1] hit upon invoking TDX's unmap
callback.
Moreover, Google's future gmem may manage PFNs independently in the future, so
TDX's private memory may have no corresponding struct page, and KVM would map
them via VM_PFNMAP, similar to mapping pass-through MMIOs or other PFNs without
struct page or with non-refcounted struct page in normal VMs. Given that KVM has
suffered a lot from handling VM_PFNMAP memory for non-refcounted struct page [6]
in normal VMs, and TDX mapping/unmapping callbacks have no semantic reason to
dictate where and how KVM/guest_memfd should allocate and map memory, Sean
suggested dropping the unnecessary assumption that memory to be mapped/unmapped
to/from S-EPT must be contained in a single folio (though he didn't object
reasonable sanity checks on if the PFNs are TDX convertible).
[1] https://lore.kernel.org/kvm/20260106101929.24937-1-yan.y.zhao@intel.com
[2] https://lore.kernel.org/kvm/20260106101826.24870-1-yan.y.zhao@intel.com
[3] https://github.com/googleprodkernel/linux-cc/blob/wip-gmem-conversions-hugetlb-restructuring-12-08-25/virt/kvm/guest_memfd.c#L909
[4] https://github.com/googleprodkernel/linux-cc/blob/wip-gmem-conversions-hugetlb-restructuring-12-08-25/virt/kvm/guest_memfd.c#L918
[5] https://lore.kernel.org/kvm/diqzqzrzdfvh.fsf@google.com/
[6] https://lore.kernel.org/all/20241010182427.1434605-1-seanjc@google.com
next prev parent reply other threads:[~2026-03-25 9:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-19 0:56 [PATCH 0/2] struct page to PFN conversion for TDX guest private memory Yan Zhao
2026-03-19 0:57 ` [PATCH 1/2] x86/virt/tdx: Use PFN directly for mapping " Yan Zhao
2026-03-19 10:39 ` Kiryl Shutsemau
2026-03-19 11:59 ` Yan Zhao
2026-03-19 12:14 ` Yan Zhao
2026-03-19 12:57 ` Kiryl Shutsemau
2026-03-19 17:27 ` Edgecombe, Rick P
2026-03-20 12:59 ` Kiryl Shutsemau
2026-03-20 17:31 ` Edgecombe, Rick P
2026-03-20 17:38 ` Dave Hansen
2026-03-20 17:48 ` Edgecombe, Rick P
2026-03-19 18:05 ` Dave Hansen
2026-03-25 9:10 ` Yan Zhao [this message]
2026-03-25 16:57 ` Edgecombe, Rick P
2026-03-27 7:03 ` Yan Zhao
2026-03-19 0:58 ` [PATCH 2/2] x86/virt/tdx: Use PFN directly for unmapping " Yan Zhao
2026-03-19 3:20 ` Xiaoyao Li
2026-03-19 6:45 ` Yan Zhao
2026-03-19 8:56 ` Xiaoyao Li
2026-03-19 8:56 ` Yan Zhao
2026-03-19 18:44 ` Edgecombe, Rick P
2026-03-19 10:48 ` Kiryl Shutsemau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acOmoOP2fFUyOByC@yzhao56-desk.sh.intel.com \
--to=yan.y.zhao@intel.com \
--cc=ackerleytng@google.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=isaku.yamahata@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox