All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: bp@alien8.de, chao.gao@intel.com, dave.hansen@intel.com,
	 isaku.yamahata@intel.com, kai.huang@intel.com, kas@kernel.org,
	 kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	linux-kernel@vger.kernel.org,  mingo@redhat.com,
	pbonzini@redhat.com, tglx@linutronix.de,  vannapurve@google.com,
	x86@kernel.org, yan.y.zhao@intel.com,  xiaoyao.li@intel.com,
	binbin.wu@intel.com
Subject: Re: [PATCH v4 11/16] KVM: TDX: Add x86 ops for external spt cache
Date: Fri, 16 Jan 2026 16:53:57 -0800	[thread overview]
Message-ID: <aWrdpZCCDDAffZRM@google.com> (raw)
In-Reply-To: <20251121005125.417831-12-rick.p.edgecombe@intel.com>

On Thu, Nov 20, 2025, Rick Edgecombe wrote:
> Move mmu_external_spt_cache behind x86 ops.
> 
> In the mirror/external MMU concept, the KVM MMU manages a non-active EPT
> tree for private memory (the mirror). The actual active EPT tree the
> private memory is protected inside the TDX module. Whenever the mirror EPT
> is changed, it needs to call out into one of a set of x86 opts that
> implement various update operation with TDX specific SEAMCALLs and other
> tricks. These implementations operate on the TDX S-EPT (the external).
> 
> In reality these external operations are designed narrowly with respect to
> TDX particulars. On the surface, what TDX specific things are happening to
> fulfill these update operations are mostly hidden from the MMU, but there
> is one particular area of interest where some details leak through.
> 
> The S-EPT needs pages to use for the S-EPT page tables. These page tables
> need to be allocated before taking the mmu lock, like all the rest. So the
> KVM MMU pre-allocates pages for TDX to use for the S-EPT in the same place
> where it pre-allocates the other page tables. It’s not too bad and fits
> nicely with the others.
> 
> However, Dynamic PAMT will need even more pages for the same operations.
> Further, these pages will need to be handed to the arch/x86 side which used
> them for DPAMT updates, which is hard for the existing KVM based cache.
> The details living in core MMU code start to add up.
> 
> So in preparation to make it more complicated, move the external page
> table cache into TDX code by putting it behind some x86 ops. Have one for
> topping up and one for allocation. Don’t go so far to try to hide the
> existence of external page tables completely from the generic MMU, as they
> are currently stored in their mirror struct kvm_mmu_page and it’s quite
> handy.
> 
> To plumb the memory cache operations through tdx.c, export some of
> the functions temporarily. This will be removed in future changes.
> 
> Acked-by: Kiryl Shutsemau <kas@kernel.org>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---

NAK.  I kinda sorta get why you did this?  But the pages KVM uses for page tables
are KVM's, not to be mixed with PAMT pages.

Eww.  Definitely a hard "no".  In tdp_mmu_alloc_sp_for_split(), the allocation
comes from KVM:

	if (mirror) {
		sp->external_spt = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
		if (!sp->external_spt) {
			free_page((unsigned long)sp->spt);
			kmem_cache_free(mmu_page_header_cache, sp);
			return NULL;
		}
	}

But then in kvm_tdp_mmu_map(), via kvm_mmu_alloc_external_spt(), the allocation
comes from get_tdx_prealloc_page()

  static void *tdx_alloc_external_fault_cache(struct kvm_vcpu *vcpu)
  {
	struct page *page = get_tdx_prealloc_page(&to_tdx(vcpu)->prealloc);

	if (WARN_ON_ONCE(!page))
		return (void *)__get_free_page(GFP_ATOMIC | __GFP_ACCOUNT);

	return page_address(page);
  }

But then regardles of where the page came from, KVM frees it.  Seriously.

  static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
  {
	free_page((unsigned long)sp->external_spt);  <=====
	free_page((unsigned long)sp->spt);
	kmem_cache_free(mmu_page_header_cache, sp);
  }

Oh, and the hugepage series also fumbles its topup (why there's yet another
topup API, I have no idea).

  static int tdx_topup_vm_split_cache(struct kvm *kvm, enum pg_level level)
  {
	struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
	struct tdx_prealloc *prealloc = &kvm_tdx->prealloc_split_cache;
	int cnt = tdx_min_split_cache_sz(kvm, level);

	while (READ_ONCE(prealloc->cnt) < cnt) {
		struct page *page = alloc_page(GFP_KERNEL);  <==== GFP_KERNEL_ACCOUNT

		if (!page)
			return -ENOMEM;

		spin_lock(&kvm_tdx->prealloc_split_cache_lock);
		list_add(&page->lru, &prealloc->page_list);
		prealloc->cnt++;
		spin_unlock(&kvm_tdx->prealloc_split_cache_lock);
	}

	return 0;
  }

  reply	other threads:[~2026-01-17  0:53 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-21  0:51 [PATCH v4 00/16] TDX: Enable Dynamic PAMT Rick Edgecombe
2025-11-21  0:51 ` [PATCH v4 01/16] x86/tdx: Move all TDX error defines into <asm/shared/tdx_errno.h> Rick Edgecombe
2025-11-25 22:30   ` Huang, Kai
2025-11-25 22:44     ` Huang, Kai
2025-11-26 23:15       ` Edgecombe, Rick P
2025-11-26 23:14     ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 02/16] x86/tdx: Add helpers to check return status codes Rick Edgecombe
2025-11-24  8:56   ` Binbin Wu
2025-11-24 19:31     ` Edgecombe, Rick P
2025-11-25 23:07   ` Huang, Kai
2025-11-26 23:26     ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 03/16] x86/virt/tdx: Simplify tdmr_get_pamt_sz() Rick Edgecombe
2025-11-24  9:26   ` Binbin Wu
2025-11-24 19:47     ` Edgecombe, Rick P
2025-11-25  1:27       ` Binbin Wu
2025-11-21  0:51 ` [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT Rick Edgecombe
2025-11-25  1:50   ` Binbin Wu
2025-11-26 17:56     ` Edgecombe, Rick P
2025-12-24  9:10   ` Xu Yilun
2026-01-05 22:06     ` Edgecombe, Rick P
2026-01-06  4:01       ` Xu Yilun
2026-01-06 17:00         ` Edgecombe, Rick P
2026-01-07  6:01           ` Xu Yilun
2026-01-07 14:41             ` Edgecombe, Rick P
2026-01-08 12:53               ` Xu Yilun
2026-01-08 16:52                 ` Edgecombe, Rick P
2026-01-09  2:18                   ` Xu Yilun
2026-01-09 16:05                     ` Edgecombe, Rick P
2026-01-12  0:24                       ` Xu Yilun
2025-11-21  0:51 ` [PATCH v4 05/16] x86/virt/tdx: Allocate reference counters for PAMT memory Rick Edgecombe
2025-11-21  0:51 ` [PATCH v4 06/16] x86/virt/tdx: Improve PAMT refcounts allocation for sparse memory Rick Edgecombe
2025-11-25  3:15   ` Binbin Wu
2025-11-26 20:47     ` Edgecombe, Rick P
2025-11-27 15:57       ` Kiryl Shutsemau
2025-12-01 22:14         ` Edgecombe, Rick P
2025-11-26 14:45   ` Nikolay Borisov
2025-11-26 20:47     ` Edgecombe, Rick P
2025-11-27  7:36       ` Nikolay Borisov
2025-12-11  0:07         ` Edgecombe, Rick P
2025-11-27 16:04       ` Kiryl Shutsemau
2025-11-21  0:51 ` [PATCH v4 07/16] x86/virt/tdx: Add tdx_alloc/free_page() helpers Rick Edgecombe
2025-11-25  8:09   ` Binbin Wu
2025-11-26 22:28     ` Edgecombe, Rick P
2026-01-29  1:19       ` Sean Christopherson
2026-01-29 17:18         ` Edgecombe, Rick P
2026-01-29 19:09           ` Sean Christopherson
2026-01-29 19:12             ` Edgecombe, Rick P
2025-11-26  1:21   ` Huang, Kai
2025-11-26 22:28     ` Edgecombe, Rick P
2025-11-27 12:29   ` Nikolay Borisov
2025-12-01 22:31     ` Edgecombe, Rick P
2025-11-27 16:11   ` Nikolay Borisov
2025-12-01 22:39     ` Edgecombe, Rick P
2025-12-02  7:38       ` Nikolay Borisov
2025-12-02 20:02         ` Edgecombe, Rick P
2025-12-03 13:46           ` Kiryl Shutsemau
2025-12-03 13:48             ` Nikolay Borisov
2025-12-03 15:41               ` Dave Hansen
2025-12-03 18:15                 ` Edgecombe, Rick P
2025-12-03 18:21                   ` Dave Hansen
2025-12-03 19:59                     ` Edgecombe, Rick P
2025-12-03 20:13                       ` Dave Hansen
2025-12-03 21:39                         ` Edgecombe, Rick P
2025-12-03 21:40                           ` Dave Hansen
2025-12-08  9:15   ` Yan Zhao
2025-12-08 20:27     ` Edgecombe, Rick P
2026-01-16 23:17   ` Sean Christopherson
2026-01-16 23:25     ` Edgecombe, Rick P
2026-01-16 23:40     ` Dave Hansen
2025-11-21  0:51 ` [PATCH v4 08/16] x86/virt/tdx: Optimize " Rick Edgecombe
2025-11-21  0:51 ` [PATCH v4 09/16] KVM: TDX: Allocate PAMT memory for TD control structures Rick Edgecombe
2025-11-25  9:11   ` Binbin Wu
2025-11-21  0:51 ` [PATCH v4 10/16] KVM: TDX: Allocate PAMT memory for vCPU " Rick Edgecombe
2025-11-25  9:14   ` Binbin Wu
2025-11-21  0:51 ` [PATCH v4 11/16] KVM: TDX: Add x86 ops for external spt cache Rick Edgecombe
2026-01-17  0:53   ` Sean Christopherson [this message]
2026-01-19  2:31     ` Yan Zhao
2026-01-20  8:42       ` Huang, Kai
2026-01-20  9:18         ` Yan Zhao
2026-01-20 10:00           ` Huang, Kai
2026-01-20 19:53     ` Edgecombe, Rick P
2026-01-21 22:12       ` Sean Christopherson
2026-01-21 22:34         ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 12/16] x86/virt/tdx: Add helpers to allow for pre-allocating pages Rick Edgecombe
2025-11-26  3:40   ` Binbin Wu
2025-11-26  5:21     ` Binbin Wu
2025-11-26 22:33     ` Edgecombe, Rick P
2025-11-27  2:38       ` Binbin Wu
2026-01-20  7:10         ` Huang, Kai
2026-01-20  7:46           ` Yan Zhao
2026-01-20  8:01             ` Huang, Kai
2026-01-17  1:02   ` Sean Christopherson
2026-01-21  0:52   ` Sean Christopherson
2026-01-21  0:58     ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 13/16] KVM: TDX: Handle PAMT allocation in fault path Rick Edgecombe
2025-11-26  5:56   ` Binbin Wu
2025-11-26 22:33     ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 14/16] KVM: TDX: Reclaim PAMT memory Rick Edgecombe
2025-11-26  8:53   ` Binbin Wu
2025-11-26 22:58     ` Edgecombe, Rick P
2025-11-21  0:51 ` [PATCH v4 15/16] x86/virt/tdx: Enable Dynamic PAMT Rick Edgecombe
2025-11-21  0:51 ` [PATCH v4 16/16] Documentation/x86: Add documentation for TDX's " Rick Edgecombe
2025-11-26 10:33   ` Binbin Wu
2025-11-26 20:05     ` Edgecombe, Rick P
2025-11-24 20:18 ` [PATCH v4 00/16] TDX: Enable " Sagi Shahar
2025-11-25 20:19 ` Vishal Annapurve

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWrdpZCCDDAffZRM@google.com \
    --to=seanjc@google.com \
    --cc=binbin.wu@intel.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.