All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kishen Maloor <kishen.maloor@intel.com>
To: Xu Yilun <yilun.xu@linux.intel.com>, <kas@kernel.org>,
	<djbw@kernel.org>, <rick.p.edgecombe@intel.com>, <x86@kernel.org>,
	<peter.fang@intel.com>
Cc: <linux-coco@lists.linux.dev>, <linux-kernel@vger.kernel.org>,
	<kvm@vger.kernel.org>, <sohil.mehta@intel.com>,
	<yilun.xu@intel.com>, <baolu.lu@linux.intel.com>,
	<zhenzhong.duan@intel.com>, <xiaoyao.li@intel.com>
Subject: Re: [PATCH 02/15] x86/virt/tdx: Add extra memory to TDX Module for Extensions
Date: Sat, 6 Jun 2026 21:38:00 -0700	[thread overview]
Message-ID: <f44d997e-49fe-4d48-84e3-e260bb9d3164@intel.com> (raw)
In-Reply-To: <20260522034128.3144354-3-yilun.xu@linux.intel.com>

On 5/21/26 8:41 PM, Xu Yilun wrote:
> TDX Module introduces a new concept called "TDX Module Extensions" to
> support long running / hard-irq preemptible flows inside. This makes TDX
> Module capable of handling complex tasks through "Extension SEAMCALLs".
> Adding more memory to TDX Module is the first step to enable Extensions.
> 
> Currently, TDX Module memory use is relatively static. But, the
> Extensions need to use memory more dynamically. While 'static' here
> means the kernel provides necessary amount of memory to TDX Module for
> its basic functionalities, 'dynamic' means extra memory is needed only
> if new add-on features are to be enabled. So add a new memory feeding
> process backed by a new SEAMCALL TDH.EXT.MEM.ADD.
> 
> The process is mostly the same as adding PAMT. The kernel queries TDX
> Module how much memory needed, allocates it, hands it over, and never
> gets it back.
> 
> TDH.EXT.MEM.ADD uses a new parameter type HPA_LIST_INFO to provide
> control (private) pages to TDX Module. This type represents a list of
> pages for TDX Module to access. It needs a 'root page' which contains
> the list of HPAs of the pages. It collapses the HPA of the root page
> and the number of valid HPAs into a 64 bit raw value for SEAMCALL
> parameters. The root page is always a medium, TDX Module never keeps
> the root page.
> 
> Introduce a tdx_clflush_hpa_list() helper to flush shared cache before
> SEAMCALL, to avoid shared cache writeback damaging these private pages.
> 
> For now, TDX Module Extensions consumes relatively large amount of
> memory (~50MB). Use contiguous page allocation to avoid permanently
> fragment too much memory. Print the allocation amount on TDX Module
> Extensions initialization for visibility.
> 
> Co-developed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
>   arch/x86/virt/vmx/tdx/tdx.h |   1 +
>   arch/x86/virt/vmx/tdx/tdx.c | 118 ++++++++++++++++++++++++++++++++++++
>   2 files changed, 119 insertions(+)
> 
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index a5eec8e3cc71..2335f88bbb10 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -46,6 +46,7 @@
>   #define TDH_PHYMEM_PAGE_WBINVD		41
>   #define TDH_VP_WR			43
>   #define TDH_SYS_CONFIG			45
> +#define TDH_EXT_MEM_ADD			61
>   #define TDH_SYS_DISABLE			69
>   
>   /*
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index c0c6281b08a5..622399d8da68 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -31,6 +31,7 @@
>   #include <linux/syscore_ops.h>
>   #include <linux/idr.h>
>   #include <linux/kvm_types.h>
> +#include <linux/bitfield.h>
>   #include <asm/page.h>
>   #include <asm/special_insns.h>
>   #include <asm/msr-index.h>
> @@ -1179,6 +1180,123 @@ static __init int init_tdmrs(struct tdmr_info_list *tdmr_list)
>   	return 0;
>   }
>   
> +static void tdx_clflush_hpa_list(struct page *root, unsigned int nr_pages)
> +{
> +	u64 *entries = page_to_virt(root);
> +	int i;
> +
> +	for (i = 0; i < nr_pages; i++)
> +		clflush_cache_range(__va(entries[i]), PAGE_SIZE);
> +}
> +
> +#define HPA_LIST_INFO_FIRST_ENTRY	GENMASK_U64(11, 3)
> +#define HPA_LIST_INFO_PFN		GENMASK_U64(51, 12)
> +#define HPA_LIST_INFO_LAST_ENTRY	GENMASK_U64(63, 55)
> +
> +static u64 to_hpa_list_info(struct page *root, unsigned int nr_pages)
> +{
> +	return FIELD_PREP(HPA_LIST_INFO_FIRST_ENTRY, 0) |
> +	       FIELD_PREP(HPA_LIST_INFO_PFN, page_to_pfn(root)) |
> +	       FIELD_PREP(HPA_LIST_INFO_LAST_ENTRY, nr_pages - 1);
> +}
> +
> +static int tdx_ext_mem_add(struct page *root, unsigned int nr_pages)
> +{
> +	struct tdx_module_args args = {
> +		.rcx = to_hpa_list_info(root, nr_pages),
> +	};
> +	u64 r;
> +
> +	tdx_clflush_hpa_list(root, nr_pages);
> +
> +	do {
> +		/*
> +		 * TDH_EXT_MEM_ADD is designed to use output parameter RCX to
> +		 * override/update input parameter RCX, so the caller doesn't
> +		 * have to do manual parameter update on retry call.
> +		 */
> +		r = seamcall_ret(TDH_EXT_MEM_ADD, &args);
> +	} while (r == TDX_INTERRUPTED_RESUMABLE);

The retry loop compares the full return value against TDX_INTERRUPTED_RESUMABLE. Should
it mask with TDX_SEAMCALL_STATUS_MASK first, in case the module sets any
lower detail bits?

Ditto for TDH.EXT.INIT in patch 3.

> +
> +	if (r != TDX_SUCCESS)
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int tdx_ext_mem_setup(void)
> +{
> +	unsigned int nr_pages;
> +	struct page *page;
> +	u64 *root;
> +	unsigned int i;
> +	int ret;
> +
> +	nr_pages = tdx_sysinfo.ext.memory_pool_required_pages;
> +	/*
> +	 * memory_pool_required_pages == 0 means no need to add pages,
> +	 * skip the memory setup.
> +	 */
> +	if (!nr_pages)
> +		return 0;
> +
> +	root = kzalloc(PAGE_SIZE, GFP_KERNEL);
> +	if (!root)
> +		return -ENOMEM;
> +
> +	page = alloc_contig_pages(nr_pages, GFP_KERNEL, numa_mem_id(),
> +				  &node_online_map);

The SEAMCALL takes a scatter list (HPA_LIST_INFO), so the module
doesn't require contiguity. If the goal is just to avoid scattering
pages across many 2MB regions, maybe dense, 2MB-aligned allocations should
achieve that without a single pool-wide contiguous block.

> +	if (!page) {
> +		ret = -ENOMEM;
> +		goto out_free_root;
> +	}
> +
> +	for (i = 0; i < nr_pages;) {
> +		unsigned int nents = min(nr_pages - i,
> +					 PAGE_SIZE / sizeof(*root));
> +		int j;
> +
> +		for (j = 0; j < nents; j++)
> +			root[j] = page_to_phys(page + i + j);

Would it be better to allocate per-batch (i.e. one root page's worth
at a time) rather than the whole pool up front?

That way an intermediate TDH.EXT.MEM.ADD failure wouldn't leak
all nr_pages. Also, a batch is up to 512 pages (= 2MB) and its allocation
could be 2MB-aligned, addressing your fragmentation concern.

> +
> +		ret = tdx_ext_mem_add(virt_to_page(root), nents);
> +		/*
> +		 * No SEAMCALLs to reclaim the added pages. For simple error
> +		 * handling, leak all pages.
> +		 */
> +		WARN_ON_ONCE(ret);
> +		if (ret)
> +			break;
> +
> +		i += nents;
> +	}
> +
> +	/*
> +	 * Extensions memory can't be reclaimed once added, print out the
> +	 * amount, stop tracking it and free the root page, no matter success
> +	 * or failure.
> +	 */
> +	pr_info("%lu KB allocated for TDX Module Extensions\n",
> +		nr_pages * PAGE_SIZE / 1024);
> +
> +out_free_root:
> +	kfree(root);
> +
> +	return ret;
> +}
> +
> +static int __maybe_unused init_tdx_ext(void)

Could this be named init_tdx_extensions() instead to disambiguate
from tdx_ext_init() in patch 3?

> +{
> +	if (!(tdx_sysinfo.features.tdx_features0 & TDX_FEATURES0_EXT))
> +		return 0;
> +
> +	/* No feature requires TDX Module Extensions. */
> +	if (!tdx_sysinfo.ext.ext_required)
> +		return 0;
> +
> +	return tdx_ext_mem_setup();
> +}
> +
>   static __init int init_tdx_module(void)
>   {
>   	int ret;


  parent reply	other threads:[~2026-06-07  4:38 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-22  3:41 [PATCH 00/15] Enable TDX Module Extensions and DICE-based TDX Quoting Xu Yilun
2026-05-22  3:41 ` [PATCH 01/15] x86/virt/tdx: Read global metadata for TDX Module Extensions Xu Yilun
2026-05-25  6:24   ` Xiaoyao Li
2026-05-25  6:54   ` Xiaoyao Li
2026-05-27 15:35     ` Kiryl Shutsemau
2026-05-28  4:25       ` Xu Yilun
2026-05-28 21:17         ` Edgecombe, Rick P
2026-05-29 15:34           ` Xu Yilun
2026-05-27  6:05   ` Sohil Mehta
2026-05-27  7:11     ` Xu Yilun
2026-05-27 17:17       ` Sohil Mehta
2026-05-28  3:48         ` Xu Yilun
2026-05-28 21:00   ` Edgecombe, Rick P
2026-05-29 16:59     ` Xu Yilun
2026-06-09 13:06   ` Adrian Hunter
2026-06-10  3:20     ` Xu Yilun
2026-06-12 22:20   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [PATCH 02/15] x86/virt/tdx: Add extra memory to TDX Module for Extensions Xu Yilun
2026-05-25  8:56   ` Xiaoyao Li
2026-05-27  3:47     ` Xu Yilun
2026-05-27  6:38       ` Xiaoyao Li
2026-05-27  7:32         ` Xu Yilun
2026-05-27  8:18           ` Xiaoyao Li
2026-06-07  4:38   ` Kishen Maloor [this message]
2026-06-08  9:41     ` Xu Yilun
2026-06-09 13:38   ` Adrian Hunter
2026-06-10  5:13     ` Xu Yilun
2026-06-10  5:43       ` Adrian Hunter
2026-06-10  7:44         ` Xu Yilun
2026-06-12 23:49   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [PATCH 03/15] x86/virt/tdx: Make TDX Module initialize Extensions Xu Yilun
2026-05-25  8:58   ` Xiaoyao Li
2026-06-05  8:46   ` Tony Lindgren
2026-06-09 15:14   ` Adrian Hunter
2026-06-10  8:09     ` Xu Yilun
2026-05-22  3:41 ` [PATCH 04/15] x86/virt/tdx: Enable the Extensions right after basic TDX Module init Xu Yilun
2026-05-25  6:00   ` Tony Lindgren
2026-05-27  4:02     ` Xu Yilun
2026-05-25  8:05   ` Xiaoyao Li
2026-05-28 21:32   ` Edgecombe, Rick P
2026-05-29 17:19     ` Xu Yilun
2026-06-07  4:38   ` Kishen Maloor
2026-06-08 10:12     ` Xu Yilun
2026-06-13  0:08   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [RFC PATCH 05/15] x86/virt/tdx: Move tdx_tdr_pa() up in the file Xu Yilun
2026-05-28 21:32   ` Edgecombe, Rick P
2026-06-11 16:21   ` Adrian Hunter
2026-05-22  3:41 ` [RFC PATCH 06/15] x86/virt/tdx: Initialize Quoting extension during bringup Xu Yilun
2026-05-28 21:35   ` Edgecombe, Rick P
2026-06-11 16:22   ` Adrian Hunter
2026-06-13  0:00   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [RFC PATCH 07/15] x86/virt/tdx: Prepare Quote buffer during extension bringup Xu Yilun
2026-05-28 22:30   ` Edgecombe, Rick P
2026-05-22  3:41 ` [RFC PATCH 08/15] x86/virt/tdx: Add interface to check Quoting availability Xu Yilun
2026-05-22  3:41 ` [RFC PATCH 09/15] x86/virt/tdx: Add interface to generate a Quote Xu Yilun
2026-05-28 22:30   ` Edgecombe, Rick P
2026-06-11 17:15   ` Adrian Hunter
2026-05-22  3:41 ` [RFC PATCH 10/15] x86/tdx: Move and rename Quote request structure Xu Yilun
2026-06-11 17:16   ` Adrian Hunter
2026-06-13  0:04   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [RFC PATCH 11/15] KVM: TDX: Factor out userspace return path from tdx_get_quote() Xu Yilun
2026-05-22  3:41 ` [RFC PATCH 12/15] KVM: TDX: Add in-kernel Quote generation Xu Yilun
2026-06-13  0:20   ` Dan Williams (nvidia)
2026-05-22  3:41 ` [RFC PATCH 13/15] KVM: TDX: Support event-notify interrupts only with userspace quoting Xu Yilun
2026-06-11 19:36   ` Adrian Hunter
2026-05-22  3:41 ` [RFC PATCH 14/15] x86/virt/tdx: Embed version info in SEAMCALL leaf function definitions Xu Yilun
2026-05-25  9:00   ` Xiaoyao Li
2026-05-27  6:45     ` Xu Yilun
2026-05-27  7:44       ` Xiaoyao Li
2026-05-27 11:45         ` Xu Yilun
2026-06-12  5:47   ` Adrian Hunter
2026-05-22  3:41 ` [RFC PATCH 15/15] x86/virt/tdx: Enable TDX Quoting extension Xu Yilun
2026-05-25  5:17   ` Tony Lindgren
2026-05-25 10:51     ` Xiaoyao Li
2026-05-26  9:00       ` Tony Lindgren
2026-05-26 15:45       ` Xu Yilun
2026-05-27  1:30         ` Xiaoyao Li
2026-06-07  4:41   ` Kishen Maloor
2026-06-08 15:10     ` Xu Yilun
2026-05-27  5:23 ` [PATCH 00/15] Enable TDX Module Extensions and DICE-based TDX Quoting Sohil Mehta
2026-05-27 10:38   ` Xu Yilun
2026-05-27 17:09     ` Sohil Mehta
2026-05-28  4:52       ` Xu Yilun
2026-05-28 19:50         ` Sohil Mehta
2026-06-01  9:36           ` Xu Yilun
2026-06-01 20:17             ` Sohil Mehta
2026-06-02  5:36               ` Xu Yilun
2026-06-07  4:36 ` Kishen Maloor
2026-06-08  6:54   ` Xu Yilun
2026-06-08 18:31 ` Adrian Hunter
2026-06-12 22:03 ` Dan Williams (nvidia)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f44d997e-49fe-4d48-84e3-e260bb9d3164@intel.com \
    --to=kishen.maloor@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=djbw@kernel.org \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peter.fang@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sohil.mehta@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yilun.xu@intel.com \
    --cc=yilun.xu@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.