From: Dave Hansen <dave.hansen@intel.com>
To: Xu Yilun <yilun.xu@linux.intel.com>,
linux-coco@lists.linux.dev, linux-pci@vger.kernel.org
Cc: chao.gao@intel.com, dave.jiang@intel.com,
baolu.lu@linux.intel.com, yilun.xu@intel.com,
zhenzhong.duan@intel.com, kvm@vger.kernel.org,
rick.p.edgecombe@intel.com, dave.hansen@linux.intel.com,
dan.j.williams@intel.com, kas@kernel.org, x86@kernel.org
Subject: Re: [PATCH v1 08/26] x86/virt/tdx: Add tdx_enable_ext() to enable of TDX Module Extensions
Date: Mon, 17 Nov 2025 09:34:30 -0800 [thread overview]
Message-ID: <cfcfb160-fcd2-4a75-9639-5f7f0894d14b@intel.com> (raw)
In-Reply-To: <20251117022311.2443900-9-yilun.xu@linux.intel.com>
I really dislike subjects like this. I honestly don't need to know what
the function's name is. The _rest_ of the subject is just words that
don't tell me _anything_ about what this patch does.
In this case, I suspect it's because the patch is doing about 15
discrete things and it's impossible to write a subject that's anything
other than some form of:
x86/virt/tdx: Implement $FOO by making miscellaneous changes
So it's a symptom of the real disease.
On 11/16/25 18:22, Xu Yilun wrote:
> From: Zhenzhong Duan <zhenzhong.duan@intel.com>
>
> Add a kAPI tdx_enable_ext() for kernel to enable TDX Module Extensions
> after basic TDX Module initialization.
>
> The extension initialization uses the new TDH.EXT.MEM.ADD and
> TDX.EXT.INIT seamcalls. TDH.EXT.MEM.ADD add pages to a shared memory
> pool for extensions to consume.
"Shared memory" is an exceedingly unfortunate term to use here. They're
TDX private memory, right?
> The number of pages required is
> published in the MEMORY_POOL_REQUIRED_PAGES field from TDH.SYS.RD. Then
> on TDX.EXT.INIT, the extensions consume from the pool and initialize.
This all seems backwards to me. I don't need to read the ABI names in
the changelog. I *REALLY* don't need to read the TDX documentation names
for them. If *ANYTHING* these names should be trivialy mappable to the
patch that sits below this changelog. They're not.
This changelog _should_ begin:
Currently, TDX module memory use is relatively static. But, some
new features (called "TDX Module Extensions") need to use memory
more dynamically.
How much memory does this consume?
> TDH.EXT.MEM.ADD is the first user of tdx_page_array. It provides pages
> to TDX Module as control (private) pages. A tdx_clflush_page_array()
> helper is introduced to flush shared cache before SEAMCALL, to avoid
> shared cache write back damages these private pages.
First, this talks about "control pages". But I don't know what a control
page is.
Second, these all need to be in imperative voice. Not:
It provides pages to TDX Module as control (private) pages.
Do this:
Provide pages to TDX Module as control (private) pages.
> TDH.EXT.MEM.ADD uses HPA_LIST_INFO as parameter so could leverage the
> 'first_entry' field to simplify the interrupted - retry flow. Host
> don't have to care about partial page adding and 'first_entry'.
>
> Use a new version TDH.SYS.CONFIG for VMM to tell TDX Module which
> optional features (e.g. TDX Connect, and selecting TDX Connect implies
> selecting TDX Module Extensions) to use and let TDX Module update its
> global metadata (e.g. memory_pool_required_pages for TDX Module
> Extensions). So after calling this new version TDH.SYS.CONFIG, VMM
> updates the cached tdx_sysinfo.
>
> Note that this extension initialization does not impact existing
> in-flight SEAMCALLs that are not implemented by the extension. So only
> the first user of an extension-seamcall needs invoke this helper.
Ahh, so this is another bit of very useful information buried deep in
this changelog.
Extensions consume memory, but they're *optional*.
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 3a3ea3fa04f2..1eeb77a6790a 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -125,11 +125,13 @@ static __always_inline u64 sc_retry(sc_func_t func, u64 fn,
> #define seamcall(_fn, _args) sc_retry(__seamcall, (_fn), (_args))
> #define seamcall_ret(_fn, _args) sc_retry(__seamcall_ret, (_fn), (_args))
> #define seamcall_saved_ret(_fn, _args) sc_retry(__seamcall_saved_ret, (_fn), (_args))
> +int tdx_enable_ext(void);
> const char *tdx_dump_mce_info(struct mce *m);
>
> /* Bit definitions of TDX_FEATURES0 metadata field */
> #define TDX_FEATURES0_TDXCONNECT BIT_ULL(6)
> #define TDX_FEATURES0_NO_RBP_MOD BIT_ULL(18)
> +#define TDX_FEATURES0_EXT BIT_ULL(39)
>
> const struct tdx_sys_info *tdx_get_sysinfo(void);
>
> @@ -223,6 +225,7 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
> u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page);
> #else
> static inline void tdx_init(void) { }
> +static inline int tdx_enable_ext(void) { return -ENODEV; }
> static inline u32 tdx_get_nr_guest_keyids(void) { return 0; }
> static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; }
> static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; }
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 4370d3d177f6..b84678165d00 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -46,6 +46,8 @@
> #define TDH_PHYMEM_PAGE_WBINVD 41
> #define TDH_VP_WR 43
> #define TDH_SYS_CONFIG 45
> +#define TDH_EXT_INIT 60
> +#define TDH_EXT_MEM_ADD 61
>
> /*
> * SEAMCALL leaf:
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index 9a5c32dc1767..bbf93cad5bf2 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -59,6 +59,9 @@ static LIST_HEAD(tdx_memlist);
> static struct tdx_sys_info tdx_sysinfo __ro_after_init;
> static bool tdx_module_initialized __ro_after_init;
>
> +static DEFINE_MUTEX(tdx_module_ext_lock);
> +static bool tdx_module_ext_initialized;
> +
> typedef void (*sc_err_func_t)(u64 fn, u64 err, struct tdx_module_args *args);
>
> static inline void seamcall_err(u64 fn, u64 err, struct tdx_module_args *args)
> @@ -517,7 +520,7 @@ EXPORT_SYMBOL_GPL(tdx_page_array_ctrl_release);
> #define HPA_LIST_INFO_PFN GENMASK_U64(51, 12)
> #define HPA_LIST_INFO_LAST_ENTRY GENMASK_U64(63, 55)
>
> -static u64 __maybe_unused hpa_list_info_assign_raw(struct tdx_page_array *array)
> +static u64 hpa_list_info_assign_raw(struct tdx_page_array *array)
> {
> return FIELD_PREP(HPA_LIST_INFO_FIRST_ENTRY, 0) |
> FIELD_PREP(HPA_LIST_INFO_PFN, page_to_pfn(array->root)) |
> @@ -1251,7 +1254,14 @@ static __init int config_tdx_module(struct tdmr_info_list *tdmr_list,
> args.rcx = __pa(tdmr_pa_array);
> args.rdx = tdmr_list->nr_consumed_tdmrs;
> args.r8 = global_keyid;
> - ret = seamcall_prerr(TDH_SYS_CONFIG, &args);
> +
> + if (tdx_sysinfo.features.tdx_features0 & TDX_FEATURES0_TDXCONNECT) {
> + args.r9 |= TDX_FEATURES0_TDXCONNECT;
> + args.r11 = ktime_get_real_seconds();
> + ret = seamcall_prerr(TDH_SYS_CONFIG | (1ULL << TDX_VERSION_SHIFT), &args);
> + } else {
> + ret = seamcall_prerr(TDH_SYS_CONFIG, &args);
> + }
I'm in the first actual hunk of code and I'm lost. I don't have any idea
what the "(1ULL << TDX_VERSION_SHIFT)" is doing.
Also, bifurcating code paths is discouraged. It's much better to not
copy and paste the code and instead name your variables and change
*them* in a single path:
u64 module_function = TDH_SYS_CONFIG;
u64 features = 0;
u64 timestamp = 0;
if (tdx_sysinfo.features.tdx_features0 & TDX_FEATURES0_TDXCONNECT) {
features |= TDX_FEATURES0_TDXCONNECT;
timestamp = ktime_get_real_seconds();
module_function |= 1ULL << TDX_VERSION_SHIFT;
}
ret = seamcall_prerr(module_function, &args);
This would also provide a place to say what the heck is going on with
the whole "(1ULL << TDX_VERSION_SHIFT)" thing. Just hacking it in and
open-coding makes it actually harder to comment and describe it.
> /* Free the array as it is not required anymore. */
> kfree(tdmr_pa_array);
> @@ -1411,6 +1421,11 @@ static __init int init_tdx_module(void)
> if (ret)
> goto err_free_pamts;
>
> + /* configuration to tdx module may change tdx_sysinfo, update it */
> + ret = get_tdx_sys_info(&tdx_sysinfo);
> + if (ret)
> + goto err_reset_pamts;
> +
> /* Config the key of global KeyID on all packages */
> ret = config_global_keyid();
> if (ret)
> @@ -1488,6 +1503,160 @@ static __init int tdx_enable(void)
> }
> subsys_initcall(tdx_enable);
>
> +static int enable_tdx_ext(void)
> +{
Comments, please. "ext" can mean too many things. What does this do and
why can it fail?
> + struct tdx_module_args args = {};
> + u64 r;
> +
> + if (!tdx_sysinfo.ext.ext_required)
> + return 0;
Is this an optimization or is it functionally required?
> + do {
> + r = seamcall(TDH_EXT_INIT, &args);
> + cond_resched();
> + } while (r == TDX_INTERRUPTED_RESUMABLE);
> +
> + if (r != TDX_SUCCESS)
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static void tdx_ext_mempool_free(struct tdx_page_array *mempool)
> +{
> + /*
> + * Some pages may have been touched by the TDX module.
> + * Flush cache before returning these pages to kernel.
> + */
> + wbinvd_on_all_cpus();
> + tdx_page_array_free(mempool);
> +}
> +
> +DEFINE_FREE(tdx_ext_mempool_free, struct tdx_page_array *,
> + if (!IS_ERR_OR_NULL(_T)) tdx_ext_mempool_free(_T))
> +
> +/*
> + * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
> + * a CLFLUSH of pages is required before handing them to the TDX module.
> + * Be conservative and make the code simpler by doing the CLFLUSH
> + * unconditionally.
> + */
> +static void tdx_clflush_page(struct page *page)
> +{
> + clflush_cache_range(page_to_virt(page), PAGE_SIZE);
> +}
arch/x86/virt/vmx/tdx/tdx.c has:
static void tdx_clflush_page(struct page *page)
{
clflush_cache_range(page_to_virt(page), PAGE_SIZE);
}
Seems odd to see this here.
> +static void tdx_clflush_page_array(struct tdx_page_array *array)
> +{
> + for (int i = 0; i < array->nents; i++)
> + tdx_clflush_page(array->pages[array->offset + i]);
> +}
> +
> +static int tdx_ext_mem_add(struct tdx_page_array *mempool)
> +{
I just realized the 'mempool' has nothing to do with 'struct mempool',
which makes this a rather unfortunate naming choice.
> + struct tdx_module_args args = {
> + .rcx = hpa_list_info_assign_raw(mempool),
> + };
> + u64 r;
> +
> + tdx_clflush_page_array(mempool);
> +
> + do {
> + r = seamcall_ret(TDH_EXT_MEM_ADD, &args);
> + cond_resched();
> + } while (r == TDX_INTERRUPTED_RESUMABLE);
> +
> + if (r != TDX_SUCCESS)
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static struct tdx_page_array *tdx_ext_mempool_setup(void)
> +{
> + unsigned int nr_pages, nents, offset = 0;
> + int ret;
> +
> + nr_pages = tdx_sysinfo.ext.memory_pool_required_pages;
> + if (!nr_pages)
> + return NULL;
> +
> + struct tdx_page_array *mempool __free(tdx_page_array_free) =
> + tdx_page_array_alloc(nr_pages);
> + if (!mempool)
> + return ERR_PTR(-ENOMEM);
> +
> + while (1) {
> + nents = tdx_page_array_fill_root(mempool, offset);
This is really difficult to understand. It's not really filling a
"root", it's populating an array. The structure of the loop is also
rather non-obvious. It's doing:
while (1) {
fill(&array);
tell_tdx_module(&array);
}
Why can't it be:
while (1)
fill(&array);
while (1)
tell_tdx_module(&array);
for example?
> + if (!nents)
> + break;
> +
> + ret = tdx_ext_mem_add(mempool);
> + if (ret)
> + return ERR_PTR(ret);
> +
> + offset += nents;
> + }
> +
> + return no_free_ptr(mempool);
> +}
This patch is getting waaaaaaaaaaaaaaay too long. I'd say it needs to be
4 or 5 patches, just eyeballing it.
Call be old fashioned, but I suspect the use of __free() here is atually
hurting readability.
> +static int init_tdx_ext(void)
> +{
> + int ret;
> +
> + if (!(tdx_sysinfo.features.tdx_features0 & TDX_FEATURES0_EXT))
> + return -EOPNOTSUPP;
> +
> + struct tdx_page_array *mempool __free(tdx_ext_mempool_free) =
> + tdx_ext_mempool_setup();
> + /* Return NULL is OK, means no need to setup mempool */
> + if (IS_ERR(mempool))
> + return PTR_ERR(mempool);
That's a somewhat odd comment to put above an if() that doesn't return NULL.
> + ret = enable_tdx_ext();
> + if (ret)
> + return ret;
> +
> + /* Extension memory is never reclaimed once assigned */
> + if (mempool)
> + tdx_page_array_ctrl_leak(no_free_ptr(mempool));
> +
> + return 0;
> +}
> +/**
> + * tdx_enable_ext - Enable TDX module extensions.
> + *
> + * This function can be called in parallel by multiple callers.
> + *
> + * Return 0 if TDX module extension is enabled successfully, otherwise error.
> + */
> +int tdx_enable_ext(void)
> +{
> + int ret;
> +
> + if (!tdx_module_initialized)
> + return -ENOENT;
> +
> + guard(mutex)(&tdx_module_ext_lock);
> +
> + if (tdx_module_ext_initialized)
> + return 0;
> +
> + ret = init_tdx_ext();
> + if (ret) {
> + pr_debug("module extension initialization failed (%d)\n", ret);
> + return ret;
> + }
> +
> + pr_debug("module extension initialized\n");
> + tdx_module_ext_initialized = true;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(tdx_enable_ext);
> +
> static bool is_pamt_page(unsigned long phys)
> {
> struct tdmr_info_list *tdmr_list = &tdx_tdmr_list;
> @@ -1769,17 +1938,6 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td)
> return page_to_phys(td->tdr_page);
> }
>
> -/*
> - * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
> - * a CLFLUSH of pages is required before handing them to the TDX module.
> - * Be conservative and make the code simpler by doing the CLFLUSH
> - * unconditionally.
> - */
> -static void tdx_clflush_page(struct page *page)
> -{
> - clflush_cache_range(page_to_virt(page), PAGE_SIZE);
> -}
Ahh, here's the code move.
This should be in its own patch.
next prev parent reply other threads:[~2025-11-17 17:34 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-17 2:22 [PATCH v1 00/26] PCI/TSM: TDX Connect: SPDM Session and IDE Establishment Xu Yilun
2025-11-17 2:22 ` [PATCH v1 01/26] coco/tdx-host: Introduce a "tdx_host" device Xu Yilun
2025-12-19 11:19 ` Jonathan Cameron
2025-11-17 2:22 ` [PATCH v1 02/26] x86/virt/tdx: Move bit definitions of TDX_FEATURES0 to public header Xu Yilun
2025-11-17 2:22 ` [PATCH v1 03/26] coco/tdx-host: Support Link TSM for TDX host Xu Yilun
2025-12-19 11:18 ` Jonathan Cameron
2025-11-17 2:22 ` [PATCH v1 04/26] x86/tdx: Move all TDX error defines into <asm/shared/tdx_errno.h> Xu Yilun
2025-11-17 2:22 ` [PATCH v1 05/26] mm: Add __free() support for __free_page() Xu Yilun
2025-12-19 11:22 ` Jonathan Cameron
2025-12-23 9:41 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 06/26] x86/virt/tdx: Add tdx_page_array helpers for new TDX Module objects Xu Yilun
2025-11-17 16:41 ` Dave Hansen
2025-11-18 12:47 ` Xu Yilun
2026-02-11 16:24 ` dan.j.williams
2025-11-18 19:09 ` Dave Hansen
2025-11-19 16:20 ` dan.j.williams
2025-11-19 18:05 ` Dave Hansen
2025-11-19 19:10 ` dan.j.williams
2025-11-20 8:34 ` Xu Yilun
2025-11-20 6:28 ` Xu Yilun
2025-12-19 11:32 ` Jonathan Cameron
2025-12-23 10:07 ` Xu Yilun
2026-02-17 7:37 ` Tony Lindgren
2025-11-17 2:22 ` [PATCH v1 07/26] x86/virt/tdx: Read TDX global metadata for TDX Module Extensions Xu Yilun
2025-11-17 16:52 ` Dave Hansen
2025-11-18 13:00 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 08/26] x86/virt/tdx: Add tdx_enable_ext() to enable of " Xu Yilun
2025-11-17 17:34 ` Dave Hansen [this message]
2025-11-18 17:14 ` Xu Yilun
2025-11-18 18:32 ` Dave Hansen
2025-11-20 6:09 ` Xu Yilun
2025-11-20 15:23 ` Dave Hansen
2025-11-20 18:00 ` dan.j.williams
2025-11-21 12:54 ` Xu Yilun
2025-11-21 15:15 ` Dave Hansen
2025-11-21 15:38 ` Dave Hansen
2025-11-24 10:41 ` Xu Yilun
2025-11-24 10:52 ` Xu Yilun
2025-12-08 10:02 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 09/26] ACPICA: Add KEYP table definition Xu Yilun
2025-11-17 2:22 ` [PATCH v1 10/26] acpi: Add KEYP support to fw_table parsing Xu Yilun
2025-12-19 11:44 ` Jonathan Cameron
2025-11-17 2:22 ` [PATCH v1 11/26] iommu/vt-d: Cache max domain ID to avoid redundant calculation Xu Yilun
2025-12-19 11:53 ` Jonathan Cameron
2025-12-23 10:09 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 12/26] iommu/vt-d: Reserve the MSB domain ID bit for the TDX module Xu Yilun
2025-12-19 11:51 ` Jonathan Cameron
2025-12-19 11:52 ` Jonathan Cameron
2025-12-23 10:39 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 13/26] x86/virt/tdx: Read TDX Connect global metadata for TDX Connect Xu Yilun
2025-11-17 2:22 ` [PATCH v1 14/26] mm: Add __free() support for folio_put() Xu Yilun
2025-12-19 11:55 ` Jonathan Cameron
2025-12-23 10:44 ` Xu Yilun
2025-11-17 2:22 ` [PATCH v1 15/26] x86/virt/tdx: Extend tdx_page_array to support IOMMU_MT Xu Yilun
2025-11-17 19:19 ` Dave Hansen
2025-11-17 2:23 ` [PATCH v1 16/26] x86/virt/tdx: Add a helper to loop on TDX_INTERRUPTED_RESUMABLE Xu Yilun
2025-11-17 2:23 ` [PATCH v1 17/26] x86/virt/tdx: Add SEAMCALL wrappers for trusted IOMMU setup and clear Xu Yilun
2025-11-17 2:23 ` [PATCH v1 18/26] iommu/vt-d: Export a helper to do function for each dmar_drhd_unit Xu Yilun
2025-11-17 2:23 ` [PATCH v1 19/26] coco/tdx-host: Setup all trusted IOMMUs on TDX Connect init Xu Yilun
2025-11-17 2:23 ` [PATCH v1 20/26] coco/tdx-host: Add a helper to exchange SPDM messages through DOE Xu Yilun
2025-11-17 2:23 ` [PATCH v1 21/26] x86/virt/tdx: Add SEAMCALL wrappers for SPDM management Xu Yilun
2025-11-17 2:23 ` [PATCH v1 22/26] coco/tdx-host: Implement SPDM session setup Xu Yilun
2025-11-17 2:23 ` [PATCH v1 23/26] coco/tdx-host: Parse ACPI KEYP table to init IDE for PCI host bridges Xu Yilun
2025-12-19 12:02 ` Jonathan Cameron
2025-11-17 2:23 ` [PATCH v1 24/26] x86/virt/tdx: Add SEAMCALL wrappers for IDE stream management Xu Yilun
2025-11-17 2:23 ` [PATCH v1 25/26] coco/tdx-host: Implement IDE stream setup/teardown Xu Yilun
2025-11-17 2:23 ` [PATCH v1 26/26] coco/tdx-host: Finally enable SPDM session and IDE Establishment Xu Yilun
2025-12-19 12:06 ` Jonathan Cameron
2025-12-23 10:45 ` Xu Yilun
2025-11-17 23:05 ` [PATCH v1 00/26] PCI/TSM: TDX Connect: SPDM Session " Dave Hansen
2025-11-18 1:07 ` Xu Yilun
2025-11-19 15:18 ` Dave Hansen
2025-11-19 15:50 ` dan.j.williams
2025-11-19 16:19 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cfcfb160-fcd2-4a75-9639-5f7f0894d14b@intel.com \
--to=dave.hansen@intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=chao.gao@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-pci@vger.kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=x86@kernel.org \
--cc=yilun.xu@intel.com \
--cc=yilun.xu@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox