From: "Dan Williams (nvidia)" <djbw@kernel.org>
To: Xu Yilun <yilun.xu@linux.intel.com>,
kas@kernel.org, djbw@kernel.org, rick.p.edgecombe@intel.com,
x86@kernel.org, peter.fang@intel.com
Cc: linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, sohil.mehta@intel.com,
yilun.xu@intel.com, yilun.xu@linux.intel.com,
baolu.lu@linux.intel.com, zhenzhong.duan@intel.com,
xiaoyao.li@intel.com
Subject: Re: [PATCH 02/15] x86/virt/tdx: Add extra memory to TDX Module for Extensions
Date: Fri, 12 Jun 2026 16:49:36 -0700 [thread overview]
Message-ID: <6a2c9b10574ce_9b8551005d@djbw-dev.notmuch> (raw)
In-Reply-To: <20260522034128.3144354-3-yilun.xu@linux.intel.com>
Xu Yilun wrote:
> TDX Module introduces a new concept called "TDX Module Extensions" to
> support long running / hard-irq preemptible flows inside. This makes TDX
> Module capable of handling complex tasks through "Extension SEAMCALLs".
> Adding more memory to TDX Module is the first step to enable Extensions.
Like I said on the cover, I think "long running hard-irq preemptible"
invites more questions that it answers. The service calls are not "long
running" on their own. I think it is sufficient to say they are
resumable unlike typical calls that run to completion while monopolizing
the CPU.
> Currently, TDX Module memory use is relatively static. But, the
> Extensions need to use memory more dynamically. While 'static' here
> means the kernel provides necessary amount of memory to TDX Module for
> its basic functionalities, 'dynamic' means extra memory is needed only
> if new add-on features are to be enabled. So add a new memory feeding
> process backed by a new SEAMCALL TDH.EXT.MEM.ADD.
Rick commented on this as well, but a simpler way to say it is
extensions receive a one time memory pool allocation at init time. The
extension uses that pool as its baseline for its own internal state and
data for the service APIs it offers.
> The process is mostly the same as adding PAMT. The kernel queries TDX
> Module how much memory needed, allocates it, hands it over, and never
> gets it back.
>
> TDH.EXT.MEM.ADD uses a new parameter type HPA_LIST_INFO to provide
> control (private) pages to TDX Module. This type represents a list of
> pages for TDX Module to access. It needs a 'root page' which contains
> the list of HPAs of the pages. It collapses the HPA of the root page
> and the number of valid HPAs into a 64 bit raw value for SEAMCALL
> parameters. The root page is always a medium, TDX Module never keeps
> the root page.
I mention below, but I do not think the reader cares that the TDX Module
calls an array of physical addresses a "root" page.
>
> Introduce a tdx_clflush_hpa_list() helper to flush shared cache before
> SEAMCALL, to avoid shared cache writeback damaging these private pages.
>
> For now, TDX Module Extensions consumes relatively large amount of
> memory (~50MB). Use contiguous page allocation to avoid permanently
> fragment too much memory. Print the allocation amount on TDX Module
> Extensions initialization for visibility.
To be clear I believe there is a low chance of fragmentation given this
allocation happening early. However, at 10s of MB the benefit of
isolating blocks of PFNs that will never be returned, it makes to not
use the buddy allocator for that.
> Co-developed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
> arch/x86/virt/vmx/tdx/tdx.h | 1 +
> arch/x86/virt/vmx/tdx/tdx.c | 118 ++++++++++++++++++++++++++++++++++++
> 2 files changed, 119 insertions(+)
>
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index a5eec8e3cc71..2335f88bbb10 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -46,6 +46,7 @@
> #define TDH_PHYMEM_PAGE_WBINVD 41
> #define TDH_VP_WR 43
> #define TDH_SYS_CONFIG 45
> +#define TDH_EXT_MEM_ADD 61
> #define TDH_SYS_DISABLE 69
>
> /*
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index c0c6281b08a5..622399d8da68 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -31,6 +31,7 @@
> #include <linux/syscore_ops.h>
> #include <linux/idr.h>
> #include <linux/kvm_types.h>
> +#include <linux/bitfield.h>
> #include <asm/page.h>
> #include <asm/special_insns.h>
> #include <asm/msr-index.h>
> @@ -1179,6 +1180,123 @@ static __init int init_tdmrs(struct tdmr_info_list *tdmr_list)
> return 0;
> }
>
> +static void tdx_clflush_hpa_list(struct page *root, unsigned int nr_pages)
> +{
> + u64 *entries = page_to_virt(root);
> + int i;
> +
> + for (i = 0; i < nr_pages; i++)
> + clflush_cache_range(__va(entries[i]), PAGE_SIZE);
> +}
> +
> +#define HPA_LIST_INFO_FIRST_ENTRY GENMASK_U64(11, 3)
> +#define HPA_LIST_INFO_PFN GENMASK_U64(51, 12)
> +#define HPA_LIST_INFO_LAST_ENTRY GENMASK_U64(63, 55)
> +
> +static u64 to_hpa_list_info(struct page *root, unsigned int nr_pages)
> +{
> + return FIELD_PREP(HPA_LIST_INFO_FIRST_ENTRY, 0) |
> + FIELD_PREP(HPA_LIST_INFO_PFN, page_to_pfn(root)) |
> + FIELD_PREP(HPA_LIST_INFO_LAST_ENTRY, nr_pages - 1);
> +}
> +
> +static int tdx_ext_mem_add(struct page *root, unsigned int nr_pages)
> +{
> + struct tdx_module_args args = {
> + .rcx = to_hpa_list_info(root, nr_pages),
> + };
> + u64 r;
> +
> + tdx_clflush_hpa_list(root, nr_pages);
> +
> + do {
> + /*
> + * TDH_EXT_MEM_ADD is designed to use output parameter RCX to
> + * override/update input parameter RCX, so the caller doesn't
> + * have to do manual parameter update on retry call.
> + */
> + r = seamcall_ret(TDH_EXT_MEM_ADD, &args);
> + } while (r == TDX_INTERRUPTED_RESUMABLE);
> +
> + if (r != TDX_SUCCESS)
> + return -EFAULT;
> +
> + return 0;
> +}
> +
> +static int tdx_ext_mem_setup(void)
> +{
> + unsigned int nr_pages;
> + struct page *page;
> + u64 *root;
> + unsigned int i;
> + int ret;
> +
> + nr_pages = tdx_sysinfo.ext.memory_pool_required_pages;
> + /*
> + * memory_pool_required_pages == 0 means no need to add pages,
> + * skip the memory setup.
> + */
> + if (!nr_pages)
> + return 0;
> +
> + root = kzalloc(PAGE_SIZE, GFP_KERNEL);
> + if (!root)
> + return -ENOMEM;
I think this "root" term is a holdover from the complicated TDX Connect
case where it might sometimes be this odd "singleton" object? You could
just make it this for actual type safety.
struct tdx_hpa_list {
u64 phys[PAGE_SIZE/sizeof(u64)];
}
> +
> + page = alloc_contig_pages(nr_pages, GFP_KERNEL, numa_mem_id(),
> + &node_online_map);
> + if (!page) {
> + ret = -ENOMEM;
> + goto out_free_root;
> + }
> +
> + for (i = 0; i < nr_pages;) {
> + unsigned int nents = min(nr_pages - i,
> + PAGE_SIZE / sizeof(*root));
This looks wrong, sizeof(struct page)?, or size of physical address?
Becomes less error prone if you do:
min(nr_pages - i, ARRAY_SIZE(hpa_list->phys))
> + int j;
> +
> + for (j = 0; j < nents; j++)
You can declare j in the for loop.
> + root[j] = page_to_phys(page + i + j);
> +
> + ret = tdx_ext_mem_add(virt_to_page(root), nents);
> + /*
> + * No SEAMCALLs to reclaim the added pages. For simple error
> + * handling, leak all pages.
> + */
> + WARN_ON_ONCE(ret);
Perhaps to be friendlier to folks without the source code in front of
them drop the comment and do:
WARN(ret, "Fatal: TDX Module failed (%d) to accept memory, stranded %ld pages\n", ret, nr_pages)
...the once flavor not needed, right? It's toast at this point.
next prev parent reply other threads:[~2026-06-12 23:49 UTC|newest]
Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 3:41 [PATCH 00/15] Enable TDX Module Extensions and DICE-based TDX Quoting Xu Yilun
2026-05-22 3:41 ` [PATCH 01/15] x86/virt/tdx: Read global metadata for TDX Module Extensions Xu Yilun
2026-05-25 6:24 ` Xiaoyao Li
2026-05-25 6:54 ` Xiaoyao Li
2026-05-27 15:35 ` Kiryl Shutsemau
2026-05-28 4:25 ` Xu Yilun
2026-05-28 21:17 ` Edgecombe, Rick P
2026-05-29 15:34 ` Xu Yilun
2026-05-27 6:05 ` Sohil Mehta
2026-05-27 7:11 ` Xu Yilun
2026-05-27 17:17 ` Sohil Mehta
2026-05-28 3:48 ` Xu Yilun
2026-05-28 21:00 ` Edgecombe, Rick P
2026-05-29 16:59 ` Xu Yilun
2026-06-09 13:06 ` Adrian Hunter
2026-06-10 3:20 ` Xu Yilun
2026-06-12 22:20 ` Dan Williams (nvidia)
2026-05-22 3:41 ` [PATCH 02/15] x86/virt/tdx: Add extra memory to TDX Module for Extensions Xu Yilun
2026-05-25 8:56 ` Xiaoyao Li
2026-05-27 3:47 ` Xu Yilun
2026-05-27 6:38 ` Xiaoyao Li
2026-05-27 7:32 ` Xu Yilun
2026-05-27 8:18 ` Xiaoyao Li
2026-06-07 4:38 ` Kishen Maloor
2026-06-08 9:41 ` Xu Yilun
2026-06-09 13:38 ` Adrian Hunter
2026-06-10 5:13 ` Xu Yilun
2026-06-10 5:43 ` Adrian Hunter
2026-06-10 7:44 ` Xu Yilun
2026-06-12 23:49 ` Dan Williams (nvidia) [this message]
2026-05-22 3:41 ` [PATCH 03/15] x86/virt/tdx: Make TDX Module initialize Extensions Xu Yilun
2026-05-25 8:58 ` Xiaoyao Li
2026-06-05 8:46 ` Tony Lindgren
2026-06-09 15:14 ` Adrian Hunter
2026-06-10 8:09 ` Xu Yilun
2026-05-22 3:41 ` [PATCH 04/15] x86/virt/tdx: Enable the Extensions right after basic TDX Module init Xu Yilun
2026-05-25 6:00 ` Tony Lindgren
2026-05-27 4:02 ` Xu Yilun
2026-05-25 8:05 ` Xiaoyao Li
2026-05-28 21:32 ` Edgecombe, Rick P
2026-05-29 17:19 ` Xu Yilun
2026-06-07 4:38 ` Kishen Maloor
2026-06-08 10:12 ` Xu Yilun
2026-06-13 0:08 ` Dan Williams (nvidia)
2026-05-22 3:41 ` [RFC PATCH 05/15] x86/virt/tdx: Move tdx_tdr_pa() up in the file Xu Yilun
2026-05-28 21:32 ` Edgecombe, Rick P
2026-06-11 16:21 ` Adrian Hunter
2026-05-22 3:41 ` [RFC PATCH 06/15] x86/virt/tdx: Initialize Quoting extension during bringup Xu Yilun
2026-05-28 21:35 ` Edgecombe, Rick P
2026-06-11 16:22 ` Adrian Hunter
2026-06-13 0:00 ` Dan Williams (nvidia)
2026-05-22 3:41 ` [RFC PATCH 07/15] x86/virt/tdx: Prepare Quote buffer during extension bringup Xu Yilun
2026-05-28 22:30 ` Edgecombe, Rick P
2026-05-22 3:41 ` [RFC PATCH 08/15] x86/virt/tdx: Add interface to check Quoting availability Xu Yilun
2026-05-22 3:41 ` [RFC PATCH 09/15] x86/virt/tdx: Add interface to generate a Quote Xu Yilun
2026-05-28 22:30 ` Edgecombe, Rick P
2026-06-11 17:15 ` Adrian Hunter
2026-05-22 3:41 ` [RFC PATCH 10/15] x86/tdx: Move and rename Quote request structure Xu Yilun
2026-06-11 17:16 ` Adrian Hunter
2026-06-13 0:04 ` Dan Williams (nvidia)
2026-05-22 3:41 ` [RFC PATCH 11/15] KVM: TDX: Factor out userspace return path from tdx_get_quote() Xu Yilun
2026-05-22 3:41 ` [RFC PATCH 12/15] KVM: TDX: Add in-kernel Quote generation Xu Yilun
2026-06-13 0:20 ` Dan Williams (nvidia)
2026-05-22 3:41 ` [RFC PATCH 13/15] KVM: TDX: Support event-notify interrupts only with userspace quoting Xu Yilun
2026-06-11 19:36 ` Adrian Hunter
2026-05-22 3:41 ` [RFC PATCH 14/15] x86/virt/tdx: Embed version info in SEAMCALL leaf function definitions Xu Yilun
2026-05-25 9:00 ` Xiaoyao Li
2026-05-27 6:45 ` Xu Yilun
2026-05-27 7:44 ` Xiaoyao Li
2026-05-27 11:45 ` Xu Yilun
2026-06-12 5:47 ` Adrian Hunter
2026-05-22 3:41 ` [RFC PATCH 15/15] x86/virt/tdx: Enable TDX Quoting extension Xu Yilun
[not found] ` <ahPbb1Ws9hBruJ2d@tlindgre-MOBL1>
2026-05-25 10:51 ` Xiaoyao Li
2026-05-26 9:00 ` Tony Lindgren
2026-05-26 15:45 ` Xu Yilun
2026-05-27 1:30 ` Xiaoyao Li
2026-06-07 4:41 ` Kishen Maloor
2026-06-08 15:10 ` Xu Yilun
2026-05-27 5:23 ` [PATCH 00/15] Enable TDX Module Extensions and DICE-based TDX Quoting Sohil Mehta
2026-05-27 10:38 ` Xu Yilun
2026-05-27 17:09 ` Sohil Mehta
2026-05-28 4:52 ` Xu Yilun
2026-05-28 19:50 ` Sohil Mehta
2026-06-01 9:36 ` Xu Yilun
2026-06-01 20:17 ` Sohil Mehta
2026-06-02 5:36 ` Xu Yilun
2026-06-07 4:36 ` Kishen Maloor
2026-06-08 6:54 ` Xu Yilun
2026-06-08 18:31 ` Adrian Hunter
2026-06-12 22:03 ` Dan Williams (nvidia)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a2c9b10574ce_9b8551005d@djbw-dev.notmuch \
--to=djbw@kernel.org \
--cc=baolu.lu@linux.intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=peter.fang@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sohil.mehta@intel.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=yilun.xu@intel.com \
--cc=yilun.xu@linux.intel.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox