Linux Confidential Computing Development
 help / color / mirror / Atom feed
* Re: [RFC PATCH 12/15] KVM: TDX: Add in-kernel Quote generation
From: Dan Williams (nvidia) @ 2026-06-13  0:20 UTC (permalink / raw)
  To: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, peter.fang
  Cc: linux-coco, linux-kernel, kvm, sohil.mehta, yilun.xu, yilun.xu,
	baolu.lu, zhenzhong.duan, xiaoyao.li
In-Reply-To: <20260522034128.3144354-13-yilun.xu@linux.intel.com>

Xu Yilun wrote:
> From: Peter Fang <peter.fang@intel.com>
> 
> Provide an in-kernel path for TDX Quote generation when handling
> TDG.VP.VMCALL<GetQuote>, without requiring an exit to userspace.
> 
> Use the core TDX API when the TDX Quoting extension is available. For
> simplicity, each KVM guest checks for availability only once during
> initialization. KVM does not handle Quoting service disruptions.
> 
> Signed-off-by: Peter Fang <peter.fang@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> ---
[..]
> +static u64 __get_quote_kernel(struct kvm_vcpu *vcpu, struct tdx_quote_req *req,
> +			      size_t req_len, gpa_t req_gpa, size_t total_len)
> +{
> +	struct tdx_td *td = &to_kvm_tdx(vcpu->kvm)->td;
> +
> +	/* Only support version 1 as defined in the GHCI spec */
> +	if (req->version != 1)
> +		return TDX_QUOTE_STATUS_ERROR;
> +
> +	if ((size_t)req->in_len + TDX_QUOTE_REQ_HDR_SIZE > req_len)
> +		return TDX_QUOTE_STATUS_ERROR;
> +
> +	/* The caller frees the quote data */

No, it is freed by cleanup as far as I can see

> +	void *quote_data __free(kvfree) =

...this shadows the global "quote_data". A global really should be
properly namespaced.

^ permalink raw reply

* Re: [RFC PATCH 14/15] x86/virt/tdx: Embed version info in SEAMCALL leaf function definitions
From: Xu Yilun @ 2026-06-13 15:55 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: kas, djbw, rick.p.edgecombe, x86, peter.fang, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <dd9027c7-ea84-4cee-9484-4e464a766b0d@intel.com>

On Fri, Jun 12, 2026 at 08:47:26AM +0300, Adrian Hunter wrote:
> On 22/05/2026 06:41, Xu Yilun wrote:
> > Embed version information in SEAMCALL leaf function definitions rather
> > than let the caller open code them. For now, only TDH.VP.INIT is
> > involved.
> 
> > @@ -31,7 +44,7 @@
> >  #define TDH_VP_CREATE			10
> >  #define TDH_MNG_KEY_FREEID		20
> >  #define TDH_MNG_INIT			21
> > -#define TDH_VP_INIT			22
> > +#define TDH_VP_INIT			SEAMCALL_LEAF_VER(22, 1)
> 
> FWIW I find the macro a bit ugly, and hiding the version number in
> the leaf number macro a little counter-intuitive compared with setting
> it at the call site.  It anyway needs some explanation at the call site.

We actually discussed about this and realized we don't need to keep
version. This is because:

  1. Newer version SEAMCALLs are always compatible with older ones.
  2. System security requires us to stop using an older TDX module when
     there is a newer one. So don't try to support an older TDX module
     which doesn't understand newer version SEAMCALLs.

https://lore.kernel.org/all/ca331aa3-6304-4e07-9ed9-94dc69726382@intel.com/

> 
> > @@ -2217,8 +2217,8 @@ u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
> >  		.r8 = x2apicid,
> >  	};
> >  
> > -	/* apicid requires version == 1. */
> > -	return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
> > +	/* apicid requires version == 1. See TDH_VP_INIT definition.*/
> > +	return seamcall(TDH_VP_INIT, &args);
> 
> Now the reader has to go look at TDH_VP_INIT.

mm.. I think I should just delete the comment.

^ permalink raw reply

* Re: [PATCH 04/15] x86/virt/tdx: Enable the Extensions right after basic TDX Module init
From: Peter Fang @ 2026-06-14  7:00 UTC (permalink / raw)
  To: Xu Yilun
  Cc: Kishen Maloor, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <aiaVk9Lx7iakgd4g@yilunxu-OptiPlex-7050>

On Mon, Jun 08, 2026 at 06:12:35PM +0800, Xu Yilun wrote:
> 
> > 
> > The handling of tdx_quote_init() in Patch 6 suggests a more
> > best-effort approach.
> 
> TDX Quoting is however a clear self-contained add-on feature from OS POV.
> Though I'm not sure if a TDX platform is still a safe TCB with DICE
> available but failed, and good for "best-effort" policy? Maybe Peter
> could answer.

The DICE extension is just one of the ways to generate a Quote for the
guest. If DICE is not available, TDX can fall back to the existing
userspace SGX Quoting flow. So I think a best-effort approach makes
sense here.

> > 

^ permalink raw reply

* Re: [RFC PATCH 05/15] x86/virt/tdx: Move tdx_tdr_pa() up in the file
From: Peter Fang @ 2026-06-14  7:04 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <0f4ee112-59c6-49b0-8d0b-886f32ec410a@intel.com>

On Thu, Jun 11, 2026 at 07:21:17PM +0300, Adrian Hunter wrote:
> On 22/05/2026 06:41, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Move the tdx_tdr_pa() in preparation for upcoming changes to use them
> 
> them -> it

Ack. Thanks for catching this.

> 

^ permalink raw reply

* Re: [RFC PATCH 06/15] x86/virt/tdx: Initialize Quoting extension during bringup
From: Peter Fang @ 2026-06-14  7:10 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: kas@kernel.org, djbw@kernel.org, yilun.xu@linux.intel.com,
	x86@kernel.org, Xu, Yilun, Duan, Zhenzhong,
	baolu.lu@linux.intel.com, Li, Xiaoyao,
	linux-kernel@vger.kernel.org, Mehta, Sohil, kvm@vger.kernel.org,
	linux-coco@lists.linux.dev
In-Reply-To: <f9ebc92839c94430055fe2a48114054a39b0e56e.camel@intel.com>

On Thu, May 28, 2026 at 02:35:49PM -0700, Edgecombe, Rick P wrote:
> On Fri, 2026-05-22 at 11:41 +0800, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Initialize the Quoting extension and fetch its metadata during TDX
> > bringup.
> > 
> > Because Quoting is an optional TDX feature, do not let its
> > initialization failures cause TDX bringup to fail.
> > 
> > This patch
> > 
> 
> Don't say "this patch" in tip logs. The patch is a temporary format, and some
> x86 maintainers hate the term in logs.

Thanks, will fix in the next revision.

> 
> >  does not include the opt-in portion of the initialization.
> > It mainly lays the groundwork for TDX Quoting support. Opt-in will be
> > added in a follow-up patch once the feature can be properly used by the
> > system.
> 
> This could be imperative mood.

Will fix this as well.

> 
> > 
> > Signed-off-by: Peter Fang <peter.fang@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
> 

^ permalink raw reply

* Re: [RFC PATCH 06/15] x86/virt/tdx: Initialize Quoting extension during bringup
From: Peter Fang @ 2026-06-14  7:20 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <55b1972a-bfc9-4229-a7c6-7d46b03d9e6c@intel.com>

On Thu, Jun 11, 2026 at 07:22:18PM +0300, Adrian Hunter wrote:
> On 22/05/2026 06:41, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Initialize the Quoting extension and fetch its metadata during TDX
> > bringup.
> > 
> > Because Quoting is an optional TDX feature, do not let its
> > initialization failures cause TDX bringup to fail.
> 
> Is there a reason Linux needs to support TDX with failed Quote
> extension initialization?

The Quoting extension is not the only way to get TD Quotes. If this
extension fails, the host can still fall back to the legacy SGX-based
Quoting in userspace. I think the decision to actually fall back can be
left to userspace at that point.

> 
> > +static void tdx_quote_init(void)
> > +{
> > +	struct tdx_module_args args = {};
> > +	u64 r;
> > +
> > +	do {
> > +		r = seamcall(TDH_QUOTE_INIT, &args);
> > +	} while (r == TDX_INTERRUPTED_RESUMABLE);
> > +
> > +	if (r)
> 
> Elsewhere it tends to be:
> 
> 	if (r != TDX_SUCCESS)

Good catch. I'll fix this. Thanks!

> 

^ permalink raw reply

* Re: [RFC PATCH 06/15] x86/virt/tdx: Initialize Quoting extension during bringup
From: Peter Fang @ 2026-06-14  7:50 UTC (permalink / raw)
  To: Dan Williams (nvidia)
  Cc: Xu Yilun, kas, rick.p.edgecombe, x86, linux-coco, linux-kernel,
	kvm, sohil.mehta, yilun.xu, baolu.lu, zhenzhong.duan, xiaoyao.li
In-Reply-To: <6a2c9d8b8bfe9_9b85510018@djbw-dev.notmuch>

On Fri, Jun 12, 2026 at 05:00:11PM -0700, Dan Williams (nvidia) wrote:
> Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Initialize the Quoting extension and fetch its metadata during TDX
> > bringup.
> > 
> > Because Quoting is an optional TDX feature, do not let its
> > initialization failures cause TDX bringup to fail.
> 
> Is this micro-optimization worth it? What are the classes of quote-init
> failures vs just make the policy be anything in the module must init.

Since there is a fallback option to do the Quoting in userspace, I think
it is probably not worth shooting down TDX entirely over quote-init
failures.

The quote-init failures can come from:

  1. Quoting init SEAMCALL failures, which look pretty opaque to the
     kernel and there's not much it can do about it.
  2. Quoting buffer allocation failures, which *are* understood by the
     kernel, and it could maybe try something else. Right now, we just
     treat it the same as 1.

This is helpful because I think the question of "what if the Quoting
extension fails" has come up enough times that it warrants some
explanation in the patch log. Thanks.

> 
> > This patch does not include the opt-in portion of the initialization.
> > It mainly lays the groundwork for TDX Quoting support. Opt-in will be
> > added in a follow-up patch once the feature can be properly used by the
> > system.
> 
> It is unconditionally calling quote init even if the feature is not
> present. Is that not a problem?

Good question... I should reorder the patches so this looks more
straightforward. I enable everything in patch 15 (including the check
for the Quoting feature) and I think that just creates confusion for
folks looking at this patch.

> 

^ permalink raw reply

* Re: [RFC PATCH 07/15] x86/virt/tdx: Prepare Quote buffer during extension bringup
From: Peter Fang @ 2026-06-14 10:28 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: kas@kernel.org, djbw@kernel.org, yilun.xu@linux.intel.com,
	x86@kernel.org, Xu, Yilun, Duan, Zhenzhong,
	baolu.lu@linux.intel.com, Li, Xiaoyao,
	linux-kernel@vger.kernel.org, Mehta, Sohil, kvm@vger.kernel.org,
	linux-coco@lists.linux.dev
In-Reply-To: <1a4d1126d6fe86e94fa8e1de6764656853e61106.camel@intel.com>

On Thu, May 28, 2026 at 03:30:36PM -0700, Edgecombe, Rick P wrote:
> On Fri, 2026-05-22 at 11:41 +0800, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > The host uses a Quote buffer to communicate with the TDX module when
> > generating Quotes.
> > 
> 
> Can this be put in common terms. This is going to mean nothing to someone
> reading this that doesn't already know the feature.

I'll add more background in common terms here.

> 
> >  Because the Quote buffer is shared with TDX guests,
> 
> Why capitalize "Quote"?

This is again the balance between using common terms vs TDX language. In
general, TDX docs capitalize terms a lot. TDX attestation docs always
refer to the attestation blob as "Quotes".

I mainly went with "Quotes" in the logs because that term has already
been used everywhere in the tdx-guest code/logs (see tdx-guest.c). So I
wanted to preserve some consistency at least in the logs. In the added
host code and prints, I'm starting to just use "quotes" because that
seems to be the more common convention in the TDX host code. I'm happy
to make adjustments if this doesn't make sense.

> 
> > prepare the required metadata during Quoting extension bringup.
> 
> What does prepare the required metadata mean?

That's a poor choice of word on my part. I'll rephrase it in the next
revision. I mainly just wanted to convey "prepare struct quote_data".

> 
> How does it being shared with TDX guest suggest this? Just that TDX guests will
> need them? Is the reason just that only one is needed, so do it during global
> init? 

Yes, that's exactly it. I'll make it clearer.

> 
> > +static struct quote_data {
> > +	void *buf;
> > +	u64 buf_len;
> > +	u64 *hpa_list;
> > +	phys_addr_t hpa_list_pa;
> > +} quote_data;
> 
> Hmm, I think this should separate the type and variable declaration. It's not a
> common pattern. I don't think there is an official rule.

Sure, I'll fix this.

> 
> > +	qlist = vmalloc_array(qlist_npages, PAGE_SIZE);
> > +	if (!qlist) {
> > +		err = -ENOMEM;
> > +		goto out_err;
> 
> Just return ENOMEM here. vfree() doesn't do any work if passed NULL, but it's
> weird flow.

Will do.

> 
> > +	}
> > +
> > +	/*
> > +	 * Make sure unfilled entries are always -1, which means NULL in TDX.
> 
> Huh?

I'll add more explanation here (see below).

> 
> > +	 * Only the last page needs to be filled. All the other pages will be
> > +	 * fully populated.
> > +	 */
> > +	memset((u8 *)qlist + (qlist_npages - 1) * PAGE_SIZE, 0xff, PAGE_SIZE);
> 
> What are the entries? And what is a -1 in u8? Or is it supposed to be u64?
> Please make this a lot clearer.

Yeah I was trying to create all-1 u64 entries. This is pretty
under-commented. I'll redo the comments.

> 
> > +	/* Populate HPA_LINKED_LIST as per TDX ABI spec */
> > +	for (i = 0, j = 0; j < nr_pages; i++) {
> > +		if ((i % HPAS_PER_PAGE) == HPAS_PER_PAGE - 1) {
> > +			/*
> > +			 * The last entry always points to the next page. The
> > +			 * address of the following entry must be on next page's
> > +			 * boundary.
> > +			 */
> 
> Can you maybe just explain this format that you are building in like one
> sentence at the beginning of the function? "The quote buffer is passed to the
> tdx module in a format that like... (some common terms that have no TDX
> jargon)."

Will do. This part is pretty under-commented as well.

> 
> > +	qdata->buf = qbuf;
> > +	qdata->buf_len = (u64)nr_pages * PAGE_SIZE;
> > +	qdata->hpa_list = qlist;
> > +
> > +	pfn = vmalloc_to_pfn(qlist);
> 
> Do we need a vmalloc_to_pa() helper? Maybe put it in terms of tdx format. Like
> vmalloc_pfn_to_tdxpa() and keep it here? The tdx update stuff does this a bunch
> too.

That's a really good idea. I'll do that.

> 
> > +	qdata->hpa_list_pa = PFN_PHYS(pfn);
> > +
> > +	return 0;
> > +
> > +out_err:
> > +	vfree(qlist);
> > +
> > +	return err;
> 
> It only returns -ENOMEM, so do we need the err var?

Good point. I think I had some other errors that I later removed. I'll
just return -ENOMEM directly here.

> 
> > +}
> > +
> >  static void tdx_quote_init(void)
> >  {
> >  	struct tdx_module_args args = {};
> > +	unsigned int nr_quote_pages;
> >  	u64 r;
> >  
> >  	do {
> > @@ -1218,7 +1295,13 @@ static void tdx_quote_init(void)
> >  		return;
> >  
> >  	/* Quoting metadata is valid only after initialization */
> > -	get_tdx_sys_info_quote(&tdx_sysinfo.quote);
> > +	if (get_tdx_sys_info_quote(&tdx_sysinfo.quote))
> > +		return;
> 
> How come this patch gets error handling? Why is it needed now when it wasn't
> before?

Previously, get_tdx_sys_info_quote() just happened to be the last
statement in tdx_quote_init() so getting an error didn't require an
early return. tdx_quote_init() wasn't doing much at the time. But now
the code can't see a valid max_quote_size if get_tdx_sys_info_quote()
fails.

> 
> > +
> > +	nr_quote_pages = PAGE_ALIGN(tdx_sysinfo.quote.max_quote_size) /
> > +			 PAGE_SIZE;
> > +	if (tdx_quote_create_buf(nr_quote_pages, &quote_data))
> > +		pr_err("Failed to create quote buffer\n");
> 
> Err... what happens in ENOMEM scenario? NULL pointer later?

Yes. struct quote_data remains uninitialized so it will have NULL
pointers. All the added APIs will take this into account so there won't
be NULL pointer accesses.

> 
> >  }
> >  
> >  /* Initialize the TDX Module Extensions then Extension-SEAMCALLs can be used */
> 

^ permalink raw reply

* Re: [RFC PATCH 09/15] x86/virt/tdx: Add interface to generate a Quote
From: Peter Fang @ 2026-06-14 11:29 UTC (permalink / raw)
  To: Edgecombe, Rick P
  Cc: kas@kernel.org, djbw@kernel.org, yilun.xu@linux.intel.com,
	x86@kernel.org, Xu, Yilun, Duan, Zhenzhong,
	baolu.lu@linux.intel.com, Li, Xiaoyao,
	linux-kernel@vger.kernel.org, Mehta, Sohil, kvm@vger.kernel.org,
	linux-coco@lists.linux.dev
In-Reply-To: <a10ad58ed8092e4e7d81be1995438efd21647fde.camel@intel.com>

On Thu, May 28, 2026 at 03:30:45PM -0700, Edgecombe, Rick P wrote:
> > +
> > +	/* TDH.QUOTE.GET expects the input data to fit in a page */
> > +	if (in_data_len > PAGE_SIZE)
> > +		return NULL;
> 
> Do we really need this check? We can't trust the caller to pass the right size?

There is a similar check for this in_data_len on the KVM side in patch
12, but it is for a different reason. The check in KVM is to make sure
it maps valid guest memory pages into the kernel, while here we make
sure it complies with the SEAMCALL API. That said, the KVM check does
make the check here kinda redundant... I can remove this for simplicity.

> 
> > +
> > +	mutex_lock(&tdx_quote_lock);
> > +
> > +	/*
> > +	 * Use the first page of the quote buffer for input data. The buffer
> > +	 * must be at least one page in size. @in_data may not be page-aligned,
> > +	 * but TDH.QUOTE.GET expects page-aligned addresses.
> > +	 */
> > +	memcpy(quote_data.buf, in_data, (size_t)in_data_len);
> > +
> > +	r = tdx_quote_get(td, quote_data.hpa_list[0], (u64)in_data_len,
> > +			  quote_data.hpa_list_pa, quote_data.buf_len, &out_len);
> > +	if (r || !out_len || out_len > quote_data.buf_len)
> 
> 
> How do these various error conditions happen?

"r" is a SEAMCALL error just like any other SEAMCALL. If r == 0
(SUCCESS), there is no documented scenario for when "!out_len" or
"out_len > quote_data.buf_len" would occur. I would assume these would
be TDX module bugs.

The reason I check the last 2 conditions is mainly to protect the
kernel:

  - "!out_len" will cause kvmemdup() to return ZERO_SIZE_PTR
  - "out_len > quote_data.buf_len" will cause out-of-bounds memory
    access in kvmemdup()

> 
> > +		goto out;
> > +
> > +	/*
> > +	 * The quote buffer is a shared resource, so use it only for the
> > +	 * SEAMCALL and copy the data out as soon as possible.
> > +	 */
> > +	quote_dup = kvmemdup(quote_data.buf, out_len, GFP_KERNEL);
> 
> So at init time we allocate a vmalloc for the quote and pre-populate the
> hpa_list. Then we use it every time and copy the contents to a new vmalloc.
> Would it really be that hard to keep the hpa list allocation around, do a
> vmalloc here and update the pfn list. Then do get quote on that and pass back
> the vmalloc we just allocated? Just feels like global reuse way has extra pieces
> in it. Compared to the whole quoting operation, this vmalloc_to_pfn() loop is
> probably not very expensive.

Hm interesting idea. But a Quote buffer could be close to 4MB in the worst
case. Let's say max_quote_size is 3MB, that's 768 vmalloc_to_pfn() calls
each time... That sounds a bit excessive right?

The extra bits mainly come from using kvmemdup() I think. Having to use
kvfree() on it does feel a bit annoying but that was the tradeoff I
made...

> 

^ permalink raw reply

* Re: [RFC PATCH 09/15] x86/virt/tdx: Add interface to generate a Quote
From: Peter Fang @ 2026-06-14 11:36 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <7c7d21c6-1f8a-42c6-a950-8fd61d702679@intel.com>

On Thu, Jun 11, 2026 at 08:15:50PM +0300, Adrian Hunter wrote:
> On 22/05/2026 06:41, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Use the TDX Quoting extension's TDH.QUOTE.GET SEAMCALL to generate a
> > Quote. Since the interface is shared across all KVM instances,
> > serialize access to the SEAMCALL buffer with a mutex.
> 
> Isn't the concurrency configurable, so supporting only 1 instance
> is a decision of the software implementation, not a TDX limitation?

Ah yes, I should document that. I'll put that in the patch log.

> 
> > +static u64 tdx_quote_get(struct tdx_td *td, u64 in_data_pa, u64 in_data_len,
> > +			 u64 hpa_list_pa, u64 total_len, u64 *quote_len)
> > +{
> > +	struct tdx_module_args args = {
> > +		.rcx = tdx_tdr_pa(td),
> > +		/* Don't bother specifying the quote id */
> 
> Need to explain why

Will do. It's because we use whatever the default Quote ID is.

> 
> ...
> 
> > +	r = tdx_quote_get(td, quote_data.hpa_list[0], (u64)in_data_len,
> > +			  quote_data.hpa_list_pa, quote_data.buf_len, &out_len);
> > +	if (r || !out_len || out_len > quote_data.buf_len)
> 
> Is r != TDX_SUCCESS more consistent

Yep I can fix that. Thanks.

> 

^ permalink raw reply

* Re: [RFC PATCH 10/15] x86/tdx: Move and rename Quote request structure
From: Peter Fang @ 2026-06-14 11:50 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <5f9474ed-bacb-44d5-a0fc-5a29a1e79b60@intel.com>

On Thu, Jun 11, 2026 at 08:16:37PM +0300, Adrian Hunter wrote:
> > -static int wait_for_quote_completion(struct tdx_quote_buf *quote_buf, u32 timeout)
> > +static int wait_for_quote_completion(struct tdx_quote_req *quote_buf, u32 timeout)
> 
> Seems inconsistent to rename the struct but not the variable names

Good catch, I'll fix that.

> 
> >  {
> >  	int i = 0;
> 
> Please note, the timeout condition in wait_for_quote_completion() is
> broken, in that the final value of i is timeout + 1 not timeout.
> Since you are in the same area, that needs fixing that too.

Thanks for catching that. This needs to be fixed. We can submit a
separate guest-only patch.

> 

^ permalink raw reply

* Re: [RFC PATCH 10/15] x86/tdx: Move and rename Quote request structure
From: Peter Fang @ 2026-06-14 11:51 UTC (permalink / raw)
  To: Dan Williams (nvidia)
  Cc: Xu Yilun, kas, rick.p.edgecombe, x86, linux-coco, linux-kernel,
	kvm, sohil.mehta, yilun.xu, baolu.lu, zhenzhong.duan, xiaoyao.li
In-Reply-To: <6a2c9e7570dd_9b855100eb@djbw-dev.notmuch>

On Fri, Jun 12, 2026 at 05:04:05PM -0700, Dan Williams (nvidia) wrote:
> >  }
> >  #endif /* CONFIG_INTEL_TDX_GUEST && CONFIG_KVM_GUEST */
> >  
> > +#if defined(CONFIG_INTEL_TDX_GUEST) || defined(CONFIG_KVM_INTEL_TDX)
> > +/* struct tdx_quote_req: Format of Quote request message.
> > + * @version: Quote format version, filled by TD.
> > + * @status: Status code of Quote request, filled by VMM.
> > + * @in_len: Length of TDREPORT, filled by TD.
> > + * @out_len: Length of Quote data, filled by VMM.
> > + * @data: Quote data on output or TDREPORT on input.
> > + *
> > + * More details of Quote request message can be found in TDX
> > + * Guest-Host Communication Interface (GHCI) for Intel TDX 1.0,
> > + * section titled "TDG.VP.VMCALL<GetQuote>"
> > + */
> > +struct tdx_quote_req {
> > +	u64 version;
> > +	u64 status;
> > +	u32 in_len;
> > +	u32 out_len;
> > +	u8 data[];
> > +};
> > +#endif /* CONFIG_INTEL_TDX_GUEST || CONFIG_KVM_INTEL_TDX */
> 
> Drop the ifdef guards.
> 
> There is no cost to allowing a data structure to be defined
> unconditionally. Usually the ifdef guards are to prevent compilation
> errors when symbols do not resolve.
> 
> Otherwise looks ok.
> 
> Reviewed-by: Dan Williams <djbw@kernel.org>

Will do, thanks for the review Dan!

^ permalink raw reply

* Re: [RFC PATCH 12/15] KVM: TDX: Add in-kernel Quote generation
From: Peter Fang @ 2026-06-14 11:57 UTC (permalink / raw)
  To: Dan Williams (nvidia)
  Cc: Xu Yilun, kas, rick.p.edgecombe, x86, linux-coco, linux-kernel,
	kvm, sohil.mehta, yilun.xu, baolu.lu, zhenzhong.duan, xiaoyao.li
In-Reply-To: <6a2ca24f16277_9b85510070@djbw-dev.notmuch>

On Fri, Jun 12, 2026 at 05:20:31PM -0700, Dan Williams (nvidia) wrote:
> [..]
> > +static u64 __get_quote_kernel(struct kvm_vcpu *vcpu, struct tdx_quote_req *req,
> > +			      size_t req_len, gpa_t req_gpa, size_t total_len)
> > +{
> > +	struct tdx_td *td = &to_kvm_tdx(vcpu->kvm)->td;
> > +
> > +	/* Only support version 1 as defined in the GHCI spec */
> > +	if (req->version != 1)
> > +		return TDX_QUOTE_STATUS_ERROR;
> > +
> > +	if ((size_t)req->in_len + TDX_QUOTE_REQ_HDR_SIZE > req_len)
> > +		return TDX_QUOTE_STATUS_ERROR;
> > +
> > +	/* The caller frees the quote data */
> 
> No, it is freed by cleanup as far as I can see

Ah makes sense. I'll fix it up.

> 
> > +	void *quote_data __free(kvfree) =
> 
> ...this shadows the global "quote_data". A global really should be
> properly namespaced.

Good point... I'll fix the naming. Thanks.

^ permalink raw reply

* Re: [RFC PATCH 13/15] KVM: TDX: Support event-notify interrupts only with userspace quoting
From: Peter Fang @ 2026-06-14 12:57 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <7090f4af-3a6d-40fd-82ab-0ba6272534dd@intel.com>

On Thu, Jun 11, 2026 at 10:36:52PM +0300, Adrian Hunter wrote:
> On 22/05/2026 06:41, Xu Yilun wrote:
> > From: Peter Fang <peter.fang@intel.com>
> > 
> > Tie userspace SetupEventNotifyInterrupt support to userspace Quote
> > generation. Delivering event-notify interrupts via userspace breaks if
> > KVM never exits to userspace in the first place.
> 
> Breaks how exactly?
> 
> Seems like a TDX guest has no way to know whether the VMM will use
> the Event Notify Interrupt anyway, so it cannot rely upon it, so
> it should already handle the case when the interrupt does not fire.

Hm that's an interesting point. But isn't the whole point of
SetupEventNotifyInterrupt to set up a contract with the host VMM? The
GHCI spec is quite loose about this.

If we say "the host VMM is not required to honor this contract", then
maybe this doesn't truly break anything. But then this stance kind of
makes this whole feature moot, or at least not very useful?

Not adding this patch feels like making this problem worse, right?
Because now we will have platforms that won't ever fire these
interrupts, and the host still tells the guest SetupEventNotifyInterrupt
is supported.

> 
> > 
> > No known guest currently requires event-notify interrupt support, so
> > defer adding in-kernel support for now. Linux TDX guests use polling
> > only.
> 
> If no guest is using it, then why does it need special treatment?

Just to maintain status quo basically. Seems like previously there was
some interest in adding this support to the guest at some point. This
patch simply turns off this feature when quoting is not done in
userspace. But platforms that do quoting in userspace (e.g. don't
support DICE extension) can observe the same behavior as today, if/when
such a guest comes into existence.

> 
> > 
> > @@ -7335,6 +7335,9 @@ inputs and outputs of the TDVMCALL.  Currently the following values of
> >     queued successfully, the TDX guest can poll the status field in the
> >     shared-memory area to check whether the Quote generation is completed or
> >     not. When completed, the generated Quote is returned via the same buffer.
> > +   If the host kernel generates Quotes through the TDX Quoting service provided
> > +   by the TDX module, KVM processes the GetQuote request and it will not appear
> > +   in userspace.
> 
> There is an Attestation section in Documentation/virt/kvm/x86/intel-tdx.rst
> that could be updated too.

Can you please point me to it? I couldn't find that section in that
file.

> 
> > +                  KVM only supports version 1 of the GetQuote request.
> 
> Is that relevant here?

Documenting this came up during some internal discussions. But yeah it
looks a bit out of place. I can remove it.

> 
> >  
> >   * ``TDVMCALL_GET_TD_VM_CALL_INFO``: the guest has requested the support
> >     status of TDVMCALLs.  The output values for the given leaf should be
> > @@ -7342,7 +7345,10 @@ inputs and outputs of the TDVMCALL.  Currently the following values of
> >     field of the union.
> >  
> >   * ``TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT``: the guest has requested to
> > -   set up a notification interrupt for vector ``vector``.
> > +   set up a notification interrupt for vector ``vector``.  Since this TDVMCALL
> > +   is used to optimize ``TDVMCALL_GET_QUOTE``, KVM disables this support in
> > +   userspace VMM if ``TDVMCALL_GET_QUOTE`` is completely handled in the kernel.
> > +   KVM may add kernel support for this in the future.
> 
> Is that really necessary?

I think this is related to the discussion above about how hard host VMM
should try to honor the SetupEventNotifyInterrupt contract.

> 

^ permalink raw reply

* Re: [RFC PATCH 13/15] KVM: TDX: Support event-notify interrupts only with userspace quoting
From: Adrian Hunter @ 2026-06-15  4:39 UTC (permalink / raw)
  To: Peter Fang
  Cc: Xu Yilun, kas, djbw, rick.p.edgecombe, x86, linux-coco,
	linux-kernel, kvm, sohil.mehta, yilun.xu, baolu.lu,
	zhenzhong.duan, xiaoyao.li
In-Reply-To: <20260614125750.GB3425618@pedri>

>>> @@ -7335,6 +7335,9 @@ inputs and outputs of the TDVMCALL.  Currently the following values of
>>>     queued successfully, the TDX guest can poll the status field in the
>>>     shared-memory area to check whether the Quote generation is completed or
>>>     not. When completed, the generated Quote is returned via the same buffer.
>>> +   If the host kernel generates Quotes through the TDX Quoting service provided
>>> +   by the TDX module, KVM processes the GetQuote request and it will not appear
>>> +   in userspace.
>>
>> There is an Attestation section in Documentation/virt/kvm/x86/intel-tdx.rst
>> that could be updated too.
> 
> Can you please point me to it? I couldn't find that section in that
> file.

Sorry, got he file name wrong: Documentation/arch/x86/tdx.rst


^ permalink raw reply

* RE: [RFC PATCH 0/6] Support virtio-mem memory hotplug in TDX guests
From: Duan, Zhenzhong @ 2026-06-15  7:54 UTC (permalink / raw)
  To: Kiryl Shutsemau
  Cc: marcandre.lureau@redhat.com, david@kernel.org, Edgecombe, Rick P,
	prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com,
	peterx@redhat.com, Qiang, Chenyi, Reshetova, Elena,
	michaeluth@amd.com, ackerleytng@google.com,
	linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev,
	virtualization@lists.linux.dev, x86@kernel.org, Xu, Yilun,
	Li, Xiaoyao, Peng, Chao P
In-Reply-To: <aiv0y-Op9bfP-CVO@thinkstation>

>-----Original Message-----
>From: Kiryl Shutsemau <kas@kernel.org>
>Subject: Re: [RFC PATCH 0/6] Support virtio-mem memory hotplug in TDX guests
>
>On Thu, Jun 04, 2026 at 05:35:45AM -0400, Zhenzhong Duan wrote:
>> 2. Re-accepting already-accepted memory returns errors. Ignoring these errors
>> can mislead the guest into believing re-accepted memory is zeroed when it
>> contains stale data.
>
>Re-accepting concern is valid, but often overblown.

> Reaccepting memory that never got allocated is fine.

I don't quite understand. "Reaccepting" implies accepting memory that was
already accepted earlier. For that to happen, the memory must have already
been allocated on the VMM side, correct?

>
>> == About this series ==
>>
>> This series takes a different direction, supporting start-private memory
>> and addressing the limitations of previous series [1] by implementing a
>> callback-based infrastructure that integrates TDX memory acceptance and
>> release operations with proper subblock granularity.
>
>You are presenting these callbacks as generic memory hotplug thingy, but
>it is only plugged into virtio mem. ACPI hotplug won't accept/release
>memory unless I miss something. Are you expecting them to cover non
>virtio cases too?

You are right, I didn't add ACPI hotplug in this series. I'm working on RFCv2
supporting both virtio-mem and ACPI hotplug in eager/lazy accept mode.

>
>And these callbacks feels like very ad-hoc solution.

OK, will drop the callbacks in RFCv2.

>
>> See Rick and Paolo's
>> discussion about using TDG.MEM.PAGE.RELEASE in [1].
>
>Having RELEASE in hotplug path without addressing private->shared
>conversion first is odd. That's the most obvious path that has to be
>covered first.
>
>Hm?

This patch series assumes that memory is plugged in as private memory
and must remain private prior to being unplugged. During the unplugging
process, memory is allocated from the buddy system and marked as
FAKE_OFFLINE. Because all free memory within the buddy system is
strictly private, shared memory can never be unplugged.

Shared memory is originally converted from private memory allocated by
the buddy system. Consequently, the driver must convert any shared
memory back to private and return it to the buddy system before it can
be unplugged.

>
>> == Future work ==
>> support lazy accept
>
>It would be nice to have some outline on how we will get there to
>understand if this patchset is stepping stone or dead end that has to be
>thrown away later on.

I realized the callbacks are specially used for eager accept, they are not
useful for lazy accept. So, I will drop them in RFCv2.

>
>Hot[un]plug is often used to manager overcommited host. Eager accept
>might be counter-productive.

Agree, I should have taken lazy accept into consideration from start.

Thanks
Zhenzhong

^ permalink raw reply

* Re: [PATCH RFC 0/3] KVM: guest_memfd: folio migration for non-confidential VMs
From: Alexandru Elisei @ 2026-06-15 10:43 UTC (permalink / raw)
  To: Shivank Garg
  Cc: Matthew Wilcox (Oracle), Jan Kara, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Johannes Weiner, Zi Yan, David Hildenbrand, Matthew Brost,
	Joshua Hahn, Rakie Kim, Byungchul Park, Gregory Price, Ying Huang,
	Alistair Popple, Paolo Bonzini, Shuah Khan, Chao Peng,
	Nikunj A Dadhania, Ira Weiny, Michael Roth, Pankaj Gupta,
	Ackerley Tng, Fuad Tabba, Sean Christopherson, Vishal Annapurve,
	Nikita Kalyazin, Patrick Roy, Pratik Sampat, Ashish Kalra,
	linux-fsdevel, linux-coco, linux-mm, linux-kernel, kvm,
	linux-kselftest
In-Reply-To: <20260611-shivank-gmem-migrate-v1-0-2d266bfc6f95@amd.com>

Hi,

On Thu, Jun 11, 2026 at 01:05:07PM +0000, Shivank Garg wrote:
> guest_memfd folios are currently marked unmovable, so the kernel cannot
> perform NUMA-balancing, memory compaction, etc. This is unavoidable for
> confidential VMs (SEV-SNP, TDX), since memory is encrypted and copying it
> needs firmware assistance. However, for non-confidential VMs (like
> Firecracker), we can migrate the folios.
> 
> This series enables folio migration for non-confidential guest_memfd and
> also lays the groundwork for migrating confidential guest_memfd later.
> Once firmware-assisted copying support is available, those VMs can be
> made movable, the confidential folio content can be copied separately,
> and the destination folio marked with FOLIO_CONTENT_COPIED so
> __migrate_folio() skips the host-side folio_mc_copy().

I always thought that one of the nice things about using guest_memfd as a
memory backend, as opposed to host userspace mappings, is that the host
cannot unmap VM memory because of KSM, automatic NUMA balancing, hugepage
collapse, compaction, etc, acting on the host userspace mapping of the
VM memory, and outside of the VMM's or KVM's control.

I think it would be useful to preserve this behaviour, even in the absence
of confidential VMs (i.e, guest_memfd file descriptor created with
GUEST_MEMFD_FLAG_MMAP).

Thanks,
Alex

> 
> Testing
> -------
> Host: 7.1-rc7 + this, 2 NUMA nodes
> 
> - KVM selftest: allocate folios on node 0, migrate them to node 1 and
>   back and verify resulting NUMA node and the folio contents at each
>   step.
> 
> - Firecracker [1]: booted a microVM backed by guest_memfd. While the
>   guest was running, forced host-side migration of its folios via
>   migratepages(8) and explicit move_pages(2) of guest_memfd
>   pages. Verify with /proc/firecracker_pid/numa_maps.
> 
> [1] https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding
>     and change builder.rs to remove GUEST_MEMFD_FLAG_NO_DIRECT_MAP from
>     vm.create_guest_memfd()
> 
> Best regards,
> Shivank
> 
> Signed-off-by: Shivank Garg <shivankg@amd.com>
> ---
> Shivank Garg (3):
>       mm: split AS_UNMOVABLE back out of AS_INACCESSIBLE
>       KVM: guest_memfd: support folio migration for non-confidential VMs
>       KVM: selftests: exercise guest_memfd folio migration
> 
>  include/linux/pagemap.h                        | 24 ++++++--
>  mm/compaction.c                                | 12 ++--
>  mm/migrate.c                                   |  2 +-
>  tools/testing/selftests/kvm/guest_memfd_test.c | 77 ++++++++++++++++++++++++++
>  virt/kvm/guest_memfd.c                         | 49 ++++++++++++++--
>  5 files changed, 149 insertions(+), 15 deletions(-)
> ---
> base-commit: 4549871118cf616eecdd2d939f78e3b9e1dddc48
> change-id: 20260611-shivank-gmem-migrate-8c1c519b30a6
> 
> Best regards,
> -- 
> Shivank Garg <shivankg@amd.com>
> 
> 

^ permalink raw reply

* Re: [PATCH v13 09/22] KVM: selftests: Expose functions to get default sregs values
From: Chenyi Qiang @ 2026-06-15 10:54 UTC (permalink / raw)
  To: Binbin Wu, Lisa Wang
  Cc: Andrew Jones, Ackerley Tng, Chao Gao, Dave Hansen, Erdem Aktas,
	Ira Weiny, Isaku Yamahata, Kiryl Shutsemau, linux-kselftest,
	Paolo Bonzini, Pratik R. Sampat, Reinette Chatre, Rick Edgecombe,
	Roger Wang, Ryan Afranji, Sagi Shahar, Sean Christopherson,
	Shuah Khan, Oliver Upton, Jeremiah McReynolds, kvm, linux-coco,
	linux-kernel, x86
In-Reply-To: <434e7f9a-5f64-4488-bf9d-5be8c3f9eefe@linux.intel.com>



On 6/8/2026 2:39 PM, Binbin Wu wrote:
> On 5/22/2026 7:16 AM, Lisa Wang wrote:
> 
> [...]
> 
>> +
>> +static inline u64 kvm_get_default_cr4(void)
>> +{
>> +	u64 cr4 = X86_CR4_PAE | X86_CR4_OSFXSR;
>> +
>> +	if (kvm_cpu_has(X86_FEATURE_XSAVE))
>> +		cr4 |= X86_CR4_OSXSAVE;
>> +	return cr4;
>> +}
>> +
> 
> [...]
> 
>> @@ -647,16 +643,12 @@ static void vcpu_init_sregs(struct kvm_vm *vm, struct kvm_vcpu *vcpu)
>>  	vcpu_sregs_get(vcpu, &sregs);
>>  
>>  	sregs.idt.base = vm->arch.idt;
>> -	sregs.idt.limit = NUM_INTERRUPTS * sizeof(struct idt_entry) - 1;
>> +	sregs.idt.limit = kvm_get_default_idt_limit();
>>  	sregs.gdt.base = vm->arch.gdt;
>> -	sregs.gdt.limit = getpagesize() - 1;
>> -
>> -	sregs.cr0 = X86_CR0_PE | X86_CR0_NE | X86_CR0_PG;
>> -	sregs.cr4 |= X86_CR4_PAE | X86_CR4_OSFXSR;
>> -	if (kvm_cpu_has(X86_FEATURE_XSAVE))
>> -		sregs.cr4 |= X86_CR4_OSXSAVE;
>> -	if (vm->mmu.pgtable_levels == 5)
>> -		sregs.cr4 |= X86_CR4_LA57;
> 
> I guess the 5-level paging thing is dropped unexpectedly during rebase?
> 
> 
>> +	sregs.gdt.limit = kvm_get_default_gdt_limit();
>>
>> +	sregs.cr0 = kvm_get_default_cr0();
>> +	sregs.cr4 |= kvm_get_default_cr4();
>>  	sregs.efer |= (EFER_LME | EFER_LMA | EFER_NX);

Also, sregs.efer |= kvm_get_default_efer() is dropped unexpectedly during rebase.

>>  
>>  	kvm_seg_set_unusable(&sregs.ldt);
>>
> 


^ permalink raw reply

* Re: [PATCH RFC 0/3] KVM: guest_memfd: folio migration for non-confidential VMs
From: Alexandru Elisei @ 2026-06-15 11:04 UTC (permalink / raw)
  To: Shivank Garg
  Cc: Matthew Wilcox (Oracle), Jan Kara, Andrew Morton, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Johannes Weiner, Zi Yan, David Hildenbrand, Matthew Brost,
	Joshua Hahn, Rakie Kim, Byungchul Park, Gregory Price, Ying Huang,
	Alistair Popple, Paolo Bonzini, Shuah Khan, Chao Peng,
	Nikunj A Dadhania, Ira Weiny, Michael Roth, Pankaj Gupta,
	Ackerley Tng, Fuad Tabba, Sean Christopherson, Vishal Annapurve,
	Nikita Kalyazin, Patrick Roy, Pratik Sampat, Ashish Kalra,
	linux-fsdevel, linux-coco, linux-mm, linux-kernel, kvm,
	linux-kselftest
In-Reply-To: <ai_XK__RTXMCEcCG@raptor>

Hi,

On Mon, Jun 15, 2026 at 11:43:14AM +0100, Alexandru Elisei wrote:
> Hi,
> 
> On Thu, Jun 11, 2026 at 01:05:07PM +0000, Shivank Garg wrote:
> > guest_memfd folios are currently marked unmovable, so the kernel cannot
> > perform NUMA-balancing, memory compaction, etc. This is unavoidable for
> > confidential VMs (SEV-SNP, TDX), since memory is encrypted and copying it
> > needs firmware assistance. However, for non-confidential VMs (like
> > Firecracker), we can migrate the folios.
> > 
> > This series enables folio migration for non-confidential guest_memfd and
> > also lays the groundwork for migrating confidential guest_memfd later.
> > Once firmware-assisted copying support is available, those VMs can be
> > made movable, the confidential folio content can be copied separately,
> > and the destination folio marked with FOLIO_CONTENT_COPIED so
> > __migrate_folio() skips the host-side folio_mc_copy().
> 
> I always thought that one of the nice things about using guest_memfd as a
> memory backend, as opposed to host userspace mappings, is that the host
> cannot unmap VM memory because of KSM, automatic NUMA balancing, hugepage
> collapse, compaction, etc, acting on the host userspace mapping of the
> VM memory, and outside of the VMM's or KVM's control.
> 
> I think it would be useful to preserve this behaviour, even in the absence
> of confidential VMs (i.e, guest_memfd file descriptor created with
> GUEST_MEMFD_FLAG_MMAP).

Just to be clear, I was thinking that it might be useful for both
behaviours to exist (migratable and non-migratable) for non-confidential
VMs, and allow KVM or userspace to decide which they prefer for a
guest_memfd.

Thanks,
Alex

^ permalink raw reply

* Re: [PATCH v14 10/44] arm64: RMI: Add support for SRO
From: Steven Price @ 2026-06-15 11:45 UTC (permalink / raw)
  To: Dan Williams (nvidia), Gavin Shan, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco, Ganapatrao Kulkarni, Shanker Donthineni,
	Alper Gun, Aneesh Kumar K . V, Emi Kisanuki, Vishal Annapurve,
	WeiLin.Chang, Lorenzo.Pieralisi2
In-Reply-To: <6a2c91398fad5_a003b10027@djbw-dev.notmuch>

Hi Dan,

On 13/06/2026 00:07, Dan Williams (nvidia) wrote:
> Steven Price wrote:
> [..]
>>> alloc_pages_exact() will fail if the requested size exceeds the maximal
>>> allowed
>>> size (1 << MAX_PAGE_ORDER). The maximal size is usually smaller than
>>> PUD_SIZE
>>> but PUD_SIZE is allowed by the RMM.
>>
>> This is an area where to be honest I'm really not sure what to do.
>> Technically the RMM is allowed to ask for a contiguous range of 512GB
>> pages (on a 4K system - larger with larger page sizes) - but clearly no
>> real OS is going to be able to provide anything like that.
>>
>> In practise we don't expect the RMM to do anything so crazy. It's not
>> really clear to be whether even 2MB (PMD_SIZE) is needed. But the spec
>> is written to be generic.
>>
>> So my current approach is to calculate the required size and pass it
>> into alloc_pages_exact(). For "stupidly large" values this will fail and
>> Linux just doesn't support an RMM which attempts this. If there is ever
>> a usecase which needs this then we'd need to find a different method of
>> providing the memory (most likely some form of carveout to avoid
>> fragmentation). But my view is we should wait for that usecase to be
>> identified first.
> 
> Just some comparison comments as I am also going through the TDX patches
> which enable "Extension SEAMCALLs". These new SEAMCALLs are similar to
> the SRO mechanism [1].

Looks like at least at the moment it's much more one-way than the SRO
mechanism - there's no reclaim mechanism (yet).

> TDX asks for an upfront delegation of memory at init time using
> alloc_contig_pages() that is never returned until entire module is
> shutdown. alloc_contig_pages() is not subject to the MAX_ORDER limit,
> but not sure that alloc_contig_pages() is suitable for small+dynamic
> runtime memory add / release that SRO potentially wants to do?

Yeah I'm not sure quite what is best. I expect the RMM to only request
contiguous memory for very small allocations to use as hardware page
tables. It's an issue I'm trying to work through that the specification
doesn't provide any guidance for what sort of allocations the host
should expect to provide.

> Does SRO always balance the size of RMI_OP_MEM_REQ_DONATE with
> RMI_OP_MEM_REQ_RECLAIM, or might some donate requests be a one way
> donation like TDX? Just poking to see if there is a path to preallocate
> a pool vs the fine grained per-operation alloc/free.

The spec is unfortunately not prescriptive on this point. For an
operation which eventually fails, the expectation is that the RMM will
return all the memory that was provided (and exactly that memory). But
the specification doesn't actually require that.

The problem is that there are situations where a racing operation on
another CPU could trigger this to not happen. For example, a new page
table needs to be allocated to complete a map operation, but then a
racing operation on another CPU makes use of this page table (e.g due to
a map at a different address), the memory for the page table cannot be
returned even if the operation doesn't complete because it's in use from
the racing operation.

I don't believe the current RMM design will actually do this - but it's
not something we actually want to prevent in the spec.

Equally the expectation is that all the donated memory for a guest will
be returned when the guest is destroyed. But we don't have anything in
the spec to enforce this.

I don't particularly expect a pool to be that useful for the expected
memory allocation patterns as I expect SRO donations to be long lived.
We don't (yet at least) have a concept of donating memory just for
"scratch" memory during an operation. Although the SRO mechanism doesn't
rule that out.

Thanks,
Steve


^ permalink raw reply

* Re: [RFC PATCH] mm/vmalloc: add vmalloc_decrypted() and vzalloc_decrypted()
From: Jason Gunthorpe @ 2026-06-15 12:09 UTC (permalink / raw)
  To: Michael Kelley
  Cc: Catalin Marinas, Christoph Hellwig, Kameron Carr,
	akpm@linux-foundation.org, urezki@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, rppt@kernel.org,
	linux-coco@lists.linux.dev, Suzuki K Poulose
In-Reply-To: <SN6PR02MB4157EC032AD55D182FBC1318D4182@SN6PR02MB4157.namprd02.prod.outlook.com>

On Fri, Jun 12, 2026 at 07:06:00PM +0000, Michael Kelley wrote:

> > I thought arches are either preserving the memory content or zeroing
> > it, you are saying some arch leaves it as garbage? I'd argue that's an
> > arch bug and they should clear it in their path.
> 
> AMD SEV-SNP leaves the memory contents as garbage after an encryption
> or decryption state change. On the flip side, my understanding has been
> that TDX zeroes the memory (or at least has an option to do so) after
> such a state change, though a couple of AI chats say TDX also leaves
> garbage. To be sure, I'd have to run an experiment to check in a TDX
> guest on Hyper-V.

So there are many bugs then if the pre-zero is lost and you have to
zero it again. Even swiotlb doesn't reliably zero it's pools in the
right order under these rules, though alloc coherent does get it
right at least.

IMHO this is too sketchy to be usable and optimizing for AMD is not
the right call, IMHO.

> > Otherwise this sharp edge is not documented and we have many other
> > places getting it wrong, eg system_heap_allocate() doesn't re-zero the
> > memory after decrypting it.
> 
> In the Hyper-V code that uses set_memory_decrypted()/encrypted(),
> there's always an explicit call to set the memory to zero afterwards.

Good for it, maybe next time improve the APIs :(

Even more compelling that hyper-v should be using the dma api..

Jason

^ permalink raw reply

* [PATCH] PCI/TSM: Resume device to D0 for CMA-SPDM operation
From: Lukas Wunner @ 2026-06-15 13:19 UTC (permalink / raw)
  To: Dan Williams, Ashish Kalra, Tom Lendacky
  Cc: Vivaik Balasubrawmanian, John Allen, Bjorn Helgaas, linux-coco,
	linux-pci, Jonathan Cameron, Aneesh Kumar K.V, Yilun Xu,
	Zhenzhong Duan, Alexey Kardashevskiy

Per PCIe r7.0 sec 6.31.3, CMA-SPDM operation in non-D0 states is optional.
The spec does not define a way to determine if it's supported, so resume
to D0 unconditionally for the duration of a CMA-SPDM exchange.  Vivaik has
talked to Windows engineers and they said that Windows does the same.

Note that for plain DOE operation, it is sufficient for the device to be
in D3hot and its parents in D0 because config space remains accessible in
D3hot.  So CMA-SPDM goes beyond the requirements of plain DOE and hence
resuming to D0 needs to (only) be done in code paths which use DOE
specifically for CMA-SPDM.

The pattern used herein for runtime resume is the best practice introduced
by commit ef8057b07c72 ("PM: runtime: Wrapper macros for ACQUIRE()/
ACQUIRE_ERR()").

Fixes: 3225f52cde56 ("PCI/TSM: Establish Secure Sessions and Link Encryption")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v6.19+
Cc: Vivaik Balasubrawmanian <vivaik.balasubrawmanian@intel.com>
---
We're in the merge window for v7.2 and this isn't super urgent,
so it's targeting v7.3 via tsm.git/next.

Technically I'd have permission to apply myself,
but I wouldn't want to without acks from Dan and AMD!
Thanks for taking a look!

 drivers/crypto/ccp/sev-dev-tsm.c | 6 ++++++
 drivers/pci/tsm.c                | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/crypto/ccp/sev-dev-tsm.c b/drivers/crypto/ccp/sev-dev-tsm.c
index b07ae52..108204f7 100644
--- a/drivers/crypto/ccp/sev-dev-tsm.c
+++ b/drivers/crypto/ccp/sev-dev-tsm.c
@@ -7,6 +7,7 @@
 #include <linux/tsm.h>
 #include <linux/iommu.h>
 #include <linux/pci-doe.h>
+#include <linux/pm_runtime.h>
 #include <linux/bitfield.h>
 #include <linux/module.h>
 
@@ -30,6 +31,7 @@ static int sev_tio_spdm_cmd(struct tio_dsm *dsm, int ret)
 {
 	struct tsm_dsm_tio *dev_data = &dsm->data;
 	struct tsm_spdm *spdm = &dev_data->spdm;
+	int pm_ret;
 
 	/* Check the main command handler response before entering the loop */
 	if (ret == 0 && dev_data->psp_ret != SEV_RET_SUCCESS)
@@ -38,6 +40,10 @@ static int sev_tio_spdm_cmd(struct tio_dsm *dsm, int ret)
 	if (ret <= 0)
 		return ret;
 
+	PM_RUNTIME_ACQUIRE(&dsm->tsm.base_tsm.pdev->dev, pm);
+	if ((pm_ret = PM_RUNTIME_ACQUIRE_ERR(&pm)))
+		return pm_ret;
+
 	/* ret > 0 means "SPDM requested" */
 	while (ret == PCI_DOE_FEATURE_CMA || ret == PCI_DOE_FEATURE_SSESSION) {
 		ret = pci_doe(dsm->tsm.doe_mb, PCI_VENDOR_ID_PCI_SIG, ret,
diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c
index 5fdcd7f..af1817e 100644
--- a/drivers/pci/tsm.c
+++ b/drivers/pci/tsm.c
@@ -12,6 +12,7 @@
 #include <linux/pci.h>
 #include <linux/pci-doe.h>
 #include <linux/pci-tsm.h>
+#include <linux/pm_runtime.h>
 #include <linux/sysfs.h>
 #include <linux/tsm.h>
 #include <linux/xarray.h>
@@ -886,6 +887,7 @@ int pci_tsm_doe_transfer(struct pci_dev *pdev, u8 type, const void *req,
 			 size_t req_sz, void *resp, size_t resp_sz)
 {
 	struct pci_tsm_pf0 *tsm;
+	int rc;
 
 	if (!pdev->tsm || !is_pci_tsm_pf0(pdev))
 		return -ENXIO;
@@ -894,6 +896,10 @@ int pci_tsm_doe_transfer(struct pci_dev *pdev, u8 type, const void *req,
 	if (!tsm->doe_mb)
 		return -ENXIO;
 
+	PM_RUNTIME_ACQUIRE(&pdev->dev, pm);
+	if ((rc = PM_RUNTIME_ACQUIRE_ERR(&pm)))
+		return rc;
+
 	return pci_doe(tsm->doe_mb, PCI_VENDOR_ID_PCI_SIG, type, req, req_sz,
 		       resp, resp_sz);
 }
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH v7 2/6] firmware: hwrng: arm_smccc_trng: Register as an SMCCC device
From: Andre Przywara @ 2026-06-15 15:15 UTC (permalink / raw)
  To: Aneesh Kumar K.V (Arm), linux-coco, linux-arm-kernel,
	linux-kernel
  Cc: Catalin Marinas, Greg KH, Jeremy Linton, Jonathan Cameron,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Will Deacon,
	Steven Price, Suzuki K Poulose
In-Reply-To: <20260611130429.295516-3-aneesh.kumar@kernel.org>

Hi Aneesh,

thanks for doing this, we have thought about this for quite a while, but 
no one dared to just bite the bullet...

On 6/11/26 15:04, Aneesh Kumar K.V (Arm) wrote:
> The SMCCC TRNG interface is a firmware-provided SMCCC service rather than a
> standalone platform device. Now that the SMCCC core has an SMCCC bus,
> create an arm-smccc-trng device for the discovered TRNG service and convert
> the hwrng driver to an SMCCC driver.
> 
> The SMCCC id table preserves module autoloading for systems where the TRNG
> driver is built as a module.
> 
> The sysfs device path changes from the old smccc_trng platform-device path
> to an arm-smccc device path. No known userspace dependency on the old path
> was found; a Debian Code Search lookup for the existing platform-device
> name/path did not find any users.
> 
> Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
> ---
>   arch/arm64/include/asm/archrandom.h     |  2 +-
>   drivers/char/hw_random/arm_smccc_trng.c | 32 +++++++++-----
>   drivers/firmware/smccc/smccc.c          | 58 +++++++++++++++++++++----
>   3 files changed, 71 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/archrandom.h b/arch/arm64/include/asm/archrandom.h
> index 8babfbe31f95..7605dd81bd1e 100644
> --- a/arch/arm64/include/asm/archrandom.h
> +++ b/arch/arm64/include/asm/archrandom.h
> @@ -12,7 +12,7 @@
>   
>   extern bool smccc_trng_available;
>   
> -static inline bool __init smccc_probe_trng(void)
> +static inline bool smccc_probe_trng(void)
>   {
>   	struct arm_smccc_res res;
>   
> diff --git a/drivers/char/hw_random/arm_smccc_trng.c b/drivers/char/hw_random/arm_smccc_trng.c
> index dcb8e7f37f25..8f7f9d830cf2 100644
> --- a/drivers/char/hw_random/arm_smccc_trng.c
> +++ b/drivers/char/hw_random/arm_smccc_trng.c
> @@ -16,8 +16,10 @@
>   #include <linux/device.h>
>   #include <linux/hw_random.h>
>   #include <linux/module.h>
> -#include <linux/platform_device.h>
>   #include <linux/arm-smccc.h>
> +#include <linux/arm-smccc-bus.h>
> +
> +#include <asm/archrandom.h>
>   
>   #ifdef CONFIG_ARM64
>   #define ARM_SMCCC_TRNG_RND	ARM_SMCCC_TRNG_RND64
> @@ -94,29 +96,37 @@ static int smccc_trng_read(struct hwrng *rng, void *data, size_t max, bool wait)
>   	return copied;
>   }
>   
> -static int smccc_trng_probe(struct platform_device *pdev)
> +static int smccc_trng_probe(struct arm_smccc_device *sdev)
>   {
>   	struct hwrng *trng;
>   
> -	trng = devm_kzalloc(&pdev->dev, sizeof(*trng), GFP_KERNEL);
> +	/* validate the minimum version requirement */
> +	if (!smccc_probe_trng())
> +		return -ENODEV;
> +
> +	trng = devm_kzalloc(&sdev->dev, sizeof(*trng), GFP_KERNEL);
>   	if (!trng)
>   		return -ENOMEM;
>   
>   	trng->name = "smccc_trng";
>   	trng->read = smccc_trng_read;
>   
> -	return devm_hwrng_register(&pdev->dev, trng);
> +	return devm_hwrng_register(&sdev->dev, trng);
>   }
>   
> -static struct platform_driver smccc_trng_driver = {
> -	.driver = {
> -		.name		= "smccc_trng",
> -	},
> -	.probe		= smccc_trng_probe,
> +static const struct arm_smccc_device_id smccc_trng_id_table[] = {
> +	{ .name = "arm-smccc-trng" },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(arm_smccc, smccc_trng_id_table);
> +
> +static struct arm_smccc_driver smccc_trng_driver = {
> +	.name	  = KBUILD_MODNAME,
> +	.probe	  = smccc_trng_probe,
> +	.id_table = smccc_trng_id_table,
>   };
> -module_platform_driver(smccc_trng_driver);
> +module_arm_smccc_driver(smccc_trng_driver);
>   
> -MODULE_ALIAS("platform:smccc_trng");
>   MODULE_AUTHOR("Andre Przywara");
>   MODULE_DESCRIPTION("Arm SMCCC TRNG firmware interface support");
>   MODULE_LICENSE("GPL");
> diff --git a/drivers/firmware/smccc/smccc.c b/drivers/firmware/smccc/smccc.c
> index bdee057db2fd..a47696f3a5de 100644
> --- a/drivers/firmware/smccc/smccc.c
> +++ b/drivers/firmware/smccc/smccc.c
> @@ -9,7 +9,8 @@
>   #include <linux/init.h>
>   #include <linux/arm-smccc.h>
>   #include <linux/kernel.h>
> -#include <linux/platform_device.h>
> +#include <linux/arm-smccc-bus.h>
> +
>   #include <asm/archrandom.h>
>   
>   static u32 smccc_version = ARM_SMCCC_VERSION_1_0;
> @@ -81,16 +82,55 @@ bool arm_smccc_hypervisor_has_uuid(const uuid_t *hyp_uuid)
>   }
>   EXPORT_SYMBOL_GPL(arm_smccc_hypervisor_has_uuid);
>   
> +struct smccc_device_info {
> +	u32 func_id;
> +	bool requires_smc;
> +	const char *device_name;
> +};
> +
> +static const struct smccc_device_info smccc_devices[] __initconst = {
> +	{
> +		.func_id        = ARM_SMCCC_TRNG_VERSION,
> +		.requires_smc   = false,
> +		.device_name    = "arm-smccc-trng",
> +	},
> +};
> +
> +static bool __init smccc_probe_smccc_device(const struct smccc_device_info *smccc_dev)
> +{
> +	unsigned long ret;
> +	struct arm_smccc_res res;
> +
> +	if (smccc_conduit == SMCCC_CONDUIT_NONE)
> +		return false;
> +
> +	if (smccc_dev->requires_smc && smccc_conduit != SMCCC_CONDUIT_SMC)
> +		return false;
> +
> +	arm_smccc_1_1_invoke(smccc_dev->func_id, &res);
> +	ret = res.a0;

Mostly a nit:
Why the assignment to a variable of the same type here? Wouldn't it be 
cleaner to let "ret" be an "int"? Then you can save the cast below.
Or drop the assignment, and just cast res.a0 below directly.

In any case, I tested this in a KVM guest, and it worked flawlessly: the 
device is created, works, and sysfs looks good, both with this file 
compiled in (=y), and also as a module. Module autoloading also seems to 
work.
So that's:

Tested-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.


> +
> +	if ((s32)ret == SMCCC_RET_NOT_SUPPORTED)
> +		return false;
> +
> +	return true;
> +}
> +
>   static int __init smccc_devices_init(void)
>   {
> -	struct platform_device *pdev;
> -
> -	if (smccc_trng_available) {
> -		pdev = platform_device_register_simple("smccc_trng", -1,
> -						       NULL, 0);
> -		if (IS_ERR(pdev))
> -			pr_err("smccc_trng: could not register device: %ld\n",
> -			       PTR_ERR(pdev));
> +	struct arm_smccc_device *sdev;
> +	const struct smccc_device_info *smccc_dev;
> +
> +	for (int i = 0; i < ARRAY_SIZE(smccc_devices); i++) {
> +		smccc_dev = &smccc_devices[i];
> +
> +		if (!smccc_probe_smccc_device(smccc_dev))
> +			continue;
> +
> +		sdev = arm_smccc_device_register(smccc_dev->device_name);
> +		if (IS_ERR(sdev))
> +			pr_err("%s: could not register device: %ld\n",
> +			       smccc_dev->device_name, PTR_ERR(sdev));
>   	}
>   
>   	return 0;


^ permalink raw reply

* Re: [PATCH] PCI/TSM: Resume device to D0 for CMA-SPDM operation
From: Lukas Wunner @ 2026-06-15 15:37 UTC (permalink / raw)
  To: sashiko-reviews
  Cc: linux-pci, Dan Williams, Ashish Kalra, Tom Lendacky,
	Alexey Kardashevskiy, linux-coco, Jonathan Cameron
In-Reply-To: <20260615134252.B34A21F000E9@smtp.kernel.org>

On Mon, Jun 15, 2026 at 01:42:52PM +0000, sashiko-bot@kernel.org wrote:
> > +++ b/drivers/crypto/ccp/sev-dev-tsm.c
> > @@ -38,6 +40,10 @@ static int sev_tio_spdm_cmd(struct tio_dsm *dsm, int ret)
> >  	if (ret <= 0)
> >  		return ret;
> >  
> > +	PM_RUNTIME_ACQUIRE(&dsm->tsm.base_tsm.pdev->dev, pm);
> > +	if ((pm_ret = PM_RUNTIME_ACQUIRE_ERR(&pm)))
> > +		return pm_ret;
> > +
> 
> [Severity: Medium]
> The commit message mentions that "resuming to D0 needs to (only) be done in
> code paths which use DOE specifically for CMA-SPDM", but does this code
> unconditionally resume the device to D0 regardless of the operation?
> 
> Looking at sev_tio_spdm_cmd(), the resume happens before the loop checks if
> the operation is actually PCI_DOE_FEATURE_CMA or PCI_DOE_FEATURE_SSESSION.
> Could this cause unnecessary power transitions for generic operations where
> accessing the config space in D3hot would be sufficient?

Moving PM_RUNTIME_ACQUIRE() inside the while loop may lead to repeated
D0 -> D3hot -> D0 -> D3hot ... transitions (depending on autosuspend
settings of the device, which are user-configurable through sysfs).
It would also lead to overhead induced by runtime PM code (repeated
spinlock acquisition etc).

So I believe keeping PM_RUNTIME_ACQUIRE() outside the while loop is
the right thing to do, but I'll leave this to AMD engineers to decide.

> [Severity: High]
> Does using PM_RUNTIME_ACQUIRE() here risk leaving the hardware permanently
> stuck in D0 during the device removal path?
> 
> The PM_RUNTIME_ACQUIRE() macro queues an asynchronous pm_runtime_put() when
> dropping the reference. If sev_tio_spdm_cmd() is called during teardown
> sequences, the pending asynchronous suspend can be cancelled by a subsequent
> device_del() call, which disables runtime PM. 
> 
> Should a synchronous runtime put be used here to ensure the device actually
> suspends before the removal completes?

If the device is deleted anyway, we don't care about leaked references.
And we absolutely do not want to synchronously runtime suspend here.

> > +++ b/drivers/pci/tsm.c
> > @@ -894,6 +896,10 @@ int pci_tsm_doe_transfer(struct pci_dev *pdev, u8 type, const void *req,
> >  	if (!tsm->doe_mb)
> >  		return -ENXIO;
> >  
> > +	PM_RUNTIME_ACQUIRE(&pdev->dev, pm);
> > +	if ((rc = PM_RUNTIME_ACQUIRE_ERR(&pm)))
> > +		return rc;
> > +
> 
> [Severity: Medium]
> Similarly to sev_tio_spdm_cmd(), does this code unconditionally resume the
> device to D0, ignoring the type parameter?
> 
> If a generic DOE transfer is requested that does not strictly require D0, it
> appears this will still force a transition to D0, contradicting the commit
> message's stated power management requirements.

Hallucination, this code does not perform "generic DOE" exchanges, only
CMA-SPDM ones.

> [Severity: High]
> Is this exported API also susceptible to the same asynchronous put regression
> mentioned above?
> 
> If pci_tsm_doe_transfer() is invoked during a teardown sequence, the async
> idle and suspend queued by the macro might get canceled by device_del(),
> leaving the device in D0.

We have to leave de-enumerated devices in D0 to ensure that a subsequent
rescan successfully re-enumerates them.  E.g. leaving a Downstream Port
in D3hot upon de-enumeration would leave any children inaccessible.

We also leave unbound devices in D0 for similar reasons.

Thanks,

Lukas

^ permalink raw reply

* Re: [PATCH 00/15] Enable TDX Module Extensions and DICE-based TDX Quoting
From: Xu Yilun @ 2026-06-15 15:22 UTC (permalink / raw)
  To: Dan Williams (nvidia)
  Cc: kas, rick.p.edgecombe, x86, peter.fang, linux-coco, linux-kernel,
	kvm, sohil.mehta, yilun.xu, baolu.lu, zhenzhong.duan, xiaoyao.li
In-Reply-To: <6a2c821a99e3_9b8551002a@djbw-dev.notmuch>

> The internal implementation details of extension seamcalls buries the
> lead on why this mechanism is important, why Linux should care, and why
> this brings TDX in line with the other major CC architectures. Something
> like:
> 
> ===
> To date, SEAMCALLs have been short lived routines that monopolize the
> CPU for their duration. This limits their utility for implementing
> higher order security protocols or pushes complexity into Linux. The
> Linux appetite for ingesting complexity is low, so TDX now adds a new
> class of SEAMCALLs that are preemptible and resumable. This capability
> enables higher order service APIs to carry out a security protocol like
> "establish an SPDM session".
> 
> The TDX "Extension SEAMCALL" capability is akin to ARM CCA's "Stateful
> RMI Operations (SRO)", and achieves similar externalized complexity
> relief as a dedicated hardware coprocessor like AMD SEV-SNP. The

I may not include the ARM/AMD examples, not sure I can explain them
well.

> mechanism is "give the service environment some memory", "invoke the
> service API", and "continue invoking until complete". All protocol state
> is internal the service API.
> 
> The simplest class of extension SEAMCALLs to support are in support of
> "DICE-based TDX Quoting", a service to turn guest launch attestation
> reports into a document that can be externally verified.
> ===

[...]

> > The Extensions consumes relatively large amount of memory (~50MB). So it
> > is designed to be off by default.
> 
> This confuses the TDX design with the Linux design, and sets up "50MB" as
> something to be quibbled with. The Linux design is turn on all the
> features that Linux knows about all the time. Unless and until the "any
> available, all the time" becomes untenable it just simplifies the init
> flow to not play piecemeal games. Await evidence to change the simple
> policy. Suffice to say the cost of this policy will burn 10s of
> megabytes.

[...]

> 
> > == Some history ==
> > 
> > The TDX Module Extensions part was first posted along with TDX
> > Connect [2]. Now this part is remarkably smaller because we've removed
> > the generic tdx_page_array abstraction for HPA_LIST_INFO. TDX Module
> > Extensions is the first user of HPA_LIST_INFO, and doesn't use it in a
> > typical way (HPA_LIST_INFO can only hold at most 2MB memory). There
> > isn't enough justification to make the abstraction in this series. A
> > possible plan is to rebuild tdx_page_array iteratively when more use
> > cases arise.
> 
> No need to talk about details not in this series. I would maybe just
> note that quoting is the simplest first consumer and was chosen as the
> lead vehicle over TDX Connect previously posted in case anyone asks.

Good to me, will include most of them, thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox