From: Kiryl Shutsemau <kas@kernel.org>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "tglx@kernel.org" <tglx@kernel.org>,
"Hansen, Dave" <dave.hansen@intel.com>,
"seanjc@google.com" <seanjc@google.com>,
"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
"ackerleytng@google.com" <ackerleytng@google.com>,
"hpa@zytor.com" <hpa@zytor.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"mingo@redhat.com" <mingo@redhat.com>,
"Huang, Kai" <kai.huang@intel.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"Verma, Vishal L" <vishal.l.verma@intel.com>,
"Gao, Chao" <chao.gao@intel.com>
Subject: Re: [PATCH 3/4] x86/virt/tdx: Add SEAMCALL wrapper for TDH.SYS.DISABLE
Date: Tue, 17 Mar 2026 09:47:46 +0000 [thread overview]
Message-ID: <abkdNQnvgJWmaKWk@thinkstation> (raw)
In-Reply-To: <ac341cedf0e387f2ba5cf3fc9d2ad5fcaa2ec3c8.camel@intel.com>
On Mon, Mar 16, 2026 at 09:15:13PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2026-03-16 at 11:51 +0000, Kiryl Shutsemau wrote:
> > On Fri, Mar 06, 2026 at 05:03:57PM -0800, Rick Edgecombe wrote:
> > > From: Vishal Verma <vishal.l.verma@intel.com>
> > >
> > > Some early TDX-capable platforms have an erratum where a partial write
> > > to TDX private memory can cause a machine check on a subsequent read.
> > > On these platforms, kexec and kdump have been disabled in these cases,
> > > because the old kernel cannot safely hand off TDX state to the new
> > > kernel. Later TDX modules support the TDH.SYS.DISABLE SEAMCALL, which
> > > provides a way to cleanly disable TDX and allow kexec to proceed.
> >
> > Does it need to be enumerated?
> >
> > I don't see this SEAMCALL be covered in the public documentation.
> > </me looking around>
> > Ah! Found it the the draft. So the feature is not yet finalized.
> >
> > "Support of TDH.SYS.DISABLE is enumerated by TDX_FEATURES0. SYS_DISABLE
> > (bit 53)"
> >
> > I am seeing the next patch calling it unconditionally. Is it okay?
>
> We debated checking the feature bit before allowing kexec, but decided it was
> simpler to just blindly call and ignore the errors. The reasoning was that this
> is already a somewhat exotic scenario being addressed, and future modules will
> have the feature. So maintaining a check for the feature bit only helps a little
> bit, for a short time. And then only if the user would rather have kexec blocked
> than attempt it. Do you think it is worth it?
No, I see very limited reason to support stale TDX modules. Users are
expected to keep the module up-to-date, so skipping enumeration should
be okay. But it deserves explanation in the commit message or a comment.
> > > This can be a long running operation, and the time needed largely
> > > depends on the amount of memory that has been allocated to TDs. If all
> > > TDs have been destroyed prior to the sys_disable call, then it is fast,
> > > with only needing to override the TDX module memory.
> > >
> > > After the SEAMCALL completes, the TDX module is disabled and all memory
> > > resources allocated to TDX are freed and reset. The next kernel can then
> > > re-initialize the TDX module from scratch via the normal TDX bring-up
> > > sequence.
> > >
> > > The SEAMCALL may be interrupted by an interrupt. In this case, it
> > > returns TDX_INTERRUPTED_RESUMABLE, and it must be retried in a loop
> > > until the operation completes successfully.
> > >
> > > Add a tdx_sys_disable() helper, which implements the retry loop around
> > > the SEAMCALL to provide this functionality.
> > >
> > > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > > ---
> > > arch/x86/include/asm/tdx.h | 3 +++
> > > arch/x86/virt/vmx/tdx/tdx.c | 18 ++++++++++++++++++
> > > arch/x86/virt/vmx/tdx/tdx.h | 1 +
> > > 3 files changed, 22 insertions(+)
> > >
> > > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> > > index f0826b0a512a..baaf43a09e99 100644
> > > --- a/arch/x86/include/asm/tdx.h
> > > +++ b/arch/x86/include/asm/tdx.h
> > > @@ -173,6 +173,8 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
> > > return level - 1;
> > > }
> > >
> > > +void tdx_sys_disable(void);
> > > +
> > > u64 tdh_vp_enter(struct tdx_vp *vp, struct tdx_module_args *args);
> > > u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
> > > u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2);
> > > @@ -204,6 +206,7 @@ static inline void tdx_init(void) { }
> > > static inline u32 tdx_get_nr_guest_keyids(void) { return 0; }
> > > static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; }
> > > static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; }
> > > +static inline void tdx_sys_disable(void) { }
> > > #endif /* CONFIG_INTEL_TDX_HOST */
> > >
> > > #endif /* !__ASSEMBLER__ */
> > > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > > index 0802d0fd18a4..68bd2618dde4 100644
> > > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > > @@ -37,6 +37,7 @@
> > > #include <asm/msr.h>
> > > #include <asm/cpufeature.h>
> > > #include <asm/tdx.h>
> > > +#include <asm/shared/tdx_errno.h>
> > > #include <asm/cpu_device_id.h>
> > > #include <asm/processor.h>
> > > #include <asm/mce.h>
> > > @@ -1940,3 +1941,20 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page)
> > > return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> > > }
> > > EXPORT_SYMBOL_FOR_KVM(tdh_phymem_page_wbinvd_hkid);
> > > +
> > > +void tdx_sys_disable(void)
> > > +{
> > > + struct tdx_module_args args = {};
> > > +
> > > + /*
> > > + * SEAMCALLs that can return TDX_INTERRUPTED_RESUMABLE are guaranteed
> > > + * to make forward progress between interrupts, so it is safe to loop
> > > + * unconditionally here.
> > > + *
> > > + * This is a 'destructive' SEAMCALL, in that no other SEAMCALL can be
> > > + * run after this until a full reinitialization is done.
> > > + */
> > > + while (seamcall(TDH_SYS_DISABLE, &args) == TDX_INTERRUPTED_RESUMABLE)
> > > + ;
> >
> > Silently ignore any other errors?
>
> Do you think it's worth a warn? There are a couple other considerations.
> - Kai brought up offline that we should handle TDX_SYS_BUSY here too.
> - Previous kexec patches had trouble solving races around tdx enabling. So we
> have to handle the seamcall failures.
>
> So we have to exclude a few different errors in different ways. And then the
> warn worthy error codes either don't impact anything, or the new kernel will
> fail to initialize the TDX module and give notice there.
The delayed error is harder to debug. It can be useful to leave a
breadcrumbs.
Also, do we want to make try_init_module_global() return failure after
tdx_sys_disable()? I guess, TDH_SYS_LP_INIT will fail anyway, so it
shouldn't matter.
--
Kiryl Shutsemau / Kirill A. Shutemov
next prev parent reply other threads:[~2026-03-17 9:47 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-07 1:03 [PATCH 0/4] Fuller TDX kexec support Rick Edgecombe
2026-03-07 1:03 ` [PATCH 1/4] x86/tdx: Move all TDX error defines into <asm/shared/tdx_errno.h> Rick Edgecombe
2026-03-08 23:47 ` Huang, Kai
2026-03-09 16:20 ` Edgecombe, Rick P
2026-03-07 1:03 ` [PATCH 2/4] x86/virt/tdx: Pull kexec cache flush logic into arch/x86 Rick Edgecombe
2026-03-09 0:23 ` Huang, Kai
2026-03-09 16:23 ` Edgecombe, Rick P
2026-03-07 1:03 ` [PATCH 3/4] x86/virt/tdx: Add SEAMCALL wrapper for TDH.SYS.DISABLE Rick Edgecombe
2026-03-16 11:51 ` Kiryl Shutsemau
2026-03-16 21:15 ` Edgecombe, Rick P
2026-03-17 9:47 ` Kiryl Shutsemau [this message]
2026-03-17 21:55 ` Edgecombe, Rick P
2026-03-07 1:03 ` [PATCH 4/4] KVM: x86: Disable the TDX module during kexec and kdump Rick Edgecombe
2026-03-09 8:15 ` Chao Gao
2026-03-09 16:24 ` Edgecombe, Rick P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abkdNQnvgJWmaKWk@thinkstation \
--to=kas@kernel.org \
--cc=ackerleytng@google.com \
--cc=bp@alien8.de \
--cc=chao.gao@intel.com \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=vishal.l.verma@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox