From: Erdem Aktas <erdemaktas@google.com>
To: dan.j.williams@intel.com
Cc: Vishal Annapurve <vannapurve@google.com>,
Dave Hansen <dave.hansen@intel.com>,
Chao Gao <chao.gao@intel.com>,
"Reshetova, Elena" <elena.reshetova@intel.com>,
"linux-coco@lists.linux.dev" <linux-coco@lists.linux.dev>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"x86@kernel.org" <x86@kernel.org>,
"Chatre, Reinette" <reinette.chatre@intel.com>,
"Weiny, Ira" <ira.weiny@intel.com>,
"Huang, Kai" <kai.huang@intel.com>,
"yilun.xu@linux.intel.com" <yilun.xu@linux.intel.com>,
"sagis@google.com" <sagis@google.com>,
"paulmck@kernel.org" <paulmck@kernel.org>,
"nik.borisov@suse.com" <nik.borisov@suse.com>,
Borislav Petkov <bp@alien8.de>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
"Kirill A. Shutemov" <kas@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH v2 00/21] Runtime TDX Module update support
Date: Tue, 28 Oct 2025 10:00:08 -0700 [thread overview]
Message-ID: <CAAYXXYyVC0Sm+1PBw=xoYNDV7aa54c_6KTGjMdwVaBAJOd8Hpw@mail.gmail.com> (raw)
In-Reply-To: <690026ac52509_10e2100cd@dwillia2-mobl4.notmuch>
On Mon, Oct 27, 2025 at 7:14 PM <dan.j.williams@intel.com> wrote:
>
> Vishal Annapurve wrote:
> [..]
> > Problem 2 should be solved in the TDX module as it is the state owner
> > and should be given a chance to ensure that nothing else can affect
> > it's state. Kernel is just opting-in to toggle the already provided
> > TDX module ABI. I don't think this is adding complexity to the kernel.
>
> It makes the interface hard to reason about, that is complexity.
Could you clarify what you mean here? What interface do you need to
reason about? TDX module has a feature as described in its spec, this
is nothing to do with the kernel. Kernel executes the TDH.SYS.SHUTDOWN
and if it fails, it will return the error code back to the user space.
There is nothing here to reason about and it is not clear how it is
adding the complexity to the kernel.
>
> Consider an urgent case where update is more important than the
> consistency of ongoing builds. The kernel's job is its own self
> consistency and security model, when that remains in tact root is
> allowed to make informed decisions.
>
The whole update is initiated by the userspace, imo, it is not the
kernel's job to decide what to do. It should try to update the TDX
module and return error code back to the userspace if it fails. it is
up to the userspace to resolve the conflict and retry the
installation. If you are saying that the userspace is not trusted for
such a critical action, again the whole process is initiated and
controlled by the userspace so there is an inherent trust there.
Consistency? How does td preserve failure impact the kernel
consistency? On the contrary, bypassing AVOID_COMPAT_SENSITIVE will
break the consistency for some TDs.
> You might say, well add a --force option for that, and that is also
> userspace prerogative to perform otherwise destructive operations with
> the degrees of freedom the kernel allows.
IMO, It is something userspace should decide, kernel's job is to
provide the necessary interface about it.
>
> I think we have reached the useful end of this thread. I support moving
> ahead with the dead simple, "this may clobber your builds", for now. We
> can always circle back to add more complexity later if that proves "too
> simple" in practice.
>
It is not clear how you reached that conclusion. We are one of the
users for this feature and we have multiple times explained that we
prefer failure on update if there is any risk of corrupting some TD
states. I did not see any other feedback/preference from other users
and I did not see any reasonable argument why you are preferring the
"clobber your builds" option.
Also the "clobber your builds" option will impact the TDX live
migration, considering the TDX live migration is WIP, it will be
definitely very hard to foresee the challenges there you are
introducing with this decision. How about TDX connect? Are we going to
come back and keep updating this every time we find an issue?
Since the update process is initiated and controlled by userspace, it
is the userspace application's prerogative to make the informed
decision on whether an urgent update warrants potentially destructive
actions. The kernel's role is to provide a reliable mechanism to
interact with the TDX Module and report outcomes accurately.
Ideally, ABI should allow userpace to provide flags which can be
also used to configure the TD preserve update option. If you do not
want to change ABI, you can make those as module param so userspace
can make a decision by itself.
To address some of your previous concerns:
It shifts complexity to userspace which is something everyone here
seems to prefer. The problem is that the TD Preserve update would
corrupt the TDs who are in the build stage (also impacts TDX LM and
possibly some TDX connect functionalities) and since the TDX module
would know about it, this will make sure that they will not be
corrupted hence it is a fix for a problem.
TDH.SYS.SHUTDOWN may not succeed due to multiple reasons like
TDX_SYS_BUSY therefore it needs to handle the error cases anyway and
should return the error to the userspace.
Now userspace can decide whatever logic it has to finish/cancel the
existing tdbuilds and retry the tdpreserve update.
You might be concerned about forward progress. As I said above, there
might be some other cases which might prevent the td preserve update
to succeed so forward progress is not guaranteed anyway and it is not
the kernel's job to figure it out. It will return the error code back
to userspace and let the userspace resolve the conflict.
next prev parent reply other threads:[~2025-10-28 17:00 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-01 2:52 [PATCH v2 00/21] Runtime TDX Module update support Chao Gao
2025-10-01 2:52 ` [PATCH v2 01/21] x86/virt/tdx: Print SEAMCALL leaf numbers in decimal Chao Gao
2025-10-01 2:52 ` [PATCH v2 02/21] x86/virt/tdx: Use %# prefix for hex values in SEAMCALL error messages Chao Gao
2025-10-01 2:52 ` [PATCH v2 03/21] x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h> Chao Gao
2025-10-01 2:52 ` [PATCH v2 04/21] x86/virt/tdx: Prepare to support P-SEAMLDR SEAMCALLs Chao Gao
2025-10-01 2:52 ` [PATCH v2 05/21] x86/virt/seamldr: Introduce a wrapper for " Chao Gao
2025-10-01 2:52 ` [PATCH v2 06/21] x86/virt/seamldr: Retrieve P-SEAMLDR information Chao Gao
2025-10-01 2:52 ` [PATCH v2 07/21] coco/tdx-host: Expose P-SEAMLDR information via sysfs Chao Gao
2025-10-30 21:54 ` Sagi Shahar
2025-10-30 23:05 ` dan.j.williams
2025-10-31 14:31 ` Sagi Shahar
2025-10-01 2:52 ` [PATCH v2 08/21] coco/tdx-host: Implement FW_UPLOAD sysfs ABI for TDX Module updates Chao Gao
2025-10-01 2:52 ` [PATCH v2 09/21] x86/virt/seamldr: Block TDX Module updates if any CPU is offline Chao Gao
2025-10-01 2:52 ` [PATCH v2 10/21] x86/virt/seamldr: Verify availability of slots for TDX Module updates Chao Gao
2025-10-01 2:52 ` [PATCH v2 11/21] x86/virt/seamldr: Allocate and populate a module update request Chao Gao
2025-10-01 2:52 ` [PATCH v2 12/21] x86/virt/seamldr: Introduce skeleton for TDX Module updates Chao Gao
2025-10-01 2:52 ` [PATCH v2 13/21] x86/virt/seamldr: Abort updates if errors occurred midway Chao Gao
2025-10-01 2:52 ` [PATCH v2 14/21] x86/virt/seamldr: Shut down the current TDX module Chao Gao
2025-10-01 2:52 ` [PATCH v2 15/21] x86/virt/tdx: Reset software states after TDX module shutdown Chao Gao
2025-10-01 2:53 ` [PATCH v2 16/21] x86/virt/seamldr: Handle TDX Module update failures Chao Gao
2025-10-28 2:53 ` Chao Gao
2025-10-01 2:53 ` [PATCH v2 17/21] x86/virt/seamldr: Install a new TDX Module Chao Gao
2025-10-01 2:53 ` [PATCH v2 18/21] x86/virt/seamldr: Do TDX per-CPU initialization after updates Chao Gao
2025-10-01 2:53 ` [PATCH v2 19/21] x86/virt/tdx: Establish contexts for the new TDX Module Chao Gao
2025-10-01 2:53 ` [PATCH v2 20/21] x86/virt/tdx: Update tdx_sysinfo and check features post-update Chao Gao
2025-10-01 2:53 ` [PATCH v2 21/21] x86/virt/tdx: Enable TDX Module runtime updates Chao Gao
2025-10-14 15:32 ` [PATCH v2 00/21] Runtime TDX Module update support Vishal Annapurve
2025-10-15 8:54 ` Reshetova, Elena
2025-10-15 14:19 ` Vishal Annapurve
2025-10-16 6:48 ` Reshetova, Elena
2025-10-15 15:02 ` Dave Hansen
2025-10-16 6:46 ` Reshetova, Elena
2025-10-16 17:47 ` Vishal Annapurve
2025-10-17 10:08 ` Reshetova, Elena
2025-10-18 0:01 ` Vishal Annapurve
2025-10-21 13:42 ` Reshetova, Elena
2025-10-22 7:14 ` Chao Gao
2025-10-22 15:42 ` Vishal Annapurve
2025-10-23 20:31 ` Vishal Annapurve
2025-10-23 21:10 ` Dave Hansen
2025-10-23 22:00 ` Vishal Annapurve
2025-10-24 7:43 ` Chao Gao
2025-10-24 18:02 ` Dave Hansen
2025-10-24 19:40 ` dan.j.williams
2025-10-24 20:00 ` Sean Christopherson
2025-10-24 20:14 ` Dave Hansen
2025-10-24 21:09 ` Vishal Annapurve
2025-10-24 20:13 ` Dave Hansen
2025-10-24 21:12 ` dan.j.williams
2025-10-24 21:19 ` Dave Hansen
2025-10-25 0:54 ` Vishal Annapurve
2025-10-25 1:42 ` dan.j.williams
2025-10-25 11:55 ` Vishal Annapurve
2025-10-25 12:01 ` Vishal Annapurve
2025-10-26 21:30 ` dan.j.williams
2025-10-26 22:01 ` Vishal Annapurve
2025-10-27 18:53 ` dan.j.williams
2025-10-28 0:42 ` Vishal Annapurve
2025-10-28 2:13 ` dan.j.williams
2025-10-28 17:00 ` Erdem Aktas [this message]
2025-10-29 0:56 ` Sean Christopherson
2025-10-29 2:17 ` dan.j.williams
2025-10-29 13:48 ` Sean Christopherson
2025-10-30 17:01 ` Vishal Annapurve
2025-10-31 2:53 ` Chao Gao
2025-11-19 22:44 ` Sagi Shahar
2025-11-20 2:47 ` Chao Gao
2025-10-28 23:48 ` Vishal Annapurve
2025-10-28 20:29 ` dan.j.williams
2025-10-28 20:32 ` dan.j.williams
2025-10-31 16:55 ` Sagi Shahar
2025-10-31 17:57 ` Vishal Annapurve
2025-11-01 2:18 ` Chao Gao
2025-11-01 2:05 ` Chao Gao
2025-11-12 14:09 ` Chao Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAAYXXYyVC0Sm+1PBw=xoYNDV7aa54c_6KTGjMdwVaBAJOd8Hpw@mail.gmail.com' \
--to=erdemaktas@google.com \
--cc=bp@alien8.de \
--cc=chao.gao@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=elena.reshetova@intel.com \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=nik.borisov@suse.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=tglx@linutronix.de \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).