From: Chao Gao <chao.gao@intel.com>
To: kvm@vger.kernel.org, linux-coco@lists.linux.dev,
linux-kernel@vger.kernel.org
Cc: binbin.wu@linux.intel.com, dave.hansen@linux.intel.com,
djbw@kernel.org, ira.weiny@intel.com, kai.huang@intel.com,
kas@kernel.org, nik.borisov@suse.com, paulmck@kernel.org,
pbonzini@redhat.com, reinette.chatre@intel.com,
rick.p.edgecombe@intel.com, sagis@google.com, seanjc@google.com,
tony.lindgren@linux.intel.com, vannapurve@google.com,
vishal.l.verma@intel.com, yilun.xu@linux.intel.com,
xiaoyao.li@intel.com, yan.y.zhao@intel.com,
Chao Gao <chao.gao@intel.com>, Thomas Gleixner <tglx@kernel.org>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: [PATCH v9 12/23] x86/virt/seamldr: Introduce skeleton for TDX module updates
Date: Wed, 13 May 2026 08:09:55 -0700 [thread overview]
Message-ID: <20260513151045.1420990-13-chao.gao@intel.com> (raw)
In-Reply-To: <20260513151045.1420990-1-chao.gao@intel.com>
TDX module updates require careful synchronization with other TDX
operations. The requirements are (#1/#2 reflect current behavior that
must be preserved):
1. SEAMCALLs need to be callable from both process and IRQ contexts.
2. SEAMCALLs need to be able to run concurrently across CPUs
3. During updates, only update-related SEAMCALLs are permitted; all
other SEAMCALLs shouldn't be called.
4. During updates, all online CPUs must participate in the update work.
No single lock primitive satisfies all requirements. For instance,
rwlock_t handles #1/#2 but fails #4: CPUs spinning with IRQs disabled
cannot be directed to perform update work.
Use stop_machine() as it is the only well-understood mechanism that can
meet all requirements.
And TDX module updates consist of several steps (See Intel® Trust Domain
Extensions (Intel® TDX) Module Base Architecture Specification, Chapter
"TD-Preserving TDX module Update"). Ordering requirements between steps
mandate lockstep synchronization across all CPUs.
multi_cpu_stop() provides a good example of executing a multi-step task
in lockstep across CPUs, but it does not synchronize the individual
steps inside the callback itself.
Implement a similar state machine as the skeleton for TDX module
updates. Each state represents one step in the update flow, and the
state advances only after all CPUs acknowledge completion of the current
step. This acknowledgment mechanism provides the required lockstep
execution.
The update flow is intentionally simpler than multi_cpu_stop() in two ways:
a) use a spinlock to protect the control data instead of atomic_t and
explicit memory barriers.
b) omit touch_nmi_watchdog() and rcu_momentary_eqs(), which exist
there for debugging and are not strictly needed for this update flow
Potential alternative to stop_machine()
=======================================
An alternative approach is to lock all KVM entry points and kick all
vCPUs. Here, KVM entry points refer to KVM VM/vCPU ioctl entry points,
implemented in KVM common code (virt/kvm). Adding a locking mechanism
there would affect all architectures KVM supports. And to lock only TDX
vCPUs, new logic would be needed to identify TDX vCPUs, which the KVM
common code currently lacks. This would add significant complexity and
maintenance overhead to KVM for this TDX-specific use case, so don't take
this approach.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Xu Yilun <yilun.xu@linux.intel.com>
Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
v9:
- Extract control-data initialization into a separate helper. [Dave]
- Drop touch_nmi_watchdog() and rcu_momentary_eqs(), as they are not
needed here.
- Rename thread_ack to num_ack to make it clear that it counts the
number of acknowledgments.
- Rename set_target_state() to __set_target_state() to mark it as an
internal helper. Add a comment noting that __set_target_state()
does not take the lock, unlike ack_state().
- Update the changelog to explain why a spinlock is used instead of
atomic_t plus memory barriers to protect the control data.
---
arch/x86/virt/vmx/tdx/seamldr.c | 87 ++++++++++++++++++++++++++++++++-
1 file changed, 86 insertions(+), 1 deletion(-)
diff --git a/arch/x86/virt/vmx/tdx/seamldr.c b/arch/x86/virt/vmx/tdx/seamldr.c
index 929203ec96f2..7befe4a08f33 100644
--- a/arch/x86/virt/vmx/tdx/seamldr.c
+++ b/arch/x86/virt/vmx/tdx/seamldr.c
@@ -7,8 +7,10 @@
#define pr_fmt(fmt) "seamldr: " fmt
#include <linux/mm.h>
+#include <linux/nmi.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
+#include <linux/stop_machine.h>
#include <asm/seamldr.h>
@@ -154,6 +156,84 @@ static int init_seamldr_params(struct seamldr_params *params, const u8 *data, u3
return 0;
}
+/*
+ * During a TDX module update, all CPUs start from MODULE_UPDATE_START and
+ * progress to MODULE_UPDATE_DONE. Each state is associated with certain
+ * work. For some states, just one CPU needs to perform the work, while
+ * other CPUs just wait during those states.
+ */
+enum module_update_state {
+ MODULE_UPDATE_START,
+ MODULE_UPDATE_DONE,
+};
+
+static struct update_ctrl {
+ enum module_update_state state;
+ int num_ack;
+ /*
+ * Protect update_ctrl. Raw spinlock as it will be acquired from
+ * interrupt-disabled contexts.
+ */
+ raw_spinlock_t lock;
+} update_ctrl;
+
+/* Called with ctrl->lock held or during initialization. */
+static void __set_target_state(struct update_ctrl *ctrl,
+ enum module_update_state newstate)
+{
+ /* Reset ack counter. */
+ ctrl->num_ack = 0;
+ ctrl->state = newstate;
+}
+
+/* Last one to ack a state moves to the next state. */
+static void ack_state(struct update_ctrl *ctrl)
+{
+ raw_spin_lock(&ctrl->lock);
+
+ ctrl->num_ack++;
+ if (ctrl->num_ack == num_online_cpus())
+ __set_target_state(ctrl, ctrl->state + 1);
+
+ raw_spin_unlock(&ctrl->lock);
+}
+
+static void init_state(struct update_ctrl *ctrl)
+{
+ raw_spin_lock_init(&ctrl->lock);
+ __set_target_state(ctrl, MODULE_UPDATE_START + 1);
+}
+
+/*
+ * See multi_cpu_stop() from where this multi-cpu state-machine was
+ * adopted.
+ */
+static int do_seamldr_install_module(void *seamldr_params)
+{
+ enum module_update_state newstate, curstate = MODULE_UPDATE_START;
+ int ret = 0;
+
+ do {
+ newstate = READ_ONCE(update_ctrl.state);
+
+ if (curstate == newstate) {
+ cpu_relax();
+ continue;
+ }
+
+ curstate = newstate;
+ switch (curstate) {
+ /* TODO: add the update steps. */
+ default:
+ break;
+ }
+
+ ack_state(&update_ctrl);
+ } while (curstate != MODULE_UPDATE_DONE);
+
+ return ret;
+}
+
/**
* seamldr_install_module - Install a new TDX module.
* @data: Pointer to the TDX module image.
@@ -174,7 +254,12 @@ int seamldr_install_module(const u8 *data, u32 size)
if (ret)
goto out;
- /* TODO: Update TDX module here */
+ /* Ensure a stable set of online CPUs for the update process. */
+ cpus_read_lock();
+ init_state(&update_ctrl);
+ ret = stop_machine_cpuslocked(do_seamldr_install_module, params, cpu_online_mask);
+ cpus_read_unlock();
+
out:
kfree(params);
return ret;
--
2.52.0
next prev parent reply other threads:[~2026-05-13 15:11 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 15:09 [PATCH v9 00/23] Runtime TDX module update support Chao Gao
2026-05-13 15:09 ` [PATCH v9 01/23] x86/virt/tdx: Consolidate TDX global initialization states Chao Gao
2026-05-13 15:09 ` [PATCH v9 02/23] x86/virt/tdx: Move TDX_FEATURES0 bits to asm/tdx.h Chao Gao
2026-05-13 15:09 ` [PATCH v9 03/23] x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h> Chao Gao
2026-05-13 15:09 ` [PATCH v9 04/23] coco/tdx-host: Introduce a "tdx_host" device Chao Gao
2026-05-13 15:09 ` [PATCH v9 05/23] coco/tdx-host: Expose TDX module version Chao Gao
2026-05-13 15:09 ` [PATCH v9 06/23] x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs Chao Gao
2026-05-13 15:09 ` [PATCH v9 07/23] x86/virt/seamldr: Add a helper to retrieve P-SEAMLDR information Chao Gao
2026-05-13 15:09 ` [PATCH v9 08/23] coco/tdx-host: Expose P-SEAMLDR information via sysfs Chao Gao
2026-05-13 15:09 ` [PATCH v9 09/23] coco/tdx-host: Don't expose P-SEAMLDR information on CPUs with erratum Chao Gao
2026-05-13 15:09 ` [PATCH v9 10/23] coco/tdx-host: Implement firmware upload sysfs ABI for TDX module updates Chao Gao
2026-05-13 15:09 ` [PATCH v9 11/23] x86/virt/seamldr: Allocate and populate a module update request Chao Gao
2026-05-13 15:09 ` Chao Gao [this message]
2026-05-13 15:09 ` [PATCH v9 13/23] x86/virt/seamldr: Abort updates after a failed step Chao Gao
2026-05-13 15:09 ` [PATCH v9 14/23] x86/virt/seamldr: Shut down the current TDX module Chao Gao
2026-05-13 15:09 ` [PATCH v9 15/23] x86/virt/tdx: Reset software states during TDX module shutdown Chao Gao
2026-05-13 15:09 ` [PATCH v9 16/23] x86/virt/seamldr: Install a new TDX module Chao Gao
2026-05-13 15:10 ` [PATCH v9 17/23] x86/virt/seamldr: Do TDX per-CPU initialization after module installation Chao Gao
2026-05-13 15:10 ` [PATCH v9 18/23] x86/virt/tdx: Restore TDX module state Chao Gao
2026-05-13 15:10 ` [PATCH v9 19/23] x86/virt/tdx: Refresh TDX module version after update Chao Gao
2026-05-13 15:10 ` [PATCH v9 20/23] x86/virt/tdx: Reject updates during compatibility-sensitive operations Chao Gao
2026-05-13 15:10 ` [PATCH v9 21/23] x86/virt/tdx: Enable TDX module runtime updates Chao Gao
2026-05-13 15:10 ` [PATCH v9 22/23] coco/tdx-host: Document TDX module update compatibility criteria Chao Gao
2026-05-13 15:10 ` [PATCH v9 23/23] x86/virt/tdx: Document TDX module update Chao Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260513151045.1420990-13-chao.gao@intel.com \
--to=chao.gao@intel.com \
--cc=binbin.wu@linux.intel.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=djbw@kernel.org \
--cc=hpa@zytor.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=nik.borisov@suse.com \
--cc=paulmck@kernel.org \
--cc=pbonzini@redhat.com \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=tglx@kernel.org \
--cc=tony.lindgren@linux.intel.com \
--cc=vannapurve@google.com \
--cc=vishal.l.verma@intel.com \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
--cc=yan.y.zhao@intel.com \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox