* [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
@ 2025-01-15 16:08 Paolo Bonzini
2025-01-15 16:08 ` [PATCH v3 01/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
` (15 more replies)
0 siblings, 16 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:08 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
Hi,
This is the final-ish version of the "SEAMCALL Wrappers" RFC[0], with
all the wrappers extracted out of the corresponding TDX patches.
This version of the series uses u64 only for guest physical addresses
and error return values:
* u64 pfn is replaced by struct page
* u64 level is replaced by int level
* u64 tdr and u64 tdvpr are replaced by structs that contain struct page
for them as well as for tdcs and tdcx.
A couple functions are also moved over from KVM to tdx.h
static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
static inline int pg_level_to_tdx_sept_level(enum pg_level level)
The plan is to include these in kvm.git together with their first user.
Thanks,
Paolo
Isaku Yamahata (5):
x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT
pages
x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial
contents
x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX
guest KeyID
x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
Kai Huang (1):
x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest
x86/virt/tdx: Read essential global metadata for KVM
Rick Edgecombe (6):
x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation
x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation
x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management
x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access
x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations
arch/x86/include/asm/tdx.h | 68 ++++
arch/x86/virt/vmx/tdx/tdx.c | 403 ++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 47 ++-
arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 50 +++
arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 19 +
5 files changed, 580 insertions(+), 7 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v3 01/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
@ 2025-01-15 16:08 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 02/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
` (14 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:08 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. Pre-TDX Intel hardware has support for a memory encryption
architecture called MK-TME, which repurposes several high bits of
physical address as "KeyID". TDX ends up with reserving a sub-range of
MK-TME KeyIDs as "TDX private KeyIDs".
Like MK-TME, these KeyIDs can be associated with an ephemeral key. For TDX
this association is done by the TDX module. It also has its own tracking
for which KeyIDs are in use. To do this ephemeral key setup and manipulate
the TDX module's internal tracking, KVM will use the following SEAMCALLs:
TDH.MNG.KEY.CONFIG: Mark the KeyID as in use, and initialize its
ephemeral key.
TDH.MNG.KEY.FREEID: Mark the KeyID as not in use.
These SEAMCALLs both operate on TDR structures, which are setup using the
previously added TDH.MNG.CREATE SEAMCALL. KVM's use of these operations
will go like:
- tdx_guest_keyid_alloc()
- Initialize TD and TDR page with TDH.MNG.CREATE (not yet-added), passing
KeyID
- TDH.MNG.KEY.CONFIG to initialize the key
- TD runs, teardown is started
- TDH.MNG.KEY.FREEID
- tdx_guest_keyid_free()
Don't try to combine the tdx_guest_keyid_alloc() and TDH.MNG.KEY.CONFIG
operations because TDH.MNG.CREATE and some locking need to be done in the
middle. Don't combine TDH.MNG.KEY.FREEID and tdx_guest_keyid_free() so they
are symmetrical with the creation path.
So implement tdh_mng_key_config() and tdh_mng_key_freeid() as separate
functions than tdx_guest_keyid_alloc() and tdx_guest_keyid_free().
The TDX module provides SEAMCALLs to hand pages to the TDX module for
storing TDX controlled state. SEAMCALLs that operate on this state are
directed to the appropriate TD VM using references to the pages originally
provided for managing the TD's state. So the host kernel needs to track
these pages, both as an ID for specifying which TD to operate on, and to
allow them to be eventually reclaimed. The TD VM associated pages are
called TDR (Trust Domain Root) and TDCS (Trust Domain Control Structure).
Introduce "struct tdx_td" for holding references to pages provided to the
TDX module for this TD VM associated state. Don't plan for any TD
associated state that is controlled by KVM to live in this struct. Only
expect it to hold data for concepts specific to the TDX architecture, for
which there can't already be preexisting storage for in KVM.
Add both the TDR page and an array of TDCS pages, even though the SEAMCALL
wrappers will only need to know about the TDR pages for directing the
SEAMCALLs to the right TD. Adding the TDCS pages to this struct will let
all of the TD VM associated pages handed to the TDX module be tracked in
one location. For a type to specify physical pages, use KVM's hpa_t type.
Do this for KVM's benefit This is the common type used to hold physical
addresses in KVM, so will make interoperability easier.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-2-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 12 ++++++++++++
arch/x86/virt/vmx/tdx/tdx.c | 25 +++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 16 +++++++++-------
3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index eba178996d84..5045ab1c3d5b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -116,6 +116,18 @@ static inline u64 sc_retry(sc_func_t func, u64 fn,
int tdx_cpu_enable(void);
int tdx_enable(void);
const char *tdx_dump_mce_info(struct mce *m);
+
+struct tdx_td {
+ /* TD root structure: */
+ struct page *tdr_page;
+
+ int tdcs_nr_pages;
+ /* TD control structure: */
+ struct page **tdcs_pages;
+};
+
+u64 tdh_mng_key_config(struct tdx_td *td);
+u64 tdh_mng_key_freeid(struct tdx_td *td);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 7fdb37387886..1ffbdb840004 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1456,3 +1456,28 @@ void __init tdx_init(void)
check_tdx_erratum();
}
+
+static inline u64 tdx_tdr_pa(struct tdx_td *td)
+{
+ return page_to_phys(td->tdr_page);
+}
+
+u64 tdh_mng_key_config(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_KEY_CONFIG, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_config);
+
+u64 tdh_mng_key_freeid(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_KEY_FREEID, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_freeid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 4e3d533cdd61..5579317f67ab 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -15,13 +15,15 @@
/*
* TDX module SEAMCALL leaf functions
*/
-#define TDH_PHYMEM_PAGE_RDMD 24
-#define TDH_SYS_KEY_CONFIG 31
-#define TDH_SYS_INIT 33
-#define TDH_SYS_RD 34
-#define TDH_SYS_LP_INIT 35
-#define TDH_SYS_TDMR_INIT 36
-#define TDH_SYS_CONFIG 45
+#define TDH_MNG_KEY_CONFIG 8
+#define TDH_MNG_KEY_FREEID 20
+#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_SYS_KEY_CONFIG 31
+#define TDH_SYS_INIT 33
+#define TDH_SYS_RD 34
+#define TDH_SYS_LP_INIT 35
+#define TDH_SYS_TDMR_INIT 36
+#define TDH_SYS_CONFIG 45
/* TDX page types */
#define PT_NDA 0x0
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 02/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-15 16:08 ` [PATCH v3 01/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 03/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
` (13 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious hosts and certain physical
attacks. It defines various control structures that hold state for things
like TDs or vCPUs. These control structures are stored in pages given to
the TDX module and encrypted with either the global KeyID or the guest
KeyIDs.
To manipulate these control structures the TDX module defines a few
SEAMCALLs. KVM will use these during the process of creating a TD as
follows:
1) Allocate a unique TDX KeyID for a new guest.
1) Call TDH.MNG.CREATE to create a "TD Root" (TDR) page, together with
the new allocated KeyID. Unlike the rest of the TDX guest, the TDR
page is crypto-protected by the 'global KeyID'.
2) Call the previously added TDH.MNG.KEY.CONFIG on each package to
configure the KeyID for the guest. After this step, the KeyID to
protect the guest is ready and the rest of the guest will be protected
by this KeyID.
3) Call TDH.MNG.ADDCX to add TD Control Structure (TDCS) pages.
4) Call TDH.MNG.INIT to initialize the TDCS.
To reclaim these pages for use by the kernel other SEAMCALLs are needed,
which will be added in future patches.
Add tdh_mng_addcx(), tdh_mng_create() and tdh_mng_init() to export these
SEAMCALLs so that KVM can use them to create TDs.
For SEAMCALLs that give a page to the TDX module to be encrypted, CLFLUSH
the page mapped with KeyID 0, such that any dirty cache lines don't write
back later and clobber TD memory or control structures. Don't worry about
the other MK-TME KeyIDs because the kernel doesn't use them. The TDX docs
specify that this flush is not needed unless the TDX module exposes the
CLFLUSH_BEFORE_ALLOC feature bit. Be conservative and always flush. Add a
helper function to facilitate this.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-3-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 51 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 57 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 5045ab1c3d5b..131356bed6f5 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -126,8 +126,11 @@ struct tdx_td {
struct page **tdcs_pages;
};
+u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_mng_key_config(struct tdx_td *td);
+u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_mng_key_freeid(struct tdx_td *td);
+u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 1ffbdb840004..ce4b1e96c5b0 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1462,6 +1462,29 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td)
return page_to_phys(td->tdr_page);
}
+/*
+ * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
+ * a CLFLUSH of pages is required before handing them to the TDX module.
+ * Be conservative and make the code simpler by doing the CLFLUSH
+ * unconditionally.
+ */
+static void tdx_clflush_page(struct page *page)
+{
+ clflush_cache_range(page_to_virt(page), PAGE_SIZE);
+}
+
+u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_phys(tdcs_page),
+ .rdx = tdx_tdr_pa(td),
+ };
+
+ tdx_clflush_page(tdcs_page);
+ return seamcall(TDH_MNG_ADDCX, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1472,6 +1495,18 @@ u64 tdh_mng_key_config(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mng_key_config);
+u64 tdh_mng_create(struct tdx_td *td, u16 hkid)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = hkid,
+ };
+
+ tdx_clflush_page(td->tdr_page);
+ return seamcall(TDH_MNG_CREATE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_create);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1481,3 +1516,19 @@ u64 tdh_mng_key_freeid(struct tdx_td *td)
return seamcall(TDH_MNG_KEY_FREEID, &args);
}
EXPORT_SYMBOL_GPL(tdh_mng_key_freeid);
+
+u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = td_params,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MNG_INIT, &args);
+
+ *extended_err = args.rcx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mng_init);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 5579317f67ab..0861c3f09576 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -15,8 +15,11 @@
/*
* TDX module SEAMCALL leaf functions
*/
+#define TDH_MNG_ADDCX 1
#define TDH_MNG_KEY_CONFIG 8
+#define TDH_MNG_CREATE 9
#define TDH_MNG_KEY_FREEID 20
+#define TDH_MNG_INIT 21
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 03/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-15 16:08 ` [PATCH v3 01/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 02/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
` (12 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. It defines various control structures that hold state for
virtualized components of the TD (i.e. VMs or vCPUs) These control
structures are stored in pages given to the TDX module and encrypted
with either the global KeyID or the guest KeyIDs.
To manipulate these control structures the TDX module defines a few
SEAMCALLs. KVM will use these during the process of creating a vCPU as
follows:
1) Call TDH.VP.CREATE to create a TD vCPU Root (TDVPR) page for each
vCPU.
2) Call TDH.VP.ADDCX to add per-vCPU control pages (TDCX) for each vCPU.
3) Call TDH.VP.INIT to initialize the TDCX for each vCPU.
To reclaim these pages for use by the kernel other SEAMCALLs are needed,
which will be added in future patches.
Export functions to allow KVM to make these SEAMCALLs. Export two
variants for TDH.VP.CREATE, in order to support the planned logic of KVM
to support TDX modules with and without the ENUM_TOPOLOGY feature. If
KVM can drop support for the !ENUM_TOPOLOGY case, this could go down a
single version. Leave that for later discussion.
The TDX module provides SEAMCALLs to hand pages to the TDX module for
storing TDX controlled state. SEAMCALLs that operate on this state are
directed to the appropriate TD vCPU using references to the pages
originally provided for managing the vCPU's state. So the host kernel
needs to track these pages, both as an ID for specifying which vCPU to
operate on, and to allow them to be eventually reclaimed. The vCPU
associated pages are called TDVPR (Trust Domain Virtual Processor Root)
and TDCX (Trust Domain Control Extension).
Introduce "struct tdx_vp" for holding references to pages provided to the
TDX module for the TD vCPU associated state. Don't plan for any vCPU
associated state that is controlled by KVM to live in this struct. Only
expect it to hold data for concepts specific to the TDX architecture, for
which there can't already be preexisting storage for in KVM.
Add both the TDVPR page and an array of TDCX pages, even though the
SEAMCALL wrappers will only need to know about the TDVPR pages for
directing the SEAMCALLs to the right vCPU. Adding the TDCX pages to this
struct will let all of the vCPU associated pages handed to the TDX module be
tracked in one location. For a type to specify physical pages, use KVM's
hpa_t type. Do this for KVM's benefit This is the common type used to hold
physical addresses in KVM, so will make interoperability easier.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-4-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 15 +++++++++++
arch/x86/virt/vmx/tdx/tdx.c | 54 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 11 ++++++++
3 files changed, 80 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 131356bed6f5..2b993ea24297 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -124,13 +124,28 @@ struct tdx_td {
int tdcs_nr_pages;
/* TD control structure: */
struct page **tdcs_pages;
+
+ /* Size of `tdcx_pages` in struct tdx_vp */
+ int tdcx_nr_pages;
+};
+
+struct tdx_vp {
+ /* TDVP root page */
+ struct page *tdvpr_page;
+
+ /* TD vCPU control structure: */
+ struct page **tdcx_pages;
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
+u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
+u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
+u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index ce4b1e96c5b0..a3804e8bdf55 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -5,6 +5,7 @@
* Intel Trusted Domain Extensions (TDX) support
*/
+#include "asm/page_types.h"
#define pr_fmt(fmt) "virt/tdx: " fmt
#include <linux/types.h>
@@ -1462,6 +1463,11 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td)
return page_to_phys(td->tdr_page);
}
+static inline u64 tdx_tdvpr_pa(struct tdx_vp *td)
+{
+ return page_to_phys(td->tdvpr_page);
+}
+
/*
* The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
* a CLFLUSH of pages is required before handing them to the TDX module.
@@ -1485,6 +1491,18 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_phys(tdcx_page),
+ .rdx = tdx_tdvpr_pa(vp),
+ };
+
+ tdx_clflush_page(tdcx_page);
+ return seamcall(TDH_VP_ADDCX, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_addcx);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1507,6 +1525,18 @@ u64 tdh_mng_create(struct tdx_td *td, u16 hkid)
}
EXPORT_SYMBOL_GPL(tdh_mng_create);
+u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = tdx_tdr_pa(td),
+ };
+
+ tdx_clflush_page(vp->tdvpr_page);
+ return seamcall(TDH_VP_CREATE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_create);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1532,3 +1562,27 @@ u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err)
return ret;
}
EXPORT_SYMBOL_GPL(tdh_mng_init);
+
+u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = initial_rcx,
+ };
+
+ return seamcall(TDH_VP_INIT, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_init);
+
+u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = initial_rcx,
+ .r8 = x2apicid,
+ };
+
+ /* apicid requires version == 1. */
+ return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_init_apicid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 0861c3f09576..f0464f7d9780 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,10 +16,13 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
+#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
+#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
@@ -28,6 +31,14 @@
#define TDH_SYS_TDMR_INIT 36
#define TDH_SYS_CONFIG 45
+/*
+ * SEAMCALL leaf:
+ *
+ * Bit 15:0 Leaf number
+ * Bit 23:16 Version number
+ */
+#define TDX_VERSION_SHIFT 16
+
/* TDX page types */
#define PT_NDA 0x0
#define PT_RSVD 0x1
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (2 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 03/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:38 ` Dave Hansen
2025-01-15 16:09 ` [PATCH v3 05/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
` (11 subsequent siblings)
15 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module uses pages provided by the host for both control
structures and for TD guest pages. These pages are encrypted using the
MK-TME encryption engine, with its special requirements around cache
invalidation. For its own security, the TDX module ensures pages are
flushed properly and track which usage they are currently assigned. For
creating and tearing down TD VMs and vCPUs KVM will need to use the
TDH.PHYMEM.PAGE.RECLAIM, TDH.PHYMEM.CACHE.WB, and TDH.PHYMEM.PAGE.WBINVD
SEAMCALLs.
Add tdh_phymem_page_reclaim() to enable KVM to call
TDH.PHYMEM.PAGE.RECLAIM to reclaim the page for use by the host kernel.
This effectively resets its state in the TDX module's page tracking
(PAMT), if the page is available to be reclaimed. This will be used by KVM
to reclaim the various types of pages owned by the TDX module. It will
have a small wrapper in KVM that retries in the case of a relevant error
code. Don't implement this wrapper in arch/x86 because KVM's solution
around retrying SEAMCALLs will be better located in a single place.
Add tdh_phymem_cache_wb() to enable KVM to call TDH.PHYMEM.CACHE.WB to do
a cache write back in a way that the TDX module can verify, before it
allows a KeyID to be freed. The KVM code will use this to have a small
wrapper that handles retries. Since the TDH.PHYMEM.CACHE.WB operation is
interruptible, have tdh_phymem_cache_wb() take a resume argument to pass
this info to the TDX module for restarts. It is worth noting that this
SEAMCALL uses a SEAM specific MSR to do the write back in sections. In
this way it does export some new functionality that affects CPU state.
Add tdh_phymem_page_wbinvd_tdr() to enable KVM to call
TDH.PHYMEM.PAGE.WBINVD to do a cache write back and invalidate of a TDR,
using the global KeyID. The underlying TDH.PHYMEM.PAGE.WBINVD SEAMCALL
requires the related KeyID to be encoded into the SEAMCALL args. Since the
global KeyID is not exposed to KVM, a dedicated wrapper is needed for TDR
focused TDH.PHYMEM.PAGE.WBINVD operations.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-5-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 16 ++++++++++++++
arch/x86/virt/vmx/tdx/tdx.c | 42 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 61 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 2b993ea24297..4176f734118b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -137,6 +137,19 @@ struct tdx_vp {
struct page **tdcx_pages;
};
+
+static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
+{
+ u64 ret;
+
+ ret = page_to_phys(page);
+ /* KeyID bits are just above the physical address bits: */
+ ret |= hkid << boot_cpu_data.x86_phys_bits;
+
+ return ret;
+
+}
+
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
@@ -146,6 +159,9 @@ u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_phymem_cache_wb(bool resume);
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a3804e8bdf55..39c4d023e452 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1586,3 +1586,45 @@ u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
}
EXPORT_SYMBOL_GPL(tdh_vp_init_apicid);
+
+/*
+ * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX defined fomats.
+ * So despite the names, they must be interpted specially as described by the spec. Return
+ * them only for error reporting purposes.
+ */
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_phys(page),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_PHYMEM_PAGE_RECLAIM, &args);
+
+ *tdx_pt = args.rcx;
+ *tdx_owner = args.rdx;
+ *tdx_size = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+
+u64 tdh_phymem_cache_wb(bool resume)
+{
+ struct tdx_module_args args = {
+ .rcx = resume ? 1 : 0,
+ };
+
+ return seamcall(TDH_PHYMEM_CACHE_WB, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_cache_wb);
+
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
+{
+ struct tdx_module_args args = {};
+
+ args.rcx = mk_keyed_paddr(tdx_global_keyid, td->tdr_page);
+
+ return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index f0464f7d9780..7a15c9afcdfa 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -24,11 +24,14 @@
#define TDH_MNG_INIT 21
#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_PHYMEM_PAGE_RECLAIM 28
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
#define TDH_SYS_LP_INIT 35
#define TDH_SYS_TDMR_INIT 36
+#define TDH_PHYMEM_CACHE_WB 40
+#define TDH_PHYMEM_PAGE_WBINVD 41
#define TDH_SYS_CONFIG 45
/*
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 05/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (3 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 06/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
` (10 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module has TD scoped and vCPU scoped "metadata fields".
These fields are a bit like VMCS fields, and stored in data structures
maintained by the TDX module. Export 3 SEAMCALLs for use in reading and
writing these fields:
Make tdh_mng_rd() use MNG.VP.RD to read the TD scoped metadata.
Make tdh_vp_rd()/tdh_vp_wr() use TDH.VP.RD/WR to read/write the vCPU
scoped metadata.
KVM will use these by creating inline helpers that target various metadata
sizes. Export the raw SEAMCALL leaf, to avoid exporting the large number
of various sized helpers.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Message-ID: <20241203010317.827803-6-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 47 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 53 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 4176f734118b..0a5ecda98713 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -155,9 +155,12 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
+u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
+u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
+u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
u64 tdh_phymem_cache_wb(bool resume);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 39c4d023e452..387fec057bd3 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1537,6 +1537,23 @@ u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp)
}
EXPORT_SYMBOL_GPL(tdh_vp_create);
+u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = field,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MNG_RD, &args);
+
+ /* R8: Content of the field, or 0 in case of error. */
+ *data = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mng_rd);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1574,6 +1591,36 @@ u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx)
}
EXPORT_SYMBOL_GPL(tdh_vp_init);
+u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = field,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_VP_RD, &args);
+
+ /* R8: Content of the field, or 0 in case of error. */
+ *data = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_vp_rd);
+
+u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = field,
+ .r8 = data,
+ .r9 = mask,
+ };
+
+ return seamcall(TDH_VP_WR, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_wr);
+
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 7a15c9afcdfa..aacd38b12989 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -19,11 +19,13 @@
#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
+#define TDH_MNG_RD 11
#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_VP_RD 26
#define TDH_PHYMEM_PAGE_RECLAIM 28
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
@@ -32,6 +34,7 @@
#define TDH_SYS_TDMR_INIT 36
#define TDH_PHYMEM_CACHE_WB 40
#define TDH_PHYMEM_PAGE_WBINVD 41
+#define TDH_VP_WR 43
#define TDH_SYS_CONFIG 45
/*
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 06/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (4 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 05/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 07/14] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
` (9 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module has the concept of flushing vCPUs. These flushes
include both a flush of the translation caches and also any other state
internal to the TDX module. Before freeing a KeyID, this flush operation
needs to be done. KVM will need to perform the flush on each pCPU
associated with the TD, and also perform a TD scoped operation that checks
if the flush has been done on all vCPU's associated with the TD.
Add a tdh_vp_flush() function to be used to call TDH.VP.FLUSH on each pCPU
associated with the TD during TD teardown. It will also be called when
disabling TDX and during vCPU migration between pCPUs.
Add tdh_mng_vpflushdone() to be used by KVM to call TDH.MNG.VPFLUSHDONE.
KVM will use this during TD teardown to verify that TDH.VP.FLUSH has been
called sufficiently, and advance the state machine that will allow for
reclaiming the TD's KeyID.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-7-rick.p.edgecombe@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 20 ++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 24 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 0a5ecda98713..1d84cf8e2abe 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -156,6 +156,8 @@ u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
+u64 tdh_vp_flush(struct tdx_vp *vp);
+u64 tdh_mng_vpflushdone(struct tdx_td *td);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 387fec057bd3..83fc01bfd55d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1554,6 +1554,26 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
}
EXPORT_SYMBOL_GPL(tdh_mng_rd);
+u64 tdh_vp_flush(struct tdx_vp *vp)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ };
+
+ return seamcall(TDH_VP_FLUSH, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_flush);
+
+u64 tdh_mng_vpflushdone(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_VPFLUSHDONE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_vpflushdone);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index aacd38b12989..62cb7832c42d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -20,6 +20,8 @@
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
+#define TDH_VP_FLUSH 18
+#define TDH_MNG_VPFLUSHDONE 19
#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 07/14] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (5 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 06/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 08/14] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
` (8 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a Secure EPT
(S-EPT or SEPT) tree per TD for private GPA to HPA translation. Wrap the
TDH.MEM.SEPT.ADD SEAMCALL with tdh_mem_sept_add() to provide pages to the
TDX module for building a TD's SEPT tree. (Refer to these pages as SEPT
pages).
Callers need to allocate and provide a normal page to tdh_mem_sept_add(),
which then passes the page to the TDX module via the SEAMCALL
TDH.MEM.SEPT.ADD. The TDX module then installs the page into SEPT tree and
encrypts this SEPT page with the TD's guest keyID. The kernel cannot use
the SEPT page until after reclaiming it via TDH.MEM.SEPT.REMOVE or
TDH.PHYMEM.PAGE.RECLAIM.
Before passing the page to the TDX module, tdh_mem_sept_add() performs a
CLFLUSH on the page mapped with keyID 0 to ensure that any dirty cache
lines don't write back later and clobber TD memory or control structures.
Don't worry about the other MK-TME keyIDs because the kernel doesn't use
them. The TDX docs specify that this flush is not needed unless the TDX
module exposes the CLFLUSH_BEFORE_ALLOC feature bit. Do the CLFLUSH
unconditionally for two reasons: make the solution simpler by having a
single path that can handle both !CLFLUSH_BEFORE_ALLOC and
CLFLUSH_BEFORE_ALLOC cases. Avoid wading into any correctness uncertainty
by going with a conservative solution to start.
Callers should specify "GPA" and "level" for the TDX module to install the
SEPT page at the specified position in the SEPT. Do not include the root
page level in "level" since TDH.MEM.SEPT.ADD can only add non-root pages to
the SEPT. Ensure "level" is between 1 and 3 for a 4-level SEPT or between 1
and 4 for a 5-level SEPT.
Call tdh_mem_sept_add() during the TD's build time or during the TD's
runtime. Check for errors from the function return value and retrieve
extended error info from the function output parameters.
The TDX module has many internal locks. To avoid staying in SEAM mode for
too long, SEAMCALLs returns a BUSY error code to the kernel instead of
spinning on the locks. Depending on the specific SEAMCALL, the caller
may need to handle this error in specific ways (e.g., retry). Therefore,
return the SEAMCALL error code directly to the caller. Don't attempt to
handle it in the core kernel.
TDH.MEM.SEPT.ADD effectively manages two internal resources of the TDX
module: it installs page table pages in the SEPT tree and also updates the
TDX module's page metadata (PAMT). Don't add a wrapper for the matching
SEAMCALL for removing a SEPT page (TDH.MEM.SEPT.REMOVE) because KVM, as the
only in-kernel user, will only tear down the SEPT tree when the TD is being
torn down. When this happens it can just do other operations that reclaim
the SEPT pages for the host kernels to use, update the PAMT and let the
SEPT get trashed.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073624.22114-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 7 ++++++-
arch/x86/virt/vmx/tdx/tdx.c | 19 +++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 1 +
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 1d84cf8e2abe..1be640718692 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -137,7 +137,6 @@ struct tdx_vp {
struct page **tdcx_pages;
};
-
static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
{
u64 ret;
@@ -147,10 +146,16 @@ static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
ret |= hkid << boot_cpu_data.x86_phys_bits;
return ret;
+}
+static inline int pg_level_to_tdx_sept_level(enum pg_level level)
+{
+ WARN_ON_ONCE(level == PG_LEVEL_NONE);
+ return level - 1;
}
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 83fc01bfd55d..77f9c9c2514c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1491,6 +1491,25 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(page),
+ };
+ u64 ret;
+
+ tdx_clflush_page(page);
+ ret = seamcall_ret(TDH_MEM_SEPT_ADD, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_sept_add);
+
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 62cb7832c42d..308d3aa565d7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,6 +16,7 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 08/14] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (6 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 07/14] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 09/14] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
` (7 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a Secure EPT
(S-EPT or SEPT) tree per TD to translate TD's private memory accessed
using a private GPA. Wrap the SEAMCALL TDH.MEM.PAGE.ADD with
tdh_mem_page_add() and TDH.MEM.PAGE.AUG with tdh_mem_page_aug() to add TD
private pages and map them to the TD's private GPAs in the SEPT.
Callers of tdh_mem_page_add() and tdh_mem_page_aug() allocate and provide
normal pages to the wrappers, who further pass those pages to the TDX
module. Before passing the pages to the TDX module, tdh_mem_page_add() and
tdh_mem_page_aug() perform a CLFLUSH on the page mapped with keyID 0 to
ensure that any dirty cache lines don't write back later and clobber TD
memory or control structures. Don't worry about the other MK-TME keyIDs
because the kernel doesn't use them. The TDX docs specify that this flush
is not needed unless the TDX module exposes the CLFLUSH_BEFORE_ALLOC
feature bit. Do the CLFLUSH unconditionally for two reasons: make the
solution simpler by having a single path that can handle both
!CLFLUSH_BEFORE_ALLOC and CLFLUSH_BEFORE_ALLOC cases. Avoid wading into any
correctness uncertainty by going with a conservative solution to start.
Call tdh_mem_page_add() to add a private page to a TD during the TD's build
time (i.e., before TDH.MR.FINALIZE). Specify which GPA the 4K private page
will map to. No need to specify level info since TDH.MEM.PAGE.ADD only adds
pages at 4K level. To provide initial contents to TD, provide an additional
source page residing in memory managed by the host kernel itself (encrypted
with a shared keyID). The TDX module will copy the initial contents from
the source page in shared memory into the private page after mapping the
page in the SEPT to the specified private GPA. The TDX module allows the
source page to be the same page as the private page to be added. In that
case, the TDX module converts and encrypts the source page as a TD private
page.
Call tdh_mem_page_aug() to add a private page to a TD during the TD's
runtime (i.e., after TDH.MR.FINALIZE). TDH.MEM.PAGE.AUG supports adding
huge pages. Specify which GPA the private page will map to, along with
level info embedded in the lower bits of the GPA. The TDX module will
recognize the added page as the TD's private page after the TD's acceptance
with TDCALL TDG.MEM.PAGE.ACCEPT.
tdh_mem_page_add() and tdh_mem_page_aug() may fail. Callers can check
function return value and retrieve extended error info from the function
output parameters.
The TDX module has many internal locks. To avoid staying in SEAM mode for
too long, SEAMCALLs returns a BUSY error code to the kernel instead of
spinning on the locks. Depending on the specific SEAMCALL, the caller
may need to handle this error in specific ways (e.g., retry). Therefore,
return the SEAMCALL error code directly to the caller. Don't attempt to
handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073636.22129-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 39 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 43 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 1be640718692..5a615cfefe36 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -155,8 +155,10 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
}
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2);
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
+u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 77f9c9c2514c..e9ab7346e0ca 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1491,6 +1491,26 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(page),
+ .r9 = page_to_phys(source),
+ };
+ u64 ret;
+
+ tdx_clflush_page(page);
+ ret = seamcall_ret(TDH_MEM_PAGE_ADD, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_add);
+
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2)
{
struct tdx_module_args args = {
@@ -1522,6 +1542,25 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
}
EXPORT_SYMBOL_GPL(tdh_vp_addcx);
+u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(page),
+ };
+ u64 ret;
+
+ tdx_clflush_page(page);
+ ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 308d3aa565d7..80e6ef006085 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,8 +16,10 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_PAGE_ADD 2
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
+#define TDH_MEM_PAGE_AUG 6
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 09/14] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (7 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 08/14] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 10/14] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
` (6 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX module defines a TLB tracking protocol to make sure that no logical
processor holds any stale Secure EPT (S-EPT or SEPT) TLB translations for a
given TD private GPA range. After a successful TDH.MEM.RANGE.BLOCK,
TDH.MEM.TRACK, and kicking off all vCPUs, TDX module ensures that the
subsequent TDH.VP.ENTER on each vCPU will flush all stale TLB entries for
the specified GPA ranges in TDH.MEM.RANGE.BLOCK. Wrap the
TDH.MEM.RANGE.BLOCK with tdh_mem_range_block() and TDH.MEM.TRACK with
tdh_mem_track() to enable the kernel to assist the TDX module in TLB
tracking management.
The caller of tdh_mem_range_block() needs to specify "GPA" and "level" to
request the TDX module to block the subsequent creation of TLB translation
for a GPA range. This GPA range can correspond to a SEPT page or a TD
private page at any level.
Contentions and errors are possible with the SEAMCALL TDH.MEM.RANGE.BLOCK.
Therefore, the caller of tdh_mem_range_block() needs to check the function
return value and retrieve extended error info from the function output
params.
Upon TDH.MEM.RANGE.BLOCK success, no new TLB entries will be created for
the specified private GPA range, though the existing TLB translations may
still persist. TDH.MEM.TRACK will then advance the TD's epoch counter to
ensure TDX module will flush TLBs in all vCPUs once the vCPUs re-enter
the TD. TDH.MEM.TRACK will fail to advance TD's epoch counter if there
are vCPUs still running in non-root mode at the previous TD epoch counter.
So to ensure private GPA translations are flushed, callers must first call
tdh_mem_range_block(), then tdh_mem_track(), and lastly send IPIs to kick
all the vCPUs and force them to re-enter, thus triggering the TLB flush.
Don't export a single operation and instead export functions that just
expose the block and track operations; this is for a couple reasons:
1. The vCPU kick should use KVM's functionality for doing this, which can better
target sending IPIs to only the minimum required pCPUs.
2. tdh_mem_track() doesn't need to be executed if a vCPU has not entered a TD,
which is information only KVM knows.
3. Leaving the operations separate will allow for batching many
tdh_mem_range_block() calls before a tdh_mem_track(). While this batching will
not be done initially by KVM, it demonstrates that keeping mem block and track
as separate operations is a generally good design.
Contentions are also possible in TDH.MEM.TRACK. For example, TDH.MEM.TRACK
may contend with TDH.VP.ENTER when advancing the TD epoch counter.
tdh_mem_track() does not provide the retries for the caller. Callers can
choose to avoid contentions or retry on their own.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073648.22143-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 5a615cfefe36..cd259d80a11c 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -159,6 +159,7 @@ u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
+u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, int level, u64 *ext_err1, u64 *ext_err2);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
@@ -172,6 +173,7 @@ u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_mem_track(struct tdx_td *tdr);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
#else
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index e9ab7346e0ca..cf488ef83da1 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1561,6 +1561,23 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u
}
EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, int level, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_range_block);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1734,6 +1751,16 @@ u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+u64 tdh_mem_track(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MEM_TRACK, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mem_track);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 80e6ef006085..bfbc6a07ee2e 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -20,6 +20,7 @@
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MEM_PAGE_AUG 6
+#define TDH_MEM_RANGE_BLOCK 7
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
@@ -37,6 +38,7 @@
#define TDH_SYS_RD 34
#define TDH_SYS_LP_INIT 35
#define TDH_SYS_TDMR_INIT 36
+#define TDH_MEM_TRACK 38
#define TDH_PHYMEM_CACHE_WB 40
#define TDH_PHYMEM_PAGE_WBINVD 41
#define TDH_VP_WR 43
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 10/14] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (8 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 09/14] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 11/14] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
` (5 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a single Secure
EPT (S-EPT or SEPT) tree per TD to translate TD's private memory accessed
using a private GPA. Wrap the SEAMCALL TDH.MEM.PAGE.REMOVE with
tdh_mem_page_remove() and TDH_PHYMEM_PAGE_WBINVD with
tdh_phymem_page_wbinvd_hkid() to unmap a TD private page from the SEPT,
remove the TD private page from the TDX module and flush cache lines to
memory after removal of the private page.
Callers should specify "GPA" and "level" when calling tdh_mem_page_remove()
to indicate to the TDX module which TD private page to unmap and remove.
TDH.MEM.PAGE.REMOVE may fail, and the caller of tdh_mem_page_remove() can
check the function return value and retrieve extended error information
from the function output parameters. Follow the TLB tracking protocol
before calling tdh_mem_page_remove() to remove a TD private page to avoid
SEAMCALL failure.
After removing a TD's private page, the TDX module does not write back and
invalidate cache lines associated with the page and the page's keyID (i.e.,
the TD's guest keyID). Therefore, provide tdh_phymem_page_wbinvd_hkid() to
allow the caller to pass in the TD's guest keyID and invoke
TDH_PHYMEM_PAGE_WBINVD to perform this action.
Before reusing the page, the host kernel needs to map the page with keyID 0
and invoke movdir64b() to convert the TD private page to a normal shared
page.
TDH.MEM.PAGE.REMOVE and TDH_PHYMEM_PAGE_WBINVD may meet contentions inside
the TDX module for TDX's internal resources. To avoid staying in SEAM mode
for too long, TDX module will return a BUSY error code to the kernel
instead of spinning on the locks. The caller may need to handle this error
in specific ways (e.g., retry). The wrappers return the SEAMCALL error code
directly to the caller. Don't attempt to handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073658.22157-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 1 +
3 files changed, 30 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index cd259d80a11c..50b06d91073e 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -174,8 +174,10 @@ u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
u64 tdh_mem_track(struct tdx_td *tdr);
+u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *ext_err1, u64 *ext_err2);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
+u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index cf488ef83da1..91f76501592a 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1761,6 +1761,23 @@ u64 tdh_mem_track(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mem_track);
+u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_PAGE_REMOVE, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_remove);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
@@ -1780,3 +1797,13 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
+
+u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page)
+{
+ struct tdx_module_args args = {};
+
+ args.rcx = mk_keyed_paddr(hkid, page);
+
+ return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index bfbc6a07ee2e..ee07527df279 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -33,6 +33,7 @@
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_VP_RD 26
#define TDH_PHYMEM_PAGE_RECLAIM 28
+#define TDH_MEM_PAGE_REMOVE 29
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 11/14] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (9 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 10/14] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest Paolo Bonzini
` (4 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
The TDX module measures the TD during the build process and saves the
measurement in TDCS.MRTD to facilitate TD attestation of the initial
contents of the TD. Wrap the SEAMCALL TDH.MR.EXTEND with tdh_mr_extend()
and TDH.MR.FINALIZE with tdh_mr_finalize() to enable the host kernel to
assist the TDX module in performing the measurement.
The measurement in TDCS.MRTD is a SHA-384 digest of the build process.
SEAMCALLs TDH.MNG.INIT and TDH.MEM.PAGE.ADD initialize and contribute to
the MRTD digest calculation.
The caller of tdh_mr_extend() should break the TD private page into chunks
of size TDX_EXTENDMR_CHUNKSIZE and invoke tdh_mr_extend() to add the page
content into the digest calculation. Failures are possible with
TDH.MR.EXTEND (e.g., due to SEPT walking). The caller of tdh_mr_extend()
can check the function return value and retrieve extended error information
from the function output parameters.
Calling tdh_mr_finalize() completes the measurement. The TDX module then
turns the TD into the runnable state. Further TDH.MEM.PAGE.ADD and
TDH.MR.EXTEND calls will fail.
TDH.MR.FINALIZE may fail due to errors such as the TD having no vCPUs or
contentions. Check function return value when calling tdh_mr_finalize() to
determine the exact reason for failure. Take proper locks on the caller's
side to avoid contention failures, or handle the BUSY error in specific
ways (e.g., retry). Return the SEAMCALL error code directly to the caller.
Do not attempt to handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073709.22171-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 50b06d91073e..0c89afffdac4 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -164,6 +164,8 @@ u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
+u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *ext_err1, u64 *ext_err2);
+u64 tdh_mr_finalize(struct tdx_td *td);
u64 tdh_vp_flush(struct tdx_vp *vp);
u64 tdh_mng_vpflushdone(struct tdx_td *td);
u64 tdh_mng_key_freeid(struct tdx_td *td);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 91f76501592a..55851a0591d2 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1629,6 +1629,33 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
}
EXPORT_SYMBOL_GPL(tdh_mng_rd);
+u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *ext_err1, u64 *ext_err2)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MR_EXTEND, &args);
+
+ *ext_err1 = args.rcx;
+ *ext_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mr_extend);
+
+u64 tdh_mr_finalize(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MR_FINALIZE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mr_finalize);
+
u64 tdh_vp_flush(struct tdx_vp *vp)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index ee07527df279..64932450aba3 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -24,6 +24,8 @@
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
+#define TDH_MR_EXTEND 16
+#define TDH_MR_FINALIZE 17
#define TDH_VP_FLUSH 18
#define TDH_MNG_VPFLUSHDONE 19
#define TDH_VP_CREATE 10
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (10 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 11/14] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 17:14 ` Adrian Hunter
2025-01-15 16:09 ` [PATCH v3 13/14] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
` (3 subsequent siblings)
15 siblings, 1 reply; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Adrian Hunter
From: Kai Huang <kai.huang@intel.com>
Intel TDX protects guest VM's from malicious host and certain physical
attacks. TDX introduces a new operation mode, Secure Arbitration Mode
(SEAM) to isolate and protect guest VM's. A TDX guest VM runs in SEAM and,
unlike VMX, direct control and interaction with the guest by the host VMM
is not possible. Instead, Intel TDX Module, which also runs in SEAM,
provides a SEAMCALL API.
The SEAMCALL that provides the ability to enter a guest is TDH.VP.ENTER.
The TDX Module processes TDH.VP.ENTER, and enters the guest via VMX
VMLAUNCH/VMRESUME instructions. When a guest VM-exit requires host VMM
interaction, the TDH.VP.ENTER SEAMCALL returns to the host VMM (KVM).
Add tdh_vp_enter() to wrap the SEAMCALL invocation of TDH.VP.ENTER.
TDH.VP.ENTER is different from other SEAMCALLS in several ways:
- it may take some time to return as the guest executes
- it uses more arguments
- after it returns some host state may need to be restored
TDH.VP.ENTER arguments are passed through General Purpose Registers (GPRs).
For the special case of the TD guest invoking TDG.VP.VMCALL, nearly any GPR
can be used, as well as XMM0 to XMM15. Notably, RBP is not used, and Linux
mandates the TDX Module feature NO_RBP_MOD, which is enforced elsewhere.
Additionally, XMM registers are not required for the existing Guest
Hypervisor Communication Interface and are handled by existing KVM code
should they be modified by the guest.
There are 2 input formats and 5 output formats for TDH.VP.ENTER arguments.
Input #1 : Initial entry or following a previous async. TD Exit
Input #2 : Following a previous TDCALL(TDG.VP.VMCALL)
Output #1 : On Error (No TD Entry)
Output #2 : Async. Exits with a VMX Architectural Exit Reason
Output #3 : Async. Exits with a non-VMX TD Exit Status
Output #4 : Async. Exits with Cross-TD Exit Details
Output #5 : On TDCALL(TDG.VP.VMCALL)
Currently, to keep things simple, the wrapper function does not attempt
to support different formats, and just passes all the GPRs that could be
used. The GPR values are held by KVM in the area set aside for guest
GPRs. KVM code uses the guest GPR area (vcpu->arch.regs[]) to set up for
or process results of tdh_vp_enter().
Therefore changing tdh_vp_enter() to use more complex argument formats
would also alter the way KVM code interacts with tdh_vp_enter().
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Message-ID: <20241121201448.36170-2-adrian.hunter@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 1 +
arch/x86/virt/vmx/tdx/tdx.c | 8 ++++++++
arch/x86/virt/vmx/tdx/tdx.h | 1 +
3 files changed, 10 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 0c89afffdac4..6531b69a53ac 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -154,6 +154,7 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
return level - 1;
}
+u64 tdh_vp_enter(struct tdx_vp *vp, struct tdx_module_args *args);
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2);
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 55851a0591d2..bb6f8ef9661e 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1479,6 +1479,14 @@ static void tdx_clflush_page(struct page *page)
clflush_cache_range(page_to_virt(page), PAGE_SIZE);
}
+u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args)
+{
+ args->rcx = tdx_tdvpr_pa(td);
+
+ return __seamcall_saved_ret(TDH_VP_ENTER, args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_enter);
+
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 64932450aba3..b71b375b10c0 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -15,6 +15,7 @@
/*
* TDX module SEAMCALL leaf functions
*/
+#define TDH_VP_ENTER 0
#define TDH_MNG_ADDCX 1
#define TDH_MEM_PAGE_ADD 2
#define TDH_MEM_SEPT_ADD 3
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 13/14] x86/virt/tdx: Read essential global metadata for KVM
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (11 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 14/14] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
` (2 subsequent siblings)
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
From: Kai Huang <kai.huang@intel.com>
KVM needs two classes of global metadata to create and run TDX guests:
- "TD Control Structures"
- "TD Configurability"
The first class contains the sizes of TDX guest per-VM and per-vCPU
control structures. KVM will need to use them to allocate enough space
for those control structures.
The second class contains info which reports things like which features
are configurable to TDX guest etc. KVM will need to use them to
properly configure TDX guests.
Read them for KVM TDX to use.
The code change is auto-generated by re-running the script in [1] after
uncommenting the "td_conf" and "td_ctrl" part to regenerate the
tdx_global_metadata.{hc} and update them to the existing ones in the
kernel.
#python tdx.py global_metadata.json tdx_global_metadata.h \
tdx_global_metadata.c
The 'global_metadata.json' can be fetched from [2].
Note that as of this writing, the JSON file only allows a maximum of 32
CPUID entries. While this is enough for current contents of the CPUID
leaves, there were plans to change the JSON per TDX module release which
would change the ABI and potentially prevent future versions of the TDX
module from working with older kernels.
While discussions are ongoing with the TDX module team on what exactly
constitutes an ABI breakage, in the meantime the TDX module team has
agreed to not increase the number of CPUID entries beyond 128 without
an opt in. Therefore the file was tweaked by hand to change the maximum
number of CPUID_CONFIGs.
Link: https://lore.kernel.org/kvm/0853b155ec9aac09c594caa60914ed6ea4dc0a71.camel@intel.com/ [1]
Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [2]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20241030190039.77971-4-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 50 +++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 19 ++++++++
2 files changed, 69 insertions(+)
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
index 8027a24d1c6e..13ad2663488b 100644
--- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
@@ -37,12 +37,62 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
return ret;
}
+static int get_tdx_sys_info_td_ctrl(struct tdx_sys_info_td_ctrl *sysinfo_td_ctrl)
+{
+ int ret = 0;
+ u64 val;
+
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000000, &val)))
+ sysinfo_td_ctrl->tdr_base_size = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000100, &val)))
+ sysinfo_td_ctrl->tdcs_base_size = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000200, &val)))
+ sysinfo_td_ctrl->tdvps_base_size = val;
+
+ return ret;
+}
+
+static int get_tdx_sys_info_td_conf(struct tdx_sys_info_td_conf *sysinfo_td_conf)
+{
+ int ret = 0;
+ u64 val;
+ int i, j;
+
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000000, &val)))
+ sysinfo_td_conf->attributes_fixed0 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000001, &val)))
+ sysinfo_td_conf->attributes_fixed1 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000002, &val)))
+ sysinfo_td_conf->xfam_fixed0 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000003, &val)))
+ sysinfo_td_conf->xfam_fixed1 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000100000004, &val)))
+ sysinfo_td_conf->num_cpuid_config = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000100000008, &val)))
+ sysinfo_td_conf->max_vcpus_per_td = val;
+ if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_leaves))
+ return -EINVAL;
+ for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++)
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000300000400 + i, &val)))
+ sysinfo_td_conf->cpuid_config_leaves[i] = val;
+ if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_values))
+ return -EINVAL;
+ for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++)
+ for (j = 0; j < 2; j++)
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000300000500 + i * 2 + j, &val)))
+ sysinfo_td_conf->cpuid_config_values[i][j] = val;
+
+ return ret;
+}
+
static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
{
int ret = 0;
ret = ret ?: get_tdx_sys_info_features(&sysinfo->features);
ret = ret ?: get_tdx_sys_info_tdmr(&sysinfo->tdmr);
+ ret = ret ?: get_tdx_sys_info_td_ctrl(&sysinfo->td_ctrl);
+ ret = ret ?: get_tdx_sys_info_td_conf(&sysinfo->td_conf);
return ret;
}
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
index 6dd3c9695f59..060a2ad744bf 100644
--- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
@@ -17,9 +17,28 @@ struct tdx_sys_info_tdmr {
u16 pamt_1g_entry_size;
};
+struct tdx_sys_info_td_ctrl {
+ u16 tdr_base_size;
+ u16 tdcs_base_size;
+ u16 tdvps_base_size;
+};
+
+struct tdx_sys_info_td_conf {
+ u64 attributes_fixed0;
+ u64 attributes_fixed1;
+ u64 xfam_fixed0;
+ u64 xfam_fixed1;
+ u16 num_cpuid_config;
+ u16 max_vcpus_per_td;
+ u64 cpuid_config_leaves[128];
+ u64 cpuid_config_values[128][2];
+};
+
struct tdx_sys_info {
struct tdx_sys_info_features features;
struct tdx_sys_info_tdmr tdmr;
+ struct tdx_sys_info_td_ctrl td_ctrl;
+ struct tdx_sys_info_td_conf td_conf;
};
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v3 14/14] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (12 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 13/14] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
@ 2025-01-15 16:09 ` Paolo Bonzini
2025-01-15 16:39 ` [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Dave Hansen
2025-01-15 19:14 ` Edgecombe, Rick P
15 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 16:09 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata
From: Isaku Yamahata <isaku.yamahata@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. Pre-TDX Intel hardware has support for a memory encryption
architecture called MK-TME, which repurposes several high bits of
physical address as "KeyID". The BIOS reserves a sub-range of MK-TME
KeyIDs as "TDX private KeyIDs".
Each TDX guest must be assigned with a unique TDX KeyID when it is
created. The kernel reserves the first TDX private KeyID for
crypto-protection of specific TDX module data which has a lifecycle that
exceeds the KeyID reserved for the TD's use. The rest of the KeyIDs are
left for TDX guests to use.
Create a small KeyID allocator. Export
tdx_guest_keyid_alloc()/tdx_guest_keyid_free() to allocate and free TDX
guest KeyID for KVM to use.
Don't provide the stub functions when CONFIG_INTEL_TDX_HOST=n since they
are not supposed to be called in this case.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20241030190039.77971-5-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 17 +++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 6531b69a53ac..58182b245e36 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -117,6 +117,9 @@ int tdx_cpu_enable(void);
int tdx_enable(void);
const char *tdx_dump_mce_info(struct mce *m);
+int tdx_guest_keyid_alloc(void);
+void tdx_guest_keyid_free(unsigned int keyid);
+
struct tdx_td {
/* TD root structure: */
struct page *tdr_page;
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index bb6f8ef9661e..bff350626b08 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -28,6 +28,7 @@
#include <linux/log2.h>
#include <linux/acpi.h>
#include <linux/suspend.h>
+#include <linux/idr.h>
#include <asm/page.h>
#include <asm/special_insns.h>
#include <asm/msr-index.h>
@@ -43,6 +44,8 @@ static u32 tdx_global_keyid __ro_after_init;
static u32 tdx_guest_keyid_start __ro_after_init;
static u32 tdx_nr_guest_keyids __ro_after_init;
+static DEFINE_IDA(tdx_guest_keyid_pool);
+
static DEFINE_PER_CPU(bool, tdx_lp_initialized);
static struct tdmr_info_list tdx_tdmr_list;
@@ -1458,6 +1461,20 @@ void __init tdx_init(void)
check_tdx_erratum();
}
+int tdx_guest_keyid_alloc(void)
+{
+ return ida_alloc_range(&tdx_guest_keyid_pool, tdx_guest_keyid_start,
+ tdx_guest_keyid_start + tdx_nr_guest_keyids - 1,
+ GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(tdx_guest_keyid_alloc);
+
+void tdx_guest_keyid_free(unsigned int keyid)
+{
+ ida_free(&tdx_guest_keyid_pool, keyid);
+}
+EXPORT_SYMBOL_GPL(tdx_guest_keyid_free);
+
static inline u64 tdx_tdr_pa(struct tdx_td *td)
{
return page_to_phys(td->tdr_page);
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management
2025-01-15 16:09 ` [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
@ 2025-01-15 16:38 ` Dave Hansen
0 siblings, 0 replies; 22+ messages in thread
From: Dave Hansen @ 2025-01-15 16:38 UTC (permalink / raw)
To: Paolo Bonzini, linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
On 1/15/25 08:09, Paolo Bonzini wrote:
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -137,6 +137,19 @@ struct tdx_vp {
> struct page **tdcx_pages;
> };
>
> +
> +static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
> +{
> + u64 ret;
> +
> + ret = page_to_phys(page);
> + /* KeyID bits are just above the physical address bits: */
> + ret |= hkid << boot_cpu_data.x86_phys_bits;
> +
> + return ret;
> +
> +}
> +
> u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
Paolo, any chance you could fix up the whitespace goofiness before
applying? It's a super minor thing and I think later patches fix up at
least some of it, but it's a bit wonky.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (13 preceding siblings ...)
2025-01-15 16:09 ` [PATCH v3 14/14] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
@ 2025-01-15 16:39 ` Dave Hansen
2025-01-15 19:14 ` Edgecombe, Rick P
15 siblings, 0 replies; 22+ messages in thread
From: Dave Hansen @ 2025-01-15 16:39 UTC (permalink / raw)
To: Paolo Bonzini, linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
The series looks fine. For the bits that don't have it yet:
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest
2025-01-15 16:09 ` [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest Paolo Bonzini
@ 2025-01-15 17:14 ` Adrian Hunter
0 siblings, 0 replies; 22+ messages in thread
From: Adrian Hunter @ 2025-01-15 17:14 UTC (permalink / raw)
To: Paolo Bonzini, linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
On 15/01/25 18:09, Paolo Bonzini wrote:
> From: Kai Huang <kai.huang@intel.com>
>
> Intel TDX protects guest VM's from malicious host and certain physical
> attacks. TDX introduces a new operation mode, Secure Arbitration Mode
> (SEAM) to isolate and protect guest VM's. A TDX guest VM runs in SEAM and,
> unlike VMX, direct control and interaction with the guest by the host VMM
> is not possible. Instead, Intel TDX Module, which also runs in SEAM,
> provides a SEAMCALL API.
>
> The SEAMCALL that provides the ability to enter a guest is TDH.VP.ENTER.
> The TDX Module processes TDH.VP.ENTER, and enters the guest via VMX
> VMLAUNCH/VMRESUME instructions. When a guest VM-exit requires host VMM
> interaction, the TDH.VP.ENTER SEAMCALL returns to the host VMM (KVM).
>
> Add tdh_vp_enter() to wrap the SEAMCALL invocation of TDH.VP.ENTER.
>
> TDH.VP.ENTER is different from other SEAMCALLS in several ways:
> - it may take some time to return as the guest executes
> - it uses more arguments
> - after it returns some host state may need to be restored
>
> TDH.VP.ENTER arguments are passed through General Purpose Registers (GPRs).
> For the special case of the TD guest invoking TDG.VP.VMCALL, nearly any GPR
> can be used, as well as XMM0 to XMM15. Notably, RBP is not used, and Linux
> mandates the TDX Module feature NO_RBP_MOD, which is enforced elsewhere.
> Additionally, XMM registers are not required for the existing Guest
> Hypervisor Communication Interface and are handled by existing KVM code
> should they be modified by the guest.
>
> There are 2 input formats and 5 output formats for TDH.VP.ENTER arguments.
> Input #1 : Initial entry or following a previous async. TD Exit
> Input #2 : Following a previous TDCALL(TDG.VP.VMCALL)
> Output #1 : On Error (No TD Entry)
> Output #2 : Async. Exits with a VMX Architectural Exit Reason
> Output #3 : Async. Exits with a non-VMX TD Exit Status
> Output #4 : Async. Exits with Cross-TD Exit Details
> Output #5 : On TDCALL(TDG.VP.VMCALL)
>
> Currently, to keep things simple, the wrapper function does not attempt
> to support different formats, and just passes all the GPRs that could be
> used. The GPR values are held by KVM in the area set aside for guest
> GPRs. KVM code uses the guest GPR area (vcpu->arch.regs[]) to set up for
> or process results of tdh_vp_enter().
>
> Therefore changing tdh_vp_enter() to use more complex argument formats
> would also alter the way KVM code interacts with tdh_vp_enter().
>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Message-ID: <20241121201448.36170-2-adrian.hunter@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/include/asm/tdx.h | 1 +
> arch/x86/virt/vmx/tdx/tdx.c | 8 ++++++++
> arch/x86/virt/vmx/tdx/tdx.h | 1 +
> 3 files changed, 10 insertions(+)
>
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 0c89afffdac4..6531b69a53ac 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -154,6 +154,7 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
> return level - 1;
> }
>
> +u64 tdh_vp_enter(struct tdx_vp *vp, struct tdx_module_args *args);
> u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
> u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2);
> u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2);
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index 55851a0591d2..bb6f8ef9661e 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1479,6 +1479,14 @@ static void tdx_clflush_page(struct page *page)
> clflush_cache_range(page_to_virt(page), PAGE_SIZE);
> }
>
> +u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args)
> +{
> + args->rcx = tdx_tdvpr_pa(td);
> +
> + return __seamcall_saved_ret(TDH_VP_ENTER, args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_vp_enter);
> +
> u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
> {
> struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 64932450aba3..b71b375b10c0 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -15,6 +15,7 @@
> /*
> * TDX module SEAMCALL leaf functions
> */
> +#define TDH_VP_ENTER 0
> #define TDH_MNG_ADDCX 1
> #define TDH_MEM_PAGE_ADD 2
> #define TDH_MEM_SEPT_ADD 3
FWIW I was planning to squash the noinstr change into this
patch, and amend the commit message accordingly, like so:
From: Adrian Hunter <adrian.hunter@intel.com>
Date: Wed, 15 Jan 2025 19:04:31 +0200
Subject: [PATCH] amend! x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX
guest
x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest
Intel TDX protects guest VMs from malicious host and certain physical
attacks. TDX introduces a new operation mode, Secure Arbitration Mode
(SEAM) to isolate and protect guest VMs. A TDX guest VM runs in SEAM and,
unlike VMX, direct control and interaction with the guest by the host VMM
is not possible. Instead, Intel TDX Module, which also runs in SEAM,
provides a SEAMCALL API.
The SEAMCALL that provides the ability to enter a guest is TDH.VP.ENTER.
The TDX Module processes TDH.VP.ENTER, and enters the guest via VMX
VMLAUNCH/VMRESUME instructions. When a guest VM-exit requires host VMM
interaction, the TDH.VP.ENTER SEAMCALL returns to the host VMM (KVM).
Add tdh_vp_enter() to wrap the SEAMCALL invocation of TDH.VP.ENTER.
Make tdh_vp_enter() noinstr because KVM requires VM entry to be noinstr
for 2 reasons:
1. The use of context tracking via guest_state_enter_irqoff() and
guest_state_exit_irqoff()
2. The need to avoid IRET between VM-exit and NMI handling in order to
avoid prematurely releasing NMI inhibit.
Consequently make __seamcall_saved_ret() noinstr also. Note,
tdh_vp_enter() is the only caller of __seamcall_saved_ret().
Essentially, __seamcall_saved_ret() exists to serve the register passing
requirements of TDH.VP.ENTER SEAMCALL, and is unlikely to be used for
anything else.
TDH.VP.ENTER is different from other SEAMCALLS in several ways:
- it may take some time to return as the guest executes
- it uses more arguments
- after it returns some host state may need to be restored
TDH.VP.ENTER arguments are passed through General Purpose Registers (GPRs).
For the special case of the TD guest invoking TDG.VP.VMCALL, nearly any GPR
can be used, as well as XMM0 to XMM15. Notably, RBP is not used, and Linux
mandates the TDX Module feature NO_RBP_MOD, which is enforced elsewhere.
Additionally, XMM registers are not required for the existing Guest
Hypervisor Communication Interface and are handled by existing KVM code
should they be modified by the guest.
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
arch/x86/virt/vmx/tdx/seamcall.S | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 2 +-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/virt/vmx/tdx/seamcall.S b/arch/x86/virt/vmx/tdx/seamcall.S
index 5b1f2286aea9..6854c52c374b 100644
--- a/arch/x86/virt/vmx/tdx/seamcall.S
+++ b/arch/x86/virt/vmx/tdx/seamcall.S
@@ -41,6 +41,9 @@ SYM_FUNC_START(__seamcall_ret)
TDX_MODULE_CALL host=1 ret=1
SYM_FUNC_END(__seamcall_ret)
+/* KVM requires non-instrumentable __seamcall_saved_ret() for TDH.VP.ENTER */
+.section .noinstr.text, "ax"
+
/*
* __seamcall_saved_ret() - Host-side interface functions to SEAM software
* (the P-SEAMLDR or the TDX module), with saving output registers to the
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index bb6f8ef9661e..c9c198e0b48c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1479,7 +1479,7 @@ static void tdx_clflush_page(struct page *page)
clflush_cache_range(page_to_virt(page), PAGE_SIZE);
}
-u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args)
+noinstr u64 tdh_vp_enter(struct tdx_vp *td, struct tdx_module_args *args)
{
args->rcx = tdx_tdvpr_pa(td);
--
2.43.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (14 preceding siblings ...)
2025-01-15 16:39 ` [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Dave Hansen
@ 2025-01-15 19:14 ` Edgecombe, Rick P
2025-01-15 19:36 ` Paolo Bonzini
15 siblings, 1 reply; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-01-15 19:14 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On Wed, 2025-01-15 at 11:08 -0500, Paolo Bonzini wrote:
> Hi,
>
> This is the final-ish version of the "SEAMCALL Wrappers" RFC[0], with
> all the wrappers extracted out of the corresponding TDX patches.
> This version of the series uses u64 only for guest physical addresses
> and error return values:
>
> * u64 pfn is replaced by struct page
>
> * u64 level is replaced by int level
>
> * u64 tdr and u64 tdvpr are replaced by structs that contain struct page
> for them as well as for tdcs and tdcx.
>
> A couple functions are also moved over from KVM to tdx.h
>
> static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
> static inline int pg_level_to_tdx_sept_level(enum pg_level level)
>
> The plan is to include these in kvm.git together with their first user.
It looks like you missed these build issues and bugs from v2:
https://lore.kernel.org/kvm/6345272506c5bc707f11b6f54c4bd5015cedcd95.camel@intel.com/
https://lore.kernel.org/kvm/3f8fa8fc98b532add1ff14034c0c868cdbeca7f8.camel@intel.com/
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-15 19:14 ` Edgecombe, Rick P
@ 2025-01-15 19:36 ` Paolo Bonzini
2025-01-15 20:06 ` Edgecombe, Rick P
2025-01-16 22:11 ` Edgecombe, Rick P
0 siblings, 2 replies; 22+ messages in thread
From: Paolo Bonzini @ 2025-01-15 19:36 UTC (permalink / raw)
To: Edgecombe, Rick P, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On 1/15/25 20:14, Edgecombe, Rick P wrote:
> It looks like you missed these build issues and bugs from v2:
> https://lore.kernel.org/
> kvm/6345272506c5bc707f11b6f54c4bd5015cedcd95.camel@intel.com/
> https://lore.kernel.org/
> kvm/3f8fa8fc98b532add1ff14034c0c868cdbeca7f8.camel@intel.com/
I did, I'll update tomorrow and repost.
WRT hkid, I interpreted "I'd personally probably just keep 'hkid' as an
int everywhere until the point where it gets shoved into the TDX module
ABI" as "it can be u16 in the SEAMCALLs and in mk_keyed_paddr" (as the
latter builds an argument to the SEAMCALLs).
I understood his objection to be more about
tdx_guest_keyid_alloc/tdx_guest_keyid_free and struct kvm_tdx:
> Oh, and casts like this:
>
>> static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
>> @@ -2354,7 +2354,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params,
>> ret = tdx_guest_keyid_alloc();
>> if (ret < 0)
>> return ret;
>> - kvm_tdx->hkid = ret;
>> + kvm_tdx->hkid = (u16)ret;
>> + kvm_tdx->hkid_assigned = true;
>
> are a bit silly, don't you think?
so I didn't change tdx_guest_keyid_alloc().
Paolo
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-15 19:36 ` Paolo Bonzini
@ 2025-01-15 20:06 ` Edgecombe, Rick P
2025-01-16 22:11 ` Edgecombe, Rick P
1 sibling, 0 replies; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-01-15 20:06 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On Wed, 2025-01-15 at 20:36 +0100, Paolo Bonzini wrote:
> WRT hkid, I interpreted "I'd personally probably just keep 'hkid' as an
> int everywhere until the point where it gets shoved into the TDX module
> ABI" as "it can be u16 in the SEAMCALLs and in mk_keyed_paddr" (as the
> latter builds an argument to the SEAMCALLs).
>
> I understood his objection to be more about
> tdx_guest_keyid_alloc/tdx_guest_keyid_free and struct kvm_tdx:
>
> > Oh, and casts like this:
> >
> > > static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
> > > @@ -2354,7 +2354,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params,
> > > ret = tdx_guest_keyid_alloc();
> > > if (ret < 0)
> > > return ret;
> > > - kvm_tdx->hkid = ret;
> > > + kvm_tdx->hkid = (u16)ret;
> > > + kvm_tdx->hkid_assigned = true;
> >
> > are a bit silly, don't you think?
>
> so I didn't change tdx_guest_keyid_alloc().
There was a related comment on the GPA union Yan was suggesting:
https://lore.kernel.org/kvm/753cd9f1-5eb7-480f-ae4f-d263aaecdd6c@intel.com/
Basically that the bit fields have subtle behavior when you shift them
(ironically the exact bug that happened with u16 keyid).
But I think your reasoning seems valid, especially since Dave has since quoted
that function without commenting on that aspect. So let's leave it.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-15 19:36 ` Paolo Bonzini
2025-01-15 20:06 ` Edgecombe, Rick P
@ 2025-01-16 22:11 ` Edgecombe, Rick P
1 sibling, 0 replies; 22+ messages in thread
From: Edgecombe, Rick P @ 2025-01-16 22:11 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On Wed, 2025-01-15 at 20:36 +0100, Paolo Bonzini wrote:
> On 1/15/25 20:14, Edgecombe, Rick P wrote:
> > It looks like you missed these build issues and bugs from v2:
> > https://lore.kernel.org/
> > kvm/6345272506c5bc707f11b6f54c4bd5015cedcd95.camel@intel.com/
> > https://lore.kernel.org/
> > kvm/3f8fa8fc98b532add1ff14034c0c868cdbeca7f8.camel@intel.com/
>
> I did, I'll update tomorrow and repost.
Hey, one more thing, we've been seeing some compiler sensitive warnings about
stack frame size in init_tdx_module() in the latest kvm-coco-queue. The struct
tdx_sys_info is the main stack allocated variable in that function.
In this commit (x86/virt/tdx: Read essential global metadata for KVM), struct
tdx_sys_info gets expanded a huge amount:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/commit/?h=kvm-coco-queue&id=6691a42a26844247526ed08aa21ee748a949c408
And in this later commit(KVM: VMX: Initialize TDX during KVM module load), the
stack allocated variable is moved to a static allocation:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/commit/?h=kvm-coco-queue&id=34f786697b382750739f8c4e13ebb3da348c307c
So the move of the arch/x86 patches earlier opens up a window where, depending
on compiler optimizations, the stack size may be 3304 and trigger
CONFIG_FRAME_WARN related errors.
The solution could be to move the "arch/x86/virt/vmx/tdx/tdx.c" related changes
in the second patch to a separate patch and put it before the first one, or swap
the order of the two patches. These changes are before the VM/vCPU creation
patches, so I think that would be your area. But let me know if you want to get
a bigger posting back from us to include it all.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-01-16 22:11 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15 16:08 [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-15 16:08 ` [PATCH v3 01/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 02/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 03/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 04/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
2025-01-15 16:38 ` Dave Hansen
2025-01-15 16:09 ` [PATCH v3 05/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 06/14] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 07/14] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 08/14] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 09/14] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 10/14] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 11/14] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 12/14] x86/virt/tdx: Add SEAMCALL wrapper to enter/exit TDX guest Paolo Bonzini
2025-01-15 17:14 ` Adrian Hunter
2025-01-15 16:09 ` [PATCH v3 13/14] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
2025-01-15 16:09 ` [PATCH v3 14/14] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
2025-01-15 16:39 ` [PATCH v3 00/14] x86/virt/tdx: Add SEAMCALL wrappers for KVM Dave Hansen
2025-01-15 19:14 ` Edgecombe, Rick P
2025-01-15 19:36 ` Paolo Bonzini
2025-01-15 20:06 ` Edgecombe, Rick P
2025-01-16 22:11 ` Edgecombe, Rick P
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox