* [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM
@ 2025-01-01 7:49 Paolo Bonzini
2025-01-01 7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
` (13 more replies)
0 siblings, 14 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
This is a completed version of Rick's RFC series at
https://lore.kernel.org/r/20241203010317.827803-1-rick.p.edgecombe@intel.com/.
Due to EPANETTONE I didn't use the latest RFC, which is fixed here.
As in the patches that I sent ten minutes ago, I took all the "Add
SEAMCALL wrappers" patches from the various TDX parts and placed them
in a single series, so that they can be reviewed and provided in a topic
branch by Dave.
I will rebase kvm-coco-queue on top of these, but I almost definitely
will not manage to finish and push the result before getting the first
NMIs from the rest of the family. In the meanwhile, this gives people
time to review while I'm not available.
Paolo
Isaku Yamahata (6):
x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT
pages
x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial
contents
x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX
guest KeyID
Kai Huang (1):
x86/virt/tdx: Read essential global metadata for KVM
Rick Edgecombe (6):
x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation
x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation
x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management
x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access
x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations
Yuan Yao (1):
[WORKAROUND] x86/virt/tdx: Retry seamcall when TDX_OPERAND_BUSY with
operand SEPT
arch/x86/include/asm/tdx.h | 50 +++
arch/x86/virt/vmx/tdx/tdx.c | 432 ++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 46 ++-
arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 50 +++
arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 19 +
5 files changed, 590 insertions(+), 7 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 48+ messages in thread
* [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-02 19:44 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
` (12 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. Pre-TDX Intel hardware has support for a memory encryption
architecture called MK-TME, which repurposes several high bits of
physical address as "KeyID". TDX ends up with reserving a sub-range of
MK-TME KeyIDs as "TDX private KeyIDs".
Like MK-TME, these KeyIDs can be associated with an ephemeral key. For TDX
this association is done by the TDX module. It also has its own tracking
for which KeyIDs are in use. To do this ephemeral key setup and manipulate
the TDX module's internal tracking, KVM will use the following SEAMCALLs:
TDH.MNG.KEY.CONFIG: Mark the KeyID as in use, and initialize its
ephemeral key.
TDH.MNG.KEY.FREEID: Mark the KeyID as not in use.
These SEAMCALLs both operate on TDR structures, which are setup using the
previously added TDH.MNG.CREATE SEAMCALL. KVM's use of these operations
will go like:
- tdx_guest_keyid_alloc()
- Initialize TD and TDR page with TDH.MNG.CREATE (not yet-added), passing
KeyID
- TDH.MNG.KEY.CONFIG to initialize the key
- TD runs, teardown is started
- TDH.MNG.KEY.FREEID
- tdx_guest_keyid_free()
Don't try to combine the tdx_guest_keyid_alloc() and TDH.MNG.KEY.CONFIG
operations because TDH.MNG.CREATE and some locking need to be done in the
middle. Don't combine TDH.MNG.KEY.FREEID and tdx_guest_keyid_free() so they
are symmetrical with the creation path.
So implement tdh_mng_key_config() and tdh_mng_key_freeid() as separate
functions than tdx_guest_keyid_alloc() and tdx_guest_keyid_free().
The TDX module provides SEAMCALLs to hand pages to the TDX module for
storing TDX controlled state. SEAMCALLs that operate on this state are
directed to the appropriate TD VM using references to the pages originally
provided for managing the TD's state. So the host kernel needs to track
these pages, both as an ID for specifying which TD to operate on, and to
allow them to be eventually reclaimed. The TD VM associated pages are
called TDR (Trust Domain Root) and TDCS (Trust Domain Control Structure).
Introduce "struct tdx_td" for holding references to pages provided to the
TDX module for this TD VM associated state. Don't plan for any TD
associated state that is controlled by KVM to live in this struct. Only
expect it to hold data for concepts specific to the TDX architecture, for
which there can't already be preexisting storage for in KVM.
Add both the TDR page and an array of TDCS pages, even though the SEAMCALL
wrappers will only need to know about the TDR pages for directing the
SEAMCALLs to the right TD. Adding the TDCS pages to this struct will let
all of the TD VM associated pages handed to the TDX module be tracked in
one location. For a type to specify physical pages, use KVM's hpa_t type.
Do this for KVM's benefit This is the common type used to hold physical
addresses in KVM, so will make interoperability easier.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-2-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 12 ++++++++++++
arch/x86/virt/vmx/tdx/tdx.c | 25 +++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 16 +++++++++-------
3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index eba178996d84..5045ab1c3d5b 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -116,6 +116,18 @@ static inline u64 sc_retry(sc_func_t func, u64 fn,
int tdx_cpu_enable(void);
int tdx_enable(void);
const char *tdx_dump_mce_info(struct mce *m);
+
+struct tdx_td {
+ /* TD root structure: */
+ struct page *tdr_page;
+
+ int tdcs_nr_pages;
+ /* TD control structure: */
+ struct page **tdcs_pages;
+};
+
+u64 tdh_mng_key_config(struct tdx_td *td);
+u64 tdh_mng_key_freeid(struct tdx_td *td);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 7fdb37387886..55842d5b9474 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1456,3 +1456,28 @@ void __init tdx_init(void)
check_tdx_erratum();
}
+
+static inline u64 tdx_tdr_pa(struct tdx_td *td)
+{
+ return page_to_pfn(td->tdr_page) << PAGE_SHIFT;
+}
+
+u64 tdh_mng_key_config(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_KEY_CONFIG, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_config);
+
+u64 tdh_mng_key_freeid(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_KEY_FREEID, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_freeid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 4e3d533cdd61..5579317f67ab 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -15,13 +15,15 @@
/*
* TDX module SEAMCALL leaf functions
*/
-#define TDH_PHYMEM_PAGE_RDMD 24
-#define TDH_SYS_KEY_CONFIG 31
-#define TDH_SYS_INIT 33
-#define TDH_SYS_RD 34
-#define TDH_SYS_LP_INIT 35
-#define TDH_SYS_TDMR_INIT 36
-#define TDH_SYS_CONFIG 45
+#define TDH_MNG_KEY_CONFIG 8
+#define TDH_MNG_KEY_FREEID 20
+#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_SYS_KEY_CONFIG 31
+#define TDH_SYS_INIT 33
+#define TDH_SYS_RD 34
+#define TDH_SYS_LP_INIT 35
+#define TDH_SYS_TDMR_INIT 36
+#define TDH_SYS_CONFIG 45
/* TDX page types */
#define PT_NDA 0x0
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-01 7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-03 17:51 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 03/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
` (11 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious hosts and certain physical
attacks. It defines various control structures that hold state for things
like TDs or vCPUs. These control structures are stored in pages given to
the TDX module and encrypted with either the global KeyID or the guest
KeyIDs.
To manipulate these control structures the TDX module defines a few
SEAMCALLs. KVM will use these during the process of creating a TD as
follows:
1) Allocate a unique TDX KeyID for a new guest.
1) Call TDH.MNG.CREATE to create a "TD Root" (TDR) page, together with
the new allocated KeyID. Unlike the rest of the TDX guest, the TDR
page is crypto-protected by the 'global KeyID'.
2) Call the previously added TDH.MNG.KEY.CONFIG on each package to
configure the KeyID for the guest. After this step, the KeyID to
protect the guest is ready and the rest of the guest will be protected
by this KeyID.
3) Call TDH.MNG.ADDCX to add TD Control Structure (TDCS) pages.
4) Call TDH.MNG.INIT to initialize the TDCS.
To reclaim these pages for use by the kernel other SEAMCALLs are needed,
which will be added in future patches.
Add tdh_mng_addcx(), tdh_mng_create() and tdh_mng_init() to export these
SEAMCALLs so that KVM can use them to create TDs.
For SEAMCALLs that give a page to the TDX module to be encrypted, CLFLUSH
the page mapped with KeyID 0, such that any dirty cache lines don't write
back later and clobber TD memory or control structures. Don't worry about
the other MK-TME KeyIDs because the kernel doesn't use them. The TDX docs
specify that this flush is not needed unless the TDX module exposes the
CLFLUSH_BEFORE_ALLOC feature bit. Be conservative and always flush. Add a
helper function to facilitate this.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-3-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 51 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 57 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 5045ab1c3d5b..425d282834e0 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -126,8 +126,11 @@ struct tdx_td {
struct page **tdcs_pages;
};
+u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_mng_key_config(struct tdx_td *td);
+u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_mng_key_freeid(struct tdx_td *td);
+u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 55842d5b9474..c180551ecce3 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1462,6 +1462,29 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td)
return page_to_pfn(td->tdr_page) << PAGE_SHIFT;
}
+/*
+ * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
+ * a CLFLUSH of pages is required before handing them to the TDX module.
+ * Be conservative and make the code simpler by doing the CLFLUSH
+ * unconditionally.
+ */
+static void tdx_clflush_page(struct page *tdr)
+{
+ clflush_cache_range(page_to_virt(tdr), PAGE_SIZE);
+}
+
+u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_pfn(tdcs_page) << PAGE_SHIFT,
+ .rdx = tdx_tdr_pa(td),
+ };
+
+ tdx_clflush_page(tdcs_page);
+ return seamcall(TDH_MNG_ADDCX, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1472,6 +1495,18 @@ u64 tdh_mng_key_config(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mng_key_config);
+u64 tdh_mng_create(struct tdx_td *td, u64 hkid)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = hkid,
+ };
+
+ tdx_clflush_page(td->tdr_page);
+ return seamcall(TDH_MNG_CREATE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_create);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1481,3 +1516,19 @@ u64 tdh_mng_key_freeid(struct tdx_td *td)
return seamcall(TDH_MNG_KEY_FREEID, &args);
}
EXPORT_SYMBOL_GPL(tdh_mng_key_freeid);
+
+u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = td_params,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MNG_INIT, &args);
+
+ *extended_err = args.rcx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mng_init);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 5579317f67ab..0861c3f09576 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -15,8 +15,11 @@
/*
* TDX module SEAMCALL leaf functions
*/
+#define TDH_MNG_ADDCX 1
#define TDH_MNG_KEY_CONFIG 8
+#define TDH_MNG_CREATE 9
#define TDH_MNG_KEY_FREEID 20
+#define TDH_MNG_INIT 21
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 03/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-01 7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
2025-01-01 7:49 ` [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-01 7:49 ` [PATCH 04/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
` (10 subsequent siblings)
13 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. It defines various control structures that hold state for
virtualized components of the TD (i.e. VMs or vCPUs) These control
structures are stored in pages given to the TDX module and encrypted
with either the global KeyID or the guest KeyIDs.
To manipulate these control structures the TDX module defines a few
SEAMCALLs. KVM will use these during the process of creating a vCPU as
follows:
1) Call TDH.VP.CREATE to create a TD vCPU Root (TDVPR) page for each
vCPU.
2) Call TDH.VP.ADDCX to add per-vCPU control pages (TDCX) for each vCPU.
3) Call TDH.VP.INIT to initialize the TDCX for each vCPU.
To reclaim these pages for use by the kernel other SEAMCALLs are needed,
which will be added in future patches.
Export functions to allow KVM to make these SEAMCALLs. Export two
variants for TDH.VP.CREATE, in order to support the planned logic of KVM
to support TDX modules with and without the ENUM_TOPOLOGY feature. If
KVM can drop support for the !ENUM_TOPOLOGY case, this could go down a
single version. Leave that for later discussion.
The TDX module provides SEAMCALLs to hand pages to the TDX module for
storing TDX controlled state. SEAMCALLs that operate on this state are
directed to the appropriate TD vCPU using references to the pages
originally provided for managing the vCPU's state. So the host kernel
needs to track these pages, both as an ID for specifying which vCPU to
operate on, and to allow them to be eventually reclaimed. The vCPU
associated pages are called TDVPR (Trust Domain Virtual Processor Root)
and TDCX (Trust Domain Control Extension).
Introduce "struct tdx_vp" for holding references to pages provided to the
TDX module for the TD vCPU associated state. Don't plan for any vCPU
associated state that is controlled by KVM to live in this struct. Only
expect it to hold data for concepts specific to the TDX architecture, for
which there can't already be preexisting storage for in KVM.
Add both the TDVPR page and an array of TDCX pages, even though the
SEAMCALL wrappers will only need to know about the TDVPR pages for
directing the SEAMCALLs to the right vCPU. Adding the TDCX pages to this
struct will let all of the vCPU associated pages handed to the TDX module be
tracked in one location. For a type to specify physical pages, use KVM's
hpa_t type. Do this for KVM's benefit This is the common type used to hold
physical addresses in KVM, so will make interoperability easier.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-4-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 15 +++++++++++
arch/x86/virt/vmx/tdx/tdx.c | 54 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 11 ++++++++
3 files changed, 80 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 425d282834e0..aa2c8f297557 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -124,13 +124,28 @@ struct tdx_td {
int tdcs_nr_pages;
/* TD control structure: */
struct page **tdcs_pages;
+
+ /* Size of `tdcx_pages` in struct tdx_vp */
+ int tdcx_nr_pages;
+};
+
+struct tdx_vp {
+ /* TDVP root page */
+ struct page *tdvpr_page;
+
+ /* TD vCPU control structure: */
+ struct page **tdcx_pages;
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
+u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
+u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
+u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index c180551ecce3..fceef24959ff 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -5,6 +5,7 @@
* Intel Trusted Domain Extensions (TDX) support
*/
+#include "asm/page_types.h"
#define pr_fmt(fmt) "virt/tdx: " fmt
#include <linux/types.h>
@@ -1462,6 +1463,11 @@ static inline u64 tdx_tdr_pa(struct tdx_td *td)
return page_to_pfn(td->tdr_page) << PAGE_SHIFT;
}
+static inline u64 tdx_tdvpr_pa(struct tdx_vp *td)
+{
+ return page_to_pfn(td->tdvpr_page) << PAGE_SHIFT;
+}
+
/*
* The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
* a CLFLUSH of pages is required before handing them to the TDX module.
@@ -1485,6 +1491,18 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_pfn(tdcx_page) << PAGE_SHIFT,
+ .rdx = tdx_tdvpr_pa(vp),
+ };
+
+ tdx_clflush_page(tdcx_page);
+ return seamcall(TDH_VP_ADDCX, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_addcx);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1507,6 +1525,18 @@ u64 tdh_mng_create(struct tdx_td *td, u64 hkid)
}
EXPORT_SYMBOL_GPL(tdh_mng_create);
+u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = tdx_tdr_pa(td),
+ };
+
+ tdx_clflush_page(vp->tdvpr_page);
+ return seamcall(TDH_VP_CREATE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_create);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1532,3 +1562,27 @@ u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err)
return ret;
}
EXPORT_SYMBOL_GPL(tdh_mng_init);
+
+u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = initial_rcx,
+ };
+
+ return seamcall(TDH_VP_INIT, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_init);
+
+u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = initial_rcx,
+ .r8 = x2apicid,
+ };
+
+ /* apicid requires version == 1. */
+ return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_init_apicid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 0861c3f09576..f0464f7d9780 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,10 +16,13 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
+#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
+#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
@@ -28,6 +31,14 @@
#define TDH_SYS_TDMR_INIT 36
#define TDH_SYS_CONFIG 45
+/*
+ * SEAMCALL leaf:
+ *
+ * Bit 15:0 Leaf number
+ * Bit 23:16 Version number
+ */
+#define TDX_VERSION_SHIFT 16
+
/* TDX page types */
#define PT_NDA 0x0
#define PT_RSVD 0x1
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 04/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (2 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 03/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-01 7:49 ` [PATCH 05/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
` (9 subsequent siblings)
13 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module uses pages provided by the host for both control
structures and for TD guest pages. These pages are encrypted using the
MK-TME encryption engine, with its special requirements around cache
invalidation. For its own security, the TDX module ensures pages are
flushed properly and track which usage they are currently assigned. For
creating and tearing down TD VMs and vCPUs KVM will need to use the
TDH.PHYMEM.PAGE.RECLAIM, TDH.PHYMEM.CACHE.WB, and TDH.PHYMEM.PAGE.WBINVD
SEAMCALLs.
Add tdh_phymem_page_reclaim() to enable KVM to call
TDH.PHYMEM.PAGE.RECLAIM to reclaim the page for use by the host kernel.
This effectively resets its state in the TDX module's page tracking
(PAMT), if the page is available to be reclaimed. This will be used by KVM
to reclaim the various types of pages owned by the TDX module. It will
have a small wrapper in KVM that retries in the case of a relevant error
code. Don't implement this wrapper in arch/x86 because KVM's solution
around retrying SEAMCALLs will be better located in a single place.
Add tdh_phymem_cache_wb() to enable KVM to call TDH.PHYMEM.CACHE.WB to do
a cache write back in a way that the TDX module can verify, before it
allows a KeyID to be freed. The KVM code will use this to have a small
wrapper that handles retries. Since the TDH.PHYMEM.CACHE.WB operation is
interruptible, have tdh_phymem_cache_wb() take a resume argument to pass
this info to the TDX module for restarts. It is worth noting that this
SEAMCALL uses a SEAM specific MSR to do the write back in sections. In
this way it does export some new functionality that affects CPU state.
Add tdh_phymem_page_wbinvd_tdr() to enable KVM to call
TDH.PHYMEM.PAGE.WBINVD to do a cache write back and invalidate of a TDR,
using the global KeyID. The underlying TDH.PHYMEM.PAGE.WBINVD SEAMCALL
requires the related KeyID to be encoded into the SEAMCALL args. Since the
global KeyID is not exposed to KVM, a dedicated wrapper is needed for TDR
focused TDH.PHYMEM.PAGE.WBINVD operations.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-5-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 42 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 48 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index aa2c8f297557..dbdfa7d59673 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -146,6 +146,9 @@ u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_phymem_cache_wb(bool resume);
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index fceef24959ff..a5036184d7d1 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1586,3 +1586,45 @@ u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
return seamcall(TDH_VP_INIT | (1ULL << TDX_VERSION_SHIFT), &args);
}
EXPORT_SYMBOL_GPL(tdh_vp_init_apicid);
+
+/*
+ * TDX ABI defines output operands as PT, OWNER and SIZE. These are TDX defined fomats.
+ * So despite the names, they must be interpted specially as described by the spec. Return
+ * them only for error reporting purposes.
+ */
+u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size)
+{
+ struct tdx_module_args args = {
+ .rcx = page_to_pfn(page) << PAGE_SHIFT,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_PHYMEM_PAGE_RECLAIM, &args);
+
+ *tdx_pt = args.rcx;
+ *tdx_owner = args.rdx;
+ *tdx_size = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+
+u64 tdh_phymem_cache_wb(bool resume)
+{
+ struct tdx_module_args args = {
+ .rcx = resume ? 1 : 0,
+ };
+
+ return seamcall(TDH_PHYMEM_CACHE_WB, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_cache_wb);
+
+u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
+{
+ struct tdx_module_args args = {};
+
+ args.rcx = tdx_tdr_pa(td) | ((u64)tdx_global_keyid << boot_cpu_data.x86_phys_bits);
+
+ return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index f0464f7d9780..7a15c9afcdfa 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -24,11 +24,14 @@
#define TDH_MNG_INIT 21
#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_PHYMEM_PAGE_RECLAIM 28
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
#define TDH_SYS_LP_INIT 35
#define TDH_SYS_TDMR_INIT 36
+#define TDH_PHYMEM_CACHE_WB 40
+#define TDH_PHYMEM_PAGE_WBINVD 41
#define TDH_SYS_CONFIG 45
/*
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 05/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (3 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 04/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-01 7:49 ` [PATCH 06/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
` (8 subsequent siblings)
13 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module has TD scoped and vCPU scoped "metadata fields".
These fields are a bit like VMCS fields, and stored in data structures
maintained by the TDX module. Export 3 SEAMCALLs for use in reading and
writing these fields:
Make tdh_mng_rd() use MNG.VP.RD to read the TD scoped metadata.
Make tdh_vp_rd()/tdh_vp_wr() use TDH.VP.RD/WR to read/write the vCPU
scoped metadata.
KVM will use these by creating inline helpers that target various metadata
sizes. Export the raw SEAMCALL leaf, to avoid exporting the large number
of various sized helpers.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Message-ID: <20241203010317.827803-6-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 47 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 3 +++
3 files changed, 53 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index dbdfa7d59673..6342309fe011 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -142,9 +142,12 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
+u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
+u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
+u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
u64 tdh_phymem_cache_wb(bool resume);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a5036184d7d1..7cfd07772b20 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1537,6 +1537,23 @@ u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp)
}
EXPORT_SYMBOL_GPL(tdh_vp_create);
+u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ .rdx = field,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MNG_RD, &args);
+
+ /* R8: Content of the field, or 0 in case of error. */
+ *data = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mng_rd);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1574,6 +1591,36 @@ u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx)
}
EXPORT_SYMBOL_GPL(tdh_vp_init);
+u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = field,
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_VP_RD, &args);
+
+ /* R8: Content of the field, or 0 in case of error. */
+ *data = args.r8;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_vp_rd);
+
+u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ .rdx = field,
+ .r8 = data,
+ .r9 = mask,
+ };
+
+ return seamcall(TDH_VP_WR, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_wr);
+
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 7a15c9afcdfa..aacd38b12989 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -19,11 +19,13 @@
#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
+#define TDH_MNG_RD 11
#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
#define TDH_VP_INIT 22
#define TDH_PHYMEM_PAGE_RDMD 24
+#define TDH_VP_RD 26
#define TDH_PHYMEM_PAGE_RECLAIM 28
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
@@ -32,6 +34,7 @@
#define TDH_SYS_TDMR_INIT 36
#define TDH_PHYMEM_CACHE_WB 40
#define TDH_PHYMEM_PAGE_WBINVD 41
+#define TDH_VP_WR 43
#define TDH_SYS_CONFIG 45
/*
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 06/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (4 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 05/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-01 7:49 ` [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
` (7 subsequent siblings)
13 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Sean Christopherson, Isaku Yamahata, Binbin Wu, Yuan Yao
From: Rick Edgecombe <rick.p.edgecombe@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. The TDX module has the concept of flushing vCPUs. These flushes
include both a flush of the translation caches and also any other state
internal to the TDX module. Before freeing a KeyID, this flush operation
needs to be done. KVM will need to perform the flush on each pCPU
associated with the TD, and also perform a TD scoped operation that checks
if the flush has been done on all vCPU's associated with the TD.
Add a tdh_vp_flush() function to be used to call TDH.VP.FLUSH on each pCPU
associated with the TD during TD teardown. It will also be called when
disabling TDX and during vCPU migration between pCPUs.
Add tdh_mng_vpflushdone() to be used by KVM to call TDH.MNG.VPFLUSHDONE.
KVM will use this during TD teardown to verify that TDH.VP.FLUSH has been
called sufficiently, and advance the state machine that will allow for
reclaiming the TD's KeyID.
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
Message-ID: <20241203010317.827803-7-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 20 ++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 24 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 6342309fe011..8fc1836c5d09 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -143,6 +143,8 @@ u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
+u64 tdh_vp_flush(struct tdx_vp *vp);
+u64 tdh_mng_vpflushdone(struct tdx_td *td);
u64 tdh_mng_key_freeid(struct tdx_td *td);
u64 tdh_mng_init(struct tdx_td *td, u64 td_params, u64 *extended_err);
u64 tdh_vp_init(struct tdx_vp *vp, u64 initial_rcx);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 7cfd07772b20..986b58b63121 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1554,6 +1554,26 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
}
EXPORT_SYMBOL_GPL(tdh_mng_rd);
+u64 tdh_vp_flush(struct tdx_vp *vp)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdvpr_pa(vp),
+ };
+
+ return seamcall(TDH_VP_FLUSH, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_vp_flush);
+
+u64 tdh_mng_vpflushdone(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MNG_VPFLUSHDONE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_vpflushdone);
+
u64 tdh_mng_key_freeid(struct tdx_td *td)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index aacd38b12989..62cb7832c42d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -20,6 +20,8 @@
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
+#define TDH_VP_FLUSH 18
+#define TDH_MNG_VPFLUSHDONE 19
#define TDH_VP_CREATE 10
#define TDH_MNG_KEY_FREEID 20
#define TDH_MNG_INIT 21
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (5 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 06/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-02 21:59 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
` (6 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a Secure EPT
(S-EPT or SEPT) tree per TD for private GPA to HPA translation. Wrap the
TDH.MEM.SEPT.ADD SEAMCALL with tdh_mem_sept_add() to provide pages to the
TDX module for building a TD's SEPT tree. (Refer to these pages as SEPT
pages).
Callers need to allocate and provide a normal page to tdh_mem_sept_add(),
which then passes the page to the TDX module via the SEAMCALL
TDH.MEM.SEPT.ADD. The TDX module then installs the page into SEPT tree and
encrypts this SEPT page with the TD's guest keyID. The kernel cannot use
the SEPT page until after reclaiming it via TDH.MEM.SEPT.REMOVE or
TDH.PHYMEM.PAGE.RECLAIM.
Before passing the page to the TDX module, tdh_mem_sept_add() performs a
CLFLUSH on the page mapped with keyID 0 to ensure that any dirty cache
lines don't write back later and clobber TD memory or control structures.
Don't worry about the other MK-TME keyIDs because the kernel doesn't use
them. The TDX docs specify that this flush is not needed unless the TDX
module exposes the CLFLUSH_BEFORE_ALLOC feature bit. Do the CLFLUSH
unconditionally for two reasons: make the solution simpler by having a
single path that can handle both !CLFLUSH_BEFORE_ALLOC and
CLFLUSH_BEFORE_ALLOC cases. Avoid wading into any correctness uncertainty
by going with a conservative solution to start.
Callers should specify "GPA" and "level" for the TDX module to install the
SEPT page at the specified position in the SEPT. Do not include the root
page level in "level" since TDH.MEM.SEPT.ADD can only add non-root pages to
the SEPT. Ensure "level" is between 1 and 3 for a 4-level SEPT or between 1
and 4 for a 5-level SEPT.
Call tdh_mem_sept_add() during the TD's build time or during the TD's
runtime. Check for errors from the function return value and retrieve
extended error info from the function output parameters.
The TDX module has many internal locks. To avoid staying in SEAM mode for
too long, SEAMCALLs returns a BUSY error code to the kernel instead of
spinning on the locks. Depending on the specific SEAMCALL, the caller
may need to handle this error in specific ways (e.g., retry). Therefore,
return the SEAMCALL error code directly to the caller. Don't attempt to
handle it in the core kernel.
TDH.MEM.SEPT.ADD effectively manages two internal resources of the TDX
module: it installs page table pages in the SEPT tree and also updates the
TDX module's page metadata (PAMT). Don't add a wrapper for the matching
SEAMCALL for removing a SEPT page (TDH.MEM.SEPT.REMOVE) because KVM, as the
only in-kernel user, will only tear down the SEPT tree when the TD is being
torn down. When this happens it can just do other operations that reclaim
the SEPT pages for the host kernels to use, update the PAMT and let the
SEPT get trashed.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073624.22114-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 1 +
arch/x86/virt/vmx/tdx/tdx.c | 19 +++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 1 +
3 files changed, 21 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 8fc1836c5d09..9e0c60a41d5d 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -138,6 +138,7 @@ struct tdx_vp {
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 986b58b63121..a97a470dda23 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1491,6 +1491,25 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = hpa,
+ };
+ u64 ret;
+
+ clflush_cache_range(__va(hpa), PAGE_SIZE);
+ ret = seamcall_ret(TDH_MEM_SEPT_ADD, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_sept_add);
+
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 62cb7832c42d..308d3aa565d7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,6 +16,7 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (6 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-02 23:32 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
` (5 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a Secure EPT
(S-EPT or SEPT) tree per TD to translate TD's private memory accessed
using a private GPA. Wrap the SEAMCALL TDH.MEM.PAGE.ADD with
tdh_mem_page_add() and TDH.MEM.PAGE.AUG with tdh_mem_page_aug() to add TD
private pages and map them to the TD's private GPAs in the SEPT.
Callers of tdh_mem_page_add() and tdh_mem_page_aug() allocate and provide
normal pages to the wrappers, who further pass those pages to the TDX
module. Before passing the pages to the TDX module, tdh_mem_page_add() and
tdh_mem_page_aug() perform a CLFLUSH on the page mapped with keyID 0 to
ensure that any dirty cache lines don't write back later and clobber TD
memory or control structures. Don't worry about the other MK-TME keyIDs
because the kernel doesn't use them. The TDX docs specify that this flush
is not needed unless the TDX module exposes the CLFLUSH_BEFORE_ALLOC
feature bit. Do the CLFLUSH unconditionally for two reasons: make the
solution simpler by having a single path that can handle both
!CLFLUSH_BEFORE_ALLOC and CLFLUSH_BEFORE_ALLOC cases. Avoid wading into any
correctness uncertainty by going with a conservative solution to start.
Call tdh_mem_page_add() to add a private page to a TD during the TD's build
time (i.e., before TDH.MR.FINALIZE). Specify which GPA the 4K private page
will map to. No need to specify level info since TDH.MEM.PAGE.ADD only adds
pages at 4K level. To provide initial contents to TD, provide an additional
source page residing in memory managed by the host kernel itself (encrypted
with a shared keyID). The TDX module will copy the initial contents from
the source page in shared memory into the private page after mapping the
page in the SEPT to the specified private GPA. The TDX module allows the
source page to be the same page as the private page to be added. In that
case, the TDX module converts and encrypts the source page as a TD private
page.
Call tdh_mem_page_aug() to add a private page to a TD during the TD's
runtime (i.e., after TDH.MR.FINALIZE). TDH.MEM.PAGE.AUG supports adding
huge pages. Specify which GPA the private page will map to, along with
level info embedded in the lower bits of the GPA. The TDX module will
recognize the added page as the TD's private page after the TD's acceptance
with TDCALL TDG.MEM.PAGE.ACCEPT.
tdh_mem_page_add() and tdh_mem_page_aug() may fail. Callers can check
function return value and retrieve extended error info from the function
output parameters.
The TDX module has many internal locks. To avoid staying in SEAM mode for
too long, SEAMCALLs returns a BUSY error code to the kernel instead of
spinning on the locks. Depending on the specific SEAMCALL, the caller
may need to handle this error in specific ways (e.g., retry). Therefore,
return the SEAMCALL error code directly to the caller. Don't attempt to
handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073636.22129-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 39 +++++++++++++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 43 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 9e0c60a41d5d..32f3981d56c5 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -138,8 +138,10 @@ struct tdx_vp {
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx, u64 *rdx);
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
+u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a97a470dda23..f39197d4eafc 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1491,6 +1491,26 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = hpa,
+ .r9 = source,
+ };
+ u64 ret;
+
+ clflush_cache_range(__va(hpa), PAGE_SIZE);
+ ret = seamcall_ret(TDH_MEM_PAGE_ADD, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_add);
+
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx)
{
struct tdx_module_args args = {
@@ -1522,6 +1542,25 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
}
EXPORT_SYMBOL_GPL(tdh_vp_addcx);
+u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = hpa,
+ };
+ u64 ret;
+
+ clflush_cache_range(__va(hpa), PAGE_SIZE);
+ ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 308d3aa565d7..80e6ef006085 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -16,8 +16,10 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_PAGE_ADD 2
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
+#define TDH_MEM_PAGE_AUG 6
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (7 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-03 1:28 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
` (4 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX module defines a TLB tracking protocol to make sure that no logical
processor holds any stale Secure EPT (S-EPT or SEPT) TLB translations for a
given TD private GPA range. After a successful TDH.MEM.RANGE.BLOCK,
TDH.MEM.TRACK, and kicking off all vCPUs, TDX module ensures that the
subsequent TDH.VP.ENTER on each vCPU will flush all stale TLB entries for
the specified GPA ranges in TDH.MEM.RANGE.BLOCK. Wrap the
TDH.MEM.RANGE.BLOCK with tdh_mem_range_block() and TDH.MEM.TRACK with
tdh_mem_track() to enable the kernel to assist the TDX module in TLB
tracking management.
The caller of tdh_mem_range_block() needs to specify "GPA" and "level" to
request the TDX module to block the subsequent creation of TLB translation
for a GPA range. This GPA range can correspond to a SEPT page or a TD
private page at any level.
Contentions and errors are possible with the SEAMCALL TDH.MEM.RANGE.BLOCK.
Therefore, the caller of tdh_mem_range_block() needs to check the function
return value and retrieve extended error info from the function output
params.
Upon TDH.MEM.RANGE.BLOCK success, no new TLB entries will be created for
the specified private GPA range, though the existing TLB translations may
still persist.
Call tdh_mem_track() after tdh_mem_range_block(). No extra info is required
except the TDR HPA to denote the TD. TDH.MEM.TRACK will advance the TD's
epoch counter to ensure TDX module will flush TLBs in all vCPUs once the
vCPUs re-enter the TD. TDH.MEM.TRACK will fail to advance TD's epoch
counter if there are vCPUs still running in non-root mode at the previous
TD epoch counter. Therefore, send IPIs to kick off vCPUs after
tdh_mem_track() to avoid the failure by forcing all vCPUs to re-enter the
TD.
Contentions are also possible in TDH.MEM.TRACK. For example, TDH.MEM.TRACK
may contend with TDH.VP.ENTER when advancing the TD epoch counter.
tdh_mem_track() does not provide the retries for the caller. Callers can
choose to avoid contentions or retry on their own.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073648.22143-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 32f3981d56c5..f0b7b7b7d506 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -142,6 +142,7 @@ u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx,
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx);
+u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
@@ -155,6 +156,7 @@ u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_mem_track(struct tdx_td *tdr);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
#else
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index f39197d4eafc..c7e6f30d0a14 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1561,6 +1561,23 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
}
EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_range_block);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1734,6 +1751,16 @@ u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+u64 tdh_mem_track(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MEM_TRACK, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mem_track);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 80e6ef006085..4b0ad536afd9 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -20,6 +20,7 @@
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MEM_PAGE_AUG 6
+#define TDH_MEM_RANGE_BLOCK 7
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
@@ -35,6 +36,7 @@
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
+#define TDH_MEM_TRACK 38
#define TDH_SYS_LP_INIT 35
#define TDH_SYS_TDMR_INIT 36
#define TDH_PHYMEM_CACHE_WB 40
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (8 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-03 1:36 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
` (3 subsequent siblings)
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
TDX architecture introduces the concept of private GPA vs shared GPA,
depending on the GPA.SHARED bit. The TDX module maintains a single Secure
EPT (S-EPT or SEPT) tree per TD to translate TD's private memory accessed
using a private GPA. Wrap the SEAMCALL TDH.MEM.PAGE.REMOVE with
tdh_mem_page_remove() and TDH_PHYMEM_PAGE_WBINVD with
tdh_phymem_page_wbinvd_hkid() to unmap a TD private page from the SEPT,
remove the TD private page from the TDX module and flush cache lines to
memory after removal of the private page.
Callers should specify "GPA" and "level" when calling tdh_mem_page_remove()
to indicate to the TDX module which TD private page to unmap and remove.
TDH.MEM.PAGE.REMOVE may fail, and the caller of tdh_mem_page_remove() can
check the function return value and retrieve extended error information
from the function output parameters. Follow the TLB tracking protocol
before calling tdh_mem_page_remove() to remove a TD private page to avoid
SEAMCALL failure.
After removing a TD's private page, the TDX module does not write back and
invalidate cache lines associated with the page and the page's keyID (i.e.,
the TD's guest keyID). Therefore, provide tdh_phymem_page_wbinvd_hkid() to
allow the caller to pass in the TD's guest keyID and invoke
TDH_PHYMEM_PAGE_WBINVD to perform this action.
Before reusing the page, the host kernel needs to map the page with keyID 0
and invoke movdir64b() to convert the TD private page to a normal shared
page.
TDH.MEM.PAGE.REMOVE and TDH_PHYMEM_PAGE_WBINVD may meet contentions inside
the TDX module for TDX's internal resources. To avoid staying in SEAM mode
for too long, TDX module will return a BUSY error code to the kernel
instead of spinning on the locks. The caller may need to handle this error
in specific ways (e.g., retry). The wrappers return the SEAMCALL error code
directly to the caller. Don't attempt to handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073658.22157-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 1 +
3 files changed, 30 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index f0b7b7b7d506..74938f725481 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -157,8 +157,10 @@ u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
u64 tdh_mem_track(struct tdx_td *tdr);
+u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
+u64 tdh_phymem_page_wbinvd_hkid(u64 hpa, u64 hkid);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index c7e6f30d0a14..cde55e9b3280 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1761,6 +1761,23 @@ u64 tdh_mem_track(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mem_track);
+u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa | level,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_PAGE_REMOVE, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_remove);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
@@ -1780,3 +1797,13 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
+
+u64 tdh_phymem_page_wbinvd_hkid(u64 hpa, u64 hkid)
+{
+ struct tdx_module_args args = {};
+
+ args.rcx = hpa | (hkid << boot_cpu_data.x86_phys_bits);
+
+ return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 4b0ad536afd9..d49cdd9b0577 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -33,6 +33,7 @@
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_VP_RD 26
#define TDH_PHYMEM_PAGE_RECLAIM 28
+#define TDH_MEM_PAGE_REMOVE 29
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (9 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-03 18:02 ` Edgecombe, Rick P
2025-01-07 7:01 ` Yan Zhao
2025-01-01 7:49 ` [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
` (2 subsequent siblings)
13 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata, Sean Christopherson
From: Isaku Yamahata <isaku.yamahata@intel.com>
The TDX module measures the TD during the build process and saves the
measurement in TDCS.MRTD to facilitate TD attestation of the initial
contents of the TD. Wrap the SEAMCALL TDH.MR.EXTEND with tdh_mr_extend()
and TDH.MR.FINALIZE with tdh_mr_finalize() to enable the host kernel to
assist the TDX module in performing the measurement.
The measurement in TDCS.MRTD is a SHA-384 digest of the build process.
SEAMCALLs TDH.MNG.INIT and TDH.MEM.PAGE.ADD initialize and contribute to
the MRTD digest calculation.
The caller of tdh_mr_extend() should break the TD private page into chunks
of size TDX_EXTENDMR_CHUNKSIZE and invoke tdh_mr_extend() to add the page
content into the digest calculation. Failures are possible with
TDH.MR.EXTEND (e.g., due to SEPT walking). The caller of tdh_mr_extend()
can check the function return value and retrieve extended error information
from the function output parameters.
Calling tdh_mr_finalize() completes the measurement. The TDX module then
turns the TD into the runnable state. Further TDH.MEM.PAGE.ADD and
TDH.MR.EXTEND calls will fail.
TDH.MR.FINALIZE may fail due to errors such as the TD having no vCPUs or
contentions. Check function return value when calling tdh_mr_finalize() to
determine the exact reason for failure. Take proper locks on the caller's
side to avoid contention failures, or handle the BUSY error in specific
ways (e.g., retry). Return the SEAMCALL error code directly to the caller.
Do not attempt to handle it in the core kernel.
[Kai: Switched from generic seamcall export]
[Yan: Re-wrote the changelog]
Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20241112073709.22171-1-yan.y.zhao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 2 ++
arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx.h | 2 ++
3 files changed, 31 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 74938f725481..6981a3d75eb2 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -147,6 +147,8 @@ u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
+u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx);
+u64 tdh_mr_finalize(struct tdx_td *td);
u64 tdh_vp_flush(struct tdx_vp *vp);
u64 tdh_mng_vpflushdone(struct tdx_td *td);
u64 tdh_mng_key_freeid(struct tdx_td *td);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index cde55e9b3280..84fe5bc79434 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1629,6 +1629,33 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
}
EXPORT_SYMBOL_GPL(tdh_mng_rd);
+u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx)
+{
+ struct tdx_module_args args = {
+ .rcx = gpa,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MR_EXTEND, &args);
+
+ *rcx = args.rcx;
+ *rdx = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mr_extend);
+
+u64 tdh_mr_finalize(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MR_FINALIZE, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mr_finalize);
+
u64 tdh_vp_flush(struct tdx_vp *vp)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index d49cdd9b0577..a1e34773bab7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -24,6 +24,8 @@
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
+#define TDH_MR_EXTEND 16
+#define TDH_MR_FINALIZE 17
#define TDH_VP_FLUSH 18
#define TDH_MNG_VPFLUSHDONE 19
#define TDH_VP_CREATE 10
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (10 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-03 18:26 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 13/13] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
2025-01-02 19:43 ` [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Edgecombe, Rick P
13 siblings, 1 reply; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao
From: Kai Huang <kai.huang@intel.com>
KVM needs two classes of global metadata to create and run TDX guests:
- "TD Control Structures"
- "TD Configurability"
The first class contains the sizes of TDX guest per-VM and per-vCPU
control structures. KVM will need to use them to allocate enough space
for those control structures.
The second class contains info which reports things like which features
are configurable to TDX guest etc. KVM will need to use them to
properly configure TDX guests.
Read them for KVM TDX to use.
The code change is auto-generated by re-running the script in [1] after
uncommenting the "td_conf" and "td_ctrl" part to regenerate the
tdx_global_metadata.{hc} and update them to the existing ones in the
kernel.
#python tdx.py global_metadata.json tdx_global_metadata.h \
tdx_global_metadata.c
The 'global_metadata.json' can be fetched from [2].
Link: https://lore.kernel.org/kvm/0853b155ec9aac09c594caa60914ed6ea4dc0a71.camel@intel.com/ [1]
Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [2]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20241030190039.77971-4-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 50 +++++++++++++++++++++
arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 19 ++++++++
2 files changed, 69 insertions(+)
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
index 8027a24d1c6e..13ad2663488b 100644
--- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
@@ -37,12 +37,62 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
return ret;
}
+static int get_tdx_sys_info_td_ctrl(struct tdx_sys_info_td_ctrl *sysinfo_td_ctrl)
+{
+ int ret = 0;
+ u64 val;
+
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000000, &val)))
+ sysinfo_td_ctrl->tdr_base_size = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000100, &val)))
+ sysinfo_td_ctrl->tdcs_base_size = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9800000100000200, &val)))
+ sysinfo_td_ctrl->tdvps_base_size = val;
+
+ return ret;
+}
+
+static int get_tdx_sys_info_td_conf(struct tdx_sys_info_td_conf *sysinfo_td_conf)
+{
+ int ret = 0;
+ u64 val;
+ int i, j;
+
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000000, &val)))
+ sysinfo_td_conf->attributes_fixed0 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000001, &val)))
+ sysinfo_td_conf->attributes_fixed1 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000002, &val)))
+ sysinfo_td_conf->xfam_fixed0 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x1900000300000003, &val)))
+ sysinfo_td_conf->xfam_fixed1 = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000100000004, &val)))
+ sysinfo_td_conf->num_cpuid_config = val;
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000100000008, &val)))
+ sysinfo_td_conf->max_vcpus_per_td = val;
+ if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_leaves))
+ return -EINVAL;
+ for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++)
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000300000400 + i, &val)))
+ sysinfo_td_conf->cpuid_config_leaves[i] = val;
+ if (sysinfo_td_conf->num_cpuid_config > ARRAY_SIZE(sysinfo_td_conf->cpuid_config_values))
+ return -EINVAL;
+ for (i = 0; i < sysinfo_td_conf->num_cpuid_config; i++)
+ for (j = 0; j < 2; j++)
+ if (!ret && !(ret = read_sys_metadata_field(0x9900000300000500 + i * 2 + j, &val)))
+ sysinfo_td_conf->cpuid_config_values[i][j] = val;
+
+ return ret;
+}
+
static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
{
int ret = 0;
ret = ret ?: get_tdx_sys_info_features(&sysinfo->features);
ret = ret ?: get_tdx_sys_info_tdmr(&sysinfo->tdmr);
+ ret = ret ?: get_tdx_sys_info_td_ctrl(&sysinfo->td_ctrl);
+ ret = ret ?: get_tdx_sys_info_td_conf(&sysinfo->td_conf);
return ret;
}
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
index 6dd3c9695f59..060a2ad744bf 100644
--- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
@@ -17,9 +17,28 @@ struct tdx_sys_info_tdmr {
u16 pamt_1g_entry_size;
};
+struct tdx_sys_info_td_ctrl {
+ u16 tdr_base_size;
+ u16 tdcs_base_size;
+ u16 tdvps_base_size;
+};
+
+struct tdx_sys_info_td_conf {
+ u64 attributes_fixed0;
+ u64 attributes_fixed1;
+ u64 xfam_fixed0;
+ u64 xfam_fixed1;
+ u16 num_cpuid_config;
+ u16 max_vcpus_per_td;
+ u64 cpuid_config_leaves[128];
+ u64 cpuid_config_values[128][2];
+};
+
struct tdx_sys_info {
struct tdx_sys_info_features features;
struct tdx_sys_info_tdmr tdmr;
+ struct tdx_sys_info_td_ctrl td_ctrl;
+ struct tdx_sys_info_td_conf td_conf;
};
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* [PATCH 13/13] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (11 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
@ 2025-01-01 7:49 ` Paolo Bonzini
2025-01-02 19:43 ` [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Edgecombe, Rick P
13 siblings, 0 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-01 7:49 UTC (permalink / raw)
To: linux-kernel, kvm
Cc: kai.huang, rick.p.edgecombe, dave.hansen, yan.y.zhao,
Isaku Yamahata
From: Isaku Yamahata <isaku.yamahata@intel.com>
Intel TDX protects guest VMs from malicious host and certain physical
attacks. Pre-TDX Intel hardware has support for a memory encryption
architecture called MK-TME, which repurposes several high bits of
physical address as "KeyID". The BIOS reserves a sub-range of MK-TME
KeyIDs as "TDX private KeyIDs".
Each TDX guest must be assigned with a unique TDX KeyID when it is
created. The kernel reserves the first TDX private KeyID for
crypto-protection of specific TDX module data which has a lifecycle that
exceeds the KeyID reserved for the TD's use. The rest of the KeyIDs are
left for TDX guests to use.
Create a small KeyID allocator. Export
tdx_guest_keyid_alloc()/tdx_guest_keyid_free() to allocate and free TDX
guest KeyID for KVM to use.
Don't provide the stub functions when CONFIG_INTEL_TDX_HOST=n since they
are not supposed to be called in this case.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20241030190039.77971-5-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/include/asm/tdx.h | 3 +++
arch/x86/virt/vmx/tdx/tdx.c | 17 +++++++++++++++++
2 files changed, 20 insertions(+)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 6981a3d75eb2..c2fe56a3c7fa 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -117,6 +117,9 @@ int tdx_cpu_enable(void);
int tdx_enable(void);
const char *tdx_dump_mce_info(struct mce *m);
+int tdx_guest_keyid_alloc(void);
+void tdx_guest_keyid_free(unsigned int keyid);
+
struct tdx_td {
/* TD root structure: */
struct page *tdr_page;
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 84fe5bc79434..1a5912a8b5c5 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -28,6 +28,7 @@
#include <linux/log2.h>
#include <linux/acpi.h>
#include <linux/suspend.h>
+#include <linux/idr.h>
#include <asm/page.h>
#include <asm/special_insns.h>
#include <asm/msr-index.h>
@@ -43,6 +44,8 @@ static u32 tdx_global_keyid __ro_after_init;
static u32 tdx_guest_keyid_start __ro_after_init;
static u32 tdx_nr_guest_keyids __ro_after_init;
+static DEFINE_IDA(tdx_guest_keyid_pool);
+
static DEFINE_PER_CPU(bool, tdx_lp_initialized);
static struct tdmr_info_list tdx_tdmr_list;
@@ -1458,6 +1461,20 @@ void __init tdx_init(void)
check_tdx_erratum();
}
+int tdx_guest_keyid_alloc(void)
+{
+ return ida_alloc_range(&tdx_guest_keyid_pool, tdx_guest_keyid_start,
+ tdx_guest_keyid_start + tdx_nr_guest_keyids - 1,
+ GFP_KERNEL);
+}
+EXPORT_SYMBOL_GPL(tdx_guest_keyid_alloc);
+
+void tdx_guest_keyid_free(unsigned int keyid)
+{
+ ida_free(&tdx_guest_keyid_pool, keyid);
+}
+EXPORT_SYMBOL_GPL(tdx_guest_keyid_free);
+
static inline u64 tdx_tdr_pa(struct tdx_td *td)
{
return page_to_pfn(td->tdr_page) << PAGE_SHIFT;
--
2.43.5
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
` (12 preceding siblings ...)
2025-01-01 7:49 ` [PATCH 13/13] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
@ 2025-01-02 19:43 ` Edgecombe, Rick P
13 siblings, 0 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-02 19:43 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> This is a completed version of Rick's RFC series at
> https://lore.kernel.org/r/20241203010317.827803-1-rick.p.edgecombe@intel.com/.
Thanks!
> Due to EPANETTONE I didn't use the latest RFC, which is fixed here.
>
> As in the patches that I sent ten minutes ago, I took all the "Add
> SEAMCALL wrappers" patches from the various TDX parts and placed them
> in a single series, so that they can be reviewed and provided in a topic
> branch by Dave.
The last plan was to have host metadata go through tip (currently v9 is queued
in tip "x86/tdx"), and everything else go through KVM tree with Dave acks. I
don't see any big problem in changing that, but we had been telling him to
expect to just give acks on the other patches. The reason to separate them that
way was because the other patches were tightly related to KVM's usage, where as
has host metadata had other users in mind too. Do you want to change that plan?
I mention this because I'm not sure if you were objecting to the current state
or just slipped the mind.
Also, did you see that Dave had acked patches 1 through 6? So if you are good
with them too, then I think we should call those done. For the MMU ones, Yan has
some updates to try to address Dave's general feedback from the first batch of
SEAMCALLs. We can comment the updates.
For the others, I had pinged Dave on "x86/virt/tdx: Read essential global
metadata for KVM" before the break, but missed that "x86/virt/tdx: Add SEAMCALL
wrappers for TDX KeyID management" had Dave's general agreement but not an
offical ack. If we collect acks on those last two, we should have everything
needed to queue "VM/vCPU creation".
>
> I will rebase kvm-coco-queue on top of these, but I almost definitely
> will not manage to finish and push the result before getting the first
> NMIs from the rest of the family. In the meanwhile, this gives people
> time to review while I'm not available.
You mentioned wanting to use it as an exercise to learn the code, so I'll leave
it to you.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
2025-01-01 7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
@ 2025-01-02 19:44 ` Edgecombe, Rick P
0 siblings, 0 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-02 19:44 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, yuan.yao@intel.com, binbin.wu@linux.intel.com,
sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
> Reviewed-by: Yuan Yao <yuan.yao@intel.com>
> Message-ID: <20241203010317.827803-2-rick.p.edgecombe@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
This and the next 5 patches are missing Dave's ack:
https://lore.kernel.org/kvm/0854fbf0-c885-4331-8e9d-30eaa557b266@intel.com/
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-01 7:49 ` [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
@ 2025-01-02 21:59 ` Edgecombe, Rick P
2025-01-07 5:27 ` Yan Zhao
2025-01-07 19:48 ` Dave Hansen
0 siblings, 2 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-02 21:59 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
>
> +u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx)
> +{
> + struct tdx_module_args args = {
> + .rcx = gpa | level,
The bit packing here is a bit magic. Yan had been looking at adding a union like
below to use internally in tdh_mem_sept_add() to pack the rcx register.
union tdx_sept_gpa_mapping_info {
struct {
u64 level : 3;
u64 reserved1 : 9;
u64 gfn : 40;
u64 reserved2 : 12;
};
u64 full;
};
It might be a bit verbose, but I agree something should be done.
> + .rdx = tdx_tdr_pa(td),
> + .r8 = hpa,
> + };
> + u64 ret;
> +
> + clflush_cache_range(__va(hpa), PAGE_SIZE);
> + ret = seamcall_ret(TDH_MEM_SEPT_ADD, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
This was changed to this:
*extended_err1 = args.rcx;
*extended_err2 = args.rdx;
The reasoning is rcx and rdx are not very clear names. The caller only uses them
for printing raw error codes, so don't bother creating a union or anything
fancy. The print statements should print the raw error code to help in debugging
situations where new bits are defined and returned by a future buggy TDX module.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_sept_add);
> +
> u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
> {
> struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 62cb7832c42d..308d3aa565d7 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -16,6 +16,7 @@
> * TDX module SEAMCALL leaf functions
> */
> #define TDH_MNG_ADDCX 1
> +#define TDH_MEM_SEPT_ADD 3
> #define TDH_VP_ADDCX 4
> #define TDH_MNG_KEY_CONFIG 8
> #define TDH_MNG_CREATE 9
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-01 7:49 ` [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
@ 2025-01-02 23:32 ` Edgecombe, Rick P
2025-01-06 5:50 ` Yan Zhao
0 siblings, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-02 23:32 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com,
Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index a97a470dda23..f39197d4eafc 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1491,6 +1491,26 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
> }
> EXPORT_SYMBOL_GPL(tdh_mng_addcx);
>
> +u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx, u64 *rdx)
> +{
u64 gpa could be gfn_t.
hpa and source should be struct pages, per:
https://lore.kernel.org/kvm/d92e5301-9ca4-469a-8ae5-b36426e67356@intel.com/
> + struct tdx_module_args args = {
> + .rcx = gpa,
This could potentially also use union tdx_sept_gpa_mapping_info.
> + .rdx = tdx_tdr_pa(td),
> + .r8 = hpa,
> + .r9 = source,
> + };
> + u64 ret;
> +
> + clflush_cache_range(__va(hpa), PAGE_SIZE);
> + ret = seamcall_ret(TDH_MEM_PAGE_ADD, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
Similar to the last patch, these could be extended_err1, extended_err2.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_page_add);
> +
> u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx)
> {
> struct tdx_module_args args = {
> @@ -1522,6 +1542,25 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
> }
> EXPORT_SYMBOL_GPL(tdh_vp_addcx);
>
> +u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
> +{
hpa should be struct page, or as Yan had been ready to propose a folio and idx.
I would have thought a struct page would be sufficient for now. She also planned
to add a level arg, which today should always be 4k, but would be needed for
future huge page support.
I think we should try to keep it as simple as possible for now.
> + struct tdx_module_args args = {
> + .rcx = gpa,
> + .rdx = tdx_tdr_pa(td),
> + .r8 = hpa,
> + };
> + u64 ret;
> +
> + clflush_cache_range(__va(hpa), PAGE_SIZE);
> + ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
Similar to the others, these could be extended_err1, extended_err2.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
> +
> u64 tdh_mng_key_config(struct tdx_td *td)
> {
> struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 308d3aa565d7..80e6ef006085 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -16,8 +16,10 @@
> * TDX module SEAMCALL leaf functions
> */
> #define TDH_MNG_ADDCX 1
> +#define TDH_MEM_PAGE_ADD 2
> #define TDH_MEM_SEPT_ADD 3
> #define TDH_VP_ADDCX 4
> +#define TDH_MEM_PAGE_AUG 6
> #define TDH_MNG_KEY_CONFIG 8
> #define TDH_MNG_CREATE 9
> #define TDH_MNG_RD 11
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
2025-01-01 7:49 ` [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
@ 2025-01-03 1:28 ` Edgecombe, Rick P
2025-01-07 6:40 ` Yan Zhao
0 siblings, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-03 1:28 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> TDX module defines a TLB tracking protocol to make sure that no logical
> processor holds any stale Secure EPT (S-EPT or SEPT) TLB translations for a
> given TD private GPA range. After a successful TDH.MEM.RANGE.BLOCK,
> TDH.MEM.TRACK, and kicking off all vCPUs, TDX module ensures that the
> subsequent TDH.VP.ENTER on each vCPU will flush all stale TLB entries for
> the specified GPA ranges in TDH.MEM.RANGE.BLOCK. Wrap the
> TDH.MEM.RANGE.BLOCK with tdh_mem_range_block() and TDH.MEM.TRACK with
> tdh_mem_track() to enable the kernel to assist the TDX module in TLB
> tracking management.
>
> The caller of tdh_mem_range_block() needs to specify "GPA" and "level" to
> request the TDX module to block the subsequent creation of TLB translation
> for a GPA range. This GPA range can correspond to a SEPT page or a TD
> private page at any level.
>
> Contentions and errors are possible with the SEAMCALL TDH.MEM.RANGE.BLOCK.
> Therefore, the caller of tdh_mem_range_block() needs to check the function
> return value and retrieve extended error info from the function output
> params.
>
> Upon TDH.MEM.RANGE.BLOCK success, no new TLB entries will be created for
> the specified private GPA range, though the existing TLB translations may
> still persist.
>
> Call tdh_mem_track() after tdh_mem_range_block(). No extra info is required
> except the TDR HPA to denote the TD. TDH.MEM.TRACK will advance the TD's
> epoch counter to ensure TDX module will flush TLBs in all vCPUs once the
> vCPUs re-enter the TD. TDH.MEM.TRACK will fail to advance TD's epoch
> counter if there are vCPUs still running in non-root mode at the previous
> TD epoch counter. Therefore, send IPIs to kick off vCPUs after
> tdh_mem_track() to avoid the failure by forcing all vCPUs to re-enter the
> TD.
This patch doesn't implement the functionality described in the above paragraph,
the caller does it. It also doesn't explain why it leaves it all for the callers
instead of implementing the described flow on the x86 side. I think Dave will
want to understand this, so how about this instead:
So to ensure private GPA translations are flushed, callers must first call
tdh_mem_range_block(), then tdh_mem_track(), and lastly send IPIs to kick all
the vCPUs and trigger a TLB flush forcing them to re-enter. Don't export a
single operation and instead export functions that just expose the block and
track operations. Do this for a couple reasons:
1. The vCPU kick should use KVM's functionality for doing this, which can better
target sending IPIs to only the minimum required pCPUs.
2. tdh_mem_track() doesn't need to be executed if a vCPU has not entered a TD,
which is information only KVM knows.
3. Leaving the operations separate will allow for batching many
tdh_mem_range_block() calls before a tdh_mem_track(). While this batching will
not be done initially by KVM, it demonstrates that keeping mem block and track
as separate operations is a generally good design.
The above also loses the bit about ensuring all vCPUs are kicked to avoid epoch
related errors. I didn't think it was needed to justify the wrappers.
>
> Contentions are also possible in TDH.MEM.TRACK. For example, TDH.MEM.TRACK
> may contend with TDH.VP.ENTER when advancing the TD epoch counter.
> tdh_mem_track() does not provide the retries for the caller. Callers can
> choose to avoid contentions or retry on their own.
>
> [Kai: Switched from generic seamcall export]
> [Yan: Re-wrote the changelog]
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20241112073648.22143-1-yan.y.zhao@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/include/asm/tdx.h | 2 ++
> arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
> arch/x86/virt/vmx/tdx/tdx.h | 2 ++
> 3 files changed, 31 insertions(+)
>
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 32f3981d56c5..f0b7b7b7d506 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -142,6 +142,7 @@ u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, u64 hpa, u64 source, u64 *rcx,
> u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, u64 level, u64 hpa, u64 *rcx, u64 *rdx);
> u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
> u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx);
> +u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx);
> u64 tdh_mng_key_config(struct tdx_td *td);
> u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
> u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
> @@ -155,6 +156,7 @@ u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
> u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
> u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
> u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
> +u64 tdh_mem_track(struct tdx_td *tdr);
> u64 tdh_phymem_cache_wb(bool resume);
> u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
> #else
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index f39197d4eafc..c7e6f30d0a14 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1561,6 +1561,23 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
> }
> EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
>
> +u64 tdh_mem_range_block(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx)
> +{
> + struct tdx_module_args args = {
> + .rcx = gpa | level,
> + .rdx = tdx_tdr_pa(td),
> + };
> + u64 ret;
> +
> + ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
Similar to the others, these could be called extended_err1, extended_err2.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_range_block);
> +
> u64 tdh_mng_key_config(struct tdx_td *td)
> {
> struct tdx_module_args args = {
> @@ -1734,6 +1751,16 @@ u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64
> }
> EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
>
> +u64 tdh_mem_track(struct tdx_td *td)
> +{
> + struct tdx_module_args args = {
> + .rcx = tdx_tdr_pa(td),
> + };
> +
> + return seamcall(TDH_MEM_TRACK, &args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_track);
> +
> u64 tdh_phymem_cache_wb(bool resume)
> {
> struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 80e6ef006085..4b0ad536afd9 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -20,6 +20,7 @@
> #define TDH_MEM_SEPT_ADD 3
> #define TDH_VP_ADDCX 4
> #define TDH_MEM_PAGE_AUG 6
> +#define TDH_MEM_RANGE_BLOCK 7
> #define TDH_MNG_KEY_CONFIG 8
> #define TDH_MNG_CREATE 9
> #define TDH_MNG_RD 11
> @@ -35,6 +36,7 @@
> #define TDH_SYS_KEY_CONFIG 31
> #define TDH_SYS_INIT 33
> #define TDH_SYS_RD 34
> +#define TDH_MEM_TRACK 38
> #define TDH_SYS_LP_INIT 35
> #define TDH_SYS_TDMR_INIT 36
> #define TDH_PHYMEM_CACHE_WB 40
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-01 7:49 ` [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
@ 2025-01-03 1:36 ` Edgecombe, Rick P
2025-01-07 6:43 ` Yan Zhao
0 siblings, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-03 1:36 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> TDX architecture introduces the concept of private GPA vs shared GPA,
> depending on the GPA.SHARED bit. The TDX module maintains a single Secure
> EPT (S-EPT or SEPT) tree per TD to translate TD's private memory accessed
> using a private GPA. Wrap the SEAMCALL TDH.MEM.PAGE.REMOVE with
> tdh_mem_page_remove() and TDH_PHYMEM_PAGE_WBINVD with
> tdh_phymem_page_wbinvd_hkid() to unmap a TD private page from the SEPT,
> remove the TD private page from the TDX module and flush cache lines to
> memory after removal of the private page.
>
> Callers should specify "GPA" and "level" when calling tdh_mem_page_remove()
> to indicate to the TDX module which TD private page to unmap and remove.
>
> TDH.MEM.PAGE.REMOVE may fail, and the caller of tdh_mem_page_remove() can
> check the function return value and retrieve extended error information
> from the function output parameters. Follow the TLB tracking protocol
> before calling tdh_mem_page_remove() to remove a TD private page to avoid
> SEAMCALL failure.
>
> After removing a TD's private page, the TDX module does not write back and
> invalidate cache lines associated with the page and the page's keyID (i.e.,
> the TD's guest keyID). Therefore, provide tdh_phymem_page_wbinvd_hkid() to
> allow the caller to pass in the TD's guest keyID and invoke
> TDH_PHYMEM_PAGE_WBINVD to perform this action.
>
> Before reusing the page, the host kernel needs to map the page with keyID 0
> and invoke movdir64b() to convert the TD private page to a normal shared
> page.
>
> TDH.MEM.PAGE.REMOVE and TDH_PHYMEM_PAGE_WBINVD may meet contentions inside
> the TDX module for TDX's internal resources. To avoid staying in SEAM mode
> for too long, TDX module will return a BUSY error code to the kernel
> instead of spinning on the locks. The caller may need to handle this error
> in specific ways (e.g., retry). The wrappers return the SEAMCALL error code
> directly to the caller. Don't attempt to handle it in the core kernel.
>
> [Kai: Switched from generic seamcall export]
> [Yan: Re-wrote the changelog]
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20241112073658.22157-1-yan.y.zhao@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/include/asm/tdx.h | 2 ++
> arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
> arch/x86/virt/vmx/tdx/tdx.h | 1 +
> 3 files changed, 30 insertions(+)
>
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index f0b7b7b7d506..74938f725481 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -157,8 +157,10 @@ u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
> u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
> u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
> u64 tdh_mem_track(struct tdx_td *tdr);
> +u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx);
> u64 tdh_phymem_cache_wb(bool resume);
> u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
> +u64 tdh_phymem_page_wbinvd_hkid(u64 hpa, u64 hkid);
> #else
> static inline void tdx_init(void) { }
> static inline int tdx_cpu_enable(void) { return -ENODEV; }
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index c7e6f30d0a14..cde55e9b3280 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1761,6 +1761,23 @@ u64 tdh_mem_track(struct tdx_td *td)
> }
> EXPORT_SYMBOL_GPL(tdh_mem_track);
>
> +u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64 level, u64 *rcx, u64 *rdx)
level could be an int instead of a u64. An enum was also discussed, but
considered to be not completely necessary. Probably we could even lose the level
arg, depending on what we want to do about the one for page.aug.
> +{
> + struct tdx_module_args args = {
> + .rcx = gpa | level,
Yan had "= gpa | (level & 0x7)" here, to make sure to only apply bits 0-2.
> + .rdx = tdx_tdr_pa(td),
> + };
> + u64 ret;
> +
> + ret = seamcall_ret(TDH_MEM_PAGE_REMOVE, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
Switch to extended_err1/2 if the others get changed.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mem_page_remove);
> +
> u64 tdh_phymem_cache_wb(bool resume)
> {
> struct tdx_module_args args = {
> @@ -1780,3 +1797,13 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
> return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> }
> EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
> +
> +u64 tdh_phymem_page_wbinvd_hkid(u64 hpa, u64 hkid)
> +{
> + struct tdx_module_args args = {};
> +
> + args.rcx = hpa | (hkid << boot_cpu_data.x86_phys_bits);
> +
> + return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid);
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index 4b0ad536afd9..d49cdd9b0577 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -33,6 +33,7 @@
> #define TDH_PHYMEM_PAGE_RDMD 24
> #define TDH_VP_RD 26
> #define TDH_PHYMEM_PAGE_RECLAIM 28
> +#define TDH_MEM_PAGE_REMOVE 29
> #define TDH_SYS_KEY_CONFIG 31
> #define TDH_SYS_INIT 33
> #define TDH_SYS_RD 34
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation
2025-01-01 7:49 ` [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
@ 2025-01-03 17:51 ` Edgecombe, Rick P
0 siblings, 0 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-03 17:51 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, yuan.yao@intel.com, binbin.wu@linux.intel.com,
sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> +/*
> + * The TDX module exposes a CLFLUSH_BEFORE_ALLOC bit to specify whether
> + * a CLFLUSH of pages is required before handing them to the TDX module.
> + * Be conservative and make the code simpler by doing the CLFLUSH
> + * unconditionally.
> + */
> +static void tdx_clflush_page(struct page *tdr)
tdr could be "page" here, as it later is used for non-tdr pages.
> +{
> + clflush_cache_range(page_to_virt(tdr), PAGE_SIZE);
> +}
> +
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-01 7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
@ 2025-01-03 18:02 ` Edgecombe, Rick P
2025-01-14 22:03 ` Paolo Bonzini
2025-01-07 7:01 ` Yan Zhao
1 sibling, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-03 18:02 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> From: Isaku Yamahata <isaku.yamahata@intel.com>
>
> The TDX module measures the TD during the build process and saves the
> measurement in TDCS.MRTD to facilitate TD attestation of the initial
> contents of the TD. Wrap the SEAMCALL TDH.MR.EXTEND with tdh_mr_extend()
> and TDH.MR.FINALIZE with tdh_mr_finalize() to enable the host kernel to
> assist the TDX module in performing the measurement.
>
> The measurement in TDCS.MRTD is a SHA-384 digest of the build process.
> SEAMCALLs TDH.MNG.INIT and TDH.MEM.PAGE.ADD initialize and contribute to
> the MRTD digest calculation.
>
> The caller of tdh_mr_extend() should break the TD private page into chunks
> of size TDX_EXTENDMR_CHUNKSIZE and invoke tdh_mr_extend() to add the page
> content into the digest calculation. Failures are possible with
> TDH.MR.EXTEND (e.g., due to SEPT walking). The caller of tdh_mr_extend()
> can check the function return value and retrieve extended error information
> from the function output parameters.
>
> Calling tdh_mr_finalize() completes the measurement. The TDX module then
> turns the TD into the runnable state. Further TDH.MEM.PAGE.ADD and
> TDH.MR.EXTEND calls will fail.
>
> TDH.MR.FINALIZE may fail due to errors such as the TD having no vCPUs or
> contentions. Check function return value when calling tdh_mr_finalize() to
> determine the exact reason for failure. Take proper locks on the caller's
> side to avoid contention failures, or handle the BUSY error in specific
> ways (e.g., retry). Return the SEAMCALL error code directly to the caller.
> Do not attempt to handle it in the core kernel.
>
> [Kai: Switched from generic seamcall export]
> [Yan: Re-wrote the changelog]
> Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20241112073709.22171-1-yan.y.zhao@intel.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/include/asm/tdx.h | 2 ++
> arch/x86/virt/vmx/tdx/tdx.c | 27 +++++++++++++++++++++++++++
> arch/x86/virt/vmx/tdx/tdx.h | 2 ++
> 3 files changed, 31 insertions(+)
>
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 74938f725481..6981a3d75eb2 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -147,6 +147,8 @@ u64 tdh_mng_key_config(struct tdx_td *td);
> u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
> u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
> u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
> +u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx);
> +u64 tdh_mr_finalize(struct tdx_td *td);
> u64 tdh_vp_flush(struct tdx_vp *vp);
> u64 tdh_mng_vpflushdone(struct tdx_td *td);
> u64 tdh_mng_key_freeid(struct tdx_td *td);
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index cde55e9b3280..84fe5bc79434 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -1629,6 +1629,33 @@ u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data)
> }
> EXPORT_SYMBOL_GPL(tdh_mng_rd);
>
> +u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx)
gpa should be type gpa_t to avoid bare u64 types.
> +{
> + struct tdx_module_args args = {
> + .rcx = gpa,
> + .rdx = tdx_tdr_pa(td),
> + };
> + u64 ret;
> +
> + ret = seamcall_ret(TDH_MR_EXTEND, &args);
> +
> + *rcx = args.rcx;
> + *rdx = args.rdx;
Could be extended_err1/2.
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdh_mr_extend);
> +
> +u64 tdh_mr_finalize(struct tdx_td *td)
> +{
> + struct tdx_module_args args = {
> + .rcx = tdx_tdr_pa(td),
> + };
> +
> + return seamcall(TDH_MR_FINALIZE, &args);
> +}
> +EXPORT_SYMBOL_GPL(tdh_mr_finalize);
> +
> u64 tdh_vp_flush(struct tdx_vp *vp)
> {
> struct tdx_module_args args = {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> index d49cdd9b0577..a1e34773bab7 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.h
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -24,6 +24,8 @@
> #define TDH_MNG_KEY_CONFIG 8
> #define TDH_MNG_CREATE 9
> #define TDH_MNG_RD 11
> +#define TDH_MR_EXTEND 16
> +#define TDH_MR_FINALIZE 17
> #define TDH_VP_FLUSH 18
> #define TDH_MNG_VPFLUSHDONE 19
> #define TDH_VP_CREATE 10
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM
2025-01-01 7:49 ` [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
@ 2025-01-03 18:26 ` Edgecombe, Rick P
0 siblings, 0 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-03 18:26 UTC (permalink / raw)
To: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, Huang, Kai, dave.hansen@linux.intel.com
On Wed, 2025-01-01 at 02:49 -0500, Paolo Bonzini wrote:
> Link: https://lore.kernel.org/kvm/0853b155ec9aac09c594caa60914ed6ea4dc0a71.camel@intel.com/ [1]
> Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [2]
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Message-ID: <20241030190039.77971-4-rick.p.edgecombe@intel.com>
It looks like this has the code changes from:
https://lore.kernel.org/kvm/20241221010704.14155-1-kai.huang@intel.com/
but not the log updates. The main missing piece is the explanation of why to go
with 128 size over the documented 32 and to check the size at runtime. I think
the history is good to record in the log.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-02 23:32 ` Edgecombe, Rick P
@ 2025-01-06 5:50 ` Yan Zhao
2025-01-06 19:21 ` Edgecombe, Rick P
2025-01-14 23:32 ` Paolo Bonzini
0 siblings, 2 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-06 5:50 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Fri, Jan 03, 2025 at 07:32:46AM +0800, Edgecombe, Rick P wrote:
> > +u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, u64 hpa, u64 *rcx, u64 *rdx)
> > +{
>
> hpa should be struct page, or as Yan had been ready to propose a folio and idx.
The consideration of folio + idx is to make sure the physical addresses for
"level" is contiguous, while allowing KVM to request mapping of a small page
contained in a huge folio.
> I would have thought a struct page would be sufficient for now. She also planned
> to add a level arg, which today should always be 4k, but would be needed for
> future huge page support.
Yes, in previous version, "level" is embedded in param "gpa" and implicitly set
to 0 by KVM.
The planned changes to tdh_mem_page_aug() are as follows:
- Use struct tdx_td instead of raw TDR u64.
- Use extended_err1/2 instead of rcx/rdx for output.
- Change "u64 gpa" to "gfn_t gfn".
- Use union tdx_sept_gpa_mapping_info to initialize args.rcx.
- Use "struct folio *" + "unsigned long folio_page_idx" instead of raw
hpa for guest page in tdh_mem_page_aug() and fail if a page (huge
or not) to aug is not contained in a single folio.
- Add a new param "level".
- Fail the wrapper if "level" is not 4K-1G.
- Call tdx_clflush_page() instead of clflush_cache_range() and loops
tdx_clflush_page() for each 4k page in a huge page to aug.
+u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, int level, struct folio *private_folio,
+ unsigned long folio_page_idx, u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = level, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(folio_page(private_folio, folio_page_idx)),
+ };
+ unsigned long nr_pages = 1 << (level * 9);
+ u64 ret;
+
+ if (!(level >= TDX_PS_4K && level < TDX_PS_NR) ||
+ (folio_page_idx + nr_pages > folio_nr_pages(private_folio)))
+ return -EINVAL;
+
+ while (nr_pages--)
+ tdx_clflush_page(folio_page(private_folio, folio_page_idx++));
+
+ ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
The corresponding changes in KVM:
static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn)
{
- put_page(pfn_to_page(pfn));
+ folio_put(page_folio(pfn_to_page(pfn)));
}
static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn,
enum pg_level level, kvm_pfn_t pfn)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
- hpa_t hpa = pfn_to_hpa(pfn);
- gpa_t gpa = gfn_to_gpa(gfn);
+ int tdx_level = pg_level_to_tdx_sept_level(level);
+ struct page *private_page = pfn_to_page(pfn);
u64 entry, level_state;
u64 err;
- err = tdh_mem_page_aug(tdr_pa, gpa, hpa, &entry, &level_state);
+ err = tdh_mem_page_aug(&kvm_tdx->td, gfn, tdx_level, page_folio(private_page),
+ folio_page_idx(page_folio(private_page), private_page),
+ &entry, &level_state);
if (unlikely(err & TDX_OPERAND_BUSY)) {
tdx_unpin(kvm, pfn);
return -EBUSY;
@@ -1620,9 +1621,9 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
* migration. Until guest_memfd supports page migration, prevent page
* migration.
* TODO: Once guest_memfd introduces callback on page migration,
- * implement it and remove get_page/put_page().
+ * implement it and remove folio_get/folio_put().
*/
- get_page(pfn_to_page(pfn));
+ folio_get(page_folio(pfn_to_page(pfn)));
> I think we should try to keep it as simple as possible for now.
Yeah.
So, do you think we need to have tdh_mem_page_aug() to support 4K level page
only and ask for Dave's review again for huge page?
Do we need to add param "level" ?
- if yes, "struct page" looks not fit.
- if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-06 5:50 ` Yan Zhao
@ 2025-01-06 19:21 ` Edgecombe, Rick P
2025-01-07 6:37 ` Yan Zhao
2025-01-14 23:32 ` Paolo Bonzini
1 sibling, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-06 19:21 UTC (permalink / raw)
To: Zhao, Yan Y
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Mon, 2025-01-06 at 13:50 +0800, Yan Zhao wrote:
> > I think we should try to keep it as simple as possible for now.
> Yeah.
> So, do you think we need to have tdh_mem_page_aug() to support 4K level page
> only and ask for Dave's review again for huge page?
>
> Do we need to add param "level" ?
> - if yes, "struct page" looks not fit.
> - if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?
My thoughts would be we should export just what is needed for today to keep
things simple and speedy (skip level arg, support order 0 only), especially if
we can drop all folio checks. The SEAMCALL wrappers will not be set in stone and
it will be easier to review huge page required stuff in the context of already
settled 4k support.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-02 21:59 ` Edgecombe, Rick P
@ 2025-01-07 5:27 ` Yan Zhao
2025-01-07 19:48 ` Dave Hansen
1 sibling, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 5:27 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
Here's the proposed new version with changelog:
SEAMCALL RFC:
- Use struct tdx_td instead of raw TDR u64.
- Use "struct page *" for sept_page instead of raw hpa.
- Change "u64 level" to "int level".
- Rename "level" to "tdx_level" as it's tdx specfic. (Rick)
- Change "u64 gpa" to "gfn_t gfn". (Reinette)
- Introduce a union tdx_sept_gpa_mapping_info to initialize args.rcx.
(Reinette)
- Call tdx_clflush_page() instead of clflush_cache_range().
- Use extended_err1/2 instead of rcx/rdx for output.
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index c6d0668de167..ea5a98fd1dd5 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -105,6 +105,20 @@ struct tdx_module_args {
u64 rsi;
};
+/*
+ * Used to specify SEPT GPA mapping information for SEPT related SEAMCALLs and
+ * TDCALLs.
+ */
+union tdx_sept_gpa_mapping_info {
+ struct {
+ u64 level : 3;
+ u64 reserved1 : 9;
+ u64 gfn : 40;
+ u64 reserved2 : 12;
+ };
+ u64 full;
+};
+
/* Used to communicate with the TDX module */
u64 __tdcall(u64 fn, struct tdx_module_args *args);
u64 __tdcall_ret(u64 fn, struct tdx_module_args *args);
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index bbb8f0bae9ba..787c359a5fc9 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -33,6 +33,7 @@
#ifndef __ASSEMBLY__
#include <uapi/asm/mce.h>
+#include <linux/kvm_types.h>
#include "tdx_global_metadata.h"
/*
@@ -143,6 +144,8 @@ struct tdx_vp {
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_sept_add(struct tdx_td *td, gfn_t gfn, int tdx_level, struct page *sept_page,
+ u64 *extended_err1, u64 *extended_err2);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a16c507296df..adb2059b6b5f 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1583,6 +1583,27 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_sept_add(struct tdx_td *td, gfn_t gfn, int tdx_level, struct page *sept_page,
+ u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = tdx_level, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(sept_page),
+ };
+ u64 ret;
+
+ tdx_clflush_page(sept_page);
+ ret = seamcall_ret(TDH_MEM_SEPT_ADD, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_sept_add);
+
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 08b01b7fe7c2..0d1ba0d0ac82 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -18,6 +18,7 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-06 19:21 ` Edgecombe, Rick P
@ 2025-01-07 6:37 ` Yan Zhao
0 siblings, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 6:37 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, Jan 07, 2025 at 03:21:40AM +0800, Edgecombe, Rick P wrote:
> On Mon, 2025-01-06 at 13:50 +0800, Yan Zhao wrote:
> > > I think we should try to keep it as simple as possible for now.
> > Yeah.
> > So, do you think we need to have tdh_mem_page_aug() to support 4K level page
> > only and ask for Dave's review again for huge page?
> >
> > Do we need to add param "level" ?
> > - if yes, "struct page" looks not fit.
> > - if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?
>
> My thoughts would be we should export just what is needed for today to keep
> things simple and speedy (skip level arg, support order 0 only), especially if
> we can drop all folio checks. The SEAMCALL wrappers will not be set in stone and
> it will be easier to review huge page required stuff in the context of already
> settled 4k support.
Ok. Attached the new diff for tdh_mem_page_aug() to support 4K only.
Have compiled and tested in my local env.
(I kept the tdx_level in tdh_mem_range_block() and tdh_mem_page_remove() in
later patches).
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 787c359a5fc9..1db93e4886df 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -144,9 +144,14 @@ struct tdx_vp {
};
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
+u64 tdh_mem_page_add(struct tdx_td *td, gfn_t gfn, struct page *private_page,
+ struct page *source_page, u64 *extended_err1, u64 *extended_err2);
u64 tdh_mem_sept_add(struct tdx_td *td, gfn_t gfn, int tdx_level, struct page *sept_page,
u64 *extended_err1, u64 *extended_err2);
u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
+struct folio;
+u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, struct page *private_page,
+ u64 *extended_err1, u64 *extended_err2);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index adb2059b6b5f..cfedff43e1e0 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1583,6 +1583,28 @@ u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page)
}
EXPORT_SYMBOL_GPL(tdh_mng_addcx);
+u64 tdh_mem_page_add(struct tdx_td *td, gfn_t gfn, struct page *private_page,
+ struct page *source_page, u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = 0, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(private_page),
+ .r9 = page_to_phys(source_page),
+ };
+ u64 ret;
+
+ tdx_clflush_page(private_page);
+ ret = seamcall_ret(TDH_MEM_PAGE_ADD, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_add);
+
u64 tdh_mem_sept_add(struct tdx_td *td, gfn_t gfn, int tdx_level, struct page *sept_page,
u64 *extended_err1, u64 *extended_err2)
{
@@ -1616,6 +1638,28 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
}
EXPORT_SYMBOL_GPL(tdh_vp_addcx);
+u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, struct page *private_page,
+ u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = 0, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ .r8 = page_to_phys(private_page),
+ };
+ u64 ret;
+
+ tdx_clflush_page(private_page);
+
+ ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 0d1ba0d0ac82..8a56e790f64d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -18,8 +18,10 @@
* TDX module SEAMCALL leaf functions
*/
#define TDH_MNG_ADDCX 1
+#define TDH_MEM_PAGE_ADD 2
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
+#define TDH_MEM_PAGE_AUG 6
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking
2025-01-03 1:28 ` Edgecombe, Rick P
@ 2025-01-07 6:40 ` Yan Zhao
0 siblings, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 6:40 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
New diffs and changelog:
SEAMCALL RFC:
- Use struct tdx_td instead of raw TDR u64.
- Use extended_err1/2 instead of rcx/rdx for output.
- Change "u64 level" to "int tdx_level".
- Change "u64 gpa" to "gfn_t gfn". (Reinette)
- Use union tdx_sept_gpa_mapping_info to initialize args.rcx in
tdh_mem_range_block(). (Reinette)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 1db93e4886df..980daa142e92 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -152,6 +152,8 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page);
struct folio;
u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, struct page *private_page,
u64 *extended_err1, u64 *extended_err2);
+u64 tdh_mem_range_block(struct tdx_td *td, gfn_t gfn, int tdx_level,
+ u64 *extended_err1, u64 *extended_err2);
u64 tdh_mng_key_config(struct tdx_td *td);
u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
@@ -165,6 +167,7 @@ u64 tdh_vp_rd(struct tdx_vp *vp, u64 field, u64 *data);
u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
+u64 tdh_mem_track(struct tdx_td *td);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
#else
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index cfedff43e1e0..120a415c1d7a 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1660,6 +1660,25 @@ u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, struct page *private_page,
}
EXPORT_SYMBOL_GPL(tdh_mem_page_aug);
+u64 tdh_mem_range_block(struct tdx_td *td, gfn_t gfn, int tdx_level,
+ u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = tdx_level, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_RANGE_BLOCK, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_range_block);
+
u64 tdh_mng_key_config(struct tdx_td *td)
{
struct tdx_module_args args = {
@@ -1833,6 +1852,16 @@ u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_reclaim);
+u64 tdh_mem_track(struct tdx_td *td)
+{
+ struct tdx_module_args args = {
+ .rcx = tdx_tdr_pa(td),
+ };
+
+ return seamcall(TDH_MEM_TRACK, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mem_track);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 8a56e790f64d..24e32c838926 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -22,6 +22,7 @@
#define TDH_MEM_SEPT_ADD 3
#define TDH_VP_ADDCX 4
#define TDH_MEM_PAGE_AUG 6
+#define TDH_MEM_RANGE_BLOCK 7
#define TDH_MNG_KEY_CONFIG 8
#define TDH_MNG_CREATE 9
#define TDH_MNG_RD 11
@@ -37,6 +38,7 @@
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
+#define TDH_MEM_TRACK 38
#define TDH_SYS_LP_INIT 35
#define TDH_SYS_TDMR_INIT 36
#define TDH_PHYMEM_CACHE_WB 40
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-03 1:36 ` Edgecombe, Rick P
@ 2025-01-07 6:43 ` Yan Zhao
2025-01-07 6:52 ` Yan Zhao
2025-01-07 22:13 ` Dave Hansen
0 siblings, 2 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 6:43 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
New diffs and changelog:
SEAMCALL RFC:
- For tdh_mem_page_remove()
a) Use struct tdx_td instead of raw TDR u64
b) Change "u64 level" to "int tdx_level".
c) Change "u64 gpa" to "gfn_t gfn". (Reinette)
d) Use union tdx_sept_gpa_mapping_info to initialize args.rcx. (Reinette)
e) Use extended_err1/2 instead of rcx/rdx for output.
- For tdh_phymem_page_wbinvd_hkid()
a) Use "struct page *" instead of raw hpa.
b) Change "u64 hkid" to "u16 hkid" (Reinette)
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 980daa142e92..be0fc55186a8 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -168,8 +168,11 @@ u64 tdh_vp_wr(struct tdx_vp *vp, u64 field, u64 data, u64 mask);
u64 tdh_vp_init_apicid(struct tdx_vp *vp, u64 initial_rcx, u32 x2apicid);
u64 tdh_phymem_page_reclaim(struct page *page, u64 *tdx_pt, u64 *tdx_owner, u64 *tdx_size);
u64 tdh_mem_track(struct tdx_td *td);
+u64 tdh_mem_page_remove(struct tdx_td *td, gfn_t gfn, int tdx_level,
+ u64 *extended_err1, u64 *extended_err2);
u64 tdh_phymem_cache_wb(bool resume);
u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td);
+u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid);
#else
static inline void tdx_init(void) { }
static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 120a415c1d7a..b4e4cfce3475 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1862,6 +1862,25 @@ u64 tdh_mem_track(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mem_track);
+u64 tdh_mem_page_remove(struct tdx_td *td, gfn_t gfn, int tdx_level,
+ u64 *extended_err1, u64 *extended_err2)
+{
+ union tdx_sept_gpa_mapping_info gpa_info = { .level = tdx_level, .gfn = gfn, };
+ struct tdx_module_args args = {
+ .rcx = gpa_info.full,
+ .rdx = tdx_tdr_pa(td),
+ };
+ u64 ret;
+
+ ret = seamcall_ret(TDH_MEM_PAGE_REMOVE, &args);
+
+ *extended_err1 = args.rcx;
+ *extended_err2 = args.rdx;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(tdh_mem_page_remove);
+
u64 tdh_phymem_cache_wb(bool resume)
{
struct tdx_module_args args = {
@@ -1882,3 +1901,12 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_tdr);
+u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid)
+{
+ struct tdx_module_args args = {};
+
+ args.rcx = page_to_phys(page) | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
+
+ return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 24e32c838926..4de17d9c2e8c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -35,6 +35,7 @@
#define TDH_PHYMEM_PAGE_RDMD 24
#define TDH_VP_RD 26
#define TDH_PHYMEM_PAGE_RECLAIM 28
+#define TDH_MEM_PAGE_REMOVE 29
#define TDH_SYS_KEY_CONFIG 31
#define TDH_SYS_INIT 33
#define TDH_SYS_RD 34
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-07 6:43 ` Yan Zhao
@ 2025-01-07 6:52 ` Yan Zhao
2025-01-07 22:13 ` Dave Hansen
1 sibling, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 6:52 UTC (permalink / raw)
To: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, Jan 07, 2025 at 02:43:11PM +0800, Yan Zhao wrote:
...
> +u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid)
> +{
> + struct tdx_module_args args = {};
> +
> + args.rcx = page_to_phys(page) | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
> +
> + return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> +}
For type of hkid is changed from u64 to u16.
Here's a fixup patch to further have tdh_phymem_page_wbinvd_tdr() in [1] and
the tdh_phymem_page_wbinvd_hkid() in this patch to use the common helper
set_hkid_to_hpa().
[1] https://lore.kernel.org/kvm/20250101074959.412696-11-pbonzini@redhat.com/
commit 41f66e12a400516c6a851f0755f8abbe4dacb39b
Author: Yan Zhao <yan.y.zhao@intel.com>
Date: Wed Dec 11 18:11:24 2024 +0800
x86/virt/tdx: Move set_hkid_to_hpa() to x86 common header and use it
Move set_hkid_to_hpa() from KVM TDX to x86 common header and have
tdh_phymem_page_wbinvd_tdr() and tdh_phymem_page_wbinvd_hkid() to use it.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 5420d07ee81c..5f3931e62c06 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -144,7 +144,13 @@ struct tdx_vp {
struct page **tdcx_pages;
};
+static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid)
+{
+ return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
+}
+
/* SEAMCALL wrappers for creating/destroying/running TDX guests */
+u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_vp_enter(u64 tdvpr, struct tdx_module_args *args);
u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
u64 tdh_mem_page_add(struct tdx_td *td, gfn_t gfn, struct page *private_page,
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index c3a84eb4694a..d86bfcbd6873 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -222,11 +222,6 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
*/
static DEFINE_PER_CPU(struct list_head, associated_tdvcpus);
-static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid)
-{
- return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
-}
-
static __always_inline union vmx_exit_reason tdexit_exit_reason(struct kvm_vcpu *vcpu)
{
return (union vmx_exit_reason)(u32)(to_tdx(vcpu)->vp_enter_ret);
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 6f002e36e421..0d7a0a27bd3e 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1930,7 +1930,7 @@ u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td)
{
struct tdx_module_args args = {};
- args.rcx = tdx_tdr_pa(td) | ((u64)tdx_global_keyid << boot_cpu_data.x86_phys_bits);
+ args.rcx = set_hkid_to_hpa(tdx_tdr_pa(td), tdx_global_keyid);
return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
}
@@ -1940,7 +1940,7 @@ u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid)
{
struct tdx_module_args args = {};
- args.rcx = page_to_phys(page) | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
+ args.rcx = set_hkid_to_hpa(page_to_phys(page), hkid);
return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
}
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-01 7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
2025-01-03 18:02 ` Edgecombe, Rick P
@ 2025-01-07 7:01 ` Yan Zhao
2025-01-07 22:05 ` Dave Hansen
1 sibling, 1 reply; 48+ messages in thread
From: Yan Zhao @ 2025-01-07 7:01 UTC (permalink / raw)
To: Paolo Bonzini
Cc: linux-kernel, kvm, kai.huang, rick.p.edgecombe, dave.hansen,
Isaku Yamahata, Sean Christopherson
Here's the KVM side fixup for patches 7-11. (attached in case you want
to take a look).
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index ced4d09b2673..c3a84eb4694a 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1606,13 +1606,11 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t gfn,
enum pg_level level, kvm_pfn_t pfn)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
- hpa_t hpa = pfn_to_hpa(pfn);
- gpa_t gpa = gfn_to_gpa(gfn);
+ struct page *private_page = pfn_to_page(pfn);
u64 entry, level_state;
u64 err;
- err = tdh_mem_page_aug(tdr_pa, gpa, hpa, &entry, &level_state);
+ err = tdh_mem_page_aug(&kvm_tdx->td, gfn, private_page, &entry, &level_state);
if (unlikely(err & TDX_OPERAND_BUSY)) {
tdx_unpin(kvm, pfn);
return -EBUSY;
@@ -1687,9 +1685,6 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
{
int tdx_level = pg_level_to_tdx_sept_level(level);
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
- gpa_t gpa = gfn_to_gpa(gfn);
- hpa_t hpa = pfn_to_hpa(pfn);
u64 err, entry, level_state;
/* TODO: handle large pages. */
@@ -1705,7 +1700,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
* condition with other vcpu sept operation. Race only with
* TDH.VP.ENTER.
*/
- err = tdh_mem_page_remove(tdr_pa, gpa, tdx_level, &entry,
+ err = tdh_mem_page_remove(&kvm_tdx->td, gfn, tdx_level, &entry,
&level_state);
} while (unlikely(err == TDX_ERROR_SEPT_BUSY));
@@ -1734,7 +1729,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
* this page was removed above, other thread shouldn't be
* repeatedly operating on this page. Just retry loop.
*/
- err = tdh_phymem_page_wbinvd_hkid(hpa, (u16)kvm_tdx->hkid);
+ err = tdh_phymem_page_wbinvd_hkid(pfn_to_page(pfn), kvm_tdx->hkid);
} while (unlikely(err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX)));
if (KVM_BUG_ON(err, kvm)) {
@@ -1751,13 +1746,11 @@ int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn,
{
int tdx_level = pg_level_to_tdx_sept_level(level);
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
- gpa_t gpa = gfn_to_gpa(gfn);
- hpa_t hpa = __pa(private_spt);
+ struct page *sept_page = virt_to_page(private_spt);
u64 err, entry, level_state;
- err = tdh_mem_sept_add(tdr_pa, gpa, tdx_level, hpa, &entry,
- &level_state);
+ err = tdh_mem_sept_add(&kvm_tdx->td, gfn, tdx_level, sept_page,
+ &entry, &level_state);
if (unlikely(err == TDX_ERROR_SEPT_BUSY))
return -EAGAIN;
if (KVM_BUG_ON(err, kvm)) {
@@ -1773,14 +1766,13 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn,
{
int tdx_level = pg_level_to_tdx_sept_level(level);
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
- gpa_t gpa = gfn_to_gpa(gfn) & KVM_HPAGE_MASK(level);
+ gfn_t base_gfn = gfn_round_for_level(gfn, level);
u64 err, entry, level_state;
/* For now large page isn't supported yet. */
WARN_ON_ONCE(level != PG_LEVEL_4K);
- err = tdh_mem_range_block(tdr_pa, gpa, tdx_level, &entry, &level_state);
+ err = tdh_mem_range_block(&kvm_tdx->td, base_gfn, tdx_level, &entry, &level_state);
if (unlikely(err == TDX_ERROR_SEPT_BUSY))
return -EAGAIN;
if (KVM_BUG_ON(err, kvm)) {
@@ -1817,7 +1809,6 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn,
static void tdx_track(struct kvm *kvm)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
u64 err;
/* If TD isn't finalized, it's before any vcpu running. */
@@ -1827,7 +1818,7 @@ static void tdx_track(struct kvm *kvm)
lockdep_assert_held_write(&kvm->mmu_lock);
do {
- err = tdh_mem_track(tdr_pa);
+ err = tdh_mem_track(&kvm_tdx->td);
} while (unlikely((err & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY));
if (KVM_BUG_ON(err, kvm))
@@ -2756,7 +2747,6 @@ void tdx_flush_tlb_all(struct kvm_vcpu *vcpu)
static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
{
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
guard(mutex)(&kvm->slots_lock);
@@ -2769,7 +2759,7 @@ static int tdx_td_finalizemr(struct kvm *kvm, struct kvm_tdx_cmd *cmd)
if (atomic64_read(&kvm_tdx->nr_premapped))
return -EINVAL;
- cmd->hw_error = tdh_mr_finalize(tdr_pa);
+ cmd->hw_error = tdh_mr_finalize(&kvm_tdx->td);
if ((cmd->hw_error & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY)
return -EAGAIN;
if (KVM_BUG_ON(cmd->hw_error, kvm)) {
@@ -3029,7 +3019,6 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
{
u64 error_code = PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS;
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
- hpa_t tdr_pa = page_to_phys(kvm_tdx->td.tdr_page);
struct tdx_gmem_post_populate_arg *arg = _arg;
struct kvm_vcpu *vcpu = arg->vcpu;
gpa_t gpa = gfn_to_gpa(gfn);
@@ -3068,9 +3057,8 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
ret = 0;
do {
- err = tdh_mem_page_add(tdr_pa, gpa, pfn_to_hpa(pfn),
- pfn_to_hpa(page_to_pfn(page)),
- &entry, &level_state);
+ err = tdh_mem_page_add(&kvm_tdx->td, gfn, pfn_to_page(pfn),
+ page, &entry, &level_state);
} while (err == TDX_ERROR_SEPT_BUSY);
if (err) {
ret = -EIO;
@@ -3082,7 +3070,7 @@ static int tdx_gmem_post_populate(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
if (arg->flags & KVM_TDX_MEASURE_MEMORY_REGION) {
for (i = 0; i < PAGE_SIZE; i += TDX_EXTENDMR_CHUNKSIZE) {
- err = tdh_mr_extend(tdr_pa, gpa + i, &entry,
+ err = tdh_mr_extend(&kvm_tdx->td, gpa + i, &entry,
&level_state);
if (err) {
ret = -EIO;
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-02 21:59 ` Edgecombe, Rick P
2025-01-07 5:27 ` Yan Zhao
@ 2025-01-07 19:48 ` Dave Hansen
2025-01-08 1:12 ` Yan Zhao
1 sibling, 1 reply; 48+ messages in thread
From: Dave Hansen @ 2025-01-07 19:48 UTC (permalink / raw)
To: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/2/25 13:59, Edgecombe, Rick P wrote:
> union tdx_sept_gpa_mapping_info {
> struct {
> u64 level : 3;
> u64 reserved1 : 9;
> u64 gfn : 40;
> u64 reserved2 : 12;
> };
> u64 full;
> };
This is functionally OK, but seeing bitfields on a value that's probably
going to get shifted around makes me nervous because of:
> https://lore.kernel.org/lkml/20231111020019.553664-1-michael.roth@amd.com/
I wouldn't NAK it just for this, but it's also not how I would code it up.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-07 7:01 ` Yan Zhao
@ 2025-01-07 22:05 ` Dave Hansen
0 siblings, 0 replies; 48+ messages in thread
From: Dave Hansen @ 2025-01-07 22:05 UTC (permalink / raw)
To: Yan Zhao, Paolo Bonzini
Cc: linux-kernel, kvm, kai.huang, rick.p.edgecombe, dave.hansen,
Isaku Yamahata, Sean Christopherson
On 1/6/25 23:01, Yan Zhao wrote:
> Here's the KVM side fixup for patches 7-11. (attached in case you want
> to take a look).
Note that this removes a few lines of code without hurting readability.
That's a sign that the changes were in the right direction.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-07 6:43 ` Yan Zhao
2025-01-07 6:52 ` Yan Zhao
@ 2025-01-07 22:13 ` Dave Hansen
2025-01-08 0:41 ` Yan Zhao
1 sibling, 1 reply; 48+ messages in thread
From: Dave Hansen @ 2025-01-07 22:13 UTC (permalink / raw)
To: Yan Zhao, Edgecombe, Rick P
Cc: kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/6/25 22:43, Yan Zhao wrote:
> +u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid)
> +{
> + struct tdx_module_args args = {};
> +
> + args.rcx = page_to_phys(page) | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
I've seen this idiom enough times. You need a helper:
u64 mk_keyed_paddr(struct page *page, u64 keyid)
{
u64 ret;
ret = page_to_phys(page);
/* KeyID bits are just above the physical address bits: */
ret |= keyid << boot_cpu_data.x86_phys_bits;
return ret;
}
Although I'm also debating a bit what the typing on 'keyid' should be.
Right now it's quite tied to the physical address width, but that's not
fundamental to TDX. It could absolutely change in the future.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-07 22:13 ` Dave Hansen
@ 2025-01-08 0:41 ` Yan Zhao
2025-01-08 16:31 ` Dave Hansen
0 siblings, 1 reply; 48+ messages in thread
From: Yan Zhao @ 2025-01-08 0:41 UTC (permalink / raw)
To: Dave Hansen
Cc: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, Jan 07, 2025 at 02:13:19PM -0800, Dave Hansen wrote:
> On 1/6/25 22:43, Yan Zhao wrote:
> > +u64 tdh_phymem_page_wbinvd_hkid(struct page *page, u16 hkid)
> > +{
> > + struct tdx_module_args args = {};
> > +
> > + args.rcx = page_to_phys(page) | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits);
>
> I've seen this idiom enough times. You need a helper:
>
> u64 mk_keyed_paddr(struct page *page, u64 keyid)
> {
> u64 ret;
>
> ret = page_to_phys(page);
> /* KeyID bits are just above the physical address bits: */
> ret |= keyid << boot_cpu_data.x86_phys_bits;
>
> return ret;
> }
Thanks. There's actually a helper at [1].
I made the helper in [1] as a fixup patch since it needs to be applied to both
tdh_phymem_page_wbinvd_tdr() and tdh_phymem_page_wbinvd_hkid().
[1] https://lore.kernel.org/all/Z3zPSPrFtxM7l5cD@yzhao56-desk.sh.intel.com/
>
> Although I'm also debating a bit what the typing on 'keyid' should be.
> Right now it's quite tied to the physical address width, but that's not
> fundamental to TDX. It could absolutely change in the future.
Changing from "u64" to "u16" was raised during our internal review.
The main reasons for this change are to avoid overflow and also because:
- In MSR IA32_TME_ACTIVATE, MK_TME_KEYID_BITS and TDX_RESERVED_KEYID_BITS are
only defined with 4 bits, so at most 16 bits in HPA can be configured as
KeyIDs.
- TDX spec explictly requires KeyID to be 16 bits for SEAMCALLs TDH.SYS.CONFIG
and TDH.MNG.CREATE.
- Corresponding variables for tdx_global_keyid, tdx_guest_keyid_start,
tdx_nr_guest_keyids, nr_tdx_keyids, tdx_keyid_start are all u16 in TDX module
code.
There is a proposed fix to change the type of KeyID to u16 as shown below (not
yet split and sent out). Do you think this change to u16 makes sense?
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 296d123a6c1c..5f3931e62c06 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -122,7 +122,7 @@ const char *tdx_dump_mce_info(struct mce *m);
const struct tdx_sys_info *tdx_get_sysinfo(void);
int tdx_guest_keyid_alloc(void);
-void tdx_guest_keyid_free(unsigned int keyid);
+void tdx_guest_keyid_free(u16 keyid);
struct tdx_td {
/* TD root structure: */
@@ -164,7 +164,7 @@ u64 tdh_mem_page_aug(struct tdx_td *td, gfn_t gfn, struct page *private_page,
u64 tdh_mem_range_block(struct tdx_td *td, gfn_t gfn, int tdx_level,
u64 *extended_err1, u64 *extended_err2);
u64 tdh_mng_key_config(struct tdx_td *td);
-u64 tdh_mng_create(struct tdx_td *td, u64 hkid);
+u64 tdh_mng_create(struct tdx_td *td, u16 hkid);
u64 tdh_vp_create(struct tdx_td *td, struct tdx_vp *vp);
u64 tdh_mng_rd(struct tdx_td *td, u64 field, u64 *data);
u64 tdh_mr_extend(struct tdx_td *td, gpa_t gpa, u64 *extended_err1, u64 *extended_err2);
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index d86bfcbd6873..4e7330df4e32 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -308,13 +308,13 @@ static __always_inline void tdvmcall_set_return_val(struct kvm_vcpu *vcpu,
static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx)
{
tdx_guest_keyid_free(kvm_tdx->hkid);
- kvm_tdx->hkid = -1;
+ kvm_tdx->hkid_assigned = false;
atomic_dec(&nr_configured_hkid);
}
static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx)
{
- return kvm_tdx->hkid > 0;
+ return kvm_tdx->hkid_assigned;
}
static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
@@ -2354,7 +2354,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params,
ret = tdx_guest_keyid_alloc();
if (ret < 0)
return ret;
- kvm_tdx->hkid = ret;
+ kvm_tdx->hkid = (u16)ret;
+ kvm_tdx->hkid_assigned = true;
atomic_inc(&nr_configured_hkid);
diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h
index b07ca9ecaf95..67b44e98bf49 100644
--- a/arch/x86/kvm/vmx/tdx.h
+++ b/arch/x86/kvm/vmx/tdx.h
@@ -29,7 +29,8 @@ struct kvm_tdx {
u64 attributes;
u64 xfam;
- int hkid;
+ u16 hkid;
+ bool hkid_assigned;
u64 tsc_offset;
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 1986d360e9b7..5dd83e59c5b6 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -40,9 +40,9 @@
#include <asm/mce.h>
#include "tdx.h"
-static u32 tdx_global_keyid __ro_after_init;
-static u32 tdx_guest_keyid_start __ro_after_init;
-static u32 tdx_nr_guest_keyids __ro_after_init;
+static u16 tdx_global_keyid __ro_after_init;
+static u16 tdx_guest_keyid_start __ro_after_init;
+static u16 tdx_nr_guest_keyids __ro_after_init;
static DEFINE_IDA(tdx_guest_keyid_pool);
@@ -979,7 +979,7 @@ static int construct_tdmrs(struct list_head *tmb_list,
return ret;
}
-static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_keyid)
+static int config_tdx_module(struct tdmr_info_list *tdmr_list, u16 global_keyid)
{
struct tdx_module_args args = {};
u64 *tdmr_pa_array;
@@ -1375,8 +1375,8 @@ const char *tdx_dump_mce_info(struct mce *m)
return "TDX private memory error. Possible kernel bug.";
}
-static __init int record_keyid_partitioning(u32 *tdx_keyid_start,
- u32 *nr_tdx_keyids)
+static __init int record_keyid_partitioning(u16 *tdx_keyid_start,
+ u16 *nr_tdx_keyids)
{
u32 _nr_mktme_keyids, _tdx_keyid_start, _nr_tdx_keyids;
int ret;
@@ -1394,6 +1394,11 @@ static __init int record_keyid_partitioning(u32 *tdx_keyid_start,
/* TDX KeyIDs start after the last MKTME KeyID. */
_tdx_keyid_start = _nr_mktme_keyids + 1;
+ /*
+ * In IA32_TME_ACTIVATE, MK_TME_KEYID_BITS and TDX_RESERVED_KEYID_BITS
+ * are only 4 bits, so at most 16 bits in HPA can be configured as
+ * KeyIDs.
+ */
*tdx_keyid_start = _tdx_keyid_start;
*nr_tdx_keyids = _nr_tdx_keyids;
@@ -1467,7 +1472,7 @@ static void __init check_tdx_erratum(void)
void __init tdx_init(void)
{
- u32 tdx_keyid_start, nr_tdx_keyids;
+ u16 tdx_keyid_start, nr_tdx_keyids;
int err;
err = record_keyid_partitioning(&tdx_keyid_start, &nr_tdx_keyids);
@@ -1544,7 +1549,7 @@ int tdx_guest_keyid_alloc(void)
}
EXPORT_SYMBOL_GPL(tdx_guest_keyid_alloc);
-void tdx_guest_keyid_free(unsigned int keyid)
+void tdx_guest_keyid_free(u16 keyid)
{
ida_free(&tdx_guest_keyid_pool, keyid);
}
@@ -1697,7 +1702,7 @@ u64 tdh_mng_key_config(struct tdx_td *td)
}
EXPORT_SYMBOL_GPL(tdh_mng_key_config);
-u64 tdh_mng_create(struct tdx_td *td, u64 hkid)
+u64 tdh_mng_create(struct tdx_td *td, u16 hkid)
{
struct tdx_module_args args = {
.rcx = tdx_tdr_pa(td),
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-07 19:48 ` Dave Hansen
@ 2025-01-08 1:12 ` Yan Zhao
2025-01-08 1:20 ` Dave Hansen
0 siblings, 1 reply; 48+ messages in thread
From: Yan Zhao @ 2025-01-08 1:12 UTC (permalink / raw)
To: Dave Hansen
Cc: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, Jan 07, 2025 at 11:48:12AM -0800, Dave Hansen wrote:
> On 1/2/25 13:59, Edgecombe, Rick P wrote:
> > union tdx_sept_gpa_mapping_info {
> > struct {
> > u64 level : 3;
> > u64 reserved1 : 9;
> > u64 gfn : 40;
> > u64 reserved2 : 12;
> > };
> > u64 full;
> > };
>
> This is functionally OK, but seeing bitfields on a value that's probably
> going to get shifted around makes me nervous because of:
This is defined according to the TDX spec.
e.g. in TDH.MEM.SEPT.ADD:
RCX | EPT mapping information:
----|---------------------------------------------------------------------------
| Bits | Name | Description
|------|----------|---------------------------------------------------------
| 2:0 | Level | Level of the non-leaf Secure EPT entry that will map the
| | | new Secure EPT page - see 3.6.1
| | | Level must between 1 and 3 for a 4-level EPT or between
| | | 1 and 4 for a 5-level EPT.
|------|----------|---------------------------------------------------------
| 11:3 | Reserved | Reserved: must be 0
|------|----------|---------------------------------------------------------
| 51:12| GPA | Bits 51:12 of the guest physical address of to be mapped
| | | for the new Secure EPT page Depending on the level, the
| | | following least significant bits must be 0:
| | | Level 1 (EPT): Bits 20:12
| | | Level 2 (EPD): Bits 29:12
| | | Level 3 (EPDPT): Bits 38:12
| | | Level 4 (EPML4): Bits 47:12
|------|----------|---------------------------------------------------------
| 63:52| Reserved | Reserved: must be 0
So, why does this bitfields definition make things worse?
> > https://lore.kernel.org/lkml/20231111020019.553664-1-michael.roth@amd.com/
> I wouldn't NAK it just for this, but it's also not how I would code it up.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-08 1:12 ` Yan Zhao
@ 2025-01-08 1:20 ` Dave Hansen
2025-01-08 1:43 ` Edgecombe, Rick P
0 siblings, 1 reply; 48+ messages in thread
From: Dave Hansen @ 2025-01-08 1:20 UTC (permalink / raw)
To: Yan Zhao
Cc: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/7/25 17:12, Yan Zhao wrote:
> So, why does this bitfields definition make things worse?
Look at the kernel page table management. Why don't we use bitfields for
_that_? Look at the link I sent. Bitfields can cause some really goofy
unexpected behavior if you pass them around like they were a full type.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-08 1:20 ` Dave Hansen
@ 2025-01-08 1:43 ` Edgecombe, Rick P
2025-01-08 2:14 ` Yan Zhao
0 siblings, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-08 1:43 UTC (permalink / raw)
To: Hansen, Dave, Zhao, Yan Y
Cc: Yamahata, Isaku, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com
On Tue, 2025-01-07 at 17:20 -0800, Dave Hansen wrote:
> On 1/7/25 17:12, Yan Zhao wrote:
> > So, why does this bitfields definition make things worse?
>
> Look at the kernel page table management. Why don't we use bitfields for
> _that_? Look at the link I sent. Bitfields can cause some really goofy
> unexpected behavior if you pass them around like they were a full type.
Huh, so this enum is unsafe for reading out the individual fields because if
shifting them, it will perform the shift with the size of the source bit field
size. It is safe in the way it is being used in these patches, which is to
encode a u64. But if we ever started to use tdx_sept_gpa_mapping_info to process
output from a SEAMCALL, or something, we could set ourselves up for the same
problem as the SEV bug.
Let's not open code the encoding in each SEAMCALL though. What about replacing
it with just a helper that encodes the u64 gpa from two args: gfn and tdx_level.
We could add some specific over-size behavior for the fields, but I'd think it
would be ok to keep it simple. Maybe something like this:
static u64 encode_gpa_mapping_info(gfn_t gfn, unsigned int tdx_level)
{
u64 val = 0;
val |= level;
val |= gfn << TDX_MAPPING_INFO_GFN_SHIFT;
return val;
}
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages
2025-01-08 1:43 ` Edgecombe, Rick P
@ 2025-01-08 2:14 ` Yan Zhao
0 siblings, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-08 2:14 UTC (permalink / raw)
To: Edgecombe, Rick P
Cc: Hansen, Dave, Yamahata, Isaku, kvm@vger.kernel.org,
pbonzini@redhat.com, linux-kernel@vger.kernel.org,
sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com
On Wed, Jan 08, 2025 at 09:43:37AM +0800, Edgecombe, Rick P wrote:
> On Tue, 2025-01-07 at 17:20 -0800, Dave Hansen wrote:
> > On 1/7/25 17:12, Yan Zhao wrote:
> > > So, why does this bitfields definition make things worse?
> >
> > Look at the kernel page table management. Why don't we use bitfields for
> > _that_? Look at the link I sent. Bitfields can cause some really goofy
> > unexpected behavior if you pass them around like they were a full type.
>
> Huh, so this enum is unsafe for reading out the individual fields because if
> shifting them, it will perform the shift with the size of the source bit field
> size. It is safe in the way it is being used in these patches, which is to
> encode a u64. But if we ever started to use tdx_sept_gpa_mapping_info to process
> output from a SEAMCALL, or something, we could set ourselves up for the same
> problem as the SEV bug.
Thanks for the explanation!
Sorry that I didn't clearly explain the usage to Dave.
> Let's not open code the encoding in each SEAMCALL though. What about replacing
> it with just a helper that encodes the u64 gpa from two args: gfn and tdx_level.
> We could add some specific over-size behavior for the fields, but I'd think it
> would be ok to keep it simple. Maybe something like this:
>
> static u64 encode_gpa_mapping_info(gfn_t gfn, unsigned int tdx_level)
> {
> u64 val = 0;
>
> val |= level;
> val |= gfn << TDX_MAPPING_INFO_GFN_SHIFT;
>
> return val;
> }
That's a clever alternative :)
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-08 0:41 ` Yan Zhao
@ 2025-01-08 16:31 ` Dave Hansen
2025-01-09 2:19 ` Yan Zhao
0 siblings, 1 reply; 48+ messages in thread
From: Dave Hansen @ 2025-01-08 16:31 UTC (permalink / raw)
To: Yan Zhao
Cc: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/7/25 16:41, Yan Zhao wrote:
> There is a proposed fix to change the type of KeyID to u16 as shown below (not
> yet split and sent out). Do you think this change to u16 makes sense?
I just think that the concept of a KeyID and the current implementation
on today's hardware are different things. Don't confuse an
implementation with the _concept_.
It can also make a lot of sense to pass around a 16-bit value in an
'int' in some cases. Think about NUMA nodes. You can't have negative
NUMA nodes in hardware, but we use 'int' in the kernel everywhere
because NUMA_NO_NODE gets passed around a lot.
Anyway, my point is that the underlying hardware types stop having
meaning at _some_ level of abstraction in the interfaces.
I'd personally probably just keep 'hkid' as an int everywhere until the
point where it gets shoved into the TDX module ABI.
Oh, and casts like this:
> static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
> @@ -2354,7 +2354,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params,
> ret = tdx_guest_keyid_alloc();
> if (ret < 0)
> return ret;
> - kvm_tdx->hkid = ret;
> + kvm_tdx->hkid = (u16)ret;
> + kvm_tdx->hkid_assigned = true;
are a bit silly, don't you think?
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page
2025-01-08 16:31 ` Dave Hansen
@ 2025-01-09 2:19 ` Yan Zhao
0 siblings, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-09 2:19 UTC (permalink / raw)
To: Dave Hansen
Cc: Edgecombe, Rick P, kvm@vger.kernel.org, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, Jan 08, 2025 at 08:31:14AM -0800, Dave Hansen wrote:
> On 1/7/25 16:41, Yan Zhao wrote:
> > There is a proposed fix to change the type of KeyID to u16 as shown below (not
> > yet split and sent out). Do you think this change to u16 makes sense?
>
> I just think that the concept of a KeyID and the current implementation
> on today's hardware are different things. Don't confuse an
> implementation with the _concept_.
>
> It can also make a lot of sense to pass around a 16-bit value in an
> 'int' in some cases. Think about NUMA nodes. You can't have negative
> NUMA nodes in hardware, but we use 'int' in the kernel everywhere
> because NUMA_NO_NODE gets passed around a lot.
>
> Anyway, my point is that the underlying hardware types stop having
> meaning at _some_ level of abstraction in the interfaces.
Thanks for explaining the reasoning behind the preference to "int" and pointing
to the example of NUMA nodes.
It helps a lot for me to understand the underlying principle in kernel design!
Regarding the TDX hkid, do we need a similar check for the hkid (as that for
NUMA nodes) to avoid unexpected SEAMCALL error or overflow?
static inline bool numa_valid_node(int nid)
{
return nid >= 0 && nid < MAX_NUMNODES;
}
> I'd personally probably just keep 'hkid' as an int everywhere until the
> point where it gets shoved into the TDX module ABI.
>
> Oh, and casts like this:
>
> > static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu)
> > @@ -2354,7 +2354,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params,
> > ret = tdx_guest_keyid_alloc();
> > if (ret < 0)
> > return ret;
> > - kvm_tdx->hkid = ret;
> > + kvm_tdx->hkid = (u16)ret;
> > + kvm_tdx->hkid_assigned = true;
>
> are a bit silly, don't you think?
Agreed. That's the part I don't like about this fix.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-03 18:02 ` Edgecombe, Rick P
@ 2025-01-14 22:03 ` Paolo Bonzini
2025-01-14 22:10 ` Dave Hansen
2025-01-15 1:24 ` Yan Zhao
0 siblings, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-14 22:03 UTC (permalink / raw)
To: Edgecombe, Rick P, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/3/25 19:02, Edgecombe, Rick P wrote:
>> +u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx)
>
> gpa should be type gpa_t to avoid bare u64 types.
gpa_t is defined in kvm_types.h, I am not sure arch/x86/virt should
include it.
Paolo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-14 22:03 ` Paolo Bonzini
@ 2025-01-14 22:10 ` Dave Hansen
2025-01-15 1:24 ` Yan Zhao
1 sibling, 0 replies; 48+ messages in thread
From: Dave Hansen @ 2025-01-14 22:10 UTC (permalink / raw)
To: Paolo Bonzini, Edgecombe, Rick P, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Cc: Zhao, Yan Y, sean.j.christopherson@intel.com, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/14/25 14:03, Paolo Bonzini wrote:
> On 1/3/25 19:02, Edgecombe, Rick P wrote:
>>> +u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx)
>>
>> gpa should be type gpa_t to avoid bare u64 types.
>
> gpa_t is defined in kvm_types.h, I am not sure arch/x86/virt should
> include it.
Yeah, _ideally_ we'd just keep it one type across the stack, but I can
understand why the KVM folks wouldn't want it to leave KVM-specific code.
My biggest beef elsewhere was with the indiscriminate use of u64. Using
it for passing a gpa which is also 64-bits is far from indiscriminate.
It's fine.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-06 5:50 ` Yan Zhao
2025-01-06 19:21 ` Edgecombe, Rick P
@ 2025-01-14 23:32 ` Paolo Bonzini
2025-01-15 0:49 ` Edgecombe, Rick P
2025-01-15 5:49 ` Yan Zhao
1 sibling, 2 replies; 48+ messages in thread
From: Paolo Bonzini @ 2025-01-14 23:32 UTC (permalink / raw)
To: Yan Zhao, Edgecombe, Rick P
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On 1/6/25 06:50, Yan Zhao wrote:
> Yeah.
> So, do you think we need to have tdh_mem_page_aug() to support 4K level page
> only and ask for Dave's review again for huge page?
You're right that TDH.MEM.PAGE.AUG is basically the only case in which a
struct folio is involved; on the other hand that also means that the
arch/x86/virt part of large page support will be tiny and I don't think
it will be a problem to review it again (for either Dave or myself).
> Do we need to add param "level" ?
> - if yes, "struct page" looks not fit.
Maybe, but I think adding folio knowledge now would be a bit too
hypothetical.
> - if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?
I think it makes sense to add "int level" now everywhere, even if it is
just to match the SEPT API and to have the same style for computing the
SEAMCALL arguments. I'd rather keep the arguments simple with just "gpa
| level" (i.e. gpa/level instead of gfn/level) as the computation:
that's because gpa is more obviously a u64.
I've pushed to kvm-coco-queue; if you have some time to double check
what I did that's great, otherwise if I don't hear from you I'll post
around noon European time the v3 of this series.
I have also asked Amazon, since they use KVM without struct page,
whether it is a problem to have struct page pervasively in the API and
they don't care.
Paolo
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-14 23:32 ` Paolo Bonzini
@ 2025-01-15 0:49 ` Edgecombe, Rick P
2025-01-15 2:02 ` Edgecombe, Rick P
2025-01-15 5:49 ` Yan Zhao
1 sibling, 1 reply; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-15 0:49 UTC (permalink / raw)
To: pbonzini@redhat.com, Zhao, Yan Y
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, 2025-01-15 at 00:32 +0100, Paolo Bonzini wrote:
> I've pushed to kvm-coco-queue; if you have some time to double check
> what I did that's great, otherwise if I don't hear from you I'll post
> around noon European time the v3 of this series.
Level handling and gpa encoding looks fine to me.
I get a build error:
In file included from /home/rpedgeco/repos/linux/include/asm-
generic/memory_model.h:5,
from arch/x86/include/asm/page.h:89,
from linux/arch/x86/include/asm/processor.h:20,
from linux/arch/x86/include/asm/timex.h:5,
from linux/include/linux/timex.h:67,
from linux/include/linux/clocksource.h:13,
from linux/include/linux/clockchips.h:14,
from linux/arch/x86/kernel/i8253.c:6:
linux/arch/x86/include/asm/tdx.h: In function ‘mk_keyed_paddr’:
linux/include/asm-generic/memory_model.h:38:58: error: ‘vmemmap’ undeclared
(first use in this function); did you mean ‘mem_map’?
38 | #define __page_to_pfn(page) (unsigned long)((page) - vmemmap)
| ^~~~~~~
...and needed this:
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 201f2e910411..a8a6fbd7bf71 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -35,6 +35,7 @@
#include <uapi/asm/mce.h>
#include <asm/tdx_global_metadata.h>
+#include <linux/pgtable.h>
/*
* Used by the #VE exception handler to gather the #VE exception
Nit, whitespace errors:
+static inline u64 mk_keyed_paddr(u16 hkid, struct page *page)
+{
+ u64 ret;
+
+ ret = page_to_phys(page);
+ /* KeyID bits are just above the physical address bits: */
+ ret |= hkid << boot_cpu_data.x86_phys_bits;
+
^ extra tab here
+ return ret;
+}
+
+static inline int pg_level_to_tdx_sept_level(enum pg_level level)
+{
+ WARN_ON_ONCE(level == PG_LEVEL_NONE);
+ return level - 1;
+ ^ spaces instead of tabs
+
+}
Lastly, TD's are not booting for me, with a QEMU error. Still debugging this.
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents
2025-01-14 22:03 ` Paolo Bonzini
2025-01-14 22:10 ` Dave Hansen
@ 2025-01-15 1:24 ` Yan Zhao
1 sibling, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-15 1:24 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Edgecombe, Rick P, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, sean.j.christopherson@intel.com,
Huang, Kai, dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, Jan 14, 2025 at 11:03:06PM +0100, Paolo Bonzini wrote:
> On 1/3/25 19:02, Edgecombe, Rick P wrote:
> > > +u64 tdh_mr_extend(struct tdx_td *td, u64 gpa, u64 *rcx, u64 *rdx)
> >
> > gpa should be type gpa_t to avoid bare u64 types.
>
> gpa_t is defined in kvm_types.h, I am not sure arch/x86/virt should include
> it.
>
Hmm, it's following
https://lore.kernel.org/kvm/e00c6169-802b-452b-939d-49ce5622c816@intel.com.
But if Dave is fine, it's ok.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-15 0:49 ` Edgecombe, Rick P
@ 2025-01-15 2:02 ` Edgecombe, Rick P
0 siblings, 0 replies; 48+ messages in thread
From: Edgecombe, Rick P @ 2025-01-15 2:02 UTC (permalink / raw)
To: pbonzini@redhat.com, Zhao, Yan Y
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Tue, 2025-01-14 at 16:49 -0800, Rick Edgecombe wrote:
> Lastly, TD's are not booting for me, with a QEMU error. Still debugging this.
I was able to boot kvm-coco-queue with the below two fixes. I'll glue the TDX
tests back on tomorrow and test further. Regarding the mk_keyed_paddr()
cast/shift issue, Dave had expressed a preference for int over u16 for keyids:
https://lore.kernel.org/kvm/3a32ce4a-b108-4f06-a22d-14e9c2e135f7@intel.com/
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 201f2e910411..e83f4bac6e9a 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -149,7 +149,7 @@ static inline u64 mk_keyed_paddr(u16 hkid, struct page
*page)
ret = page_to_phys(page);
/* KeyID bits are just above the physical address bits: */
- ret |= hkid << boot_cpu_data.x86_phys_bits;
+ ret |= (u64)hkid << boot_cpu_data.x86_phys_bits;
return ret;
}
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index df6928a62e2d..307b6ee083d0 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -2307,7 +2307,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params
*td_params,
struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm);
cpumask_var_t packages;
struct page **tdcs_pages = NULL;
- struct page *page, *tdr_page;
+ struct page *tdr_page;
int ret, i;
u64 err, rcx;
@@ -2333,7 +2333,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_params
*td_params,
for (i = 0; i < kvm_tdx->td.tdcs_nr_pages; i++) {
tdcs_pages[i] = alloc_page(GFP_KERNEL);
- if (!page)
+ if (!tdcs_pages[i])
goto free_tdcs;
}
^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages
2025-01-14 23:32 ` Paolo Bonzini
2025-01-15 0:49 ` Edgecombe, Rick P
@ 2025-01-15 5:49 ` Yan Zhao
1 sibling, 0 replies; 48+ messages in thread
From: Yan Zhao @ 2025-01-15 5:49 UTC (permalink / raw)
To: Paolo Bonzini
Cc: Edgecombe, Rick P, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, Huang, Kai,
dave.hansen@linux.intel.com, Yamahata, Isaku
On Wed, Jan 15, 2025 at 12:32:24AM +0100, Paolo Bonzini wrote:
> On 1/6/25 06:50, Yan Zhao wrote:
> > Yeah.
> > So, do you think we need to have tdh_mem_page_aug() to support 4K level page
> > only and ask for Dave's review again for huge page?
>
> You're right that TDH.MEM.PAGE.AUG is basically the only case in which a
> struct folio is involved; on the other hand that also means that the
> arch/x86/virt part of large page support will be tiny and I don't think it
> will be a problem to review it again (for either Dave or myself).
>
> > Do we need to add param "level" ?
> > - if yes, "struct page" looks not fit.
>
> Maybe, but I think adding folio knowledge now would be a bit too
> hypothetical.
>
> > - if not, hardcode it as 0 in the wrapper and convert "pfn" to "struct page"?
>
> I think it makes sense to add "int level" now everywhere, even if it is just
> to match the SEPT API and to have the same style for computing the SEAMCALL
> arguments. I'd rather keep the arguments simple with just "gpa | level"
> (i.e. gpa/level instead of gfn/level) as the computation: that's because gpa
> is more obviously a u64.
>
> I've pushed to kvm-coco-queue; if you have some time to double check what I
> did that's great, otherwise if I don't hear from you I'll post around noon
> European time the v3 of this series.
For tdh_mem_sept_add(), tdh_mem_page_aug(), tdh_mem_page_add()
- Use tdx_clflush_page() instead of invoking clflush_cache_range() directly to
share the common comment of tdx_clflush_page().
- prefer page_to_phy() over page_to_pfn()?
https://lore.kernel.org/kvm/0070a616-5233-4a8d-8797-eb9f182f074d@intel.com/
4d0824a1daba ("x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages")
contains unexpected changes to tdh_mem_sept_add().
u64 tdh_mem_sept_add(struct tdx_td *td, u64 gpa, int level, struct page *page, u64 *ext_err1, u64 *ext_err2)
{
+ u64 hpa = page_to_pfn(page) << PAGE_SHIFT;
struct tdx_module_args args = {
.rcx = gpa | level,
.rdx = tdx_tdr_pa(td),
- .r8 = page_to_pfn(page) << PAGE_SHIFT,
+ .r8 = hpa,
};
u64 ret;
- clflush_cache_range(page_to_virt(page), PAGE_SIZE);
+ clflush_cache_range(__va(hpa), PAGE_SIZE);
ret = seamcall_ret(TDH_MEM_SEPT_ADD, &args);
*ext_err1 = args.rcx;
@@ -1522,6 +1544,26 @@ u64 tdh_vp_addcx(struct tdx_vp *vp, struct page *tdcx_page)
}
EXPORT_SYMBOL_GPL(tdh_vp_addcx);
>
> I have also asked Amazon, since they use KVM without struct page, whether it
> is a problem to have struct page pervasively in the API and they don't care.
>
> Paolo
>
^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2025-01-15 5:51 UTC | newest]
Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-01 7:49 [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Paolo Bonzini
2025-01-01 7:49 ` [PATCH 01/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management Paolo Bonzini
2025-01-02 19:44 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 02/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX TD creation Paolo Bonzini
2025-01-03 17:51 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 03/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX vCPU creation Paolo Bonzini
2025-01-01 7:49 ` [PATCH 04/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX page cache management Paolo Bonzini
2025-01-01 7:49 ` [PATCH 05/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX VM/vCPU field access Paolo Bonzini
2025-01-01 7:49 ` [PATCH 06/13] x86/virt/tdx: Add SEAMCALL wrappers for TDX flush operations Paolo Bonzini
2025-01-01 7:49 ` [PATCH 07/13] x86/virt/tdx: Add SEAMCALL wrapper tdh_mem_sept_add() to add SEPT pages Paolo Bonzini
2025-01-02 21:59 ` Edgecombe, Rick P
2025-01-07 5:27 ` Yan Zhao
2025-01-07 19:48 ` Dave Hansen
2025-01-08 1:12 ` Yan Zhao
2025-01-08 1:20 ` Dave Hansen
2025-01-08 1:43 ` Edgecombe, Rick P
2025-01-08 2:14 ` Yan Zhao
2025-01-01 7:49 ` [PATCH 08/13] x86/virt/tdx: Add SEAMCALL wrappers to add TD private pages Paolo Bonzini
2025-01-02 23:32 ` Edgecombe, Rick P
2025-01-06 5:50 ` Yan Zhao
2025-01-06 19:21 ` Edgecombe, Rick P
2025-01-07 6:37 ` Yan Zhao
2025-01-14 23:32 ` Paolo Bonzini
2025-01-15 0:49 ` Edgecombe, Rick P
2025-01-15 2:02 ` Edgecombe, Rick P
2025-01-15 5:49 ` Yan Zhao
2025-01-01 7:49 ` [PATCH 09/13] x86/virt/tdx: Add SEAMCALL wrappers to manage TDX TLB tracking Paolo Bonzini
2025-01-03 1:28 ` Edgecombe, Rick P
2025-01-07 6:40 ` Yan Zhao
2025-01-01 7:49 ` [PATCH 10/13] x86/virt/tdx: Add SEAMCALL wrappers to remove a TD private page Paolo Bonzini
2025-01-03 1:36 ` Edgecombe, Rick P
2025-01-07 6:43 ` Yan Zhao
2025-01-07 6:52 ` Yan Zhao
2025-01-07 22:13 ` Dave Hansen
2025-01-08 0:41 ` Yan Zhao
2025-01-08 16:31 ` Dave Hansen
2025-01-09 2:19 ` Yan Zhao
2025-01-01 7:49 ` [PATCH 11/13] x86/virt/tdx: Add SEAMCALL wrappers for TD measurement of initial contents Paolo Bonzini
2025-01-03 18:02 ` Edgecombe, Rick P
2025-01-14 22:03 ` Paolo Bonzini
2025-01-14 22:10 ` Dave Hansen
2025-01-15 1:24 ` Yan Zhao
2025-01-07 7:01 ` Yan Zhao
2025-01-07 22:05 ` Dave Hansen
2025-01-01 7:49 ` [PATCH 12/13] x86/virt/tdx: Read essential global metadata for KVM Paolo Bonzini
2025-01-03 18:26 ` Edgecombe, Rick P
2025-01-01 7:49 ` [PATCH 13/13] x86/virt/tdx: Add tdx_guest_keyid_alloc/free() to alloc and free TDX guest KeyID Paolo Bonzini
2025-01-02 19:43 ` [PATCH v2 00/13] x86/virt/tdx: Add SEAMCALL wrappers for KVM Edgecombe, Rick P
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox