kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
@ 2024-10-14 11:31 Kai Huang
  2024-10-14 11:31 ` [PATCH v5 1/8] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

TL;DR:

This series does necessary tweaks to TDX host "global metadata" reading
code to fix some immediate issues in the TDX module initialization code,
with intention to also provide a flexible code base to support sharing
global metadata to KVM (and other kernel components) for future needs.

This series, and additional patches to initialize TDX when loading KVM
module and read essential metadata fields for KVM TDX can be found at
[1].

Hi Dave (and maintainers),

This series targets x86 tip.  Also add Dan, KVM maintainers and KVM list
so people can also review and comment.

This is a pre-work of the "quite near future" KVM TDX support (see the
kvm-coco-queue branch [2], which already includes all the patches in the
v2 of this series).  I appreciate if you can review, comment and take
this series if the patches look good to you.

v4 -> v5:
  - Change the build_sysmd_read(_size) macro approach to what Dave
    suggested (which is basically the approach used in v3) and rebase
    the rest patches to it:

    https://lore.kernel.org/lkml/408dee3f-a466-4746-92d3-adf54d35ec7c@intel.com/

  - Rebase to latest tip/master

v3 -> v4:
  - Change to add a build_sysmd_read(_size) macro to build one primitive
    for each metadata field element size, similar to build_mmio_read()
    macro -- Dan.

    https://lore.kernel.org/kvm/66db75497a213_22a2294b@dwillia2-xfh.jf.intel.com.notmuch/

  - Replace TD_SYSINFO_MAP() with READ_SYS_INFO() and #undef it after
    use -- Adrian, Dan.

    https://lore.kernel.org/kvm/66db7469dbfdd_22a2294c0@dwillia2-xfh.jf.intel.com.notmuch/

  - Use permalink in the changelog -- Dan.
  - Other comments from Dan, Adrian and Nikolay.  Please see individual
    patches.
  - Collect tags from Dan, Adrian, Nikolay (Thanks!).

 v3: https://lore.kernel.org/kvm/5235e05e-1d73-4f70-9b5d-b8648b1f4524@intel.com/T/

v2 -> v3 (address comments from Dan):
  - Replace the first couple of "metadata reading tweaks" patches with
    two new patches using a different approach (removin the 'struct
    field_mapping' and reimplement the TD_SYSINFO_MAP()):

    https://lore.kernel.org/kvm/a107b067-861d-43f4-86b5-29271cb93dad@intel.com/T/#m7cfb3c146214d94b24e978eeb8708d92c0b14ac6
    https://lore.kernel.org/kvm/a107b067-861d-43f4-86b5-29271cb93dad@intel.com/T/#mbe65f0903ff7835bc418a907f0d02d7a9e0b78be
    https://lore.kernel.org/kvm/a107b067-861d-43f4-86b5-29271cb93dad@intel.com/T/#m80cde5e6504b3af74d933ea0cbfc3ca9d24697d3

  - Split out the renaming of 'struct tdx_tdmr_sysinfo' as a separate
    patch and place it at the beginning of this series.

    https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#m8fec7c429242d640cf5e756eb68e3b822e6dff8b

  - Refine this cover letter ("More info" section)

    https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#m11868c9f486dcc4cfbbb690c7c18dfa4570e433f

  - Address other comments.  See changelog of individual patches.

 v2: https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/

v1 -> v2:
  - Fix comments from Chao and Nikolay.
  - A new patch to refine an out-dated comment by Nikolay.
  - Collect tags from Nikolay (thanks!).

 v1: https://lore.kernel.org/linux-kernel/cover.1718538552.git.kai.huang@intel.com/T/

=== More info ===

TDX module provides a set of "global metadata fields" for software to
query.  They report things like TDX module version, supported features
fields required for creating TDX guests and so on.

Today the TDX host code already reads "TD Memory Region" (TDMR) related
metadata fields for module initialization.  There are immediate needs
that require TDX host code to read more metadata fields:

 - Dump basic TDX module info [3];
 - Reject module with no NO_RBP_MOD feature support [4];
 - Read CMR info to fix a module initialization failure bug [5].

Also, the "quite near future" KVM TDX support requires to read more
global metadata fields.  In the longer term, the TDX Connect [6] (which
supports assigning trusted IO devices to TDX guest) may also require
other kernel components (e.g., pci/vt-d) to access more metadata.

To meet all of those, the idea is the TDX host core-kernel to provide a
centralized, canonical, and read-only structure to contain all global
metadata that comes out of TDX module for all kernel components to use.

An alternative way is to expose metadata reading API(s) for in-kernel
TDX users to use, but the reasons of choosing to provide a centural
structure are, quoted from Dan:

  The idea that x86 gets to review growth to this structure over time is
  an asset for maintainability and oversight of what is happening in the
  downstream consumers like KVM and TSM (for TDX Connect).

  A dynamic retrieval API removes that natural auditing of data structure
  patches from tip.git.

  Yes, it requires more touches than letting use cases consume new
  metadata fields at will, but that's net positive for maintainence of
  the kernel and the feedback loop to the TDX module.

This series starts to track all global metadata fields into a single
'struct tdx_sys_info', and reads more metadata fields to that structure
to address the immediate needs as mentioned above.

More fields will be added to support KVM TDX.  For the initial support
all metadata fields are populated in this single structure and shared to
KVM via a 'const pointer' to that structure (see last patches in [1]).


[1] https://github.com/intel/tdx/commits/kvm-tdxinit-host-metadata-v4/
[2] https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=kvm-coco-queue
[3] https://lore.kernel.org/lkml/4b3adb59-50ea-419e-ad02-e19e8ca20dee@intel.com/
[4] https://lore.kernel.org/lkml/fc0e8ab7-86d4-4428-be31-82e1ece6dd21@intel.com/
[5] https://github.com/canonical/tdx/issues/135#issuecomment-2151570238
[6] https://cdrdv2.intel.com/v1/dl/getContent/773614





Kai Huang (8):
  x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec
    better
  x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification
  x86/virt/tdx: Prepare to support reading other global metadata fields
  x86/virt/tdx: Refine a comment to reflect the latest TDX spec
  x86/virt/tdx: Start to track all global metadata in one structure
  x86/virt/tdx: Print TDX module version
  x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD
    mitigation
  x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find
    memory holes

 arch/x86/virt/vmx/tdx/tdx.c | 273 +++++++++++++++++++++++++++---------
 arch/x86/virt/vmx/tdx/tdx.h |  86 ++++++++++--
 2 files changed, 287 insertions(+), 72 deletions(-)


base-commit: 4587b6326696c7cd54e6f899225c366d8f144040
-- 
2.46.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v5 1/8] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification Kai Huang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The TDX module provides a set of "Global Metadata Fields".  They report
things like TDX module version, supported features, and fields related
to create/run TDX guests and so on.

TDX organizes those metadata fields by "Classes" based on the meaning of
those fields.  E.g., for now the kernel only reads "TD Memory Region"
(TDMR) related fields for module initialization.  Those fields are
defined under class "TDMR Info".

There are both immediate needs to read more metadata fields for module
initialization and near-future needs for other kernel components like
KVM to run TDX guests.  To meet all those requirements, the idea is the
TDX host core-kernel to provide a centralized, canonical, and read-only
structure for the global metadata that comes out from the TDX module for
all kernel components to use.

More specifically, the target is to end up with something like:

       struct tdx_sys_info {
	       struct tdx_sys_info_classA a;
	       struct tdx_sys_info_classB b;
	       ...
       };

Currently the kernel organizes all fields under "TDMR Info" class in
'struct tdx_tdmr_sysinfo'.  To prepare for the above target, rename the
structure to 'struct tdx_sys_info_tdmr' to follow the class name better.

No functional change intended.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---

v4 -> v5:
 - No change

v3 -> v4:
 - "global metadata fields" -> "Global Metadata Fields" - Ardian.
 - "Class"es -> "Classes" - Ardian.
 - Add tag from Ardian and Dan.

v2 -> v3:
 - Split out as a separate patch and place it at beginning:

   https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#m8fec7c429242d640cf5e756eb68e3b822e6dff8b
 
 - Rename to 'struct tdx_sys_info_tdmr':

   https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#md73dd9b02a492acf4a6facae63e8d030e320967d
   https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#m8fec7c429242d640cf5e756eb68e3b822e6dff8b


---
 arch/x86/virt/vmx/tdx/tdx.c | 36 ++++++++++++++++++------------------
 arch/x86/virt/vmx/tdx/tdx.h |  2 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 4e2b2e2ac9f9..e979bf442929 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -272,7 +272,7 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 
 static int read_sys_metadata_field16(u64 field_id,
 				     int offset,
-				     struct tdx_tdmr_sysinfo *ts)
+				     struct tdx_sys_info_tdmr *ts)
 {
 	u16 *ts_member = ((void *)ts) + offset;
 	u64 tmp;
@@ -298,9 +298,9 @@ struct field_mapping {
 
 #define TD_SYSINFO_MAP(_field_id, _offset) \
 	{ .field_id = MD_FIELD_ID_##_field_id,	   \
-	  .offset   = offsetof(struct tdx_tdmr_sysinfo, _offset) }
+	  .offset   = offsetof(struct tdx_sys_info_tdmr, _offset) }
 
-/* Map TD_SYSINFO fields into 'struct tdx_tdmr_sysinfo': */
+/* Map TD_SYSINFO fields into 'struct tdx_sys_info_tdmr': */
 static const struct field_mapping fields[] = {
 	TD_SYSINFO_MAP(MAX_TDMRS,	      max_tdmrs),
 	TD_SYSINFO_MAP(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr),
@@ -309,16 +309,16 @@ static const struct field_mapping fields[] = {
 	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]),
 };
 
-static int get_tdx_tdmr_sysinfo(struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret;
 	int i;
 
-	/* Populate 'tdmr_sysinfo' fields using the mapping structure above: */
+	/* Populate 'sysinfo_tdmr' fields using the mapping structure above: */
 	for (i = 0; i < ARRAY_SIZE(fields); i++) {
 		ret = read_sys_metadata_field16(fields[i].field_id,
 						fields[i].offset,
-						tdmr_sysinfo);
+						sysinfo_tdmr);
 		if (ret)
 			return ret;
 	}
@@ -342,13 +342,13 @@ static int tdmr_size_single(u16 max_reserved_per_tdmr)
 }
 
 static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
-			   struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	size_t tdmr_sz, tdmr_array_sz;
 	void *tdmr_array;
 
-	tdmr_sz = tdmr_size_single(tdmr_sysinfo->max_reserved_per_tdmr);
-	tdmr_array_sz = tdmr_sz * tdmr_sysinfo->max_tdmrs;
+	tdmr_sz = tdmr_size_single(sysinfo_tdmr->max_reserved_per_tdmr);
+	tdmr_array_sz = tdmr_sz * sysinfo_tdmr->max_tdmrs;
 
 	/*
 	 * To keep things simple, allocate all TDMRs together.
@@ -367,7 +367,7 @@ static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
 	 * at a given index in the TDMR list.
 	 */
 	tdmr_list->tdmr_sz = tdmr_sz;
-	tdmr_list->max_tdmrs = tdmr_sysinfo->max_tdmrs;
+	tdmr_list->max_tdmrs = sysinfo_tdmr->max_tdmrs;
 	tdmr_list->nr_consumed_tdmrs = 0;
 
 	return 0;
@@ -921,11 +921,11 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
 /*
  * Construct a list of TDMRs on the preallocated space in @tdmr_list
  * to cover all TDX memory regions in @tmb_list based on the TDX module
- * TDMR global information in @tdmr_sysinfo.
+ * TDMR global information in @sysinfo_tdmr.
  */
 static int construct_tdmrs(struct list_head *tmb_list,
 			   struct tdmr_info_list *tdmr_list,
-			   struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret;
 
@@ -934,12 +934,12 @@ static int construct_tdmrs(struct list_head *tmb_list,
 		return ret;
 
 	ret = tdmrs_set_up_pamt_all(tdmr_list, tmb_list,
-			tdmr_sysinfo->pamt_entry_size);
+			sysinfo_tdmr->pamt_entry_size);
 	if (ret)
 		return ret;
 
 	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, tmb_list,
-			tdmr_sysinfo->max_reserved_per_tdmr);
+			sysinfo_tdmr->max_reserved_per_tdmr);
 	if (ret)
 		tdmrs_free_pamt_all(tdmr_list);
 
@@ -1098,7 +1098,7 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list)
 
 static int init_tdx_module(void)
 {
-	struct tdx_tdmr_sysinfo tdmr_sysinfo;
+	struct tdx_sys_info_tdmr sysinfo_tdmr;
 	int ret;
 
 	/*
@@ -1117,17 +1117,17 @@ static int init_tdx_module(void)
 	if (ret)
 		goto out_put_tdxmem;
 
-	ret = get_tdx_tdmr_sysinfo(&tdmr_sysinfo);
+	ret = get_tdx_sys_info_tdmr(&sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Allocate enough space for constructing TDMRs */
-	ret = alloc_tdmr_list(&tdx_tdmr_list, &tdmr_sysinfo);
+	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &tdmr_sysinfo);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdmrs;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index b701f69485d3..148f9b4d1140 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -100,7 +100,7 @@ struct tdx_memblock {
 };
 
 /* "TDMR info" part of "Global Scope Metadata" for constructing TDMRs */
-struct tdx_tdmr_sysinfo {
+struct tdx_sys_info_tdmr {
 	u16 max_tdmrs;
 	u16 max_reserved_per_tdmr;
 	u16 pamt_entry_size[TDX_PS_NR];
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
  2024-10-14 11:31 ` [PATCH v5 1/8] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 15:56   ` Dave Hansen
  2024-10-14 11:31 ` [PATCH v5 3/8] x86/virt/tdx: Prepare to support reading other global metadata fields Kai Huang
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Dan noticed [1] that read_sys_metadata_field16() has a runtime warning
to validate that the metadata field size matches the passed in buffer
size.  In turns out that all the information to perform that validation
is available at build time.  Rework TD_SYSINFO_MAP() to stop providing
runtime data to read_sys_metadata_field16() and instead just pass typed
fields to read_sys_metadata_field16() and let the compiler catch any
mismatches.  Rename TD_SYSINFO_MAP() to READ_SYS_INFO() since it no
longer stores the mapping of metadata field ID and the structure member
offset.

For now READ_SYS_INFO() is only used in get_tdx_sys_info_tdmr().  Future
changes will need to read other metadata fields that are organized in
different structures.  Do #undef READ_SYS_INFO() after use so the same
pattern can be used for reading other metadata fields.  To avoid needing
to duplicate build-time verification in each READ_SYS_INFO() in future
changes, add a wrapper macro to do build-time verification and call it
from READ_SYS_INFO().

The READ_SYS_INFO() has a couple quirks for readability.  It requires
the function that uses it to define a local variable @ret to carry the
error code and set the initial value to 0.  It also hard-codes the
variable name of the structure pointer used in the function, but it is
less code, build-time verifiable, and the same readability as the former
'struct field_mapping' approach.

Link: http://lore.kernel.org/66b16121c48f4_4fc729424@dwillia2-xfh.jf.intel.com.notmuch [1]
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v4 -> v5:
 - No change

v3 -> v4:
 - Rename TD_SYSINFO_MAP() to READ_SYS_INFO() - Ardian.
 - #undef READ_SYS_INFO() - Ardian.
 - Rewrite changelog based on text from Dan, with some clarification
   around using READ_SYS_INFO() and #undef it.
 - Move the BUILD_BUG_ON() out of read_sys_metadata_field16() - Dan.
 - Use permalink in changelog - Dan.

v2 -> v3:
 - Remove 'struct field_mapping' and reimplement TD_SYSINFO_MAP().


---
 arch/x86/virt/vmx/tdx/tdx.c | 58 ++++++++++++++-----------------------
 1 file changed, 21 insertions(+), 37 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index e979bf442929..2f7e4abc1bb9 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -270,60 +270,44 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 	return 0;
 }
 
-static int read_sys_metadata_field16(u64 field_id,
-				     int offset,
-				     struct tdx_sys_info_tdmr *ts)
+static int __read_sys_metadata_field16(u64 field_id, u16 *val)
 {
-	u16 *ts_member = ((void *)ts) + offset;
 	u64 tmp;
 	int ret;
 
-	if (WARN_ON_ONCE(MD_FIELD_ID_ELE_SIZE_CODE(field_id) !=
-			MD_FIELD_ID_ELE_SIZE_16BIT))
-		return -EINVAL;
-
 	ret = read_sys_metadata_field(field_id, &tmp);
 	if (ret)
 		return ret;
 
-	*ts_member = tmp;
+	*val = tmp;
 
 	return 0;
 }
 
-struct field_mapping {
-	u64 field_id;
-	int offset;
-};
-
-#define TD_SYSINFO_MAP(_field_id, _offset) \
-	{ .field_id = MD_FIELD_ID_##_field_id,	   \
-	  .offset   = offsetof(struct tdx_sys_info_tdmr, _offset) }
-
-/* Map TD_SYSINFO fields into 'struct tdx_sys_info_tdmr': */
-static const struct field_mapping fields[] = {
-	TD_SYSINFO_MAP(MAX_TDMRS,	      max_tdmrs),
-	TD_SYSINFO_MAP(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr),
-	TD_SYSINFO_MAP(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]),
-	TD_SYSINFO_MAP(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]),
-	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]),
-};
+#define read_sys_metadata_field16(_field_id, _val)		\
+({								\
+	BUILD_BUG_ON(MD_FIELD_ID_ELE_SIZE_CODE(_field_id) !=	\
+			MD_FIELD_ID_ELE_SIZE_16BIT);		\
+	__read_sys_metadata_field16(_field_id, _val);		\
+})
 
 static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
-	int ret;
-	int i;
+	int ret = 0;
 
-	/* Populate 'sysinfo_tdmr' fields using the mapping structure above: */
-	for (i = 0; i < ARRAY_SIZE(fields); i++) {
-		ret = read_sys_metadata_field16(fields[i].field_id,
-						fields[i].offset,
-						sysinfo_tdmr);
-		if (ret)
-			return ret;
-	}
+#define READ_SYS_INFO(_field_id, _member)				\
+	ret = ret ?: read_sys_metadata_field16(MD_FIELD_ID_##_field_id,	\
+					&sysinfo_tdmr->_member)
 
-	return 0;
+	READ_SYS_INFO(MAX_TDMRS,	     max_tdmrs);
+	READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
+	READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
+	READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
+	READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);
+
+#undef READ_SYS_INFO
+
+	return ret;
 }
 
 /* Calculate the actual TDMR size */
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 3/8] x86/virt/tdx: Prepare to support reading other global metadata fields
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
  2024-10-14 11:31 ` [PATCH v5 1/8] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
  2024-10-14 11:31 ` [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 4/8] x86/virt/tdx: Refine a comment to reflect the latest TDX spec Kai Huang
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The TDX module provides a set of "Global Metadata Fields".  They report
things like TDX module version, supported features, and fields related
to create/run TDX guests and so on.  TDX supports 8/16/32/64 bits
metadata field element sizes.  For a given metadata field, the element
size is encoded in the metadata field ID.

For now the kernel only reads "TD Memory Region" (TDMR) related metadata
fields and they are all 16-bit.  Thus the kernel only has one primitive
__read_sys_metadata_field16() to read 16-bit metadata field and the
macro, read_sys_metadata_field16(), which does additional build-time
check of the field ID makes sure the field is indeed 16-bit.

Future changes will need to read more metadata fields with different
element sizes.  Rework the {__}read_sys_metadata_field16() primitives to
work with all 8/16/32/64 element sizes.

Note the new primitive __read_sys_metadata_field() takes 'void *buf' and
'int size' and explicitly uses memcpy() to copy the SEAMCALL returned
data (u64) to a pointer of u8/u16/u32/u64, instead of depending on the
compiler to know the size and copy.  But this is fine since the wrapper
read_sys_metadata_field(), which works with a pointer to u8/u16/32/u64,
passes the sizeof() to the __read_sys_metadata_field() internally.  And
it has BUILD_BUG_ON() to verify the metadata element size encoded in the
field ID indeed matches the size passed to __read_sys_metadata_field().

This ensures the users of read_sys_metadata_field() will never screw up.
Also add a comment to point out __read_sys_metadata_field() should not
be used directly.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v4 -> v5:
 - Change back to what Dave suggested and update changelog:

   https://lore.kernel.org/lkml/408dee3f-a466-4746-92d3-adf54d35ec7c@intel.com/
   
v3 -> v4:
 - Change to use one primitive for each element size, similar to
   build_mmio_read() macro - Dan.
 - Rewrite changelog based on the new code.
 - "global metadata fields" -> "Global Metadata Fields" - Ardian.

v2 -> v3:
 - Rename read_sys_metadata_field() to tdh_sys_rd() so the former can be
   used as the high level wrapper.  Get rid of "stbuf_" prefix since
   people don't like it.
 
 - Rewrite after removing 'struct field_mapping' and reimplementing
   TD_SYSINFO_MAP().
 
---
 arch/x86/virt/vmx/tdx/tdx.c | 21 +++++++++++----------
 arch/x86/virt/vmx/tdx/tdx.h |  3 ++-
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 2f7e4abc1bb9..d63efb2d50d1 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -250,7 +250,7 @@ static int build_tdx_memlist(struct list_head *tmb_list)
 	return ret;
 }
 
-static int read_sys_metadata_field(u64 field_id, u64 *data)
+static int tdh_sys_rd(u64 field_id, u64 *data)
 {
 	struct tdx_module_args args = {};
 	int ret;
@@ -270,25 +270,26 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 	return 0;
 }
 
-static int __read_sys_metadata_field16(u64 field_id, u16 *val)
+/* Don't use this directly, but use read_sys_metadata_field() instead. */
+static int __read_sys_metadata_field(u64 field_id, void *val, int size)
 {
 	u64 tmp;
 	int ret;
 
-	ret = read_sys_metadata_field(field_id, &tmp);
+	ret = tdh_sys_rd(field_id, &tmp);
 	if (ret)
 		return ret;
 
-	*val = tmp;
+	memcpy(val, &tmp, size);
 
 	return 0;
 }
 
-#define read_sys_metadata_field16(_field_id, _val)		\
-({								\
-	BUILD_BUG_ON(MD_FIELD_ID_ELE_SIZE_CODE(_field_id) !=	\
-			MD_FIELD_ID_ELE_SIZE_16BIT);		\
-	__read_sys_metadata_field16(_field_id, _val);		\
+/* @_val must be a pointer to u8/u16/u32/u64 */
+#define read_sys_metadata_field(_field_id, _val)			\
+({									\
+	BUILD_BUG_ON(MD_FIELD_ELE_SIZE(_field_id) != sizeof(*(_val)));	\
+	__read_sys_metadata_field(_field_id, _val, sizeof(*(_val)));	\
 })
 
 static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
@@ -296,7 +297,7 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 	int ret = 0;
 
 #define READ_SYS_INFO(_field_id, _member)				\
-	ret = ret ?: read_sys_metadata_field16(MD_FIELD_ID_##_field_id,	\
+	ret = ret ?: read_sys_metadata_field(MD_FIELD_ID_##_field_id,	\
 					&sysinfo_tdmr->_member)
 
 	READ_SYS_INFO(MAX_TDMRS,	     max_tdmrs);
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 148f9b4d1140..7a8204a05bf7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -53,7 +53,8 @@
 #define MD_FIELD_ID_ELE_SIZE_CODE(_field_id)	\
 		(((_field_id) & GENMASK_ULL(33, 32)) >> 32)
 
-#define MD_FIELD_ID_ELE_SIZE_16BIT	1
+#define MD_FIELD_ELE_SIZE(_field_id)   \
+	(1 << MD_FIELD_ID_ELE_SIZE_CODE(_field_id))
 
 struct tdmr_reserved_area {
 	u64 offset;
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 4/8] x86/virt/tdx: Refine a comment to reflect the latest TDX spec
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (2 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 3/8] x86/virt/tdx: Prepare to support reading other global metadata fields Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 5/8] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The old versions of "Intel TDX Module v1.5 ABI Specification" contain
the definitions of all global metadata field IDs directly in a table.

However, the latest spec moves those definitions to a dedicated
'global_metadata.json' file as part of a new (separate) "Intel TDX
Module v1.5 ABI definitions" [1].

Update the comment to reflect this.

[1]: https://cdrdv2.intel.com/v1/dl/getContent/795381

Reported-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
---
 arch/x86/virt/vmx/tdx/tdx.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 7a8204a05bf7..8345ae1f7fb1 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -29,7 +29,7 @@
 /*
  * Global scope metadata field ID.
  *
- * See Table "Global Scope Metadata", TDX module 1.5 ABI spec.
+ * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
  */
 #define MD_FIELD_ID_MAX_TDMRS			0x9100000100000008ULL
 #define MD_FIELD_ID_MAX_RESERVED_PER_TDMR	0x9100000100000009ULL
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 5/8] x86/virt/tdx: Start to track all global metadata in one structure
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (3 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 4/8] x86/virt/tdx: Refine a comment to reflect the latest TDX spec Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 6/8] x86/virt/tdx: Print TDX module version Kai Huang
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The TDX module provides a set of "Global Metadata Fields".  They report
things like TDX module version, supported features, and fields related
to create/run TDX guests and so on.

Currently the kernel only reads "TD Memory Region" (TDMR) related fields
for module initialization.  There are immediate needs which require the
TDX module initialization to read more global metadata including module
version, supported features and "Convertible Memory Regions" (CMRs).

Also, KVM will need to read more metadata fields to support baseline TDX
guests.  In the longer term, other TDX features like TDX Connect (which
supports assigning trusted IO devices to TDX guest) may also require
other kernel components such as pci/vt-d to access global metadata.

To meet all those requirements, the idea is the TDX host core-kernel to
to provide a centralized, canonical, and read-only structure for the
global metadata that comes out from the TDX module for all kernel
components to use.

As the first step, introduce a new 'struct tdx_sys_info' to track all
global metadata fields.

TDX categories global metadata fields into different "Classes".  E.g.,
the TDMR related fields are under class "TDMR Info".  Instead of making
'struct tdx_sys_info' a plain structure to contain all metadata fields,
organize them in smaller structures based on the "Class".

This allows those metadata fields to be used in finer granularity thus
makes the code more clear.  E.g., the construct_tdmr() can just take the
structure which contains "TDMR Info" metadata fields.

Add a new function get_tdx_sys_info() as the placeholder to read all
metadata fields, and call it at the beginning of init_tdx_module().  For
now it only calls get_tdx_sys_info_tdmr() to read TDMR related fields.

Note there is a functional change: get_tdx_sys_info_tdmr() is moved from
after build_tdx_memlist() to before it, but it is fine to do so.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
---

v4 -> v5:
 - Rebase due to patch 3 change.

v3 -> v4:
 - "global metadata fields" -> "Global Metadata Fields" - Ardian.
 - "Class"es -> "Classes" - Ardian.
 - Add tag from Ardian.

v2 -> v3:
 - Split out the part to rename 'struct tdx_tdmr_sysinfo' to 'struct
   tdx_sys_info_tdmr'.


---
 arch/x86/virt/vmx/tdx/tdx.c | 19 ++++++++++++-------
 arch/x86/virt/vmx/tdx/tdx.h | 36 +++++++++++++++++++++++++++++-------
 2 files changed, 41 insertions(+), 14 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index d63efb2d50d1..a4496c4c765f 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -311,6 +311,11 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 	return ret;
 }
 
+static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
+{
+	return get_tdx_sys_info_tdmr(&sysinfo->tdmr);
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1083,9 +1088,13 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list)
 
 static int init_tdx_module(void)
 {
-	struct tdx_sys_info_tdmr sysinfo_tdmr;
+	struct tdx_sys_info sysinfo;
 	int ret;
 
+	ret = get_tdx_sys_info(&sysinfo);
+	if (ret)
+		return ret;
+
 	/*
 	 * To keep things simple, assume that all TDX-protected memory
 	 * will come from the page allocator.  Make sure all pages in the
@@ -1102,17 +1111,13 @@ static int init_tdx_module(void)
 	if (ret)
 		goto out_put_tdxmem;
 
-	ret = get_tdx_sys_info_tdmr(&sysinfo_tdmr);
-	if (ret)
-		goto err_free_tdxmem;
-
 	/* Allocate enough space for constructing TDMRs */
-	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo_tdmr);
+	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo.tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo_tdmr);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr);
 	if (ret)
 		goto err_free_tdmrs;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 8345ae1f7fb1..4c924124347c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -81,6 +81,35 @@ struct tdmr_info {
 	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
 } __packed __aligned(TDMR_INFO_ALIGNMENT);
 
+/*
+ * Data structures for "Global Scope Metadata".
+ *
+ * TDX global metadata fields are categorized by "Classes".  See the
+ * "global_metadata.json" in the "TDX 1.5 ABI Definitions".
+ *
+ * 'struct tdx_sys_info' is the main structure to contain all metadata
+ * used by the kernel.  It contains sub-structures with each reflecting
+ * the "Class" in the 'global_metadata.json'.
+ *
+ * Note the structure name may not exactly follow the name of the
+ * "Class" in the TDX spec, but the comment of that structure always
+ * reflect that.
+ *
+ * Also note not all metadata fields in each class are defined, only
+ * those used by the kernel are.
+ */
+
+/* Class "TDMR info" */
+struct tdx_sys_info_tdmr {
+	u16 max_tdmrs;
+	u16 max_reserved_per_tdmr;
+	u16 pamt_entry_size[TDX_PS_NR];
+};
+
+struct tdx_sys_info {
+	struct tdx_sys_info_tdmr tdmr;
+};
+
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!
@@ -100,13 +129,6 @@ struct tdx_memblock {
 	int nid;
 };
 
-/* "TDMR info" part of "Global Scope Metadata" for constructing TDMRs */
-struct tdx_sys_info_tdmr {
-	u16 max_tdmrs;
-	u16 max_reserved_per_tdmr;
-	u16 pamt_entry_size[TDX_PS_NR];
-};
-
 /* Warn if kernel has less than TDMR_NR_WARN TDMRs after allocation */
 #define TDMR_NR_WARN 4
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 6/8] x86/virt/tdx: Print TDX module version
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (4 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 5/8] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 7/8] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Currently the kernel doesn't print any TDX module version information.
In practice such information is useful, especially to the developers.

For instance:

1) When something goes wrong around using TDX, the module version is
   normally the first information the users want to know [1].

2) After initializing TDX module, the users want to quickly know module
   version to see whether the loaded module is the expected one.

Dump TDX module version.  The actual dmesg will look like:

  virt/tdx: Initializing TDX module: 1.5.00.00.0481 (build_date 20230323).

And dump right after reading global metadata, so that this information is
printed no matter whether module initialization fails or not.

Link: https://lore.kernel.org/lkml/4b3adb59-50ea-419e-ad02-e19e8ca20dee@intel.com/ [1]
Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v4 -> v5:
 - Rebase due to patch 3 change.

v3 -> v4:
 - Omit dumping TDX_FEATURES0 - Dan.
 - As a result, move TDX_FEATURES0 related code out to NO_MOD_RBP patch.
 - Update changelog accordingly.
 - Simplify changelog for the use case 2).
 - Use permalink - Dan.

v2 -> v3:
 - 'struct tdx_sysinfo_module_info' -> 'struct tdx_sys_info_features'
 - 'struct tdx_sysinfo_module_version' -> 'struct tdx_sys_info_version'
 - Remove the 'sys_attributes' and the check of debug/production module.

 https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#md73dd9b02a492acf4a6facae63e8d030e320967d


---
 arch/x86/virt/vmx/tdx/tdx.c | 50 +++++++++++++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h | 19 +++++++++++++-
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index a4496c4c765f..130ddac47f64 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -292,6 +292,26 @@ static int __read_sys_metadata_field(u64 field_id, void *val, int size)
 	__read_sys_metadata_field(_field_id, _val, sizeof(*(_val)));	\
 })
 
+static int get_tdx_sys_info_version(struct tdx_sys_info_version *sysinfo_version)
+{
+	int ret = 0;
+
+#define READ_SYS_INFO(_field_id, _member)				\
+	ret = ret ?: read_sys_metadata_field(MD_FIELD_ID_##_field_id,	\
+					&sysinfo_version->_member)
+
+	READ_SYS_INFO(MAJOR_VERSION,    major);
+	READ_SYS_INFO(MINOR_VERSION,    minor);
+	READ_SYS_INFO(UPDATE_VERSION,   update);
+	READ_SYS_INFO(INTERNAL_VERSION, internal);
+	READ_SYS_INFO(BUILD_NUM,	build_num);
+	READ_SYS_INFO(BUILD_DATE,	build_date);
+
+#undef READ_SYS_INFO
+
+	return ret;
+}
+
 static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret = 0;
@@ -313,9 +333,37 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 
 static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
 {
+	int ret;
+
+	ret = get_tdx_sys_info_version(&sysinfo->version);
+	if (ret)
+		return ret;
+
 	return get_tdx_sys_info_tdmr(&sysinfo->tdmr);
 }
 
+static void print_sys_info_version(struct tdx_sys_info_version *version)
+{
+	/*
+	 * TDX module version encoding:
+	 *
+	 *   <major>.<minor>.<update>.<internal>.<build_num>
+	 *
+	 * When printed as text, <major> and <minor> are 1-digit,
+	 * <update> and <internal> are 2-digits and <build_num>
+	 * is 4-digits.
+	 */
+	pr_info("Initializing TDX module: %u.%u.%02u.%02u.%04u (build_date %u).\n",
+			version->major, version->minor,	version->update,
+			version->internal, version->build_num,
+			version->build_date);
+}
+
+static void print_basic_sys_info(struct tdx_sys_info *sysinfo)
+{
+	print_sys_info_version(&sysinfo->version);
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1095,6 +1143,8 @@ static int init_tdx_module(void)
 	if (ret)
 		return ret;
 
+	print_basic_sys_info(&sysinfo);
+
 	/*
 	 * To keep things simple, assume that all TDX-protected memory
 	 * will come from the page allocator.  Make sure all pages in the
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 4c924124347c..0203528da024 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -31,6 +31,12 @@
  *
  * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
  */
+#define MD_FIELD_ID_BUILD_DATE			0x8800000200000001ULL
+#define MD_FIELD_ID_BUILD_NUM			0x8800000100000002ULL
+#define MD_FIELD_ID_MINOR_VERSION		0x0800000100000003ULL
+#define MD_FIELD_ID_MAJOR_VERSION		0x0800000100000004ULL
+#define MD_FIELD_ID_UPDATE_VERSION		0x0800000100000005ULL
+#define MD_FIELD_ID_INTERNAL_VERSION		0x0800000100000006ULL
 #define MD_FIELD_ID_MAX_TDMRS			0x9100000100000008ULL
 #define MD_FIELD_ID_MAX_RESERVED_PER_TDMR	0x9100000100000009ULL
 #define MD_FIELD_ID_PAMT_4K_ENTRY_SIZE		0x9100000100000010ULL
@@ -99,6 +105,16 @@ struct tdmr_info {
  * those used by the kernel are.
  */
 
+/* Class "TDX Module Version" */
+struct tdx_sys_info_version {
+	u16 major;
+	u16 minor;
+	u16 update;
+	u16 internal;
+	u16 build_num;
+	u32 build_date;
+};
+
 /* Class "TDMR info" */
 struct tdx_sys_info_tdmr {
 	u16 max_tdmrs;
@@ -107,7 +123,8 @@ struct tdx_sys_info_tdmr {
 };
 
 struct tdx_sys_info {
-	struct tdx_sys_info_tdmr tdmr;
+	struct tdx_sys_info_version	version;
+	struct tdx_sys_info_tdmr	tdmr;
 };
 
 /*
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 7/8] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (5 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 6/8] x86/virt/tdx: Print TDX module version Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-14 11:31 ` [PATCH v5 8/8] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
  2024-10-15 15:30 ` [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Old TDX modules can clobber RBP in the TDH.VP.ENTER SEAMCALL.  However
RBP is used as frame pointer in the x86_64 calling convention, and
clobbering RBP could result in bad things like being unable to unwind
the stack if any non-maskable exceptions (NMI, #MC etc) happens in that
gap.

A new "NO_RBP_MOD" feature was introduced to more recent TDX modules to
not clobber RBP.  This feature is reported in the TDX_FEATURES0 global
metadata field via bit 18.

Don't initialize the TDX module if this feature is not supported [1].

Link: https://lore.kernel.org/all/fc0e8ab7-86d4-4428-be31-82e1ece6dd21@intel.com/ [1]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---

v4 -> v5:
 - Rebase due to patch 3 change.

v3 -> v4:
 - Move reading TDX_FEATURES0 code to this patch.
 - Change patch title and use permalink - Dan.

 Hi Dan, Ardian, Nikolay,

 The code to read TDX_FEATURES0 was not included in this patch when you
 gave your tag.  I didn't remove them.  Please let me know if you want
 me to remove your tag.  Thanks!

v2 -> v3:
 - check_module_compatibility() -> check_features().
 - Improve error message.

 https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#md9e2eeef927838cbf20d7b361cdbea518b8aec50

---
 arch/x86/virt/vmx/tdx/tdx.c | 36 ++++++++++++++++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h | 16 ++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 130ddac47f64..c877d02ca057 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -292,6 +292,21 @@ static int __read_sys_metadata_field(u64 field_id, void *val, int size)
 	__read_sys_metadata_field(_field_id, _val, sizeof(*(_val)));	\
 })
 
+static int get_tdx_sys_info_features(struct tdx_sys_info_features *sysinfo_features)
+{
+	int ret = 0;
+
+#define READ_SYS_INFO(_field_id, _member)				\
+	ret = ret ?: read_sys_metadata_field(MD_FIELD_ID_##_field_id,	\
+					&sysinfo_features->_member)
+
+	READ_SYS_INFO(TDX_FEATURES0, tdx_features0);
+
+#undef READ_SYS_INFO
+
+	return ret;
+}
+
 static int get_tdx_sys_info_version(struct tdx_sys_info_version *sysinfo_version)
 {
 	int ret = 0;
@@ -335,6 +350,10 @@ static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
 {
 	int ret;
 
+	ret = get_tdx_sys_info_features(&sysinfo->features);
+	if (ret)
+		return ret;
+
 	ret = get_tdx_sys_info_version(&sysinfo->version);
 	if (ret)
 		return ret;
@@ -364,6 +383,18 @@ static void print_basic_sys_info(struct tdx_sys_info *sysinfo)
 	print_sys_info_version(&sysinfo->version);
 }
 
+static int check_features(struct tdx_sys_info *sysinfo)
+{
+	u64 tdx_features0 = sysinfo->features.tdx_features0;
+
+	if (!(tdx_features0 & TDX_FEATURES0_NO_RBP_MOD)) {
+		pr_err("frame pointer (RBP) clobber bug present, upgrade TDX module\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1145,6 +1176,11 @@ static int init_tdx_module(void)
 
 	print_basic_sys_info(&sysinfo);
 
+	/* Check whether the kernel can support this module */
+	ret = check_features(&sysinfo);
+	if (ret)
+		return ret;
+
 	/*
 	 * To keep things simple, assume that all TDX-protected memory
 	 * will come from the page allocator.  Make sure all pages in the
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 0203528da024..18c54e1e3a4a 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -31,6 +31,7 @@
  *
  * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
  */
+#define MD_FIELD_ID_TDX_FEATURES0		0x0A00000300000008ULL
 #define MD_FIELD_ID_BUILD_DATE			0x8800000200000001ULL
 #define MD_FIELD_ID_BUILD_NUM			0x8800000100000002ULL
 #define MD_FIELD_ID_MINOR_VERSION		0x0800000100000003ULL
@@ -105,6 +106,20 @@ struct tdmr_info {
  * those used by the kernel are.
  */
 
+/*
+ * Class "TDX Module Info".
+ *
+ * This class also contains other fields like SYS_ATTRIBUTES and the
+ * NUM_TDX_FEATURES.  For now only TDX_FEATURES0 is needed, but still
+ * keep the structure to follow the spec (and for future extension).
+ */
+struct tdx_sys_info_features {
+	u64 tdx_features0;
+};
+
+/* Bit definitions of TDX_FEATURES0 metadata field */
+#define TDX_FEATURES0_NO_RBP_MOD	_BITULL(18)
+
 /* Class "TDX Module Version" */
 struct tdx_sys_info_version {
 	u16 major;
@@ -123,6 +138,7 @@ struct tdx_sys_info_tdmr {
 };
 
 struct tdx_sys_info {
+	struct tdx_sys_info_features	features;
 	struct tdx_sys_info_version	version;
 	struct tdx_sys_info_tdmr	tdmr;
 };
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v5 8/8] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (6 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 7/8] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
@ 2024-10-14 11:31 ` Kai Huang
  2024-10-15 15:30 ` [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
  8 siblings, 0 replies; 17+ messages in thread
From: Kai Huang @ 2024-10-14 11:31 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

A TDX module initialization failure was reported on a Emerald Rapids
platform:

  virt/tdx: initialization failed: TDMR [0x0, 0x80000000): reserved areas exhausted.
  virt/tdx: module initialization failed (-28)

As part of initializing the TDX module, the kernel informs the TDX
module of all "TDX-usable memory regions" using an array of TDX defined
structure "TD Memory Region" (TDMR).  Each TDMR must be in 1GB aligned
and in 1GB granularity, and all "non-TDX-usable memory holes" within a
given TDMR are marked as "reserved areas".  The TDX module reports a
maximum number of reserved areas that can be supported per TDMR (16).

The kernel builds the "TDX-usable memory regions" based on memblocks
(which reflects e820), and uses this list to find all "reserved areas"
for each TDMR.

It turns out that the kernel's view of memory holes is too fine grained
and sometimes exceeds the number of holes that the TDX module can track
per TDMR [1], resulting in the above failure.

Thankfully the module also lists memory that is potentially convertible
in a list of "Convertible Memory Regions" (CMRs).  That coarser grained
CMR list tends to track usable memory in the memory map even if it might
be reserved for host usage like 'ACPI data' [2].

Use that list to relax what the kernel considers unusable memory.  If it
falls in a CMR no need to instantiate a hole, and rely on the fact that
kernel will keep what it considers 'reserved' out of the page allocator.

Also dump the CMRs in dmesg.  They are helpful when something goes wrong
around "constructing the TDMRs and configuring the TDX module with
them".  Note there are no existing userspace tools that the user can get
CMRs since they can only be read via SEAMCALL (no CPUID, MSR etc).

[1] BIOS-E820 table of the problematic platform:

  BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
  BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
  BIOS-e820: [mem 0x0000000000100000-0x000000005d168fff] usable
  BIOS-e820: [mem 0x000000005d169000-0x000000005d22afff] ACPI data
  BIOS-e820: [mem 0x000000005d22b000-0x000000005d3cefff] usable
  BIOS-e820: [mem 0x000000005d3cf000-0x000000005d469fff] reserved
  BIOS-e820: [mem 0x000000005d46a000-0x000000005e5b2fff] usable
  BIOS-e820: [mem 0x000000005e5b3000-0x000000005e5c2fff] reserved
  BIOS-e820: [mem 0x000000005e5c3000-0x000000005e5d2fff] usable
  BIOS-e820: [mem 0x000000005e5d3000-0x000000005e5e4fff] reserved
  BIOS-e820: [mem 0x000000005e5e5000-0x000000005eb57fff] usable
  BIOS-e820: [mem 0x000000005eb58000-0x0000000061357fff] ACPI NVS
  BIOS-e820: [mem 0x0000000061358000-0x000000006172afff] usable
  BIOS-e820: [mem 0x000000006172b000-0x0000000061794fff] ACPI data
  BIOS-e820: [mem 0x0000000061795000-0x00000000617fefff] usable
  BIOS-e820: [mem 0x00000000617ff000-0x0000000061912fff] ACPI data
  BIOS-e820: [mem 0x0000000061913000-0x0000000061998fff] usable
  BIOS-e820: [mem 0x0000000061999000-0x00000000619dffff] ACPI data
  BIOS-e820: [mem 0x00000000619e0000-0x00000000619e1fff] usable
  BIOS-e820: [mem 0x00000000619e2000-0x00000000619e9fff] reserved
  BIOS-e820: [mem 0x00000000619ea000-0x0000000061a26fff] usable
  BIOS-e820: [mem 0x0000000061a27000-0x0000000061baefff] ACPI data
  BIOS-e820: [mem 0x0000000061baf000-0x00000000623c2fff] usable
  BIOS-e820: [mem 0x00000000623c3000-0x0000000062471fff] reserved
  BIOS-e820: [mem 0x0000000062472000-0x0000000062823fff] usable
  BIOS-e820: [mem 0x0000000062824000-0x0000000063a24fff] reserved
  BIOS-e820: [mem 0x0000000063a25000-0x0000000063d57fff] usable
  BIOS-e820: [mem 0x0000000063d58000-0x0000000064157fff] reserved
  BIOS-e820: [mem 0x0000000064158000-0x0000000064158fff] usable
  BIOS-e820: [mem 0x0000000064159000-0x0000000064194fff] reserved
  BIOS-e820: [mem 0x0000000064195000-0x000000006e9cefff] usable
  BIOS-e820: [mem 0x000000006e9cf000-0x000000006eccefff] reserved
  BIOS-e820: [mem 0x000000006eccf000-0x000000006f6fefff] ACPI NVS
  BIOS-e820: [mem 0x000000006f6ff000-0x000000006f7fefff] ACPI data
  BIOS-e820: [mem 0x000000006f7ff000-0x000000006f7fffff] usable
  BIOS-e820: [mem 0x000000006f800000-0x000000008fffffff] reserved
  ......

[2] Convertible Memory Regions of the problematic platform:

  virt/tdx: CMR: [0x100000, 0x6f800000)
  virt/tdx: CMR: [0x100000000, 0x107a000000)
  virt/tdx: CMR: [0x1080000000, 0x207c000000)
  virt/tdx: CMR: [0x2080000000, 0x307c000000)
  virt/tdx: CMR: [0x3080000000, 0x407c000000)

Fixes: dde3b60d572c ("x86/virt/tdx: Designate reserved areas for all TDMRs")
Signed-off-by: Kai Huang <kai.huang@intel.com>
---

v4 -> v5:
 - Rebase due to patch 3 change.

v3 -> v4:
 - Trim down changelog - Dan.
 - "must be marked as reserved areas" -> "are marked as reserved areas" - Ardian.
 - Remove all WARN_ON_ONCE() for CMR sanity checks, and clarify in the
   comment that CMRs are verified by MCHECK before it enables TDX so we
   can trust hardware.
 - Change CMR_BASE(i) macro back to just define CMR_BASE and do the
   "+i" in the code.

v2 -> v3:

 - Add the Fixes tag, although this patch depends on previous patches.
 - CMR_BASE0 -> CMR_BASE(_i), CMR_SIZE0 -> CMR_SIZE(_i) to silence the
   build-check error.

v1 -> v2:
 - Change to walk over CMRs directly to find out memory holes, instead
   of walking over TDX memory blocks and explicitly check whether a hole
   is subregion of CMR.  (Chao)
 - Mention any constant macro definitions in global metadata structures
   are TDX architectural. (Binbin)
 - Slightly improve the changelog.


---
 arch/x86/virt/vmx/tdx/tdx.c | 103 ++++++++++++++++++++++++++++++------
 arch/x86/virt/vmx/tdx/tdx.h |  12 +++++
 2 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index c877d02ca057..b1e419765ed7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -327,6 +327,58 @@ static int get_tdx_sys_info_version(struct tdx_sys_info_version *sysinfo_version
 	return ret;
 }
 
+/* Update the @sysinfo_cmr->num_cmrs to trim tail empty CMRs */
+static void trim_empty_tail_cmrs(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int i;
+
+	/*
+	 * The TDX module may report the maximum number of CMRs that
+	 * TDX architecturally supports as the actual number of CMRs,
+	 * despite the latter is smaller.  In this case all the tail
+	 * CMRs will be empty.  Trim them away.
+	 *
+	 * Note MCHECK verifies CMRs before enabling TDX on hardware.
+	 * Skip other sanity checks (e.g., verify CMR is 4KB aligned)
+	 * but trust MCHECK to work properly.  CMRs are printed later
+	 * anyway, and the worst case is module fails to initialize.
+	 */
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
+		if (!sysinfo_cmr->cmr_size[i])
+			break;
+
+	sysinfo_cmr->num_cmrs = i;
+}
+
+static int get_tdx_sys_info_cmr(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int ret = 0;
+	u16 i;
+
+#define READ_SYS_INFO(_field_id, _member)				\
+	ret = ret ?: read_sys_metadata_field(MD_FIELD_ID_##_field_id,	\
+					&sysinfo_cmr->_member)
+
+	READ_SYS_INFO(NUM_CMRS, num_cmrs);
+
+	if (ret)
+		return ret;
+
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
+		READ_SYS_INFO(CMR_BASE + i, cmr_base[i]);
+		READ_SYS_INFO(CMR_SIZE + i, cmr_size[i]);
+	}
+
+	if (ret)
+		return ret;
+
+	trim_empty_tail_cmrs(sysinfo_cmr);
+
+#undef READ_SYS_INFO
+
+	return 0;
+}
+
 static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret = 0;
@@ -358,6 +410,10 @@ static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
 	if (ret)
 		return ret;
 
+	ret = get_tdx_sys_info_cmr(&sysinfo->cmr);
+	if (ret)
+		return ret;
+
 	return get_tdx_sys_info_tdmr(&sysinfo->tdmr);
 }
 
@@ -378,9 +434,23 @@ static void print_sys_info_version(struct tdx_sys_info_version *version)
 			version->build_date);
 }
 
+static void print_sys_info_cmr(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int i;
+
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
+		u64 cmr_base = sysinfo_cmr->cmr_base[i];
+		u64 cmr_size = sysinfo_cmr->cmr_size[i];
+
+		pr_info("CMR[%d]: [0x%llx, 0x%llx)\n", i, cmr_base,
+				cmr_base + cmr_size);
+	}
+}
+
 static void print_basic_sys_info(struct tdx_sys_info *sysinfo)
 {
 	print_sys_info_version(&sysinfo->version);
+	print_sys_info_cmr(&sysinfo->cmr);
 }
 
 static int check_features(struct tdx_sys_info *sysinfo)
@@ -816,29 +886,28 @@ static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr,
 }
 
 /*
- * Go through @tmb_list to find holes between memory areas.  If any of
+ * Go through all CMRs in @sysinfo_cmr to find memory holes.  If any of
  * those holes fall within @tdmr, set up a TDMR reserved area to cover
  * the hole.
  */
-static int tdmr_populate_rsvd_holes(struct list_head *tmb_list,
+static int tdmr_populate_rsvd_holes(struct tdx_sys_info_cmr *sysinfo_cmr,
 				    struct tdmr_info *tdmr,
 				    int *rsvd_idx,
 				    u16 max_reserved_per_tdmr)
 {
-	struct tdx_memblock *tmb;
 	u64 prev_end;
-	int ret;
+	int i, ret;
 
 	/*
 	 * Start looking for reserved blocks at the
 	 * beginning of the TDMR.
 	 */
 	prev_end = tdmr->base;
-	list_for_each_entry(tmb, tmb_list, list) {
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
 		u64 start, end;
 
-		start = PFN_PHYS(tmb->start_pfn);
-		end   = PFN_PHYS(tmb->end_pfn);
+		start = sysinfo_cmr->cmr_base[i];
+		end   = start + sysinfo_cmr->cmr_size[i];
 
 		/* Break if this region is after the TDMR */
 		if (start >= tdmr_end(tdmr))
@@ -939,16 +1008,16 @@ static int rsvd_area_cmp_func(const void *a, const void *b)
 
 /*
  * Populate reserved areas for the given @tdmr, including memory holes
- * (via @tmb_list) and PAMTs (via @tdmr_list).
+ * (via @sysinfo_cmr) and PAMTs (via @tdmr_list).
  */
 static int tdmr_populate_rsvd_areas(struct tdmr_info *tdmr,
-				    struct list_head *tmb_list,
+				    struct tdx_sys_info_cmr *sysinfo_cmr,
 				    struct tdmr_info_list *tdmr_list,
 				    u16 max_reserved_per_tdmr)
 {
 	int ret, rsvd_idx = 0;
 
-	ret = tdmr_populate_rsvd_holes(tmb_list, tdmr, &rsvd_idx,
+	ret = tdmr_populate_rsvd_holes(sysinfo_cmr, tdmr, &rsvd_idx,
 			max_reserved_per_tdmr);
 	if (ret)
 		return ret;
@@ -967,10 +1036,10 @@ static int tdmr_populate_rsvd_areas(struct tdmr_info *tdmr,
 
 /*
  * Populate reserved areas for all TDMRs in @tdmr_list, including memory
- * holes (via @tmb_list) and PAMTs.
+ * holes (via @sysinfo_cmr) and PAMTs.
  */
 static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
-					 struct list_head *tmb_list,
+					 struct tdx_sys_info_cmr *sysinfo_cmr,
 					 u16 max_reserved_per_tdmr)
 {
 	int i;
@@ -979,7 +1048,7 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
 		int ret;
 
 		ret = tdmr_populate_rsvd_areas(tdmr_entry(tdmr_list, i),
-				tmb_list, tdmr_list, max_reserved_per_tdmr);
+				sysinfo_cmr, tdmr_list, max_reserved_per_tdmr);
 		if (ret)
 			return ret;
 	}
@@ -994,7 +1063,8 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
  */
 static int construct_tdmrs(struct list_head *tmb_list,
 			   struct tdmr_info_list *tdmr_list,
-			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr,
+			   struct tdx_sys_info_cmr *sysinfo_cmr)
 {
 	int ret;
 
@@ -1007,7 +1077,7 @@ static int construct_tdmrs(struct list_head *tmb_list,
 	if (ret)
 		return ret;
 
-	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, tmb_list,
+	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, sysinfo_cmr,
 			sysinfo_tdmr->max_reserved_per_tdmr);
 	if (ret)
 		tdmrs_free_pamt_all(tdmr_list);
@@ -1203,7 +1273,8 @@ static int init_tdx_module(void)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr,
+			&sysinfo.cmr);
 	if (ret)
 		goto err_free_tdmrs;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 18c54e1e3a4a..eeb42bd75a75 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -43,6 +43,9 @@
 #define MD_FIELD_ID_PAMT_4K_ENTRY_SIZE		0x9100000100000010ULL
 #define MD_FIELD_ID_PAMT_2M_ENTRY_SIZE		0x9100000100000011ULL
 #define MD_FIELD_ID_PAMT_1G_ENTRY_SIZE		0x9100000100000012ULL
+#define MD_FIELD_ID_NUM_CMRS			0x9000000100000000ULL
+#define MD_FIELD_ID_CMR_BASE			0x9000000300000080ULL
+#define MD_FIELD_ID_CMR_SIZE			0x9000000300000100ULL
 
 /*
  * Sub-field definition of metadata field ID.
@@ -130,6 +133,14 @@ struct tdx_sys_info_version {
 	u32 build_date;
 };
 
+/* Class "CMR Info" */
+#define TDX_MAX_CMRS	32
+struct tdx_sys_info_cmr {
+	u16 num_cmrs;
+	u64 cmr_base[TDX_MAX_CMRS];
+	u64 cmr_size[TDX_MAX_CMRS];
+};
+
 /* Class "TDMR info" */
 struct tdx_sys_info_tdmr {
 	u16 max_tdmrs;
@@ -140,6 +151,7 @@ struct tdx_sys_info_tdmr {
 struct tdx_sys_info {
 	struct tdx_sys_info_features	features;
 	struct tdx_sys_info_version	version;
+	struct tdx_sys_info_cmr		cmr;
 	struct tdx_sys_info_tdmr	tdmr;
 };
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification
  2024-10-14 11:31 ` [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification Kai Huang
@ 2024-10-14 15:56   ` Dave Hansen
  2024-10-14 19:13     ` Dan Williams
  0 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2024-10-14 15:56 UTC (permalink / raw)
  To: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov

On 10/14/24 04:31, Kai Huang wrote:
> +#define READ_SYS_INFO(_field_id, _member)				\
> +	ret = ret ?: read_sys_metadata_field16(MD_FIELD_ID_##_field_id,	\
> +					&sysinfo_tdmr->_member)
>  
> -	return 0;
> +	READ_SYS_INFO(MAX_TDMRS,	     max_tdmrs);
> +	READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
> +	READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
> +	READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
> +	READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);

I know what Dan asked for here, but I dislike how this ended up.

The existing stuff *has* type safety, despite the void*.  It at least
checks the size, which is the biggest problem.

Also, this isn't really an unrolled loop.  It still effectively has
gotos, just like the for loop did.  It just buries the goto in the "ret
= ret ?: " construct.  It hides the control flow logic.

Logically, this whole function is

	ret = read_something1();
	if (ret)
		goto out;

	ret = read_something2();
	if (ret)
		goto out;

	...

I'd *much* rather have that goto be:

	for () {
		ret = read_something();
		if (ret)
			break; // aka. goto out
	}

Than have something *look* like straight control flow when it isn't.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification
  2024-10-14 15:56   ` Dave Hansen
@ 2024-10-14 19:13     ` Dan Williams
  2024-10-15 11:34       ` Huang, Kai
  0 siblings, 1 reply; 17+ messages in thread
From: Dan Williams @ 2024-10-14 19:13 UTC (permalink / raw)
  To: Dave Hansen, Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo,
	hpa, dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov

Dave Hansen wrote:
> On 10/14/24 04:31, Kai Huang wrote:
> > +#define READ_SYS_INFO(_field_id, _member)				\
> > +	ret = ret ?: read_sys_metadata_field16(MD_FIELD_ID_##_field_id,	\
> > +					&sysinfo_tdmr->_member)
> >  
> > -	return 0;
> > +	READ_SYS_INFO(MAX_TDMRS,	     max_tdmrs);
> > +	READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
> > +	READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
> > +	READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
> > +	READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);
> 
> I know what Dan asked for here, but I dislike how this ended up.
> 
> The existing stuff *has* type safety, despite the void*.  It at least
> checks the size, which is the biggest problem.
> 
> Also, this isn't really an unrolled loop.  It still effectively has
> gotos, just like the for loop did.  It just buries the goto in the "ret
> = ret ?: " construct.  It hides the control flow logic.
> 
> Logically, this whole function is
> 
> 	ret = read_something1();
> 	if (ret)
> 		goto out;
> 
> 	ret = read_something2();
> 	if (ret)
> 		goto out;
> 
> 	...
> 
> I'd *much* rather have that goto be:
> 
> 	for () {
> 		ret = read_something();
> 		if (ret)
> 			break; // aka. goto out
> 	}
> 
> Than have something *look* like straight control flow when it isn't.

Yeah, the hiding of the control flow was the weakest part of the
suggestion. My main gripe was runtime validation of details that could
be validated at compile time.

There is no real need for control flow at all, i.e. early exit is not
needed as these are not resources that need to be unwound. It simply
needs to count whether all of the reads happened, so something like this
is sufficient:

    success += READ_SYS_INFO(MAX_TDMRS,             max_tdmrs);
    success += READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
    success += READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
    success += READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
    success += READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);
    
    if (success != 5)
    	return false;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification
  2024-10-14 19:13     ` Dan Williams
@ 2024-10-15 11:34       ` Huang, Kai
  0 siblings, 0 replies; 17+ messages in thread
From: Huang, Kai @ 2024-10-15 11:34 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	Williams, Dan J, kirill.shutemov@linux.intel.com,
	pbonzini@redhat.com, tglx@linutronix.de
  Cc: kvm@vger.kernel.org, nik.borisov@suse.com, Hunter, Adrian,
	linux-kernel@vger.kernel.org, Edgecombe, Rick P, x86@kernel.org,
	Yamahata, Isaku

On Mon, 2024-10-14 at 12:13 -0700, Dan Williams wrote:
> Dave Hansen wrote:
> > On 10/14/24 04:31, Kai Huang wrote:
> > > +#define READ_SYS_INFO(_field_id, _member)				\
> > > +	ret = ret ?: read_sys_metadata_field16(MD_FIELD_ID_##_field_id,	\
> > > +					&sysinfo_tdmr->_member)
> > >  
> > > -	return 0;
> > > +	READ_SYS_INFO(MAX_TDMRS,	     max_tdmrs);
> > > +	READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
> > > +	READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
> > > +	READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
> > > +	READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);
> > 
> > I know what Dan asked for here, but I dislike how this ended up.
> > 
> > The existing stuff *has* type safety, despite the void*.  It at least
> > checks the size, which is the biggest problem.
> > 
> > Also, this isn't really an unrolled loop.  It still effectively has
> > gotos, just like the for loop did.  It just buries the goto in the "ret
> > = ret ?: " construct.  It hides the control flow logic.
> > 
> > Logically, this whole function is
> > 
> > 	ret = read_something1();
> > 	if (ret)
> > 		goto out;
> > 
> > 	ret = read_something2();
> > 	if (ret)
> > 		goto out;
> > 
> > 	...
> > 
> > I'd *much* rather have that goto be:
> > 
> > 	for () {
> > 		ret = read_something();
> > 		if (ret)
> > 			break; // aka. goto out
> > 	}
> > 
> > Than have something *look* like straight control flow when it isn't.

Yeah understood.  Thanks for letting me know.

The 'for() loop' approach would need the original 'struct field_mapping' to hold
the mapping between field ID and the offset/size info, though.

> 
> Yeah, the hiding of the control flow was the weakest part of the
> suggestion. My main gripe was runtime validation of details that could
> be validated at compile time.

I am looking into how to do build-time verification while still using the
original 'struct field_mapping' approach.  If we can do this, I hope this can
address your concern about doing runtime check instead of build-time?

Adrian provided one suggestion [*] that we can use __builtin_choose_expr() to
achieve this:

"
BUILD_BUG_ON() requires a function, but it is still
be possible to add a build time check in TD_SYSINFO_MAP
e.g.

#define TD_SYSINFO_CHECK_SIZE(_field_id, _size)			\
	__builtin_choose_expr(MD_FIELD_ELE_SIZE(_field_id) == _size, _size,
(void)0)

#define _TD_SYSINFO_MAP(_field_id, _offset, _size)		\
	{ .field_id = _field_id,				\
	  .offset   = _offset,					\
	  .size	    = TD_SYSINFO_CHECK_SIZE(_field_id, _size) }

#define TD_SYSINFO_MAP(_field_id, _struct, _member)		\
	_TD_SYSINFO_MAP(MD_FIELD_ID_##_field_id,		\
			offsetof(_struct, _member),		\
			sizeof(typeof(((_struct *)0)->_member)))
"

I tried this, and it worked for most cases where the field ID is a simple
integer constant, but I got build error for the CMRs:

	for (u16 i = 0; i < cmr_info->num_cmrs; i++) {
		const struct field_mapping fields[] = {
			TD_SYSINFO_CMRINFO_MAP(CMR_BASE0 + i, cmr_base[i]),
			TD_SYSINFO_CMRINFO_MAP(CMR_SIZE0 + i, cmr_size[i]),
		};
		...
	}

.. where field ID for CMR[i] is calculated by CMR0.

The MD_FIELD_ELE_SIZE(_field_id) works for 'CMR_BASE0 + i' for BUILD_BUG_ON(),
but somehow the compiler fails to determine the 'MD_FIELD_ELE_SIZE(_field_id) ==
_size' as a constant_express and caused build failure.  I am still looking into
this.

[*]
https://lore.kernel.org/kvm/cover.1721186590.git.kai.huang@intel.com/T/#m379ce041f025dc20e7b58fa6dbdc484c2ce53af4

> There is no real need for control flow at all, i.e. early exit is not
> needed as these are not resources that need to be unwound. It simply
> needs to count whether all of the reads happened, so something like this
> is sufficient:
> 
>     success += READ_SYS_INFO(MAX_TDMRS,             max_tdmrs);
>     success += READ_SYS_INFO(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr);
>     success += READ_SYS_INFO(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]);
>     success += READ_SYS_INFO(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]);
>     success += READ_SYS_INFO(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]);
>     
>     if (success != 5)
>     	return false;
> 

If we go with this approach, it seems we can even get rid of the @success.

	int ret = 0;

#define READ_SYS_INFO(_field_id, _member)	\
	read_sys_metadata_field(MD_FIELD_ID_##_field_id, \
	                                     &sysinfo_tdmr->_member)

	ret |= READ_SYS_INFO(MAX_TDMRS, max_tdmrs);
	...
#undef READ_SYS_INFO

	return ret;

The tdh_sys_rd() always return -EIO when TDH.SYS.RD fails, so the above either
return 0 when all reads were successful or -EIO when there's any failed.

I can also go with route if Dave is fine?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
  2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (7 preceding siblings ...)
  2024-10-14 11:31 ` [PATCH v5 8/8] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
@ 2024-10-15 15:30 ` Dave Hansen
  2024-10-15 16:29   ` Paolo Bonzini
  8 siblings, 1 reply; 17+ messages in thread
From: Dave Hansen @ 2024-10-15 15:30 UTC (permalink / raw)
  To: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov

[-- Attachment #1: Type: text/plain, Size: 1793 bytes --]

I'm having one of those "I hate this all" moments.  Look at what we say
in the code:

>   * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".

Basically step one in verifying that this is all right is: Hey, humans,
please go parse a machine-readable format.  That's insanity.  If Intel
wants to publish JSON as the canonical source of truth, that's fine.
It's great, actually.  But let's stop playing human JSON parser and make
the computers do it for us, OK?

Let's just generate the code.  Basically, as long as the generated C is
marginally readable, I'm OK with it.  The most important things are:

 1. Adding a field is dirt simple
 2. Using the generated C is simple

In 99% of the cases, nobody ends up having to ever look at the generated
code.

Take a look at the attached python program and generated C file.  I
think they qualify.  We can check the script into tools/scripts/ and it
can get re-run when new json comes out or when a new field is needed.
You'd could call the generated code like this:

#include <generated.h>

	read_gunk(&tgm);

and use it like this:

	foo = tgm.BUILD_NUM;
	bar = tgm.BUILD_DATE;

Any field you want to add is a single addition to the python list and
re-running the script.  There's not even any need to do:

#define TDX_FOO_BAR_BUILD_DATE 0x8800000200000001

because it's unnecessary when you have:

	ret |= read_...(0x8800000200000001, &tgm.BUILD_DATE);

that links the magic number and the "BUILD_DATE" so closely together
anyway.  We also don't need type safety *here* at the "read" because
it's machine generated in the first place.  If there's a type mismatch
between "0x8800000200000001" and "tgm.BUILD_DATE" we have bigger
problems on our hands.

All the type checking comes when the code consumes tgm.BUILD_DATE (or
whatever).

[-- Attachment #2: tdx.py --]
[-- Type: text/x-python, Size: 853 bytes --]

#!/usr/bin/python3
import json
import sys

filefd = open(sys.argv[1])
jsonstr = filefd.read()
filefd.close()

j = json.loads(jsonstr)

print("static struct tdx_global_metadata tgm")
print("{")

def find_field(name):
	for f in j['Fields']:
		if f['Field Name'] == name:
			return f
	return None

fields = """
TDX_FEATURES0
BUILD_DATE
BUILD_NUM
MINOR_VERSION
""".strip().split("\n")

for fn in fields:
	f = find_field(fn)
	name = f['Field Name']
	element_bytes = int(f['Element Size (Bytes)'])
	element_bits = element_bytes * 8
	print("\tu%d %s;" % (element_bits, name))

print("}")


print("static void read_gunk()")
print("{")
print("\tint ret = 0;")
print("")
for fn in fields:
	f = find_field(fn)
	print("\tret |= read_sys_metadata_field(%s, &tgm.%s);" %
			(f['Base FIELD_ID (Hex)'],
			 f['Field Name']))
print("")
print("\treturn ret;")
print("}")

[-- Attachment #3: tdxm.c --]
[-- Type: text/x-csrc, Size: 457 bytes --]

static struct tdx_global_metadata tgm
{
	u64 TDX_FEATURES0;
	u32 BUILD_DATE;
	u16 BUILD_NUM;
	u16 MINOR_VERSION;
}
static void read_gunk()
{
	int ret = 0;

	ret |= read_sys_metadata_field(0x0A00000300000008, &tgm.TDX_FEATURES0);
	ret |= read_sys_metadata_field(0x8800000200000001, &tgm.BUILD_DATE);
	ret |= read_sys_metadata_field(0x8800000100000002, &tgm.BUILD_NUM);
	ret |= read_sys_metadata_field(0x0800000100000003, &tgm.MINOR_VERSION);

	return ret;
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
  2024-10-15 15:30 ` [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
@ 2024-10-15 16:29   ` Paolo Bonzini
  2024-10-15 19:04     ` Dan Williams
  2024-10-28 12:07     ` Huang, Kai
  0 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2024-10-15 16:29 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, x86, linux-kernel, kvm, rick.p.edgecombe,
	isaku.yamahata, adrian.hunter, nik.borisov

[-- Attachment #1: Type: text/plain, Size: 2059 bytes --]

On Tue, Oct 15, 2024 at 5:30 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> I'm having one of those "I hate this all" moments.  Look at what we say
> in the code:
>
> >   * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
>
> Basically step one in verifying that this is all right is: Hey, humans,
> please go parse a machine-readable format.  That's insanity.  If Intel
> wants to publish JSON as the canonical source of truth, that's fine.
> It's great, actually.  But let's stop playing human JSON parser and make
> the computers do it for us, OK?
>
> Let's just generate the code.  Basically, as long as the generated C is
> marginally readable, I'm OK with it.  The most important things are:
>
>  1. Adding a field is dirt simple
>  2. Using the generated C is simple
>
> In 99% of the cases, nobody ends up having to ever look at the generated
> code.
>
> Take a look at the attached python program and generated C file.  I
> think they qualify.  We can check the script into tools/scripts/ and it
> can get re-run when new json comes out or when a new field is needed.
> You'd could call the generated code like this:

Ok, so let's move this thing forward. Here is a more polished script
and the output. Untested beyond compilation.

Kai, feel free to include it in v6 with my

Signed-off-by: Paolo Bonzini <pbonzini@redhat.om>

I made an attempt at adding array support and using it with the CMR
information; just to see if Intel is actually trying to make
global_metadata.json accurate. The original code has

  for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
    READ_SYS_INFO(CMR_BASE + i, cmr_base[i]);
    READ_SYS_INFO(CMR_SIZE + i, cmr_size[i]);
  }

The generated code instead always tries to read 32 fields and returns
non-zero from get_tdx_sys_info_cmr if they are missing. If it fails to
read the fields above NUM_CMRS, just remove that part of the tdx.py
script and make sure that a comment in the code shames the TDX ABI
documentation adequately. :)

Thanks,

Paolo

[-- Attachment #2: tdx-generated.c --]
[-- Type: text/x-csrc, Size: 2046 bytes --]

/* Automatically generated by tdx.py */

static int get_tdx_sys_info_version(struct tdx_sys_info_version *out)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x8800000200000001, &val))) out->build_date = val;
	if (!ret && !(ret = read_sys_metadata_field(0x8800000100000002, &val))) out->build_num = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000003, &val))) out->minor_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000004, &val))) out->major_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000005, &val))) out->update_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000006, &val))) out->internal_version = val;

	return ret;
}

static int get_tdx_sys_info_features(struct tdx_sys_info_features *out)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x0A00000300000008, &val))) out->tdx_features0 = val;

	return ret;
}

static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *out)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000008, &val))) out->max_tdmrs = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000009, &val))) out->max_reserved_per_tdmr = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000010, &val))) out->pamt_4k_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000011, &val))) out->pamt_2m_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val))) out->pamt_1g_entry_size = val;

	return ret;
}

static int get_tdx_sys_info_cmr(struct tdx_sys_info_cmr *out)
{
	int ret = 0;
	u64 val;
	int i;

	if (!ret && !(ret = read_sys_metadata_field(0x9000000100000000, &val))) out->num_cmrs = val;
	for (i = 0; i < 32; i++)
		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000080 + i, &val))) out->cmr_base[i] = val;
	for (i = 0; i < 32; i++)
		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000100 + i, &val))) out->cmr_size[i] = val;

	return ret;
}

[-- Attachment #3: tdx-generated.h --]
[-- Type: text/x-chdr, Size: 547 bytes --]

/* Automatically generated by tdx.py */
#ifndef TDX_DATA_GENERATED_H
#define TDX_DATA_GENERATED_H 1

struct tdx_sys_info_version {
	u32 build_date;
	u16 build_num;
	u16 minor_version;
	u16 major_version;
	u16 update_version;
	u16 internal_version;
};

struct tdx_sys_info_features {
	u64 tdx_features0;
};

struct tdx_sys_info_tdmr {
	u16 max_tdmrs;
	u16 max_reserved_per_tdmr;
	u16 pamt_4k_entry_size;
	u16 pamt_2m_entry_size;
	u16 pamt_1g_entry_size;
};

struct tdx_sys_info_cmr {
	u16 num_cmrs;
	u64 cmr_base[32];
	u64 cmr_size[32];
};

#endif

[-- Attachment #4: tdx.py --]
[-- Type: text/x-python, Size: 3350 bytes --]

#! /usr/bin/env python3
import json
import sys

# Note: this script does not run as part of the build process.
# It is used to generate structs from the TDX global_metadata.json
# file, and functions to fill in said structs.  Rerun it if
# you need more fields.

TDX_STRUCTS = {
    "tdx_sys_info_version": [
        "BUILD_DATE",
        "BUILD_NUM",
        "MINOR_VERSION",
        "MAJOR_VERSION",
        "UPDATE_VERSION",
        "INTERNAL_VERSION",
    ],
    "tdx_sys_info_features": [
        "TDX_FEATURES0"
    ],
    "tdx_sys_info_tdmr": [
        "MAX_TDMRS",
        "MAX_RESERVED_PER_TDMR",
        "PAMT_4K_ENTRY_SIZE",
        "PAMT_2M_ENTRY_SIZE",
        "PAMT_1G_ENTRY_SIZE",
    ],
    "tdx_sys_info_cmr": [
        "NUM_CMRS", "CMR_BASE", "CMR_SIZE"
    ],
}


def print_struct(name, fields, file):
    print("struct %s {" % (name,), file=file)
    for f in fields:
        fname = f["Field Name"]
        element_bytes = int(f["Element Size (Bytes)"])
        element_bits = element_bytes * 8
        num_fields = int(f["Num Fields"])
        if num_fields == 1:
            print("\tu%d %s;" % (element_bits, fname.lower()), file=file)
        else:
            print(
                "\tu%d %s[%d];" % (element_bits, fname.lower(), num_fields), file=file
            )
    print("};", file=file)


def print_field(number, member, indent, file):
    print(
        "%sif (!ret && !(ret = read_sys_metadata_field(%s, &val))) out->%s = val;"
        % (indent, number, member),
        file=file,
    )


def print_function(name, fields, file):
    print("static int get_%s(struct %s *out)" % (name, name), file=file)
    print("{", file=file)

    print("\tint ret = 0;", file=file)
    print("\tu64 val;", file=file)
    for f in fields:
        num_fields = int(f["Num Fields"])
        if num_fields > 1:
            print("\tint i;", file=file)
            break

    print(file=file)
    for f in fields:
        num_fields = int(f["Num Fields"])
        if num_fields == 1:
            print_field(
                f["Base FIELD_ID (Hex)"],
                f["Field Name"].lower(),
                indent="\t",
                file=file,
            )
        else:
            print("\tfor (i = 0; i < %d; i++)" % (num_fields,), file=file)
            print_field(
                f["Base FIELD_ID (Hex)"] + " + i",
                f["Field Name"].lower() + "[i]",
                indent="\t\t",
                file=file,
            )

    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)


jsonfile = sys.argv[1]
hfile = sys.argv[2]
cfile = sys.argv[3]

with open(jsonfile, "r") as f:
    json_in = json.load(f)
    fields = {x["Field Name"]: x for x in json_in["Fields"]}

with open(hfile, "w") as f:
    print("/* Automatically generated by tdx.py */", file=f)
    print("#ifndef TDX_DATA_GENERATED_H", file=f)
    print("#define TDX_DATA_GENERATED_H 1", file=f)
    for name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_struct(name, [fields[x] for x in field_names], file=f)
    print(file=f)
    print("#endif", file=f)

with open(cfile, "w") as f:
    print("/* Automatically generated by tdx.py */", file=f)
    for name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_function(name, [fields[x] for x in field_names], file=f)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
  2024-10-15 16:29   ` Paolo Bonzini
@ 2024-10-15 19:04     ` Dan Williams
  2024-10-15 21:11       ` Huang, Kai
  2024-10-28 12:07     ` Huang, Kai
  1 sibling, 1 reply; 17+ messages in thread
From: Dan Williams @ 2024-10-15 19:04 UTC (permalink / raw)
  To: Paolo Bonzini, Dave Hansen
  Cc: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, x86, linux-kernel, kvm, rick.p.edgecombe,
	isaku.yamahata, adrian.hunter, nik.borisov

Paolo Bonzini wrote:
> On Tue, Oct 15, 2024 at 5:30 PM Dave Hansen <dave.hansen@intel.com> wrote:
> >
> > I'm having one of those "I hate this all" moments.  Look at what we say
> > in the code:
> >
> > >   * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
> >
> > Basically step one in verifying that this is all right is: Hey, humans,
> > please go parse a machine-readable format.  That's insanity.  If Intel
> > wants to publish JSON as the canonical source of truth, that's fine.
> > It's great, actually.  But let's stop playing human JSON parser and make
> > the computers do it for us, OK?
> >
> > Let's just generate the code.  Basically, as long as the generated C is
> > marginally readable, I'm OK with it.  The most important things are:
> >
> >  1. Adding a field is dirt simple
> >  2. Using the generated C is simple
> >
> > In 99% of the cases, nobody ends up having to ever look at the generated
> > code.
> >
> > Take a look at the attached python program and generated C file.  I
> > think they qualify.  We can check the script into tools/scripts/ and it
> > can get re-run when new json comes out or when a new field is needed.
> > You'd could call the generated code like this:
> 
> Ok, so let's move this thing forward. Here is a more polished script
> and the output. Untested beyond compilation.
> 
> Kai, feel free to include it in v6 with my
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.om>
> 
> I made an attempt at adding array support and using it with the CMR
> information; just to see if Intel is actually trying to make
> global_metadata.json accurate. The original code has
> 
>   for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
>     READ_SYS_INFO(CMR_BASE + i, cmr_base[i]);
>     READ_SYS_INFO(CMR_SIZE + i, cmr_size[i]);
>   }
> 
> The generated code instead always tries to read 32 fields and returns
> non-zero from get_tdx_sys_info_cmr if they are missing. If it fails to
> read the fields above NUM_CMRS, just remove that part of the tdx.py
> script and make sure that a comment in the code shames the TDX ABI
> documentation adequately. :)

Thanks for doing this Paolo, I regret not pushing harder [1] / polishing
up the bash+jq script I threw together to do the same.

I took a look at your script and the autogenerated code and it looks good
to me.

Feel free to add my Reviewed-by on a patch that adds that collateral to
the tools/ directory.

[1]: http://lore.kernel.org/66b19beaadd28_4fc729410@dwillia2-xfh.jf.intel.com.notmuch

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
  2024-10-15 19:04     ` Dan Williams
@ 2024-10-15 21:11       ` Huang, Kai
  0 siblings, 0 replies; 17+ messages in thread
From: Huang, Kai @ 2024-10-15 21:11 UTC (permalink / raw)
  To: Dan Williams, Paolo Bonzini, Dave Hansen
  Cc: kirill.shutemov, tglx, bp, peterz, mingo, hpa, seanjc, x86,
	linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov



On 16/10/2024 8:04 am, Dan Williams wrote:
> Paolo Bonzini wrote:
>> On Tue, Oct 15, 2024 at 5:30 PM Dave Hansen <dave.hansen@intel.com> wrote:
>>>
>>> I'm having one of those "I hate this all" moments.  Look at what we say
>>> in the code:
>>>
>>>>    * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
>>>
>>> Basically step one in verifying that this is all right is: Hey, humans,
>>> please go parse a machine-readable format.  That's insanity.  If Intel
>>> wants to publish JSON as the canonical source of truth, that's fine.
>>> It's great, actually.  But let's stop playing human JSON parser and make
>>> the computers do it for us, OK?
>>>
>>> Let's just generate the code.  Basically, as long as the generated C is
>>> marginally readable, I'm OK with it.  The most important things are:
>>>
>>>   1. Adding a field is dirt simple
>>>   2. Using the generated C is simple
>>>
>>> In 99% of the cases, nobody ends up having to ever look at the generated
>>> code.
>>>
>>> Take a look at the attached python program and generated C file.  I
>>> think they qualify.  We can check the script into tools/scripts/ and it
>>> can get re-run when new json comes out or when a new field is needed.
>>> You'd could call the generated code like this:
>>
>> Ok, so let's move this thing forward. Here is a more polished script
>> and the output. Untested beyond compilation.
>>
>> Kai, feel free to include it in v6 with my
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.om>
>>
>> I made an attempt at adding array support and using it with the CMR
>> information; just to see if Intel is actually trying to make
>> global_metadata.json accurate. The original code has
>>
>>    for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
>>      READ_SYS_INFO(CMR_BASE + i, cmr_base[i]);
>>      READ_SYS_INFO(CMR_SIZE + i, cmr_size[i]);
>>    }
>>
>> The generated code instead always tries to read 32 fields and returns
>> non-zero from get_tdx_sys_info_cmr if they are missing. If it fails to
>> read the fields above NUM_CMRS, just remove that part of the tdx.py
>> script and make sure that a comment in the code shames the TDX ABI
>> documentation adequately. :)
> 
> Thanks for doing this Paolo, I regret not pushing harder [1] / polishing
> up the bash+jq script I threw together to do the same.
> 
> I took a look at your script and the autogenerated code and it looks good
> to me.
> 
> Feel free to add my Reviewed-by on a patch that adds that collateral to
> the tools/ directory.
> 
> [1]: http://lore.kernel.org/66b19beaadd28_4fc729410@dwillia2-xfh.jf.intel.com.notmuch

Hi Dave/Paolo/Dan,

I'll go with this for the next version.

Thanks for all your feedback and the code you shared.  I appreciate.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump
  2024-10-15 16:29   ` Paolo Bonzini
  2024-10-15 19:04     ` Dan Williams
@ 2024-10-28 12:07     ` Huang, Kai
  1 sibling, 0 replies; 17+ messages in thread
From: Huang, Kai @ 2024-10-28 12:07 UTC (permalink / raw)
  To: pbonzini@redhat.com, Hansen, Dave
  Cc: Edgecombe, Rick P, seanjc@google.com, bp@alien8.de,
	x86@kernel.org, peterz@infradead.org, hpa@zytor.com,
	mingo@redhat.com, kirill.shutemov@linux.intel.com,
	tglx@linutronix.de, linux-kernel@vger.kernel.org, Williams, Dan J,
	Yamahata, Isaku, kvm@vger.kernel.org, Hunter, Adrian,
	nik.borisov@suse.com

[-- Attachment #1: Type: text/plain, Size: 10709 bytes --]

On Tue, 2024-10-15 at 18:29 +0200, Paolo Bonzini wrote:
> On Tue, Oct 15, 2024 at 5:30 PM Dave Hansen <dave.hansen@intel.com> wrote:
> > 
> > I'm having one of those "I hate this all" moments.  Look at what we say
> > in the code:
> > 
> > >   * See the "global_metadata.json" in the "TDX 1.5 ABI definitions".
> > 
> > Basically step one in verifying that this is all right is: Hey, humans,
> > please go parse a machine-readable format.  That's insanity.  If Intel
> > wants to publish JSON as the canonical source of truth, that's fine.
> > It's great, actually.  But let's stop playing human JSON parser and make
> > the computers do it for us, OK?
> > 
> > Let's just generate the code.  Basically, as long as the generated C is
> > marginally readable, I'm OK with it.  The most important things are:
> > 
> >  1. Adding a field is dirt simple
> >  2. Using the generated C is simple
> > 
> > In 99% of the cases, nobody ends up having to ever look at the generated
> > code.
> > 
> > Take a look at the attached python program and generated C file.  I
> > think they qualify.  We can check the script into tools/scripts/ and it
> > can get re-run when new json comes out or when a new field is needed.
> > You'd could call the generated code like this:
> 
> Ok, so let's move this thing forward. Here is a more polished script
> and the output. Untested beyond compilation.
> 
> Kai, feel free to include it in v6 with my
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.om>

Hi Dave, Paolo,

I updated the script mainly for two purposes: 

1) Add auto-generation of main structure 'struct tdx_sys_info' and the main
function 'get_tdx_sys_info()' which reads metadata to the main structure.  

Without it, adding a new field of a new "Class" won't be that simple.  Paolo's
script only generates 'struct tdx_sys_info_xx' sub-structures and
'get_tdx_sys_info_xx()' sub-functions.  We will need to manually add the new
sub-structure to 'struct tdx_sys_info' and add code to call the new function to
read it.

2) Add support of reading large metadata field which has multiple 64-bit
elements.

Yes a metadata field can consist multiple 64-bit elements.  The example is
CPUID_CONFIG_VALUE, which is an array, with each field consisting 2 64-bit
elements to represent 128-bit CPUID value EAX/EBX/ECX/EDX.

KVM will need to use this CPUID_CONFIG_VALUE to create TD.

The way to read such fields (see TDX 1.5 Base spec, 18.6.3. Arrays of Metadata
Fields, Figure 18.3: Example of an Array of Four 48 Byte TDCS.RTMR Fields, Each
Composed of 6 Elements):

	Base field ID: X
	
	---------------------------------
	| elem 0 (X)	| elem 1 (X+1)	|	Field 0
	---------------------------------
	| elem 0 (X+2)	| elem 1 (X+3)	|	Field 1
	---------------------------------
	....

To make generation simple, I updated the script to generate such case as two-
dimensional array.  E.g.,

	Field Name: "XXX"
	Num Field: M
	Num Elements: N

   -> 
	u64 xxx[M][N];

And the code to read:

	for (i = 0; i < M; i++)
		for (j = 0; j < N; J++)
			read...(BASE_ID + i * N + j, &val);

There are also some other minor updates.  E.g., after asking TDX module guys,
they said we should use value reported via "NUM_CMRS" to loop when reading CMRs,
so I adjusted the script to use 'sysinfo_cmr->num_cmrs' to loop.

I pasted the script below, so that I can have a premalink of this script to use
in the next version of this series (for the sake to make the auto-generated code
reproducible).  I also attached the updated script and auto-generated files.

Btw, the checkpatch.pl has below complain about the generated code:

ERROR: do not use assignment in if condition
#112: FILE: arch/x86/virt/vmx/tdx/tdx_global_metadata.c:15:
+	if (!ret && !(ret = read_sys_metadata_field(0x8800000200000001, &val)))

I didn't address this because this seems fine to me.

--- the updated script ---

#! /usr/bin/env python3
import json
import sys

# Note: this script does not run as part of the build process.
# It is used to generate structs from the TDX global_metadata.json
# file, and functions to fill in said structs.  Rerun it if
# you need more fields.

TDX_STRUCTS = {
    "version": [
        "BUILD_DATE",
        "BUILD_NUM",
        "MINOR_VERSION",
        "MAJOR_VERSION",
        "UPDATE_VERSION",
        "INTERNAL_VERSION",
    ],
    "features": [
        "TDX_FEATURES0"
    ],
    "tdmr": [
        "MAX_TDMRS",
        "MAX_RESERVED_PER_TDMR",
        "PAMT_4K_ENTRY_SIZE",
        "PAMT_2M_ENTRY_SIZE",
        "PAMT_1G_ENTRY_SIZE",
    ],
    "cmr": [
        "NUM_CMRS", "CMR_BASE", "CMR_SIZE"
    ],
#   "td_ctrl": [
#        "TDR_BASE_SIZE",
#        "TDCS_BASE_SIZE",
#        "TDVPS_BASE_SIZE",
#    ],
#    "td_conf": [
#        "ATTRIBUTES_FIXED0",
#        "ATTRIBUTES_FIXED1",
#        "XFAM_FIXED0",
#        "XFAM_FIXED1",
#        "NUM_CPUID_CONFIG",
#        "MAX_VCPUS_PER_TD",
#        "CPUID_CONFIG_LEAVES",
#        "CPUID_CONFIG_VALUES",
#    ],
}

def print_class_struct_field(field_name, element_bytes, num_fields,
num_elements, file):
    element_type = "u%s" % (element_bytes * 8)
    element_array = ""
    if num_fields > 1:
        element_array += "[%d]" % (num_fields)
    if num_elements > 1:
        element_array += "[%d]" % (num_elements)
    print("\t%s %s%s;" % (element_type, field_name, element_array), file=file)

def print_class_struct(class_name, fields, file):
    struct_name = "tdx_sys_info_%s" % (class_name)
    print("struct %s {" % (struct_name), file=file)
    for f in fields:
        print_class_struct_field(
            f["Field Name"].lower(),
            int(f["Element Size (Bytes)"]),
            int(f["Num Fields"]),
            int(f["Num Elements"]),
            file=file)
    print("};", file=file)

def print_read_field(field_id, struct_var, struct_member, indent, file):
    print(
        "%sif (!ret && !(ret = read_sys_metadata_field(%s, &val)))\n%s\t%s->%s =
val;"
        % (indent, field_id, indent, struct_var, struct_member),
        file=file,
    )

def print_class_function(class_name, fields, file):
    func_name = "get_tdx_sys_info_%s" % (class_name)
    struct_name = "tdx_sys_info_%s" % (class_name)
    struct_var = "sysinfo_%s" % (class_name)

    print("static int %s(struct %s *%s)" % (func_name, struct_name, struct_var),
file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print("\tu64 val;", file=file)

    has_i = 0
    has_j = 0
    for f in fields:
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        if num_fields > 1:
            has_i = 1
        if num_elements > 1:
            has_j = 1

    if has_i == 1 and has_j == 1:
        print("\tint i, j;", file=file)
    elif has_i == 1:
        print("\tint i;", file=file)

    print(file=file)
    for f in fields:
        fname = f["Field Name"]
        field_id = f["Base FIELD_ID (Hex)"]
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        struct_member = fname.lower()
        indent = "\t"
        if num_fields > 1:
            if fname == "CMR_BASE" or fname == "CMR_SIZE":
                limit = "sysinfo_cmr->num_cmrs"
            elif fname == "CPUID_CONFIG_LEAVES" or fname ==
"CPUID_CONFIG_VALUES":
                limit = "sysinfo_td_conf->num_cpuid_config"
            else:
                limit = "%d" %(num_fields)
            print("%sfor (i = 0; i < %s; i++)" % (indent, limit), file=file)
            indent += "\t"
            field_id += " + i"
            struct_member += "[i]"
        if num_elements > 1:
            print("%sfor (j = 0; j < %d; j++)" % (indent, num_elements),
file=file)
            indent += "\t"
            field_id += " * 2 + j"
            struct_member += "[j]"

        print_read_field(
            field_id,
            struct_var,
            struct_member,
            indent,
            file=file,
        )

    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

def print_main_struct(file):
    print("struct tdx_sys_info {", file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        struct_name = "tdx_sys_info_%s" % (class_name)
        struct_var = class_name
        print("\tstruct %s %s;" % (struct_name, struct_var), file=file)
    print("};", file=file)

def print_main_function(file):
    print("static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)",
file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print(file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        func_name = "get_tdx_sys_info_" + class_name
        struct_var = class_name
        print("\tret = ret ?: %s(&sysinfo->%s);" % (func_name, struct_var),
file=file)
    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

jsonfile = sys.argv[1]
hfile = sys.argv[2]
cfile = sys.argv[3]
hfileifdef = hfile.replace(".", "_")

with open(jsonfile, "r") as f:
    json_in = json.load(f)
    fields = {x["Field Name"]: x for x in json_in["Fields"]}

with open(hfile, "w") as f:
    print("/* SPDX-License-Identifier: GPL-2.0 */", file=f)
    print("/* Automatically generated TDX global metadata structures. */",
file=f)
    print("#ifndef _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print("#define _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print(file=f)
    print("#include <linux/types.h>", file=f)
    print(file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print_class_struct(class_name, [fields[x] for x in field_names], file=f)
        print(file=f)
    print_main_struct(file=f)
    print(file=f)
    print("#endif", file=f)

with open(cfile, "w") as f:
    print("// SPDX-License-Identifier: GPL-2.0", file=f)
    print("/*", file=f)
    print(" * Automatically generated functions to read TDX global metadata.",
file=f)
    print(" *", file=f)
    print(" * This file doesn't compile on its own as it lacks of inclusion",
file=f)
    print(" * of SEAMCALL wrapper primitive which reads global metadata.",
file=f)
    print(" * Include this file to other C file instead.", file=f)
    print(" */", file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_class_function(class_name, [fields[x] for x in field_names],
file=f)
    print(file=f)
    print_main_function(file=f)





[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: tdx.py --]
[-- Type: text/x-python3; name="tdx.py", Size: 6530 bytes --]

#! /usr/bin/env python3
import json
import sys

# Note: this script does not run as part of the build process.
# It is used to generate structs from the TDX global_metadata.json
# file, and functions to fill in said structs.  Rerun it if
# you need more fields.

TDX_STRUCTS = {
    "version": [
        "BUILD_DATE",
        "BUILD_NUM",
        "MINOR_VERSION",
        "MAJOR_VERSION",
        "UPDATE_VERSION",
        "INTERNAL_VERSION",
    ],
    "features": [
        "TDX_FEATURES0"
    ],
    "tdmr": [
        "MAX_TDMRS",
        "MAX_RESERVED_PER_TDMR",
        "PAMT_4K_ENTRY_SIZE",
        "PAMT_2M_ENTRY_SIZE",
        "PAMT_1G_ENTRY_SIZE",
    ],
    "cmr": [
        "NUM_CMRS", "CMR_BASE", "CMR_SIZE"
    ],
#   "td_ctrl": [
#        "TDR_BASE_SIZE",
#        "TDCS_BASE_SIZE",
#        "TDVPS_BASE_SIZE",
#    ],
#    "td_conf": [
#        "ATTRIBUTES_FIXED0",
#        "ATTRIBUTES_FIXED1",
#        "XFAM_FIXED0",
#        "XFAM_FIXED1",
#        "NUM_CPUID_CONFIG",
#        "MAX_VCPUS_PER_TD",
#        "CPUID_CONFIG_LEAVES",
#        "CPUID_CONFIG_VALUES",
#    ],
}

def print_class_struct_field(field_name, element_bytes, num_fields, num_elements, file):
    element_type = "u%s" % (element_bytes * 8)
    element_array = ""
    if num_fields > 1:
        element_array += "[%d]" % (num_fields)
    if num_elements > 1:
        element_array += "[%d]" % (num_elements)
    print("\t%s %s%s;" % (element_type, field_name, element_array), file=file)

def print_class_struct(class_name, fields, file):
    struct_name = "tdx_sys_info_%s" % (class_name)
    print("struct %s {" % (struct_name), file=file)
    for f in fields:
        print_class_struct_field(
            f["Field Name"].lower(),
            int(f["Element Size (Bytes)"]),
            int(f["Num Fields"]),
            int(f["Num Elements"]),
            file=file)
    print("};", file=file)

def print_read_field(field_id, struct_var, struct_member, indent, file):
    print(
        "%sif (!ret && !(ret = read_sys_metadata_field(%s, &val)))\n%s\t%s->%s = val;"
        % (indent, field_id, indent, struct_var, struct_member),
        file=file,
    )

def print_class_function(class_name, fields, file):
    func_name = "get_tdx_sys_info_%s" % (class_name)
    struct_name = "tdx_sys_info_%s" % (class_name)
    struct_var = "sysinfo_%s" % (class_name)

    print("static int %s(struct %s *%s)" % (func_name, struct_name, struct_var), file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print("\tu64 val;", file=file)

    has_i = 0
    has_j = 0
    for f in fields:
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        if num_fields > 1:
            has_i = 1
        if num_elements > 1:
            has_j = 1

    if has_i == 1 and has_j == 1:
        print("\tint i, j;", file=file)
    elif has_i == 1:
        print("\tint i;", file=file)

    print(file=file)
    for f in fields:
        fname = f["Field Name"]
        field_id = f["Base FIELD_ID (Hex)"]
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        struct_member = fname.lower()
        indent = "\t"
        if num_fields > 1:
            if fname == "CMR_BASE" or fname == "CMR_SIZE":
                limit = "sysinfo_cmr->num_cmrs"
            elif fname == "CPUID_CONFIG_LEAVES" or fname == "CPUID_CONFIG_VALUES":
                limit = "sysinfo_td_conf->num_cpuid_config"
            else:
                limit = "%d" %(num_fields)
            print("%sfor (i = 0; i < %s; i++)" % (indent, limit), file=file)
            indent += "\t"
            field_id += " + i"
            struct_member += "[i]"
        if num_elements > 1:
            print("%sfor (j = 0; j < %d; j++)" % (indent, num_elements), file=file)
            indent += "\t"
            field_id += " * 2 + j"
            struct_member += "[j]"

        print_read_field(
            field_id,
            struct_var,
            struct_member,
            indent,
            file=file,
        )

    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

def print_main_struct(file):
    print("struct tdx_sys_info {", file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        struct_name = "tdx_sys_info_%s" % (class_name)
        struct_var = class_name
        print("\tstruct %s %s;" % (struct_name, struct_var), file=file)
    print("};", file=file)

def print_main_function(file):
    print("static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)", file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print(file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        func_name = "get_tdx_sys_info_" + class_name
        struct_var = class_name
        print("\tret = ret ?: %s(&sysinfo->%s);" % (func_name, struct_var), file=file)
    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

jsonfile = sys.argv[1]
hfile = sys.argv[2]
cfile = sys.argv[3]
hfileifdef = hfile.replace(".", "_")

with open(jsonfile, "r") as f:
    json_in = json.load(f)
    fields = {x["Field Name"]: x for x in json_in["Fields"]}

with open(hfile, "w") as f:
    print("/* SPDX-License-Identifier: GPL-2.0 */", file=f)
    print("/* Automatically generated TDX global metadata structures. */", file=f)
    print("#ifndef _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print("#define _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print(file=f)
    print("#include <linux/types.h>", file=f)
    print(file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print_class_struct(class_name, [fields[x] for x in field_names], file=f)
        print(file=f)
    print_main_struct(file=f)
    print(file=f)
    print("#endif", file=f)

with open(cfile, "w") as f:
    print("// SPDX-License-Identifier: GPL-2.0", file=f)
    print("/*", file=f)
    print(" * Automatically generated functions to read TDX global metadata.", file=f)
    print(" *", file=f)
    print(" * This file doesn't compile on its own as it lacks of inclusion", file=f)
    print(" * of SEAMCALL wrapper primitive which reads global metadata.", file=f)
    print(" * Include this file to other C file instead.", file=f)
    print(" */", file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_class_function(class_name, [fields[x] for x in field_names], file=f)
    print(file=f)
    print_main_function(file=f)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: tdx_global_metadata.c --]
[-- Type: text/x-csrc; name="tdx_global_metadata.c", Size: 2873 bytes --]

// SPDX-License-Identifier: GPL-2.0
/*
 * Automatically generated functions to read TDX global metadata.
 *
 * This file doesn't compile on its own as it lacks of inclusion
 * of SEAMCALL wrapper primitive which reads global metadata.
 * Include this file to other C file instead.
 */

static int get_tdx_sys_info_version(struct tdx_sys_info_version *sysinfo_version)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x8800000200000001, &val)))
		sysinfo_version->build_date = val;
	if (!ret && !(ret = read_sys_metadata_field(0x8800000100000002, &val)))
		sysinfo_version->build_num = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000003, &val)))
		sysinfo_version->minor_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000004, &val)))
		sysinfo_version->major_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000005, &val)))
		sysinfo_version->update_version = val;
	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000006, &val)))
		sysinfo_version->internal_version = val;

	return ret;
}

static int get_tdx_sys_info_features(struct tdx_sys_info_features *sysinfo_features)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x0A00000300000008, &val)))
		sysinfo_features->tdx_features0 = val;

	return ret;
}

static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000008, &val)))
		sysinfo_tdmr->max_tdmrs = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000009, &val)))
		sysinfo_tdmr->max_reserved_per_tdmr = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000010, &val)))
		sysinfo_tdmr->pamt_4k_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000011, &val)))
		sysinfo_tdmr->pamt_2m_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
		sysinfo_tdmr->pamt_1g_entry_size = val;

	return ret;
}

static int get_tdx_sys_info_cmr(struct tdx_sys_info_cmr *sysinfo_cmr)
{
	int ret = 0;
	u64 val;
	int i;

	if (!ret && !(ret = read_sys_metadata_field(0x9000000100000000, &val)))
		sysinfo_cmr->num_cmrs = val;
	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000080 + i, &val)))
			sysinfo_cmr->cmr_base[i] = val;
	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000100 + i, &val)))
			sysinfo_cmr->cmr_size[i] = val;

	return ret;
}

static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
{
	int ret = 0;

	ret = ret ?: get_tdx_sys_info_version(&sysinfo->version);
	ret = ret ?: get_tdx_sys_info_features(&sysinfo->features);
	ret = ret ?: get_tdx_sys_info_tdmr(&sysinfo->tdmr);
	ret = ret ?: get_tdx_sys_info_cmr(&sysinfo->cmr);

	return ret;
}

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #4: tdx_global_metadata.h --]
[-- Type: text/x-chdr; name="tdx_global_metadata.h", Size: 858 bytes --]

/* SPDX-License-Identifier: GPL-2.0 */
/* Automatically generated TDX global metadata structures. */
#ifndef _X86_VIRT_TDX_AUTO_GENERATED_TDX_GLOBAL_METADATA_H
#define _X86_VIRT_TDX_AUTO_GENERATED_TDX_GLOBAL_METADATA_H

#include <linux/types.h>

struct tdx_sys_info_version {
	u32 build_date;
	u16 build_num;
	u16 minor_version;
	u16 major_version;
	u16 update_version;
	u16 internal_version;
};

struct tdx_sys_info_features {
	u64 tdx_features0;
};

struct tdx_sys_info_tdmr {
	u16 max_tdmrs;
	u16 max_reserved_per_tdmr;
	u16 pamt_4k_entry_size;
	u16 pamt_2m_entry_size;
	u16 pamt_1g_entry_size;
};

struct tdx_sys_info_cmr {
	u16 num_cmrs;
	u64 cmr_base[32];
	u64 cmr_size[32];
};

struct tdx_sys_info {
	struct tdx_sys_info_version version;
	struct tdx_sys_info_features features;
	struct tdx_sys_info_tdmr tdmr;
	struct tdx_sys_info_cmr cmr;
};

#endif

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-10-28 12:07 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-14 11:31 [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
2024-10-14 11:31 ` [PATCH v5 1/8] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
2024-10-14 11:31 ` [PATCH v5 2/8] x86/virt/tdx: Rework TD_SYSINFO_MAP to support build-time verification Kai Huang
2024-10-14 15:56   ` Dave Hansen
2024-10-14 19:13     ` Dan Williams
2024-10-15 11:34       ` Huang, Kai
2024-10-14 11:31 ` [PATCH v5 3/8] x86/virt/tdx: Prepare to support reading other global metadata fields Kai Huang
2024-10-14 11:31 ` [PATCH v5 4/8] x86/virt/tdx: Refine a comment to reflect the latest TDX spec Kai Huang
2024-10-14 11:31 ` [PATCH v5 5/8] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
2024-10-14 11:31 ` [PATCH v5 6/8] x86/virt/tdx: Print TDX module version Kai Huang
2024-10-14 11:31 ` [PATCH v5 7/8] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
2024-10-14 11:31 ` [PATCH v5 8/8] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
2024-10-15 15:30 ` [PATCH v5 0/8] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
2024-10-15 16:29   ` Paolo Bonzini
2024-10-15 19:04     ` Dan Williams
2024-10-15 21:11       ` Huang, Kai
2024-10-28 12:07     ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).