public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
@ 2024-11-11 10:39 Kai Huang
  2024-11-11 10:39 ` [PATCH v7 01/10] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

This series does necessary tweaks to TDX host "global metadata" reading
code to fix some immediate issues in the TDX module initialization code,
with intention to also provide a flexible code base to support sharing
global metadata to KVM (and other kernel components) for future needs.

This series, and additional patches to initialize TDX when loading KVM
module and read essential metadata fields for KVM TDX can be found at
[1].

Hi Dave (and maintainers),

This series targets x86 tip.  Also add Dan, KVM maintainers and KVM list
so people can also review and comment.

This is a pre-work of the "quite near future" KVM TDX support.  I
appreciate if you can review, comment and take this series if the
patches look good to you.

History:

v6 -> v7:
 - Collect tags from Dan and Nikolay (Thanks!)
 - Address nit comments from Dan in patch 3 changelog.
 - Rebase to tip/x86/tdx

v5 -> v6:
 - Change to use a script [*] to auto-generate metadata reading code.

  - https://lore.kernel.org/kvm/f25673ea-08c5-474b-a841-095656820b67@intel.com/
  - https://lore.kernel.org/kvm/CABgObfYXUxqQV_FoxKjC8U3t5DnyM45nz5DpTxYZv2x_uFK_Kw@mail.gmail.com/

   Per Dave, this patchset doesn't contain a patch to add the script
   to the kernel tree but append it in this cover letter in order to
   minimize the review effort.

 - Change to use auto-generated code to read TDX module version,
   supported features and CMRs in one patch, and made that from and
   signed by Paolo.
 - Couple of new patches due to using the auto-generated code
 - Remove the "reading metadata" part (due to they are auto-generated
   in one patch now) from the consumer patches.

Pervious versions and more background please see:

 - https://lore.kernel.org/kvm/9a06e2cf469cbca2777ac2c4ef70579e6bb934d5.camel@intel.com/T/

[1]: https://github.com/intel/tdx/tree/kvm-tdxinit-host-metadata-v7

[*] The script used to generate the patch 3:

#! /usr/bin/env python3
import json
import sys

# Note: this script does not run as part of the build process.
# It is used to generate structs from the TDX global_metadata.json
# file, and functions to fill in said structs.  Rerun it if
# you need more fields.

TDX_STRUCTS = {
    "version": [
        "BUILD_DATE",
        "BUILD_NUM",
        "MINOR_VERSION",
        "MAJOR_VERSION",
        "UPDATE_VERSION",
        "INTERNAL_VERSION",
    ],
    "features": [
        "TDX_FEATURES0"
    ],
    "tdmr": [
        "MAX_TDMRS",
        "MAX_RESERVED_PER_TDMR",
        "PAMT_4K_ENTRY_SIZE",
        "PAMT_2M_ENTRY_SIZE",
        "PAMT_1G_ENTRY_SIZE",
    ],
    "cmr": [
        "NUM_CMRS", "CMR_BASE", "CMR_SIZE"
    ],
}

STRUCT_PREFIX = "tdx_sys_info"
FUNC_PREFIX = "get_tdx_sys_info"
STRVAR_PREFIX = "sysinfo"

def print_class_struct_field(field_name, element_bytes, num_fields, num_elements, file):
    element_type = "u%s" % (element_bytes * 8)
    element_array = ""
    if num_fields > 1:
        element_array += "[%d]" % (num_fields)
    if num_elements > 1:
        element_array += "[%d]" % (num_elements)
    print("\t%s %s%s;" % (element_type, field_name, element_array), file=file)

def print_class_struct(class_name, fields, file):
    struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
    print("struct %s {" % (struct_name), file=file)
    for f in fields:
        print_class_struct_field(
            f["Field Name"].lower(),
            int(f["Element Size (Bytes)"]),
            int(f["Num Fields"]),
            int(f["Num Elements"]),
            file=file)
    print("};", file=file)

def print_read_field(field_id, struct_var, struct_member, indent, file):
    print(
        "%sif (!ret && !(ret = read_sys_metadata_field(%s, &val)))\n%s\t%s->%s = val;"
        % (indent, field_id, indent, struct_var, struct_member),
        file=file,
    )

def print_class_function(class_name, fields, file):
    func_name = "%s_%s" % (FUNC_PREFIX, class_name)
    struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
    struct_var = "%s_%s" % (STRVAR_PREFIX, class_name)

    print("static int %s(struct %s *%s)" % (func_name, struct_name, struct_var), file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print("\tu64 val;", file=file)

    has_i = 0
    has_j = 0
    for f in fields:
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        if num_fields > 1:
            has_i = 1
        if num_elements > 1:
            has_j = 1

    if has_i == 1 and has_j == 1:
        print("\tint i, j;", file=file)
    elif has_i == 1:
        print("\tint i;", file=file)

    print(file=file)
    for f in fields:
        fname = f["Field Name"]
        field_id = f["Base FIELD_ID (Hex)"]
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        struct_member = fname.lower()
        indent = "\t"
        if num_fields > 1:
            if fname == "CMR_BASE" or fname == "CMR_SIZE":
                limit = "%s_%s->num_cmrs" %(STRVAR_PREFIX, "cmr")
            elif fname == "CPUID_CONFIG_LEAVES" or fname == "CPUID_CONFIG_VALUES":
                limit = "%s_%s->num_cpuid_config" %(STRVAR_PREFIX, "td_conf")
            else:
                limit = "%d" %(num_fields)
            print("%sfor (i = 0; i < %s; i++)" % (indent, limit), file=file)
            indent += "\t"
            field_id += " + i"
            struct_member += "[i]"
        if num_elements > 1:
            print("%sfor (j = 0; j < %d; j++)" % (indent, num_elements), file=file)
            indent += "\t"
            field_id += " * 2 + j"
            struct_member += "[j]"

        print_read_field(
            field_id,
            struct_var,
            struct_member,
            indent,
            file=file,
        )

    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

def print_main_struct(file):
    print("struct tdx_sys_info {", file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
        struct_var = class_name
        print("\tstruct %s %s;" % (struct_name, struct_var), file=file)
    print("};", file=file)

def print_main_function(file):
    print("static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)", file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print(file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        func_name = "%s_%s" % (FUNC_PREFIX, class_name)
        struct_var = class_name
        print("\tret = ret ?: %s(&sysinfo->%s);" % (func_name, struct_var), file=file)
    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

jsonfile = sys.argv[1]
hfile = sys.argv[2]
cfile = sys.argv[3]
hfileifdef = hfile.replace(".", "_")

with open(jsonfile, "r") as f:
    json_in = json.load(f)
    fields = {x["Field Name"]: x for x in json_in["Fields"]}

with open(hfile, "w") as f:
    print("/* SPDX-License-Identifier: GPL-2.0 */", file=f)
    print("/* Automatically generated TDX global metadata structures. */", file=f)
    print("#ifndef _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print("#define _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print(file=f)
    print("#include <linux/types.h>", file=f)
    print(file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print_class_struct(class_name, [fields[x] for x in field_names], file=f)
        print(file=f)
    print_main_struct(file=f)
    print(file=f)
    print("#endif", file=f)

with open(cfile, "w") as f:
    print("// SPDX-License-Identifier: GPL-2.0", file=f)
    print("/*", file=f)
    print(" * Automatically generated functions to read TDX global metadata.", file=f)
    print(" *", file=f)
    print(" * This file doesn't compile on its own as it lacks of inclusion", file=f)
    print(" * of SEAMCALL wrapper primitive which reads global metadata.", file=f)
    print(" * Include this file to other C file instead.", file=f)
    print(" */", file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_class_function(class_name, [fields[x] for x in field_names], file=f)
    print(file=f)
    print_main_function(file=f)





Kai Huang (9):
  x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec
    better
  x86/virt/tdx: Start to track all global metadata in one structure
  x86/virt/tdx: Use dedicated struct members for PAMT entry sizes
  x86/virt/tdx: Add missing header file inclusion to local tdx.h
  x86/virt/tdx: Switch to use auto-generated global metadata reading
    code
  x86/virt/tdx: Trim away tail null CMRs
  x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find
    memory holes
  x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD
    mitigation
  x86/virt/tdx: Print TDX module version

Paolo Bonzini (1):
  x86/virt/tdx: Use auto-generated code to read global metadata

 arch/x86/virt/vmx/tdx/tdx.c                 | 178 ++++++++++++--------
 arch/x86/virt/vmx/tdx/tdx.h                 |  43 +----
 arch/x86/virt/vmx/tdx/tdx_global_metadata.c |  89 ++++++++++
 arch/x86/virt/vmx/tdx/tdx_global_metadata.h |  42 +++++
 4 files changed, 247 insertions(+), 105 deletions(-)
 create mode 100644 arch/x86/virt/vmx/tdx/tdx_global_metadata.c
 create mode 100644 arch/x86/virt/vmx/tdx/tdx_global_metadata.h


base-commit: 7ae15e2f69bad06527668b478dff7c099ad2e6ae
-- 
2.46.2


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v7 01/10] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 02/10] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The TDX module provides a set of "Global Metadata Fields".  They report
things like TDX module version, supported features, and fields related
to create/run TDX guests and so on.

TDX organizes those metadata fields by "Classes" based on the meaning of
those fields.  E.g., for now the kernel only reads "TD Memory Region"
(TDMR) related fields for module initialization.  Those fields are
defined under class "TDMR Info".

There are both immediate needs to read more metadata fields for module
initialization and near-future needs for other kernel components like
KVM to run TDX guests.  To meet all those requirements, the idea is the
TDX host core-kernel to provide a centralized, canonical, and read-only
structure for the global metadata that comes out from the TDX module for
all kernel components to use.

More specifically, the target is to end up with something like:

       struct tdx_sys_info {
	       struct tdx_sys_info_classA a;
	       struct tdx_sys_info_classB b;
	       ...
       };

Currently the kernel organizes all fields under "TDMR Info" class in
'struct tdx_tdmr_sysinfo'.  To prepare for the above target, rename the
structure to 'struct tdx_sys_info_tdmr' to follow the class name better.

No functional change intended.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 36 ++++++++++++++++++------------------
 arch/x86/virt/vmx/tdx/tdx.h |  2 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 4e2b2e2ac9f9..e979bf442929 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -272,7 +272,7 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 
 static int read_sys_metadata_field16(u64 field_id,
 				     int offset,
-				     struct tdx_tdmr_sysinfo *ts)
+				     struct tdx_sys_info_tdmr *ts)
 {
 	u16 *ts_member = ((void *)ts) + offset;
 	u64 tmp;
@@ -298,9 +298,9 @@ struct field_mapping {
 
 #define TD_SYSINFO_MAP(_field_id, _offset) \
 	{ .field_id = MD_FIELD_ID_##_field_id,	   \
-	  .offset   = offsetof(struct tdx_tdmr_sysinfo, _offset) }
+	  .offset   = offsetof(struct tdx_sys_info_tdmr, _offset) }
 
-/* Map TD_SYSINFO fields into 'struct tdx_tdmr_sysinfo': */
+/* Map TD_SYSINFO fields into 'struct tdx_sys_info_tdmr': */
 static const struct field_mapping fields[] = {
 	TD_SYSINFO_MAP(MAX_TDMRS,	      max_tdmrs),
 	TD_SYSINFO_MAP(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr),
@@ -309,16 +309,16 @@ static const struct field_mapping fields[] = {
 	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]),
 };
 
-static int get_tdx_tdmr_sysinfo(struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret;
 	int i;
 
-	/* Populate 'tdmr_sysinfo' fields using the mapping structure above: */
+	/* Populate 'sysinfo_tdmr' fields using the mapping structure above: */
 	for (i = 0; i < ARRAY_SIZE(fields); i++) {
 		ret = read_sys_metadata_field16(fields[i].field_id,
 						fields[i].offset,
-						tdmr_sysinfo);
+						sysinfo_tdmr);
 		if (ret)
 			return ret;
 	}
@@ -342,13 +342,13 @@ static int tdmr_size_single(u16 max_reserved_per_tdmr)
 }
 
 static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
-			   struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	size_t tdmr_sz, tdmr_array_sz;
 	void *tdmr_array;
 
-	tdmr_sz = tdmr_size_single(tdmr_sysinfo->max_reserved_per_tdmr);
-	tdmr_array_sz = tdmr_sz * tdmr_sysinfo->max_tdmrs;
+	tdmr_sz = tdmr_size_single(sysinfo_tdmr->max_reserved_per_tdmr);
+	tdmr_array_sz = tdmr_sz * sysinfo_tdmr->max_tdmrs;
 
 	/*
 	 * To keep things simple, allocate all TDMRs together.
@@ -367,7 +367,7 @@ static int alloc_tdmr_list(struct tdmr_info_list *tdmr_list,
 	 * at a given index in the TDMR list.
 	 */
 	tdmr_list->tdmr_sz = tdmr_sz;
-	tdmr_list->max_tdmrs = tdmr_sysinfo->max_tdmrs;
+	tdmr_list->max_tdmrs = sysinfo_tdmr->max_tdmrs;
 	tdmr_list->nr_consumed_tdmrs = 0;
 
 	return 0;
@@ -921,11 +921,11 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
 /*
  * Construct a list of TDMRs on the preallocated space in @tdmr_list
  * to cover all TDX memory regions in @tmb_list based on the TDX module
- * TDMR global information in @tdmr_sysinfo.
+ * TDMR global information in @sysinfo_tdmr.
  */
 static int construct_tdmrs(struct list_head *tmb_list,
 			   struct tdmr_info_list *tdmr_list,
-			   struct tdx_tdmr_sysinfo *tdmr_sysinfo)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
 	int ret;
 
@@ -934,12 +934,12 @@ static int construct_tdmrs(struct list_head *tmb_list,
 		return ret;
 
 	ret = tdmrs_set_up_pamt_all(tdmr_list, tmb_list,
-			tdmr_sysinfo->pamt_entry_size);
+			sysinfo_tdmr->pamt_entry_size);
 	if (ret)
 		return ret;
 
 	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, tmb_list,
-			tdmr_sysinfo->max_reserved_per_tdmr);
+			sysinfo_tdmr->max_reserved_per_tdmr);
 	if (ret)
 		tdmrs_free_pamt_all(tdmr_list);
 
@@ -1098,7 +1098,7 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list)
 
 static int init_tdx_module(void)
 {
-	struct tdx_tdmr_sysinfo tdmr_sysinfo;
+	struct tdx_sys_info_tdmr sysinfo_tdmr;
 	int ret;
 
 	/*
@@ -1117,17 +1117,17 @@ static int init_tdx_module(void)
 	if (ret)
 		goto out_put_tdxmem;
 
-	ret = get_tdx_tdmr_sysinfo(&tdmr_sysinfo);
+	ret = get_tdx_sys_info_tdmr(&sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Allocate enough space for constructing TDMRs */
-	ret = alloc_tdmr_list(&tdx_tdmr_list, &tdmr_sysinfo);
+	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &tdmr_sysinfo);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo_tdmr);
 	if (ret)
 		goto err_free_tdmrs;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index b701f69485d3..148f9b4d1140 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -100,7 +100,7 @@ struct tdx_memblock {
 };
 
 /* "TDMR info" part of "Global Scope Metadata" for constructing TDMRs */
-struct tdx_tdmr_sysinfo {
+struct tdx_sys_info_tdmr {
 	u16 max_tdmrs;
 	u16 max_reserved_per_tdmr;
 	u16 pamt_entry_size[TDX_PS_NR];
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 02/10] x86/virt/tdx: Start to track all global metadata in one structure
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
  2024-11-11 10:39 ` [PATCH v7 01/10] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 03/10] x86/virt/tdx: Use auto-generated code to read global metadata Kai Huang
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

The TDX module provides a set of "Global Metadata Fields".  They report
things like TDX module version, supported features, and fields related
to create/run TDX guests and so on.

Currently the kernel only reads "TD Memory Region" (TDMR) related fields
for module initialization.  There are immediate needs which require the
TDX module initialization to read more global metadata including module
version, supported features and "Convertible Memory Regions" (CMRs).

Also, KVM will need to read more metadata fields to support baseline TDX
guests.  In the longer term, other TDX features like TDX Connect (which
supports assigning trusted IO devices to TDX guest) may also require
other kernel components such as pci/vt-d to access global metadata.

To meet all those requirements, the idea is the TDX host core-kernel to
to provide a centralized, canonical, and read-only structure for the
global metadata that comes out from the TDX module for all kernel
components to use.

As the first step, introduce a new 'struct tdx_sys_info' to track all
global metadata fields.

TDX categories global metadata fields into different "Classes".  E.g.,
the TDMR related fields are under class "TDMR Info".  Instead of making
'struct tdx_sys_info' a plain structure to contain all metadata fields,
organize them in smaller structures based on the "Class".

This allows those metadata fields to be used in finer granularity thus
makes the code more clear.  E.g., the construct_tdmr() can just take the
structure which contains "TDMR Info" metadata fields.

Add a new function get_tdx_sys_info() as the placeholder to read all
metadata fields, and call it at the beginning of init_tdx_module().  For
now it only calls get_tdx_sys_info_tdmr() to read TDMR related fields.

Note there is a functional change: get_tdx_sys_info_tdmr() is moved from
after build_tdx_memlist() to before it, but it is fine to do so.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 19 ++++++++++++-------
 arch/x86/virt/vmx/tdx/tdx.h | 19 ++++++++++++-------
 2 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index e979bf442929..7a2f979092e7 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -326,6 +326,11 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 	return 0;
 }
 
+static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
+{
+	return get_tdx_sys_info_tdmr(&sysinfo->tdmr);
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1098,9 +1103,13 @@ static int init_tdmrs(struct tdmr_info_list *tdmr_list)
 
 static int init_tdx_module(void)
 {
-	struct tdx_sys_info_tdmr sysinfo_tdmr;
+	struct tdx_sys_info sysinfo;
 	int ret;
 
+	ret = get_tdx_sys_info(&sysinfo);
+	if (ret)
+		return ret;
+
 	/*
 	 * To keep things simple, assume that all TDX-protected memory
 	 * will come from the page allocator.  Make sure all pages in the
@@ -1117,17 +1126,13 @@ static int init_tdx_module(void)
 	if (ret)
 		goto out_put_tdxmem;
 
-	ret = get_tdx_sys_info_tdmr(&sysinfo_tdmr);
-	if (ret)
-		goto err_free_tdxmem;
-
 	/* Allocate enough space for constructing TDMRs */
-	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo_tdmr);
+	ret = alloc_tdmr_list(&tdx_tdmr_list, &sysinfo.tdmr);
 	if (ret)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo_tdmr);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr);
 	if (ret)
 		goto err_free_tdmrs;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 148f9b4d1140..2600ec3752f5 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -80,6 +80,18 @@ struct tdmr_info {
 	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
 } __packed __aligned(TDMR_INFO_ALIGNMENT);
 
+/* Class "TDMR info" */
+struct tdx_sys_info_tdmr {
+	u16 max_tdmrs;
+	u16 max_reserved_per_tdmr;
+	u16 pamt_entry_size[TDX_PS_NR];
+};
+
+/* Kernel used global metadata fields */
+struct tdx_sys_info {
+	struct tdx_sys_info_tdmr tdmr;
+};
+
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!
@@ -99,13 +111,6 @@ struct tdx_memblock {
 	int nid;
 };
 
-/* "TDMR info" part of "Global Scope Metadata" for constructing TDMRs */
-struct tdx_sys_info_tdmr {
-	u16 max_tdmrs;
-	u16 max_reserved_per_tdmr;
-	u16 pamt_entry_size[TDX_PS_NR];
-};
-
 /* Warn if kernel has less than TDMR_NR_WARN TDMRs after allocation */
 #define TDMR_NR_WARN 4
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 03/10] x86/virt/tdx: Use auto-generated code to read global metadata
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
  2024-11-11 10:39 ` [PATCH v7 01/10] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
  2024-11-11 10:39 ` [PATCH v7 02/10] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 04/10] x86/virt/tdx: Use dedicated struct members for PAMT entry sizes Kai Huang
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

From: Paolo Bonzini <pbonzini@redhat.com>

The TDX module provides a set of "Global Metadata Fields".  Currently
the kernel only reads "TD Memory Region" (TDMR) related fields for
module initialization.  There are needs to read more global metadata
fields including TDX module version [1], supported features [2] and
"Convertible Memory Regions" (CMRs) to fix a module initialization
failure [3].  Future changes to support KVM TDX and other features like
TDX Connect will need to read more.

The current global metadata reading code has limitations (e.g., it only
has a primitive helper to read metadata field with 16-bit element size,
while TDX supports 8/16/32/64 bits metadata element sizes).  It needs
tweaks in order to read more metadata fields.

But even with the tweaks, when new code is added to read a new field,
the reviewers will still need to review against the spec to make sure
the new code doesn't screw up things like using the wrong metadata
field ID (each metadata field is associated with a unique field ID,
which is a TDX-defined u64 constant) etc.

TDX documents all global metadata fields in a 'global_metadata.json'
file as part of TDX spec [4].  JSON format is machine readable.  Instead
of tweaking the metadata reading code, use a script [5] to generate the
code so that:

  1) Using the generated C is simple.
  2) Adding a field is simple, e.g., the script just pulls the field ID
     out of the JSON for a given field thus no manual review is needed.

Specifically, to match the layout of the 'struct tdx_sys_info' and its
sub-structures, the script uses a table with each entry containing the
the name of the sub-structures (which reflects the "Class") and the
"Field Name" of all its fields, and auto-generate:

  1) The 'struct tdx_sys_info' and all 'struct tdx_sys_info_xx'
     sub-structures in 'tdx_global_metadata.h'

  2) The main function 'get_tdx_sys_info()' which reads all metadata to
     'struct tdx_sys_info' and the 'get_tdx_sys_info_xx()' functions
     which read 'struct tdx_sys_info_xx()' in 'tdx_global_metadata.c'.

Using the generated C is simple: 1) include "tdx_global_metadata.h" to
the local "tdx.h"; 2) explicitly include "tdx_global_metadata.c" to the
local "tdx.c" after the read_sys_metadata_field() primitive (which is a
wrapper of TDH.SYS.RD SEAMCALL to read global metadata).

Adding a field is also simple: 1) just add the new field to an existing
structure, or add it with a new structure; 2) re-run the script to
generate the new code; 3) update the existing tdx_global_metadata.{hc}
with the new ones.

For now, use the auto-generated code to read the aforesaid metadata
fields: 1) TDX module version; 2) supported features; 3) CMRs.

Reading CMRs is more complicated than reading a simple field, since
there are two arrays containing the "CMR_BASE" and "CMR_SIZE" for each
CMR respectively.

TDX spec [3] section "Metadata Access Interface", sub-section "Arrays of
Metadata Fields" defines the way to read metadata fields in an array.
There's a "Base field ID" (say, X) for the array and the field ID for
entry array[i] is X + i.

For CMRs, the field "NUM_CMRS" reports the number of CMR entries that
can be read, and the code needs to use the value reported via "NUM_CMRS"
to loop despite the JSON file says the "Num Fields" of both "CMR_BASE"
and "CMR_SIZE" are 32.

The tdx_global_metadata.{hc} can be generated by running below:

 #python tdx_global_metadata.py global_metadata.json \
	tdx_global_metadata.h tdx_global_metadata.c

.. where the tdx_global_metadata.py can be found in [5] and the
global_metadata.json can be fetched from [4].

Link: https://lore.kernel.org/4b3adb59-50ea-419e-ad02-e19e8ca20dee@intel.com/ [1]
Link: https://lore.kernel.org/fc0e8ab7-86d4-4428-be31-82e1ece6dd21@intel.com/ [2]
Link: https://github.com/canonical/tdx/issues/135 [3]
Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [4]
Link: https://lore.kernel.org/20241031104433.855336-1-kai.huang@intel.com/ [5]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 89 +++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx_global_metadata.h | 42 ++++++++++
 2 files changed, 131 insertions(+)
 create mode 100644 arch/x86/virt/vmx/tdx/tdx_global_metadata.c
 create mode 100644 arch/x86/virt/vmx/tdx/tdx_global_metadata.h

diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
new file mode 100644
index 000000000000..2fe57e084453
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
@@ -0,0 +1,89 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Automatically generated functions to read TDX global metadata.
+ *
+ * This file doesn't compile on its own as it lacks of inclusion
+ * of SEAMCALL wrapper primitive which reads global metadata.
+ * Include this file to other C file instead.
+ */
+
+static int get_tdx_sys_info_version(struct tdx_sys_info_version *sysinfo_version)
+{
+	int ret = 0;
+	u64 val;
+
+	if (!ret && !(ret = read_sys_metadata_field(0x8800000200000001, &val)))
+		sysinfo_version->build_date = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x8800000100000002, &val)))
+		sysinfo_version->build_num = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000003, &val)))
+		sysinfo_version->minor_version = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000004, &val)))
+		sysinfo_version->major_version = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000005, &val)))
+		sysinfo_version->update_version = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x0800000100000006, &val)))
+		sysinfo_version->internal_version = val;
+
+	return ret;
+}
+
+static int get_tdx_sys_info_features(struct tdx_sys_info_features *sysinfo_features)
+{
+	int ret = 0;
+	u64 val;
+
+	if (!ret && !(ret = read_sys_metadata_field(0x0A00000300000008, &val)))
+		sysinfo_features->tdx_features0 = val;
+
+	return ret;
+}
+
+static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
+{
+	int ret = 0;
+	u64 val;
+
+	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000008, &val)))
+		sysinfo_tdmr->max_tdmrs = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000009, &val)))
+		sysinfo_tdmr->max_reserved_per_tdmr = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000010, &val)))
+		sysinfo_tdmr->pamt_4k_entry_size = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000011, &val)))
+		sysinfo_tdmr->pamt_2m_entry_size = val;
+	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
+		sysinfo_tdmr->pamt_1g_entry_size = val;
+
+	return ret;
+}
+
+static int get_tdx_sys_info_cmr(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int ret = 0;
+	u64 val;
+	int i;
+
+	if (!ret && !(ret = read_sys_metadata_field(0x9000000100000000, &val)))
+		sysinfo_cmr->num_cmrs = val;
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
+		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000080 + i, &val)))
+			sysinfo_cmr->cmr_base[i] = val;
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
+		if (!ret && !(ret = read_sys_metadata_field(0x9000000300000100 + i, &val)))
+			sysinfo_cmr->cmr_size[i] = val;
+
+	return ret;
+}
+
+static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
+{
+	int ret = 0;
+
+	ret = ret ?: get_tdx_sys_info_version(&sysinfo->version);
+	ret = ret ?: get_tdx_sys_info_features(&sysinfo->features);
+	ret = ret ?: get_tdx_sys_info_tdmr(&sysinfo->tdmr);
+	ret = ret ?: get_tdx_sys_info_cmr(&sysinfo->cmr);
+
+	return ret;
+}
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.h b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
new file mode 100644
index 000000000000..fde370b855f1
--- /dev/null
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Automatically generated TDX global metadata structures. */
+#ifndef _X86_VIRT_TDX_AUTO_GENERATED_TDX_GLOBAL_METADATA_H
+#define _X86_VIRT_TDX_AUTO_GENERATED_TDX_GLOBAL_METADATA_H
+
+#include <linux/types.h>
+
+struct tdx_sys_info_version {
+	u32 build_date;
+	u16 build_num;
+	u16 minor_version;
+	u16 major_version;
+	u16 update_version;
+	u16 internal_version;
+};
+
+struct tdx_sys_info_features {
+	u64 tdx_features0;
+};
+
+struct tdx_sys_info_tdmr {
+	u16 max_tdmrs;
+	u16 max_reserved_per_tdmr;
+	u16 pamt_4k_entry_size;
+	u16 pamt_2m_entry_size;
+	u16 pamt_1g_entry_size;
+};
+
+struct tdx_sys_info_cmr {
+	u16 num_cmrs;
+	u64 cmr_base[32];
+	u64 cmr_size[32];
+};
+
+struct tdx_sys_info {
+	struct tdx_sys_info_version version;
+	struct tdx_sys_info_features features;
+	struct tdx_sys_info_tdmr tdmr;
+	struct tdx_sys_info_cmr cmr;
+};
+
+#endif
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 04/10] x86/virt/tdx: Use dedicated struct members for PAMT entry sizes
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (2 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 03/10] x86/virt/tdx: Use auto-generated code to read global metadata Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 05/10] x86/virt/tdx: Add missing header file inclusion to local tdx.h Kai Huang
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Currently, the 'struct tdmr_sys_info_tdmr' which includes TDMR related
fields defines the PAMT entry sizes for TDX supported page sizes (4KB,
2MB and 1GB) as an array:

	struct tdx_sys_info_tdmr {
		...
		u16 pamt_entry_sizes[TDX_PS_NR];
	};

PAMT entry sizes are needed when allocating PAMTs for each TDMR.  Using
the array to contain PAMT entry sizes reduces the number of arguments
that need to be passed when calling tdmr_set_up_pamt().  It also makes
the code pattern like below clearer:

	for (pgsz = TDX_PS_4K; pgsz < TDX_PS_NR; pgsz++) {
		pamt_size[pgsz] = tdmr_get_pamt_sz(tdmr, pgsz,
					pamt_entry_size[pgsz]);
		tdmr_pamt_size += pamt_size[pgsz];
	}

However, the auto-generated metadata reading code generates a structure
member for each field.  The 'global_metadata.json' has a dedicated field
for each PAMT entry size, and the new 'struct tdx_sys_info_tdmr' looks
like:

	struct tdx_sys_info_tdmr {
		...
		u16 pamt_4k_entry_size;
		u16 pamt_2m_entry_size;
		u16 pamt_1g_entry_size;
	};

To prepare to use the auto-generated code, make the existing 'struct
tdx_sys_info_tdmr' look like the generated one.  But when passing to
tdmrs_set_up_pamt_all(), build a local array of PAMT entry sizes from
the structure so the code to allocate PAMTs can stay the same.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 14 +++++++++-----
 arch/x86/virt/vmx/tdx/tdx.h |  4 +++-
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 7a2f979092e7..28537a6c47fc 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -304,9 +304,9 @@ struct field_mapping {
 static const struct field_mapping fields[] = {
 	TD_SYSINFO_MAP(MAX_TDMRS,	      max_tdmrs),
 	TD_SYSINFO_MAP(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr),
-	TD_SYSINFO_MAP(PAMT_4K_ENTRY_SIZE,    pamt_entry_size[TDX_PS_4K]),
-	TD_SYSINFO_MAP(PAMT_2M_ENTRY_SIZE,    pamt_entry_size[TDX_PS_2M]),
-	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_entry_size[TDX_PS_1G]),
+	TD_SYSINFO_MAP(PAMT_4K_ENTRY_SIZE,    pamt_4k_entry_size),
+	TD_SYSINFO_MAP(PAMT_2M_ENTRY_SIZE,    pamt_2m_entry_size),
+	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_1g_entry_size),
 };
 
 static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
@@ -932,14 +932,18 @@ static int construct_tdmrs(struct list_head *tmb_list,
 			   struct tdmr_info_list *tdmr_list,
 			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
 {
+	u16 pamt_entry_size[TDX_PS_NR] = {
+		sysinfo_tdmr->pamt_4k_entry_size,
+		sysinfo_tdmr->pamt_2m_entry_size,
+		sysinfo_tdmr->pamt_1g_entry_size,
+	};
 	int ret;
 
 	ret = fill_out_tdmrs(tmb_list, tdmr_list);
 	if (ret)
 		return ret;
 
-	ret = tdmrs_set_up_pamt_all(tdmr_list, tmb_list,
-			sysinfo_tdmr->pamt_entry_size);
+	ret = tdmrs_set_up_pamt_all(tdmr_list, tmb_list, pamt_entry_size);
 	if (ret)
 		return ret;
 
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 2600ec3752f5..ec879d54eb5c 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -84,7 +84,9 @@ struct tdmr_info {
 struct tdx_sys_info_tdmr {
 	u16 max_tdmrs;
 	u16 max_reserved_per_tdmr;
-	u16 pamt_entry_size[TDX_PS_NR];
+	u16 pamt_4k_entry_size;
+	u16 pamt_2m_entry_size;
+	u16 pamt_1g_entry_size;
 };
 
 /* Kernel used global metadata fields */
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 05/10] x86/virt/tdx: Add missing header file inclusion to local tdx.h
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (3 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 04/10] x86/virt/tdx: Use dedicated struct members for PAMT entry sizes Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 06/10] x86/virt/tdx: Switch to use auto-generated global metadata reading code Kai Huang
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Compiler attributes __packed and __aligned, and DECLARE_FLEX_ARRAY() are
currently used in arch/x86/virt/vmx/tdx/tdx.h, but the relevant headers
are not included explicitly.

There's no build issue in the current code since this "tdx.h" is only
included by arch/x86/virt/vmx/tdx/tdx.c and it includes bunch of other
<linux/xxx.h> before including "tdx.h".  But for the better explicitly
include the relevant headers to "tdx.h".  Also include <linux/types.h>
for basic variable types like u16.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index ec879d54eb5c..b1d705c3ab2a 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -2,6 +2,9 @@
 #ifndef _X86_VIRT_TDX_H
 #define _X86_VIRT_TDX_H
 
+#include <linux/types.h>
+#include <linux/compiler_attributes.h>
+#include <linux/stddef.h>
 #include <linux/bits.h>
 
 /*
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 06/10] x86/virt/tdx: Switch to use auto-generated global metadata reading code
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (4 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 05/10] x86/virt/tdx: Add missing header file inclusion to local tdx.h Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs Kai Huang
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Now the caller to read global metadata has been tweaked to be ready to
use auto-generated metadata reading code.  Switch to use it.

Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 61 +------------------------------------
 arch/x86/virt/vmx/tdx/tdx.h | 45 +--------------------------
 2 files changed, 2 insertions(+), 104 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 28537a6c47fc..43ec56db5084 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -270,66 +270,7 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 	return 0;
 }
 
-static int read_sys_metadata_field16(u64 field_id,
-				     int offset,
-				     struct tdx_sys_info_tdmr *ts)
-{
-	u16 *ts_member = ((void *)ts) + offset;
-	u64 tmp;
-	int ret;
-
-	if (WARN_ON_ONCE(MD_FIELD_ID_ELE_SIZE_CODE(field_id) !=
-			MD_FIELD_ID_ELE_SIZE_16BIT))
-		return -EINVAL;
-
-	ret = read_sys_metadata_field(field_id, &tmp);
-	if (ret)
-		return ret;
-
-	*ts_member = tmp;
-
-	return 0;
-}
-
-struct field_mapping {
-	u64 field_id;
-	int offset;
-};
-
-#define TD_SYSINFO_MAP(_field_id, _offset) \
-	{ .field_id = MD_FIELD_ID_##_field_id,	   \
-	  .offset   = offsetof(struct tdx_sys_info_tdmr, _offset) }
-
-/* Map TD_SYSINFO fields into 'struct tdx_sys_info_tdmr': */
-static const struct field_mapping fields[] = {
-	TD_SYSINFO_MAP(MAX_TDMRS,	      max_tdmrs),
-	TD_SYSINFO_MAP(MAX_RESERVED_PER_TDMR, max_reserved_per_tdmr),
-	TD_SYSINFO_MAP(PAMT_4K_ENTRY_SIZE,    pamt_4k_entry_size),
-	TD_SYSINFO_MAP(PAMT_2M_ENTRY_SIZE,    pamt_2m_entry_size),
-	TD_SYSINFO_MAP(PAMT_1G_ENTRY_SIZE,    pamt_1g_entry_size),
-};
-
-static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
-{
-	int ret;
-	int i;
-
-	/* Populate 'sysinfo_tdmr' fields using the mapping structure above: */
-	for (i = 0; i < ARRAY_SIZE(fields); i++) {
-		ret = read_sys_metadata_field16(fields[i].field_id,
-						fields[i].offset,
-						sysinfo_tdmr);
-		if (ret)
-			return ret;
-	}
-
-	return 0;
-}
-
-static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)
-{
-	return get_tdx_sys_info_tdmr(&sysinfo->tdmr);
-}
+#include "tdx_global_metadata.c"
 
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index b1d705c3ab2a..0128b963b723 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -5,7 +5,7 @@
 #include <linux/types.h>
 #include <linux/compiler_attributes.h>
 #include <linux/stddef.h>
-#include <linux/bits.h>
+#include "tdx_global_metadata.h"
 
 /*
  * This file contains both macros and data structures defined by the TDX
@@ -29,35 +29,6 @@
 #define	PT_NDA		0x0
 #define	PT_RSVD		0x1
 
-/*
- * Global scope metadata field ID.
- *
- * See Table "Global Scope Metadata", TDX module 1.5 ABI spec.
- */
-#define MD_FIELD_ID_MAX_TDMRS			0x9100000100000008ULL
-#define MD_FIELD_ID_MAX_RESERVED_PER_TDMR	0x9100000100000009ULL
-#define MD_FIELD_ID_PAMT_4K_ENTRY_SIZE		0x9100000100000010ULL
-#define MD_FIELD_ID_PAMT_2M_ENTRY_SIZE		0x9100000100000011ULL
-#define MD_FIELD_ID_PAMT_1G_ENTRY_SIZE		0x9100000100000012ULL
-
-/*
- * Sub-field definition of metadata field ID.
- *
- * See Table "MD_FIELD_ID (Metadata Field Identifier / Sequence Header)
- * Definition", TDX module 1.5 ABI spec.
- *
- *  - Bit 33:32: ELEMENT_SIZE_CODE -- size of a single element of metadata
- *
- *	0: 8 bits
- *	1: 16 bits
- *	2: 32 bits
- *	3: 64 bits
- */
-#define MD_FIELD_ID_ELE_SIZE_CODE(_field_id)	\
-		(((_field_id) & GENMASK_ULL(33, 32)) >> 32)
-
-#define MD_FIELD_ID_ELE_SIZE_16BIT	1
-
 struct tdmr_reserved_area {
 	u64 offset;
 	u64 size;
@@ -83,20 +54,6 @@ struct tdmr_info {
 	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
 } __packed __aligned(TDMR_INFO_ALIGNMENT);
 
-/* Class "TDMR info" */
-struct tdx_sys_info_tdmr {
-	u16 max_tdmrs;
-	u16 max_reserved_per_tdmr;
-	u16 pamt_4k_entry_size;
-	u16 pamt_2m_entry_size;
-	u16 pamt_1g_entry_size;
-};
-
-/* Kernel used global metadata fields */
-struct tdx_sys_info {
-	struct tdx_sys_info_tdmr tdmr;
-};
-
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (5 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 06/10] x86/virt/tdx: Switch to use auto-generated global metadata reading code Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 16:32   ` Nikolay Borisov
  2024-11-11 19:41   ` Dave Hansen
  2024-11-11 10:39 ` [PATCH v7 08/10] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

TDX architecturally supports up to 32 CMRs.  The global metadata field
"NUM_CMRS" reports the number of CMR entries that can be read by the
kernel.  However, that field may just report the maximum number of CMRs
albeit the actual number of CMRs is smaller, in which case there are
tail null CMRs (size is 0).

Trim away those null CMRs, and print valid CMRs since they are useful
at least to developers.

More information about CMR can be found at "Intel TDX ISA Background:
Convertible Memory Ranges (CMRs)" in TDX 1.5 base spec [1], and
"CMR_INFO" in TDX 1.5 ABI spec [2].

Now get_tdx_sys_info() just reads kernel-needed global metadata to
kernel structure, and it is auto-generated.  Add a wrapper function
init_tdx_sys_info() to invoke get_tdx_sys_info() and provide room to do
additional things like dealing with CMRs.

Link: https://cdrdv2.intel.com/v1/dl/getContent/733575 [1]
Link: https://cdrdv2.intel.com/v1/dl/getContent/733579 [2]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 56 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 43ec56db5084..e81bdcfc20bf 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -272,6 +272,60 @@ static int read_sys_metadata_field(u64 field_id, u64 *data)
 
 #include "tdx_global_metadata.c"
 
+/* Update the @sysinfo_cmr->num_cmrs to trim tail null CMRs */
+static void trim_null_tail_cmrs(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int i;
+
+	/*
+	 * The TDX module may report the maximum number of CMRs that
+	 * TDX architecturally supports as the actual number of CMRs,
+	 * despite the latter is smaller.  In this case some tail
+	 * CMR(s) will be null (size is 0).  Trim them away.
+	 *
+	 * Note the CMRs are generated by the BIOS, but the MCHECK
+	 * verifies CMRs before enabling TDX on hardware.  Skip other
+	 * sanity checks (e.g., verify CMR is 4KB aligned) but trust
+	 * MCHECK to work properly.
+	 *
+	 * The spec doesn't say whether it's legal to have null CMRs
+	 * in the middle of valid CMRs.  For now assume no sane BIOS
+	 * would do that (even MCHECK allows).
+	 */
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++)
+		if (!sysinfo_cmr->cmr_size[i])
+			break;
+
+	sysinfo_cmr->num_cmrs = i;
+}
+
+static void print_cmrs(struct tdx_sys_info_cmr *sysinfo_cmr)
+{
+	int i;
+
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
+		u64 cmr_base = sysinfo_cmr->cmr_base[i];
+		u64 cmr_size = sysinfo_cmr->cmr_size[i];
+
+		pr_info("CMR[%d]: [0x%llx, 0x%llx)\n", i, cmr_base,
+				cmr_base + cmr_size);
+	}
+}
+
+static int init_tdx_sys_info(struct tdx_sys_info *sysinfo)
+{
+	int ret;
+
+	ret = get_tdx_sys_info(sysinfo);
+	if (ret)
+		return ret;
+
+	trim_null_tail_cmrs(&sysinfo->cmr);
+	print_cmrs(&sysinfo->cmr);
+
+	return 0;
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1051,7 +1105,7 @@ static int init_tdx_module(void)
 	struct tdx_sys_info sysinfo;
 	int ret;
 
-	ret = get_tdx_sys_info(&sysinfo);
+	ret = init_tdx_sys_info(&sysinfo);
 	if (ret)
 		return ret;
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 08/10] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (6 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 09/10] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

A TDX module initialization failure was reported on a Emerald Rapids
platform [*]:

  virt/tdx: initialization failed: TDMR [0x0, 0x80000000): reserved areas exhausted.
  virt/tdx: module initialization failed (-28)

As part of initializing the TDX module, the kernel informs the TDX
module of all "TDX-usable memory regions" using an array of TDX defined
structure "TD Memory Region" (TDMR).  Each TDMR must be in 1GB aligned
and in 1GB granularity, and all "non-TDX-usable memory holes" within a
given TDMR are marked as "reserved areas".  The TDX module reports a
maximum number of reserved areas that can be supported per TDMR (16).

The kernel builds the "TDX-usable memory regions" based on memblocks
(which reflects e820), and uses this list to find all "reserved areas"
for each TDMR.

It turns out that the kernel's view of memory holes is too fine grained
and sometimes exceeds the number of holes that the TDX module can track
per TDMR [1], resulting in the above failure.

Thankfully the module also lists memory that is potentially convertible
in a list of "Convertible Memory Regions" (CMRs).  That coarser grained
CMR list tends to track usable memory in the memory map even if it might
be reserved for host usage like 'ACPI data' [2].

Use that list to relax what the kernel considers unusable memory.  If it
falls in a CMR no need to instantiate a hole, and rely on the fact that
kernel will keep what it considers 'reserved' out of the page allocator.

[1] BIOS-E820 table of the problematic platform:

  BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
  BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
  BIOS-e820: [mem 0x0000000000100000-0x000000005d168fff] usable
  BIOS-e820: [mem 0x000000005d169000-0x000000005d22afff] ACPI data
  BIOS-e820: [mem 0x000000005d22b000-0x000000005d3cefff] usable
  BIOS-e820: [mem 0x000000005d3cf000-0x000000005d469fff] reserved
  BIOS-e820: [mem 0x000000005d46a000-0x000000005e5b2fff] usable
  BIOS-e820: [mem 0x000000005e5b3000-0x000000005e5c2fff] reserved
  BIOS-e820: [mem 0x000000005e5c3000-0x000000005e5d2fff] usable
  BIOS-e820: [mem 0x000000005e5d3000-0x000000005e5e4fff] reserved
  BIOS-e820: [mem 0x000000005e5e5000-0x000000005eb57fff] usable
  BIOS-e820: [mem 0x000000005eb58000-0x0000000061357fff] ACPI NVS
  BIOS-e820: [mem 0x0000000061358000-0x000000006172afff] usable
  BIOS-e820: [mem 0x000000006172b000-0x0000000061794fff] ACPI data
  BIOS-e820: [mem 0x0000000061795000-0x00000000617fefff] usable
  BIOS-e820: [mem 0x00000000617ff000-0x0000000061912fff] ACPI data
  BIOS-e820: [mem 0x0000000061913000-0x0000000061998fff] usable
  BIOS-e820: [mem 0x0000000061999000-0x00000000619dffff] ACPI data
  BIOS-e820: [mem 0x00000000619e0000-0x00000000619e1fff] usable
  BIOS-e820: [mem 0x00000000619e2000-0x00000000619e9fff] reserved
  BIOS-e820: [mem 0x00000000619ea000-0x0000000061a26fff] usable
  BIOS-e820: [mem 0x0000000061a27000-0x0000000061baefff] ACPI data
  BIOS-e820: [mem 0x0000000061baf000-0x00000000623c2fff] usable
  BIOS-e820: [mem 0x00000000623c3000-0x0000000062471fff] reserved
  BIOS-e820: [mem 0x0000000062472000-0x0000000062823fff] usable
  BIOS-e820: [mem 0x0000000062824000-0x0000000063a24fff] reserved
  BIOS-e820: [mem 0x0000000063a25000-0x0000000063d57fff] usable
  BIOS-e820: [mem 0x0000000063d58000-0x0000000064157fff] reserved
  BIOS-e820: [mem 0x0000000064158000-0x0000000064158fff] usable
  BIOS-e820: [mem 0x0000000064159000-0x0000000064194fff] reserved
  BIOS-e820: [mem 0x0000000064195000-0x000000006e9cefff] usable
  BIOS-e820: [mem 0x000000006e9cf000-0x000000006eccefff] reserved
  BIOS-e820: [mem 0x000000006eccf000-0x000000006f6fefff] ACPI NVS
  BIOS-e820: [mem 0x000000006f6ff000-0x000000006f7fefff] ACPI data
  BIOS-e820: [mem 0x000000006f7ff000-0x000000006f7fffff] usable
  BIOS-e820: [mem 0x000000006f800000-0x000000008fffffff] reserved
  ......

[2] Convertible Memory Regions of the problematic platform:

  virt/tdx: CMR: [0x100000, 0x6f800000)
  virt/tdx: CMR: [0x100000000, 0x107a000000)
  virt/tdx: CMR: [0x1080000000, 0x207c000000)
  virt/tdx: CMR: [0x2080000000, 0x307c000000)
  virt/tdx: CMR: [0x3080000000, 0x407c000000)

Link: https://github.com/canonical/tdx/issues/135 [*]
Fixes: dde3b60d572c ("x86/virt/tdx: Designate reserved areas for all TDMRs")
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index e81bdcfc20bf..9acb12c75e9b 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -747,29 +747,28 @@ static int tdmr_add_rsvd_area(struct tdmr_info *tdmr, int *p_idx, u64 addr,
 }
 
 /*
- * Go through @tmb_list to find holes between memory areas.  If any of
+ * Go through all CMRs in @sysinfo_cmr to find memory holes.  If any of
  * those holes fall within @tdmr, set up a TDMR reserved area to cover
  * the hole.
  */
-static int tdmr_populate_rsvd_holes(struct list_head *tmb_list,
+static int tdmr_populate_rsvd_holes(struct tdx_sys_info_cmr *sysinfo_cmr,
 				    struct tdmr_info *tdmr,
 				    int *rsvd_idx,
 				    u16 max_reserved_per_tdmr)
 {
-	struct tdx_memblock *tmb;
 	u64 prev_end;
-	int ret;
+	int i, ret;
 
 	/*
 	 * Start looking for reserved blocks at the
 	 * beginning of the TDMR.
 	 */
 	prev_end = tdmr->base;
-	list_for_each_entry(tmb, tmb_list, list) {
+	for (i = 0; i < sysinfo_cmr->num_cmrs; i++) {
 		u64 start, end;
 
-		start = PFN_PHYS(tmb->start_pfn);
-		end   = PFN_PHYS(tmb->end_pfn);
+		start = sysinfo_cmr->cmr_base[i];
+		end   = start + sysinfo_cmr->cmr_size[i];
 
 		/* Break if this region is after the TDMR */
 		if (start >= tdmr_end(tdmr))
@@ -870,16 +869,16 @@ static int rsvd_area_cmp_func(const void *a, const void *b)
 
 /*
  * Populate reserved areas for the given @tdmr, including memory holes
- * (via @tmb_list) and PAMTs (via @tdmr_list).
+ * (via @sysinfo_cmr) and PAMTs (via @tdmr_list).
  */
 static int tdmr_populate_rsvd_areas(struct tdmr_info *tdmr,
-				    struct list_head *tmb_list,
+				    struct tdx_sys_info_cmr *sysinfo_cmr,
 				    struct tdmr_info_list *tdmr_list,
 				    u16 max_reserved_per_tdmr)
 {
 	int ret, rsvd_idx = 0;
 
-	ret = tdmr_populate_rsvd_holes(tmb_list, tdmr, &rsvd_idx,
+	ret = tdmr_populate_rsvd_holes(sysinfo_cmr, tdmr, &rsvd_idx,
 			max_reserved_per_tdmr);
 	if (ret)
 		return ret;
@@ -898,10 +897,10 @@ static int tdmr_populate_rsvd_areas(struct tdmr_info *tdmr,
 
 /*
  * Populate reserved areas for all TDMRs in @tdmr_list, including memory
- * holes (via @tmb_list) and PAMTs.
+ * holes (via @sysinfo_cmr) and PAMTs.
  */
 static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
-					 struct list_head *tmb_list,
+					 struct tdx_sys_info_cmr *sysinfo_cmr,
 					 u16 max_reserved_per_tdmr)
 {
 	int i;
@@ -910,7 +909,7 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
 		int ret;
 
 		ret = tdmr_populate_rsvd_areas(tdmr_entry(tdmr_list, i),
-				tmb_list, tdmr_list, max_reserved_per_tdmr);
+				sysinfo_cmr, tdmr_list, max_reserved_per_tdmr);
 		if (ret)
 			return ret;
 	}
@@ -925,7 +924,8 @@ static int tdmrs_populate_rsvd_areas_all(struct tdmr_info_list *tdmr_list,
  */
 static int construct_tdmrs(struct list_head *tmb_list,
 			   struct tdmr_info_list *tdmr_list,
-			   struct tdx_sys_info_tdmr *sysinfo_tdmr)
+			   struct tdx_sys_info_tdmr *sysinfo_tdmr,
+			   struct tdx_sys_info_cmr *sysinfo_cmr)
 {
 	u16 pamt_entry_size[TDX_PS_NR] = {
 		sysinfo_tdmr->pamt_4k_entry_size,
@@ -942,7 +942,7 @@ static int construct_tdmrs(struct list_head *tmb_list,
 	if (ret)
 		return ret;
 
-	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, tmb_list,
+	ret = tdmrs_populate_rsvd_areas_all(tdmr_list, sysinfo_cmr,
 			sysinfo_tdmr->max_reserved_per_tdmr);
 	if (ret)
 		tdmrs_free_pamt_all(tdmr_list);
@@ -1131,7 +1131,8 @@ static int init_tdx_module(void)
 		goto err_free_tdxmem;
 
 	/* Cover all TDX-usable memory regions in TDMRs */
-	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr);
+	ret = construct_tdmrs(&tdx_memlist, &tdx_tdmr_list, &sysinfo.tdmr,
+			&sysinfo.cmr);
 	if (ret)
 		goto err_free_tdmrs;
 
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 09/10] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (7 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 08/10] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 10:39 ` [PATCH v7 10/10] x86/virt/tdx: Print TDX module version Kai Huang
  2024-11-11 20:33 ` [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Old TDX modules can clobber RBP in the TDH.VP.ENTER SEAMCALL.  However
RBP is used as frame pointer in the x86_64 calling convention, and
clobbering RBP could result in bad things like being unable to unwind
the stack if any non-maskable exceptions (NMI, #MC etc) happens in that
gap.

A new "NO_RBP_MOD" feature was introduced to more recent TDX modules to
not clobber RBP.  This feature is reported in the TDX_FEATURES0 global
metadata field via bit 18.

Don't initialize the TDX module if this feature is not supported [1].

Note the bit definitions of TDX_FEATURES0 are not auto-generated in
tdx_global_metadata.h.  Manually define a macro for it in "tdx.h".

Link: https://lore.kernel.org/fc0e8ab7-86d4-4428-be31-82e1ece6dd21@intel.com/ [1]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 17 +++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h |  4 ++++
 2 files changed, 21 insertions(+)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 9acb12c75e9b..9bc827a6cee8 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -326,6 +326,18 @@ static int init_tdx_sys_info(struct tdx_sys_info *sysinfo)
 	return 0;
 }
 
+static int check_features(struct tdx_sys_info *sysinfo)
+{
+	u64 tdx_features0 = sysinfo->features.tdx_features0;
+
+	if (!(tdx_features0 & TDX_FEATURES0_NO_RBP_MOD)) {
+		pr_err("frame pointer (RBP) clobber bug present, upgrade TDX module\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 /* Calculate the actual TDMR size */
 static int tdmr_size_single(u16 max_reserved_per_tdmr)
 {
@@ -1109,6 +1121,11 @@ static int init_tdx_module(void)
 	if (ret)
 		return ret;
 
+	/* Check whether the kernel can support this module */
+	ret = check_features(&sysinfo);
+	if (ret)
+		return ret;
+
 	/*
 	 * To keep things simple, assume that all TDX-protected memory
 	 * will come from the page allocator.  Make sure all pages in the
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 0128b963b723..c8be00f6b15a 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <linux/compiler_attributes.h>
 #include <linux/stddef.h>
+#include <linux/bits.h>
 #include "tdx_global_metadata.h"
 
 /*
@@ -54,6 +55,9 @@ struct tdmr_info {
 	DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas);
 } __packed __aligned(TDMR_INFO_ALIGNMENT);
 
+/* Bit definitions of TDX_FEATURES0 metadata field */
+#define TDX_FEATURES0_NO_RBP_MOD	BIT(18)
+
 /*
  * Do not put any hardware-defined TDX structure representations below
  * this comment!
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v7 10/10] x86/virt/tdx: Print TDX module version
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (8 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 09/10] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
@ 2024-11-11 10:39 ` Kai Huang
  2024-11-11 20:33 ` [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
  10 siblings, 0 replies; 20+ messages in thread
From: Kai Huang @ 2024-11-11 10:39 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov, kai.huang

Currently the kernel doesn't print any TDX module version information.
In practice such information is useful, especially to the developers.

For instance:

1) When something goes wrong around using TDX, the module version is
   normally the first information the users want to know [1].

2) The users want to quickly know module version to see whether the
   loaded module is the expected one.

Dump TDX module version.  The actual dmesg will look like:

  virt/tdx: module version: 1.5.00.00.0481 (build_date 20230323).

And dump right after reading global metadata, so that this information is
printed no matter whether module initialization fails or not.

Link: https://lore.kernel.org/4b3adb59-50ea-419e-ad02-e19e8ca20dee@intel.com/ [1]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 arch/x86/virt/vmx/tdx/tdx.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 9bc827a6cee8..6982e100536d 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -312,6 +312,23 @@ static void print_cmrs(struct tdx_sys_info_cmr *sysinfo_cmr)
 	}
 }
 
+static void print_module_version(struct tdx_sys_info_version *version)
+{
+       /*
+	* TDX module version encoding:
+	*
+	*   <major>.<minor>.<update>.<internal>.<build_num>
+	*
+	* When printed as text, <major> and <minor> are 1-digit,
+	* <update> and <internal> are 2-digits and <build_num>
+	* is 4-digits.
+	*/
+	pr_info("module version: %u.%u.%02u.%02u.%04u (build_date %u).\n",
+			version->major_version,	 version->minor_version,
+			version->update_version, version->internal_version,
+			version->build_num,	 version->build_date);
+}
+
 static int init_tdx_sys_info(struct tdx_sys_info *sysinfo)
 {
 	int ret;
@@ -322,6 +339,7 @@ static int init_tdx_sys_info(struct tdx_sys_info *sysinfo)
 
 	trim_null_tail_cmrs(&sysinfo->cmr);
 	print_cmrs(&sysinfo->cmr);
+	print_module_version(&sysinfo->version);
 
 	return 0;
 }
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs
  2024-11-11 10:39 ` [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs Kai Huang
@ 2024-11-11 16:32   ` Nikolay Borisov
  2024-11-11 19:30     ` Huang, Kai
  2024-11-11 19:41   ` Dave Hansen
  1 sibling, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2024-11-11 16:32 UTC (permalink / raw)
  To: Kai Huang, dave.hansen, kirill.shutemov, tglx, bp, peterz, mingo,
	hpa, dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter



On 11.11.24 г. 12:39 ч., Kai Huang wrote:
> TDX architecturally supports up to 32 CMRs.  The global metadata field
> "NUM_CMRS" reports the number of CMR entries that can be read by the
> kernel.  However, that field may just report the maximum number of CMRs
> albeit the actual number of CMRs is smaller, in which case there are
> tail null CMRs (size is 0).

nit: Is it safe to assume that null CMRs are going to be sequential and 
always at the end? Nothing in the TDX module spec suggests this. I.e 
can't we have :


1. Valid CMR region
2. ZERO CMR
3. Valid CMR

Sure, it might be a dummy and pointless but nothing prevents such CMR 
records. In any case I think the mentioning of "tail" is a bit too much 
detail and adds to unnecessary mental overload. Simply say you trim 
empty CMR's and that such regions will be sequential (if that's the 
case) and be done with it.

Because having "tail null cmr" can be interpreted as also having  there 
might be "non-tail null CMR", which doesn't seem to be the case?

> 
> Trim away those null CMRs, and print valid CMRs since they are useful
> at least to developers.
> 
> More information about CMR can be found at "Intel TDX ISA Background:
> Convertible Memory Ranges (CMRs)" in TDX 1.5 base spec [1], and
> "CMR_INFO" in TDX 1.5 ABI spec [2].
> 
> Now get_tdx_sys_info() just reads kernel-needed global metadata to
> kernel structure, and it is auto-generated.  Add a wrapper function
> init_tdx_sys_info() to invoke get_tdx_sys_info() and provide room to do
> additional things like dealing with CMRs.
> 
> Link: https://cdrdv2.intel.com/v1/dl/getContent/733575 [1]
> Link: https://cdrdv2.intel.com/v1/dl/getContent/733579 [2]
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs
  2024-11-11 16:32   ` Nikolay Borisov
@ 2024-11-11 19:30     ` Huang, Kai
  0 siblings, 0 replies; 20+ messages in thread
From: Huang, Kai @ 2024-11-11 19:30 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	kirill.shutemov@linux.intel.com, tglx@linutronix.de,
	pbonzini@redhat.com, Williams, Dan J, nik.borisov@suse.com
  Cc: kvm@vger.kernel.org, Hunter, Adrian, linux-kernel@vger.kernel.org,
	Edgecombe, Rick P, x86@kernel.org, Yamahata, Isaku

On Mon, 2024-11-11 at 18:32 +0200, Nikolay Borisov wrote:
> 
> On 11.11.24 г. 12:39 ч., Kai Huang wrote:
> > TDX architecturally supports up to 32 CMRs.  The global metadata field
> > "NUM_CMRS" reports the number of CMR entries that can be read by the
> > kernel.  However, that field may just report the maximum number of CMRs
> > albeit the actual number of CMRs is smaller, in which case there are
> > tail null CMRs (size is 0).
> 
> nit: Is it safe to assume that null CMRs are going to be sequential and 
> always at the end? Nothing in the TDX module spec suggests this. I.e 
> can't we have :
> 
> 
> 1. Valid CMR region
> 2. ZERO CMR
> 3. Valid CMR
> 
> Sure, it might be a dummy and pointless but nothing prevents such CMR 
> records. In any case I think the mentioning of "tail" is a bit too much 
> detail and adds to unnecessary mental overload. Simply say you trim 
> empty CMR's and that such regions will be sequential (if that's the 
> case) and be done with it.
> 
> Because having "tail null cmr" can be interpreted as also having  there 
> might be "non-tail null CMR", which doesn't seem to be the case?

It's described in the comment in the code:

+	 * Note the CMRs are generated by the BIOS, but the MCHECK
+	 * verifies CMRs before enabling TDX on hardware.  Skip other
+	 * sanity checks (e.g., verify CMR is 4KB aligned) but trust
+	 * MCHECK to work properly.
+	 *
+	 * The spec doesn't say whether it's legal to have null CMRs
+	 * in the middle of valid CMRs.  For now assume no sane BIOS
+	 * would do that (even MCHECK allows).

I don't see why a sane BIOS would need to do that, and we have never seen such
case in reality.  IMO we don't need to be too skeptical now.  If we see this can
indeed happen in the future, we can always come up with a patch to fix.
 
[...]


> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> 

Thanks!


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs
  2024-11-11 10:39 ` [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs Kai Huang
  2024-11-11 16:32   ` Nikolay Borisov
@ 2024-11-11 19:41   ` Dave Hansen
  2024-11-11 20:22     ` Huang, Kai
  1 sibling, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2024-11-11 19:41 UTC (permalink / raw)
  To: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov

On 11/11/24 02:39, Kai Huang wrote:
> TDX architecturally supports up to 32 CMRs.  The global metadata field
> "NUM_CMRS" reports the number of CMR entries that can be read by the
> kernel.  However, that field may just report the maximum number of CMRs
> albeit the actual number of CMRs is smaller, in which case there are
> tail null CMRs (size is 0).
> 
> Trim away those null CMRs, and print valid CMRs since they are useful
> at least to developers.
> 
> More information about CMR can be found at "Intel TDX ISA Background:
> Convertible Memory Ranges (CMRs)" in TDX 1.5 base spec [1], and
> "CMR_INFO" in TDX 1.5 ABI spec [2].
> 
> Now get_tdx_sys_info() just reads kernel-needed global metadata to
> kernel structure, and it is auto-generated.  Add a wrapper function
> init_tdx_sys_info() to invoke get_tdx_sys_info() and provide room to do
> additional things like dealing with CMRs.

I'm not sure I understand why this patch is here.

What does trimming buy us in the first place?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs
  2024-11-11 19:41   ` Dave Hansen
@ 2024-11-11 20:22     ` Huang, Kai
  0 siblings, 0 replies; 20+ messages in thread
From: Huang, Kai @ 2024-11-11 20:22 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	kirill.shutemov@linux.intel.com, tglx@linutronix.de,
	pbonzini@redhat.com, Williams, Dan J
  Cc: kvm@vger.kernel.org, nik.borisov@suse.com, Hunter, Adrian,
	linux-kernel@vger.kernel.org, Edgecombe, Rick P, x86@kernel.org,
	Yamahata, Isaku

On Mon, 2024-11-11 at 11:41 -0800, Dave Hansen wrote:
> On 11/11/24 02:39, Kai Huang wrote:
> > TDX architecturally supports up to 32 CMRs.  The global metadata field
> > "NUM_CMRS" reports the number of CMR entries that can be read by the
> > kernel.  However, that field may just report the maximum number of CMRs
> > albeit the actual number of CMRs is smaller, in which case there are
> > tail null CMRs (size is 0).
> > 
> > Trim away those null CMRs, and print valid CMRs since they are useful
> > at least to developers.
> > 
> > More information about CMR can be found at "Intel TDX ISA Background:
> > Convertible Memory Ranges (CMRs)" in TDX 1.5 base spec [1], and
> > "CMR_INFO" in TDX 1.5 ABI spec [2].
> > 
> > Now get_tdx_sys_info() just reads kernel-needed global metadata to
> > kernel structure, and it is auto-generated.  Add a wrapper function
> > init_tdx_sys_info() to invoke get_tdx_sys_info() and provide room to do
> > additional things like dealing with CMRs.
> 
> I'm not sure I understand why this patch is here.
> 
> What does trimming buy us in the first place?

I think the global metadata provided by the core kernel should only reflect the
real valid CMRs via 'num_cmrs', so when the kernel uses CMRs it can just get
valid ones.

One immediate need is the next patch will need to loop over CMRs to set up
reserved areas for TDMRs.  If we don't trim here, we will need to explicitly
skip all null CMRs in each loop.  This will result in more duplicated code and
is not as clean as trimming at early IMO.

I should call this out here though.  

How about I clarify this in the changelog like below?

"
A future fix to a module initialization failure issue will need to loop over all
CMRs.  Trim away those null CMRs once for all here so that the kernel doesn't
need to explicitly skip them each time when it uses CMRs in later time.  Also
print valid CMRs since they are useful at least to developers.
"

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
  2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
                   ` (9 preceding siblings ...)
  2024-11-11 10:39 ` [PATCH v7 10/10] x86/virt/tdx: Print TDX module version Kai Huang
@ 2024-11-11 20:33 ` Dave Hansen
  2024-11-11 20:49   ` Huang, Kai
  10 siblings, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2024-11-11 20:33 UTC (permalink / raw)
  To: Kai Huang, kirill.shutemov, tglx, bp, peterz, mingo, hpa,
	dan.j.williams, seanjc, pbonzini
  Cc: x86, linux-kernel, kvm, rick.p.edgecombe, isaku.yamahata,
	adrian.hunter, nik.borisov

On 11/11/24 02:39, Kai Huang wrote:
> This series does necessary tweaks to TDX host "global metadata" reading
> code to fix some immediate issues in the TDX module initialization code,
> with intention to also provide a flexible code base to support sharing
> global metadata to KVM (and other kernel components) for future needs.

Could we please just limit this to the bug fix and the new TD metadata
infrastructure?  Let's not mix it all up with the debugging printk()s.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
  2024-11-11 20:33 ` [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
@ 2024-11-11 20:49   ` Huang, Kai
  2024-11-11 21:00     ` Dave Hansen
  0 siblings, 1 reply; 20+ messages in thread
From: Huang, Kai @ 2024-11-11 20:49 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	kirill.shutemov@linux.intel.com, tglx@linutronix.de,
	pbonzini@redhat.com, Williams, Dan J
  Cc: kvm@vger.kernel.org, nik.borisov@suse.com, Hunter, Adrian,
	linux-kernel@vger.kernel.org, Edgecombe, Rick P, x86@kernel.org,
	Yamahata, Isaku

On Mon, 2024-11-11 at 12:33 -0800, Dave Hansen wrote:
> On 11/11/24 02:39, Kai Huang wrote:
> > This series does necessary tweaks to TDX host "global metadata" reading
> > code to fix some immediate issues in the TDX module initialization code,
> > with intention to also provide a flexible code base to support sharing
> > global metadata to KVM (and other kernel components) for future needs.
> 
> Could we please just limit this to the bug fix and the new TD metadata
> infrastructure?  Let's not mix it all up with the debugging printk()s.

It also has a patch to fail module initialization when NO_MOD_BBP feature is not
support.

Just want to confirm, do you want to remove the code to:

 - print CMRs;
 - print TDX module versoin;

?

Then I will need to:

 - remove "printing CMRs" in patch 7 ("x86/virt/tdx: Trim away tail null CMRs").
 - remove patch 10 ("x86/virt/tdx: Print TDX module version").

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
  2024-11-11 20:49   ` Huang, Kai
@ 2024-11-11 21:00     ` Dave Hansen
  2024-11-11 21:28       ` Huang, Kai
  0 siblings, 1 reply; 20+ messages in thread
From: Dave Hansen @ 2024-11-11 21:00 UTC (permalink / raw)
  To: Huang, Kai, seanjc@google.com, bp@alien8.de, peterz@infradead.org,
	hpa@zytor.com, mingo@redhat.com, kirill.shutemov@linux.intel.com,
	tglx@linutronix.de, pbonzini@redhat.com, Williams, Dan J
  Cc: kvm@vger.kernel.org, nik.borisov@suse.com, Hunter, Adrian,
	linux-kernel@vger.kernel.org, Edgecombe, Rick P, x86@kernel.org,
	Yamahata, Isaku

On 11/11/24 12:49, Huang, Kai wrote:
> It also has a patch to fail module initialization when NO_MOD_BBP feature is not
> support.
> 
> Just want to confirm, do you want to remove the code to:
> 
>  - print CMRs;
>  - print TDX module versoin;

What is your goal?  What is the bare minimum amount of code to get there?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
  2024-11-11 21:00     ` Dave Hansen
@ 2024-11-11 21:28       ` Huang, Kai
  2024-11-13 11:26         ` Huang, Kai
  0 siblings, 1 reply; 20+ messages in thread
From: Huang, Kai @ 2024-11-11 21:28 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	tglx@linutronix.de, kirill.shutemov@linux.intel.com,
	pbonzini@redhat.com, Williams, Dan J
  Cc: Yamahata, Isaku, kvm@vger.kernel.org, nik.borisov@suse.com,
	Hunter, Adrian, linux-kernel@vger.kernel.org, Edgecombe, Rick P,
	x86@kernel.org

On Mon, 2024-11-11 at 13:00 -0800, Hansen, Dave wrote:
> On 11/11/24 12:49, Huang, Kai wrote:
> > It also has a patch to fail module initialization when NO_MOD_BBP feature is not
> > support.
> > 
> > Just want to confirm, do you want to remove the code to:
> > 
> >  - print CMRs;
> >  - print TDX module versoin;
> 
> What is your goal?  What is the bare minimum amount of code to get there?

The goal is to get everything that KVM TDX needs merged, plus the bug fix.

KVM TDX needs the new metadata infrastructure and the NO_MOD_BRP patch, so yeah
only printing CMRs and TDX module version are not needed.

I'll remove them in the next version.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump
  2024-11-11 21:28       ` Huang, Kai
@ 2024-11-13 11:26         ` Huang, Kai
  0 siblings, 0 replies; 20+ messages in thread
From: Huang, Kai @ 2024-11-13 11:26 UTC (permalink / raw)
  To: Hansen, Dave, seanjc@google.com, bp@alien8.de,
	peterz@infradead.org, hpa@zytor.com, mingo@redhat.com,
	kirill.shutemov@linux.intel.com, tglx@linutronix.de,
	pbonzini@redhat.com, Williams, Dan J
  Cc: Edgecombe, Rick P, kvm@vger.kernel.org, nik.borisov@suse.com,
	Hunter, Adrian, linux-kernel@vger.kernel.org, Yamahata, Isaku,
	x86@kernel.org

[-- Attachment #1: Type: text/plain, Size: 2510 bytes --]

On Mon, 2024-11-11 at 21:28 +0000, Huang, Kai wrote:
> On Mon, 2024-11-11 at 13:00 -0800, Hansen, Dave wrote:
> > On 11/11/24 12:49, Huang, Kai wrote:
> > > It also has a patch to fail module initialization when NO_MOD_BBP feature is not
> > > support.
> > > 
> > > Just want to confirm, do you want to remove the code to:
> > > 
> > >  - print CMRs;
> > >  - print TDX module versoin;
> > 
> > What is your goal?  What is the bare minimum amount of code to get there?
> 
> The goal is to get everything that KVM TDX needs merged, plus the bug fix.
> 
> KVM TDX needs the new metadata infrastructure and the NO_MOD_BRP patch, so yeah
> only printing CMRs and TDX module version are not needed.
> 
> I'll remove them in the next version.

I removed the "version" part in the 'tdx_global_metadata.py' script in order to
remove the code which reads TDX module version from the auto-generated code. 
For the sake of having a lore link of the script that I used in the new version,
I attached the updated script here.  It just got "version" part removed thus is
not interesting to read.

And Sorry I didn't provide enough info about the "goal" in my previous reply:

The goal of this series is to provide a new TDX global metadata infrastructure
to:

1) address two issues in the current TDX module initialization code, and
2) have an extendable infrastructure which is super easy to read more metadata
and share with KVM for KVM TDX support (and other kernel components for TDX
Connect in the future).

And the reason that we need a new global metadata infrastructure is the current
one can only read TDMR related metadata fields and it is not extendable to read
more metadata fields, which is required to address both 1) and 2) above.

Specifically, below two issues in the current module initialization code need to
be addressed:

1) Module initialization may fail on some large systems (e.g., with 4 or more
sockets).
2) Some old modules can clobber host's RBP when existing from the TDX guest, and
currently they can be initialized successfully.  We don't want to use such
modules thus we should just fail to initialize them to avoid memory/cpu cycle
cost of initializing TDX module.

The minimal code to achieve this goal is to remove the code which prints TDX
module version and CMR info in this series.  After removing them, the fist 6
patches in this series introduce the new metadata infrastructure, and the rest
patches address the two above issues.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: tdx_global_metadata.py --]
[-- Type: text/x-python3; name="tdx_global_metadata.py", Size: 6151 bytes --]

#! /usr/bin/env python3
import json
import sys

# Note: this script does not run as part of the build process.
# It is used to generate structs from the TDX global_metadata.json
# file, and functions to fill in said structs.  Rerun it if
# you need more fields.

TDX_STRUCTS = {
    "features": [
        "TDX_FEATURES0"
    ],
    "tdmr": [
        "MAX_TDMRS",
        "MAX_RESERVED_PER_TDMR",
        "PAMT_4K_ENTRY_SIZE",
        "PAMT_2M_ENTRY_SIZE",
        "PAMT_1G_ENTRY_SIZE",
    ],
    "cmr": [
        "NUM_CMRS", "CMR_BASE", "CMR_SIZE"
    ],
}

STRUCT_PREFIX = "tdx_sys_info"
FUNC_PREFIX = "get_tdx_sys_info"
STRVAR_PREFIX = "sysinfo"

def print_class_struct_field(field_name, element_bytes, num_fields, num_elements, file):
    element_type = "u%s" % (element_bytes * 8)
    element_array = ""
    if num_fields > 1:
        element_array += "[%d]" % (num_fields)
    if num_elements > 1:
        element_array += "[%d]" % (num_elements)
    print("\t%s %s%s;" % (element_type, field_name, element_array), file=file)

def print_class_struct(class_name, fields, file):
    struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
    print("struct %s {" % (struct_name), file=file)
    for f in fields:
        print_class_struct_field(
            f["Field Name"].lower(),
            int(f["Element Size (Bytes)"]),
            int(f["Num Fields"]),
            int(f["Num Elements"]),
            file=file)
    print("};", file=file)

def print_read_field(field_id, struct_var, struct_member, indent, file):
    print(
        "%sif (!ret && !(ret = read_sys_metadata_field(%s, &val)))\n%s\t%s->%s = val;"
        % (indent, field_id, indent, struct_var, struct_member),
        file=file,
    )

def print_class_function(class_name, fields, file):
    func_name = "%s_%s" % (FUNC_PREFIX, class_name)
    struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
    struct_var = "%s_%s" % (STRVAR_PREFIX, class_name)

    print("static int %s(struct %s *%s)" % (func_name, struct_name, struct_var), file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print("\tu64 val;", file=file)

    has_i = 0
    has_j = 0
    for f in fields:
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        if num_fields > 1:
            has_i = 1
        if num_elements > 1:
            has_j = 1

    if has_i == 1 and has_j == 1:
        print("\tint i, j;", file=file)
    elif has_i == 1:
        print("\tint i;", file=file)

    print(file=file)
    for f in fields:
        fname = f["Field Name"]
        field_id = f["Base FIELD_ID (Hex)"]
        num_fields = int(f["Num Fields"])
        num_elements = int(f["Num Elements"])
        struct_member = fname.lower()
        indent = "\t"
        if num_fields > 1:
            if fname == "CMR_BASE" or fname == "CMR_SIZE":
                limit = "%s_%s->num_cmrs" %(STRVAR_PREFIX, "cmr")
            elif fname == "CPUID_CONFIG_LEAVES" or fname == "CPUID_CONFIG_VALUES":
                limit = "%s_%s->num_cpuid_config" %(STRVAR_PREFIX, "td_conf")
            else:
                limit = "%d" %(num_fields)
            print("%sfor (i = 0; i < %s; i++)" % (indent, limit), file=file)
            indent += "\t"
            field_id += " + i"
            struct_member += "[i]"
        if num_elements > 1:
            print("%sfor (j = 0; j < %d; j++)" % (indent, num_elements), file=file)
            indent += "\t"
            field_id += " * 2 + j"
            struct_member += "[j]"

        print_read_field(
            field_id,
            struct_var,
            struct_member,
            indent,
            file=file,
        )

    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

def print_main_struct(file):
    print("struct tdx_sys_info {", file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        struct_name = "%s_%s" % (STRUCT_PREFIX, class_name)
        struct_var = class_name
        print("\tstruct %s %s;" % (struct_name, struct_var), file=file)
    print("};", file=file)

def print_main_function(file):
    print("static int get_tdx_sys_info(struct tdx_sys_info *sysinfo)", file=file)
    print("{", file=file)
    print("\tint ret = 0;", file=file)
    print(file=file)
    for class_name, field_names in TDX_STRUCTS.items():
        func_name = "%s_%s" % (FUNC_PREFIX, class_name)
        struct_var = class_name
        print("\tret = ret ?: %s(&sysinfo->%s);" % (func_name, struct_var), file=file)
    print(file=file)
    print("\treturn ret;", file=file)
    print("}", file=file)

jsonfile = sys.argv[1]
hfile = sys.argv[2]
cfile = sys.argv[3]
hfileifdef = hfile.replace(".", "_")

with open(jsonfile, "r") as f:
    json_in = json.load(f)
    fields = {x["Field Name"]: x for x in json_in["Fields"]}

with open(hfile, "w") as f:
    print("/* SPDX-License-Identifier: GPL-2.0 */", file=f)
    print("/* Automatically generated TDX global metadata structures. */", file=f)
    print("#ifndef _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print("#define _X86_VIRT_TDX_AUTO_GENERATED_" + hfileifdef.upper(), file=f)
    print(file=f)
    print("#include <linux/types.h>", file=f)
    print(file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print_class_struct(class_name, [fields[x] for x in field_names], file=f)
        print(file=f)
    print_main_struct(file=f)
    print(file=f)
    print("#endif", file=f)

with open(cfile, "w") as f:
    print("// SPDX-License-Identifier: GPL-2.0", file=f)
    print("/*", file=f)
    print(" * Automatically generated functions to read TDX global metadata.", file=f)
    print(" *", file=f)
    print(" * This file doesn't compile on its own as it lacks of inclusion", file=f)
    print(" * of SEAMCALL wrapper primitive which reads global metadata.", file=f)
    print(" * Include this file to other C file instead.", file=f)
    print(" */", file=f)
    for class_name, field_names in TDX_STRUCTS.items():
        print(file=f)
        print_class_function(class_name, [fields[x] for x in field_names], file=f)
    print(file=f)
    print_main_function(file=f)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-11-13 11:26 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-11 10:39 [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Kai Huang
2024-11-11 10:39 ` [PATCH v7 01/10] x86/virt/tdx: Rename 'struct tdx_tdmr_sysinfo' to reflect the spec better Kai Huang
2024-11-11 10:39 ` [PATCH v7 02/10] x86/virt/tdx: Start to track all global metadata in one structure Kai Huang
2024-11-11 10:39 ` [PATCH v7 03/10] x86/virt/tdx: Use auto-generated code to read global metadata Kai Huang
2024-11-11 10:39 ` [PATCH v7 04/10] x86/virt/tdx: Use dedicated struct members for PAMT entry sizes Kai Huang
2024-11-11 10:39 ` [PATCH v7 05/10] x86/virt/tdx: Add missing header file inclusion to local tdx.h Kai Huang
2024-11-11 10:39 ` [PATCH v7 06/10] x86/virt/tdx: Switch to use auto-generated global metadata reading code Kai Huang
2024-11-11 10:39 ` [PATCH v7 07/10] x86/virt/tdx: Trim away tail null CMRs Kai Huang
2024-11-11 16:32   ` Nikolay Borisov
2024-11-11 19:30     ` Huang, Kai
2024-11-11 19:41   ` Dave Hansen
2024-11-11 20:22     ` Huang, Kai
2024-11-11 10:39 ` [PATCH v7 08/10] x86/virt/tdx: Reduce TDMR's reserved areas by using CMRs to find memory holes Kai Huang
2024-11-11 10:39 ` [PATCH v7 09/10] x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation Kai Huang
2024-11-11 10:39 ` [PATCH v7 10/10] x86/virt/tdx: Print TDX module version Kai Huang
2024-11-11 20:33 ` [PATCH v7 00/10] TDX host: metadata reading tweaks, bug fix and info dump Dave Hansen
2024-11-11 20:49   ` Huang, Kai
2024-11-11 21:00     ` Dave Hansen
2024-11-11 21:28       ` Huang, Kai
2024-11-13 11:26         ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox