linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part)
@ 2023-08-11 22:18 Dexuan Cui
  2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

Hyper-V provides two modes for running Intel TDX VMs:

1) TD Partitioning mode with a paravisor (see [1]).
2) In "fully enlightened" mode with normal TDX shared bit control
   over page encryption, and no paravisor

The first mode is similar to AMD SEV-SNP's vTOM mode (see [2]).
The second is similar to AMD SEV-SNP's C-bit mode(see [2]).

For #2, the v6 patchset was [3], which is later split into 2 parts:
the generic TDX part (see [4][5]), and the Hyper-V specific part, i.e.
the first 5 patches of this patchset. For the second part, I rebased
the patches to Tianyu's fully enlighted SNP v5 patchset [2]. Since this
is a straightforward rebasing, I keep the existing Acked-by and
Reviewed-by in the v6 patchset [3].

The next 3 patches of this patchset add the support for #1.

The last patch (the 9th patch) just makes some cleanup.

Please review all the 9 patches, which are also on my github
branch [6].

I tested the patches for a regular VM, a VBS VM, a SNP VM
with the paravisor, and a TDX VM with the paravisor and a TDX
VM without the paravisor, and an ARM64 VM. All the VMs worked
as expected.

If the patches all look good, I expect that Tianyu's patches [2]
are merged into the Hyper-V tree's hyperv-next branch first, then
these 9 patches can be merged afterwards.

Note: Tianyu will post v6, and I expect that the 9 patches can
still apply atop Tianyu's v6 -- if not, I'll post a new version.

Thanks,
Dexuan

References:
[1] Intel TDX Module v1.5 TD Partitioning Architecture Specification
[2] https://lwn.net/ml/linux-kernel/20230810160412.820246-1-ltykernel%40gmail.com/
[3] https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/
[4] https://lwn.net/ml/linux-kernel/20230811214826.9609-1-decui%40microsoft.com/
[5] https://github.com/dcui/tdx/commits/decui/mainline/x86/tdx/v10
[6] https://github.com/dcui/tdx/commits/decui/mainline/x86/hyperv/tdx

Dexuan Cui (9):
  x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
  x86/hyperv: Support hypercalls for fully enlightened TDX guests
  Drivers: hv: vmbus: Support fully enlightened TDX guests
  x86/hyperv: Fix serial console interrupts for fully enlightened TDX
    guests
  Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
  x86/hyperv: Introduce a global variable hyperv_paravisor_present
  Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the
    paravisor
  x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the
    paravisor
  x86/hyperv: Remove hv_isolation_type_en_snp

 arch/x86/hyperv/hv_apic.c          |  15 +++-
 arch/x86/hyperv/hv_init.c          |  53 ++++++++++---
 arch/x86/hyperv/ivm.c              | 120 ++++++++++++++++++++++++++---
 arch/x86/include/asm/hyperv-tlfs.h |   3 +-
 arch/x86/include/asm/mshyperv.h    |  29 ++++++-
 arch/x86/kernel/cpu/mshyperv.c     |  72 ++++++++++++++---
 drivers/hv/connection.c            |   2 +-
 drivers/hv/hv.c                    |  88 +++++++++++++++++----
 drivers/hv/hv_common.c             |  46 ++++++++---
 drivers/hv/hyperv_vmbus.h          |  11 +++
 include/asm-generic/mshyperv.h     |   3 +
 11 files changed, 379 insertions(+), 63 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-18 14:18   ` Tianyu Lan
  2023-08-22 14:18   ` Tianyu Lan
  2023-08-11 22:18 ` [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened " Dexuan Cui
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

No logic change to SNP/VBS guests.

hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to
do some TDX-specific operations, e.g. for a fully enlightened TDX guest
(i.e. without the paravisor), hv_do_hypercall() should use
__tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V
Event/Message/Monitor pages specially.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/ivm.c              | 9 +++++++++
 arch/x86/include/asm/hyperv-tlfs.h | 3 ++-
 arch/x86/include/asm/mshyperv.h    | 3 +++
 arch/x86/kernel/cpu/mshyperv.c     | 2 ++
 drivers/hv/hv_common.c             | 6 ++++++
 include/asm-generic/mshyperv.h     | 1 +
 6 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index ee08a0cd6da38..d4aafe8b6b50d 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -524,3 +524,12 @@ bool hv_isolation_type_en_snp(void)
 	return static_branch_unlikely(&isolation_type_en_snp);
 }
 
+DEFINE_STATIC_KEY_FALSE(isolation_type_tdx);
+/*
+ * hv_isolation_type_tdx - Check if the system runs in an Intel TDX based
+ * isolated VM.
+ */
+bool hv_isolation_type_tdx(void)
+{
+	return static_branch_unlikely(&isolation_type_tdx);
+}
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 4bf0b315b0ce9..2ff26f53cd624 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -169,7 +169,8 @@
 enum hv_isolation_type {
 	HV_ISOLATION_TYPE_NONE	= 0,
 	HV_ISOLATION_TYPE_VBS	= 1,
-	HV_ISOLATION_TYPE_SNP	= 2
+	HV_ISOLATION_TYPE_SNP	= 2,
+	HV_ISOLATION_TYPE_TDX	= 3
 };
 
 /* Hyper-V specific model specific registers (MSRs) */
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 0b0d1eb249d0a..83fc3a79f1557 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -27,6 +27,7 @@ union hv_ghcb;
 
 DECLARE_STATIC_KEY_FALSE(isolation_type_snp);
 DECLARE_STATIC_KEY_FALSE(isolation_type_en_snp);
+DECLARE_STATIC_KEY_FALSE(isolation_type_tdx);
 
 typedef int (*hyperv_fill_flush_list_func)(
 		struct hv_guest_mapping_flush_list *flush,
@@ -59,6 +60,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
 int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
 int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
 
+bool hv_isolation_type_tdx(void);
+
 static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 {
 	u64 input_address = input ? virt_to_phys(input) : 0;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index b7d73f3107c63..a50fd3650ea9b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -418,6 +418,8 @@ static void __init ms_hyperv_init_platform(void)
 			static_branch_enable(&isolation_type_en_snp);
 		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
 			static_branch_enable(&isolation_type_snp);
+		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
+			static_branch_enable(&isolation_type_tdx);
 		}
 	}
 
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 2d43ba2bc925d..da3307533f4d7 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -521,6 +521,12 @@ bool __weak hv_isolation_type_en_snp(void)
 }
 EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
 
+bool __weak hv_isolation_type_tdx(void)
+{
+	return false;
+}
+EXPORT_SYMBOL_GPL(hv_isolation_type_tdx);
+
 void __weak hv_setup_vmbus_handler(void (*handler)(void))
 {
 }
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index efd0d2aedad39..c5e657c3cdf4c 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -66,6 +66,7 @@ extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
 extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
 extern bool hv_isolation_type_snp(void);
 extern bool hv_isolation_type_en_snp(void);
+extern bool hv_isolation_type_tdx(void);
 
 /* Helper functions that provide a consistent pattern for checking Hyper-V hypercall status. */
 static inline int hv_result(u64 status)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened TDX guests
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
  2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-23  7:05   ` Tianyu Lan
  2023-08-11 22:18 ` [PATCH 3/9] Drivers: hv: vmbus: Support " Dexuan Cui
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

A fully enlightened TDX guest on Hyper-V (i.e. without the paravisor) only
uses the GHCI call rather than hv_hypercall_pg.

In hv_do_hypercall(), Hyper-V requires that the input/output addresses
must have the cc_mask.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c       |  8 ++++++++
 arch/x86/hyperv/ivm.c           | 17 +++++++++++++++++
 arch/x86/include/asm/mshyperv.h | 15 +++++++++++++++
 drivers/hv/hv_common.c          | 10 ++++++++--
 include/asm-generic/mshyperv.h  |  1 +
 5 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 547ebf6a03bc9..d8ea54663113c 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -481,6 +481,10 @@ void __init hyperv_init(void)
 	/* Hyper-V requires to write guest os id via ghcb in SNP IVM. */
 	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
 
+	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
+	if (hv_isolation_type_tdx())
+		goto skip_hypercall_pg_init;
+
 	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
 			VMALLOC_END, GFP_KERNEL, PAGE_KERNEL_ROX,
 			VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
@@ -520,6 +524,7 @@ void __init hyperv_init(void)
 		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 	}
 
+skip_hypercall_pg_init:
 	/*
 	 * hyperv_init() is called before LAPIC is initialized: see
 	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and
@@ -647,6 +652,9 @@ bool hv_is_hyperv_initialized(void)
 	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return false;
 
+	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
+	if (hv_isolation_type_tdx())
+		return true;
 	/*
 	 * Verify that earlier initialization succeeded by checking
 	 * that the hypercall page is setup
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index d4aafe8b6b50d..5792cddea4914 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -533,3 +533,20 @@ bool hv_isolation_type_tdx(void)
 {
 	return static_branch_unlikely(&isolation_type_tdx);
 }
+
+#ifdef CONFIG_INTEL_TDX_GUEST
+
+u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2)
+{
+	struct tdx_hypercall_args args = { };
+
+	args.r10 = control;
+	args.rdx = param1;
+	args.r8  = param2;
+
+	(void)__tdx_hypercall_ret(&args);
+
+	return args.r11;
+}
+
+#endif
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 83fc3a79f1557..4c68564a165e5 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -61,7 +61,12 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
 int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
 
 bool hv_isolation_type_tdx(void);
+u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
 
+/*
+ * If the hypercall involves no input or output parameters, the hypervisor
+ * ignores the corresponding GPA pointer.
+ */
 static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 {
 	u64 input_address = input ? virt_to_phys(input) : 0;
@@ -69,6 +74,10 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
+	if (hv_isolation_type_tdx())
+		return hv_tdx_hypercall(control,
+					cc_mkdec(input_address),
+					cc_mkdec(output_address));
 	if (!hv_hypercall_pg)
 		return U64_MAX;
 
@@ -112,6 +121,9 @@ static inline u64 _hv_do_fast_hypercall8(u64 control, u64 input1)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
+	if (hv_isolation_type_tdx())
+		return hv_tdx_hypercall(control, input1, 0);
+
 	{
 		__asm__ __volatile__("mov %[thunk_target], %%r8\n"
 				     ALTERNATIVE(CALL_NOSPEC, "vmmcall", X86_FEATURE_SEV_ES)
@@ -158,6 +170,9 @@ static inline u64 _hv_do_fast_hypercall16(u64 control, u64 input1, u64 input2)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
+	if (hv_isolation_type_tdx())
+		return hv_tdx_hypercall(control, input1, input2);
+
 	{
 		__asm__ __volatile__("mov %[output], %%r8\n"
 		     ALTERNATIVE(CALL_NOSPEC, "vmmcall", X86_FEATURE_SEV_ES)
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index da3307533f4d7..897bbb96f4118 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -381,10 +381,10 @@ int hv_common_cpu_init(unsigned int cpu)
 			*outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
 		}
 
-		if (hv_isolation_type_en_snp()) {
+		if (hv_isolation_type_en_snp() || hv_isolation_type_tdx()) {
 			ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);
 			if (ret) {
-				kfree(*inputarg);
+				/* It may be unsafe to free *inputarg */
 				*inputarg = NULL;
 				return ret;
 			}
@@ -567,3 +567,9 @@ u64 __weak hv_ghcb_hypercall(u64 control, void *input, void *output, u32 input_s
 	return HV_STATUS_INVALID_PARAMETER;
 }
 EXPORT_SYMBOL_GPL(hv_ghcb_hypercall);
+
+u64 __weak hv_tdx_hypercall(u64 control, u64 param1, u64 param2)
+{
+	return HV_STATUS_INVALID_PARAMETER;
+}
+EXPORT_SYMBOL_GPL(hv_tdx_hypercall);
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index c5e657c3cdf4c..30fa75facd784 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -283,6 +283,7 @@ enum hv_isolation_type hv_get_isolation_type(void);
 bool hv_is_isolation_supported(void);
 bool hv_isolation_type_snp(void);
 u64 hv_ghcb_hypercall(u64 control, void *input, void *output, u32 input_size);
+u64 hv_tdx_hypercall(u64 control, u64 param1, u64 param2);
 void hyperv_cleanup(void);
 bool hv_query_ext_cap(u64 cap_query);
 void hv_setup_dma_ops(struct device *dev, bool coherent);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/9] Drivers: hv: vmbus: Support fully enlightened TDX guests
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
  2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
  2023-08-11 22:18 ` [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened " Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-11 22:18 ` [PATCH 4/9] x86/hyperv: Fix serial console interrupts for " Dexuan Cui
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

Add Hyper-V specific code so that a fully enlightened TDX guest (i.e.
without the paravisor) can run on Hyper-V:
  Don't use hv_vp_assist_page. Use GHCI instead.
  Don't try to use the unsupported HV_REGISTER_CRASH_CTL.
  Don't trust (use) Hyper-V's TLB-flushing hypercalls.
  Don't use lazy EOI.
  Share the SynIC Event/Message pages with the hypervisor.
  Don't use the Hyper-V TSC page for now, because non-trivial work is
    required to share the page with the hypervisor.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_apic.c      | 15 ++++++++++++---
 arch/x86/hyperv/hv_init.c      | 19 +++++++++++++++----
 arch/x86/kernel/cpu/mshyperv.c | 23 +++++++++++++++++++++++
 drivers/hv/hv.c                | 17 +++++++++++++++--
 4 files changed, 65 insertions(+), 9 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 1fbda2f94184e..cb7429046d18d 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -177,8 +177,11 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector,
 	    (exclude_self && weight == 1 && cpumask_test_cpu(this_cpu, mask)))
 		return true;
 
-	if (!hv_hypercall_pg)
-		return false;
+	/* A fully enlightened TDX VM uses GHCI rather than hv_hypercall_pg. */
+	if (!hv_hypercall_pg) {
+		if (ms_hyperv.paravisor_present || !hv_isolation_type_tdx())
+			return false;
+	}
 
 	if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
 		return false;
@@ -231,9 +234,15 @@ static bool __send_ipi_one(int cpu, int vector)
 
 	trace_hyperv_send_ipi_one(cpu, vector);
 
-	if (!hv_hypercall_pg || (vp == VP_INVAL))
+	if (vp == VP_INVAL)
 		return false;
 
+	/* A fully enlightened TDX VM uses GHCI rather than hv_hypercall_pg. */
+	if (!hv_hypercall_pg) {
+		if (ms_hyperv.paravisor_present || !hv_isolation_type_tdx())
+			return false;
+	}
+
 	if ((vector < HV_IPI_LOW_VECTOR) || (vector > HV_IPI_HIGH_VECTOR))
 		return false;
 
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index d8ea54663113c..4bcd0a6f94760 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -80,7 +80,7 @@ static int hyperv_init_ghcb(void)
 static int hv_cpu_init(unsigned int cpu)
 {
 	union hv_vp_assist_msr_contents msr = { 0 };
-	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[cpu];
+	struct hv_vp_assist_page **hvp;
 	int ret;
 
 	ret = hv_common_cpu_init(cpu);
@@ -90,6 +90,7 @@ static int hv_cpu_init(unsigned int cpu)
 	if (!hv_vp_assist_page)
 		return 0;
 
+	hvp = &hv_vp_assist_page[cpu];
 	if (hv_root_partition) {
 		/*
 		 * For root partition we get the hypervisor provided VP assist
@@ -447,11 +448,21 @@ void __init hyperv_init(void)
 	if (hv_common_init())
 		return;
 
-	hv_vp_assist_page = kcalloc(num_possible_cpus(),
-				    sizeof(*hv_vp_assist_page), GFP_KERNEL);
+	/*
+	 * The VP assist page is useless to a TDX guest: the only use we
+	 * would have for it is lazy EOI, which can not be used with TDX.
+	 */
+	if (hv_isolation_type_tdx())
+		hv_vp_assist_page = NULL;
+	else
+		hv_vp_assist_page = kcalloc(num_possible_cpus(),
+					    sizeof(*hv_vp_assist_page),
+					    GFP_KERNEL);
 	if (!hv_vp_assist_page) {
 		ms_hyperv.hints &= ~HV_X64_ENLIGHTENED_VMCS_RECOMMENDED;
-		goto common_free;
+
+		if (!hv_isolation_type_tdx())
+			goto common_free;
 	}
 
 	if (hv_isolation_type_snp()) {
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index a50fd3650ea9b..507df0f64ae18 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -420,6 +420,29 @@ static void __init ms_hyperv_init_platform(void)
 			static_branch_enable(&isolation_type_snp);
 		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
 			static_branch_enable(&isolation_type_tdx);
+
+			/* A TDX VM must use x2APIC and doesn't use lazy EOI. */
+			ms_hyperv.hints &= ~HV_X64_APIC_ACCESS_RECOMMENDED;
+
+			if (!ms_hyperv.paravisor_present) {
+				/*
+				 * The ms_hyperv.shared_gpa_boundary_active in
+				 * a fully enlightened TDX VM is 0, but the GPAs
+				 * of the SynIC Event/Message pages and VMBus
+				 * Moniter pages in such a VM still need to be
+				 * added by this offset.
+				 */
+				ms_hyperv.shared_gpa_boundary = cc_mkdec(0);
+
+				/* To be supported: more work is required.  */
+				ms_hyperv.features &= ~HV_MSR_REFERENCE_TSC_AVAILABLE;
+
+				/* HV_REGISTER_CRASH_CTL is unsupported. */
+				ms_hyperv.misc_features &= ~HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
+
+				/* Don't trust Hyper-V's TLB-flushing hypercalls. */
+				ms_hyperv.hints &= ~HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
+			}
 		}
 	}
 
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index ec6e35a0d9bf6..28bbb354324d6 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -121,11 +121,15 @@ int hv_synic_alloc(void)
 				(void *)get_zeroed_page(GFP_ATOMIC);
 			if (hv_cpu->synic_event_page == NULL) {
 				pr_err("Unable to allocate SYNIC event page\n");
+
+				free_page((unsigned long)hv_cpu->synic_message_page);
+				hv_cpu->synic_message_page = NULL;
 				goto err;
 			}
 		}
 
-		if (hv_isolation_type_en_snp()) {
+		if (!ms_hyperv.paravisor_present &&
+		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)
 				hv_cpu->synic_message_page, 1);
 			if (ret) {
@@ -174,7 +178,8 @@ void hv_synic_free(void)
 			= per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		/* It's better to leak the page if the encryption fails. */
-		if (hv_isolation_type_en_snp()) {
+		if (!ms_hyperv.paravisor_present &&
+		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
 			if (hv_cpu->synic_message_page) {
 				ret = set_memory_encrypted((unsigned long)
 					hv_cpu->synic_message_page, 1);
@@ -232,6 +237,10 @@ void hv_synic_enable_regs(unsigned int cpu)
 	} else {
 		simp.base_simp_gpa = virt_to_phys(hv_cpu->synic_message_page)
 			>> HV_HYP_PAGE_SHIFT;
+
+		if (hv_isolation_type_tdx())
+			simp.base_simp_gpa |= ms_hyperv.shared_gpa_boundary
+				>> HV_HYP_PAGE_SHIFT;
 	}
 
 	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
@@ -251,6 +260,10 @@ void hv_synic_enable_regs(unsigned int cpu)
 	} else {
 		siefp.base_siefp_gpa = virt_to_phys(hv_cpu->synic_event_page)
 			>> HV_HYP_PAGE_SHIFT;
+
+		if (hv_isolation_type_tdx())
+			siefp.base_siefp_gpa |= ms_hyperv.shared_gpa_boundary
+				>> HV_HYP_PAGE_SHIFT;
 	}
 
 	hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/9] x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (2 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 3/9] Drivers: hv: vmbus: Support " Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-23  8:17   ` Tianyu Lan
  2023-08-11 22:18 ` [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM Dexuan Cui
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

When a fully enlightened TDX guest runs on Hyper-V, the UEFI firmware sets
the HW_REDUCED flag and consequently ttyS0 interrupts can't work. Fix the
issue by overriding x86_init.acpi.reduced_hw_early_init().

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 507df0f64ae18..b4214e37e9124 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -324,6 +324,26 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
 }
 #endif
 
+/*
+ * When a fully enlightened TDX VM runs on Hyper-V, the firmware sets the
+ * HW_REDUCED flag: refer to acpi_tb_create_local_fadt(). Consequently ttyS0
+ * interrupts can't work because request_irq() -> ... -> irq_to_desc() returns
+ * NULL for ttyS0. This happens because mp_config_acpi_legacy_irqs() sees a
+ * nr_legacy_irqs() of 0, so it doesn't initialize the array 'mp_irqs[]', and
+ * later setup_IO_APIC_irqs() -> find_irq_entry() fails to find the legacy irqs
+ * from the array and hence doesn't create the necessary irq description info.
+ *
+ * Clone arch/x86/kernel/acpi/boot.c: acpi_generic_reduced_hw_init() here,
+ * except don't change 'legacy_pic', which keeps its default value
+ * 'default_legacy_pic'. This way, mp_config_acpi_legacy_irqs() sees a non-zero
+ * nr_legacy_irqs() and eventually serial console interrupts works properly.
+ */
+static void __init reduced_hw_init(void)
+{
+	x86_init.timers.timer_init	= x86_init_noop;
+	x86_init.irqs.pre_vector_init	= x86_init_noop;
+}
+
 static void __init ms_hyperv_init_platform(void)
 {
 	int hv_max_functions_eax;
@@ -442,6 +462,8 @@ static void __init ms_hyperv_init_platform(void)
 
 				/* Don't trust Hyper-V's TLB-flushing hypercalls. */
 				ms_hyperv.hints &= ~HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
+
+				x86_init.acpi.reduced_hw_early_init = reduced_hw_init;
 			}
 		}
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (3 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 4/9] x86/hyperv: Fix serial console interrupts for " Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-23  9:13   ` Tianyu Lan
  2023-08-11 22:18 ` [PATCH 6/9] x86/hyperv: Introduce a global variable hyperv_paravisor_present Dexuan Cui
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

Don't set *this_cpu_ptr(hyperv_pcpu_input_arg) before the function
set_memory_decrypted() returns, otherwise we run into this ticky issue:

For a fully enlightened TDX/SNP VM, in hv_common_cpu_init(),
*this_cpu_ptr(hyperv_pcpu_input_arg) is an encrypted page before
the set_memory_decrypted() returns.

When such a VM has more than 64 VPs, if the hyperv_pcpu_input_arg is not
NULL, hv_common_cpu_init() -> set_memory_decrypted() -> ... ->
cpa_flush() -> on_each_cpu() -> ... -> hv_send_ipi_mask() -> ... ->
__send_ipi_mask_ex() tries to call hv_do_fast_hypercall16() with the
hyperv_pcpu_input_arg as the hypercall input page, which must be a
decrypted page in such a VM, but the page is still encrypted at this
point, and a fatal fault is triggered.

Fix the issue by setting *this_cpu_ptr(hyperv_pcpu_input_arg) after
set_memory_decrypted(): if the hyperv_pcpu_input_arg is NULL,
__send_ipi_mask_ex() returns HV_STATUS_INVALID_PARAMETER immediately,
and hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
which can use x2apic_send_IPI_all(), which may be slightly slower than
the hypercall but still works correctly in such a VM.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/hv_common.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 897bbb96f4118..4c858e1636da7 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -360,6 +360,7 @@ int hv_common_cpu_init(unsigned int cpu)
 	u64 msr_vp_index;
 	gfp_t flags;
 	int pgcount = hv_root_partition ? 2 : 1;
+	void *mem;
 	int ret;
 
 	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
@@ -372,25 +373,40 @@ int hv_common_cpu_init(unsigned int cpu)
 	 * allocated if this CPU was previously online and then taken offline
 	 */
 	if (!*inputarg) {
-		*inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
-		if (!(*inputarg))
+		mem = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
+		if (!mem)
 			return -ENOMEM;
 
 		if (hv_root_partition) {
 			outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
-			*outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
+			*outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
 		}
 
 		if (hv_isolation_type_en_snp() || hv_isolation_type_tdx()) {
-			ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);
+			ret = set_memory_decrypted((unsigned long)mem, pgcount);
 			if (ret) {
-				/* It may be unsafe to free *inputarg */
-				*inputarg = NULL;
+				/* It may be unsafe to free 'mem' */
 				return ret;
 			}
 
-			memset(*inputarg, 0x00, pgcount * PAGE_SIZE);
+			memset(mem, 0x00, pgcount * HV_HYP_PAGE_SIZE);
 		}
+
+		/*
+		 * In a fully enlightened TDX/SNP VM with more than 64 VPs, if
+		 * hyperv_pcpu_input_arg is not NULL, set_memory_decrypted() ->
+		 * ... -> cpa_flush()-> ... -> __send_ipi_mask_ex() tries to
+		 * use hyperv_pcpu_input_arg as the hypercall input page, which
+		 * must be a decrypted page in such a VM, but the page is still
+		 * encrypted before set_memory_decrypted() returns. Fix this by
+		 * setting *inputarg after the above set_memory_decrypted(): if
+		 * hyperv_pcpu_input_arg is NULL, __send_ipi_mask_ex() returns
+		 * HV_STATUS_INVALID_PARAMETER immediately, and the function
+		 * hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
+		 * which may be slightly slower than the hypercall, but still
+		 * works correctly in such a VM.
+		 */
+		*inputarg = mem;
 	}
 
 	msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/9] x86/hyperv: Introduce a global variable hyperv_paravisor_present
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (4 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-11 22:18 ` [PATCH 7/9] Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor Dexuan Cui
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

The new variable hyperv_paravisor_present is set only when the VM
is a SNP/TDX with the paravisor running: see ms_hyperv_init_platform().

In many places, hyperv_paravisor_present can replace
ms_hyperv.paravisor_present, and it's also used to replace
hv_isolation_type_snp() in drivers/hv/hv.c.

Call hv_vtom_init() when it's a VBS VM or when hyperv_paravisor_present
is true (i.e. the VM is a SNP/TDX VM with the paravisor).

Enhance hv_vtom_init() for a TDX VM with the paravisor.

The biggest motive to introduce hyperv_paravisor_present is that we
can not use ms_hyperv.paravisor_present in arch/x86/include/asm/mshyperv.h:
that would introduce a complicated header file dependency issue.

In arch/x86/include/asm/mshyperv.h, _hv_do_fast_hypercall8()
is changed to specially handle HVCALL_SIGNAL_EVENT for a TDX VM with the
paravisor, because such a VM must use TDX GHCI (see hv_tdx_hypercall())
for this hypercall. See vmbus_set_event() -> hv_do_fast_hypercall8() ->
_hv_do_fast_hypercall8() -> hv_tdx_hypercall().

In hv_common_cpu_init(), don't decrypt the hyperv_pcpu_input_arg
for a TDX VM with the paravisor, just like we don't decrypt the page
for a SNP VM with the paravisor.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_apic.c       |  4 ++--
 arch/x86/hyperv/hv_init.c       |  4 ++--
 arch/x86/hyperv/ivm.c           | 20 ++++++++++++++++++--
 arch/x86/include/asm/mshyperv.h |  9 ++++++---
 arch/x86/kernel/cpu/mshyperv.c  | 13 ++++++++++---
 drivers/hv/connection.c         |  2 +-
 drivers/hv/hv.c                 | 12 ++++++------
 drivers/hv/hv_common.c          |  6 +++++-
 include/asm-generic/mshyperv.h  |  1 +
 9 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index cb7429046d18d..8958836500d01 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -179,7 +179,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int vector,
 
 	/* A fully enlightened TDX VM uses GHCI rather than hv_hypercall_pg. */
 	if (!hv_hypercall_pg) {
-		if (ms_hyperv.paravisor_present || !hv_isolation_type_tdx())
+		if (hyperv_paravisor_present || !hv_isolation_type_tdx())
 			return false;
 	}
 
@@ -239,7 +239,7 @@ static bool __send_ipi_one(int cpu, int vector)
 
 	/* A fully enlightened TDX VM uses GHCI rather than hv_hypercall_pg. */
 	if (!hv_hypercall_pg) {
-		if (ms_hyperv.paravisor_present || !hv_isolation_type_tdx())
+		if (hyperv_paravisor_present || !hv_isolation_type_tdx())
 			return false;
 	}
 
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 4bcd0a6f94760..e67e2430fba35 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -663,8 +663,8 @@ bool hv_is_hyperv_initialized(void)
 	if (x86_hyper_type != X86_HYPER_MS_HYPERV)
 		return false;
 
-	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
-	if (hv_isolation_type_tdx())
+	/* A TDX VM with no paravisor uses TDX GHCI call rather than hv_hypercall_pg */
+	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
 		return true;
 	/*
 	 * Verify that earlier initialization succeeded by checking
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index 5792cddea4914..0d54bc8b06b4a 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -459,13 +459,29 @@ int hv_snp_boot_ap(int cpu, unsigned long start_ip)
 
 void __init hv_vtom_init(void)
 {
+	enum hv_isolation_type type = hv_get_isolation_type();
 	/*
 	 * By design, a VM using vTOM doesn't see the SEV setting,
 	 * so SEV initialization is bypassed and sev_status isn't set.
 	 * Set it here to indicate a vTOM VM.
 	 */
-	sev_status = MSR_AMD64_SNP_VTOM;
-	cc_vendor = CC_VENDOR_AMD;
+	switch (type) {
+	case HV_ISOLATION_TYPE_VBS:
+		fallthrough;
+
+	case HV_ISOLATION_TYPE_SNP:
+		sev_status = MSR_AMD64_SNP_VTOM;
+		cc_vendor = CC_VENDOR_AMD;
+		break;
+
+	case HV_ISOLATION_TYPE_TDX:
+		cc_vendor = CC_VENDOR_INTEL;
+		break;
+
+	default:
+		panic("hv_vtom_init: unsupported isolation type %d\n", type);
+	}
+
 	cc_set_mask(ms_hyperv.shared_gpa_boundary);
 	physical_mask &= ms_hyperv.shared_gpa_boundary - 1;
 
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 4c68564a165e5..9fa31dce69727 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -42,6 +42,7 @@ static inline unsigned char hv_get_nmi_reason(void)
 
 #if IS_ENABLED(CONFIG_HYPERV)
 extern int hyperv_init_cpuhp;
+extern bool hyperv_paravisor_present;
 
 extern void *hv_hypercall_pg;
 
@@ -74,7 +75,7 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx())
+	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
 		return hv_tdx_hypercall(control,
 					cc_mkdec(input_address),
 					cc_mkdec(output_address));
@@ -121,7 +122,9 @@ static inline u64 _hv_do_fast_hypercall8(u64 control, u64 input1)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx())
+	if (hv_isolation_type_tdx() &&
+		(!hyperv_paravisor_present ||
+		 control == (HVCALL_SIGNAL_EVENT | HV_HYPERCALL_FAST_BIT)))
 		return hv_tdx_hypercall(control, input1, 0);
 
 	{
@@ -170,7 +173,7 @@ static inline u64 _hv_do_fast_hypercall16(u64 control, u64 input1, u64 input2)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
-	if (hv_isolation_type_tdx())
+	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
 		return hv_tdx_hypercall(control, input1, input2);
 
 	{
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index b4214e37e9124..ddcc62185e4ae 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -40,6 +40,12 @@ bool hv_root_partition;
 bool hv_nested;
 struct ms_hyperv_info ms_hyperv;
 
+/*
+ * Used in modules via hv_do_hypercall(): see arch/x86/include/asm/mshyperv.h.
+ * Exported in drivers/hv/hv_common.c to not break the ARM64 build.
+ */
+bool hyperv_paravisor_present __ro_after_init;
+
 #if IS_ENABLED(CONFIG_HYPERV)
 static inline unsigned int hv_get_nested_reg(unsigned int reg)
 {
@@ -430,6 +436,8 @@ static void __init ms_hyperv_init_platform(void)
 			ms_hyperv.shared_gpa_boundary =
 				BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
 
+		hyperv_paravisor_present = !!ms_hyperv.paravisor_present;
+
 		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
 			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
 
@@ -444,7 +452,7 @@ static void __init ms_hyperv_init_platform(void)
 			/* A TDX VM must use x2APIC and doesn't use lazy EOI. */
 			ms_hyperv.hints &= ~HV_X64_APIC_ACCESS_RECOMMENDED;
 
-			if (!ms_hyperv.paravisor_present) {
+			if (!hyperv_paravisor_present) {
 				/*
 				 * The ms_hyperv.shared_gpa_boundary_active in
 				 * a fully enlightened TDX VM is 0, but the GPAs
@@ -535,8 +543,7 @@ static void __init ms_hyperv_init_platform(void)
 
 #if IS_ENABLED(CONFIG_HYPERV)
 	if ((hv_get_isolation_type() == HV_ISOLATION_TYPE_VBS) ||
-	    ((hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) &&
-	    ms_hyperv.paravisor_present))
+	    hyperv_paravisor_present)
 		hv_vtom_init();
 	/*
 	 * Setup the hook to get control post apic initialization.
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 02b54f85dc607..7f64fc942323b 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -484,7 +484,7 @@ void vmbus_set_event(struct vmbus_channel *channel)
 
 	++channel->sig_events;
 
-	if (hv_isolation_type_snp())
+	if (hv_isolation_type_snp() && hyperv_paravisor_present)
 		hv_ghcb_hypercall(HVCALL_SIGNAL_EVENT, &channel->sig_event,
 				NULL, sizeof(channel->sig_event));
 	else
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 28bbb354324d6..20bc44923e4f0 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -109,7 +109,7 @@ int hv_synic_alloc(void)
 		 * Synic message and event pages are allocated by paravisor.
 		 * Skip these pages allocation here.
 		 */
-		if (!hv_isolation_type_snp() && !hv_root_partition) {
+		if (!hyperv_paravisor_present && !hv_root_partition) {
 			hv_cpu->synic_message_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
 			if (hv_cpu->synic_message_page == NULL) {
@@ -128,7 +128,7 @@ int hv_synic_alloc(void)
 			}
 		}
 
-		if (!ms_hyperv.paravisor_present &&
+		if (!hyperv_paravisor_present &&
 		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)
 				hv_cpu->synic_message_page, 1);
@@ -226,7 +226,7 @@ void hv_synic_enable_regs(unsigned int cpu)
 	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
 	simp.simp_enabled = 1;
 
-	if (hv_isolation_type_snp() || hv_root_partition) {
+	if (hyperv_paravisor_present || hv_root_partition) {
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (simp.base_simp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
@@ -249,7 +249,7 @@ void hv_synic_enable_regs(unsigned int cpu)
 	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
 	siefp.siefp_enabled = 1;
 
-	if (hv_isolation_type_snp() || hv_root_partition) {
+	if (hyperv_paravisor_present || hv_root_partition) {
 		/* Mask out vTOM bit. ioremap_cache() maps decrypted */
 		u64 base = (siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT) &
 				~ms_hyperv.shared_gpa_boundary;
@@ -336,7 +336,7 @@ void hv_synic_disable_regs(unsigned int cpu)
 	 * addresses.
 	 */
 	simp.simp_enabled = 0;
-	if (hv_isolation_type_snp() || hv_root_partition) {
+	if (hyperv_paravisor_present || hv_root_partition) {
 		iounmap(hv_cpu->synic_message_page);
 		hv_cpu->synic_message_page = NULL;
 	} else {
@@ -348,7 +348,7 @@ void hv_synic_disable_regs(unsigned int cpu)
 	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
 	siefp.siefp_enabled = 0;
 
-	if (hv_isolation_type_snp() || hv_root_partition) {
+	if (hyperv_paravisor_present || hv_root_partition) {
 		iounmap(hv_cpu->synic_event_page);
 		hv_cpu->synic_event_page = NULL;
 	} else {
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 4c858e1636da7..c0b0ac44ffa3c 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -40,6 +40,9 @@
 bool __weak hv_root_partition;
 EXPORT_SYMBOL_GPL(hv_root_partition);
 
+bool __weak hyperv_paravisor_present;
+EXPORT_SYMBOL_GPL(hyperv_paravisor_present);
+
 bool __weak hv_nested;
 EXPORT_SYMBOL_GPL(hv_nested);
 
@@ -382,7 +385,8 @@ int hv_common_cpu_init(unsigned int cpu)
 			*outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
 		}
 
-		if (hv_isolation_type_en_snp() || hv_isolation_type_tdx()) {
+		if (!hyperv_paravisor_present &&
+		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)mem, pgcount);
 			if (ret) {
 				/* It may be unsafe to free 'mem' */
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 30fa75facd784..a6e4f38222c81 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -176,6 +176,7 @@ extern int vmbus_interrupt;
 extern int vmbus_irq;
 
 extern bool hv_root_partition;
+extern bool hyperv_paravisor_present;
 
 #if IS_ENABLED(CONFIG_HYPERV)
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/9] Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (5 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 6/9] x86/hyperv: Introduce a global variable hyperv_paravisor_present Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-11 22:18 ` [PATCH 8/9] x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM " Dexuan Cui
  2023-08-11 22:18 ` [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp Dexuan Cui
  8 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

The post_msg_page was removed in
commit 9a6b1a170ca8 ("Drivers: hv: vmbus: Remove the per-CPU post_msg_page")

However, it turns out that we need to bring it back, but only for a TDX VM
with the paravisor: in such a VM, the hyperv_pcpu_input_arg is not decrypted,
but the HVCALL_POST_MESSAGE in such a VM needs a decrypted page as the
hypercall input page: see the comments in hyperv_init() for a detailed
explanation.

Except for HVCALL_POST_MESSAGE and HVCALL_SIGNAL_EVENT, the other hypercalls
in a TDX VM with the paravisor still use hv_hypercall_pg and must use the
hyperv_pcpu_input_arg (which is encrypted in such a VM), when a hypercall
input page is used.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c | 20 +++++++++++--
 drivers/hv/hv.c           | 63 ++++++++++++++++++++++++++++++++++-----
 drivers/hv/hyperv_vmbus.h | 11 +++++++
 3 files changed, 85 insertions(+), 9 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e67e2430fba35..9c12a199ea62c 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -485,6 +485,22 @@ void __init hyperv_init(void)
 	 * Setup the hypercall page and enable hypercalls.
 	 * 1. Register the guest ID
 	 * 2. Enable the hypercall and register the hypercall page
+	 *
+	 * A TDX VM with no paravisor only uses TDX GHCI rather than hv_hypercall_pg:
+	 * when the hypercall input is a page, such a VM must pass a decrypted
+	 * page to Hyper-V, e.g. hv_post_message() uses the per-CPU page
+	 * hyperv_pcpu_input_arg, which is decrypted if no paravisor is present.
+	 *
+	 * A TDX VM with the paravisor uses hv_hypercall_pg for most hypercalls,
+	 * which are handled by the paravisor and the VM must use an encrypted
+	 * input page: in such a VM, the hyperv_pcpu_input_arg is encrypted and
+	 * used in the hypercalls, e.g. see hv_mark_gpa_visibility() and
+	 * hv_arch_irq_unmask(). Such a VM uses TDX GHCI for two hypercalls:
+	 * 1. HVCALL_SIGNAL_EVENT: see vmbus_set_event() and _hv_do_fast_hypercall8().
+	 * 2. HVCALL_POST_MESSAGE: the input page must be a decrypted page, i.e.
+	 * hv_post_message() in such a VM can't use the encrypted hyperv_pcpu_input_arg;
+	 * instead, hv_post_message() uses the post_msg_page, which is decrypted
+	 * in such a VM and is only used in such a VM.
 	 */
 	guest_id = hv_generate_guest_id(LINUX_VERSION_CODE);
 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
@@ -492,8 +508,8 @@ void __init hyperv_init(void)
 	/* Hyper-V requires to write guest os id via ghcb in SNP IVM. */
 	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
 
-	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
-	if (hv_isolation_type_tdx())
+	/* A TDX VM with no paravisor only uses TDX GHCI rather than hv_hypercall_pg */
+	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
 		goto skip_hypercall_pg_init;
 
 	hv_hypercall_pg = __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START,
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 20bc44923e4f0..6b5f1805d4749 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -57,20 +57,39 @@ int hv_post_message(union hv_connection_id connection_id,
 
 	local_irq_save(flags);
 
-	aligned_msg = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	/*
+	 * A TDX VM with the paravisor must use the decrypted post_msg_page: see
+	 * the comment in struct hv_per_cpu_context. A SNP VM with the paravisor
+	 * can use the encrypted hyperv_pcpu_input_arg because it copies the
+	 * input into the GHCB page, which has been decrypted by the paravisor.
+	 */
+	if (hv_isolation_type_tdx() && hyperv_paravisor_present)
+		aligned_msg = this_cpu_ptr(hv_context.cpu_context)->post_msg_page;
+	else
+		aligned_msg = *this_cpu_ptr(hyperv_pcpu_input_arg);
+
 	aligned_msg->connectionid = connection_id;
 	aligned_msg->reserved = 0;
 	aligned_msg->message_type = message_type;
 	aligned_msg->payload_size = payload_size;
 	memcpy((void *)aligned_msg->payload, payload, payload_size);
 
-	if (hv_isolation_type_snp())
-		status = hv_ghcb_hypercall(HVCALL_POST_MESSAGE,
-				(void *)aligned_msg, NULL,
-				sizeof(*aligned_msg));
-	else
+	if (hyperv_paravisor_present) {
+		if (hv_isolation_type_tdx()) {
+			u64 gpa_boundary = ms_hyperv.shared_gpa_boundary;
+			u64 in = virt_to_phys(aligned_msg) | gpa_boundary;
+
+			status = hv_tdx_hypercall(HVCALL_POST_MESSAGE, in, 0);
+		} else if (hv_isolation_type_snp())
+			status = hv_ghcb_hypercall(HVCALL_POST_MESSAGE,
+						   aligned_msg, NULL,
+						   sizeof(*aligned_msg));
+		else
+			status = HV_STATUS_INVALID_PARAMETER;
+	} else {
 		status = hv_do_hypercall(HVCALL_POST_MESSAGE,
 				aligned_msg, NULL);
+	}
 
 	local_irq_restore(flags);
 
@@ -105,6 +124,24 @@ int hv_synic_alloc(void)
 		tasklet_init(&hv_cpu->msg_dpc,
 			     vmbus_on_msg_dpc, (unsigned long) hv_cpu);
 
+		if (hyperv_paravisor_present && hv_isolation_type_tdx()) {
+			hv_cpu->post_msg_page = (void *)get_zeroed_page(GFP_ATOMIC);
+			if (hv_cpu->post_msg_page == NULL) {
+				pr_err("Unable to allocate post msg page\n");
+				goto err;
+			}
+
+			ret = set_memory_decrypted((unsigned long)hv_cpu->post_msg_page, 1);
+			if (ret) {
+				pr_err("Failed to decrypt post msg page: %d\n", ret);
+				/* Just leak the page, as it's unsafe to free the page. */
+				hv_cpu->post_msg_page = NULL;
+				goto err;
+			}
+
+			memset(hv_cpu->post_msg_page, 0, PAGE_SIZE);
+		}
+
 		/*
 		 * Synic message and event pages are allocated by paravisor.
 		 * Skip these pages allocation here.
@@ -178,7 +215,18 @@ void hv_synic_free(void)
 			= per_cpu_ptr(hv_context.cpu_context, cpu);
 
 		/* It's better to leak the page if the encryption fails. */
-		if (!ms_hyperv.paravisor_present &&
+		if (hyperv_paravisor_present && hv_isolation_type_tdx()) {
+			if (hv_cpu->post_msg_page) {
+				ret = set_memory_encrypted((unsigned long)
+					hv_cpu->post_msg_page, 1);
+				if (ret) {
+					pr_err("Failed to encrypt post msg page: %d\n", ret);
+					hv_cpu->post_msg_page = NULL;
+				}
+			}
+		}
+
+		if (!hyperv_paravisor_present &&
 		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
 			if (hv_cpu->synic_message_page) {
 				ret = set_memory_encrypted((unsigned long)
@@ -199,6 +247,7 @@ void hv_synic_free(void)
 			}
 		}
 
+		free_page((unsigned long)hv_cpu->post_msg_page);
 		free_page((unsigned long)hv_cpu->synic_event_page);
 		free_page((unsigned long)hv_cpu->synic_message_page);
 	}
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 55f2086841ae4..f6b1e710f8055 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -123,6 +123,17 @@ struct hv_per_cpu_context {
 	void *synic_message_page;
 	void *synic_event_page;
 
+	/*
+	 * The page is only used in hv_post_message() for a TDX VM (with the
+	 * paravisor) to post a messages to Hyper-V: when such a VM calls
+	 * HVCALL_POST_MESSAGE, it can't use the hyperv_pcpu_input_arg (which
+	 * is encrypted in such a VM) as the hypercall input page, because
+	 * the input page for HVCALL_POST_MESSAGE must be decrypted in such a
+	 * VM, so post_msg_page (which is decrypted in hv_synic_alloc()) is
+	 * introduced for this purpose. See hyperv_init() for more comments.
+	 */
+	void *post_msg_page;
+
 	/*
 	 * Starting with win8, we can take channel interrupts on any CPU;
 	 * we will manage the tasklet that handles events messages on a per CPU
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 8/9] x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (6 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 7/9] Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-11 22:18 ` [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp Dexuan Cui
  8 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

When the paravisor is present, a SNP VM must use GHCB to access some
special MSRs, including HV_X64_MSR_GUEST_OS_ID and some SynIC MSRs.

Similarly, when the paravisor is present, a TDX VM must use TDX GHCI
to access the same MSRs.

Implement hv_tdx_read_msr() and hv_tdx_write_msr(), and use the helper
functions hv_ivm_msr_read() and hv_ivm_msr_write() to access the MSRs
in a unified way for SNP/TDX VMs with the paravisor.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c       |  8 ++--
 arch/x86/hyperv/ivm.c           | 72 +++++++++++++++++++++++++++++++--
 arch/x86/include/asm/mshyperv.h |  8 ++--
 arch/x86/kernel/cpu/mshyperv.c  |  8 ++--
 4 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 9c12a199ea62c..abd0a8bd3f15e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -505,8 +505,8 @@ void __init hyperv_init(void)
 	guest_id = hv_generate_guest_id(LINUX_VERSION_CODE);
 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, guest_id);
 
-	/* Hyper-V requires to write guest os id via ghcb in SNP IVM. */
-	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
+	/* With the paravisor, the VM must also write the ID via GHCB/GHCI */
+	hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
 
 	/* A TDX VM with no paravisor only uses TDX GHCI rather than hv_hypercall_pg */
 	if (hv_isolation_type_tdx() && !hyperv_paravisor_present)
@@ -595,7 +595,7 @@ void __init hyperv_init(void)
 
 clean_guest_os_id:
 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
-	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
+	hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
 	cpuhp_remove_state(cpuhp);
 free_ghcb_page:
 	free_percpu(hv_ghcb_pg);
@@ -616,7 +616,7 @@ void hyperv_cleanup(void)
 
 	/* Reset our OS id */
 	wrmsrl(HV_X64_MSR_GUEST_OS_ID, 0);
-	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
+	hv_ivm_msr_write(HV_X64_MSR_GUEST_OS_ID, 0);
 
 	/*
 	 * Reset hypercall page reference before reset the page,
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index 0d54bc8b06b4a..b8fb1557c1986 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -184,7 +184,49 @@ bool hv_ghcb_negotiate_protocol(void)
 	return true;
 }
 
-void hv_ghcb_msr_write(u64 msr, u64 value)
+#define EXIT_REASON_MSR_READ		31
+#define EXIT_REASON_MSR_WRITE		32
+
+static void hv_tdx_read_msr(u64 msr, u64 *val)
+{
+	struct tdx_hypercall_args args = {
+		.r10 = TDX_HYPERCALL_STANDARD,
+		.r11 = EXIT_REASON_MSR_READ,
+		.r12 = msr,
+	};
+
+#ifdef CONFIG_INTEL_TDX_GUEST
+	u64 ret = __tdx_hypercall_ret(&args);
+#else
+	u64 ret = HV_STATUS_INVALID_PARAMETER;
+#endif
+
+	if (WARN_ONCE(ret, "Failed to emulate MSR read: %lld\n", ret))
+		*val = 0;
+	else
+		*val = args.r11;
+}
+
+static void hv_tdx_write_msr(u64 msr, u64 val)
+{
+	struct tdx_hypercall_args args = {
+		.r10 = TDX_HYPERCALL_STANDARD,
+		.r11 = EXIT_REASON_MSR_WRITE,
+		.r12 = msr,
+		.r13 = val,
+	};
+
+#ifdef CONFIG_INTEL_TDX_GUEST
+	u64 ret = __tdx_hypercall(&args);
+#else
+	u64 ret = HV_STATUS_INVALID_PARAMETER;
+	(void)args;
+#endif
+
+	WARN_ONCE(ret, "Failed to emulate MSR write: %lld\n", ret);
+}
+
+static void hv_ghcb_msr_write(u64 msr, u64 value)
 {
 	union hv_ghcb *hv_ghcb;
 	void **ghcb_base;
@@ -212,9 +254,20 @@ void hv_ghcb_msr_write(u64 msr, u64 value)
 
 	local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(hv_ghcb_msr_write);
 
-void hv_ghcb_msr_read(u64 msr, u64 *value)
+void hv_ivm_msr_write(u64 msr, u64 value)
+{
+	if (!hyperv_paravisor_present)
+		return;
+
+	if (hv_isolation_type_tdx())
+		hv_tdx_write_msr(msr, value);
+	else if (hv_isolation_type_snp())
+		hv_ghcb_msr_write(msr, value);
+}
+EXPORT_SYMBOL_GPL(hv_ivm_msr_write);
+
+static void hv_ghcb_msr_read(u64 msr, u64 *value)
 {
 	union hv_ghcb *hv_ghcb;
 	void **ghcb_base;
@@ -244,7 +297,18 @@ void hv_ghcb_msr_read(u64 msr, u64 *value)
 			| ((u64)lower_32_bits(hv_ghcb->ghcb.save.rdx) << 32);
 	local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(hv_ghcb_msr_read);
+
+void hv_ivm_msr_read(u64 msr, u64 *value)
+{
+	if (!hyperv_paravisor_present)
+		return;
+
+	if (hv_isolation_type_tdx())
+		hv_tdx_read_msr(msr, value);
+	else if (hv_isolation_type_snp())
+		hv_ghcb_msr_read(msr, value);
+}
+EXPORT_SYMBOL_GPL(hv_ivm_msr_read);
 
 /*
  * hv_mark_gpa_visibility - Set pages visible to host via hvcall.
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 9fa31dce69727..3f22324ef2e25 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -255,15 +255,15 @@ int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
 int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
-void hv_ghcb_msr_write(u64 msr, u64 value);
-void hv_ghcb_msr_read(u64 msr, u64 *value);
+void hv_ivm_msr_write(u64 msr, u64 value);
+void hv_ivm_msr_read(u64 msr, u64 *value);
 bool hv_ghcb_negotiate_protocol(void);
 void __noreturn hv_ghcb_terminate(unsigned int set, unsigned int reason);
 void hv_vtom_init(void);
 int hv_snp_boot_ap(int cpu, unsigned long start_ip);
 #else
-static inline void hv_ghcb_msr_write(u64 msr, u64 value) {}
-static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
+static inline void hv_ivm_msr_write(u64 msr, u64 value) {}
+static inline void hv_ivm_msr_read(u64 msr, u64 *value) {}
 static inline bool hv_ghcb_negotiate_protocol(void) { return false; }
 static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {}
 static inline void hv_vtom_init(void) {}
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index ddcc62185e4ae..fb585d3b080b1 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -72,8 +72,8 @@ u64 hv_get_non_nested_register(unsigned int reg)
 {
 	u64 value;
 
-	if (hv_is_synic_reg(reg) && hv_isolation_type_snp())
-		hv_ghcb_msr_read(reg, &value);
+	if (hv_is_synic_reg(reg) && hyperv_paravisor_present)
+		hv_ivm_msr_read(reg, &value);
 	else
 		rdmsrl(reg, value);
 	return value;
@@ -82,8 +82,8 @@ EXPORT_SYMBOL_GPL(hv_get_non_nested_register);
 
 void hv_set_non_nested_register(unsigned int reg, u64 value)
 {
-	if (hv_is_synic_reg(reg) && hv_isolation_type_snp()) {
-		hv_ghcb_msr_write(reg, value);
+	if (hv_is_synic_reg(reg) && hyperv_paravisor_present) {
+		hv_ivm_msr_write(reg, value);
 
 		/* Write proxy bit via wrmsl instruction */
 		if (hv_is_sint_reg(reg))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp
  2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
                   ` (7 preceding siblings ...)
  2023-08-11 22:18 ` [PATCH 8/9] x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM " Dexuan Cui
@ 2023-08-11 22:18 ` Dexuan Cui
  2023-08-16  4:14   ` Dexuan Cui
  8 siblings, 1 reply; 17+ messages in thread
From: Dexuan Cui @ 2023-08-11 22:18 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li, Dexuan Cui

In ms_hyperv_init_platform(), do not distinguish between a SNP VM with
the paravisor and a SNP VM without the paravisor.

Replace hv_isolation_type_en_snp() with
!hyperv_paravisor_present && hv_isolation_type_snp().

The hv_isolation_type_en_snp() in drivers/hv/hv.c and
drivers/hv/hv_common.c can be changed to hv_isolation_type_snp() since
we know !hyperv_paravisor_present is true there.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c      |  2 +-
 arch/x86/hyperv/ivm.c          | 12 +-----------
 arch/x86/kernel/cpu/mshyperv.c |  6 ++----
 drivers/hv/hv.c                |  4 ++--
 drivers/hv/hv_common.c         |  8 +-------
 5 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index abd0a8bd3f15e..b23466c1cd574 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -117,7 +117,7 @@ static int hv_cpu_init(unsigned int cpu)
 			 * is blocked to run in Confidential VM. So only decrypt assist
 			 * page in non-root partition here.
 			 */
-			if (*hvp && hv_isolation_type_en_snp()) {
+			if (*hvp && !hyperv_paravisor_present && hv_isolation_type_snp()) {
 				WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
 				memset(*hvp, 0, PAGE_SIZE);
 			}
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index b8fb1557c1986..068f05574067c 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -586,7 +586,7 @@ bool hv_is_isolation_supported(void)
 DEFINE_STATIC_KEY_FALSE(isolation_type_snp);
 
 /*
- * hv_isolation_type_snp - Check system runs in the AMD SEV-SNP based
+ * hv_isolation_type_snp - Check if the system runs in an AMD SEV-SNP based
  * isolation VM.
  */
 bool hv_isolation_type_snp(void)
@@ -594,16 +594,6 @@ bool hv_isolation_type_snp(void)
 	return static_branch_unlikely(&isolation_type_snp);
 }
 
-DEFINE_STATIC_KEY_FALSE(isolation_type_en_snp);
-/*
- * hv_isolation_type_en_snp - Check system runs in the AMD SEV-SNP based
- * isolation enlightened VM.
- */
-bool hv_isolation_type_en_snp(void)
-{
-	return static_branch_unlikely(&isolation_type_en_snp);
-}
-
 DEFINE_STATIC_KEY_FALSE(isolation_type_tdx);
 /*
  * hv_isolation_type_tdx - Check if the system runs in an Intel TDX based
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index fb585d3b080b1..6b464ed3cf546 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -306,7 +306,7 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
 	 *  Override wakeup_secondary_cpu_64 callback for SEV-SNP
 	 *  enlightened guest.
 	 */
-	if (hv_isolation_type_en_snp())
+	if (!hyperv_paravisor_present && hv_isolation_type_snp())
 		apic->wakeup_secondary_cpu_64 = hv_snp_boot_ap;
 
 	if (!hv_root_partition)
@@ -442,9 +442,7 @@ static void __init ms_hyperv_init_platform(void)
 			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
 
 
-		if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
-			static_branch_enable(&isolation_type_en_snp);
-		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
+		if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
 			static_branch_enable(&isolation_type_snp);
 		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
 			static_branch_enable(&isolation_type_tdx);
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 6b5f1805d4749..932b8bc239acb 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -166,7 +166,7 @@ int hv_synic_alloc(void)
 		}
 
 		if (!hyperv_paravisor_present &&
-		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
+		    (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)
 				hv_cpu->synic_message_page, 1);
 			if (ret) {
@@ -227,7 +227,7 @@ void hv_synic_free(void)
 		}
 
 		if (!hyperv_paravisor_present &&
-		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
+		    (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
 			if (hv_cpu->synic_message_page) {
 				ret = set_memory_encrypted((unsigned long)
 					hv_cpu->synic_message_page, 1);
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index c0b0ac44ffa3c..d3f95a1be1e99 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -386,7 +386,7 @@ int hv_common_cpu_init(unsigned int cpu)
 		}
 
 		if (!hyperv_paravisor_present &&
-		    (hv_isolation_type_en_snp() || hv_isolation_type_tdx())) {
+		    (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)mem, pgcount);
 			if (ret) {
 				/* It may be unsafe to free 'mem' */
@@ -535,12 +535,6 @@ bool __weak hv_isolation_type_snp(void)
 }
 EXPORT_SYMBOL_GPL(hv_isolation_type_snp);
 
-bool __weak hv_isolation_type_en_snp(void)
-{
-	return false;
-}
-EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
-
 bool __weak hv_isolation_type_tdx(void)
 {
 	return false;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* RE: [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp
  2023-08-11 22:18 ` [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp Dexuan Cui
@ 2023-08-16  4:14   ` Dexuan Cui
  0 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-16  4:14 UTC (permalink / raw)
  To: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
	brijesh.singh@amd.com, dan.j.williams@intel.com,
	dave.hansen@intel.com, dave.hansen@linux.intel.com, Haiyang Zhang,
	hpa@zytor.com, jane.chu@oracle.com,
	kirill.shutemov@linux.intel.com, KY Srinivasan,
	linux-hyperv@vger.kernel.org, luto@kernel.org, mingo@redhat.com,
	peterz@infradead.org, rostedt@goodmis.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com,
	tglx@linutronix.de, tony.luck@intel.com, wei.liu@kernel.org,
	jason, nik.borisov@suse.com, Michael Kelley (LINUX)
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org, Tianyu Lan,
	rick.p.edgecombe@intel.com, Anthony Davis, Mark Heslin, vkuznets,
	xiaoyao.li@intel.com

> From: Dexuan Cui <decui@microsoft.com>
> Sent: Friday, August 11, 2023 3:19 PM
> To: ak@linux.intel.com; arnd@arndb.de; bp@alien8.de;
> ...
> In ms_hyperv_init_platform(), do not distinguish between a SNP VM with
> the paravisor and a SNP VM without the paravisor.
> 
> Replace hv_isolation_type_en_snp() with
> !hyperv_paravisor_present && hv_isolation_type_snp().
> 
> The hv_isolation_type_en_snp() in drivers/hv/hv.c and
> drivers/hv/hv_common.c can be changed to hv_isolation_type_snp() since
> we know !hyperv_paravisor_present is true there.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Some of the existing hv_isolation_type_snp()'s need to test 
hyperv_paravisor_present as well, e.g.

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
@@ -52,7 +52,7 @@ static int hyperv_init_ghcb(void)
 	void *ghcb_va;
 	void **ghcb_base;
 
-	if (!hv_isolation_type_snp())
+	if (!hyperv_paravisor_present || !hv_isolation_type_snp())
 		return 0;
 
 	if (!hv_ghcb_pg)

The new version of this patch is here:
https://github.com/dcui/tdx/commit/323de396984e3c222ded23dce46155ae48a1992a

After Tianyu posts his v6 of the fully enlightened SNP patchset,
I'll rebase to his patchset and post v2.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
  2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
@ 2023-08-18 14:18   ` Tianyu Lan
  2023-08-22 14:18   ` Tianyu Lan
  1 sibling, 0 replies; 17+ messages in thread
From: Tianyu Lan @ 2023-08-18 14:18 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li

On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> No logic change to SNP/VBS guests.
> 
> hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to
> do some TDX-specific operations, e.g. for a fully enlightened TDX guest
> (i.e. without the paravisor), hv_do_hypercall() should use
> __tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V
> Event/Message/Monitor pages specially.
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Reviewed-by: Tianyu Lan <tiala@microsoft.com>

> ---
>   arch/x86/hyperv/ivm.c              | 9 +++++++++
>   arch/x86/include/asm/hyperv-tlfs.h | 3 ++-
>   arch/x86/include/asm/mshyperv.h    | 3 +++
>   arch/x86/kernel/cpu/mshyperv.c     | 2 ++
>   drivers/hv/hv_common.c             | 6 ++++++
>   include/asm-generic/mshyperv.h     | 1 +
>   6 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
> index ee08a0cd6da38..d4aafe8b6b50d 100644
> --- a/arch/x86/hyperv/ivm.c
> +++ b/arch/x86/hyperv/ivm.c
> @@ -524,3 +524,12 @@ bool hv_isolation_type_en_snp(void)
>   	return static_branch_unlikely(&isolation_type_en_snp);
>   }
>   
> +DEFINE_STATIC_KEY_FALSE(isolation_type_tdx);
> +/*
> + * hv_isolation_type_tdx - Check if the system runs in an Intel TDX based
> + * isolated VM.
> + */
> +bool hv_isolation_type_tdx(void)
> +{
> +	return static_branch_unlikely(&isolation_type_tdx);
> +}
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 4bf0b315b0ce9..2ff26f53cd624 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -169,7 +169,8 @@
>   enum hv_isolation_type {
>   	HV_ISOLATION_TYPE_NONE	= 0,
>   	HV_ISOLATION_TYPE_VBS	= 1,
> -	HV_ISOLATION_TYPE_SNP	= 2
> +	HV_ISOLATION_TYPE_SNP	= 2,
> +	HV_ISOLATION_TYPE_TDX	= 3
>   };
>   
>   /* Hyper-V specific model specific registers (MSRs) */
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 0b0d1eb249d0a..83fc3a79f1557 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -27,6 +27,7 @@ union hv_ghcb;
>   
>   DECLARE_STATIC_KEY_FALSE(isolation_type_snp);
>   DECLARE_STATIC_KEY_FALSE(isolation_type_en_snp);
> +DECLARE_STATIC_KEY_FALSE(isolation_type_tdx);
>   
>   typedef int (*hyperv_fill_flush_list_func)(
>   		struct hv_guest_mapping_flush_list *flush,
> @@ -59,6 +60,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
>   int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
>   int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
>   
> +bool hv_isolation_type_tdx(void);
> +
>   static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
>   {
>   	u64 input_address = input ? virt_to_phys(input) : 0;
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index b7d73f3107c63..a50fd3650ea9b 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -418,6 +418,8 @@ static void __init ms_hyperv_init_platform(void)
>   			static_branch_enable(&isolation_type_en_snp);
>   		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
>   			static_branch_enable(&isolation_type_snp);
> +		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
> +			static_branch_enable(&isolation_type_tdx);
>   		}
>   	}
>   
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 2d43ba2bc925d..da3307533f4d7 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -521,6 +521,12 @@ bool __weak hv_isolation_type_en_snp(void)
>   }
>   EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
>   
> +bool __weak hv_isolation_type_tdx(void)
> +{
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(hv_isolation_type_tdx);
> +
>   void __weak hv_setup_vmbus_handler(void (*handler)(void))
>   {
>   }
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> index efd0d2aedad39..c5e657c3cdf4c 100644
> --- a/include/asm-generic/mshyperv.h
> +++ b/include/asm-generic/mshyperv.h
> @@ -66,6 +66,7 @@ extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
>   extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
>   extern bool hv_isolation_type_snp(void);
>   extern bool hv_isolation_type_en_snp(void);
> +extern bool hv_isolation_type_tdx(void);
>   
>   /* Helper functions that provide a consistent pattern for checking Hyper-V hypercall status. */
>   static inline int hv_result(u64 status)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
  2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
  2023-08-18 14:18   ` Tianyu Lan
@ 2023-08-22 14:18   ` Tianyu Lan
  1 sibling, 0 replies; 17+ messages in thread
From: Tianyu Lan @ 2023-08-22 14:18 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li

On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> No logic change to SNP/VBS guests.
> 
> hv_isolation_type_tdx() will be used to instruct a TDX guest on Hyper-V to
> do some TDX-specific operations, e.g. for a fully enlightened TDX guest
> (i.e. without the paravisor), hv_do_hypercall() should use
> __tdx_hypercall() and such a guest on Hyper-V should handle the Hyper-V
> Event/Message/Monitor pages specially.
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Reviewed-by: Tianyu Lan <tiala@microsoft.com>

> ---
>   arch/x86/hyperv/ivm.c              | 9 +++++++++
>   arch/x86/include/asm/hyperv-tlfs.h | 3 ++-
>   arch/x86/include/asm/mshyperv.h    | 3 +++
>   arch/x86/kernel/cpu/mshyperv.c     | 2 ++
>   drivers/hv/hv_common.c             | 6 ++++++
>   include/asm-generic/mshyperv.h     | 1 +
>   6 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
> index ee08a0cd6da38..d4aafe8b6b50d 100644
> --- a/arch/x86/hyperv/ivm.c
> +++ b/arch/x86/hyperv/ivm.c
> @@ -524,3 +524,12 @@ bool hv_isolation_type_en_snp(void)
>   	return static_branch_unlikely(&isolation_type_en_snp);
>   }
>   
> +DEFINE_STATIC_KEY_FALSE(isolation_type_tdx);
> +/*
> + * hv_isolation_type_tdx - Check if the system runs in an Intel TDX based
> + * isolated VM.
> + */
> +bool hv_isolation_type_tdx(void)
> +{
> +	return static_branch_unlikely(&isolation_type_tdx);
> +}
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 4bf0b315b0ce9..2ff26f53cd624 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -169,7 +169,8 @@
>   enum hv_isolation_type {
>   	HV_ISOLATION_TYPE_NONE	= 0,
>   	HV_ISOLATION_TYPE_VBS	= 1,
> -	HV_ISOLATION_TYPE_SNP	= 2
> +	HV_ISOLATION_TYPE_SNP	= 2,
> +	HV_ISOLATION_TYPE_TDX	= 3
>   };
>   
>   /* Hyper-V specific model specific registers (MSRs) */
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 0b0d1eb249d0a..83fc3a79f1557 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -27,6 +27,7 @@ union hv_ghcb;
>   
>   DECLARE_STATIC_KEY_FALSE(isolation_type_snp);
>   DECLARE_STATIC_KEY_FALSE(isolation_type_en_snp);
> +DECLARE_STATIC_KEY_FALSE(isolation_type_tdx);
>   
>   typedef int (*hyperv_fill_flush_list_func)(
>   		struct hv_guest_mapping_flush_list *flush,
> @@ -59,6 +60,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
>   int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
>   int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
>   
> +bool hv_isolation_type_tdx(void);
> +
>   static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
>   {
>   	u64 input_address = input ? virt_to_phys(input) : 0;
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index b7d73f3107c63..a50fd3650ea9b 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -418,6 +418,8 @@ static void __init ms_hyperv_init_platform(void)
>   			static_branch_enable(&isolation_type_en_snp);
>   		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP) {
>   			static_branch_enable(&isolation_type_snp);
> +		} else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_TDX) {
> +			static_branch_enable(&isolation_type_tdx);
>   		}
>   	}
>   
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 2d43ba2bc925d..da3307533f4d7 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -521,6 +521,12 @@ bool __weak hv_isolation_type_en_snp(void)
>   }
>   EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
>   
> +bool __weak hv_isolation_type_tdx(void)
> +{
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(hv_isolation_type_tdx);
> +
>   void __weak hv_setup_vmbus_handler(void (*handler)(void))
>   {
>   }
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> index efd0d2aedad39..c5e657c3cdf4c 100644
> --- a/include/asm-generic/mshyperv.h
> +++ b/include/asm-generic/mshyperv.h
> @@ -66,6 +66,7 @@ extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
>   extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
>   extern bool hv_isolation_type_snp(void);
>   extern bool hv_isolation_type_en_snp(void);
> +extern bool hv_isolation_type_tdx(void);
>   
>   /* Helper functions that provide a consistent pattern for checking Hyper-V hypercall status. */
>   static inline int hv_result(u64 status)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened TDX guests
  2023-08-11 22:18 ` [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened " Dexuan Cui
@ 2023-08-23  7:05   ` Tianyu Lan
  2023-08-23 17:40     ` Dexuan Cui
  0 siblings, 1 reply; 17+ messages in thread
From: Tianyu Lan @ 2023-08-23  7:05 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li

On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> A fully enlightened TDX guest on Hyper-V (i.e. without the paravisor) only
> uses the GHCI call rather than hv_hypercall_pg.
> 
> In hv_do_hypercall(), Hyper-V requires that the input/output addresses
> must have the cc_mask.
> 
> Reviewed-by: Kuppuswamy Sathyanarayanan<sathyanarayanan.kuppuswamy@linux.intel.com>
> Reviewed-by: Michael Kelley<mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui<decui@microsoft.com>
> ---

Reviewed-by: Tianyu Lan <tiala@microsoft.com>

>   arch/x86/hyperv/hv_init.c       |  8 ++++++++
>   arch/x86/hyperv/ivm.c           | 17 +++++++++++++++++
>   arch/x86/include/asm/mshyperv.h | 15 +++++++++++++++
>   drivers/hv/hv_common.c          | 10 ++++++++--
>   include/asm-generic/mshyperv.h  |  1 +
>   5 files changed, 49 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 547ebf6a03bc9..d8ea54663113c 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -481,6 +481,10 @@ void __init hyperv_init(void)
>   	/* Hyper-V requires to write guest os id via ghcb in SNP IVM. */
>   	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
>   
> +	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
> +	if (hv_isolation_type_tdx())
> +		goto skip_hypercall_pg_init;
> +

Nitpick:
	Put hypercal page initialization code into a sepearate function and 
skip the function in the tdx guest instead of adding the label.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/9] x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests
  2023-08-11 22:18 ` [PATCH 4/9] x86/hyperv: Fix serial console interrupts for " Dexuan Cui
@ 2023-08-23  8:17   ` Tianyu Lan
  0 siblings, 0 replies; 17+ messages in thread
From: Tianyu Lan @ 2023-08-23  8:17 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li

On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> When a fully enlightened TDX guest runs on Hyper-V, the UEFI firmware sets
> the HW_REDUCED flag and consequently ttyS0 interrupts can't work. Fix the
> issue by overriding x86_init.acpi.reduced_hw_early_init().
> 
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Reviewed-by: Tianyu Lan <tiala@microsoft.com>
> ---
>   arch/x86/kernel/cpu/mshyperv.c | 22 ++++++++++++++++++++++
>   1 file changed, 22 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 507df0f64ae18..b4214e37e9124 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -324,6 +324,26 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>   }
>   #endif
>   
> +/*
> + * When a fully enlightened TDX VM runs on Hyper-V, the firmware sets the
> + * HW_REDUCED flag: refer to acpi_tb_create_local_fadt(). Consequently ttyS0
> + * interrupts can't work because request_irq() -> ... -> irq_to_desc() returns
> + * NULL for ttyS0. This happens because mp_config_acpi_legacy_irqs() sees a
> + * nr_legacy_irqs() of 0, so it doesn't initialize the array 'mp_irqs[]', and
> + * later setup_IO_APIC_irqs() -> find_irq_entry() fails to find the legacy irqs
> + * from the array and hence doesn't create the necessary irq description info.
> + *
> + * Clone arch/x86/kernel/acpi/boot.c: acpi_generic_reduced_hw_init() here,
> + * except don't change 'legacy_pic', which keeps its default value
> + * 'default_legacy_pic'. This way, mp_config_acpi_legacy_irqs() sees a non-zero
> + * nr_legacy_irqs() and eventually serial console interrupts works properly.
> + */
> +static void __init reduced_hw_init(void)
> +{
> +	x86_init.timers.timer_init	= x86_init_noop;
> +	x86_init.irqs.pre_vector_init	= x86_init_noop;
> +}
> +
>   static void __init ms_hyperv_init_platform(void)
>   {
>   	int hv_max_functions_eax;
> @@ -442,6 +462,8 @@ static void __init ms_hyperv_init_platform(void)
>   
>   				/* Don't trust Hyper-V's TLB-flushing hypercalls. */
>   				ms_hyperv.hints &= ~HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED;
> +
> +				x86_init.acpi.reduced_hw_early_init = reduced_hw_init;
>   			}
>   		}
>   	}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
  2023-08-11 22:18 ` [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM Dexuan Cui
@ 2023-08-23  9:13   ` Tianyu Lan
  0 siblings, 0 replies; 17+ messages in thread
From: Tianyu Lan @ 2023-08-23  9:13 UTC (permalink / raw)
  To: Dexuan Cui, ak, arnd, bp, brijesh.singh, dan.j.williams,
	dave.hansen, dave.hansen, haiyangz, hpa, jane.chu,
	kirill.shutemov, kys, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu,
	Jason, nik.borisov, mikelley
  Cc: x86, linux-kernel, linux-arch, Tianyu.Lan, rick.p.edgecombe,
	andavis, mheslin, vkuznets, xiaoyao.li

On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> Don't set *this_cpu_ptr(hyperv_pcpu_input_arg) before the function
> set_memory_decrypted() returns, otherwise we run into this ticky issue:
> 
> For a fully enlightened TDX/SNP VM, in hv_common_cpu_init(),
> *this_cpu_ptr(hyperv_pcpu_input_arg) is an encrypted page before
> the set_memory_decrypted() returns.
> 
> When such a VM has more than 64 VPs, if the hyperv_pcpu_input_arg is not
> NULL, hv_common_cpu_init() -> set_memory_decrypted() -> ... ->
> cpa_flush() -> on_each_cpu() -> ... -> hv_send_ipi_mask() -> ... ->
> __send_ipi_mask_ex() tries to call hv_do_fast_hypercall16() with the
> hyperv_pcpu_input_arg as the hypercall input page, which must be a
> decrypted page in such a VM, but the page is still encrypted at this
> point, and a fatal fault is triggered.
> 
> Fix the issue by setting *this_cpu_ptr(hyperv_pcpu_input_arg) after
> set_memory_decrypted(): if the hyperv_pcpu_input_arg is NULL,
> __send_ipi_mask_ex() returns HV_STATUS_INVALID_PARAMETER immediately,
> and hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
> which can use x2apic_send_IPI_all(), which may be slightly slower than
> the hypercall but still works correctly in such a VM.
>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>

Reviewed-by: Tianyu Lan <tiala@microsoft.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened TDX guests
  2023-08-23  7:05   ` Tianyu Lan
@ 2023-08-23 17:40     ` Dexuan Cui
  0 siblings, 0 replies; 17+ messages in thread
From: Dexuan Cui @ 2023-08-23 17:40 UTC (permalink / raw)
  To: Tianyu Lan, ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
	brijesh.singh@amd.com, dan.j.williams@intel.com,
	dave.hansen@intel.com, dave.hansen@linux.intel.com, Haiyang Zhang,
	hpa@zytor.com, jane.chu@oracle.com,
	kirill.shutemov@linux.intel.com, KY Srinivasan,
	linux-hyperv@vger.kernel.org, luto@kernel.org, mingo@redhat.com,
	peterz@infradead.org, rostedt@goodmis.org,
	sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com,
	tglx@linutronix.de, tony.luck@intel.com, wei.liu@kernel.org,
	jason, nik.borisov@suse.com, Michael Kelley (LINUX)
  Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	linux-arch@vger.kernel.org, Tianyu Lan,
	rick.p.edgecombe@intel.com, Anthony Davis, Mark Heslin, vkuznets,
	xiaoyao.li@intel.com

> From: Tianyu Lan <ltykernel@gmail.com>
> Sent: Wednesday, August 23, 2023 12:06 AM
> To: Dexuan Cui <decui@microsoft.com>; ak@linux.intel.com;
> [...]
> On 8/12/2023 6:18 AM, Dexuan Cui wrote:
> > A fully enlightened TDX guest on Hyper-V (i.e. without the paravisor) only
> > uses the GHCI call rather than hv_hypercall_pg.
> >
> > In hv_do_hypercall(), Hyper-V requires that the input/output addresses
> > must have the cc_mask.
I'll remove the above sentence, since this is no longer true on
generally available Hyper-V: cc_mask still works, but it's not required.

> > [...]
> Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Thanks!

> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -481,6 +481,10 @@ void __init hyperv_init(void)
> >   	/* Hyper-V requires to write guest os id via ghcb in SNP IVM. */
> >   	hv_ghcb_msr_write(HV_X64_MSR_GUEST_OS_ID, guest_id);
> >
> > +	/* A TDX guest uses the GHCI call rather than hv_hypercall_pg. */
> > +	if (hv_isolation_type_tdx())
> > +		goto skip_hypercall_pg_init;
> > +
> 
> Nitpick:
> 	Put hypercal page initialization code into a sepearate function and
> skip the function in the tdx guest instead of adding the label.
This is doable. The intention here is to minimize the changes.
In the future, we'll introduce a hypercall function structure. We can
do code refactoring at that time.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-08-23 17:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-11 22:18 [PATCH 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
2023-08-11 22:18 ` [PATCH 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
2023-08-18 14:18   ` Tianyu Lan
2023-08-22 14:18   ` Tianyu Lan
2023-08-11 22:18 ` [PATCH 2/9] x86/hyperv: Support hypercalls for fully enlightened " Dexuan Cui
2023-08-23  7:05   ` Tianyu Lan
2023-08-23 17:40     ` Dexuan Cui
2023-08-11 22:18 ` [PATCH 3/9] Drivers: hv: vmbus: Support " Dexuan Cui
2023-08-11 22:18 ` [PATCH 4/9] x86/hyperv: Fix serial console interrupts for " Dexuan Cui
2023-08-23  8:17   ` Tianyu Lan
2023-08-11 22:18 ` [PATCH 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM Dexuan Cui
2023-08-23  9:13   ` Tianyu Lan
2023-08-11 22:18 ` [PATCH 6/9] x86/hyperv: Introduce a global variable hyperv_paravisor_present Dexuan Cui
2023-08-11 22:18 ` [PATCH 7/9] Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor Dexuan Cui
2023-08-11 22:18 ` [PATCH 8/9] x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM " Dexuan Cui
2023-08-11 22:18 ` [PATCH 9/9] x86/hyperv: Remove hv_isolation_type_en_snp Dexuan Cui
2023-08-16  4:14   ` Dexuan Cui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).