linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
@ 2023-06-16  4:46 Dexuan Cui
  2023-06-16  4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16  4:46 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
	mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

The two patches (which are based on the latest x86/tdx branch in the tip
tree) are the x86/tdx part of the v6 patchset:
https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/

The other patches of the v6 patchset needs more changes in preparation for
the upcoming paravisor support, so let me post the x86/tdx part first.

This v7 patchset addressed Dave's comments on patch 1:
see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com/

Patch 2 is just a repost. There was a race between set_memory_encrypted()
and load_unaligned_zeropad(), which has been fixed by the 3 patches of
Kirill in the x86/tdx branch of the tip tree:
  3f6819dd192e ("x86/mm: Allow guest.enc_status_change_prepare() to fail")
  195edce08b63 ("x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()")
  94142c9d1bdf ("x86/mm: Fix enc_status_change_finish_noop()")
  (see https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/tdx)

If you want to view the patchset on github, it is here:
https://github.com/dcui/tdx/commits/decui/upstream-tip/x86/tdx/v7

Thanks,
Dexuan

Dexuan Cui (2):
  x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  x86/tdx: Support vmalloc() for tdx_enc_status_changed()

 arch/x86/coco/tdx/tdx.c | 123 +++++++++++++++++++++++++++++++---------
 1 file changed, 96 insertions(+), 27 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
  2023-06-16  4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
@ 2023-06-16  4:47 ` Dexuan Cui
  2023-06-16  4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
  2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
  2 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16  4:47 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
	mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10
error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this
operation for the pages in the region starting at the GPA specified
in R11.

When a fully enlightened TDX guest runs on Hyper-V, Hyper-V can return
the retry error when set_memory_decrypted() is called to decrypt up to
1GB of swiotlb bounce buffers.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---

Changes in v2:
  Used __tdx_hypercall() directly in tdx_map_gpa().
  Added a max_retry_cnt of 1000.
  Renamed a few variables, e.g., r11 -> map_fail_paddr.

Changes in v3:
  Changed max_retry_cnt from 1000 to 3.

Changes in v4:
  __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT) -> __tdx_hypercall_ret()
  Added Kirill's Acked-by.

Changes in v5:
  Added Michael's Reviewed-by.

Changes in v6: None.

Changes in v7:
  Addressed Dave's comments:
  see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com


 arch/x86/coco/tdx/tdx.c | 65 +++++++++++++++++++++++++++++++++--------
 1 file changed, 53 insertions(+), 12 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index cde174f4e239..5b62a1f5bd79 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -28,6 +28,8 @@
 #define TDVMCALL_MAP_GPA		0x10001
 #define TDVMCALL_REPORT_FATAL_ERROR	0x10003
 
+#define TDVMCALL_STATUS_RETRY		1
+
 /* MMIO direction */
 #define EPT_READ	0
 #define EPT_WRITE	1
@@ -777,14 +779,16 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len,
 }
 
 /*
- * Inform the VMM of the guest's intent for this physical page: shared with
- * the VMM or private to the guest.  The VMM is expected to change its mapping
- * of the page in response.
+ * Notify the VMM about page mapping conversion. More info about ABI
+ * can be found in TDX Guest-Host-Communication Interface (GHCI),
+ * section "TDG.VP.VMCALL<MapGPA>".
  */
-static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 {
-	phys_addr_t start = __pa(vaddr);
-	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+	const int max_retries_per_page = 3;
+	struct tdx_hypercall_args args;
+	u64 map_fail_paddr, ret;
+	int retry_count = 0;
 
 	if (!enc) {
 		/* Set the shared (decrypted) bits: */
@@ -792,12 +796,49 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
 		end   |= cc_mkdec(0);
 	}
 
-	/*
-	 * Notify the VMM about page mapping conversion. More info about ABI
-	 * can be found in TDX Guest-Host-Communication Interface (GHCI),
-	 * section "TDG.VP.VMCALL<MapGPA>"
-	 */
-	if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
+	while (retry_count < max_retries_per_page) {
+		memset(&args, 0, sizeof(args));
+		args.r10 = TDX_HYPERCALL_STANDARD;
+		args.r11 = TDVMCALL_MAP_GPA;
+		args.r12 = start;
+		args.r13 = end - start;
+
+		ret = __tdx_hypercall_ret(&args);
+		if (ret != TDVMCALL_STATUS_RETRY)
+			return !ret;
+		/*
+		 * The guest must retry the operation for the pages in the
+		 * region starting at the GPA specified in R11. R11 comes
+		 * from the untrusted VMM. Sanity check it.
+		 */
+		map_fail_paddr = args.r11;
+		if (map_fail_paddr < start || map_fail_paddr >= end)
+			return false;
+
+		/* "Consume" a retry without forward progress */
+		if (map_fail_paddr == start) {
+			retry_count++;
+			continue;
+		}
+
+		start = map_fail_paddr;
+		retry_count = 0;
+	}
+
+	return false;
+}
+
+/*
+ * Inform the VMM of the guest's intent for this physical page: shared with
+ * the VMM or private to the guest. The VMM is expected to change its mapping
+ * of the page in response.
+ */
+static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+{
+	phys_addr_t start = __pa(vaddr);
+	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+
+	if (!tdx_map_gpa(start, end, enc))
 		return false;
 
 	/* private->shared conversion  requires only MapGPA call */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
  2023-06-16  4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
  2023-06-16  4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
@ 2023-06-16  4:47 ` Dexuan Cui
  2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
  2 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16  4:47 UTC (permalink / raw)
  To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
	mikelley
  Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui

When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
allocates buffers using vzalloc(), and needs to share the buffers with the
host OS by calling set_memory_decrypted(), which is not working for
vmalloc() yet. Add the support by handling the pages one by one.

Co-developed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/coco/tdx/tdx.c | 76 ++++++++++++++++++++++++++++-------------
 1 file changed, 52 insertions(+), 24 deletions(-)


Changes in v2:
  Changed tdx_enc_status_changed() in place.

Changes in v3:
  No change since v2.

Changes in v4:
  Added Kirill's Co-developed-by since Kirill helped to improve the
    code by adding tdx_enc_status_changed_phys().

  Thanks Kirill for the clarification on load_unaligned_zeropad()!

Changes in v5:
  Added Kirill's Signed-off-by.
  Added Michael's Reviewed-by.

Changes in v6: None.

Changes in v7: None.
  Note: there was a race between set_memory_encrypted() and
  load_unaligned_zeropad(), which has been fixed by the 3 patches of
  Kirill in the x86/tdx branch of the tip tree.


diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 5b62a1f5bd79..8b2a2dcb2efd 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
 #include <linux/cpufeature.h>
 #include <linux/export.h>
 #include <linux/io.h>
+#include <linux/mm.h>
 #include <asm/coco.h>
 #include <asm/tdx.h>
 #include <asm/vmx.h>
@@ -778,6 +779,34 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len,
 	return true;
 }
 
+static bool try_accept_page(phys_addr_t start, phys_addr_t end)
+{
+	/*
+	 * For shared->private conversion, accept the page using
+	 * TDX_ACCEPT_PAGE TDX module call.
+	 */
+	while (start < end) {
+		unsigned long len = end - start;
+
+		/*
+		 * Try larger accepts first. It gives chance to VMM to keep
+		 * 1G/2M SEPT entries where possible and speeds up process by
+		 * cutting number of hypercalls (if successful).
+		 */
+
+		if (try_accept_one(&start, len, PG_LEVEL_1G))
+			continue;
+
+		if (try_accept_one(&start, len, PG_LEVEL_2M))
+			continue;
+
+		if (!try_accept_one(&start, len, PG_LEVEL_4K))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Notify the VMM about page mapping conversion. More info about ABI
  * can be found in TDX Guest-Host-Communication Interface (GHCI),
@@ -828,6 +857,19 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
 	return false;
 }
 
+static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end,
+					bool enc)
+{
+	if (!tdx_map_gpa(start, end, enc))
+		return false;
+
+	/* private->shared conversion requires only MapGPA call */
+	if (!enc)
+		return true;
+
+	return try_accept_page(start, end);
+}
+
 /*
  * Inform the VMM of the guest's intent for this physical page: shared with
  * the VMM or private to the guest. The VMM is expected to change its mapping
@@ -835,37 +877,23 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
  */
 static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
 {
-	phys_addr_t start = __pa(vaddr);
-	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+	unsigned long start = vaddr;
+	unsigned long end = start + numpages * PAGE_SIZE;
 
-	if (!tdx_map_gpa(start, end, enc))
+	if (offset_in_page(start) != 0)
 		return false;
 
-	/* private->shared conversion  requires only MapGPA call */
-	if (!enc)
-		return true;
+	if (!is_vmalloc_addr((void *)start))
+		return tdx_enc_status_changed_phys(__pa(start), __pa(end), enc);
 
-	/*
-	 * For shared->private conversion, accept the page using
-	 * TDX_ACCEPT_PAGE TDX module call.
-	 */
 	while (start < end) {
-		unsigned long len = end - start;
+		phys_addr_t start_pa = slow_virt_to_phys((void *)start);
+		phys_addr_t end_pa = start_pa + PAGE_SIZE;
 
-		/*
-		 * Try larger accepts first. It gives chance to VMM to keep
-		 * 1G/2M SEPT entries where possible and speeds up process by
-		 * cutting number of hypercalls (if successful).
-		 */
-
-		if (try_accept_one(&start, len, PG_LEVEL_1G))
-			continue;
-
-		if (try_accept_one(&start, len, PG_LEVEL_2M))
-			continue;
-
-		if (!try_accept_one(&start, len, PG_LEVEL_4K))
+		if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
 			return false;
+
+		start += PAGE_SIZE;
 	}
 
 	return true;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
  2023-06-16  4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
  2023-06-16  4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
  2023-06-16  4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
@ 2023-06-19 13:47 ` Kirill A. Shutemov
  2023-06-19 16:23   ` Dexuan Cui
  2 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2023-06-19 13:47 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
	dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
	linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
	sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
	mikelley, linux-kernel, Tianyu.Lan, rick.p.edgecombe

On Thu, Jun 15, 2023 at 09:46:59PM -0700, Dexuan Cui wrote:
> The two patches (which are based on the latest x86/tdx branch in the tip
> tree) are the x86/tdx part of the v6 patchset:
> https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/
> 
> The other patches of the v6 patchset needs more changes in preparation for
> the upcoming paravisor support, so let me post the x86/tdx part first.
> 
> This v7 patchset addressed Dave's comments on patch 1:
> see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com/
> 
> Patch 2 is just a repost. There was a race between set_memory_encrypted()
> and load_unaligned_zeropad(), which has been fixed by the 3 patches of
> Kirill in the x86/tdx branch of the tip tree:
>   3f6819dd192e ("x86/mm: Allow guest.enc_status_change_prepare() to fail")
>   195edce08b63 ("x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()")
>   94142c9d1bdf ("x86/mm: Fix enc_status_change_finish_noop()")
>   (see https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/tdx)
> 
> If you want to view the patchset on github, it is here:
> https://github.com/dcui/tdx/commits/decui/upstream-tip/x86/tdx/v7

JFYI, it won't apply to tip/master. Unaccepted memory changed the code you
patching.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
  2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
@ 2023-06-19 16:23   ` Dexuan Cui
  0 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-19 16:23 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
	brijesh.singh@amd.com, dan.j.williams@intel.com,
	dave.hansen@intel.com, dave.hansen@linux.intel.com, Haiyang Zhang,
	hpa@zytor.com, jane.chu@oracle.com,
	kirill.shutemov@linux.intel.com, KY Srinivasan,
	linux-arch@vger.kernel.org, linux-hyperv@vger.kernel.org,
	luto@kernel.org, mingo@redhat.com, peterz@infradead.org,
	rostedt@goodmis.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	seanjc@google.com, tglx@linutronix.de, tony.luck@intel.com,
	wei.liu@kernel.org, x86@kernel.org, Michael Kelley (LINUX),
	linux-kernel@vger.kernel.org, Tianyu Lan,
	rick.p.edgecombe@intel.com

> From: Kirill A. Shutemov <kirill@shutemov.name>
> Sent: Monday, June 19, 2023 6:47 AM
> ...
> JFYI, it won't apply to tip/master. Unaccepted memory changed the code you
> patching.
Thanks for letting me know! I'll rebase to tip/master and repost shortly.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-06-19 16:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-16  4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
2023-06-16  4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2023-06-16  4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
2023-06-19 16:23   ` Dexuan Cui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).