* [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
@ 2023-06-16 4:46 Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16 4:46 UTC (permalink / raw)
To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
mikelley
Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui
The two patches (which are based on the latest x86/tdx branch in the tip
tree) are the x86/tdx part of the v6 patchset:
https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/
The other patches of the v6 patchset needs more changes in preparation for
the upcoming paravisor support, so let me post the x86/tdx part first.
This v7 patchset addressed Dave's comments on patch 1:
see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com/
Patch 2 is just a repost. There was a race between set_memory_encrypted()
and load_unaligned_zeropad(), which has been fixed by the 3 patches of
Kirill in the x86/tdx branch of the tip tree:
3f6819dd192e ("x86/mm: Allow guest.enc_status_change_prepare() to fail")
195edce08b63 ("x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()")
94142c9d1bdf ("x86/mm: Fix enc_status_change_finish_noop()")
(see https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/tdx)
If you want to view the patchset on github, it is here:
https://github.com/dcui/tdx/commits/decui/upstream-tip/x86/tdx/v7
Thanks,
Dexuan
Dexuan Cui (2):
x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
x86/tdx: Support vmalloc() for tdx_enc_status_changed()
arch/x86/coco/tdx/tdx.c | 123 +++++++++++++++++++++++++++++++---------
1 file changed, 96 insertions(+), 27 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed
2023-06-16 4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
@ 2023-06-16 4:47 ` Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
2 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16 4:47 UTC (permalink / raw)
To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
mikelley
Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui
GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10
error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this
operation for the pages in the region starting at the GPA specified
in R11.
When a fully enlightened TDX guest runs on Hyper-V, Hyper-V can return
the retry error when set_memory_decrypted() is called to decrypt up to
1GB of swiotlb bounce buffers.
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
Changes in v2:
Used __tdx_hypercall() directly in tdx_map_gpa().
Added a max_retry_cnt of 1000.
Renamed a few variables, e.g., r11 -> map_fail_paddr.
Changes in v3:
Changed max_retry_cnt from 1000 to 3.
Changes in v4:
__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT) -> __tdx_hypercall_ret()
Added Kirill's Acked-by.
Changes in v5:
Added Michael's Reviewed-by.
Changes in v6: None.
Changes in v7:
Addressed Dave's comments:
see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com
arch/x86/coco/tdx/tdx.c | 65 +++++++++++++++++++++++++++++++++--------
1 file changed, 53 insertions(+), 12 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index cde174f4e239..5b62a1f5bd79 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -28,6 +28,8 @@
#define TDVMCALL_MAP_GPA 0x10001
#define TDVMCALL_REPORT_FATAL_ERROR 0x10003
+#define TDVMCALL_STATUS_RETRY 1
+
/* MMIO direction */
#define EPT_READ 0
#define EPT_WRITE 1
@@ -777,14 +779,16 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len,
}
/*
- * Inform the VMM of the guest's intent for this physical page: shared with
- * the VMM or private to the guest. The VMM is expected to change its mapping
- * of the page in response.
+ * Notify the VMM about page mapping conversion. More info about ABI
+ * can be found in TDX Guest-Host-Communication Interface (GHCI),
+ * section "TDG.VP.VMCALL<MapGPA>".
*/
-static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
{
- phys_addr_t start = __pa(vaddr);
- phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE);
+ const int max_retries_per_page = 3;
+ struct tdx_hypercall_args args;
+ u64 map_fail_paddr, ret;
+ int retry_count = 0;
if (!enc) {
/* Set the shared (decrypted) bits: */
@@ -792,12 +796,49 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
end |= cc_mkdec(0);
}
- /*
- * Notify the VMM about page mapping conversion. More info about ABI
- * can be found in TDX Guest-Host-Communication Interface (GHCI),
- * section "TDG.VP.VMCALL<MapGPA>"
- */
- if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
+ while (retry_count < max_retries_per_page) {
+ memset(&args, 0, sizeof(args));
+ args.r10 = TDX_HYPERCALL_STANDARD;
+ args.r11 = TDVMCALL_MAP_GPA;
+ args.r12 = start;
+ args.r13 = end - start;
+
+ ret = __tdx_hypercall_ret(&args);
+ if (ret != TDVMCALL_STATUS_RETRY)
+ return !ret;
+ /*
+ * The guest must retry the operation for the pages in the
+ * region starting at the GPA specified in R11. R11 comes
+ * from the untrusted VMM. Sanity check it.
+ */
+ map_fail_paddr = args.r11;
+ if (map_fail_paddr < start || map_fail_paddr >= end)
+ return false;
+
+ /* "Consume" a retry without forward progress */
+ if (map_fail_paddr == start) {
+ retry_count++;
+ continue;
+ }
+
+ start = map_fail_paddr;
+ retry_count = 0;
+ }
+
+ return false;
+}
+
+/*
+ * Inform the VMM of the guest's intent for this physical page: shared with
+ * the VMM or private to the guest. The VMM is expected to change its mapping
+ * of the page in response.
+ */
+static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
+{
+ phys_addr_t start = __pa(vaddr);
+ phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE);
+
+ if (!tdx_map_gpa(start, end, enc))
return false;
/* private->shared conversion requires only MapGPA call */
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
2023-06-16 4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
@ 2023-06-16 4:47 ` Dexuan Cui
2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
2 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-16 4:47 UTC (permalink / raw)
To: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
mikelley
Cc: linux-kernel, Tianyu.Lan, rick.p.edgecombe, Dexuan Cui
When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
allocates buffers using vzalloc(), and needs to share the buffers with the
host OS by calling set_memory_decrypted(), which is not working for
vmalloc() yet. Add the support by handling the pages one by one.
Co-developed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
arch/x86/coco/tdx/tdx.c | 76 ++++++++++++++++++++++++++++-------------
1 file changed, 52 insertions(+), 24 deletions(-)
Changes in v2:
Changed tdx_enc_status_changed() in place.
Changes in v3:
No change since v2.
Changes in v4:
Added Kirill's Co-developed-by since Kirill helped to improve the
code by adding tdx_enc_status_changed_phys().
Thanks Kirill for the clarification on load_unaligned_zeropad()!
Changes in v5:
Added Kirill's Signed-off-by.
Added Michael's Reviewed-by.
Changes in v6: None.
Changes in v7: None.
Note: there was a race between set_memory_encrypted() and
load_unaligned_zeropad(), which has been fixed by the 3 patches of
Kirill in the x86/tdx branch of the tip tree.
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 5b62a1f5bd79..8b2a2dcb2efd 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
#include <linux/cpufeature.h>
#include <linux/export.h>
#include <linux/io.h>
+#include <linux/mm.h>
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
@@ -778,6 +779,34 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len,
return true;
}
+static bool try_accept_page(phys_addr_t start, phys_addr_t end)
+{
+ /*
+ * For shared->private conversion, accept the page using
+ * TDX_ACCEPT_PAGE TDX module call.
+ */
+ while (start < end) {
+ unsigned long len = end - start;
+
+ /*
+ * Try larger accepts first. It gives chance to VMM to keep
+ * 1G/2M SEPT entries where possible and speeds up process by
+ * cutting number of hypercalls (if successful).
+ */
+
+ if (try_accept_one(&start, len, PG_LEVEL_1G))
+ continue;
+
+ if (try_accept_one(&start, len, PG_LEVEL_2M))
+ continue;
+
+ if (!try_accept_one(&start, len, PG_LEVEL_4K))
+ return false;
+ }
+
+ return true;
+}
+
/*
* Notify the VMM about page mapping conversion. More info about ABI
* can be found in TDX Guest-Host-Communication Interface (GHCI),
@@ -828,6 +857,19 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
return false;
}
+static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end,
+ bool enc)
+{
+ if (!tdx_map_gpa(start, end, enc))
+ return false;
+
+ /* private->shared conversion requires only MapGPA call */
+ if (!enc)
+ return true;
+
+ return try_accept_page(start, end);
+}
+
/*
* Inform the VMM of the guest's intent for this physical page: shared with
* the VMM or private to the guest. The VMM is expected to change its mapping
@@ -835,37 +877,23 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
*/
static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
{
- phys_addr_t start = __pa(vaddr);
- phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE);
+ unsigned long start = vaddr;
+ unsigned long end = start + numpages * PAGE_SIZE;
- if (!tdx_map_gpa(start, end, enc))
+ if (offset_in_page(start) != 0)
return false;
- /* private->shared conversion requires only MapGPA call */
- if (!enc)
- return true;
+ if (!is_vmalloc_addr((void *)start))
+ return tdx_enc_status_changed_phys(__pa(start), __pa(end), enc);
- /*
- * For shared->private conversion, accept the page using
- * TDX_ACCEPT_PAGE TDX module call.
- */
while (start < end) {
- unsigned long len = end - start;
+ phys_addr_t start_pa = slow_virt_to_phys((void *)start);
+ phys_addr_t end_pa = start_pa + PAGE_SIZE;
- /*
- * Try larger accepts first. It gives chance to VMM to keep
- * 1G/2M SEPT entries where possible and speeds up process by
- * cutting number of hypercalls (if successful).
- */
-
- if (try_accept_one(&start, len, PG_LEVEL_1G))
- continue;
-
- if (try_accept_one(&start, len, PG_LEVEL_2M))
- continue;
-
- if (!try_accept_one(&start, len, PG_LEVEL_4K))
+ if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc))
return false;
+
+ start += PAGE_SIZE;
}
return true;
--
2.25.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
2023-06-16 4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
@ 2023-06-19 13:47 ` Kirill A. Shutemov
2023-06-19 16:23 ` Dexuan Cui
2 siblings, 1 reply; 5+ messages in thread
From: Kirill A. Shutemov @ 2023-06-19 13:47 UTC (permalink / raw)
To: Dexuan Cui
Cc: ak, arnd, bp, brijesh.singh, dan.j.williams, dave.hansen,
dave.hansen, haiyangz, hpa, jane.chu, kirill.shutemov, kys,
linux-arch, linux-hyperv, luto, mingo, peterz, rostedt,
sathyanarayanan.kuppuswamy, seanjc, tglx, tony.luck, wei.liu, x86,
mikelley, linux-kernel, Tianyu.Lan, rick.p.edgecombe
On Thu, Jun 15, 2023 at 09:46:59PM -0700, Dexuan Cui wrote:
> The two patches (which are based on the latest x86/tdx branch in the tip
> tree) are the x86/tdx part of the v6 patchset:
> https://lwn.net/ml/linux-kernel/20230504225351.10765-1-decui@microsoft.com/
>
> The other patches of the v6 patchset needs more changes in preparation for
> the upcoming paravisor support, so let me post the x86/tdx part first.
>
> This v7 patchset addressed Dave's comments on patch 1:
> see https://lwn.net/ml/linux-kernel/SA1PR21MB1335736123C2BCBBFD7460C3BF46A@SA1PR21MB1335.namprd21.prod.outlook.com/
>
> Patch 2 is just a repost. There was a race between set_memory_encrypted()
> and load_unaligned_zeropad(), which has been fixed by the 3 patches of
> Kirill in the x86/tdx branch of the tip tree:
> 3f6819dd192e ("x86/mm: Allow guest.enc_status_change_prepare() to fail")
> 195edce08b63 ("x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()")
> 94142c9d1bdf ("x86/mm: Fix enc_status_change_finish_noop()")
> (see https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=x86/tdx)
>
> If you want to view the patchset on github, it is here:
> https://github.com/dcui/tdx/commits/decui/upstream-tip/x86/tdx/v7
JFYI, it won't apply to tip/master. Unaccepted memory changed the code you
patching.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part)
2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
@ 2023-06-19 16:23 ` Dexuan Cui
0 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2023-06-19 16:23 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
brijesh.singh@amd.com, dan.j.williams@intel.com,
dave.hansen@intel.com, dave.hansen@linux.intel.com, Haiyang Zhang,
hpa@zytor.com, jane.chu@oracle.com,
kirill.shutemov@linux.intel.com, KY Srinivasan,
linux-arch@vger.kernel.org, linux-hyperv@vger.kernel.org,
luto@kernel.org, mingo@redhat.com, peterz@infradead.org,
rostedt@goodmis.org, sathyanarayanan.kuppuswamy@linux.intel.com,
seanjc@google.com, tglx@linutronix.de, tony.luck@intel.com,
wei.liu@kernel.org, x86@kernel.org, Michael Kelley (LINUX),
linux-kernel@vger.kernel.org, Tianyu Lan,
rick.p.edgecombe@intel.com
> From: Kirill A. Shutemov <kirill@shutemov.name>
> Sent: Monday, June 19, 2023 6:47 AM
> ...
> JFYI, it won't apply to tip/master. Unaccepted memory changed the code you
> patching.
Thanks for letting me know! I'll rebase to tip/master and repost shortly.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-06-19 16:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-16 4:46 [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 1/2] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2023-06-16 4:47 ` [PATCH v7 2/2] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Dexuan Cui
2023-06-19 13:47 ` [PATCH v7 0/2] Support TDX guests on Hyper-V (the x86/tdx part) Kirill A. Shutemov
2023-06-19 16:23 ` Dexuan Cui
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).