linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dexuan Cui <decui@microsoft.com>
To: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
	brijesh.singh@amd.com, dan.j.williams@intel.com,
	dave.hansen@linux.intel.com, haiyangz@microsoft.com,
	hpa@zytor.com, jane.chu@oracle.com,
	kirill.shutemov@linux.intel.com, kys@microsoft.com,
	linux-arch@vger.kernel.org, linux-hyperv@vger.kernel.org,
	luto@kernel.org, mingo@redhat.com, peterz@infradead.org,
	rostedt@goodmis.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	seanjc@google.com, tglx@linutronix.de, tony.luck@intel.com,
	wei.liu@kernel.org, x86@kernel.org, mikelley@microsoft.com
Cc: linux-kernel@vger.kernel.org, Tianyu.Lan@microsoft.com,
	Dexuan Cui <decui@microsoft.com>
Subject: [PATCH v3 2/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed()
Date: Mon,  6 Feb 2023 11:24:15 -0800	[thread overview]
Message-ID: <20230206192419.24525-3-decui@microsoft.com> (raw)
In-Reply-To: <20230206192419.24525-1-decui@microsoft.com>

When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf()
allocates buffers using vzalloc(), and needs to share the buffers with the
host OS by calling set_memory_decrypted(), which is not working for
vmalloc() yet. Add the support by handling the pages one by one.

Signed-off-by: Dexuan Cui <decui@microsoft.com>

---

Changes in v2:
  Changed tdx_enc_status_changed() in place.

Hi, Dave, I checked the huge vmalloc mapping code, but still don't know
how to get the underlying huge page info (if huge page is in use) and
try to use PG_LEVEL_2M/1G in try_accept_page() for vmalloc: I checked
is_vm_area_hugepages() and  __vfree() -> __vunmap(), and I think the
underlying page allocation info is internal to the mm code, and there
is no mm API to for me get the info in tdx_enc_status_changed().

Hi, Kirill, the load_unaligned_zeropad() issue is not addressed in
this patch. The issue looks like a generic issue that also happens to
AMD SNP vTOM mode and C-bit mode. Will need to figure out how to
address the issue. If we decide to adjust direct mapping to have the
shared bit set, it lools like we need to do the below for each
'start_va' vmalloc page:
  pa = slow_virt_to_phys(start_va);
  set_memory_decrypted(phys_to_virt(pa), 1); -- this line calls
tdx_enc_status_changed() the second time for the same age, which is not
great. It looks like we need to find a way to reuse the cpa_flush()
related code in __set_memory_enc_pgtable() and make sure we call
tdx_enc_status_changed() only once for the same page from vmalloc()?

Changes in v3:
  No change since v2.

 arch/x86/coco/tdx/tdx.c | 69 ++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 25 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 6e2665c07395..2cad4b8c4dc4 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -7,6 +7,7 @@
 #include <linux/cpufeature.h>
 #include <linux/export.h>
 #include <linux/io.h>
+#include <linux/mm.h>
 #include <asm/coco.h>
 #include <asm/tdx.h>
 #include <asm/vmx.h>
@@ -800,6 +801,34 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len,
 	return true;
 }
 
+static bool try_accept_page(phys_addr_t start, phys_addr_t end)
+{
+	/*
+	 * For shared->private conversion, accept the page using
+	 * TDX_ACCEPT_PAGE TDX module call.
+	 */
+	while (start < end) {
+		unsigned long len = end - start;
+
+		/*
+		 * Try larger accepts first. It gives chance to VMM to keep
+		 * 1G/2M SEPT entries where possible and speeds up process by
+		 * cutting number of hypercalls (if successful).
+		 */
+
+		if (try_accept_one(&start, len, PG_LEVEL_1G))
+			continue;
+
+		if (try_accept_one(&start, len, PG_LEVEL_2M))
+			continue;
+
+		if (!try_accept_one(&start, len, PG_LEVEL_4K))
+			return false;
+	}
+
+	return true;
+}
+
 /*
  * Notify the VMM about page mapping conversion. More info about ABI
  * can be found in TDX Guest-Host-Communication Interface (GHCI),
@@ -856,37 +885,27 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc)
  */
 static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc)
 {
-	phys_addr_t start = __pa(vaddr);
-	phys_addr_t end   = __pa(vaddr + numpages * PAGE_SIZE);
+	bool is_vmalloc = is_vmalloc_addr((void *)vaddr);
+	unsigned long len = numpages * PAGE_SIZE;
+	void *start_va = (void *)vaddr, *end_va = start_va + len;
+	phys_addr_t start_pa, end_pa;
 
-	if (!tdx_map_gpa(start, end, enc))
+	if (offset_in_page(start_va) != 0)
 		return false;
 
-	/* private->shared conversion  requires only MapGPA call */
-	if (!enc)
-		return true;
-
-	/*
-	 * For shared->private conversion, accept the page using
-	 * TDX_ACCEPT_PAGE TDX module call.
-	 */
-	while (start < end) {
-		unsigned long len = end - start;
-
-		/*
-		 * Try larger accepts first. It gives chance to VMM to keep
-		 * 1G/2M SEPT entries where possible and speeds up process by
-		 * cutting number of hypercalls (if successful).
-		 */
-
-		if (try_accept_one(&start, len, PG_LEVEL_1G))
-			continue;
+	while (start_va < end_va) {
+		start_pa = is_vmalloc ? slow_virt_to_phys(start_va) :
+					__pa(start_va);
+		end_pa = start_pa + (is_vmalloc ? PAGE_SIZE : len);
 
-		if (try_accept_one(&start, len, PG_LEVEL_2M))
-			continue;
+		if (!tdx_map_gpa(start_pa, end_pa, enc))
+			return false;
 
-		if (!try_accept_one(&start, len, PG_LEVEL_4K))
+		/* private->shared conversion requires only MapGPA call */
+		if (enc && !try_accept_page(start_pa, end_pa))
 			return false;
+
+		start_va += is_vmalloc ? PAGE_SIZE : len;
 	}
 
 	return true;
-- 
2.25.1


  parent reply	other threads:[~2023-02-06 19:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 19:24 [PATCH v3 0/6] Support TDX guests on Hyper-V Dexuan Cui
2023-02-06 19:24 ` [PATCH v3 1/6] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Dexuan Cui
2023-02-15 16:52   ` Kirill A. Shutemov
2023-02-06 19:24 ` Dexuan Cui [this message]
2023-02-17 13:19   ` [PATCH v3 2/6] x86/tdx: Support vmalloc() for tdx_enc_status_changed() Kirill A. Shutemov
2023-03-03  7:16     ` Dexuan Cui
2023-02-06 19:24 ` [PATCH v3 3/6] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
2023-02-06 19:24 ` [PATCH v3 4/6] x86/hyperv: Support hypercalls for " Dexuan Cui
2023-02-17  6:01   ` Sathyanarayanan Kuppuswamy
2023-02-06 19:24 ` [PATCH v3 5/6] Drivers: hv: vmbus: Support " Dexuan Cui
2023-02-06 19:24 ` [PATCH v3 6/6] x86/hyperv: Fix serial console interrupts for " Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230206192419.24525-3-decui@microsoft.com \
    --to=decui@microsoft.com \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=ak@linux.intel.com \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jane.chu@oracle.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).