From: Dexuan Cui <decui@microsoft.com>
To: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de,
brijesh.singh@amd.com, dan.j.williams@intel.com,
dave.hansen@intel.com, dave.hansen@linux.intel.com,
haiyangz@microsoft.com, hpa@zytor.com, jane.chu@oracle.com,
kirill.shutemov@linux.intel.com, kys@microsoft.com,
linux-hyperv@vger.kernel.org, luto@kernel.org, mingo@redhat.com,
peterz@infradead.org, rostedt@goodmis.org,
sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com,
tglx@linutronix.de, tony.luck@intel.com, wei.liu@kernel.org,
Jason@zx2c4.com, nik.borisov@suse.com, mikelley@microsoft.com
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org, Tianyu.Lan@microsoft.com,
rick.p.edgecombe@intel.com, andavis@redhat.com,
mheslin@redhat.com, vkuznets@redhat.com, xiaoyao.li@intel.com,
Dexuan Cui <decui@microsoft.com>
Subject: [PATCH v2 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
Date: Sun, 20 Aug 2023 13:27:11 -0700 [thread overview]
Message-ID: <20230820202715.29006-6-decui@microsoft.com> (raw)
In-Reply-To: <20230820202715.29006-1-decui@microsoft.com>
Don't set *this_cpu_ptr(hyperv_pcpu_input_arg) before the function
set_memory_decrypted() returns, otherwise we run into this ticky issue:
For a fully enlightened TDX/SNP VM, in hv_common_cpu_init(),
*this_cpu_ptr(hyperv_pcpu_input_arg) is an encrypted page before
the set_memory_decrypted() returns.
When such a VM has more than 64 VPs, if the hyperv_pcpu_input_arg is not
NULL, hv_common_cpu_init() -> set_memory_decrypted() -> ... ->
cpa_flush() -> on_each_cpu() -> ... -> hv_send_ipi_mask() -> ... ->
__send_ipi_mask_ex() tries to call hv_do_fast_hypercall16() with the
hyperv_pcpu_input_arg as the hypercall input page, which must be a
decrypted page in such a VM, but the page is still encrypted at this
point, and a fatal fault is triggered.
Fix the issue by setting *this_cpu_ptr(hyperv_pcpu_input_arg) after
set_memory_decrypted(): if the hyperv_pcpu_input_arg is NULL,
__send_ipi_mask_ex() returns HV_STATUS_INVALID_PARAMETER immediately,
and hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
which can use x2apic_send_IPI_all(), which may be slightly slower than
the hypercall but still works correctly in such a VM.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
Changes in v2: None
drivers/hv/hv_common.c | 30 +++++++++++++++++++++++-------
1 file changed, 23 insertions(+), 7 deletions(-)
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 897bbb96f4118..4c858e1636da7 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -360,6 +360,7 @@ int hv_common_cpu_init(unsigned int cpu)
u64 msr_vp_index;
gfp_t flags;
int pgcount = hv_root_partition ? 2 : 1;
+ void *mem;
int ret;
/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
@@ -372,25 +373,40 @@ int hv_common_cpu_init(unsigned int cpu)
* allocated if this CPU was previously online and then taken offline
*/
if (!*inputarg) {
- *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
- if (!(*inputarg))
+ mem = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
+ if (!mem)
return -ENOMEM;
if (hv_root_partition) {
outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
- *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
+ *outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
}
if (hv_isolation_type_en_snp() || hv_isolation_type_tdx()) {
- ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);
+ ret = set_memory_decrypted((unsigned long)mem, pgcount);
if (ret) {
- /* It may be unsafe to free *inputarg */
- *inputarg = NULL;
+ /* It may be unsafe to free 'mem' */
return ret;
}
- memset(*inputarg, 0x00, pgcount * PAGE_SIZE);
+ memset(mem, 0x00, pgcount * HV_HYP_PAGE_SIZE);
}
+
+ /*
+ * In a fully enlightened TDX/SNP VM with more than 64 VPs, if
+ * hyperv_pcpu_input_arg is not NULL, set_memory_decrypted() ->
+ * ... -> cpa_flush()-> ... -> __send_ipi_mask_ex() tries to
+ * use hyperv_pcpu_input_arg as the hypercall input page, which
+ * must be a decrypted page in such a VM, but the page is still
+ * encrypted before set_memory_decrypted() returns. Fix this by
+ * setting *inputarg after the above set_memory_decrypted(): if
+ * hyperv_pcpu_input_arg is NULL, __send_ipi_mask_ex() returns
+ * HV_STATUS_INVALID_PARAMETER immediately, and the function
+ * hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
+ * which may be slightly slower than the hypercall, but still
+ * works correctly in such a VM.
+ */
+ *inputarg = mem;
}
msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX);
--
2.25.1
next prev parent reply other threads:[~2023-08-20 20:33 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-20 20:27 [PATCH v2 0/9] Support TDX guests on Hyper-V (the Hyper-V specific part) Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 1/9] x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 2/9] x86/hyperv: Support hypercalls for fully enlightened " Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 3/9] Drivers: hv: vmbus: Support " Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 4/9] x86/hyperv: Fix serial console interrupts for " Dexuan Cui
2023-08-20 20:27 ` Dexuan Cui [this message]
2023-08-21 14:29 ` [PATCH v2 5/9] Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM Michael Kelley (LINUX)
2023-08-21 18:17 ` Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 6/9] x86/hyperv: Introduce a global variable hyperv_paravisor_present Dexuan Cui
2023-08-21 19:33 ` Michael Kelley (LINUX)
2023-08-23 4:23 ` Dexuan Cui
2023-08-23 4:28 ` Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 7/9] Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 8/9] x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM " Dexuan Cui
2023-08-21 19:33 ` Michael Kelley (LINUX)
2023-08-23 4:30 ` Dexuan Cui
2023-08-20 20:27 ` [PATCH v2 9/9] x86/hyperv: Remove hv_isolation_type_en_snp Dexuan Cui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230820202715.29006-6-decui@microsoft.com \
--to=decui@microsoft.com \
--cc=Jason@zx2c4.com \
--cc=Tianyu.Lan@microsoft.com \
--cc=ak@linux.intel.com \
--cc=andavis@redhat.com \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=brijesh.singh@amd.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=haiyangz@microsoft.com \
--cc=hpa@zytor.com \
--cc=jane.chu@oracle.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kys@microsoft.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mheslin@redhat.com \
--cc=mikelley@microsoft.com \
--cc=mingo@redhat.com \
--cc=nik.borisov@suse.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rostedt@goodmis.org \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
--cc=xiaoyao.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.