[PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args

linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args
@ 2025-03-13  6:19 mhkelley58
  2025-03-13  6:19 ` [PATCH v2 1/6] Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments mhkelley58
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

This patch set introduces a new way to manage the use of the per-cpu
memory that is usually the input and output arguments to Hyper-V
hypercalls. Current code allocates the "hyperv_pcpu_input_arg", and in
some configurations, the "hyperv_pcpu_output_arg". Each is a 4 KiB
page of memory allocated per-vCPU. A hypercall call site disables
interrupts, then uses this memory to set up the input parameters for
the hypercall, read the output results after hypercall execution, and
re-enable interrupts. The open coding of these steps has led to
inconsistencies, and in some cases, violation of the generic
requirements for the hypercall input and output as described in the
Hyper-V Top Level Functional Spec (TLFS)[1]. This patch set introduces
a new family of inline functions to replace the open coding. The new
functions encapsulate key aspects of the use of per-vCPU memory for
hypercall input and output,and ensure that the TLFS requirements are
met (max size of 1 page each for input and output, no overlap of input
and output, aligned to 8 bytes, etc.).

With this change, hypercall call sites no longer directly access
"hyperv_pcpu_input_arg" and "hyperv_pcpu_output_arg". Instead, one of
a family of new functions provides the per-cpu memory that a hypercall
call site uses to set up hypercall input and output areas.
Conceptually, there is no longer a difference between the "per-vCPU
input page" and "per-vCPU output page". Only a single per-vCPU page is
allocated, and it is used to provide both hypercall input and output.
All current hypercalls can fit their input and output within that single
page, though the new code allows easy changing to two pages should a
future hypercall require a full page for each of the input and output.

The new functions always zero the fixed-size portion of the hypercall
input area (but not any array portion -- see below) so that
uninitialized memory isn't inadvertently passed to the hypercall.
Current open-coded hypercall call sites are inconsistent on this point,
and use of the new functions addresses that inconsistency. The output
area is not zero'ed by the new code as it is Hyper-V's responsibility
to provide legal output.

When the input or output (or both) contain an array, the new code
calculates and returns how many array entries fit within the per-cpu
memory page, which is effectively the "batch size" for the hypercall
processing multiple entries. This batch size can then be used in the
hypercall control word to specify the repetition count. This
calculation of the batch size replaces current open coding of the
batch size, which is prone to errors. Note that the array portion of
the input area is *not* zero'ed. The arrays are almost always 64-bit
GPAs or something similar, and zero'ing that much memory seems
wasteful at runtime when it will all be overwritten. The hypercall
call site is responsible for ensuring that no part of the array is
left uninitialized (just as with current code).

The new family of functions is realized as a single inline function
that handles the most complex case, which is a hypercall with input
and output, both of which contain arrays. Simpler cases are mapped to
this most complex case with #define wrappers that provide zero or NULL
for some arguments. Several of the arguments to this new function
must be compile-time constants generated by "sizeof()" expressions.
As such, most of the code in the new function is evaluated by the
compiler, with the result that the runtime code paths are no longer
than with the current open coding. An exception is the new code
generated to zero the fixed-size portion of the input area in cases
where it was not previously done.

Use of the new function typically (but not always) saves a few lines
of code at each hypercall call site. This is traded off against the
lines of code added for the new functions. With code currently
upstream, the net is an add of about 60 lines of code and comments.
However, as additional hypercall call sites are upstreamed from the
OpenHCL project[2] in support of Linux running in the Hyper-V root
partition and in VTLs other than VTL 0, the net lines of code added is
nearly zero.

A couple hypercall call sites have requirements that are not 100%
handled by the new function. These still require some manual open-
coded adjustment or open-coded batch size calculations -- see the
individual patches in this series. Suggestions on how to do better
are welcome.

The patches in the series do the following:

Patch 1: Introduce the new family of functions for assigning hypercall
         input and output arguments.

Patch 2 to 5: Change existing hypercall call sites to use one of the new
         functions. In some cases, tweaks to the hypercall argument data
         structures are necessary, but these tweaks are making the data
         structures more consistent with the overall pattern. These
         four patches are independent of each other, and can go in any
         order. The breakup into 4 patches is for ease of review.

Patch 6: Update the name of the variable used to hold the per-cpu memory
         used for hypercall arguments. Remove code for managing the
	 per-cpu output page.

Patch 1 from v1 of the patch set has been dropped in v2. It was a bug
fix that has already been picked up.

The new code compiles and runs successfully on x86 and arm64. Separate
from this patch set, for evaluation purposes I also applied the
changes to the additional hypercall call sites in the OpenHCL
project[2]. However, I don't have the hardware or Hyper-V
configurations needed to test running in the Hyper-V root partition or
in a VTL other than VTL 0. So the related hypercall call sites still
need to be tested to make sure I didn't break anything. Hopefully
someone with the necessary configurations and Hyper-V versions can
help with that testing.

For gcc 9.4.0, I've looked at the generated code for a couple of
hypercall call sites on both x86 and arm64 to ensure that it boils
down to the equivalent of the current open coding. I have not looked
at the generated code for later gcc versions or for Clang/LLVM, but
there's no reason to expect something worse as the code isn't doing
anything tricky.

This patch set is built against linux-next20250311.

[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
[2] https://github.com/microsoft/OHCL-Linux-Kernel

Michael Kelley (6):
  Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments
  x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1
  x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2
  Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments
  PCI: hv: Use hv_hvcall_*() to set up hypercall arguments
  Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg

 arch/x86/hyperv/hv_apic.c           |  10 ++-
 arch/x86/hyperv/hv_init.c           |  12 ++--
 arch/x86/hyperv/hv_vtl.c            |   9 +--
 arch/x86/hyperv/irqdomain.c         |  17 +++--
 arch/x86/hyperv/ivm.c               |  18 ++---
 arch/x86/hyperv/mmu.c               |  19 ++---
 arch/x86/hyperv/nested.c            |  14 ++--
 drivers/hv/hv.c                     |  11 +--
 drivers/hv/hv_balloon.c             |   4 +-
 drivers/hv/hv_common.c              |  57 +++++----------
 drivers/hv/hv_proc.c                |   8 +--
 drivers/hv/hyperv_vmbus.h           |   2 +-
 drivers/pci/controller/pci-hyperv.c |  18 +++--
 include/asm-generic/mshyperv.h      | 103 +++++++++++++++++++++++++++-
 include/hyperv/hvgdk_mini.h         |   6 +-
 15 files changed, 184 insertions(+), 124 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/6] Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-03-13  6:19 ` [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1 mhkelley58
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

Current code allocates the "hyperv_pcpu_input_arg", and in
some configurations, the "hyperv_pcpu_output_arg". Each is a 4 KiB
page of memory allocated per-vCPU. A hypercall call site disables
interrupts, then uses this memory to set up the input parameters for
the hypercall, read the output results after hypercall execution, and
re-enable interrupts. The open coding of these steps leads to
inconsistencies, and in some cases, violation of the generic
requirements for the hypercall input and output as described in the
Hyper-V Top Level Functional Spec (TLFS)[1].

To reduce these kinds of problems, introduce a family of inline
functions to replace the open coding. The functions provide a new way
to manage the use of this per-vCPU memory that is usually the input and
output arguments to Hyper-V hypercalls. The functions encapsulate
key aspects of the usage and ensure that the TLFS requirements are
met (max size of 1 page each for input and output, no overlap of
input and output, aligned to 8 bytes, etc.). Conceptually, there
is no longer a difference between the "per-vCPU input page" and
"per-vCPU output page". Only a single per-vCPU page is allocated, and
it provides both hypercall input and output memory. All current
hypercalls can fit their input and output within that single page,
though the new code allows easy changing to two pages should a future
hypercall require a full page for each of the input and output.

The new functions always zero the fixed-size portion of the hypercall
input area so that uninitialized memory is not inadvertently passed
to the hypercall. Current open-coded hypercall call sites are
inconsistent on this point, and use of the new functions addresses
that inconsistency. The output area is not zero'ed by the new code
as it is Hyper-V's responsibility to provide legal output.

When the input or output (or both) contain an array, the new functions
calculate and return how many array entries fit within the per-cpu
memory page, which is effectively the "batch size" for the hypercall
processing multiple entries. This batch size can then be used in the
hypercall control word to specify the repetition count. This
calculation of the batch size replaces current open coding of the
batch size, which is prone to errors. Note that the array portion of
the input area is *not* zero'ed. The arrays are almost always 64-bit
GPAs or something similar, and zero'ing that much memory seems
wasteful at runtime when it will all be overwritten. The hypercall
call site is responsible for ensuring that no part of the array is
left uninitialized (just as with current code).

The new functions are realized as a single inline function that
handles the most complex case, which is a hypercall with input
and output, both of which contain arrays. Simpler cases are mapped to
this most complex case with #define wrappers that provide zero or NULL
for some arguments. Several of the arguments to this new function
must be compile-time constants generated by "sizeof()"
expressions. As such, most of the code in the new function can be
evaluated by the compiler, with the result that the code paths are
no longer than with the current open coding. The one exception is
new code generated to zero the fixed-size portion of the input area
in cases where it is not currently done.

[1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---

Notes:
    Changes in v2:
    * Added comment that hv_hvcall_inout_array() should always be called with
      interrupts disabled because it is returning pointers to per-cpu memory
      [Nuno Das Neves]

 include/asm-generic/mshyperv.h | 103 +++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index b13b0cda4ac8..01e8763edc2c 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -135,6 +135,109 @@ static inline u64 hv_do_rep_hypercall(u16 code, u16 rep_count, u16 varhead_size,
 	return status;
 }

+/*
+ * Hypercall input and output argument setup
+ */
+
+/* Temporary mapping to be removed at the end of the patch series */
+#define hyperv_pcpu_arg hyperv_pcpu_input_arg
+
+/*
+ * Allocate one page that is shared between input and output args, which is
+ * sufficient for all current hypercalls. If a future hypercall requires
+ * more space, change this value to "2" and everything will work.
+ */
+#define HV_HVCALL_ARG_PAGES 1
+
+/*
+ * Allocate space for hypercall input and output arguments from the
+ * pre-allocated per-cpu hyperv_pcpu_args page(s). A NULL value for the input
+ * or output indicates to allocate no space for that argument. For input and
+ * for output, specify the size of the fixed portion, and the size of an
+ * element in a variable size array. A zero value for entry_size indicates
+ * there is no array. The fixed size space for the input is zero'ed.
+ *
+ * When variable size arrays are present, the function returns the number of
+ * elements (i.e, the batch size) that fit in the available space.
+ *
+ * The four "size" arguments must be constants so the compiler can do most of
+ * the calculations. Then the generated inline code is no larger than if open
+ * coding the access to the hyperv_pcpu_arg and doing memset() on the input.
+ *
+ * This function must be called with interrupts disabled so the thread is not
+ * rescheduled onto another vCPU while accessing the per-cpu args page.
+ */
+static inline int hv_hvcall_inout_array(void *input, u32 in_size, u32 in_entry_size,
+					void *output, u32 out_size, u32 out_entry_size)
+{
+	u32 in_batch_count = 0, out_batch_count = 0, batch_count;
+	u32 in_total_size, out_total_size, offset;
+	u32 batch_space = HV_HYP_PAGE_SIZE * HV_HVCALL_ARG_PAGES;
+	void *space;
+
+	/*
+	 * If input and output have arrays, allocate half the space to input
+	 * and half to output. If only input has an array, the array can use
+	 * all the space except for the fixed size output (but not to exceed
+	 * one page), and vice versa.
+	 */
+	if (in_entry_size && out_entry_size)
+		batch_space = batch_space / 2;
+	else if (in_entry_size)
+		batch_space = min(HV_HYP_PAGE_SIZE, batch_space - out_size);
+	else if (out_entry_size)
+		batch_space = min(HV_HYP_PAGE_SIZE, batch_space - in_size);
+
+	if (in_entry_size)
+		in_batch_count = (batch_space - in_size) / in_entry_size;
+	if (out_entry_size)
+		out_batch_count = (batch_space - out_size) / out_entry_size;
+
+	/*
+	 * If input and output have arrays, use the smaller of the two batch
+	 * counts, in case they are different. If only one has an array, use
+	 * that batch count. batch_count will be zero if neither has an array.
+	 */
+	if (in_batch_count && out_batch_count)
+		batch_count = min(in_batch_count, out_batch_count);
+	else
+		batch_count = in_batch_count | out_batch_count;
+
+	in_total_size = ALIGN(in_size + (in_entry_size * batch_count), 8);
+	out_total_size = ALIGN(out_size + (out_entry_size * batch_count), 8);
+
+	space = *this_cpu_ptr(hyperv_pcpu_arg);
+	if (input) {
+		*(void **)input = space;
+		if (space)
+			/* Zero the fixed size portion, not any array portion */
+			memset(space, 0, ALIGN(in_size, 8));
+	}
+
+	if (output) {
+		if (in_total_size + out_total_size <= HV_HYP_PAGE_SIZE) {
+			offset = in_total_size;
+		} else {
+			offset = HV_HYP_PAGE_SIZE;
+			/* Need more than 1 page, but only 1 was allocated */
+			BUILD_BUG_ON(HV_HVCALL_ARG_PAGES == 1);
+		}
+		*(void **)output = space + offset;
+	}
+
+	return batch_count;
+}
+
+/* Wrappers for some of the simpler cases with only input, or with no arrays */
+#define hv_hvcall_in(input, in_size) \
+	hv_hvcall_inout_array(input, in_size, 0, NULL, 0, 0)
+
+#define hv_hvcall_inout(input, in_size, output, out_size) \
+	hv_hvcall_inout_array(input, in_size, 0, output, out_size, 0)
+
+#define hv_hvcall_in_array(input, in_size, in_entry_size) \
+	hv_hvcall_inout_array(input, in_size, in_entry_size, NULL, 0, 0)
+
 /* Generate the guest OS identifier as described in the Hyper-V TLFS */
 static inline u64 hv_generate_guest_id(u64 kernel_version)
 {
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
  2025-03-13  6:19 ` [PATCH v2 1/6] Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-03-21 19:21   ` Nuno Das Neves
  2025-03-13  6:19 ` [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2 mhkelley58
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

Update hypercall call sites to use the new hv_hvcall_*() functions
to set up hypercall arguments. Since these functions zero the
fixed portion of input memory, remove now redundant calls to memset()
and explicit zero'ing of input fields.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---

Notes:
    Changes in v2:
    * Fixed get_vtl() and hv_vtl_apicid_to_vp_id() to properly treat the input
      and output arguments as arrays [Nuno Das Neves]
    * Enhanced __send_ipi_mask_ex() and hv_map_interrupt() to check the number
      of computed banks in the hv_vpset against the batch_size. Since an
      hv_vpset currently represents a maximum of 4096 CPUs, the hv_vpset size
      does not exceed 512 bytes and there should always be sufficent space. But
      do the check just in case something changes. [Nuno Das Neves]

 arch/x86/hyperv/hv_apic.c   | 10 ++++------
 arch/x86/hyperv/hv_init.c   |  6 ++----
 arch/x86/hyperv/hv_vtl.c    |  9 +++------
 arch/x86/hyperv/irqdomain.c | 17 ++++++++++-------
 4 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index f022d5f64fb6..b5d6a9b2e17a 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -108,21 +108,19 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
 {
 	struct hv_send_ipi_ex *ipi_arg;
 	unsigned long flags;
-	int nr_bank = 0;
+	int batch_size, nr_bank = 0;
 	u64 status = HV_STATUS_INVALID_PARAMETER;
 
 	if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
 		return false;
 
 	local_irq_save(flags);
-	ipi_arg = *this_cpu_ptr(hyperv_pcpu_input_arg);
-
+	batch_size = hv_hvcall_in_array(&ipi_arg, sizeof(*ipi_arg),
+					sizeof(ipi_arg->vp_set.bank_contents[0]));
 	if (unlikely(!ipi_arg))
 		goto ipi_mask_ex_done;
 
 	ipi_arg->vector = vector;
-	ipi_arg->reserved = 0;
-	ipi_arg->vp_set.valid_bank_mask = 0;
 
 	/*
 	 * Use HV_GENERIC_SET_ALL and avoid converting cpumask to VP_SET
@@ -139,7 +137,7 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector,
 		 * represented in VP_SET. Return an error and fall back to
 		 * native (architectural) method of sending IPIs.
 		 */
-		if (nr_bank <= 0)
+		if (nr_bank <= 0 || nr_bank > batch_size)
 			goto ipi_mask_ex_done;
 	} else {
 		ipi_arg->vp_set.format = HV_GENERIC_SET_ALL;
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index ddeb40930bc8..cc843905c23a 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -400,13 +400,11 @@ static u8 __init get_vtl(void)
 	u64 ret;
 
 	local_irq_save(flags);
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
 
-	memset(input, 0, struct_size(input, names, 1));
+	hv_hvcall_inout_array(&input, sizeof(*input), sizeof(input->names[0]),
+			      &output, sizeof(*output), sizeof(output->values[0]));
 	input->partition_id = HV_PARTITION_ID_SELF;
 	input->vp_index = HV_VP_INDEX_SELF;
-	input->input_vtl.as_uint8 = 0;
 	input->names[0] = HV_REGISTER_VSM_VP_STATUS;
 
 	ret = hv_do_hypercall(control, input, output);
diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
index 3f4e20d7b724..5d9aaebe5709 100644
--- a/arch/x86/hyperv/hv_vtl.c
+++ b/arch/x86/hyperv/hv_vtl.c
@@ -94,8 +94,7 @@ static int hv_vtl_bringup_vcpu(u32 target_vp_index, int cpu, u64 eip_ignored)
 
 	local_irq_save(irq_flags);
 
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	memset(input, 0, sizeof(*input));
+	hv_hvcall_in(&input, sizeof(*input));
 
 	input->partition_id = HV_PARTITION_ID_SELF;
 	input->vp_index = target_vp_index;
@@ -185,13 +184,11 @@ static int hv_vtl_apicid_to_vp_id(u32 apic_id)
 
 	local_irq_save(irq_flags);
 
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	memset(input, 0, sizeof(*input));
+	hv_hvcall_inout_array(&input, sizeof(*input), sizeof(input->apic_ids[0]),
+			      &output, 0, sizeof(*output));
 	input->partition_id = HV_PARTITION_ID_SELF;
 	input->apic_ids[0] = apic_id;
 
-	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
-
 	control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_ID_FROM_APIC_ID;
 	status = hv_do_hypercall(control, input, output);
 	ret = output[0];
diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
index 64b921360b0f..1f78b2ea7489 100644
--- a/arch/x86/hyperv/irqdomain.c
+++ b/arch/x86/hyperv/irqdomain.c
@@ -20,15 +20,15 @@ static int hv_map_interrupt(union hv_device_id device_id, bool level,
 	struct hv_device_interrupt_descriptor *intr_desc;
 	unsigned long flags;
 	u64 status;
-	int nr_bank, var_size;
+	int batch_size, nr_bank, var_size;
 
 	local_irq_save(flags);
 
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+	batch_size = hv_hvcall_inout_array(&input, sizeof(*input),
+			sizeof(input->interrupt_descriptor.target.vp_set.bank_contents[0]),
+			&output, sizeof(*output), 0);
 
 	intr_desc = &input->interrupt_descriptor;
-	memset(input, 0, sizeof(*input));
 	input->partition_id = hv_current_partition_id;
 	input->device_id = device_id.as_uint64;
 	intr_desc->interrupt_type = HV_X64_INTERRUPT_TYPE_FIXED;
@@ -40,7 +40,6 @@ static int hv_map_interrupt(union hv_device_id device_id, bool level,
 	else
 		intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_EDGE;
 
-	intr_desc->target.vp_set.valid_bank_mask = 0;
 	intr_desc->target.vp_set.format = HV_GENERIC_SET_SPARSE_4K;
 	nr_bank = cpumask_to_vpset(&(intr_desc->target.vp_set), cpumask_of(cpu));
 	if (nr_bank < 0) {
@@ -48,6 +47,11 @@ static int hv_map_interrupt(union hv_device_id device_id, bool level,
 		pr_err("%s: unable to generate VP set\n", __func__);
 		return EINVAL;
 	}
+	if (nr_bank > batch_size) {
+		local_irq_restore(flags);
+		pr_err("%s: nr_bank too large\n", __func__);
+		return EINVAL;
+	}
 	intr_desc->target.flags = HV_DEVICE_INTERRUPT_TARGET_PROCESSOR_SET;
 
 	/*
@@ -77,9 +81,8 @@ static int hv_unmap_interrupt(u64 id, struct hv_interrupt_entry *old_entry)
 	u64 status;
 
 	local_irq_save(flags);
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
 
-	memset(input, 0, sizeof(*input));
+	hv_hvcall_in(&input, sizeof(*input));
 	intr_entry = &input->interrupt_entry;
 	input->partition_id = hv_current_partition_id;
 	input->device_id = id;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
  2025-03-13  6:19 ` [PATCH v2 1/6] Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments mhkelley58
  2025-03-13  6:19 ` [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1 mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-03-21 19:35   ` Nuno Das Neves
  2025-03-13  6:19 ` [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments mhkelley58
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

Update hypercall call sites to use the new hv_hvcall_*() functions
to set up hypercall arguments. Since these functions zero the
fixed portion of input memory, remove now redundant calls to memset()
and explicit zero'ing of input fields.

For hv_mark_gpa_visibility(), use the computed batch_size instead
of HV_MAX_MODIFY_GPA_REP_COUNT. Also update the associated gpa_page_list[]
field to have zero size, which is more consistent with other array
arguments to hypercalls. Due to the interaction with the calling
hv_vtom_set_host_visibility(), HV_MAX_MODIFY_GPA_REP_COUNT cannot be
completely eliminated without some further restructuring, but that's
for another patch set.

Similarly, for the nested flush functions, update the gpa_list[] to
have zero size. Again, separate restructuring would be required to
completely eliminate the need for HV_MAX_FLUSH_REP_COUNT.

Finally, hyperv_flush_tlb_others_ex() requires special handling
because the input consists of two arrays -- one for the hv_vp_set and
another for the gva list. The batch_size computed by hv_hvcall_in_array()
is adjusted to account for the number of entries in the hv_vp_set.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---

Notes:
    Changes in v2:
    * In hyperv_flush_tlb_others_ex(), added check of the adjusted
      max_gvas to make sure it doesn't go to zero or negative, which would
      happen if there is insufficient space to hold the hv_vpset and have
      at least one entry in the gva list. Since an hv_vpset currently
      represents a maximum of 4096 CPUs, the hv_vpset size does not exceed
      512 bytes and there should always be sufficent space. But do the
      check just in case something changes. [Nuno Das Neves]

 arch/x86/hyperv/ivm.c       | 18 +++++++++---------
 arch/x86/hyperv/mmu.c       | 19 +++++--------------
 arch/x86/hyperv/nested.c    | 14 +++++---------
 include/hyperv/hvgdk_mini.h |  4 ++--
 4 files changed, 21 insertions(+), 34 deletions(-)

diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index ec7880271cf9..de983aae0cbe 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -465,30 +465,30 @@ static int hv_mark_gpa_visibility(u16 count, const u64 pfn[],
 {
 	struct hv_gpa_range_for_visibility *input;
 	u64 hv_status;
+	int batch_size;
 	unsigned long flags;
 
 	/* no-op if partition isolation is not enabled */
 	if (!hv_is_isolation_supported())
 		return 0;
 
-	if (count > HV_MAX_MODIFY_GPA_REP_COUNT) {
-		pr_err("Hyper-V: GPA count:%d exceeds supported:%lu\n", count,
-			HV_MAX_MODIFY_GPA_REP_COUNT);
+	local_irq_save(flags);
+	batch_size = hv_hvcall_in_array(&input, sizeof(*input),
+					sizeof(input->gpa_page_list[0]));
+	if (unlikely(!input)) {
+		local_irq_restore(flags);
 		return -EINVAL;
 	}
 
-	local_irq_save(flags);
-	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
-
-	if (unlikely(!input)) {
+	if (count > batch_size) {
+		pr_err("Hyper-V: GPA count:%d exceeds supported:%u\n", count,
+		       batch_size);
 		local_irq_restore(flags);
 		return -EINVAL;
 	}
 
 	input->partition_id = HV_PARTITION_ID_SELF;
 	input->host_visibility = visibility;
-	input->reserved0 = 0;
-	input->reserved1 = 0;
 	memcpy((void *)input->gpa_page_list, pfn, count * sizeof(*pfn));
 	hv_status = hv_do_rep_hypercall(
 			HVCALL_MODIFY_SPARSE_GPA_PAGE_HOST_VISIBILITY, count,
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 1f7c3082a36d..7d143826e5dc 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -72,7 +72,7 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
 
 	local_irq_save(flags);
 
-	flush = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	max_gvas = hv_hvcall_in_array(&flush, sizeof(*flush), sizeof(flush->gva_list[0]));
 
 	if (unlikely(!flush)) {
 		local_irq_restore(flags);
@@ -86,13 +86,10 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
 		 */
 		flush->address_space = virt_to_phys(info->mm->pgd);
 		flush->address_space &= CR3_ADDR_MASK;
-		flush->flags = 0;
 	} else {
-		flush->address_space = 0;
 		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
 	}
 
-	flush->processor_mask = 0;
 	if (cpumask_equal(cpus, cpu_present_mask)) {
 		flush->flags |= HV_FLUSH_ALL_PROCESSORS;
 	} else {
@@ -139,8 +136,6 @@ static void hyperv_flush_tlb_multi(const struct cpumask *cpus,
 	 * We can flush not more than max_gvas with one hypercall. Flush the
 	 * whole address space if we were asked to do more.
 	 */
-	max_gvas = (PAGE_SIZE - sizeof(*flush)) / sizeof(flush->gva_list[0]);
-
 	if (info->end == TLB_FLUSH_ALL) {
 		flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
 		status = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE,
@@ -179,7 +174,7 @@ static u64 hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
 	if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED))
 		return HV_STATUS_INVALID_PARAMETER;
 
-	flush = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	max_gvas = hv_hvcall_in_array(&flush, sizeof(*flush), sizeof(flush->gva_list[0]));
 
 	if (info->mm) {
 		/*
@@ -188,14 +183,10 @@ static u64 hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
 		 */
 		flush->address_space = virt_to_phys(info->mm->pgd);
 		flush->address_space &= CR3_ADDR_MASK;
-		flush->flags = 0;
 	} else {
-		flush->address_space = 0;
 		flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES;
 	}
 
-	flush->hv_vp_set.valid_bank_mask = 0;
-
 	flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K;
 	nr_bank = cpumask_to_vpset_skip(&flush->hv_vp_set, cpus,
 			info->freed_tables ? NULL : cpu_is_lazy);
@@ -206,10 +197,10 @@ static u64 hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
 	 * We can flush not more than max_gvas with one hypercall. Flush the
 	 * whole address space if we were asked to do more.
 	 */
-	max_gvas =
-		(PAGE_SIZE - sizeof(*flush) - nr_bank *
-		 sizeof(flush->hv_vp_set.bank_contents[0])) /
+	max_gvas -= (nr_bank * sizeof(flush->hv_vp_set.bank_contents[0])) /
 		sizeof(flush->gva_list[0]);
+	if (max_gvas <= 0)
+		return HV_STATUS_INVALID_PARAMETER;
 
 	if (info->end == TLB_FLUSH_ALL) {
 		flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY;
diff --git a/arch/x86/hyperv/nested.c b/arch/x86/hyperv/nested.c
index 1083dc8646f9..88c39ac8d0aa 100644
--- a/arch/x86/hyperv/nested.c
+++ b/arch/x86/hyperv/nested.c
@@ -29,15 +29,13 @@ int hyperv_flush_guest_mapping(u64 as)
 
 	local_irq_save(flags);
 
-	flush = *this_cpu_ptr(hyperv_pcpu_input_arg);
-
+	hv_hvcall_in(&flush, sizeof(*flush));
 	if (unlikely(!flush)) {
 		local_irq_restore(flags);
 		goto fault;
 	}
 
 	flush->address_space = as;
-	flush->flags = 0;
 
 	status = hv_do_hypercall(HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE,
 				 flush, NULL);
@@ -90,25 +88,23 @@ int hyperv_flush_guest_mapping_range(u64 as,
 	u64 status;
 	unsigned long flags;
 	int ret = -ENOTSUPP;
-	int gpa_n = 0;
+	int batch_size, gpa_n = 0;
 
 	if (!hv_hypercall_pg || !fill_flush_list_func)
 		goto fault;
 
 	local_irq_save(flags);
 
-	flush = *this_cpu_ptr(hyperv_pcpu_input_arg);
-
+	batch_size = hv_hvcall_in_array(&flush, sizeof(*flush),
+					sizeof(flush->gpa_list[0]));
 	if (unlikely(!flush)) {
 		local_irq_restore(flags);
 		goto fault;
 	}
 
 	flush->address_space = as;
-	flush->flags = 0;
-
 	gpa_n = fill_flush_list_func(flush, data);
-	if (gpa_n < 0) {
+	if (gpa_n < 0 || gpa_n > batch_size) {
 		local_irq_restore(flags);
 		goto fault;
 	}
diff --git a/include/hyperv/hvgdk_mini.h b/include/hyperv/hvgdk_mini.h
index 58895883f636..70e5d7ee40c8 100644
--- a/include/hyperv/hvgdk_mini.h
+++ b/include/hyperv/hvgdk_mini.h
@@ -533,7 +533,7 @@ union hv_gpa_page_range {
 struct hv_guest_mapping_flush_list {
 	u64 address_space;
 	u64 flags;
-	union hv_gpa_page_range gpa_list[HV_MAX_FLUSH_REP_COUNT];
+	union hv_gpa_page_range gpa_list[];
 };
 
 struct hv_tlb_flush {	 /* HV_INPUT_FLUSH_VIRTUAL_ADDRESS_LIST */
@@ -1218,7 +1218,7 @@ struct hv_gpa_range_for_visibility {
 	u32 host_visibility : 2;
 	u32 reserved0 : 30;
 	u32 reserved1;
-	u64 gpa_page_list[HV_MAX_MODIFY_GPA_REP_COUNT];
+	u64 gpa_page_list[];
 } __packed;
 
 #if defined(CONFIG_X86)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
                   ` (2 preceding siblings ...)
  2025-03-13  6:19 ` [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2 mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-03-21 20:11   ` Nuno Das Neves
  2025-03-13  6:19 ` [PATCH v2 5/6] PCI: " mhkelley58
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

Update hypercall call sites to use the new hv_hvcall_*() functions
to set up hypercall arguments. Since these functions zero the
fixed portion of input memory, remove now redundant zero'ing of
input fields.

hv_post_message() requires additional updates. The payload area is
treated as an array to avoid wasting cycles on zero'ing it and
then overwriting with memcpy(). To allow treatment as an array,
the corresponding payload[] field is updated to have zero size.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/hv/hv.c           | 9 ++++++---
 drivers/hv/hv_balloon.c   | 4 ++--
 drivers/hv/hv_common.c    | 2 +-
 drivers/hv/hv_proc.c      | 8 ++++----
 drivers/hv/hyperv_vmbus.h | 2 +-
 5 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index a38f84548bc2..e2dcbc816fc5 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -66,7 +66,8 @@ int hv_post_message(union hv_connection_id connection_id,
 	if (hv_isolation_type_tdx() && ms_hyperv.paravisor_present)
 		aligned_msg = this_cpu_ptr(hv_context.cpu_context)->post_msg_page;
 	else
-		aligned_msg = *this_cpu_ptr(hyperv_pcpu_input_arg);
+		hv_hvcall_in_array(&aligned_msg, sizeof(*aligned_msg),
+				   sizeof(aligned_msg->payload[0]));
 
 	aligned_msg->connectionid = connection_id;
 	aligned_msg->reserved = 0;
@@ -80,8 +81,10 @@ int hv_post_message(union hv_connection_id connection_id,
 						  virt_to_phys(aligned_msg), 0);
 		else if (hv_isolation_type_snp())
 			status = hv_ghcb_hypercall(HVCALL_POST_MESSAGE,
-						   aligned_msg, NULL,
-						   sizeof(*aligned_msg));
+						   aligned_msg,
+						   NULL,
+						   struct_size(aligned_msg, payload,
+							       HV_MESSAGE_PAYLOAD_QWORD_COUNT));
 		else
 			status = HV_STATUS_INVALID_PARAMETER;
 	} else {
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index fec2f18679e3..2def8b8794ee 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -1582,14 +1582,14 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
 	WARN_ON_ONCE(nents > HV_MEMORY_HINT_MAX_GPA_PAGE_RANGES);
 	WARN_ON_ONCE(sgl->length < (HV_HYP_PAGE_SIZE << page_reporting_order));
 	local_irq_save(flags);
-	hint = *this_cpu_ptr(hyperv_pcpu_input_arg);
+
+	hv_hvcall_in_array(&hint, sizeof(*hint), sizeof(hint->ranges[0]));
 	if (!hint) {
 		local_irq_restore(flags);
 		return -ENOSPC;
 	}
 
 	hint->heat_type = HV_EXTMEM_HEAT_HINT_COLD_DISCARD;
-	hint->reserved = 0;
 	for_each_sg(sgl, sg, nents, i) {
 		union hv_gpa_page_range *range;
 
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 9804adb4cc56..a6b1cdfbc8d4 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -293,7 +293,7 @@ void __init hv_get_partition_id(void)
 	u64 status, pt_id;
 
 	local_irq_save(flags);
-	output = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	hv_hvcall_inout(NULL, 0, &output, sizeof(*output));
 	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, &output);
 	pt_id = output->partition_id;
 	local_irq_restore(flags);
diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
index 2fae18e4f7d2..5c580ee1c23f 100644
--- a/drivers/hv/hv_proc.c
+++ b/drivers/hv/hv_proc.c
@@ -73,7 +73,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
 
 	local_irq_save(flags);
 
-	input_page = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	hv_hvcall_in_array(&input_page, sizeof(*input_page),
+			   sizeof(input_page->gpa_page_list[0]));
 
 	input_page->partition_id = partition_id;
 
@@ -124,9 +125,8 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
 	do {
 		local_irq_save(flags);
 
-		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
 		/* We don't do anything with the output right now */
-		output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+		hv_hvcall_inout(&input, sizeof(*input), &output, sizeof(*output));
 
 		input->lp_index = lp_index;
 		input->apic_id = apic_id;
@@ -167,7 +167,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
 	do {
 		local_irq_save(irq_flags);
 
-		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+		hv_hvcall_in(&input, sizeof(*input));
 
 		input->partition_id = partition_id;
 		input->vp_index = vp_index;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 29780f3a7478..44b5e8330d9d 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -101,7 +101,7 @@ struct hv_input_post_message {
 	u32 reserved;
 	u32 message_type;
 	u32 payload_size;
-	u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
+	u64 payload[];
 };
 
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 5/6] PCI: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
                   ` (3 preceding siblings ...)
  2025-03-13  6:19 ` [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-03-21 20:18   ` Nuno Das Neves
  2025-03-13  6:19 ` [PATCH v2 6/6] Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg mhkelley58
  2025-04-01 19:29 ` [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args Easwar Hariharan
  6 siblings, 1 reply; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

Update hypercall call sites to use the new hv_hvcall_*() functions
to set up hypercall arguments. Since these functions zero the
fixed portion of input memory, remove now redundant calls to memset().

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
---

Notes:
    Changes in v2:
    * In hv_arch_irq_unmask(), added check of the number of computed banks
      in the hv_vpset against the batch_size. Since an hv_vpset currently
      represents a maximum of 4096 CPUs, the hv_vpset size does not exceed
      512 bytes and there should always be sufficent space. But do the
      check just in case something changes. [Nuno Das Neves]

 drivers/pci/controller/pci-hyperv.c | 18 ++++++++----------
 include/hyperv/hvgdk_mini.h         |  2 +-
 2 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index ac27bda5ba26..82ac0e09943b 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -622,7 +622,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
 	struct pci_dev *pdev;
 	unsigned long flags;
 	u32 var_size = 0;
-	int cpu, nr_bank;
+	int cpu, nr_bank, batch_size;
 	u64 res;
 
 	dest = irq_data_get_effective_affinity_mask(data);
@@ -638,8 +638,8 @@ static void hv_arch_irq_unmask(struct irq_data *data)
 
 	local_irq_save(flags);
 
-	params = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	memset(params, 0, sizeof(*params));
+	batch_size = hv_hvcall_in_array(&params, sizeof(*params),
+					sizeof(params->int_target.vp_set.bank_contents[0]));
 	params->partition_id = HV_PARTITION_ID_SELF;
 	params->int_entry.source = HV_INTERRUPT_SOURCE_MSI;
 	params->int_entry.msi_entry.address.as_uint32 = int_desc->address & 0xffffffff;
@@ -671,7 +671,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
 		nr_bank = cpumask_to_vpset(&params->int_target.vp_set, tmp);
 		free_cpumask_var(tmp);
 
-		if (nr_bank <= 0) {
+		if (nr_bank <= 0 || nr_bank > batch_size) {
 			res = 1;
 			goto out;
 		}
@@ -1034,11 +1034,9 @@ static void hv_pci_read_mmio(struct device *dev, phys_addr_t gpa, int size, u32
 
 	/*
 	 * Must be called with interrupts disabled so it is safe
-	 * to use the per-cpu input argument page.  Use it for
-	 * both input and output.
+	 * to use the per-cpu argument page.
 	 */
-	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
-	out = *this_cpu_ptr(hyperv_pcpu_input_arg) + sizeof(*in);
+	hv_hvcall_inout(&in, sizeof(*in), &out, sizeof(*out));
 	in->gpa = gpa;
 	in->size = size;
 
@@ -1067,9 +1065,9 @@ static void hv_pci_write_mmio(struct device *dev, phys_addr_t gpa, int size, u32
 
 	/*
 	 * Must be called with interrupts disabled so it is safe
-	 * to use the per-cpu input argument memory.
+	 * to use the per-cpu argument page.
 	 */
-	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	hv_hvcall_in_array(&in, sizeof(*in), sizeof(in->data[0]));
 	in->gpa = gpa;
 	in->size = size;
 	switch (size) {
diff --git a/include/hyperv/hvgdk_mini.h b/include/hyperv/hvgdk_mini.h
index 70e5d7ee40c8..cb25ac1e3ac5 100644
--- a/include/hyperv/hvgdk_mini.h
+++ b/include/hyperv/hvgdk_mini.h
@@ -1342,7 +1342,7 @@ struct hv_mmio_write_input {
 	u64 gpa;
 	u32 size;
 	u32 reserved;
-	u8 data[HV_HYPERCALL_MMIO_MAX_DATA_LENGTH];
+	u8 data[];
 } __packed;
 
 #endif /* _HV_HVGDK_MINI_H */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 6/6] Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
                   ` (4 preceding siblings ...)
  2025-03-13  6:19 ` [PATCH v2 5/6] PCI: " mhkelley58
@ 2025-03-13  6:19 ` mhkelley58
  2025-04-01 19:29 ` [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args Easwar Hariharan
  6 siblings, 0 replies; 14+ messages in thread
From: mhkelley58 @ 2025-03-13  6:19 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

From: Michael Kelley <mhklinux@outlook.com>

All open coded uses of hyperv_pcpu_input_arg and hyperv_pcpu_ouput_arg
have been replaced by hv_hvcall_*() functions. So combine
hyperv_pcpu_input_arg and hyperv_pcpu_output_arg in a single
hyperv_pcpu_arg. Remove logic for managing a separate output arg. Fixup
comment references to the old variable names.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/hyperv/hv_init.c      |  6 ++--
 drivers/hv/hv.c                |  2 +-
 drivers/hv/hv_common.c         | 55 ++++++++++------------------------
 include/asm-generic/mshyperv.h |  6 +---
 4 files changed, 21 insertions(+), 48 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index cc843905c23a..e930fe75f2ca 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -483,16 +483,16 @@ void __init hyperv_init(void)
 	 * A TDX VM with no paravisor only uses TDX GHCI rather than hv_hypercall_pg:
 	 * when the hypercall input is a page, such a VM must pass a decrypted
 	 * page to Hyper-V, e.g. hv_post_message() uses the per-CPU page
-	 * hyperv_pcpu_input_arg, which is decrypted if no paravisor is present.
+	 * hyperv_pcpu_arg, which is decrypted if no paravisor is present.
 	 *
 	 * A TDX VM with the paravisor uses hv_hypercall_pg for most hypercalls,
 	 * which are handled by the paravisor and the VM must use an encrypted
-	 * input page: in such a VM, the hyperv_pcpu_input_arg is encrypted and
+	 * input page: in such a VM, the hyperv_pcpu_arg is encrypted and
 	 * used in the hypercalls, e.g. see hv_mark_gpa_visibility() and
 	 * hv_arch_irq_unmask(). Such a VM uses TDX GHCI for two hypercalls:
 	 * 1. HVCALL_SIGNAL_EVENT: see vmbus_set_event() and _hv_do_fast_hypercall8().
 	 * 2. HVCALL_POST_MESSAGE: the input page must be a decrypted page, i.e.
-	 * hv_post_message() in such a VM can't use the encrypted hyperv_pcpu_input_arg;
+	 * hv_post_message() in such a VM can't use the encrypted hyperv_pcpu_arg;
 	 * instead, hv_post_message() uses the post_msg_page, which is decrypted
 	 * in such a VM and is only used in such a VM.
 	 */
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index e2dcbc816fc5..854ef5d82807 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -60,7 +60,7 @@ int hv_post_message(union hv_connection_id connection_id,
 	/*
 	 * A TDX VM with the paravisor must use the decrypted post_msg_page: see
 	 * the comment in struct hv_per_cpu_context. A SNP VM with the paravisor
-	 * can use the encrypted hyperv_pcpu_input_arg because it copies the
+	 * can use the encrypted hyperv_pcpu_arg because it copies the
 	 * input into the GHCB page, which has been decrypted by the paravisor.
 	 */
 	if (hv_isolation_type_tdx() && ms_hyperv.paravisor_present)
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index a6b1cdfbc8d4..f3e219daf9fe 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -58,11 +58,8 @@ EXPORT_SYMBOL_GPL(hv_vp_index);
 u32 hv_max_vp_index;
 EXPORT_SYMBOL_GPL(hv_max_vp_index);
 
-void * __percpu *hyperv_pcpu_input_arg;
-EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
-
-void * __percpu *hyperv_pcpu_output_arg;
-EXPORT_SYMBOL_GPL(hyperv_pcpu_output_arg);
+void * __percpu *hyperv_pcpu_arg;
+EXPORT_SYMBOL_GPL(hyperv_pcpu_arg);
 
 static void hv_kmsg_dump_unregister(void);
 
@@ -85,11 +82,8 @@ void __init hv_common_free(void)
 	kfree(hv_vp_index);
 	hv_vp_index = NULL;
 
-	free_percpu(hyperv_pcpu_output_arg);
-	hyperv_pcpu_output_arg = NULL;
-
-	free_percpu(hyperv_pcpu_input_arg);
-	hyperv_pcpu_input_arg = NULL;
+	free_percpu(hyperv_pcpu_arg);
+	hyperv_pcpu_arg = NULL;
 }
 
 /*
@@ -281,11 +275,6 @@ static void hv_kmsg_dump_register(void)
 	}
 }
 
-static inline bool hv_output_page_exists(void)
-{
-	return hv_root_partition() || IS_ENABLED(CONFIG_HYPERV_VTL_MODE);
-}
-
 void __init hv_get_partition_id(void)
 {
 	struct hv_output_get_partition_id *output;
@@ -363,14 +352,8 @@ int __init hv_common_init(void)
 	 * (per-CPU) hypercall input page and thus this failure is
 	 * fatal on Hyper-V.
 	 */
-	hyperv_pcpu_input_arg = alloc_percpu(void  *);
-	BUG_ON(!hyperv_pcpu_input_arg);
-
-	/* Allocate the per-CPU state for output arg for root */
-	if (hv_output_page_exists()) {
-		hyperv_pcpu_output_arg = alloc_percpu(void *);
-		BUG_ON(!hyperv_pcpu_output_arg);
-	}
+	hyperv_pcpu_arg = alloc_percpu(void  *);
+	BUG_ON(!hyperv_pcpu_arg);
 
 	hv_vp_index = kmalloc_array(nr_cpu_ids, sizeof(*hv_vp_index),
 				    GFP_KERNEL);
@@ -459,32 +442,27 @@ void __init ms_hyperv_late_init(void)
 
 int hv_common_cpu_init(unsigned int cpu)
 {
-	void **inputarg, **outputarg;
+	void **inputarg;
 	u64 msr_vp_index;
 	gfp_t flags;
-	const int pgcount = hv_output_page_exists() ? 2 : 1;
+	const int pgcount = HV_HVCALL_ARG_PAGES;
 	void *mem;
 	int ret;
 
 	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
 	flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;
 
-	inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
+	inputarg = (void **)this_cpu_ptr(hyperv_pcpu_arg);
 
 	/*
-	 * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already
-	 * allocated if this CPU was previously online and then taken offline
+	 * hyperv_pcpu_arg memory is already allocated if this CPU was
+	 * previously online and then taken offline
 	 */
 	if (!*inputarg) {
 		mem = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags);
 		if (!mem)
 			return -ENOMEM;
 
-		if (hv_output_page_exists()) {
-			outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
-			*outputarg = (char *)mem + HV_HYP_PAGE_SIZE;
-		}
-
 		if (!ms_hyperv.paravisor_present &&
 		    (hv_isolation_type_snp() || hv_isolation_type_tdx())) {
 			ret = set_memory_decrypted((unsigned long)mem, pgcount);
@@ -498,13 +476,13 @@ int hv_common_cpu_init(unsigned int cpu)
 
 		/*
 		 * In a fully enlightened TDX/SNP VM with more than 64 VPs, if
-		 * hyperv_pcpu_input_arg is not NULL, set_memory_decrypted() ->
+		 * hyperv_pcpu_arg is not NULL, set_memory_decrypted() ->
 		 * ... -> cpa_flush()-> ... -> __send_ipi_mask_ex() tries to
-		 * use hyperv_pcpu_input_arg as the hypercall input page, which
+		 * use hyperv_pcpu_arg as the hypercall input page, which
 		 * must be a decrypted page in such a VM, but the page is still
 		 * encrypted before set_memory_decrypted() returns. Fix this by
 		 * setting *inputarg after the above set_memory_decrypted(): if
-		 * hyperv_pcpu_input_arg is NULL, __send_ipi_mask_ex() returns
+		 * hyperv_pcpu_arg is NULL, __send_ipi_mask_ex() returns
 		 * HV_STATUS_INVALID_PARAMETER immediately, and the function
 		 * hv_send_ipi_mask() falls back to orig_apic.send_IPI_mask(),
 		 * which may be slightly slower than the hypercall, but still
@@ -526,9 +504,8 @@ int hv_common_cpu_init(unsigned int cpu)
 int hv_common_cpu_die(unsigned int cpu)
 {
 	/*
-	 * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory
-	 * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg
-	 * may be used by the Hyper-V vPCI driver in reassigning interrupts
+	 * The hyperv_pcpu_arg memory is not freed when the CPU goes offline as
+	 * it may be used by the Hyper-V vPCI driver in reassigning interrupts
 	 * as part of the offlining process.  The interrupt reassignment
 	 * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and
 	 * called this function.
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 01e8763edc2c..015f87e35b5a 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -66,8 +66,7 @@ extern bool hv_nested;
 extern u64 hv_current_partition_id;
 extern enum hv_partition_type hv_curr_partition_type;
 
-extern void * __percpu *hyperv_pcpu_input_arg;
-extern void * __percpu *hyperv_pcpu_output_arg;
+extern void * __percpu *hyperv_pcpu_arg;
 
 extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
 extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
@@ -139,9 +138,6 @@ static inline u64 hv_do_rep_hypercall(u16 code, u16 rep_count, u16 varhead_size,
  * Hypercall input and output argument setup
  */
 
-/* Temporary mapping to be removed at the end of the patch series */
-#define hyperv_pcpu_arg hyperv_pcpu_input_arg
-
 /*
  * Allocate one page that is shared between input and output args, which is
  * sufficient for all current hypercalls. If a future hypercall requires
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1
  2025-03-13  6:19 ` [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1 mhkelley58
@ 2025-03-21 19:21   ` Nuno Das Neves
  0 siblings, 0 replies; 14+ messages in thread
From: Nuno Das Neves @ 2025-03-21 19:21 UTC (permalink / raw)
  To: mhklinux, kys, haiyangz, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, hpa, lpieralisi, kw, manivannan.sadhasivam, robh,
	bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Update hypercall call sites to use the new hv_hvcall_*() functions
> to set up hypercall arguments. Since these functions zero the
> fixed portion of input memory, remove now redundant calls to memset()
> and explicit zero'ing of input fields.
> 
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> 
> Notes:
>     Changes in v2:
>     * Fixed get_vtl() and hv_vtl_apicid_to_vp_id() to properly treat the input
>       and output arguments as arrays [Nuno Das Neves]
>     * Enhanced __send_ipi_mask_ex() and hv_map_interrupt() to check the number
>       of computed banks in the hv_vpset against the batch_size. Since an
>       hv_vpset currently represents a maximum of 4096 CPUs, the hv_vpset size
>       does not exceed 512 bytes and there should always be sufficent space. But
>       do the check just in case something changes. [Nuno Das Neves]
> 
>  arch/x86/hyperv/hv_apic.c   | 10 ++++------
>  arch/x86/hyperv/hv_init.c   |  6 ++----
>  arch/x86/hyperv/hv_vtl.c    |  9 +++------
>  arch/x86/hyperv/irqdomain.c | 17 ++++++++++-------
>  4 files changed, 19 insertions(+), 23 deletions(-)
> 

Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2
  2025-03-13  6:19 ` [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2 mhkelley58
@ 2025-03-21 19:35   ` Nuno Das Neves
  0 siblings, 0 replies; 14+ messages in thread
From: Nuno Das Neves @ 2025-03-21 19:35 UTC (permalink / raw)
  To: mhklinux, kys, haiyangz, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, hpa, lpieralisi, kw, manivannan.sadhasivam, robh,
	bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Update hypercall call sites to use the new hv_hvcall_*() functions
> to set up hypercall arguments. Since these functions zero the
> fixed portion of input memory, remove now redundant calls to memset()
> and explicit zero'ing of input fields.
> 
> For hv_mark_gpa_visibility(), use the computed batch_size instead
> of HV_MAX_MODIFY_GPA_REP_COUNT. Also update the associated gpa_page_list[]
> field to have zero size, which is more consistent with other array
> arguments to hypercalls. Due to the interaction with the calling
> hv_vtom_set_host_visibility(), HV_MAX_MODIFY_GPA_REP_COUNT cannot be
> completely eliminated without some further restructuring, but that's
> for another patch set.
> 
> Similarly, for the nested flush functions, update the gpa_list[] to
> have zero size. Again, separate restructuring would be required to
> completely eliminate the need for HV_MAX_FLUSH_REP_COUNT.
> 
> Finally, hyperv_flush_tlb_others_ex() requires special handling
> because the input consists of two arrays -- one for the hv_vp_set and
> another for the gva list. The batch_size computed by hv_hvcall_in_array()
> is adjusted to account for the number of entries in the hv_vp_set.
> 
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> 
> Notes:
>     Changes in v2:
>     * In hyperv_flush_tlb_others_ex(), added check of the adjusted
>       max_gvas to make sure it doesn't go to zero or negative, which would
>       happen if there is insufficient space to hold the hv_vpset and have
>       at least one entry in the gva list. Since an hv_vpset currently
>       represents a maximum of 4096 CPUs, the hv_vpset size does not exceed
>       512 bytes and there should always be sufficent space. But do the
>       check just in case something changes. [Nuno Das Neves]
> 
>  arch/x86/hyperv/ivm.c       | 18 +++++++++---------
>  arch/x86/hyperv/mmu.c       | 19 +++++--------------
>  arch/x86/hyperv/nested.c    | 14 +++++---------
>  include/hyperv/hvgdk_mini.h |  4 ++--
>  4 files changed, 21 insertions(+), 34 deletions(-)
> 

Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-13  6:19 ` [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments mhkelley58
@ 2025-03-21 20:11   ` Nuno Das Neves
  2025-03-30 21:53     ` Michael Kelley
  0 siblings, 1 reply; 14+ messages in thread
From: Nuno Das Neves @ 2025-03-21 20:11 UTC (permalink / raw)
  To: mhklinux, kys, haiyangz, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, hpa, lpieralisi, kw, manivannan.sadhasivam, robh,
	bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Update hypercall call sites to use the new hv_hvcall_*() functions
> to set up hypercall arguments. Since these functions zero the
> fixed portion of input memory, remove now redundant zero'ing of
> input fields.
> 
> hv_post_message() requires additional updates. The payload area is
> treated as an array to avoid wasting cycles on zero'ing it and
> then overwriting with memcpy(). To allow treatment as an array,
> the corresponding payload[] field is updated to have zero size.
> 
I'd prefer to leave the payload field as a fixed-sized array.
Changing it to a flexible array makes it look like that input is
for a variable-sized or rep hypercall, and it makes the surrounding
code in hv_post_message() more complex and inscrutable as a result.

I suggest leaving hv_post_message() alone, except for changing
hyperv_pcpu_input_arg -> hyperv_pcpu_arg, and perhaps a comment
explaining why hv_hvcall_input() isn't used there.

> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  drivers/hv/hv.c           | 9 ++++++---
>  drivers/hv/hv_balloon.c   | 4 ++--
>  drivers/hv/hv_common.c    | 2 +-
>  drivers/hv/hv_proc.c      | 8 ++++----
>  drivers/hv/hyperv_vmbus.h | 2 +-
>  5 files changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index a38f84548bc2..e2dcbc816fc5 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -66,7 +66,8 @@ int hv_post_message(union hv_connection_id connection_id,
>  	if (hv_isolation_type_tdx() && ms_hyperv.paravisor_present)
>  		aligned_msg = this_cpu_ptr(hv_context.cpu_context)->post_msg_page;
>  	else
> -		aligned_msg = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +		hv_hvcall_in_array(&aligned_msg, sizeof(*aligned_msg),
> +				   sizeof(aligned_msg->payload[0]));
>  
>  	aligned_msg->connectionid = connection_id;
>  	aligned_msg->reserved = 0;
> @@ -80,8 +81,10 @@ int hv_post_message(union hv_connection_id connection_id,
>  						  virt_to_phys(aligned_msg), 0);
>  		else if (hv_isolation_type_snp())
>  			status = hv_ghcb_hypercall(HVCALL_POST_MESSAGE,
> -						   aligned_msg, NULL,
> -						   sizeof(*aligned_msg));
> +						   aligned_msg,
> +						   NULL,
> +						   struct_size(aligned_msg, payload,
> +							       HV_MESSAGE_PAYLOAD_QWORD_COUNT));

See my comment above, I'd prefer to leave this function mostly
alone to maintain readability.

>  		else
>  			status = HV_STATUS_INVALID_PARAMETER;
>  	} else {
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index fec2f18679e3..2def8b8794ee 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -1582,14 +1582,14 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
>  	WARN_ON_ONCE(nents > HV_MEMORY_HINT_MAX_GPA_PAGE_RANGES);
>  	WARN_ON_ONCE(sgl->length < (HV_HYP_PAGE_SIZE << page_reporting_order));
>  	local_irq_save(flags);
> -	hint = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +
> +	hv_hvcall_in_array(&hint, sizeof(*hint), sizeof(hint->ranges[0]));

We should ensure the returned batch size is large enough for
"nents".

>  	if (!hint) {
>  		local_irq_restore(flags);
>  		return -ENOSPC;
>  	}
>  
>  	hint->heat_type = HV_EXTMEM_HEAT_HINT_COLD_DISCARD;
> -	hint->reserved = 0;
>  	for_each_sg(sgl, sg, nents, i) {
>  		union hv_gpa_page_range *range;
>  
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 9804adb4cc56..a6b1cdfbc8d4 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -293,7 +293,7 @@ void __init hv_get_partition_id(void)
>  	u64 status, pt_id;
>  
>  	local_irq_save(flags);
> -	output = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +	hv_hvcall_inout(NULL, 0, &output, sizeof(*output));
>  	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, &output);
>  	pt_id = output->partition_id;
>  	local_irq_restore(flags);
> diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> index 2fae18e4f7d2..5c580ee1c23f 100644
> --- a/drivers/hv/hv_proc.c
> +++ b/drivers/hv/hv_proc.c
> @@ -73,7 +73,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
>  
>  	local_irq_save(flags);
>  
> -	input_page = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +	hv_hvcall_in_array(&input_page, sizeof(*input_page),
> +			   sizeof(input_page->gpa_page_list[0]));

We should ensure the returned batch size is large enough.

>  
>  	input_page->partition_id = partition_id;
>  
> @@ -124,9 +125,8 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
>  	do {
>  		local_irq_save(flags);
>  
> -		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
>  		/* We don't do anything with the output right now */
> -		output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> +		hv_hvcall_inout(&input, sizeof(*input), &output, sizeof(*output));
>  
>  		input->lp_index = lp_index;
>  		input->apic_id = apic_id;
> @@ -167,7 +167,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
>  	do {
>  		local_irq_save(irq_flags);
>  
> -		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +		hv_hvcall_in(&input, sizeof(*input));
>  
>  		input->partition_id = partition_id;
>  		input->vp_index = vp_index;
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index 29780f3a7478..44b5e8330d9d 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -101,7 +101,7 @@ struct hv_input_post_message {
>  	u32 reserved;
>  	u32 message_type;
>  	u32 payload_size;
> -	u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
> +	u64 payload[];

See my comment above, I'd prefer to keep this how it is.

>  };
>  
>  

Thanks
Nuno


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 5/6] PCI: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-13  6:19 ` [PATCH v2 5/6] PCI: " mhkelley58
@ 2025-03-21 20:18   ` Nuno Das Neves
  2025-03-30 21:53     ` Michael Kelley
  0 siblings, 1 reply; 14+ messages in thread
From: Nuno Das Neves @ 2025-03-21 20:18 UTC (permalink / raw)
  To: mhklinux, kys, haiyangz, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, hpa, lpieralisi, kw, manivannan.sadhasivam, robh,
	bhelgaas, arnd
  Cc: x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> Update hypercall call sites to use the new hv_hvcall_*() functions
> to set up hypercall arguments. Since these functions zero the
> fixed portion of input memory, remove now redundant calls to memset().
> 
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
> 
> Notes:
>     Changes in v2:
>     * In hv_arch_irq_unmask(), added check of the number of computed banks
>       in the hv_vpset against the batch_size. Since an hv_vpset currently
>       represents a maximum of 4096 CPUs, the hv_vpset size does not exceed
>       512 bytes and there should always be sufficent space. But do the
>       check just in case something changes. [Nuno Das Neves]
> 
>  drivers/pci/controller/pci-hyperv.c | 18 ++++++++----------
>  include/hyperv/hvgdk_mini.h         |  2 +-
>  2 files changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index ac27bda5ba26..82ac0e09943b 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -622,7 +622,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
>  	struct pci_dev *pdev;
>  	unsigned long flags;
>  	u32 var_size = 0;
> -	int cpu, nr_bank;
> +	int cpu, nr_bank, batch_size;
>  	u64 res;
>  
>  	dest = irq_data_get_effective_affinity_mask(data);
> @@ -638,8 +638,8 @@ static void hv_arch_irq_unmask(struct irq_data *data)
>  
>  	local_irq_save(flags);
>  
> -	params = *this_cpu_ptr(hyperv_pcpu_input_arg);
> -	memset(params, 0, sizeof(*params));
> +	batch_size = hv_hvcall_in_array(&params, sizeof(*params),
> +					sizeof(params->int_target.vp_set.bank_contents[0]));
>  	params->partition_id = HV_PARTITION_ID_SELF;
>  	params->int_entry.source = HV_INTERRUPT_SOURCE_MSI;
>  	params->int_entry.msi_entry.address.as_uint32 = int_desc->address & 0xffffffff;
> @@ -671,7 +671,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
>  		nr_bank = cpumask_to_vpset(&params->int_target.vp_set, tmp);
>  		free_cpumask_var(tmp);
>  
> -		if (nr_bank <= 0) {
> +		if (nr_bank <= 0 || nr_bank > batch_size) {
>  			res = 1;
>  			goto out;
>  		}
> @@ -1034,11 +1034,9 @@ static void hv_pci_read_mmio(struct device *dev, phys_addr_t gpa, int size, u32
>  
>  	/*
>  	 * Must be called with interrupts disabled so it is safe
> -	 * to use the per-cpu input argument page.  Use it for
> -	 * both input and output.
> +	 * to use the per-cpu argument page.
>  	 */
> -	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
> -	out = *this_cpu_ptr(hyperv_pcpu_input_arg) + sizeof(*in);
> +	hv_hvcall_inout(&in, sizeof(*in), &out, sizeof(*out));
>  	in->gpa = gpa;
>  	in->size = size;
>  
> @@ -1067,9 +1065,9 @@ static void hv_pci_write_mmio(struct device *dev, phys_addr_t gpa, int size, u32
>  
>  	/*
>  	 * Must be called with interrupts disabled so it is safe
> -	 * to use the per-cpu input argument memory.
> +	 * to use the per-cpu argument page.
>  	 */
> -	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +	hv_hvcall_in_array(&in, sizeof(*in), sizeof(in->data[0]));
>  	in->gpa = gpa;
>  	in->size = size;
>  	switch (size) {
> diff --git a/include/hyperv/hvgdk_mini.h b/include/hyperv/hvgdk_mini.h
> index 70e5d7ee40c8..cb25ac1e3ac5 100644
> --- a/include/hyperv/hvgdk_mini.h
> +++ b/include/hyperv/hvgdk_mini.h
> @@ -1342,7 +1342,7 @@ struct hv_mmio_write_input {
>  	u64 gpa;
>  	u32 size;
>  	u32 reserved;
> -	u8 data[HV_HYPERCALL_MMIO_MAX_DATA_LENGTH];
> +	u8 data[];

As with the prior patch, I don't think this is worth changing. The
code in hv_pci_write_mmio() is more complicated as a result, and
changing the array means the struct no longer matches the Hyper-V
struct 1:1.

Thanks
Nuno

>  } __packed;
>  
>  #endif /* _HV_HVGDK_MINI_H */


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-21 20:11   ` Nuno Das Neves
@ 2025-03-30 21:53     ` Michael Kelley
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Kelley @ 2025-03-30 21:53 UTC (permalink / raw)
  To: Nuno Das Neves, kys@microsoft.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, lpieralisi@kernel.org, kw@linux.com,
	manivannan.sadhasivam@linaro.org, robh@kernel.org,
	bhelgaas@google.com, arnd@arndb.de
  Cc: x86@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-arch@vger.kernel.org

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, March 21, 2025 1:11 PM
> 
> On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > Update hypercall call sites to use the new hv_hvcall_*() functions
> > to set up hypercall arguments. Since these functions zero the
> > fixed portion of input memory, remove now redundant zero'ing of
> > input fields.
> >
> > hv_post_message() requires additional updates. The payload area is
> > treated as an array to avoid wasting cycles on zero'ing it and
> > then overwriting with memcpy(). To allow treatment as an array,
> > the corresponding payload[] field is updated to have zero size.
> >
> I'd prefer to leave the payload field as a fixed-sized array.
> Changing it to a flexible array makes it look like that input is
> for a variable-sized or rep hypercall, and it makes the surrounding
> code in hv_post_message() more complex and inscrutable as a result.
> 
> I suggest leaving hv_post_message() alone, except for changing
> hyperv_pcpu_input_arg -> hyperv_pcpu_arg, and perhaps a comment
> explaining why hv_hvcall_input() isn't used there.
> 
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > ---
> >  drivers/hv/hv.c           | 9 ++++++---
> >  drivers/hv/hv_balloon.c   | 4 ++--
> >  drivers/hv/hv_common.c    | 2 +-
> >  drivers/hv/hv_proc.c      | 8 ++++----
> >  drivers/hv/hyperv_vmbus.h | 2 +-
> >  5 files changed, 14 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> > index a38f84548bc2..e2dcbc816fc5 100644
> > --- a/drivers/hv/hv.c
> > +++ b/drivers/hv/hv.c
> > @@ -66,7 +66,8 @@ int hv_post_message(union hv_connection_id connection_id,
> >  	if (hv_isolation_type_tdx() && ms_hyperv.paravisor_present)
> >  		aligned_msg = this_cpu_ptr(hv_context.cpu_context)->post_msg_page;
> >  	else
> > -		aligned_msg = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +		hv_hvcall_in_array(&aligned_msg, sizeof(*aligned_msg),
> > +				   sizeof(aligned_msg->payload[0]));
> >
> >  	aligned_msg->connectionid = connection_id;
> >  	aligned_msg->reserved = 0;
> > @@ -80,8 +81,10 @@ int hv_post_message(union hv_connection_id connection_id,
> >  						  virt_to_phys(aligned_msg), 0);
> >  		else if (hv_isolation_type_snp())
> >  			status = hv_ghcb_hypercall(HVCALL_POST_MESSAGE,
> > -						   aligned_msg, NULL,
> > -						   sizeof(*aligned_msg));
> > +						   aligned_msg,
> > +						   NULL,
> > +						   struct_size(aligned_msg, payload,
> > +						   HV_MESSAGE_PAYLOAD_QWORD_COUNT));
> 
> See my comment above, I'd prefer to leave this function mostly
> alone to maintain readability.

Let me try again to introduce hv_hvcall_*() without changing struct
hv_input_post_message. If that struct isn't changed, then the
hv_ghcb_hypercall() arguments don't have to change.

I'd like to reach a point where hyperv_input_arg isn't referenced in any
open coding -- it should be referenced only internally in the hv_call_*()
functions and where it is allocated and deallocated. The arguments to
hv_hvcall_in_array() will be a slightly more complicated to prevent zero'ing
the entire payload area, but I don't think readability alone is justification
for open-coding the use of hyperv_input_arg.

Other reviewers -- please chime in on whether the "no open coding" goal
should be kept. I can drop that goal if that's the way folks prefer.

> 
> >  		else
> >  			status = HV_STATUS_INVALID_PARAMETER;
> >  	} else {
> > diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> > index fec2f18679e3..2def8b8794ee 100644
> > --- a/drivers/hv/hv_balloon.c
> > +++ b/drivers/hv/hv_balloon.c
> > @@ -1582,14 +1582,14 @@ static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
> >  	WARN_ON_ONCE(nents > HV_MEMORY_HINT_MAX_GPA_PAGE_RANGES);
> >  	WARN_ON_ONCE(sgl->length < (HV_HYP_PAGE_SIZE << page_reporting_order));
> >  	local_irq_save(flags);
> > -	hint = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +
> > +	hv_hvcall_in_array(&hint, sizeof(*hint), sizeof(hint->ranges[0]));
> 
> We should ensure the returned batch size is large enough for
> "nents".

OK, right.  That test would replace the WARN_ON_ONCE() based on nents.

> 
> >  	if (!hint) {
> >  		local_irq_restore(flags);
> >  		return -ENOSPC;
> >  	}
> >
> >  	hint->heat_type = HV_EXTMEM_HEAT_HINT_COLD_DISCARD;
> > -	hint->reserved = 0;
> >  	for_each_sg(sgl, sg, nents, i) {
> >  		union hv_gpa_page_range *range;
> >
> > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> > index 9804adb4cc56..a6b1cdfbc8d4 100644
> > --- a/drivers/hv/hv_common.c
> > +++ b/drivers/hv/hv_common.c
> > @@ -293,7 +293,7 @@ void __init hv_get_partition_id(void)
> >  	u64 status, pt_id;
> >
> >  	local_irq_save(flags);
> > -	output = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +	hv_hvcall_inout(NULL, 0, &output, sizeof(*output));
> >  	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, &output);
> >  	pt_id = output->partition_id;
> >  	local_irq_restore(flags);
> > diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> > index 2fae18e4f7d2..5c580ee1c23f 100644
> > --- a/drivers/hv/hv_proc.c
> > +++ b/drivers/hv/hv_proc.c
> > @@ -73,7 +73,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> >
> >  	local_irq_save(flags);
> >
> > -	input_page = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +	hv_hvcall_in_array(&input_page, sizeof(*input_page),
> > +			   sizeof(input_page->gpa_page_list[0]));
> 
> We should ensure the returned batch size is large enough.

OK.

> 
> >
> >  	input_page->partition_id = partition_id;
> >
> > @@ -124,9 +125,8 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
> >  	do {
> >  		local_irq_save(flags);
> >
> > -		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> >  		/* We don't do anything with the output right now */
> > -		output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> > +		hv_hvcall_inout(&input, sizeof(*input), &output, sizeof(*output));
> >
> >  		input->lp_index = lp_index;
> >  		input->apic_id = apic_id;
> > @@ -167,7 +167,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
> >  	do {
> >  		local_irq_save(irq_flags);
> >
> > -		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +		hv_hvcall_in(&input, sizeof(*input));
> >
> >  		input->partition_id = partition_id;
> >  		input->vp_index = vp_index;
> > diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> > index 29780f3a7478..44b5e8330d9d 100644
> > --- a/drivers/hv/hyperv_vmbus.h
> > +++ b/drivers/hv/hyperv_vmbus.h
> > @@ -101,7 +101,7 @@ struct hv_input_post_message {
> >  	u32 reserved;
> >  	u32 message_type;
> >  	u32 payload_size;
> > -	u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
> > +	u64 payload[];
> 
> See my comment above, I'd prefer to keep this how it is.
> 
> >  };
> >
> >
> 
> Thanks
> Nuno


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH v2 5/6] PCI: hv: Use hv_hvcall_*() to set up hypercall arguments
  2025-03-21 20:18   ` Nuno Das Neves
@ 2025-03-30 21:53     ` Michael Kelley
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Kelley @ 2025-03-30 21:53 UTC (permalink / raw)
  To: Nuno Das Neves, kys@microsoft.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, lpieralisi@kernel.org, kw@linux.com,
	manivannan.sadhasivam@linaro.org, robh@kernel.org,
	bhelgaas@google.com, arnd@arndb.de
  Cc: x86@kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-arch@vger.kernel.org

From: Nuno Das Neves <nunodasneves@linux.microsoft.com> Sent: Friday, March 21, 2025 1:19 PM
> 
> On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> >
> > Update hypercall call sites to use the new hv_hvcall_*() functions
> > to set up hypercall arguments. Since these functions zero the
> > fixed portion of input memory, remove now redundant calls to memset().
> >
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> > Acked-by: Bjorn Helgaas <bhelgaas@google.com>
> > ---
> >
> > Notes:
> >     Changes in v2:
> >     * In hv_arch_irq_unmask(), added check of the number of computed banks
> >       in the hv_vpset against the batch_size. Since an hv_vpset currently
> >       represents a maximum of 4096 CPUs, the hv_vpset size does not exceed
> >       512 bytes and there should always be sufficent space. But do the
> >       check just in case something changes. [Nuno Das Neves]
> >
> >  drivers/pci/controller/pci-hyperv.c | 18 ++++++++----------
> >  include/hyperv/hvgdk_mini.h         |  2 +-
> >  2 files changed, 9 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> > index ac27bda5ba26..82ac0e09943b 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -622,7 +622,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> >  	struct pci_dev *pdev;
> >  	unsigned long flags;
> >  	u32 var_size = 0;
> > -	int cpu, nr_bank;
> > +	int cpu, nr_bank, batch_size;
> >  	u64 res;
> >
> >  	dest = irq_data_get_effective_affinity_mask(data);
> > @@ -638,8 +638,8 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> >
> >  	local_irq_save(flags);
> >
> > -	params = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > -	memset(params, 0, sizeof(*params));
> > +	batch_size = hv_hvcall_in_array(&params, sizeof(*params),
> > +					sizeof(params->int_target.vp_set.bank_contents[0]));
> >  	params->partition_id = HV_PARTITION_ID_SELF;
> >  	params->int_entry.source = HV_INTERRUPT_SOURCE_MSI;
> >  	params->int_entry.msi_entry.address.as_uint32 = int_desc->address & 0xffffffff;
> > @@ -671,7 +671,7 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> >  		nr_bank = cpumask_to_vpset(&params->int_target.vp_set, tmp);
> >  		free_cpumask_var(tmp);
> >
> > -		if (nr_bank <= 0) {
> > +		if (nr_bank <= 0 || nr_bank > batch_size) {
> >  			res = 1;
> >  			goto out;
> >  		}
> > @@ -1034,11 +1034,9 @@ static void hv_pci_read_mmio(struct device *dev, phys_addr_t gpa, int size, u32
> >
> >  	/*
> >  	 * Must be called with interrupts disabled so it is safe
> > -	 * to use the per-cpu input argument page.  Use it for
> > -	 * both input and output.
> > +	 * to use the per-cpu argument page.
> >  	 */
> > -	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > -	out = *this_cpu_ptr(hyperv_pcpu_input_arg) + sizeof(*in);
> > +	hv_hvcall_inout(&in, sizeof(*in), &out, sizeof(*out));
> >  	in->gpa = gpa;
> >  	in->size = size;
> >
> > @@ -1067,9 +1065,9 @@ static void hv_pci_write_mmio(struct device *dev, phys_addr_t gpa, int size, u32
> >
> >  	/*
> >  	 * Must be called with interrupts disabled so it is safe
> > -	 * to use the per-cpu input argument memory.
> > +	 * to use the per-cpu argument page.
> >  	 */
> > -	in = *this_cpu_ptr(hyperv_pcpu_input_arg);
> > +	hv_hvcall_in_array(&in, sizeof(*in), sizeof(in->data[0]));
> >  	in->gpa = gpa;
> >  	in->size = size;
> >  	switch (size) {
> > diff --git a/include/hyperv/hvgdk_mini.h b/include/hyperv/hvgdk_mini.h
> > index 70e5d7ee40c8..cb25ac1e3ac5 100644
> > --- a/include/hyperv/hvgdk_mini.h
> > +++ b/include/hyperv/hvgdk_mini.h
> > @@ -1342,7 +1342,7 @@ struct hv_mmio_write_input {
> >  	u64 gpa;
> >  	u32 size;
> >  	u32 reserved;
> > -	u8 data[HV_HYPERCALL_MMIO_MAX_DATA_LENGTH];
> > +	u8 data[];
> 
> As with the prior patch, I don't think this is worth changing. The
> code in hv_pci_write_mmio() is more complicated as a result, and
> changing the array means the struct no longer matches the Hyper-V
> struct 1:1.
> 

Given the goal of matching the Hyper-V structure definitions, I
can see that changing the "data" field to be a flexible array is
problematic. But what are the additional complications in
hv_pci_write_mmio() are you referring to?  There's only a one
line change. Again, I'd like to not leave cases where use of
hyperv_input_arg is open coded. I think hv_hvcall_*() can still
be used even if the "data" field is a fixed-size array.

Michael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args
  2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
                   ` (5 preceding siblings ...)
  2025-03-13  6:19 ` [PATCH v2 6/6] Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg mhkelley58
@ 2025-04-01 19:29 ` Easwar Hariharan
  6 siblings, 0 replies; 14+ messages in thread
From: Easwar Hariharan @ 2025-04-01 19:29 UTC (permalink / raw)
  To: mhklinux
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, hpa,
	lpieralisi, kw, manivannan.sadhasivam, robh, bhelgaas, arnd,
	eahariha, x86, linux-hyperv, linux-kernel, linux-pci, linux-arch

On 3/12/2025 11:19 PM, mhkelley58@gmail.com wrote:
> From: Michael Kelley <mhklinux@outlook.com>
> 
> This patch set introduces a new way to manage the use of the per-cpu
> memory that is usually the input and output arguments to Hyper-V
> hypercalls. Current code allocates the "hyperv_pcpu_input_arg", and in
> some configurations, the "hyperv_pcpu_output_arg". Each is a 4 KiB
> page of memory allocated per-vCPU. A hypercall call site disables
> interrupts, then uses this memory to set up the input parameters for
> the hypercall, read the output results after hypercall execution, and
> re-enable interrupts. The open coding of these steps has led to
> inconsistencies, and in some cases, violation of the generic
> requirements for the hypercall input and output as described in the
> Hyper-V Top Level Functional Spec (TLFS)[1]. This patch set introduces
> a new family of inline functions to replace the open coding. The new
> functions encapsulate key aspects of the use of per-vCPU memory for
> hypercall input and output,and ensure that the TLFS requirements are
> met (max size of 1 page each for input and output, no overlap of input
> and output, aligned to 8 bytes, etc.).
> 
> With this change, hypercall call sites no longer directly access
> "hyperv_pcpu_input_arg" and "hyperv_pcpu_output_arg". Instead, one of
> a family of new functions provides the per-cpu memory that a hypercall
> call site uses to set up hypercall input and output areas.
> Conceptually, there is no longer a difference between the "per-vCPU
> input page" and "per-vCPU output page". Only a single per-vCPU page is
> allocated, and it is used to provide both hypercall input and output.
> All current hypercalls can fit their input and output within that single
> page, though the new code allows easy changing to two pages should a
> future hypercall require a full page for each of the input and output.
> 
> The new functions always zero the fixed-size portion of the hypercall
> input area (but not any array portion -- see below) so that
> uninitialized memory isn't inadvertently passed to the hypercall.
> Current open-coded hypercall call sites are inconsistent on this point,
> and use of the new functions addresses that inconsistency. The output
> area is not zero'ed by the new code as it is Hyper-V's responsibility
> to provide legal output.
> 
> When the input or output (or both) contain an array, the new code
> calculates and returns how many array entries fit within the per-cpu
> memory page, which is effectively the "batch size" for the hypercall
> processing multiple entries. This batch size can then be used in the
> hypercall control word to specify the repetition count. This
> calculation of the batch size replaces current open coding of the
> batch size, which is prone to errors. Note that the array portion of
> the input area is *not* zero'ed. The arrays are almost always 64-bit
> GPAs or something similar, and zero'ing that much memory seems
> wasteful at runtime when it will all be overwritten. The hypercall
> call site is responsible for ensuring that no part of the array is
> left uninitialized (just as with current code).
> 
> The new family of functions is realized as a single inline function
> that handles the most complex case, which is a hypercall with input
> and output, both of which contain arrays. Simpler cases are mapped to
> this most complex case with #define wrappers that provide zero or NULL
> for some arguments. Several of the arguments to this new function
> must be compile-time constants generated by "sizeof()" expressions.
> As such, most of the code in the new function is evaluated by the
> compiler, with the result that the runtime code paths are no longer
> than with the current open coding. An exception is the new code
> generated to zero the fixed-size portion of the input area in cases
> where it was not previously done.
> 
> Use of the new function typically (but not always) saves a few lines
> of code at each hypercall call site. This is traded off against the
> lines of code added for the new functions. With code currently
> upstream, the net is an add of about 60 lines of code and comments.
> However, as additional hypercall call sites are upstreamed from the
> OpenHCL project[2] in support of Linux running in the Hyper-V root
> partition and in VTLs other than VTL 0, the net lines of code added is
> nearly zero.
> 
> A couple hypercall call sites have requirements that are not 100%
> handled by the new function. These still require some manual open-
> coded adjustment or open-coded batch size calculations -- see the
> individual patches in this series. Suggestions on how to do better
> are welcome.
> 
> The patches in the series do the following:
> 
> Patch 1: Introduce the new family of functions for assigning hypercall
>          input and output arguments.
> 
> Patch 2 to 5: Change existing hypercall call sites to use one of the new
>          functions. In some cases, tweaks to the hypercall argument data
>          structures are necessary, but these tweaks are making the data
>          structures more consistent with the overall pattern. These
>          four patches are independent of each other, and can go in any
>          order. The breakup into 4 patches is for ease of review.
> 
> Patch 6: Update the name of the variable used to hold the per-cpu memory
>          used for hypercall arguments. Remove code for managing the
> 	 per-cpu output page.
> 
> Patch 1 from v1 of the patch set has been dropped in v2. It was a bug
> fix that has already been picked up.
> 
> The new code compiles and runs successfully on x86 and arm64. Separate
> from this patch set, for evaluation purposes I also applied the
> changes to the additional hypercall call sites in the OpenHCL
> project[2]. However, I don't have the hardware or Hyper-V
> configurations needed to test running in the Hyper-V root partition or
> in a VTL other than VTL 0. So the related hypercall call sites still
> need to be tested to make sure I didn't break anything. Hopefully
> someone with the necessary configurations and Hyper-V versions can
> help with that testing.
> 
> For gcc 9.4.0, I've looked at the generated code for a couple of
> hypercall call sites on both x86 and arm64 to ensure that it boils
> down to the equivalent of the current open coding. I have not looked
> at the generated code for later gcc versions or for Clang/LLVM, but
> there's no reason to expect something worse as the code isn't doing
> anything tricky.
> 
> This patch set is built against linux-next20250311.
> 
> [1] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs
> [2] https://github.com/microsoft/OHCL-Linux-Kernel
> 
> Michael Kelley (6):
>   Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments
>   x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1
>   x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2
>   Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments
>   PCI: hv: Use hv_hvcall_*() to set up hypercall arguments
>   Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg
> 
>  arch/x86/hyperv/hv_apic.c           |  10 ++-
>  arch/x86/hyperv/hv_init.c           |  12 ++--
>  arch/x86/hyperv/hv_vtl.c            |   9 +--
>  arch/x86/hyperv/irqdomain.c         |  17 +++--
>  arch/x86/hyperv/ivm.c               |  18 ++---
>  arch/x86/hyperv/mmu.c               |  19 ++---
>  arch/x86/hyperv/nested.c            |  14 ++--
>  drivers/hv/hv.c                     |  11 +--
>  drivers/hv/hv_balloon.c             |   4 +-
>  drivers/hv/hv_common.c              |  57 +++++----------
>  drivers/hv/hv_proc.c                |   8 +--
>  drivers/hv/hyperv_vmbus.h           |   2 +-
>  drivers/pci/controller/pci-hyperv.c |  18 +++--
>  include/asm-generic/mshyperv.h      | 103 +++++++++++++++++++++++++++-
>  include/hyperv/hvgdk_mini.h         |   6 +-
>  15 files changed, 184 insertions(+), 124 deletions(-)
> 

I do intend to review this series but it's been hard to get time in between doing
commercial work. I'll leave it to Wei to determine how important my feedback is since
Nuno has reviewed v1.

I do feel that it's important for *someone* to review the series from the Linux
guest perspective.

Thanks,
Easwar (he/him)

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-04-01 19:29 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-13  6:19 [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args mhkelley58
2025-03-13  6:19 ` [PATCH v2 1/6] Drivers: hv: Introduce hv_hvcall_*() functions for hypercall arguments mhkelley58
2025-03-13  6:19 ` [PATCH v2 2/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 1 mhkelley58
2025-03-21 19:21   ` Nuno Das Neves
2025-03-13  6:19 ` [PATCH v2 3/6] x86/hyperv: Use hv_hvcall_*() to set up hypercall arguments -- part 2 mhkelley58
2025-03-21 19:35   ` Nuno Das Neves
2025-03-13  6:19 ` [PATCH v2 4/6] Drivers: hv: Use hv_hvcall_*() to set up hypercall arguments mhkelley58
2025-03-21 20:11   ` Nuno Das Neves
2025-03-30 21:53     ` Michael Kelley
2025-03-13  6:19 ` [PATCH v2 5/6] PCI: " mhkelley58
2025-03-21 20:18   ` Nuno Das Neves
2025-03-30 21:53     ` Michael Kelley
2025-03-13  6:19 ` [PATCH v2 6/6] Drivers: hv: Replace hyperv_pcpu_input/output_arg with hyperv_pcpu_arg mhkelley58
2025-04-01 19:29 ` [PATCH v2 0/6] hyperv: Introduce new way to manage hypercall args Easwar Hariharan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).