* [PATCH 1/4] mshv: Support larger memory deposits
2026-03-04 0:23 [PATCH 0/4] mshv: Fix and improve memory pre-depositing Stanislav Kinsburskii
@ 2026-03-04 0:23 ` Stanislav Kinsburskii
2026-03-05 19:43 ` Michael Kelley
2026-03-06 3:19 ` Mukesh R
2026-03-04 0:23 ` [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization Stanislav Kinsburskii
` (3 subsequent siblings)
4 siblings, 2 replies; 16+ messages in thread
From: Stanislav Kinsburskii @ 2026-03-04 0:23 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel
Convert hv_call_deposit_pages() into a wrapper supporting arbitrary number
of pages, and use it in the memory deposit code paths.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/hv_proc.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)
diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
index 5f4fd9c3231c..0f84a70def30 100644
--- a/drivers/hv/hv_proc.c
+++ b/drivers/hv/hv_proc.c
@@ -16,7 +16,7 @@
#define HV_DEPOSIT_MAX (HV_HYP_PAGE_SIZE / sizeof(u64) - 1)
/* Deposits exact number of pages. Must be called with interrupts enabled. */
-int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
+static int __hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
{
struct page **pages, *page;
int *counts;
@@ -108,6 +108,54 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
kfree(counts);
return ret;
}
+
+/**
+ * hv_call_deposit_pages - Deposit memory pages to a partition
+ * @node : NUMA node from which to allocate pages
+ * @partition_id: Target partition ID to deposit pages to
+ * @num_pages : Number of pages to deposit
+ *
+ * Deposits memory pages to the specified partition. The deposit is
+ * performed in chunks of HV_DEPOSIT_MAX pages to handle large requests
+ * efficiently.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
+{
+ u32 done;
+ int ret = 0;
+
+ /*
+ * Do a double deposit for L1VH. This reserves enough memory for
+ * Hypervisor Hot Restart (HHR).
+ *
+ * During HHR, every data structure must be recreated in the new
+ * ("proto") hypervisor. Memory is required by the proto hypervisor
+ * to do this work.
+ *
+ * For regular L1 partitions, more memory can be requested from the
+ * root during HHR by sending an asynchronous message. But this is
+ * not supported for L1VHs. A guest must not be allowed to block
+ * HHR by refusing to deposit more memory.
+ *
+ * So for L1VH a deposit is always required for both current needs
+ * and future HHR work.
+ */
+ if (hv_l1vh_partition())
+ num_pages *= 2;
+
+ for (done = 0; done < num_pages; done += HV_DEPOSIT_MAX) {
+ u32 to_deposit = min(num_pages - done, HV_DEPOSIT_MAX);
+
+ ret = __hv_call_deposit_pages(node, partition_id,
+ to_deposit);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
int hv_deposit_memory_node(int node, u64 partition_id,
^ permalink raw reply related [flat|nested] 16+ messages in thread* RE: [PATCH 1/4] mshv: Support larger memory deposits
2026-03-04 0:23 ` [PATCH 1/4] mshv: Support larger memory deposits Stanislav Kinsburskii
@ 2026-03-05 19:43 ` Michael Kelley
2026-03-06 3:19 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Michael Kelley @ 2026-03-05 19:43 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, March 3, 2026 4:24 PM
>
> Convert hv_call_deposit_pages() into a wrapper supporting arbitrary number
> of pages, and use it in the memory deposit code paths.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/hv_proc.c | 50
> +++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> index 5f4fd9c3231c..0f84a70def30 100644
> --- a/drivers/hv/hv_proc.c
> +++ b/drivers/hv/hv_proc.c
> @@ -16,7 +16,7 @@
> #define HV_DEPOSIT_MAX (HV_HYP_PAGE_SIZE / sizeof(u64) - 1)
>
> /* Deposits exact number of pages. Must be called with interrupts enabled. */
> -int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> +static int __hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> {
> struct page **pages, *page;
> int *counts;
> @@ -108,6 +108,54 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> kfree(counts);
> return ret;
> }
> +
> +/**
> + * hv_call_deposit_pages - Deposit memory pages to a partition
> + * @node : NUMA node from which to allocate pages
> + * @partition_id: Target partition ID to deposit pages to
> + * @num_pages : Number of pages to deposit
> + *
> + * Deposits memory pages to the specified partition. The deposit is
> + * performed in chunks of HV_DEPOSIT_MAX pages to handle large requests
> + * efficiently.
> + *
> + * Return: 0 on success, negative error code on failure
For the failure case, a key fact seems to be that there's no attempt to
withdraw any pages that might have been successfully deposited. In
such failure case, the caller has no information about how many pages
were, or were not, deposited. The 2x for L1VH further muddies the
picture.
__hv_call_deposit_pages() apparently assumes that if the underlying
hypercall fails, none of the pages were deposited. So it frees all the
allocated pages. But I wonder if that's really true. The hypercall is
a rep hypercall, which can get partly through the list, return to the
guest, then restart where it left off. If there's a failure after a
restart, I wonder if the hypercall goes back and withdraws any
pages that were successfully deposited before the restart. The
restart behaves like a new invocation of the hypercall.
> + */
> +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
Perhaps the num_pages parameter should be a u64. The u32 imposes
a limit of 8 Tbytes on the amount of memory that can be deposited
(allowing for the 2x multiplier for L1VH partitions). Azure has VM sizes
today with up to 30 Tbytes of memory, so it's certainly possible.
> +{
> + u32 done;
Same here. Use u64.
> + int ret = 0;
> +
> + /*
> + * Do a double deposit for L1VH. This reserves enough memory for
> + * Hypervisor Hot Restart (HHR).
> + *
> + * During HHR, every data structure must be recreated in the new
> + * ("proto") hypervisor. Memory is required by the proto hypervisor
> + * to do this work.
> + *
> + * For regular L1 partitions, more memory can be requested from the
> + * root during HHR by sending an asynchronous message. But this is
> + * not supported for L1VHs. A guest must not be allowed to block
> + * HHR by refusing to deposit more memory.
> + *
> + * So for L1VH a deposit is always required for both current needs
> + * and future HHR work.
> + */
> + if (hv_l1vh_partition())
> + num_pages *= 2;
> +
> + for (done = 0; done < num_pages; done += HV_DEPOSIT_MAX) {
> + u32 to_deposit = min(num_pages - done, HV_DEPOSIT_MAX);
> +
> + ret = __hv_call_deposit_pages(node, partition_id,
> + to_deposit);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
>
> int hv_deposit_memory_node(int node, u64 partition_id,
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH 1/4] mshv: Support larger memory deposits
2026-03-04 0:23 ` [PATCH 1/4] mshv: Support larger memory deposits Stanislav Kinsburskii
2026-03-05 19:43 ` Michael Kelley
@ 2026-03-06 3:19 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:19 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> Convert hv_call_deposit_pages() into a wrapper supporting arbitrary number
> of pages, and use it in the memory deposit code paths.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/hv_proc.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 49 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> index 5f4fd9c3231c..0f84a70def30 100644
> --- a/drivers/hv/hv_proc.c
> +++ b/drivers/hv/hv_proc.c
> @@ -16,7 +16,7 @@
> #define HV_DEPOSIT_MAX (HV_HYP_PAGE_SIZE / sizeof(u64) - 1)
>
> /* Deposits exact number of pages. Must be called with interrupts enabled. */
> -int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> +static int __hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> {
> struct page **pages, *page;
> int *counts;
> @@ -108,6 +108,54 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> kfree(counts);
> return ret;
> }
> +
> +/**
> + * hv_call_deposit_pages - Deposit memory pages to a partition
> + * @node : NUMA node from which to allocate pages
> + * @partition_id: Target partition ID to deposit pages to
> + * @num_pages : Number of pages to deposit
> + *
> + * Deposits memory pages to the specified partition. The deposit is
> + * performed in chunks of HV_DEPOSIT_MAX pages to handle large requests
> + * efficiently.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> +{
> + u32 done;
> + int ret = 0;
> +
> + /*
> + * Do a double deposit for L1VH. This reserves enough memory for
> + * Hypervisor Hot Restart (HHR).
> + *
> + * During HHR, every data structure must be recreated in the new
> + * ("proto") hypervisor. Memory is required by the proto hypervisor
> + * to do this work.
> + *
> + * For regular L1 partitions, more memory can be requested from the
> + * root during HHR by sending an asynchronous message. But this is
> + * not supported for L1VHs. A guest must not be allowed to block
> + * HHR by refusing to deposit more memory.
> + *
> + * So for L1VH a deposit is always required for both current needs
> + * and future HHR work.
> + */
> + if (hv_l1vh_partition())
> + num_pages *= 2;
I'm not sure if it is a good idea to just do this unconditionally for
all cases of l1vh. I'd like to experiment to see if this is actually
truy for all the passthru and interrupt related hypercalls that fail
with insuff memory.
> +
> + for (done = 0; done < num_pages; done += HV_DEPOSIT_MAX) {
> + u32 to_deposit = min(num_pages - done, HV_DEPOSIT_MAX);
> +
> + ret = __hv_call_deposit_pages(node, partition_id,
> + to_deposit);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
>
> int hv_deposit_memory_node(int node, u64 partition_id,
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization
2026-03-04 0:23 [PATCH 0/4] mshv: Fix and improve memory pre-depositing Stanislav Kinsburskii
2026-03-04 0:23 ` [PATCH 1/4] mshv: Support larger memory deposits Stanislav Kinsburskii
@ 2026-03-04 0:23 ` Stanislav Kinsburskii
2026-03-05 19:43 ` Michael Kelley
2026-03-06 3:26 ` Mukesh R
2026-03-04 0:23 ` [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization Stanislav Kinsburskii
` (2 subsequent siblings)
4 siblings, 2 replies; 16+ messages in thread
From: Stanislav Kinsburskii @ 2026-03-04 0:23 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel
Deposit enough pages upfront to avoid partition initialization failures due
to low memory. This also speeds up partition initialization.
Move page depositing from the hypercall wrapper to the partition
initialization code. The required number of pages is empirical. This logic
fits better in the partition initialization code than in the hypercall
wrapper.
A partition with nested capability requires 40x more pages (20 MB) to
accommodate the nested MSHV hypervisor. This may be improved in the future.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/mshv_root.h | 1 +
drivers/hv/mshv_root_hv_call.c | 6 ------
drivers/hv/mshv_root_main.c | 23 +++++++++++++++++++++--
3 files changed, 22 insertions(+), 8 deletions(-)
diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
index 947dfb76bb19..40cf7bdbd62f 100644
--- a/drivers/hv/mshv_root.h
+++ b/drivers/hv/mshv_root.h
@@ -106,6 +106,7 @@ struct mshv_partition {
struct hlist_node pt_hnode;
u64 pt_id;
+ u64 pt_flags;
refcount_t pt_ref_count;
struct mutex pt_mutex;
diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
index bdcb8de7fb47..b8d199f95299 100644
--- a/drivers/hv/mshv_root_hv_call.c
+++ b/drivers/hv/mshv_root_hv_call.c
@@ -15,7 +15,6 @@
#include "mshv_root.h"
/* Determined empirically */
-#define HV_INIT_PARTITION_DEPOSIT_PAGES 208
#define HV_UMAP_GPA_PAGES 512
#define HV_PAGE_COUNT_2M_ALIGNED(pg_count) (!((pg_count) & (0x200 - 1)))
@@ -139,11 +138,6 @@ int hv_call_initialize_partition(u64 partition_id)
input.partition_id = partition_id;
- ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id,
- HV_INIT_PARTITION_DEPOSIT_PAGES);
- if (ret)
- return ret;
-
do {
status = hv_do_fast_hypercall8(HVCALL_INITIALIZE_PARTITION,
*(u64 *)&input);
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index d753f41d3b57..fbfc50de332c 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -35,6 +35,10 @@
#include "mshv.h"
#include "mshv_root.h"
+/* The deposit values below are empirical and may need to be adjusted. */
+#define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
+#define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
+
MODULE_AUTHOR("Microsoft");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
@@ -1587,6 +1591,15 @@ mshv_partition_ioctl_set_msi_routing(struct mshv_partition *partition,
return ret;
}
+static u64
+mshv_partition_deposit_pages(struct mshv_partition *partition)
+{
+ if (partition->pt_flags &
+ HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE)
+ return MSHV_PARTITION_DEPOSIT_PAGES_NESTED;
+ return MSHV_PARTITION_DEPOSIT_PAGES;
+}
+
static long
mshv_partition_ioctl_initialize(struct mshv_partition *partition)
{
@@ -1595,6 +1608,11 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
if (partition->pt_initialized)
return 0;
+ ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
+ mshv_partition_deposit_pages(partition));
+ if (ret)
+ goto withdraw_mem;
+
ret = hv_call_initialize_partition(partition->pt_id);
if (ret)
goto withdraw_mem;
@@ -1610,8 +1628,8 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
finalize_partition:
hv_call_finalize_partition(partition->pt_id);
withdraw_mem:
- hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
-
+ hv_call_withdraw_memory(MSHV_PARTITION_DEPOSIT_PAGES,
+ NUMA_NO_NODE, partition->pt_id);
return ret;
}
@@ -2032,6 +2050,7 @@ mshv_ioctl_create_partition(void __user *user_arg, struct device *module_dev)
return -ENOMEM;
partition->pt_module_dev = module_dev;
+ partition->pt_flags = creation_flags;
partition->isolation_type = isolation_properties.isolation_type;
refcount_set(&partition->pt_ref_count, 1);
^ permalink raw reply related [flat|nested] 16+ messages in thread* RE: [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization
2026-03-04 0:23 ` [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization Stanislav Kinsburskii
@ 2026-03-05 19:43 ` Michael Kelley
2026-03-06 3:26 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Michael Kelley @ 2026-03-05 19:43 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, March 3, 2026 4:24 PM
>
> Deposit enough pages upfront to avoid partition initialization failures due
> to low memory. This also speeds up partition initialization.
>
> Move page depositing from the hypercall wrapper to the partition
> initialization code. The required number of pages is empirical. This logic
> fits better in the partition initialization code than in the hypercall
> wrapper.
>
> A partition with nested capability requires 40x more pages (20 MB) to
> accommodate the nested MSHV hypervisor. This may be improved in the future.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/mshv_root.h | 1 +
> drivers/hv/mshv_root_hv_call.c | 6 ------
> drivers/hv/mshv_root_main.c | 23 +++++++++++++++++++++--
> 3 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
> index 947dfb76bb19..40cf7bdbd62f 100644
> --- a/drivers/hv/mshv_root.h
> +++ b/drivers/hv/mshv_root.h
> @@ -106,6 +106,7 @@ struct mshv_partition {
>
> struct hlist_node pt_hnode;
> u64 pt_id;
> + u64 pt_flags;
> refcount_t pt_ref_count;
> struct mutex pt_mutex;
>
> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> index bdcb8de7fb47..b8d199f95299 100644
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c
> @@ -15,7 +15,6 @@
> #include "mshv_root.h"
>
> /* Determined empirically */
I think the above comment applies to HV_INIT_PARTITION_DEPOSIT_PAGES
(not to HV_UMAP_GPA_PAGES) and should be removed.
> -#define HV_INIT_PARTITION_DEPOSIT_PAGES 208
> #define HV_UMAP_GPA_PAGES 512
>
> #define HV_PAGE_COUNT_2M_ALIGNED(pg_count) (!((pg_count) & (0x200 - 1)))
> @@ -139,11 +138,6 @@ int hv_call_initialize_partition(u64 partition_id)
>
> input.partition_id = partition_id;
>
> - ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id,
> - HV_INIT_PARTITION_DEPOSIT_PAGES);
> - if (ret)
> - return ret;
> -
> do {
> status = hv_do_fast_hypercall8(HVCALL_INITIALIZE_PARTITION,
> *(u64 *)&input);
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index d753f41d3b57..fbfc50de332c 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -35,6 +35,10 @@
> #include "mshv.h"
> #include "mshv_root.h"
>
> +/* The deposit values below are empirical and may need to be adjusted. */
> +#define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> +#define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
Nit: The placement of these #defines *above* the MODULE_* notations seems
a bit odd to me.
> +
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
> @@ -1587,6 +1591,15 @@ mshv_partition_ioctl_set_msi_routing(struct
> mshv_partition *partition,
> return ret;
> }
>
> +static u64
> +mshv_partition_deposit_pages(struct mshv_partition *partition)
Nit: This function name makes it seem like it will "deposit pages". Maybe
mshv_partition_get_deposit_cnt(), or something similar, would be better?
> +{
> + if (partition->pt_flags &
> + HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE)
> + return MSHV_PARTITION_DEPOSIT_PAGES_NESTED;
> + return MSHV_PARTITION_DEPOSIT_PAGES;
> +}
> +
> static long
> mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> {
> @@ -1595,6 +1608,11 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> if (partition->pt_initialized)
> return 0;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + mshv_partition_deposit_pages(partition));
> + if (ret)
> + goto withdraw_mem;
> +
> ret = hv_call_initialize_partition(partition->pt_id);
> if (ret)
> goto withdraw_mem;
> @@ -1610,8 +1628,8 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> finalize_partition:
> hv_call_finalize_partition(partition->pt_id);
> withdraw_mem:
> - hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
> -
> + hv_call_withdraw_memory(MSHV_PARTITION_DEPOSIT_PAGES,
> + NUMA_NO_NODE, partition->pt_id);
What's the strategy here for withdrawing memory after a failure? As I noted in
Patch 1 of the series, there's no way to know how many pages were deposited.
Might have been zero, or significantly more than MSHV_PARTITION_DEPOSIT_PAGES.
And in Patches 3 and 4 of the series, there's no attempt to withdraw pages if
hv_call_deposit_pages() fails, which seems inconsistent.
> return ret;
> }
>
> @@ -2032,6 +2050,7 @@ mshv_ioctl_create_partition(void __user *user_arg, struct device *module_dev)
> return -ENOMEM;
>
> partition->pt_module_dev = module_dev;
> + partition->pt_flags = creation_flags;
> partition->isolation_type = isolation_properties.isolation_type;
>
> refcount_set(&partition->pt_ref_count, 1);
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization
2026-03-04 0:23 ` [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization Stanislav Kinsburskii
2026-03-05 19:43 ` Michael Kelley
@ 2026-03-06 3:26 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:26 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> Deposit enough pages upfront to avoid partition initialization failures due
> to low memory. This also speeds up partition initialization.
I am curious what kinda of failures are observerd. Normally, hypercall
would fail with insuff memory, and we continue to deposit till it
succeeds, right? Is there an issue there that some calls are not looping
in the deposit path?
> Move page depositing from the hypercall wrapper to the partition
> initialization code. The required number of pages is empirical. This logic
> fits better in the partition initialization code than in the hypercall
> wrapper.
>
> A partition with nested capability requires 40x more pages (20 MB) to
> accommodate the nested MSHV hypervisor. This may be improved in the future.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/mshv_root.h | 1 +
> drivers/hv/mshv_root_hv_call.c | 6 ------
> drivers/hv/mshv_root_main.c | 23 +++++++++++++++++++++--
> 3 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
> index 947dfb76bb19..40cf7bdbd62f 100644
> --- a/drivers/hv/mshv_root.h
> +++ b/drivers/hv/mshv_root.h
> @@ -106,6 +106,7 @@ struct mshv_partition {
>
> struct hlist_node pt_hnode;
> u64 pt_id;
> + u64 pt_flags;
> refcount_t pt_ref_count;
> struct mutex pt_mutex;
>
> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> index bdcb8de7fb47..b8d199f95299 100644
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c
> @@ -15,7 +15,6 @@
> #include "mshv_root.h"
>
> /* Determined empirically */
> -#define HV_INIT_PARTITION_DEPOSIT_PAGES 208
> #define HV_UMAP_GPA_PAGES 512
>
> #define HV_PAGE_COUNT_2M_ALIGNED(pg_count) (!((pg_count) & (0x200 - 1)))
> @@ -139,11 +138,6 @@ int hv_call_initialize_partition(u64 partition_id)
>
> input.partition_id = partition_id;
>
> - ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id,
> - HV_INIT_PARTITION_DEPOSIT_PAGES);
> - if (ret)
> - return ret;
> -
> do {
> status = hv_do_fast_hypercall8(HVCALL_INITIALIZE_PARTITION,
> *(u64 *)&input);
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index d753f41d3b57..fbfc50de332c 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -35,6 +35,10 @@
> #include "mshv.h"
> #include "mshv_root.h"
>
> +/* The deposit values below are empirical and may need to be adjusted. */
> +#define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> +#define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
This suggests action rather than count. imo, much better would be:
#define MSHV_PT_NUM_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
#define MSHV_PT_NUM_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
+
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> MODULE_DESCRIPTION("Microsoft Hyper-V root partition VMM interface /dev/mshv");
> @@ -1587,6 +1591,15 @@ mshv_partition_ioctl_set_msi_routing(struct mshv_partition *partition,
> return ret;
> }
>
> +static u64
> +mshv_partition_deposit_pages(struct mshv_partition *partition)
> +{
> + if (partition->pt_flags &
> + HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE)
> + return MSHV_PARTITION_DEPOSIT_PAGES_NESTED;
> + return MSHV_PARTITION_DEPOSIT_PAGES;
> +}
> +
> static long
> mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> {
> @@ -1595,6 +1608,11 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> if (partition->pt_initialized)
> return 0;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + mshv_partition_deposit_pages(partition));
> + if (ret)
> + goto withdraw_mem;
> +
> ret = hv_call_initialize_partition(partition->pt_id);
> if (ret)
> goto withdraw_mem;
> @@ -1610,8 +1628,8 @@ mshv_partition_ioctl_initialize(struct mshv_partition *partition)
> finalize_partition:
> hv_call_finalize_partition(partition->pt_id);
> withdraw_mem:
> - hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->pt_id);
> -
> + hv_call_withdraw_memory(MSHV_PARTITION_DEPOSIT_PAGES,
> + NUMA_NO_NODE, partition->pt_id);
> return ret;
> }
>
> @@ -2032,6 +2050,7 @@ mshv_ioctl_create_partition(void __user *user_arg, struct device *module_dev)
> return -ENOMEM;
>
> partition->pt_module_dev = module_dev;
> + partition->pt_flags = creation_flags;
> partition->isolation_type = isolation_properties.isolation_type;
>
> refcount_set(&partition->pt_ref_count, 1);
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization
2026-03-04 0:23 [PATCH 0/4] mshv: Fix and improve memory pre-depositing Stanislav Kinsburskii
2026-03-04 0:23 ` [PATCH 1/4] mshv: Support larger memory deposits Stanislav Kinsburskii
2026-03-04 0:23 ` [PATCH 2/4] mshv: Fix pre-depositing of pages for partition initialization Stanislav Kinsburskii
@ 2026-03-04 0:23 ` Stanislav Kinsburskii
2026-03-05 19:44 ` Michael Kelley
2026-03-06 3:33 ` Mukesh R
2026-03-04 0:23 ` [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation Stanislav Kinsburskii
2026-03-06 3:44 ` [PATCH 0/4] mshv: Fix and improve memory pre-depositing Mukesh R
4 siblings, 2 replies; 16+ messages in thread
From: Stanislav Kinsburskii @ 2026-03-04 0:23 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel
Deposit enough pages up front to avoid virtual processor creation failures
due to low memory. This also speeds up guest creation. A VP uses 25% more
pages in a partition with nested virtualization enabled, but the exact
number doesn't vary much, so deposit a fixed number of pages per VP that
works for nested virtualization.
Move page depositing from the hypercall wrapper to the virtual processor
creation code. The required number of pages is based on empirical data.
This logic fits better in the virtual processor creation code than in the
hypercall wrapper.
Also withdraw the deposited memory if virtual processor creation fails.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/hv_proc.c | 8 --------
drivers/hv/mshv_root_main.c | 11 ++++++++++-
2 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
index 0f84a70def30..3d41f52efd9a 100644
--- a/drivers/hv/hv_proc.c
+++ b/drivers/hv/hv_proc.c
@@ -251,14 +251,6 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
unsigned long irq_flags;
int ret = 0;
- /* Root VPs don't seem to need pages deposited */
- if (partition_id != hv_current_partition_id) {
- /* The value 90 is empirically determined. It may change. */
- ret = hv_call_deposit_pages(node, partition_id, 90);
- if (ret)
- return ret;
- }
-
do {
local_irq_save(irq_flags);
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index fbfc50de332c..48c842b6938d 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -38,6 +38,7 @@
/* The deposit values below are empirical and may need to be adjusted. */
#define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
#define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
+#define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
MODULE_AUTHOR("Microsoft");
MODULE_LICENSE("GPL");
@@ -1077,10 +1078,15 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
if (partition->pt_vp_array[args.vp_index])
return -EEXIST;
+ ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
+ MSHV_VP_DEPOSIT_PAGES);
+ if (ret)
+ return ret;
+
ret = hv_call_create_vp(NUMA_NO_NODE, partition->pt_id, args.vp_index,
0 /* Only valid for root partition VPs */);
if (ret)
- return ret;
+ goto withdraw_mem;
ret = hv_map_vp_state_page(partition->pt_id, args.vp_index,
HV_VP_STATE_PAGE_INTERCEPT_MESSAGE,
@@ -1177,6 +1183,9 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
intercept_msg_page, input_vtl_zero);
destroy_vp:
hv_call_delete_vp(partition->pt_id, args.vp_index);
+withdraw_mem:
+ hv_call_withdraw_memory(MSHV_VP_DEPOSIT_PAGES, NUMA_NO_NODE,
+ partition->pt_id);
out:
trace_mshv_create_vp(partition->pt_id, args.vp_index, ret);
return ret;
^ permalink raw reply related [flat|nested] 16+ messages in thread* RE: [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization
2026-03-04 0:23 ` [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization Stanislav Kinsburskii
@ 2026-03-05 19:44 ` Michael Kelley
2026-03-06 3:33 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Michael Kelley @ 2026-03-05 19:44 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, March 3, 2026 4:24 PM
>
> Deposit enough pages up front to avoid virtual processor creation failures
> due to low memory. This also speeds up guest creation. A VP uses 25% more
> pages in a partition with nested virtualization enabled, but the exact
> number doesn't vary much, so deposit a fixed number of pages per VP that
> works for nested virtualization.
>
> Move page depositing from the hypercall wrapper to the virtual processor
> creation code. The required number of pages is based on empirical data.
> This logic fits better in the virtual processor creation code than in the
> hypercall wrapper.
>
> Also withdraw the deposited memory if virtual processor creation fails.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/hv_proc.c | 8 --------
> drivers/hv/mshv_root_main.c | 11 ++++++++++-
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> index 0f84a70def30..3d41f52efd9a 100644
> --- a/drivers/hv/hv_proc.c
> +++ b/drivers/hv/hv_proc.c
> @@ -251,14 +251,6 @@ int hv_call_create_vp(int node, u64 partition_id, u32
> vp_index, u32 flags)
> unsigned long irq_flags;
> int ret = 0;
>
> - /* Root VPs don't seem to need pages deposited */
> - if (partition_id != hv_current_partition_id) {
> - /* The value 90 is empirically determined. It may change. */
> - ret = hv_call_deposit_pages(node, partition_id, 90);
> - if (ret)
> - return ret;
> - }
> -
> do {
> local_irq_save(irq_flags);
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index fbfc50de332c..48c842b6938d 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -38,6 +38,7 @@
> /* The deposit values below are empirical and may need to be adjusted. */
> #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> +#define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
>
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> @@ -1077,10 +1078,15 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
> if (partition->pt_vp_array[args.vp_index])
> return -EEXIST;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + MSHV_VP_DEPOSIT_PAGES);
> + if (ret)
> + return ret;
> +
> ret = hv_call_create_vp(NUMA_NO_NODE, partition->pt_id, args.vp_index,
> 0 /* Only valid for root partition VPs */);
> if (ret)
> - return ret;
> + goto withdraw_mem;
>
> ret = hv_map_vp_state_page(partition->pt_id, args.vp_index,
> HV_VP_STATE_PAGE_INTERCEPT_MESSAGE,
> @@ -1177,6 +1183,9 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
> intercept_msg_page, input_vtl_zero);
> destroy_vp:
> hv_call_delete_vp(partition->pt_id, args.vp_index);
> +withdraw_mem:
> + hv_call_withdraw_memory(MSHV_VP_DEPOSIT_PAGES, NUMA_NO_NODE,
> + partition->pt_id);
If the partition is an L1VH partition, hv_call_deposit_pages() will have deposited
2 * MSHV_VP_DEPOSIT_PAGES, but here in the failure case you are withdrawing
only MSHV_VP_DEPOSIT_PAGES.
> out:
> trace_mshv_create_vp(partition->pt_id, args.vp_index, ret);
> return ret;
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization
2026-03-04 0:23 ` [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization Stanislav Kinsburskii
2026-03-05 19:44 ` Michael Kelley
@ 2026-03-06 3:33 ` Mukesh R
1 sibling, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:33 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> Deposit enough pages up front to avoid virtual processor creation failures
> due to low memory. This also speeds up guest creation. A VP uses 25% more
> pages in a partition with nested virtualization enabled, but the exact
> number doesn't vary much, so deposit a fixed number of pages per VP that
> works for nested virtualization.
>
> Move page depositing from the hypercall wrapper to the virtual processor
> creation code. The required number of pages is based on empirical data.
> This logic fits better in the virtual processor creation code than in the
> hypercall wrapper.
>
> Also withdraw the deposited memory if virtual processor creation fails.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/hv_proc.c | 8 --------
> drivers/hv/mshv_root_main.c | 11 ++++++++++-
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> index 0f84a70def30..3d41f52efd9a 100644
> --- a/drivers/hv/hv_proc.c
> +++ b/drivers/hv/hv_proc.c
> @@ -251,14 +251,6 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
> unsigned long irq_flags;
> int ret = 0;
>
> - /* Root VPs don't seem to need pages deposited */
> - if (partition_id != hv_current_partition_id) {
> - /* The value 90 is empirically determined. It may change. */
> - ret = hv_call_deposit_pages(node, partition_id, 90);
> - if (ret)
> - return ret;
> - }
> -
> do {
> local_irq_save(irq_flags);
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index fbfc50de332c..48c842b6938d 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -38,6 +38,7 @@
> /* The deposit values below are empirical and may need to be adjusted. */
> #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> +#define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
This seems to assume that each vp will use up total of 1M, and I don't
think that is the case. My understanding, hyp will reuse remaining chunks.
IOW, 6M maybe enought for 8 vcpus.
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> @@ -1077,10 +1078,15 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
> if (partition->pt_vp_array[args.vp_index])
> return -EEXIST;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + MSHV_VP_DEPOSIT_PAGES);
> + if (ret)
> + return ret;
> +
> ret = hv_call_create_vp(NUMA_NO_NODE, partition->pt_id, args.vp_index,
> 0 /* Only valid for root partition VPs */);
> if (ret)
> - return ret;
> + goto withdraw_mem;
>
> ret = hv_map_vp_state_page(partition->pt_id, args.vp_index,
> HV_VP_STATE_PAGE_INTERCEPT_MESSAGE,
> @@ -1177,6 +1183,9 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
> intercept_msg_page, input_vtl_zero);
> destroy_vp:
> hv_call_delete_vp(partition->pt_id, args.vp_index);
> +withdraw_mem:
> + hv_call_withdraw_memory(MSHV_VP_DEPOSIT_PAGES, NUMA_NO_NODE,
> + partition->pt_id);
> out:
> trace_mshv_create_vp(partition->pt_id, args.vp_index, ret);
> return ret;
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation
2026-03-04 0:23 [PATCH 0/4] mshv: Fix and improve memory pre-depositing Stanislav Kinsburskii
` (2 preceding siblings ...)
2026-03-04 0:23 ` [PATCH 3/4] mshv: Fix pre-depositing of pages for virtual processor initialization Stanislav Kinsburskii
@ 2026-03-04 0:23 ` Stanislav Kinsburskii
2026-03-05 19:44 ` Michael Kelley
` (2 more replies)
2026-03-06 3:44 ` [PATCH 0/4] mshv: Fix and improve memory pre-depositing Mukesh R
4 siblings, 3 replies; 16+ messages in thread
From: Stanislav Kinsburskii @ 2026-03-04 0:23 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel
Deposit enough pages up front to avoid guest address space region creation
failures due to low memory. This also speeds up guest creation.
Calculate the required number of pages based on the guest's physical
address space size, rounded up to 1 GB chunks. Even the smallest guests are
assumed to need at least 1 GB worth of deposits. This is because every
guest requires tens of megabytes of deposited pages for hypervisor
overhead, making smaller deposits impractical.
Estimating in 1 GB chunks prevents over-depositing for larger guests while
accepting some over-deposit for smaller ones. This trade-off keeps the
estimate close to actual needs for larger guests.
Also withdraw the deposited pages if address space region creation fails.
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/mshv_root_main.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index 48c842b6938d..cb5b4505f8eb 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -39,6 +39,7 @@
#define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
#define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
#define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
+#define MSHV_1G_DEPOSIT_PAGES (6 * SZ_1M >> PAGE_SHIFT)
MODULE_AUTHOR("Microsoft");
MODULE_LICENSE("GPL");
@@ -1324,6 +1325,18 @@ static int mshv_prepare_pinned_region(struct mshv_mem_region *region)
return ret;
}
+static u64
+mshv_region_deposit_slat_pages(struct mshv_mem_region *region)
+{
+ u64 region_in_gbs, slat_pages;
+
+ /* SLAT needs 6 MB per 1 GB of address space. */
+ region_in_gbs = DIV_ROUND_UP(region->nr_pages << HV_HYP_PAGE_SHIFT, SZ_1G);
+ slat_pages = region_in_gbs * MSHV_1G_DEPOSIT_PAGES;
+
+ return slat_pages;
+}
+
/*
* This maps two things: guest RAM and for pci passthru mmio space.
*
@@ -1364,6 +1377,11 @@ mshv_map_user_memory(struct mshv_partition *partition,
if (ret)
return ret;
+ ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
+ mshv_region_deposit_slat_pages(region));
+ if (ret)
+ goto free_region;
+
switch (region->mreg_type) {
case MSHV_REGION_TYPE_MEM_PINNED:
ret = mshv_prepare_pinned_region(region);
@@ -1392,7 +1410,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
region->hv_map_flags, ret);
if (ret)
- goto errout;
+ goto withdraw_memory;
spin_lock(&partition->pt_mem_regions_lock);
hlist_add_head(®ion->hnode, &partition->pt_mem_regions);
@@ -1400,7 +1418,10 @@ mshv_map_user_memory(struct mshv_partition *partition,
return 0;
-errout:
+withdraw_memory:
+ hv_call_withdraw_memory(mshv_region_deposit_slat_pages(region),
+ NUMA_NO_NODE, partition->pt_id);
+free_region:
vfree(region);
return ret;
}
^ permalink raw reply related [flat|nested] 16+ messages in thread* RE: [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation
2026-03-04 0:23 ` [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation Stanislav Kinsburskii
@ 2026-03-05 19:44 ` Michael Kelley
2026-03-06 4:15 ` mhklkml
2026-03-06 3:41 ` Mukesh R
2026-03-06 3:54 ` Mukesh R
2 siblings, 1 reply; 16+ messages in thread
From: Michael Kelley @ 2026-03-05 19:44 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, March 3, 2026 4:24 PM
>
> Deposit enough pages up front to avoid guest address space region creation
> failures due to low memory. This also speeds up guest creation.
>
> Calculate the required number of pages based on the guest's physical
> address space size, rounded up to 1 GB chunks. Even the smallest guests are
> assumed to need at least 1 GB worth of deposits. This is because every
> guest requires tens of megabytes of deposited pages for hypervisor
> overhead, making smaller deposits impractical.
>
> Estimating in 1 GB chunks prevents over-depositing for larger guests while
> accepting some over-deposit for smaller ones. This trade-off keeps the
> estimate close to actual needs for larger guests.
>
> Also withdraw the deposited pages if address space region creation fails.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/mshv_root_main.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 48c842b6938d..cb5b4505f8eb 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -39,6 +39,7 @@
> #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> #define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
> +#define MSHV_1G_DEPOSIT_PAGES (6 * SZ_1M >> PAGE_SHIFT)
>
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> @@ -1324,6 +1325,18 @@ static int mshv_prepare_pinned_region(struct mshv_mem_region *region)
> return ret;
> }
>
> +static u64
> +mshv_region_deposit_slat_pages(struct mshv_mem_region *region)
Same nit about the function name. This one seems like it will "deposit slat pages".
> +{
> + u64 region_in_gbs, slat_pages;
> +
> + /* SLAT needs 6 MB per 1 GB of address space. */
> + region_in_gbs = DIV_ROUND_UP(region->nr_pages << HV_HYP_PAGE_SHIFT, SZ_1G);
This local variable "region_in_gbs" is computed in units of bytes.
> + slat_pages = region_in_gbs * MSHV_1G_DEPOSIT_PAGES;
But here region_in_gbs is used as if it were in units of Gbytes. So the
slat_pages return value is much larger than intended.
> +
> + return slat_pages;
> +}
> +
> /*
> * This maps two things: guest RAM and for pci passthru mmio space.
> *
> @@ -1364,6 +1377,11 @@ mshv_map_user_memory(struct mshv_partition *partition,
> if (ret)
> return ret;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + mshv_region_deposit_slat_pages(region));
> + if (ret)
> + goto free_region;
> +
> switch (region->mreg_type) {
> case MSHV_REGION_TYPE_MEM_PINNED:
> ret = mshv_prepare_pinned_region(region);
> @@ -1392,7 +1410,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> region->hv_map_flags, ret);
>
> if (ret)
> - goto errout;
> + goto withdraw_memory;
>
> spin_lock(&partition->pt_mem_regions_lock);
> hlist_add_head(®ion->hnode, &partition->pt_mem_regions);
> @@ -1400,7 +1418,10 @@ mshv_map_user_memory(struct mshv_partition *partition,
>
> return 0;
>
> -errout:
> +withdraw_memory:
> + hv_call_withdraw_memory(mshv_region_deposit_slat_pages(region),
> + NUMA_NO_NODE, partition->pt_id);
Again, for an L1VH partition, the actual number of pages deposited would
be 2x what mshv_region_deposit_slat_pages() returns.
> +free_region:
> vfree(region);
> return ret;
> }
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* RE: [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation
2026-03-05 19:44 ` Michael Kelley
@ 2026-03-06 4:15 ` mhklkml
0 siblings, 0 replies; 16+ messages in thread
From: mhklkml @ 2026-03-06 4:15 UTC (permalink / raw)
To: 'Michael Kelley', 'Stanislav Kinsburskii', kys,
haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
From: Michael Kelley <mhklinux@outlook.com> Sent: Thursday, March 5, 2026 11:45 AM
>
> From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Tuesday, March
3, 2026 4:24 PM
> >
> > Deposit enough pages up front to avoid guest address space region creation
> > failures due to low memory. This also speeds up guest creation.
> >
> > Calculate the required number of pages based on the guest's physical
> > address space size, rounded up to 1 GB chunks. Even the smallest guests are
> > assumed to need at least 1 GB worth of deposits. This is because every
> > guest requires tens of megabytes of deposited pages for hypervisor
> > overhead, making smaller deposits impractical.
> >
> > Estimating in 1 GB chunks prevents over-depositing for larger guests while
> > accepting some over-deposit for smaller ones. This trade-off keeps the
> > estimate close to actual needs for larger guests.
> >
> > Also withdraw the deposited pages if address space region creation fails.
> >
> > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > ---
> > drivers/hv/mshv_root_main.c | 25 +++++++++++++++++++++++--
> > 1 file changed, 23 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> > index 48c842b6938d..cb5b4505f8eb 100644
> > --- a/drivers/hv/mshv_root_main.c
> > +++ b/drivers/hv/mshv_root_main.c
> > @@ -39,6 +39,7 @@
> > #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> > #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> > #define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
> > +#define MSHV_1G_DEPOSIT_PAGES (6 * SZ_1M >> PAGE_SHIFT)
> >
> > MODULE_AUTHOR("Microsoft");
> > MODULE_LICENSE("GPL");
> > @@ -1324,6 +1325,18 @@ static int mshv_prepare_pinned_region(struct
mshv_mem_region *region)
> > return ret;
> > }
> >
> > +static u64
> > +mshv_region_deposit_slat_pages(struct mshv_mem_region *region)
>
> Same nit about the function name. This one seems like it will "deposit slat pages".
>
> > +{
> > + u64 region_in_gbs, slat_pages;
> > +
> > + /* SLAT needs 6 MB per 1 GB of address space. */
> > + region_in_gbs = DIV_ROUND_UP(region->nr_pages << HV_HYP_PAGE_SHIFT, SZ_1G);
>
> This local variable "region_in_gbs" is computed in units of bytes.
Ignore this comment and the following one in this function. I saw the
ROUND_UP(), but somehow failed to see that it was DIV_ROUND_UP(). :-(
Michael
>
> > + slat_pages = region_in_gbs * MSHV_1G_DEPOSIT_PAGES;
>
> But here region_in_gbs is used as if it were in units of Gbytes. So the
> slat_pages return value is much larger than intended.
>
> > +
> > + return slat_pages;
> > +}
> > +
> > /*
> > * This maps two things: guest RAM and for pci passthru mmio space.
> > *
> > @@ -1364,6 +1377,11 @@ mshv_map_user_memory(struct mshv_partition *partition,
> > if (ret)
> > return ret;
> >
> > + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> > + mshv_region_deposit_slat_pages(region));
> > + if (ret)
> > + goto free_region;
> > +
> > switch (region->mreg_type) {
> > case MSHV_REGION_TYPE_MEM_PINNED:
> > ret = mshv_prepare_pinned_region(region);
> > @@ -1392,7 +1410,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> > region->hv_map_flags, ret);
> >
> > if (ret)
> > - goto errout;
> > + goto withdraw_memory;
> >
> > spin_lock(&partition->pt_mem_regions_lock);
> > hlist_add_head(®ion->hnode, &partition->pt_mem_regions);
> > @@ -1400,7 +1418,10 @@ mshv_map_user_memory(struct mshv_partition *partition,
> >
> > return 0;
> >
> > -errout:
> > +withdraw_memory:
> > + hv_call_withdraw_memory(mshv_region_deposit_slat_pages(region),
> > + NUMA_NO_NODE, partition->pt_id);
>
> Again, for an L1VH partition, the actual number of pages deposited would
> be 2x what mshv_region_deposit_slat_pages() returns.
>
> > +free_region:
> > vfree(region);
> > return ret;
> > }
> >
> >
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation
2026-03-04 0:23 ` [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation Stanislav Kinsburskii
2026-03-05 19:44 ` Michael Kelley
@ 2026-03-06 3:41 ` Mukesh R
2026-03-06 3:54 ` Mukesh R
2 siblings, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:41 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> Deposit enough pages up front to avoid guest address space region creation
> failures due to low memory. This also speeds up guest creation.
Does this imply that some hypercall fails and has no return of
insufficient memory?
> Calculate the required number of pages based on the guest's physical
> address space size, rounded up to 1 GB chunks. Even the smallest guests are
> assumed to need at least 1 GB worth of deposits. This is because every
> guest requires tens of megabytes of deposited pages for hypervisor
> overhead, making smaller deposits impractical.
>
> Estimating in 1 GB chunks prevents over-depositing for larger guests while
> accepting some over-deposit for smaller ones. This trade-off keeps the
> estimate close to actual needs for larger guests.
>
> Also withdraw the deposited pages if address space region creation fails.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/mshv_root_main.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 48c842b6938d..cb5b4505f8eb 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -39,6 +39,7 @@
> #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> #define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
> +#define MSHV_1G_DEPOSIT_PAGES (6 * SZ_1M >> PAGE_SHIFT)
>
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> @@ -1324,6 +1325,18 @@ static int mshv_prepare_pinned_region(struct mshv_mem_region *region)
> return ret;
> }
>
> +static u64
> +mshv_region_deposit_slat_pages(struct mshv_mem_region *region)
I don't think it is accurate to say slat pages, because in case of
overdeposit, they may be used for non-slat purposes according to my
understanding.
> +{
> + u64 region_in_gbs, slat_pages;
> +
> + /* SLAT needs 6 MB per 1 GB of address space. */
> + region_in_gbs = DIV_ROUND_UP(region->nr_pages << HV_HYP_PAGE_SHIFT, SZ_1G);
> + slat_pages = region_in_gbs * MSHV_1G_DEPOSIT_PAGES;
> +
> + return slat_pages;
> +}
> +
Again, unconditionally depositing for each region is not good because
that is empirical, and hyp will reuse the leftover ram.
/*
> * This maps two things: guest RAM and for pci passthru mmio space.
> *
> @@ -1364,6 +1377,11 @@ mshv_map_user_memory(struct mshv_partition *partition,
> if (ret)
> return ret;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + mshv_region_deposit_slat_pages(region));
> + if (ret)
> + goto free_region;
> +
> switch (region->mreg_type) {
> case MSHV_REGION_TYPE_MEM_PINNED:
> ret = mshv_prepare_pinned_region(region);
> @@ -1392,7 +1410,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> region->hv_map_flags, ret);
>
> if (ret)
> - goto errout;
> + goto withdraw_memory;
>
> spin_lock(&partition->pt_mem_regions_lock);
> hlist_add_head(®ion->hnode, &partition->pt_mem_regions);
> @@ -1400,7 +1418,10 @@ mshv_map_user_memory(struct mshv_partition *partition,
>
> return 0;
>
> -errout:
> +withdraw_memory:
> + hv_call_withdraw_memory(mshv_region_deposit_slat_pages(region),
> + NUMA_NO_NODE, partition->pt_id);
> +free_region:
> vfree(region);
> return ret;
> }
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation
2026-03-04 0:23 ` [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation Stanislav Kinsburskii
2026-03-05 19:44 ` Michael Kelley
2026-03-06 3:41 ` Mukesh R
@ 2026-03-06 3:54 ` Mukesh R
2 siblings, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:54 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> Deposit enough pages up front to avoid guest address space region creation
> failures due to low memory. This also speeds up guest creation.
>
> Calculate the required number of pages based on the guest's physical
> address space size, rounded up to 1 GB chunks. Even the smallest guests are
> assumed to need at least 1 GB worth of deposits. This is because every
> guest requires tens of megabytes of deposited pages for hypervisor
> overhead, making smaller deposits impractical.
>
> Estimating in 1 GB chunks prevents over-depositing for larger guests while
> accepting some over-deposit for smaller ones. This trade-off keeps the
> estimate close to actual needs for larger guests.
>
> Also withdraw the deposited pages if address space region creation fails.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> ---
> drivers/hv/mshv_root_main.c | 25 +++++++++++++++++++++++--
> 1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 48c842b6938d..cb5b4505f8eb 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -39,6 +39,7 @@
> #define MSHV_PARTITION_DEPOSIT_PAGES (SZ_512K >> PAGE_SHIFT)
> #define MSHV_PARTITION_DEPOSIT_PAGES_NESTED (20 * SZ_1M >> PAGE_SHIFT)
> #define MSHV_VP_DEPOSIT_PAGES (1 * SZ_1M >> PAGE_SHIFT)
> +#define MSHV_1G_DEPOSIT_PAGES (6 * SZ_1M >> PAGE_SHIFT)
>
> MODULE_AUTHOR("Microsoft");
> MODULE_LICENSE("GPL");
> @@ -1324,6 +1325,18 @@ static int mshv_prepare_pinned_region(struct mshv_mem_region *region)
> return ret;
> }
>
> +static u64
> +mshv_region_deposit_slat_pages(struct mshv_mem_region *region)
> +{
> + u64 region_in_gbs, slat_pages;
> +
> + /* SLAT needs 6 MB per 1 GB of address space. */
> + region_in_gbs = DIV_ROUND_UP(region->nr_pages << HV_HYP_PAGE_SHIFT, SZ_1G);
> + slat_pages = region_in_gbs * MSHV_1G_DEPOSIT_PAGES;
> +
> + return slat_pages;
> +}
> +
> /*
> * This maps two things: guest RAM and for pci passthru mmio space.
> *
> @@ -1364,6 +1377,11 @@ mshv_map_user_memory(struct mshv_partition *partition,
> if (ret)
> return ret;
>
> + ret = hv_call_deposit_pages(NUMA_NO_NODE, partition->pt_id,
> + mshv_region_deposit_slat_pages(region));
> + if (ret)
> + goto free_region;
> +
Also, for MSHV_REGION_TYPE_MEM_PINNED, deposit is not needed.
> switch (region->mreg_type) {
> case MSHV_REGION_TYPE_MEM_PINNED:
> ret = mshv_prepare_pinned_region(region);
> @@ -1392,7 +1410,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
> region->hv_map_flags, ret);
>
> if (ret)
> - goto errout;
> + goto withdraw_memory;
>
> spin_lock(&partition->pt_mem_regions_lock);
> hlist_add_head(®ion->hnode, &partition->pt_mem_regions);
> @@ -1400,7 +1418,10 @@ mshv_map_user_memory(struct mshv_partition *partition,
>
> return 0;
>
> -errout:
> +withdraw_memory:
> + hv_call_withdraw_memory(mshv_region_deposit_slat_pages(region),
> + NUMA_NO_NODE, partition->pt_id);
> +free_region:
> vfree(region);
> return ret;
> }
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/4] mshv: Fix and improve memory pre-depositing
2026-03-04 0:23 [PATCH 0/4] mshv: Fix and improve memory pre-depositing Stanislav Kinsburskii
` (3 preceding siblings ...)
2026-03-04 0:23 ` [PATCH 4/4] mshv: Pre-deposit pages for SLAT creation Stanislav Kinsburskii
@ 2026-03-06 3:44 ` Mukesh R
4 siblings, 0 replies; 16+ messages in thread
From: Mukesh R @ 2026-03-06 3:44 UTC (permalink / raw)
To: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli
Cc: linux-hyperv, linux-kernel
On 3/3/26 16:23, Stanislav Kinsburskii wrote:
> This series fixes and improves memory pre-depositing in the Microsoft
> Hypervisor (MSHV) driver to avoid partition and virtual processor
> creation failures due to insufficient deposited memory, and to speed
> up guest creation.
>
> The first patch converts hv_call_deposit_pages() into a wrapper that
> supports arbitrarily large deposit requests by splitting them into
> HV_DEPOSIT_MAX-sized chunks. It also doubles the deposit amount for
> L1 virtual hypervisor (L1VH) partitions to reserve memory for
> Hypervisor Hot Restart (HHR), since L1VH guests cannot request
> additional memory from the root partition during HHR.
>
> The second patch moves partition initialization page depositing from
> the hypercall wrapper to the partition initialization ioctl. The
> required number of pages is determined empirically. Partitions with
> nested virtualization capability require significantly more pages
> (20 MB) to accommodate the nested hypervisor. The partition creation
> flags are saved in the partition structure to allow selecting the
> correct deposit size at initialization time.
>
> The third patch moves virtual processor page depositing from
> hv_call_create_vp() to mshv_partition_ioctl_create_vp(). A fixed
> deposit of 1 MB per VP is used, which covers both regular and nested
> virtualization cases. Deposited memory is now properly withdrawn if
> VP creation fails.
>
> The fourth patch adds pre-depositing of pages for guest address space
> (SLAT) region creation. The deposit size is calculated based on the
> region size rounded up to 1 GB chunks, with 6 MB deposited per GB of
> address space. Deposited pages are withdrawn on failure.
Can't we just get away with changing deposit for most cases to just
2M? My theory is with that we won't really find any measurable
performance hits, and it keeps things simple.
Thanks,
-Mukesh
> ---
>
> Stanislav Kinsburskii (4):
> mshv: Support larger memory deposits
> mshv: Fix pre-depositing of pages for partition initialization
> mshv: Fix pre-depositing of pages for virtual processor initialization
> mshv: Pre-deposit pages for SLAT creation
>
>
> drivers/hv/hv_proc.c | 58 +++++++++++++++++++++++++++++++++------
> drivers/hv/mshv_root.h | 1 +
> drivers/hv/mshv_root_hv_call.c | 6 ----
> drivers/hv/mshv_root_main.c | 59 +++++++++++++++++++++++++++++++++++++---
> 4 files changed, 104 insertions(+), 20 deletions(-)
>
^ permalink raw reply [flat|nested] 16+ messages in thread