Linux-HyperV List
 help / color / mirror / Atom feed
* Re: [PATCH v2] x86/hyperv: Reserve 3 interrupt vectors used exclusively by mshv
From: Mukesh R @ 2026-02-20 18:56 UTC (permalink / raw)
  To: Wei Liu, Michael Kelley
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	kys@microsoft.com, haiyangz@microsoft.com, decui@microsoft.com,
	longli@microsoft.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com
In-Reply-To: <20260220184520.GB3119916@liuwe-devbox-debian-v2.local>

On 2/20/26 10:45, Wei Liu wrote:
> On Fri, Feb 20, 2026 at 05:14:26PM +0000, Michael Kelley wrote:
>> From: Mukesh R <mrathor@linux.microsoft.com> Sent: Tuesday, February 17, 2026 3:12 PM
>>>
>>> MSVC compiler, used to compile the Microsoft Hyper-V hypervisor currently,
>>> has an assert intrinsic that uses interrupt vector 0x29 to create an
>>> exception. This will cause hypervisor to then crash and collect core. As
>>> such, if this interrupt number is assigned to a device by Linux and the
>>> device generates it, hypervisor will crash. There are two other such
>>> vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.
>>> Fortunately, the three vectors are part of the kernel driver space and
>>> that makes it feasible to reserve them early so they are not assigned
>>> later.
>>>
>>> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
>>> ---
>>>
>>> v1: Add ifndef CONFIG_X86_FRED (thanks hpa)
>>> v2: replace ifndef with cpu_feature_enabled() (thanks hpa and tglx)
>>>
>>>   arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
>>> index 579fb2c64cfd..88ca127dc6d4 100644
>>> --- a/arch/x86/kernel/cpu/mshyperv.c
>>> +++ b/arch/x86/kernel/cpu/mshyperv.c
>>> @@ -478,6 +478,28 @@ int hv_get_hypervisor_version(union hv_hypervisor_version_info *info)
>>>   }
>>>   EXPORT_SYMBOL_GPL(hv_get_hypervisor_version);
>>>
>>> +/*
>>> + * Reserve vectors hard coded in the hypervisor. If used outside, the hypervisor
>>> + * will either crash or hang or attempt to break into debugger.
>>> + */
>>> +static void hv_reserve_irq_vectors(void)
>>> +{
>>> +	#define HYPERV_DBG_FASTFAIL_VECTOR	0x29
>>> +	#define HYPERV_DBG_ASSERT_VECTOR	0x2C
>>> +	#define HYPERV_DBG_SERVICE_VECTOR	0x2D
>>> +
>>> +	if (cpu_feature_enabled(X86_FEATURE_FRED))
>>> +		return;
>>> +
>>> +	if (test_and_set_bit(HYPERV_DBG_ASSERT_VECTOR, system_vectors) ||
>>> +	    test_and_set_bit(HYPERV_DBG_SERVICE_VECTOR, system_vectors) ||
>>> +	    test_and_set_bit(HYPERV_DBG_FASTFAIL_VECTOR, system_vectors))
>>> +		BUG();
>>> +
>>> +	pr_info("Hyper-V:reserve vectors: %d %d %d\n", HYPERV_DBG_ASSERT_VECTOR,
>>> +		HYPERV_DBG_SERVICE_VECTOR, HYPERV_DBG_FASTFAIL_VECTOR);
>>
>> I'm a little late to the party here, but I've always seen Intel interrupt vectors
>> displayed as 2-digit hex numbers. This info message is displaying decimal,
>> which is atypical and will probably be confusing.
> 
> Noted. The pull request to Linus has been sent. We will change the
> format in a follow up patch.

Well, there is no 0x prefix, so should not be confusing, but no big
deal, whatever.....

Thanks,
-Mukesh



^ permalink raw reply

* Re: [PATCH v2] x86/hyperv: Reserve 3 interrupt vectors used exclusively by mshv
From: Wei Liu @ 2026-02-20 18:45 UTC (permalink / raw)
  To: Michael Kelley
  Cc: Mukesh R, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, kys@microsoft.com,
	haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com,
	longli@microsoft.com, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	hpa@zytor.com
In-Reply-To: <SN6PR02MB41574BE5CE887CADE406BAD3D468A@SN6PR02MB4157.namprd02.prod.outlook.com>

On Fri, Feb 20, 2026 at 05:14:26PM +0000, Michael Kelley wrote:
> From: Mukesh R <mrathor@linux.microsoft.com> Sent: Tuesday, February 17, 2026 3:12 PM
> > 
> > MSVC compiler, used to compile the Microsoft Hyper-V hypervisor currently,
> > has an assert intrinsic that uses interrupt vector 0x29 to create an
> > exception. This will cause hypervisor to then crash and collect core. As
> > such, if this interrupt number is assigned to a device by Linux and the
> > device generates it, hypervisor will crash. There are two other such
> > vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.
> > Fortunately, the three vectors are part of the kernel driver space and
> > that makes it feasible to reserve them early so they are not assigned
> > later.
> > 
> > Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> > ---
> > 
> > v1: Add ifndef CONFIG_X86_FRED (thanks hpa)
> > v2: replace ifndef with cpu_feature_enabled() (thanks hpa and tglx)
> > 
> >  arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index 579fb2c64cfd..88ca127dc6d4 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -478,6 +478,28 @@ int hv_get_hypervisor_version(union hv_hypervisor_version_info *info)
> >  }
> >  EXPORT_SYMBOL_GPL(hv_get_hypervisor_version);
> > 
> > +/*
> > + * Reserve vectors hard coded in the hypervisor. If used outside, the hypervisor
> > + * will either crash or hang or attempt to break into debugger.
> > + */
> > +static void hv_reserve_irq_vectors(void)
> > +{
> > +	#define HYPERV_DBG_FASTFAIL_VECTOR	0x29
> > +	#define HYPERV_DBG_ASSERT_VECTOR	0x2C
> > +	#define HYPERV_DBG_SERVICE_VECTOR	0x2D
> > +
> > +	if (cpu_feature_enabled(X86_FEATURE_FRED))
> > +		return;
> > +
> > +	if (test_and_set_bit(HYPERV_DBG_ASSERT_VECTOR, system_vectors) ||
> > +	    test_and_set_bit(HYPERV_DBG_SERVICE_VECTOR, system_vectors) ||
> > +	    test_and_set_bit(HYPERV_DBG_FASTFAIL_VECTOR, system_vectors))
> > +		BUG();
> > +
> > +	pr_info("Hyper-V:reserve vectors: %d %d %d\n", HYPERV_DBG_ASSERT_VECTOR,
> > +		HYPERV_DBG_SERVICE_VECTOR, HYPERV_DBG_FASTFAIL_VECTOR);
> 
> I'm a little late to the party here, but I've always seen Intel interrupt vectors
> displayed as 2-digit hex numbers. This info message is displaying decimal,
> which is atypical and will probably be confusing.

Noted. The pull request to Linus has been sent. We will change the
format in a follow up patch.

Wei

^ permalink raw reply

* RE: [PATCH v2] x86/hyperv: Reserve 3 interrupt vectors used exclusively by mshv
From: Michael Kelley @ 2026-02-20 17:14 UTC (permalink / raw)
  To: Mukesh R, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org,
	decui@microsoft.com, longli@microsoft.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	x86@kernel.org, hpa@zytor.com
In-Reply-To: <20260217231158.1184736-1-mrathor@linux.microsoft.com>

From: Mukesh R <mrathor@linux.microsoft.com> Sent: Tuesday, February 17, 2026 3:12 PM
> 
> MSVC compiler, used to compile the Microsoft Hyper-V hypervisor currently,
> has an assert intrinsic that uses interrupt vector 0x29 to create an
> exception. This will cause hypervisor to then crash and collect core. As
> such, if this interrupt number is assigned to a device by Linux and the
> device generates it, hypervisor will crash. There are two other such
> vectors hard coded in the hypervisor, 0x2C and 0x2D for debug purposes.
> Fortunately, the three vectors are part of the kernel driver space and
> that makes it feasible to reserve them early so they are not assigned
> later.
> 
> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com>
> ---
> 
> v1: Add ifndef CONFIG_X86_FRED (thanks hpa)
> v2: replace ifndef with cpu_feature_enabled() (thanks hpa and tglx)
> 
>  arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 579fb2c64cfd..88ca127dc6d4 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -478,6 +478,28 @@ int hv_get_hypervisor_version(union hv_hypervisor_version_info *info)
>  }
>  EXPORT_SYMBOL_GPL(hv_get_hypervisor_version);
> 
> +/*
> + * Reserve vectors hard coded in the hypervisor. If used outside, the hypervisor
> + * will either crash or hang or attempt to break into debugger.
> + */
> +static void hv_reserve_irq_vectors(void)
> +{
> +	#define HYPERV_DBG_FASTFAIL_VECTOR	0x29
> +	#define HYPERV_DBG_ASSERT_VECTOR	0x2C
> +	#define HYPERV_DBG_SERVICE_VECTOR	0x2D
> +
> +	if (cpu_feature_enabled(X86_FEATURE_FRED))
> +		return;
> +
> +	if (test_and_set_bit(HYPERV_DBG_ASSERT_VECTOR, system_vectors) ||
> +	    test_and_set_bit(HYPERV_DBG_SERVICE_VECTOR, system_vectors) ||
> +	    test_and_set_bit(HYPERV_DBG_FASTFAIL_VECTOR, system_vectors))
> +		BUG();
> +
> +	pr_info("Hyper-V:reserve vectors: %d %d %d\n", HYPERV_DBG_ASSERT_VECTOR,
> +		HYPERV_DBG_SERVICE_VECTOR, HYPERV_DBG_FASTFAIL_VECTOR);

I'm a little late to the party here, but I've always seen Intel interrupt vectors
displayed as 2-digit hex numbers. This info message is displaying decimal,
which is atypical and will probably be confusing.

Michael

> +}
> +
>  static void __init ms_hyperv_init_platform(void)
>  {
>  	int hv_max_functions_eax, eax;
> @@ -510,6 +532,11 @@ static void __init ms_hyperv_init_platform(void)
> 
>  	hv_identify_partition_type();
> 
> +#ifndef CONFIG_X86_FRED
> +	if (hv_root_partition())
> +		hv_reserve_irq_vectors();
> +#endif	/* CONFIG_X86_FRED */
> +
>  	if (cc_platform_has(CC_ATTR_SNP_SECURE_AVIC))
>  		ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED;
> 
> --
> 2.51.2.vfs.0.1
> 


^ permalink raw reply

* RE: [PATCH] mshv: Replace fixed memory deposit with status driven helper
From: Michael Kelley @ 2026-02-20 17:05 UTC (permalink / raw)
  To: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com
  Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <177153896491.48883.14285093878498416061.stgit@skinsburskii-cloud-desktop.internal.cloudapp.net>

From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Thursday, February 19, 2026 2:10 PM
> 
> Replace hardcoded HV_MAP_GPA_DEPOSIT_PAGES usage with
> hv_deposit_memory() which derives the deposit size from
> the hypercall status, and remove the now-unused constant.
> 
> The previous code always deposited a fixed 256 pages on
> insufficient memory, ignoring the actual demand reported
> by the hypervisor.

Does the hypervisor report a specific page count demand? I haven't
seen that anywhere. It seems like the deposit memory operation is
always something of a guess.

> hv_deposit_memory() handles different
> deposit statuses, aligning map-GPA retries with the rest
> of the codebase.
> 
> This approach may require more allocation and deposit
> hypercall iterations, but avoids over-depositing large
> fixed chunks when fewer pages would suffice. Until any
> performance impact is measured, the more frugal and
> consistent behavior is preferred.
> 
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>

From a purely functional standpoint, this change addresses the
concern that I raised. But I don’t have any intuition on the performance
impact of having to iterate. hv_deposit_memory() adds only a single
page for some of the statuses, so if there really is a large memory need,
the new code would iterate 256 times to achieve what the existing code
does.

Any idea where the 256 came from the first place?  Was that
empirically determined like some of the other memory deposit counts?

In addition to a potential performance impact, I know the hypervisor tries
to detect denial-of-service attempts that make "too many" calls to the
hypervisor in a short period of time. In such a case, the hypervisor
suspends scheduling the VM for a few seconds before allowing it to resume.
Just need to make sure the hypervisor doesn't think the iterating is a 
denial-of-service attack. Or maybe that denial-of-service detection
doesn't apply to the root partition VM.

But from a functional standpoint,
Reviewed-by: Michael Kelley <mhklinux@outlook.com>

> ---
>  drivers/hv/mshv_root_hv_call.c |    4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> index 7f91096f95a8..317191462b63 100644
> --- a/drivers/hv/mshv_root_hv_call.c
> +++ b/drivers/hv/mshv_root_hv_call.c
> @@ -16,7 +16,6 @@
> 
>  /* Determined empirically */
>  #define HV_INIT_PARTITION_DEPOSIT_PAGES 208
> -#define HV_MAP_GPA_DEPOSIT_PAGES	256
>  #define HV_UMAP_GPA_PAGES		512
> 
>  #define HV_PAGE_COUNT_2M_ALIGNED(pg_count) (!((pg_count) & (0x200 - 1)))
> @@ -239,8 +238,7 @@ static int hv_do_map_gpa_hcall(u64 partition_id, u64 gfn, u64
> page_struct_count,
>  		completed = hv_repcomp(status);
> 
>  		if (hv_result_needs_memory(status)) {
> -			ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id,
> -						    HV_MAP_GPA_DEPOSIT_PAGES);
> +			ret = hv_deposit_memory(partition_id, status);
>  			if (ret)
>  				break;
> 
> 
> 


^ permalink raw reply

* [PATCH 1/1] Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
From: Michael Kelley @ 2026-02-20 16:40 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, linux-hyperv; +Cc: linux-kernel

From: Michael Kelley <mhklinux@outlook.com>

When checking for VMBus channel interrutps, current code always scans the
full SynIC receive interrupt bit array to get the relid of the
interrupting channels. The array has HV_EVENT_FLAGS_COUNT (2048) bits.
But VMs rarely have more than 100 channels, and the relid is typically
a small integer that is densely assigned by the Hyper-V host. It's
wasteful to scan 2048 bits when it is highly unlikely that anything will
be found past bit 100. The waste is double with Confidential VMBus because
there are two receive interrupt arrays that must be scanned: one for the
hypervisor SynIC and one for the paravisor SynIC.

Improve the scanning by tracking the largest relid that has been offered
by the Hyper-V host. Then when checking for VMBus channel interrupts, only
scan up to this high water mark.

When channels are rescinded, it's not worth the complexity to recalculate
the high water mark. Hyper-V tends to reuse the rescinded relids for any
new channels that are subsequently added, and the performance benefit of
exactly tracking the high water mark would be minimal.

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
---
 drivers/hv/channel_mgmt.c | 16 ++++++++++++----
 drivers/hv/hyperv_vmbus.h |  3 ++-
 drivers/hv/vmbus_drv.c    |  7 +------
 3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 74fed2c073d4..61f7dffd0f50 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -384,8 +384,18 @@ static void free_channel(struct vmbus_channel *channel)
 
 void vmbus_channel_map_relid(struct vmbus_channel *channel)
 {
-	if (WARN_ON(channel->offermsg.child_relid >= MAX_CHANNEL_RELIDS))
+	u32 new_relid = channel->offermsg.child_relid;
+
+	if (WARN_ON(new_relid >= MAX_CHANNEL_RELIDS))
 		return;
+
+	/*
+	 * This function is always called in the tasklet for the connect CPU.
+	 * So updating the relid hiwater mark does not need to be atomic.
+	 */
+	if (new_relid > READ_ONCE(vmbus_connection.relid_hiwater))
+		WRITE_ONCE(vmbus_connection.relid_hiwater, new_relid);
+
 	/*
 	 * The mapping of the channel's relid is visible from the CPUs that
 	 * execute vmbus_chan_sched() by the time that vmbus_chan_sched() will
@@ -411,9 +421,7 @@ void vmbus_channel_map_relid(struct vmbus_channel *channel)
 	 *      of the VMBus driver and vmbus_chan_sched() can not run before
 	 *      vmbus_bus_resume() has completed execution (cf. resume_noirq).
 	 */
-	virt_store_mb(
-		vmbus_connection.channels[channel->offermsg.child_relid],
-		channel);
+	virt_store_mb(vmbus_connection.channels[new_relid], channel);
 }
 
 void vmbus_channel_unmap_relid(struct vmbus_channel *channel)
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index 7bd8f8486e85..2c90c81a3b0f 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -276,8 +276,9 @@ struct vmbus_connection {
 	struct list_head chn_list;
 	struct mutex channel_mutex;
 
-	/* Array of channels */
+	/* Array of channel pointers, indexed by relid */
 	struct vmbus_channel **channels;
+	u32 relid_hiwater;
 
 	/*
 	 * An offer message is handled first on the work_queue, and then
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 3e7a52918ce0..a96da105b593 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1258,17 +1258,12 @@ static void vmbus_chan_sched(void *event_page_addr)
 		return;
 	event = (union hv_synic_event_flags *)event_page_addr + VMBUS_MESSAGE_SINT;
 
-	maxbits = HV_EVENT_FLAGS_COUNT;
+	maxbits = READ_ONCE(vmbus_connection.relid_hiwater) + 1;
 	recv_int_page = event->flags;
 
 	if (unlikely(!recv_int_page))
 		return;
 
-	/*
-	 * Suggested-by: Michael Kelley <mhklinux@outlook.com>
-	 * One possible optimization would be to keep track of the largest relID that's in use,
-	 * and only scan up to that relID.
-	 */
 	for_each_set_bit(relid, recv_int_page, maxbits) {
 		void (*callback_fn)(void *context);
 		struct vmbus_channel *channel;
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH v4 00/21] paravirt: cleanup and reorg
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: =?utf-8?b?SsO8cmdlbiBHcm/DnyA8amdyb3NzQHN1c2UuY29tPg==?=
  Cc: linux-riscv, linux-kernel, x86, linux-hyperv, virtualization,
	loongarch, linuxppc-dev, kvm, luto, tglx, mingo, bp, dave.hansen,
	hpa, kys, haiyangz, wei.liu, decui, peterz, will, boqun.feng,
	longman, jikos, jpoimboe, pawan.kumar.gupta, boris.ostrovsky,
	xen-devel, ajay.kaher, alexey.makhalov, bcm-kernel-feedback-list,
	linux, catalin.marinas, chenhuacai, kernel, maddy, mpe, npiggin,
	christophe.leroy, pjw, palmer, aou, alex, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, linux-arm-kernel, pbonzini, vkuznets, sstabellini,
	oleksandr_tyshchenko, daniel.lezcano, oleg
In-Reply-To: <20251127070844.21919-1-jgross@suse.com>

Hello:

This series was applied to riscv/linux.git (fixes)
by Borislav Petkov (AMD) <bp@alien8.de>:

On Thu, 27 Nov 2025 08:08:23 +0100 you wrote:
> Some cleanups and reorg of paravirt code and headers:
> 
> - The first 2 patches should be not controversial at all, as they
>   remove just some no longer needed #include and struct forward
>   declarations.
> 
> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has
>   no real value, as it just changes a crash to a BUG() (the stack
>   trace will basically be the same). As the maintainer of the main
>   paravirt user (Xen) I have never seen this crash/BUG() to happen.
> 
> [...]

Here is the summary with links:
  - [v4,05/21] paravirt: Remove asm/paravirt_api_clock.h
    https://git.kernel.org/riscv/c/68b10fd40d49
  - [v4,06/21] sched: Move clock related paravirt code to kernel/sched
    (no matching commit)
  - [v4,10/21] riscv/paravirt: Use common code for paravirt_steal_clock()
    https://git.kernel.org/riscv/c/ee9ffcf99f07

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v3 00/21] paravirt: cleanup and reorg
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: =?utf-8?b?SsO8cmdlbiBHcm/DnyA8amdyb3NzQHN1c2UuY29tPg==?=
  Cc: linux-riscv, linux-kernel, x86, linux-hyperv, virtualization,
	loongarch, linuxppc-dev, kvm, luto, tglx, mingo, bp, dave.hansen,
	hpa, kys, haiyangz, wei.liu, decui, peterz, will, boqun.feng,
	longman, jikos, jpoimboe, pawan.kumar.gupta, boris.ostrovsky,
	xen-devel, ajay.kaher, alexey.makhalov, bcm-kernel-feedback-list,
	linux, catalin.marinas, chenhuacai, kernel, maddy, mpe, npiggin,
	christophe.leroy, pjw, palmer, aou, alex, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, linux-arm-kernel, pbonzini, vkuznets, sstabellini,
	oleksandr_tyshchenko, daniel.lezcano, oleg
In-Reply-To: <20251006074606.1266-1-jgross@suse.com>

Hello:

This series was applied to riscv/linux.git (fixes)
by Borislav Petkov (AMD) <bp@alien8.de>:

On Mon,  6 Oct 2025 09:45:45 +0200 you wrote:
> Some cleanups and reorg of paravirt code and headers:
> 
> - The first 2 patches should be not controversial at all, as they
>   remove just some no longer needed #include and struct forward
>   declarations.
> 
> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has
>   no real value, as it just changes a crash to a BUG() (the stack
>   trace will basically be the same). As the maintainer of the main
>   paravirt user (Xen) I have never seen this crash/BUG() to happen.
> 
> [...]

Here is the summary with links:
  - [v3,05/21] paravirt: Remove asm/paravirt_api_clock.h
    https://git.kernel.org/riscv/c/68b10fd40d49
  - [v3,06/21] sched: Move clock related paravirt code to kernel/sched
    (no matching commit)
  - [v3,10/21] riscv/paravirt: Use common code for paravirt_steal_clock()
    https://git.kernel.org/riscv/c/ee9ffcf99f07

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v2 00/21] paravirt: cleanup and reorg
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: =?utf-8?b?SsO8cmdlbiBHcm/DnyA8amdyb3NzQHN1c2UuY29tPg==?=
  Cc: linux-riscv, linux-kernel, x86, linux-hyperv, virtualization,
	loongarch, linuxppc-dev, kvm, luto, tglx, mingo, bp, dave.hansen,
	hpa, kys, haiyangz, wei.liu, decui, peterz, will, boqun.feng,
	longman, jikos, jpoimboe, pawan.kumar.gupta, boris.ostrovsky,
	xen-devel, ajay.kaher, alexey.makhalov, bcm-kernel-feedback-list,
	linux, catalin.marinas, chenhuacai, kernel, maddy, mpe, npiggin,
	christophe.leroy, paul.walmsley, palmer, aou, alex, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, linux-arm-kernel, pbonzini, vkuznets, sstabellini,
	oleksandr_tyshchenko, daniel.lezcano, oleg
In-Reply-To: <20250917145220.31064-1-jgross@suse.com>

Hello:

This series was applied to riscv/linux.git (fixes)
by Borislav Petkov (AMD) <bp@alien8.de>:

On Wed, 17 Sep 2025 16:51:59 +0200 you wrote:
> Some cleanups and reorg of paravirt code and headers:
> 
> - The first 2 patches should be not controversial at all, as they
>   remove just some no longer needed #include and struct forward
>   declarations.
> 
> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has
>   no real value, as it just changes a crash to a BUG() (the stack
>   trace will basically be the same). As the maintainer of the main
>   paravirt user (Xen) I have never seen this crash/BUG() to happen.
> 
> [...]

Here is the summary with links:
  - [v2,05/21] paravirt: Remove asm/paravirt_api_clock.h
    https://git.kernel.org/riscv/c/68b10fd40d49
  - [v2,06/21] sched: Move clock related paravirt code to kernel/sched
    (no matching commit)
  - [v2,10/21] riscv/paravirt: Use common code for paravirt_steal_clock()
    https://git.kernel.org/riscv/c/ee9ffcf99f07

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH 00/14] paravirt: cleanup and reorg
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: =?utf-8?b?SsO8cmdlbiBHcm/DnyA8amdyb3NzQHN1c2UuY29tPg==?=
  Cc: linux-riscv, linux-kernel, x86, linux-hyperv, virtualization,
	loongarch, linuxppc-dev, kvm, luto, tglx, mingo, bp, dave.hansen,
	hpa, kys, haiyangz, wei.liu, decui, peterz, will, boqun.feng,
	longman, jikos, jpoimboe, pawan.kumar.gupta, boris.ostrovsky,
	xen-devel, ajay.kaher, alexey.makhalov, bcm-kernel-feedback-list,
	linux, catalin.marinas, chenhuacai, kernel, maddy, mpe, npiggin,
	christophe.leroy, paul.walmsley, palmer, aou, alex, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, linux-arm-kernel, pbonzini, vkuznets, sstabellini,
	oleksandr_tyshchenko, daniel.lezcano
In-Reply-To: <20250911063433.13783-1-jgross@suse.com>

Hello:

This series was applied to riscv/linux.git (fixes)
by Borislav Petkov (AMD) <bp@alien8.de>:

On Thu, 11 Sep 2025 08:34:19 +0200 you wrote:
> Some cleanups and reorg of paravirt code and headers:
> 
> - The first 2 patches should be not controversial at all, as they
>   remove just some no longer needed #include and struct forward
>   declarations.
> 
> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has
>   no real value, as it just changes a crash to a BUG() (the stack
>   trace will basically be the same). As the maintainer of the main
>   paravirt user (Xen) I have never seen this crash/BUG() to happen.
> 
> [...]

Here is the summary with links:
  - [05/14] paravirt: remove asm/paravirt_api_clock.h
    https://git.kernel.org/riscv/c/68b10fd40d49
  - [06/14] sched: move clock related paravirt code to kernel/sched
    (no matching commit)
  - [10/14] riscv/paravirt: use common code for paravirt_steal_clock()
    https://git.kernel.org/riscv/c/ee9ffcf99f07

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v5 00/21] paravirt: cleanup and reorg
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: =?utf-8?b?SsO8cmdlbiBHcm/DnyA8amdyb3NzQHN1c2UuY29tPg==?=
  Cc: linux-riscv, linux-kernel, x86, linux-hyperv, virtualization,
	loongarch, linuxppc-dev, kvm, luto, tglx, mingo, bp, dave.hansen,
	hpa, kys, haiyangz, wei.liu, decui, longli, peterz, will,
	boqun.feng, longman, jikos, jpoimboe, pawan.kumar.gupta,
	boris.ostrovsky, xen-devel, ajay.kaher, alexey.makhalov,
	bcm-kernel-feedback-list, linux, catalin.marinas, chenhuacai,
	kernel, maddy, mpe, npiggin, chleroy, pjw, palmer, aou, alex,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, linux-arm-kernel, pbonzini, vkuznets,
	sstabellini, oleksandr_tyshchenko, daniel.lezcano, oleg
In-Reply-To: <20260105110520.21356-1-jgross@suse.com>

Hello:

This series was applied to riscv/linux.git (fixes)
by Borislav Petkov (AMD) <bp@alien8.de>:

On Mon,  5 Jan 2026 12:04:59 +0100 you wrote:
> Some cleanups and reorg of paravirt code and headers:
> 
> - The first 2 patches should be not controversial at all, as they
>   remove just some no longer needed #include and struct forward
>   declarations.
> 
> - The 3rd patch is removing CONFIG_PARAVIRT_DEBUG, which IMO has
>   no real value, as it just changes a crash to a BUG() (the stack
>   trace will basically be the same). As the maintainer of the main
>   paravirt user (Xen) I have never seen this crash/BUG() to happen.
> 
> [...]

Here is the summary with links:
  - [v5,05/21] paravirt: Remove asm/paravirt_api_clock.h
    https://git.kernel.org/riscv/c/68b10fd40d49
  - [v5,06/21] sched: Move clock related paravirt code to kernel/sched
    https://git.kernel.org/riscv/c/e6b2aa6d4004
  - [v5,10/21] riscv/paravirt: Use common code for paravirt_steal_clock()
    https://git.kernel.org/riscv/c/ee9ffcf99f07

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH v3 0/9] arch,sysfb,efi: Support EDID on non-x86 EFI systems
From: patchwork-bot+linux-riscv @ 2026-02-20  4:10 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: linux-riscv, ardb, javierm, arnd, richard.lyu, helgaas, x86,
	linux-arm-kernel, linux-kernel, linux-efi, loongarch, dri-devel,
	linux-hyperv, linux-pci, linux-fbdev
In-Reply-To: <20251126160854.553077-1-tzimmermann@suse.de>

Hello:

This series was applied to riscv/linux.git (fixes)
by Ard Biesheuvel <ardb@kernel.org>:

On Wed, 26 Nov 2025 17:03:17 +0100 you wrote:
> Replace screen_info and edid_info with sysfb_primary_device of type
> struct sysfb_display_info. Update all users. Then implement EDID support
> in the kernel EFI code.
> 
> Sysfb DRM drivers currently fetch the global edid_info directly, when
> they should get that information together with the screen_info from their
> device. Wrapping screen_info and edid_info in sysfb_primary_display and
> passing this to drivers enables this.
> 
> [...]

Here is the summary with links:
  - [v3,1/9] efi: earlycon: Reduce number of references to global screen_info
    https://git.kernel.org/riscv/c/b868070fbc02
  - [v3,2/9] efi: sysfb_efi: Reduce number of references to global screen_info
    (no matching commit)
  - [v3,3/9] sysfb: Add struct sysfb_display_info
    https://git.kernel.org/riscv/c/b945922619b7
  - [v3,4/9] sysfb: Replace screen_info with sysfb_primary_display
    (no matching commit)
  - [v3,5/9] sysfb: Pass sysfb_primary_display to devices
    https://git.kernel.org/riscv/c/08e583ad6857
  - [v3,6/9] sysfb: Move edid_info into sysfb_primary_display
    https://git.kernel.org/riscv/c/4fcae6358871
  - [v3,7/9] efi: Refactor init_primary_display() helpers
    (no matching commit)
  - [v3,8/9] efi: Support EDID information
    (no matching commit)
  - [v3,9/9] efi: libstub: Simplify interfaces for primary_display
    (no matching commit)

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH 2/2] mshv: Add kexec blocking support
From: Stanislav Kinsburskii @ 2026-02-19 22:16 UTC (permalink / raw)
  To: Mukesh R
  Cc: rppt, akpm, bhe, kys, haiyangz, wei.liu, decui, longli, kexec,
	linux-hyperv, linux-kernel
In-Reply-To: <32c4bc2a-5dd1-c54d-a089-45bfad6eec94@linux.microsoft.com>

On Thu, Feb 12, 2026 at 02:11:13PM -0800, Mukesh R wrote:
> On 1/28/26 09:42, Stanislav Kinsburskii wrote:
> > Add kexec notifier to prevent kexec when VMs are active or memory
> > is deposited. The notifier blocks kexec operations if:
> > - Active VMs exist in the partition table
> > - Pages are still deposited to the hypervisor
> > 
> > The kernel cannot access hypervisor deposited pages: any access
> > triggers a GPF. Until the deposited page state can be handed over
> > to the next kernel, kexec must be blocked if there is any shared
> > state between kernel and hypervisor.
> > 
> > For L1 host virtualization, attempt to withdraw all deposited memory before
> > allowing kexec to proceed. If withdrawal fails or pages remain deposited
> > block the kexec operation.
> > 
> > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > ---
> >   drivers/hv/Makefile            |    1 +
> >   drivers/hv/hv_proc.c           |    4 ++
> >   drivers/hv/mshv_kexec.c        |   66 ++++++++++++++++++++++++++++++++++++++++
> >   drivers/hv/mshv_root.h         |   14 ++++++++
> >   drivers/hv/mshv_root_hv_call.c |    2 +
> >   drivers/hv/mshv_root_main.c    |    7 ++++
> >   6 files changed, 94 insertions(+)
> >   create mode 100644 drivers/hv/mshv_kexec.c
> > 
> > diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
> > index a49f93c2d245..bb72be5cc525 100644
> > --- a/drivers/hv/Makefile
> > +++ b/drivers/hv/Makefile
> > @@ -15,6 +15,7 @@ hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
> >   hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o
> >   mshv_root-y := mshv_root_main.o mshv_synic.o mshv_eventfd.o mshv_irq.o \
> >   	       mshv_root_hv_call.o mshv_portid_table.o mshv_regions.o
> > +mshv_root-$(CONFIG_KEXEC) += mshv_kexec.o
> >   mshv_vtl-y := mshv_vtl_main.o
> >   # Code that must be built-in
> > diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> > index 89870c1b0087..39bbbedb0340 100644
> > --- a/drivers/hv/hv_proc.c
> > +++ b/drivers/hv/hv_proc.c
> > @@ -15,6 +15,8 @@
> >    */
> >   #define HV_DEPOSIT_MAX (HV_HYP_PAGE_SIZE / sizeof(u64) - 1)
> > +atomic_t hv_pages_deposited;
> > +
> >   /* Deposits exact number of pages. Must be called with interrupts enabled.  */
> >   int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> >   {
> > @@ -93,6 +95,8 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> >   		goto err_free_allocations;
> >   	}
> > +	atomic_add(page_count, &hv_pages_deposited);
> > +
> >   	ret = 0;
> >   	goto free_buf;
> > diff --git a/drivers/hv/mshv_kexec.c b/drivers/hv/mshv_kexec.c
> > new file mode 100644
> > index 000000000000..5222b2e4ff97
> > --- /dev/null
> > +++ b/drivers/hv/mshv_kexec.c
> > @@ -0,0 +1,66 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (c) 2026, Microsoft Corporation.
> > + *
> > + * Live update orchestration management for mshv_root module.
> > + *
> > + * Author: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > + */
> > +
> > +#include <linux/kexec.h>
> > +#include <linux/notifier.h>
> > +#include <asm/mshyperv.h>
> > +#include "mshv_root.h"
> > +
> > +static BLOCKING_NOTIFIER_HEAD(overlay_notify_chain);
> > +
> > +static int mshv_block_kexec_notify(struct notifier_block *nb,
> > +				   unsigned long action, void *arg)
> > +{
> > +	if (!hash_empty(mshv_root.pt_htable)) {
> > +		pr_warn("mshv: Cannot perform kexec while VMs are active\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	if (hv_l1vh_partition()) {
> > +		int err;
> > +
> > +		/* Attempt to withdraw all the deposited pages */
> > +		err = hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE,
> > +					      hv_current_partition_id);
> > +		if (err) {
> > +			pr_err("mshv: Failed to withdraw memory from L1 virtualization: %d\n",
> > +			       err);
> > +			return err;
> > +		}
> > +	}
> > +
> > +	if (atomic_read(&hv_pages_deposited)) {
> > +		pr_warn("mshv: Cannot perform kexec while pages are deposited\n");
> > +		return -EBUSY;
> > +	}
> > +	return 0;
> > +}
> > +
> 
> What guarantees another deposit won't happen after this. Are all cpus
> "locked" in kexec path and not doing anything at this point?
> 

Yeah, this should be guarded.

Thanks,
Stanislav

> Thanks,
> -Mukesh
> 
> 
> 
> > +static struct notifier_block mshv_kexec_notifier = {
> > +	.notifier_call = mshv_block_kexec_notify,
> > +};
> > +
> > +int __init mshv_kexec_init(void)
> > +{
> > +	int err;
> > +
> > +	err = kexec_block_notifier_register(&mshv_kexec_notifier);
> > +	if (err) {
> > +		pr_err("mshv: Could not register kexec notifier: %pe\n",
> > +		       ERR_PTR(err));
> > +		return err;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +void __exit mshv_kexec_exit(void)
> > +{
> > +	(void)kexec_block_notifier_unregister(&mshv_kexec_notifier);
> > +}
> > diff --git a/drivers/hv/mshv_root.h b/drivers/hv/mshv_root.h
> > index 3c1d88b36741..311f76262d10 100644
> > --- a/drivers/hv/mshv_root.h
> > +++ b/drivers/hv/mshv_root.h
> > @@ -17,6 +17,7 @@
> >   #include <linux/build_bug.h>
> >   #include <linux/mmu_notifier.h>
> >   #include <uapi/linux/mshv.h>
> > +#include <hyperv/hvhdk.h>
> >   /*
> >    * Hypervisor must be between these version numbers (inclusive)
> > @@ -319,6 +320,7 @@ int hv_call_get_partition_property_ex(u64 partition_id, u64 property_code, u64 a
> >   extern struct mshv_root mshv_root;
> >   extern enum hv_scheduler_type hv_scheduler_type;
> >   extern u8 * __percpu *hv_synic_eventring_tail;
> > +extern atomic_t hv_pages_deposited;
> >   struct mshv_mem_region *mshv_region_create(u64 guest_pfn, u64 nr_pages,
> >   					   u64 uaddr, u32 flags);
> > @@ -333,4 +335,16 @@ bool mshv_region_handle_gfn_fault(struct mshv_mem_region *region, u64 gfn);
> >   void mshv_region_movable_fini(struct mshv_mem_region *region);
> >   bool mshv_region_movable_init(struct mshv_mem_region *region);
> > +#if IS_ENABLED(CONFIG_KEXEC)
> > +int mshv_kexec_init(void);
> > +void mshv_kexec_exit(void);
> > +#else
> > +static inline int mshv_kexec_init(void)
> > +{
> > +	return 0;
> > +}
> > +
> > +static inline void mshv_kexec_exit(void) { }
> > +#endif
> > +
> >   #endif /* _MSHV_ROOT_H_ */
> > diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> > index 06f2bac8039d..4203af5190ee 100644
> > --- a/drivers/hv/mshv_root_hv_call.c
> > +++ b/drivers/hv/mshv_root_hv_call.c
> > @@ -73,6 +73,8 @@ int hv_call_withdraw_memory(u64 count, int node, u64 partition_id)
> >   		for (i = 0; i < completed; i++)
> >   			__free_page(pfn_to_page(output_page->gpa_page_list[i]));
> > +		atomic_sub(completed, &hv_pages_deposited);
> > +
> >   		if (!hv_result_success(status)) {
> >   			if (hv_result(status) == HV_STATUS_NO_RESOURCES)
> >   				status = HV_STATUS_SUCCESS;
> > diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> > index 5fc572e31cd7..d55aa69d130c 100644
> > --- a/drivers/hv/mshv_root_main.c
> > +++ b/drivers/hv/mshv_root_main.c
> > @@ -2330,6 +2330,10 @@ static int __init mshv_parent_partition_init(void)
> >   	if (ret)
> >   		goto deinit_root_scheduler;
> > +	ret = mshv_kexec_init();
> > +	if (ret)
> > +		goto deinit_irqfd_wq;
> > +
> >   	spin_lock_init(&mshv_root.pt_ht_lock);
> >   	hash_init(mshv_root.pt_htable);
> > @@ -2337,6 +2341,8 @@ static int __init mshv_parent_partition_init(void)
> >   	return 0;
> > +deinit_irqfd_wq:
> > +	mshv_irqfd_wq_cleanup();
> >   deinit_root_scheduler:
> >   	root_scheduler_deinit();
> >   exit_partition:
> > @@ -2356,6 +2362,7 @@ static void __exit mshv_parent_partition_exit(void)
> >   	hv_setup_mshv_handler(NULL);
> >   	mshv_port_table_fini();
> >   	misc_deregister(&mshv_dev);
> > +	mshv_kexec_exit();
> >   	mshv_irqfd_wq_cleanup();
> >   	root_scheduler_deinit();
> >   	if (hv_root_partition())
> > 
> > 
> 

^ permalink raw reply

* Re: [PATCH 1/2] kexec: Add permission notifier chain for kexec operations
From: Stanislav Kinsburskii @ 2026-02-19 22:13 UTC (permalink / raw)
  To: Mukesh R
  Cc: rppt, akpm, bhe, kys, haiyangz, wei.liu, decui, longli, kexec,
	linux-hyperv, linux-kernel
In-Reply-To: <fb651abf-0546-3bef-bf8f-597f35ddc0d6@linux.microsoft.com>

On Thu, Feb 12, 2026 at 02:12:29PM -0800, Mukesh R wrote:
> On 1/28/26 09:42, Stanislav Kinsburskii wrote:
> > Add a blocking notifier chain to allow subsystems to be notified
> > before kexec execution. This enables modules to perform necessary
> > cleanup or validation before the system transitions to a new kernel or
> > block kexec if not possible under current conditions.
> > 
> > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > ---
> >   include/linux/kexec.h |    6 ++++++
> >   kernel/kexec_core.c   |   24 ++++++++++++++++++++++++
> >   2 files changed, 30 insertions(+)
> > 
> > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > index ff7e231b0485..311037d30f9e 100644
> > --- a/include/linux/kexec.h
> > +++ b/include/linux/kexec.h
> > @@ -35,6 +35,7 @@ extern note_buf_t __percpu *crash_notes;
> >   #include <linux/ioport.h>
> >   #include <linux/module.h>
> >   #include <linux/highmem.h>
> > +#include <linux/notifier.h>
> >   #include <asm/kexec.h>
> >   #include <linux/crash_core.h>
> > @@ -532,10 +533,13 @@ extern bool kexec_file_dbg_print;
> >   extern void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size);
> >   extern void kimage_unmap_segment(void *buffer);
> > +extern int kexec_block_notifier_register(struct notifier_block *nb);
> > +extern int kexec_block_notifier_unregister(struct notifier_block *nb);
> >   #else /* !CONFIG_KEXEC_CORE */
> >   struct pt_regs;
> >   struct task_struct;
> >   struct kimage;
> > +struct notifier_block;
> >   static inline void __crash_kexec(struct pt_regs *regs) { }
> >   static inline void crash_kexec(struct pt_regs *regs) { }
> >   static inline int kexec_should_crash(struct task_struct *p) { return 0; }
> > @@ -543,6 +547,8 @@ static inline int kexec_crash_loaded(void) { return 0; }
> >   static inline void *kimage_map_segment(struct kimage *image, unsigned long addr, unsigned long size)
> >   { return NULL; }
> >   static inline void kimage_unmap_segment(void *buffer) { }
> > +static inline int kexec_block_notifier_register(struct notifier_block *nb) { }
> > +static inline int kexec_block_notifier_unregister(struct notifier_block *nb) { }
> >   #define kexec_in_progress false
> >   #endif /* CONFIG_KEXEC_CORE */
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index 0f92acdd354d..1e86a6f175f0 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -57,6 +57,20 @@ bool kexec_in_progress = false;
> >   bool kexec_file_dbg_print;
> > +static BLOCKING_NOTIFIER_HEAD(kexec_block_list);
> > +
> > +int kexec_block_notifier_register(struct notifier_block *nb)
> > +{
> > +	return blocking_notifier_chain_register(&kexec_block_list, nb);
> > +}
> > +EXPORT_SYMBOL_GPL(kexec_block_notifier_register);
> > +
> > +int kexec_block_notifier_unregister(struct notifier_block *nb)
> > +{
> > +	return blocking_notifier_chain_unregister(&kexec_block_list, nb);
> > +}
> > +EXPORT_SYMBOL_GPL(kexec_block_notifier_unregister);
> > +
> >   /*
> >    * When kexec transitions to the new kernel there is a one-to-one
> >    * mapping between physical and virtual addresses.  On processors
> > @@ -1124,6 +1138,12 @@ bool kexec_load_permitted(int kexec_image_type)
> >   	return true;
> >   }
> > +static int kexec_check_blockers(void)
> > +{
> > +	/* Notify subsystems of impending kexec */
> > +	return blocking_notifier_call_chain(&kexec_block_list, 0, NULL);
> > +}
> > +
> >   /*
> >    * Move into place and start executing a preloaded standalone
> >    * executable.  If nothing was preloaded return an error.
> > @@ -1139,6 +1159,10 @@ int kernel_kexec(void)
> >   		goto Unlock;
> >   	}
> > +	error = kexec_check_blockers();
> 
> This could take a long time, and I am not sure if it's a good idea
> to stall kexec with such dependencies.
> 

Whether the call takes time should not matter. liveudpate_reboot()
already introduced the same semantics below.

Thanks,
Stanislav

> Thanks,
> -Mukesh
> 
> 
> > +	if (error)
> > +		goto Unlock;
> > +
> >   	error = liveupdate_reboot();
> >   	if (error)
> >   		goto Unlock;
> > 
> > 
> 

^ permalink raw reply

* [PATCH] mshv: Replace fixed memory deposit with status driven helper
From: Stanislav Kinsburskii @ 2026-02-19 22:09 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel

Replace hardcoded HV_MAP_GPA_DEPOSIT_PAGES usage with
hv_deposit_memory() which derives the deposit size from
the hypercall status, and remove the now-unused constant.

The previous code always deposited a fixed 256 pages on
insufficient memory, ignoring the actual demand reported
by the hypervisor. hv_deposit_memory() handles different
deposit statuses, aligning map-GPA retries with the rest
of the codebase.

This approach may require more allocation and deposit
hypercall iterations, but avoids over-depositing large
fixed chunks when fewer pages would suffice. Until any
performance impact is measured, the more frugal and
consistent behavior is preferred.

Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
 drivers/hv/mshv_root_hv_call.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
index 7f91096f95a8..317191462b63 100644
--- a/drivers/hv/mshv_root_hv_call.c
+++ b/drivers/hv/mshv_root_hv_call.c
@@ -16,7 +16,6 @@
 
 /* Determined empirically */
 #define HV_INIT_PARTITION_DEPOSIT_PAGES 208
-#define HV_MAP_GPA_DEPOSIT_PAGES	256
 #define HV_UMAP_GPA_PAGES		512
 
 #define HV_PAGE_COUNT_2M_ALIGNED(pg_count) (!((pg_count) & (0x200 - 1)))
@@ -239,8 +238,7 @@ static int hv_do_map_gpa_hcall(u64 partition_id, u64 gfn, u64 page_struct_count,
 		completed = hv_repcomp(status);
 
 		if (hv_result_needs_memory(status)) {
-			ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id,
-						    HV_MAP_GPA_DEPOSIT_PAGES);
+			ret = hv_deposit_memory(partition_id, status);
 			if (ret)
 				break;
 



^ permalink raw reply related

* Re: [PATCH v3 4/4] mshv: Handle insufficient root memory hypervisor statuses
From: Stanislav Kinsburskii @ 2026-02-19 21:46 UTC (permalink / raw)
  To: Wei Liu
  Cc: Michael Kelley, kys@microsoft.com, haiyangz@microsoft.com,
	decui@microsoft.com, longli@microsoft.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260219064701.GQ2236050@liuwe-devbox-debian-v2.local>

On Thu, Feb 19, 2026 at 06:47:01AM +0000, Wei Liu wrote:
> On Fri, Feb 06, 2026 at 06:54:55PM +0000, Michael Kelley wrote:
> > From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Thursday, February 5, 2026 10:42 AM
> > > To: kys@microsoft.com; haiyangz@microsoft.com; wei.liu@kernel.org;
> > > decui@microsoft.com; longli@microsoft.com
> > > Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: [PATCH v3 4/4] mshv: Handle insufficient root memory hypervisor statuses
> > > 
> > > When creating guest partition objects, the hypervisor may fail to
> > > allocate root partition pages and return an insufficient memory status.
> > > In this case, deposit memory using the root partition ID instead.
> > > 
> > > Note: This error should never occur in a guest of L1VH partition context.
> > > 
> > > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > > ---
> > >  drivers/hv/hv_common.c      |    2 +
> > >  drivers/hv/hv_proc.c        |   14 ++++++++++
> > >  include/hyperv/hvgdk_mini.h |   58 ++++++++++++++++++++++---------------------
> > >  3 files changed, 46 insertions(+), 28 deletions(-)
> > > 
> > > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> > > index f20596276662..6b67ac616789 100644
> > > --- a/drivers/hv/hv_common.c
> > > +++ b/drivers/hv/hv_common.c
> > > @@ -794,6 +794,8 @@ static const struct hv_status_info hv_status_infos[] = {
> > >  	_STATUS_INFO(HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE,	-EIO),
> > >  	_STATUS_INFO(HV_STATUS_INSUFFICIENT_MEMORY,		-ENOMEM),
> > >  	_STATUS_INFO(HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY,	-ENOMEM),
> > > +	_STATUS_INFO(HV_STATUS_INSUFFICIENT_ROOT_MEMORY,	-ENOMEM),
> > > +	_STATUS_INFO(HV_STATUS_INSUFFICIENT_CONTIGUOUS_ROOT_MEMORY, 	-ENOMEM),
> > >  	_STATUS_INFO(HV_STATUS_INVALID_PARTITION_ID,		-EINVAL),
> > >  	_STATUS_INFO(HV_STATUS_INVALID_VP_INDEX,		-EINVAL),
> > >  	_STATUS_INFO(HV_STATUS_NOT_FOUND,			-EIO),
> > > diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> > > index 181f6d02bce3..5f4fd9c3231c 100644
> > > --- a/drivers/hv/hv_proc.c
> > > +++ b/drivers/hv/hv_proc.c
> > > @@ -121,6 +121,18 @@ int hv_deposit_memory_node(int node, u64 partition_id,
> > >  	case HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY:
> > >  		num_pages = HV_MAX_CONTIGUOUS_ALLOCATION_PAGES;
> > >  		break;
> > > +
> > > +	case HV_STATUS_INSUFFICIENT_CONTIGUOUS_ROOT_MEMORY:
> > > +		num_pages = HV_MAX_CONTIGUOUS_ALLOCATION_PAGES;
> > > +		fallthrough;
> > > +	case HV_STATUS_INSUFFICIENT_ROOT_MEMORY:
> > > +		if (!hv_root_partition()) {
> > > +			hv_status_err(hv_status, "Unexpected root memory deposit\n");
> > > +			return -ENOMEM;
> > > +		}
> > > +		partition_id = HV_PARTITION_ID_SELF;
> > > +		break;
> > > +
> > 
> > Per the discussion in v1 of this patch set, if the number of pages that should be
> > deposited in a particular situation is different from what this function provides,
> > the fallback is to use hv_call_deposit_pages() directly. From what I see, there's
> > only one such fallback case after a hypercall failure -- in hv_do_map_gpa_hcall().
> > The other uses of hv_call_deposit_pages() are initial deposits when creating a
> > VP or partition.
> > 
> > But if hv_call_deposit_pages() is used directly, the logic added here to detect
> > insufficient root memory and deposit to HV_PARTITION_ID_SELF isn't applied.
> > So if the hypercall in hv_do_map_gpa_hcall() fails with insufficient root
> > memory, the deposit is done to the wrong partition ID. If that case can
> > actually happen, then some additional logic is needed in
> > hv_do_map_gpa_hcall() to handle it. Or there needs to be a fallback
> > function that contains the logic.
> 
> Stanislav, how about this comment? Please submit a follow-up patch if
> necessary.
> 

I'll sumbit a follow-up patch.

Thanks,
Stanislav

> Wei
> 
> > 
> > Other than that, everything else in this patch set looks good to me.
> > 
> > Michael
> > 
> > >  	default:
> > >  		hv_status_err(hv_status, "Unexpected!\n");
> > >  		return -ENOMEM;

^ permalink raw reply

* Re: [PATCH rdma-next 42/50] RDMA/bnxt_re: Complete CQ resize in a single step
From: Selvin Xavier @ 2026-02-19  8:02 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, Kalesh AP, Potnuri Bharat Teja, Michael Margolin,
	Gal Pressman, Yossi Leybovich, Cheng Xu, Kai Shen,
	Chengchang Tang, Junxian Huang, Abhijit Gangurde, Allen Hubbe,
	Krzysztof Czurylo, Tatyana Nikolova, Long Li, Konstantin Taranov,
	Yishai Hadas, Michal Kalderon, Bryan Tan, Vishnu Dasa,
	Broadcom internal kernel review list, Christian Benvenuti,
	Nelson Escobar, Dennis Dalessandro, Bernard Metzler, Zhu Yanjun,
	linux-kernel, linux-rdma, linux-hyperv
In-Reply-To: <CA+sbYW2Bzis0E-pLKZ_j3T748YeB8Bt_zM_t2pzh09_TGoUnHA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3316 bytes --]

On Tue, Feb 17, 2026 at 4:22 PM Selvin Xavier
<selvin.xavier@broadcom.com> wrote:
>
> On Tue, Feb 17, 2026 at 1:27 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Tue, Feb 17, 2026 at 10:32:25AM +0530, Selvin Xavier wrote:
> > > On Mon, Feb 16, 2026 at 1:37 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Mon, Feb 16, 2026 at 09:29:29AM +0530, Selvin Xavier wrote:
> > > > > On Fri, Feb 13, 2026 at 4:31 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > > >
> > > > > > From: Leon Romanovsky <leonro@nvidia.com>
> > > > > >
> > > > > > There is no need to defer the CQ resize operation, as it is intended to
> > > > > > be completed in one pass. The current bnxt_re_resize_cq() implementation
> > > > > > does not handle concurrent CQ resize requests, and this will be addressed
> > > > > > in the following patches.
> > > > > bnxt HW requires that the previous CQ memory be available with the HW until
> > > > > HW generates a cut off cqe on the CQ that is being destroyed. This is
> > > > > the reason for
> > > > > polling the completions in the user library after returning the
> > > > > resize_cq call. Once the polling
> > > > > thread sees the expected CQE, it will invoke the driver to free CQ
> > > > > memory.
> > > >
> > > > This flow is problematic. It requires the kernel to trust a user‑space
> > > > application, which is not acceptable. There is no guarantee that the
> > > > rdma-core implementation is correct or will invoke the interface properly.
> > > > Users can bypass rdma-core entirely and issue ioctls directly (syzkaller,
> > > > custom rdma-core variants, etc.), leading to umem leaks, races that overwrite
> > > > kernel memory, and access to fields that are now being modified. All of this
> > > > can occur silently and without any protections.
> > > >
> > > > > So ib_umem_release should wait. This patch doesn't guarantee that.
> > > >
> > > > The issue is that it was never guaranteed in the first place. It only appeared
> > > > to work under very controlled conditions.
> > > >
> > > > > Do you think if there is a better way to handle this requirement?
> > > >
> > > > You should wait for BNXT_RE_WC_TYPE_COFF in the kernel before returning
> > > > from resize_cq.
> > > The difficulty is that libbnxt_re  in rdma-core has the  queue  the
> > > consumer index used for completion lookup. The driver therefore has to
> > > use copy_from_user to read the queue memory and then check for
> > > BNXT_RE_WC_TYPE_COFF, along with the queue consumer index and the
> > > relevant validity flags. I’ll explore if we have a way to handle this
> > > and get back.
> >
> > The thing is that you need to ensure that after libbnxt_re issued resize_cq command,
> > kernel won't require anything from user-space.
> >
> > Can you cause to your HW to stop generate CQEs before resize_cq?
> we dont have this control (especially on the Receive CQ side).  For
> the Tx side, maybe we can prevent
> posting to the Tx queue.
After discussing with other teams internally, we feel that the
sequence given by you
 should work fine.  As per the sequence, BNXT_RE_WC_TYPE_COFF should
be available when resize request is returned from FW.
We will test your series and confirm above behavior.
> >
> > Thanks

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5473 bytes --]

^ permalink raw reply

* [GIT PULL] Hyper-V patches for 7.0
From: Wei Liu @ 2026-02-19  7:45 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Wei Liu, Linux on Hyper-V List, Linux Kernel List, kys, haiyangz,
	decui, longli

Hi Linus,

The following changes since commit 18f7fcd5e69a04df57b563360b88be72471d6b62:

  Linux 6.19-rc8 (2026-02-01 14:01:13 -0800)

are available in the Git repository at:

  ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git tags/hyperv-next-signed-20260218

for you to fetch changes up to 158ebb578cd5f7881fdc7c4ecebddcf9463f91fd:

  mshv: Handle insufficient root memory hypervisor statuses (2026-02-19 06:42:11 +0000)

----------------------------------------------------------------
hyperv-next for v7.0
  - Debugfs support for MSHV statistics (Nuno Das Neves)
  - Support for the integrated scheduler (Stanislav Kinsburskii)
  - Various fixes for MSHV memory management and hypervisor status
    handling (Stanislav Kinsburskii)
  - Expose more capabilities and flags for MSHV partition management
    (Anatol Belski, Muminul Islam, Magnus Kulke)
  - Miscellaneous fixes to improve code quality and stability (Carlos
    López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh
    Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros
    Bizjak) 
  - PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka)
----------------------------------------------------------------
Anatol Belski (1):
      mshv: Add SMT_ENABLED_GUEST partition creation flag

Carlos López (1):
      mshv: clear eventfd counter on irqfd shutdown

Ethan Nelson-Moore (1):
      PCI: hv: remove unnecessary module_init/exit functions

Ethan Tidmore (1):
      x86/hyperv: Fix error pointer dereference

Jan Kiszka (1):
      Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT

Li RongQing (1):
      mshv: fix SRCU protection in irqfd resampler ack handler

Magnus Kulke (1):
      mshv: expose the scrub partition hypercall

Michael Kelley (7):
      PCI: hv: Remove unused field pci_bus in struct hv_pcibus_device
      mshv: Fix compiler warning about cast converting incompatible function type
      mshv: Use EPOLLIN and EPOLLHUP instead of POLLIN and POLLHUP
      Drivers: hv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
      x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()
      x86/hyperv: Update comment in hyperv_cleanup()
      Drivers: hv: vmbus: Simplify allocation of vmbus_evt

Mukesh R (3):
      x86/hyperv: fix a compiler warning in hv_crash.c
      x86/hyperv: Move hv crash init after hypercall pg setup
      mshv: make field names descriptive in a header struct

Mukesh Rathor (1):
      x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV

Muminul Islam (1):
      mshv: Add nested virtualization creation flag

Nuno Das Neves (3):
      mshv: Update hv_stats_page definitions
      mshv: Add data for printing stats page counters
      mshv: Add debugfs to view hypervisor statistics

Purna Pavan Chandra Aekkaladevi (1):
      mshv: Ignore second stats page map result failure

Stanislav Kinsburskii (8):
      mshv: Use typed hv_stats_page pointers
      mshv: Improve mshv_vp_stats_map/unmap(), add them to mshv_root.h
      mshv: Always map child vp stats pages regardless of scheduler type
      mshv: Add support for integrated scheduler
      mshv: Introduce hv_result_needs_memory() helper function
      mshv: Introduce hv_deposit_memory helper functions
      mshv: Handle insufficient contiguous memory hypervisor status
      mshv: Handle insufficient root memory hypervisor statuses

Uros Bizjak (3):
      x86/hyperv: Use savesegment() instead of inline asm() to save segment registers
      x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn
      mshv: Use try_cmpxchg() instead of cmpxchg()

 arch/x86/hyperv/hv_crash.c               |   3 +-
 arch/x86/hyperv/hv_init.c                |  20 +-
 arch/x86/hyperv/hv_vtl.c                 |   8 +-
 arch/x86/hyperv/ivm.c                    |  11 +-
 arch/x86/kernel/cpu/mshyperv.c           |  25 ++
 drivers/hv/Makefile                      |   1 +
 drivers/hv/hv.c                          |  12 +-
 drivers/hv/hv_common.c                   |   3 +
 drivers/hv/hv_proc.c                     |  53 ++-
 drivers/hv/hyperv_vmbus.h                |   4 +-
 drivers/hv/mshv_debugfs.c                | 726 +++++++++++++++++++++++++++++++
 drivers/hv/mshv_debugfs_counters.c       | 490 +++++++++++++++++++++
 drivers/hv/mshv_eventfd.c                |  22 +-
 drivers/hv/mshv_eventfd.h                |   1 -
 drivers/hv/mshv_regions.c                |  60 +--
 drivers/hv/mshv_root.h                   |  59 ++-
 drivers/hv/mshv_root_hv_call.c           | 104 +++--
 drivers/hv/mshv_root_main.c              | 238 ++++++----
 drivers/hv/mshv_vtl_main.c               |   5 +-
 drivers/hv/vmbus_drv.c                   |  86 +++-
 drivers/pci/controller/pci-hyperv-intf.c |  12 -
 drivers/pci/controller/pci-hyperv.c      |   1 -
 include/asm-generic/mshyperv.h           |  13 +
 include/hyperv/hvgdk_mini.h              |  58 +--
 include/hyperv/hvhdk.h                   |   9 +
 include/hyperv/hvhdk_mini.h              |   9 +-
 include/uapi/linux/mshv.h                |   2 +
 27 files changed, 1775 insertions(+), 260 deletions(-)
 create mode 100644 drivers/hv/mshv_debugfs.c
 create mode 100644 drivers/hv/mshv_debugfs_counters.c

^ permalink raw reply

* Re: [PATCH v3 0/4] Improve Hyper-V memory deposit error handling
From: Wei Liu @ 2026-02-19  6:49 UTC (permalink / raw)
  To: Stanislav Kinsburskii
  Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <177031674698.186911.179832109354647364.stgit@skinsburskii-cloud-desktop.internal.cloudapp.net>

On Thu, Feb 05, 2026 at 06:42:04PM +0000, Stanislav Kinsburskii wrote:
> This series extends the MSHV driver to properly handle additional
> memory-related error codes from the Microsoft Hypervisor by depositing
> memory pages when needed.
> 
> Currently, when the hypervisor returns HV_STATUS_INSUFFICIENT_MEMORY
> during partition creation, the driver calls hv_call_deposit_pages() to
> provide the necessary memory. However, there are other memory-related
> error codes that indicate the hypervisor needs additional memory
> resources, but the driver does not attempt to deposit pages for these
> cases.
> 
> This series introduces a dedicated helper function macro to identify all
> memory-related error codes (HV_STATUS_INSUFFICIENT_MEMORY,
> HV_STATUS_INSUFFICIENT_BUFFERS, HV_STATUS_INSUFFICIENT_DEVICE_DOMAINS, and
> HV_STATUS_INSUFFICIENT_ROOT_MEMORY) and ensures the driver attempts to
> deposit pages for all of them via new hv_deposit_memory() helper.
> 
> With these changes, partition creation becomes more robust by handling
> all scenarios where the hypervisor requires additional memory deposits.
> 
> v3:
> - Fix uninitialized num_pages variable in hv_deposit_memory_node() in case
>   of HV_STATUS_INSUFFICIENT_ROOT_MEMORY status
> 

I fixed a typo pointed out by Mukesh in the previous version, dropped
the note from the commit message in the last patch, and applied this to
hyperv-next.

Please address Michael's comment in patch four and send out a follow-up
patch if necessary.

Wei

^ permalink raw reply

* Re: [PATCH v3 4/4] mshv: Handle insufficient root memory hypervisor statuses
From: Wei Liu @ 2026-02-19  6:47 UTC (permalink / raw)
  To: Michael Kelley
  Cc: Stanislav Kinsburskii, kys@microsoft.com, haiyangz@microsoft.com,
	wei.liu@kernel.org, decui@microsoft.com, longli@microsoft.com,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SN6PR02MB4157F28C4F4CEFB886CF949ED466A@SN6PR02MB4157.namprd02.prod.outlook.com>

On Fri, Feb 06, 2026 at 06:54:55PM +0000, Michael Kelley wrote:
> From: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Sent: Thursday, February 5, 2026 10:42 AM
> > To: kys@microsoft.com; haiyangz@microsoft.com; wei.liu@kernel.org;
> > decui@microsoft.com; longli@microsoft.com
> > Cc: linux-hyperv@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: [PATCH v3 4/4] mshv: Handle insufficient root memory hypervisor statuses
> > 
> > When creating guest partition objects, the hypervisor may fail to
> > allocate root partition pages and return an insufficient memory status.
> > In this case, deposit memory using the root partition ID instead.
> > 
> > Note: This error should never occur in a guest of L1VH partition context.
> > 
> > Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> > ---
> >  drivers/hv/hv_common.c      |    2 +
> >  drivers/hv/hv_proc.c        |   14 ++++++++++
> >  include/hyperv/hvgdk_mini.h |   58 ++++++++++++++++++++++---------------------
> >  3 files changed, 46 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> > index f20596276662..6b67ac616789 100644
> > --- a/drivers/hv/hv_common.c
> > +++ b/drivers/hv/hv_common.c
> > @@ -794,6 +794,8 @@ static const struct hv_status_info hv_status_infos[] = {
> >  	_STATUS_INFO(HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE,	-EIO),
> >  	_STATUS_INFO(HV_STATUS_INSUFFICIENT_MEMORY,		-ENOMEM),
> >  	_STATUS_INFO(HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY,	-ENOMEM),
> > +	_STATUS_INFO(HV_STATUS_INSUFFICIENT_ROOT_MEMORY,	-ENOMEM),
> > +	_STATUS_INFO(HV_STATUS_INSUFFICIENT_CONTIGUOUS_ROOT_MEMORY, 	-ENOMEM),
> >  	_STATUS_INFO(HV_STATUS_INVALID_PARTITION_ID,		-EINVAL),
> >  	_STATUS_INFO(HV_STATUS_INVALID_VP_INDEX,		-EINVAL),
> >  	_STATUS_INFO(HV_STATUS_NOT_FOUND,			-EIO),
> > diff --git a/drivers/hv/hv_proc.c b/drivers/hv/hv_proc.c
> > index 181f6d02bce3..5f4fd9c3231c 100644
> > --- a/drivers/hv/hv_proc.c
> > +++ b/drivers/hv/hv_proc.c
> > @@ -121,6 +121,18 @@ int hv_deposit_memory_node(int node, u64 partition_id,
> >  	case HV_STATUS_INSUFFICIENT_CONTIGUOUS_MEMORY:
> >  		num_pages = HV_MAX_CONTIGUOUS_ALLOCATION_PAGES;
> >  		break;
> > +
> > +	case HV_STATUS_INSUFFICIENT_CONTIGUOUS_ROOT_MEMORY:
> > +		num_pages = HV_MAX_CONTIGUOUS_ALLOCATION_PAGES;
> > +		fallthrough;
> > +	case HV_STATUS_INSUFFICIENT_ROOT_MEMORY:
> > +		if (!hv_root_partition()) {
> > +			hv_status_err(hv_status, "Unexpected root memory deposit\n");
> > +			return -ENOMEM;
> > +		}
> > +		partition_id = HV_PARTITION_ID_SELF;
> > +		break;
> > +
> 
> Per the discussion in v1 of this patch set, if the number of pages that should be
> deposited in a particular situation is different from what this function provides,
> the fallback is to use hv_call_deposit_pages() directly. From what I see, there's
> only one such fallback case after a hypercall failure -- in hv_do_map_gpa_hcall().
> The other uses of hv_call_deposit_pages() are initial deposits when creating a
> VP or partition.
> 
> But if hv_call_deposit_pages() is used directly, the logic added here to detect
> insufficient root memory and deposit to HV_PARTITION_ID_SELF isn't applied.
> So if the hypercall in hv_do_map_gpa_hcall() fails with insufficient root
> memory, the deposit is done to the wrong partition ID. If that case can
> actually happen, then some additional logic is needed in
> hv_do_map_gpa_hcall() to handle it. Or there needs to be a fallback
> function that contains the logic.

Stanislav, how about this comment? Please submit a follow-up patch if
necessary.

Wei

> 
> Other than that, everything else in this patch set looks good to me.
> 
> Michael
> 
> >  	default:
> >  		hv_status_err(hv_status, "Unexpected!\n");
> >  		return -ENOMEM;

^ permalink raw reply

* Re: [PATCH v3 4/4] mshv: Handle insufficient root memory hypervisor statuses
From: Wei Liu @ 2026-02-19  6:44 UTC (permalink / raw)
  To: Anirudh Rayabharam
  Cc: Stanislav Kinsburskii, kys, haiyangz, wei.liu, decui, longli,
	linux-hyperv, linux-kernel
In-Reply-To: <aYWCmVxnO8R3vsc-@anirudh-surface.localdomain>

On Fri, Feb 06, 2026 at 05:56:41AM +0000, Anirudh Rayabharam wrote:
> On Thu, Feb 05, 2026 at 06:42:27PM +0000, Stanislav Kinsburskii wrote:
> > When creating guest partition objects, the hypervisor may fail to
> > allocate root partition pages and return an insufficient memory status.
> > In this case, deposit memory using the root partition ID instead.
> > 
> > Note: This error should never occur in a guest of L1VH partition context.
> 
> I think you should rephrse this to:
> 
> "... should never occur in an L1VH partition"
> 
> because none of the errors in this patch series occur inside a guest. They
> either occur in L1VH or root or both.

I have dropped this note from the commit message. If anything, this
should be in the code so that it can be kept up to date.

Wei

^ permalink raw reply

* Re: [PATCH v3 00/16] x86/msr: Inline rdmsr/wrmsr instructions
From: Jürgen Groß @ 2026-02-19  6:28 UTC (permalink / raw)
  To: H. Peter Anvin, linux-kernel, x86, linux-coco, kvm, linux-hyperv,
	virtualization, llvm
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Kiryl Shutsemau, Rick Edgecombe, Sean Christopherson,
	Paolo Bonzini, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Vitaly Kuznetsov, Boris Ostrovsky, xen-devel,
	Ajay Kaher, Alexey Makhalov, Broadcom internal kernel review list,
	Andy Lutomirski, Peter Zijlstra, Xin Li, Nathan Chancellor,
	Nick Desaulniers, Bill Wendling, Justin Stitt, Josh Poimboeuf,
	andy.cooper
In-Reply-To: <3D1FE2A7-F237-4232-9E39-6AFC75F3A4F0@zytor.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 6721 bytes --]

On 18.02.26 21:37, H. Peter Anvin wrote:
> On February 18, 2026 12:21:17 AM PST, Juergen Gross <jgross@suse.com> wrote:
>> When building a kernel with CONFIG_PARAVIRT_XXL the paravirt
>> infrastructure will always use functions for reading or writing MSRs,
>> even when running on bare metal.
>>
>> Switch to inline RDMSR/WRMSR instructions in this case, reducing the
>> paravirt overhead.
>>
>> The first patch is a prerequisite fix for alternative patching. Its
>> is needed due to the initial indirect call needs to be padded with
>> NOPs in some cases with the following patches.
>>
>> In order to make this less intrusive, some further reorganization of
>> the MSR access helpers is done in the patches 1-6.
>>
>> The next 4 patches are converting the non-paravirt case to use direct
>> inlining of the MSR access instructions, including the WRMSRNS
>> instruction and the immediate variants of RDMSR and WRMSR if possible.
>>
>> Patches 11-13 are some further preparations for making the real switch
>> to directly patch in the native MSR instructions easier.
>>
>> Patch 14 is switching the paravirt MSR function interface from normal
>> call ABI to one more similar to the native MSR instructions.
>>
>> Patch 15 is a little cleanup patch.
>>
>> Patch 16 is the final step for patching in the native MSR instructions
>> when not running as a Xen PV guest.
>>
>> This series has been tested to work with Xen PV and on bare metal.
>>
>> Note that there is more room for improvement. This series is sent out
>> to get a first impression how the code will basically look like.
> 
> Does that mean you are considering this patchset an RFC? If so, you should put that in the subject header.

It is one possible solution.

> 
>> Right now the same problem is solved differently for the paravirt and
>> the non-paravirt cases. In case this is not desired, there are two
>> possibilities to merge the two implementations. Both solutions have
>> the common idea to have rather similar code for paravirt and
>> non-paravirt variants, but just use a different main macro for
>> generating the respective code. For making the code of both possible
>> scenarios more similar, the following variants are possible:
>>
>> 1. Remove the micro-optimizations of the non-paravirt case, making
>>    it similar to the paravirt code in my series. This has the
>>    advantage of being more simple, but might have a very small
>>    negative performance impact (probably not really detectable).
>>
>> 2. Add the same micro-optimizations to the paravirt case, requiring
>>    to enhance paravirt patching to support a to be patched indirect
>>    call in the middle of the initial code snipplet.
>>
>> In both cases the native MSR function variants would no longer be
>> usable in the paravirt case, but this would mostly affect Xen, as it
>> would need to open code the WRMSR/RDMSR instructions to be used
>> instead the native_*msr*() functions.
>>
>> Changes since V2:
>> - switch back to the paravirt approach
>>
>> Changes since V1:
>> - Use Xin Li's approach for inlining
>> - Several new patches
>>
>> Juergen Gross (16):
>>   x86/alternative: Support alt_replace_call() with instructions after
>>     call
>>   coco/tdx: Rename MSR access helpers
>>   x86/sev: Replace call of native_wrmsr() with native_wrmsrq()
>>   KVM: x86: Remove the KVM private read_msr() function
>>   x86/msr: Minimize usage of native_*() msr access functions
>>   x86/msr: Move MSR trace calls one function level up
>>   x86/opcode: Add immediate form MSR instructions
>>   x86/extable: Add support for immediate form MSR instructions
>>   x86/msr: Use the alternatives mechanism for WRMSR
>>   x86/msr: Use the alternatives mechanism for RDMSR
>>   x86/alternatives: Add ALTERNATIVE_4()
>>   x86/paravirt: Split off MSR related hooks into new header
>>   x86/paravirt: Prepare support of MSR instruction interfaces
>>   x86/paravirt: Switch MSR access pv_ops functions to instruction
>>     interfaces
>>   x86/msr: Reduce number of low level MSR access helpers
>>   x86/paravirt: Use alternatives for MSR access with paravirt
>>
>> arch/x86/coco/sev/internal.h              |   7 +-
>> arch/x86/coco/tdx/tdx.c                   |   8 +-
>> arch/x86/hyperv/ivm.c                     |   2 +-
>> arch/x86/include/asm/alternative.h        |   6 +
>> arch/x86/include/asm/fred.h               |   2 +-
>> arch/x86/include/asm/kvm_host.h           |  10 -
>> arch/x86/include/asm/msr.h                | 345 ++++++++++++++++------
>> arch/x86/include/asm/paravirt-msr.h       | 148 ++++++++++
>> arch/x86/include/asm/paravirt.h           |  67 -----
>> arch/x86/include/asm/paravirt_types.h     |  57 ++--
>> arch/x86/include/asm/qspinlock_paravirt.h |   4 +-
>> arch/x86/kernel/alternative.c             |   5 +-
>> arch/x86/kernel/cpu/mshyperv.c            |   7 +-
>> arch/x86/kernel/kvmclock.c                |   2 +-
>> arch/x86/kernel/paravirt.c                |  42 ++-
>> arch/x86/kvm/svm/svm.c                    |  16 +-
>> arch/x86/kvm/vmx/tdx.c                    |   2 +-
>> arch/x86/kvm/vmx/vmx.c                    |   8 +-
>> arch/x86/lib/x86-opcode-map.txt           |   5 +-
>> arch/x86/mm/extable.c                     |  35 ++-
>> arch/x86/xen/enlighten_pv.c               |  52 +++-
>> arch/x86/xen/pmu.c                        |   4 +-
>> tools/arch/x86/lib/x86-opcode-map.txt     |   5 +-
>> tools/objtool/check.c                     |   1 +
>> 24 files changed, 576 insertions(+), 264 deletions(-)
>> create mode 100644 arch/x86/include/asm/paravirt-msr.h
>>
> 
> Could you clarify *on the high design level* what "go back to the paravirt approach" means, and the motivation for that?

This is related to V2 of this series, where I used a static branch for
special casing Xen PV.

Peter Zijlstra commented on that asking to try harder using the pv_ops
hooks for Xen PV, too.

> Note that for Xen *most* MSRs fall in one of two categories: those that are dropped entirely and those that are just passed straight on to the hardware.
> 
> I don't know if anyone cares about optimizing PV Xen anymore, but at least in theory Xen can un-paravirtualize most sites.

The problem with that is, that this would need to be taken care at the
callers' sites, "poisoning" a lot of code with Xen specific paths. Or we'd
need to use the native variants explicitly at all places where Xen PV
would just use the MSR instructions itself. But please be aware, that
there are plans to introduce a hypercall for Xen to speed up MSR accesses,
which would reduce the "passed through to hardware" cases to 0.


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply

* RE: [EXTERNAL] [PATCH 1/1] Drivers: hv: vmbus: Simplify allocation of vmbus_evt
From: Michael Kelley @ 2026-02-19  0:33 UTC (permalink / raw)
  To: Wei Liu, Long Li
  Cc: KY Srinivasan, Haiyang Zhang, Dexuan Cui,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260218234756.GN2236050@liuwe-devbox-debian-v2.local>

From: Wei Liu <wei.liu@kernel.org>
> 
> On Wed, Feb 18, 2026 at 09:52:41PM +0000, Long Li wrote:
> > > From: Michael Kelley <mhklinux@outlook.com>
> > >
> > > The per-cpu variable vmbus_evt is currently dynamically allocated. It's only 8
> > > bytes, so just allocate it statically to simplify and save a few lines of code.
> > >
> > > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> >
> > Reviewed-by: Long Li <longli@microsoft.com>
> 
> Applied to hyperv-next.
> 
> This has a conflict with Jan Kiszka's patch. It is easy to resolve.
> Please check and shout if something is off.
> 

Thanks. The conflict resolution looks good to me.

Michael

^ permalink raw reply

* Re: [PATCH 1/4] mshv: Add nested virtualization creation flag
From: Wei Liu @ 2026-02-18 23:55 UTC (permalink / raw)
  To: Anatol Belski; +Cc: linux-hyperv, wei.liu, muislam
In-Reply-To: <20260218144802.1962513-1-anbelski@linux.microsoft.com>

On Wed, Feb 18, 2026 at 02:47:59PM +0000, Anatol Belski wrote:
> From: Muminul Islam <muislam@microsoft.com>
> 
> Introduce HV_PARTITION_CREATION_FLAG_NESTED_VIRTUALIZATION_CAPABLE to
> indicate support for nested virtualization during partition creation.
> 
> This enables clearer configuration and capability checks for nested
> virtualization scenarios.
> 
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
> Signed-off-by: Muminul Islam <muislam@microsoft.com>

Series applied. I squashed the first three patches into one.

Wei

^ permalink raw reply

* Re: [EXTERNAL] [PATCH 1/1] Drivers: hv: vmbus: Simplify allocation of vmbus_evt
From: Wei Liu @ 2026-02-18 23:47 UTC (permalink / raw)
  To: Long Li
  Cc: mhklinux@outlook.com, KY Srinivasan, Haiyang Zhang,
	wei.liu@kernel.org, Dexuan Cui, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <DS3PR21MB5735CA49BF2A99E320D69C8ACE6AA@DS3PR21MB5735.namprd21.prod.outlook.com>

On Wed, Feb 18, 2026 at 09:52:41PM +0000, Long Li wrote:
> > From: Michael Kelley <mhklinux@outlook.com>
> > 
> > The per-cpu variable vmbus_evt is currently dynamically allocated. It's only 8
> > bytes, so just allocate it statically to simplify and save a few lines of code.
> > 
> > Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> 
> Reviewed-by: Long Li <longli@microsoft.com>

Applied to hyperv-next.

This has a conflict with Jan Kiszka's patch. It is easy to resolve.
Please check and shout if something is off.

Wei

^ permalink raw reply

* Re: [PATCH] mshv: expose hv_call_scrub_partition
From: Wei Liu @ 2026-02-18 23:35 UTC (permalink / raw)
  To: Magnus Kulke
  Cc: wei.liu, haiyangz, kys, decui, linux-hyperv, skinsburskii,
	magnuskulke, linux-kernel
In-Reply-To: <20260218141911.555592-1-magnuskulke@linux.microsoft.com>

On Wed, Feb 18, 2026 at 03:19:11PM +0100, Magnus Kulke wrote:
> This hv call needs to be exposed for VMMs to be able to soft-reboot
> guests. It will reset APIC and state of para-virtualized devices like
> SynIC.
> 
> Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com>

Applied to hyperv-next. Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox