* Re: [PATCH 3/7] mshv: Rename mshv_mem_region to mshv_region
From: Stanislav Kinsburskii @ 2026-04-03 16:09 UTC (permalink / raw)
To: Anirudh Rayabharam
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <20260403-gainful-cerulean-asp-5a1a2d@anirudhrb>
On Fri, Apr 03, 2026 at 03:54:52AM +0000, Anirudh Rayabharam wrote:
> On Thu, Apr 02, 2026 at 10:57:09AM -0700, Stanislav Kinsburskii wrote:
> > On Thu, Apr 02, 2026 at 04:33:24PM +0000, Anirudh Rayabharam wrote:
> > > On Wed, Apr 01, 2026 at 10:12:40PM +0000, Stanislav Kinsburskii wrote:
> > > > The mshv_mem_region structure represents guest address space regions,
> > > > which can be either RAM-backed memory or memory-mapped IO regions
> > > > without physical backing. The "mem_" prefix incorrectly suggests the
> > > > structure only handles memory regions, creating confusion about its
> > > > actual purpose.
> > > >
> > > > Remove the "mem_" prefix to align with existing function naming
> > > > (mshv_region_map, mshv_region_pin, etc.) and accurately reflect that
> > > > this structure manages arbitrary guest address space mappings
> > > > regardless of their backing type.
> > >
> > > I don't think the "mem_" prefix automatically suggested the backing
> > > type.
> > >
> >
> > What else can it suggest?
>
> To me it just suggested that the struct contained info or properties of
> a guest memory region. The name itself didn't suggest what backing type
> the memory has.
>
Right, that’s what the old name suggested to me too. And that’s the
problem. It’s inaccurate for MMIO regions. They have nothing to do with
memory. They are never backed by it.
> >
> > > Isn't mshv_region too vague now? Region of what?
> > >
> >
> > The region of address space, which can or can not be backed by memory.
>
> Yeah but that's not obvious just from the new name. The new name just
> says mshv_region and doesn't what the region is of.
>
I hear you, but what else can we do? Keeping mshv_mem_region isn’t good
for the reasons above. Renaming it to `mshv_gpa_region` also feels
redundant detailization.
To me, this is a good tradeoff. Yes, it’s a bit too generic. But it’s
concise, and we don’t have any other regions in the codebase. So I don’t
think it will cause confusion.
Thanks,
Stanislav
> Thanks,
> Anirudh.
>
> >
> > Thanks,
> > Stanislav
> >
> > > Thanks,
> > > Anirudh.
^ permalink raw reply
* RE: [PATCH v2 4/4] drivers: hv: mark channel attributes as const
From: Michael Kelley @ 2026-04-03 15:56 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260403-sysfs-const-hv-v2-4-8932ab8d41db@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Friday, April 3, 2026 1:29 AM
>
> These attributes are never modified, mark them as const.
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> Tested-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
But take a look at this analysis from Sashiko AI:
https://sashiko.dev/#/patchset/20260403-sysfs-const-hv-v2-0-8932ab8d41db%40weissschuh.net
This seems to be a valid concern with your earlier commit 7dd9fdb4939b9
"sysfs: attribute_group: enable const variants of is_visible()".
> ---
> drivers/hv/vmbus_drv.c | 38 +++++++++++++++++++-------------------
> 1 file changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index ecce6b72a2a2..b33b85764c19 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1695,7 +1695,7 @@ static ssize_t out_mask_show(struct vmbus_channel
> *channel, char *buf)
> mutex_unlock(&rbi->ring_buffer_mutex);
> return ret;
> }
> -static VMBUS_CHAN_ATTR_RO(out_mask);
> +static const VMBUS_CHAN_ATTR_RO(out_mask);
>
> static ssize_t in_mask_show(struct vmbus_channel *channel, char *buf)
> {
> @@ -1712,7 +1712,7 @@ static ssize_t in_mask_show(struct vmbus_channel
> *channel, char *buf)
> mutex_unlock(&rbi->ring_buffer_mutex);
> return ret;
> }
> -static VMBUS_CHAN_ATTR_RO(in_mask);
> +static const VMBUS_CHAN_ATTR_RO(in_mask);
>
> static ssize_t read_avail_show(struct vmbus_channel *channel, char *buf)
> {
> @@ -1729,7 +1729,7 @@ static ssize_t read_avail_show(struct vmbus_channel
> *channel, char *buf)
> mutex_unlock(&rbi->ring_buffer_mutex);
> return ret;
> }
> -static VMBUS_CHAN_ATTR_RO(read_avail);
> +static const VMBUS_CHAN_ATTR_RO(read_avail);
>
> static ssize_t write_avail_show(struct vmbus_channel *channel, char *buf)
> {
> @@ -1746,7 +1746,7 @@ static ssize_t write_avail_show(struct vmbus_channel
> *channel, char *buf)
> mutex_unlock(&rbi->ring_buffer_mutex);
> return ret;
> }
> -static VMBUS_CHAN_ATTR_RO(write_avail);
> +static const VMBUS_CHAN_ATTR_RO(write_avail);
>
> static ssize_t target_cpu_show(struct vmbus_channel *channel, char *buf)
> {
> @@ -1864,7 +1864,7 @@ static ssize_t target_cpu_store(struct vmbus_channel
> *channel,
>
> return ret ?: count;
> }
> -static VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
> +static const VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
>
> static ssize_t channel_pending_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1873,7 +1873,7 @@ static ssize_t channel_pending_show(struct vmbus_channel
> *channel,
> channel_pending(channel,
> vmbus_connection.monitor_pages[1]));
> }
> -static VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
> +static const VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
>
> static ssize_t channel_latency_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1882,19 +1882,19 @@ static ssize_t channel_latency_show(struct
> vmbus_channel *channel,
> channel_latency(channel,
> vmbus_connection.monitor_pages[1]));
> }
> -static VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
> +static const VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
>
> static ssize_t channel_interrupts_show(struct vmbus_channel *channel, char *buf)
> {
> return sprintf(buf, "%llu\n", channel->interrupts);
> }
> -static VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
> +static const VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
>
> static ssize_t channel_events_show(struct vmbus_channel *channel, char *buf)
> {
> return sprintf(buf, "%llu\n", channel->sig_events);
> }
> -static VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
> +static const VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
>
> static ssize_t channel_intr_in_full_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1902,7 +1902,7 @@ static ssize_t channel_intr_in_full_show(struct
> vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->intr_in_full);
> }
> -static VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
> +static const VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
>
> static ssize_t channel_intr_out_empty_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1910,7 +1910,7 @@ static ssize_t channel_intr_out_empty_show(struct
> vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->intr_out_empty);
> }
> -static VMBUS_CHAN_ATTR(intr_out_empty, 0444, channel_intr_out_empty_show,
> NULL);
> +static const VMBUS_CHAN_ATTR(intr_out_empty, 0444,
> channel_intr_out_empty_show, NULL);
>
> static ssize_t channel_out_full_first_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1918,7 +1918,7 @@ static ssize_t channel_out_full_first_show(struct
> vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->out_full_first);
> }
> -static VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show, NULL);
> +static const VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show,
> NULL);
>
> static ssize_t channel_out_full_total_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1926,14 +1926,14 @@ static ssize_t channel_out_full_total_show(struct
> vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->out_full_total);
> }
> -static VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show, NULL);
> +static const VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show,
> NULL);
>
> static ssize_t subchannel_monitor_id_show(struct vmbus_channel *channel,
> char *buf)
> {
> return sprintf(buf, "%u\n", channel->offermsg.monitorid);
> }
> -static VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show, NULL);
> +static const VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show,
> NULL);
>
> static ssize_t subchannel_id_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1941,7 +1941,7 @@ static ssize_t subchannel_id_show(struct vmbus_channel
> *channel,
> return sprintf(buf, "%u\n",
> channel->offermsg.offer.sub_channel_index);
> }
> -static VMBUS_CHAN_ATTR_RO(subchannel_id);
> +static const VMBUS_CHAN_ATTR_RO(subchannel_id);
>
> static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
> const struct bin_attribute *attr,
> @@ -1963,7 +1963,7 @@ static const struct bin_attribute chan_attr_ring_buffer = {
> },
> .mmap = hv_mmap_ring_buffer_wrapper,
> };
> -static struct attribute *vmbus_chan_attrs[] = {
> +static const struct attribute *const vmbus_chan_attrs[] = {
> &chan_attr_out_mask.attr,
> &chan_attr_in_mask.attr,
> &chan_attr_read_avail.attr,
> @@ -1992,7 +1992,7 @@ static const struct bin_attribute *const
> vmbus_chan_bin_attrs[] = {
> * each attribute, and returns 0 if an attribute is not visible.
> */
> static umode_t vmbus_chan_attr_is_visible(struct kobject *kobj,
> - struct attribute *attr, int idx)
> + const struct attribute *attr, int idx)
> {
> const struct vmbus_channel *channel =
> container_of(kobj, struct vmbus_channel, kobj);
> @@ -2030,9 +2030,9 @@ static size_t vmbus_chan_bin_size(struct kobject *kobj,
> }
>
> static const struct attribute_group vmbus_chan_group = {
> - .attrs = vmbus_chan_attrs,
> + .attrs_const = vmbus_chan_attrs,
> .bin_attrs = vmbus_chan_bin_attrs,
> - .is_visible = vmbus_chan_attr_is_visible,
> + .is_visible_const = vmbus_chan_attr_is_visible,
> .is_bin_visible = vmbus_chan_bin_attr_is_visible,
> .bin_size = vmbus_chan_bin_size,
> };
>
> --
> 2.53.0
>
^ permalink raw reply
* [PATCH] PCI: hv: Fix double ida_free in hv_pci_probe error path
From: Sahil Chandna @ 2026-04-03 12:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, lpieralisi, kwilczynski,
mani, robh, bhelgaas, dan.j.williams, mhklinux, linux-hyperv,
linux-pci, linux-kernel
If hv_pci_probe() fails after storing the domain number in
hbus->bridge->domain_nr, there is a call to free this domain_nr via
pci_bus_release_emul_domain_nr(), however, during cleanup, the bridge
release callback pci_release_host_bridge_dev() also frees the domain_nr
causing ida_free to be called on same ID twice and triggering following
warning:
ida_free called for id=28971 which is not allocated.
WARNING: lib/idr.c:594 at ida_free+0xdf/0x160, CPU#0: kworker/0:2/198
Call Trace:
pci_bus_release_emul_domain_nr+0x17/0x20
pci_release_host_bridge_dev+0x4b/0x60
device_release+0x3b/0xa0
kobject_put+0x8e/0x220
devm_pci_alloc_host_bridge_release+0xe/0x20
devres_release_all+0x9a/0xd0
device_unbind_cleanup+0x12/0xa0
really_probe+0x1c5/0x3f0
vmbus_add_channel_work+0x135/0x1a0
Fix this by letting pci core handle the free domain_nr and remove
the explicit free called in pci-hyperv driver.
Fixes: bcce8c74f1ce ("PCI: Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms")
Signed-off-by: Sahil Chandna <sahilchandna@linux.microsoft.com>
---
drivers/pci/controller/pci-hyperv.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 2c7a406b4ba8..5616ad5d2a8f 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3778,7 +3778,7 @@ static int hv_pci_probe(struct hv_device *hdev,
hbus->bridge->domain_nr);
if (!hbus->wq) {
ret = -ENOMEM;
- goto free_dom;
+ goto free_bus;
}
hdev->channel->next_request_id_callback = vmbus_next_request_id;
@@ -3874,8 +3874,6 @@ static int hv_pci_probe(struct hv_device *hdev,
vmbus_close(hdev->channel);
destroy_wq:
destroy_workqueue(hbus->wq);
-free_dom:
- pci_bus_release_emul_domain_nr(hbus->bridge->domain_nr);
free_bus:
kfree(hbus);
return ret;
--
2.53.0
^ permalink raw reply related
* [PATCH v2 3/4] drivers: hv: mark bus attributes as const
From: Thomas Weißschuh @ 2026-04-03 8:29 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
Cc: Michael Kelley, linux-hyperv, linux-kernel, Thomas Weißschuh
In-Reply-To: <20260403-sysfs-const-hv-v2-0-8932ab8d41db@weissschuh.net>
This attribute is never modified, mark it as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
drivers/hv/vmbus_drv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index d41b39ab628d..ecce6b72a2a2 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -639,9 +639,9 @@ static ssize_t hibernation_show(const struct bus_type *bus, char *buf)
return sprintf(buf, "%d\n", !!hv_is_hibernation_supported());
}
-static BUS_ATTR_RO(hibernation);
+static const BUS_ATTR_RO(hibernation);
-static struct attribute *vmbus_bus_attrs[] = {
+static const struct attribute *const vmbus_bus_attrs[] = {
&bus_attr_hibernation.attr,
NULL,
};
--
2.53.0
^ permalink raw reply related
* [PATCH v2 2/4] drivers: hv: use ATTRIBUTE_GROUPS helper
From: Thomas Weißschuh @ 2026-04-03 8:29 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
Cc: Michael Kelley, linux-hyperv, linux-kernel, Thomas Weißschuh
In-Reply-To: <20260403-sysfs-const-hv-v2-0-8932ab8d41db@weissschuh.net>
The current logic is equivalent to the helper.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
drivers/hv/vmbus_drv.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 5f9b7cc9080c..d41b39ab628d 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -645,10 +645,7 @@ static struct attribute *vmbus_bus_attrs[] = {
&bus_attr_hibernation.attr,
NULL,
};
-static const struct attribute_group vmbus_bus_group = {
- .attrs = vmbus_bus_attrs,
-};
-__ATTRIBUTE_GROUPS(vmbus_bus);
+ATTRIBUTE_GROUPS(vmbus_bus);
/*
* vmbus_uevent - add uevent for our device
--
2.53.0
^ permalink raw reply related
* [PATCH v2 1/4] drivers: hv: mark chan_attr_ring_buffer as const
From: Thomas Weißschuh @ 2026-04-03 8:29 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
Cc: Michael Kelley, linux-hyperv, linux-kernel, Thomas Weißschuh
In-Reply-To: <20260403-sysfs-const-hv-v2-0-8932ab8d41db@weissschuh.net>
The structure is never modified, mark it as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
drivers/hv/vmbus_drv.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index bc4fc1951ae1..5f9b7cc9080c 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1959,7 +1959,7 @@ static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
return channel->mmap_ring_buffer(channel, vma);
}
-static struct bin_attribute chan_attr_ring_buffer = {
+static const struct bin_attribute chan_attr_ring_buffer = {
.attr = {
.name = "ring",
.mode = 0600,
@@ -1985,7 +1985,7 @@ static struct attribute *vmbus_chan_attrs[] = {
NULL
};
-static const struct bin_attribute *vmbus_chan_bin_attrs[] = {
+static const struct bin_attribute *const vmbus_chan_bin_attrs[] = {
&chan_attr_ring_buffer,
NULL
};
--
2.53.0
^ permalink raw reply related
* [PATCH v2 0/4] drivers: hv: mark some sysfs attribute as const
From: Thomas Weißschuh @ 2026-04-03 8:29 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
Cc: Michael Kelley, linux-hyperv, linux-kernel, Thomas Weißschuh
The sysfs attribute structures are never modified.
Mark them as const where possible.
This excludes the device attributes for now, as these will have their
own migration.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
---
Changes in v2:
- Mark missed four channels attributes as const (Michael Kelley)
- Link to v1: https://patch.msgid.link/20260402-sysfs-const-hv-v1-0-a467d6f7726e@weissschuh.net
---
Thomas Weißschuh (4):
drivers: hv: mark chan_attr_ring_buffer as const
drivers: hv: use ATTRIBUTE_GROUPS helper
drivers: hv: mark bus attributes as const
drivers: hv: mark channel attributes as const
drivers/hv/vmbus_drv.c | 51 ++++++++++++++++++++++++--------------------------
1 file changed, 24 insertions(+), 27 deletions(-)
---
base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
change-id: 20260331-sysfs-const-hv-cad05718c68a
Best regards,
--
Thomas Weißschuh <linux@weissschuh.net>
^ permalink raw reply
* [PATCH v2 4/4] drivers: hv: mark channel attributes as const
From: Thomas Weißschuh @ 2026-04-03 8:29 UTC (permalink / raw)
To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
Cc: Michael Kelley, linux-hyperv, linux-kernel, Thomas Weißschuh
In-Reply-To: <20260403-sysfs-const-hv-v2-0-8932ab8d41db@weissschuh.net>
These attributes are never modified, mark them as const.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
drivers/hv/vmbus_drv.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index ecce6b72a2a2..b33b85764c19 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1695,7 +1695,7 @@ static ssize_t out_mask_show(struct vmbus_channel *channel, char *buf)
mutex_unlock(&rbi->ring_buffer_mutex);
return ret;
}
-static VMBUS_CHAN_ATTR_RO(out_mask);
+static const VMBUS_CHAN_ATTR_RO(out_mask);
static ssize_t in_mask_show(struct vmbus_channel *channel, char *buf)
{
@@ -1712,7 +1712,7 @@ static ssize_t in_mask_show(struct vmbus_channel *channel, char *buf)
mutex_unlock(&rbi->ring_buffer_mutex);
return ret;
}
-static VMBUS_CHAN_ATTR_RO(in_mask);
+static const VMBUS_CHAN_ATTR_RO(in_mask);
static ssize_t read_avail_show(struct vmbus_channel *channel, char *buf)
{
@@ -1729,7 +1729,7 @@ static ssize_t read_avail_show(struct vmbus_channel *channel, char *buf)
mutex_unlock(&rbi->ring_buffer_mutex);
return ret;
}
-static VMBUS_CHAN_ATTR_RO(read_avail);
+static const VMBUS_CHAN_ATTR_RO(read_avail);
static ssize_t write_avail_show(struct vmbus_channel *channel, char *buf)
{
@@ -1746,7 +1746,7 @@ static ssize_t write_avail_show(struct vmbus_channel *channel, char *buf)
mutex_unlock(&rbi->ring_buffer_mutex);
return ret;
}
-static VMBUS_CHAN_ATTR_RO(write_avail);
+static const VMBUS_CHAN_ATTR_RO(write_avail);
static ssize_t target_cpu_show(struct vmbus_channel *channel, char *buf)
{
@@ -1864,7 +1864,7 @@ static ssize_t target_cpu_store(struct vmbus_channel *channel,
return ret ?: count;
}
-static VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
+static const VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
static ssize_t channel_pending_show(struct vmbus_channel *channel,
char *buf)
@@ -1873,7 +1873,7 @@ static ssize_t channel_pending_show(struct vmbus_channel *channel,
channel_pending(channel,
vmbus_connection.monitor_pages[1]));
}
-static VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
+static const VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
static ssize_t channel_latency_show(struct vmbus_channel *channel,
char *buf)
@@ -1882,19 +1882,19 @@ static ssize_t channel_latency_show(struct vmbus_channel *channel,
channel_latency(channel,
vmbus_connection.monitor_pages[1]));
}
-static VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
+static const VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
static ssize_t channel_interrupts_show(struct vmbus_channel *channel, char *buf)
{
return sprintf(buf, "%llu\n", channel->interrupts);
}
-static VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
+static const VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
static ssize_t channel_events_show(struct vmbus_channel *channel, char *buf)
{
return sprintf(buf, "%llu\n", channel->sig_events);
}
-static VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
+static const VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
static ssize_t channel_intr_in_full_show(struct vmbus_channel *channel,
char *buf)
@@ -1902,7 +1902,7 @@ static ssize_t channel_intr_in_full_show(struct vmbus_channel *channel,
return sprintf(buf, "%llu\n",
(unsigned long long)channel->intr_in_full);
}
-static VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
+static const VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
static ssize_t channel_intr_out_empty_show(struct vmbus_channel *channel,
char *buf)
@@ -1910,7 +1910,7 @@ static ssize_t channel_intr_out_empty_show(struct vmbus_channel *channel,
return sprintf(buf, "%llu\n",
(unsigned long long)channel->intr_out_empty);
}
-static VMBUS_CHAN_ATTR(intr_out_empty, 0444, channel_intr_out_empty_show, NULL);
+static const VMBUS_CHAN_ATTR(intr_out_empty, 0444, channel_intr_out_empty_show, NULL);
static ssize_t channel_out_full_first_show(struct vmbus_channel *channel,
char *buf)
@@ -1918,7 +1918,7 @@ static ssize_t channel_out_full_first_show(struct vmbus_channel *channel,
return sprintf(buf, "%llu\n",
(unsigned long long)channel->out_full_first);
}
-static VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show, NULL);
+static const VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show, NULL);
static ssize_t channel_out_full_total_show(struct vmbus_channel *channel,
char *buf)
@@ -1926,14 +1926,14 @@ static ssize_t channel_out_full_total_show(struct vmbus_channel *channel,
return sprintf(buf, "%llu\n",
(unsigned long long)channel->out_full_total);
}
-static VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show, NULL);
+static const VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show, NULL);
static ssize_t subchannel_monitor_id_show(struct vmbus_channel *channel,
char *buf)
{
return sprintf(buf, "%u\n", channel->offermsg.monitorid);
}
-static VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show, NULL);
+static const VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show, NULL);
static ssize_t subchannel_id_show(struct vmbus_channel *channel,
char *buf)
@@ -1941,7 +1941,7 @@ static ssize_t subchannel_id_show(struct vmbus_channel *channel,
return sprintf(buf, "%u\n",
channel->offermsg.offer.sub_channel_index);
}
-static VMBUS_CHAN_ATTR_RO(subchannel_id);
+static const VMBUS_CHAN_ATTR_RO(subchannel_id);
static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
const struct bin_attribute *attr,
@@ -1963,7 +1963,7 @@ static const struct bin_attribute chan_attr_ring_buffer = {
},
.mmap = hv_mmap_ring_buffer_wrapper,
};
-static struct attribute *vmbus_chan_attrs[] = {
+static const struct attribute *const vmbus_chan_attrs[] = {
&chan_attr_out_mask.attr,
&chan_attr_in_mask.attr,
&chan_attr_read_avail.attr,
@@ -1992,7 +1992,7 @@ static const struct bin_attribute *const vmbus_chan_bin_attrs[] = {
* each attribute, and returns 0 if an attribute is not visible.
*/
static umode_t vmbus_chan_attr_is_visible(struct kobject *kobj,
- struct attribute *attr, int idx)
+ const struct attribute *attr, int idx)
{
const struct vmbus_channel *channel =
container_of(kobj, struct vmbus_channel, kobj);
@@ -2030,9 +2030,9 @@ static size_t vmbus_chan_bin_size(struct kobject *kobj,
}
static const struct attribute_group vmbus_chan_group = {
- .attrs = vmbus_chan_attrs,
+ .attrs_const = vmbus_chan_attrs,
.bin_attrs = vmbus_chan_bin_attrs,
- .is_visible = vmbus_chan_attr_is_visible,
+ .is_visible_const = vmbus_chan_attr_is_visible,
.is_bin_visible = vmbus_chan_bin_attr_is_visible,
.bin_size = vmbus_chan_bin_size,
};
--
2.53.0
^ permalink raw reply related
* Re: [PATCH] Drivers: hv: mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
From: Naman Jain @ 2026-04-03 7:25 UTC (permalink / raw)
To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: Saurabh Sengar, linux-hyperv@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <SN6PR02MB415797828F8DC5C9AC7E9131D45EA@SN6PR02MB4157.namprd02.prod.outlook.com>
On 4/3/2026 9:05 AM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Tuesday, March 31, 2026 10:40 PM
>>
>> When registering VTL0 memory via MSHV_ADD_VTL0_MEMORY, the kernel
>> computes pgmap->vmemmap_shift as the number of trailing zeros in the
>> OR of start_pfn and last_pfn, intending to use the largest compound
>> page order both endpoints are aligned to.
>>
>> However, this value is not clamped to MAX_FOLIO_ORDER, so a
>> sufficiently aligned range (e.g. physical range 0x800000000000-
>> 0x800080000000, corresponding to start_pfn=0x800000000 with 35
>> trailing zeros) can produce a shift larger than what
>> memremap_pages() accepts, triggering a WARN and returning -EINVAL:
>>
>> WARNING: ... memremap_pages+0x512/0x650
>> requested folio size unsupported
>>
>> The MAX_FOLIO_ORDER check was added by
>> commit 646b67d57589 ("mm/memremap: reject unreasonable folio/compound
>> page sizes in memremap_pages()").
>> When CONFIG_HAVE_GIGANTIC_FOLIOS=y, CONFIG_SPARSEMEM_VMEMMAP=y, and
>> CONFIG_HUGETLB_PAGE is not set, MAX_FOLIO_ORDER resolves to
>> (PUD_SHIFT - PAGE_SHIFT) = 18. Any range whose PFN alignment exceeds
>> order 18 hits this path.
>
> I'm not clear on what point you are making with this specific
> configuration that results in MAX_FOLIO_ORDER being 18. Is it just
> an example? Is 18 the largest expected value for MAX_FOLIO_ORDER?
> And note that PUD_SHIFT and PAGE_SHIFT might have different values
> on arm64 with a page size other than 4K.
>
Yes, this was just an example. It is not generalized enough, I will
remove it.
MAX_FOLIO_ORDER could go beyond 18.
>>
>> Fix this by clamping vmemmap_shift to MAX_FOLIO_ORDER so we always
>> request the largest order the kernel supports, rather than an
>> out-of-range value.
>>
>> Also fix the error path to propagate the actual error code from
>> devm_memremap_pages() instead of hard-coding -EFAULT, which was
>> masking the real -EINVAL return.
>>
>> Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>> drivers/hv/mshv_vtl_main.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/hv/mshv_vtl_main.c b/drivers/hv/mshv_vtl_main.c
>> index 5856975f32e12..255fed3a740c1 100644
>> --- a/drivers/hv/mshv_vtl_main.c
>> +++ b/drivers/hv/mshv_vtl_main.c
>> @@ -405,8 +405,12 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
>> /*
>> * Determine the highest page order that can be used for the given memory range.
>> * This works best when the range is aligned; i.e. both the start and the length.
>> + * Clamp to MAX_FOLIO_ORDER to avoid a WARN in memremap_pages() when the range
>> + * alignment exceeds the maximum supported folio order for this kernel config.
>> */
>> - pgmap->vmemmap_shift = count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn);
>> + pgmap->vmemmap_shift = min_t(unsigned long,
>> + count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn),
>> + MAX_FOLIO_ORDER);
>
> Is it necessary to use min_t() here, or would min() work? Neither count_trailing_zeros()
> nor MAX_FOLIO_ORDER is ever negative, so it seems like just min() would work with
> no potential for doing a bogus comparison or assignment.
>
min could work, yes. I just felt min_t is more safer for comparing these
two different types of values -
count_trailing_zeroes being 'int'
MAX_FOLIO_ORDER being a macro, calculated by applying bit operations.
and destination being unsigned int.
include/linux/mmzone.h:#define MAX_FOLIO_ORDER MAX_PAGE_ORDER
include/linux/mmzone.h:#define MAX_FOLIO_ORDER PFN_SECTION_SHIFT
include/linux/mmzone.h:#define MAX_FOLIO_ORDER (ilog2(SZ_16G) -
PAGE_SHIFT)
include/linux/mmzone.h:#define MAX_FOLIO_ORDER (ilog2(SZ_1G) -
PAGE_SHIFT)
include/linux/mmzone.h:#define MAX_FOLIO_ORDER (PUD_SHIFT -
PAGE_SHIFT)
I am fine with anything you suggest here.
> The shift is calculated using the originally passed in start_pfn and last_pfn, while the
> "range" struct in pgmap has an "end" value that is one page less. So is the idea to
> go ahead and create the mapping with folios of a size that includes that last page,
> and then just waste the last page of the last folio?
No, waste does not occur. The vmemmap_shift determines the folio order,
and memmap_init_zone_device() walks the range [start_pfn, last_pfn) in
steps of (1 << vmemmap_shift) pages, creating one folio per step. The OR
operation ensures both boundaries are aligned to multiples of (1 <<
vmemmap_shift), guaranteeing the range divides evenly into folios with
no partial folio at the end.
The intention is to find the highest folio order possible here, and if
it reaches the MAX_FOLIO_ORDER, restrict vmemmap_shift to it.
>
>> dev_dbg(vtl->module_dev,
>> "Add VTL0 memory: start: 0x%llx, end_pfn: 0x%llx, page order: %lu\n",
>> vtl0_mem.start_pfn, vtl0_mem.last_pfn, pgmap->vmemmap_shift);
>> @@ -415,7 +419,7 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
>> if (IS_ERR(addr)) {
>> dev_err(vtl->module_dev, "devm_memremap_pages error: %ld\n", PTR_ERR(addr));
>> kfree(pgmap);
>> - return -EFAULT;
>> + return PTR_ERR(addr);
>> }
>>
>> /* Don't free pgmap, since it has to stick around until the memory
>>
>> base-commit: 36ece9697e89016181e5ae87510e40fb31d86f2b
>> --
>> 2.43.0
>>
Thanks for reviewing.
Regards,
Naman
^ permalink raw reply
* [PATCH net-next 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
From: Erni Sri Satya Vennela @ 2026-04-03 7:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260403070948.9225-1-ernis@linux.microsoft.com>
In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq().
This can happen when mana_probe_port() fails for port 0, leaving
ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove()
hits the NULL entry and jumps past mana_destroy_eq().
Change 'goto out' to 'break' so the for-loop exits normally and
mana_destroy_eq() is always reached. Remove the now-unreferenced out:
label.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1a141c46ac27..97237d137cbf 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3747,7 +3747,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
if (!ndev) {
if (i == 0)
dev_err(dev, "No net device to remove\n");
- goto out;
+ break;
}
apc = netdev_priv(ndev);
@@ -3778,7 +3778,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
}
mana_destroy_eq(ac);
-out:
+
if (ac->per_port_queue_reset_wq) {
destroy_workqueue(ac->per_port_queue_reset_wq);
ac->per_port_queue_reset_wq = NULL;
--
2.34.1
^ permalink raw reply related
* [PATCH net-next 3/4] net: mana: Don't overwrite port probe error with add_adev result
From: Erni Sri Satya Vennela @ 2026-04-03 7:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260403070948.9225-1-ernis@linux.microsoft.com>
In mana_probe(), if mana_probe_port() fails for any port, the error
is stored in 'err' and the loop breaks. However, the subsequent
unconditional 'err = add_adev(gd, "eth")' overwrites this error.
If add_adev() succeeds, mana_probe() returns success despite ports
being left in a partially initialized state (ac->ports[i] == NULL).
Only call add_adev() when there is no prior error, so the probe
correctly fails and triggers mana_remove() cleanup.
Fixes: ced82fce77e9 ("net: mana: Probe rdma device in mana driver")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f6ad46736418..1a141c46ac27 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3680,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
for (i = 0; i < ac->num_ports; i++) {
err = mana_probe_port(ac, i, &ac->ports[i]);
- /* we log the port for which the probe failed and stop
- * probes for subsequent ports.
- * Note that we keep running ports, for which the probes
- * were successful, unless add_adev fails too
+ /* Log the port for which the probe failed, stop probing
+ * subsequent ports, and skip add_adev.
+ * Already-probed ports remain functional.
*/
if (err) {
dev_err(dev, "Probe Failed for port %d\n", i);
@@ -3697,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
enable_work(&apc->queue_reset_work);
err = mana_attach(ac->ports[i]);
rtnl_unlock();
- /* we log the port for which the attach failed and stop
- * attach for subsequent ports
- * Note that we keep running ports, for which the attach
- * were successful, unless add_adev fails too
+ /* Log the port for which the attach failed, stop
+ * attaching subsequent ports, and skip add_adev.
+ * Already-attached ports remain functional.
*/
if (err) {
dev_err(dev, "Attach Failed for port %d\n", i);
@@ -3709,7 +3707,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
}
}
- err = add_adev(gd, "eth");
+ if (!err)
+ err = add_adev(gd, "eth");
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
--
2.34.1
^ permalink raw reply related
* [PATCH net-next 2/4] net: mana: Init gf_stats_work before potential error paths in probe
From: Erni Sri Satya Vennela @ 2026-04-03 7:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260403070948.9225-1-ernis@linux.microsoft.com>
Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
while keeping schedule_delayed_work() at its original location.
Previously, if any function between mana_create_eq() and the
INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
in __flush_work() or debug object warnings with
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 57f146ea6f66..f6ad46736418 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3635,6 +3635,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
+ INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+
err = mana_create_eq(ac);
if (err) {
dev_err(dev, "Failed to create EQs: %d\n", err);
@@ -3709,7 +3711,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
err = add_adev(gd, "eth");
- INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
out:
--
2.34.1
^ permalink raw reply related
* [PATCH net-next 1/4] net: mana: Init link_change_work before potential error paths in probe
From: Erni Sri Satya Vennela @ 2026-04-03 7:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260403070948.9225-1-ernis@linux.microsoft.com>
Move INIT_WORK(link_change_work) to right after the mana_context
allocation, before any error path that could reach mana_remove().
Previously, if mana_create_eq() or mana_query_device_cfg() failed,
mana_probe() would jump to the error path which calls mana_remove().
mana_remove() unconditionally calls disable_work_sync(link_change_work),
but the work struct had not been initialized yet. This can trigger
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 07630322545f..57f146ea6f66 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3631,6 +3631,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
ac->gdma_dev = gd;
gd->driver_data = ac;
+
+ INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
err = mana_create_eq(ac);
@@ -3648,8 +3650,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
ac->num_ports = num_ports;
-
- INIT_WORK(&ac->link_change_work, mana_link_state_handle);
} else {
if (ac->num_ports != num_ports) {
dev_err(dev, "The number of vPorts changed: %d->%d\n",
--
2.34.1
^ permalink raw reply related
* [PATCH net-next 0/4] net: mana: Fix probe/remove error path bugs
From: Erni Sri Satya Vennela @ 2026-04-03 7:09 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Fix four pre-existing bugs in mana_probe()/mana_remove() error handling
that can cause warnings on uninitialized work structs, masked errors,
and resource leaks when early probe steps fail.
Patches 1-2 move work struct initialization (link_change_work and
gf_stats_work) to before any error path that could trigger
mana_remove(), preventing WARN_ON in __flush_work() or debug object
warnings when sync cancellation runs on uninitialized work structs.
Patch 3 prevents add_adev() from overwriting a port probe error,
which could leave the driver in a broken state with NULL ports while
reporting success.
Patch 4 changes 'goto out' to 'break' in mana_remove()'s port loop
so that mana_destroy_eq() is always reached, preventing EQ leaks when
a NULL port is encountered.
Erni Sri Satya Vennela (4):
net: mana: Init link_change_work before potential error paths in probe
net: mana: Init gf_stats_work before potential error paths in probe
net: mana: Don't overwrite port probe error with add_adev result
net: mana: Fix EQ leak in mana_remove on NULL port
drivers/net/ethernet/microsoft/mana/mana_en.c | 28 +++++++++----------
1 file changed, 14 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply
* Re: [PATCH 4/6] mshv: limit SynIC management to MSHV-owned resources
From: Anirudh Rayabharam @ 2026-04-03 4:34 UTC (permalink / raw)
To: Jork Loeser
Cc: linux-hyperv, x86, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Arnd Bergmann,
Roman Kisel, Michael Kelley, linux-kernel, linux-arch
In-Reply-To: <20260327201920.2100427-5-jloeser@linux.microsoft.com>
On Fri, Mar 27, 2026 at 01:19:15PM -0700, Jork Loeser wrote:
> The SynIC is shared between VMBus and MSHV. VMBus owns the message
> page (SIMP), event flags page (SIEFP), global enable (SCONTROL), and
> SINT2. MSHV adds SINT0, SINT5, and the event ring page (SIRBP).
>
> Currently mshv_synic_init() redundantly enables SIMP, SIEFP, and
> SCONTROL that VMBus already configured, and mshv_synic_cleanup()
> disables all of them. This is wrong because MSHV can be torn down
> while VMBus is still active. In particular, a kexec reboot notifier
> tears down MSHV first. Disabling SCONTROL, SIMP, and SIEFP out from
> under VMBus causes its later cleanup to write SynIC MSRs while SynIC
> is disabled, which the hypervisor does not tolerate.
>
> Restrict MSHV to managing only the resources it owns:
> - SINT0, SINT5: mask on cleanup, unmask on init
> - SIRBP: enable/disable as before
> - SIMP, SIEFP, SCONTROL: on L1VH leave entirely to VMBus (it
> already enabled them); on root partition VMBus doesn't run, so
Actually, I believe VMBus does run on a nested root partition. Doesn't
it? That would then be another case to handle in this patch.
Thanks,
Anirudh.
^ permalink raw reply
* Re: [PATCH 4/6] mshv: limit SynIC management to MSHV-owned resources
From: Anirudh Rayabharam @ 2026-04-03 3:59 UTC (permalink / raw)
To: Jork Loeser
Cc: linux-hyperv, x86, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H . Peter Anvin, Arnd Bergmann,
Roman Kisel, Michael Kelley, linux-kernel, linux-arch
In-Reply-To: <b9681d7-6166-e7cf-feb-b924c583cdf@linux.microsoft.com>
On Thu, Apr 02, 2026 at 02:21:43PM -0700, Jork Loeser wrote:
> On Thu, 2 Apr 2026, Anirudh Rayabharam wrote:
>
> > Maybe it's time to extract all the synic management stuff to a separate
> > file to act like synic "driver" which facilitates synic access to
> > mutiple users i.e. mshv & vmbus.
>
> Yes, such a refactor maybe warranted. This series is about getting kexec to
> work on L1VH, and so such refactor would be ill-placed here.
Well, the limitations of the current code have been exposed while trying
to fix kexec. So I would say this is the best place to fix those
limitations.
If you're intending these patches to the backported into stable release
then you could first fix the issues and later patches *in the same
series* can do the refactor. That way, the patches that have the actual
fix can be backported and we can avoid backporting the refactor.
Thanks,
Anirudh.
^ permalink raw reply
* Re: [PATCH 3/7] mshv: Rename mshv_mem_region to mshv_region
From: Anirudh Rayabharam @ 2026-04-03 3:54 UTC (permalink / raw)
To: Stanislav Kinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <ac6t9d2CIvL469_t@skinsburskii.localdomain>
On Thu, Apr 02, 2026 at 10:57:09AM -0700, Stanislav Kinsburskii wrote:
> On Thu, Apr 02, 2026 at 04:33:24PM +0000, Anirudh Rayabharam wrote:
> > On Wed, Apr 01, 2026 at 10:12:40PM +0000, Stanislav Kinsburskii wrote:
> > > The mshv_mem_region structure represents guest address space regions,
> > > which can be either RAM-backed memory or memory-mapped IO regions
> > > without physical backing. The "mem_" prefix incorrectly suggests the
> > > structure only handles memory regions, creating confusion about its
> > > actual purpose.
> > >
> > > Remove the "mem_" prefix to align with existing function naming
> > > (mshv_region_map, mshv_region_pin, etc.) and accurately reflect that
> > > this structure manages arbitrary guest address space mappings
> > > regardless of their backing type.
> >
> > I don't think the "mem_" prefix automatically suggested the backing
> > type.
> >
>
> What else can it suggest?
To me it just suggested that the struct contained info or properties of
a guest memory region. The name itself didn't suggest what backing type
the memory has.
>
> > Isn't mshv_region too vague now? Region of what?
> >
>
> The region of address space, which can or can not be backed by memory.
Yeah but that's not obvious just from the new name. The new name just
says mshv_region and doesn't what the region is of.
Thanks,
Anirudh.
>
> Thanks,
> Stanislav
>
> > Thanks,
> > Anirudh.
^ permalink raw reply
* RE: [PATCH] Drivers: hv: mshv_vtl: Fix vmemmap_shift exceeding MAX_FOLIO_ORDER
From: Michael Kelley @ 2026-04-03 3:35 UTC (permalink / raw)
To: Naman Jain, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li, Michael Kelley
Cc: Saurabh Sengar, linux-hyperv@vger.kernel.org,
linux-kernel@vger.kernel.org
In-Reply-To: <20260401054005.1532381-1-namjain@linux.microsoft.com>
From: Naman Jain <namjain@linux.microsoft.com> Sent: Tuesday, March 31, 2026 10:40 PM
>
> When registering VTL0 memory via MSHV_ADD_VTL0_MEMORY, the kernel
> computes pgmap->vmemmap_shift as the number of trailing zeros in the
> OR of start_pfn and last_pfn, intending to use the largest compound
> page order both endpoints are aligned to.
>
> However, this value is not clamped to MAX_FOLIO_ORDER, so a
> sufficiently aligned range (e.g. physical range 0x800000000000-
> 0x800080000000, corresponding to start_pfn=0x800000000 with 35
> trailing zeros) can produce a shift larger than what
> memremap_pages() accepts, triggering a WARN and returning -EINVAL:
>
> WARNING: ... memremap_pages+0x512/0x650
> requested folio size unsupported
>
> The MAX_FOLIO_ORDER check was added by
> commit 646b67d57589 ("mm/memremap: reject unreasonable folio/compound
> page sizes in memremap_pages()").
> When CONFIG_HAVE_GIGANTIC_FOLIOS=y, CONFIG_SPARSEMEM_VMEMMAP=y, and
> CONFIG_HUGETLB_PAGE is not set, MAX_FOLIO_ORDER resolves to
> (PUD_SHIFT - PAGE_SHIFT) = 18. Any range whose PFN alignment exceeds
> order 18 hits this path.
I'm not clear on what point you are making with this specific
configuration that results in MAX_FOLIO_ORDER being 18. Is it just
an example? Is 18 the largest expected value for MAX_FOLIO_ORDER?
And note that PUD_SHIFT and PAGE_SHIFT might have different values
on arm64 with a page size other than 4K.
>
> Fix this by clamping vmemmap_shift to MAX_FOLIO_ORDER so we always
> request the largest order the kernel supports, rather than an
> out-of-range value.
>
> Also fix the error path to propagate the actual error code from
> devm_memremap_pages() instead of hard-coding -EFAULT, which was
> masking the real -EINVAL return.
>
> Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
> ---
> drivers/hv/mshv_vtl_main.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/mshv_vtl_main.c b/drivers/hv/mshv_vtl_main.c
> index 5856975f32e12..255fed3a740c1 100644
> --- a/drivers/hv/mshv_vtl_main.c
> +++ b/drivers/hv/mshv_vtl_main.c
> @@ -405,8 +405,12 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
> /*
> * Determine the highest page order that can be used for the given memory range.
> * This works best when the range is aligned; i.e. both the start and the length.
> + * Clamp to MAX_FOLIO_ORDER to avoid a WARN in memremap_pages() when the range
> + * alignment exceeds the maximum supported folio order for this kernel config.
> */
> - pgmap->vmemmap_shift = count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn);
> + pgmap->vmemmap_shift = min_t(unsigned long,
> + count_trailing_zeros(vtl0_mem.start_pfn | vtl0_mem.last_pfn),
> + MAX_FOLIO_ORDER);
Is it necessary to use min_t() here, or would min() work? Neither count_trailing_zeros()
nor MAX_FOLIO_ORDER is ever negative, so it seems like just min() would work with
no potential for doing a bogus comparison or assignment.
The shift is calculated using the originally passed in start_pfn and last_pfn, while the
"range" struct in pgmap has an "end" value that is one page less. So is the idea to
go ahead and create the mapping with folios of a size that includes that last page,
and then just waste the last page of the last folio?
> dev_dbg(vtl->module_dev,
> "Add VTL0 memory: start: 0x%llx, end_pfn: 0x%llx, page order: %lu\n",
> vtl0_mem.start_pfn, vtl0_mem.last_pfn, pgmap->vmemmap_shift);
> @@ -415,7 +419,7 @@ static int mshv_vtl_ioctl_add_vtl0_mem(struct mshv_vtl *vtl, void __user *arg)
> if (IS_ERR(addr)) {
> dev_err(vtl->module_dev, "devm_memremap_pages error: %ld\n", PTR_ERR(addr));
> kfree(pgmap);
> - return -EFAULT;
> + return PTR_ERR(addr);
> }
>
> /* Don't free pgmap, since it has to stick around until the memory
>
> base-commit: 36ece9697e89016181e5ae87510e40fb31d86f2b
> --
> 2.43.0
>
^ permalink raw reply
* RE: [PATCH 3/4] drivers: hv: mark bus attributes as const
From: Michael Kelley @ 2026-04-03 2:58 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260402-sysfs-const-hv-v1-3-a467d6f7726e@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Thursday, April 2, 2026 8:18 AM
>
> This attribute is never modified, mark it as const.
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> ---
> drivers/hv/vmbus_drv.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index d41b39ab628d..ecce6b72a2a2 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -639,9 +639,9 @@ static ssize_t hibernation_show(const struct bus_type *bus, char *buf)
> return sprintf(buf, "%d\n", !!hv_is_hibernation_supported());
> }
>
> -static BUS_ATTR_RO(hibernation);
> +static const BUS_ATTR_RO(hibernation);
>
> -static struct attribute *vmbus_bus_attrs[] = {
> +static const struct attribute *const vmbus_bus_attrs[] = {
> &bus_attr_hibernation.attr,
> NULL,
> };
>
> --
> 2.53.0
>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
^ permalink raw reply
* RE: [PATCH 2/4] drivers: hv: use ATTRIBUTE_GROUPS helper
From: Michael Kelley @ 2026-04-03 2:57 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260402-sysfs-const-hv-v1-2-a467d6f7726e@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Thursday, April 2, 2026 8:18 AM
>
> The current logic is equivalent to the helper.
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> ---
> drivers/hv/vmbus_drv.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 5f9b7cc9080c..d41b39ab628d 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -645,10 +645,7 @@ static struct attribute *vmbus_bus_attrs[] = {
> &bus_attr_hibernation.attr,
> NULL,
> };
> -static const struct attribute_group vmbus_bus_group = {
> - .attrs = vmbus_bus_attrs,
> -};
> -__ATTRIBUTE_GROUPS(vmbus_bus);
> +ATTRIBUTE_GROUPS(vmbus_bus);
>
> /*
> * vmbus_uevent - add uevent for our device
>
> --
> 2.53.0
>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
^ permalink raw reply
* RE: [PATCH 1/4] drivers: hv: mark chan_attr_ring_buffer as const
From: Michael Kelley @ 2026-04-03 2:52 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260402-sysfs-const-hv-v1-1-a467d6f7726e@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Thursday, April 2, 2026 8:18 AM
>
> The structure is never modified, mark it as const.
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> ---
> drivers/hv/vmbus_drv.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index bc4fc1951ae1..5f9b7cc9080c 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1959,7 +1959,7 @@ static int hv_mmap_ring_buffer_wrapper(struct file *filp,
> struct kobject *kobj,
> return channel->mmap_ring_buffer(channel, vma);
> }
>
> -static struct bin_attribute chan_attr_ring_buffer = {
> +static const struct bin_attribute chan_attr_ring_buffer = {
> .attr = {
> .name = "ring",
> .mode = 0600,
> @@ -1985,7 +1985,7 @@ static struct attribute *vmbus_chan_attrs[] = {
> NULL
> };
>
> -static const struct bin_attribute *vmbus_chan_bin_attrs[] = {
> +static const struct bin_attribute *const vmbus_chan_bin_attrs[] = {
> &chan_attr_ring_buffer,
> NULL
> };
>
> --
> 2.53.0
>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
^ permalink raw reply
* RE: [PATCH 0/4] drivers: hv: mark some sysfs attribute as const
From: Michael Kelley @ 2026-04-03 2:52 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260402-sysfs-const-hv-v1-0-a467d6f7726e@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Thursday, April 2, 2026 8:18 AM
>
> The sysfs attribute structures are never modified.
> Mark them as const where possible.
>
> This excludes the device attributes for now, as these will have their
> own migration.
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> ---
> Thomas Weißschuh (4):
> drivers: hv: mark chan_attr_ring_buffer as const
> drivers: hv: use ATTRIBUTE_GROUPS helper
> drivers: hv: mark bus attributes as const
> drivers: hv: mark channel attributes as const
>
> drivers/hv/vmbus_drv.c | 43 ++++++++++++++++++++-----------------------
> 1 file changed, 20 insertions(+), 23 deletions(-)
> ---
> base-commit: 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f
> change-id: 20260331-sysfs-const-hv-cad05718c68a
>
> Best regards,
> --
> Thomas Weißschuh <linux@weissschuh.net>
>
I built and tested a linux-next20260330 kernel with this patch set.
The testing was on a x86/x64 VM running on Hyper-V locally on my
laptop. I also applied a follow-on update to mark as const the remaining
4 channel attributes that were presumably missed in Patch 4 of this
series (see my comments there).
Everything worked fine with no issues.
For the series,
Tested-by: Michael Kelley <mhklinux@outlook.com>
^ permalink raw reply
* RE: [PATCH 4/4] drivers: hv: mark channel attributes as const
From: Michael Kelley @ 2026-04-03 2:48 UTC (permalink / raw)
To: Thomas Weißschuh, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
Dexuan Cui, Long Li
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260402-sysfs-const-hv-v1-4-a467d6f7726e@weissschuh.net>
From: Thomas Weißschuh <linux@weissschuh.net> Sent: Thursday, April 2, 2026 8:18 AM
>
> These attributes are never modified, mark them as const.
As Sashiko AI noted here [1], there are 4 channel attributes that you did not
mark as "const". Is there a reason, or is that just an oversight? There are a total
of 15 channel attributes, and this patch marks only 11 of them as const.
[1] https://sashiko.dev/#/patchset/20260402-sysfs-const-hv-v1-0-a467d6f7726e%40weissschuh.net
Michael
>
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
> ---
> drivers/hv/vmbus_drv.c | 30 +++++++++++++++---------------
> 1 file changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index ecce6b72a2a2..7ca038c1e5aa 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1864,7 +1864,7 @@ static ssize_t target_cpu_store(struct vmbus_channel *channel,
>
> return ret ?: count;
> }
> -static VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
> +static const VMBUS_CHAN_ATTR(cpu, 0644, target_cpu_show, target_cpu_store);
>
> static ssize_t channel_pending_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1873,7 +1873,7 @@ static ssize_t channel_pending_show(struct vmbus_channel *channel,
> channel_pending(channel,
> vmbus_connection.monitor_pages[1]));
> }
> -static VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
> +static const VMBUS_CHAN_ATTR(pending, 0444, channel_pending_show, NULL);
>
> static ssize_t channel_latency_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1882,19 +1882,19 @@ static ssize_t channel_latency_show(struct vmbus_channel *channel,
> channel_latency(channel,
> vmbus_connection.monitor_pages[1]));
> }
> -static VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
> +static const VMBUS_CHAN_ATTR(latency, 0444, channel_latency_show, NULL);
>
> static ssize_t channel_interrupts_show(struct vmbus_channel *channel, char *buf)
> {
> return sprintf(buf, "%llu\n", channel->interrupts);
> }
> -static VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
> +static const VMBUS_CHAN_ATTR(interrupts, 0444, channel_interrupts_show, NULL);
>
> static ssize_t channel_events_show(struct vmbus_channel *channel, char *buf)
> {
> return sprintf(buf, "%llu\n", channel->sig_events);
> }
> -static VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
> +static const VMBUS_CHAN_ATTR(events, 0444, channel_events_show, NULL);
>
> static ssize_t channel_intr_in_full_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1902,7 +1902,7 @@ static ssize_t channel_intr_in_full_show(struct vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->intr_in_full);
> }
> -static VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
> +static const VMBUS_CHAN_ATTR(intr_in_full, 0444, channel_intr_in_full_show, NULL);
>
> static ssize_t channel_intr_out_empty_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1910,7 +1910,7 @@ static ssize_t channel_intr_out_empty_show(struct vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->intr_out_empty);
> }
> -static VMBUS_CHAN_ATTR(intr_out_empty, 0444, channel_intr_out_empty_show, NULL);
> +static const VMBUS_CHAN_ATTR(intr_out_empty, 0444, channel_intr_out_empty_show, NULL);
>
> static ssize_t channel_out_full_first_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1918,7 +1918,7 @@ static ssize_t channel_out_full_first_show(struct vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->out_full_first);
> }
> -static VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show, NULL);
> +static const VMBUS_CHAN_ATTR(out_full_first, 0444, channel_out_full_first_show, NULL);
>
> static ssize_t channel_out_full_total_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1926,14 +1926,14 @@ static ssize_t channel_out_full_total_show(struct vmbus_channel *channel,
> return sprintf(buf, "%llu\n",
> (unsigned long long)channel->out_full_total);
> }
> -static VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show, NULL);
> +static const VMBUS_CHAN_ATTR(out_full_total, 0444, channel_out_full_total_show, NULL);
>
> static ssize_t subchannel_monitor_id_show(struct vmbus_channel *channel,
> char *buf)
> {
> return sprintf(buf, "%u\n", channel->offermsg.monitorid);
> }
> -static VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show, NULL);
> +static const VMBUS_CHAN_ATTR(monitor_id, 0444, subchannel_monitor_id_show, NULL);
>
> static ssize_t subchannel_id_show(struct vmbus_channel *channel,
> char *buf)
> @@ -1941,7 +1941,7 @@ static ssize_t subchannel_id_show(struct vmbus_channel *channel,
> return sprintf(buf, "%u\n",
> channel->offermsg.offer.sub_channel_index);
> }
> -static VMBUS_CHAN_ATTR_RO(subchannel_id);
> +static const VMBUS_CHAN_ATTR_RO(subchannel_id);
>
> static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
> const struct bin_attribute *attr,
> @@ -1963,7 +1963,7 @@ static const struct bin_attribute chan_attr_ring_buffer = {
> },
> .mmap = hv_mmap_ring_buffer_wrapper,
> };
> -static struct attribute *vmbus_chan_attrs[] = {
> +static const struct attribute *const vmbus_chan_attrs[] = {
> &chan_attr_out_mask.attr,
> &chan_attr_in_mask.attr,
> &chan_attr_read_avail.attr,
> @@ -1992,7 +1992,7 @@ static const struct bin_attribute *const vmbus_chan_bin_attrs[] = {
> * each attribute, and returns 0 if an attribute is not visible.
> */
> static umode_t vmbus_chan_attr_is_visible(struct kobject *kobj,
> - struct attribute *attr, int idx)
> + const struct attribute *attr, int idx)
> {
> const struct vmbus_channel *channel =
> container_of(kobj, struct vmbus_channel, kobj);
> @@ -2030,9 +2030,9 @@ static size_t vmbus_chan_bin_size(struct kobject *kobj,
> }
>
> static const struct attribute_group vmbus_chan_group = {
> - .attrs = vmbus_chan_attrs,
> + .attrs_const = vmbus_chan_attrs,
> .bin_attrs = vmbus_chan_bin_attrs,
> - .is_visible = vmbus_chan_attr_is_visible,
> + .is_visible_const = vmbus_chan_attr_is_visible,
> .is_bin_visible = vmbus_chan_bin_attr_is_visible,
> .bin_size = vmbus_chan_bin_size,
> };
>
> --
> 2.53.0
>
^ permalink raw reply
* [PATCH v2] PCI: hv: Allocate MMIO from above 4GB for the config window
From: Dexuan Cui @ 2026-04-02 23:43 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, lpieralisi, kwilczynski,
mani, robh, bhelgaas, jakeo, linux-hyperv, linux-pci,
linux-kernel, mhklinux, matthew.ruffell, kjlx
Cc: Krister Johansen, stable
There has been a longstanding MMIO conflict between the pci_hyperv
driver's config_window (see hv_allocate_config_window()) and the
hyperv_drm (or hyperv_fb) driver (see hyperv_setup_vram()): typically
both get MMIO from the low MMIO range below 4GB; this is not an issue
in the normal kernel since the VMBus driver reserves the framebuffer
MMIO range in vmbus_reserve_fb(), so the drm driver's hyperv_setup_vram()
can always get the reserved framebuffer MMIO; however, a Gen2 VM's
kdump kernel can fail to reserve the framebuffer MMIO in
vmbus_reserve_fb() because the screen_info.lfb_base is zero in the
kdump kernel due to several possible reasons (see the Link below for
more details):
1) on ARM64, the two syscalls (KEXEC_LOAD, KEXEC_FILE_LOAD) don't
initialize the screen_info.lfb_base for the kdump kernel;
2) on x86-64, the KEXEC_FILE_LOAD syscall initializes kdump kernel's
screen_info.lfb_base, but the KEXEC_LOAD syscall doesn't really do that
when the hyperv_drm driver loads, because the user-space kexec-tools
(i.e. the program 'kexec') doesn't recognize the hyperv_drm driver
(let's ignore the behavior of kexec-tools of very old versions).
When vmbus_reserve_fb() fails to reserve the framebuffer MMIO in the
kdump kernel, if pci_hyperv in the kdump kernel loads before hyperv_drm
loads, pci_hyperv's vmbus_allocate_mmio() gets the framebuffer MMIO
and tries to use it, but since the host thinks that the MMIO range is
still in use by hyperv_drm, the host refuses to accept the MMIO range
as the config window, and pci_hyperv's hv_pci_enter_d0() errors out,
e.g. an error can be "PCI Pass-through VSP failed D0 Entry with status
c0370048".
Typically, this pci_hyperv error in the kdump kernel was not fatal in
the past because the kdump kernel typically doesn't rely on pci_hyperv,
i.e. the root file system is on a VMBus SCSI device.
Now, a VM on Azure can boot from NVMe, i.e. the root file system can be
on a NVMe device, which depends on pci_hyperv. When the error occurs,
the kdump kernel fails to boot up since no root file system is detected.
Fix the MMIO conflict by allocating MMIO above 4GB for the config_window,
so it won't conflict with hyperv_drm's MMIO, which should be below the
4GB boundary. The size of config_window is small: it's only 8KB per PCI
device, so there should be sufficient MMIO space available above 4GB.
Note: we still need to figure out how to address the possible MMIO
conflict between hyperv_drm and pci_hyperv in the case of 32-bit PCI
MMIO BARs, but that's of low priority because all PCI devices available
to a Linux VM on Azure or on a modern host should use 64-bit BARs and
should not use 32-bit BARs -- I checked Mellanox VFs, MANA VFs, NVMe
devices, and GPUs in Linux VMs on Azure, and found no 32-bit BARs.
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Link: https://lore.kernel.org/all/SA1PR21MB692176C1BC53BFC9EAE5CF8EBF51A@SA1PR21MB6921.namprd21.prod.outlook.com/
Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com>
Tested-by: Krister Johansen <johansen@templeofstupid.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
---
Changes since v1:
Updated the commit message and the comment to better explain
why screen_info.lfb_base can be 0 in the kdump kernel.
No code change since v1.
drivers/pci/controller/pci-hyperv.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 2c7a406b4ba8..1a79334ea9f4 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3403,9 +3403,26 @@ static int hv_allocate_config_window(struct hv_pcibus_device *hbus)
/*
* Set up a region of MMIO space to use for accessing configuration
- * space.
+ * space. Use the high MMIO range to not conflict with the hyperv_drm
+ * driver (which normally gets MMIO from the low MMIO range) in the
+ * kdump kernel of a Gen2 VM, which may fail to reserve the framebuffer
+ * MMIO range in vmbus_reserve_fb() due to screen_info.lfb_base being
+ * zero in the kdump kernel:
+ *
+ * on ARM64, the two syscalls (KEXEC_LOAD, KEXEC_FILE_LOAD) don't
+ * initialize the screen_info.lfb_base for the kdump kernel;
+ *
+ * on x86-64, the KEXEC_FILE_LOAD syscall initializes kdump kernel's
+ * screen_info.lfb_base (see bzImage64_load() -> setup_boot_parameters())
+ * but the KEXEC_LOAD syscall doesn't really do that when the hyperv_drm
+ * driver loads, because the user-space program 'kexec' doesn't
+ * recognize hyperv_drm: see the function setup_linux_vesafb() in the
+ * kexec-tools.git repo. Note: old versions of kexec-tools, e.g.
+ * v2.0.18, initialize screen_info.lfb_base if the hyperv_fb driver
+ * loads, but hyperv_fb is deprecated and has been removed from the
+ * mainline kernel.
*/
- ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, 0, -1,
+ ret = vmbus_allocate_mmio(&hbus->mem_config, hbus->hdev, SZ_4G, -1,
PCI_CONFIG_MMIO_LENGTH, 0x1000, false);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v2] Drivers: hv: mshv: fix integer overflow in memory region overlap check
From: Stanislav Kinsburskii @ 2026-04-02 23:25 UTC (permalink / raw)
To: Junrui Luo
Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
Nuno Das Neves, Anirudh Rayabharam, Mukesh Rathor, Muminul Islam,
Praveen K Paladugu, Jinank Jain, linux-hyperv, linux-kernel,
Yuhao Jiang, Roman Kisel, stable
In-Reply-To: <SYBPR01MB788138A30BC69B0F5C3316E5AF54A@SYBPR01MB7881.ausprd01.prod.outlook.com>
On Sat, Mar 28, 2026 at 05:18:45PM +0800, Junrui Luo wrote:
> mshv_partition_create_region() computes mem->guest_pfn + nr_pages to
> check for overlapping regions without verifying u64 wraparound. A
> sufficiently large guest_pfn can cause the addition to overflow,
> bypassing the overlap check and allowing creation of regions that wrap
> around the address space.
>
> Fix by using check_add_overflow() to reject such regions early, and
> validate that the region end does not exceed MAX_PHYSMEM_BITS. These
> checks also protect downstream callers that compute start_gfn +
> nr_pages on stored regions without overflow guards.
>
> Fixes: 621191d709b1 ("Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs")
> Reported-by: Yuhao Jiang <danisjiang@gmail.com>
> Suggested-by: Roman Kisel <romank@linux.microsoft.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
> ---
> Changes in v2:
> - Add a maximum check suggested by Roman Kisel
> - Link to v1: https://lore.kernel.org/all/SYBPR01MB7881689C0F58149DD986A6D1AF49A@SYBPR01MB7881.ausprd01.prod.outlook.com/
> ---
> drivers/hv/mshv_root_main.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 6f42423f7faa..32826247dbce 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -1174,11 +1174,20 @@ static int mshv_partition_create_region(struct mshv_partition *partition,
> {
> struct mshv_mem_region *rg;
> u64 nr_pages = HVPFN_DOWN(mem->size);
> + u64 new_region_end;
> +
> + /* Reject regions whose end address would wrap around */
> + if (check_add_overflow(mem->guest_pfn, nr_pages, &new_region_end))
> + return -EOVERFLOW;
> +
> + /* Reject regions beyond the maximum physical address */
nit: both comments are redundant - the meaning is clear from the code
itself.
> + if (new_region_end > HVPFN_DOWN(1ULL << MAX_PHYSMEM_BITS))
This maximum value check bugs me a bit.
First of all, why does it matter what is the region end? Potentially, there can be
regions not backed by host address space (leave alone host RAM), so why
intropducing this limitation?
Second, this check takes a host-specific constant (MAX_PHYSMEM_BITS) and rounds it down
to hypervisor-specific units which may not be aligned with the host page
size. Should this be host pages instead?
Thanks,
Stanislav
> + return -EINVAL;
>
> /* Reject overlapping regions */
> spin_lock(&partition->pt_mem_regions_lock);
> hlist_for_each_entry(rg, &partition->pt_mem_regions, hnode) {
> - if (mem->guest_pfn + nr_pages <= rg->start_gfn ||
> + if (new_region_end <= rg->start_gfn ||
> rg->start_gfn + rg->nr_pages <= mem->guest_pfn)
> continue;
> spin_unlock(&partition->pt_mem_regions_lock);
>
> ---
> base-commit: c369299895a591d96745d6492d4888259b004a9e
> change-id: 20260328-fixes-0296eb3dbb52
>
> Best regards,
> --
> Junrui Luo <moonafterrain@outlook.com>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox