Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG
From: Erni Sri Satya Vennela @ 2026-04-10  5:16 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260326174815.2012137-1-ernis@linux.microsoft.com>

On Thu, Mar 26, 2026 at 10:48:10AM -0700, Erni Sri Satya Vennela wrote:
> As a part of MANA hardening for CVM, validate that max_num_sq and
> max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
> values flow into apc->num_queues, which is used as an allocation count
> and loop bound. A zero value would result in zero-size allocations and
> incorrect driver behavior.
> 
> Return -EPROTO if either value is zero.
> 
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index b39e8b920791..a4197b4b0597 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct mana_port_context *apc, u32 vport_index,
>  
>  	*max_sq = resp.max_num_sq;
>  	*max_rq = resp.max_num_rq;
> +
> +	if (*max_sq == 0 || *max_rq == 0) {
> +		netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
> +		return -EPROTO;
> +	}
> +
>  	if (resp.num_indirection_ent > 0 &&
>  	    resp.num_indirection_ent <= MANA_INDIRECT_TABLE_MAX_SIZE &&
>  	    is_power_of_2(resp.num_indirection_ent)) {
> -- 
> 2.34.1

Hi,

Gentle reminder regarding this patch.

I would really appreciate any feedback whenever you get a chance.
Please let me know if any changes are required from my side.

Thanks for your time.

Regards,
Vennela


^ permalink raw reply

* Re: [PATCH 04/61] ext4: Prefer IS_ERR_OR_NULL over manual NULL check
From: Theodore Ts'o @ 2026-04-10 15:18 UTC (permalink / raw)
  To: amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel, dri-devel,
	gfs2, intel-gfx, intel-wired-lan, iommu, kvm, linux-arm-kernel,
	linux-block, linux-bluetooth, linux-btrfs, linux-cifs, linux-clk,
	linux-erofs, linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv,
	linux-input, linux-kernel, linux-leds, linux-media, linux-mips,
	linux-mm, linux-modules, linux-mtd, linux-nfs, linux-omap,
	linux-phy, linux-pm, linux-rockchip, linux-s390, linux-scsi,
	linux-sctp, linux-security-module, linux-sh, linux-sound,
	linux-stm32, linux-trace-kernel, linux-usb, linux-wireless,
	netdev, ntfs3, samba-technical, sched-ext, target-devel,
	tipc-discussion, v9fs, Philipp Hahn
  Cc: Theodore Ts'o, Andreas Dilger
In-Reply-To: <20260310-b4-is_err_or_null-v1-4-bd63b656022d@avm.de>


On Tue, 10 Mar 2026 12:48:30 +0100, Philipp Hahn wrote:
> Prefer using IS_ERR_OR_NULL() over using IS_ERR() and a manual NULL
> check.
> 
> Change generated with coccinelle.

Applied, thanks!

[04/61] ext4: Prefer IS_ERR_OR_NULL over manual NULL check
        commit: 1d749e110277ce4103f27bd60d6181e52c0cc1e3

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

^ permalink raw reply

* Re: [EXTERNAL] Re: [PATCH rdma-next v2] RDMA/mana_ib: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Jason Gunthorpe @ 2026-04-10 15:43 UTC (permalink / raw)
  To: Long Li
  Cc: Leon Romanovsky, Erni Sri Satya Vennela, Konstantin Taranov,
	linux-rdma@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <SA1PR21MB66833EBAF447BA0B102862FCCE4DA@SA1PR21MB6683.namprd21.prod.outlook.com>

On Sat, Mar 21, 2026 at 12:56:39AM +0000, Long Li wrote:

> How we rephrase this in this way: the driver should not corrupt or
> overflow other parts of the kernel if its device is misbehaving (or
> has a bug).

If we are going to do this CC hardening stuff I think I want to see a
more comphrensive approach, like if we detect an attack then the
kernel instantly crashes or something. Or at least an approach in
general agreed to by the CC and kernel community.

Igoring the issue and continuing seems just wrong.

This sprinkling of random checks in this series doesn't feel
comprehensive or cohesive to me.

Jason

^ permalink raw reply

* Re: [EXTERNAL] Re: [PATCH rdma-next 0/8] RDMA/mana_ib: Handle service reset for RDMA resources
From: Jason Gunthorpe @ 2026-04-10 15:49 UTC (permalink / raw)
  To: Long Li
  Cc: Leon Romanovsky, Konstantin Taranov, Jakub Kicinski,
	David S . Miller, Paolo Abeni, Eric Dumazet, Andrew Lunn,
	Haiyang Zhang, KY Srinivasan, Wei Liu, Dexuan Cui, Simon Horman,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SA1PR21MB66832D0A369DE7E411ACCDEDCE41A@SA1PR21MB6683.namprd21.prod.outlook.com>

On Tue, Mar 17, 2026 at 11:43:49PM +0000, Long Li wrote:

>    Today a DPC event on one NIC kills all RDMA connections and can
>    crash entire training jobs. 

All rdma connections on that nic, right?

>    If the ib_device persists and the driver
>    recreates firmware resources after recovery, raw verbs users can
>    resume without full teardown, and RDMA-CM users get the same
>    disconnect/reconnect behavior they have today.

No, I don't think this is feasible. There is too much state, the
kernel cannot just recreate things and transparently keep going
without userspace handshaking this. IMHO It is just the wrong model.

We have always gone for the model that userspace has to be involved in
the RAS and it has to recreate its operations on a fresh new verbs
FD. I think anything else is going to be so complicated and fragile.

I can't see any sensible way an already open verbs FD can survive a
device reset.

Jason

^ permalink raw reply

* RE: [EXTERNAL] Re: [PATCH rdma-next v2] RDMA/mana_ib: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Long Li @ 2026-04-10 22:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Erni Sri Satya Vennela, Konstantin Taranov,
	linux-rdma@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20260410154327.GA2551565@ziepe.ca>

> On Sat, Mar 21, 2026 at 12:56:39AM +0000, Long Li wrote:
> 
> > How we rephrase this in this way: the driver should not corrupt or
> > overflow other parts of the kernel if its device is misbehaving (or
> > has a bug).
> 
> If we are going to do this CC hardening stuff I think I want to see a more
> comphrensive approach, like if we detect an attack then the kernel instantly
> crashes or something. Or at least an approach in general agreed to by the CC and
> kernel community.
> 
> Igoring the issue and continuing seems just wrong.
> 
> This sprinkling of random checks in this series doesn't feel comprehensive or
> cohesive to me.
> 
> Jason

Can we follow the virtio BAD_RING()/vq->broken pattern in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/virtio/virtio_ring.c#n57.

Add a broken flag to mana_ib_dev. When any hardware response contains out-of-range values, mark the device broken and fail the operation - during probe this prevents device registration entirely, at runtime all subsequent operations return -EIO.

Long

^ permalink raw reply

* Re: [PATCH v2] Drivers: hv: mshv: fix integer overflow in memory region overlap check
From: vdso @ 2026-04-11  5:05 UTC (permalink / raw)
  To: Junrui Luo, Stanislav Kinsburskii
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Nuno Das Neves, Anirudh Rayabharam, Mukesh Rathor, Muminul Islam,
	Praveen K Paladugu, Jinank Jain, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Yuhao Jiang, stable@vger.kernel.org
In-Reply-To: <89730D18-D9A3-4A18-87DD-E7A51625FF69@outlook.com>

> On 04/09/2026 8:06 PM PDT Junrui Luo <moonafterrain@outlook.com> wrote:
> 
>  
> On Thu, Apr 02, 2026 at 04:25:02PM -0700, Stanislav Kinsburskii wrote:
> > nit: both comments are redundant - the meaning is clear from the code
> > itself.
> 
> I will drop them in v3.
> 
> > This maximum value check bugs me a bit.
> > 
> > First of all, why does it matter what is the region end? Potentially, there can be
> > regions not backed by host address space (leave alone host RAM), so why
> > intropducing this limitation?
> > 
> > Second, this check takes a host-specific constant (MAX_PHYSMEM_BITS) and rounds it down
> > to hypervisor-specific units which may not be aligned with the host page
> > size. Should this be host pages instead?
>  
> This check was suggested by Roman in v1 review. Roman, could you
> share your thoughts on Stanislav's concerns? I'd like to align on whether an upper
> bound check is needed here.

Hey Junrui, Stanislav,

The idea was that there is a max PFN/GFN going over which is _architecturally_ impossible.
It might be ((1 << 52) - 1) << 12 or ((1 << 48) - 1) << 12 or similar depending on how
many bits are used for the guest phys. address (52, 48, 36) and page sizes. The gist is
that max is way below 0xFFFF_FFFF_FFFF_FFFF (1<<64 - 1) used for overflow checking in the
original v1 patch. Hence, instead of checking for overflow, one may check for that max
PFN/GFN catching way more bad input. Checking for that max also feels as more
domain-specific compared to the "generic" overflow check.

I went for the trade-off where the "impossible" PFN/GFN is computed simply as
(1 << max_possible_bits_in_the_address)/the_min_possible_page_size. Again, that's simple
and better than the oveflow check as it catches the region with PFNs/GFNs that just cannot
be used. I agree that may overshoot, and an even more specific check would have to fetch the
CPUID for the VM (or its ARM64 MMFR regs), etc. to get more specifics and a lower bound
on the bad GFN (but at the cost of more computation).

All in all, from the three options of (generic check for overflow, simple check
for arch bad PFNs/GFNs, an elaborated check with all specifics) I suggested the simple check.
Fast and still more useful than checking for overflow in my opinion.

Below I go into various details on "why impossible" and these 48 and 52 bits, and I most
likely will be re-iterating things that you know - just for the convenience of other
folks who don't play with MMU/EPT/etc often, and might find the topic interesting.
It may happen that I'll also learn something from your critique, thanks in advance :)

That max PFN/GFN is dictated by the maximum number of bits allowed to be used in
the physical address. Above that, an address would not be mapped or cached by
the machines that run the code in question. The page tables won't work as the number
of bits used for the physical address here is the sum of the bits used to index
into a page table + the offset bits. That also gives the number of bits in the virtual
address which the CPU needs to address the memory (again, in this modern case;
there were many years ago machines with 32 bit virt addresses and 36 bit phys
addresses used via PAE/AWE).

E.g. for the x64_86 and 48 bits in the phys addresses: the VAs have 9+9+9+9 bits per
each page table hierarchy + 12 bits for the page offset in 4KiB (1<<12) pages. That
also can be played as 9+9+9 and 21 bits which gives 3 level paging and 2MiB (1<<21)
pages but still _48_ bits for the VA and the phys. address. Sure can be 9+9 bits for
2-level pages and 30 bits for the page offset (1GiB pages, 1 << 30). Still, _48_
bits. Similar aruphmetic goes for _52_ bit addresses but going to need 5-level pages
this time.

Now, can you have 53 bits in the phys address (to address 1<<53 bytes)? No, you
cannot as that doesn't work with the existing x86_64 MMUs (and ARM64) - no virtual
address can be constructed to address 53 bits - not enough levels in the page table
hierarchy. As no VA can be constructed, the code won't be able to access the data.

Neither the EPT/2nd level set up by the hypervisor, nor for 1st level translation
set up by the guest or the host can go higher than 52 bits, even for MMIO. When MMIO
happens, that's a 2nd level fault/not present page and goes to the hypervisor but
_still_ the hv builds the EPT/2nd level map by the rules of the architecture and
cannot go above the architecturally imposed limits.

--
Cheers,
Roman

> 
> Thanks,
> Junrui Luo

^ permalink raw reply

* Re: [PATCH net-next v6] net: mana: Expose hardware diagnostic info via debugfs
From: Simon Horman @ 2026-04-12 12:49 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, kotaranov, shradhagupta, shirazsaleem,
	yury.norov, kees, ssengar, dipayanroy, gargaditya, linux-hyperv,
	netdev, linux-kernel, linux-rdma
In-Reply-To: <20260408081555.302620-1-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:15:46AM -0700, Erni Sri Satya Vennela wrote:
> Add debugfs entries to expose hardware configuration and diagnostic
> information that aids in debugging driver initialization and runtime
> operations without adding noise to dmesg.
> 
> The debugfs directory for each PCI device is named using pci_name()
> (the unique BDF address), and its creation and removal is integrated
> into mana_gd_setup() and mana_gd_cleanup_device() respectively, so
> that all callers (probe, remove, suspend, resume, shutdown) share a
> single code path.
> 
> Device-level entries (under /sys/kernel/debug/mana/<BDF>/):
>   - num_msix_usable, max_num_queues: Max resources from hardware
>   - gdma_protocol_ver, pf_cap_flags1: VF version negotiation results
>   - num_vports, bm_hostmode: Device configuration
> 
> Per-vPort entries (under /sys/kernel/debug/mana/<BDF>/vportN/):
>   - port_handle: Hardware vPort handle
>   - max_sq, max_rq: Max queues from vPort config
>   - indir_table_sz: Indirection table size
>   - steer_rx, steer_rss, steer_update_tab, steer_cqe_coalescing:
>     Last applied steering configuration parameters
> 
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> This patch depends on the following fixes submitted to net:
>   - "net: mana: Use pci_name() for debugfs directory naming"
>   - "net: mana: Move current_speed debugfs file to mana_init_port()"
> Conflict resolution may be needed when net merges into net-next.

Unfortunately this patch doesn't apply to net-next,
which is a requirement for our workflow.

-- 
pw-bot: changes-requested

^ permalink raw reply

* Re: [PATCH net 1/2] net: mana: Use pci_name() for debugfs directory naming
From: Simon Horman @ 2026-04-12 12:52 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shradhagupta, kees, kotaranov, yury.norov, linux-hyperv, netdev,
	linux-kernel
In-Reply-To: <20260408081224.302308-2-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:12:19AM -0700, Erni Sri Satya Vennela wrote:
> Use pci_name(pdev) for the per-device debugfs directory instead of
> hardcoded "0" for PFs and pci_slot_name(pdev->slot) for VFs. The
> previous approach had two issues:
> 
> 1. pci_slot_name() dereferences pdev->slot, which can be NULL for VFs
>    in environments like generic VFIO passthrough or nested KVM,
>    causing a NULL pointer dereference.
> 
> 2. Multiple PFs would all use "0", and VFs across different PCI
>    domains or buses could share the same slot name, leading to
>    -EEXIST errors from debugfs_create_dir().
> 
> pci_name(pdev) returns the unique BDF address, is always valid, and is
> unique across the system.
> 
> Fixes: 6607c17c6c5e ("net: mana: Enable debugfs files for MANA device")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 2/2] net: mana: Move current_speed debugfs file to mana_init_port()
From: Simon Horman @ 2026-04-12 12:52 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shradhagupta, kees, kotaranov, yury.norov, linux-hyperv, netdev,
	linux-kernel
In-Reply-To: <20260408081224.302308-3-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:12:20AM -0700, Erni Sri Satya Vennela wrote:
> Move the current_speed debugfs file creation from mana_probe_port() to
> mana_init_port(). The file was previously created only during initial
> probe, but mana_cleanup_port_context() removes the entire vPort debugfs
> directory during detach/attach cycles. Since mana_init_port() recreates
> the directory on re-attach, moving current_speed here ensures it survives
> these cycles.
> 
> Fixes: 75cabb46935b ("net: mana: Add support for net_shaper_ops")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 0/2] net: mana: Fix debugfs directory naming and file lifecycle
From: patchwork-bot+netdevbpf @ 2026-04-12 18:40 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shradhagupta, kees, kotaranov, yury.norov, linux-hyperv, netdev,
	linux-kernel
In-Reply-To: <20260408081224.302308-1-ernis@linux.microsoft.com>

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed,  8 Apr 2026 01:12:18 -0700 you wrote:
> This series fixes two pre-existing debugfs issues in the MANA driver.
> 
> Patch 1 fixes the per-device debugfs directory naming to use the unique
> PCI BDF address via pci_name(), avoiding a potential NULL pointer
> dereference when pdev->slot is NULL (e.g. VFIO passthrough, nested KVM)
> and preventing name collisions across multiple PFs or VFs.
> 
> [...]

Here is the summary with links:
  - [net,1/2] net: mana: Use pci_name() for debugfs directory naming
    https://git.kernel.org/netdev/net/c/c116f07ab9d2
  - [net,2/2] net: mana: Move current_speed debugfs file to mana_init_port()
    https://git.kernel.org/netdev/net/c/3b7c7fc97aea

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v6 0/2] net: mana: add ethtool private flag for full-page RX buffers
From: Jakub Kicinski @ 2026-04-12 19:59 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, leitao, kees, john.fastabend,
	hawk, bpf, daniel, ast, sdf, dipayanroy
In-Reply-To: <20260409183509.0b24dea6@kernel.org>

On Thu, 9 Apr 2026 18:35:09 -0700 Jakub Kicinski wrote:
> On Tue,  7 Apr 2026 12:59:17 -0700 Dipayaan Roy wrote:
> > This behavior is observed on a single platform; other platforms
> > perform better with page_pool fragments, indicating this is not a
> > page_pool issue but platform-specific.  
> 
> Well, someone has to run some experiments and confirm other ARM
> platforms are not impacted, with data. I was hoping to do it myself
> but doesn't look like that will happen in time for the merge window :(

Please repost with the perf analysis on other commercially available
ARM platform. Something like:

  This is a workaround applicable to only some platforms. Modifying
  driver X to use a similar workaround on [Ampere Max|nVidia
  Grace|Amazon Graviton 3|..] the performance for split pages is
  y% higher than when using single pages.
-- 
pw-bot: cr

^ permalink raw reply

* [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

Fix four pre-existing bugs in mana_probe()/mana_remove() error handling
that can cause warnings on uninitialized work structs, masked errors,
and resource leaks when early probe steps fail.

Patches 1-2 move work struct initialization (link_change_work and
gf_stats_work) to before any error path that could trigger
mana_remove(), preventing WARN_ON in __flush_work() or debug object
warnings when sync cancellation runs on uninitialized work structs.

Patch 3 prevents add_adev() from overwriting a port probe error,
which could leave the driver in a broken state with NULL ports while
reporting success.

Patch 4 changes 'goto out' to 'break' in mana_remove()'s port loop
so that mana_destroy_eq() is always reached, preventing EQ leaks when
a NULL port is encountered.
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
Erni Sri Satya Vennela (4):
  net: mana: Init link_change_work before potential error paths in probe
  net: mana: Init gf_stats_work before potential error paths in probe
  net: mana: Don't overwrite port probe error with add_adev result
  net: mana: Fix EQ leak in mana_remove on NULL port

 drivers/net/ethernet/microsoft/mana/mana_en.c | 28 +++++++++----------
 1 file changed, 14 insertions(+), 14 deletions(-)

-- 
2.34.1

^ permalink raw reply

* [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260413050843.605789-1-ernis@linux.microsoft.com>

Move INIT_WORK(link_change_work) to right after the mana_context
allocation, before any error path that could reach mana_remove().

Previously, if mana_create_eq() or mana_query_device_cfg() failed,
mana_probe() would jump to the error path which calls mana_remove().
mana_remove() unconditionally calls disable_work_sync(link_change_work),
but the work struct had not been initialized yet. This can trigger
CONFIG_DEBUG_OBJECTS_WORK enabled.

Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 07630322545f..57f146ea6f66 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3631,6 +3631,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 		ac->gdma_dev = gd;
 		gd->driver_data = ac;
+
+		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	}
 
 	err = mana_create_eq(ac);
@@ -3648,8 +3650,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 	if (!resuming) {
 		ac->num_ports = num_ports;
-
-		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	} else {
 		if (ac->num_ports != num_ports) {
 			dev_err(dev, "The number of vPorts changed: %d->%d\n",
-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260413050843.605789-1-ernis@linux.microsoft.com>

Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
while keeping schedule_delayed_work() at its original location.

Previously, if any function between mana_create_eq() and the
INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
in __flush_work() or debug object warnings with
CONFIG_DEBUG_OBJECTS_WORK enabled.

Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 57f146ea6f66..f6ad46736418 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3635,6 +3635,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	}
 
+	INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+
 	err = mana_create_eq(ac);
 	if (err) {
 		dev_err(dev, "Failed to create EQs: %d\n", err);
@@ -3709,7 +3711,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 	err = add_adev(gd, "eth");
 
-	INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
 	schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
 
 out:
-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260413050843.605789-1-ernis@linux.microsoft.com>

In mana_probe(), if mana_probe_port() fails for any port, the error
is stored in 'err' and the loop breaks. However, the subsequent
unconditional 'err = add_adev(gd, "eth")' overwrites this error.
If add_adev() succeeds, mana_probe() returns success despite ports
being left in a partially initialized state (ac->ports[i] == NULL).

Only call add_adev() when there is no prior error, so the probe
correctly fails and triggers mana_remove() cleanup.

Fixes: ced82fce77e9 ("net: mana: Probe rdma device in mana driver")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f6ad46736418..1a141c46ac27 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3680,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 	if (!resuming) {
 		for (i = 0; i < ac->num_ports; i++) {
 			err = mana_probe_port(ac, i, &ac->ports[i]);
-			/* we log the port for which the probe failed and stop
-			 * probes for subsequent ports.
-			 * Note that we keep running ports, for which the probes
-			 * were successful, unless add_adev fails too
+			/* Log the port for which the probe failed, stop probing
+			 * subsequent ports, and skip add_adev.
+			 * Already-probed ports remain functional.
 			 */
 			if (err) {
 				dev_err(dev, "Probe Failed for port %d\n", i);
@@ -3697,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 			enable_work(&apc->queue_reset_work);
 			err = mana_attach(ac->ports[i]);
 			rtnl_unlock();
-			/* we log the port for which the attach failed and stop
-			 * attach for subsequent ports
-			 * Note that we keep running ports, for which the attach
-			 * were successful, unless add_adev fails too
+			/* Log the port for which the attach failed, stop
+			 * attaching subsequent ports, and skip add_adev.
+			 * Already-attached ports remain functional.
 			 */
 			if (err) {
 				dev_err(dev, "Attach Failed for port %d\n", i);
@@ -3709,7 +3707,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 		}
 	}
 
-	err = add_adev(gd, "eth");
+	if (!err)
+		err = add_adev(gd, "eth");
 
 	schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
 
-- 
2.34.1


^ permalink raw reply related

* [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260413050843.605789-1-ernis@linux.microsoft.com>

In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq().

This can happen when mana_probe_port() fails for port 0, leaving
ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove()
hits the NULL entry and jumps past mana_destroy_eq().

Change 'goto out' to 'break' so the for-loop exits normally and
mana_destroy_eq() is always reached. Remove the now-unreferenced out:
label.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1a141c46ac27..97237d137cbf 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3747,7 +3747,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
 		if (!ndev) {
 			if (i == 0)
 				dev_err(dev, "No net device to remove\n");
-			goto out;
+			break;
 		}
 
 		apc = netdev_priv(ndev);
@@ -3778,7 +3778,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
 	}
 
 	mana_destroy_eq(ac);
-out:
+
 	if (ac->per_port_queue_reset_wq) {
 		destroy_workqueue(ac->per_port_queue_reset_wq);
 		ac->per_port_queue_reset_wq = NULL;
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH v4 19/21] uio: replace deprecated mmap hook with mmap_prepare in uio_info
From: Shinichiro Kawasaki @ 2026-04-13  5:14 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle)
  Cc: Andrew Morton, Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com,
	linux-arm-kernel@lists.infradead.org,
	linux-mtd@lists.infradead.org, linux-staging@lists.linux.dev,
	linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Ryan Roberts
In-Reply-To: <157583e4477705b496896c7acd4ac88a937b8fa6.1774045440.git.ljs@kernel.org>

On Mar 20, 2026 / 22:39, Lorenzo Stoakes (Oracle) wrote:
> The f_op->mmap interface is deprecated, so update uio_info to use its
> successor, mmap_prepare.
> 
> Therefore, replace the uio_info->mmap hook with a new
> uio_info->mmap_prepare hook, and update its one user, target_core_user,
> to both specify this new mmap_prepare hook and also to use the new
> vm_ops->mapped() hook to continue to maintain a correct udev->kref
> refcount.
> 
> Then update uio_mmap() to utilise the mmap_prepare compatibility layer to
> invoke this callback from the uio mmap invocation.
> 
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Hello Lorenzo, since two weeks ago, I observe a failure during my kernel test
set targeting Linux for-next branch. On failure, kernel reported a WARN at
__vma_check_mmap_hook [1]. I bisected and found that this patch is the trigger.
Here I share my observations of the failure. Actions or advices for fix will be
appreciated.


The failure happens when TCMU device is set up with targetcli. When tcmu-runner
is running, the command lines below should successfully create a backstore for a
TCMU device, but it fails.

  $ sudo targetcli
  targetcli shell version 3.0.1
  Copyright 2011-2013 by Datera, Inc and others.
  For help on commands, type 'help'.
  
  /> cd /backstores/user:zbc
  /backstores/user:zbc> create name=test size=1M cfgstring=@/tmp/tmp.img
  UserBackedStorageObject creation failed.

On failure, tcmu-runner reports mmap failures:

  2026-04-13 12:23:49.271 1103 [CRIT] main:1302: Starting...
  2026-04-13 12:23:49.461 1103 [INFO] load_our_module:575: Inserted module 'target_core_user'
  2026-04-13 12:23:49.480 1103 [INFO] tcmur_register_handler:92: Handler fbo is registered
  2026-04-13 12:23:49.486 1103 [INFO] tcmur_register_handler:92: Handler zbc is registered
  2026-04-13 12:23:51.202 1103 [INFO] tcmur_register_handler:92: Handler rbd is registered
  2026-04-13 12:27:24.522 1103 [ERROR] device_open_shm:523: could not mmap /dev/uio0
  2026-04-13 12:27:24.550 1103 [ERROR] device_open_shm:523: could not mmap /dev/uio0

The failure was found with user:zbc handler. I confirmed that the failure is
recreated with fbo handler also. Then, this failure looks common for all TCMU
users.

At the failrue, kernel reported the WARN at __vma_check_mmap_hook [1]. The line
1287 in mm/util.c reported the WARN:

  1284 int __vma_check_mmap_hook(struct vm_area_struct *vma)
  1285 {
  1286         /* vm_ops->mapped is not valid if mmap() is specified. */
  1287         if (vma->vm_ops && WARN_ON_ONCE(vma->vm_ops->mapped))
  1288                 return -EINVAL;
  1289
  1290         return 0;
  1291 }
  1292 EXPORT_SYMBOL(__vma_check_mmap_hook);

When I reverted the commit from the kernel tag next-20260409, the failrue
disappeared.

If other information is required for fix, please let me know. Thanks in advance.


[1] dmesg

WARNING: mm/util.c:1287 at __vma_check_mmap_hook+0x61/0x90, CPU#0: tcmu-runner/1332
Modules linked in: target_core_pscsi target_core_file target_core_iblock xfs target_core_user target_core_mod rfkill nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security nf_tables ip6table_filter ip6_tables iptable_filter ip_tables qrtr irdma intel_rapl_msr intel_rapl_common ice intel_uncore_frequency intel_uncore_frequency_common skx_edac skx_edac_common libie_fwlog nfit sunrpc gnss idpf libnvdimm x86_pkg_temp_thermal libeth_xdp intel_powerclamp libeth ib_core spi_nor mtd coretemp kvm_intel kvm i40e iTCO_wdt irqbypass intel_pmc_bxt rapl ses vfat intel_cstate libie fat intel_uncore libie_adminq enclosure i2c_i801 spi_intel_pci i2c_smbus spi_intel lpc_ich wmi joydev mei_me ioatdma acpi_power_meter acpi_pad mei intel_pch_thermal dca fuse loop dm_multipath nfnetlink zram
 lz4hc_compress lz4_compress zstd_compress ast drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper nvme drm mpi3mr nvme_core scsi_transport_sas nvme_keyring nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser i2c_dev [last unloaded: null_blk]
CPU: 0 UID: 0 PID: 1332 Comm: tcmu-runner Not tainted 7.0.0-rc6-next-20260401-kts #1 PREEMPT(lazy) 
Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.5 05/18/2021
RIP: 0010:__vma_check_mmap_hook+0x61/0x90
Code: 00 00 00 00 fc ff df 48 8d 78 10 48 89 f9 48 c1 e9 03 80 3c 11 00 75 2a 48 83 78 10 00 75 0b 31 c0 48 83 c4 08 c3 cc cc cc cc <0f> 0b b8 ea ff ff ff eb ee 48 89 04 24 e8 6d 4c 1f 00 48 8b 04 24
RSP: 0018:ffff8881391f7488 EFLAGS: 00010282
RAX: ffffffffc2abca40 RBX: 0000000000000000 RCX: 1ffffffff855794a
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffffffffc2abca50
RBP: ffff8881391f76a0 R08: ffffffffa10016e9 R09: ffffed102723ee44
R10: ffffed102723ee45 R11: 0000000000000000 R12: ffff8881391f78e0
R13: ffff8881391f78f0 R14: ffff88810d1ec780 R15: ffff8881391f7a78
FS:  00007f154f1a9840(0000) GS:ffff888e9b440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f5efd40fe88 CR3: 000000010cfbf005 CR4: 00000000007726f0
PKRU: 55555554
Call Trace:
 <TASK>
 __mmap_new_vma+0x116e/0x18d0
 ? __pfx___mmap_new_vma+0x10/0x10
 ? vma_merge_new_range+0x495/0xa00
 ? __pfx_vma_merge_new_range+0x10/0x10
 ? lock_acquire+0x126/0x140
 __mmap_region+0x651/0xa00
 ? __pfx_process_measurement+0x10/0x10
 ? __pfx___mmap_region+0x10/0x10
 ? __lock_acquire+0x55d/0xbd0
 ? __lock_acquire+0x55d/0xbd0
 ? lock_is_held_type+0x9a/0x110
 ? mas_find+0xc9/0x690
 ? arch_get_unmapped_area_topdown+0x2a7/0x890
 mmap_region+0x3c2/0x4c0
 ? __pfx_mmap_region+0x10/0x10
 ? security_mmap_addr+0x54/0xd0
 ? __get_unmapped_area+0x18c/0x300
 ? __pfx_uio_mmap+0x10/0x10
 do_mmap+0xa26/0x10f0
 ? lock_acquire+0x126/0x140
 ? __pfx_do_mmap+0x10/0x10
 ? __pfx_down_write_killable+0x10/0x10
 ? __lock_acquire+0x55d/0xbd0
 vm_mmap_pgoff+0x218/0x3a0
 ? __pfx_vm_mmap_pgoff+0x10/0x10
 ? __fget_files+0x1b4/0x2f0
 ksys_mmap_pgoff+0x229/0x570
 ? clockevents_program_event+0x144/0x370
 do_syscall_64+0xf4/0x1560
 ? do_syscall_64+0x1d7/0x1560
 ? __lock_release.isra.0+0x59/0x170
 ? do_syscall_64+0x34/0x1560
 ? lockdep_hardirqs_on_prepare.part.0+0x9b/0x140
 ? do_syscall_64+0x34/0x1560
 ? trace_hardirqs_on+0x19/0x1a0
 ? do_syscall_64+0xab/0x1560
 ? clear_bhb_loop+0x30/0x80
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f154f5728dc
Code: 1e fa 41 f7 c1 ff 0f 00 00 75 33 55 48 89 e5 41 54 41 89 cc 53 48 89 fb 48 85 ff 74 41 45 89 e2 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7c 5b 41 5c 5d c3 0f 1f 80 00 00 00 00 48 8b
RSP: 002b:00007ffeb6ac04f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f154f5728dc
RDX: 0000000000000003 RSI: 0000000040800000 RDI: 0000000000000000
RBP: 00007ffeb6ac0500 R08: 000000000000000c R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
R13: 000000001ddac980 R14: 00007f154fa629c0 R15: 00007f154fa62940
 </TASK>
irq event stamp: 62665
hardirqs last  enabled at (62679): [<ffffffff9e1cd23e>] __up_console_sem+0x5e/0x70
hardirqs last disabled at (62698): [<ffffffff9e1cd223>] __up_console_sem+0x43/0x70
softirqs last  enabled at (62692): [<ffffffff9dfc5d01>] handle_softirqs+0x5c1/0x8b0
softirqs last disabled at (62721): [<ffffffff9dfc6152>] __irq_exit_rcu+0x152/0x280
---[ end trace 0000000000000000 ]---
scsi host74: TCM_Loopback

^ permalink raw reply

* Re: [PATCH v4 19/21] uio: replace deprecated mmap hook with mmap_prepare in uio_info
From: Lorenzo Stoakes @ 2026-04-13  5:37 UTC (permalink / raw)
  To: Shinichiro Kawasaki
  Cc: Andrew Morton, Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com,
	linux-arm-kernel@lists.infradead.org,
	linux-mtd@lists.infradead.org, linux-staging@lists.linux.dev,
	linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Ryan Roberts
In-Reply-To: <adx2ws5z0NMIe5Yj@shinmob>

On Mon, Apr 13, 2026 at 05:14:08AM +0000, Shinichiro Kawasaki wrote:
> On Mar 20, 2026 / 22:39, Lorenzo Stoakes (Oracle) wrote:
> > The f_op->mmap interface is deprecated, so update uio_info to use its
> > successor, mmap_prepare.
> >
> > Therefore, replace the uio_info->mmap hook with a new
> > uio_info->mmap_prepare hook, and update its one user, target_core_user,
> > to both specify this new mmap_prepare hook and also to use the new
> > vm_ops->mapped() hook to continue to maintain a correct udev->kref
> > refcount.
> >
> > Then update uio_mmap() to utilise the mmap_prepare compatibility layer to
> > invoke this callback from the uio mmap invocation.
> >
> > Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
>
> Hello Lorenzo, since two weeks ago, I observe a failure during my kernel test
> set targeting Linux for-next branch. On failure, kernel reported a WARN at
> __vma_check_mmap_hook [1]. I bisected and found that this patch is the trigger.
> Here I share my observations of the failure. Actions or advices for fix will be
> appreciated.

Ugh yeah thanks, this actually needs to account for use of compatibility layer,
so probably we shouldn't even assert this as that isn't easily detectable.

I'll send a hotfix for this that can be bundled up with 7.1 patches.

Cheers, Lorenzo

^ permalink raw reply

* Re: [PATCH 1/8] hv: Select CONFIG_SYSFB only for CONFIG_HYPERV_VMBUS
From: Thomas Zimmermann @ 2026-04-13  7:17 UTC (permalink / raw)
  To: Saurabh Singh Sengar, javierm@redhat.com, arnd@arndb.de,
	ardb@kernel.org, ilias.apalodimas@linaro.org,
	chenhuacai@kernel.org, kernel@xen0n.name,
	maarten.lankhorst@linux.intel.com, mripard@kernel.org,
	airlied@gmail.com, simona@ffwll.ch, KY Srinivasan, Haiyang Zhang,
	wei.liu@kernel.org, Dexuan Cui, Long Li, deller@gmx.de
  Cc: linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
	linux-efi@vger.kernel.org, linux-riscv@lists.infradead.org,
	dri-devel@lists.freedesktop.org, linux-hyperv@vger.kernel.org,
	linux-fbdev@vger.kernel.org, Michael Kelley, Saurabh Sengar,
	stable@vger.kernel.org
In-Reply-To: <KUZP153MB14449BBE44CBAEEA7621A4A0BE51A@KUZP153MB1444.APCP153.PROD.OUTLOOK.COM>

Hi

Am 02.04.26 um 12:50 schrieb Saurabh Singh Sengar:
>> Hyperv's sysfb access only exists in the VMBUS support. Therefore only select
>> CONFIG_SYSFB for CONFIG_HYPERV_VMBUS. Avoids sysfb code on systems
>> that don't need it.
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> Fixes: 96959283a58d ("Drivers: hv: Always select CONFIG_SYSFB for Hyper-V
>> guests")
>> Cc: Michael Kelley <mhklinux@outlook.com>
>> Cc: Saurabh Sengar <ssengar@linux.microsoft.com>
>> Cc: Wei Liu <wei.liu@kernel.org>
>> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>> Cc: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: Dexuan Cui <decui@microsoft.com>
>> Cc: Long Li <longli@microsoft.com>
>> Cc: linux-hyperv@vger.kernel.org
>> Cc: <stable@vger.kernel.org> # v6.16+
>> ---
>>   drivers/hv/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index
>> 7937ac0cbd0f..2d0b3fcb0ff8 100644
>> --- a/drivers/hv/Kconfig
>> +++ b/drivers/hv/Kconfig
>> @@ -9,7 +9,6 @@ config HYPERV
>>   	select PARAVIRT
>>   	select X86_HV_CALLBACK_VECTOR if X86
>>   	select OF_EARLY_FLATTREE if OF
>> -	select SYSFB if EFI && !HYPERV_VTL_MODE
>>   	select IRQ_MSI_LIB if X86
>>   	help
>>   	  Select this option to run Linux as a Hyper-V client operating @@ -62,6
>> +61,7 @@ config HYPERV_VMBUS
>>   	tristate "Microsoft Hyper-V VMBus driver"
>>   	depends on HYPERV
>>   	default HYPERV
>> +	select SYSFB if EFI && !HYPERV_VTL_MODE
>>   	help
>>   	  Select this option to enable Hyper-V Vmbus driver.
>>
>> --
>> 2.53.0
> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>

This fix is independent from the rest of the series. Do you want to 
merge it or can I take it into DRM trees?

Best regards
Thomas

>

-- 
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstr. 146, 90461 Nürnberg, Germany, www.suse.com
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich, (HRB 36809, AG Nürnberg)



^ permalink raw reply

* RE: [EXTERNAL] Re: [PATCH 1/8] hv: Select CONFIG_SYSFB only for CONFIG_HYPERV_VMBUS
From: Saurabh Singh Sengar @ 2026-04-13  8:22 UTC (permalink / raw)
  To: Thomas Zimmermann, javierm@redhat.com, arnd@arndb.de,
	ardb@kernel.org, ilias.apalodimas@linaro.org,
	chenhuacai@kernel.org, kernel@xen0n.name,
	maarten.lankhorst@linux.intel.com, mripard@kernel.org,
	airlied@gmail.com, simona@ffwll.ch, KY Srinivasan, Haiyang Zhang,
	wei.liu@kernel.org, Dexuan Cui, Long Li, deller@gmx.de
  Cc: linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
	linux-efi@vger.kernel.org, linux-riscv@lists.infradead.org,
	dri-devel@lists.freedesktop.org, linux-hyperv@vger.kernel.org,
	linux-fbdev@vger.kernel.org, Michael Kelley, Saurabh Sengar,
	stable@vger.kernel.org, Wei Liu
In-Reply-To: <2fe8ce91-2dc5-4cf2-b7cf-d495e5cff14b@suse.de>

> Hi
> 
> Am 02.04.26 um 12:50 schrieb Saurabh Singh Sengar:
> >> Hyperv's sysfb access only exists in the VMBUS support. Therefore
> >> only select CONFIG_SYSFB for CONFIG_HYPERV_VMBUS. Avoids sysfb code
> >> on systems that don't need it.
> >>
> >> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> >> Fixes: 96959283a58d ("Drivers: hv: Always select CONFIG_SYSFB for
> >> Hyper-V
> >> guests")
> >> Cc: Michael Kelley <mhklinux@outlook.com>
> >> Cc: Saurabh Sengar <ssengar@linux.microsoft.com>
> >> Cc: Wei Liu <wei.liu@kernel.org>
> >> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> >> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> >> Cc: Dexuan Cui <decui@microsoft.com>
> >> Cc: Long Li <longli@microsoft.com>
> >> Cc: linux-hyperv@vger.kernel.org
> >> Cc: <stable@vger.kernel.org> # v6.16+
> >> ---
> >>   drivers/hv/Kconfig | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index
> >> 7937ac0cbd0f..2d0b3fcb0ff8 100644
> >> --- a/drivers/hv/Kconfig
> >> +++ b/drivers/hv/Kconfig
> >> @@ -9,7 +9,6 @@ config HYPERV
> >>   	select PARAVIRT
> >>   	select X86_HV_CALLBACK_VECTOR if X86
> >>   	select OF_EARLY_FLATTREE if OF
> >> -	select SYSFB if EFI && !HYPERV_VTL_MODE
> >>   	select IRQ_MSI_LIB if X86
> >>   	help
> >>   	  Select this option to run Linux as a Hyper-V client operating @@
> >> -62,6
> >> +61,7 @@ config HYPERV_VMBUS
> >>   	tristate "Microsoft Hyper-V VMBus driver"
> >>   	depends on HYPERV
> >>   	default HYPERV
> >> +	select SYSFB if EFI && !HYPERV_VTL_MODE
> >>   	help
> >>   	  Select this option to enable Hyper-V Vmbus driver.
> >>
> >> --
> >> 2.53.0
> > Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> 
> This fix is independent from the rest of the series. Do you want to merge it or
> can I take it into DRM trees?

Please feel free to take it via DRM tree.
CC : Wei Liu

- Saurabh


^ permalink raw reply

* Re: [PATCH v2] Drivers: hv: mshv: fix integer overflow in memory region overlap check
From: Junrui Luo @ 2026-04-13  8:43 UTC (permalink / raw)
  To: vdso@mailbox.org, Stanislav Kinsburskii
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Nuno Das Neves, Anirudh Rayabharam, Mukesh Rathor, Muminul Islam,
	Praveen K Paladugu, Jinank Jain, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org, Yuhao Jiang, stable@vger.kernel.org
In-Reply-To: <319614096.43465.1775883935863@app.mailbox.org>

On Fri, Apr 10, 2026 at 09:05:35PM -0800, vdso@mailbox.org wrote:
> All in all, from the three options of (generic check for overflow, simple check
> for arch bad PFNs/GFNs, an elaborated check with all specifics) I suggested the simple check.
> Fast and still more useful than checking for overflow in my opinion.
 
Thanks Roman for the thorough write-up. Since the original patch mixes
host and hypervisor-side constants with an unclear unit, IMO we should
do the bounds check in bytes instead.

For instance:

	u64 start_gpa, end_gpa;

	if (check_mul_overflow(mem->guest_pfn, HV_HYP_PAGE_SIZE,
						   &start_gpa) ||
		check_add_overflow(start_gpa, mem->size, &end_gpa) ||
		end_gpa > (1ULL << MAX_PHYSMEM_BITS))
		return -EINVAL;

Both sides of the final comparison are bytes, so no host-vs-hv page
unit conversion is needed.

In addition, it changes return value from -EOVERFLOW to -EINVAL.

Does this approach look reasonable? Happy to iterate if either of you
would prefer a different choice.

Thanks,
Junrui Luo


^ permalink raw reply

* Re: [PATCH 01/11] arch: arm64: Export arch_smp_send_reschedule for mshv_vtl module
From: Naman Jain @ 2026-04-13 11:44 UTC (permalink / raw)
  To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
	ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB41570A9050B3EB6A905DFF56D450A@SN6PR02MB4157.namprd02.prod.outlook.com>



On 4/1/2026 10:24 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>
> 
> Nit: For the patch "Subject", the most common prefix for the file
> arch/arm64/kernel/smp.c is "arm64: smp:".  I'd suggest using that
> prefix for historical consistency.

Acked. Will change in v2.

> 
>> mshv_vtl_main.c calls smp_send_reschedule() which expands to
>> arch_smp_send_reschedule(). When CONFIG_MSHV_VTL=m, the module cannot
>> access this symbol since it is not exported on arm64.
>>
>> smp_send_reschedule() is used in mshv_vtl_cancel() to interrupt a vCPU
>> thread running on another CPU. When a vCPU is looping in
>> mshv_vtl_ioctl_return_to_lower_vtl(), it checks a per-CPU cancel flag
>> before each VTL0 entry. Setting cancel=1 alone is not enough if the
>> target CPU thread is sleeping - the IPI from smp_send_reschedule() kicks
>> the remote CPU out of idle/sleep so it re-checks the cancel flag and
>> exits the loop promptly.
>>
>> Other architectures (riscv, loongarch, powerpc) already export this
>> symbol. Add the same EXPORT_SYMBOL_GPL for arm64. This is required
>> for adding arm64 support in MSHV_VTL.
>>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>>   arch/arm64/kernel/smp.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index 1aa324104afb..26b1a4456ceb 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -1152,6 +1152,7 @@ void arch_smp_send_reschedule(int cpu)
>>   {
>>   	smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>>   }
>> +EXPORT_SYMBOL_GPL(arch_smp_send_reschedule);
>>
>>   #ifdef CONFIG_ARM64_ACPI_PARKING_PROTOCOL
>>   void arch_send_wakeup_ipi(unsigned int cpu)
>> --
>> 2.43.0
>>
> 
> The "Subject" nit notwithstanding,
> 
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>

Thanks,
Naman

^ permalink raw reply

* Re: [PATCH 03/11] Drivers: hv: Add support to setup percpu vmbus handler
From: Naman Jain @ 2026-04-13 11:45 UTC (permalink / raw)
  To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
	ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB41570E0F113FE28CFC839476D450A@SN6PR02MB4157.namprd02.prod.outlook.com>



On 4/1/2026 10:25 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>
>> Add a wrapper function - hv_setup_percpu_vmbus_handler(), similar to
>> hv_setup_vmbus_handler() to allow setting up custom per-cpu VMBus
>> interrupt handler. This is required for arm64 support, to be added
>> in MSHV_VTL driver, where per-cpu VMBus interrupt handler will be
>> set to mshv_vtl_vmbus_isr() for VTL2 (Virtual Trust Level 2).
> 
> Needing both hv_setup_vmbus_handler() and
> hv_setup_percpu_vmbus_handler() seems unfortunate. Here's an
> alternate approach to consider:
> 
> 1. I think the x86 VMBus sysvec handler and the vmbus_percpu_isr()
> functions could both use the same vmbus_handler global variable.
> Looking at your changes in this patch set, hv_setup_vmbus_handler()
> and hv_setup_percpu_vmbus_handler() are used together and always
> set the same value.
> 
> 2. So move the global variable vmbus_handler out from arch/x86
> and into hv_common.c, and export it. The x86 sysvec handler can
> still reference it, and vmbus_percpu_isr() in vmbus_drv.c can
> also reference it.  No need to have vmbus_percpu_isr() under
> arch/arm64 or have a stub in hv_common.c.
> 
> 3. hv_setup_vmbus_handler() and hv_remove_vmbus_handler()
> also move to hv_common.c.  The __weak stubs go away.
> 
> With these changes, only hv_setup_vmbus_handler() needs to
> be called, and it works for both x86 with the sysvec handler and
> for arm64 with vmbus_percpu_isr().
> 
> I haven't coded this up, so maybe there's some problematic detail,
> but the idea seems like it would work. If it does work, some of my
> comments below are no longer applicable.
> 

This is a great suggestion. Current implementation looked complex in 
design and it was becoming more complex with the changes I was making 
while addressing Sashiko's AI review comments. However your suggestion 
looks much better. I'll implement it. Thanks for suggesting.

>>
>> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>>   arch/arm64/hyperv/mshyperv.c   | 13 +++++++++++++
>>   drivers/hv/hv_common.c         | 11 +++++++++++
>>   drivers/hv/vmbus_drv.c         |  7 +------
>>   include/asm-generic/mshyperv.h |  3 +++
>>   4 files changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c
>> index 4fdc26ade1d7..d4494ceeaad0 100644
>> --- a/arch/arm64/hyperv/mshyperv.c
>> +++ b/arch/arm64/hyperv/mshyperv.c
>> @@ -134,3 +134,16 @@ bool hv_is_hyperv_initialized(void)
>>   	return hyperv_initialized;
>>   }
>>   EXPORT_SYMBOL_GPL(hv_is_hyperv_initialized);
>> +
>> +static void (*vmbus_percpu_handler)(void);
>> +void hv_setup_percpu_vmbus_handler(void (*handler)(void))
>> +{
>> +	vmbus_percpu_handler = handler;
>> +}
>> +
>> +irqreturn_t vmbus_percpu_isr(int irq, void *dev_id)
>> +{
>> +	if (vmbus_percpu_handler)
>> +		vmbus_percpu_handler();
>> +	return IRQ_HANDLED;
>> +}
>> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
>> index d1ebc0ebd08f..a5064f558bf6 100644
>> --- a/drivers/hv/hv_common.c
>> +++ b/drivers/hv/hv_common.c
>> @@ -759,6 +759,17 @@ void __weak hv_setup_vmbus_handler(void (*handler)(void))
>>   }
>>   EXPORT_SYMBOL_GPL(hv_setup_vmbus_handler);
>>
>> +irqreturn_t __weak vmbus_percpu_isr(int irq, void *dev_id)
>> +{
>> +	return IRQ_HANDLED;
>> +}
>> +EXPORT_SYMBOL_GPL(vmbus_percpu_isr);
>> +
>> +void __weak hv_setup_percpu_vmbus_handler(void (*handler)(void))
>> +{
>> +}
>> +EXPORT_SYMBOL_GPL(hv_setup_percpu_vmbus_handler);
> 
> You've implemented hv_setup_percpu_vmbus_handler() following
> the pattern of hv_setup_vmbus_handler(), which is reasonable.
> But that turns out to be unnecessarily complicated. The existing
> hv_setup_vmbus_handler() has a portion in
> arch/x86/kernel/cpu/mshyperv.c as a special case because it uses a
> hard-coded interrupt vector on x86/x64, and has its own custom
> sysvec code. And there's a need for a __weak stub in hv_common.c
> so that vmbus_drv.c will compile on arm64.
> 
> But hv_setup_percpu_vmbus_handler() does not have the same
> requirements. It could be implemented entirely in vmbus_drv.c,
> with no code under arch/x86 or arch/arm64, and no __weak stub
> in hv_common.c.  vmbus_drv.c would just need to
> EXPORT_SYMBOL_FOR_MODULES, like it already does with vmbus_isr.
> I didn't code it up, but I think that approach would be simpler with
> fewer piece-parts scattered all over. If so, it would be worth
> breaking the symmetry with hv_setup_vmbus_handler().
> 

No longer applicable.

Regards,
Naman

^ permalink raw reply

* Re: [PATCH 04/11] Drivers: hv: Refactor mshv_vtl for ARM64 support to be added
From: Naman Jain @ 2026-04-13 11:46 UTC (permalink / raw)
  To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
	ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB41573C4A21BA96A534E5429CD450A@SN6PR02MB4157.namprd02.prod.outlook.com>



On 4/1/2026 10:26 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>
>> Refactor MSHV_VTL driver to move some of the x86 specific code to arch
>> specific files, and add corresponding functions for arm64.
>>
>> Signed-off-by: Roman Kisel <romank@linux.microsoft.com>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>>   arch/arm64/include/asm/mshyperv.h |  10 +++
>>   arch/x86/hyperv/hv_vtl.c          |  98 ++++++++++++++++++++++++++++
>>   arch/x86/include/asm/mshyperv.h   |   1 +
>>   drivers/hv/mshv_vtl_main.c        | 102 +-----------------------------
>>   4 files changed, 111 insertions(+), 100 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/mshyperv.h
>> b/arch/arm64/include/asm/mshyperv.h
>> index b721d3134ab6..804068e0941b 100644
>> --- a/arch/arm64/include/asm/mshyperv.h
>> +++ b/arch/arm64/include/asm/mshyperv.h
>> @@ -60,6 +60,16 @@ static inline u64 hv_get_non_nested_msr(unsigned int reg)
>>   				ARM_SMCCC_SMC_64,		\
>>   				ARM_SMCCC_OWNER_VENDOR_HYP,	\
>>   				HV_SMCCC_FUNC_NUMBER)
>> +#ifdef CONFIG_HYPERV_VTL_MODE
>> +/*
>> + * Get/Set the register. If the function returns `1`, that must be done via
>> + * a hypercall. Returning `0` means success.
>> + */
>> +static inline int hv_vtl_get_set_reg(struct hv_register_assoc *regs, bool set, u64 shared)
>> +{
>> +	return 1;
>> +}
>> +#endif
>>
>>   #include <asm-generic/mshyperv.h>
>>
>> diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
>> index 9b6a9bc4ab76..72a0bb4ae0c7 100644
>> --- a/arch/x86/hyperv/hv_vtl.c
>> +++ b/arch/x86/hyperv/hv_vtl.c
>> @@ -17,6 +17,8 @@
>>   #include <asm/realmode.h>
>>   #include <asm/reboot.h>
>>   #include <asm/smap.h>
>> +#include <uapi/asm/mtrr.h>
>> +#include <asm/debugreg.h>
>>   #include <linux/export.h>
>>   #include <../kernel/smpboot.h>
>>   #include "../../kernel/fpu/legacy.h"
>> @@ -281,3 +283,99 @@ void mshv_vtl_return_call(struct mshv_vtl_cpu_context *vtl0)
>>   	kernel_fpu_end();
>>   }
>>   EXPORT_SYMBOL(mshv_vtl_return_call);
>> +
>> +/* Static table mapping register names to their corresponding actions */
>> +static const struct {
>> +	enum hv_register_name reg_name;
>> +	int debug_reg_num;  /* -1 if not a debug register */
>> +	u32 msr_addr;       /* 0 if not an MSR */
>> +} reg_table[] = {
>> +	/* Debug registers */
>> +	{HV_X64_REGISTER_DR0, 0, 0},
>> +	{HV_X64_REGISTER_DR1, 1, 0},
>> +	{HV_X64_REGISTER_DR2, 2, 0},
>> +	{HV_X64_REGISTER_DR3, 3, 0},
>> +	{HV_X64_REGISTER_DR6, 6, 0},
>> +	/* MTRR MSRs */
>> +	{HV_X64_REGISTER_MSR_MTRR_CAP, -1, MSR_MTRRcap},
>> +	{HV_X64_REGISTER_MSR_MTRR_DEF_TYPE, -1, MSR_MTRRdefType},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE0, -1, MTRRphysBase_MSR(0)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE1, -1, MTRRphysBase_MSR(1)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE2, -1, MTRRphysBase_MSR(2)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE3, -1, MTRRphysBase_MSR(3)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE4, -1, MTRRphysBase_MSR(4)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE5, -1, MTRRphysBase_MSR(5)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE6, -1, MTRRphysBase_MSR(6)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE7, -1, MTRRphysBase_MSR(7)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE8, -1, MTRRphysBase_MSR(8)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASE9, -1, MTRRphysBase_MSR(9)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASEA, -1, MTRRphysBase_MSR(0xa)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASEB, -1, MTRRphysBase_MSR(0xb)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASEC, -1, MTRRphysBase_MSR(0xc)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASED, -1, MTRRphysBase_MSR(0xd)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASEE, -1, MTRRphysBase_MSR(0xe)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_BASEF, -1, MTRRphysBase_MSR(0xf)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK0, -1, MTRRphysMask_MSR(0)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK1, -1, MTRRphysMask_MSR(1)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK2, -1, MTRRphysMask_MSR(2)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK3, -1, MTRRphysMask_MSR(3)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK4, -1, MTRRphysMask_MSR(4)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK5, -1, MTRRphysMask_MSR(5)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK6, -1, MTRRphysMask_MSR(6)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK7, -1, MTRRphysMask_MSR(7)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK8, -1, MTRRphysMask_MSR(8)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASK9, -1, MTRRphysMask_MSR(9)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKA, -1, MTRRphysMask_MSR(0xa)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKB, -1, MTRRphysMask_MSR(0xb)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKC, -1, MTRRphysMask_MSR(0xc)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKD, -1, MTRRphysMask_MSR(0xd)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKE, -1, MTRRphysMask_MSR(0xe)},
>> +	{HV_X64_REGISTER_MSR_MTRR_PHYS_MASKF, -1, MTRRphysMask_MSR(0xf)},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX64K00000, -1, MSR_MTRRfix64K_00000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX16K80000, -1, MSR_MTRRfix16K_80000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX16KA0000, -1, MSR_MTRRfix16K_A0000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KC0000, -1, MSR_MTRRfix4K_C0000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KC8000, -1, MSR_MTRRfix4K_C8000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KD0000, -1, MSR_MTRRfix4K_D0000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KD8000, -1, MSR_MTRRfix4K_D8000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KE0000, -1, MSR_MTRRfix4K_E0000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KE8000, -1, MSR_MTRRfix4K_E8000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KF0000, -1, MSR_MTRRfix4K_F0000},
>> +	{HV_X64_REGISTER_MSR_MTRR_FIX4KF8000, -1, MSR_MTRRfix4K_F8000},
>> +};
>> +
>> +int hv_vtl_get_set_reg(struct hv_register_assoc *regs, bool set, u64 shared)
>> +{
>> +	u64 *reg64;
>> +	enum hv_register_name gpr_name;
>> +	int i;
>> +
>> +	gpr_name = regs->name;
>> +	reg64 = &regs->value.reg64;
>> +
>> +	/* Search for the register in the table */
>> +	for (i = 0; i < ARRAY_SIZE(reg_table); i++) {
>> +		if (reg_table[i].reg_name != gpr_name)
>> +			continue;
>> +		if (reg_table[i].debug_reg_num != -1) {
>> +			/* Handle debug registers */
>> +			if (gpr_name == HV_X64_REGISTER_DR6 && !shared)
>> +				goto hypercall;
>> +			if (set)
>> +				native_set_debugreg(reg_table[i].debug_reg_num, *reg64);
>> +			else
>> +				*reg64 = native_get_debugreg(reg_table[i].debug_reg_num);
>> +		} else {
>> +			/* Handle MSRs */
>> +			if (set)
>> +				wrmsrl(reg_table[i].msr_addr, *reg64);
>> +			else
>> +				rdmsrl(reg_table[i].msr_addr, *reg64);
>> +		}
>> +		return 0;
>> +	}
>> +
>> +hypercall:
>> +	return 1;
>> +}
>> +EXPORT_SYMBOL(hv_vtl_get_set_reg);
>> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
>> index f64393e853ee..d5355a5b7517 100644
>> --- a/arch/x86/include/asm/mshyperv.h
>> +++ b/arch/x86/include/asm/mshyperv.h
>> @@ -304,6 +304,7 @@ void mshv_vtl_return_call(struct mshv_vtl_cpu_context *vtl0);
>>   void mshv_vtl_return_call_init(u64 vtl_return_offset);
>>   void mshv_vtl_return_hypercall(void);
>>   void __mshv_vtl_return_call(struct mshv_vtl_cpu_context *vtl0);
>> +int hv_vtl_get_set_reg(struct hv_register_assoc *regs, bool set, u64 shared);
> 
> Can this move to asm-generic/mshyperv.h?  The function is no longer specific
> to x86/x64, so one would want to not declare it in the arch/x86 version
> of mshyperv.h. But maybe moving it to asm-generic/mshyperv.h breaks
> compilation on arm64 because there's also the static inline stub there.

This is still arch specific (x86 to be precise). For ARM64, we always 
want to return 1, which is to tell the client to use hypercall as a 
fallback option. Moving this x86 specific implementation (X64 registers, 
MTRR etc) to a common code, would not be right. One thing that could be 
done here was to move the "return 1" stub code from arm64 to asm-generic 
mshyperv.h, but that would not provide much benefit.

I am currently not planning to make any changes here. If I misunderstood 
your comment, please let me know.

Regards,
Naman


^ permalink raw reply

* Re: [PATCH 05/11] drivers: hv: Export vmbus_interrupt for mshv_vtl module
From: Naman Jain @ 2026-04-13 11:46 UTC (permalink / raw)
  To: Michael Kelley, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Catalin Marinas, Will Deacon,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86@kernel.org, H . Peter Anvin, Arnd Bergmann, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, Alexandre Ghiti
  Cc: Marc Zyngier, Timothy Hayes, Lorenzo Pieralisi, mrigendrachaubey,
	ssengar@linux.microsoft.com, linux-hyperv@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org
In-Reply-To: <SN6PR02MB4157F1DAEF3BC14A67D59FB8D450A@SN6PR02MB4157.namprd02.prod.outlook.com>



On 4/1/2026 10:26 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Monday, March 16, 2026 5:13 AM
>>
> 
> Nit:  For the patch Subject, capitalize "Drivers:" in the prefix.

Acked.

Thanks,
Naman

> 
>> vmbus_interrupt is used in mshv_vtl_main.c to set the SINT vector.
>> When CONFIG_MSHV_VTL=m and CONFIG_HYPERV_VMBUS=y (built-in), the module
>> cannot access vmbus_interrupt at load time since it is not exported.
>>
>> Export it using EXPORT_SYMBOL_FOR_MODULES consistent with the existing
>> pattern used for vmbus_isr.
>>
>> Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
>> ---
>>   drivers/hv/vmbus_drv.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index f99d4f2d3862..de191799a8f6 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -57,6 +57,7 @@ static DEFINE_PER_CPU(long, vmbus_evt);
>>   /* Values parsed from ACPI DSDT */
>>   int vmbus_irq;
>>   int vmbus_interrupt;
>> +EXPORT_SYMBOL_FOR_MODULES(vmbus_interrupt, "mshv_vtl");
>>
>>   /*
>>    * If the Confidential VMBus is used, the data on the "wire" is not
>> --
>> 2.43.0
>>
> 
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox