* Re: [PATCH v3] drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT
From: Sebastian Andrzej Siewior @ 2026-03-12 17:07 UTC (permalink / raw)
To: Jan Kiszka
Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
linux-hyperv, linux-kernel, Florian Bezdeka, RT, Mitchell Levy,
Michael Kelley, Saurabh Singh Sengar, Naman Jain
In-Reply-To: <289d8e52-40f8-4b22-8aa9-d0bd3bd15aae@siemens.com>
On 2026-02-16 17:24:56 [+0100], Jan Kiszka wrote:
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -25,6 +25,7 @@
> #include <linux/cpu.h>
> #include <linux/sched/isolation.h>
> #include <linux/sched/task_stack.h>
> +#include <linux/smpboot.h>
>
> #include <linux/delay.h>
> #include <linux/panic_notifier.h>
> @@ -1350,7 +1351,7 @@ static void vmbus_message_sched(struct hv_per_cpu_context *hv_cpu, void *message
> }
> }
>
> -void vmbus_isr(void)
> +static void __vmbus_isr(void)
> {
> struct hv_per_cpu_context *hv_cpu
> = this_cpu_ptr(hv_context.cpu_context);
> @@ -1363,6 +1364,53 @@ void vmbus_isr(void)
>
> add_interrupt_randomness(vmbus_interrupt);
This is feeding entropy and would like to see interrupt registers. But
since this is invoked from a thread it won't.
> }
> +
…
> +void vmbus_isr(void)
> +{
> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> + vmbus_irqd_wake();
> + } else {
> + lockdep_hardirq_threaded();
What clears this? This is wrongly placed. This should go to
sysvec_hyperv_callback() instead with its matching canceling part. The
add_interrupt_randomness() should also be there and not here.
sysvec_hyperv_stimer0() managed to do so.
Different question: What guarantees that there won't be another
interrupt before this one is done? The handshake appears to be
deprecated. The interrupt itself returns ACKing (or not) but the actual
handler is delayed to this thread. Depending on the userland it could
take some time and I don't know how impatient the host is.
> + __vmbus_isr();
Moving on. This (trying very hard here) even schedules tasklets. Why?
You need to disable BH before doing so. Otherwise it ends in ksoftirqd.
You don't want that.
Couldn't the whole logic be integrated into the IRQ code? Then we could
have mask/ unmask if supported/ provided and threaded interrupts. Then
sysvec_hyperv_reenlightenment() could use a proper threaded interrupt
instead apic_eoi() + schedule_delayed_work().
> + }
> +}
> EXPORT_SYMBOL_FOR_MODULES(vmbus_isr, "mshv_vtl");
>
> static irqreturn_t vmbus_percpu_isr(int irq, void *dev_id)
Sebastian
^ permalink raw reply
* Re: [PATCH 00/61] treewide: Use IS_ERR_OR_NULL over manual NULL check - refactor
From: Jason Gunthorpe @ 2026-03-12 16:54 UTC (permalink / raw)
To: James Bottomley
Cc: Kuan-Wei Chiu, Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel,
cocci, dm-devel, dri-devel, gfs2, intel-gfx, intel-wired-lan,
iommu, kvm, linux-arm-kernel, linux-block, linux-bluetooth,
linux-btrfs, linux-cifs, linux-clk, linux-erofs, linux-ext4,
linux-fsdevel, linux-gpio, linux-hyperv, linux-input,
linux-kernel, linux-leds, linux-media, linux-mips, linux-mm,
linux-modules, linux-mtd, linux-nfs, linux-omap, linux-phy,
linux-pm, linux-rockchip, linux-s390, linux-scsi, linux-sctp,
linux-security-module, linux-sh, linux-sound, linux-stm32,
linux-trace-kernel, linux-usb, linux-wireless, netdev, ntfs3,
samba-technical, sched-ext, target-devel, tipc-discussion, v9fs
In-Reply-To: <f5688b895eaebabae6545a0d9baf8f1404e8454e.camel@HansenPartnership.com>
On Thu, Mar 12, 2026 at 11:32:37AM -0400, James Bottomley wrote:
> On Thu, 2026-03-12 at 09:57 -0300, Jason Gunthorpe wrote:
> > On Wed, Mar 11, 2026 at 02:40:36AM +0800, Kuan-Wei Chiu wrote:
> >
> > > IMHO, the necessity of IS_ERR_OR_NULL() often highlights a
> > > confusing or flawed API design. It usually implies that the caller
> > > is unsure whether a failure results in an error pointer or a NULL
> > > pointer.
> >
> > +1
> >
> > IS_ERR_OR_NULL() should always be looked on with suspicion. Very
> > little should be returning some tri-state 'ERR' 'NULL' 'SUCCESS'
> > pointer. What does the middle condition even mean? IS_ERR_OR_NULL()
> > implies ERR and NULL are semanticly the same, so fix the things to
> > always use ERR.
>
> Not in any way supporting the original patch. However, the pattern
> ERR, NULL, PTR is used extensively in the dentry code of filesystems.
> See the try_lookup..() set of functions in fs/namei.c
>
> The meaning is
>
> PTR - I found it
> NULL - It definitely doesn't exist
> ERR - something went wrong during the lookup.
>
> So I don't think you can blanket say this pattern is wrong.
Lots of places also would return ENOENT, I'd argue that is easier to
use..
But yes, I did use the word "suspicion" not blanket wrong :)
Jason
^ permalink raw reply
* RE: [EXTERNAL] Re: [PATCH net-next,V4, 3/3] net: mana: Add ethtool counters for RX CQEs in coalesced type
From: Haiyang Zhang @ 2026-03-12 16:34 UTC (permalink / raw)
To: Simon Horman, Haiyang Zhang
Cc: linux-hyperv@vger.kernel.org, netdev@vger.kernel.org,
KY Srinivasan, Wei Liu, Dexuan Cui, Long Li, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Konstantin Taranov, Erni Sri Satya Vennela, Dipayaan Roy,
Shradha Gupta, Shiraz Saleem, Kees Cook, Subbaraya Sundeep,
Aditya Garg, Breno Leitao, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, Paul Rosswurm
In-Reply-To: <20260311175835.GV461701@kernel.org>
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: Wednesday, March 11, 2026 1:59 PM
> To: Haiyang Zhang <haiyangz@linux.microsoft.com>
> Cc: linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; KY Srinivasan
> <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>; Wei Liu
> <wei.liu@kernel.org>; Dexuan Cui <DECUI@microsoft.com>; Long Li
> <longli@microsoft.com>; Andrew Lunn <andrew+netdev@lunn.ch>; David S.
> Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub
> Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; Konstantin
> Taranov <kotaranov@microsoft.com>; Erni Sri Satya Vennela
> <ernis@linux.microsoft.com>; Dipayaan Roy
> <dipayanroy@linux.microsoft.com>; Shradha Gupta
> <shradhagupta@linux.microsoft.com>; Shiraz Saleem
> <shirazsaleem@microsoft.com>; Kees Cook <kees@kernel.org>; Subbaraya
> Sundeep <sbhatta@marvell.com>; Aditya Garg
> <gargaditya@linux.microsoft.com>; Breno Leitao <leitao@debian.org>; linux-
> kernel@vger.kernel.org; linux-rdma@vger.kernel.org; Paul Rosswurm
> <paulros@microsoft.com>
> Subject: [EXTERNAL] Re: [PATCH net-next,V4, 3/3] net: mana: Add ethtool
> counters for RX CQEs in coalesced type
>
> On Mon, Mar 09, 2026 at 02:20:45PM -0700, Haiyang Zhang wrote:
> > From: Haiyang Zhang <haiyangz@microsoft.com>
> >
> > For RX CQEs with type CQE_RX_COALESCED_4, to measure the coalescing
> > efficiency, add counters to count how many contains 2, 3, 4 packets
> > respectively.
> > Also, add a counter for the error case of first packet with length == 0.
> >
> > Reviewed-by: Long Li <longli@microsoft.com>
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > ---
> > drivers/net/ethernet/microsoft/mana/mana_en.c | 21 ++++++++++++++++++-
> > .../ethernet/microsoft/mana/mana_ethtool.c | 15 +++++++++++--
> > include/net/mana/mana.h | 9 +++++---
> > 3 files changed, 39 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > index fa30046dcd3d..85f7a56d0d90 100644
> > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > @@ -2148,11 +2148,23 @@ static void mana_process_rx_cqe(struct mana_rxq
> *rxq, struct mana_cq *cq,
> > old_buf = NULL;
> > pktlen = oob->ppi[i].pkt_len;
> > if (pktlen == 0) {
> > - if (i == 0)
> > + /* Collect coalesced CQE count based on packets
> processed.
> > + * Coalesced CQEs have at least 2 packets, so index is i
> - 2.
> > + */
> > + if (i > 1) {
> > + u64_stats_update_begin(&rxq->stats.syncp);
> > + rxq->stats.coalesced_cqe[i - 2]++;
> > + u64_stats_update_end(&rxq->stats.syncp);
> > + } else if (i == 0) {
> > + /* Error case stat */
> > + u64_stats_update_begin(&rxq->stats.syncp);
> > + rxq->stats.pkt_len0_err++;
> > + u64_stats_update_end(&rxq->stats.syncp);
> > netdev_err_once(
> > ndev,
> > "RX pkt len=0, rq=%u, cq=%u,
> rxobj=0x%llx\n",
> > rxq->gdma_id, cq->gdma_id, rxq->rxobj);
> > + }
> > break;
>
> Hi Haiyang Zhang,
>
> As there is a break here, can the accounting logic above be move out of
> the
> loop, and merged with the "Coalesced CQE with all 4 packets" accounting
> logic that is already there?
>
> As is, accounting seems split between and slightly duplicated in two
> locations.
Will do.
Thanks,
- Haiyang
^ permalink raw reply
* Re: [PATCH 38/61] net: Prefer IS_ERR_OR_NULL over manual NULL check
From: Przemek Kitszel @ 2026-03-12 16:11 UTC (permalink / raw)
To: Philipp Hahn
Cc: amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel, dri-devel,
gfs2, intel-gfx, intel-wired-lan, iommu, kvm, linux-arm-kernel,
linux-block, linux-bluetooth, linux-btrfs, linux-cifs, linux-clk,
linux-erofs, linux-ext4, linux-fsdevel, linux-gpio, linux-hyperv,
linux-input, linux-kernel, linux-leds, linux-media, linux-mips,
linux-mm, linux-modules, linux-mtd, linux-nfs, linux-omap,
linux-phy, linux-pm, linux-rockchip, linux-s390, linux-scsi,
linux-sctp, linux-security-module, linux-sh, linux-sound,
linux-stm32, linux-trace-kernel, linux-usb, linux-wireless,
netdev, ntfs3, samba-technical, sched-ext, target-devel,
tipc-discussion, v9fs, Igor Russkikh, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Pavan Chebbi, Michael Chan, Potnuri Bharat Teja, Tony Nguyen,
Taras Chornyi, Maxime Coquelin, Alexandre Torgue,
Iyappan Subramanian, Keyur Chudgar, Quan Nguyen, Heiner Kallweit,
Russell King
In-Reply-To: <20260310-b4-is_err_or_null-v1-38-bd63b656022d@avm.de>
On 3/10/26 12:49, Philipp Hahn wrote:
> Prefer using IS_ERR_OR_NULL() over using IS_ERR() and a manual NULL
> check.
>
> Change generated with coccinelle.
>
> To: Igor Russkikh <irusskikh@marvell.com>
> To: Andrew Lunn <andrew+netdev@lunn.ch>
> To: "David S. Miller" <davem@davemloft.net>
> To: Eric Dumazet <edumazet@google.com>
> To: Jakub Kicinski <kuba@kernel.org>
> To: Paolo Abeni <pabeni@redhat.com>
> To: Pavan Chebbi <pavan.chebbi@broadcom.com>
> To: Michael Chan <mchan@broadcom.com>
> To: Potnuri Bharat Teja <bharat@chelsio.com>
> To: Tony Nguyen <anthony.l.nguyen@intel.com>
> To: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> To: Taras Chornyi <taras.chornyi@plvision.eu>
> To: Maxime Coquelin <mcoquelin.stm32@gmail.com>
> To: Alexandre Torgue <alexandre.torgue@foss.st.com>
> To: Iyappan Subramanian <iyappan@os.amperecomputing.com>
> To: Keyur Chudgar <keyur@os.amperecomputing.com>
> To: Quan Nguyen <quan@os.amperecomputing.com>
> To: Heiner Kallweit <hkallweit1@gmail.com>
> To: Russell King <linux@armlinux.org.uk>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: intel-wired-lan@lists.osuosl.org
> Cc: linux-stm32@st-md-mailman.stormreply.com
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-usb@vger.kernel.org
> Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
this is too trivial change, especially when combined like that
https://docs.kernel.org/process/maintainer-netdev.html#clean-up-patches
> ---
> drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 2 +-
> drivers/net/ethernet/broadcom/tg3.c | 2 +-
> drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 3 +--
> drivers/net/ethernet/intel/ice/devlink/devlink.c | 2 +-
> drivers/net/ethernet/marvell/prestera/prestera_router.c | 2 +-
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
> drivers/net/mdio/mdio-xgene.c | 2 +-
> drivers/net/usb/r8152.c | 2 +-
> 8 files changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> index e270327e47fd804cc8ee5cfd53ed1b993c955c41..43edef35c4b1ff606b2f1519a07fad4c9a990ad4 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
> @@ -810,7 +810,7 @@ static int __aq_ring_xdp_clean(struct aq_ring_s *rx_ring,
> }
>
> skb = aq_xdp_run_prog(aq_nic, &xdp, rx_ring, buff);
> - if (IS_ERR(skb) || !skb)
> + if (IS_ERR_OR_NULL(skb))
> continue;
>
> if (ptp_hwtstamp_len > 0)
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index 2328fce336447eb4a796f9300ccc0ab536ff0a35..8ed79f34f03d81184dcc12e6eaff009cb8f7756e 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -7943,7 +7943,7 @@ static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
>
> segs = skb_gso_segment(skb, tp->dev->features &
> ~(NETIF_F_TSO | NETIF_F_TSO6));
> - if (IS_ERR(segs) || !segs) {
> + if (IS_ERR_OR_NULL(segs)) {
> tnapi->tx_dropped++;
> goto tg3_tso_bug_end;
> }
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
> index 3307e50426819087ad985178c4a5383f16b8e7b4..1c8a6445d4b2e3535d8f1b7908dd02d8dd2f23fa 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
> @@ -1032,8 +1032,7 @@ static void ch_flower_stats_handler(struct work_struct *work)
> do {
> rhashtable_walk_start(&iter);
>
> - while ((flower_entry = rhashtable_walk_next(&iter)) &&
> - !IS_ERR(flower_entry)) {
> + while (!IS_ERR_OR_NULL((flower_entry = rhashtable_walk_next(&iter)))) {
> ret = cxgb4_get_filter_counters(adap->port[0],
> flower_entry->filter_id,
> &packets, &bytes,
> diff --git a/drivers/net/ethernet/intel/ice/devlink/devlink.c b/drivers/net/ethernet/intel/ice/devlink/devlink.c
> index 6c72bd15db6d75a1d4fa04ef8fefbd26fb6e84bd..3d08b9187fd76ca3198af28111b6f1c1765ea01e 100644
> --- a/drivers/net/ethernet/intel/ice/devlink/devlink.c
> +++ b/drivers/net/ethernet/intel/ice/devlink/devlink.c
> @@ -791,7 +791,7 @@ static void ice_traverse_tx_tree(struct devlink *devlink, struct ice_sched_node
> node->parent->rate_node);
> }
>
> - if (rate_node && !IS_ERR(rate_node))
> + if (!IS_ERR_OR_NULL(rate_node))
> node->rate_node = rate_node;
>
> traverse_children:
> diff --git a/drivers/net/ethernet/marvell/prestera/prestera_router.c b/drivers/net/ethernet/marvell/prestera/prestera_router.c
> index b036b173a308b5f994ad8538eb010fa27196988c..4492938e8a3da91d32efe8d45ccbe2eb437c0e49 100644
> --- a/drivers/net/ethernet/marvell/prestera/prestera_router.c
> +++ b/drivers/net/ethernet/marvell/prestera/prestera_router.c
> @@ -1061,7 +1061,7 @@ static void __prestera_k_arb_hw_state_upd(struct prestera_switch *sw,
> n = NULL;
> }
>
> - if (!IS_ERR(n) && n) {
> + if (!IS_ERR_OR_NULL(n)) {
> neigh_event_send(n, NULL);
> neigh_release(n);
> } else {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 6827c99bde8c22db42b363d2d36ad6f26075ed50..356a4e9ce04b1fcf8786d7274d31ace404be2cf6 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -1275,7 +1275,7 @@ static int stmmac_init_phy(struct net_device *dev)
> /* Some DT bindings do not set-up the PHY handle. Let's try to
> * manually parse it
> */
> - if (!phy_fwnode || IS_ERR(phy_fwnode)) {
> + if (IS_ERR_OR_NULL(phy_fwnode)) {
> int addr = priv->plat->phy_addr;
> struct phy_device *phydev;
>
> diff --git a/drivers/net/mdio/mdio-xgene.c b/drivers/net/mdio/mdio-xgene.c
> index a8f91a4b7fed0927ee14e408000cd3a2bfb9b09a..09b30b563295c6085dc1358ac361301e5cf6b2a8 100644
> --- a/drivers/net/mdio/mdio-xgene.c
> +++ b/drivers/net/mdio/mdio-xgene.c
> @@ -265,7 +265,7 @@ struct phy_device *xgene_enet_phy_register(struct mii_bus *bus, int phy_addr)
> struct phy_device *phy_dev;
>
> phy_dev = get_phy_device(bus, phy_addr, false);
> - if (!phy_dev || IS_ERR(phy_dev))
> + if (IS_ERR_OR_NULL(phy_dev))
> return NULL;
>
> if (phy_device_register(phy_dev))
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index 0c83bbbea2e7c322ee6339893e281237663bd3ae..73f17ebd7d40007eec5004f887a46249defd28ab 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -2218,7 +2218,7 @@ static void r8152_csum_workaround(struct r8152 *tp, struct sk_buff *skb,
>
> features &= ~(NETIF_F_SG | NETIF_F_IPV6_CSUM | NETIF_F_TSO6);
> segs = skb_gso_segment(skb, features);
> - if (IS_ERR(segs) || !segs)
> + if (IS_ERR_OR_NULL(segs))
> goto drop;
>
> __skb_queue_head_init(&seg_list);
>
^ permalink raw reply
* [PATCH] mshv: Fix use-after-free in mshv_map_user_memory error path
From: Stanislav Kinsburskii @ 2026-03-12 16:02 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli; +Cc: linux-hyperv, linux-kernel
In the error path of mshv_map_user_memory(), calling vfree() directly on
the region leaves the MMU notifier registered. When userspace later unmaps
the memory, the notifier fires and accesses the freed region, causing a
use-after-free and potential kernel panic.
Replace vfree() with mshv_partition_put() to properly unregister
the MMU notifier before freeing the region.
Fixes: b9a66cd5ccbb9 ("mshv: Add support for movable memory regions")
Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
---
drivers/hv/mshv_root_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
index d753f41d3b57..796f3ca8308f 100644
--- a/drivers/hv/mshv_root_main.c
+++ b/drivers/hv/mshv_root_main.c
@@ -1388,7 +1388,7 @@ mshv_map_user_memory(struct mshv_partition *partition,
return 0;
errout:
- vfree(region);
+ mshv_region_put(region);
return ret;
}
^ permalink raw reply related
* Re: [PATCH 00/61] treewide: Use IS_ERR_OR_NULL over manual NULL check - refactor
From: James Bottomley @ 2026-03-12 15:32 UTC (permalink / raw)
To: Jason Gunthorpe, Kuan-Wei Chiu
Cc: Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel,
dri-devel, gfs2, intel-gfx, intel-wired-lan, iommu, kvm,
linux-arm-kernel, linux-block, linux-bluetooth, linux-btrfs,
linux-cifs, linux-clk, linux-erofs, linux-ext4, linux-fsdevel,
linux-gpio, linux-hyperv, linux-input, linux-kernel, linux-leds,
linux-media, linux-mips, linux-mm, linux-modules, linux-mtd,
linux-nfs, linux-omap, linux-phy, linux-pm, linux-rockchip,
linux-s390, linux-scsi, linux-sctp, linux-security-module,
linux-sh, linux-sound, linux-stm32, linux-trace-kernel, linux-usb,
linux-wireless, netdev, ntfs3, samba-technical, sched-ext,
target-devel, tipc-discussion, v9fs
In-Reply-To: <20260312125730.GI1469476@ziepe.ca>
On Thu, 2026-03-12 at 09:57 -0300, Jason Gunthorpe wrote:
> On Wed, Mar 11, 2026 at 02:40:36AM +0800, Kuan-Wei Chiu wrote:
>
> > IMHO, the necessity of IS_ERR_OR_NULL() often highlights a
> > confusing or flawed API design. It usually implies that the caller
> > is unsure whether a failure results in an error pointer or a NULL
> > pointer.
>
> +1
>
> IS_ERR_OR_NULL() should always be looked on with suspicion. Very
> little should be returning some tri-state 'ERR' 'NULL' 'SUCCESS'
> pointer. What does the middle condition even mean? IS_ERR_OR_NULL()
> implies ERR and NULL are semanticly the same, so fix the things to
> always use ERR.
Not in any way supporting the original patch. However, the pattern
ERR, NULL, PTR is used extensively in the dentry code of filesystems.
See the try_lookup..() set of functions in fs/namei.c
The meaning is
PTR - I found it
NULL - It definitely doesn't exist
ERR - something went wrong during the lookup.
So I don't think you can blanket say this pattern is wrong.
Regards,
James
^ permalink raw reply
* Re: [PATCH net-next v2] net: mana: Expose hardware diagnostic info via debugfs
From: Erni Sri Satya Vennela @ 2026-03-12 14:46 UTC (permalink / raw)
To: Simon Horman
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, kotaranov, shradhagupta, dipayanroy,
yury.norov, kees, shirazsaleem, linux-hyperv, netdev,
linux-kernel, linux-rdma
In-Reply-To: <20260311164653.GS461701@kernel.org>
On Wed, Mar 11, 2026 at 04:46:53PM +0000, Simon Horman wrote:
> On Mon, Mar 09, 2026 at 07:38:28AM -0700, Erni Sri Satya Vennela wrote:
>
> ...
>
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
>
> ...
>
> > @@ -2128,6 +2140,9 @@ int mana_gd_suspend(struct pci_dev *pdev, pm_message_t state)
> >
> > mana_gd_cleanup(pdev);
> >
> > + debugfs_remove_recursive(gc->mana_pci_debugfs);
> > + gc->mana_pci_debugfs = NULL;
>
> Hi Erni,
>
> The same cleanup of mana_pci_debugfs already appears in a couple of other
> places. It seems that all such cleanup is now paired with a call to
> mana_gd_cleanup().
>
> So could you consider performing the mana_pci_debugfs cleanup in
> mana_gd_cleanup()? Possibly also renaming that function?
>
Yes, I think that makes sense to combine them in once function.
I will make that change in the next version.
> > +
> > return 0;
> > }
> >
> > @@ -2140,6 +2155,12 @@ int mana_gd_resume(struct pci_dev *pdev)
> > struct gdma_context *gc = pci_get_drvdata(pdev);
> > int err;
> >
> > + if (gc->is_pf)
> > + gc->mana_pci_debugfs = debugfs_create_dir("0", mana_debugfs_root);
> > + else
> > + gc->mana_pci_debugfs = debugfs_create_dir(pci_slot_name(pdev->slot),
> > + mana_debugfs_root);
>
> Likewise the setup of mana_pci_debugfs seems to now always be paired
> with a call to mana_gd_setup().
>
Thankyou for the review.
I will send the next version with updated changes.
Regards,
Vennela
> > +
> > err = mana_gd_setup(pdev);
> > if (err)
> > return err;
>
> ...
^ permalink raw reply
* Re: [PATCH 10/16] RDMA/efa: Use ib_copy_validate_udata_in_cm()
From: Michael Margolin @ 2026-03-12 13:22 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Gal Pressman, Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Junxian Huang, Kai Shen, Konstantin Taranov,
Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
Long Li, Michal Kalderon, Nelson Escobar, Satish Kharat,
Selvin Xavier, Yossi Leybovich, Chengchang Tang, Tatyana Nikolova,
Vishnu Dasa, Yishai Hadas, Zhu Yanjun, patches
In-Reply-To: <20260312120858.GH1448102@nvidia.com>
On Thu, Mar 12, 2026 at 09:08:58AM -0300, Jason Gunthorpe wrote:
> On Thu, Mar 12, 2026 at 08:20:20AM -0300, Jason Gunthorpe wrote:
> > On Thu, Mar 12, 2026 at 01:03:59PM +0200, Gal Pressman wrote:
> > > On 12/03/2026 2:24, Jason Gunthorpe wrote:
> > > > Add the missed check for unsupported comp_mask bits.
> > >
> > > Is it really missed? IIRC, it's intended.
> > >
> > > See the comment above your hunk, and efa_user_comp_handshake()?
> >
> > No, that is an illegal way to use a field called comp_mask.
> >
> > If the driver wants that it needs a new field "suggested feature flags
> > to enable"
> >
> > comp_mask is strictly to say that new fields are present and must be
> > processed by the kernel, and nothing else.
>
> We could also rename the struct field away from comp_mask ? It is
> easier to add a comp_mask later..
>
> Jason
Agree that this field should be renamed as we shouldn't fail when
unsupported value is requested. I'll send a patch.
Michael
^ permalink raw reply
* Re: [PATCH 00/61] treewide: Use IS_ERR_OR_NULL over manual NULL check - refactor
From: Jason Gunthorpe @ 2026-03-12 12:57 UTC (permalink / raw)
To: Kuan-Wei Chiu
Cc: Philipp Hahn, amd-gfx, apparmor, bpf, ceph-devel, cocci, dm-devel,
dri-devel, gfs2, intel-gfx, intel-wired-lan, iommu, kvm,
linux-arm-kernel, linux-block, linux-bluetooth, linux-btrfs,
linux-cifs, linux-clk, linux-erofs, linux-ext4, linux-fsdevel,
linux-gpio, linux-hyperv, linux-input, linux-kernel, linux-leds,
linux-media, linux-mips, linux-mm, linux-modules, linux-mtd,
linux-nfs, linux-omap, linux-phy, linux-pm, linux-rockchip,
linux-s390, linux-scsi, linux-sctp, linux-security-module,
linux-sh, linux-sound, linux-stm32, linux-trace-kernel, linux-usb,
linux-wireless, netdev, ntfs3, samba-technical, sched-ext,
target-devel, tipc-discussion, v9fs
In-Reply-To: <abBlpGKO842B3yl9@google.com>
On Wed, Mar 11, 2026 at 02:40:36AM +0800, Kuan-Wei Chiu wrote:
> IMHO, the necessity of IS_ERR_OR_NULL() often highlights a confusing or
> flawed API design. It usually implies that the caller is unsure whether
> a failure results in an error pointer or a NULL pointer.
+1
IS_ERR_OR_NULL() should always be looked on with suspicion. Very
little should be returning some tri-state 'ERR' 'NULL' 'SUCCESS'
pointer. What does the middle condition even mean? IS_ERR_OR_NULL()
implies ERR and NULL are semanticly the same, so fix the things to
always use ERR.
If you want to improve things work to get rid of the NULL checks this
script identifies. Remove ERR or NULL because only one can ever
happen, or fix the source to consistently return ERR.
Jason
^ permalink raw reply
* Re: [PATCH 10/16] RDMA/efa: Use ib_copy_validate_udata_in_cm()
From: Jason Gunthorpe @ 2026-03-12 12:08 UTC (permalink / raw)
To: Gal Pressman
Cc: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Junxian Huang, Kai Shen, Konstantin Taranov,
Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
Long Li, Michal Kalderon, Michael Margolin, Nelson Escobar,
Satish Kharat, Selvin Xavier, Yossi Leybovich, Chengchang Tang,
Tatyana Nikolova, Vishnu Dasa, Yishai Hadas, Zhu Yanjun, patches
In-Reply-To: <20260312112020.GE1448102@nvidia.com>
On Thu, Mar 12, 2026 at 08:20:20AM -0300, Jason Gunthorpe wrote:
> On Thu, Mar 12, 2026 at 01:03:59PM +0200, Gal Pressman wrote:
> > On 12/03/2026 2:24, Jason Gunthorpe wrote:
> > > Add the missed check for unsupported comp_mask bits.
> >
> > Is it really missed? IIRC, it's intended.
> >
> > See the comment above your hunk, and efa_user_comp_handshake()?
>
> No, that is an illegal way to use a field called comp_mask.
>
> If the driver wants that it needs a new field "suggested feature flags
> to enable"
>
> comp_mask is strictly to say that new fields are present and must be
> processed by the kernel, and nothing else.
We could also rename the struct field away from comp_mask ? It is
easier to add a comp_mask later..
Jason
^ permalink raw reply
* [PATCH net-next] net: mana: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Erni Sri Satya Vennela @ 2026-03-12 11:25 UTC (permalink / raw)
To: longli, kotaranov, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
linux-hyperv, linux-kernel
Cc: Erni Sri Satya Vennela
As part of MANA hardening for CVM, clamp hardware-reported adapter
capability values from the MANA_IB_GET_ADAPTER_CAP response before
they are used by the IB subsystem.
The response fields (max_qp_count, max_cq_count, max_mr_count,
max_pd_count, max_inbound_read_limit, max_outbound_read_limit,
max_qp_wr, max_send_sge_count, max_recv_sge_count) are u32 but are
assigned to signed int members in struct ib_device_attr. If hardware
returns a value exceeding INT_MAX, the implicit u32-to-int conversion
produces a negative value, which can cause incorrect behavior in the
IB core and userspace applications.
Clamp these fields to INT_MAX in mana_ib_gd_query_adapter_caps() so
all downstream consumers receive safe values.
Additionally, fix an integer overflow in mana_ib_query_device() where
max_res_rd_atom is computed as max_qp_rd_atom * max_qp. Both operands
are int and the multiplication can overflow. Widen to s64 before
multiplying and clamp the result to INT_MAX.
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
drivers/infiniband/hw/mana/main.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 8d99cd00f002..2869660077ef 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -599,7 +599,8 @@ int mana_ib_query_device(struct ib_device *ibdev, struct ib_device_attr *props,
props->max_mr = dev->adapter_caps.max_mr_count;
props->max_pd = dev->adapter_caps.max_pd_count;
props->max_qp_rd_atom = dev->adapter_caps.max_inbound_read_limit;
- props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp;
+ props->max_res_rd_atom =
+ min_t(s64, (s64)props->max_qp_rd_atom * props->max_qp, INT_MAX);
props->max_qp_init_rd_atom = dev->adapter_caps.max_outbound_read_limit;
props->atomic_cap = IB_ATOMIC_NONE;
props->masked_atomic_cap = IB_ATOMIC_NONE;
@@ -694,20 +695,22 @@ int mana_ib_gd_query_adapter_caps(struct mana_ib_dev *dev)
caps->max_sq_id = resp.max_sq_id;
caps->max_rq_id = resp.max_rq_id;
caps->max_cq_id = resp.max_cq_id;
- caps->max_qp_count = resp.max_qp_count;
- caps->max_cq_count = resp.max_cq_count;
- caps->max_mr_count = resp.max_mr_count;
- caps->max_pd_count = resp.max_pd_count;
- caps->max_inbound_read_limit = resp.max_inbound_read_limit;
- caps->max_outbound_read_limit = resp.max_outbound_read_limit;
+ caps->max_qp_count = min_t(u32, resp.max_qp_count, INT_MAX);
+ caps->max_cq_count = min_t(u32, resp.max_cq_count, INT_MAX);
+ caps->max_mr_count = min_t(u32, resp.max_mr_count, INT_MAX);
+ caps->max_pd_count = min_t(u32, resp.max_pd_count, INT_MAX);
+ caps->max_inbound_read_limit = min_t(u32, resp.max_inbound_read_limit,
+ INT_MAX);
+ caps->max_outbound_read_limit = min_t(u32, resp.max_outbound_read_limit,
+ INT_MAX);
caps->mw_count = resp.mw_count;
caps->max_srq_count = resp.max_srq_count;
caps->max_qp_wr = min_t(u32,
resp.max_requester_sq_size / GDMA_MAX_SQE_SIZE,
resp.max_requester_rq_size / GDMA_MAX_RQE_SIZE);
caps->max_inline_data_size = resp.max_inline_data_size;
- caps->max_send_sge_count = resp.max_send_sge_count;
- caps->max_recv_sge_count = resp.max_recv_sge_count;
+ caps->max_send_sge_count = min_t(u32, resp.max_send_sge_count, INT_MAX);
+ caps->max_recv_sge_count = min_t(u32, resp.max_recv_sge_count, INT_MAX);
caps->feature_flags = resp.feature_flags;
caps->page_size_cap = PAGE_SZ_BM;
--
2.34.1
^ permalink raw reply related
* Re: [PATCH 10/16] RDMA/efa: Use ib_copy_validate_udata_in_cm()
From: Jason Gunthorpe @ 2026-03-12 11:20 UTC (permalink / raw)
To: Gal Pressman
Cc: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Junxian Huang, Kai Shen, Konstantin Taranov,
Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
Long Li, Michal Kalderon, Michael Margolin, Nelson Escobar,
Satish Kharat, Selvin Xavier, Yossi Leybovich, Chengchang Tang,
Tatyana Nikolova, Vishnu Dasa, Yishai Hadas, Zhu Yanjun, patches
In-Reply-To: <cf3bbf89-0bb1-4c58-b78f-37afdb2ff99c@linux.dev>
On Thu, Mar 12, 2026 at 01:03:59PM +0200, Gal Pressman wrote:
> On 12/03/2026 2:24, Jason Gunthorpe wrote:
> > Add the missed check for unsupported comp_mask bits.
>
> Is it really missed? IIRC, it's intended.
>
> See the comment above your hunk, and efa_user_comp_handshake()?
No, that is an illegal way to use a field called comp_mask.
If the driver wants that it needs a new field "suggested feature flags
to enable"
comp_mask is strictly to say that new fields are present and must be
processed by the kernel, and nothing else.
Jason
^ permalink raw reply
* Re: [PATCH 10/16] RDMA/efa: Use ib_copy_validate_udata_in_cm()
From: Gal Pressman @ 2026-03-12 11:03 UTC (permalink / raw)
To: Jason Gunthorpe, Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Junxian Huang, Kai Shen, Konstantin Taranov,
Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
Long Li, Michal Kalderon, Michael Margolin, Nelson Escobar,
Satish Kharat, Selvin Xavier, Yossi Leybovich, Chengchang Tang,
Tatyana Nikolova, Vishnu Dasa, Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <10-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
On 12/03/2026 2:24, Jason Gunthorpe wrote:
> Add the missed check for unsupported comp_mask bits.
Is it really missed? IIRC, it's intended.
See the comment above your hunk, and efa_user_comp_handshake()?
^ permalink raw reply
* Re: [PATCH v2 1/2] Drivers: hv: vmbus: Provide option to skip VMBus unload on panic
From: Wei Liu @ 2026-03-12 4:46 UTC (permalink / raw)
To: mhklinux
Cc: drawat.floss, maarten.lankhorst, mripard, tzimmermann, airlied,
simona, kys, haiyangz, wei.liu, decui, longli, ryasuoka, jfalempe,
dri-devel, linux-kernel, linux-hyperv, stable
In-Reply-To: <20260217182335.265585-1-mhklkml@zohomail.com>
Dexuan and Long, can you share your thoughts on this patch?
On Tue, Feb 17, 2026 at 10:23:34AM -0800, Michael Kelley wrote:
> From: Michael Kelley <mhklinux@outlook.com>
>
> Currently, VMBus code initiates a VMBus unload in the panic path so
> that if a kdump kernel is loaded, it can start fresh in setting up its
> own VMBus connection. However, a driver for the VMBus virtual frame
> buffer may need to flush dirty portions of the frame buffer back to
> the Hyper-V host so that panic information is visible in the graphics
> console. To support such flushing, provide exported functions for the
> frame buffer driver to specify that the VMBus unload should not be
> done by the VMBus driver, and to initiate the VMBus unload itself.
> Together these allow a frame buffer driver to delay the VMBus unload
> until after it has completed the flush.
>
> Ideally, the VMBus driver could use its own panic-path callback to do
> the unload after all frame buffer drivers have finished. But DRM frame
> buffer drivers use the kmsg dump callback, and there are no callbacks
> after that in the panic path. Hence this somewhat messy approach to
> properly sequencing the frame buffer flush and the VMBus unload.
>
> Fixes: 3671f3777758 ("drm/hyperv: Add support for drm_panic")
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
> ---
> Changes in v2: None
>
> drivers/hv/channel_mgmt.c | 1 +
> drivers/hv/hyperv_vmbus.h | 1 -
> drivers/hv/vmbus_drv.c | 25 ++++++++++++++++++-------
> include/linux/hyperv.h | 3 +++
> 4 files changed, 22 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 74fed2c073d4..5de83676dbad 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -944,6 +944,7 @@ void vmbus_initiate_unload(bool crash)
> else
> vmbus_wait_for_unload();
> }
> +EXPORT_SYMBOL_GPL(vmbus_initiate_unload);
>
> static void vmbus_setup_channel_state(struct vmbus_channel *channel,
> struct vmbus_channel_offer_channel *offer)
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index cdbc5f5c3215..5d3944fc93ae 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -440,7 +440,6 @@ void hv_vss_deinit(void);
> int hv_vss_pre_suspend(void);
> int hv_vss_pre_resume(void);
> void hv_vss_onchannelcallback(void *context);
> -void vmbus_initiate_unload(bool crash);
>
> static inline void hv_poll_channel(struct vmbus_channel *channel,
> void (*cb)(void *))
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 6785ad63a9cb..97dfa529d250 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -69,19 +69,29 @@ bool vmbus_is_confidential(void)
> }
> EXPORT_SYMBOL_GPL(vmbus_is_confidential);
>
> +static bool skip_vmbus_unload;
> +
> +/*
> + * Allow a VMBus framebuffer driver to specify that in the case of a panic,
> + * it will do the VMbus unload operation once it has flushed any dirty
> + * portions of the framebuffer to the Hyper-V host.
> + */
> +void vmbus_set_skip_unload(bool skip)
> +{
> + skip_vmbus_unload = skip;
> +}
> +EXPORT_SYMBOL_GPL(vmbus_set_skip_unload);
> +
> /*
> * The panic notifier below is responsible solely for unloading the
> * vmbus connection, which is necessary in a panic event.
> - *
> - * Notice an intrincate relation of this notifier with Hyper-V
> - * framebuffer panic notifier exists - we need vmbus connection alive
> - * there in order to succeed, so we need to order both with each other
> - * [see hvfb_on_panic()] - this is done using notifiers' priorities.
> */
> static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val,
> void *args)
> {
> - vmbus_initiate_unload(true);
> + if (!skip_vmbus_unload)
> + vmbus_initiate_unload(true);
> +
> return NOTIFY_DONE;
> }
> static struct notifier_block hyperv_panic_vmbus_unload_block = {
> @@ -2848,7 +2858,8 @@ static void hv_crash_handler(struct pt_regs *regs)
> {
> int cpu;
>
> - vmbus_initiate_unload(true);
> + if (!skip_vmbus_unload)
> + vmbus_initiate_unload(true);
> /*
> * In crash handler we can't schedule synic cleanup for all CPUs,
> * doing the cleanup for current CPU only. This should be sufficient
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index dfc516c1c719..b0502a336eb3 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -1334,6 +1334,9 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
> bool fb_overlap_ok);
> void vmbus_free_mmio(resource_size_t start, resource_size_t size);
>
> +void vmbus_initiate_unload(bool crash);
> +void vmbus_set_skip_unload(bool skip);
> +
> /*
> * GUID definitions of various offer types - services offered to the guest.
> */
> --
> 2.25.1
>
^ permalink raw reply
* Re: [PATCH v2] mshv: Introduce tracing support
From: Wei Liu @ 2026-03-12 4:45 UTC (permalink / raw)
To: Stanislav Kinsburskii
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <177238674748.34040.4319844213401541666.stgit@skinsburskii-cloud-desktop.internal.cloudapp.net>
On Sun, Mar 01, 2026 at 05:39:10PM +0000, Stanislav Kinsburskii wrote:
> Introduces various trace events and use them in the corresponding places
> in the driver.
>
> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Applied to hyperv-next.
^ permalink raw reply
* Re: [PATCH 1/1] Drivers: hv: vmbus: Limit channel interrupt scan to relid high water mark
From: Wei Liu @ 2026-03-12 4:44 UTC (permalink / raw)
To: mhklinux
Cc: kys, haiyangz, wei.liu, decui, longli, linux-hyperv, linux-kernel
In-Reply-To: <20260220164045.1670-1-mhklkml@zohomail.com>
On Fri, Feb 20, 2026 at 08:40:45AM -0800, Michael Kelley wrote:
> From: Michael Kelley <mhklinux@outlook.com>
>
> When checking for VMBus channel interrutps, current code always scans the
> full SynIC receive interrupt bit array to get the relid of the
> interrupting channels. The array has HV_EVENT_FLAGS_COUNT (2048) bits.
> But VMs rarely have more than 100 channels, and the relid is typically
> a small integer that is densely assigned by the Hyper-V host. It's
> wasteful to scan 2048 bits when it is highly unlikely that anything will
> be found past bit 100. The waste is double with Confidential VMBus because
> there are two receive interrupt arrays that must be scanned: one for the
> hypervisor SynIC and one for the paravisor SynIC.
>
> Improve the scanning by tracking the largest relid that has been offered
> by the Hyper-V host. Then when checking for VMBus channel interrupts, only
> scan up to this high water mark.
>
> When channels are rescinded, it's not worth the complexity to recalculate
> the high water mark. Hyper-V tends to reuse the rescinded relids for any
> new channels that are subsequently added, and the performance benefit of
> exactly tracking the high water mark would be minimal.
>
> Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Applied to hyperv-next. Thanks.
^ permalink raw reply
* Re: [PATCH V0] mshv: pass struct mshv_user_mem_region by reference
From: Wei Liu @ 2026-03-12 4:32 UTC (permalink / raw)
To: Michael Kelley
Cc: Mukesh R, linux-hyperv@vger.kernel.org,
linux-kernel@vger.kernel.org, wei.liu@kernel.org
In-Reply-To: <SN6PR02MB4157FBAE767E7563898DA0BDD47CA@SN6PR02MB4157.namprd02.prod.outlook.com>
On Wed, Mar 04, 2026 at 06:45:12PM +0000, Michael Kelley wrote:
> From: Mukesh R <mrathor@linux.microsoft.com> Sent: Tuesday, March 3, 2026 4:03 PM
> >
> > For unstated reasons, function mshv_partition_ioctl_set_memory passes
> > struct mshv_user_mem_region by value instead of by reference. Change
> > it to pass by reference.
> >
> > Signed-off-by: Mukesh R <mrathor@linux.microsoft.com>
>
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>
>
Applied to hyperv-fixes.
^ permalink raw reply
* Re: [PATCH -hyperv 1/3] x86/hyperv: Save segment registers directly to memory in hv_hvcrash_ctxt_save()
From: Wei Liu @ 2026-03-12 4:27 UTC (permalink / raw)
To: Uros Bizjak
Cc: linux-hyperv, x86, linux-kernel, K. Y. Srinivasan, Haiyang Zhang,
Wei Liu, Dexuan Cui, Long Li, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin
In-Reply-To: <20260311102658.215693-1-ubizjak@gmail.com>
On Wed, Mar 11, 2026 at 11:25:58AM +0100, Uros Bizjak wrote:
> hv_hvcrash_ctxt_save() in arch/x86/hyperv/hv_crash.c currently saves
> segment registers via a general-purpose register (%eax). Update the
> code to save segment registers (cs, ss, ds, es, fs, gs) directly to
> the crash context memory using movw. This avoids unnecessary use of
> a general-purpose register, making the code simpler and more efficient.
>
> The size of the corresponding object file improves as follows:
>
> text data bss dec hex filename
> 4167 176 200 4543 11bf hv_crash-old.o
> 4151 176 200 4527 11af hv_crash-new.o
>
> No functional change occurs to the saved context contents; this is
> purely a code-quality improvement.
>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Dexuan Cui <decui@microsoft.com>
> Cc: Long Li <longli@microsoft.com>
> Cc: Thomas Gleixner <tglx@kernel.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
Series applied to hyperv-fixes. Thanks.
^ permalink raw reply
* [PATCH 15/16] RDMA: Remove redundant = {} for udata req structs
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
Now that all of the udata request structs are loaded with the helpers
the callers should not pre-zero them. The helpers all guarantee that
the entire struct is filled with something.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 4 ++--
drivers/infiniband/hw/hns/hns_roce_main.c | 2 +-
drivers/infiniband/hw/hns/hns_roce_srq.c | 2 +-
drivers/infiniband/hw/mana/cq.c | 2 +-
drivers/infiniband/hw/mana/qp.c | 2 +-
drivers/infiniband/hw/mana/wq.c | 2 +-
drivers/infiniband/hw/mlx4/qp.c | 4 ++--
drivers/infiniband/hw/mlx5/cq.c | 2 +-
drivers/infiniband/hw/mlx5/main.c | 2 +-
drivers/infiniband/hw/mlx5/mr.c | 2 +-
drivers/infiniband/hw/mlx5/qp.c | 4 ++--
drivers/infiniband/hw/mlx5/srq.c | 2 +-
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 4 +++-
drivers/infiniband/hw/qedr/verbs.c | 8 ++++----
14 files changed, 22 insertions(+), 20 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index b491bcd886ccb0..f1020921f0e742 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -682,7 +682,7 @@ int efa_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
struct efa_com_create_qp_result create_qp_resp;
struct efa_dev *dev = to_edev(ibqp->device);
struct efa_ibv_create_qp_resp resp = {};
- struct efa_ibv_create_qp cmd = {};
+ struct efa_ibv_create_qp cmd;
struct efa_qp *qp = to_eqp(ibqp);
struct efa_ucontext *ucontext;
u16 supported_efa_flags = 0;
@@ -1121,7 +1121,7 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct efa_com_create_cq_result result;
struct ib_device *ibdev = ibcq->device;
struct efa_dev *dev = to_edev(ibdev);
- struct efa_ibv_create_cq cmd = {};
+ struct efa_ibv_create_cq cmd;
struct efa_cq *cq = to_ecq(ibcq);
int entries = attr->cqe;
bool set_src_addr;
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index ec6fb3f1177941..0dbe99aab6ad21 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -425,7 +425,7 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
struct hns_roce_ucontext *context = to_hr_ucontext(uctx);
struct hns_roce_dev *hr_dev = to_hr_dev(uctx->device);
struct hns_roce_ib_alloc_ucontext_resp resp = {};
- struct hns_roce_ib_alloc_ucontext ucmd = {};
+ struct hns_roce_ib_alloc_ucontext ucmd;
int ret = -EAGAIN;
if (!hr_dev->active)
diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c
index b37a76587aa868..601f8cdfce96a3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_srq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
@@ -406,7 +406,7 @@ static int alloc_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
struct ib_udata *udata,
struct hns_roce_ib_create_srq_resp *resp)
{
- struct hns_roce_ib_create_srq ucmd = {};
+ struct hns_roce_ib_create_srq ucmd;
struct hns_roce_ucontext *uctx;
int ret;
diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 3f932ef6e5fff6..f4cbe21763bf11 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -13,7 +13,7 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct mana_ib_create_cq_resp resp = {};
struct mana_ib_ucontext *mana_ucontext;
struct ib_device *ibdev = ibcq->device;
- struct mana_ib_create_cq ucmd = {};
+ struct mana_ib_create_cq ucmd;
struct mana_ib_dev *mdev;
bool is_rnic_cq;
u32 doorbell;
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 69c8d4f7a1f46b..ddc30d37d715f6 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -97,7 +97,7 @@ static int mana_ib_create_qp_rss(struct ib_qp *ibqp, struct ib_pd *pd,
container_of(pd->device, struct mana_ib_dev, ib_dev);
struct ib_rwq_ind_table *ind_tbl = attr->rwq_ind_tbl;
struct mana_ib_create_qp_rss_resp resp = {};
- struct mana_ib_create_qp_rss ucmd = {};
+ struct mana_ib_create_qp_rss ucmd;
mana_handle_t *mana_ind_table;
struct mana_port_context *mpc;
unsigned int ind_tbl_size;
diff --git a/drivers/infiniband/hw/mana/wq.c b/drivers/infiniband/hw/mana/wq.c
index aceeea7f17b339..5c2134a0b1a196 100644
--- a/drivers/infiniband/hw/mana/wq.c
+++ b/drivers/infiniband/hw/mana/wq.c
@@ -11,7 +11,7 @@ struct ib_wq *mana_ib_create_wq(struct ib_pd *pd,
{
struct mana_ib_dev *mdev =
container_of(pd->device, struct mana_ib_dev, ib_dev);
- struct mana_ib_create_wq ucmd = {};
+ struct mana_ib_create_wq ucmd;
struct mana_ib_wq *wq;
int err;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index cfb54ffcaac22c..790be09d985a1a 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -709,7 +709,7 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
struct ib_qp_init_attr *init_attr,
struct ib_udata *udata)
{
- struct mlx4_ib_create_qp_rss ucmd = {};
+ struct mlx4_ib_create_qp_rss ucmd;
int err;
if (!udata) {
@@ -4230,7 +4230,7 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
u32 wq_attr_mask, struct ib_udata *udata)
{
struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
- struct mlx4_ib_modify_wq ucmd = {};
+ struct mlx4_ib_modify_wq ucmd;
enum ib_wq_state cur_state, new_state;
int err;
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index f5e75e51c6763f..1f94863e755cc7 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -720,7 +720,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
int *cqe_size, int *index, int *inlen,
struct uverbs_attr_bundle *attrs)
{
- struct mlx5_ib_create_cq ucmd = {};
+ struct mlx5_ib_create_cq ucmd;
unsigned long page_size;
unsigned int page_offset_quantized;
__be64 *pas;
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index ff2c02c85625ce..fe3de414bfcad5 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2178,7 +2178,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx,
{
struct ib_device *ibdev = uctx->device;
struct mlx5_ib_dev *dev = to_mdev(ibdev);
- struct mlx5_ib_alloc_ucontext_req_v2 req = {};
+ struct mlx5_ib_alloc_ucontext_req_v2 req;
struct mlx5_ib_alloc_ucontext_resp resp = {};
struct mlx5_ib_ucontext *context = to_mucontext(uctx);
struct mlx5_bfreg_info *bfregi;
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 49dcc39836c047..37f3d19bd374ee 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1768,7 +1768,7 @@ int mlx5_ib_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata)
u32 *in = NULL;
void *mkc;
int err;
- struct mlx5_ib_alloc_mw req = {};
+ struct mlx5_ib_alloc_mw req;
struct {
__u32 comp_mask;
__u32 response_length;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 3b602ed0a2dafc..8f50e7342a7694 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4692,7 +4692,7 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
struct mlx5_ib_dev *dev = to_mdev(ibqp->device);
struct mlx5_ib_modify_qp_resp resp = {};
struct mlx5_ib_qp *qp = to_mqp(ibqp);
- struct mlx5_ib_modify_qp ucmd = {};
+ struct mlx5_ib_modify_qp ucmd;
enum ib_qp_type qp_type;
enum ib_qp_state cur_state, new_state;
int err = -EINVAL;
@@ -5379,7 +5379,7 @@ static int prepare_user_rq(struct ib_pd *pd,
struct mlx5_ib_rwq *rwq)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
- struct mlx5_ib_create_wq ucmd = {};
+ struct mlx5_ib_create_wq ucmd;
int err;
err = ib_copy_validate_udata_in_cm(udata, ucmd,
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 6d89c0242cab61..852f6f502d14d0 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -45,7 +45,7 @@ static int create_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
struct ib_udata *udata, int buf_size)
{
struct mlx5_ib_dev *dev = to_mdev(pd->device);
- struct mlx5_ib_create_srq ucmd = {};
+ struct mlx5_ib_create_srq ucmd;
struct mlx5_ib_ucontext *ucontext = rdma_udata_to_drv_context(
udata, struct mlx5_ib_ucontext, ibucontext);
int err;
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 8b285fcc638701..eed149f7a942b8 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -1311,12 +1311,14 @@ int ocrdma_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attrs,
if (status)
goto gen_err;
- memset(&ureq, 0, sizeof(ureq));
if (udata) {
status = ib_copy_validate_udata_in(udata, ureq, rsvd1);
if (status)
return status;
+ } else {
+ memset(&ureq, 0, sizeof(ureq));
}
+
ocrdma_set_qp_init_params(qp, pd, attrs);
if (udata == NULL)
qp->cap_flags |= (OCRDMA_QP_MW_BIND | OCRDMA_QP_LKEY0 |
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 42d20b35ff3fe0..679aa6f3a63bc5 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -264,7 +264,7 @@ int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
int rc;
struct qedr_ucontext *ctx = get_qedr_ucontext(uctx);
struct qedr_alloc_ucontext_resp uresp = {};
- struct qedr_alloc_ucontext_req ureq = {};
+ struct qedr_alloc_ucontext_req ureq;
struct qedr_dev *dev = get_qedr_dev(ibdev);
struct qed_rdma_add_user_out_params oparams;
struct qedr_user_mmap_entry *entry;
@@ -913,7 +913,7 @@ int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
};
struct qedr_dev *dev = get_qedr_dev(ibdev);
struct qed_rdma_create_cq_in_params params;
- struct qedr_create_cq_ureq ureq = {};
+ struct qedr_create_cq_ureq ureq;
int vector = attr->comp_vector;
int entries = attr->cqe;
struct qedr_cq *cq = get_qedr_cq(ibcq);
@@ -1541,7 +1541,7 @@ int qedr_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *init_attr,
struct qedr_dev *dev = get_qedr_dev(ibsrq->device);
struct qed_rdma_create_srq_out_params out_params;
struct qedr_pd *pd = get_qedr_pd(ibsrq->pd);
- struct qedr_create_srq_ureq ureq = {};
+ struct qedr_create_srq_ureq ureq;
u64 pbl_base_addr, phy_prod_pair_addr;
struct qedr_srq_hwq_info *hw_srq;
u32 page_cnt, page_size;
@@ -1837,7 +1837,7 @@ static int qedr_create_user_qp(struct qedr_dev *dev,
struct qed_rdma_create_qp_in_params in_params;
struct qed_rdma_create_qp_out_params out_params;
struct qedr_create_qp_uresp uresp = {};
- struct qedr_create_qp_ureq ureq = {};
+ struct qedr_create_qp_ureq ureq;
int alloc_and_init = rdma_protocol_roce(&dev->ibdev, 1);
struct qedr_ucontext *ctx = NULL;
struct qedr_pd *pd = NULL;
--
2.43.0
^ permalink raw reply related
* [PATCH 14/16] RDMA/irdma: Add missing comp_mask check in alloc_ucontext
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
irdma has a comp_mask field that was never checked for validity, check
it.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/irdma/verbs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index b2978632241900..d695130b187bdd 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -296,7 +296,9 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
if (udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
return -EINVAL;
- ret = ib_copy_validate_udata_in(udata, req, rsvd8);
+ ret = ib_copy_validate_udata_in_cm(udata, req, rsvd8,
+ IRDMA_ALLOC_UCTX_USE_RAW_ATTR |
+ IRDMA_SUPPORT_WQE_FORMAT_V2);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related
* [PATCH 01/16] RDMA: Consolidate patterns with offsetofend() to ib_copy_validate_udata_in()
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
Go treewide and consolidate all existing patterns using:
* offsetofend() and variations
* ib_is_udata_cleared()
* ib_copy_from_udata()
into a direct call to the new ib_copy_validate_udata_in().
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 47 +++---------------------
drivers/infiniband/hw/irdma/verbs.c | 10 +++---
drivers/infiniband/hw/mlx4/qp.c | 38 ++++----------------
drivers/infiniband/hw/mlx5/qp.c | 51 ++++++---------------------
4 files changed, 26 insertions(+), 120 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index fc498663cd372f..8d9357e2d513bb 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -699,29 +699,9 @@ int efa_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
if (err)
goto err_out;
- if (offsetofend(typeof(cmd), driver_qp_type) > udata->inlen) {
- ibdev_dbg(&dev->ibdev,
- "Incompatible ABI params, no input udata\n");
- err = -EINVAL;
+ err = ib_copy_validate_udata_in(udata, cmd, driver_qp_type);
+ if (err)
goto err_out;
- }
-
- if (udata->inlen > sizeof(cmd) &&
- !ib_is_udata_cleared(udata, sizeof(cmd),
- udata->inlen - sizeof(cmd))) {
- ibdev_dbg(&dev->ibdev,
- "Incompatible ABI params, unknown fields in udata\n");
- err = -EINVAL;
- goto err_out;
- }
-
- err = ib_copy_from_udata(&cmd, udata,
- min(sizeof(cmd), udata->inlen));
- if (err) {
- ibdev_dbg(&dev->ibdev,
- "Cannot copy udata for create_qp\n");
- goto err_out;
- }
if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_98)) {
ibdev_dbg(&dev->ibdev,
@@ -1160,28 +1140,9 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
goto err_out;
}
- if (offsetofend(typeof(cmd), num_sub_cqs) > udata->inlen) {
- ibdev_dbg(ibdev,
- "Incompatible ABI params, no input udata\n");
- err = -EINVAL;
+ err = ib_copy_validate_udata_in(udata, cmd, num_sub_cqs);
+ if (err)
goto err_out;
- }
-
- if (udata->inlen > sizeof(cmd) &&
- !ib_is_udata_cleared(udata, sizeof(cmd),
- udata->inlen - sizeof(cmd))) {
- ibdev_dbg(ibdev,
- "Incompatible ABI params, unknown fields in udata\n");
- err = -EINVAL;
- goto err_out;
- }
-
- err = ib_copy_from_udata(&cmd, udata,
- min(sizeof(cmd), udata->inlen));
- if (err) {
- ibdev_dbg(ibdev, "Cannot copy udata for create_cq\n");
- goto err_out;
- }
if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_58)) {
ibdev_dbg(ibdev,
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 7251cd7a21471e..b2978632241900 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -284,7 +284,6 @@ static void irdma_alloc_push_page(struct irdma_qp *iwqp)
static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
struct ib_udata *udata)
{
-#define IRDMA_ALLOC_UCTX_MIN_REQ_LEN offsetofend(struct irdma_alloc_ucontext_req, rsvd8)
#define IRDMA_ALLOC_UCTX_MIN_RESP_LEN offsetofend(struct irdma_alloc_ucontext_resp, rsvd)
struct ib_device *ibdev = uctx->device;
struct irdma_device *iwdev = to_iwdev(ibdev);
@@ -292,13 +291,14 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
struct irdma_alloc_ucontext_resp uresp = {};
struct irdma_ucontext *ucontext = to_ucontext(uctx);
struct irdma_uk_attrs *uk_attrs = &iwdev->rf->sc_dev.hw_attrs.uk_attrs;
+ int ret;
- if (udata->inlen < IRDMA_ALLOC_UCTX_MIN_REQ_LEN ||
- udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
+ if (udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
return -EINVAL;
- if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen)))
- return -EINVAL;
+ ret = ib_copy_validate_udata_in(udata, req, rsvd8);
+ if (ret)
+ return ret;
if (req.userspace_ver < 4 || req.userspace_ver > IRDMA_ABI_VER)
goto ver_error;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 1cb890d3d93cea..b87a4b7949a3a0 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -710,7 +710,6 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
struct ib_udata *udata)
{
struct mlx4_ib_create_qp_rss ucmd = {};
- size_t required_cmd_sz;
int err;
if (!udata) {
@@ -721,16 +720,10 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
if (udata->outlen)
return -EOPNOTSUPP;
- required_cmd_sz = offsetof(typeof(ucmd), reserved1) +
- sizeof(ucmd.reserved1);
- if (udata->inlen < required_cmd_sz) {
- pr_debug("invalid inlen\n");
- return -EINVAL;
- }
-
- if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+ err = ib_copy_validate_udata_in(udata, ucmd, reserved1);
+ if (err) {
pr_debug("copy failed\n");
- return -EFAULT;
+ return err;
}
if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved)))
@@ -739,13 +732,6 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
if (ucmd.comp_mask || ucmd.reserved1)
return -EOPNOTSUPP;
- if (udata->inlen > sizeof(ucmd) &&
- !ib_is_udata_cleared(udata, sizeof(ucmd),
- udata->inlen - sizeof(ucmd))) {
- pr_debug("inlen is not supported\n");
- return -EOPNOTSUPP;
- }
-
if (init_attr->qp_type != IB_QPT_RAW_PACKET) {
pr_debug("RSS QP with unsupported QP type %d\n",
init_attr->qp_type);
@@ -4269,22 +4255,12 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
{
struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
struct mlx4_ib_modify_wq ucmd = {};
- size_t required_cmd_sz;
enum ib_wq_state cur_state, new_state;
- int err = 0;
+ int err;
- required_cmd_sz = offsetof(typeof(ucmd), reserved) +
- sizeof(ucmd.reserved);
- if (udata->inlen < required_cmd_sz)
- return -EINVAL;
-
- if (udata->inlen > sizeof(ucmd) &&
- !ib_is_udata_cleared(udata, sizeof(ucmd),
- udata->inlen - sizeof(ucmd)))
- return -EOPNOTSUPP;
-
- if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+ if (err)
+ return err;
if (ucmd.comp_mask || ucmd.reserved)
return -EOPNOTSUPP;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 59f9ddb35d4620..d4d5e0d457a0b5 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4707,17 +4707,9 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
return -ENOSYS;
if (udata && udata->inlen) {
- if (udata->inlen < offsetofend(typeof(ucmd), ece_options))
- return -EINVAL;
-
- if (udata->inlen > sizeof(ucmd) &&
- !ib_is_udata_cleared(udata, sizeof(ucmd),
- udata->inlen - sizeof(ucmd)))
- return -EOPNOTSUPP;
-
- if (ib_copy_from_udata(&ucmd, udata,
- min(udata->inlen, sizeof(ucmd))))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, ece_options);
+ if (err)
+ return err;
if (ucmd.comp_mask & ~MLX5_IB_MODIFY_QP_OOO_DP ||
memchr_inv(&ucmd.burst_info.reserved, 0,
@@ -5389,25 +5381,11 @@ static int prepare_user_rq(struct ib_pd *pd,
struct mlx5_ib_dev *dev = to_mdev(pd->device);
struct mlx5_ib_create_wq ucmd = {};
int err;
- size_t required_cmd_sz;
-
- required_cmd_sz = offsetofend(struct mlx5_ib_create_wq,
- single_stride_log_num_of_bytes);
- if (udata->inlen < required_cmd_sz) {
- mlx5_ib_dbg(dev, "invalid inlen\n");
- return -EINVAL;
- }
-
- if (udata->inlen > sizeof(ucmd) &&
- !ib_is_udata_cleared(udata, sizeof(ucmd),
- udata->inlen - sizeof(ucmd))) {
- mlx5_ib_dbg(dev, "inlen is not supported\n");
- return -EOPNOTSUPP;
- }
-
- if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+ err = ib_copy_validate_udata_in(udata, ucmd,
+ single_stride_log_num_of_bytes);
+ if (err) {
mlx5_ib_dbg(dev, "copy failed\n");
- return -EFAULT;
+ return err;
}
if (ucmd.comp_mask & (~MLX5_IB_CREATE_WQ_STRIDING_RQ)) {
@@ -5626,7 +5604,6 @@ int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
struct mlx5_ib_dev *dev = to_mdev(wq->device);
struct mlx5_ib_rwq *rwq = to_mrwq(wq);
struct mlx5_ib_modify_wq ucmd = {};
- size_t required_cmd_sz;
int curr_wq_state;
int wq_state;
int inlen;
@@ -5634,17 +5611,9 @@ int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
void *rqc;
void *in;
- required_cmd_sz = offsetofend(struct mlx5_ib_modify_wq, reserved);
- if (udata->inlen < required_cmd_sz)
- return -EINVAL;
-
- if (udata->inlen > sizeof(ucmd) &&
- !ib_is_udata_cleared(udata, sizeof(ucmd),
- udata->inlen - sizeof(ucmd)))
- return -EOPNOTSUPP;
-
- if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+ if (err)
+ return err;
if (ucmd.comp_mask || ucmd.reserved)
return -EOPNOTSUPP;
--
2.43.0
^ permalink raw reply related
* [PATCH 09/16] RDMA/hns: Use ib_copy_validate_udata_in()
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
Follow the last struct member from the commit when the struct was
added to the kernel.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/hns/hns_roce_cq.c | 16 +--------------
drivers/infiniband/hw/hns/hns_roce_main.c | 4 ++--
drivers/infiniband/hw/hns/hns_roce_qp.c | 8 ++------
drivers/infiniband/hw/hns/hns_roce_srq.c | 25 +++--------------------
4 files changed, 8 insertions(+), 45 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c
index 857a913326cd88..621568e114054b 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -350,20 +350,6 @@ static int verify_cq_create_attr(struct hns_roce_dev *hr_dev,
return 0;
}
-static int get_cq_ucmd(struct hns_roce_cq *hr_cq, struct ib_udata *udata,
- struct hns_roce_ib_create_cq *ucmd)
-{
- struct ib_device *ibdev = hr_cq->ib_cq.device;
- int ret;
-
- ret = ib_copy_from_udata(ucmd, udata, min(udata->inlen, sizeof(*ucmd)));
- if (ret) {
- ibdev_err(ibdev, "failed to copy CQ udata, ret = %d.\n", ret);
- return ret;
- }
-
- return 0;
-}
static void set_cq_param(struct hns_roce_cq *hr_cq, u32 cq_entries, int vector,
struct hns_roce_ib_create_cq *ucmd)
@@ -428,7 +414,7 @@ int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr,
goto err_out;
if (udata) {
- ret = get_cq_ucmd(hr_cq, udata, &ucmd);
+ ret = ib_copy_validate_udata_in(udata, ucmd, db_addr);
if (ret)
goto err_out;
}
diff --git a/drivers/infiniband/hw/hns/hns_roce_main.c b/drivers/infiniband/hw/hns/hns_roce_main.c
index 1148d732f94fbf..ec6fb3f1177941 100644
--- a/drivers/infiniband/hw/hns/hns_roce_main.c
+++ b/drivers/infiniband/hw/hns/hns_roce_main.c
@@ -36,6 +36,7 @@
#include <rdma/ib_smi.h>
#include <rdma/ib_user_verbs.h>
#include <rdma/ib_cache.h>
+#include <rdma/uverbs_ioctl.h>
#include "hns_roce_common.h"
#include "hns_roce_device.h"
#include "hns_roce_hem.h"
@@ -433,8 +434,7 @@ static int hns_roce_alloc_ucontext(struct ib_ucontext *uctx,
resp.qp_tab_size = hr_dev->caps.num_qps;
resp.srq_tab_size = hr_dev->caps.num_srqs;
- ret = ib_copy_from_udata(&ucmd, udata,
- min(udata->inlen, sizeof(ucmd)));
+ ret = ib_copy_validate_udata_in(udata, ucmd, reserved);
if (ret)
goto error_out;
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 6a2dff4bd2d0fc..3d6eb22cbcd940 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -1130,13 +1130,9 @@ static int set_qp_param(struct hns_roce_dev *hr_dev, struct hns_roce_qp *hr_qp,
}
if (udata) {
- ret = ib_copy_from_udata(ucmd, udata,
- min(udata->inlen, sizeof(*ucmd)));
- if (ret) {
- ibdev_err(ibdev,
- "failed to copy QP ucmd, ret = %d\n", ret);
+ ret = ib_copy_validate_udata_in(udata, *ucmd, reserved);
+ if (ret)
return ret;
- }
uctx = rdma_udata_to_drv_context(udata, struct hns_roce_ucontext,
ibucontext);
diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c
index 8a6efb6b9c9eba..b37a76587aa868 100644
--- a/drivers/infiniband/hw/hns/hns_roce_srq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
@@ -346,14 +346,9 @@ static int alloc_srq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
int ret;
if (udata) {
- ret = ib_copy_from_udata(&ucmd, udata,
- min(udata->inlen, sizeof(ucmd)));
- if (ret) {
- ibdev_err(&hr_dev->ib_dev,
- "failed to copy SRQ udata, ret = %d.\n",
- ret);
+ ret = ib_copy_validate_udata_in(udata, ucmd, que_addr);
+ if (ret)
return ret;
- }
}
ret = alloc_srq_idx(hr_dev, srq, udata, ucmd.que_addr);
@@ -387,20 +382,6 @@ static void free_srq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq)
free_srq_idx(hr_dev, srq);
}
-static int get_srq_ucmd(struct hns_roce_srq *srq, struct ib_udata *udata,
- struct hns_roce_ib_create_srq *ucmd)
-{
- struct ib_device *ibdev = srq->ibsrq.device;
- int ret;
-
- ret = ib_copy_from_udata(ucmd, udata, min(udata->inlen, sizeof(*ucmd)));
- if (ret) {
- ibdev_err(ibdev, "failed to copy SRQ udata, ret = %d.\n", ret);
- return ret;
- }
-
- return 0;
-}
static void free_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
struct ib_udata *udata)
@@ -430,7 +411,7 @@ static int alloc_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
int ret;
if (udata) {
- ret = get_srq_ucmd(srq, udata, &ucmd);
+ ret = ib_copy_validate_udata_in(udata, ucmd, que_addr);
if (ret)
return ret;
--
2.43.0
^ permalink raw reply related
* [PATCH 04/16] RDMA: Use ib_copy_validate_udata_in() for implicit full structs
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
All of these cases have git blames that say the entire current struct
was introduced at once, so the last member is the right choice.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/erdma/erdma_verbs.c | 6 ++--
.../infiniband/hw/ionic/ionic_controlpath.c | 6 ++--
drivers/infiniband/hw/mthca/mthca_provider.c | 27 +++++++++------
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 10 +++---
drivers/infiniband/hw/qedr/verbs.c | 34 ++++++-------------
drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 +-
drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c | 6 ++--
drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c | 6 ++--
8 files changed, 45 insertions(+), 52 deletions(-)
diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
index 04136a0281aa4c..5523b4e151e1ff 100644
--- a/drivers/infiniband/hw/erdma/erdma_verbs.c
+++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
@@ -1039,8 +1039,7 @@ int erdma_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attrs,
qp->attrs.rq_size = roundup_pow_of_two(attrs->cap.max_recv_wr);
if (uctx) {
- ret = ib_copy_from_udata(&ureq, udata,
- min(sizeof(ureq), udata->inlen));
+ ret = ib_copy_validate_udata_in(udata, ureq, rsvd0);
if (ret)
goto err_out_xa;
@@ -1980,8 +1979,7 @@ int erdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct erdma_ureq_create_cq ureq;
struct erdma_uresp_create_cq uresp;
- ret = ib_copy_from_udata(&ureq, udata,
- min(udata->inlen, sizeof(ureq)));
+ ret = ib_copy_validate_udata_in(udata, ureq, rsvd0);
if (ret)
goto err_out_xa;
diff --git a/drivers/infiniband/hw/ionic/ionic_controlpath.c b/drivers/infiniband/hw/ionic/ionic_controlpath.c
index 4842931f5316ee..cbdb0ea7782a49 100644
--- a/drivers/infiniband/hw/ionic/ionic_controlpath.c
+++ b/drivers/infiniband/hw/ionic/ionic_controlpath.c
@@ -373,7 +373,7 @@ int ionic_alloc_ucontext(struct ib_ucontext *ibctx, struct ib_udata *udata)
phys_addr_t db_phys = 0;
int rc;
- rc = ib_copy_from_udata(&req, udata, sizeof(req));
+ rc = ib_copy_validate_udata_in(udata, req, rsvd);
if (rc)
return rc;
@@ -1223,7 +1223,7 @@ int ionic_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
int udma_idx = 0, rc;
if (udata) {
- rc = ib_copy_from_udata(&req, udata, sizeof(req));
+ rc = ib_copy_validate_udata_in(udata, req, rsvd);
if (rc)
return rc;
}
@@ -2152,7 +2152,7 @@ int ionic_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
int rc;
if (udata) {
- rc = ib_copy_from_udata(&req, udata, sizeof(req));
+ rc = ib_copy_validate_udata_in(udata, req, rsvd);
if (rc)
return rc;
} else {
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 6a0795332616dc..7467e3dff7ebb8 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -402,8 +402,9 @@ static int mthca_create_srq(struct ib_srq *ibsrq,
return -EOPNOTSUPP;
if (udata) {
- if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, db_page);
+ if (err)
+ return err;
err = mthca_map_user_db(to_mdev(ibsrq->device), &context->uar,
context->db_tab, ucmd.db_index,
@@ -472,8 +473,9 @@ static int mthca_create_qp(struct ib_qp *ibqp,
case IB_QPT_UD:
{
if (udata) {
- if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, rq_db_index);
+ if (err)
+ return err;
err = mthca_map_user_db(dev, &context->uar,
context->db_tab,
@@ -594,8 +596,9 @@ static int mthca_create_cq(struct ib_cq *ibcq,
return -EINVAL;
if (udata) {
- if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)))
- return -EFAULT;
+ err = ib_copy_validate_udata_in(udata, ucmd, set_db_index);
+ if (err)
+ return err;
err = mthca_map_user_db(to_mdev(ibdev), &context->uar,
context->db_tab, ucmd.set_db_index,
@@ -720,10 +723,9 @@ static int mthca_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *uda
goto out;
lkey = cq->resize_buf->buf.mr.ibmr.lkey;
} else {
- if (ib_copy_from_udata(&ucmd, udata, sizeof ucmd)) {
- ret = -EFAULT;
+ ret = ib_copy_validate_udata_in(udata, ucmd, reserved);
+ if (ret)
goto out;
- }
lkey = ucmd.lkey;
}
@@ -851,8 +853,11 @@ static struct ib_mr *mthca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
}
++context->reg_mr_warned;
ucmd.mr_attrs = 0;
- } else if (ib_copy_from_udata(&ucmd, udata, sizeof ucmd))
- return ERR_PTR(-EFAULT);
+ } else {
+ err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+ if (err)
+ return ERR_PTR(err);
+ }
mr = kmalloc_obj(*mr);
if (!mr)
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 7383b67e172312..8b285fcc638701 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -983,8 +983,9 @@ int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
return -EOPNOTSUPP;
if (udata) {
- if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
- return -EFAULT;
+ status = ib_copy_validate_udata_in(udata, ureq, rsvd);
+ if (status)
+ return status;
} else
ureq.dpp_cq = 0;
@@ -1312,8 +1313,9 @@ int ocrdma_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attrs,
memset(&ureq, 0, sizeof(ureq));
if (udata) {
- if (ib_copy_from_udata(&ureq, udata, sizeof(ureq)))
- return -EFAULT;
+ status = ib_copy_validate_udata_in(udata, ureq, rsvd1);
+ if (status)
+ return status;
}
ocrdma_set_qp_init_params(qp, pd, attrs);
if (udata == NULL)
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 2fa9e07710d31f..42d20b35ff3fe0 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -273,12 +273,9 @@ int qedr_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata)
return -EFAULT;
if (udata->inlen) {
- rc = ib_copy_from_udata(&ureq, udata,
- min(sizeof(ureq), udata->inlen));
- if (rc) {
- DP_ERR(dev, "Problem copying data from user space\n");
- return -EFAULT;
- }
+ rc = ib_copy_validate_udata_in(udata, ureq, reserved);
+ if (rc)
+ return rc;
ctx->edpm_mode = !!(ureq.context_flags &
QEDR_ALLOC_UCTX_EDPM_MODE);
ctx->db_rec = !!(ureq.context_flags & QEDR_ALLOC_UCTX_DB_REC);
@@ -949,12 +946,9 @@ int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
db_offset = DB_ADDR_SHIFT(DQ_PWM_OFFSET_UCM_RDMA_CQ_CONS_32BIT);
if (udata) {
- if (ib_copy_from_udata(&ureq, udata, min(sizeof(ureq),
- udata->inlen))) {
- DP_ERR(dev,
- "create cq: problem copying data from user space\n");
- goto err0;
- }
+ rc = ib_copy_validate_udata_in(udata, ureq, len);
+ if (rc)
+ return rc;
if (!ureq.len) {
DP_ERR(dev,
@@ -1575,12 +1569,9 @@ int qedr_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *init_attr,
hw_srq->max_sges = init_attr->attr.max_sge;
if (udata) {
- if (ib_copy_from_udata(&ureq, udata, min(sizeof(ureq),
- udata->inlen))) {
- DP_ERR(dev,
- "create srq: problem copying data from user space\n");
- goto err0;
- }
+ rc = ib_copy_validate_udata_in(udata, ureq, srq_len);
+ if (rc)
+ return rc;
rc = qedr_init_srq_user_params(udata, srq, &ureq, 0);
if (rc)
@@ -1860,12 +1851,9 @@ static int qedr_create_user_qp(struct qedr_dev *dev,
}
if (udata) {
- rc = ib_copy_from_udata(&ureq, udata, min(sizeof(ureq),
- udata->inlen));
- if (rc) {
- DP_ERR(dev, "Problem copying data from user space\n");
+ rc = ib_copy_validate_udata_in(udata, ureq, rq_len);
+ if (rc)
return rc;
- }
}
if (qedr_qp_has_sq(qp)) {
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 16b269128f52d3..615de9c4209bf1 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -476,7 +476,7 @@ int usnic_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
if (init_attr->create_flags)
return -EOPNOTSUPP;
- err = ib_copy_from_udata(&cmd, udata, sizeof(cmd));
+ err = ib_copy_validate_udata_in(udata, cmd, spec);
if (err) {
usnic_err("%s: cannot copy udata for create_qp\n",
dev_name(&us_ibdev->ib_dev.dev));
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
index 98b2a0090bf2a1..16aab967a20308 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
@@ -49,6 +49,7 @@
#include <rdma/ib_addr.h>
#include <rdma/ib_smi.h>
#include <rdma/ib_user_verbs.h>
+#include <rdma/uverbs_ioctl.h>
#include "pvrdma.h"
@@ -252,10 +253,9 @@ int pvrdma_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
dev_dbg(&dev->pdev->dev,
"create queuepair from user space\n");
- if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
- ret = -EFAULT;
+ ret = ib_copy_validate_udata_in(udata, ucmd, qp_addr);
+ if (ret)
goto err_qp;
- }
/* Userspace supports qpn and qp handles? */
if (dev->dsr_version >= PVRDMA_QPHANDLE_VERSION &&
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
index bdc2703532c6cc..d31fb692fcaafb 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
@@ -49,6 +49,7 @@
#include <rdma/ib_addr.h>
#include <rdma/ib_smi.h>
#include <rdma/ib_user_verbs.h>
+#include <rdma/uverbs_ioctl.h>
#include "pvrdma.h"
@@ -141,10 +142,9 @@ int pvrdma_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *init_attr,
dev_dbg(&dev->pdev->dev,
"create shared receive queue from user space\n");
- if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
- ret = -EFAULT;
+ ret = ib_copy_validate_udata_in(udata, ucmd, reserved);
+ if (ret)
goto err_srq;
- }
srq->umem = ib_umem_get(ibsrq->device, ucmd.buf_addr, ucmd.buf_size, 0);
if (IS_ERR(srq->umem)) {
--
2.43.0
^ permalink raw reply related
* [PATCH 10/16] RDMA/efa: Use ib_copy_validate_udata_in_cm()
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
Add the missed check for unsupported comp_mask bits.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 8d9357e2d513bb..22993273028433 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -1918,13 +1918,13 @@ int efa_alloc_ucontext(struct ib_ucontext *ibucontext, struct ib_udata *udata)
* it's fine if the driver does not know all request fields,
* we will ack input fields in our response.
*/
-
- err = ib_copy_from_udata(&cmd, udata,
- min(sizeof(cmd), udata->inlen));
- if (err) {
- ibdev_dbg(&dev->ibdev,
- "Cannot copy udata for alloc_ucontext\n");
- goto err_out;
+ if (udata->inlen) {
+ err = ib_copy_validate_udata_in_cm(
+ udata, cmd, comp_mask,
+ EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH |
+ EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR);
+ if (err)
+ goto err_out;
}
err = efa_user_comp_handshake(ibucontext, &cmd);
--
2.43.0
^ permalink raw reply related
* [PATCH 16/16] RDMA/hns: Remove the duplicate calls to ib_copy_validate_udata_in()
From: Jason Gunthorpe @ 2026-03-12 0:24 UTC (permalink / raw)
To: Abhijit Gangurde, Allen Hubbe,
Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
linux-hyperv, linux-rdma, Long Li, Michal Kalderon,
Michael Margolin, Nelson Escobar, Satish Kharat, Selvin Xavier,
Yossi Leybovich, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
Yishai Hadas, Zhu Yanjun
Cc: patches
In-Reply-To: <0-v1-2b86f54cda42+7d-rdma_udata_req_jgg@nvidia.com>
A udata should be read only once per ioctl, not multiple times.
Multiple reads make it unclear what the content is since userspace can
change it between the reads.
Lift the ib_copy_validate_udata_in() out of
alloc_srq_buf()/alloc_srq_db() and into hns_roce_create_srq().
Found by AI.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/hns/hns_roce_srq.c | 35 +++++++++++-------------
1 file changed, 16 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c
index 601f8cdfce96a3..cb848e8e6bbd76 100644
--- a/drivers/infiniband/hw/hns/hns_roce_srq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
@@ -340,22 +340,16 @@ static int set_srq_param(struct hns_roce_srq *srq,
}
static int alloc_srq_buf(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
- struct ib_udata *udata)
+ struct ib_udata *udata,
+ struct hns_roce_ib_create_srq *ucmd)
{
- struct hns_roce_ib_create_srq ucmd = {};
int ret;
- if (udata) {
- ret = ib_copy_validate_udata_in(udata, ucmd, que_addr);
- if (ret)
- return ret;
- }
-
- ret = alloc_srq_idx(hr_dev, srq, udata, ucmd.que_addr);
+ ret = alloc_srq_idx(hr_dev, srq, udata, ucmd->que_addr);
if (ret)
return ret;
- ret = alloc_srq_wqe_buf(hr_dev, srq, udata, ucmd.buf_addr);
+ ret = alloc_srq_wqe_buf(hr_dev, srq, udata, ucmd->buf_addr);
if (ret)
goto err_idx;
@@ -404,22 +398,18 @@ static void free_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
static int alloc_srq_db(struct hns_roce_dev *hr_dev, struct hns_roce_srq *srq,
struct ib_udata *udata,
+ struct hns_roce_ib_create_srq *ucmd,
struct hns_roce_ib_create_srq_resp *resp)
{
- struct hns_roce_ib_create_srq ucmd;
struct hns_roce_ucontext *uctx;
int ret;
if (udata) {
- ret = ib_copy_validate_udata_in(udata, ucmd, que_addr);
- if (ret)
- return ret;
-
if ((hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_SRQ_RECORD_DB) &&
- (ucmd.req_cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB)) {
+ (ucmd->req_cap_flags & HNS_ROCE_SRQ_CAP_RECORD_DB)) {
uctx = rdma_udata_to_drv_context(udata,
struct hns_roce_ucontext, ibucontext);
- ret = hns_roce_db_map_user(uctx, ucmd.db_addr,
+ ret = hns_roce_db_map_user(uctx, ucmd->db_addr,
&srq->rdb);
if (ret)
return ret;
@@ -448,6 +438,7 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
struct hns_roce_dev *hr_dev = to_hr_dev(ib_srq->device);
struct hns_roce_ib_create_srq_resp resp = {};
struct hns_roce_srq *srq = to_hr_srq(ib_srq);
+ struct hns_roce_ib_create_srq ucmd = {};
int ret;
mutex_init(&srq->mutex);
@@ -457,11 +448,17 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
if (ret)
goto err_out;
- ret = alloc_srq_buf(hr_dev, srq, udata);
+ if (udata) {
+ ret = ib_copy_validate_udata_in(udata, ucmd, que_addr);
+ if (ret)
+ goto err_out;
+ }
+
+ ret = alloc_srq_buf(hr_dev, srq, udata, &ucmd);
if (ret)
goto err_out;
- ret = alloc_srq_db(hr_dev, srq, udata, &resp);
+ ret = alloc_srq_db(hr_dev, srq, udata, &ucmd, &resp);
if (ret)
goto err_srq_buf;
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox