Linux-HyperV List
 help / color / mirror / Atom feed
* RE: [PATCH v5 1/2] drm/hyperv: validate resolution_count and fix WIN8 fallback
From: Michael Kelley @ 2026-05-23 15:16 UTC (permalink / raw)
  To: Berkant Koc, Saurabh Sengar, Dexuan Cui, Long Li
  Cc: linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Michael Kelley, Thomas Zimmermann, Maarten Lankhorst,
	Maxime Ripard, Deepak Rawat
In-Reply-To: <6945b22419c7d404b4954a113de2ac9c900dba93.1779542874.git.me@berkoc.com>

From: Berkant Koc <me@berkoc.com> Sent: Tuesday, May 19, 2026 1:08 PM
> 
> A SYNTHVID_RESOLUTION_RESPONSE with resolution_count > 64 walks past
> the supported_resolution[SYNTHVID_MAX_RESOLUTION_COUNT] array in the
> parse loop. Bound resolution_count against the array size, folded
> into the existing zero-check.
> 
> When the WIN10 resolution probe fails, the caller in
> hyperv_connect_vsp() left hv->screen_*_max / preferred_* unpopulated,
> which sets mode_config.max_width / max_height to 0 and makes
> drm_internal_framebuffer_create() reject every userspace framebuffer
> with -EINVAL. The pre-WIN10 branch had the same gap for
> preferred_width / preferred_height. Use a single post-probe fallback
> guarded by screen_width_max == 0 so both paths converge on the WIN8
> defaults.
> 
> Signed-off-by: Berkant Koc <me@berkoc.com>
> Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline
> Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
> Cc: stable@vger.kernel.org # 5.14+
> Reviewed-by: Michael Kelley <mhklinux@outlook.com>

I ran a basic smoke-test on my local Hyper-V instance. I can
confirm that no error messages or failure were generated
in the "good" case where the resolution_count from Hyper-V is
valid. I did not simulate a bogus resolution_count and ensure
it is detected.

Tested-by: Michael Kelley <mhklinux@outlook.com>  

> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> index 051ecc526..c3d0ff229 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> @@ -391,8 +391,11 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
>  		return -ETIMEDOUT;
>  	}
> 
> -	if (msg->resolution_resp.resolution_count == 0) {
> -		drm_err(dev, "No supported resolutions\n");
> +	if (msg->resolution_resp.resolution_count == 0 ||
> +	    msg->resolution_resp.resolution_count >
> +	    SYNTHVID_MAX_RESOLUTION_COUNT) {
> +		drm_err(dev, "Invalid resolution count: %d\n",
> +			msg->resolution_resp.resolution_count);
>  		return -ENODEV;
>  	}
> 
> @@ -508,9 +511,13 @@ int hyperv_connect_vsp(struct hv_device *hdev)
>  		ret = hyperv_get_supported_resolution(hdev);
>  		if (ret)
>  			drm_err(dev, "Failed to get supported resolution from host, use default\n");
> -	} else {
> +	}
> +
> +	if (!hv->screen_width_max) {
>  		hv->screen_width_max = SYNTHVID_WIDTH_WIN8;
>  		hv->screen_height_max = SYNTHVID_HEIGHT_WIN8;
> +		hv->preferred_width = SYNTHVID_WIDTH_WIN8;
> +		hv->preferred_height = SYNTHVID_HEIGHT_WIN8;
>  	}
> 
>  	hv->mmio_megabytes = hdev->channel->offermsg.offer.mmio_megabytes;
> --
> 2.47.3
> 


^ permalink raw reply

* RE: [PATCH v5 2/2] drm/hyperv: validate VMBus packet size in receive callback
From: Michael Kelley @ 2026-05-23 15:17 UTC (permalink / raw)
  To: Berkant Koc, Saurabh Sengar, Dexuan Cui, Long Li
  Cc: linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Michael Kelley, Thomas Zimmermann, Maarten Lankhorst,
	Maxime Ripard, Deepak Rawat
In-Reply-To: <8200dbc199c7a9b75ac7e8af6c748d2189b5ebd5.1779542874.git.me@berkoc.com>

From: Berkant Koc <me@berkoc.com> Sent: Saturday, May 23, 2026 6:28 AM
> 
> hyperv_receive_sub() reads msg->vid_hdr.type and dispatches into one
> of four message-type branches without knowing how many bytes the host
> wrote into hv->recv_buf. The completion path then runs
> memcpy(hv->init_buf, msg, VMBUS_MAX_PACKET_SIZE), so the consumer that
> wakes on wait_for_completion_timeout() can read up to 16 KiB of
> residue from a prior message as if it were the response payload.
> 
> Pass bytes_recvd into hyperv_receive_sub() and reject any packet that
> does not cover the pipe + synthvid header. A single switch on
> msg->vid_hdr.type then computes the type-specific payload size: the
> three completion-driving types (SYNTHVID_VERSION_RESPONSE,
> SYNTHVID_RESOLUTION_RESPONSE, SYNTHVID_VRAM_LOCATION_ACK) fall through
> to a shared exit that requires that size before memcpy/complete, while
> SYNTHVID_FEATURE_CHANGE validates its own payload and returns before
> reading is_dirt_needed. Unknown types are dropped.
> 
> SYNTHVID_RESOLUTION_RESPONSE is variable length: the host fills
> resolution_count entries, not the full SYNTHVID_MAX_RESOLUTION_COUNT
> array. Validate the fixed prefix first so resolution_count can be
> read, bound it against the array, then require only the count-sized
> array, so the shorter responses the host actually sends are accepted.
> 
> Only run the sub-handler when vmbus_recvpacket() returned success. The
> memcpy length is bytes_recvd, which is bounded by VMBUS_MAX_PACKET_SIZE
> only on a successful receive; on -ENOBUFS vmbus_recvpacket() instead
> reports the required length, which can exceed hv->recv_buf, so copying
> bytes_recvd would read and write past the 16 KiB buffers. Gating on the
> success return keeps the copy bounded. The nonzero-return path is itself
> a malformed-message case and is now logged rather than silently skipped;
> channel recovery is not attempted.
> 
> Rejected packets are reported via drm_err_ratelimited() rather than
> silently dropped, matching the CoCo-hardened pattern in
> hv_kvp_onchannelcallback().
> 
> Fixes: 76c56a5affeb ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
> Cc: stable@vger.kernel.org # 5.14+
> Signed-off-by: Berkant Koc <me@berkoc.com>
> Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline

This looks good now. The error checking and reporting is robust
and the code is well-structured. Thanks for putting up with my
sometimes picky feedback. :-)

I also ran a basic smoke-test on my local Hyper-V instance. I
can confirm that no error messages or failure were generated
in the "good" case where the messages from Hyper-V are
properly formatted and sized. I did not simulate bad messages
and ensure they are detected.

Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>  

> ---
>  drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 100 +++++++++++++++++++---
>  1 file changed, 87 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> index c3d0ff229..4e6f703a1 100644
> --- a/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> +++ b/drivers/gpu/drm/hyperv/hyperv_drm_proto.c
> @@ -420,30 +420,92 @@ static int hyperv_get_supported_resolution(struct hv_device *hdev)
>  	return 0;
>  }
> 
> -static void hyperv_receive_sub(struct hv_device *hdev)
> +static void hyperv_receive_sub(struct hv_device *hdev, u32 bytes_recvd)
>  {
>  	struct hyperv_drm_device *hv = hv_get_drvdata(hdev);
>  	struct synthvid_msg *msg;
> +	size_t hdr_size;
> +	size_t need;
> 
>  	if (!hv)
>  		return;
> 
> -	msg = (struct synthvid_msg *)hv->recv_buf;
> -
> -	/* Complete the wait event */
> -	if (msg->vid_hdr.type == SYNTHVID_VERSION_RESPONSE ||
> -	    msg->vid_hdr.type == SYNTHVID_RESOLUTION_RESPONSE ||
> -	    msg->vid_hdr.type == SYNTHVID_VRAM_LOCATION_ACK) {
> -		memcpy(hv->init_buf, msg, VMBUS_MAX_PACKET_SIZE);
> -		complete(&hv->wait);
> +	hdr_size = sizeof(struct pipe_msg_hdr) +
> +		   sizeof(struct synthvid_msg_hdr);
> +	if (bytes_recvd < hdr_size) {
> +		drm_err_ratelimited(&hv->dev,
> +				    "synthvid packet too small for header: %u\n",
> +				    bytes_recvd);
>  		return;
>  	}
> 
> -	if (msg->vid_hdr.type == SYNTHVID_FEATURE_CHANGE) {
> +	msg = (struct synthvid_msg *)hv->recv_buf;
> +	need = hdr_size;
> +
> +	switch (msg->vid_hdr.type) {
> +	case SYNTHVID_VERSION_RESPONSE:
> +		need += sizeof(struct synthvid_version_resp);
> +		break;
> +	case SYNTHVID_RESOLUTION_RESPONSE:
> +		/*
> +		 * The resolution response is variable length: the host
> +		 * fills resolution_count entries, not the full
> +		 * SYNTHVID_MAX_RESOLUTION_COUNT array. Require the fixed
> +		 * prefix first so resolution_count can be read, then
> +		 * demand exactly the count-sized array.
> +		 */
> +		need += offsetof(struct synthvid_supported_resolution_resp,
> +				 supported_resolution);
> +		if (bytes_recvd < need)
> +			break;
> +		if (msg->resolution_resp.resolution_count >
> +		    SYNTHVID_MAX_RESOLUTION_COUNT) {
> +			drm_err_ratelimited(&hv->dev,
> +					    "synthvid resolution count too large: %u\n",
> +					    msg->resolution_resp.resolution_count);
> +			return;
> +		}
> +		need += msg->resolution_resp.resolution_count *
> +			sizeof(struct hvd_screen_info);
> +		break;
> +	case SYNTHVID_VRAM_LOCATION_ACK:
> +		need += sizeof(struct synthvid_vram_location_ack);
> +		break;
> +	case SYNTHVID_FEATURE_CHANGE:
> +		/*
> +		 * Not a completion-driving message: validate its own payload
> +		 * and consume it here rather than falling through to the
> +		 * memcpy/complete shared by the wait-event responses.
> +		 */
> +		if (bytes_recvd < need +
> +		    sizeof(struct synthvid_feature_change)) {
> +			drm_err_ratelimited(&hv->dev,
> +					    "synthvid feature change packet too small: %u\n",
> +					    bytes_recvd);
> +			return;
> +		}
>  		hv->dirt_needed = msg->feature_chg.is_dirt_needed;
>  		if (hv->dirt_needed)
>  			hyperv_hide_hw_ptr(hv->hdev);
> +		return;
> +	default:
> +		return;
> +	}
> +
> +	/*
> +	 * Shared completion path for the wait-event responses
> +	 * (VERSION_RESPONSE, RESOLUTION_RESPONSE, VRAM_LOCATION_ACK):
> +	 * require the type-specific payload before handing the buffer to
> +	 * the waiter.
> +	 */
> +	if (bytes_recvd < need) {
> +		drm_err_ratelimited(&hv->dev,
> +				    "synthvid packet too small for type %u: %u < %zu\n",
> +				    msg->vid_hdr.type, bytes_recvd, need);
> +		return;
>  	}
> +	memcpy(hv->init_buf, msg, bytes_recvd);
> +	complete(&hv->wait);
>  }
> 
>  static void hyperv_receive(void *ctx)
> @@ -464,9 +526,21 @@ static void hyperv_receive(void *ctx)
>  		ret = vmbus_recvpacket(hdev->channel, recv_buf,
>  				       VMBUS_MAX_PACKET_SIZE,
>  				       &bytes_recvd, &req_id);
> -		if (bytes_recvd > 0 &&
> -		    recv_buf->pipe_hdr.type == PIPE_MSG_DATA)
> -			hyperv_receive_sub(hdev);
> +		if (ret) {
> +			/*
> +			 * A nonzero return (e.g. -ENOBUFS for an oversized
> +			 * packet) is itself a malformed message: bytes_recvd
> +			 * then reports the required length rather than a copied
> +			 * payload, so it must not be forwarded to the
> +			 * sub-handler. Channel recovery is not attempted.
> +			 */
> +			drm_err_ratelimited(&hv->dev,
> +					    "vmbus_recvpacket failed: %d (need %u)\n",
> +					    ret, bytes_recvd);
> +		} else if (bytes_recvd > 0 &&
> +			   recv_buf->pipe_hdr.type == PIPE_MSG_DATA) {
> +			hyperv_receive_sub(hdev, bytes_recvd);
> +		}
>  	} while (bytes_recvd > 0 && ret == 0);
>  }
> 
> --
> 2.47.3
> 


^ permalink raw reply

* Re: [PATCH 1/1] x86/hyperv: Refactor hv_smp_prepare_cpus()
From: kernel test robot @ 2026-05-25  7:37 UTC (permalink / raw)
  To: Michael Kelley, kys, haiyangz, wei.liu, decui, longli, tglx,
	mingo, bp, dave.hansen, x86, hpa, linux-hyperv, linux-kernel,
	linux-arch
  Cc: oe-kbuild-all
In-Reply-To: <20260521192336.99623-1-mhklkml@zohomail.com>

Hi Michael,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/master]
[also build test ERROR on linus/master v7.1-rc5 next-20260522]
[cannot apply to tip/x86/core arnd-asm-generic/master tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Michael-Kelley/x86-hyperv-Refactor-hv_smp_prepare_cpus/20260522-032610
base:   tip/master
patch link:    https://lore.kernel.org/r/20260521192336.99623-1-mhklkml%40zohomail.com
patch subject: [PATCH 1/1] x86/hyperv: Refactor hv_smp_prepare_cpus()
config: x86_64-buildonly-randconfig-006-20260525 (https://download.01.org/0day-ci/archive/20260525/202605251528.eVtKHPbX-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260525/202605251528.eVtKHPbX-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202605251528.eVtKHPbX-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/hv/hv_proc.c: In function 'hv_smp_prep_cpus':
>> drivers/hv/hv_proc.c:303:69: error: implicit declaration of function 'cpu_physical_id' [-Wimplicit-function-declaration]
     303 |                 ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));
         |                                                                     ^~~~~~~~~~~~~~~

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for MFD_STMFX
   Depends on [n]: HAS_IOMEM [=y] && I2C [=y] && OF [=n]
   Selected by [y]:
   - PINCTRL_STMFX [=y] && PINCTRL [=y] && I2C [=y] && HAS_IOMEM [=y]


vim +/cpu_physical_id +303 drivers/hv/hv_proc.c

   290	
   291	void hv_smp_prep_cpus(void)
   292	{
   293	#ifdef CONFIG_X86_64
   294		int i, ret;
   295	
   296		/* If AP LPs exist, we are in a kexec'd kernel and VPs already exist */
   297		if (num_present_cpus() == 1 || hv_lp_exists(1))
   298			return;
   299	
   300		for_each_present_cpu(i) {
   301			if (i == 0)
   302				continue;
 > 303			ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* Re: [PATCH net v2 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure
From: Dipayaan Roy @ 2026-05-25  8:01 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
	john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
In-Reply-To: <20260522233555.1099342-2-dipayanroy@linux.microsoft.com>

On Fri, May 22, 2026 at 04:33:12PM -0700, Dipayaan Roy wrote:
> When queue allocation fails partway through, the error cleanup frees
> and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as
> mana_remove(), mana_change_mtu() recovery, and internal error handling
> in mana_alloc_queues() can subsequently call into functions that
> dereference these pointers without NULL checks:
> 
> - mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer
>   dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26).
> - mana_destroy_vport() iterates apc->rxqs without a NULL check.
> - mana_fence_rqs() iterates apc->rxqs without a NULL check.
> - mana_dealloc_queues() iterates apc->tx_qp without a NULL check.
> 
> Add NULL guards for apc->rxqs in mana_fence_rqs(),
> mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a
> NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue
> draining when TX queues were never allocated or already freed.
> 
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
> 
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Hi, 

I will send a v3 to fix the fixes tag issue highlighted in:
https://netdev-ctrl.bots.linux.dev/logs/build/1099669/14590738/verify_fixes/summary

> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 70 +++++++++++--------
>  1 file changed, 41 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9afc786b297a..0582803907a8 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1727,6 +1727,9 @@ static void mana_fence_rqs(struct mana_port_context *apc)
>  	struct mana_rxq *rxq;
>  	int err;
>  
> +	if (!apc->rxqs)
> +		return;
> +
>  	for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
>  		rxq = apc->rxqs[rxq_idx];
>  		err = mana_fence_rq(apc, rxq);
> @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc)
>  	struct mana_rxq *rxq;
>  	u32 rxq_idx;
>  
> -	for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> -		rxq = apc->rxqs[rxq_idx];
> -		if (!rxq)
> -			continue;
> +	if (apc->rxqs) {
>  
> -		mana_destroy_rxq(apc, rxq, true);
> -		apc->rxqs[rxq_idx] = NULL;
> +		for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> +			rxq = apc->rxqs[rxq_idx];
> +			if (!rxq)
> +				continue;
> +
> +			mana_destroy_rxq(apc, rxq, true);
> +			apc->rxqs[rxq_idx] = NULL;
> +		}
>  	}
>  
>  	mana_destroy_txq(apc);
> @@ -3269,7 +3275,8 @@ static int mana_dealloc_queues(struct net_device *ndev)
>  	if (apc->port_is_up)
>  		return -EINVAL;
>  
> -	mana_chn_setxdp(apc, NULL);
> +	if (apc->rxqs)
> +		mana_chn_setxdp(apc, NULL);
>  
>  	if (gd->gdma_context->is_pf && !apc->ac->bm_hostmode)
>  		mana_pf_deregister_filter(apc);
> @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev)
>  	 * number of queues.
>  	 */
>  
> -	for (i = 0; i < apc->num_queues; i++) {
> -		txq = &apc->tx_qp[i].txq;
> -		tsleep = 1000;
> -		while (atomic_read(&txq->pending_sends) > 0 &&
> -		       time_before(jiffies, timeout)) {
> -			usleep_range(tsleep, tsleep + 1000);
> -			tsleep <<= 1;
> -		}
> -		if (atomic_read(&txq->pending_sends)) {
> -			err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
> -			if (err) {
> -				netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
> -					   err, atomic_read(&txq->pending_sends),
> -					   txq->gdma_txq_id);
> +	if (apc->tx_qp) {
> +		for (i = 0; i < apc->num_queues; i++) {
> +			txq = &apc->tx_qp[i].txq;
> +			tsleep = 1000;
> +			while (atomic_read(&txq->pending_sends) > 0 &&
> +			       time_before(jiffies, timeout)) {
> +				usleep_range(tsleep, tsleep + 1000);
> +				tsleep <<= 1;
> +			}
> +			if (atomic_read(&txq->pending_sends)) {
> +				err =
> +				    pcie_flr(to_pci_dev(gd->gdma_context->dev));
> +				if (err) {
> +					netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
> +						   err,
> +					    atomic_read(&txq->pending_sends),
> +					    txq->gdma_txq_id);
> +				}
> +				break;
>  			}
> -			break;
>  		}
> -	}
>  
> -	for (i = 0; i < apc->num_queues; i++) {
> -		txq = &apc->tx_qp[i].txq;
> -		while ((skb = skb_dequeue(&txq->pending_skbs))) {
> -			mana_unmap_skb(skb, apc);
> -			dev_kfree_skb_any(skb);
> +		for (i = 0; i < apc->num_queues; i++) {
> +			txq = &apc->tx_qp[i].txq;
> +			while ((skb = skb_dequeue(&txq->pending_skbs))) {
> +				mana_unmap_skb(skb, apc);
> +				dev_kfree_skb_any(skb);
> +			}
> +			atomic_set(&txq->pending_sends, 0);
>  		}
> -		atomic_set(&txq->pending_sends, 0);
>  	}
> +
>  	/* We're 100% sure the queues can no longer be woken up, because
>  	 * we're sure now mana_poll_tx_cq() can't be running.
>  	 */
> -- 
> 2.43.0
> 

^ permalink raw reply

* [PATCH net v3 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure
From: Dipayaan Roy @ 2026-05-25  8:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
	john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov,
	pavan.chebbi
In-Reply-To: <20260525081129.1230035-1-dipayanroy@linux.microsoft.com>

When queue allocation fails partway through, the error cleanup frees
and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as
mana_remove(), mana_change_mtu() recovery, and internal error handling
in mana_alloc_queues() can subsequently call into functions that
dereference these pointers without NULL checks:

- mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer
  dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26).
- mana_destroy_vport() iterates apc->rxqs without a NULL check.
- mana_fence_rqs() iterates apc->rxqs without a NULL check.
- mana_dealloc_queues() iterates apc->tx_qp without a NULL check.

Add NULL guards for apc->rxqs in mana_fence_rqs(),
mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a
NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue
draining when TX queues were never allocated or already freed.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 70 +++++++++++--------
 1 file changed, 41 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 9afc786b297a..0582803907a8 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1727,6 +1727,9 @@ static void mana_fence_rqs(struct mana_port_context *apc)
 	struct mana_rxq *rxq;
 	int err;
 
+	if (!apc->rxqs)
+		return;
+
 	for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
 		rxq = apc->rxqs[rxq_idx];
 		err = mana_fence_rq(apc, rxq);
@@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc)
 	struct mana_rxq *rxq;
 	u32 rxq_idx;
 
-	for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
-		rxq = apc->rxqs[rxq_idx];
-		if (!rxq)
-			continue;
+	if (apc->rxqs) {
 
-		mana_destroy_rxq(apc, rxq, true);
-		apc->rxqs[rxq_idx] = NULL;
+		for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
+			rxq = apc->rxqs[rxq_idx];
+			if (!rxq)
+				continue;
+
+			mana_destroy_rxq(apc, rxq, true);
+			apc->rxqs[rxq_idx] = NULL;
+		}
 	}
 
 	mana_destroy_txq(apc);
@@ -3269,7 +3275,8 @@ static int mana_dealloc_queues(struct net_device *ndev)
 	if (apc->port_is_up)
 		return -EINVAL;
 
-	mana_chn_setxdp(apc, NULL);
+	if (apc->rxqs)
+		mana_chn_setxdp(apc, NULL);
 
 	if (gd->gdma_context->is_pf && !apc->ac->bm_hostmode)
 		mana_pf_deregister_filter(apc);
@@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev)
 	 * number of queues.
 	 */
 
-	for (i = 0; i < apc->num_queues; i++) {
-		txq = &apc->tx_qp[i].txq;
-		tsleep = 1000;
-		while (atomic_read(&txq->pending_sends) > 0 &&
-		       time_before(jiffies, timeout)) {
-			usleep_range(tsleep, tsleep + 1000);
-			tsleep <<= 1;
-		}
-		if (atomic_read(&txq->pending_sends)) {
-			err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
-			if (err) {
-				netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
-					   err, atomic_read(&txq->pending_sends),
-					   txq->gdma_txq_id);
+	if (apc->tx_qp) {
+		for (i = 0; i < apc->num_queues; i++) {
+			txq = &apc->tx_qp[i].txq;
+			tsleep = 1000;
+			while (atomic_read(&txq->pending_sends) > 0 &&
+			       time_before(jiffies, timeout)) {
+				usleep_range(tsleep, tsleep + 1000);
+				tsleep <<= 1;
+			}
+			if (atomic_read(&txq->pending_sends)) {
+				err =
+				    pcie_flr(to_pci_dev(gd->gdma_context->dev));
+				if (err) {
+					netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n",
+						   err,
+					    atomic_read(&txq->pending_sends),
+					    txq->gdma_txq_id);
+				}
+				break;
 			}
-			break;
 		}
-	}
 
-	for (i = 0; i < apc->num_queues; i++) {
-		txq = &apc->tx_qp[i].txq;
-		while ((skb = skb_dequeue(&txq->pending_skbs))) {
-			mana_unmap_skb(skb, apc);
-			dev_kfree_skb_any(skb);
+		for (i = 0; i < apc->num_queues; i++) {
+			txq = &apc->tx_qp[i].txq;
+			while ((skb = skb_dequeue(&txq->pending_skbs))) {
+				mana_unmap_skb(skb, apc);
+				dev_kfree_skb_any(skb);
+			}
+			atomic_set(&txq->pending_sends, 0);
 		}
-		atomic_set(&txq->pending_sends, 0);
 	}
+
 	/* We're 100% sure the queues can no longer be woken up, because
 	 * we're sure now mana_poll_tx_cq() can't be running.
 	 */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net v3 0/2] net: mana: Fix NULL dereferences during teardown after attach failure
From: Dipayaan Roy @ 2026-05-25  8:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
	john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov,
	pavan.chebbi

When mana_attach() fails (e.g. during queue allocation), the error
cleanup frees apc->tx_qp and apc->rxqs and sets them to NULL. Multiple
subsequent teardown paths can then dereference these NULL pointers,
causing kernel panics.

Patch 1 adds NULL guards in the low-level teardown functions
(mana_fence_rqs, mana_destroy_vport, mana_dealloc_queues) so they are
safe to call regardless of queue initialization state. This covers all
callers: mana_remove(), mana_change_mtu() recovery, and internal error
paths in mana_alloc_queues().

Patch 2 adds an early exit in mana_detach() for already-detached ports,
making it safe for non-close callers. This allows the queue reset
handler to safely retry mana_attach() without redundant teardown.

Changes in v3:
  - Patch 1: Fixed commit message and fixes tag.
Changes in v2:
  - Patch 2: moved netif_device_present() check into mana_detach() as
    an early exit instead of using goto in the work handler 

Dipayaan Roy (2):
  net: mana: Add NULL guards in teardown path to prevent panic on attach
    failure
  net: mana: Skip redundant detach on already-detached port

 drivers/net/ethernet/microsoft/mana/mana_en.c | 76 ++++++++++++-------
 1 file changed, 47 insertions(+), 29 deletions(-)

-- 
2.43.0


^ permalink raw reply

* [PATCH net v3 2/2] net: mana: Skip redundant detach on already-detached port
From: Dipayaan Roy @ 2026-05-25  8:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
	john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov,
	pavan.chebbi
In-Reply-To: <20260525081129.1230035-1-dipayanroy@linux.microsoft.com>

When mana_per_port_queue_reset_work_handler() runs after a previous
detach succeeded but attach failed, the port is left in a detached
state with apc->tx_qp and apc->rxqs already freed. Calling
mana_detach() again unconditionally leads to NULL pointer dereferences
during queue teardown.

Add an early exit in mana_detach() when the port is already in
detached state (!netif_device_present) for non-close callers, making
it safe to call idempotently. This allows the queue reset handler and
other recovery paths to simply retry mana_attach() without redundant
teardown.

Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 0582803907a8..1e1ad2795c3c 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3350,6 +3350,12 @@ int mana_detach(struct net_device *ndev, bool from_close)
 
 	ASSERT_RTNL();
 
+	/* If already detached (indicates detach succeeded but attach failed
+	 * previously). Now skip mana detach and just retry mana_attach.
+	 */
+	if (!from_close && !netif_device_present(ndev))
+		return 0;
+
 	apc->port_st_save = apc->port_is_up;
 	apc->port_is_up = false;
 
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v5 0/2] drm/hyperv: harden host message parsing
From: Hamza Mahfooz @ 2026-05-25 11:36 UTC (permalink / raw)
  To: Berkant Koc
  Cc: Saurabh Sengar, Dexuan Cui, Long Li, linux-hyperv, dri-devel,
	linux-kernel, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Michael Kelley, Thomas Zimmermann, Maarten Lankhorst,
	Maxime Ripard, Deepak Rawat
In-Reply-To: <cover.1779542874.git.me@berkoc.com>

On Sat, May 23, 2026 at 03:27:54PM +0200, Berkant Koc wrote:
> Two independent issues in the synthetic video driver that both stem
> from trusting unvalidated host data.
> 
> 1/2 bounds resolution_count from SYNTHVID_RESOLUTION_RESPONSE against
> the supported_resolution[] array, and populates WIN8 defaults for
> hv->screen_*_max / hv->preferred_* in both the WIN10-probe-failure
> path and the pre-WIN10 path, so a failed or pre-WIN10 probe yields a
> usable display instead of having drm_internal_framebuffer_create()
> reject every userspace framebuffer with -EINVAL.
> 
> 2/2 forwards bytes_recvd from vmbus_recvpacket() into the sub-handler,
> rejects packets that do not cover the synthvid header, and requires
> the type-specific payload size before memcpy/complete or before
> reading the feature-change byte. Rejected packets are logged via
> drm_err_ratelimited() instead of being silently dropped, matching the
> CoCo-hardened pattern in hv_kvp_onchannelcallback().
> 
> 1/2 is unchanged from v3/v4 and carries Michael Kelley's Reviewed-by.
> 
> Changes since v4 (per review by Michael Kelley):
> 
>   2/2: collapsed the leading "if (type == ... ) { ... switch ... }"
>        plus the separate "if (type == FEATURE_CHANGE)" into a single
>        switch on msg->vid_hdr.type. The three completion-driving cases
>        compute their per-type size and break to a shared exit that does
>        the size check + memcpy(init_buf) + complete(); FEATURE_CHANGE is
>        its own case that validates its payload and returns without
>        falling through; unknown types hit default and return. No
>        functional change, fewer lines.
> 
>   2/2: the vmbus_recvpacket() nonzero-return path (e.g. -ENOBUFS for a
>        too-big packet) is itself a malformed-message case. It is now
>        logged via drm_err_ratelimited(), consistent with the
>        sub-handler's other reject paths, instead of being silently
>        skipped. No channel recovery is attempted, as that is not worth
>        the added code for this rare host-side condition.
> 
> Changes since v3 (per review by Michael Kelley):
> 
>   2/2: validate SYNTHVID_RESOLUTION_RESPONSE against its actual
>        variable length. The response carries resolution_count entries,
>        not the full SYNTHVID_MAX_RESOLUTION_COUNT array, so requiring
>        sizeof(struct synthvid_supported_resolution_resp) rejected the
>        shorter responses the host legitimately sends and broke
>        resolution probing. Require the fixed prefix, read and bound
>        resolution_count, then require only the count-sized array.
> 
>   2/2: only run hyperv_receive_sub() when vmbus_recvpacket() returned
>        success. v3 dropped the bytes_recvd upper bound as redundant,
>        which holds only on a successful receive: on -ENOBUFS
>        vmbus_recvpacket() reports the required length in bytes_recvd,
>        which can exceed the 16 KiB hv->recv_buf, and the subsequent
>        memcpy(hv->init_buf, msg, bytes_recvd) would read and write
>        past both buffers. Gating on the success return restores the
>        invariant that made the bound redundant, so an oversized host
>        packet is dropped rather than copied.
> 
> Changes since v2 (per review by Michael Kelley):
> 
>   1/2: dropped the reinit_completion() change; the stale completion can
>        only outlive its request in hyperv_vmbus_resume() after a
>        get_supported_resolution() timeout, which is a narrower fix that
>        belongs in a separate patch against the resume path. Pre-WIN10
>        branch now also populates hv->preferred_*. The else branch is
>        gone; a single screen_width_max == 0 check covers both the
>        pre-WIN10 case and a failed WIN10 probe.
> 
>   2/2: added a per-type switch for the three completion-driving message
>        types so the wait-completion path validates payload size before
>        memcpy/complete. Every reject path emits drm_err_ratelimited()
>        rather than returning silently.
> 
> Changes since v1:
> 
>   1/2: bound resolution_count check folded into the existing zero check;
>        populate WIN8 defaults when hyperv_get_supported_resolution()
>        fails.
>   2/2: forward bytes_recvd into hyperv_receive_sub(); enforce the pipe +
>        synthvid header minimum; check synthvid_feature_change payload
>        size before reading is_dirt_needed.
> 
> The shared init_buf reuse (a duplicate or late host response can
> overwrite init_buf between successive request/response cycles) and the
> related completion reinit are real but orthogonal to this size
> validation. As discussed on v2, they are queued as a separate follow-up
> against the resume/expected-type path once this series lands.
> 
> This series is verified by static analysis and code inspection against
> the synthvid protocol structures and the vmbus_recvpacket() contract. I
> do not currently have a Hyper-V test environment to exercise the receive
> and resolution-probe paths at runtime, so confirmation from someone who
> can run it in a Hyper-V VM would be welcome.
> 
> Both patches carry an Assisted-by: Claude:claude-opus-4-7 berkoc-pipeline
> trailer per the kernel coding-assistants policy. Code, analysis and
> review responses are mine; the model is used as a structured reviewer
> under human verification.
> 
> Berkant Koc (2):
>   drm/hyperv: validate resolution_count and fix WIN8 fallback
>   drm/hyperv: validate VMBus packet size in receive callback
> 
>  drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 113 +++++++++++++++++++---
>  1 file changed, 97 insertions(+), 16 deletions(-)
> 
> 

Applied, thanks!

> base-commit: 4bf5d3da79c48e1df4bab82c9680c53adeff7820
> -- 
> 2.47.3
> 

^ permalink raw reply

* Re: [PATCH 1/1] x86/hyperv: Refactor hv_smp_prepare_cpus()
From: kernel test robot @ 2026-05-25 11:42 UTC (permalink / raw)
  To: Michael Kelley, kys, haiyangz, wei.liu, decui, longli, tglx,
	mingo, bp, dave.hansen, x86, hpa, linux-hyperv, linux-kernel,
	linux-arch
  Cc: llvm, oe-kbuild-all
In-Reply-To: <20260521192336.99623-1-mhklkml@zohomail.com>

Hi Michael,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/master]
[also build test ERROR on linus/master v7.1-rc5 next-20260522]
[cannot apply to tip/x86/core arnd-asm-generic/master tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Michael-Kelley/x86-hyperv-Refactor-hv_smp_prepare_cpus/20260522-032610
base:   tip/master
patch link:    https://lore.kernel.org/r/20260521192336.99623-1-mhklkml%40zohomail.com
patch subject: [PATCH 1/1] x86/hyperv: Refactor hv_smp_prepare_cpus()
config: x86_64-randconfig-076-20260524 (https://download.01.org/0day-ci/archive/20260525/202605251945.TesslTvF-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260525/202605251945.TesslTvF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202605251945.TesslTvF-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/hv/hv_proc.c:303:55: error: call to undeclared function 'cpu_physical_id'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     303 |                 ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));
         |                                                                     ^
   1 error generated.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for MFD_STMFX
   Depends on [n]: HAS_IOMEM [=y] && I2C [=y] && OF [=n]
   Selected by [y]:
   - PINCTRL_STMFX [=y] && PINCTRL [=y] && I2C [=y] && HAS_IOMEM [=y]


vim +/cpu_physical_id +303 drivers/hv/hv_proc.c

   290	
   291	void hv_smp_prep_cpus(void)
   292	{
   293	#ifdef CONFIG_X86_64
   294		int i, ret;
   295	
   296		/* If AP LPs exist, we are in a kexec'd kernel and VPs already exist */
   297		if (num_present_cpus() == 1 || hv_lp_exists(1))
   298			return;
   299	
   300		for_each_present_cpu(i) {
   301			if (i == 0)
   302				continue;
 > 303			ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply

* [PATCH] uio_hv_generic: Bind to FCopy device by default
From: Ben Hutchings @ 2026-05-25 12:04 UTC (permalink / raw)
  To: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li
  Cc: Greg Kroah-Hartman, linux-hyperv

[-- Attachment #1: Type: text/plain, Size: 1030 bytes --]

The Hyper-V kernel-mode fcopy driver was removed in 6.10 and the new
fcopy daemon requires this uio driver to function.  However, by
default the driver does not bind to any devices, and must be
configured through the sysfs "new_id" file.

Since the FCopy device is now only usable through this driver, add its
ID to the driver's ID table so that the daemon will work "out of the
box".

Signed-off-by: Ben Hutchings <benh@debian.org>
Fixes: ec314f61e4fc ("Drivers: hv: Remove fcopy driver")
---
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -395,9 +395,15 @@ hv_uio_remove(struct hv_device *dev)
 	vmbus_free_ring(dev->channel);
 }
 
+static const struct hv_vmbus_device_id hv_uio_id_table[] = {
+	{ HV_FCOPY_GUID },
+	{}
+};
+MODULE_DEVICE_TABLE(vmbus, hv_uio_id_table);
+
 static struct hv_driver hv_uio_drv = {
 	.name = "uio_hv_generic",
-	.id_table = NULL, /* only dynamic id's */
+	.id_table = hv_uio_id_table,
 	.probe = hv_uio_probe,
 	.remove = hv_uio_remove,
 };

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* RE: [PATCH v5 0/2] drm/hyperv: harden host message parsing
From: Michael Kelley @ 2026-05-25 14:57 UTC (permalink / raw)
  To: Hamza Mahfooz, Berkant Koc
  Cc: Saurabh Sengar, Dexuan Cui, Long Li, linux-hyperv@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Michael Kelley,
	Thomas Zimmermann, Maarten Lankhorst, Maxime Ripard, Deepak Rawat
In-Reply-To: <ahQ0NS1jrfU8ms1U@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

From: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Sent: Monday, May 25, 2026 4:36 AM
> 
> On Sat, May 23, 2026 at 03:27:54PM +0200, Berkant Koc wrote:
> > Two independent issues in the synthetic video driver that both stem
> > from trusting unvalidated host data.
> >
> > 1/2 bounds resolution_count from SYNTHVID_RESOLUTION_RESPONSE against
> > the supported_resolution[] array, and populates WIN8 defaults for
> > hv->screen_*_max / hv->preferred_* in both the WIN10-probe-failure
> > path and the pre-WIN10 path, so a failed or pre-WIN10 probe yields a
> > usable display instead of having drm_internal_framebuffer_create()
> > reject every userspace framebuffer with -EINVAL.
> >
> > 2/2 forwards bytes_recvd from vmbus_recvpacket() into the sub-handler,
> > rejects packets that do not cover the synthvid header, and requires
> > the type-specific payload size before memcpy/complete or before
> > reading the feature-change byte. Rejected packets are logged via
> > drm_err_ratelimited() instead of being silently dropped, matching the
> > CoCo-hardened pattern in hv_kvp_onchannelcallback().
> >
> > 1/2 is unchanged from v3/v4 and carries Michael Kelley's Reviewed-by.
> >

[snip]

> >
> > Berkant Koc (2):
> >   drm/hyperv: validate resolution_count and fix WIN8 fallback
> >   drm/hyperv: validate VMBus packet size in receive callback
> >
> >  drivers/gpu/drm/hyperv/hyperv_drm_proto.c | 113 +++++++++++++++++++---
> >  1 file changed, 97 insertions(+), 16 deletions(-)
> >
> >
> 
> Applied, thanks!

Hamza -- which tree was this applied to?

Michael

^ permalink raw reply

* Re: [PATCH v5 0/2] drm/hyperv: harden host message parsing
From: Hamza Mahfooz @ 2026-05-25 15:32 UTC (permalink / raw)
  To: Michael Kelley
  Cc: Berkant Koc, Saurabh Sengar, Dexuan Cui, Long Li,
	linux-hyperv@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Thomas Zimmermann, Maarten Lankhorst, Maxime Ripard,
	Deepak Rawat
In-Reply-To: <SN6PR02MB4157F72302D2B4B86ECE553FD40A2@SN6PR02MB4157.namprd02.prod.outlook.com>

On Mon, May 25, 2026 at 02:57:24PM +0000, Michael Kelley wrote:
> From: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com> Sent: Monday, May 25, 2026 4:36 AM
> > Applied, thanks!
> 
> Hamza -- which tree was this applied to?

drm-misc-fixes

> 
> Michael

^ permalink raw reply

* Re: [EXTERNAL] Re: [PATCH rdma-next v2] RDMA/mana_ib: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Erni Sri Satya Vennela @ 2026-05-25 18:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Long Li, Leon Romanovsky, Konstantin Taranov,
	linux-rdma@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20260413134602.GL3694781@ziepe.ca>

On Mon, Apr 13, 2026 at 10:46:02AM -0300, Jason Gunthorpe wrote:
> On Fri, Apr 10, 2026 at 10:29:45PM +0000, Long Li wrote:
> > > On Sat, Mar 21, 2026 at 12:56:39AM +0000, Long Li wrote:
> > > 
> > > > How we rephrase this in this way: the driver should not corrupt or
> > > > overflow other parts of the kernel if its device is misbehaving (or
> > > > has a bug).
> > > 
> > > If we are going to do this CC hardening stuff I think I want to see a more
> > > comphrensive approach, like if we detect an attack then the kernel instantly
> > > crashes or something. Or at least an approach in general agreed to by the CC and
> > > kernel community.
> > > 
> > > Igoring the issue and continuing seems just wrong.
> > > 
> > > This sprinkling of random checks in this series doesn't feel comprehensive or
> > > cohesive to me.
> > > 
> > > Jason
> > 
> > Can we follow the virtio BAD_RING()/vq->broken pattern in
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/virtio/virtio_ring.c#n57.
> > 
> > Add a broken flag to mana_ib_dev. When any hardware response
> > contains out-of-range values, mark the device broken and fail the
> > operation - during probe this prevents device registration entirely,
> > at runtime all subsequent operations return -EIO.
> 
> If that's the plan I would think it should be struct device based, but
> yeah, I'm more comfortable with this sort of direction as a CC
> hardening plan.
> 
Hi Jason,

Our team is not aligned with marking the device broken, after multiple
discussions, since both the values that are received from hardware and
stored in mana_ib_gd_query_adapter_caps are u32.

I'm planning to send v3 as a non-hardening patch with only clamping the
values at mana_ib_query_device to INT_MAX when out-of-bound.

Your previous concerns:
> “I'm also not convinced clamping to such a high value has any value
> whatsoever, as it probably still triggers maths overflows elsewhere. I
> think you should clamp to reasonable limits for your device if you want
> to do this.”

We plan to clamp it to INT_MAX since it is the max in props.

> “There is no reason they should be signed, you should just fix the
> type.”

It is not allowed to change sign in props, so clamping is the best bet.

Thanks,
Vennela

^ permalink raw reply

* [PATCH rdma-next v3] RDMA/mana_ib: Clamp adapter capabilities at the ib_device_attr boundary
From: Erni Sri Satya Vennela @ 2026-05-25 19:01 UTC (permalink / raw)
  To: longli, kotaranov, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-hyperv, linux-kernel
  Cc: Erni Sri Satya Vennela

mana_ib stores its adapter capabilities internally as u32 in
struct mana_ib_adapter_caps. The IB core, however, exposes the
corresponding device attributes through struct ib_device_attr, where
fields such as max_qp, max_qp_wr, max_send_sge, max_recv_sge,
max_sge_rd, max_cq, max_cqe, max_mr, max_pd, max_qp_rd_atom,
max_res_rd_atom and max_qp_init_rd_atom are signed int.

mana_ib_query_device() is the only place that copies the cached u32
caps into these int fields. If a cap exceeds INT_MAX, the implicit
u32-to-int narrowing yields a negative value. Clamp each cap to
INT_MAX at this boundary so the values handed to the IB core are always
non-negative.

While here, fix a related overflow in the computation of
max_res_rd_atom. It is derived as max_qp_rd_atom * max_qp, both of
which are int after the assignment above; the multiplication can
overflow an int even with the new clamps in place. Widen to s64
before multiplying and clamp the result to INT_MAX.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v3:
* Drop clamping from mana_ib_gd_query_adapter_caps(). The internal u32
  caps cache does not need to be clamped.
* Move all clamping exclusively to mana_ib_query_device(), which is the
  only place the cached u32 values are narrowed into the signed int
  fields of struct ib_device_attr.
* Reframe commit message: this is a u32-to-int type boundary fix, not a
  CVM/untrusted-hardware hardening patch.
Changes in v2:
* Update patch title.
---
 drivers/infiniband/hw/mana/main.c | 33 ++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index ac5e75dd3494..ca843083140f 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -555,19 +555,28 @@ int mana_ib_query_device(struct ib_device *ibdev, struct ib_device_attr *props,
 	props->vendor_part_id = dev->gdma_dev->dev_id.type;
 	props->max_mr_size = MANA_IB_MAX_MR_SIZE;
 	props->page_size_cap = dev->adapter_caps.page_size_cap;
-	props->max_qp = dev->adapter_caps.max_qp_count;
-	props->max_qp_wr = dev->adapter_caps.max_qp_wr;
+	/*
+	 * mana_ib stores adapter capabilities internally as u32, but the
+	 * corresponding ib_device_attr fields are signed int. Clamp each
+	 * value at this boundary so a cap larger than INT_MAX is never
+	 * narrowed into a negative value visible to the IB core or
+	 * userspace.
+	 */
+	props->max_qp = min_t(u32, dev->adapter_caps.max_qp_count, INT_MAX);
+	props->max_qp_wr = min_t(u32, dev->adapter_caps.max_qp_wr, INT_MAX);
 	props->device_cap_flags = IB_DEVICE_RC_RNR_NAK_GEN;
-	props->max_send_sge = dev->adapter_caps.max_send_sge_count;
-	props->max_recv_sge = dev->adapter_caps.max_recv_sge_count;
-	props->max_sge_rd = dev->adapter_caps.max_recv_sge_count;
-	props->max_cq = dev->adapter_caps.max_cq_count;
-	props->max_cqe = dev->adapter_caps.max_qp_wr;
-	props->max_mr = dev->adapter_caps.max_mr_count;
-	props->max_pd = dev->adapter_caps.max_pd_count;
-	props->max_qp_rd_atom = dev->adapter_caps.max_inbound_read_limit;
-	props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp;
-	props->max_qp_init_rd_atom = dev->adapter_caps.max_outbound_read_limit;
+	props->max_send_sge = min_t(u32, dev->adapter_caps.max_send_sge_count, INT_MAX);
+	props->max_recv_sge = min_t(u32, dev->adapter_caps.max_recv_sge_count, INT_MAX);
+	props->max_sge_rd = min_t(u32, dev->adapter_caps.max_recv_sge_count, INT_MAX);
+	props->max_cq = min_t(u32, dev->adapter_caps.max_cq_count, INT_MAX);
+	props->max_cqe = min_t(u32, dev->adapter_caps.max_qp_wr, INT_MAX);
+	props->max_mr = min_t(u32, dev->adapter_caps.max_mr_count, INT_MAX);
+	props->max_pd = min_t(u32, dev->adapter_caps.max_pd_count, INT_MAX);
+	props->max_qp_rd_atom = min_t(u32, dev->adapter_caps.max_inbound_read_limit, INT_MAX);
+	props->max_res_rd_atom = min_t(s64,
+				       (s64)props->max_qp_rd_atom * props->max_qp,
+				       INT_MAX);
+	props->max_qp_init_rd_atom = min_t(u32, dev->adapter_caps.max_outbound_read_limit, INT_MAX);
 	props->atomic_cap = IB_ATOMIC_NONE;
 	props->masked_atomic_cap = IB_ATOMIC_NONE;
 	props->max_ah = INT_MAX;
-- 
2.34.1


^ permalink raw reply related

* Re: [EXTERNAL] Re: [PATCH rdma-next v2] RDMA/mana_ib: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Jason Gunthorpe @ 2026-05-25 23:01 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: Long Li, Leon Romanovsky, Konstantin Taranov,
	linux-rdma@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <ahSbyYcq0sgfJnmZ@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Mon, May 25, 2026 at 11:58:17AM -0700, Erni Sri Satya Vennela wrote:
> > “There is no reason they should be signed, you should just fix the
> > type.”
> 
> It is not allowed to change sign in props, so clamping is the best bet.

Why not? Fix the core code, it is just old junk they are signed, they
should't never have been.

Jason

^ permalink raw reply

* Re: [PATCH] uio_hv_generic: Bind to FCopy device by default
From: Naman Jain @ 2026-05-26  6:45 UTC (permalink / raw)
  To: Ben Hutchings, K. Y. Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li
  Cc: Greg Kroah-Hartman, linux-hyperv
In-Reply-To: <ahQ6xuhSReidmN-3@decadent.org.uk>



On 5/25/2026 5:34 PM, Ben Hutchings wrote:
> The Hyper-V kernel-mode fcopy driver was removed in 6.10 and the new
> fcopy daemon requires this uio driver to function.  However, by
> default the driver does not bind to any devices, and must be
> configured through the sysfs "new_id" file.
> 
> Since the FCopy device is now only usable through this driver, add its
> ID to the driver's ID table so that the daemon will work "out of the
> box".
> 
> Signed-off-by: Ben Hutchings <benh@debian.org>
> Fixes: ec314f61e4fc ("Drivers: hv: Remove fcopy driver")
> ---
> --- a/drivers/uio/uio_hv_generic.c
> +++ b/drivers/uio/uio_hv_generic.c
> @@ -395,9 +395,15 @@ hv_uio_remove(struct hv_device *dev)
>   	vmbus_free_ring(dev->channel);
>   }
>   
> +static const struct hv_vmbus_device_id hv_uio_id_table[] = {
> +	{ HV_FCOPY_GUID },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(vmbus, hv_uio_id_table);
> +
>   static struct hv_driver hv_uio_drv = {
>   	.name = "uio_hv_generic",
> -	.id_table = NULL, /* only dynamic id's */
> +	.id_table = hv_uio_id_table,
>   	.probe = hv_uio_probe,
>   	.remove = hv_uio_remove,
>   };

Two things worth considering before applying:

1. Please add Cc: stable@vger.kernel.org or is it that we do not want 
this to be ported to older kernels?
2. Every Hyper-V guest (with UIO_HV_GENERIC enabled) will now have an 
additional auto-bound /dev/uio0 node for FCopy. Anything that hardcodes 
/dev/uio0 (e.g. ad-hoc DPDK scripts that bind a NetVSC NIC via 
uio_hv_generic + new_id) may see its index shift, since FCopy now wins 
uio0 at boot. The fix for such consumers is the same thing DPDK and the 
in-tree daemon already do: resolve uio via 
/sys/bus/vmbus/devices/<guid>/uio/ rather than by number. This is not a 
regression in the patch, but it's a behavior change worth calling out.

Regards,
Naman

^ permalink raw reply

* Re: [PATCH] uio_hv_generic: Bind to FCopy device by default
From: Naman Jain @ 2026-05-26 10:10 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Greg Kroah-Hartman, linux-hyperv
In-Reply-To: <afdcb1775e7a60b7824b5c540a44f0148abe3e1c.camel@debian.org>



On 5/26/2026 1:59 PM, Ben Hutchings wrote:
> On Tue, 2026-05-26 at 12:15 +0530, Naman Jain wrote:
>>
>> On 5/25/2026 5:34 PM, Ben Hutchings wrote:
>>> The Hyper-V kernel-mode fcopy driver was removed in 6.10 and the new
>>> fcopy daemon requires this uio driver to function.  However, by
>>> default the driver does not bind to any devices, and must be
>>> configured through the sysfs "new_id" file.
>>>
>>> Since the FCopy device is now only usable through this driver, add its
>>> ID to the driver's ID table so that the daemon will work "out of the
>>> box".
>>>
>>> Signed-off-by: Ben Hutchings <benh@debian.org>
>>> Fixes: ec314f61e4fc ("Drivers: hv: Remove fcopy driver")
>>> ---
>>> --- a/drivers/uio/uio_hv_generic.c
>>> +++ b/drivers/uio/uio_hv_generic.c
>>> @@ -395,9 +395,15 @@ hv_uio_remove(struct hv_device *dev)
>>>    	vmbus_free_ring(dev->channel);
>>>    }
>>>    
>>> +static const struct hv_vmbus_device_id hv_uio_id_table[] = {
>>> +	{ HV_FCOPY_GUID },
>>> +	{}
>>> +};
>>> +MODULE_DEVICE_TABLE(vmbus, hv_uio_id_table);
>>> +
>>>    static struct hv_driver hv_uio_drv = {
>>>    	.name = "uio_hv_generic",
>>> -	.id_table = NULL, /* only dynamic id's */
>>> +	.id_table = hv_uio_id_table,
>>>    	.probe = hv_uio_probe,
>>>    	.remove = hv_uio_remove,
>>>    };
>>

++ recipients, assuming you mistakenly clicked reply instead of reply all.


>> Two things worth considering before applying:
>>
>> 1. Please add Cc: stable@vger.kernel.org or is it that we do not want
>> this to be ported to older kernels?
>>
>> 2. Every Hyper-V guest (with UIO_HV_GENERIC enabled) will now have an
>> additional auto-bound /dev/uio0 node for FCopy.
> 
> I don't think that's quite true.  I tested with a Windows 11 host and
> needed to enable "Guest services" for the VM, which was disabled by
> default.  But if that includes other features besides FCopy it might be
> enabled for other reasons.
> 

Yes, meaning if these two conditions are satisfied (enabling guest 
services is also one time step for a Hyper-V VM), we would see uio0 by 
default for fcopy.

>> Anything that hardcodes
>> /dev/uio0 (e.g. ad-hoc DPDK scripts that bind a NetVSC NIC via
>> uio_hv_generic + new_id) may see its index shift, since FCopy now wins
>> uio0 at boot.
> 
> OK, so maybe I should implement the new_id dance in the fcopy service
> startup, to avoid that?  I did already looked at doing it in a systemd
> unit, but it's hard to do right because adding the same ID twice is an
> error.  Maybe the daemon itself ould do it?

Implementing it in uio daemon can introduce race conditions with sysfs 
creation. I guess it's OK then to implement it here, in kernel.

> 
>> The fix for such consumers is the same thing DPDK and the
>> in-tree daemon already do: resolve uio via
>> /sys/bus/vmbus/devices/<guid>/uio/ rather than by number. This is not a
>> regression in the patch, but it's a behavior change worth calling out.
> 
> It would be a good reason *not* to make this change in stable.
> 
> Ben.
> 

What issues are you fixing with this patch exactly? Is there any 
particular sequence of events you are targeting where traditional 
approach does not work?

Regards,
Naman

^ permalink raw reply

* Re: [PATCH net v2] net: mana: Optimize irq affinity for low vcpu configs
From: Shradha Gupta @ 2026-05-26 13:05 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: Yury Norov, Dexuan Cui, Wei Liu, Haiyang Zhang, K. Y. Srinivasan,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Konstantin Taranov, Simon Horman, Dipayaan Roy,
	Shiraz Saleem, Michael Kelley, Long Li, Yury Norov, linux-hyperv,
	linux-kernel, netdev, Paul Rosswurm, Shradha Gupta,
	Saurabh Singh Sengar, stable
In-Reply-To: <agq5/8rUFp3ttOFz@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

On Mon, May 18, 2026 at 12:04:31AM -0700, Erni Sri Satya Vennela wrote:
> > > But one observation I had was that " irq_set_affinity_and_hint(*irqs++,
> > > NULL);" is essentially a no-op and we end up relying on the initial
> > > placement from pci_alloc_irq_vectors().
> > 
> > Yes you are, assuming you're not binding them before in your call chain.
> > 
> > > Even though in these tests we
> > > were not able to reproduce it, but with this distribution there is a
> > > chance we end up clustering the mana queue IRQs, while other vCPUs are
> > > not running any network load.
> > 
> > That sounds like an IRQ balancer bug which you're unable to reproduce. 
> > 
> > > It's because the placement depends on
> > > system-wide IRQ state at allocation time.
> > 
> > I don't understand this point. The 
> > 
> >         irq_set_affinity_and_hint(*irqs++, NULL);
> > 
> > simply means: I trust system IRQ balancer to pick the best CPU for my
> > IRQ at runtime. It doesn't refer any "IRQ state at allocation time".
> >   
> > > The linear approach however gaurantees each queue IRQ lands on a
> > > distinct vCPU regardless of system state. Even after stressing the cpus
> > > using stress-ng, we did not observe any significant throughput drop.
> > 
> > If you just do nothing, it would lead to the same numbers, right? What
> > does that "non-significant throughput drop" mean? It sounds like the
> > linear approach is slightly worse.
> 
> The numbers are not worse, they almost same in both the cases.
> > 
> > --
> > 
> > So, as you can't demonstrate solid benefit for the 'linear' IRQ placement,
> > I would just stick to the no-affinity logic.
> 
> Thankyou Yury,
> We are investigating on more test scenarios and trying to
> capture numbers with both, your proposed change and the one from this
> patch. We will keep you updated about the results.
> 
> 
> - Vennela

Hi Yury,

Vennela and I ran a bunch of more tests and were able to reproduce the
clustering of mana IRQs issue we discussed earlier with the suggested
approach(setting the affinity and hint to NULL).
In these tests there were additional IRQs allocated(apart from MANA),
that disturbed the MANA IRQ distribution

ENV details
azure SKU(Standard_L4als_v5) 4 vcpu(2 cores), 5 MANA IRQs (1 HWC + 4
Queue)

"Affinity set to NULL" approach
========================================
MANA IRQ distribution	vCPU
========================================
IRQ0	HWC		0
IRQ1	mana_q1		2
IRQ3	mana_q2		3
IRQ4	mana_q3		2
IRQ5	mana_q4		3


"Affinity set linearly" approach
========================================
MANA IRQ distribution	vCPU
========================================
IRQ0	HWC		0
IRQ1	mana_q1		1
IRQ3	mana_q2		2
IRQ4	mana_q3		3
IRQ5	mana_q4		0


Throughput(Gbps) with high TCP connection
========================================
connection	affinity NULL	Linear
20480		5.25		13.49
10240		5.77		13.48
8192		7.16		13.48
6144		9.33		13.53
4096		13.50		13.50


Considering these results, we would like to proceed with the linear
approach that was proposed by this patch.


Regards,
Shradha

^ permalink raw reply

* [RFC] KVM/x86: Killing kvm_get_time_and_clockread() in favour of ktime_get_snapshot()
From: David Woodhouse @ 2026-05-26 13:57 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, John Stultz,
	Michael Kelley
  Cc: Vitaly Kuznetsov, Marcelo Tosatti, Christopher S. Hall,
	Stephen Boyd, Miroslav Lichvar, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
	Wei Liu, Dexuan Cui, Daniel Lezcano, kvm, linux-hyperv, x86,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2323 bytes --]

In 2012, as part of implementing the "master clock" mode for kvmclock,
Marcelo added kvm_get_time_and_clockread() in commit d828199e8444
("KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag").

In 2016, Christopher Hall added the generic ktime_get_snapshot() in
commit 9da0f49c8767 ("time: Add timekeeping snapshot code capturing
system time and counter"), which provides the same paired read of
{ time, counter } through the core timekeeping code.

Then in 2018, Vitaly Kuznetsov added Hyper-V TSC page support in
commit b0c39dc68e3b ("x86/kvm: Pass stable clocksource to guests when
running nested on Hyper-V"), which extended vgettsc() to handle the
HVCLOCK case.

I'd quite like to kill it all with fire and make KVM use
ktime_get_snapshot() instead.

However, to correlate with the TSC provided to guests, KVM needs the
underlying host TSC counter value, *not* the cycles count from the
hyperv_clocksource_tsc_page clocksource which is scaled to 10MHz.

If we wanted to support master clock mode while nesting under KVM and
bizarrely using the kvmclock for system timing, we'd have the same
problem with the kvmclock clocksource, which similarly scales to 1GHz.

One option is to say "Don't Do That Then™": if you want to provide a
masterclock kvmclock to guests then *don't* use the silly pvclocks for
your own kernel's timekeeping, use the damn TSC. Because if the TSC
*isn't* reliable then you can't do masterclock mode for your guests
anyway.

Perhaps that should have been the response when commit b0c39dc68e3b was
submitted, but I guess we're stuck supporting that mode now. But I
really do want to kill the KVM hacks and use ktime_get_snapshot().

Reverse-engineering the original TSC reading from the clocksource
counter value doesn't look sane, without a loss of precision and/or
128-bit division.

One simple option that occurs to me would be to add a 'cycles_raw'
value to the system_time_snapshot, for PV clocksources like hyperv and
kvmclock to populate with the original TSC reading.

That might actually let us clean up some of the PTP code that currently
has to deal with TSC vs. kvmclock in counter snapshots too. I think I
could kill the use of get_cycles() in vmclock for the kvmclock case,
which might make Thomas happy...

Any better ideas?

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* [PATCH v2 1/1] mshv: Add conditional VMBus dependency
From: Michael Kelley @ 2026-05-26 14:13 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, jloeser, linux-hyperv
  Cc: linux-kernel, arnd, hamzamahfooz

From: Michael Kelley <mhklinux@outlook.com>

When the VMBus driver is not part of the kernel (CONFIG_HYPERV_VMBUS=n),
the MSHV root driver fails to link:

ERROR: modpost: "hv_vmbus_exists" [drivers/hv/mshv_root.ko] undefined!

Fix this while meeting these requirements:
* It must be possible to include the MSHV root driver without the
  VMBus driver. In such case, the MSHV root driver can be built-in
  to the kernel image, or it can be built as a separate module.
* If both the MSHV root driver and the VMBus driver are present, the
  MSHV root driver and VMBus driver can both be built-in, or they can
  both be separate modules. Or the MSHV root driver can be a module
  while the VMBus driver can be built-in, but the reverse is
  disallowed. Regardless of the build choices, the VMBus driver must
  be loaded before the MSHV driver in order for the SynIC to be
  managed properly (see comments in the MSHV SynIC code).

The fix has two parts:
* Add a Kconfig entry for MSHV_ROOT to depend on HYPERV_VMBUS if
  HYPERV_VMBUS is present. The entry disallows MSHV_ROOT being
  built-in when HYPERV_VMBUS is a module, but without requiring that
  HYPERV_VMBUS be built.
* Add a stub implementation of hv_vmbus_exists() for when the
  VMBus driver is not present so that the MSHV root driver has
  no module dependency on VMBus. When the VMBus driver *is*
  present, the module dependency ensures that the VMBus driver
  loads first when both are built as modules.

Existing code ensures that the VMBus driver loads first if it is
built-in. The VMBus driver uses subsys_initcall(), which is
initcall level 4. The MSHV root driver uses module_init(), which
becomes device_init() when built-in, and device_init() is
initcall level 6.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Closes: https://lore.kernel.org/all/20260520074044.923728-1-arnd@kernel.org/
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jork Loeser <jloeser@linux.microsoft.com>
---
Changes in v2:
* Instead of putting IS_ENABLED(CONFIG_HYPERV_VMBUS) around each of
  the two calls to hv_vmbus_exists() in mshv_synic.c, provide a stub
  for hv_vmbus_exists() when CONFIG_HYPERV_VMBUS is not set. The
  effect is the same as in v1, but the code is cleaner. [Jork Loeser]

Arnd: I've kept your Ack even though I changed how hv_vmbus_exists()
is stubbed out since the effect is the same. Let me know if
you have any concerns.

 drivers/hv/Kconfig     | 1 +
 include/linux/hyperv.h | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 2d0b3fcb0ff8..aa11bcefddf2 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -74,6 +74,7 @@ config MSHV_ROOT
 	# e.g. When withdrawing memory, the hypervisor gives back 4k pages in
 	# no particular order, making it impossible to reassemble larger pages
 	depends on PAGE_SIZE_4KB
+	depends on HYPERV_VMBUS if HYPERV_VMBUS
 	select EVENTFD
 	select VIRT_XFER_TO_GUEST_WORK
 	select HMM_MIRROR
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 41a3d82f0722..734b7ef98f4d 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1304,7 +1304,11 @@ static inline void *hv_get_drvdata(struct hv_device *dev)
 
 struct device *hv_get_vmbus_root_device(void);
 
+#if IS_ENABLED(CONFIG_HYPERV_VMBUS)
 bool hv_vmbus_exists(void);
+#else
+static inline bool hv_vmbus_exists(void) { return false; }
+#endif
 
 struct hv_ring_buffer_debug_info {
 	u32 current_interrupt_mask;
-- 
2.25.1


^ permalink raw reply related

* Re: [EXTERNAL] Re: [PATCH rdma-next v2] RDMA/mana_ib: hardening: Clamp adapter capability values from MANA_IB_GET_ADAPTER_CAP
From: Erni Sri Satya Vennela @ 2026-05-26 14:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Long Li, Leon Romanovsky, Konstantin Taranov,
	linux-rdma@vger.kernel.org, linux-hyperv@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20260525230155.GB2487554@ziepe.ca>

On Mon, May 25, 2026 at 08:01:55PM -0300, Jason Gunthorpe wrote:
> On Mon, May 25, 2026 at 11:58:17AM -0700, Erni Sri Satya Vennela wrote:
> > > “There is no reason they should be signed, you should just fix the
> > > type.”
> > 
> > It is not allowed to change sign in props, so clamping is the best bet.
> 
> Why not? Fix the core code, it is just old junk they are signed, they
> should't never have been.
> 
> Jason

Thanks for the feedback, Jason.

I sent the v3 before your comments in v2.
I'll be sending a v4 which drops the clamping entirely.

- Vennela

^ permalink raw reply

* RE: [PATCH] uio_hv_generic: Bind to FCopy device by default
From: Michael Kelley @ 2026-05-26 15:15 UTC (permalink / raw)
  To: Naman Jain, Ben Hutchings
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Greg Kroah-Hartman, linux-hyperv@vger.kernel.org
In-Reply-To: <aa420dc1-029c-408b-aef0-f02d6bfa002c@linux.microsoft.com>

From: Naman Jain <namjain@linux.microsoft.com> Sent: Tuesday, May 26, 2026 3:10 AM
> 
> On 5/26/2026 1:59 PM, Ben Hutchings wrote:
> > On Tue, 2026-05-26 at 12:15 +0530, Naman Jain wrote:
> >>
> >> On 5/25/2026 5:34 PM, Ben Hutchings wrote:
> >>> The Hyper-V kernel-mode fcopy driver was removed in 6.10 and the new
> >>> fcopy daemon requires this uio driver to function.  However, by
> >>> default the driver does not bind to any devices, and must be
> >>> configured through the sysfs "new_id" file.
> >>>
> >>> Since the FCopy device is now only usable through this driver, add its
> >>> ID to the driver's ID table so that the daemon will work "out of the
> >>> box".
> >>>
> >>> Signed-off-by: Ben Hutchings <benh@debian.org>
> >>> Fixes: ec314f61e4fc ("Drivers: hv: Remove fcopy driver")
> >>> ---
> >>> --- a/drivers/uio/uio_hv_generic.c
> >>> +++ b/drivers/uio/uio_hv_generic.c
> >>> @@ -395,9 +395,15 @@ hv_uio_remove(struct hv_device *dev)
> >>>    	vmbus_free_ring(dev->channel);
> >>>    }
> >>>
> >>> +static const struct hv_vmbus_device_id hv_uio_id_table[] = {
> >>> +	{ HV_FCOPY_GUID },
> >>> +	{}
> >>> +};
> >>> +MODULE_DEVICE_TABLE(vmbus, hv_uio_id_table);
> >>> +
> >>>    static struct hv_driver hv_uio_drv = {
> >>>    	.name = "uio_hv_generic",
> >>> -	.id_table = NULL, /* only dynamic id's */
> >>> +	.id_table = hv_uio_id_table,
> >>>    	.probe = hv_uio_probe,
> >>>    	.remove = hv_uio_remove,
> >>>    };
> >>
> 
> ++ recipients, assuming you mistakenly clicked reply instead of reply all.

Ben --

Regarding recipients, please include the full LKML
(linux-kernel@vger.kernel.org) on the original patch posting, even
though it is about a narrow Hyper-V issue. I dabble in areas beyond
just Hyper-V so subscribe to the full LKML instead of the
linux-hyperv mailing list. I miss patches like this one unless I happen
to be looking through the lore.kernel.org archives for linux-hyperv.

Thx,

Michael

> 
> 
> >> Two things worth considering before applying:
> >>
> >> 1. Please add Cc: stable@vger.kernel.org or is it that we do not want
> >> this to be ported to older kernels?
> >>
> >> 2. Every Hyper-V guest (with UIO_HV_GENERIC enabled) will now have an
> >> additional auto-bound /dev/uio0 node for FCopy.
> >
> > I don't think that's quite true.  I tested with a Windows 11 host and
> > needed to enable "Guest services" for the VM, which was disabled by
> > default.  But if that includes other features besides FCopy it might be
> > enabled for other reasons.
> >
> 
> Yes, meaning if these two conditions are satisfied (enabling guest
> services is also one time step for a Hyper-V VM), we would see uio0 by
> default for fcopy.
> 
> >> Anything that hardcodes
> >> /dev/uio0 (e.g. ad-hoc DPDK scripts that bind a NetVSC NIC via
> >> uio_hv_generic + new_id) may see its index shift, since FCopy now wins
> >> uio0 at boot.
> >
> > OK, so maybe I should implement the new_id dance in the fcopy service
> > startup, to avoid that?  I did already looked at doing it in a systemd
> > unit, but it's hard to do right because adding the same ID twice is an
> > error.  Maybe the daemon itself ould do it?
> 
> Implementing it in uio daemon can introduce race conditions with sysfs
> creation. I guess it's OK then to implement it here, in kernel.
> 
> >
> >> The fix for such consumers is the same thing DPDK and the
> >> in-tree daemon already do: resolve uio via
> >> /sys/bus/vmbus/devices/<guid>/uio/ rather than by number. This is not a
> >> regression in the patch, but it's a behavior change worth calling out.
> >
> > It would be a good reason *not* to make this change in stable.
> >
> > Ben.
> >
> 
> What issues are you fixing with this patch exactly? Is there any
> particular sequence of events you are targeting where traditional
> approach does not work?
> 
> Regards,
> Naman


^ permalink raw reply

* Re: [PATCH] uio_hv_generic: Bind to FCopy device by default
From: Naman Jain @ 2026-05-26 15:49 UTC (permalink / raw)
  To: Michael Kelley, Ben Hutchings
  Cc: K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Greg Kroah-Hartman, linux-hyperv@vger.kernel.org
In-Reply-To: <SN6PR02MB41574FDA377FF59597181B7BD40B2@SN6PR02MB4157.namprd02.prod.outlook.com>



On 5/26/2026 8:45 PM, Michael Kelley wrote:
> From: Naman Jain <namjain@linux.microsoft.com> Sent: Tuesday, May 26, 2026 3:10 AM
>>
>> On 5/26/2026 1:59 PM, Ben Hutchings wrote:
>>> On Tue, 2026-05-26 at 12:15 +0530, Naman Jain wrote:
>>>>
>>>> On 5/25/2026 5:34 PM, Ben Hutchings wrote:
>>>>> The Hyper-V kernel-mode fcopy driver was removed in 6.10 and the new
>>>>> fcopy daemon requires this uio driver to function.  However, by
>>>>> default the driver does not bind to any devices, and must be
>>>>> configured through the sysfs "new_id" file.
>>>>>
>>>>> Since the FCopy device is now only usable through this driver, add its
>>>>> ID to the driver's ID table so that the daemon will work "out of the
>>>>> box".
>>>>>
>>>>> Signed-off-by: Ben Hutchings <benh@debian.org>
>>>>> Fixes: ec314f61e4fc ("Drivers: hv: Remove fcopy driver")
>>>>> ---
>>>>> --- a/drivers/uio/uio_hv_generic.c
>>>>> +++ b/drivers/uio/uio_hv_generic.c
>>>>> @@ -395,9 +395,15 @@ hv_uio_remove(struct hv_device *dev)
>>>>>     	vmbus_free_ring(dev->channel);
>>>>>     }
>>>>>
>>>>> +static const struct hv_vmbus_device_id hv_uio_id_table[] = {
>>>>> +	{ HV_FCOPY_GUID },
>>>>> +	{}
>>>>> +};
>>>>> +MODULE_DEVICE_TABLE(vmbus, hv_uio_id_table);
>>>>> +
>>>>>     static struct hv_driver hv_uio_drv = {
>>>>>     	.name = "uio_hv_generic",
>>>>> -	.id_table = NULL, /* only dynamic id's */
>>>>> +	.id_table = hv_uio_id_table,
>>>>>     	.probe = hv_uio_probe,
>>>>>     	.remove = hv_uio_remove,
>>>>>     };
>>>>
>>
>> ++ recipients, assuming you mistakenly clicked reply instead of reply all.
> 
> Ben --
> 
> Regarding recipients, please include the full LKML
> (linux-kernel@vger.kernel.org) on the original patch posting, even
> though it is about a narrow Hyper-V issue. I dabble in areas beyond
> just Hyper-V so subscribe to the full LKML instead of the
> linux-hyperv mailing list. I miss patches like this one unless I happen
> to be looking through the lore.kernel.org archives for linux-hyperv.
> 
> Thx,
> 
> Michael
> 
Sashiko also misses out on such patches if linux-kernel@vger.kernel.org 
is not included. I could not find this in https://sashiko.dev/#/.

Regards,
Naman

^ permalink raw reply

* [PATCH 0/2] Remove stack ib_udata's
From: Jason Gunthorpe @ 2026-05-26 16:15 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler,
	Potnuri Bharat Teja, Bryan Tan, Cheng Xu, Dennis Dalessandro,
	Junxian Huang, Kai Shen, Kalesh AP, Konstantin Taranov,
	Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
	Long Li, Michal Kalderon, Nelson Escobar, Satish Kharat,
	Selvin Xavier, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
	Yishai Hadas, Zhu Yanjun
  Cc: Leon Romanovsky, patches

Sashiko pointed out these are dangerous, and the create_qp() one is in
fact a bug. The query_device is just ugly old code.

Remove the stack ib_udata's from both places.

Jason Gunthorpe (2):
  RDMA/core: Don't make a dummy ib_udata on the stack in create_qp
  RDMA: Update the query_device() op

 drivers/infiniband/core/core_priv.h           |  2 +-
 drivers/infiniband/core/device.c              |  3 +--
 drivers/infiniband/core/ib_core_uverbs.c      | 12 +++++++++++
 drivers/infiniband/core/rdma_core.h           |  7 +++++++
 drivers/infiniband/core/uverbs_cmd.c          | 14 +------------
 drivers/infiniband/core/uverbs_std_types_qp.c |  3 +--
 drivers/infiniband/core/verbs.c               | 20 ++++++++++---------
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  5 ++++-
 drivers/infiniband/hw/cxgb4/provider.c        |  8 +++++---
 drivers/infiniband/hw/erdma/erdma_verbs.c     |  9 +++++++--
 drivers/infiniband/hw/hns/hns_roce_main.c     |  7 ++++++-
 drivers/infiniband/hw/ionic/ionic_ibdev.c     |  7 ++++++-
 drivers/infiniband/hw/irdma/verbs.c           |  8 +++++---
 drivers/infiniband/hw/mana/main.c             |  7 ++++++-
 drivers/infiniband/hw/mlx4/main.c             | 13 ++++++------
 drivers/infiniband/hw/mthca/mthca_provider.c  | 13 +++++++-----
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |  8 +++++---
 drivers/infiniband/hw/qedr/verbs.c            |  7 ++++++-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c  |  8 +++++---
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.c   |  8 +++++---
 drivers/infiniband/sw/rdmavt/vt.c             |  9 ++++++---
 drivers/infiniband/sw/rxe/rxe_verbs.c         | 14 ++++---------
 drivers/infiniband/sw/siw/siw_verbs.c         |  8 +++++---
 23 files changed, 124 insertions(+), 76 deletions(-)


base-commit: fd9482545e37fb6b7e04b588ad2bd80a2779776c
-- 
2.43.0


^ permalink raw reply

* [PATCH 1/2] RDMA/core: Don't make a dummy ib_udata on the stack in create_qp
From: Jason Gunthorpe @ 2026-05-26 16:15 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler,
	Potnuri Bharat Teja, Bryan Tan, Cheng Xu, Dennis Dalessandro,
	Junxian Huang, Kai Shen, Kalesh AP, Konstantin Taranov,
	Krzysztof Czurylo, Leon Romanovsky, linux-hyperv, linux-rdma,
	Long Li, Michal Kalderon, Nelson Escobar, Satish Kharat,
	Selvin Xavier, Chengchang Tang, Tatyana Nikolova, Vishnu Dasa,
	Yishai Hadas, Zhu Yanjun
  Cc: Leon Romanovsky, patches
In-Reply-To: <0-v1-922fa8e828ba+f7-ib_udata_stack_jgg@nvidia.com>

Sashiko points out the udata for destruction has to be created using
uverbs_get_cleared_udata(). Move it to ib_core_uverbs.c so that the core
qp code can call it. Rework the call chain to pass the struct
uverbs_attr_bundle right up to the driver op callback.

Fixes a possible wild stack reference in drivers during error unwinding,
mlx5 can call rdma_udata_to_drv_context() from destroy_qp() when
destroying a QP.

Fixes: 00a79d6b996d ("RDMA/core: Configure selinux QP during creation")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/core/core_priv.h           |  2 +-
 drivers/infiniband/core/ib_core_uverbs.c      | 12 +++++++++++
 drivers/infiniband/core/rdma_core.h           |  7 +++++++
 drivers/infiniband/core/uverbs_cmd.c          | 14 +------------
 drivers/infiniband/core/uverbs_std_types_qp.c |  3 +--
 drivers/infiniband/core/verbs.c               | 20 ++++++++++---------
 6 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index a2c36666e6fcb9..19104c542b270d 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -321,7 +321,7 @@ void nldev_exit(void);
 
 struct ib_qp *ib_create_qp_user(struct ib_device *dev, struct ib_pd *pd,
 				struct ib_qp_init_attr *attr,
-				struct ib_udata *udata,
+				struct uverbs_attr_bundle *uattrs,
 				struct ib_uqp_object *uobj, const char *caller);
 
 void ib_qp_usecnt_inc(struct ib_qp *qp);
diff --git a/drivers/infiniband/core/ib_core_uverbs.c b/drivers/infiniband/core/ib_core_uverbs.c
index b4fc693a3bd8b7..6c3bc9ca1d58ef 100644
--- a/drivers/infiniband/core/ib_core_uverbs.c
+++ b/drivers/infiniband/core/ib_core_uverbs.c
@@ -532,6 +532,18 @@ int uverbs_destroy_def_handler(struct uverbs_attr_bundle *attrs)
 }
 EXPORT_SYMBOL(uverbs_destroy_def_handler);
 
+/*
+ * When calling a destroy function during an error unwind we need to pass in
+ * the udata that is sanitized of all user arguments. Ie from the driver
+ * perspective it looks like no udata was passed.
+ */
+struct ib_udata *uverbs_get_cleared_udata(struct uverbs_attr_bundle *attrs)
+{
+	attrs->driver_udata = (struct ib_udata){};
+	return &attrs->driver_udata;
+}
+EXPORT_SYMBOL_NS_GPL(uverbs_get_cleared_udata, "rdma_core");
+
 /**
  * _uverbs_alloc() - Quickly allocate memory for use with a bundle
  * @bundle: The bundle
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index b626d3d24d087d..56121103e9f4f5 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -71,7 +71,14 @@ int uverbs_output_written(const struct uverbs_attr_bundle *bundle, size_t idx);
 
 void setup_ufile_idr_uobject(struct ib_uverbs_file *ufile);
 
+#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
 struct ib_udata *uverbs_get_cleared_udata(struct uverbs_attr_bundle *attrs);
+#else
+static inline struct ib_udata *uverbs_get_cleared_udata(struct uverbs_attr_bundle *attrs)
+{
+	return NULL;
+}
+#endif
 
 /*
  * This is the runtime description of the uverbs API, used by the syscall
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 32914007bae66f..41ad11ae1123b7 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -163,17 +163,6 @@ static int uverbs_request_finish(struct uverbs_req_iter *iter)
 	return 0;
 }
 
-/*
- * When calling a destroy function during an error unwind we need to pass in
- * the udata that is sanitized of all user arguments. Ie from the driver
- * perspective it looks like no udata was passed.
- */
-struct ib_udata *uverbs_get_cleared_udata(struct uverbs_attr_bundle *attrs)
-{
-	attrs->driver_udata = (struct ib_udata){};
-	return &attrs->driver_udata;
-}
-
 static struct ib_uverbs_completion_event_file *
 _ib_uverbs_lookup_comp_file(s32 fd, struct uverbs_attr_bundle *attrs)
 {
@@ -1462,8 +1451,7 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
 		attr.source_qpn = cmd->source_qpn;
 	}
 
-	qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata, obj,
-			       KBUILD_MODNAME);
+	qp = ib_create_qp_user(device, pd, &attr, attrs, obj, KBUILD_MODNAME);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
 		goto err_put;
diff --git a/drivers/infiniband/core/uverbs_std_types_qp.c b/drivers/infiniband/core/uverbs_std_types_qp.c
index be0730e8509ed9..fd617903ffcf49 100644
--- a/drivers/infiniband/core/uverbs_std_types_qp.c
+++ b/drivers/infiniband/core/uverbs_std_types_qp.c
@@ -248,8 +248,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
 	set_caps(&attr, &cap, true);
 	mutex_init(&obj->mcast_lock);
 
-	qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata, obj,
-			       KBUILD_MODNAME);
+	qp = ib_create_qp_user(device, pd, &attr, attrs, obj, KBUILD_MODNAME);
 	if (IS_ERR(qp)) {
 		ret = PTR_ERR(qp);
 		goto err_put;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index bac87de9cc6735..1500bc09bdc915 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -53,6 +53,7 @@
 #include <rdma/rw.h>
 #include <rdma/lag.h>
 
+#include "rdma_core.h"
 #include "core_priv.h"
 #include <trace/events/rdma_core.h>
 
@@ -1265,10 +1266,9 @@ static struct ib_qp *create_xrc_qp_user(struct ib_qp *qp,
 
 static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
 			       struct ib_qp_init_attr *attr,
-			       struct ib_udata *udata,
+			       struct uverbs_attr_bundle *uattrs,
 			       struct ib_uqp_object *uobj, const char *caller)
 {
-	struct ib_udata dummy = {};
 	struct ib_qp *qp;
 	int ret;
 
@@ -1301,9 +1301,10 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
 	qp->recv_cq = attr->recv_cq;
 
 	rdma_restrack_new(&qp->res, RDMA_RESTRACK_QP);
-	WARN_ONCE(!udata && !caller, "Missing kernel QP owner");
-	rdma_restrack_set_name(&qp->res, udata ? NULL : caller);
-	ret = dev->ops.create_qp(qp, attr, udata);
+	WARN_ONCE(!uattrs && !caller, "Missing kernel QP owner");
+	rdma_restrack_set_name(&qp->res, uattrs ? NULL : caller);
+	ret = dev->ops.create_qp(qp, attr,
+				 uattrs ? &uattrs->driver_udata : NULL);
 	if (ret)
 		goto err_create;
 
@@ -1322,7 +1323,8 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
 	return qp;
 
 err_security:
-	qp->device->ops.destroy_qp(qp, udata ? &dummy : NULL);
+	qp->device->ops.destroy_qp(
+		qp, uattrs ? uverbs_get_cleared_udata(uattrs) : NULL);
 err_create:
 	rdma_restrack_put(&qp->res);
 	kfree(qp);
@@ -1338,13 +1340,13 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
  * @attr: A list of initial attributes required to create the
  *   QP.  If QP creation succeeds, then the attributes are updated to
  *   the actual capabilities of the created QP.
- * @udata: User data
+ * @uattrs: User ioctl attributes and udata
  * @uobj: uverbs obect
  * @caller: caller's build-time module name
  */
 struct ib_qp *ib_create_qp_user(struct ib_device *dev, struct ib_pd *pd,
 				struct ib_qp_init_attr *attr,
-				struct ib_udata *udata,
+				struct uverbs_attr_bundle *uattrs,
 				struct ib_uqp_object *uobj, const char *caller)
 {
 	struct ib_qp *qp, *xrc_qp;
@@ -1352,7 +1354,7 @@ struct ib_qp *ib_create_qp_user(struct ib_device *dev, struct ib_pd *pd,
 	if (attr->qp_type == IB_QPT_XRC_TGT)
 		qp = create_qp(dev, pd, attr, NULL, NULL, caller);
 	else
-		qp = create_qp(dev, pd, attr, udata, uobj, NULL);
+		qp = create_qp(dev, pd, attr, uattrs, uobj, NULL);
 	if (attr->qp_type != IB_QPT_XRC_TGT || IS_ERR(qp))
 		return qp;
 
-- 
2.43.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox