Linux-HyperV List

Linux-HyperV List
 help / color / mirror / Atom feed

* [PATCH v2 11/16] RDMA: Use ib_copy_validate_udata_in_cm() for zero comp_mask
From: Jason Gunthorpe @ 2026-03-25 21:26 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun
  Cc: Long Li, patches
In-Reply-To: <0-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

All of these cases require a 0 comp_mask. Consolidate these into
using ib_copy_validate_udata_in_cm() and remove the open coded
comp_mask test.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/efa/efa_verbs.c |  8 ++++----
 drivers/infiniband/hw/mlx4/main.c     |  5 +----
 drivers/infiniband/hw/mlx4/qp.c       | 13 ++++++-------
 drivers/infiniband/hw/mlx5/qp.c       |  4 ++--
 4 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 8d9357e2d513bb..064d5136ba405d 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -699,11 +699,11 @@ int efa_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
 	if (err)
 		goto err_out;
 
-	err = ib_copy_validate_udata_in(udata, cmd, driver_qp_type);
+	err = ib_copy_validate_udata_in_cm(udata, cmd, driver_qp_type, 0);
 	if (err)
 		goto err_out;
 
-	if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_98)) {
+	if (!is_reserved_cleared(cmd.reserved_98)) {
 		ibdev_dbg(&dev->ibdev,
 			  "Incompatible ABI params, unknown fields in udata\n");
 		err = -EINVAL;
@@ -1140,11 +1140,11 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		goto err_out;
 	}
 
-	err = ib_copy_validate_udata_in(udata, cmd, num_sub_cqs);
+	err = ib_copy_validate_udata_in_cm(udata, cmd, num_sub_cqs, 0);
 	if (err)
 		goto err_out;
 
-	if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_58)) {
+	if (!is_reserved_cleared(cmd.reserved_58)) {
 		ibdev_dbg(ibdev,
 			  "Incompatible ABI params, unknown fields in udata\n");
 		err = -EINVAL;
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 16e4cffbd7a84d..037f02b5f28fb5 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -446,13 +446,10 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
 	struct mlx4_clock_params clock_params;
 
 	if (uhw->inlen) {
-		err = ib_copy_validate_udata_in(uhw, cmd, reserved);
+		err = ib_copy_validate_udata_in_cm(uhw, cmd, reserved, 0);
 		if (err)
 			return err;
 
-		if (cmd.comp_mask)
-			return -EINVAL;
-
 		if (cmd.reserved)
 			return -EINVAL;
 	}
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 40ddd723d7b549..cfb54ffcaac22c 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -720,7 +720,7 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
 	if (udata->outlen)
 		return -EOPNOTSUPP;
 
-	err = ib_copy_validate_udata_in(udata, ucmd, reserved1);
+	err = ib_copy_validate_udata_in_cm(udata, ucmd, reserved1, 0);
 	if (err) {
 		pr_debug("copy failed\n");
 		return err;
@@ -729,7 +729,7 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
 	if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved)))
 		return -EOPNOTSUPP;
 
-	if (ucmd.comp_mask || ucmd.reserved1)
+	if (ucmd.reserved1)
 		return -EOPNOTSUPP;
 
 	if (init_attr->qp_type != IB_QPT_RAW_PACKET) {
@@ -866,12 +866,11 @@ static int create_rq(struct ib_pd *pd, struct ib_qp_init_attr *init_attr,
 
 	qp->state = IB_QPS_RESET;
 
-	err = ib_copy_validate_udata_in(udata, wq, comp_mask);
+	err = ib_copy_validate_udata_in_cm(udata, wq, comp_mask, 0);
 	if (err)
 		goto err;
 
-	if (wq.comp_mask || wq.reserved[0] || wq.reserved[1] ||
-	    wq.reserved[2]) {
+	if (wq.reserved[0] || wq.reserved[1] || wq.reserved[2]) {
 		pr_debug("user command isn't supported\n");
 		err = -EOPNOTSUPP;
 		goto err;
@@ -4235,11 +4234,11 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
 	enum ib_wq_state cur_state, new_state;
 	int err;
 
-	err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+	err = ib_copy_validate_udata_in_cm(udata, ucmd, reserved, 0);
 	if (err)
 		return err;
 
-	if (ucmd.comp_mask || ucmd.reserved)
+	if (ucmd.reserved)
 		return -EOPNOTSUPP;
 
 	if (wq_attr_mask & IB_WQ_FLAGS)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index d4d5e0d457a0b5..68c6e107747693 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -5611,11 +5611,11 @@ int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 	void *rqc;
 	void *in;
 
-	err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+	err = ib_copy_validate_udata_in_cm(udata, ucmd, reserved, 0);
 	if (err)
 		return err;
 
-	if (ucmd.comp_mask || ucmd.reserved)
+	if (ucmd.reserved)
 		return -EOPNOTSUPP;
 
 	inlen = MLX5_ST_SZ_BYTES(modify_rq_in);
-- 
2.43.0


^ permalink raw reply related

* [PATCH v2 12/16] RDMA/mlx5: Pull comp_mask validation into ib_copy_validate_udata_in_cm()
From: Jason Gunthorpe @ 2026-03-25 21:26 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun
  Cc: Long Li, patches
In-Reply-To: <0-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

Directly check the supported comp_mask bitmap using
ib_copy_validate_udata_in_cm() and remove the open coding.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx5/qp.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 68c6e107747693..3b602ed0a2dafc 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4707,12 +4707,12 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		return -ENOSYS;
 
 	if (udata && udata->inlen) {
-		err = ib_copy_validate_udata_in(udata, ucmd, ece_options);
+		err = ib_copy_validate_udata_in_cm(udata, ucmd, ece_options,
+						   MLX5_IB_MODIFY_QP_OOO_DP);
 		if (err)
 			return err;
 
-		if (ucmd.comp_mask & ~MLX5_IB_MODIFY_QP_OOO_DP ||
-		    memchr_inv(&ucmd.burst_info.reserved, 0,
+		if (memchr_inv(&ucmd.burst_info.reserved, 0,
 			       sizeof(ucmd.burst_info.reserved)))
 			return -EOPNOTSUPP;
 
@@ -5381,17 +5381,16 @@ static int prepare_user_rq(struct ib_pd *pd,
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
 	struct mlx5_ib_create_wq ucmd = {};
 	int err;
-	err = ib_copy_validate_udata_in(udata, ucmd,
-					single_stride_log_num_of_bytes);
+
+	err = ib_copy_validate_udata_in_cm(udata, ucmd,
+					   single_stride_log_num_of_bytes,
+					   MLX5_IB_CREATE_WQ_STRIDING_RQ);
 	if (err) {
 		mlx5_ib_dbg(dev, "copy failed\n");
 		return err;
 	}
 
-	if (ucmd.comp_mask & (~MLX5_IB_CREATE_WQ_STRIDING_RQ)) {
-		mlx5_ib_dbg(dev, "invalid comp mask\n");
-		return -EOPNOTSUPP;
-	} else if (ucmd.comp_mask & MLX5_IB_CREATE_WQ_STRIDING_RQ) {
+	if (ucmd.comp_mask & MLX5_IB_CREATE_WQ_STRIDING_RQ) {
 		if (!MLX5_CAP_GEN(dev->mdev, striding_rq)) {
 			mlx5_ib_dbg(dev, "Striding RQ is not supported\n");
 			return -EOPNOTSUPP;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v2 09/16] RDMA/mlx4: Use ib_copy_validate_udata_in() for QP
From: Jason Gunthorpe @ 2026-03-25 21:26 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun
  Cc: Long Li, patches
In-Reply-To: <0-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

Move the validation of the udata to the same function that copies it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx4/qp.c | 25 +++----------------------
 1 file changed, 3 insertions(+), 22 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index deb1b0306aa7a1..40ddd723d7b549 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -854,7 +854,6 @@ static int create_rq(struct ib_pd *pd, struct ib_qp_init_attr *init_attr,
 	unsigned long flags;
 	int range_size;
 	struct mlx4_ib_create_wq wq;
-	size_t copy_len;
 	int shift;
 	int n;
 
@@ -867,12 +866,9 @@ static int create_rq(struct ib_pd *pd, struct ib_qp_init_attr *init_attr,
 
 	qp->state = IB_QPS_RESET;
 
-	copy_len = min(sizeof(struct mlx4_ib_create_wq), udata->inlen);
-
-	if (ib_copy_from_udata(&wq, udata, copy_len)) {
-		err = -EFAULT;
+	err = ib_copy_validate_udata_in(udata, wq, comp_mask);
+	if (err)
 		goto err;
-	}
 
 	if (wq.comp_mask || wq.reserved[0] || wq.reserved[1] ||
 	    wq.reserved[2]) {
@@ -4112,26 +4108,11 @@ struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd,
 	struct mlx4_dev *dev = to_mdev(pd->device)->dev;
 	struct ib_qp_init_attr ib_qp_init_attr = {};
 	struct mlx4_ib_qp *qp;
-	struct mlx4_ib_create_wq ucmd;
-	int err, required_cmd_sz;
+	int err;
 
 	if (!udata)
 		return ERR_PTR(-EINVAL);
 
-	required_cmd_sz = offsetof(typeof(ucmd), comp_mask) +
-			  sizeof(ucmd.comp_mask);
-	if (udata->inlen < required_cmd_sz) {
-		pr_debug("invalid inlen\n");
-		return ERR_PTR(-EINVAL);
-	}
-
-	if (udata->inlen > sizeof(ucmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(ucmd),
-				 udata->inlen - sizeof(ucmd))) {
-		pr_debug("inlen is not supported\n");
-		return ERR_PTR(-EOPNOTSUPP);
-	}
-
 	if (udata->outlen)
 		return ERR_PTR(-EOPNOTSUPP);
 
-- 
2.43.0


^ permalink raw reply related

* [PATCH v2 01/16] RDMA: Consolidate patterns with offsetofend() to ib_copy_validate_udata_in()
From: Jason Gunthorpe @ 2026-03-25 21:26 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun
  Cc: Long Li, patches
In-Reply-To: <0-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

Go treewide and consolidate all existing patterns using:

* offsetofend() and variations
* ib_is_udata_cleared()
* ib_copy_from_udata()

into a direct call to the new ib_copy_validate_udata_in().

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/efa/efa_verbs.c | 47 +++---------------------
 drivers/infiniband/hw/irdma/verbs.c   | 10 +++---
 drivers/infiniband/hw/mlx4/qp.c       | 38 ++++----------------
 drivers/infiniband/hw/mlx5/qp.c       | 51 ++++++---------------------
 4 files changed, 26 insertions(+), 120 deletions(-)

diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index fc498663cd372f..8d9357e2d513bb 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -699,29 +699,9 @@ int efa_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *init_attr,
 	if (err)
 		goto err_out;
 
-	if (offsetofend(typeof(cmd), driver_qp_type) > udata->inlen) {
-		ibdev_dbg(&dev->ibdev,
-			  "Incompatible ABI params, no input udata\n");
-		err = -EINVAL;
+	err = ib_copy_validate_udata_in(udata, cmd, driver_qp_type);
+	if (err)
 		goto err_out;
-	}
-
-	if (udata->inlen > sizeof(cmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(cmd),
-				 udata->inlen - sizeof(cmd))) {
-		ibdev_dbg(&dev->ibdev,
-			  "Incompatible ABI params, unknown fields in udata\n");
-		err = -EINVAL;
-		goto err_out;
-	}
-
-	err = ib_copy_from_udata(&cmd, udata,
-				 min(sizeof(cmd), udata->inlen));
-	if (err) {
-		ibdev_dbg(&dev->ibdev,
-			  "Cannot copy udata for create_qp\n");
-		goto err_out;
-	}
 
 	if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_98)) {
 		ibdev_dbg(&dev->ibdev,
@@ -1160,28 +1140,9 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		goto err_out;
 	}
 
-	if (offsetofend(typeof(cmd), num_sub_cqs) > udata->inlen) {
-		ibdev_dbg(ibdev,
-			  "Incompatible ABI params, no input udata\n");
-		err = -EINVAL;
+	err = ib_copy_validate_udata_in(udata, cmd, num_sub_cqs);
+	if (err)
 		goto err_out;
-	}
-
-	if (udata->inlen > sizeof(cmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(cmd),
-				 udata->inlen - sizeof(cmd))) {
-		ibdev_dbg(ibdev,
-			  "Incompatible ABI params, unknown fields in udata\n");
-		err = -EINVAL;
-		goto err_out;
-	}
-
-	err = ib_copy_from_udata(&cmd, udata,
-				 min(sizeof(cmd), udata->inlen));
-	if (err) {
-		ibdev_dbg(ibdev, "Cannot copy udata for create_cq\n");
-		goto err_out;
-	}
 
 	if (cmd.comp_mask || !is_reserved_cleared(cmd.reserved_58)) {
 		ibdev_dbg(ibdev,
diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
index 7251cd7a21471e..b2978632241900 100644
--- a/drivers/infiniband/hw/irdma/verbs.c
+++ b/drivers/infiniband/hw/irdma/verbs.c
@@ -284,7 +284,6 @@ static void irdma_alloc_push_page(struct irdma_qp *iwqp)
 static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
 				struct ib_udata *udata)
 {
-#define IRDMA_ALLOC_UCTX_MIN_REQ_LEN offsetofend(struct irdma_alloc_ucontext_req, rsvd8)
 #define IRDMA_ALLOC_UCTX_MIN_RESP_LEN offsetofend(struct irdma_alloc_ucontext_resp, rsvd)
 	struct ib_device *ibdev = uctx->device;
 	struct irdma_device *iwdev = to_iwdev(ibdev);
@@ -292,13 +291,14 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
 	struct irdma_alloc_ucontext_resp uresp = {};
 	struct irdma_ucontext *ucontext = to_ucontext(uctx);
 	struct irdma_uk_attrs *uk_attrs = &iwdev->rf->sc_dev.hw_attrs.uk_attrs;
+	int ret;
 
-	if (udata->inlen < IRDMA_ALLOC_UCTX_MIN_REQ_LEN ||
-	    udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
+	if (udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
 		return -EINVAL;
 
-	if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen)))
-		return -EINVAL;
+	ret = ib_copy_validate_udata_in(udata, req, rsvd8);
+	if (ret)
+		return ret;
 
 	if (req.userspace_ver < 4 || req.userspace_ver > IRDMA_ABI_VER)
 		goto ver_error;
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 1cb890d3d93cea..b87a4b7949a3a0 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -710,7 +710,6 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
 				  struct ib_udata *udata)
 {
 	struct mlx4_ib_create_qp_rss ucmd = {};
-	size_t required_cmd_sz;
 	int err;
 
 	if (!udata) {
@@ -721,16 +720,10 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
 	if (udata->outlen)
 		return -EOPNOTSUPP;
 
-	required_cmd_sz = offsetof(typeof(ucmd), reserved1) +
-					sizeof(ucmd.reserved1);
-	if (udata->inlen < required_cmd_sz) {
-		pr_debug("invalid inlen\n");
-		return -EINVAL;
-	}
-
-	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+	err = ib_copy_validate_udata_in(udata, ucmd, reserved1);
+	if (err) {
 		pr_debug("copy failed\n");
-		return -EFAULT;
+		return err;
 	}
 
 	if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved)))
@@ -739,13 +732,6 @@ static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp,
 	if (ucmd.comp_mask || ucmd.reserved1)
 		return -EOPNOTSUPP;
 
-	if (udata->inlen > sizeof(ucmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(ucmd),
-				 udata->inlen - sizeof(ucmd))) {
-		pr_debug("inlen is not supported\n");
-		return -EOPNOTSUPP;
-	}
-
 	if (init_attr->qp_type != IB_QPT_RAW_PACKET) {
 		pr_debug("RSS QP with unsupported QP type %d\n",
 			 init_attr->qp_type);
@@ -4269,22 +4255,12 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
 {
 	struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
 	struct mlx4_ib_modify_wq ucmd = {};
-	size_t required_cmd_sz;
 	enum ib_wq_state cur_state, new_state;
-	int err = 0;
+	int err;
 
-	required_cmd_sz = offsetof(typeof(ucmd), reserved) +
-				   sizeof(ucmd.reserved);
-	if (udata->inlen < required_cmd_sz)
-		return -EINVAL;
-
-	if (udata->inlen > sizeof(ucmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(ucmd),
-				 udata->inlen - sizeof(ucmd)))
-		return -EOPNOTSUPP;
-
-	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
-		return -EFAULT;
+	err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+	if (err)
+		return err;
 
 	if (ucmd.comp_mask || ucmd.reserved)
 		return -EOPNOTSUPP;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 59f9ddb35d4620..d4d5e0d457a0b5 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4707,17 +4707,9 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		return -ENOSYS;
 
 	if (udata && udata->inlen) {
-		if (udata->inlen < offsetofend(typeof(ucmd), ece_options))
-			return -EINVAL;
-
-		if (udata->inlen > sizeof(ucmd) &&
-		    !ib_is_udata_cleared(udata, sizeof(ucmd),
-					 udata->inlen - sizeof(ucmd)))
-			return -EOPNOTSUPP;
-
-		if (ib_copy_from_udata(&ucmd, udata,
-				       min(udata->inlen, sizeof(ucmd))))
-			return -EFAULT;
+		err = ib_copy_validate_udata_in(udata, ucmd, ece_options);
+		if (err)
+			return err;
 
 		if (ucmd.comp_mask & ~MLX5_IB_MODIFY_QP_OOO_DP ||
 		    memchr_inv(&ucmd.burst_info.reserved, 0,
@@ -5389,25 +5381,11 @@ static int prepare_user_rq(struct ib_pd *pd,
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
 	struct mlx5_ib_create_wq ucmd = {};
 	int err;
-	size_t required_cmd_sz;
-
-	required_cmd_sz = offsetofend(struct mlx5_ib_create_wq,
-				      single_stride_log_num_of_bytes);
-	if (udata->inlen < required_cmd_sz) {
-		mlx5_ib_dbg(dev, "invalid inlen\n");
-		return -EINVAL;
-	}
-
-	if (udata->inlen > sizeof(ucmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(ucmd),
-				 udata->inlen - sizeof(ucmd))) {
-		mlx5_ib_dbg(dev, "inlen is not supported\n");
-		return -EOPNOTSUPP;
-	}
-
-	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+	err = ib_copy_validate_udata_in(udata, ucmd,
+					single_stride_log_num_of_bytes);
+	if (err) {
 		mlx5_ib_dbg(dev, "copy failed\n");
-		return -EFAULT;
+		return err;
 	}
 
 	if (ucmd.comp_mask & (~MLX5_IB_CREATE_WQ_STRIDING_RQ)) {
@@ -5626,7 +5604,6 @@ int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 	struct mlx5_ib_dev *dev = to_mdev(wq->device);
 	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
 	struct mlx5_ib_modify_wq ucmd = {};
-	size_t required_cmd_sz;
 	int curr_wq_state;
 	int wq_state;
 	int inlen;
@@ -5634,17 +5611,9 @@ int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 	void *rqc;
 	void *in;
 
-	required_cmd_sz = offsetofend(struct mlx5_ib_modify_wq, reserved);
-	if (udata->inlen < required_cmd_sz)
-		return -EINVAL;
-
-	if (udata->inlen > sizeof(ucmd) &&
-	    !ib_is_udata_cleared(udata, sizeof(ucmd),
-				 udata->inlen - sizeof(ucmd)))
-		return -EOPNOTSUPP;
-
-	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
-		return -EFAULT;
+	err = ib_copy_validate_udata_in(udata, ucmd, reserved);
+	if (err)
+		return err;
 
 	if (ucmd.comp_mask || ucmd.reserved)
 		return -EOPNOTSUPP;
-- 
2.43.0


^ permalink raw reply related

* [PATCH v2 08/16] RDMA/mlx4: Use ib_copy_validate_udata_in()
From: Jason Gunthorpe @ 2026-03-25 21:26 UTC (permalink / raw)
  To: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun
  Cc: Long Li, patches
In-Reply-To: <0-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

Follow the last member of each struct at the point
MLX4_IB_UVERBS_ABI_VERSION was set to 4.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/mlx4/cq.c  | 10 +++++-----
 drivers/infiniband/hw/mlx4/qp.c  |  8 ++------
 drivers/infiniband/hw/mlx4/srq.c |  5 +++--
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 8535fd561691d7..ed4c2e740670be 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -168,10 +168,9 @@ int mlx4_ib_create_user_cq(struct ib_cq *ibcq,
 	INIT_LIST_HEAD(&cq->send_qp_list);
 	INIT_LIST_HEAD(&cq->recv_qp_list);
 
-	if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) {
-		err = -EFAULT;
+	err = ib_copy_validate_udata_in(udata, ucmd, db_addr);
+	if (err)
 		goto err_cq;
-	}
 
 	buf_addr = (void *)(unsigned long)ucmd.buf_addr;
 
@@ -332,8 +331,9 @@ static int mlx4_alloc_resize_umem(struct mlx4_ib_dev *dev, struct mlx4_ib_cq *cq
 	if (cq->resize_umem)
 		return -EBUSY;
 
-	if (ib_copy_from_udata(&ucmd, udata, sizeof ucmd))
-		return -EFAULT;
+	err = ib_copy_validate_udata_in(udata, ucmd, buf_addr);
+	if (err)
+		return err;
 
 	cq->resize_buf = kmalloc_obj(*cq->resize_buf);
 	if (!cq->resize_buf)
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index b87a4b7949a3a0..deb1b0306aa7a1 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -1053,16 +1053,12 @@ static int create_qp_common(struct ib_pd *pd, struct ib_qp_init_attr *init_attr,
 
 	if (udata) {
 		struct mlx4_ib_create_qp ucmd;
-		size_t copy_len;
 		int shift;
 		int n;
 
-		copy_len = sizeof(struct mlx4_ib_create_qp);
-
-		if (ib_copy_from_udata(&ucmd, udata, copy_len)) {
-			err = -EFAULT;
+		err = ib_copy_validate_udata_in(udata, ucmd, sq_no_prefetch);
+		if (err)
 			goto err;
-		}
 
 		qp->inl_recv_sz = ucmd.inl_recv_sz;
 
diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c
index c4cf91235eee3a..5b23e5f8b84aca 100644
--- a/drivers/infiniband/hw/mlx4/srq.c
+++ b/drivers/infiniband/hw/mlx4/srq.c
@@ -111,8 +111,9 @@ int mlx4_ib_create_srq(struct ib_srq *ib_srq,
 	if (udata) {
 		struct mlx4_ib_create_srq ucmd;
 
-		if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)))
-			return -EFAULT;
+		err = ib_copy_validate_udata_in(udata, ucmd, db_addr);
+		if (err)
+			return err;
 
 		srq->umem =
 			ib_umem_get(ib_srq->device, ucmd.buf_addr, buf_size, 0);
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v2 14/16] RDMA/irdma: Add missing comp_mask check in alloc_ucontext
From: Jacob Moroni @ 2026-03-25 22:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Abhijit Gangurde, Allen Hubbe,
	Broadcom internal kernel review list, Bernard Metzler, Bryan Tan,
	Cheng Xu, Gal Pressman, Junxian Huang, Kai Shen,
	Konstantin Taranov, Krzysztof Czurylo, Leon Romanovsky,
	linux-hyperv, linux-rdma, Michal Kalderon, Michael Margolin,
	Nelson Escobar, Satish Kharat, Selvin Xavier, Yossi Leybovich,
	Chengchang Tang, Tatyana Nikolova, Vishnu Dasa, Yishai Hadas,
	Zhu Yanjun, Long Li, patches
In-Reply-To: <14-v2-f4ac6f418bd6+12c5-rdma_udata_req_jgg@nvidia.com>

Reviewed-by: Jacob Moroni <jmoroni@google.com>

On Wed, Mar 25, 2026 at 5:27 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> irdma has a comp_mask field that was never checked for validity, check
> it.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/hw/irdma/verbs.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c
> index b2978632241900..d695130b187bdd 100644
> --- a/drivers/infiniband/hw/irdma/verbs.c
> +++ b/drivers/infiniband/hw/irdma/verbs.c
> @@ -296,7 +296,9 @@ static int irdma_alloc_ucontext(struct ib_ucontext *uctx,
>         if (udata->outlen < IRDMA_ALLOC_UCTX_MIN_RESP_LEN)
>                 return -EINVAL;
>
> -       ret = ib_copy_validate_udata_in(udata, req, rsvd8);
> +       ret = ib_copy_validate_udata_in_cm(udata, req, rsvd8,
> +                                          IRDMA_ALLOC_UCTX_USE_RAW_ATTR |
> +                                                  IRDMA_SUPPORT_WQE_FORMAT_V2);
>         if (ret)
>                 return ret;
>
> --
> 2.43.0
>
>

^ permalink raw reply

* Re: [PATCH] Drivers: hv: mshv: fix integer overflow in memory region overlap check
From: vdso @ 2026-03-25 22:37 UTC (permalink / raw)
  To: Junrui Luo, K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Long Li, Mukesh Rathor, Nuno Das Neves, Roman Kisel,
	Stanislav Kinsburskii
  Cc: Muminul Islam, Praveen K Paladugu, linux-hyperv, linux-kernel,
	Yuhao Jiang, stable
In-Reply-To: <SYBPR01MB7881689C0F58149DD986A6D1AF49A@SYBPR01MB7881.ausprd01.prod.outlook.com>


> On 03/24/2026 9:05 PM PDT Junrui Luo <moonafterrain@outlook.com> wrote:
> 

Hi Junrui,

I think that checking for overflow as implemented can be improved.

`guest_pfn` is a guest page frame number (GPA/page size). Hyper-V uses
page size of 4KiB (`HV_HYP_PAGE_SIZE`). On x86_64 GPAs are limited to
52 bits, and max GFN = (1<<52)/(1<<12) = 1<<40. On ARM64, 52 bits is
also the limit for the bits used in GPA. Thus checking for overflowing is
not the only thing needed here because _well_ before overflowing there is
that (1<<40)-th GFN which is problematic as using it or going above means
going over the arch limits of bits used in GPA (the processor won't be able
to map the memory through the page tables).

So we could check for (1<<40)-th GFN, too. That is, if we'd like to return
an error early instead of trying to do physically impossible things and
erroring out later anyway.

Perhaps something along the lines of

|  if (mem->guest_pfn + nr_pages > HVPFN_DOWN(1ULL << MAX_PHYSMEM_BITS))
|      return -EINVAL;

could be an meaningful improvement in addition to checking overflow which
alone doesn't take into account the specifics outlined above.

If folks like that, maybe could hoist an improved check out into a function
and apply throughout the file.

--
Cheers,
Roman

>  
> mshv_partition_create_region() computes mem->guest_pfn + nr_pages to
> check for overlapping regions without verifying u64 wraparound. A
> sufficiently large guest_pfn can cause the addition to overflow,
> bypassing the overlap check and allowing creation of regions that wrap
> around the address space.
> 
> Fix by using check_add_overflow() to reject such regions.
> 
> Fixes: 621191d709b1 ("Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs")
> Reported-by: Yuhao Jiang <danisjiang@gmail.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
> ---
>  drivers/hv/mshv_root_main.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> index 6f42423f7faa..6ddb315fc2c2 100644
> --- a/drivers/hv/mshv_root_main.c
> +++ b/drivers/hv/mshv_root_main.c
> @@ -1174,11 +1174,16 @@ static int mshv_partition_create_region(struct mshv_partition *partition,
>  {
>  	struct mshv_mem_region *rg;
>  	u64 nr_pages = HVPFN_DOWN(mem->size);
> +	u64 new_region_end;
> +
> +	/* Reject regions whose end address would wrap around */
> +	if (check_add_overflow(mem->guest_pfn, nr_pages, &new_region_end))
> +		return -EOVERFLOW;
>  
>  	/* Reject overlapping regions */
>  	spin_lock(&partition->pt_mem_regions_lock);
>  	hlist_for_each_entry(rg, &partition->pt_mem_regions, hnode) {
> -		if (mem->guest_pfn + nr_pages <= rg->start_gfn ||
> +		if (new_region_end <= rg->start_gfn ||
>  		    rg->start_gfn + rg->nr_pages <= mem->guest_pfn)
>  			continue;
>  		spin_unlock(&partition->pt_mem_regions_lock);
> 
> ---
> base-commit: c369299895a591d96745d6492d4888259b004a9e
> change-id: 20260325-fixes-9a58895aea55
> 
> Best regards,
> -- 
> Junrui Luo <moonafterrain@outlook.com>

^ permalink raw reply

* Re: [PATCH 09/12] s390/cio: use generic driver_override infrastructure
From: Vineeth Vijayan @ 2026-03-26  9:43 UTC (permalink / raw)
  To: Danilo Krummrich, Russell King, Greg Kroah-Hartman,
	Rafael J. Wysocki, Ioana Ciornei, Nipun Gupta, Nikhil Agarwal,
	K. Y. Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
	Bjorn Helgaas, Armin Wolf, Bjorn Andersson, Mathieu Poirier,
	Peter Oberparleiter, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Harald Freudenberger, Holger Dengler, Mark Brown,
	Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez,
	Alex Williamson, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Christophe Leroy (CS GROUP)
  Cc: linux-kernel, driver-core, linuxppc-dev, linux-hyperv, linux-pci,
	platform-driver-x86, linux-arm-msm, linux-remoteproc, linux-s390,
	linux-spi, virtualization, kvm, xen-devel, linux-arm-kernel,
	Gui-Dong Han
In-Reply-To: <20260324005919.2408620-10-dakr@kernel.org>



On 3/24/26 01:59, Danilo Krummrich wrote:
> When a driver is probed through __driver_attach(), the bus' match()
> callback is called without the device lock held, thus accessing the
> driver_override field without a lock, which can cause a UAF.
> 
> Fix this by using the driver-core driver_override infrastructure taking
> care of proper locking internally.
> 
> Note that calling match() from __driver_attach() without the device lock
> held is intentional. [1]
> 
> Link:https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
> Reported-by: Gui-Dong Han<hanguidong02@gmail.com>
> Closes:https://bugzilla.kernel.org/show_bug.cgi?id=220789
> Fixes: ebc3d1791503 ("s390/cio: introduce driver_override on the css bus")
> Signed-off-by: Danilo Krummrich<dakr@kernel.org>
> ---

Thank you Danilo.

Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com>

^ permalink raw reply

* Re: [PATCH v4 20/21] mm: add mmap_action_map_kernel_pages[_full]()
From: Vlastimil Babka (SUSE) @ 2026-03-26 10:44 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle), Andrew Morton
  Cc: Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-kernel, linux-doc, linux-hyperv, linux-stm32,
	linux-arm-kernel, linux-mtd, linux-staging, linux-scsi,
	target-devel, linux-afs, linux-fsdevel, linux-mm, Ryan Roberts
In-Reply-To: <926ac961690d856e67ec847bee2370ab3c6b9046.1774045440.git.ljs@kernel.org>

On 3/20/26 23:39, Lorenzo Stoakes (Oracle) wrote:
> A user can invoke mmap_action_map_kernel_pages() to specify that the
> mapping should map kernel pages starting from desc->start of a specified
> number of pages specified in an array.
> 
> In order to implement this, adjust mmap_action_prepare() to be able to
> return an error code, as it makes sense to assert that the specified
> parameters are valid as quickly as possible as well as updating the VMA
> flags to include VMA_MIXEDMAP_BIT as necessary.
> 
> This provides an mmap_prepare equivalent of vm_insert_pages().  We
> additionally update the existing vm_insert_pages() code to use
> range_in_vma() and add a new range_in_vma_desc() helper function for the
> mmap_prepare case, sharing the code between the two in range_is_subset().
> 
> We add both mmap_action_map_kernel_pages() and
> mmap_action_map_kernel_pages_full() to allow for both partial and full VMA
> mappings.
> 
> We update the documentation to reflect the new features.
> 
> Finally, we update the VMA tests accordingly to reflect the changes.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>


^ permalink raw reply

* Re: [PATCH v4 21/21] mm: on remap assert that input range within the proposed VMA
From: Vlastimil Babka (SUSE) @ 2026-03-26 10:46 UTC (permalink / raw)
  To: Lorenzo Stoakes (Oracle), Andrew Morton
  Cc: Jonathan Corbet, Clemens Ladisch, Arnd Bergmann,
	Greg Kroah-Hartman, K . Y . Srinivasan, Haiyang Zhang, Wei Liu,
	Dexuan Cui, Long Li, Alexander Shishkin, Maxime Coquelin,
	Alexandre Torgue, Miquel Raynal, Richard Weinberger,
	Vignesh Raghavendra, Bodo Stroesser, Martin K . Petersen,
	David Howells, Marc Dionne, Alexander Viro, Christian Brauner,
	Jan Kara, David Hildenbrand, Liam R . Howlett, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Jann Horn, Pedro Falcato,
	linux-kernel, linux-doc, linux-hyperv, linux-stm32,
	linux-arm-kernel, linux-mtd, linux-staging, linux-scsi,
	target-devel, linux-afs, linux-fsdevel, linux-mm, Ryan Roberts
In-Reply-To: <0fc1092f4b74f3f673a58e4e3942dc83f336dd85.1774045440.git.ljs@kernel.org>

On 3/20/26 23:39, Lorenzo Stoakes (Oracle) wrote:
> Now we have range_in_vma_desc(), update remap_pfn_range_prepare() to check
> whether the input range in contained within the specified VMA, so we can
> fail at prepare time if an invalid range is specified.
> 
> This covers the I/O remap mmap actions also which ultimately call into
> this function, and other mmap action types either already span the full
> VMA or check this already.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>

> ---
>  mm/memory.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 53ef8ef3d04a..68cc592ff0ba 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3142,6 +3142,9 @@ int remap_pfn_range_prepare(struct vm_area_desc *desc)
>  	const bool is_cow = vma_desc_is_cow_mapping(desc);
>  	int err;
>  
> +	if (!range_in_vma_desc(desc, start, end))
> +		return -EFAULT;
> +
>  	err = get_remap_pgoff(is_cow, start, end, desc->start, desc->end, pfn,
>  			      &desc->pgoff);
>  	if (err)


^ permalink raw reply

* Re: [PATCH net-next v2] net: mana: Set default number of queues to 16
From: patchwork-bot+netdevbpf @ 2026-03-26 14:10 UTC (permalink / raw)
  To: Long Li
  Cc: kotaranov, kuba, davem, pabeni, edumazet, andrew+netdev, jgg,
	leon, haiyangz, kys, wei.liu, decui, horms, netdev, linux-rdma,
	linux-hyperv, linux-kernel
In-Reply-To: <20260323194925.1766385-1-longli@microsoft.com>

Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Mon, 23 Mar 2026 12:49:25 -0700 you wrote:
> Set the default number of queues per vPort to MANA_DEF_NUM_QUEUES (16),
> as 16 queues can achieve optimal throughput for typical workloads. The
> actual number of queues may be lower if it exceeds the hardware reported
> limit. Users can increase the number of queues up to max_queues via
> ethtool if needed.
> 
> Signed-off-by: Long Li <longli@microsoft.com>
> 
> [...]

Here is the summary with links:
  - [net-next,v2] net: mana: Set default number of queues to 16
    https://git.kernel.org/netdev/net-next/c/45b2b84ac6fd

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* RE: [PATCH net,v2] net: mana: Fix RX skb truesize accounting
From: Haiyang Zhang @ 2026-03-26 15:04 UTC (permalink / raw)
  To: Dipayaan Roy, KY Srinivasan, wei.liu@kernel.org, Dexuan Cui,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, leon@kernel.org, Long Li,
	Konstantin Taranov, horms@kernel.org,
	shradhagupta@linux.microsoft.com, ssengar@linux.microsoft.com,
	ernis@linux.microsoft.com, Shiraz Saleem,
	linux-hyperv@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org,
	stephen@networkplumber.org, jacob.e.keller@intel.com,
	Dipayaan Roy
In-Reply-To: <acLUhLpLum6qrD/N@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>



> -----Original Message-----
> From: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> Sent: Tuesday, March 24, 2026 2:14 PM
> To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang
> <haiyangz@microsoft.com>; wei.liu@kernel.org; Dexuan Cui
> <DECUI@microsoft.com>; andrew+netdev@lunn.ch; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org;
> Long Li <longli@microsoft.com>; Konstantin Taranov
> <kotaranov@microsoft.com>; horms@kernel.org;
> shradhagupta@linux.microsoft.com; ssengar@linux.microsoft.com;
> ernis@linux.microsoft.com; Shiraz Saleem <shirazsaleem@microsoft.com>;
> linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-rdma@vger.kernel.org;
> stephen@networkplumber.org; jacob.e.keller@intel.com; Dipayaan Roy
> <dipayanroy@microsoft.com>
> Subject: [PATCH net,v2] net: mana: Fix RX skb truesize accounting
> 
> MANA passes rxq->alloc_size to napi_build_skb() for all RX buffers.
> It is correct for fragment-backed RX buffers, where alloc_size matches
> the actual backing allocation used for each packet buffer. However, in
> the non-fragment RX path mana allocates a full page, or a higher-order
> page, per RX buffer. In that case alloc_size only reflects the usable
> packet area and not the actual backing memory.
> 
> This causes napi_build_skb() to underestimate the skb backing allocation
> in the single-buffer RX path, so skb->truesize is derived from a value
> smaller than the real RX buffer allocation.
> 
> Fix this by updating alloc_size in the non-fragment RX path to the
> actual backing allocation size before it is passed to napi_build_skb().
> 
> Fixes: 730ff06d3f5c ("net: mana: Use page pool fragments for RX buffers
> instead of full pages to improve memory efficiency.")
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>

> ---
> Changes in v2:
>  - Added maintainers missed in v1.
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index ea71de39f996..884f8e548174 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -766,6 +766,13 @@ static void mana_get_rxbuf_cfg(struct
> mana_port_context *apc,
>  		}
> 
>  		*frag_count = 1;
> +
> +		/* In the single-buffer path, napi_build_skb() must see the
> +		 * actual backing allocation size so skb->truesize reflects
> +		 * the full page (or higher-order page), not just the usable
> +		 * packet area.
> +		 */
> +		*alloc_size = PAGE_SIZE << get_order(*alloc_size);
>  		return;
>  	}
> 
> --
> 2.43.0
> 


^ permalink raw reply

* Re: [RFC PATCH V3] x86/VMBus: Confidential VMBus for dynamic DMA transfers
From: Easwar Hariharan @ 2026-03-26 17:05 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: kys, haiyangz, wei.liu, decui, longli, m.szyprowski, robin.murphy,
	easwar.hariharan, Tianyu Lan, iommu, linux-hyperv, linux-kernel,
	hch, vdso, Michael Kelley
In-Reply-To: <20260325075649.248241-1-tiala@microsoft.com>

On 3/25/2026 12:56 AM, Tianyu Lan wrote:
> Hyper-V provides Confidential VMBus to communicate between
> device model and device guest driver via encrypted/private
> memory in Confidential VM. The device model is in OpenHCL
> (https://openvmm.dev/guide/user_guide/openhcl.html) that
> plays the paravisor role.
> 
> For a VMBus device, there are two communication methods to
> talk with Host/Hypervisor. 1) VMBUS Ring buffer 2) Dynamic
> DMA transfer.
> 
> The Confidential VMBus Ring buffer has been upstreamed by
> Roman Kisel(commit 6802d8af47d1).
> 
> The dynamic DMA transition of VMBus device normally goes
> through DMA core and it uses SWIOTLB as bounce buffer in
> a CoCo VM.
> 
> The Confidential VMBus device can do DMA directly to
> private/encrypted memory. Because the swiotlb is decrypted
> memory, the DMA transfer must not be bounced through the
> swiotlb, so as to preserve confidentiality. This is different
> from the default for Linux CoCo VMs, so disable the VMBus
> device's use of swiotlb.
> 
> Expose swiotlb_dev_disable() from DMA Core to disable
> bounce buffer for device.
> 
> Suggested-by: Michael Kelley <mhklinux@outlook.com>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>  drivers/hv/vmbus_drv.c  | 6 +++++-
>  include/linux/swiotlb.h | 5 +++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 3d1a58b667db..84e6971fc90f 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2184,11 +2184,15 @@ int vmbus_device_register(struct hv_device *child_device_obj)
>  	child_device_obj->device.dma_mask = &child_device_obj->dma_mask;
>  	dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64));
>  
> +	device_initialize(&child_device_obj->device);
> +	if (child_device_obj->channel->co_external_memory)
> +		swiotlb_dev_disable(&child_device_obj->device);
> +
>  	/*
>  	 * Register with the LDM. This will kick off the driver/device
>  	 * binding...which will eventually call vmbus_match() and vmbus_probe()
>  	 */
> -	ret = device_register(&child_device_obj->device);
> +	ret = device_add(&child_device_obj->device);
>  	if (ret) {
>  		pr_err("Unable to register child device\n");
>  		put_device(&child_device_obj->device);
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 3dae0f592063..7c572570d5d9 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -169,6 +169,11 @@ static inline struct io_tlb_pool *swiotlb_find_pool(struct device *dev,
>  	return NULL;
>  }
>  
> +static inline bool swiotlb_dev_disable(struct device *dev)
> +{
> +	return dev->dma_io_tlb_mem == NULL;

Is there an extra = here?

- Easwar (he/him)

^ permalink raw reply

* Re: [EXTERNAL] Re: [PATCH net-next v5 0/6] net: mana: Per-vPort EQ and MSI-X interrupt management
From: Simon Horman @ 2026-03-26 17:19 UTC (permalink / raw)
  To: Long Li
  Cc: Konstantin Taranov, Jakub Kicinski, David S . Miller, Paolo Abeni,
	Eric Dumazet, Andrew Lunn, Jason Gunthorpe, Leon Romanovsky,
	Haiyang Zhang, KY Srinivasan, Wei Liu, Dexuan Cui,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <SA1PR21MB6683FB2D67A3BBCC74B62D45CE49A@SA1PR21MB6683.namprd21.prod.outlook.com>

On Wed, Mar 25, 2026 at 08:47:35PM +0000, Long Li wrote:
> > > On Mon, Mar 23, 2026 at 12:59:46PM -0700, Long Li wrote:

...

> > Hi Simon,
> > 
> > This patch set should apply after this patch: (which is also pending net-next)
> > net: mana: Set default number of queues to 16
> > 
> > Can you apply the patch set after this patch, or should I wait for the next patch
> > merge window?
> > 
> > Thank you,
> > Long
> 
> 
> I'll send it over in the next patch merging window.

Thanks,

The way I understand things net-next, and in particular the CI,
can only handle patches where all the dependencies are already
present in net-next.


^ permalink raw reply

* [PATCH net-next] net: mana: hardening: Validate adapter_mtu from MANA_QUERY_DEV_CONFIG
From: Erni Sri Satya Vennela @ 2026-03-26 17:30 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, linux-hyperv, netdev, linux-kernel

As a part of MANA hardening for CVM, validate the adapter_mtu value
returned from the MANA_QUERY_DEV_CONFIG HWC command.

The adapter_mtu value is used to compute ndev->max_mtu via:
gc->adapter_mtu - ETH_HLEN. If hardware returns a bogus adapter_mtu
smaller than ETH_HLEN (e.g. 0), the unsigned subtraction wraps to a
huge value, silently allowing oversized MTU settings.

Add a validation check to reject adapter_mtu values below
ETH_MIN_MTU + ETH_HLEN, returning -EPROTO to fail the device
configuration early with a clear error message.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index b39e8b920791..bd07d17a6017 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1207,10 +1207,16 @@ static int mana_query_device_cfg(struct mana_context *ac, u32 proto_major_ver,
 
 	*max_num_vports = resp.max_num_vports;
 
-	if (resp.hdr.response.msg_version >= GDMA_MESSAGE_V2)
+	if (resp.hdr.response.msg_version >= GDMA_MESSAGE_V2) {
+		if (resp.adapter_mtu < ETH_MIN_MTU + ETH_HLEN) {
+			dev_err(dev, "Adapter MTU too small: %u\n",
+				resp.adapter_mtu);
+			return -EPROTO;
+		}
 		gc->adapter_mtu = resp.adapter_mtu;
-	else
+	} else {
 		gc->adapter_mtu = ETH_FRAME_LEN;
+	}
 
 	if (resp.hdr.response.msg_version >= GDMA_MESSAGE_V3)
 		*bm_hostmode = resp.bm_hostmode;
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH 00/12] treewide: Convert buses to use generic driver_override
From: Danilo Krummrich @ 2026-03-26 17:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Russell King, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ioana Ciornei, Nipun Gupta, Nikhil Agarwal, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Bjorn Helgaas,
	Armin Wolf, Bjorn Andersson, Mathieu Poirier, Vineeth Vijayan,
	Peter Oberparleiter, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Harald Freudenberger, Holger Dengler, Mark Brown, Jason Wang,
	Xuan Zhuo, Eugenio Pérez, Alex Williamson, Juergen Gross,
	Stefano Stabellini, Oleksandr Tyshchenko,
	Christophe Leroy (CS GROUP), linux-kernel, driver-core,
	linuxppc-dev, linux-hyperv, linux-pci, platform-driver-x86,
	linux-arm-msm, linux-remoteproc, linux-s390, linux-spi,
	virtualization, kvm, xen-devel, linux-arm-kernel
In-Reply-To: <20260325052919-mutt-send-email-mst@kernel.org>

On Wed Mar 25, 2026 at 10:29 AM CET, Michael S. Tsirkin wrote:
> vdpa bits:
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>
> I assume it'll all be merged together?

I can take it through the driver-core tree if you prefer, but you can also pick
it up yourself.

^ permalink raw reply

* [PATCH net-next] net: mana: hardening: Reject zero max_num_queues from MANA_QUERY_VPORT_CONFIG
From: Erni Sri Satya Vennela @ 2026-03-26 17:48 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, linux-hyperv, netdev, linux-kernel

As a part of MANA hardening for CVM, validate that max_num_sq and
max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
values flow into apc->num_queues, which is used as an allocation count
and loop bound. A zero value would result in zero-size allocations and
incorrect driver behavior.

Return -EPROTO if either value is zero.

Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index b39e8b920791..a4197b4b0597 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct mana_port_context *apc, u32 vport_index,
 
 	*max_sq = resp.max_num_sq;
 	*max_rq = resp.max_num_rq;
+
+	if (*max_sq == 0 || *max_rq == 0) {
+		netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
+		return -EPROTO;
+	}
+
 	if (resp.num_indirection_ent > 0 &&
 	    resp.num_indirection_ent <= MANA_INDIRECT_TABLE_MAX_SIZE &&
 	    is_power_of_2(resp.num_indirection_ent)) {
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH 05/12] PCI: use generic driver_override infrastructure
From: Bjorn Helgaas @ 2026-03-26 18:08 UTC (permalink / raw)
  To: Danilo Krummrich
  Cc: Russell King, Greg Kroah-Hartman, Rafael J. Wysocki,
	Ioana Ciornei, Nipun Gupta, Nikhil Agarwal, K. Y. Srinivasan,
	Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li, Bjorn Helgaas,
	Armin Wolf, Bjorn Andersson, Mathieu Poirier, Vineeth Vijayan,
	Peter Oberparleiter, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Harald Freudenberger, Holger Dengler, Mark Brown,
	Michael S. Tsirkin, Jason Wang, Xuan Zhuo, Eugenio Pérez,
	Alex Williamson, Juergen Gross, Stefano Stabellini,
	Oleksandr Tyshchenko, Christophe Leroy (CS GROUP), linux-kernel,
	driver-core, linuxppc-dev, linux-hyperv, linux-pci,
	platform-driver-x86, linux-arm-msm, linux-remoteproc, linux-s390,
	linux-spi, virtualization, kvm, xen-devel, linux-arm-kernel,
	Gui-Dong Han
In-Reply-To: <20260324005919.2408620-6-dakr@kernel.org>

On Tue, Mar 24, 2026 at 01:59:09AM +0100, Danilo Krummrich wrote:
> When a driver is probed through __driver_attach(), the bus' match()
> callback is called without the device lock held, thus accessing the
> driver_override field without a lock, which can cause a UAF.
> 
> Fix this by using the driver-core driver_override infrastructure taking
> care of proper locking internally.
> 
> Note that calling match() from __driver_attach() without the device lock
> held is intentional. [1]
> 
> Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
> Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
> Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override")
> Signed-off-by: Danilo Krummrich <dakr@kernel.org>
> ---
>  drivers/pci/pci-driver.c           | 11 +++++++----
>  drivers/pci/pci-sysfs.c            | 28 ----------------------------
>  drivers/pci/probe.c                |  1 -
>  include/linux/pci.h                |  6 ------

For the above:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

"driver_override" is mentioned several places in
Documentation/ABI/testing/sysfs-bus-*.  I assume this series doesn't
change the behavior documented there?  Should any of this doc be
consolidated?

>  drivers/vfio/pci/vfio_pci_core.c   |  5 ++---
>  drivers/xen/xen-pciback/pci_stub.c |  6 ++++--
>  6 files changed, 13 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index dd9075403987..d10ece0889f0 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -138,9 +138,11 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  {
>  	struct pci_dynid *dynid;
>  	const struct pci_device_id *found_id = NULL, *ids;
> +	int ret;
>  
>  	/* When driver_override is set, only bind to the matching driver */
> -	if (dev->driver_override && strcmp(dev->driver_override, drv->name))
> +	ret = device_match_driver_override(&dev->dev, &drv->driver);
> +	if (ret == 0)
>  		return NULL;
>  
>  	/* Look at the dynamic ids first, before the static ones */
> @@ -164,7 +166,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  		 * matching.
>  		 */
>  		if (found_id->override_only) {
> -			if (dev->driver_override)
> +			if (ret > 0)
>  				return found_id;
>  		} else {
>  			return found_id;
> @@ -172,7 +174,7 @@ static const struct pci_device_id *pci_match_device(struct pci_driver *drv,
>  	}
>  
>  	/* driver_override will always match, send a dummy id */
> -	if (dev->driver_override)
> +	if (ret > 0)
>  		return &pci_device_id_any;
>  	return NULL;
>  }
> @@ -452,7 +454,7 @@ static int __pci_device_probe(struct pci_driver *drv, struct pci_dev *pci_dev)
>  static inline bool pci_device_can_probe(struct pci_dev *pdev)
>  {
>  	return (!pdev->is_virtfn || pdev->physfn->sriov->drivers_autoprobe ||
> -		pdev->driver_override);
> +		device_has_driver_override(&pdev->dev));
>  }
>  #else
>  static inline bool pci_device_can_probe(struct pci_dev *pdev)
> @@ -1722,6 +1724,7 @@ static const struct cpumask *pci_device_irq_get_affinity(struct device *dev,
>  
>  const struct bus_type pci_bus_type = {
>  	.name		= "pci",
> +	.driver_override = true,
>  	.match		= pci_bus_match,
>  	.uevent		= pci_uevent,
>  	.probe		= pci_device_probe,
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 16eaaf749ba9..a9006cf4e9c8 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -615,33 +615,6 @@ static ssize_t devspec_show(struct device *dev,
>  static DEVICE_ATTR_RO(devspec);
>  #endif
>  
> -static ssize_t driver_override_store(struct device *dev,
> -				     struct device_attribute *attr,
> -				     const char *buf, size_t count)
> -{
> -	struct pci_dev *pdev = to_pci_dev(dev);
> -	int ret;
> -
> -	ret = driver_set_override(dev, &pdev->driver_override, buf, count);
> -	if (ret)
> -		return ret;
> -
> -	return count;
> -}
> -
> -static ssize_t driver_override_show(struct device *dev,
> -				    struct device_attribute *attr, char *buf)
> -{
> -	struct pci_dev *pdev = to_pci_dev(dev);
> -	ssize_t len;
> -
> -	device_lock(dev);
> -	len = sysfs_emit(buf, "%s\n", pdev->driver_override);
> -	device_unlock(dev);
> -	return len;
> -}
> -static DEVICE_ATTR_RW(driver_override);
> -
>  static struct attribute *pci_dev_attrs[] = {
>  	&dev_attr_power_state.attr,
>  	&dev_attr_resource.attr,
> @@ -669,7 +642,6 @@ static struct attribute *pci_dev_attrs[] = {
>  #ifdef CONFIG_OF
>  	&dev_attr_devspec.attr,
>  #endif
> -	&dev_attr_driver_override.attr,
>  	&dev_attr_ari_enabled.attr,
>  	NULL,
>  };
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index bccc7a4bdd79..b4707640e102 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2488,7 +2488,6 @@ static void pci_release_dev(struct device *dev)
>  	pci_release_of_node(pci_dev);
>  	pcibios_release_device(pci_dev);
>  	pci_bus_put(pci_dev->bus);
> -	kfree(pci_dev->driver_override);
>  	bitmap_free(pci_dev->dma_alias_mask);
>  	dev_dbg(dev, "device released\n");
>  	kfree(pci_dev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index d43745fe4c84..460852f79f29 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1987,9 +1987,8 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb,
>  	    pdev->is_virtfn && physfn == vdev->pdev) {
>  		pci_info(vdev->pdev, "Captured SR-IOV VF %s driver_override\n",
>  			 pci_name(pdev));
> -		pdev->driver_override = kasprintf(GFP_KERNEL, "%s",
> -						  vdev->vdev.ops->name);
> -		WARN_ON(!pdev->driver_override);
> +		WARN_ON(device_set_driver_override(&pdev->dev,
> +						   vdev->vdev.ops->name));
>  	} else if (action == BUS_NOTIFY_BOUND_DRIVER &&
>  		   pdev->is_virtfn && physfn == vdev->pdev) {
>  		struct pci_driver *drv = pci_dev_driver(pdev);
> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index e4b27aecbf05..79a2b5dfd694 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -598,6 +598,8 @@ static int pcistub_seize(struct pci_dev *dev,
>  	return err;
>  }
>  
> +static struct pci_driver xen_pcibk_pci_driver;
> +
>  /* Called when 'bind'. This means we must _NOT_ call pci_reset_function or
>   * other functions that take the sysfs lock. */
>  static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
> @@ -609,8 +611,8 @@ static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  
>  	match = pcistub_match(dev);
>  
> -	if ((dev->driver_override &&
> -	     !strcmp(dev->driver_override, PCISTUB_DRIVER_NAME)) ||
> +	if (device_match_driver_override(&dev->dev,
> +					 &xen_pcibk_pci_driver.driver) > 0 ||
>  	    match) {
>  
>  		if (dev->hdr_type != PCI_HEADER_TYPE_NORMAL
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1c270f1d5123..57e9463e4347 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -575,12 +575,6 @@ struct pci_dev {
>  	u8		supported_speeds; /* Supported Link Speeds Vector */
>  	phys_addr_t	rom;		/* Physical address if not from BAR */
>  	size_t		romlen;		/* Length if not from BAR */
> -	/*
> -	 * Driver name to force a match.  Do not set directly, because core
> -	 * frees it.  Use driver_set_override() to set or clear it.
> -	 */
> -	const char	*driver_override;
> -
>  	unsigned long	priv_flags;	/* Private flags for the PCI driver */
>  
>  	/* These methods index pci_reset_fn_methods[] */
> -- 
> 2.53.0
> 

^ permalink raw reply

* Re: [PATCH net-next v2] net: mana: Use at least SZ_4K in doorbell ID range check
From: Simon Horman @ 2026-03-26 20:07 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, shradhagupta, kotaranov, dipayanroy,
	yury.norov, kees, linux-hyperv, netdev, linux-kernel
In-Reply-To: <20260325180423.1923060-1-ernis@linux.microsoft.com>

On Wed, Mar 25, 2026 at 11:04:17AM -0700, Erni Sri Satya Vennela wrote:
> mana_gd_ring_doorbell() accesses offsets up to DOORBELL_OFFSET_EQ
> (0xFF8) + 8 bytes = 4KB within each doorbell page. A db_page_size
> smaller than SZ_4K is fundamentally incompatible with the driver:
> doorbell pages would overlap and the device cannot function correctly.
> 
> Validate db_page_size at the source and fail the
> probe early if the value is below SZ_4K. This ensures the doorbell ID
> range check in mana_gd_register_device() can rely on db_page_size
> being valid.
> 
> Fixes: 89fe91c65992 ("net: mana: hardening: Validate doorbell ID from GDMA_REGISTER_DEVICE response")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> Changes in v2:
> * Remove "db_page_sz = max_t(u64, SZ_4K, gc->db_page_size)" in
>   mana_gd_register_device and validate db_page_sz at the source
>   mana_gf_init_pf_regs and mana_gd_init_vf_regs.
> * Update commit message.

Thanks for the update.

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net,v2] net: mana: Fix RX skb truesize accounting
From: patchwork-bot+netdevbpf @ 2026-03-27  2:10 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
	ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, stephen, jacob.e.keller, dipayanroy
In-Reply-To: <acLUhLpLum6qrD/N@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 24 Mar 2026 11:14:28 -0700 you wrote:
> MANA passes rxq->alloc_size to napi_build_skb() for all RX buffers.
> It is correct for fragment-backed RX buffers, where alloc_size matches
> the actual backing allocation used for each packet buffer. However, in
> the non-fragment RX path mana allocates a full page, or a higher-order
> page, per RX buffer. In that case alloc_size only reflects the usable
> packet area and not the actual backing memory.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: mana: Fix RX skb truesize accounting
    https://git.kernel.org/netdev/net/c/f73896b4197e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v2] net: mana: Set default number of queues to 16
From: Jakub Kicinski @ 2026-03-27  3:18 UTC (permalink / raw)
  To: Long Li
  Cc: Konstantin Taranov, David S . Miller, Paolo Abeni, Eric Dumazet,
	Andrew Lunn, Jason Gunthorpe, Leon Romanovsky, Haiyang Zhang,
	K . Y . Srinivasan, Wei Liu, Dexuan Cui, Simon Horman, netdev,
	linux-rdma, linux-hyperv, linux-kernel
In-Reply-To: <20260323194925.1766385-1-longli@microsoft.com>

On Mon, 23 Mar 2026 12:49:25 -0700 Long Li wrote:
> Set the default number of queues per vPort to MANA_DEF_NUM_QUEUES (16),
> as 16 queues can achieve optimal throughput for typical workloads. The
> actual number of queues may be lower if it exceeds the hardware reported
> limit. Users can increase the number of queues up to max_queues via
> ethtool if needed.

Sorry we are a bit backlogged I didn't spot this in time (read: I'm
planning to revert this unless proper explanation is provided)

Could you explain why not use netif_get_num_default_rss_queues() ?
Having local driver innovations is a major PITA for users who deal
with heterogeneous envs.

^ permalink raw reply

* RE: [EXTERNAL] Re: [PATCH net-next v2] net: mana: Set default number of queues to 16
From: Long Li @ 2026-03-27  4:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Konstantin Taranov, David S . Miller, Paolo Abeni, Eric Dumazet,
	Andrew Lunn, Jason Gunthorpe, Leon Romanovsky, Haiyang Zhang,
	KY Srinivasan, Wei Liu, Dexuan Cui, Simon Horman,
	netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <20260326201841.3b7e5b78@kernel.org>

> On Mon, 23 Mar 2026 12:49:25 -0700 Long Li wrote:
> > Set the default number of queues per vPort to MANA_DEF_NUM_QUEUES
> > (16), as 16 queues can achieve optimal throughput for typical
> > workloads. The actual number of queues may be lower if it exceeds the
> > hardware reported limit. Users can increase the number of queues up to
> > max_queues via ethtool if needed.
> 
> Sorry we are a bit backlogged I didn't spot this in time (read: I'm planning to
> revert this unless proper explanation is provided)
> 
> Could you explain why not use netif_get_num_default_rss_queues() ?
> Having local driver innovations is a major PITA for users who deal with
> heterogeneous envs.

  Hi Jakub,

  We considered netif_get_num_default_rss_queues() but chose a fixed default based on our performance testing. On Azure VMs, typical
  workloads plateau at around 16 queues - adding more queues beyond that doesn't improve throughput but increases memory usage and
  interrupt overhead.

  netif_get_num_default_rss_queues() would return 32-64 on large VMs (64-128 vCPUs), which wastes resources without benefit.

  That said, I agree that completely ignoring the core-based heuristic isn't ideal for consistency. One option is to use
  netif_get_num_default_rss_queues() but clamp it to a maximum of MANA_DEF_NUM_QUEUES (16), so small VMs still get enough queues and
  large VMs don't over-allocate. Something like:

   apc->num_queues = min(netif_get_num_default_rss_queues(), MANA_DEF_NUM_QUEUES);
   apc->num_queues = min(apc->num_queues, gc->max_num_queues);

  For reference, it seems mlx4 does something similar - it caps at DEF_RX_RINGS (16) regardless of core count.

  Do you want me to send a v2?

  Thanks, 
  Long

^ permalink raw reply

* Re: [RFC PATCH V3] x86/VMBus: Confidential VMBus for dynamic DMA transfers
From: Tianyu Lan @ 2026-03-27  9:28 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: kys, haiyangz, wei.liu, decui, longli, m.szyprowski, robin.murphy,
	Tianyu Lan, iommu, linux-hyperv, linux-kernel, hch, vdso,
	Michael Kelley
In-Reply-To: <20260325092200.GQ814676@unreal>

On Wed, Mar 25, 2026 at 5:22 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Wed, Mar 25, 2026 at 03:56:49AM -0400, Tianyu Lan wrote:
> > Hyper-V provides Confidential VMBus to communicate between
> > device model and device guest driver via encrypted/private
> > memory in Confidential VM. The device model is in OpenHCL
> > (https://openvmm.dev/guide/user_guide/openhcl.html) that
> > plays the paravisor role.
> >
> > For a VMBus device, there are two communication methods to
> > talk with Host/Hypervisor. 1) VMBUS Ring buffer 2) Dynamic
> > DMA transfer.
> >
> > The Confidential VMBus Ring buffer has been upstreamed by
> > Roman Kisel(commit 6802d8af47d1).
> >
> > The dynamic DMA transition of VMBus device normally goes
> > through DMA core and it uses SWIOTLB as bounce buffer in
> > a CoCo VM.
> >
> > The Confidential VMBus device can do DMA directly to
> > private/encrypted memory. Because the swiotlb is decrypted
> > memory, the DMA transfer must not be bounced through the
> > swiotlb, so as to preserve confidentiality. This is different
> > from the default for Linux CoCo VMs, so disable the VMBus
> > device's use of swiotlb.
> >
> > Expose swiotlb_dev_disable() from DMA Core to disable
> > bounce buffer for device.
>
> It feels awkward and like a layering violation to let arbitrary kernel
> drivers manipulate SWIOTLB, which sits beneath the DMA core.
>

Hi Leon:
     Thanks for your review. I will try other way since now DMA core has
not stand way to disable device swiotlb.

-- 
Thanks
Tianyu Lan

^ permalink raw reply

* Re: [RFC PATCH V3] x86/VMBus: Confidential VMBus for dynamic DMA transfers
From: Tianyu Lan @ 2026-03-27  9:32 UTC (permalink / raw)
  To: Easwar Hariharan
  Cc: kys, haiyangz, wei.liu, decui, longli, m.szyprowski, robin.murphy,
	Tianyu Lan, iommu, linux-hyperv, linux-kernel, hch, vdso,
	Michael Kelley
In-Reply-To: <75c6dd78-bbae-4f5a-94ef-9de299720d38@linux.microsoft.com>

On Fri, Mar 27, 2026 at 1:05 AM Easwar Hariharan
<easwar.hariharan@linux.microsoft.com> wrote:
>
> On 3/25/2026 12:56 AM, Tianyu Lan wrote:
> > Hyper-V provides Confidential VMBus to communicate between
> > device model and device guest driver via encrypted/private
> > memory in Confidential VM. The device model is in OpenHCL
> > (https://openvmm.dev/guide/user_guide/openhcl.html) that
> > plays the paravisor role.
> >
> > For a VMBus device, there are two communication methods to
> > talk with Host/Hypervisor. 1) VMBUS Ring buffer 2) Dynamic
> > DMA transfer.
> >
> > The Confidential VMBus Ring buffer has been upstreamed by
> > Roman Kisel(commit 6802d8af47d1).
> >
> > The dynamic DMA transition of VMBus device normally goes
> > through DMA core and it uses SWIOTLB as bounce buffer in
> > a CoCo VM.
> >
> > The Confidential VMBus device can do DMA directly to
> > private/encrypted memory. Because the swiotlb is decrypted
> > memory, the DMA transfer must not be bounced through the
> > swiotlb, so as to preserve confidentiality. This is different
> > from the default for Linux CoCo VMs, so disable the VMBus
> > device's use of swiotlb.
> >
> > Expose swiotlb_dev_disable() from DMA Core to disable
> > bounce buffer for device.
> >
> > Suggested-by: Michael Kelley <mhklinux@outlook.com>
> > Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> > ---
> >  drivers/hv/vmbus_drv.c  | 6 +++++-
> >  include/linux/swiotlb.h | 5 +++++
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index 3d1a58b667db..84e6971fc90f 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -2184,11 +2184,15 @@ int vmbus_device_register(struct hv_device *child_device_obj)
> >       child_device_obj->device.dma_mask = &child_device_obj->dma_mask;
> >       dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64));
> >
> > +     device_initialize(&child_device_obj->device);
> > +     if (child_device_obj->channel->co_external_memory)
> > +             swiotlb_dev_disable(&child_device_obj->device);
> > +
> >       /*
> >        * Register with the LDM. This will kick off the driver/device
> >        * binding...which will eventually call vmbus_match() and vmbus_probe()
> >        */
> > -     ret = device_register(&child_device_obj->device);
> > +     ret = device_add(&child_device_obj->device);
> >       if (ret) {
> >               pr_err("Unable to register child device\n");
> >               put_device(&child_device_obj->device);
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index 3dae0f592063..7c572570d5d9 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -169,6 +169,11 @@ static inline struct io_tlb_pool *swiotlb_find_pool(struct device *dev,
> >       return NULL;
> >  }
> >
> > +static inline bool swiotlb_dev_disable(struct device *dev)
> > +{
> > +     return dev->dma_io_tlb_mem == NULL;
>
> Is there an extra = here?
>
> - Easwar (he/him)

Hi Easwar:
     Thanks for your review. Nice catch. Oops. Will try other way to disable
device bounce buffer in the next version.
-- 
Thanks
Tianyu Lan

^ permalink raw reply

* [PATCH 0/6] Hyper-V: kexec fixes for L1VH (mshv)
From: Jork Loeser @ 2026-03-27 20:19 UTC (permalink / raw)
  To: linux-hyperv
  Cc: x86, K . Y . Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui,
	Long Li, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H . Peter Anvin, Arnd Bergmann, Roman Kisel,
	Michael Kelley, linux-kernel, linux-arch, Jork Loeser

This series fixes kexec support when Linux runs as an L1 Virtual Host
(L1VH) under Hyper-V, using the MSHV driver to manage child VMs.

1. A variable shadowing bug in vmbus that hides the cpuhp state used
   for teardown.

2. Move hv_stimer_global_cleanup() from vmbus's hv_kexec_handler() to
   hv_machine_shutdown(). This ensures stimer cleanup happens before
   the vmbus unload.

3. LP/VP re-creation: after kexec, logical processors and virtual
   processors already exist in the hypervisor. Detect this and skip
   re-adding them.

4-5. SynIC cleanup: the MSHV driver manages its own SynIC resources
     separately from vmbus. Add proper teardown of MSHV-owned SINTs,
     SIMP, and SIEFP on kexec, scoped to only the resources MSHV
     owns.

6. Debugfs stats pages: unmap the VP statistics overlay pages before
   kexec to avoid stale mappings in the new kernel.

Jork Loeser (6):
  Drivers: hv: vmbus: fix hyperv_cpuhp_online variable shadowing
  x86/hyperv: move stimer cleanup to hv_machine_shutdown()
  x86/hyperv: Skip LP/VP creation on kexec
  mshv: limit SynIC management to MSHV-owned resources
  mshv: clean up SynIC state on kexec for L1VH
  mshv: unmap debugfs stats pages on kexec

 arch/x86/kernel/cpu/mshyperv.c |  15 +++-
 drivers/hv/hv_proc.c           |  47 +++++++++++
 drivers/hv/mshv_debugfs.c      |   7 +-
 drivers/hv/mshv_root_main.c    |  22 ++---
 drivers/hv/mshv_synic.c        | 144 ++++++++++++++++++++++-----------
 drivers/hv/vmbus_drv.c         |   2 -
 include/asm-generic/mshyperv.h |  10 +++
 include/hyperv/hvgdk_mini.h    |   1 +
 include/hyperv/hvhdk_mini.h    |  12 +++
 9 files changed, 190 insertions(+), 70 deletions(-)

-- 
2.43.0

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox