linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp
@ 2023-06-20 13:55 Bob Pearson
  2023-06-20 13:55 ` [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines Bob Pearson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Bob Pearson @ 2023-06-20 13:55 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson, syzbot+2da1965168e7dbcba136

If a call to rxe_create_qp() fails in rxe_qp_from_init()
rxe_cleanup(qp) will be called. This code currently does not correctly
handle cases where not all qp resources are allocated and can seg
fault as reported below. The first two patches cleanup cases where
this happens. The third patch corrects an error in rxe_srq.c where
if caller requests a change in the srq size the correct new value
is not returned to caller.

This patch series applies cleanly to the current for-next branch.

Reported-by: syzbot+2da1965168e7dbcba136@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-rdma/00000000000012d89205fe7cfe00@google.com/raw
Fixes: 49dc9c1f0c7e ("RDMA/rxe: Cleanup reset state handling in rxe_resp.c")
Fixes: fbdeb828a21f ("RDMA/rxe: Cleanup error state handling in rxe_comp.c")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
v2: Reverted a partially implemented change in patch 3/3 which was
    incorrect.

Bob Pearson (3):
  RDMA/rxe: Move work queue code to subroutines
  RDMA/rxe: Fix unsafe drain work queue code
  RDMA/rxe: Fix rxe_m-dify_srq

 drivers/infiniband/sw/rxe/rxe_comp.c |   4 +
 drivers/infiniband/sw/rxe/rxe_loc.h  |   6 -
 drivers/infiniband/sw/rxe/rxe_qp.c   | 163 ++++++++++++++++++---------
 drivers/infiniband/sw/rxe/rxe_resp.c |   4 +
 drivers/infiniband/sw/rxe/rxe_srq.c  |  60 ++++++----
 5 files changed, 152 insertions(+), 85 deletions(-)


base-commit: 830f93f47068b1632cc127871fbf27e918efdf46
-- 
2.39.2


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines
  2023-06-20 13:55 [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Bob Pearson
@ 2023-06-20 13:55 ` Bob Pearson
  2023-06-20 14:49   ` Zhu Yanjun
  2023-06-20 13:55 ` [PATCH for-next v2 2/3] RDMA/rxe: Fix unsafe drain work queue code Bob Pearson
  2023-07-31 18:25 ` [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Jason Gunthorpe
  2 siblings, 1 reply; 7+ messages in thread
From: Bob Pearson @ 2023-06-20 13:55 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson

This patch:
	- Moves code to initialize a qp send work queue to a
	  subroutine named rxe_init_sq.
	- Moves code to initialize a qp recv work queue to a
	  subroutine named rxe_init_rq.
	- Moves initialization of qp request and response packet
	  queues ahead of work queue initialization so that cleanup
	  of a qp if it is not fully completed can successfully
	  attempt to drain the packet queues without a seg fault.
	- Makes minor whitespace cleanups.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_qp.c | 163 +++++++++++++++++++----------
 1 file changed, 108 insertions(+), 55 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 95d4a6760c33..9dbb16699c36 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -180,13 +180,63 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
 	atomic_set(&qp->skb_out, 0);
 }
 
+static int rxe_init_sq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
+		       struct ib_udata *udata,
+		       struct rxe_create_qp_resp __user *uresp)
+{
+	struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+	int wqe_size;
+	int err;
+
+	qp->sq.max_wr = init->cap.max_send_wr;
+	wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
+			 init->cap.max_inline_data);
+	qp->sq.max_sge = wqe_size / sizeof(struct ib_sge);
+	qp->sq.max_inline = wqe_size;
+	wqe_size += sizeof(struct rxe_send_wqe);
+
+	qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size,
+				      QUEUE_TYPE_FROM_CLIENT);
+	if (!qp->sq.queue) {
+		rxe_err_qp(qp, "Unable to allocate send queue");
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	/* prepare info for caller to mmap send queue if user space qp */
+	err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
+			   qp->sq.queue->buf, qp->sq.queue->buf_size,
+			   &qp->sq.queue->ip);
+	if (err) {
+		rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
+		goto err_free;
+	}
+
+	/* return actual capabilities to caller which may be larger
+	 * than requested
+	 */
+	init->cap.max_send_wr = qp->sq.max_wr;
+	init->cap.max_send_sge = qp->sq.max_sge;
+	init->cap.max_inline_data = qp->sq.max_inline;
+
+	return 0;
+
+err_free:
+	vfree(qp->sq.queue->buf);
+	kfree(qp->sq.queue);
+	qp->sq.queue = NULL;
+err_out:
+	return err;
+}
+
 static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 			   struct ib_qp_init_attr *init, struct ib_udata *udata,
 			   struct rxe_create_qp_resp __user *uresp)
 {
 	int err;
-	int wqe_size;
-	enum queue_type type;
+
+	/* if we don't finish qp create make sure queue is valid */
+	skb_queue_head_init(&qp->req_pkts);
 
 	err = sock_create_kern(&init_net, AF_INET, SOCK_DGRAM, 0, &qp->sk);
 	if (err < 0)
@@ -201,32 +251,10 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 	 * (0xc000 - 0xffff).
 	 */
 	qp->src_port = RXE_ROCE_V2_SPORT + (hash_32(qp_num(qp), 14) & 0x3fff);
-	qp->sq.max_wr		= init->cap.max_send_wr;
-
-	/* These caps are limited by rxe_qp_chk_cap() done by the caller */
-	wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
-			 init->cap.max_inline_data);
-	qp->sq.max_sge = init->cap.max_send_sge =
-		wqe_size / sizeof(struct ib_sge);
-	qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
-	wqe_size += sizeof(struct rxe_send_wqe);
-
-	type = QUEUE_TYPE_FROM_CLIENT;
-	qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
-				wqe_size, type);
-	if (!qp->sq.queue)
-		return -ENOMEM;
 
-	err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
-			   qp->sq.queue->buf, qp->sq.queue->buf_size,
-			   &qp->sq.queue->ip);
-
-	if (err) {
-		vfree(qp->sq.queue->buf);
-		kfree(qp->sq.queue);
-		qp->sq.queue = NULL;
+	err = rxe_init_sq(qp, init, udata, uresp);
+	if (err)
 		return err;
-	}
 
 	qp->req.wqe_index = queue_get_producer(qp->sq.queue,
 					       QUEUE_TYPE_FROM_CLIENT);
@@ -234,8 +262,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 	qp->req.opcode		= -1;
 	qp->comp.opcode		= -1;
 
-	skb_queue_head_init(&qp->req_pkts);
-
 	rxe_init_task(&qp->req.task, qp, rxe_requester);
 	rxe_init_task(&qp->comp.task, qp, rxe_completer);
 
@@ -247,40 +273,67 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
 	return 0;
 }
 
+static int rxe_init_rq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
+		       struct ib_udata *udata,
+		       struct rxe_create_qp_resp __user *uresp)
+{
+	struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
+	int wqe_size;
+	int err;
+
+	qp->rq.max_wr = init->cap.max_recv_wr;
+	qp->rq.max_sge = init->cap.max_recv_sge;
+	wqe_size = sizeof(struct rxe_recv_wqe) +
+				qp->rq.max_sge*sizeof(struct ib_sge);
+
+	qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size,
+				      QUEUE_TYPE_FROM_CLIENT);
+	if (!qp->rq.queue) {
+		rxe_err_qp(qp, "Unable to allocate recv queue");
+		err = -ENOMEM;
+		goto err_out;
+	}
+
+	/* prepare info for caller to mmap recv queue if user space qp */
+	err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
+			   qp->rq.queue->buf, qp->rq.queue->buf_size,
+			   &qp->rq.queue->ip);
+	if (err) {
+		rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
+		goto err_free;
+	}
+
+	/* return actual capabilities to caller which may be larger
+	 * than requested
+	 */
+	init->cap.max_recv_wr = qp->rq.max_wr;
+
+	return 0;
+
+err_free:
+	vfree(qp->rq.queue->buf);
+	kfree(qp->rq.queue);
+	qp->rq.queue = NULL;
+err_out:
+	return err;
+}
+
 static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
 			    struct ib_qp_init_attr *init,
 			    struct ib_udata *udata,
 			    struct rxe_create_qp_resp __user *uresp)
 {
 	int err;
-	int wqe_size;
-	enum queue_type type;
+
+	/* if we don't finish qp create make sure queue is valid */
+	skb_queue_head_init(&qp->resp_pkts);
 
 	if (!qp->srq) {
-		qp->rq.max_wr		= init->cap.max_recv_wr;
-		qp->rq.max_sge		= init->cap.max_recv_sge;
-
-		wqe_size = rcv_wqe_size(qp->rq.max_sge);
-
-		type = QUEUE_TYPE_FROM_CLIENT;
-		qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
-					wqe_size, type);
-		if (!qp->rq.queue)
-			return -ENOMEM;
-
-		err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
-				   qp->rq.queue->buf, qp->rq.queue->buf_size,
-				   &qp->rq.queue->ip);
-		if (err) {
-			vfree(qp->rq.queue->buf);
-			kfree(qp->rq.queue);
-			qp->rq.queue = NULL;
+		err = rxe_init_rq(qp, init, udata, uresp);
+		if (err)
 			return err;
-		}
 	}
 
-	skb_queue_head_init(&qp->resp_pkts);
-
 	rxe_init_task(&qp->resp.task, qp, rxe_responder);
 
 	qp->resp.opcode		= OPCODE_NONE;
@@ -307,10 +360,10 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
 	if (srq)
 		rxe_get(srq);
 
-	qp->pd			= pd;
-	qp->rcq			= rcq;
-	qp->scq			= scq;
-	qp->srq			= srq;
+	qp->pd = pd;
+	qp->rcq = rcq;
+	qp->scq = scq;
+	qp->srq = srq;
 
 	atomic_inc(&rcq->num_wq);
 	atomic_inc(&scq->num_wq);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH for-next v2 2/3] RDMA/rxe: Fix unsafe drain work queue code
  2023-06-20 13:55 [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Bob Pearson
  2023-06-20 13:55 ` [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines Bob Pearson
@ 2023-06-20 13:55 ` Bob Pearson
  2023-07-31 18:25 ` [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Jason Gunthorpe
  2 siblings, 0 replies; 7+ messages in thread
From: Bob Pearson @ 2023-06-20 13:55 UTC (permalink / raw)
  To: jgg, zyjzyj2000, linux-rdma; +Cc: Bob Pearson, syzbot+2da1965168e7dbcba136

If create_qp does not fully succeed it is possible for qp cleanup
code to attempt to drain the send or recv work queues before the
queues have been created causing a seg fault. This patch checks
to see if the queues exist before attempting to drain them.

Reported-by: syzbot+2da1965168e7dbcba136@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-rdma/00000000000012d89205fe7cfe00@google.com/raw
Fixes: 49dc9c1f0c7e ("RDMA/rxe: Cleanup reset state handling in rxe_resp.c")
Fixes: fbdeb828a21f ("RDMA/rxe: Cleanup error state handling in rxe_comp.c")
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_comp.c | 4 ++++
 drivers/infiniband/sw/rxe/rxe_resp.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 0c0ae214c3a9..c2a53418a9ce 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -594,6 +594,10 @@ static void flush_send_queue(struct rxe_qp *qp, bool notify)
 	struct rxe_queue *q = qp->sq.queue;
 	int err;
 
+	/* send queue never got created. nothing to do. */
+	if (!qp->sq.queue)
+		return;
+
 	while ((wqe = queue_head(q, q->type))) {
 		if (notify) {
 			err = flush_send_wqe(qp, wqe);
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index 8a3c9c2c5a2d..9b4f95820b40 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -1467,6 +1467,10 @@ static void flush_recv_queue(struct rxe_qp *qp, bool notify)
 		return;
 	}
 
+	/* recv queue not created. nothing to do. */
+	if (!qp->rq.queue)
+		return;
+
 	while ((wqe = queue_head(q, q->type))) {
 		if (notify) {
 			err = flush_recv_wqe(qp, wqe);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines
  2023-06-20 13:55 ` [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines Bob Pearson
@ 2023-06-20 14:49   ` Zhu Yanjun
  2023-06-20 15:02     ` Bob Pearson
  0 siblings, 1 reply; 7+ messages in thread
From: Zhu Yanjun @ 2023-06-20 14:49 UTC (permalink / raw)
  To: Bob Pearson; +Cc: jgg, linux-rdma

On Tue, Jun 20, 2023 at 9:55 PM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> This patch:
>         - Moves code to initialize a qp send work queue to a
>           subroutine named rxe_init_sq.
>         - Moves code to initialize a qp recv work queue to a
>           subroutine named rxe_init_rq.

This is a use-before-initialization problem. It is better to
initialize the sq/rq queues before the queues are used.
These 3 commits are complicated. It is easy to introduce some risks
just like in the first version. A compact fix is preferred.
But these commits seems to fix the problem.

Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>



>         - Moves initialization of qp request and response packet
>           queues ahead of work queue initialization so that cleanup
>           of a qp if it is not fully completed can successfully
>           attempt to drain the packet queues without a seg fault.
>         - Makes minor whitespace cleanups.
>
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_qp.c | 163 +++++++++++++++++++----------
>  1 file changed, 108 insertions(+), 55 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> index 95d4a6760c33..9dbb16699c36 100644
> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> @@ -180,13 +180,63 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>         atomic_set(&qp->skb_out, 0);
>  }
>
> +static int rxe_init_sq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
> +                      struct ib_udata *udata,
> +                      struct rxe_create_qp_resp __user *uresp)
> +{
> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
> +       int wqe_size;
> +       int err;
> +
> +       qp->sq.max_wr = init->cap.max_send_wr;
> +       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
> +                        init->cap.max_inline_data);
> +       qp->sq.max_sge = wqe_size / sizeof(struct ib_sge);
> +       qp->sq.max_inline = wqe_size;
> +       wqe_size += sizeof(struct rxe_send_wqe);
> +
> +       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size,
> +                                     QUEUE_TYPE_FROM_CLIENT);
> +       if (!qp->sq.queue) {
> +               rxe_err_qp(qp, "Unable to allocate send queue");
> +               err = -ENOMEM;
> +               goto err_out;
> +       }
> +
> +       /* prepare info for caller to mmap send queue if user space qp */
> +       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
> +                          qp->sq.queue->buf, qp->sq.queue->buf_size,
> +                          &qp->sq.queue->ip);
> +       if (err) {
> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
> +               goto err_free;
> +       }
> +
> +       /* return actual capabilities to caller which may be larger
> +        * than requested
> +        */
> +       init->cap.max_send_wr = qp->sq.max_wr;
> +       init->cap.max_send_sge = qp->sq.max_sge;
> +       init->cap.max_inline_data = qp->sq.max_inline;
> +
> +       return 0;
> +
> +err_free:
> +       vfree(qp->sq.queue->buf);
> +       kfree(qp->sq.queue);
> +       qp->sq.queue = NULL;
> +err_out:
> +       return err;
> +}
> +
>  static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>                            struct ib_qp_init_attr *init, struct ib_udata *udata,
>                            struct rxe_create_qp_resp __user *uresp)
>  {
>         int err;
> -       int wqe_size;
> -       enum queue_type type;
> +
> +       /* if we don't finish qp create make sure queue is valid */
> +       skb_queue_head_init(&qp->req_pkts);
>
>         err = sock_create_kern(&init_net, AF_INET, SOCK_DGRAM, 0, &qp->sk);
>         if (err < 0)
> @@ -201,32 +251,10 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>          * (0xc000 - 0xffff).
>          */
>         qp->src_port = RXE_ROCE_V2_SPORT + (hash_32(qp_num(qp), 14) & 0x3fff);
> -       qp->sq.max_wr           = init->cap.max_send_wr;
> -
> -       /* These caps are limited by rxe_qp_chk_cap() done by the caller */
> -       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
> -                        init->cap.max_inline_data);
> -       qp->sq.max_sge = init->cap.max_send_sge =
> -               wqe_size / sizeof(struct ib_sge);
> -       qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
> -       wqe_size += sizeof(struct rxe_send_wqe);
> -
> -       type = QUEUE_TYPE_FROM_CLIENT;
> -       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
> -                               wqe_size, type);
> -       if (!qp->sq.queue)
> -               return -ENOMEM;
>
> -       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
> -                          qp->sq.queue->buf, qp->sq.queue->buf_size,
> -                          &qp->sq.queue->ip);
> -
> -       if (err) {
> -               vfree(qp->sq.queue->buf);
> -               kfree(qp->sq.queue);
> -               qp->sq.queue = NULL;
> +       err = rxe_init_sq(qp, init, udata, uresp);
> +       if (err)
>                 return err;
> -       }
>
>         qp->req.wqe_index = queue_get_producer(qp->sq.queue,
>                                                QUEUE_TYPE_FROM_CLIENT);
> @@ -234,8 +262,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>         qp->req.opcode          = -1;
>         qp->comp.opcode         = -1;
>
> -       skb_queue_head_init(&qp->req_pkts);
> -
>         rxe_init_task(&qp->req.task, qp, rxe_requester);
>         rxe_init_task(&qp->comp.task, qp, rxe_completer);
>
> @@ -247,40 +273,67 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>         return 0;
>  }
>
> +static int rxe_init_rq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
> +                      struct ib_udata *udata,
> +                      struct rxe_create_qp_resp __user *uresp)
> +{
> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
> +       int wqe_size;
> +       int err;
> +
> +       qp->rq.max_wr = init->cap.max_recv_wr;
> +       qp->rq.max_sge = init->cap.max_recv_sge;
> +       wqe_size = sizeof(struct rxe_recv_wqe) +
> +                               qp->rq.max_sge*sizeof(struct ib_sge);
> +
> +       qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size,
> +                                     QUEUE_TYPE_FROM_CLIENT);
> +       if (!qp->rq.queue) {
> +               rxe_err_qp(qp, "Unable to allocate recv queue");
> +               err = -ENOMEM;
> +               goto err_out;
> +       }
> +
> +       /* prepare info for caller to mmap recv queue if user space qp */
> +       err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
> +                          qp->rq.queue->buf, qp->rq.queue->buf_size,
> +                          &qp->rq.queue->ip);
> +       if (err) {
> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
> +               goto err_free;
> +       }
> +
> +       /* return actual capabilities to caller which may be larger
> +        * than requested
> +        */
> +       init->cap.max_recv_wr = qp->rq.max_wr;
> +
> +       return 0;
> +
> +err_free:
> +       vfree(qp->rq.queue->buf);
> +       kfree(qp->rq.queue);
> +       qp->rq.queue = NULL;
> +err_out:
> +       return err;
> +}
> +
>  static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
>                             struct ib_qp_init_attr *init,
>                             struct ib_udata *udata,
>                             struct rxe_create_qp_resp __user *uresp)
>  {
>         int err;
> -       int wqe_size;
> -       enum queue_type type;
> +
> +       /* if we don't finish qp create make sure queue is valid */
> +       skb_queue_head_init(&qp->resp_pkts);
>
>         if (!qp->srq) {
> -               qp->rq.max_wr           = init->cap.max_recv_wr;
> -               qp->rq.max_sge          = init->cap.max_recv_sge;
> -
> -               wqe_size = rcv_wqe_size(qp->rq.max_sge);
> -
> -               type = QUEUE_TYPE_FROM_CLIENT;
> -               qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
> -                                       wqe_size, type);
> -               if (!qp->rq.queue)
> -                       return -ENOMEM;
> -
> -               err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
> -                                  qp->rq.queue->buf, qp->rq.queue->buf_size,
> -                                  &qp->rq.queue->ip);
> -               if (err) {
> -                       vfree(qp->rq.queue->buf);
> -                       kfree(qp->rq.queue);
> -                       qp->rq.queue = NULL;
> +               err = rxe_init_rq(qp, init, udata, uresp);
> +               if (err)
>                         return err;
> -               }
>         }
>
> -       skb_queue_head_init(&qp->resp_pkts);
> -
>         rxe_init_task(&qp->resp.task, qp, rxe_responder);
>
>         qp->resp.opcode         = OPCODE_NONE;
> @@ -307,10 +360,10 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
>         if (srq)
>                 rxe_get(srq);
>
> -       qp->pd                  = pd;
> -       qp->rcq                 = rcq;
> -       qp->scq                 = scq;
> -       qp->srq                 = srq;
> +       qp->pd = pd;
> +       qp->rcq = rcq;
> +       qp->scq = scq;
> +       qp->srq = srq;
>
>         atomic_inc(&rcq->num_wq);
>         atomic_inc(&scq->num_wq);
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines
  2023-06-20 14:49   ` Zhu Yanjun
@ 2023-06-20 15:02     ` Bob Pearson
  2023-06-21  6:21       ` Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: Bob Pearson @ 2023-06-20 15:02 UTC (permalink / raw)
  To: Zhu Yanjun; +Cc: jgg, linux-rdma

On 6/20/23 09:49, Zhu Yanjun wrote:
> On Tue, Jun 20, 2023 at 9:55 PM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>
>> This patch:
>>         - Moves code to initialize a qp send work queue to a
>>           subroutine named rxe_init_sq.
>>         - Moves code to initialize a qp recv work queue to a
>>           subroutine named rxe_init_rq.
> 
> This is a use-before-initialization problem. It is better to
> initialize the sq/rq queues before the queues are used.
> These 3 commits are complicated. It is easy to introduce some risks
> just like in the first version. A compact fix is preferred.
> But these commits seems to fix the problem.
> 
> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>

The fix to the reported problem is in patch 2/3 which is very simple.
Patch 1/3 mainly just cuts and pastes the code to init the queues into a
subroutine without any functional change. But it fixes another potential
use before setting issue with the packet queues simply by initializing
them first.

I am planning on spending today looking at the namespace patches.

Thanks for looking - Bob
> 
> 
> 
>>         - Moves initialization of qp request and response packet
>>           queues ahead of work queue initialization so that cleanup
>>           of a qp if it is not fully completed can successfully
>>           attempt to drain the packet queues without a seg fault.
>>         - Makes minor whitespace cleanups.
>>
>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>  drivers/infiniband/sw/rxe/rxe_qp.c | 163 +++++++++++++++++++----------
>>  1 file changed, 108 insertions(+), 55 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>> index 95d4a6760c33..9dbb16699c36 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>> @@ -180,13 +180,63 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>>         atomic_set(&qp->skb_out, 0);
>>  }
>>
>> +static int rxe_init_sq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
>> +                      struct ib_udata *udata,
>> +                      struct rxe_create_qp_resp __user *uresp)
>> +{
>> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
>> +       int wqe_size;
>> +       int err;
>> +
>> +       qp->sq.max_wr = init->cap.max_send_wr;
>> +       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
>> +                        init->cap.max_inline_data);
>> +       qp->sq.max_sge = wqe_size / sizeof(struct ib_sge);
>> +       qp->sq.max_inline = wqe_size;
>> +       wqe_size += sizeof(struct rxe_send_wqe);
>> +
>> +       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size,
>> +                                     QUEUE_TYPE_FROM_CLIENT);
>> +       if (!qp->sq.queue) {
>> +               rxe_err_qp(qp, "Unable to allocate send queue");
>> +               err = -ENOMEM;
>> +               goto err_out;
>> +       }
>> +
>> +       /* prepare info for caller to mmap send queue if user space qp */
>> +       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
>> +                          qp->sq.queue->buf, qp->sq.queue->buf_size,
>> +                          &qp->sq.queue->ip);
>> +       if (err) {
>> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
>> +               goto err_free;
>> +       }
>> +
>> +       /* return actual capabilities to caller which may be larger
>> +        * than requested
>> +        */
>> +       init->cap.max_send_wr = qp->sq.max_wr;
>> +       init->cap.max_send_sge = qp->sq.max_sge;
>> +       init->cap.max_inline_data = qp->sq.max_inline;
>> +
>> +       return 0;
>> +
>> +err_free:
>> +       vfree(qp->sq.queue->buf);
>> +       kfree(qp->sq.queue);
>> +       qp->sq.queue = NULL;
>> +err_out:
>> +       return err;
>> +}
>> +
>>  static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>                            struct ib_qp_init_attr *init, struct ib_udata *udata,
>>                            struct rxe_create_qp_resp __user *uresp)
>>  {
>>         int err;
>> -       int wqe_size;
>> -       enum queue_type type;
>> +
>> +       /* if we don't finish qp create make sure queue is valid */
>> +       skb_queue_head_init(&qp->req_pkts);
>>
>>         err = sock_create_kern(&init_net, AF_INET, SOCK_DGRAM, 0, &qp->sk);
>>         if (err < 0)
>> @@ -201,32 +251,10 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>          * (0xc000 - 0xffff).
>>          */
>>         qp->src_port = RXE_ROCE_V2_SPORT + (hash_32(qp_num(qp), 14) & 0x3fff);
>> -       qp->sq.max_wr           = init->cap.max_send_wr;
>> -
>> -       /* These caps are limited by rxe_qp_chk_cap() done by the caller */
>> -       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
>> -                        init->cap.max_inline_data);
>> -       qp->sq.max_sge = init->cap.max_send_sge =
>> -               wqe_size / sizeof(struct ib_sge);
>> -       qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
>> -       wqe_size += sizeof(struct rxe_send_wqe);
>> -
>> -       type = QUEUE_TYPE_FROM_CLIENT;
>> -       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
>> -                               wqe_size, type);
>> -       if (!qp->sq.queue)
>> -               return -ENOMEM;
>>
>> -       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
>> -                          qp->sq.queue->buf, qp->sq.queue->buf_size,
>> -                          &qp->sq.queue->ip);
>> -
>> -       if (err) {
>> -               vfree(qp->sq.queue->buf);
>> -               kfree(qp->sq.queue);
>> -               qp->sq.queue = NULL;
>> +       err = rxe_init_sq(qp, init, udata, uresp);
>> +       if (err)
>>                 return err;
>> -       }
>>
>>         qp->req.wqe_index = queue_get_producer(qp->sq.queue,
>>                                                QUEUE_TYPE_FROM_CLIENT);
>> @@ -234,8 +262,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>         qp->req.opcode          = -1;
>>         qp->comp.opcode         = -1;
>>
>> -       skb_queue_head_init(&qp->req_pkts);
>> -
>>         rxe_init_task(&qp->req.task, qp, rxe_requester);
>>         rxe_init_task(&qp->comp.task, qp, rxe_completer);
>>
>> @@ -247,40 +273,67 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>         return 0;
>>  }
>>
>> +static int rxe_init_rq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
>> +                      struct ib_udata *udata,
>> +                      struct rxe_create_qp_resp __user *uresp)
>> +{
>> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
>> +       int wqe_size;
>> +       int err;
>> +
>> +       qp->rq.max_wr = init->cap.max_recv_wr;
>> +       qp->rq.max_sge = init->cap.max_recv_sge;
>> +       wqe_size = sizeof(struct rxe_recv_wqe) +
>> +                               qp->rq.max_sge*sizeof(struct ib_sge);
>> +
>> +       qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size,
>> +                                     QUEUE_TYPE_FROM_CLIENT);
>> +       if (!qp->rq.queue) {
>> +               rxe_err_qp(qp, "Unable to allocate recv queue");
>> +               err = -ENOMEM;
>> +               goto err_out;
>> +       }
>> +
>> +       /* prepare info for caller to mmap recv queue if user space qp */
>> +       err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
>> +                          qp->rq.queue->buf, qp->rq.queue->buf_size,
>> +                          &qp->rq.queue->ip);
>> +       if (err) {
>> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
>> +               goto err_free;
>> +       }
>> +
>> +       /* return actual capabilities to caller which may be larger
>> +        * than requested
>> +        */
>> +       init->cap.max_recv_wr = qp->rq.max_wr;
>> +
>> +       return 0;
>> +
>> +err_free:
>> +       vfree(qp->rq.queue->buf);
>> +       kfree(qp->rq.queue);
>> +       qp->rq.queue = NULL;
>> +err_out:
>> +       return err;
>> +}
>> +
>>  static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
>>                             struct ib_qp_init_attr *init,
>>                             struct ib_udata *udata,
>>                             struct rxe_create_qp_resp __user *uresp)
>>  {
>>         int err;
>> -       int wqe_size;
>> -       enum queue_type type;
>> +
>> +       /* if we don't finish qp create make sure queue is valid */
>> +       skb_queue_head_init(&qp->resp_pkts);
>>
>>         if (!qp->srq) {
>> -               qp->rq.max_wr           = init->cap.max_recv_wr;
>> -               qp->rq.max_sge          = init->cap.max_recv_sge;
>> -
>> -               wqe_size = rcv_wqe_size(qp->rq.max_sge);
>> -
>> -               type = QUEUE_TYPE_FROM_CLIENT;
>> -               qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
>> -                                       wqe_size, type);
>> -               if (!qp->rq.queue)
>> -                       return -ENOMEM;
>> -
>> -               err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
>> -                                  qp->rq.queue->buf, qp->rq.queue->buf_size,
>> -                                  &qp->rq.queue->ip);
>> -               if (err) {
>> -                       vfree(qp->rq.queue->buf);
>> -                       kfree(qp->rq.queue);
>> -                       qp->rq.queue = NULL;
>> +               err = rxe_init_rq(qp, init, udata, uresp);
>> +               if (err)
>>                         return err;
>> -               }
>>         }
>>
>> -       skb_queue_head_init(&qp->resp_pkts);
>> -
>>         rxe_init_task(&qp->resp.task, qp, rxe_responder);
>>
>>         qp->resp.opcode         = OPCODE_NONE;
>> @@ -307,10 +360,10 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
>>         if (srq)
>>                 rxe_get(srq);
>>
>> -       qp->pd                  = pd;
>> -       qp->rcq                 = rcq;
>> -       qp->scq                 = scq;
>> -       qp->srq                 = srq;
>> +       qp->pd = pd;
>> +       qp->rcq = rcq;
>> +       qp->scq = scq;
>> +       qp->srq = srq;
>>
>>         atomic_inc(&rcq->num_wq);
>>         atomic_inc(&scq->num_wq);
>> --
>> 2.39.2
>>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines
  2023-06-20 15:02     ` Bob Pearson
@ 2023-06-21  6:21       ` Zhu Yanjun
  0 siblings, 0 replies; 7+ messages in thread
From: Zhu Yanjun @ 2023-06-21  6:21 UTC (permalink / raw)
  To: Bob Pearson, Zhu Yanjun; +Cc: jgg, linux-rdma

在 2023/6/20 23:02, Bob Pearson 写道:
> On 6/20/23 09:49, Zhu Yanjun wrote:
>> On Tue, Jun 20, 2023 at 9:55 PM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>>
>>> This patch:
>>>          - Moves code to initialize a qp send work queue to a
>>>            subroutine named rxe_init_sq.
>>>          - Moves code to initialize a qp recv work queue to a
>>>            subroutine named rxe_init_rq.
>>
>> This is a use-before-initialization problem. It is better to
>> initialize the sq/rq queues before the queues are used.
>> These 3 commits are complicated. It is easy to introduce some risks
>> just like in the first version. A compact fix is preferred.
>> But these commits seems to fix the problem.
>>
>> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
> 
> The fix to the reported problem is in patch 2/3 which is very simple.
> Patch 1/3 mainly just cuts and pastes the code to init the queues into a
> subroutine without any functional change. But it fixes another potential
> use before setting issue with the packet queues simply by initializing
> them first.
> 
> I am planning on spending today looking at the namespace patches.

Thanks, Bob

Zhu Yanjun

> 
> Thanks for looking - Bob
>>
>>
>>
>>>          - Moves initialization of qp request and response packet
>>>            queues ahead of work queue initialization so that cleanup
>>>            of a qp if it is not fully completed can successfully
>>>            attempt to drain the packet queues without a seg fault.
>>>          - Makes minor whitespace cleanups.
>>>
>>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>>> ---
>>>   drivers/infiniband/sw/rxe/rxe_qp.c | 163 +++++++++++++++++++----------
>>>   1 file changed, 108 insertions(+), 55 deletions(-)
>>>
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>>> index 95d4a6760c33..9dbb16699c36 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>>> @@ -180,13 +180,63 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>          atomic_set(&qp->skb_out, 0);
>>>   }
>>>
>>> +static int rxe_init_sq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
>>> +                      struct ib_udata *udata,
>>> +                      struct rxe_create_qp_resp __user *uresp)
>>> +{
>>> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
>>> +       int wqe_size;
>>> +       int err;
>>> +
>>> +       qp->sq.max_wr = init->cap.max_send_wr;
>>> +       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
>>> +                        init->cap.max_inline_data);
>>> +       qp->sq.max_sge = wqe_size / sizeof(struct ib_sge);
>>> +       qp->sq.max_inline = wqe_size;
>>> +       wqe_size += sizeof(struct rxe_send_wqe);
>>> +
>>> +       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr, wqe_size,
>>> +                                     QUEUE_TYPE_FROM_CLIENT);
>>> +       if (!qp->sq.queue) {
>>> +               rxe_err_qp(qp, "Unable to allocate send queue");
>>> +               err = -ENOMEM;
>>> +               goto err_out;
>>> +       }
>>> +
>>> +       /* prepare info for caller to mmap send queue if user space qp */
>>> +       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
>>> +                          qp->sq.queue->buf, qp->sq.queue->buf_size,
>>> +                          &qp->sq.queue->ip);
>>> +       if (err) {
>>> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
>>> +               goto err_free;
>>> +       }
>>> +
>>> +       /* return actual capabilities to caller which may be larger
>>> +        * than requested
>>> +        */
>>> +       init->cap.max_send_wr = qp->sq.max_wr;
>>> +       init->cap.max_send_sge = qp->sq.max_sge;
>>> +       init->cap.max_inline_data = qp->sq.max_inline;
>>> +
>>> +       return 0;
>>> +
>>> +err_free:
>>> +       vfree(qp->sq.queue->buf);
>>> +       kfree(qp->sq.queue);
>>> +       qp->sq.queue = NULL;
>>> +err_out:
>>> +       return err;
>>> +}
>>> +
>>>   static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>                             struct ib_qp_init_attr *init, struct ib_udata *udata,
>>>                             struct rxe_create_qp_resp __user *uresp)
>>>   {
>>>          int err;
>>> -       int wqe_size;
>>> -       enum queue_type type;
>>> +
>>> +       /* if we don't finish qp create make sure queue is valid */
>>> +       skb_queue_head_init(&qp->req_pkts);
>>>
>>>          err = sock_create_kern(&init_net, AF_INET, SOCK_DGRAM, 0, &qp->sk);
>>>          if (err < 0)
>>> @@ -201,32 +251,10 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>           * (0xc000 - 0xffff).
>>>           */
>>>          qp->src_port = RXE_ROCE_V2_SPORT + (hash_32(qp_num(qp), 14) & 0x3fff);
>>> -       qp->sq.max_wr           = init->cap.max_send_wr;
>>> -
>>> -       /* These caps are limited by rxe_qp_chk_cap() done by the caller */
>>> -       wqe_size = max_t(int, init->cap.max_send_sge * sizeof(struct ib_sge),
>>> -                        init->cap.max_inline_data);
>>> -       qp->sq.max_sge = init->cap.max_send_sge =
>>> -               wqe_size / sizeof(struct ib_sge);
>>> -       qp->sq.max_inline = init->cap.max_inline_data = wqe_size;
>>> -       wqe_size += sizeof(struct rxe_send_wqe);
>>> -
>>> -       type = QUEUE_TYPE_FROM_CLIENT;
>>> -       qp->sq.queue = rxe_queue_init(rxe, &qp->sq.max_wr,
>>> -                               wqe_size, type);
>>> -       if (!qp->sq.queue)
>>> -               return -ENOMEM;
>>>
>>> -       err = do_mmap_info(rxe, uresp ? &uresp->sq_mi : NULL, udata,
>>> -                          qp->sq.queue->buf, qp->sq.queue->buf_size,
>>> -                          &qp->sq.queue->ip);
>>> -
>>> -       if (err) {
>>> -               vfree(qp->sq.queue->buf);
>>> -               kfree(qp->sq.queue);
>>> -               qp->sq.queue = NULL;
>>> +       err = rxe_init_sq(qp, init, udata, uresp);
>>> +       if (err)
>>>                  return err;
>>> -       }
>>>
>>>          qp->req.wqe_index = queue_get_producer(qp->sq.queue,
>>>                                                 QUEUE_TYPE_FROM_CLIENT);
>>> @@ -234,8 +262,6 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>          qp->req.opcode          = -1;
>>>          qp->comp.opcode         = -1;
>>>
>>> -       skb_queue_head_init(&qp->req_pkts);
>>> -
>>>          rxe_init_task(&qp->req.task, qp, rxe_requester);
>>>          rxe_init_task(&qp->comp.task, qp, rxe_completer);
>>>
>>> @@ -247,40 +273,67 @@ static int rxe_qp_init_req(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>          return 0;
>>>   }
>>>
>>> +static int rxe_init_rq(struct rxe_qp *qp, struct ib_qp_init_attr *init,
>>> +                      struct ib_udata *udata,
>>> +                      struct rxe_create_qp_resp __user *uresp)
>>> +{
>>> +       struct rxe_dev *rxe = to_rdev(qp->ibqp.device);
>>> +       int wqe_size;
>>> +       int err;
>>> +
>>> +       qp->rq.max_wr = init->cap.max_recv_wr;
>>> +       qp->rq.max_sge = init->cap.max_recv_sge;
>>> +       wqe_size = sizeof(struct rxe_recv_wqe) +
>>> +                               qp->rq.max_sge*sizeof(struct ib_sge);
>>> +
>>> +       qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr, wqe_size,
>>> +                                     QUEUE_TYPE_FROM_CLIENT);
>>> +       if (!qp->rq.queue) {
>>> +               rxe_err_qp(qp, "Unable to allocate recv queue");
>>> +               err = -ENOMEM;
>>> +               goto err_out;
>>> +       }
>>> +
>>> +       /* prepare info for caller to mmap recv queue if user space qp */
>>> +       err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
>>> +                          qp->rq.queue->buf, qp->rq.queue->buf_size,
>>> +                          &qp->rq.queue->ip);
>>> +       if (err) {
>>> +               rxe_err_qp(qp, "do_mmap_info failed, err = %d", err);
>>> +               goto err_free;
>>> +       }
>>> +
>>> +       /* return actual capabilities to caller which may be larger
>>> +        * than requested
>>> +        */
>>> +       init->cap.max_recv_wr = qp->rq.max_wr;
>>> +
>>> +       return 0;
>>> +
>>> +err_free:
>>> +       vfree(qp->rq.queue->buf);
>>> +       kfree(qp->rq.queue);
>>> +       qp->rq.queue = NULL;
>>> +err_out:
>>> +       return err;
>>> +}
>>> +
>>>   static int rxe_qp_init_resp(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>                              struct ib_qp_init_attr *init,
>>>                              struct ib_udata *udata,
>>>                              struct rxe_create_qp_resp __user *uresp)
>>>   {
>>>          int err;
>>> -       int wqe_size;
>>> -       enum queue_type type;
>>> +
>>> +       /* if we don't finish qp create make sure queue is valid */
>>> +       skb_queue_head_init(&qp->resp_pkts);
>>>
>>>          if (!qp->srq) {
>>> -               qp->rq.max_wr           = init->cap.max_recv_wr;
>>> -               qp->rq.max_sge          = init->cap.max_recv_sge;
>>> -
>>> -               wqe_size = rcv_wqe_size(qp->rq.max_sge);
>>> -
>>> -               type = QUEUE_TYPE_FROM_CLIENT;
>>> -               qp->rq.queue = rxe_queue_init(rxe, &qp->rq.max_wr,
>>> -                                       wqe_size, type);
>>> -               if (!qp->rq.queue)
>>> -                       return -ENOMEM;
>>> -
>>> -               err = do_mmap_info(rxe, uresp ? &uresp->rq_mi : NULL, udata,
>>> -                                  qp->rq.queue->buf, qp->rq.queue->buf_size,
>>> -                                  &qp->rq.queue->ip);
>>> -               if (err) {
>>> -                       vfree(qp->rq.queue->buf);
>>> -                       kfree(qp->rq.queue);
>>> -                       qp->rq.queue = NULL;
>>> +               err = rxe_init_rq(qp, init, udata, uresp);
>>> +               if (err)
>>>                          return err;
>>> -               }
>>>          }
>>>
>>> -       skb_queue_head_init(&qp->resp_pkts);
>>> -
>>>          rxe_init_task(&qp->resp.task, qp, rxe_responder);
>>>
>>>          qp->resp.opcode         = OPCODE_NONE;
>>> @@ -307,10 +360,10 @@ int rxe_qp_from_init(struct rxe_dev *rxe, struct rxe_qp *qp, struct rxe_pd *pd,
>>>          if (srq)
>>>                  rxe_get(srq);
>>>
>>> -       qp->pd                  = pd;
>>> -       qp->rcq                 = rcq;
>>> -       qp->scq                 = scq;
>>> -       qp->srq                 = srq;
>>> +       qp->pd = pd;
>>> +       qp->rcq = rcq;
>>> +       qp->scq = scq;
>>> +       qp->srq = srq;
>>>
>>>          atomic_inc(&rcq->num_wq);
>>>          atomic_inc(&scq->num_wq);
>>> --
>>> 2.39.2
>>>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp
  2023-06-20 13:55 [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Bob Pearson
  2023-06-20 13:55 ` [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines Bob Pearson
  2023-06-20 13:55 ` [PATCH for-next v2 2/3] RDMA/rxe: Fix unsafe drain work queue code Bob Pearson
@ 2023-07-31 18:25 ` Jason Gunthorpe
  2 siblings, 0 replies; 7+ messages in thread
From: Jason Gunthorpe @ 2023-07-31 18:25 UTC (permalink / raw)
  To: Bob Pearson; +Cc: zyjzyj2000, linux-rdma, syzbot+2da1965168e7dbcba136

On Tue, Jun 20, 2023 at 08:55:17AM -0500, Bob Pearson wrote:
> If a call to rxe_create_qp() fails in rxe_qp_from_init()
> rxe_cleanup(qp) will be called. This code currently does not correctly
> handle cases where not all qp resources are allocated and can seg
> fault as reported below. The first two patches cleanup cases where
> this happens. The third patch corrects an error in rxe_srq.c where
> if caller requests a change in the srq size the correct new value
> is not returned to caller.
> 
> This patch series applies cleanly to the current for-next branch.
> 
> Reported-by: syzbot+2da1965168e7dbcba136@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/linux-rdma/00000000000012d89205fe7cfe00@google.com/raw
> Fixes: 49dc9c1f0c7e ("RDMA/rxe: Cleanup reset state handling in rxe_resp.c")
> Fixes: fbdeb828a21f ("RDMA/rxe: Cleanup error state handling in rxe_comp.c")
> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
> ---
> v2: Reverted a partially implemented change in patch 3/3 which was
>     incorrect.
> 
> Bob Pearson (3):
>   RDMA/rxe: Move work queue code to subroutines
>   RDMA/rxe: Fix unsafe drain work queue code
>   RDMA/rxe: Fix rxe_m-dify_srq

Applied to for-next, thanks

Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-31 18:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 13:55 [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Bob Pearson
2023-06-20 13:55 ` [PATCH for-next v2 1/3] RDMA/rxe: Move work queue code to subroutines Bob Pearson
2023-06-20 14:49   ` Zhu Yanjun
2023-06-20 15:02     ` Bob Pearson
2023-06-21  6:21       ` Zhu Yanjun
2023-06-20 13:55 ` [PATCH for-next v2 2/3] RDMA/rxe: Fix unsafe drain work queue code Bob Pearson
2023-07-31 18:25 ` [PATCH for-next v2 0/3] RDMA/rxe: Fix error path code in rxe_create_qp Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).