linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib
@ 2025-01-20 17:27 Konstantin Taranov
  2025-01-20 17:27 ` [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
                   ` (13 more replies)
  0 siblings, 14 replies; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

This patch series enables GSI QPs and CM on mana_ib. 

Konstantin Taranov (13):
  RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
  RDMA/mana_ib: implement get_dma_mr
  RDMA/mana_ib: helpers to allocate kernel queues
  RDMA/mana_ib: create kernel-level CQs
  RDMA/mana_ib: Create and destroy UD/GSI QP
  RDMA/mana_ib: UD/GSI QP creation for kernel
  RDMA/mana_ib: create/destroy AH
  net/mana: fix warning in the writer of client oob
  RDMA/mana_ib: UD/GSI work requests
  RDMA/mana_ib: implement req_notify_cq
  RDMA/mana_ib: extend mana QP table
  RDMA/mana_ib: polling of CQs for GSI/UD
  RDMA/mana_ib: indicate CM support

 drivers/infiniband/hw/mana/Makefile           |   2 +-
 drivers/infiniband/hw/mana/ah.c               |  58 +++++
 drivers/infiniband/hw/mana/cq.c               | 227 ++++++++++++++--
 drivers/infiniband/hw/mana/device.c           |  18 +-
 drivers/infiniband/hw/mana/main.c             |  95 ++++++-
 drivers/infiniband/hw/mana/mana_ib.h          | 157 ++++++++++-
 drivers/infiniband/hw/mana/mr.c               |  36 +++
 drivers/infiniband/hw/mana/qp.c               | 245 +++++++++++++++++-
 drivers/infiniband/hw/mana/shadow_queue.h     | 115 ++++++++
 drivers/infiniband/hw/mana/wr.c               | 168 ++++++++++++
 .../net/ethernet/microsoft/mana/gdma_main.c   |   7 +-
 include/net/mana/gdma.h                       |   6 +
 12 files changed, 1096 insertions(+), 38 deletions(-)
 create mode 100644 drivers/infiniband/hw/mana/ah.c
 create mode 100644 drivers/infiniband/hw/mana/shadow_queue.h
 create mode 100644 drivers/infiniband/hw/mana/wr.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:17   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr Konstantin Taranov
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Allow the HW to register DMA-mapped memory for kernel-level PDs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c | 3 +++
 include/net/mana/gdma.h           | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 67c2d43..45b251b 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -82,6 +82,9 @@ int mana_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	mana_gd_init_req_hdr(&req.hdr, GDMA_CREATE_PD, sizeof(req),
 			     sizeof(resp));
 
+	if (!udata)
+		flags |= GDMA_PD_FLAG_ALLOW_GPA_MR;
+
 	req.flags = flags;
 	err = mana_gd_send_request(gc, sizeof(req), &req,
 				   sizeof(resp), &resp);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 90f5665..03e1b25 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -775,6 +775,7 @@ struct gdma_destroy_dma_region_req {
 
 enum gdma_pd_flags {
 	GDMA_PD_FLAG_INVALID = 0,
+	GDMA_PD_FLAG_ALLOW_GPA_MR = 1,
 };
 
 struct gdma_create_pd_req {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
  2025-01-20 17:27 ` [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:18   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues Konstantin Taranov
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement allocation of DMA-mapped memory regions.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/device.c |  1 +
 drivers/infiniband/hw/mana/mr.c     | 36 +++++++++++++++++++++++++++++
 include/net/mana/gdma.h             |  5 ++++
 3 files changed, 42 insertions(+)

diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 7ac0191..215dbce 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -32,6 +32,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table,
 	.destroy_wq = mana_ib_destroy_wq,
 	.disassociate_ucontext = mana_ib_disassociate_ucontext,
+	.get_dma_mr = mana_ib_get_dma_mr,
 	.get_link_layer = mana_ib_get_link_layer,
 	.get_port_immutable = mana_ib_get_port_immutable,
 	.mmap = mana_ib_mmap,
diff --git a/drivers/infiniband/hw/mana/mr.c b/drivers/infiniband/hw/mana/mr.c
index 887b09d..3a047f8 100644
--- a/drivers/infiniband/hw/mana/mr.c
+++ b/drivers/infiniband/hw/mana/mr.c
@@ -8,6 +8,8 @@
 #define VALID_MR_FLAGS                                                         \
 	(IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE | IB_ACCESS_REMOTE_READ)
 
+#define VALID_DMA_MR_FLAGS (IB_ACCESS_LOCAL_WRITE)
+
 static enum gdma_mr_access_flags
 mana_ib_verbs_to_gdma_access_flags(int access_flags)
 {
@@ -39,6 +41,8 @@ static int mana_ib_gd_create_mr(struct mana_ib_dev *dev, struct mana_ib_mr *mr,
 	req.mr_type = mr_params->mr_type;
 
 	switch (mr_params->mr_type) {
+	case GDMA_MR_TYPE_GPA:
+		break;
 	case GDMA_MR_TYPE_GVA:
 		req.gva.dma_region_handle = mr_params->gva.dma_region_handle;
 		req.gva.virtual_address = mr_params->gva.virtual_address;
@@ -169,6 +173,38 @@ err_free:
 	return ERR_PTR(err);
 }
 
+struct ib_mr *mana_ib_get_dma_mr(struct ib_pd *ibpd, int access_flags)
+{
+	struct mana_ib_pd *pd = container_of(ibpd, struct mana_ib_pd, ibpd);
+	struct gdma_create_mr_params mr_params = {};
+	struct ib_device *ibdev = ibpd->device;
+	struct mana_ib_dev *dev;
+	struct mana_ib_mr *mr;
+	int err;
+
+	dev = container_of(ibdev, struct mana_ib_dev, ib_dev);
+
+	if (access_flags & ~VALID_DMA_MR_FLAGS)
+		return ERR_PTR(-EINVAL);
+
+	mr = kzalloc(sizeof(*mr), GFP_KERNEL);
+	if (!mr)
+		return ERR_PTR(-ENOMEM);
+
+	mr_params.pd_handle = pd->pd_handle;
+	mr_params.mr_type = GDMA_MR_TYPE_GPA;
+
+	err = mana_ib_gd_create_mr(dev, mr, &mr_params);
+	if (err)
+		goto err_free;
+
+	return &mr->ibmr;
+
+err_free:
+	kfree(mr);
+	return ERR_PTR(err);
+}
+
 int mana_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)
 {
 	struct mana_ib_mr *mr = container_of(ibmr, struct mana_ib_mr, ibmr);
diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h
index 03e1b25..a94b04e 100644
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -801,6 +801,11 @@ struct gdma_destory_pd_resp {
 };/* HW DATA */
 
 enum gdma_mr_type {
+	/*
+	 * Guest Physical Address - MRs of this type allow access
+	 * to any DMA-mapped memory using bus-logical address
+	 */
+	GDMA_MR_TYPE_GPA = 1,
 	/* Guest Virtual Address - MRs of this type allow access
 	 * to memory mapped by PTEs associated with this MR using a virtual
 	 * address that is set up in the MST
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
  2025-01-20 17:27 ` [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
  2025-01-20 17:27 ` [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:25   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs Konstantin Taranov
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Introduce helpers to allocate queues for kernel-level use.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c             | 23 +++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h          |  3 +++
 .../net/ethernet/microsoft/mana/gdma_main.c   |  1 +
 3 files changed, 27 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 45b251b..f2f6bb3 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -240,6 +240,27 @@ void mana_ib_dealloc_ucontext(struct ib_ucontext *ibcontext)
 		ibdev_dbg(ibdev, "Failed to destroy doorbell page %d\n", ret);
 }
 
+int mana_ib_create_kernel_queue(struct mana_ib_dev *mdev, u32 size, enum gdma_queue_type type,
+				struct mana_ib_queue *queue)
+{
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct gdma_queue_spec spec = {};
+	int err;
+
+	queue->id = INVALID_QUEUE_ID;
+	queue->gdma_region = GDMA_INVALID_DMA_REGION;
+	spec.type = type;
+	spec.monitor_avl_buf = false;
+	spec.queue_size = size;
+	err = mana_gd_create_mana_wq_cq(&gc->mana_ib, &spec, &queue->kmem);
+	if (err)
+		return err;
+	/* take ownership into mana_ib from mana */
+	queue->gdma_region = queue->kmem->mem_info.dma_region_handle;
+	queue->kmem->mem_info.dma_region_handle = GDMA_INVALID_DMA_REGION;
+	return 0;
+}
+
 int mana_ib_create_queue(struct mana_ib_dev *mdev, u64 addr, u32 size,
 			 struct mana_ib_queue *queue)
 {
@@ -279,6 +300,8 @@ void mana_ib_destroy_queue(struct mana_ib_dev *mdev, struct mana_ib_queue *queue
 	 */
 	mana_ib_gd_destroy_dma_region(mdev, queue->gdma_region);
 	ib_umem_release(queue->umem);
+	if (queue->kmem)
+		mana_gd_destroy_queue(mdev_to_gc(mdev), queue->kmem);
 }
 
 static int
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index b53a5b4..79ebd95 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -52,6 +52,7 @@ struct mana_ib_adapter_caps {
 
 struct mana_ib_queue {
 	struct ib_umem *umem;
+	struct gdma_queue *kmem;
 	u64 gdma_region;
 	u64 id;
 };
@@ -388,6 +389,8 @@ int mana_ib_create_dma_region(struct mana_ib_dev *dev, struct ib_umem *umem,
 int mana_ib_gd_destroy_dma_region(struct mana_ib_dev *dev,
 				  mana_handle_t gdma_region);
 
+int mana_ib_create_kernel_queue(struct mana_ib_dev *mdev, u32 size, enum gdma_queue_type type,
+				struct mana_ib_queue *queue);
 int mana_ib_create_queue(struct mana_ib_dev *mdev, u64 addr, u32 size,
 			 struct mana_ib_queue *queue);
 void mana_ib_destroy_queue(struct mana_ib_dev *mdev, struct mana_ib_queue *queue);
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index e97af7a..3cb0543 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -867,6 +867,7 @@ free_q:
 	kfree(queue);
 	return err;
 }
+EXPORT_SYMBOL_NS(mana_gd_create_mana_wq_cq, NET_MANA);
 
 void mana_gd_destroy_queue(struct gdma_context *gc, struct gdma_queue *queue)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (2 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:36   ` Long Li
  2025-01-29  0:48   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP Konstantin Taranov
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement creation of CQs for the kernel.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c | 80 +++++++++++++++++++++------------
 1 file changed, 52 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index f04a679..d26d82d 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -15,42 +15,57 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	struct ib_device *ibdev = ibcq->device;
 	struct mana_ib_create_cq ucmd = {};
 	struct mana_ib_dev *mdev;
+	struct gdma_context *gc;
 	bool is_rnic_cq;
 	u32 doorbell;
+	u32 buf_size;
 	int err;
 
 	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
+	gc = mdev_to_gc(mdev);
 
 	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
 	cq->cq_handle = INVALID_MANA_HANDLE;
 
-	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
-		return -EINVAL;
+	if (udata) {
+		if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
+			return -EINVAL;
 
-	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
-	if (err) {
-		ibdev_dbg(ibdev,
-			  "Failed to copy from udata for create cq, %d\n", err);
-		return err;
-	}
+		err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen));
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to copy from udata for create cq, %d\n", err);
+			return err;
+		}
 
-	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
+		is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
 
-	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
-		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
-		return -EINVAL;
-	}
+		if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
+			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
+			return -EINVAL;
+		}
 
-	cq->cqe = attr->cqe;
-	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe * COMP_ENTRY_SIZE, &cq->queue);
-	if (err) {
-		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n", err);
-		return err;
-	}
+		cq->cqe = attr->cqe;
+		err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe * COMP_ENTRY_SIZE,
+					   &cq->queue);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n", err);
+			return err;
+		}
 
-	mana_ucontext = rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
-						  ibucontext);
-	doorbell = mana_ucontext->doorbell;
+		mana_ucontext = rdma_udata_to_drv_context(udata, struct mana_ib_ucontext,
+							  ibucontext);
+		doorbell = mana_ucontext->doorbell;
+	} else {
+		is_rnic_cq = true;
+		buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr->cqe * COMP_ENTRY_SIZE));
+		cq->cqe = buf_size / COMP_ENTRY_SIZE;
+		err = mana_ib_create_kernel_queue(mdev, buf_size, GDMA_CQ, &cq->queue);
+		if (err) {
+			ibdev_dbg(ibdev, "Failed to create kernel queue for create cq, %d\n", err);
+			return err;
+		}
+		doorbell = gc->mana_ib.doorbell;
+	}
 
 	if (is_rnic_cq) {
 		err = mana_ib_gd_create_cq(mdev, cq, doorbell);
@@ -66,11 +81,13 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		}
 	}
 
-	resp.cqid = cq->queue.id;
-	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
-	if (err) {
-		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
-		goto err_remove_cq_cb;
+	if (udata) {
+		resp.cqid = cq->queue.id;
+		err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
+		if (err) {
+			ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
+			goto err_remove_cq_cb;
+		}
 	}
 
 	return 0;
@@ -122,7 +139,10 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 		return -EINVAL;
 	/* Create CQ table entry */
 	WARN_ON(gc->cq_table[cq->queue.id]);
-	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
+	if (cq->queue.kmem)
+		gdma_cq = cq->queue.kmem;
+	else
+		gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
 	if (!gdma_cq)
 		return -ENOMEM;
 
@@ -141,6 +161,10 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	if (cq->queue.id >= gc->max_num_cqs || cq->queue.id == INVALID_QUEUE_ID)
 		return;
 
+	if (cq->queue.kmem)
+	/* Then it will be cleaned and removed by the mana */
+		return;
+
 	kfree(gc->cq_table[cq->queue.id]);
 	gc->cq_table[cq->queue.id] = NULL;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (3 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:40   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel Konstantin Taranov
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement HW requests to create and destroy UD/GSI QPs.
An UD/GSI QP has send and receive queues.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    | 58 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/mana_ib.h | 49 +++++++++++++++++++++++
 2 files changed, 107 insertions(+)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index f2f6bb3..b0c55cb 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -1013,3 +1013,61 @@ int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
 	}
 	return 0;
 }
+
+int mana_ib_gd_create_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
+			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type)
+{
+	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_pd *pd = container_of(qp->ibqp.pd, struct mana_ib_pd, ibpd);
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	struct mana_rnic_create_udqp_resp resp = {};
+	struct mana_rnic_create_udqp_req req = {};
+	int err, i;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CREATE_UD_QP, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.pd_handle = pd->pd_handle;
+	req.send_cq_handle = send_cq->cq_handle;
+	req.recv_cq_handle = recv_cq->cq_handle;
+	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; i++)
+		req.dma_region[i] = qp->ud_qp.queues[i].gdma_region;
+	req.doorbell_page = doorbell;
+	req.max_send_wr = attr->cap.max_send_wr;
+	req.max_recv_wr = attr->cap.max_recv_wr;
+	req.max_send_sge = attr->cap.max_send_sge;
+	req.max_recv_sge = attr->cap.max_recv_sge;
+	req.qp_type = type;
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create ud qp err %d", err);
+		return err;
+	}
+	qp->qp_handle = resp.qp_handle;
+	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; i++) {
+		qp->ud_qp.queues[i].id = resp.queue_ids[i];
+		/* The GDMA regions are now owned by the RNIC QP handle */
+		qp->ud_qp.queues[i].gdma_region = GDMA_INVALID_DMA_REGION;
+	}
+	return 0;
+}
+
+int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	struct mana_rnic_destroy_udqp_resp resp = {0};
+	struct mana_rnic_destroy_udqp_req req = {0};
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	int err;
+
+	mana_gd_init_req_hdr(&req.hdr, MANA_IB_DESTROY_UD_QP, sizeof(req), sizeof(resp));
+	req.hdr.dev_id = gc->mana_ib.dev_id;
+	req.adapter = mdev->adapter_handle;
+	req.qp_handle = qp->qp_handle;
+	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to destroy ud qp err %d", err);
+		return err;
+	}
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 79ebd95..5e470f1 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -115,6 +115,17 @@ struct mana_ib_rc_qp {
 	struct mana_ib_queue queues[MANA_RC_QUEUE_TYPE_MAX];
 };
 
+enum mana_ud_queue_type {
+	MANA_UD_SEND_QUEUE = 0,
+	MANA_UD_RECV_QUEUE,
+	MANA_UD_QUEUE_TYPE_MAX,
+};
+
+struct mana_ib_ud_qp {
+	struct mana_ib_queue queues[MANA_UD_QUEUE_TYPE_MAX];
+	u32 sq_psn;
+};
+
 struct mana_ib_qp {
 	struct ib_qp ibqp;
 
@@ -122,6 +133,7 @@ struct mana_ib_qp {
 	union {
 		struct mana_ib_queue raw_sq;
 		struct mana_ib_rc_qp rc_qp;
+		struct mana_ib_ud_qp ud_qp;
 	};
 
 	/* The port on the IB device, starting with 1 */
@@ -146,6 +158,8 @@ enum mana_ib_command_code {
 	MANA_IB_DESTROY_ADAPTER = 0x30003,
 	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
 	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
+	MANA_IB_CREATE_UD_QP	= 0x30006,
+	MANA_IB_DESTROY_UD_QP	= 0x30007,
 	MANA_IB_CREATE_CQ       = 0x30008,
 	MANA_IB_DESTROY_CQ      = 0x30009,
 	MANA_IB_CREATE_RC_QP    = 0x3000a,
@@ -297,6 +311,37 @@ struct mana_rnic_destroy_rc_qp_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+struct mana_rnic_create_udqp_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t pd_handle;
+	mana_handle_t send_cq_handle;
+	mana_handle_t recv_cq_handle;
+	u64 dma_region[MANA_UD_QUEUE_TYPE_MAX];
+	u32 qp_type;
+	u32 doorbell_page;
+	u32 max_send_wr;
+	u32 max_recv_wr;
+	u32 max_send_sge;
+	u32 max_recv_sge;
+}; /* HW Data */
+
+struct mana_rnic_create_udqp_resp {
+	struct gdma_resp_hdr hdr;
+	mana_handle_t qp_handle;
+	u32 queue_ids[MANA_UD_QUEUE_TYPE_MAX];
+}; /* HW Data*/
+
+struct mana_rnic_destroy_udqp_req {
+	struct gdma_req_hdr hdr;
+	mana_handle_t adapter;
+	mana_handle_t qp_handle;
+}; /* HW Data */
+
+struct mana_rnic_destroy_udqp_resp {
+	struct gdma_resp_hdr hdr;
+}; /* HW Data */
+
 struct mana_ib_ah_attr {
 	u8 src_addr[16];
 	u8 dest_addr[16];
@@ -483,4 +528,8 @@ int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev, struct mana_ib_cq *cq);
 int mana_ib_gd_create_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
 			    struct ib_qp_init_attr *attr, u32 doorbell, u64 flags);
 int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
+
+int mana_ib_gd_create_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
+			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type);
+int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (4 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:45   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH Konstantin Taranov
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement UD/GSI QPs for the kernel.
Allow create/modify/destroy for such QPs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/qp.c | 115 ++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)

diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 73d67c8..fea45be 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -398,6 +398,52 @@ err_free_vport:
 	return err;
 }
 
+static u32 mana_ib_wqe_size(u32 sge, u32 oob_size)
+{
+	u32 wqe_size = sge * sizeof(struct gdma_sge) + sizeof(struct gdma_wqe) + oob_size;
+
+	return ALIGN(wqe_size, GDMA_WQE_BU_SIZE);
+}
+
+static u32 mana_ib_queue_size(struct ib_qp_init_attr *attr, u32 queue_type)
+{
+	u32 queue_size;
+
+	switch (attr->qp_type) {
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		if (queue_type == MANA_UD_SEND_QUEUE)
+			queue_size = attr->cap.max_send_wr *
+				mana_ib_wqe_size(attr->cap.max_send_sge, INLINE_OOB_LARGE_SIZE);
+		else
+			queue_size = attr->cap.max_recv_wr *
+				mana_ib_wqe_size(attr->cap.max_recv_sge, INLINE_OOB_SMALL_SIZE);
+		break;
+	default:
+		return 0;
+	}
+
+	return MANA_PAGE_ALIGN(roundup_pow_of_two(queue_size));
+}
+
+static enum gdma_queue_type mana_ib_queue_type(struct ib_qp_init_attr *attr, u32 queue_type)
+{
+	enum gdma_queue_type type;
+
+	switch (attr->qp_type) {
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		if (queue_type == MANA_UD_SEND_QUEUE)
+			type = GDMA_SQ;
+		else
+			type = GDMA_RQ;
+		break;
+	default:
+		type = GDMA_INVALID_QUEUE;
+	}
+	return type;
+}
+
 static int mana_table_store_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
 {
 	refcount_set(&qp->refcount, 1);
@@ -490,6 +536,51 @@ destroy_queues:
 	return err;
 }
 
+static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
+				struct ib_qp_init_attr *attr, struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev = container_of(ibpd->device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
+	struct gdma_context *gc = mdev_to_gc(mdev);
+	u32 doorbell, queue_size;
+	int i, err;
+
+	if (udata) {
+		ibdev_dbg(&mdev->ib_dev, "User-level UD QPs are not supported, %d\n", err);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i) {
+		queue_size = mana_ib_queue_size(attr, i);
+		err = mana_ib_create_kernel_queue(mdev, queue_size, mana_ib_queue_type(attr, i),
+						  &qp->ud_qp.queues[i]);
+		if (err) {
+			ibdev_err(&mdev->ib_dev, "Failed to create queue %d, err %d\n",
+				  i, err);
+			goto destroy_queues;
+		}
+	}
+	doorbell = gc->mana_ib.doorbell;
+
+	err = mana_ib_gd_create_ud_qp(mdev, qp, attr, doorbell, attr->qp_type);
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create ud qp  %d\n", err);
+		goto destroy_queues;
+	}
+	qp->ibqp.qp_num = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
+	qp->port = attr->port_num;
+
+	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
+		qp->ud_qp.queues[i].kmem->id = qp->ud_qp.queues[i].id;
+
+	return 0;
+
+destroy_queues:
+	while (i-- > 0)
+		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]);
+	return err;
+}
+
 int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
 		      struct ib_udata *udata)
 {
@@ -503,6 +594,9 @@ int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
 		return mana_ib_create_qp_raw(ibqp, ibqp->pd, attr, udata);
 	case IB_QPT_RC:
 		return mana_ib_create_rc_qp(ibqp, ibqp->pd, attr, udata);
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		return mana_ib_create_ud_qp(ibqp, ibqp->pd, attr, udata);
 	default:
 		ibdev_dbg(ibqp->device, "Creating QP type %u not supported\n",
 			  attr->qp_type);
@@ -579,6 +673,8 @@ int mana_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 {
 	switch (ibqp->qp_type) {
 	case IB_QPT_RC:
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
 		return mana_ib_gd_modify_qp(ibqp, attr, attr_mask, udata);
 	default:
 		ibdev_dbg(ibqp->device, "Modify QP type %u not supported", ibqp->qp_type);
@@ -652,6 +748,22 @@ static int mana_ib_destroy_rc_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
 	return 0;
 }
 
+static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev =
+		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
+	int i;
+
+	/* Ignore return code as there is not much we can do about it.
+	 * The error message is printed inside.
+	 */
+	mana_ib_gd_destroy_ud_qp(mdev, qp);
+	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
+		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]);
+
+	return 0;
+}
+
 int mana_ib_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 {
 	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
@@ -665,6 +777,9 @@ int mana_ib_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 		return mana_ib_destroy_qp_raw(qp, udata);
 	case IB_QPT_RC:
 		return mana_ib_destroy_rc_qp(qp, udata);
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		return mana_ib_destroy_ud_qp(qp, udata);
 	default:
 		ibdev_dbg(ibqp->device, "Unexpected QP type %u\n",
 			  ibqp->qp_type);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (5 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:53   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob Konstantin Taranov
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement create and destroy AH for kernel.

In mana_ib, AV is passed as an sge in WQE.
Allocate DMA memory and write an AV there.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/Makefile  |  2 +-
 drivers/infiniband/hw/mana/ah.c      | 58 ++++++++++++++++++++++++++++
 drivers/infiniband/hw/mana/device.c  | 13 ++++++-
 drivers/infiniband/hw/mana/mana_ib.h | 30 ++++++++++++++
 4 files changed, 101 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/hw/mana/ah.c

diff --git a/drivers/infiniband/hw/mana/Makefile b/drivers/infiniband/hw/mana/Makefile
index 88655fe..6e56f77 100644
--- a/drivers/infiniband/hw/mana/Makefile
+++ b/drivers/infiniband/hw/mana/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_MANA_INFINIBAND) += mana_ib.o
 
-mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o
+mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o
diff --git a/drivers/infiniband/hw/mana/ah.c b/drivers/infiniband/hw/mana/ah.c
new file mode 100644
index 0000000..f56952e
--- /dev/null
+++ b/drivers/infiniband/hw/mana/ah.c
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
+ */
+
+#include "mana_ib.h"
+
+int mana_ib_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *attr,
+		      struct ib_udata *udata)
+{
+	struct mana_ib_dev *mdev = container_of(ibah->device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_ah *ah = container_of(ibah, struct mana_ib_ah, ibah);
+	struct rdma_ah_attr *ah_attr = attr->ah_attr;
+	const struct ib_global_route *grh;
+	enum rdma_network_type ntype;
+
+	if (ah_attr->type != RDMA_AH_ATTR_TYPE_ROCE ||
+	    !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
+		return -EINVAL;
+
+	if (udata)
+		return -EINVAL;
+
+	ah->av = dma_pool_zalloc(mdev->av_pool, GFP_ATOMIC, &ah->dma_handle);
+	if (!ah->av)
+		return -ENOMEM;
+
+	grh = rdma_ah_read_grh(ah_attr);
+	ntype = rdma_gid_attr_network_type(grh->sgid_attr);
+
+	copy_in_reverse(ah->av->dest_mac, ah_attr->roce.dmac, ETH_ALEN);
+	ah->av->udp_src_port = rdma_flow_label_to_udp_sport(grh->flow_label);
+	ah->av->hop_limit = grh->hop_limit;
+	ah->av->dscp = (grh->traffic_class >> 2) & 0x3f;
+	ah->av->is_ipv6 = (ntype == RDMA_NETWORK_IPV6);
+
+	if (ah->av->is_ipv6) {
+		copy_in_reverse(ah->av->dest_ip, grh->dgid.raw, 16);
+		copy_in_reverse(ah->av->src_ip, grh->sgid_attr->gid.raw, 16);
+	} else {
+		ah->av->dest_ip[10] = 0xFF;
+		ah->av->dest_ip[11] = 0xFF;
+		copy_in_reverse(&ah->av->dest_ip[12], &grh->dgid.raw[12], 4);
+		copy_in_reverse(&ah->av->src_ip[12], &grh->sgid_attr->gid.raw[12], 4);
+	}
+
+	return 0;
+}
+
+int mana_ib_destroy_ah(struct ib_ah *ibah, u32 flags)
+{
+	struct mana_ib_dev *mdev = container_of(ibah->device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_ah *ah = container_of(ibah, struct mana_ib_ah, ibah);
+
+	dma_pool_free(mdev->av_pool, ah->av, ah->dma_handle);
+
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 215dbce..d534ef1 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -19,6 +19,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.add_gid = mana_ib_gd_add_gid,
 	.alloc_pd = mana_ib_alloc_pd,
 	.alloc_ucontext = mana_ib_alloc_ucontext,
+	.create_ah = mana_ib_create_ah,
 	.create_cq = mana_ib_create_cq,
 	.create_qp = mana_ib_create_qp,
 	.create_rwq_ind_table = mana_ib_create_rwq_ind_table,
@@ -27,6 +28,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.dealloc_ucontext = mana_ib_dealloc_ucontext,
 	.del_gid = mana_ib_gd_del_gid,
 	.dereg_mr = mana_ib_dereg_mr,
+	.destroy_ah = mana_ib_destroy_ah,
 	.destroy_cq = mana_ib_destroy_cq,
 	.destroy_qp = mana_ib_destroy_qp,
 	.destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table,
@@ -44,6 +46,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.query_port = mana_ib_query_port,
 	.reg_user_mr = mana_ib_reg_user_mr,
 
+	INIT_RDMA_OBJ_SIZE(ib_ah, mana_ib_ah, ibah),
 	INIT_RDMA_OBJ_SIZE(ib_cq, mana_ib_cq, ibcq),
 	INIT_RDMA_OBJ_SIZE(ib_pd, mana_ib_pd, ibpd),
 	INIT_RDMA_OBJ_SIZE(ib_qp, mana_ib_qp, ibqp),
@@ -135,15 +138,22 @@ static int mana_ib_probe(struct auxiliary_device *adev,
 		goto destroy_rnic;
 	}
 
+	dev->av_pool = dma_pool_create("mana_ib_av", mdev->gdma_context->dev,
+				       MANA_AV_BUFFER_SIZE, MANA_AV_BUFFER_SIZE, 0);
+	if (!dev->av_pool)
+		goto destroy_rnic;
+
 	ret = ib_register_device(&dev->ib_dev, "mana_%d",
 				 mdev->gdma_context->dev);
 	if (ret)
-		goto destroy_rnic;
+		goto deallocate_pool;
 
 	dev_set_drvdata(&adev->dev, dev);
 
 	return 0;
 
+deallocate_pool:
+	dma_pool_destroy(dev->av_pool);
 destroy_rnic:
 	xa_destroy(&dev->qp_table_wq);
 	mana_ib_gd_destroy_rnic_adapter(dev);
@@ -161,6 +171,7 @@ static void mana_ib_remove(struct auxiliary_device *adev)
 	struct mana_ib_dev *dev = dev_get_drvdata(&adev->dev);
 
 	ib_unregister_device(&dev->ib_dev);
+	dma_pool_destroy(dev->av_pool);
 	xa_destroy(&dev->qp_table_wq);
 	mana_ib_gd_destroy_rnic_adapter(dev);
 	mana_ib_destroy_eqs(dev);
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 5e470f1..7b079d8 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -11,6 +11,7 @@
 #include <rdma/ib_umem.h>
 #include <rdma/mana-abi.h>
 #include <rdma/uverbs_ioctl.h>
+#include <linux/dmapool.h>
 
 #include <net/mana/mana.h>
 
@@ -32,6 +33,11 @@
  */
 #define MANA_CA_ACK_DELAY	16
 
+/*
+ * The buffer used for writing AV
+ */
+#define MANA_AV_BUFFER_SIZE	64
+
 struct mana_ib_adapter_caps {
 	u32 max_sq_id;
 	u32 max_rq_id;
@@ -65,6 +71,7 @@ struct mana_ib_dev {
 	struct gdma_queue **eqs;
 	struct xarray qp_table_wq;
 	struct mana_ib_adapter_caps adapter_caps;
+	struct dma_pool *av_pool;
 };
 
 struct mana_ib_wq {
@@ -88,6 +95,25 @@ struct mana_ib_pd {
 	u32 tx_vp_offset;
 };
 
+struct mana_ib_av {
+	u8 dest_ip[16];
+	u8 dest_mac[ETH_ALEN];
+	u16 udp_src_port;
+	u8 src_ip[16];
+	u32 hop_limit	: 8;
+	u32 reserved1	: 12;
+	u32 dscp	: 6;
+	u32 reserved2	: 5;
+	u32 is_ipv6	: 1;
+	u32 reserved3	: 32;
+};
+
+struct mana_ib_ah {
+	struct ib_ah ibah;
+	struct mana_ib_av *av;
+	dma_addr_t dma_handle;
+};
+
 struct mana_ib_mr {
 	struct ib_mr ibmr;
 	struct ib_umem *umem;
@@ -532,4 +558,8 @@ int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
 int mana_ib_gd_create_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp,
 			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type);
 int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
+
+int mana_ib_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
+		      struct ib_udata *udata);
+int mana_ib_destroy_ah(struct ib_ah *ah, u32 flags);
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (6 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23 22:48   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests Konstantin Taranov
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Do not warn on missing pad_data when oob is in sgl.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/gdma_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 3cb0543..a8a9cd7 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1042,7 +1042,7 @@ static u32 mana_gd_write_client_oob(const struct gdma_wqe_request *wqe_req,
 	header->inline_oob_size_div4 = client_oob_size / sizeof(u32);
 
 	if (oob_in_sgl) {
-		WARN_ON_ONCE(!pad_data || wqe_req->num_sge < 2);
+		WARN_ON_ONCE(wqe_req->num_sge < 2);
 
 		header->client_oob_in_sgl = 1;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (7 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23 18:20   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq Konstantin Taranov
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Implement post send and post recv for UD/GSI QPs.
Add information about posted requests into shadow queues.

Co-developed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
Signed-off-by: Shiraz Saleem <shirazsaleem@microsoft.com>
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
---
 drivers/infiniband/hw/mana/Makefile           |   2 +-
 drivers/infiniband/hw/mana/device.c           |   2 +
 drivers/infiniband/hw/mana/mana_ib.h          |  33 ++++
 drivers/infiniband/hw/mana/qp.c               |  21 ++-
 drivers/infiniband/hw/mana/shadow_queue.h     | 115 ++++++++++++
 drivers/infiniband/hw/mana/wr.c               | 168 ++++++++++++++++++
 .../net/ethernet/microsoft/mana/gdma_main.c   |   2 +
 7 files changed, 341 insertions(+), 2 deletions(-)
 create mode 100644 drivers/infiniband/hw/mana/shadow_queue.h
 create mode 100644 drivers/infiniband/hw/mana/wr.c

diff --git a/drivers/infiniband/hw/mana/Makefile b/drivers/infiniband/hw/mana/Makefile
index 6e56f77..79426e7 100644
--- a/drivers/infiniband/hw/mana/Makefile
+++ b/drivers/infiniband/hw/mana/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_MANA_INFINIBAND) += mana_ib.o
 
-mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o
+mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o wr.o
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index d534ef1..1da86c3 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -40,6 +40,8 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.mmap = mana_ib_mmap,
 	.modify_qp = mana_ib_modify_qp,
 	.modify_wq = mana_ib_modify_wq,
+	.post_recv = mana_ib_post_recv,
+	.post_send = mana_ib_post_send,
 	.query_device = mana_ib_query_device,
 	.query_gid = mana_ib_query_gid,
 	.query_pkey = mana_ib_query_pkey,
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 7b079d8..6265c39 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -14,6 +14,7 @@
 #include <linux/dmapool.h>
 
 #include <net/mana/mana.h>
+#include "shadow_queue.h"
 
 #define PAGE_SZ_BM                                                             \
 	(SZ_4K | SZ_8K | SZ_16K | SZ_32K | SZ_64K | SZ_128K | SZ_256K |        \
@@ -165,6 +166,9 @@ struct mana_ib_qp {
 	/* The port on the IB device, starting with 1 */
 	u32 port;
 
+	struct shadow_queue shadow_rq;
+	struct shadow_queue shadow_sq;
+
 	refcount_t		refcount;
 	struct completion	free;
 };
@@ -404,6 +408,30 @@ struct mana_rnic_set_qp_state_resp {
 	struct gdma_resp_hdr hdr;
 }; /* HW Data */
 
+enum WQE_OPCODE_TYPES {
+	WQE_TYPE_UD_SEND = 0,
+	WQE_TYPE_UD_RECV = 8,
+}; /* HW DATA */
+
+struct rdma_send_oob {
+	u32 wqe_type	: 5;
+	u32 fence	: 1;
+	u32 signaled	: 1;
+	u32 solicited	: 1;
+	u32 psn		: 24;
+
+	u32 ssn_or_rqpn	: 24;
+	u32 reserved1	: 8;
+	union {
+		struct {
+			u32 remote_qkey;
+			u32 immediate;
+			u32 reserved1;
+			u32 reserved2;
+		} ud_send;
+	};
+}; /* HW DATA */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -562,4 +590,9 @@ int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp);
 int mana_ib_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
 		      struct ib_udata *udata);
 int mana_ib_destroy_ah(struct ib_ah *ah, u32 flags);
+
+int mana_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
+		      const struct ib_recv_wr **bad_wr);
+int mana_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
+		      const struct ib_send_wr **bad_wr);
 #endif
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index fea45be..051ea03 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -562,10 +562,23 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	}
 	doorbell = gc->mana_ib.doorbell;
 
+	err = create_shadow_queue(&qp->shadow_rq, attr->cap.max_recv_wr,
+				  sizeof(struct ud_rq_shadow_wqe));
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create shadow rq err %d\n", err);
+		goto destroy_queues;
+	}
+	err = create_shadow_queue(&qp->shadow_sq, attr->cap.max_send_wr,
+				  sizeof(struct ud_sq_shadow_wqe));
+	if (err) {
+		ibdev_err(&mdev->ib_dev, "Failed to create shadow sq err %d\n", err);
+		goto destroy_shadow_queues;
+	}
+
 	err = mana_ib_gd_create_ud_qp(mdev, qp, attr, doorbell, attr->qp_type);
 	if (err) {
 		ibdev_err(&mdev->ib_dev, "Failed to create ud qp  %d\n", err);
-		goto destroy_queues;
+		goto destroy_shadow_queues;
 	}
 	qp->ibqp.qp_num = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
 	qp->port = attr->port_num;
@@ -575,6 +588,9 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 
 	return 0;
 
+destroy_shadow_queues:
+	destroy_shadow_queue(&qp->shadow_rq);
+	destroy_shadow_queue(&qp->shadow_sq);
 destroy_queues:
 	while (i-- > 0)
 		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]);
@@ -754,6 +770,9 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
 		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
 	int i;
 
+	destroy_shadow_queue(&qp->shadow_rq);
+	destroy_shadow_queue(&qp->shadow_sq);
+
 	/* Ignore return code as there is not much we can do about it.
 	 * The error message is printed inside.
 	 */
diff --git a/drivers/infiniband/hw/mana/shadow_queue.h b/drivers/infiniband/hw/mana/shadow_queue.h
new file mode 100644
index 0000000..d8bfb4c
--- /dev/null
+++ b/drivers/infiniband/hw/mana/shadow_queue.h
@@ -0,0 +1,115 @@
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
+/*
+ * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
+ */
+
+#ifndef _MANA_SHADOW_QUEUE_H_
+#define _MANA_SHADOW_QUEUE_H_
+
+struct shadow_wqe_header {
+	u16 opcode;
+	u16 error_code;
+	u32 posted_wqe_size;
+	u64 wr_id;
+};
+
+struct ud_rq_shadow_wqe {
+	struct shadow_wqe_header header;
+	u32 byte_len;
+	u32 src_qpn;
+};
+
+struct ud_sq_shadow_wqe {
+	struct shadow_wqe_header header;
+};
+
+struct shadow_queue {
+	/* Unmasked producer index, Incremented on wqe posting */
+	u64 prod_idx;
+	/* Unmasked consumer index, Incremented on cq polling */
+	u64 cons_idx;
+	/* Unmasked index of next-to-complete (from HW) shadow WQE */
+	u64 next_to_complete_idx;
+	/* queue size in wqes */
+	u32 length;
+	/* distance between elements in bytes */
+	u32 stride;
+	/* ring buffer holding wqes */
+	void *buffer;
+};
+
+static inline int create_shadow_queue(struct shadow_queue *queue, uint32_t length, uint32_t stride)
+{
+	queue->buffer = kvmalloc(length * stride, GFP_KERNEL);
+	if (!queue->buffer)
+		return -ENOMEM;
+
+	queue->length = length;
+	queue->stride = stride;
+
+	return 0;
+}
+
+static inline void destroy_shadow_queue(struct shadow_queue *queue)
+{
+	kvfree(queue->buffer);
+}
+
+static inline bool shadow_queue_full(struct shadow_queue *queue)
+{
+	return (queue->prod_idx - queue->cons_idx) >= queue->length;
+}
+
+static inline bool shadow_queue_empty(struct shadow_queue *queue)
+{
+	return queue->prod_idx == queue->cons_idx;
+}
+
+static inline void *
+shadow_queue_get_element(const struct shadow_queue *queue, u64 unmasked_index)
+{
+	u32 index = unmasked_index % queue->length;
+
+	return ((u8 *)queue->buffer + index * queue->stride);
+}
+
+static inline void *
+shadow_queue_producer_entry(struct shadow_queue *queue)
+{
+	return shadow_queue_get_element(queue, queue->prod_idx);
+}
+
+static inline void *
+shadow_queue_get_next_to_consume(const struct shadow_queue *queue)
+{
+	if (queue->cons_idx == queue->next_to_complete_idx)
+		return NULL;
+
+	return shadow_queue_get_element(queue, queue->cons_idx);
+}
+
+static inline void *
+shadow_queue_get_next_to_complete(struct shadow_queue *queue)
+{
+	if (queue->next_to_complete_idx == queue->prod_idx)
+		return NULL;
+
+	return shadow_queue_get_element(queue, queue->next_to_complete_idx);
+}
+
+static inline void shadow_queue_advance_producer(struct shadow_queue *queue)
+{
+	queue->prod_idx++;
+}
+
+static inline void shadow_queue_advance_consumer(struct shadow_queue *queue)
+{
+	queue->cons_idx++;
+}
+
+static inline void shadow_queue_advance_next_to_complete(struct shadow_queue *queue)
+{
+	queue->next_to_complete_idx++;
+}
+
+#endif
diff --git a/drivers/infiniband/hw/mana/wr.c b/drivers/infiniband/hw/mana/wr.c
new file mode 100644
index 0000000..1813567
--- /dev/null
+++ b/drivers/infiniband/hw/mana/wr.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
+ */
+
+#include "mana_ib.h"
+
+#define MAX_WR_SGL_NUM (2)
+
+static int mana_ib_post_recv_ud(struct mana_ib_qp *qp, const struct ib_recv_wr *wr)
+{
+	struct mana_ib_dev *mdev = container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
+	struct gdma_queue *queue = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].kmem;
+	struct gdma_posted_wqe_info wqe_info = {0};
+	struct gdma_sge gdma_sgl[MAX_WR_SGL_NUM];
+	struct gdma_wqe_request wqe_req = {0};
+	struct ud_rq_shadow_wqe *shadow_wqe;
+	int err, i;
+
+	if (shadow_queue_full(&qp->shadow_rq))
+		return -EINVAL;
+
+	if (wr->num_sge > MAX_WR_SGL_NUM)
+		return -EINVAL;
+
+	for (i = 0; i < wr->num_sge; ++i) {
+		gdma_sgl[i].address = wr->sg_list[i].addr;
+		gdma_sgl[i].mem_key = wr->sg_list[i].lkey;
+		gdma_sgl[i].size = wr->sg_list[i].length;
+	}
+	wqe_req.num_sge = wr->num_sge;
+	wqe_req.sgl = gdma_sgl;
+
+	err = mana_gd_post_work_request(queue, &wqe_req, &wqe_info);
+	if (err)
+		return err;
+
+	shadow_wqe = shadow_queue_producer_entry(&qp->shadow_rq);
+	memset(shadow_wqe, 0, sizeof(*shadow_wqe));
+	shadow_wqe->header.opcode = IB_WC_RECV;
+	shadow_wqe->header.wr_id = wr->wr_id;
+	shadow_wqe->header.posted_wqe_size = wqe_info.wqe_size_in_bu;
+	shadow_queue_advance_producer(&qp->shadow_rq);
+
+	mana_gd_wq_ring_doorbell(mdev_to_gc(mdev), queue);
+	return 0;
+}
+
+int mana_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
+		      const struct ib_recv_wr **bad_wr)
+{
+	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
+	int err = 0;
+
+	for (; wr; wr = wr->next) {
+		switch (ibqp->qp_type) {
+		case IB_QPT_UD:
+		case IB_QPT_GSI:
+			err = mana_ib_post_recv_ud(qp, wr);
+			if (unlikely(err)) {
+				*bad_wr = wr;
+				return err;
+			}
+			break;
+		default:
+			ibdev_dbg(ibqp->device, "Posting recv wr on qp type %u is not supported\n",
+				  ibqp->qp_type);
+			return -EINVAL;
+		}
+	}
+
+	return err;
+}
+
+static int mana_ib_post_send_ud(struct mana_ib_qp *qp, const struct ib_ud_wr *wr)
+{
+	struct mana_ib_dev *mdev = container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
+	struct mana_ib_ah *ah = container_of(wr->ah, struct mana_ib_ah, ibah);
+	struct net_device *ndev = mana_ib_get_netdev(&mdev->ib_dev, qp->port);
+	struct gdma_queue *queue = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].kmem;
+	struct gdma_sge gdma_sgl[MAX_WR_SGL_NUM + 1];
+	struct gdma_posted_wqe_info wqe_info = {0};
+	struct gdma_wqe_request wqe_req = {0};
+	struct rdma_send_oob send_oob = {0};
+	struct ud_sq_shadow_wqe *shadow_wqe;
+	int err, i;
+
+	if (!ndev) {
+		ibdev_dbg(&mdev->ib_dev, "Invalid port %u in QP %u\n",
+			  qp->port, qp->ibqp.qp_num);
+		return -EINVAL;
+	}
+
+	if (wr->wr.opcode != IB_WR_SEND)
+		return -EINVAL;
+
+	if (shadow_queue_full(&qp->shadow_sq))
+		return -EINVAL;
+
+	if (wr->wr.num_sge > MAX_WR_SGL_NUM)
+		return -EINVAL;
+
+	gdma_sgl[0].address = ah->dma_handle;
+	gdma_sgl[0].mem_key = qp->ibqp.pd->local_dma_lkey;
+	gdma_sgl[0].size = sizeof(struct mana_ib_av);
+	for (i = 0; i < wr->wr.num_sge; ++i) {
+		gdma_sgl[i + 1].address = wr->wr.sg_list[i].addr;
+		gdma_sgl[i + 1].mem_key = wr->wr.sg_list[i].lkey;
+		gdma_sgl[i + 1].size = wr->wr.sg_list[i].length;
+	}
+
+	wqe_req.num_sge = wr->wr.num_sge + 1;
+	wqe_req.sgl = gdma_sgl;
+	wqe_req.inline_oob_size = sizeof(struct rdma_send_oob);
+	wqe_req.inline_oob_data = &send_oob;
+	wqe_req.flags = GDMA_WR_OOB_IN_SGL;
+	wqe_req.client_data_unit = ib_mtu_enum_to_int(ib_mtu_int_to_enum(ndev->mtu));
+
+	send_oob.wqe_type = WQE_TYPE_UD_SEND;
+	send_oob.fence = !!(wr->wr.send_flags & IB_SEND_FENCE);
+	send_oob.signaled = !!(wr->wr.send_flags & IB_SEND_SIGNALED);
+	send_oob.solicited = !!(wr->wr.send_flags & IB_SEND_SOLICITED);
+	send_oob.psn = qp->ud_qp.sq_psn;
+	send_oob.ssn_or_rqpn = wr->remote_qpn;
+	send_oob.ud_send.remote_qkey =
+		qp->ibqp.qp_type == IB_QPT_GSI ? IB_QP1_QKEY : wr->remote_qkey;
+
+	err = mana_gd_post_work_request(queue, &wqe_req, &wqe_info);
+	if (err)
+		return err;
+
+	qp->ud_qp.sq_psn++;
+	shadow_wqe = shadow_queue_producer_entry(&qp->shadow_sq);
+	memset(shadow_wqe, 0, sizeof(*shadow_wqe));
+	shadow_wqe->header.opcode = IB_WC_SEND;
+	shadow_wqe->header.wr_id = wr->wr.wr_id;
+	shadow_wqe->header.posted_wqe_size = wqe_info.wqe_size_in_bu;
+	shadow_queue_advance_producer(&qp->shadow_sq);
+
+	mana_gd_wq_ring_doorbell(mdev_to_gc(mdev), queue);
+	return 0;
+}
+
+int mana_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
+		      const struct ib_send_wr **bad_wr)
+{
+	int err;
+	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
+
+	for (; wr; wr = wr->next) {
+		switch (ibqp->qp_type) {
+		case IB_QPT_UD:
+		case IB_QPT_GSI:
+			err = mana_ib_post_send_ud(qp, ud_wr(wr));
+			if (unlikely(err)) {
+				*bad_wr = wr;
+				return err;
+			}
+			break;
+		default:
+			ibdev_dbg(ibqp->device, "Posting send wr on qp type %u is not supported\n",
+				  ibqp->qp_type);
+			return -EINVAL;
+		}
+	}
+
+	return err;
+}
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index a8a9cd7..409e4e8 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -331,6 +331,7 @@ void mana_gd_wq_ring_doorbell(struct gdma_context *gc, struct gdma_queue *queue)
 	mana_gd_ring_doorbell(gc, queue->gdma_dev->doorbell, queue->type,
 			      queue->id, queue->head * GDMA_WQE_BU_SIZE, 0);
 }
+EXPORT_SYMBOL_NS(mana_gd_wq_ring_doorbell, NET_MANA);
 
 void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
 {
@@ -1149,6 +1150,7 @@ int mana_gd_post_work_request(struct gdma_queue *wq,
 
 	return 0;
 }
+EXPORT_SYMBOL_NS(mana_gd_post_work_request, NET_MANA);
 
 int mana_gd_post_and_ring(struct gdma_queue *queue,
 			  const struct gdma_wqe_request *wqe_req,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (8 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  5:57   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table Konstantin Taranov
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Arm a CQ when req_notify_cq is called.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c                 | 12 ++++++++++++
 drivers/infiniband/hw/mana/device.c             |  1 +
 drivers/infiniband/hw/mana/mana_ib.h            |  2 ++
 drivers/net/ethernet/microsoft/mana/gdma_main.c |  1 +
 4 files changed, 16 insertions(+)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index d26d82d..82f1462 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -168,3 +168,15 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev *mdev, struct mana_ib_cq *cq)
 	kfree(gc->cq_table[cq->queue.id]);
 	gc->cq_table[cq->queue.id] = NULL;
 }
+
+int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
+{
+	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
+	struct gdma_queue *gdma_cq = cq->queue.kmem;
+
+	if (!gdma_cq)
+		return -EINVAL;
+
+	mana_gd_ring_cq(gdma_cq, SET_ARM_BIT);
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 1da86c3..63e12c3 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -47,6 +47,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.query_pkey = mana_ib_query_pkey,
 	.query_port = mana_ib_query_port,
 	.reg_user_mr = mana_ib_reg_user_mr,
+	.req_notify_cq = mana_ib_arm_cq,
 
 	INIT_RDMA_OBJ_SIZE(ib_ah, mana_ib_ah, ibah),
 	INIT_RDMA_OBJ_SIZE(ib_cq, mana_ib_cq, ibcq),
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 6265c39..bd34ad6 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -595,4 +595,6 @@ int mana_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
 		      const struct ib_recv_wr **bad_wr);
 int mana_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
 		      const struct ib_send_wr **bad_wr);
+
+int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 #endif
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 409e4e8..823f7e7 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -344,6 +344,7 @@ void mana_gd_ring_cq(struct gdma_queue *cq, u8 arm_bit)
 	mana_gd_ring_doorbell(gc, cq->gdma_dev->doorbell, cq->type, cq->id,
 			      head, arm_bit);
 }
+EXPORT_SYMBOL_NS(mana_gd_ring_cq, NET_MANA);
 
 static void mana_gd_process_eqe(struct gdma_queue *eq)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (9 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23  6:02   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD Konstantin Taranov
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Enable mana QP table to store UD/GSI QPs.
For send queues, set the most significant bit to one,
as send and receive WQs can have the same ID in mana.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c    |  2 +-
 drivers/infiniband/hw/mana/mana_ib.h |  8 ++-
 drivers/infiniband/hw/mana/qp.c      | 78 ++++++++++++++++++++++++++--
 3 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index b0c55cb..114e391 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -704,7 +704,7 @@ mana_ib_event_handler(void *ctx, struct gdma_queue *q, struct gdma_event *event)
 	switch (event->type) {
 	case GDMA_EQE_RNIC_QP_FATAL:
 		qpn = event->details[0];
-		qp = mana_get_qp_ref(mdev, qpn);
+		qp = mana_get_qp_ref(mdev, qpn, false);
 		if (!qp)
 			break;
 		if (qp->ibqp.event_handler) {
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index bd34ad6..5e4ca55 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -23,6 +23,9 @@
 /* MANA doesn't have any limit for MR size */
 #define MANA_IB_MAX_MR_SIZE	U64_MAX
 
+/* Send queue ID mask */
+#define MANA_SENDQ_MASK	BIT(31)
+
 /*
  * The hardware limit of number of MRs is greater than maximum number of MRs
  * that can possibly represent in 24 bits
@@ -438,11 +441,14 @@ static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 }
 
 static inline struct mana_ib_qp *mana_get_qp_ref(struct mana_ib_dev *mdev,
-						 uint32_t qid)
+						 u32 qid, bool is_sq)
 {
 	struct mana_ib_qp *qp;
 	unsigned long flag;
 
+	if (is_sq)
+		qid |= MANA_SENDQ_MASK;
+
 	xa_lock_irqsave(&mdev->qp_table_wq, flag);
 	qp = xa_load(&mdev->qp_table_wq, qid);
 	if (qp)
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 051ea03..2528046 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -444,18 +444,82 @@ static enum gdma_queue_type mana_ib_queue_type(struct ib_qp_init_attr *attr, u32
 	return type;
 }
 
+static int mana_table_store_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	return xa_insert_irq(&mdev->qp_table_wq, qp->ibqp.qp_num, qp,
+			     GFP_KERNEL);
+}
+
+static void mana_table_remove_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	xa_erase_irq(&mdev->qp_table_wq, qp->ibqp.qp_num);
+}
+
+static int mana_table_store_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	u32 qids = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].id | MANA_SENDQ_MASK;
+	u32 qidr = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
+	int err;
+
+	err = xa_insert_irq(&mdev->qp_table_wq, qids, qp, GFP_KERNEL);
+	if (err)
+		return err;
+
+	err = xa_insert_irq(&mdev->qp_table_wq, qidr, qp, GFP_KERNEL);
+	if (err)
+		goto remove_sq;
+
+	return 0;
+
+remove_sq:
+	xa_erase_irq(&mdev->qp_table_wq, qids);
+	return err;
+}
+
+static void mana_table_remove_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
+{
+	u32 qids = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].id | MANA_SENDQ_MASK;
+	u32 qidr = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
+
+	xa_erase_irq(&mdev->qp_table_wq, qids);
+	xa_erase_irq(&mdev->qp_table_wq, qidr);
+}
+
 static int mana_table_store_qp(struct mana_ib_dev *mdev, struct mana_ib_qp *qp)
 {
 	refcount_set(&qp->refcount, 1);
 	init_completion(&qp->free);
-	return xa_insert_irq(&mdev->qp_table_wq, qp->ibqp.qp_num, qp,
-			     GFP_KERNEL);
+
+	switch (qp->ibqp.qp_type) {
+	case IB_QPT_RC:
+		return mana_table_store_rc_qp(mdev, qp);
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		return mana_table_store_ud_qp(mdev, qp);
+	default:
+		ibdev_dbg(&mdev->ib_dev, "Unknown QP type for storing in mana table, %d\n",
+			  qp->ibqp.qp_type);
+	}
+
+	return -EINVAL;
 }
 
 static void mana_table_remove_qp(struct mana_ib_dev *mdev,
 				 struct mana_ib_qp *qp)
 {
-	xa_erase_irq(&mdev->qp_table_wq, qp->ibqp.qp_num);
+	switch (qp->ibqp.qp_type) {
+	case IB_QPT_RC:
+		mana_table_remove_rc_qp(mdev, qp);
+		break;
+	case IB_QPT_UD:
+	case IB_QPT_GSI:
+		mana_table_remove_ud_qp(mdev, qp);
+		break;
+	default:
+		ibdev_dbg(&mdev->ib_dev, "Unknown QP type for removing from mana table, %d\n",
+			  qp->ibqp.qp_type);
+		return;
+	}
 	mana_put_qp_ref(qp);
 	wait_for_completion(&qp->free);
 }
@@ -586,8 +650,14 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
 		qp->ud_qp.queues[i].kmem->id = qp->ud_qp.queues[i].id;
 
+	err = mana_table_store_qp(mdev, qp);
+	if (err)
+		goto destroy_qp;
+
 	return 0;
 
+destroy_qp:
+	mana_ib_gd_destroy_ud_qp(mdev, qp);
 destroy_shadow_queues:
 	destroy_shadow_queue(&qp->shadow_rq);
 	destroy_shadow_queue(&qp->shadow_sq);
@@ -770,6 +840,8 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
 		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
 	int i;
 
+	mana_table_remove_qp(mdev, qp);
+
 	destroy_shadow_queue(&qp->shadow_rq);
 	destroy_shadow_queue(&qp->shadow_sq);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (10 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23 19:17   ` Long Li
  2025-01-20 17:27 ` [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support Konstantin Taranov
  2025-02-03 11:56 ` [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Leon Romanovsky
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Add polling for the kernel CQs.
Process completion events for UD/GSI QPs.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/cq.c               | 135 ++++++++++++++++++
 drivers/infiniband/hw/mana/device.c           |   1 +
 drivers/infiniband/hw/mana/mana_ib.h          |  32 +++++
 drivers/infiniband/hw/mana/qp.c               |  33 +++++
 .../net/ethernet/microsoft/mana/gdma_main.c   |   1 +
 5 files changed, 202 insertions(+)

diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
index 82f1462..5c325ef 100644
--- a/drivers/infiniband/hw/mana/cq.c
+++ b/drivers/infiniband/hw/mana/cq.c
@@ -90,6 +90,10 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		}
 	}
 
+	spin_lock_init(&cq->cq_lock);
+	INIT_LIST_HEAD(&cq->list_send_qp);
+	INIT_LIST_HEAD(&cq->list_recv_qp);
+
 	return 0;
 
 err_remove_cq_cb:
@@ -180,3 +184,134 @@ int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
 	mana_gd_ring_cq(gdma_cq, SET_ARM_BIT);
 	return 0;
 }
+
+static inline void handle_ud_sq_cqe(struct mana_ib_qp *qp, struct gdma_comp *cqe)
+{
+	struct mana_rdma_cqe *rdma_cqe = (struct mana_rdma_cqe *)cqe->cqe_data;
+	struct gdma_queue *wq = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].kmem;
+	struct ud_sq_shadow_wqe *shadow_wqe;
+
+	shadow_wqe = shadow_queue_get_next_to_complete(&qp->shadow_sq);
+	if (!shadow_wqe)
+		return;
+
+	shadow_wqe->header.error_code = rdma_cqe->ud_send.vendor_error;
+
+	wq->tail += shadow_wqe->header.posted_wqe_size;
+	shadow_queue_advance_next_to_complete(&qp->shadow_sq);
+}
+
+static inline void handle_ud_rq_cqe(struct mana_ib_qp *qp, struct gdma_comp *cqe)
+{
+	struct mana_rdma_cqe *rdma_cqe = (struct mana_rdma_cqe *)cqe->cqe_data;
+	struct gdma_queue *wq = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].kmem;
+	struct ud_rq_shadow_wqe *shadow_wqe;
+
+	shadow_wqe = shadow_queue_get_next_to_complete(&qp->shadow_rq);
+	if (!shadow_wqe)
+		return;
+
+	shadow_wqe->byte_len = rdma_cqe->ud_recv.msg_len;
+	shadow_wqe->src_qpn = rdma_cqe->ud_recv.src_qpn;
+	shadow_wqe->header.error_code = IB_WC_SUCCESS;
+
+	wq->tail += shadow_wqe->header.posted_wqe_size;
+	shadow_queue_advance_next_to_complete(&qp->shadow_rq);
+}
+
+static void mana_handle_cqe(struct mana_ib_dev *mdev, struct gdma_comp *cqe)
+{
+	struct mana_ib_qp *qp = mana_get_qp_ref(mdev, cqe->wq_num, cqe->is_sq);
+
+	if (!qp)
+		return;
+
+	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_UD) {
+		if (cqe->is_sq)
+			handle_ud_sq_cqe(qp, cqe);
+		else
+			handle_ud_rq_cqe(qp, cqe);
+	}
+
+	mana_put_qp_ref(qp);
+}
+
+static void fill_verbs_from_shadow_wqe(struct mana_ib_qp *qp, struct ib_wc *wc,
+				       const struct shadow_wqe_header *shadow_wqe)
+{
+	const struct ud_rq_shadow_wqe *ud_wqe = (const struct ud_rq_shadow_wqe *)shadow_wqe;
+
+	wc->wr_id = shadow_wqe->wr_id;
+	wc->status = shadow_wqe->error_code;
+	wc->opcode = shadow_wqe->opcode;
+	wc->vendor_err = shadow_wqe->error_code;
+	wc->wc_flags = 0;
+	wc->qp = &qp->ibqp;
+	wc->pkey_index = 0;
+
+	if (shadow_wqe->opcode == IB_WC_RECV) {
+		wc->byte_len = ud_wqe->byte_len;
+		wc->src_qp = ud_wqe->src_qpn;
+		wc->wc_flags |= IB_WC_GRH;
+	}
+}
+
+static int mana_process_completions(struct mana_ib_cq *cq, int nwc, struct ib_wc *wc)
+{
+	struct shadow_wqe_header *shadow_wqe;
+	struct mana_ib_qp *qp;
+	int wc_index = 0;
+
+	/* process send shadow queue completions  */
+	list_for_each_entry(qp, &cq->list_send_qp, cq_send_list) {
+		while ((shadow_wqe = shadow_queue_get_next_to_consume(&qp->shadow_sq))
+				!= NULL) {
+			if (wc_index >= nwc)
+				goto out;
+
+			fill_verbs_from_shadow_wqe(qp, &wc[wc_index], shadow_wqe);
+			shadow_queue_advance_consumer(&qp->shadow_sq);
+			wc_index++;
+		}
+	}
+
+	/* process recv shadow queue completions */
+	list_for_each_entry(qp, &cq->list_recv_qp, cq_recv_list) {
+		while ((shadow_wqe = shadow_queue_get_next_to_consume(&qp->shadow_rq))
+				!= NULL) {
+			if (wc_index >= nwc)
+				goto out;
+
+			fill_verbs_from_shadow_wqe(qp, &wc[wc_index], shadow_wqe);
+			shadow_queue_advance_consumer(&qp->shadow_rq);
+			wc_index++;
+		}
+	}
+
+out:
+	return wc_index;
+}
+
+int mana_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
+{
+	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
+	struct mana_ib_dev *mdev = container_of(ibcq->device, struct mana_ib_dev, ib_dev);
+	struct gdma_queue *queue = cq->queue.kmem;
+	struct gdma_comp gdma_cqe;
+	unsigned long flags;
+	int num_polled = 0;
+	int comp_read, i;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	for (i = 0; i < num_entries; i++) {
+		comp_read = mana_gd_poll_cq(queue, &gdma_cqe, 1);
+		if (comp_read < 1)
+			break;
+		mana_handle_cqe(mdev, &gdma_cqe);
+	}
+
+	num_polled = mana_process_completions(cq, num_entries, wc);
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
+
+	return num_polled;
+}
diff --git a/drivers/infiniband/hw/mana/device.c b/drivers/infiniband/hw/mana/device.c
index 63e12c3..97502bc 100644
--- a/drivers/infiniband/hw/mana/device.c
+++ b/drivers/infiniband/hw/mana/device.c
@@ -40,6 +40,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
 	.mmap = mana_ib_mmap,
 	.modify_qp = mana_ib_modify_qp,
 	.modify_wq = mana_ib_modify_wq,
+	.poll_cq = mana_ib_poll_cq,
 	.post_recv = mana_ib_post_recv,
 	.post_send = mana_ib_post_send,
 	.query_device = mana_ib_query_device,
diff --git a/drivers/infiniband/hw/mana/mana_ib.h b/drivers/infiniband/hw/mana/mana_ib.h
index 5e4ca55..cd771af 100644
--- a/drivers/infiniband/hw/mana/mana_ib.h
+++ b/drivers/infiniband/hw/mana/mana_ib.h
@@ -127,6 +127,10 @@ struct mana_ib_mr {
 struct mana_ib_cq {
 	struct ib_cq ibcq;
 	struct mana_ib_queue queue;
+	/* protects CQ polling */
+	spinlock_t cq_lock;
+	struct list_head list_send_qp;
+	struct list_head list_recv_qp;
 	int cqe;
 	u32 comp_vector;
 	mana_handle_t  cq_handle;
@@ -169,6 +173,8 @@ struct mana_ib_qp {
 	/* The port on the IB device, starting with 1 */
 	u32 port;
 
+	struct list_head cq_send_list;
+	struct list_head cq_recv_list;
 	struct shadow_queue shadow_rq;
 	struct shadow_queue shadow_sq;
 
@@ -435,6 +441,31 @@ struct rdma_send_oob {
 	};
 }; /* HW DATA */
 
+struct mana_rdma_cqe {
+	union {
+		struct {
+			u8 cqe_type;
+			u8 data[GDMA_COMP_DATA_SIZE - 1];
+		};
+		struct {
+			u32 cqe_type		: 8;
+			u32 vendor_error	: 9;
+			u32 reserved1		: 15;
+			u32 sge_offset		: 5;
+			u32 tx_wqe_offset	: 27;
+		} ud_send;
+		struct {
+			u32 cqe_type		: 8;
+			u32 reserved1		: 24;
+			u32 msg_len;
+			u32 src_qpn		: 24;
+			u32 reserved2		: 8;
+			u32 imm_data;
+			u32 rx_wqe_offset;
+		} ud_recv;
+	};
+}; /* HW DATA */
+
 static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)
 {
 	return mdev->gdma_dev->gdma_context;
@@ -602,5 +633,6 @@ int mana_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
 int mana_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
 		      const struct ib_send_wr **bad_wr);
 
+int mana_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 #endif
diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
index 2528046..b05e64b 100644
--- a/drivers/infiniband/hw/mana/qp.c
+++ b/drivers/infiniband/hw/mana/qp.c
@@ -600,6 +600,36 @@ destroy_queues:
 	return err;
 }
 
+static void mana_add_qp_to_cqs(struct mana_ib_qp *qp)
+{
+	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct mana_ib_cq, ibcq);
+	unsigned long flags;
+
+	spin_lock_irqsave(&send_cq->cq_lock, flags);
+	list_add_tail(&qp->cq_send_list, &send_cq->list_send_qp);
+	spin_unlock_irqrestore(&send_cq->cq_lock, flags);
+
+	spin_lock_irqsave(&recv_cq->cq_lock, flags);
+	list_add_tail(&qp->cq_recv_list, &recv_cq->list_recv_qp);
+	spin_unlock_irqrestore(&recv_cq->cq_lock, flags);
+}
+
+static void mana_remove_qp_from_cqs(struct mana_ib_qp *qp)
+{
+	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct mana_ib_cq, ibcq);
+	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct mana_ib_cq, ibcq);
+	unsigned long flags;
+
+	spin_lock_irqsave(&send_cq->cq_lock, flags);
+	list_del(&qp->cq_send_list);
+	spin_unlock_irqrestore(&send_cq->cq_lock, flags);
+
+	spin_lock_irqsave(&recv_cq->cq_lock, flags);
+	list_del(&qp->cq_recv_list);
+	spin_unlock_irqrestore(&recv_cq->cq_lock, flags);
+}
+
 static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 				struct ib_qp_init_attr *attr, struct ib_udata *udata)
 {
@@ -654,6 +684,8 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
 	if (err)
 		goto destroy_qp;
 
+	mana_add_qp_to_cqs(qp);
+
 	return 0;
 
 destroy_qp:
@@ -840,6 +872,7 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
 		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
 	int i;
 
+	mana_remove_qp_from_cqs(qp);
 	mana_table_remove_qp(mdev, qp);
 
 	destroy_shadow_queue(&qp->shadow_rq);
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 823f7e7..2da15d9 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1222,6 +1222,7 @@ int mana_gd_poll_cq(struct gdma_queue *cq, struct gdma_comp *comp, int num_cqe)
 
 	return cqe_idx;
 }
+EXPORT_SYMBOL_NS(mana_gd_poll_cq, NET_MANA);
 
 static irqreturn_t mana_gd_intr(int irq, void *arg)
 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (11 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD Konstantin Taranov
@ 2025-01-20 17:27 ` Konstantin Taranov
  2025-01-23 19:18   ` Long Li
  2025-02-03 11:56 ` [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Leon Romanovsky
  13 siblings, 1 reply; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-20 17:27 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg, leon
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv

From: Konstantin Taranov <kotaranov@microsoft.com>

Set max_mad_size and IB_PORT_CM_SUP capability
to enable connection manager.

Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
---
 drivers/infiniband/hw/mana/main.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c
index 114e391..ae1fb69 100644
--- a/drivers/infiniband/hw/mana/main.c
+++ b/drivers/infiniband/hw/mana/main.c
@@ -561,8 +561,10 @@ int mana_ib_get_port_immutable(struct ib_device *ibdev, u32 port_num,
 	immutable->pkey_tbl_len = attr.pkey_tbl_len;
 	immutable->gid_tbl_len = attr.gid_tbl_len;
 	immutable->core_cap_flags = RDMA_CORE_PORT_RAW_PACKET;
-	if (port_num == 1)
+	if (port_num == 1) {
 		immutable->core_cap_flags |= RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
+		immutable->max_mad_size = IB_MGMT_MAD_SIZE;
+	}
 
 	return 0;
 }
@@ -621,8 +623,11 @@ int mana_ib_query_port(struct ib_device *ibdev, u32 port,
 	props->active_width = IB_WIDTH_4X;
 	props->active_speed = IB_SPEED_EDR;
 	props->pkey_tbl_len = 1;
-	if (port == 1)
+	if (port == 1) {
 		props->gid_tbl_len = 16;
+		props->port_cap_flags = IB_PORT_CM_SUP;
+		props->ip_gids = true;
+	}
 
 	return 0;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
  2025-01-20 17:27 ` [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
@ 2025-01-23  5:17   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:17 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-
> mapped memory in PDs
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Allow the HW to register DMA-mapped memory for kernel-level PDs.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/main.c | 3 +++
>  include/net/mana/gdma.h           | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/main.c
> b/drivers/infiniband/hw/mana/main.c
> index 67c2d43..45b251b 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
> @@ -82,6 +82,9 @@ int mana_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata
> *udata)
>  	mana_gd_init_req_hdr(&req.hdr, GDMA_CREATE_PD, sizeof(req),
>  			     sizeof(resp));
> 
> +	if (!udata)
> +		flags |= GDMA_PD_FLAG_ALLOW_GPA_MR;
> +
>  	req.flags = flags;
>  	err = mana_gd_send_request(gc, sizeof(req), &req,
>  				   sizeof(resp), &resp);
> diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index
> 90f5665..03e1b25 100644
> --- a/include/net/mana/gdma.h
> +++ b/include/net/mana/gdma.h
> @@ -775,6 +775,7 @@ struct gdma_destroy_dma_region_req {
> 
>  enum gdma_pd_flags {
>  	GDMA_PD_FLAG_INVALID = 0,
> +	GDMA_PD_FLAG_ALLOW_GPA_MR = 1,
>  };
> 
>  struct gdma_create_pd_req {
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr
  2025-01-20 17:27 ` [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr Konstantin Taranov
@ 2025-01-23  5:18   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:18 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement allocation of DMA-mapped memory regions.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/device.c |  1 +
>  drivers/infiniband/hw/mana/mr.c     | 36 +++++++++++++++++++++++++++++
>  include/net/mana/gdma.h             |  5 ++++
>  3 files changed, 42 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index 7ac0191..215dbce 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -32,6 +32,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table,
>  	.destroy_wq = mana_ib_destroy_wq,
>  	.disassociate_ucontext = mana_ib_disassociate_ucontext,
> +	.get_dma_mr = mana_ib_get_dma_mr,
>  	.get_link_layer = mana_ib_get_link_layer,
>  	.get_port_immutable = mana_ib_get_port_immutable,
>  	.mmap = mana_ib_mmap,
> diff --git a/drivers/infiniband/hw/mana/mr.c b/drivers/infiniband/hw/mana/mr.c
> index 887b09d..3a047f8 100644
> --- a/drivers/infiniband/hw/mana/mr.c
> +++ b/drivers/infiniband/hw/mana/mr.c
> @@ -8,6 +8,8 @@
>  #define VALID_MR_FLAGS                                                         \
>  	(IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE |
> IB_ACCESS_REMOTE_READ)
> 
> +#define VALID_DMA_MR_FLAGS (IB_ACCESS_LOCAL_WRITE)
> +
>  static enum gdma_mr_access_flags
>  mana_ib_verbs_to_gdma_access_flags(int access_flags)  { @@ -39,6 +41,8 @@
> static int mana_ib_gd_create_mr(struct mana_ib_dev *dev, struct mana_ib_mr
> *mr,
>  	req.mr_type = mr_params->mr_type;
> 
>  	switch (mr_params->mr_type) {
> +	case GDMA_MR_TYPE_GPA:
> +		break;
>  	case GDMA_MR_TYPE_GVA:
>  		req.gva.dma_region_handle = mr_params-
> >gva.dma_region_handle;
>  		req.gva.virtual_address = mr_params->gva.virtual_address; @@
> -169,6 +173,38 @@ err_free:
>  	return ERR_PTR(err);
>  }
> 
> +struct ib_mr *mana_ib_get_dma_mr(struct ib_pd *ibpd, int access_flags)
> +{
> +	struct mana_ib_pd *pd = container_of(ibpd, struct mana_ib_pd, ibpd);
> +	struct gdma_create_mr_params mr_params = {};
> +	struct ib_device *ibdev = ibpd->device;
> +	struct mana_ib_dev *dev;
> +	struct mana_ib_mr *mr;
> +	int err;
> +
> +	dev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> +
> +	if (access_flags & ~VALID_DMA_MR_FLAGS)
> +		return ERR_PTR(-EINVAL);
> +
> +	mr = kzalloc(sizeof(*mr), GFP_KERNEL);
> +	if (!mr)
> +		return ERR_PTR(-ENOMEM);
> +
> +	mr_params.pd_handle = pd->pd_handle;
> +	mr_params.mr_type = GDMA_MR_TYPE_GPA;
> +
> +	err = mana_ib_gd_create_mr(dev, mr, &mr_params);
> +	if (err)
> +		goto err_free;
> +
> +	return &mr->ibmr;
> +
> +err_free:
> +	kfree(mr);
> +	return ERR_PTR(err);
> +}
> +
>  int mana_ib_dereg_mr(struct ib_mr *ibmr, struct ib_udata *udata)  {
>  	struct mana_ib_mr *mr = container_of(ibmr, struct mana_ib_mr, ibmr);
> diff --git a/include/net/mana/gdma.h b/include/net/mana/gdma.h index
> 03e1b25..a94b04e 100644
> --- a/include/net/mana/gdma.h
> +++ b/include/net/mana/gdma.h
> @@ -801,6 +801,11 @@ struct gdma_destory_pd_resp {
>  };/* HW DATA */
> 
>  enum gdma_mr_type {
> +	/*
> +	 * Guest Physical Address - MRs of this type allow access
> +	 * to any DMA-mapped memory using bus-logical address
> +	 */
> +	GDMA_MR_TYPE_GPA = 1,
>  	/* Guest Virtual Address - MRs of this type allow access
>  	 * to memory mapped by PTEs associated with this MR using a virtual
>  	 * address that is set up in the MST
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues
  2025-01-20 17:27 ` [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues Konstantin Taranov
@ 2025-01-23  5:25   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:25 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel
> queues
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Introduce helpers to allocate queues for kernel-level use.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/main.c             | 23 +++++++++++++++++++
>  drivers/infiniband/hw/mana/mana_ib.h          |  3 +++
>  .../net/ethernet/microsoft/mana/gdma_main.c   |  1 +
>  3 files changed, 27 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/main.c
> b/drivers/infiniband/hw/mana/main.c
> index 45b251b..f2f6bb3 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
> @@ -240,6 +240,27 @@ void mana_ib_dealloc_ucontext(struct ib_ucontext
> *ibcontext)
>  		ibdev_dbg(ibdev, "Failed to destroy doorbell page %d\n", ret);  }
> 
> +int mana_ib_create_kernel_queue(struct mana_ib_dev *mdev, u32 size, enum
> gdma_queue_type type,
> +				struct mana_ib_queue *queue)
> +{
> +	struct gdma_context *gc = mdev_to_gc(mdev);
> +	struct gdma_queue_spec spec = {};
> +	int err;
> +
> +	queue->id = INVALID_QUEUE_ID;
> +	queue->gdma_region = GDMA_INVALID_DMA_REGION;
> +	spec.type = type;
> +	spec.monitor_avl_buf = false;
> +	spec.queue_size = size;
> +	err = mana_gd_create_mana_wq_cq(&gc->mana_ib, &spec, &queue-
> >kmem);
> +	if (err)
> +		return err;
> +	/* take ownership into mana_ib from mana */
> +	queue->gdma_region = queue->kmem->mem_info.dma_region_handle;
> +	queue->kmem->mem_info.dma_region_handle =
> GDMA_INVALID_DMA_REGION;
> +	return 0;
> +}
> +
>  int mana_ib_create_queue(struct mana_ib_dev *mdev, u64 addr, u32 size,
>  			 struct mana_ib_queue *queue)
>  {
> @@ -279,6 +300,8 @@ void mana_ib_destroy_queue(struct mana_ib_dev
> *mdev, struct mana_ib_queue *queue
>  	 */
>  	mana_ib_gd_destroy_dma_region(mdev, queue->gdma_region);
>  	ib_umem_release(queue->umem);
> +	if (queue->kmem)
> +		mana_gd_destroy_queue(mdev_to_gc(mdev), queue->kmem);
>  }
> 
>  static int
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index b53a5b4..79ebd95 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -52,6 +52,7 @@ struct mana_ib_adapter_caps {
> 
>  struct mana_ib_queue {
>  	struct ib_umem *umem;
> +	struct gdma_queue *kmem;
>  	u64 gdma_region;
>  	u64 id;
>  };
> @@ -388,6 +389,8 @@ int mana_ib_create_dma_region(struct mana_ib_dev
> *dev, struct ib_umem *umem,  int mana_ib_gd_destroy_dma_region(struct
> mana_ib_dev *dev,
>  				  mana_handle_t gdma_region);
> 
> +int mana_ib_create_kernel_queue(struct mana_ib_dev *mdev, u32 size, enum
> gdma_queue_type type,
> +				struct mana_ib_queue *queue);
>  int mana_ib_create_queue(struct mana_ib_dev *mdev, u64 addr, u32 size,
>  			 struct mana_ib_queue *queue);
>  void mana_ib_destroy_queue(struct mana_ib_dev *mdev, struct
> mana_ib_queue *queue); diff --git
> a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index e97af7a..3cb0543 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -867,6 +867,7 @@ free_q:
>  	kfree(queue);
>  	return err;
>  }
> +EXPORT_SYMBOL_NS(mana_gd_create_mana_wq_cq, NET_MANA);
> 
>  void mana_gd_destroy_queue(struct gdma_context *gc, struct gdma_queue
> *queue)  {
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
  2025-01-20 17:27 ` [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs Konstantin Taranov
@ 2025-01-23  5:36   ` Long Li
  2025-01-28 12:50     ` Konstantin Taranov
  2025-01-29  0:48   ` Long Li
  1 sibling, 1 reply; 31+ messages in thread
From: Long Li @ 2025-01-23  5:36 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement creation of CQs for the kernel.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/cq.c | 80 +++++++++++++++++++++------------
>  1 file changed, 52 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index f04a679..d26d82d 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -15,42 +15,57 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_create_cq ucmd = {};
>  	struct mana_ib_dev *mdev;
> +	struct gdma_context *gc;
>  	bool is_rnic_cq;
>  	u32 doorbell;
> +	u32 buf_size;
>  	int err;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> +	gc = mdev_to_gc(mdev);
> 
>  	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
>  	cq->cq_handle = INVALID_MANA_HANDLE;
> 
> -	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> -		return -EINVAL;
> +	if (udata) {
> +		if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> +			return -EINVAL;
> 
> -	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> >inlen));
> -	if (err) {
> -		ibdev_dbg(ibdev,
> -			  "Failed to copy from udata for create cq, %d\n", err);
> -		return err;
> -	}
> +		err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd),
> udata->inlen));
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to copy from udata for create
> cq, %d\n", err);
> +			return err;
> +		}
> 
> -	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> +		is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> 
> -	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
> -		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
> -		return -EINVAL;
> -	}
> +		if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr)
> {
> +			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
> +			return -EINVAL;
> +		}
> 
> -	cq->cqe = attr->cqe;
> -	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> COMP_ENTRY_SIZE, &cq->queue);
> -	if (err) {
> -		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n",
> err);
> -		return err;
> -	}
> +		cq->cqe = attr->cqe;
> +		err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> COMP_ENTRY_SIZE,
> +					   &cq->queue);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create queue for create
> cq, %d\n", err);
> +			return err;
> +		}
> 
> -	mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> -						  ibucontext);
> -	doorbell = mana_ucontext->doorbell;
> +		mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> +							  ibucontext);
> +		doorbell = mana_ucontext->doorbell;
> +	} else {
> +		is_rnic_cq = true;
> +		buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr->cqe
> * COMP_ENTRY_SIZE));
> +		cq->cqe = buf_size / COMP_ENTRY_SIZE;
> +		err = mana_ib_create_kernel_queue(mdev, buf_size, GDMA_CQ,
> &cq->queue);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create kernel queue for
> create cq, %d\n", err);
> +			return err;
> +		}
> +		doorbell = gc->mana_ib.doorbell;
> +	}
> 
>  	if (is_rnic_cq) {
>  		err = mana_ib_gd_create_cq(mdev, cq, doorbell); @@ -66,11
> +81,13 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		}
>  	}
> 
> -	resp.cqid = cq->queue.id;
> -	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
> -	if (err) {
> -		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n", err);
> -		goto err_remove_cq_cb;
> +	if (udata) {
> +		resp.cqid = cq->queue.id;
> +		err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata-
> >outlen));
> +		if (err) {
> +			ibdev_dbg(&mdev->ib_dev, "Failed to copy to
> udata, %d\n", err);
> +			goto err_remove_cq_cb;
> +		}
>  	}
> 
>  	return 0;
> @@ -122,7 +139,10 @@ int mana_ib_install_cq_cb(struct mana_ib_dev *mdev,
> struct mana_ib_cq *cq)
>  		return -EINVAL;
>  	/* Create CQ table entry */
>  	WARN_ON(gc->cq_table[cq->queue.id]);
> -	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> +	if (cq->queue.kmem)
> +		gdma_cq = cq->queue.kmem;
> +	else
> +		gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
>  	if (!gdma_cq)
>  		return -ENOMEM;
> 
> @@ -141,6 +161,10 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev
> *mdev, struct mana_ib_cq *cq)
>  	if (cq->queue.id >= gc->max_num_cqs || cq->queue.id ==
> INVALID_QUEUE_ID)
>  		return;
> 
> +	if (cq->queue.kmem)
> +	/* Then it will be cleaned and removed by the mana */
> +		return;
> +

Do you need to call "gc->cq_table[cq->queue.id] = NULL" before return?


>  	kfree(gc->cq_table[cq->queue.id]);
>  	gc->cq_table[cq->queue.id] = NULL;
>  }
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP
  2025-01-20 17:27 ` [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP Konstantin Taranov
@ 2025-01-23  5:40   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:40 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI
> QP
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement HW requests to create and destroy UD/GSI QPs.
> An UD/GSI QP has send and receive queues.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/main.c    | 58 ++++++++++++++++++++++++++++
>  drivers/infiniband/hw/mana/mana_ib.h | 49 +++++++++++++++++++++++
>  2 files changed, 107 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/main.c
> b/drivers/infiniband/hw/mana/main.c
> index f2f6bb3..b0c55cb 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
> @@ -1013,3 +1013,61 @@ int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev
> *mdev, struct mana_ib_qp *qp)
>  	}
>  	return 0;
>  }
> +
> +int mana_ib_gd_create_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp,
> +			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type) {
> +	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct
> mana_ib_cq, ibcq);
> +	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct
> mana_ib_cq, ibcq);
> +	struct mana_ib_pd *pd = container_of(qp->ibqp.pd, struct mana_ib_pd,
> ibpd);
> +	struct gdma_context *gc = mdev_to_gc(mdev);
> +	struct mana_rnic_create_udqp_resp resp = {};
> +	struct mana_rnic_create_udqp_req req = {};
> +	int err, i;
> +
> +	mana_gd_init_req_hdr(&req.hdr, MANA_IB_CREATE_UD_QP, sizeof(req),
> sizeof(resp));
> +	req.hdr.dev_id = gc->mana_ib.dev_id;
> +	req.adapter = mdev->adapter_handle;
> +	req.pd_handle = pd->pd_handle;
> +	req.send_cq_handle = send_cq->cq_handle;
> +	req.recv_cq_handle = recv_cq->cq_handle;
> +	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; i++)
> +		req.dma_region[i] = qp->ud_qp.queues[i].gdma_region;
> +	req.doorbell_page = doorbell;
> +	req.max_send_wr = attr->cap.max_send_wr;
> +	req.max_recv_wr = attr->cap.max_recv_wr;
> +	req.max_send_sge = attr->cap.max_send_sge;
> +	req.max_recv_sge = attr->cap.max_recv_sge;
> +	req.qp_type = type;
> +	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
> +	if (err) {
> +		ibdev_err(&mdev->ib_dev, "Failed to create ud qp err %d", err);
> +		return err;
> +	}
> +	qp->qp_handle = resp.qp_handle;
> +	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; i++) {
> +		qp->ud_qp.queues[i].id = resp.queue_ids[i];
> +		/* The GDMA regions are now owned by the RNIC QP handle */
> +		qp->ud_qp.queues[i].gdma_region =
> GDMA_INVALID_DMA_REGION;
> +	}
> +	return 0;
> +}
> +
> +int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct
> +mana_ib_qp *qp) {
> +	struct mana_rnic_destroy_udqp_resp resp = {0};
> +	struct mana_rnic_destroy_udqp_req req = {0};
> +	struct gdma_context *gc = mdev_to_gc(mdev);
> +	int err;
> +
> +	mana_gd_init_req_hdr(&req.hdr, MANA_IB_DESTROY_UD_QP,
> sizeof(req), sizeof(resp));
> +	req.hdr.dev_id = gc->mana_ib.dev_id;
> +	req.adapter = mdev->adapter_handle;
> +	req.qp_handle = qp->qp_handle;
> +	err = mana_gd_send_request(gc, sizeof(req), &req, sizeof(resp), &resp);
> +	if (err) {
> +		ibdev_err(&mdev->ib_dev, "Failed to destroy ud qp err %d", err);
> +		return err;
> +	}
> +	return 0;
> +}
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 79ebd95..5e470f1 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -115,6 +115,17 @@ struct mana_ib_rc_qp {
>  	struct mana_ib_queue queues[MANA_RC_QUEUE_TYPE_MAX];  };
> 
> +enum mana_ud_queue_type {
> +	MANA_UD_SEND_QUEUE = 0,
> +	MANA_UD_RECV_QUEUE,
> +	MANA_UD_QUEUE_TYPE_MAX,
> +};
> +
> +struct mana_ib_ud_qp {
> +	struct mana_ib_queue queues[MANA_UD_QUEUE_TYPE_MAX];
> +	u32 sq_psn;
> +};
> +
>  struct mana_ib_qp {
>  	struct ib_qp ibqp;
> 
> @@ -122,6 +133,7 @@ struct mana_ib_qp {
>  	union {
>  		struct mana_ib_queue raw_sq;
>  		struct mana_ib_rc_qp rc_qp;
> +		struct mana_ib_ud_qp ud_qp;
>  	};
> 
>  	/* The port on the IB device, starting with 1 */ @@ -146,6 +158,8 @@
> enum mana_ib_command_code {
>  	MANA_IB_DESTROY_ADAPTER = 0x30003,
>  	MANA_IB_CONFIG_IP_ADDR	= 0x30004,
>  	MANA_IB_CONFIG_MAC_ADDR	= 0x30005,
> +	MANA_IB_CREATE_UD_QP	= 0x30006,
> +	MANA_IB_DESTROY_UD_QP	= 0x30007,
>  	MANA_IB_CREATE_CQ       = 0x30008,
>  	MANA_IB_DESTROY_CQ      = 0x30009,
>  	MANA_IB_CREATE_RC_QP    = 0x3000a,
> @@ -297,6 +311,37 @@ struct mana_rnic_destroy_rc_qp_resp {
>  	struct gdma_resp_hdr hdr;
>  }; /* HW Data */
> 
> +struct mana_rnic_create_udqp_req {
> +	struct gdma_req_hdr hdr;
> +	mana_handle_t adapter;
> +	mana_handle_t pd_handle;
> +	mana_handle_t send_cq_handle;
> +	mana_handle_t recv_cq_handle;
> +	u64 dma_region[MANA_UD_QUEUE_TYPE_MAX];
> +	u32 qp_type;
> +	u32 doorbell_page;
> +	u32 max_send_wr;
> +	u32 max_recv_wr;
> +	u32 max_send_sge;
> +	u32 max_recv_sge;
> +}; /* HW Data */
> +
> +struct mana_rnic_create_udqp_resp {
> +	struct gdma_resp_hdr hdr;
> +	mana_handle_t qp_handle;
> +	u32 queue_ids[MANA_UD_QUEUE_TYPE_MAX];
> +}; /* HW Data*/
> +
> +struct mana_rnic_destroy_udqp_req {
> +	struct gdma_req_hdr hdr;
> +	mana_handle_t adapter;
> +	mana_handle_t qp_handle;
> +}; /* HW Data */
> +
> +struct mana_rnic_destroy_udqp_resp {
> +	struct gdma_resp_hdr hdr;
> +}; /* HW Data */
> +
>  struct mana_ib_ah_attr {
>  	u8 src_addr[16];
>  	u8 dest_addr[16];
> @@ -483,4 +528,8 @@ int mana_ib_gd_destroy_cq(struct mana_ib_dev *mdev,
> struct mana_ib_cq *cq);  int mana_ib_gd_create_rc_qp(struct mana_ib_dev
> *mdev, struct mana_ib_qp *qp,
>  			    struct ib_qp_init_attr *attr, u32 doorbell, u64 flags);
> int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp);
> +
> +int mana_ib_gd_create_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp,
> +			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type); int
> +mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> +*qp);
>  #endif
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel
  2025-01-20 17:27 ` [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel Konstantin Taranov
@ 2025-01-23  5:45   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:45 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for
> kernel
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement UD/GSI QPs for the kernel.
> Allow create/modify/destroy for such QPs.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/qp.c | 115 ++++++++++++++++++++++++++++++++
>  1 file changed, 115 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
> index 73d67c8..fea45be 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -398,6 +398,52 @@ err_free_vport:
>  	return err;
>  }
> 
> +static u32 mana_ib_wqe_size(u32 sge, u32 oob_size) {
> +	u32 wqe_size = sge * sizeof(struct gdma_sge) + sizeof(struct gdma_wqe)
> ++ oob_size;
> +
> +	return ALIGN(wqe_size, GDMA_WQE_BU_SIZE); }
> +
> +static u32 mana_ib_queue_size(struct ib_qp_init_attr *attr, u32
> +queue_type) {
> +	u32 queue_size;
> +
> +	switch (attr->qp_type) {
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		if (queue_type == MANA_UD_SEND_QUEUE)
> +			queue_size = attr->cap.max_send_wr *
> +				mana_ib_wqe_size(attr->cap.max_send_sge,
> INLINE_OOB_LARGE_SIZE);
> +		else
> +			queue_size = attr->cap.max_recv_wr *
> +				mana_ib_wqe_size(attr->cap.max_recv_sge,
> INLINE_OOB_SMALL_SIZE);
> +		break;
> +	default:
> +		return 0;
> +	}
> +
> +	return MANA_PAGE_ALIGN(roundup_pow_of_two(queue_size));
> +}
> +
> +static enum gdma_queue_type mana_ib_queue_type(struct ib_qp_init_attr
> +*attr, u32 queue_type) {
> +	enum gdma_queue_type type;
> +
> +	switch (attr->qp_type) {
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		if (queue_type == MANA_UD_SEND_QUEUE)
> +			type = GDMA_SQ;
> +		else
> +			type = GDMA_RQ;
> +		break;
> +	default:
> +		type = GDMA_INVALID_QUEUE;
> +	}
> +	return type;
> +}
> +
>  static int mana_table_store_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp)  {
>  	refcount_set(&qp->refcount, 1);
> @@ -490,6 +536,51 @@ destroy_queues:
>  	return err;
>  }
> 
> +static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
> +				struct ib_qp_init_attr *attr, struct ib_udata
> *udata) {
> +	struct mana_ib_dev *mdev = container_of(ibpd->device, struct
> mana_ib_dev, ib_dev);
> +	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
> +	struct gdma_context *gc = mdev_to_gc(mdev);
> +	u32 doorbell, queue_size;
> +	int i, err;
> +
> +	if (udata) {
> +		ibdev_dbg(&mdev->ib_dev, "User-level UD QPs are not
> supported, %d\n", err);
> +		return -EINVAL;
> +	}
> +
> +	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i) {
> +		queue_size = mana_ib_queue_size(attr, i);
> +		err = mana_ib_create_kernel_queue(mdev, queue_size,
> mana_ib_queue_type(attr, i),
> +						  &qp->ud_qp.queues[i]);
> +		if (err) {
> +			ibdev_err(&mdev->ib_dev, "Failed to create queue %d,
> err %d\n",
> +				  i, err);
> +			goto destroy_queues;
> +		}
> +	}
> +	doorbell = gc->mana_ib.doorbell;
> +
> +	err = mana_ib_gd_create_ud_qp(mdev, qp, attr, doorbell, attr->qp_type);
> +	if (err) {
> +		ibdev_err(&mdev->ib_dev, "Failed to create ud qp  %d\n", err);
> +		goto destroy_queues;
> +	}
> +	qp->ibqp.qp_num = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
> +	qp->port = attr->port_num;
> +
> +	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
> +		qp->ud_qp.queues[i].kmem->id = qp->ud_qp.queues[i].id;
> +
> +	return 0;
> +
> +destroy_queues:
> +	while (i-- > 0)
> +		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]);
> +	return err;
> +}
> +
>  int mana_ib_create_qp(struct ib_qp *ibqp, struct ib_qp_init_attr *attr,
>  		      struct ib_udata *udata)
>  {
> @@ -503,6 +594,9 @@ int mana_ib_create_qp(struct ib_qp *ibqp, struct
> ib_qp_init_attr *attr,
>  		return mana_ib_create_qp_raw(ibqp, ibqp->pd, attr, udata);
>  	case IB_QPT_RC:
>  		return mana_ib_create_rc_qp(ibqp, ibqp->pd, attr, udata);
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		return mana_ib_create_ud_qp(ibqp, ibqp->pd, attr, udata);
>  	default:
>  		ibdev_dbg(ibqp->device, "Creating QP type %u not supported\n",
>  			  attr->qp_type);
> @@ -579,6 +673,8 @@ int mana_ib_modify_qp(struct ib_qp *ibqp, struct
> ib_qp_attr *attr,  {
>  	switch (ibqp->qp_type) {
>  	case IB_QPT_RC:
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
>  		return mana_ib_gd_modify_qp(ibqp, attr, attr_mask, udata);
>  	default:
>  		ibdev_dbg(ibqp->device, "Modify QP type %u not supported",
> ibqp->qp_type); @@ -652,6 +748,22 @@ static int
> mana_ib_destroy_rc_qp(struct mana_ib_qp *qp, struct ib_udata *udata)
>  	return 0;
>  }
> 
> +static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp, struct ib_udata
> +*udata) {
> +	struct mana_ib_dev *mdev =
> +		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
> +	int i;
> +
> +	/* Ignore return code as there is not much we can do about it.
> +	 * The error message is printed inside.
> +	 */
> +	mana_ib_gd_destroy_ud_qp(mdev, qp);
> +	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
> +		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]);
> +
> +	return 0;
> +}
> +
>  int mana_ib_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)  {
>  	struct mana_ib_qp *qp = container_of(ibqp, struct mana_ib_qp, ibqp);
> @@ -665,6 +777,9 @@ int mana_ib_destroy_qp(struct ib_qp *ibqp, struct
> ib_udata *udata)
>  		return mana_ib_destroy_qp_raw(qp, udata);
>  	case IB_QPT_RC:
>  		return mana_ib_destroy_rc_qp(qp, udata);
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		return mana_ib_destroy_ud_qp(qp, udata);
>  	default:
>  		ibdev_dbg(ibqp->device, "Unexpected QP type %u\n",
>  			  ibqp->qp_type);
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH
  2025-01-20 17:27 ` [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH Konstantin Taranov
@ 2025-01-23  5:53   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:53 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement create and destroy AH for kernel.
> 
> In mana_ib, AV is passed as an sge in WQE.
> Allocate DMA memory and write an AV there.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/Makefile  |  2 +-
>  drivers/infiniband/hw/mana/ah.c      | 58 ++++++++++++++++++++++++++++
>  drivers/infiniband/hw/mana/device.c  | 13 ++++++-
> drivers/infiniband/hw/mana/mana_ib.h | 30 ++++++++++++++
>  4 files changed, 101 insertions(+), 2 deletions(-)  create mode 100644
> drivers/infiniband/hw/mana/ah.c
> 
> diff --git a/drivers/infiniband/hw/mana/Makefile
> b/drivers/infiniband/hw/mana/Makefile
> index 88655fe..6e56f77 100644
> --- a/drivers/infiniband/hw/mana/Makefile
> +++ b/drivers/infiniband/hw/mana/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  obj-$(CONFIG_MANA_INFINIBAND) += mana_ib.o
> 
> -mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o
> +mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o
> diff --git a/drivers/infiniband/hw/mana/ah.c b/drivers/infiniband/hw/mana/ah.c
> new file mode 100644 index 0000000..f56952e
> --- /dev/null
> +++ b/drivers/infiniband/hw/mana/ah.c
> @@ -0,0 +1,58 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
> + */
> +
> +#include "mana_ib.h"
> +
> +int mana_ib_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *attr,
> +		      struct ib_udata *udata)
> +{
> +	struct mana_ib_dev *mdev = container_of(ibah->device, struct
> mana_ib_dev, ib_dev);
> +	struct mana_ib_ah *ah = container_of(ibah, struct mana_ib_ah, ibah);
> +	struct rdma_ah_attr *ah_attr = attr->ah_attr;
> +	const struct ib_global_route *grh;
> +	enum rdma_network_type ntype;
> +
> +	if (ah_attr->type != RDMA_AH_ATTR_TYPE_ROCE ||
> +	    !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
> +		return -EINVAL;
> +
> +	if (udata)
> +		return -EINVAL;
> +
> +	ah->av = dma_pool_zalloc(mdev->av_pool, GFP_ATOMIC, &ah-
> >dma_handle);
> +	if (!ah->av)
> +		return -ENOMEM;
> +
> +	grh = rdma_ah_read_grh(ah_attr);
> +	ntype = rdma_gid_attr_network_type(grh->sgid_attr);
> +
> +	copy_in_reverse(ah->av->dest_mac, ah_attr->roce.dmac, ETH_ALEN);
> +	ah->av->udp_src_port = rdma_flow_label_to_udp_sport(grh-
> >flow_label);
> +	ah->av->hop_limit = grh->hop_limit;
> +	ah->av->dscp = (grh->traffic_class >> 2) & 0x3f;
> +	ah->av->is_ipv6 = (ntype == RDMA_NETWORK_IPV6);
> +
> +	if (ah->av->is_ipv6) {
> +		copy_in_reverse(ah->av->dest_ip, grh->dgid.raw, 16);
> +		copy_in_reverse(ah->av->src_ip, grh->sgid_attr->gid.raw, 16);
> +	} else {
> +		ah->av->dest_ip[10] = 0xFF;
> +		ah->av->dest_ip[11] = 0xFF;
> +		copy_in_reverse(&ah->av->dest_ip[12], &grh->dgid.raw[12], 4);
> +		copy_in_reverse(&ah->av->src_ip[12], &grh->sgid_attr-
> >gid.raw[12], 4);
> +	}
> +
> +	return 0;
> +}
> +
> +int mana_ib_destroy_ah(struct ib_ah *ibah, u32 flags) {
> +	struct mana_ib_dev *mdev = container_of(ibah->device, struct
> mana_ib_dev, ib_dev);
> +	struct mana_ib_ah *ah = container_of(ibah, struct mana_ib_ah, ibah);
> +
> +	dma_pool_free(mdev->av_pool, ah->av, ah->dma_handle);
> +
> +	return 0;
> +}
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index 215dbce..d534ef1 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -19,6 +19,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.add_gid = mana_ib_gd_add_gid,
>  	.alloc_pd = mana_ib_alloc_pd,
>  	.alloc_ucontext = mana_ib_alloc_ucontext,
> +	.create_ah = mana_ib_create_ah,
>  	.create_cq = mana_ib_create_cq,
>  	.create_qp = mana_ib_create_qp,
>  	.create_rwq_ind_table = mana_ib_create_rwq_ind_table, @@ -27,6
> +28,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.dealloc_ucontext = mana_ib_dealloc_ucontext,
>  	.del_gid = mana_ib_gd_del_gid,
>  	.dereg_mr = mana_ib_dereg_mr,
> +	.destroy_ah = mana_ib_destroy_ah,
>  	.destroy_cq = mana_ib_destroy_cq,
>  	.destroy_qp = mana_ib_destroy_qp,
>  	.destroy_rwq_ind_table = mana_ib_destroy_rwq_ind_table, @@ -44,6
> +46,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.query_port = mana_ib_query_port,
>  	.reg_user_mr = mana_ib_reg_user_mr,
> 
> +	INIT_RDMA_OBJ_SIZE(ib_ah, mana_ib_ah, ibah),
>  	INIT_RDMA_OBJ_SIZE(ib_cq, mana_ib_cq, ibcq),
>  	INIT_RDMA_OBJ_SIZE(ib_pd, mana_ib_pd, ibpd),
>  	INIT_RDMA_OBJ_SIZE(ib_qp, mana_ib_qp, ibqp), @@ -135,15 +138,22
> @@ static int mana_ib_probe(struct auxiliary_device *adev,
>  		goto destroy_rnic;
>  	}
> 
> +	dev->av_pool = dma_pool_create("mana_ib_av", mdev->gdma_context-
> >dev,
> +				       MANA_AV_BUFFER_SIZE,
> MANA_AV_BUFFER_SIZE, 0);
> +	if (!dev->av_pool)
> +		goto destroy_rnic;
> +
>  	ret = ib_register_device(&dev->ib_dev, "mana_%d",
>  				 mdev->gdma_context->dev);
>  	if (ret)
> -		goto destroy_rnic;
> +		goto deallocate_pool;
> 
>  	dev_set_drvdata(&adev->dev, dev);
> 
>  	return 0;
> 
> +deallocate_pool:
> +	dma_pool_destroy(dev->av_pool);
>  destroy_rnic:
>  	xa_destroy(&dev->qp_table_wq);
>  	mana_ib_gd_destroy_rnic_adapter(dev);
> @@ -161,6 +171,7 @@ static void mana_ib_remove(struct auxiliary_device
> *adev)
>  	struct mana_ib_dev *dev = dev_get_drvdata(&adev->dev);
> 
>  	ib_unregister_device(&dev->ib_dev);
> +	dma_pool_destroy(dev->av_pool);
>  	xa_destroy(&dev->qp_table_wq);
>  	mana_ib_gd_destroy_rnic_adapter(dev);
>  	mana_ib_destroy_eqs(dev);
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 5e470f1..7b079d8 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -11,6 +11,7 @@
>  #include <rdma/ib_umem.h>
>  #include <rdma/mana-abi.h>
>  #include <rdma/uverbs_ioctl.h>
> +#include <linux/dmapool.h>
> 
>  #include <net/mana/mana.h>
> 
> @@ -32,6 +33,11 @@
>   */
>  #define MANA_CA_ACK_DELAY	16
> 
> +/*
> + * The buffer used for writing AV
> + */
> +#define MANA_AV_BUFFER_SIZE	64
> +
>  struct mana_ib_adapter_caps {
>  	u32 max_sq_id;
>  	u32 max_rq_id;
> @@ -65,6 +71,7 @@ struct mana_ib_dev {
>  	struct gdma_queue **eqs;
>  	struct xarray qp_table_wq;
>  	struct mana_ib_adapter_caps adapter_caps;
> +	struct dma_pool *av_pool;
>  };
> 
>  struct mana_ib_wq {
> @@ -88,6 +95,25 @@ struct mana_ib_pd {
>  	u32 tx_vp_offset;
>  };
> 
> +struct mana_ib_av {
> +	u8 dest_ip[16];
> +	u8 dest_mac[ETH_ALEN];
> +	u16 udp_src_port;
> +	u8 src_ip[16];
> +	u32 hop_limit	: 8;
> +	u32 reserved1	: 12;
> +	u32 dscp	: 6;
> +	u32 reserved2	: 5;
> +	u32 is_ipv6	: 1;
> +	u32 reserved3	: 32;
> +};
> +
> +struct mana_ib_ah {
> +	struct ib_ah ibah;
> +	struct mana_ib_av *av;
> +	dma_addr_t dma_handle;
> +};
> +
>  struct mana_ib_mr {
>  	struct ib_mr ibmr;
>  	struct ib_umem *umem;
> @@ -532,4 +558,8 @@ int mana_ib_gd_destroy_rc_qp(struct mana_ib_dev
> *mdev, struct mana_ib_qp *qp);  int mana_ib_gd_create_ud_qp(struct
> mana_ib_dev *mdev, struct mana_ib_qp *qp,
>  			    struct ib_qp_init_attr *attr, u32 doorbell, u32 type);
> int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp);
> +
> +int mana_ib_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
> +		      struct ib_udata *udata);
> +int mana_ib_destroy_ah(struct ib_ah *ah, u32 flags);
>  #endif
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq
  2025-01-20 17:27 ` [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq Konstantin Taranov
@ 2025-01-23  5:57   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  5:57 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Arm a CQ when req_notify_cq is called.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/cq.c                 | 12 ++++++++++++
>  drivers/infiniband/hw/mana/device.c             |  1 +
>  drivers/infiniband/hw/mana/mana_ib.h            |  2 ++
>  drivers/net/ethernet/microsoft/mana/gdma_main.c |  1 +
>  4 files changed, 16 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index d26d82d..82f1462 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -168,3 +168,15 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev
> *mdev, struct mana_ib_cq *cq)
>  	kfree(gc->cq_table[cq->queue.id]);
>  	gc->cq_table[cq->queue.id] = NULL;
>  }
> +
> +int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) {
> +	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> +	struct gdma_queue *gdma_cq = cq->queue.kmem;
> +
> +	if (!gdma_cq)
> +		return -EINVAL;
> +
> +	mana_gd_ring_cq(gdma_cq, SET_ARM_BIT);
> +	return 0;
> +}
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index 1da86c3..63e12c3 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -47,6 +47,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.query_pkey = mana_ib_query_pkey,
>  	.query_port = mana_ib_query_port,
>  	.reg_user_mr = mana_ib_reg_user_mr,
> +	.req_notify_cq = mana_ib_arm_cq,
> 
>  	INIT_RDMA_OBJ_SIZE(ib_ah, mana_ib_ah, ibah),
>  	INIT_RDMA_OBJ_SIZE(ib_cq, mana_ib_cq, ibcq), diff --git
> a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 6265c39..bd34ad6 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -595,4 +595,6 @@ int mana_ib_post_recv(struct ib_qp *ibqp, const struct
> ib_recv_wr *wr,
>  		      const struct ib_recv_wr **bad_wr);  int
> mana_ib_post_send(struct ib_qp *ibqp, const struct ib_send_wr *wr,
>  		      const struct ib_send_wr **bad_wr);
> +
> +int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
>  #endif
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 409e4e8..823f7e7 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -344,6 +344,7 @@ void mana_gd_ring_cq(struct gdma_queue *cq, u8
> arm_bit)
>  	mana_gd_ring_doorbell(gc, cq->gdma_dev->doorbell, cq->type, cq->id,
>  			      head, arm_bit);
>  }
> +EXPORT_SYMBOL_NS(mana_gd_ring_cq, NET_MANA);
> 
>  static void mana_gd_process_eqe(struct gdma_queue *eq)  {
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table
  2025-01-20 17:27 ` [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table Konstantin Taranov
@ 2025-01-23  6:02   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23  6:02 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Enable mana QP table to store UD/GSI QPs.
> For send queues, set the most significant bit to one, as send and receive WQs can
> have the same ID in mana.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/main.c    |  2 +-
>  drivers/infiniband/hw/mana/mana_ib.h |  8 ++-
>  drivers/infiniband/hw/mana/qp.c      | 78 ++++++++++++++++++++++++++--
>  3 files changed, 83 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/main.c
> b/drivers/infiniband/hw/mana/main.c
> index b0c55cb..114e391 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
> @@ -704,7 +704,7 @@ mana_ib_event_handler(void *ctx, struct gdma_queue
> *q, struct gdma_event *event)
>  	switch (event->type) {
>  	case GDMA_EQE_RNIC_QP_FATAL:
>  		qpn = event->details[0];
> -		qp = mana_get_qp_ref(mdev, qpn);
> +		qp = mana_get_qp_ref(mdev, qpn, false);
>  		if (!qp)
>  			break;
>  		if (qp->ibqp.event_handler) {
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index bd34ad6..5e4ca55 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -23,6 +23,9 @@
>  /* MANA doesn't have any limit for MR size */
>  #define MANA_IB_MAX_MR_SIZE	U64_MAX
> 
> +/* Send queue ID mask */
> +#define MANA_SENDQ_MASK	BIT(31)
> +
>  /*
>   * The hardware limit of number of MRs is greater than maximum number of MRs
>   * that can possibly represent in 24 bits @@ -438,11 +441,14 @@ static inline
> struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)  }
> 
>  static inline struct mana_ib_qp *mana_get_qp_ref(struct mana_ib_dev *mdev,
> -						 uint32_t qid)
> +						 u32 qid, bool is_sq)
>  {
>  	struct mana_ib_qp *qp;
>  	unsigned long flag;
> 
> +	if (is_sq)
> +		qid |= MANA_SENDQ_MASK;
> +
>  	xa_lock_irqsave(&mdev->qp_table_wq, flag);
>  	qp = xa_load(&mdev->qp_table_wq, qid);
>  	if (qp)
> diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
> index 051ea03..2528046 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -444,18 +444,82 @@ static enum gdma_queue_type
> mana_ib_queue_type(struct ib_qp_init_attr *attr, u32
>  	return type;
>  }
> 
> +static int mana_table_store_rc_qp(struct mana_ib_dev *mdev, struct
> +mana_ib_qp *qp) {
> +	return xa_insert_irq(&mdev->qp_table_wq, qp->ibqp.qp_num, qp,
> +			     GFP_KERNEL);
> +}
> +
> +static void mana_table_remove_rc_qp(struct mana_ib_dev *mdev, struct
> +mana_ib_qp *qp) {
> +	xa_erase_irq(&mdev->qp_table_wq, qp->ibqp.qp_num); }
> +
> +static int mana_table_store_ud_qp(struct mana_ib_dev *mdev, struct
> +mana_ib_qp *qp) {
> +	u32 qids = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].id |
> MANA_SENDQ_MASK;
> +	u32 qidr = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
> +	int err;
> +
> +	err = xa_insert_irq(&mdev->qp_table_wq, qids, qp, GFP_KERNEL);
> +	if (err)
> +		return err;
> +
> +	err = xa_insert_irq(&mdev->qp_table_wq, qidr, qp, GFP_KERNEL);
> +	if (err)
> +		goto remove_sq;
> +
> +	return 0;
> +
> +remove_sq:
> +	xa_erase_irq(&mdev->qp_table_wq, qids);
> +	return err;
> +}
> +
> +static void mana_table_remove_ud_qp(struct mana_ib_dev *mdev, struct
> +mana_ib_qp *qp) {
> +	u32 qids = qp->ud_qp.queues[MANA_UD_SEND_QUEUE].id |
> MANA_SENDQ_MASK;
> +	u32 qidr = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
> +
> +	xa_erase_irq(&mdev->qp_table_wq, qids);
> +	xa_erase_irq(&mdev->qp_table_wq, qidr); }
> +
>  static int mana_table_store_qp(struct mana_ib_dev *mdev, struct mana_ib_qp
> *qp)  {
>  	refcount_set(&qp->refcount, 1);
>  	init_completion(&qp->free);
> -	return xa_insert_irq(&mdev->qp_table_wq, qp->ibqp.qp_num, qp,
> -			     GFP_KERNEL);
> +
> +	switch (qp->ibqp.qp_type) {
> +	case IB_QPT_RC:
> +		return mana_table_store_rc_qp(mdev, qp);
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		return mana_table_store_ud_qp(mdev, qp);
> +	default:
> +		ibdev_dbg(&mdev->ib_dev, "Unknown QP type for storing in
> mana table, %d\n",
> +			  qp->ibqp.qp_type);
> +	}
> +
> +	return -EINVAL;
>  }
> 
>  static void mana_table_remove_qp(struct mana_ib_dev *mdev,
>  				 struct mana_ib_qp *qp)
>  {
> -	xa_erase_irq(&mdev->qp_table_wq, qp->ibqp.qp_num);
> +	switch (qp->ibqp.qp_type) {
> +	case IB_QPT_RC:
> +		mana_table_remove_rc_qp(mdev, qp);
> +		break;
> +	case IB_QPT_UD:
> +	case IB_QPT_GSI:
> +		mana_table_remove_ud_qp(mdev, qp);
> +		break;
> +	default:
> +		ibdev_dbg(&mdev->ib_dev, "Unknown QP type for removing
> from mana table, %d\n",
> +			  qp->ibqp.qp_type);
> +		return;
> +	}
>  	mana_put_qp_ref(qp);
>  	wait_for_completion(&qp->free);
>  }
> @@ -586,8 +650,14 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp,
> struct ib_pd *ibpd,
>  	for (i = 0; i < MANA_UD_QUEUE_TYPE_MAX; ++i)
>  		qp->ud_qp.queues[i].kmem->id = qp->ud_qp.queues[i].id;
> 
> +	err = mana_table_store_qp(mdev, qp);
> +	if (err)
> +		goto destroy_qp;
> +
>  	return 0;
> 
> +destroy_qp:
> +	mana_ib_gd_destroy_ud_qp(mdev, qp);
>  destroy_shadow_queues:
>  	destroy_shadow_queue(&qp->shadow_rq);
>  	destroy_shadow_queue(&qp->shadow_sq);
> @@ -770,6 +840,8 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp
> *qp, struct ib_udata *udata)
>  		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
>  	int i;
> 
> +	mana_table_remove_qp(mdev, qp);
> +
>  	destroy_shadow_queue(&qp->shadow_rq);
>  	destroy_shadow_queue(&qp->shadow_sq);
> 
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests
  2025-01-20 17:27 ` [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests Konstantin Taranov
@ 2025-01-23 18:20   ` Long Li
  2025-01-23 19:03     ` Jason Gunthorpe
  0 siblings, 1 reply; 31+ messages in thread
From: Long Li @ 2025-01-23 18:20 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org



> -----Original Message-----
> From: Konstantin Taranov <kotaranov@linux.microsoft.com>
> Sent: Monday, January 20, 2025 9:27 AM
> To: Konstantin Taranov <kotaranov@microsoft.com>; Shiraz Saleem
> <shirazsaleem@microsoft.com>; pabeni@redhat.com; Haiyang Zhang
> <haiyangz@microsoft.com>; KY Srinivasan <kys@microsoft.com>;
> edumazet@google.com; kuba@kernel.org; davem@davemloft.net; Dexuan Cui
> <decui@microsoft.com>; wei.liu@kernel.org; sharmaajay@microsoft.com; Long
> Li <longli@microsoft.com>; jgg@ziepe.ca; leon@kernel.org
> Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org;
> netdev@vger.kernel.org; linux-hyperv@vger.kernel.org
> Subject: [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement post send and post recv for UD/GSI QPs.
> Add information about posted requests into shadow queues.
> 
> Co-developed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
> Signed-off-by: Shiraz Saleem <shirazsaleem@microsoft.com>
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> ---
>  drivers/infiniband/hw/mana/Makefile           |   2 +-
>  drivers/infiniband/hw/mana/device.c           |   2 +
>  drivers/infiniband/hw/mana/mana_ib.h          |  33 ++++
>  drivers/infiniband/hw/mana/qp.c               |  21 ++-
>  drivers/infiniband/hw/mana/shadow_queue.h     | 115 ++++++++++++
>  drivers/infiniband/hw/mana/wr.c               | 168 ++++++++++++++++++
>  .../net/ethernet/microsoft/mana/gdma_main.c   |   2 +
>  7 files changed, 341 insertions(+), 2 deletions(-)  create mode 100644
> drivers/infiniband/hw/mana/shadow_queue.h
>  create mode 100644 drivers/infiniband/hw/mana/wr.c
> 
> diff --git a/drivers/infiniband/hw/mana/Makefile
> b/drivers/infiniband/hw/mana/Makefile
> index 6e56f77..79426e7 100644
> --- a/drivers/infiniband/hw/mana/Makefile
> +++ b/drivers/infiniband/hw/mana/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  obj-$(CONFIG_MANA_INFINIBAND) += mana_ib.o
> 
> -mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o
> +mana_ib-y := device.o main.o wq.o qp.o cq.o mr.o ah.o wr.o
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index d534ef1..1da86c3 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -40,6 +40,8 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.mmap = mana_ib_mmap,
>  	.modify_qp = mana_ib_modify_qp,
>  	.modify_wq = mana_ib_modify_wq,
> +	.post_recv = mana_ib_post_recv,
> +	.post_send = mana_ib_post_send,
>  	.query_device = mana_ib_query_device,
>  	.query_gid = mana_ib_query_gid,
>  	.query_pkey = mana_ib_query_pkey,
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 7b079d8..6265c39 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -14,6 +14,7 @@
>  #include <linux/dmapool.h>
> 
>  #include <net/mana/mana.h>
> +#include "shadow_queue.h"
> 
>  #define PAGE_SZ_BM                                                             \
>  	(SZ_4K | SZ_8K | SZ_16K | SZ_32K | SZ_64K | SZ_128K | SZ_256K |        \
> @@ -165,6 +166,9 @@ struct mana_ib_qp {
>  	/* The port on the IB device, starting with 1 */
>  	u32 port;
> 
> +	struct shadow_queue shadow_rq;
> +	struct shadow_queue shadow_sq;
> +
>  	refcount_t		refcount;
>  	struct completion	free;
>  };
> @@ -404,6 +408,30 @@ struct mana_rnic_set_qp_state_resp {
>  	struct gdma_resp_hdr hdr;
>  }; /* HW Data */
> 
> +enum WQE_OPCODE_TYPES {
> +	WQE_TYPE_UD_SEND = 0,
> +	WQE_TYPE_UD_RECV = 8,
> +}; /* HW DATA */
> +
> +struct rdma_send_oob {
> +	u32 wqe_type	: 5;
> +	u32 fence	: 1;
> +	u32 signaled	: 1;
> +	u32 solicited	: 1;
> +	u32 psn		: 24;
> +
> +	u32 ssn_or_rqpn	: 24;
> +	u32 reserved1	: 8;
> +	union {
> +		struct {
> +			u32 remote_qkey;
> +			u32 immediate;
> +			u32 reserved1;
> +			u32 reserved2;
> +		} ud_send;
> +	};
> +}; /* HW DATA */
> +
>  static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)  {
>  	return mdev->gdma_dev->gdma_context;
> @@ -562,4 +590,9 @@ int mana_ib_gd_destroy_ud_qp(struct mana_ib_dev
> *mdev, struct mana_ib_qp *qp);  int mana_ib_create_ah(struct ib_ah *ibah,
> struct rdma_ah_init_attr *init_attr,
>  		      struct ib_udata *udata);
>  int mana_ib_destroy_ah(struct ib_ah *ah, u32 flags);
> +
> +int mana_ib_post_recv(struct ib_qp *ibqp, const struct ib_recv_wr *wr,
> +		      const struct ib_recv_wr **bad_wr); int
> mana_ib_post_send(struct
> +ib_qp *ibqp, const struct ib_send_wr *wr,
> +		      const struct ib_send_wr **bad_wr);
>  #endif
> diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
> index fea45be..051ea03 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -562,10 +562,23 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp,
> struct ib_pd *ibpd,
>  	}
>  	doorbell = gc->mana_ib.doorbell;
> 
> +	err = create_shadow_queue(&qp->shadow_rq, attr->cap.max_recv_wr,
> +				  sizeof(struct ud_rq_shadow_wqe));
> +	if (err) {
> +		ibdev_err(&mdev->ib_dev, "Failed to create shadow rq
> err %d\n", err);
> +		goto destroy_queues;
> +	}
> +	err = create_shadow_queue(&qp->shadow_sq, attr->cap.max_send_wr,
> +				  sizeof(struct ud_sq_shadow_wqe));
> +	if (err) {
> +		ibdev_err(&mdev->ib_dev, "Failed to create shadow sq
> err %d\n", err);
> +		goto destroy_shadow_queues;
> +	}
> +
>  	err = mana_ib_gd_create_ud_qp(mdev, qp, attr, doorbell, attr->qp_type);
>  	if (err) {
>  		ibdev_err(&mdev->ib_dev, "Failed to create ud qp  %d\n", err);
> -		goto destroy_queues;
> +		goto destroy_shadow_queues;
>  	}
>  	qp->ibqp.qp_num = qp->ud_qp.queues[MANA_UD_RECV_QUEUE].id;
>  	qp->port = attr->port_num;
> @@ -575,6 +588,9 @@ static int mana_ib_create_ud_qp(struct ib_qp *ibqp,
> struct ib_pd *ibpd,
> 
>  	return 0;
> 
> +destroy_shadow_queues:
> +	destroy_shadow_queue(&qp->shadow_rq);
> +	destroy_shadow_queue(&qp->shadow_sq);
>  destroy_queues:
>  	while (i-- > 0)
>  		mana_ib_destroy_queue(mdev, &qp->ud_qp.queues[i]); @@ -
> 754,6 +770,9 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp *qp,
> struct ib_udata *udata)
>  		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
>  	int i;
> 
> +	destroy_shadow_queue(&qp->shadow_rq);
> +	destroy_shadow_queue(&qp->shadow_sq);
> +
>  	/* Ignore return code as there is not much we can do about it.
>  	 * The error message is printed inside.
>  	 */
> diff --git a/drivers/infiniband/hw/mana/shadow_queue.h
> b/drivers/infiniband/hw/mana/shadow_queue.h
> new file mode 100644
> index 0000000..d8bfb4c
> --- /dev/null
> +++ b/drivers/infiniband/hw/mana/shadow_queue.h
> @@ -0,0 +1,115 @@
> +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
> +/*
> + * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
> + */
> +
> +#ifndef _MANA_SHADOW_QUEUE_H_
> +#define _MANA_SHADOW_QUEUE_H_
> +
> +struct shadow_wqe_header {
> +	u16 opcode;
> +	u16 error_code;
> +	u32 posted_wqe_size;
> +	u64 wr_id;
> +};
> +
> +struct ud_rq_shadow_wqe {
> +	struct shadow_wqe_header header;
> +	u32 byte_len;
> +	u32 src_qpn;
> +};
> +
> +struct ud_sq_shadow_wqe {
> +	struct shadow_wqe_header header;
> +};
> +
> +struct shadow_queue {
> +	/* Unmasked producer index, Incremented on wqe posting */
> +	u64 prod_idx;
> +	/* Unmasked consumer index, Incremented on cq polling */
> +	u64 cons_idx;
> +	/* Unmasked index of next-to-complete (from HW) shadow WQE */
> +	u64 next_to_complete_idx;
> +	/* queue size in wqes */
> +	u32 length;
> +	/* distance between elements in bytes */
> +	u32 stride;
> +	/* ring buffer holding wqes */
> +	void *buffer;
> +};
> +
> +static inline int create_shadow_queue(struct shadow_queue *queue,
> +uint32_t length, uint32_t stride) {
> +	queue->buffer = kvmalloc(length * stride, GFP_KERNEL);
> +	if (!queue->buffer)
> +		return -ENOMEM;
> +
> +	queue->length = length;
> +	queue->stride = stride;
> +
> +	return 0;
> +}
> +
> +static inline void destroy_shadow_queue(struct shadow_queue *queue) {
> +	kvfree(queue->buffer);
> +}
> +
> +static inline bool shadow_queue_full(struct shadow_queue *queue) {
> +	return (queue->prod_idx - queue->cons_idx) >= queue->length; }
> +
> +static inline bool shadow_queue_empty(struct shadow_queue *queue) {
> +	return queue->prod_idx == queue->cons_idx; }
> +
> +static inline void *
> +shadow_queue_get_element(const struct shadow_queue *queue, u64
> +unmasked_index) {
> +	u32 index = unmasked_index % queue->length;
> +
> +	return ((u8 *)queue->buffer + index * queue->stride); }
> +
> +static inline void *
> +shadow_queue_producer_entry(struct shadow_queue *queue) {
> +	return shadow_queue_get_element(queue, queue->prod_idx); }
> +
> +static inline void *
> +shadow_queue_get_next_to_consume(const struct shadow_queue *queue) {
> +	if (queue->cons_idx == queue->next_to_complete_idx)
> +		return NULL;
> +
> +	return shadow_queue_get_element(queue, queue->cons_idx); }
> +
> +static inline void *
> +shadow_queue_get_next_to_complete(struct shadow_queue *queue) {
> +	if (queue->next_to_complete_idx == queue->prod_idx)
> +		return NULL;
> +
> +	return shadow_queue_get_element(queue, queue-
> >next_to_complete_idx); }
> +
> +static inline void shadow_queue_advance_producer(struct shadow_queue
> +*queue) {
> +	queue->prod_idx++;
> +}
> +
> +static inline void shadow_queue_advance_consumer(struct shadow_queue
> +*queue) {
> +	queue->cons_idx++;
> +}
> +
> +static inline void shadow_queue_advance_next_to_complete(struct
> +shadow_queue *queue) {
> +	queue->next_to_complete_idx++;
> +}
> +
> +#endif
> diff --git a/drivers/infiniband/hw/mana/wr.c b/drivers/infiniband/hw/mana/wr.c
> new file mode 100644 index 0000000..1813567
> --- /dev/null
> +++ b/drivers/infiniband/hw/mana/wr.c
> @@ -0,0 +1,168 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2024, Microsoft Corporation. All rights reserved.
> + */
> +
> +#include "mana_ib.h"
> +
> +#define MAX_WR_SGL_NUM (2)
> +
> +static int mana_ib_post_recv_ud(struct mana_ib_qp *qp, const struct
> +ib_recv_wr *wr) {
> +	struct mana_ib_dev *mdev = container_of(qp->ibqp.device, struct
> mana_ib_dev, ib_dev);
> +	struct gdma_queue *queue = qp-
> >ud_qp.queues[MANA_UD_RECV_QUEUE].kmem;
> +	struct gdma_posted_wqe_info wqe_info = {0};
> +	struct gdma_sge gdma_sgl[MAX_WR_SGL_NUM];
> +	struct gdma_wqe_request wqe_req = {0};
> +	struct ud_rq_shadow_wqe *shadow_wqe;
> +	int err, i;
> +
> +	if (shadow_queue_full(&qp->shadow_rq))
> +		return -EINVAL;
> +
> +	if (wr->num_sge > MAX_WR_SGL_NUM)
> +		return -EINVAL;
> +
> +	for (i = 0; i < wr->num_sge; ++i) {
> +		gdma_sgl[i].address = wr->sg_list[i].addr;
> +		gdma_sgl[i].mem_key = wr->sg_list[i].lkey;
> +		gdma_sgl[i].size = wr->sg_list[i].length;
> +	}
> +	wqe_req.num_sge = wr->num_sge;
> +	wqe_req.sgl = gdma_sgl;
> +
> +	err = mana_gd_post_work_request(queue, &wqe_req, &wqe_info);
> +	if (err)
> +		return err;
> +
> +	shadow_wqe = shadow_queue_producer_entry(&qp->shadow_rq);
> +	memset(shadow_wqe, 0, sizeof(*shadow_wqe));

I would avoid using memset since this is on data path.

The patch looks good otherwise.

Long


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests
  2025-01-23 18:20   ` Long Li
@ 2025-01-23 19:03     ` Jason Gunthorpe
  0 siblings, 0 replies; 31+ messages in thread
From: Jason Gunthorpe @ 2025-01-23 19:03 UTC (permalink / raw)
  To: Long Li
  Cc: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	leon@kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	linux-hyperv@vger.kernel.org

On Thu, Jan 23, 2025 at 06:20:34PM +0000, Long Li wrote:

> > +	shadow_wqe = shadow_queue_producer_entry(&qp->shadow_rq);
> > +	memset(shadow_wqe, 0, sizeof(*shadow_wqe));
> 
> I would avoid using memset since this is on data path.

The compiler often does an amazing job with constant size small length
memsets.

Jason

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD
  2025-01-20 17:27 ` [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD Konstantin Taranov
@ 2025-01-23 19:17   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23 19:17 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Add polling for the kernel CQs.
> Process completion events for UD/GSI QPs.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/cq.c               | 135 ++++++++++++++++++
>  drivers/infiniband/hw/mana/device.c           |   1 +
>  drivers/infiniband/hw/mana/mana_ib.h          |  32 +++++
>  drivers/infiniband/hw/mana/qp.c               |  33 +++++
>  .../net/ethernet/microsoft/mana/gdma_main.c   |   1 +
>  5 files changed, 202 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c b/drivers/infiniband/hw/mana/cq.c
> index 82f1462..5c325ef 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -90,6 +90,10 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		}
>  	}
> 
> +	spin_lock_init(&cq->cq_lock);
> +	INIT_LIST_HEAD(&cq->list_send_qp);
> +	INIT_LIST_HEAD(&cq->list_recv_qp);
> +
>  	return 0;
> 
>  err_remove_cq_cb:
> @@ -180,3 +184,134 @@ int mana_ib_arm_cq(struct ib_cq *ibcq, enum
> ib_cq_notify_flags flags)
>  	mana_gd_ring_cq(gdma_cq, SET_ARM_BIT);
>  	return 0;
>  }
> +
> +static inline void handle_ud_sq_cqe(struct mana_ib_qp *qp, struct
> +gdma_comp *cqe) {
> +	struct mana_rdma_cqe *rdma_cqe = (struct mana_rdma_cqe *)cqe-
> >cqe_data;
> +	struct gdma_queue *wq = qp-
> >ud_qp.queues[MANA_UD_SEND_QUEUE].kmem;
> +	struct ud_sq_shadow_wqe *shadow_wqe;
> +
> +	shadow_wqe = shadow_queue_get_next_to_complete(&qp-
> >shadow_sq);
> +	if (!shadow_wqe)
> +		return;
> +
> +	shadow_wqe->header.error_code = rdma_cqe->ud_send.vendor_error;
> +
> +	wq->tail += shadow_wqe->header.posted_wqe_size;
> +	shadow_queue_advance_next_to_complete(&qp->shadow_sq);
> +}
> +
> +static inline void handle_ud_rq_cqe(struct mana_ib_qp *qp, struct
> +gdma_comp *cqe) {
> +	struct mana_rdma_cqe *rdma_cqe = (struct mana_rdma_cqe *)cqe-
> >cqe_data;
> +	struct gdma_queue *wq = qp-
> >ud_qp.queues[MANA_UD_RECV_QUEUE].kmem;
> +	struct ud_rq_shadow_wqe *shadow_wqe;
> +
> +	shadow_wqe = shadow_queue_get_next_to_complete(&qp-
> >shadow_rq);
> +	if (!shadow_wqe)
> +		return;
> +
> +	shadow_wqe->byte_len = rdma_cqe->ud_recv.msg_len;
> +	shadow_wqe->src_qpn = rdma_cqe->ud_recv.src_qpn;
> +	shadow_wqe->header.error_code = IB_WC_SUCCESS;
> +
> +	wq->tail += shadow_wqe->header.posted_wqe_size;
> +	shadow_queue_advance_next_to_complete(&qp->shadow_rq);
> +}
> +
> +static void mana_handle_cqe(struct mana_ib_dev *mdev, struct gdma_comp
> +*cqe) {
> +	struct mana_ib_qp *qp = mana_get_qp_ref(mdev, cqe->wq_num,
> +cqe->is_sq);
> +
> +	if (!qp)
> +		return;
> +
> +	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_UD)
> {
> +		if (cqe->is_sq)
> +			handle_ud_sq_cqe(qp, cqe);
> +		else
> +			handle_ud_rq_cqe(qp, cqe);
> +	}
> +
> +	mana_put_qp_ref(qp);
> +}
> +
> +static void fill_verbs_from_shadow_wqe(struct mana_ib_qp *qp, struct ib_wc
> *wc,
> +				       const struct shadow_wqe_header
> *shadow_wqe) {
> +	const struct ud_rq_shadow_wqe *ud_wqe = (const struct
> ud_rq_shadow_wqe
> +*)shadow_wqe;
> +
> +	wc->wr_id = shadow_wqe->wr_id;
> +	wc->status = shadow_wqe->error_code;
> +	wc->opcode = shadow_wqe->opcode;
> +	wc->vendor_err = shadow_wqe->error_code;
> +	wc->wc_flags = 0;
> +	wc->qp = &qp->ibqp;
> +	wc->pkey_index = 0;
> +
> +	if (shadow_wqe->opcode == IB_WC_RECV) {
> +		wc->byte_len = ud_wqe->byte_len;
> +		wc->src_qp = ud_wqe->src_qpn;
> +		wc->wc_flags |= IB_WC_GRH;
> +	}
> +}
> +
> +static int mana_process_completions(struct mana_ib_cq *cq, int nwc,
> +struct ib_wc *wc) {
> +	struct shadow_wqe_header *shadow_wqe;
> +	struct mana_ib_qp *qp;
> +	int wc_index = 0;
> +
> +	/* process send shadow queue completions  */
> +	list_for_each_entry(qp, &cq->list_send_qp, cq_send_list) {
> +		while ((shadow_wqe =
> shadow_queue_get_next_to_consume(&qp->shadow_sq))
> +				!= NULL) {
> +			if (wc_index >= nwc)
> +				goto out;
> +
> +			fill_verbs_from_shadow_wqe(qp, &wc[wc_index],
> shadow_wqe);
> +			shadow_queue_advance_consumer(&qp->shadow_sq);
> +			wc_index++;
> +		}
> +	}
> +
> +	/* process recv shadow queue completions */
> +	list_for_each_entry(qp, &cq->list_recv_qp, cq_recv_list) {
> +		while ((shadow_wqe =
> shadow_queue_get_next_to_consume(&qp->shadow_rq))
> +				!= NULL) {
> +			if (wc_index >= nwc)
> +				goto out;
> +
> +			fill_verbs_from_shadow_wqe(qp, &wc[wc_index],
> shadow_wqe);
> +			shadow_queue_advance_consumer(&qp->shadow_rq);
> +			wc_index++;
> +		}
> +	}
> +
> +out:
> +	return wc_index;
> +}
> +
> +int mana_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc
> +*wc) {
> +	struct mana_ib_cq *cq = container_of(ibcq, struct mana_ib_cq, ibcq);
> +	struct mana_ib_dev *mdev = container_of(ibcq->device, struct
> mana_ib_dev, ib_dev);
> +	struct gdma_queue *queue = cq->queue.kmem;
> +	struct gdma_comp gdma_cqe;
> +	unsigned long flags;
> +	int num_polled = 0;
> +	int comp_read, i;
> +
> +	spin_lock_irqsave(&cq->cq_lock, flags);
> +	for (i = 0; i < num_entries; i++) {
> +		comp_read = mana_gd_poll_cq(queue, &gdma_cqe, 1);
> +		if (comp_read < 1)
> +			break;
> +		mana_handle_cqe(mdev, &gdma_cqe);
> +	}
> +
> +	num_polled = mana_process_completions(cq, num_entries, wc);
> +	spin_unlock_irqrestore(&cq->cq_lock, flags);
> +
> +	return num_polled;
> +}
> diff --git a/drivers/infiniband/hw/mana/device.c
> b/drivers/infiniband/hw/mana/device.c
> index 63e12c3..97502bc 100644
> --- a/drivers/infiniband/hw/mana/device.c
> +++ b/drivers/infiniband/hw/mana/device.c
> @@ -40,6 +40,7 @@ static const struct ib_device_ops mana_ib_dev_ops = {
>  	.mmap = mana_ib_mmap,
>  	.modify_qp = mana_ib_modify_qp,
>  	.modify_wq = mana_ib_modify_wq,
> +	.poll_cq = mana_ib_poll_cq,
>  	.post_recv = mana_ib_post_recv,
>  	.post_send = mana_ib_post_send,
>  	.query_device = mana_ib_query_device,
> diff --git a/drivers/infiniband/hw/mana/mana_ib.h
> b/drivers/infiniband/hw/mana/mana_ib.h
> index 5e4ca55..cd771af 100644
> --- a/drivers/infiniband/hw/mana/mana_ib.h
> +++ b/drivers/infiniband/hw/mana/mana_ib.h
> @@ -127,6 +127,10 @@ struct mana_ib_mr {  struct mana_ib_cq {
>  	struct ib_cq ibcq;
>  	struct mana_ib_queue queue;
> +	/* protects CQ polling */
> +	spinlock_t cq_lock;
> +	struct list_head list_send_qp;
> +	struct list_head list_recv_qp;
>  	int cqe;
>  	u32 comp_vector;
>  	mana_handle_t  cq_handle;
> @@ -169,6 +173,8 @@ struct mana_ib_qp {
>  	/* The port on the IB device, starting with 1 */
>  	u32 port;
> 
> +	struct list_head cq_send_list;
> +	struct list_head cq_recv_list;
>  	struct shadow_queue shadow_rq;
>  	struct shadow_queue shadow_sq;
> 
> @@ -435,6 +441,31 @@ struct rdma_send_oob {
>  	};
>  }; /* HW DATA */
> 
> +struct mana_rdma_cqe {
> +	union {
> +		struct {
> +			u8 cqe_type;
> +			u8 data[GDMA_COMP_DATA_SIZE - 1];
> +		};
> +		struct {
> +			u32 cqe_type		: 8;
> +			u32 vendor_error	: 9;
> +			u32 reserved1		: 15;
> +			u32 sge_offset		: 5;
> +			u32 tx_wqe_offset	: 27;
> +		} ud_send;
> +		struct {
> +			u32 cqe_type		: 8;
> +			u32 reserved1		: 24;
> +			u32 msg_len;
> +			u32 src_qpn		: 24;
> +			u32 reserved2		: 8;
> +			u32 imm_data;
> +			u32 rx_wqe_offset;
> +		} ud_recv;
> +	};
> +}; /* HW DATA */
> +
>  static inline struct gdma_context *mdev_to_gc(struct mana_ib_dev *mdev)  {
>  	return mdev->gdma_dev->gdma_context;
> @@ -602,5 +633,6 @@ int mana_ib_post_recv(struct ib_qp *ibqp, const struct
> ib_recv_wr *wr,  int mana_ib_post_send(struct ib_qp *ibqp, const struct
> ib_send_wr *wr,
>  		      const struct ib_send_wr **bad_wr);
> 
> +int mana_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc
> +*wc);
>  int mana_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);  #endif
> diff --git a/drivers/infiniband/hw/mana/qp.c b/drivers/infiniband/hw/mana/qp.c
> index 2528046..b05e64b 100644
> --- a/drivers/infiniband/hw/mana/qp.c
> +++ b/drivers/infiniband/hw/mana/qp.c
> @@ -600,6 +600,36 @@ destroy_queues:
>  	return err;
>  }
> 
> +static void mana_add_qp_to_cqs(struct mana_ib_qp *qp) {
> +	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct
> mana_ib_cq, ibcq);
> +	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct
> mana_ib_cq, ibcq);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&send_cq->cq_lock, flags);
> +	list_add_tail(&qp->cq_send_list, &send_cq->list_send_qp);
> +	spin_unlock_irqrestore(&send_cq->cq_lock, flags);
> +
> +	spin_lock_irqsave(&recv_cq->cq_lock, flags);
> +	list_add_tail(&qp->cq_recv_list, &recv_cq->list_recv_qp);
> +	spin_unlock_irqrestore(&recv_cq->cq_lock, flags); }
> +
> +static void mana_remove_qp_from_cqs(struct mana_ib_qp *qp) {
> +	struct mana_ib_cq *send_cq = container_of(qp->ibqp.send_cq, struct
> mana_ib_cq, ibcq);
> +	struct mana_ib_cq *recv_cq = container_of(qp->ibqp.recv_cq, struct
> mana_ib_cq, ibcq);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&send_cq->cq_lock, flags);
> +	list_del(&qp->cq_send_list);
> +	spin_unlock_irqrestore(&send_cq->cq_lock, flags);
> +
> +	spin_lock_irqsave(&recv_cq->cq_lock, flags);
> +	list_del(&qp->cq_recv_list);
> +	spin_unlock_irqrestore(&recv_cq->cq_lock, flags); }
> +
>  static int mana_ib_create_ud_qp(struct ib_qp *ibqp, struct ib_pd *ibpd,
>  				struct ib_qp_init_attr *attr, struct ib_udata
> *udata)  { @@ -654,6 +684,8 @@ static int mana_ib_create_ud_qp(struct ib_qp
> *ibqp, struct ib_pd *ibpd,
>  	if (err)
>  		goto destroy_qp;
> 
> +	mana_add_qp_to_cqs(qp);
> +
>  	return 0;
> 
>  destroy_qp:
> @@ -840,6 +872,7 @@ static int mana_ib_destroy_ud_qp(struct mana_ib_qp
> *qp, struct ib_udata *udata)
>  		container_of(qp->ibqp.device, struct mana_ib_dev, ib_dev);
>  	int i;
> 
> +	mana_remove_qp_from_cqs(qp);
>  	mana_table_remove_qp(mdev, qp);
> 
>  	destroy_shadow_queue(&qp->shadow_rq);
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 823f7e7..2da15d9 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1222,6 +1222,7 @@ int mana_gd_poll_cq(struct gdma_queue *cq, struct
> gdma_comp *comp, int num_cqe)
> 
>  	return cqe_idx;
>  }
> +EXPORT_SYMBOL_NS(mana_gd_poll_cq, NET_MANA);
> 
>  static irqreturn_t mana_gd_intr(int irq, void *arg)  {
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support
  2025-01-20 17:27 ` [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support Konstantin Taranov
@ 2025-01-23 19:18   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23 19:18 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Set max_mad_size and IB_PORT_CM_SUP capability to enable connection
> manager.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/main.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/main.c
> b/drivers/infiniband/hw/mana/main.c
> index 114e391..ae1fb69 100644
> --- a/drivers/infiniband/hw/mana/main.c
> +++ b/drivers/infiniband/hw/mana/main.c
> @@ -561,8 +561,10 @@ int mana_ib_get_port_immutable(struct ib_device
> *ibdev, u32 port_num,
>  	immutable->pkey_tbl_len = attr.pkey_tbl_len;
>  	immutable->gid_tbl_len = attr.gid_tbl_len;
>  	immutable->core_cap_flags = RDMA_CORE_PORT_RAW_PACKET;
> -	if (port_num == 1)
> +	if (port_num == 1) {
>  		immutable->core_cap_flags |=
> RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
> +		immutable->max_mad_size = IB_MGMT_MAD_SIZE;
> +	}
> 
>  	return 0;
>  }
> @@ -621,8 +623,11 @@ int mana_ib_query_port(struct ib_device *ibdev, u32
> port,
>  	props->active_width = IB_WIDTH_4X;
>  	props->active_speed = IB_SPEED_EDR;
>  	props->pkey_tbl_len = 1;
> -	if (port == 1)
> +	if (port == 1) {
>  		props->gid_tbl_len = 16;
> +		props->port_cap_flags = IB_PORT_CM_SUP;
> +		props->ip_gids = true;
> +	}
> 
>  	return 0;
>  }
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob
  2025-01-20 17:27 ` [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob Konstantin Taranov
@ 2025-01-23 22:48   ` Long Li
  0 siblings, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-23 22:48 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client
> oob
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Do not warn on missing pad_data when oob is in sgl.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/net/ethernet/microsoft/mana/gdma_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 3cb0543..a8a9cd7 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1042,7 +1042,7 @@ static u32 mana_gd_write_client_oob(const struct
> gdma_wqe_request *wqe_req,
>  	header->inline_oob_size_div4 = client_oob_size / sizeof(u32);
> 
>  	if (oob_in_sgl) {
> -		WARN_ON_ONCE(!pad_data || wqe_req->num_sge < 2);
> +		WARN_ON_ONCE(wqe_req->num_sge < 2);
> 
>  		header->client_oob_in_sgl = 1;
> 
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
  2025-01-23  5:36   ` Long Li
@ 2025-01-28 12:50     ` Konstantin Taranov
  0 siblings, 0 replies; 31+ messages in thread
From: Konstantin Taranov @ 2025-01-28 12:50 UTC (permalink / raw)
  To: Long Li, Konstantin Taranov, Shiraz Saleem, pabeni@redhat.com,
	Haiyang Zhang, KY Srinivasan, edumazet@google.com,
	kuba@kernel.org, davem@davemloft.net, Dexuan Cui,
	wei.liu@kernel.org, sharmaajay@microsoft.com, jgg@ziepe.ca,
	leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: RE: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level
> CQs
> 
> > Subject: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
> >
> > From: Konstantin Taranov <kotaranov@microsoft.com>
> >
> > Implement creation of CQs for the kernel.
> >
> > Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> > Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>
> > ---
> >  drivers/infiniband/hw/mana/cq.c | 80
> > +++++++++++++++++++++------------
> >  1 file changed, 52 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mana/cq.c
> > b/drivers/infiniband/hw/mana/cq.c index f04a679..d26d82d 100644
> > --- a/drivers/infiniband/hw/mana/cq.c
> > +++ b/drivers/infiniband/hw/mana/cq.c
> > @@ -15,42 +15,57 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const
> > struct ib_cq_init_attr *attr,
> >  	struct ib_device *ibdev = ibcq->device;
> >  	struct mana_ib_create_cq ucmd = {};
> >  	struct mana_ib_dev *mdev;
> > +	struct gdma_context *gc;
> >  	bool is_rnic_cq;
> >  	u32 doorbell;
> > +	u32 buf_size;
> >  	int err;
> >
> >  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> > +	gc = mdev_to_gc(mdev);
> >
> >  	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
> >  	cq->cq_handle = INVALID_MANA_HANDLE;
> >
> > -	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > -		return -EINVAL;
> > +	if (udata) {
> > +		if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> > +			return -EINVAL;
> >
> > -	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> > >inlen));
> > -	if (err) {
> > -		ibdev_dbg(ibdev,
> > -			  "Failed to copy from udata for create cq, %d\n", err);
> > -		return err;
> > -	}
> > +		err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd),
> > udata->inlen));
> > +		if (err) {
> > +			ibdev_dbg(ibdev, "Failed to copy from udata for create
> > cq, %d\n", err);
> > +			return err;
> > +		}
> >
> > -	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> > +		is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> >
> > -	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
> > -		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
> > -		return -EINVAL;
> > -	}
> > +		if (!is_rnic_cq && attr->cqe > mdev-
> >adapter_caps.max_qp_wr)
> > {
> > +			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> >cqe);
> > +			return -EINVAL;
> > +		}
> >
> > -	cq->cqe = attr->cqe;
> > -	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> > COMP_ENTRY_SIZE, &cq->queue);
> > -	if (err) {
> > -		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n",
> > err);
> > -		return err;
> > -	}
> > +		cq->cqe = attr->cqe;
> > +		err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> > COMP_ENTRY_SIZE,
> > +					   &cq->queue);
> > +		if (err) {
> > +			ibdev_dbg(ibdev, "Failed to create queue for create
> > cq, %d\n", err);
> > +			return err;
> > +		}
> >
> > -	mana_ucontext = rdma_udata_to_drv_context(udata, struct
> > mana_ib_ucontext,
> > -						  ibucontext);
> > -	doorbell = mana_ucontext->doorbell;
> > +		mana_ucontext = rdma_udata_to_drv_context(udata, struct
> > mana_ib_ucontext,
> > +							  ibucontext);
> > +		doorbell = mana_ucontext->doorbell;
> > +	} else {
> > +		is_rnic_cq = true;
> > +		buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr-
> >cqe
> > * COMP_ENTRY_SIZE));
> > +		cq->cqe = buf_size / COMP_ENTRY_SIZE;
> > +		err = mana_ib_create_kernel_queue(mdev, buf_size,
> GDMA_CQ,
> > &cq->queue);
> > +		if (err) {
> > +			ibdev_dbg(ibdev, "Failed to create kernel queue for
> > create cq, %d\n", err);
> > +			return err;
> > +		}
> > +		doorbell = gc->mana_ib.doorbell;
> > +	}
> >
> >  	if (is_rnic_cq) {
> >  		err = mana_ib_gd_create_cq(mdev, cq, doorbell); @@ -66,11
> > +81,13 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> > ib_cq_init_attr *attr,
> >  		}
> >  	}
> >
> > -	resp.cqid = cq->queue.id;
> > -	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata-
> >outlen));
> > -	if (err) {
> > -		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n",
> err);
> > -		goto err_remove_cq_cb;
> > +	if (udata) {
> > +		resp.cqid = cq->queue.id;
> > +		err = ib_copy_to_udata(udata, &resp, min(sizeof(resp),
> udata-
> > >outlen));
> > +		if (err) {
> > +			ibdev_dbg(&mdev->ib_dev, "Failed to copy to
> > udata, %d\n", err);
> > +			goto err_remove_cq_cb;
> > +		}
> >  	}
> >
> >  	return 0;
> > @@ -122,7 +139,10 @@ int mana_ib_install_cq_cb(struct mana_ib_dev
> > *mdev, struct mana_ib_cq *cq)
> >  		return -EINVAL;
> >  	/* Create CQ table entry */
> >  	WARN_ON(gc->cq_table[cq->queue.id]);
> > -	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> > +	if (cq->queue.kmem)
> > +		gdma_cq = cq->queue.kmem;
> > +	else
> > +		gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> >  	if (!gdma_cq)
> >  		return -ENOMEM;
> >
> > @@ -141,6 +161,10 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev
> > *mdev, struct mana_ib_cq *cq)
> >  	if (cq->queue.id >= gc->max_num_cqs || cq->queue.id ==
> > INVALID_QUEUE_ID)
> >  		return;
> >
> > +	if (cq->queue.kmem)
> > +	/* Then it will be cleaned and removed by the mana */
> > +		return;
> > +
> 
> Do you need to call "gc->cq_table[cq->queue.id] = NULL" before return?

No. Here I assume ownership by the mana.ko. So, it will be nulled by mana_gd_destroy_cq()
-Konstantin

> 
> 
> >  	kfree(gc->cq_table[cq->queue.id]);
> >  	gc->cq_table[cq->queue.id] = NULL;
> >  }
> > --
> > 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
  2025-01-20 17:27 ` [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs Konstantin Taranov
  2025-01-23  5:36   ` Long Li
@ 2025-01-29  0:48   ` Long Li
  1 sibling, 0 replies; 31+ messages in thread
From: Long Li @ 2025-01-29  0:48 UTC (permalink / raw)
  To: Konstantin Taranov, Konstantin Taranov, Shiraz Saleem,
	pabeni@redhat.com, Haiyang Zhang, KY Srinivasan,
	edumazet@google.com, kuba@kernel.org, davem@davemloft.net,
	Dexuan Cui, wei.liu@kernel.org, sharmaajay@microsoft.com,
	jgg@ziepe.ca, leon@kernel.org
  Cc: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, linux-hyperv@vger.kernel.org

> Subject: [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs
> 
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> Implement creation of CQs for the kernel.
> 
> Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
> Reviewed-by: Shiraz Saleem <shirazsaleem@microsoft.com>

Reviewed-by: Long Li <longli@microsoft.com>

> ---
>  drivers/infiniband/hw/mana/cq.c | 80 +++++++++++++++++++++------------
>  1 file changed, 52 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mana/cq.c
> b/drivers/infiniband/hw/mana/cq.c index f04a679..d26d82d 100644
> --- a/drivers/infiniband/hw/mana/cq.c
> +++ b/drivers/infiniband/hw/mana/cq.c
> @@ -15,42 +15,57 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const
> struct ib_cq_init_attr *attr,
>  	struct ib_device *ibdev = ibcq->device;
>  	struct mana_ib_create_cq ucmd = {};
>  	struct mana_ib_dev *mdev;
> +	struct gdma_context *gc;
>  	bool is_rnic_cq;
>  	u32 doorbell;
> +	u32 buf_size;
>  	int err;
> 
>  	mdev = container_of(ibdev, struct mana_ib_dev, ib_dev);
> +	gc = mdev_to_gc(mdev);
> 
>  	cq->comp_vector = attr->comp_vector % ibdev->num_comp_vectors;
>  	cq->cq_handle = INVALID_MANA_HANDLE;
> 
> -	if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> -		return -EINVAL;
> +	if (udata) {
> +		if (udata->inlen < offsetof(struct mana_ib_create_cq, flags))
> +			return -EINVAL;
> 
> -	err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata-
> >inlen));
> -	if (err) {
> -		ibdev_dbg(ibdev,
> -			  "Failed to copy from udata for create cq, %d\n", err);
> -		return err;
> -	}
> +		err = ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd),
> udata->inlen));
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to copy from udata for create
> cq, %d\n", err);
> +			return err;
> +		}
> 
> -	is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> +		is_rnic_cq = !!(ucmd.flags & MANA_IB_CREATE_RNIC_CQ);
> 
> -	if (!is_rnic_cq && attr->cqe > mdev->adapter_caps.max_qp_wr) {
> -		ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr->cqe);
> -		return -EINVAL;
> -	}
> +		if (!is_rnic_cq && attr->cqe > mdev-
> >adapter_caps.max_qp_wr) {
> +			ibdev_dbg(ibdev, "CQE %d exceeding limit\n", attr-
> >cqe);
> +			return -EINVAL;
> +		}
> 
> -	cq->cqe = attr->cqe;
> -	err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> COMP_ENTRY_SIZE, &cq->queue);
> -	if (err) {
> -		ibdev_dbg(ibdev, "Failed to create queue for create cq, %d\n",
> err);
> -		return err;
> -	}
> +		cq->cqe = attr->cqe;
> +		err = mana_ib_create_queue(mdev, ucmd.buf_addr, cq->cqe *
> COMP_ENTRY_SIZE,
> +					   &cq->queue);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create queue for create cq,
> %d\n", err);
> +			return err;
> +		}
> 
> -	mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> -						  ibucontext);
> -	doorbell = mana_ucontext->doorbell;
> +		mana_ucontext = rdma_udata_to_drv_context(udata, struct
> mana_ib_ucontext,
> +							  ibucontext);
> +		doorbell = mana_ucontext->doorbell;
> +	} else {
> +		is_rnic_cq = true;
> +		buf_size = MANA_PAGE_ALIGN(roundup_pow_of_two(attr-
> >cqe * COMP_ENTRY_SIZE));
> +		cq->cqe = buf_size / COMP_ENTRY_SIZE;
> +		err = mana_ib_create_kernel_queue(mdev, buf_size,
> GDMA_CQ, &cq->queue);
> +		if (err) {
> +			ibdev_dbg(ibdev, "Failed to create kernel queue for
> create cq, %d\n", err);
> +			return err;
> +		}
> +		doorbell = gc->mana_ib.doorbell;
> +	}
> 
>  	if (is_rnic_cq) {
>  		err = mana_ib_gd_create_cq(mdev, cq, doorbell); @@ -66,11
> +81,13 @@ int mana_ib_create_cq(struct ib_cq *ibcq, const struct
> ib_cq_init_attr *attr,
>  		}
>  	}
> 
> -	resp.cqid = cq->queue.id;
> -	err = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata-
> >outlen));
> -	if (err) {
> -		ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata, %d\n",
> err);
> -		goto err_remove_cq_cb;
> +	if (udata) {
> +		resp.cqid = cq->queue.id;
> +		err = ib_copy_to_udata(udata, &resp, min(sizeof(resp),
> udata->outlen));
> +		if (err) {
> +			ibdev_dbg(&mdev->ib_dev, "Failed to copy to udata,
> %d\n", err);
> +			goto err_remove_cq_cb;
> +		}
>  	}
> 
>  	return 0;
> @@ -122,7 +139,10 @@ int mana_ib_install_cq_cb(struct mana_ib_dev
> *mdev, struct mana_ib_cq *cq)
>  		return -EINVAL;
>  	/* Create CQ table entry */
>  	WARN_ON(gc->cq_table[cq->queue.id]);
> -	gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
> +	if (cq->queue.kmem)
> +		gdma_cq = cq->queue.kmem;
> +	else
> +		gdma_cq = kzalloc(sizeof(*gdma_cq), GFP_KERNEL);
>  	if (!gdma_cq)
>  		return -ENOMEM;
> 
> @@ -141,6 +161,10 @@ void mana_ib_remove_cq_cb(struct mana_ib_dev
> *mdev, struct mana_ib_cq *cq)
>  	if (cq->queue.id >= gc->max_num_cqs || cq->queue.id ==
> INVALID_QUEUE_ID)
>  		return;
> 
> +	if (cq->queue.kmem)
> +	/* Then it will be cleaned and removed by the mana */
> +		return;
> +
>  	kfree(gc->cq_table[cq->queue.id]);
>  	gc->cq_table[cq->queue.id] = NULL;
>  }
> --
> 2.43.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib
  2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
                   ` (12 preceding siblings ...)
  2025-01-20 17:27 ` [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support Konstantin Taranov
@ 2025-02-03 11:56 ` Leon Romanovsky
  13 siblings, 0 replies; 31+ messages in thread
From: Leon Romanovsky @ 2025-02-03 11:56 UTC (permalink / raw)
  To: kotaranov, shirazsaleem, pabeni, haiyangz, kys, edumazet, kuba,
	davem, decui, wei.liu, sharmaajay, longli, jgg,
	Konstantin Taranov
  Cc: linux-rdma, linux-kernel, netdev, linux-hyperv


On Mon, 20 Jan 2025 09:27:06 -0800, Konstantin Taranov wrote:
> From: Konstantin Taranov <kotaranov@microsoft.com>
> 
> This patch series enables GSI QPs and CM on mana_ib.
> 
> Konstantin Taranov (13):
>   RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
>   RDMA/mana_ib: implement get_dma_mr
>   RDMA/mana_ib: helpers to allocate kernel queues
>   RDMA/mana_ib: create kernel-level CQs
>   RDMA/mana_ib: Create and destroy UD/GSI QP
>   RDMA/mana_ib: UD/GSI QP creation for kernel
>   RDMA/mana_ib: create/destroy AH
>   net/mana: fix warning in the writer of client oob
>   RDMA/mana_ib: UD/GSI work requests
>   RDMA/mana_ib: implement req_notify_cq
>   RDMA/mana_ib: extend mana QP table
>   RDMA/mana_ib: polling of CQs for GSI/UD
>   RDMA/mana_ib: indicate CM support
> 
> [...]

Applied, thanks!

[01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs
        https://git.kernel.org/rdma/rdma/c/78683c25c80e54
[02/13] RDMA/mana_ib: implement get_dma_mr
        https://git.kernel.org/rdma/rdma/c/6e1b8bdcd04f4e
[03/13] RDMA/mana_ib: helpers to allocate kernel queues
        https://git.kernel.org/rdma/rdma/c/f662c0f5b3396a
[04/13] RDMA/mana_ib: create kernel-level CQs
        https://git.kernel.org/rdma/rdma/c/822d4c938e0d93
[05/13] RDMA/mana_ib: Create and destroy UD/GSI QP
        https://git.kernel.org/rdma/rdma/c/392ed69a9ac45c
[06/13] RDMA/mana_ib: UD/GSI QP creation for kernel
        https://git.kernel.org/rdma/rdma/c/bf3f6576bbbd51
[07/13] RDMA/mana_ib: create/destroy AH
        https://git.kernel.org/rdma/rdma/c/09ec8a57903348
[08/13] net/mana: fix warning in the writer of client oob
        https://git.kernel.org/rdma/rdma/c/622f1fc2ca7dea
[09/13] RDMA/mana_ib: UD/GSI work requests
        https://git.kernel.org/rdma/rdma/c/8d9a5210545c27
[10/13] RDMA/mana_ib: implement req_notify_cq
        https://git.kernel.org/rdma/rdma/c/cd595cf391733c
[11/13] RDMA/mana_ib: extend mana QP table
        https://git.kernel.org/rdma/rdma/c/9fe09a01ade76d
[12/13] RDMA/mana_ib: polling of CQs for GSI/UD
        https://git.kernel.org/rdma/rdma/c/4aa5b050800325
[13/13] RDMA/mana_ib: indicate CM support
        https://git.kernel.org/rdma/rdma/c/842ee6aeddff08

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-02-03 11:56 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-20 17:27 [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Konstantin Taranov
2025-01-20 17:27 ` [PATCH rdma-next 01/13] RDMA/mana_ib: Allow registration of DMA-mapped memory in PDs Konstantin Taranov
2025-01-23  5:17   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 02/13] RDMA/mana_ib: implement get_dma_mr Konstantin Taranov
2025-01-23  5:18   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 03/13] RDMA/mana_ib: helpers to allocate kernel queues Konstantin Taranov
2025-01-23  5:25   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 04/13] RDMA/mana_ib: create kernel-level CQs Konstantin Taranov
2025-01-23  5:36   ` Long Li
2025-01-28 12:50     ` Konstantin Taranov
2025-01-29  0:48   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 05/13] RDMA/mana_ib: Create and destroy UD/GSI QP Konstantin Taranov
2025-01-23  5:40   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 06/13] RDMA/mana_ib: UD/GSI QP creation for kernel Konstantin Taranov
2025-01-23  5:45   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 07/13] RDMA/mana_ib: create/destroy AH Konstantin Taranov
2025-01-23  5:53   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 08/13] net/mana: fix warning in the writer of client oob Konstantin Taranov
2025-01-23 22:48   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 09/13] RDMA/mana_ib: UD/GSI work requests Konstantin Taranov
2025-01-23 18:20   ` Long Li
2025-01-23 19:03     ` Jason Gunthorpe
2025-01-20 17:27 ` [PATCH rdma-next 10/13] RDMA/mana_ib: implement req_notify_cq Konstantin Taranov
2025-01-23  5:57   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 11/13] RDMA/mana_ib: extend mana QP table Konstantin Taranov
2025-01-23  6:02   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 12/13] RDMA/mana_ib: polling of CQs for GSI/UD Konstantin Taranov
2025-01-23 19:17   ` Long Li
2025-01-20 17:27 ` [PATCH rdma-next 13/13] RDMA/mana_ib: indicate CM support Konstantin Taranov
2025-01-23 19:18   ` Long Li
2025-02-03 11:56 ` [PATCH rdma-next 00/13] RDMA/mana_ib: Enable CM for mana_ib Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).