Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation
@ 2024-11-14  9:49 Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Selvin Xavier @ 2024-11-14  9:49 UTC (permalink / raw)
  To: leon, jgg
  Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
	Selvin Xavier

Includes some generic improvments and code refactoring for the
Notification Queue handling in the driver. Remove the data
structures that store the NQ information out of the device
structure. Fix few issues in selecting the NQ during CQ
create. Also, fail the driver load if NIC driver can not
allocate at least two NQs for RoCE.

Please review and apply.

Thanks,
Selvin Xavier

v1 -> v2:
 - Dropped patch 5
 - Fixed error handling in patch 1 by adding bnxt_unregister_dev
 - Removed redundant comment in patch 3

Kalesh AP (4):
  RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are
    reserved
  RDMA/bnxt_re: Refactor NQ allocation
  RDMA/bnxt_re: Refurbish CQ to NQ hash calculation
  RDMA/bnxt_re: Cache MSIx info to a local structure

 drivers/infiniband/hw/bnxt_re/bnxt_re.h  |  18 +++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c |  41 +++++++----
 drivers/infiniband/hw/bnxt_re/main.c     | 114 ++++++++++++++++++++-----------
 drivers/infiniband/hw/bnxt_re/qplib_fp.c |   1 +
 drivers/infiniband/hw/bnxt_re/qplib_fp.h |   1 +
 5 files changed, 115 insertions(+), 60 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH rdma-next v2 1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved
  2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
@ 2024-11-14  9:49 ` Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 2/4] RDMA/bnxt_re: Refactor NQ allocation Selvin Xavier
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Selvin Xavier @ 2024-11-14  9:49 UTC (permalink / raw)
  To: leon, jgg
  Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
	Selvin Xavier

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

L2 driver allocates and populates the MSI-x vector details for RoCE
in the en_dev structure. RoCE driver requires minimum 2 MSIx vectors.
Hence during probe, driver has to check and bail out if there are not
enough MSI-x vectors reserved for it before proceeding further
initialization.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h |  2 ++
 drivers/infiniband/hw/bnxt_re/main.c    | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index d1b7c20..7abc37b 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -154,6 +154,8 @@ struct bnxt_re_pacing {
 
 #define BNXT_RE_GRC_FIFO_REG_BASE 0x2000
 
+#define BNXT_RE_MIN_MSIX		2
+
 #define MAX_CQ_HASH_BITS		(16)
 #define MAX_SRQ_HASH_BITS		(16)
 
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index cb61941..c262a16 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1936,6 +1936,18 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 	}
 	set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
 
+	if (rdev->en_dev->ulp_tbl->msix_requested < BNXT_RE_MIN_MSIX) {
+		ibdev_err(&rdev->ibdev,
+			  "RoCE requires minimum 2 MSI-X vectors, but only %d reserved\n",
+			  rdev->en_dev->ulp_tbl->msix_requested);
+		bnxt_unregister_dev(rdev->en_dev);
+		clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+		return -EINVAL;
+	}
+	ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
+		  rdev->en_dev->ulp_tbl->msix_requested);
+	rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+
 	rc = bnxt_re_setup_chip_ctx(rdev);
 	if (rc) {
 		bnxt_unregister_dev(rdev->en_dev);
@@ -1947,16 +1959,6 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 	/* Check whether VF or PF */
 	bnxt_re_get_sriov_func_type(rdev);
 
-	if (!rdev->en_dev->ulp_tbl->msix_requested) {
-		ibdev_err(&rdev->ibdev,
-			  "Failed to get MSI-X vectors: %#x\n", rc);
-		rc = -EINVAL;
-		goto fail;
-	}
-	ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
-		  rdev->en_dev->ulp_tbl->msix_requested);
-	rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
-
 	bnxt_re_query_hwrm_intf_version(rdev);
 
 	/* Establish RCFW Communication Channel to initialize the context
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH rdma-next v2 2/4] RDMA/bnxt_re: Refactor NQ allocation
  2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
@ 2024-11-14  9:49 ` Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 3/4] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation Selvin Xavier
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Selvin Xavier @ 2024-11-14  9:49 UTC (permalink / raw)
  To: leon, jgg
  Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
	Selvin Xavier

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

Move NQ related data structures from rdev to a new structure
named "struct bnxt_re_nq_record" by keeping a pointer to in
the rdev structure. Allocate the memory for it dynamically.
This change is needed for subsequent patches in the series.

Also, removed the nq_task variable from rdev structure as it
is redundant and no longer used.

This change would help to reduce the size of the driver private
structure as well.

Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  | 13 +++---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c |  6 +--
 drivers/infiniband/hw/bnxt_re/main.c     | 74 +++++++++++++++++++++-----------
 3 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 7abc37b..d1c839e 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -155,6 +155,11 @@ struct bnxt_re_pacing {
 #define BNXT_RE_GRC_FIFO_REG_BASE 0x2000
 
 #define BNXT_RE_MIN_MSIX		2
+#define BNXT_RE_MAX_MSIX		BNXT_MAX_ROCE_MSIX
+struct bnxt_re_nq_record {
+	struct bnxt_qplib_nq	nq[BNXT_RE_MAX_MSIX];
+	int			num_msix;
+};
 
 #define MAX_CQ_HASH_BITS		(16)
 #define MAX_SRQ_HASH_BITS		(16)
@@ -183,21 +188,17 @@ struct bnxt_re_dev {
 	unsigned int			version, major, minor;
 	struct bnxt_qplib_chip_ctx	*chip_ctx;
 	struct bnxt_en_dev		*en_dev;
-	int				num_msix;
 
 	int				id;
 
 	struct delayed_work		worker;
 	u8				cur_prio_map;
 
-	/* FP Notification Queue (CQ & SRQ) */
-	struct tasklet_struct		nq_task;
-
 	/* RCFW Channel */
 	struct bnxt_qplib_rcfw		rcfw;
 
-	/* NQ */
-	struct bnxt_qplib_nq		nq[BNXT_MAX_ROCE_MSIX];
+	/* NQ record */
+	struct bnxt_re_nq_record	*nqr;
 
 	/* Device Resources */
 	struct bnxt_qplib_dev_attr	dev_attr;
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 9a188cc..a9c32c0 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -1872,8 +1872,8 @@ int bnxt_re_create_srq(struct ib_srq *ib_srq,
 	srq->qplib_srq.wqe_size = bnxt_re_get_rwqe_size(dev_attr->max_srq_sges);
 	srq->qplib_srq.threshold = srq_init_attr->attr.srq_limit;
 	srq->srq_limit = srq_init_attr->attr.srq_limit;
-	srq->qplib_srq.eventq_hw_ring_id = rdev->nq[0].ring_id;
-	nq = &rdev->nq[0];
+	srq->qplib_srq.eventq_hw_ring_id = rdev->nqr->nq[0].ring_id;
+	nq = &rdev->nqr->nq[0];
 
 	if (udata) {
 		rc = bnxt_re_init_user_srq(rdev, pd, srq, udata);
@@ -3122,7 +3122,7 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	 * used for getting the NQ index.
 	 */
 	nq_alloc_cnt = atomic_inc_return(&rdev->nq_alloc_cnt);
-	nq = &rdev->nq[nq_alloc_cnt % (rdev->num_msix - 1)];
+	nq = &rdev->nqr->nq[nq_alloc_cnt % (rdev->nqr->num_msix - 1)];
 	cq->qplib_cq.max_wqe = entries;
 	cq->qplib_cq.cnq_hw_ring_id = nq->ring_id;
 	cq->qplib_cq.nq	= nq;
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index c262a16..9669def 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -326,8 +326,8 @@ static void bnxt_re_stop_irq(void *handle)
 	rdev = en_info->rdev;
 	rcfw = &rdev->rcfw;
 
-	for (indx = BNXT_RE_NQ_IDX; indx < rdev->num_msix; indx++) {
-		nq = &rdev->nq[indx - 1];
+	for (indx = BNXT_RE_NQ_IDX; indx < rdev->nqr->num_msix; indx++) {
+		nq = &rdev->nqr->nq[indx - 1];
 		bnxt_qplib_nq_stop_irq(nq, false);
 	}
 
@@ -362,7 +362,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
 	/* Vectors may change after restart, so update with new vectors
 	 * in device sctructure.
 	 */
-	for (indx = 0; indx < rdev->num_msix; indx++)
+	for (indx = 0; indx < rdev->nqr->num_msix; indx++)
 		rdev->en_dev->msix_entries[indx].vector = ent[indx].vector;
 
 	rc = bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
@@ -371,8 +371,8 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
 		ibdev_warn(&rdev->ibdev, "Failed to reinit CREQ\n");
 		return;
 	}
-	for (indx = BNXT_RE_NQ_IDX ; indx < rdev->num_msix; indx++) {
-		nq = &rdev->nq[indx - 1];
+	for (indx = BNXT_RE_NQ_IDX ; indx < rdev->nqr->num_msix; indx++) {
+		nq = &rdev->nqr->nq[indx - 1];
 		rc = bnxt_qplib_nq_start_irq(nq, indx - 1,
 					     msix_ent[indx].vector, false);
 		if (rc) {
@@ -1206,7 +1206,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 
 	addrconf_addr_eui48((u8 *)&ibdev->node_guid, rdev->netdev->dev_addr);
 
-	ibdev->num_comp_vectors	= rdev->num_msix - 1;
+	ibdev->num_comp_vectors	= rdev->nqr->num_msix - 1;
 	ibdev->dev.parent = &rdev->en_dev->pdev->dev;
 	ibdev->local_dma_lkey = BNXT_QPLIB_RSVD_LKEY;
 
@@ -1551,8 +1551,8 @@ static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
 {
 	int i;
 
-	for (i = 1; i < rdev->num_msix; i++)
-		bnxt_qplib_disable_nq(&rdev->nq[i - 1]);
+	for (i = 1; i < rdev->nqr->num_msix; i++)
+		bnxt_qplib_disable_nq(&rdev->nqr->nq[i - 1]);
 
 	if (rdev->qplib_res.rcfw)
 		bnxt_qplib_cleanup_res(&rdev->qplib_res);
@@ -1566,9 +1566,9 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 
 	bnxt_qplib_init_res(&rdev->qplib_res);
 
-	for (i = 1; i < rdev->num_msix ; i++) {
+	for (i = 1; i < rdev->nqr->num_msix ; i++) {
 		db_offt = rdev->en_dev->msix_entries[i].db_offset;
-		rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq[i - 1],
+		rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
 					  i - 1, rdev->en_dev->msix_entries[i].vector,
 					  db_offt, &bnxt_re_cqn_handler,
 					  &bnxt_re_srqn_handler);
@@ -1582,20 +1582,22 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 	return 0;
 fail:
 	for (i = num_vec_enabled; i >= 0; i--)
-		bnxt_qplib_disable_nq(&rdev->nq[i]);
+		bnxt_qplib_disable_nq(&rdev->nqr->nq[i]);
 	return rc;
 }
 
 static void bnxt_re_free_nq_res(struct bnxt_re_dev *rdev)
 {
+	struct bnxt_qplib_nq *nq;
 	u8 type;
 	int i;
 
-	for (i = 0; i < rdev->num_msix - 1; i++) {
+	for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
 		type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
-		bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type);
-		bnxt_qplib_free_nq(&rdev->nq[i]);
-		rdev->nq[i].res = NULL;
+		nq = &rdev->nqr->nq[i];
+		bnxt_re_net_ring_free(rdev, nq->ring_id, type);
+		bnxt_qplib_free_nq(nq);
+		nq->res = NULL;
 	}
 }
 
@@ -1637,12 +1639,12 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 	if (rc)
 		goto dealloc_res;
 
-	for (i = 0; i < rdev->num_msix - 1; i++) {
+	for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
 		struct bnxt_qplib_nq *nq;
 
-		nq = &rdev->nq[i];
+		nq = &rdev->nqr->nq[i];
 		nq->hwq.max_elements = BNXT_QPLIB_NQE_MAX_CNT;
-		rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, &rdev->nq[i]);
+		rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, nq);
 		if (rc) {
 			ibdev_err(&rdev->ibdev, "Alloc Failed NQ%d rc:%#x",
 				  i, rc);
@@ -1650,7 +1652,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 		}
 		type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
 		rattr.dma_arr = nq->hwq.pbl[PBL_LVL_0].pg_map_arr;
-		rattr.pages = nq->hwq.pbl[rdev->nq[i].hwq.level].pg_count;
+		rattr.pages = nq->hwq.pbl[rdev->nqr->nq[i].hwq.level].pg_count;
 		rattr.type = type;
 		rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
 		rattr.depth = BNXT_QPLIB_NQE_MAX_CNT - 1;
@@ -1660,7 +1662,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 			ibdev_err(&rdev->ibdev,
 				  "Failed to allocate NQ fw id with rc = 0x%x",
 				  rc);
-			bnxt_qplib_free_nq(&rdev->nq[i]);
+			bnxt_qplib_free_nq(nq);
 			goto free_nq;
 		}
 		num_vec_created++;
@@ -1669,8 +1671,8 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 free_nq:
 	for (i = num_vec_created - 1; i >= 0; i--) {
 		type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
-		bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type);
-		bnxt_qplib_free_nq(&rdev->nq[i]);
+		bnxt_re_net_ring_free(rdev, rdev->nqr->nq[i].ring_id, type);
+		bnxt_qplib_free_nq(&rdev->nqr->nq[i]);
 	}
 	bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
 			       &rdev->dpi_privileged);
@@ -1865,6 +1867,21 @@ static int bnxt_re_ib_init(struct bnxt_re_dev *rdev)
 	return rc;
 }
 
+static int bnxt_re_alloc_nqr_mem(struct bnxt_re_dev *rdev)
+{
+	rdev->nqr = kzalloc(sizeof(*rdev->nqr), GFP_KERNEL);
+	if (!rdev->nqr)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void bnxt_re_free_nqr_mem(struct bnxt_re_dev *rdev)
+{
+	kfree(rdev->nqr);
+	rdev->nqr = NULL;
+}
+
 static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
 {
 	u8 type;
@@ -1894,11 +1911,12 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
 		bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
 	}
 
-	rdev->num_msix = 0;
+	rdev->nqr->num_msix = 0;
 
 	if (rdev->pacing.dbr_pacing)
 		bnxt_re_deinitialize_dbr_pacing(rdev);
 
+	bnxt_re_free_nqr_mem(rdev);
 	bnxt_re_destroy_chip_ctx(rdev);
 	if (op_type == BNXT_RE_COMPLETE_REMOVE) {
 		if (test_and_clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags))
@@ -1946,7 +1964,6 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 	}
 	ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
 		  rdev->en_dev->ulp_tbl->msix_requested);
-	rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
 
 	rc = bnxt_re_setup_chip_ctx(rdev);
 	if (rc) {
@@ -1956,6 +1973,15 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 		return -EINVAL;
 	}
 
+	rc = bnxt_re_alloc_nqr_mem(rdev);
+	if (rc) {
+		bnxt_re_destroy_chip_ctx(rdev);
+		bnxt_unregister_dev(rdev->en_dev);
+		clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+		return rc;
+	}
+	rdev->nqr->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+
 	/* Check whether VF or PF */
 	bnxt_re_get_sriov_func_type(rdev);
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH rdma-next v2 3/4] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation
  2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 2/4] RDMA/bnxt_re: Refactor NQ allocation Selvin Xavier
@ 2024-11-14  9:49 ` Selvin Xavier
  2024-11-14  9:49 ` [PATCH rdma-next v2 4/4] RDMA/bnxt_re: Cache MSIx info to a local structure Selvin Xavier
  2024-11-14 14:53 ` [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Leon Romanovsky
  4 siblings, 0 replies; 6+ messages in thread
From: Selvin Xavier @ 2024-11-14  9:49 UTC (permalink / raw)
  To: leon, jgg
  Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
	Selvin Xavier

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

There are few use cases where CQ create and destroy
is seen before re-creating the CQ, this kind of use
case is disturbing the RR distribution and all the
active CQ getting mapped to only 2 NQ alternatively.

Fixing the CQ to NQ hash calculation by implementing
a quick load sorting mechanism under a mutex.

Using this, if the CQ was allocated and destroyed
before using it, the nq selecting algorithm still
obtains the least loaded CQ. Thus balancing the load
on NQs.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h  |  2 ++
 drivers/infiniband/hw/bnxt_re/ib_verbs.c | 37 ++++++++++++++++++++++----------
 drivers/infiniband/hw/bnxt_re/main.c     |  2 ++
 drivers/infiniband/hw/bnxt_re/qplib_fp.c |  1 +
 drivers/infiniband/hw/bnxt_re/qplib_fp.h |  1 +
 5 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index d1c839e..5f64fc4 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -159,6 +159,8 @@ struct bnxt_re_pacing {
 struct bnxt_re_nq_record {
 	struct bnxt_qplib_nq	nq[BNXT_RE_MAX_MSIX];
 	int			num_msix;
+	/* serialize NQ access */
+	struct mutex		load_lock;
 };
 
 #define MAX_CQ_HASH_BITS		(16)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index a9c32c0..f6e9eef 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3029,6 +3029,28 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
 	return rc;
 }
 
+static struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev)
+{
+	int min, indx;
+
+	mutex_lock(&rdev->nqr->load_lock);
+	for (indx = 0, min = 0; indx < (rdev->nqr->num_msix - 1); indx++) {
+		if (rdev->nqr->nq[min].load > rdev->nqr->nq[indx].load)
+			min = indx;
+	}
+	rdev->nqr->nq[min].load++;
+	mutex_unlock(&rdev->nqr->load_lock);
+
+	return &rdev->nqr->nq[min];
+}
+
+static void bnxt_re_put_nq(struct bnxt_re_dev *rdev, struct bnxt_qplib_nq *nq)
+{
+	mutex_lock(&rdev->nqr->load_lock);
+	nq->load--;
+	mutex_unlock(&rdev->nqr->load_lock);
+}
+
 /* Completion Queues */
 int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 {
@@ -3047,6 +3069,8 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 		hash_del(&cq->hash_entry);
 	}
 	bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
+
+	bnxt_re_put_nq(rdev, nq);
 	ib_umem_release(cq->umem);
 
 	atomic_dec(&rdev->stats.res.cq_count);
@@ -3065,8 +3089,6 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
 	struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
 	struct bnxt_qplib_chip_ctx *cctx;
-	struct bnxt_qplib_nq *nq = NULL;
-	unsigned int nq_alloc_cnt;
 	int cqe = attr->cqe;
 	int rc, entries;
 	u32 active_cqs;
@@ -3117,16 +3139,10 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 
 		cq->qplib_cq.dpi = &rdev->dpi_privileged;
 	}
-	/*
-	 * Allocating the NQ in a round robin fashion. nq_alloc_cnt is a
-	 * used for getting the NQ index.
-	 */
-	nq_alloc_cnt = atomic_inc_return(&rdev->nq_alloc_cnt);
-	nq = &rdev->nqr->nq[nq_alloc_cnt % (rdev->nqr->num_msix - 1)];
 	cq->qplib_cq.max_wqe = entries;
-	cq->qplib_cq.cnq_hw_ring_id = nq->ring_id;
-	cq->qplib_cq.nq	= nq;
 	cq->qplib_cq.coalescing = &rdev->cq_coalescing;
+	cq->qplib_cq.nq = bnxt_re_get_nq(rdev);
+	cq->qplib_cq.cnq_hw_ring_id = cq->qplib_cq.nq->ring_id;
 
 	rc = bnxt_qplib_create_cq(&rdev->qplib_res, &cq->qplib_cq);
 	if (rc) {
@@ -3136,7 +3152,6 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 
 	cq->ib_cq.cqe = entries;
 	cq->cq_period = cq->qplib_cq.period;
-	nq->budget++;
 
 	active_cqs = atomic_inc_return(&rdev->stats.res.cq_count);
 	if (active_cqs > rdev->stats.res.cq_watermark)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 9669def..fcaf2b3 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1566,6 +1566,8 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 
 	bnxt_qplib_init_res(&rdev->qplib_res);
 
+	mutex_init(&rdev->nqr->load_lock);
+
 	for (i = 1; i < rdev->nqr->num_msix ; i++) {
 		db_offt = rdev->en_dev->msix_entries[i].db_offset;
 		rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index e2eea71..e56f42f 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -551,6 +551,7 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 	nq->pdev = pdev;
 	nq->cqn_handler = cqn_handler;
 	nq->srqn_handler = srqn_handler;
+	nq->load = 0;
 
 	/* Have a task to schedule CQ notifiers in post send case */
 	nq->cqn_wq  = create_singlethread_workqueue("bnxt_qplib_nq");
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index b5a90581..8ff56d7 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -519,6 +519,7 @@ struct bnxt_qplib_nq {
 	struct tasklet_struct		nq_tasklet;
 	bool				requested;
 	int				budget;
+	u32				load;
 
 	cqn_handler_t			cqn_handler;
 	srqn_handler_t			srqn_handler;
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH rdma-next v2 4/4] RDMA/bnxt_re: Cache MSIx info to a local structure
  2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
                   ` (2 preceding siblings ...)
  2024-11-14  9:49 ` [PATCH rdma-next v2 3/4] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation Selvin Xavier
@ 2024-11-14  9:49 ` Selvin Xavier
  2024-11-14 14:53 ` [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Leon Romanovsky
  4 siblings, 0 replies; 6+ messages in thread
From: Selvin Xavier @ 2024-11-14  9:49 UTC (permalink / raw)
  To: leon, jgg
  Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
	Selvin Xavier

From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

L2 driver allocates the vectors for RoCE and pass it through the
en_dev structure to RoCE. During probe, cache the MSIx related
info to a local structure.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h |  1 +
 drivers/infiniband/hw/bnxt_re/main.c    | 18 ++++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 5f64fc4..2975b11 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -157,6 +157,7 @@ struct bnxt_re_pacing {
 #define BNXT_RE_MIN_MSIX		2
 #define BNXT_RE_MAX_MSIX		BNXT_MAX_ROCE_MSIX
 struct bnxt_re_nq_record {
+	struct bnxt_msix_entry	msix_entries[BNXT_RE_MAX_MSIX];
 	struct bnxt_qplib_nq	nq[BNXT_RE_MAX_MSIX];
 	int			num_msix;
 	/* serialize NQ access */
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index fcaf2b3..533b9f1 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -347,7 +347,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
 		return;
 
 	rdev = en_info->rdev;
-	msix_ent = rdev->en_dev->msix_entries;
+	msix_ent = rdev->nqr->msix_entries;
 	rcfw = &rdev->rcfw;
 	if (!ent) {
 		/* Not setting the f/w timeout bit in rcfw.
@@ -363,7 +363,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
 	 * in device sctructure.
 	 */
 	for (indx = 0; indx < rdev->nqr->num_msix; indx++)
-		rdev->en_dev->msix_entries[indx].vector = ent[indx].vector;
+		rdev->nqr->msix_entries[indx].vector = ent[indx].vector;
 
 	rc = bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
 				       false);
@@ -1569,9 +1569,9 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 	mutex_init(&rdev->nqr->load_lock);
 
 	for (i = 1; i < rdev->nqr->num_msix ; i++) {
-		db_offt = rdev->en_dev->msix_entries[i].db_offset;
+		db_offt = rdev->nqr->msix_entries[i].db_offset;
 		rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
-					  i - 1, rdev->en_dev->msix_entries[i].vector,
+					  i - 1, rdev->nqr->msix_entries[i].vector,
 					  db_offt, &bnxt_re_cqn_handler,
 					  &bnxt_re_srqn_handler);
 		if (rc) {
@@ -1658,7 +1658,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 		rattr.type = type;
 		rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
 		rattr.depth = BNXT_QPLIB_NQE_MAX_CNT - 1;
-		rattr.lrid = rdev->en_dev->msix_entries[i + 1].ring_idx;
+		rattr.lrid = rdev->nqr->msix_entries[i + 1].ring_idx;
 		rc = bnxt_re_net_ring_alloc(rdev, &rattr, &nq->ring_id);
 		if (rc) {
 			ibdev_err(&rdev->ibdev,
@@ -1983,6 +1983,8 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 		return rc;
 	}
 	rdev->nqr->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+	memcpy(rdev->nqr->msix_entries, rdev->en_dev->msix_entries,
+	       sizeof(struct bnxt_msix_entry) * rdev->nqr->num_msix);
 
 	/* Check whether VF or PF */
 	bnxt_re_get_sriov_func_type(rdev);
@@ -2008,14 +2010,14 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
 	rattr.type = type;
 	rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
 	rattr.depth = BNXT_QPLIB_CREQE_MAX_CNT - 1;
-	rattr.lrid = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].ring_idx;
+	rattr.lrid = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].ring_idx;
 	rc = bnxt_re_net_ring_alloc(rdev, &rattr, &creq->ring_id);
 	if (rc) {
 		ibdev_err(&rdev->ibdev, "Failed to allocate CREQ: %#x\n", rc);
 		goto free_rcfw;
 	}
-	db_offt = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].db_offset;
-	vid = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].vector;
+	db_offt = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].db_offset;
+	vid = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].vector;
 	rc = bnxt_qplib_enable_rcfw_channel(&rdev->rcfw,
 					    vid, db_offt,
 					    &bnxt_re_aeq_handler);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation
  2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
                   ` (3 preceding siblings ...)
  2024-11-14  9:49 ` [PATCH rdma-next v2 4/4] RDMA/bnxt_re: Cache MSIx info to a local structure Selvin Xavier
@ 2024-11-14 14:53 ` Leon Romanovsky
  4 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2024-11-14 14:53 UTC (permalink / raw)
  To: jgg, Selvin Xavier; +Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil


On Thu, 14 Nov 2024 01:49:04 -0800, Selvin Xavier wrote:
> Includes some generic improvments and code refactoring for the
> Notification Queue handling in the driver. Remove the data
> structures that store the NQ information out of the device
> structure. Fix few issues in selecting the NQ during CQ
> create. Also, fail the driver load if NIC driver can not
> allocate at least two NQs for RoCE.
> 
> [...]

Applied, thanks!

[1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved
      https://git.kernel.org/rdma/rdma/c/65ecee132774e0
[2/4] RDMA/bnxt_re: Refactor NQ allocation
      https://git.kernel.org/rdma/rdma/c/30b871338c3eba
[3/4] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation
      https://git.kernel.org/rdma/rdma/c/cb97b377a13589
[4/4] RDMA/bnxt_re: Cache MSIx info to a local structure
      https://git.kernel.org/rdma/rdma/c/31bad59805c388

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-14 14:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-14  9:49 [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
2024-11-14  9:49 ` [PATCH rdma-next v2 1/4] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
2024-11-14  9:49 ` [PATCH rdma-next v2 2/4] RDMA/bnxt_re: Refactor NQ allocation Selvin Xavier
2024-11-14  9:49 ` [PATCH rdma-next v2 3/4] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation Selvin Xavier
2024-11-14  9:49 ` [PATCH rdma-next v2 4/4] RDMA/bnxt_re: Cache MSIx info to a local structure Selvin Xavier
2024-11-14 14:53 ` [PATCH rdma-next v2 0/4]RDMA/bnxt_re: Refactor Notification queue allocation Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox