* [rdma-next 1/5] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved
2024-11-08 8:42 [rdma-next 0/5] RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
@ 2024-11-08 8:42 ` Selvin Xavier
2024-11-08 8:42 ` [rdma-next 2/5] RDMA/bnxt_re: Refactor NQ allocation Selvin Xavier
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Selvin Xavier @ 2024-11-08 8:42 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
L2 driver allocates and populates the MSI-x vector details for RoCE
in the en_dev structure. RoCE driver requires minimum 2 MSIx vectors.
Hence during probe, driver has to check and bail out if there are not
enough MSI-x vectors reserved for it before proceeding further
initialization.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 ++
drivers/infiniband/hw/bnxt_re/main.c | 21 +++++++++++----------
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index d1b7c20..7abc37b 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -154,6 +154,8 @@ struct bnxt_re_pacing {
#define BNXT_RE_GRC_FIFO_REG_BASE 0x2000
+#define BNXT_RE_MIN_MSIX 2
+
#define MAX_CQ_HASH_BITS (16)
#define MAX_SRQ_HASH_BITS (16)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index cb61941..f9826c4 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1936,6 +1936,17 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
}
set_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+ if (rdev->en_dev->ulp_tbl->msix_requested < BNXT_RE_MIN_MSIX) {
+ ibdev_err(&rdev->ibdev,
+ "RoCE requires minimum 2 MSI-X vectors, but only %d reserved\n",
+ rdev->en_dev->ulp_tbl->msix_requested);
+ clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+ return -EINVAL;
+ }
+ ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
+ rdev->en_dev->ulp_tbl->msix_requested);
+ rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+
rc = bnxt_re_setup_chip_ctx(rdev);
if (rc) {
bnxt_unregister_dev(rdev->en_dev);
@@ -1947,16 +1958,6 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
/* Check whether VF or PF */
bnxt_re_get_sriov_func_type(rdev);
- if (!rdev->en_dev->ulp_tbl->msix_requested) {
- ibdev_err(&rdev->ibdev,
- "Failed to get MSI-X vectors: %#x\n", rc);
- rc = -EINVAL;
- goto fail;
- }
- ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
- rdev->en_dev->ulp_tbl->msix_requested);
- rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
-
bnxt_re_query_hwrm_intf_version(rdev);
/* Establish RCFW Communication Channel to initialize the context
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [rdma-next 2/5] RDMA/bnxt_re: Refactor NQ allocation
2024-11-08 8:42 [rdma-next 0/5] RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
2024-11-08 8:42 ` [rdma-next 1/5] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
@ 2024-11-08 8:42 ` Selvin Xavier
2024-11-08 8:42 ` [rdma-next 3/5] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation Selvin Xavier
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Selvin Xavier @ 2024-11-08 8:42 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Move NQ related data structures from rdev to a new structure
named "struct bnxt_re_nq_record" by keeping a pointer to in
the rdev structure. Allocate the memory for it dynamically.
This change is needed for subsequent patches in the series.
Also, removed the nq_task variable from rdev structure as it
is redundant and no longer used.
This change would help to reduce the size of the driver private
structure as well.
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 13 +++---
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 6 +--
drivers/infiniband/hw/bnxt_re/main.c | 74 +++++++++++++++++++++-----------
3 files changed, 60 insertions(+), 33 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 7abc37b..d1c839e 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -155,6 +155,11 @@ struct bnxt_re_pacing {
#define BNXT_RE_GRC_FIFO_REG_BASE 0x2000
#define BNXT_RE_MIN_MSIX 2
+#define BNXT_RE_MAX_MSIX BNXT_MAX_ROCE_MSIX
+struct bnxt_re_nq_record {
+ struct bnxt_qplib_nq nq[BNXT_RE_MAX_MSIX];
+ int num_msix;
+};
#define MAX_CQ_HASH_BITS (16)
#define MAX_SRQ_HASH_BITS (16)
@@ -183,21 +188,17 @@ struct bnxt_re_dev {
unsigned int version, major, minor;
struct bnxt_qplib_chip_ctx *chip_ctx;
struct bnxt_en_dev *en_dev;
- int num_msix;
int id;
struct delayed_work worker;
u8 cur_prio_map;
- /* FP Notification Queue (CQ & SRQ) */
- struct tasklet_struct nq_task;
-
/* RCFW Channel */
struct bnxt_qplib_rcfw rcfw;
- /* NQ */
- struct bnxt_qplib_nq nq[BNXT_MAX_ROCE_MSIX];
+ /* NQ record */
+ struct bnxt_re_nq_record *nqr;
/* Device Resources */
struct bnxt_qplib_dev_attr dev_attr;
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 9a188cc..a9c32c0 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -1872,8 +1872,8 @@ int bnxt_re_create_srq(struct ib_srq *ib_srq,
srq->qplib_srq.wqe_size = bnxt_re_get_rwqe_size(dev_attr->max_srq_sges);
srq->qplib_srq.threshold = srq_init_attr->attr.srq_limit;
srq->srq_limit = srq_init_attr->attr.srq_limit;
- srq->qplib_srq.eventq_hw_ring_id = rdev->nq[0].ring_id;
- nq = &rdev->nq[0];
+ srq->qplib_srq.eventq_hw_ring_id = rdev->nqr->nq[0].ring_id;
+ nq = &rdev->nqr->nq[0];
if (udata) {
rc = bnxt_re_init_user_srq(rdev, pd, srq, udata);
@@ -3122,7 +3122,7 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
* used for getting the NQ index.
*/
nq_alloc_cnt = atomic_inc_return(&rdev->nq_alloc_cnt);
- nq = &rdev->nq[nq_alloc_cnt % (rdev->num_msix - 1)];
+ nq = &rdev->nqr->nq[nq_alloc_cnt % (rdev->nqr->num_msix - 1)];
cq->qplib_cq.max_wqe = entries;
cq->qplib_cq.cnq_hw_ring_id = nq->ring_id;
cq->qplib_cq.nq = nq;
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index f9826c4..9acc0e2 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -326,8 +326,8 @@ static void bnxt_re_stop_irq(void *handle)
rdev = en_info->rdev;
rcfw = &rdev->rcfw;
- for (indx = BNXT_RE_NQ_IDX; indx < rdev->num_msix; indx++) {
- nq = &rdev->nq[indx - 1];
+ for (indx = BNXT_RE_NQ_IDX; indx < rdev->nqr->num_msix; indx++) {
+ nq = &rdev->nqr->nq[indx - 1];
bnxt_qplib_nq_stop_irq(nq, false);
}
@@ -362,7 +362,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
/* Vectors may change after restart, so update with new vectors
* in device sctructure.
*/
- for (indx = 0; indx < rdev->num_msix; indx++)
+ for (indx = 0; indx < rdev->nqr->num_msix; indx++)
rdev->en_dev->msix_entries[indx].vector = ent[indx].vector;
rc = bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
@@ -371,8 +371,8 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
ibdev_warn(&rdev->ibdev, "Failed to reinit CREQ\n");
return;
}
- for (indx = BNXT_RE_NQ_IDX ; indx < rdev->num_msix; indx++) {
- nq = &rdev->nq[indx - 1];
+ for (indx = BNXT_RE_NQ_IDX ; indx < rdev->nqr->num_msix; indx++) {
+ nq = &rdev->nqr->nq[indx - 1];
rc = bnxt_qplib_nq_start_irq(nq, indx - 1,
msix_ent[indx].vector, false);
if (rc) {
@@ -1206,7 +1206,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
addrconf_addr_eui48((u8 *)&ibdev->node_guid, rdev->netdev->dev_addr);
- ibdev->num_comp_vectors = rdev->num_msix - 1;
+ ibdev->num_comp_vectors = rdev->nqr->num_msix - 1;
ibdev->dev.parent = &rdev->en_dev->pdev->dev;
ibdev->local_dma_lkey = BNXT_QPLIB_RSVD_LKEY;
@@ -1551,8 +1551,8 @@ static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
{
int i;
- for (i = 1; i < rdev->num_msix; i++)
- bnxt_qplib_disable_nq(&rdev->nq[i - 1]);
+ for (i = 1; i < rdev->nqr->num_msix; i++)
+ bnxt_qplib_disable_nq(&rdev->nqr->nq[i - 1]);
if (rdev->qplib_res.rcfw)
bnxt_qplib_cleanup_res(&rdev->qplib_res);
@@ -1566,9 +1566,9 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
bnxt_qplib_init_res(&rdev->qplib_res);
- for (i = 1; i < rdev->num_msix ; i++) {
+ for (i = 1; i < rdev->nqr->num_msix ; i++) {
db_offt = rdev->en_dev->msix_entries[i].db_offset;
- rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq[i - 1],
+ rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
i - 1, rdev->en_dev->msix_entries[i].vector,
db_offt, &bnxt_re_cqn_handler,
&bnxt_re_srqn_handler);
@@ -1582,20 +1582,22 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
return 0;
fail:
for (i = num_vec_enabled; i >= 0; i--)
- bnxt_qplib_disable_nq(&rdev->nq[i]);
+ bnxt_qplib_disable_nq(&rdev->nqr->nq[i]);
return rc;
}
static void bnxt_re_free_nq_res(struct bnxt_re_dev *rdev)
{
+ struct bnxt_qplib_nq *nq;
u8 type;
int i;
- for (i = 0; i < rdev->num_msix - 1; i++) {
+ for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
- bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type);
- bnxt_qplib_free_nq(&rdev->nq[i]);
- rdev->nq[i].res = NULL;
+ nq = &rdev->nqr->nq[i];
+ bnxt_re_net_ring_free(rdev, nq->ring_id, type);
+ bnxt_qplib_free_nq(nq);
+ nq->res = NULL;
}
}
@@ -1637,12 +1639,12 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
if (rc)
goto dealloc_res;
- for (i = 0; i < rdev->num_msix - 1; i++) {
+ for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
struct bnxt_qplib_nq *nq;
- nq = &rdev->nq[i];
+ nq = &rdev->nqr->nq[i];
nq->hwq.max_elements = BNXT_QPLIB_NQE_MAX_CNT;
- rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, &rdev->nq[i]);
+ rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, nq);
if (rc) {
ibdev_err(&rdev->ibdev, "Alloc Failed NQ%d rc:%#x",
i, rc);
@@ -1650,7 +1652,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
}
type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
rattr.dma_arr = nq->hwq.pbl[PBL_LVL_0].pg_map_arr;
- rattr.pages = nq->hwq.pbl[rdev->nq[i].hwq.level].pg_count;
+ rattr.pages = nq->hwq.pbl[rdev->nqr->nq[i].hwq.level].pg_count;
rattr.type = type;
rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
rattr.depth = BNXT_QPLIB_NQE_MAX_CNT - 1;
@@ -1660,7 +1662,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
ibdev_err(&rdev->ibdev,
"Failed to allocate NQ fw id with rc = 0x%x",
rc);
- bnxt_qplib_free_nq(&rdev->nq[i]);
+ bnxt_qplib_free_nq(nq);
goto free_nq;
}
num_vec_created++;
@@ -1669,8 +1671,8 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
free_nq:
for (i = num_vec_created - 1; i >= 0; i--) {
type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
- bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, type);
- bnxt_qplib_free_nq(&rdev->nq[i]);
+ bnxt_re_net_ring_free(rdev, rdev->nqr->nq[i].ring_id, type);
+ bnxt_qplib_free_nq(&rdev->nqr->nq[i]);
}
bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
&rdev->dpi_privileged);
@@ -1865,6 +1867,21 @@ static int bnxt_re_ib_init(struct bnxt_re_dev *rdev)
return rc;
}
+static int bnxt_re_alloc_nqr_mem(struct bnxt_re_dev *rdev)
+{
+ rdev->nqr = kzalloc(sizeof(*rdev->nqr), GFP_KERNEL);
+ if (!rdev->nqr)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void bnxt_re_free_nqr_mem(struct bnxt_re_dev *rdev)
+{
+ kfree(rdev->nqr);
+ rdev->nqr = NULL;
+}
+
static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
{
u8 type;
@@ -1894,11 +1911,12 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
}
- rdev->num_msix = 0;
+ rdev->nqr->num_msix = 0;
if (rdev->pacing.dbr_pacing)
bnxt_re_deinitialize_dbr_pacing(rdev);
+ bnxt_re_free_nqr_mem(rdev);
bnxt_re_destroy_chip_ctx(rdev);
if (op_type == BNXT_RE_COMPLETE_REMOVE) {
if (test_and_clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags))
@@ -1945,7 +1963,6 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
}
ibdev_dbg(&rdev->ibdev, "Got %d MSI-X vectors\n",
rdev->en_dev->ulp_tbl->msix_requested);
- rdev->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
rc = bnxt_re_setup_chip_ctx(rdev);
if (rc) {
@@ -1955,6 +1972,15 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
return -EINVAL;
}
+ rc = bnxt_re_alloc_nqr_mem(rdev);
+ if (rc) {
+ bnxt_re_destroy_chip_ctx(rdev);
+ bnxt_unregister_dev(rdev->en_dev);
+ clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
+ return rc;
+ }
+ rdev->nqr->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+
/* Check whether VF or PF */
bnxt_re_get_sriov_func_type(rdev);
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [rdma-next 3/5] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation
2024-11-08 8:42 [rdma-next 0/5] RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
2024-11-08 8:42 ` [rdma-next 1/5] RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved Selvin Xavier
2024-11-08 8:42 ` [rdma-next 2/5] RDMA/bnxt_re: Refactor NQ allocation Selvin Xavier
@ 2024-11-08 8:42 ` Selvin Xavier
2024-11-08 8:42 ` [rdma-next 4/5] RDMA/bnxt_re: Cache MSIx info to a local structure Selvin Xavier
2024-11-08 8:42 ` [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs Selvin Xavier
4 siblings, 0 replies; 10+ messages in thread
From: Selvin Xavier @ 2024-11-08 8:42 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
There are few use cases where CQ create and destroy
is seen before re-creating the CQ, this kind of use
case is disturbing the RR distribution and all the
active CQ getting mapped to only 2 NQ alternatively.
Fixing the CQ to NQ hash calculation by implementing
a quick load sorting mechanism under a mutex.
Using this, if the CQ was allocated and destroyed
before using it, the nq selecting algorithm still
obtains the least loaded CQ. Thus balancing the load
on NQs.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 ++
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 33 +++++++++++++++++++++++++-------
drivers/infiniband/hw/bnxt_re/main.c | 2 ++
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 1 +
drivers/infiniband/hw/bnxt_re/qplib_fp.h | 1 +
5 files changed, 32 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index d1c839e..5f64fc4 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -159,6 +159,8 @@ struct bnxt_re_pacing {
struct bnxt_re_nq_record {
struct bnxt_qplib_nq nq[BNXT_RE_MAX_MSIX];
int num_msix;
+ /* serialize NQ access */
+ struct mutex load_lock;
};
#define MAX_CQ_HASH_BITS (16)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index a9c32c0..9b0a435 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3029,6 +3029,28 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
return rc;
}
+static struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev)
+{
+ int min, indx;
+
+ mutex_lock(&rdev->nqr->load_lock);
+ for (indx = 0, min = 0; indx < (rdev->nqr->num_msix - 1); indx++) {
+ if (rdev->nqr->nq[min].load > rdev->nqr->nq[indx].load)
+ min = indx;
+ }
+ rdev->nqr->nq[min].load++;
+ mutex_unlock(&rdev->nqr->load_lock);
+
+ return &rdev->nqr->nq[min];
+}
+
+static void bnxt_re_put_nq(struct bnxt_re_dev *rdev, struct bnxt_qplib_nq *nq)
+{
+ mutex_lock(&rdev->nqr->load_lock);
+ nq->load--;
+ mutex_unlock(&rdev->nqr->load_lock);
+}
+
/* Completion Queues */
int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
{
@@ -3047,6 +3069,8 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
hash_del(&cq->hash_entry);
}
bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
+
+ bnxt_re_put_nq(rdev, nq);
ib_umem_release(cq->umem);
atomic_dec(&rdev->stats.res.cq_count);
@@ -3065,8 +3089,6 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
struct bnxt_qplib_dev_attr *dev_attr = &rdev->dev_attr;
struct bnxt_qplib_chip_ctx *cctx;
- struct bnxt_qplib_nq *nq = NULL;
- unsigned int nq_alloc_cnt;
int cqe = attr->cqe;
int rc, entries;
u32 active_cqs;
@@ -3121,12 +3143,10 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
* Allocating the NQ in a round robin fashion. nq_alloc_cnt is a
* used for getting the NQ index.
*/
- nq_alloc_cnt = atomic_inc_return(&rdev->nq_alloc_cnt);
- nq = &rdev->nqr->nq[nq_alloc_cnt % (rdev->nqr->num_msix - 1)];
cq->qplib_cq.max_wqe = entries;
- cq->qplib_cq.cnq_hw_ring_id = nq->ring_id;
- cq->qplib_cq.nq = nq;
cq->qplib_cq.coalescing = &rdev->cq_coalescing;
+ cq->qplib_cq.nq = bnxt_re_get_nq(rdev);
+ cq->qplib_cq.cnq_hw_ring_id = cq->qplib_cq.nq->ring_id;
rc = bnxt_qplib_create_cq(&rdev->qplib_res, &cq->qplib_cq);
if (rc) {
@@ -3136,7 +3156,6 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
cq->ib_cq.cqe = entries;
cq->cq_period = cq->qplib_cq.period;
- nq->budget++;
active_cqs = atomic_inc_return(&rdev->stats.res.cq_count);
if (active_cqs > rdev->stats.res.cq_watermark)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 9acc0e2..e556110 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1566,6 +1566,8 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
bnxt_qplib_init_res(&rdev->qplib_res);
+ mutex_init(&rdev->nqr->load_lock);
+
for (i = 1; i < rdev->nqr->num_msix ; i++) {
db_offt = rdev->en_dev->msix_entries[i].db_offset;
rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index e2eea71..e56f42f 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -551,6 +551,7 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
nq->pdev = pdev;
nq->cqn_handler = cqn_handler;
nq->srqn_handler = srqn_handler;
+ nq->load = 0;
/* Have a task to schedule CQ notifiers in post send case */
nq->cqn_wq = create_singlethread_workqueue("bnxt_qplib_nq");
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index b5a90581..8ff56d7 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -519,6 +519,7 @@ struct bnxt_qplib_nq {
struct tasklet_struct nq_tasklet;
bool requested;
int budget;
+ u32 load;
cqn_handler_t cqn_handler;
srqn_handler_t srqn_handler;
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [rdma-next 4/5] RDMA/bnxt_re: Cache MSIx info to a local structure
2024-11-08 8:42 [rdma-next 0/5] RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
` (2 preceding siblings ...)
2024-11-08 8:42 ` [rdma-next 3/5] RDMA/bnxt_re: Refurbish CQ to NQ hash calculation Selvin Xavier
@ 2024-11-08 8:42 ` Selvin Xavier
2024-11-08 8:42 ` [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs Selvin Xavier
4 siblings, 0 replies; 10+ messages in thread
From: Selvin Xavier @ 2024-11-08 8:42 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
L2 driver allocates the vectors for RoCE and pass it through the
en_dev structure to RoCE. During probe, cache the MSIx related
info to a local structure.
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 1 +
drivers/infiniband/hw/bnxt_re/main.c | 18 ++++++++++--------
2 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 5f64fc4..2975b11 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -157,6 +157,7 @@ struct bnxt_re_pacing {
#define BNXT_RE_MIN_MSIX 2
#define BNXT_RE_MAX_MSIX BNXT_MAX_ROCE_MSIX
struct bnxt_re_nq_record {
+ struct bnxt_msix_entry msix_entries[BNXT_RE_MAX_MSIX];
struct bnxt_qplib_nq nq[BNXT_RE_MAX_MSIX];
int num_msix;
/* serialize NQ access */
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index e556110..1c7171a 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -347,7 +347,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
return;
rdev = en_info->rdev;
- msix_ent = rdev->en_dev->msix_entries;
+ msix_ent = rdev->nqr->msix_entries;
rcfw = &rdev->rcfw;
if (!ent) {
/* Not setting the f/w timeout bit in rcfw.
@@ -363,7 +363,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
* in device sctructure.
*/
for (indx = 0; indx < rdev->nqr->num_msix; indx++)
- rdev->en_dev->msix_entries[indx].vector = ent[indx].vector;
+ rdev->nqr->msix_entries[indx].vector = ent[indx].vector;
rc = bnxt_qplib_rcfw_start_irq(rcfw, msix_ent[BNXT_RE_AEQ_IDX].vector,
false);
@@ -1569,9 +1569,9 @@ static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
mutex_init(&rdev->nqr->load_lock);
for (i = 1; i < rdev->nqr->num_msix ; i++) {
- db_offt = rdev->en_dev->msix_entries[i].db_offset;
+ db_offt = rdev->nqr->msix_entries[i].db_offset;
rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
- i - 1, rdev->en_dev->msix_entries[i].vector,
+ i - 1, rdev->nqr->msix_entries[i].vector,
db_offt, &bnxt_re_cqn_handler,
&bnxt_re_srqn_handler);
if (rc) {
@@ -1658,7 +1658,7 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
rattr.type = type;
rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
rattr.depth = BNXT_QPLIB_NQE_MAX_CNT - 1;
- rattr.lrid = rdev->en_dev->msix_entries[i + 1].ring_idx;
+ rattr.lrid = rdev->nqr->msix_entries[i + 1].ring_idx;
rc = bnxt_re_net_ring_alloc(rdev, &rattr, &nq->ring_id);
if (rc) {
ibdev_err(&rdev->ibdev,
@@ -1982,6 +1982,8 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
return rc;
}
rdev->nqr->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
+ memcpy(rdev->nqr->msix_entries, rdev->en_dev->msix_entries,
+ sizeof(struct bnxt_msix_entry) * rdev->nqr->num_msix);
/* Check whether VF or PF */
bnxt_re_get_sriov_func_type(rdev);
@@ -2007,14 +2009,14 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
rattr.type = type;
rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
rattr.depth = BNXT_QPLIB_CREQE_MAX_CNT - 1;
- rattr.lrid = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].ring_idx;
+ rattr.lrid = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].ring_idx;
rc = bnxt_re_net_ring_alloc(rdev, &rattr, &creq->ring_id);
if (rc) {
ibdev_err(&rdev->ibdev, "Failed to allocate CREQ: %#x\n", rc);
goto free_rcfw;
}
- db_offt = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].db_offset;
- vid = rdev->en_dev->msix_entries[BNXT_RE_AEQ_IDX].vector;
+ db_offt = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].db_offset;
+ vid = rdev->nqr->msix_entries[BNXT_RE_AEQ_IDX].vector;
rc = bnxt_qplib_enable_rcfw_channel(&rdev->rcfw,
vid, db_offt,
&bnxt_re_aeq_handler);
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs
2024-11-08 8:42 [rdma-next 0/5] RDMA/bnxt_re: Refactor Notification queue allocation Selvin Xavier
` (3 preceding siblings ...)
2024-11-08 8:42 ` [rdma-next 4/5] RDMA/bnxt_re: Cache MSIx info to a local structure Selvin Xavier
@ 2024-11-08 8:42 ` Selvin Xavier
2024-11-12 8:17 ` Leon Romanovsky
4 siblings, 1 reply; 10+ messages in thread
From: Selvin Xavier @ 2024-11-08 8:42 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Move the logic to setup and enable NQs to a new function.
Similarly moved the NQ cleanup logic to a common function.
Introdued a flag to keep track of NQ allocation status
and added sanity checks inside bnxt_re_stop_irq() and
bnxt_re_start_irq() to avoid possible race conditions.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +
drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++-------------
2 files changed, 123 insertions(+), 83 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 2975b11..74a340f 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -160,6 +160,7 @@ struct bnxt_re_nq_record {
struct bnxt_msix_entry msix_entries[BNXT_RE_MAX_MSIX];
struct bnxt_qplib_nq nq[BNXT_RE_MAX_MSIX];
int num_msix;
+ int max_init;
/* serialize NQ access */
struct mutex load_lock;
};
@@ -178,6 +179,7 @@ struct bnxt_re_dev {
struct list_head list;
unsigned long flags;
#define BNXT_RE_FLAG_NETDEV_REGISTERED 0
+#define BNXT_RE_FLAG_SETUP_NQ 1
#define BNXT_RE_FLAG_HAVE_L2_REF 3
#define BNXT_RE_FLAG_RCFW_CHANNEL_EN 4
#define BNXT_RE_FLAG_QOS_WORK_REG 5
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 1c7171a..87a54db 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -324,13 +324,19 @@ static void bnxt_re_stop_irq(void *handle)
return;
rdev = en_info->rdev;
+ if (!rdev)
+ return;
rcfw = &rdev->rcfw;
+ if (!test_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
+ goto free_rcfw_irq;
+
for (indx = BNXT_RE_NQ_IDX; indx < rdev->nqr->num_msix; indx++) {
nq = &rdev->nqr->nq[indx - 1];
bnxt_qplib_nq_stop_irq(nq, false);
}
+free_rcfw_irq:
bnxt_qplib_rcfw_stop_irq(rcfw, false);
}
@@ -341,12 +347,18 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
struct bnxt_qplib_rcfw *rcfw;
struct bnxt_re_dev *rdev;
struct bnxt_qplib_nq *nq;
- int indx, rc;
+ int indx, rc, vec;
if (!en_info)
return;
rdev = en_info->rdev;
+ if (!rdev)
+ return;
+
+ if (!test_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
+ return;
+
msix_ent = rdev->nqr->msix_entries;
rcfw = &rdev->rcfw;
if (!ent) {
@@ -360,7 +372,7 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
}
/* Vectors may change after restart, so update with new vectors
- * in device sctructure.
+ * in device structure.
*/
for (indx = 0; indx < rdev->nqr->num_msix; indx++)
rdev->nqr->msix_entries[indx].vector = ent[indx].vector;
@@ -371,10 +383,11 @@ static void bnxt_re_start_irq(void *handle, struct bnxt_msix_entry *ent)
ibdev_warn(&rdev->ibdev, "Failed to reinit CREQ\n");
return;
}
- for (indx = BNXT_RE_NQ_IDX ; indx < rdev->nqr->num_msix; indx++) {
- nq = &rdev->nqr->nq[indx - 1];
- rc = bnxt_qplib_nq_start_irq(nq, indx - 1,
- msix_ent[indx].vector, false);
+ for (indx = 0 ; indx < rdev->nqr->max_init; indx++) {
+ nq = &rdev->nqr->nq[indx];
+ vec = indx + 1;
+ rc = bnxt_qplib_nq_start_irq(nq, indx,
+ msix_ent[vec].vector, false);
if (rc) {
ibdev_warn(&rdev->ibdev, "Failed to reinit NQ index %d\n",
indx - 1);
@@ -1549,64 +1562,39 @@ static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq,
static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
{
- int i;
-
- for (i = 1; i < rdev->nqr->num_msix; i++)
- bnxt_qplib_disable_nq(&rdev->nqr->nq[i - 1]);
-
if (rdev->qplib_res.rcfw)
bnxt_qplib_cleanup_res(&rdev->qplib_res);
}
static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
{
- int num_vec_enabled = 0;
- int rc = 0, i;
- u32 db_offt;
-
bnxt_qplib_init_res(&rdev->qplib_res);
- mutex_init(&rdev->nqr->load_lock);
-
- for (i = 1; i < rdev->nqr->num_msix ; i++) {
- db_offt = rdev->nqr->msix_entries[i].db_offset;
- rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nqr->nq[i - 1],
- i - 1, rdev->nqr->msix_entries[i].vector,
- db_offt, &bnxt_re_cqn_handler,
- &bnxt_re_srqn_handler);
- if (rc) {
- ibdev_err(&rdev->ibdev,
- "Failed to enable NQ with rc = 0x%x", rc);
- goto fail;
- }
- num_vec_enabled++;
- }
return 0;
-fail:
- for (i = num_vec_enabled; i >= 0; i--)
- bnxt_qplib_disable_nq(&rdev->nqr->nq[i]);
- return rc;
}
-static void bnxt_re_free_nq_res(struct bnxt_re_dev *rdev)
+static void bnxt_re_clean_nqs(struct bnxt_re_dev *rdev)
{
struct bnxt_qplib_nq *nq;
u8 type;
int i;
- for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
+ if (!rdev->nqr->max_init)
+ return;
+
+ for (i = (rdev->nqr->max_init - 1); i >= 0; i--) {
type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
nq = &rdev->nqr->nq[i];
+ bnxt_qplib_disable_nq(nq);
bnxt_re_net_ring_free(rdev, nq->ring_id, type);
bnxt_qplib_free_nq(nq);
nq->res = NULL;
}
+ rdev->nqr->max_init = 0;
}
static void bnxt_re_free_res(struct bnxt_re_dev *rdev)
{
- bnxt_re_free_nq_res(rdev);
-
if (rdev->qplib_res.dpi_tbl.max) {
bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
&rdev->dpi_privileged);
@@ -1619,10 +1607,7 @@ static void bnxt_re_free_res(struct bnxt_re_dev *rdev)
static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
{
- struct bnxt_re_ring_attr rattr = {};
- int num_vec_created = 0;
- int rc, i;
- u8 type;
+ int rc;
/* Configure and allocate resources for qplib */
rdev->qplib_res.rcfw = &rdev->rcfw;
@@ -1641,43 +1626,8 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
if (rc)
goto dealloc_res;
- for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
- struct bnxt_qplib_nq *nq;
-
- nq = &rdev->nqr->nq[i];
- nq->hwq.max_elements = BNXT_QPLIB_NQE_MAX_CNT;
- rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, nq);
- if (rc) {
- ibdev_err(&rdev->ibdev, "Alloc Failed NQ%d rc:%#x",
- i, rc);
- goto free_nq;
- }
- type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
- rattr.dma_arr = nq->hwq.pbl[PBL_LVL_0].pg_map_arr;
- rattr.pages = nq->hwq.pbl[rdev->nqr->nq[i].hwq.level].pg_count;
- rattr.type = type;
- rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
- rattr.depth = BNXT_QPLIB_NQE_MAX_CNT - 1;
- rattr.lrid = rdev->nqr->msix_entries[i + 1].ring_idx;
- rc = bnxt_re_net_ring_alloc(rdev, &rattr, &nq->ring_id);
- if (rc) {
- ibdev_err(&rdev->ibdev,
- "Failed to allocate NQ fw id with rc = 0x%x",
- rc);
- bnxt_qplib_free_nq(nq);
- goto free_nq;
- }
- num_vec_created++;
- }
return 0;
-free_nq:
- for (i = num_vec_created - 1; i >= 0; i--) {
- type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
- bnxt_re_net_ring_free(rdev, rdev->nqr->nq[i].ring_id, type);
- bnxt_qplib_free_nq(&rdev->nqr->nq[i]);
- }
- bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
- &rdev->dpi_privileged);
+
dealloc_res:
bnxt_qplib_free_res(&rdev->qplib_res);
@@ -1884,6 +1834,71 @@ static void bnxt_re_free_nqr_mem(struct bnxt_re_dev *rdev)
rdev->nqr = NULL;
}
+static int bnxt_re_setup_nqs(struct bnxt_re_dev *rdev)
+{
+ struct bnxt_re_ring_attr rattr = {};
+ struct bnxt_qplib_nq *nq;
+ int rc, i;
+ int depth;
+ u32 offt;
+ u16 vec;
+ u8 type;
+
+ mutex_init(&rdev->nqr->load_lock);
+
+ depth = BNXT_QPLIB_NQE_MAX_CNT;
+ for (i = 0; i < rdev->nqr->num_msix - 1; i++) {
+ nq = &rdev->nqr->nq[i];
+ vec = rdev->nqr->msix_entries[i + 1].vector;
+ offt = rdev->nqr->msix_entries[i + 1].db_offset;
+ nq->hwq.max_elements = depth;
+ rc = bnxt_qplib_alloc_nq(&rdev->qplib_res, nq);
+ if (rc) {
+ dev_err(rdev_to_dev(rdev),
+ "Failed to get mem for NQ %d, rc = 0x%x",
+ i, rc);
+ goto fail_mem;
+ }
+
+ type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
+ rattr.dma_arr = nq->hwq.pbl[PBL_LVL_0].pg_map_arr;
+ rattr.pages = nq->hwq.pbl[rdev->nqr->nq[i].hwq.level].pg_count;
+ rattr.type = type;
+ rattr.mode = RING_ALLOC_REQ_INT_MODE_MSIX;
+ rattr.depth = nq->hwq.max_elements - 1;
+ rattr.lrid = rdev->nqr->msix_entries[i + 1].ring_idx;
+
+ rc = bnxt_re_net_ring_alloc(rdev, &rattr, &nq->ring_id);
+ if (rc) {
+ nq->ring_id = 0xffff; /* Invalid ring-id */
+ dev_err(rdev_to_dev(rdev),
+ "Failed to get fw id for NQ %d, rc = 0x%x",
+ i, rc);
+ goto fail_ring;
+ }
+
+ rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, nq, i, vec, offt,
+ &bnxt_re_cqn_handler,
+ &bnxt_re_srqn_handler);
+ if (rc) {
+ dev_err(rdev_to_dev(rdev),
+ "Failed to enable NQ %d, rc = 0x%x", i, rc);
+ goto fail_en;
+ }
+ }
+
+ rdev->nqr->max_init = i;
+ return 0;
+fail_en:
+ /* *nq was i'th nq */
+ bnxt_re_net_ring_free(rdev, nq->ring_id, type);
+fail_ring:
+ bnxt_qplib_free_nq(nq);
+fail_mem:
+ rdev->nqr->max_init = i;
+ return rc;
+}
+
static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
{
u8 type;
@@ -1894,6 +1909,11 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
cancel_delayed_work_sync(&rdev->worker);
+ rtnl_lock();
+ if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
+ bnxt_re_clean_nqs(rdev);
+ rtnl_unlock();
+
if (test_and_clear_bit(BNXT_RE_FLAG_RESOURCES_INITIALIZED,
&rdev->flags))
bnxt_re_cleanup_res(rdev);
@@ -1906,10 +1926,12 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev, u8 op_type)
ibdev_warn(&rdev->ibdev,
"Failed to deinitialize RCFW: %#x", rc);
bnxt_re_net_stats_ctx_free(rdev, rdev->qplib_ctx.stats.fw_id);
+ rtnl_lock();
bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
+ rtnl_unlock();
bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
}
@@ -1974,6 +1996,11 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
return -EINVAL;
}
+ /* Check whether VF or PF */
+ bnxt_re_get_sriov_func_type(rdev);
+
+ bnxt_re_query_hwrm_intf_version(rdev);
+
rc = bnxt_re_alloc_nqr_mem(rdev);
if (rc) {
bnxt_re_destroy_chip_ctx(rdev);
@@ -1981,15 +2008,11 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
clear_bit(BNXT_RE_FLAG_NETDEV_REGISTERED, &rdev->flags);
return rc;
}
+ rtnl_lock();
rdev->nqr->num_msix = rdev->en_dev->ulp_tbl->msix_requested;
memcpy(rdev->nqr->msix_entries, rdev->en_dev->msix_entries,
sizeof(struct bnxt_msix_entry) * rdev->nqr->num_msix);
- /* Check whether VF or PF */
- bnxt_re_get_sriov_func_type(rdev);
-
- bnxt_re_query_hwrm_intf_version(rdev);
-
/* Establish RCFW Communication Channel to initialize the context
* memory for the function and all child VFs
*/
@@ -2083,6 +2106,20 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
}
set_bit(BNXT_RE_FLAG_RESOURCES_INITIALIZED, &rdev->flags);
+ rc = bnxt_re_setup_nqs(rdev);
+ if (rc) {
+ if (rdev->nqr->max_init == 0) {
+ dev_err(rdev_to_dev(rdev),
+ "Failed to setup NQs rc = %#x\n", rc);
+ goto fail;
+ }
+
+ dev_warn(rdev_to_dev(rdev),
+ "expected nqs %d available nqs %d\n",
+ rdev->nqr->num_msix, rdev->nqr->max_init);
+ }
+ set_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags);
+ rtnl_unlock();
if (!rdev->is_virtfn) {
rc = bnxt_re_setup_qos(rdev);
@@ -2116,6 +2153,7 @@ static int bnxt_re_dev_init(struct bnxt_re_dev *rdev, u8 op_type)
free_rcfw:
bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
fail:
+ rtnl_unlock();
bnxt_re_dev_uninit(rdev, BNXT_RE_COMPLETE_REMOVE);
return rc;
--
2.5.5
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs
2024-11-08 8:42 ` [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs Selvin Xavier
@ 2024-11-12 8:17 ` Leon Romanovsky
2024-11-12 9:25 ` Selvin Xavier
0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2024-11-12 8:17 UTC (permalink / raw)
To: Selvin Xavier; +Cc: jgg, linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil
On Fri, Nov 08, 2024 at 12:42:39AM -0800, Selvin Xavier wrote:
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
> Move the logic to setup and enable NQs to a new function.
> Similarly moved the NQ cleanup logic to a common function.
> Introdued a flag to keep track of NQ allocation status
> and added sanity checks inside bnxt_re_stop_irq() and
> bnxt_re_start_irq() to avoid possible race conditions.
>
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> ---
> drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +
> drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++-------------
> 2 files changed, 123 insertions(+), 83 deletions(-)
<...>
>
> + rtnl_lock();
> + if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
> + bnxt_re_clean_nqs(rdev);
> + rtnl_unlock();
<...>
> + rtnl_lock();
> bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
> bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
> type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
> bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
> + rtnl_unlock();
Please don't add rtnl_lock() to drivers in RDMA subsystem. BNXT driver
is managed through netdev and it is there all proper locking should be
done.
Please work to remove existing rtnl_lock() from bnxt_re_update_en_info_rdev() too.
IMHO that lock is not needed after your driver conversion to auxbus.
Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs
2024-11-12 8:17 ` Leon Romanovsky
@ 2024-11-12 9:25 ` Selvin Xavier
2024-11-12 10:34 ` Leon Romanovsky
0 siblings, 1 reply; 10+ messages in thread
From: Selvin Xavier @ 2024-11-12 9:25 UTC (permalink / raw)
To: Leon Romanovsky, Michael Chan
Cc: jgg, linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil
[-- Attachment #1: Type: text/plain, Size: 2330 bytes --]
+Michael Chan
On Tue, Nov 12, 2024 at 1:47 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Nov 08, 2024 at 12:42:39AM -0800, Selvin Xavier wrote:
> > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >
> > Move the logic to setup and enable NQs to a new function.
> > Similarly moved the NQ cleanup logic to a common function.
> > Introdued a flag to keep track of NQ allocation status
> > and added sanity checks inside bnxt_re_stop_irq() and
> > bnxt_re_start_irq() to avoid possible race conditions.
> >
> > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> > ---
> > drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +
> > drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++-------------
> > 2 files changed, 123 insertions(+), 83 deletions(-)
>
> <...>
>
> >
> > + rtnl_lock();
> > + if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
> > + bnxt_re_clean_nqs(rdev);
> > + rtnl_unlock();
>
> <...>
>
> > + rtnl_lock();
> > bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
> > bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
> > type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
> > bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
> > + rtnl_unlock();
>
> Please don't add rtnl_lock() to drivers in RDMA subsystem. BNXT driver
> is managed through netdev and it is there all proper locking should be
> done.
The main reason for bnxt_re to take the rtnl is because of the MSIx
resource configuration.
This is because the NIC driver is dynamically modifying the MSIx table
when the number
of ring change or ndo->open/close is invoked. So we stop and restart
the interrupts of RoCE also with rtnl held.
>
> Please work to remove existing rtnl_lock() from bnxt_re_update_en_info_rdev() too.
> IMHO that lock is not needed after your driver conversion to auxbus.
This check is also to synchronize between the irq_stop and restart
implementation between
bnxt_en and bnxt_re driver and roce driver unload.
We will review this locking and see if we can handle it. But it is a
major design change in both L2
and roce drivers.
>
> Thanks
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4224 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs
2024-11-12 9:25 ` Selvin Xavier
@ 2024-11-12 10:34 ` Leon Romanovsky
2024-11-13 5:44 ` Selvin Xavier
0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2024-11-12 10:34 UTC (permalink / raw)
To: Selvin Xavier
Cc: Michael Chan, jgg, linux-rdma, andrew.gospodarek,
kalesh-anakkur.purayil
On Tue, Nov 12, 2024 at 02:55:12PM +0530, Selvin Xavier wrote:
> +Michael Chan
>
> On Tue, Nov 12, 2024 at 1:47 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Nov 08, 2024 at 12:42:39AM -0800, Selvin Xavier wrote:
> > > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > >
> > > Move the logic to setup and enable NQs to a new function.
> > > Similarly moved the NQ cleanup logic to a common function.
> > > Introdued a flag to keep track of NQ allocation status
> > > and added sanity checks inside bnxt_re_stop_irq() and
> > > bnxt_re_start_irq() to avoid possible race conditions.
> > >
> > > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> > > ---
> > > drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +
> > > drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++-------------
> > > 2 files changed, 123 insertions(+), 83 deletions(-)
> >
> > <...>
> >
> > >
> > > + rtnl_lock();
> > > + if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
> > > + bnxt_re_clean_nqs(rdev);
> > > + rtnl_unlock();
> >
> > <...>
> >
> > > + rtnl_lock();
> > > bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
> > > bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
> > > type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
> > > bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
> > > + rtnl_unlock();
> >
> > Please don't add rtnl_lock() to drivers in RDMA subsystem. BNXT driver
> > is managed through netdev and it is there all proper locking should be
> > done.
> The main reason for bnxt_re to take the rtnl is because of the MSIx
> resource configuration.
> This is because the NIC driver is dynamically modifying the MSIx table
> when the number
> of ring change or ndo->open/close is invoked. So we stop and restart
> the interrupts of RoCE also with rtnl held.
rtnl_lock is a big kernel lock, which blocks almost everything in the netdev.
In your case, you are changing one device configuration and should use
your per-device locks. Even in the system with more than one BNXT device,
the MSI-X on one device will influence other "innocent" devices.
> >
> > Please work to remove existing rtnl_lock() from bnxt_re_update_en_info_rdev() too.
> > IMHO that lock is not needed after your driver conversion to auxbus.
> This check is also to synchronize between the irq_stop and restart
> implementation between
> bnxt_en and bnxt_re driver and roce driver unload.
>
> We will review this locking and see if we can handle it. But it is a
> major design change in both L2
> and roce drivers.
You are adding new rtnl_lock and not moving it from one place to
another, so this redesign should be done together with this new
feature.
Thanks
> >
> > Thanks
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [rdma-next 5/5] RDMA/bnxt_re: Add new function to setup NQs
2024-11-12 10:34 ` Leon Romanovsky
@ 2024-11-13 5:44 ` Selvin Xavier
0 siblings, 0 replies; 10+ messages in thread
From: Selvin Xavier @ 2024-11-13 5:44 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Michael Chan, jgg, linux-rdma, andrew.gospodarek,
kalesh-anakkur.purayil
[-- Attachment #1: Type: text/plain, Size: 3224 bytes --]
On Tue, Nov 12, 2024 at 4:04 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Tue, Nov 12, 2024 at 02:55:12PM +0530, Selvin Xavier wrote:
> > +Michael Chan
> >
> > On Tue, Nov 12, 2024 at 1:47 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Fri, Nov 08, 2024 at 12:42:39AM -0800, Selvin Xavier wrote:
> > > > From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > > >
> > > > Move the logic to setup and enable NQs to a new function.
> > > > Similarly moved the NQ cleanup logic to a common function.
> > > > Introdued a flag to keep track of NQ allocation status
> > > > and added sanity checks inside bnxt_re_stop_irq() and
> > > > bnxt_re_start_irq() to avoid possible race conditions.
> > > >
> > > > Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> > > > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> > > > ---
> > > > drivers/infiniband/hw/bnxt_re/bnxt_re.h | 2 +
> > > > drivers/infiniband/hw/bnxt_re/main.c | 204 +++++++++++++++++++-------------
> > > > 2 files changed, 123 insertions(+), 83 deletions(-)
> > >
> > > <...>
> > >
> > > >
> > > > + rtnl_lock();
> > > > + if (test_and_clear_bit(BNXT_RE_FLAG_SETUP_NQ, &rdev->flags))
> > > > + bnxt_re_clean_nqs(rdev);
> > > > + rtnl_unlock();
> > >
> > > <...>
> > >
> > > > + rtnl_lock();
> > > > bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
> > > > bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
> > > > type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
> > > > bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
> > > > + rtnl_unlock();
> > >
> > > Please don't add rtnl_lock() to drivers in RDMA subsystem. BNXT driver
> > > is managed through netdev and it is there all proper locking should be
> > > done.
> > The main reason for bnxt_re to take the rtnl is because of the MSIx
> > resource configuration.
> > This is because the NIC driver is dynamically modifying the MSIx table
> > when the number
> > of ring change or ndo->open/close is invoked. So we stop and restart
> > the interrupts of RoCE also with rtnl held.
>
> rtnl_lock is a big kernel lock, which blocks almost everything in the netdev.
> In your case, you are changing one device configuration and should use
> your per-device locks. Even in the system with more than one BNXT device,
> the MSI-X on one device will influence other "innocent" devices.
>
> > >
> > > Please work to remove existing rtnl_lock() from bnxt_re_update_en_info_rdev() too.
> > > IMHO that lock is not needed after your driver conversion to auxbus.
> > This check is also to synchronize between the irq_stop and restart
> > implementation between
> > bnxt_en and bnxt_re driver and roce driver unload.
> >
> > We will review this locking and see if we can handle it. But it is a
> > major design change in both L2
> > and roce drivers.
>
> You are adding new rtnl_lock and not moving it from one place to
> another, so this redesign should be done together with this new
> feature.
Ok. Will drop this patch and post a v2.
>
> Thanks
>
> > >
> > > Thanks
>
>
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4224 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread