* [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes
@ 2023-05-11 7:26 Selvin Xavier
2023-05-17 19:35 ` Jason Gunthorpe
0 siblings, 1 reply; 15+ messages in thread
From: Selvin Xavier @ 2023-05-11 7:26 UTC (permalink / raw)
To: jgg, leon; +Cc: linux-rdma, andrew.gospodarek, Selvin Xavier
[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]
Includes some of the generic bug fixes in bnxt_re driver.
Please review and apply.
Thanks,
Selvin Xavier
Kalesh AP (9):
RDMA/bnxt_re: Fix a possible memory leak
RDMA/bnxt_re: Fix to remove unnecessary return labels
RDMA/bnxt_re: Use unique names while registering interrupts
RDMA/bnxt_re: Remove a redundant check inside bnxt_re_update_gid
RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
RDMA/bnxt_re: Fix to remove an unnecessary log
RDMA/bnxt_re: Do not enable congestion control on VFs
RDMA/bnxt_re: Return directly without goto jumps
RDMA/bnxt_re: Remove unnecessary checks
Selvin Xavier (1):
RDMA/bnxt_re: Disable/kill tasklet only if it is enabled
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 2 +-
drivers/infiniband/hw/bnxt_re/main.c | 24 ++++---
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 111 ++++++++++++++---------------
drivers/infiniband/hw/bnxt_re/qplib_fp.h | 2 +-
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 29 ++++++--
drivers/infiniband/hw/bnxt_re/qplib_rcfw.h | 1 +
drivers/infiniband/hw/bnxt_re/qplib_sp.c | 8 ---
7 files changed, 91 insertions(+), 86 deletions(-)
--
2.5.5
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4224 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes
2023-05-11 7:26 Selvin Xavier
@ 2023-05-17 19:35 ` Jason Gunthorpe
2023-05-18 8:17 ` Selvin Xavier
0 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2023-05-17 19:35 UTC (permalink / raw)
To: Selvin Xavier; +Cc: leon, linux-rdma, andrew.gospodarek
On Thu, May 11, 2023 at 12:26:15AM -0700, Selvin Xavier wrote:
> Includes some of the generic bug fixes in bnxt_re driver.
> Please review and apply.
>
> Thanks,
> Selvin Xavier
>
> Kalesh AP (9):
> RDMA/bnxt_re: Fix a possible memory leak
> RDMA/bnxt_re: Fix to remove unnecessary return labels
> RDMA/bnxt_re: Use unique names while registering interrupts
> RDMA/bnxt_re: Remove a redundant check inside bnxt_re_update_gid
> RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
> RDMA/bnxt_re: Fix to remove an unnecessary log
> RDMA/bnxt_re: Do not enable congestion control on VFs
> RDMA/bnxt_re: Return directly without goto jumps
> RDMA/bnxt_re: Remove unnecessary checks
>
> Selvin Xavier (1):
> RDMA/bnxt_re: Disable/kill tasklet only if it is enabled
A lot of this is not for-rc material, please split it up properly
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes
2023-05-17 19:35 ` Jason Gunthorpe
@ 2023-05-18 8:17 ` Selvin Xavier
0 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2023-05-18 8:17 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: leon, linux-rdma, andrew.gospodarek
[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]
On Thu, May 18, 2023 at 1:05 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Thu, May 11, 2023 at 12:26:15AM -0700, Selvin Xavier wrote:
> > Includes some of the generic bug fixes in bnxt_re driver.
> > Please review and apply.
> >
> > Thanks,
> > Selvin Xavier
> >
> > Kalesh AP (9):
> > RDMA/bnxt_re: Fix a possible memory leak
> > RDMA/bnxt_re: Fix to remove unnecessary return labels
> > RDMA/bnxt_re: Use unique names while registering interrupts
> > RDMA/bnxt_re: Remove a redundant check inside bnxt_re_update_gid
> > RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
> > RDMA/bnxt_re: Fix to remove an unnecessary log
> > RDMA/bnxt_re: Do not enable congestion control on VFs
> > RDMA/bnxt_re: Return directly without goto jumps
> > RDMA/bnxt_re: Remove unnecessary checks
> >
> > Selvin Xavier (1):
> > RDMA/bnxt_re: Disable/kill tasklet only if it is enabled
>
> A lot of this is not for-rc material, please split it up properly
sure. will post it as two series.
-Selvin
>
> Jason
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4224 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes
@ 2024-10-08 7:41 Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 01/10] RDMA/bnxt_re: Fix the max CQ WQEs for older adapters Selvin Xavier
` (10 more replies)
0 siblings, 11 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
Few bugfixes for bnxt_re driver. Please review and apply.
Thanks,
Selvin Xavier
Abhishek Mohapatra (1):
RDMA/bnxt_re: Fix the max CQ WQEs for older adapters
Bhargava Chenna Marreddy (1):
RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages
Chandramohan Akula (1):
RDMA/bnxt_re: Change the sequence of updating the CQ toggle value
Kalesh AP (5):
RDMA/bnxt_re: Fix out of bound check
RDMA/ bnxt_re: Return more meaningful error
RDMA/bnxt_re: Fix a possible NULL pointer dereference
RDMA/bnxt_re: Fix an error path in bnxt_re_add_device
RDMA/bnxt_re: Fix the GID table length
Kashyap Desai (1):
RDMA/bnxt_re: Fix incorrect dereference of srq in async event
Selvin Xavier (1):
RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
drivers/infiniband/hw/bnxt_re/hw_counters.c | 2 +-
drivers/infiniband/hw/bnxt_re/main.c | 42 ++++++++++++++---------------
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 5 ++++
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 2 +-
drivers/infiniband/hw/bnxt_re/qplib_res.c | 19 +++----------
drivers/infiniband/hw/bnxt_re/qplib_sp.c | 11 +++++++-
drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 +
7 files changed, 42 insertions(+), 40 deletions(-)
--
2.5.5
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH for-rc 01/10] RDMA/bnxt_re: Fix the max CQ WQEs for older adapters
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 02/10] RDMA/bnxt_re: Fix out of bound check Selvin Xavier
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Abhishek Mohapatra, Selvin Xavier
From: Abhishek Mohapatra <abhishek.mohapatra@broadcom.com>
Older adapters doesn't support the MAX CQ WQEs reported by
older FW. So restrict the value reported to 1M always for
older adapters.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Abhishek Mohapatra<abhishek.mohapatra@broadcom.com>
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/qplib_sp.c | 2 ++
drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 +
2 files changed, 3 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index 4f75e7e..32c1cc7 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -140,6 +140,8 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
min_t(u32, sb->max_sge_var_wqe, BNXT_VAR_MAX_SGE) : 6;
attr->max_cq = le32_to_cpu(sb->max_cq);
attr->max_cq_wqes = le32_to_cpu(sb->max_cqe);
+ if (!bnxt_qplib_is_chip_gen_p7(rcfw->res->cctx))
+ attr->max_cq_wqes = min_t(u32, BNXT_QPLIB_MAX_CQ_WQES, attr->max_cq_wqes);
attr->max_cq_sges = attr->max_qp_sges;
attr->max_mr = le32_to_cpu(sb->max_mr);
attr->max_mw = le32_to_cpu(sb->max_mw);
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
index acd9c14..ecf3f45 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
@@ -56,6 +56,7 @@ struct bnxt_qplib_dev_attr {
u32 max_qp_wqes;
u32 max_qp_sges;
u32 max_cq;
+#define BNXT_QPLIB_MAX_CQ_WQES 0xfffff
u32 max_cq_wqes;
u32 max_cq_sges;
u32 max_mr;
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 02/10] RDMA/bnxt_re: Fix out of bound check
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 01/10] RDMA/bnxt_re: Fix the max CQ WQEs for older adapters Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 03/10] RDMA/bnxt_re: Fix incorrect dereference of srq in async event Selvin Xavier
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Driver exports pacing stats only on GenP5 and P7 adapters. But while
parsing the pacing stats, driver has a check for "rdev->dbr_pacing".
This caused a trace when KASAN is enabled.
BUG: KASAN: slab-out-of-bounds in bnxt_re_get_hw_stats+0x2b6a/0x2e00 [bnxt_re]
Write of size 8 at addr ffff8885942a6340 by task modprobe/4809
Fixes: 8b6573ff3420 ("bnxt_re: Update the debug counters for doorbell pacing")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/hw_counters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.c b/drivers/infiniband/hw/bnxt_re/hw_counters.c
index 128651c..1e63f80 100644
--- a/drivers/infiniband/hw/bnxt_re/hw_counters.c
+++ b/drivers/infiniband/hw/bnxt_re/hw_counters.c
@@ -366,7 +366,7 @@ int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
goto done;
}
}
- if (rdev->pacing.dbr_pacing)
+ if (rdev->pacing.dbr_pacing && bnxt_qplib_is_chip_gen_p5_p7(rdev->chip_ctx))
bnxt_re_copy_db_pacing_stats(rdev, stats);
}
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 03/10] RDMA/bnxt_re: Fix incorrect dereference of srq in async event
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 01/10] RDMA/bnxt_re: Fix the max CQ WQEs for older adapters Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 02/10] RDMA/bnxt_re: Fix out of bound check Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 04/10] RDMA/ bnxt_re: Return more meaningful error Selvin Xavier
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Kashyap Desai, Selvin Xavier
From: Kashyap Desai <kashyap.desai@broadcom.com>
Currently driver is not getting correct srq. Dereference only if
qplib has a valid srq.
Fixes: b02fd3f79ec3 ("RDMA/bnxt_re: Report async events and errors")
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 686e405..dd39948 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1028,12 +1028,15 @@ static int bnxt_re_handle_unaffi_async_event(struct creq_func_event
static int bnxt_re_handle_qp_async_event(struct creq_qp_event *qp_event,
struct bnxt_re_qp *qp)
{
- struct bnxt_re_srq *srq = container_of(qp->qplib_qp.srq, struct bnxt_re_srq,
- qplib_srq);
struct creq_qp_error_notification *err_event;
+ struct bnxt_re_srq *srq = NULL;
struct ib_event event = {};
unsigned int flags;
+ if (qp->qplib_qp.srq)
+ srq = container_of(qp->qplib_qp.srq, struct bnxt_re_srq,
+ qplib_srq);
+
if (qp->qplib_qp.state == CMDQ_MODIFY_QP_NEW_STATE_ERR &&
rdma_is_kernel_res(&qp->ib_qp.res)) {
flags = bnxt_re_lock_cqs(qp);
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 04/10] RDMA/ bnxt_re: Return more meaningful error
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (2 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 03/10] RDMA/bnxt_re: Fix incorrect dereference of srq in async event Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 05/10] RDMA/bnxt_re: Fix a possible NULL pointer dereference Selvin Xavier
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
When the HWRM command fails, driver currently returns
-EFAULT(Bad address). This does not look correct.
Modified to return -EIO(I/O error).
Fixes: cc1ec769b87c ("RDMA/bnxt_re: Fixing the Control path command and response handling")
Fixes: 65288a22ddd8 ("RDMA/bnxt_re: use shadow qd while posting non blocking rcfw command")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 3ffaef0c..7294221 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -525,7 +525,7 @@ static int __bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
/* failed with status */
dev_err(&rcfw->pdev->dev, "cmdq[%#x]=%#x status %#x\n",
cookie, opcode, evnt->status);
- rc = -EFAULT;
+ rc = -EIO;
}
return rc;
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 05/10] RDMA/bnxt_re: Fix a possible NULL pointer dereference
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (3 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 04/10] RDMA/ bnxt_re: Return more meaningful error Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 06/10] RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop Selvin Xavier
` (5 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
There is a possibility of a NULL pointer dereference in the failure
path of bnxt_re_add_device().
To address that, moved the update of "rdev->adev" to bnxt_re_dev_add().
Fixes: dee3da3422d5 ("RDMA/bnxt_re: Change aux driver data to en_info to hold more information")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-rdma/CAH-L+nMCwymKGqf5pd8-FZNhxEkDD=kb6AoCaE6fAVi7b3e5Qw@mail.gmail.com/T/#t
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index dd39948..915b0d3 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -960,7 +960,7 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
return ib_register_device(ibdev, "bnxt_re%d", &rdev->en_dev->pdev->dev);
}
-static struct bnxt_re_dev *bnxt_re_dev_add(struct bnxt_aux_priv *aux_priv,
+static struct bnxt_re_dev *bnxt_re_dev_add(struct auxiliary_device *adev,
struct bnxt_en_dev *en_dev)
{
struct bnxt_re_dev *rdev;
@@ -976,6 +976,7 @@ static struct bnxt_re_dev *bnxt_re_dev_add(struct bnxt_aux_priv *aux_priv,
rdev->nb.notifier_call = NULL;
rdev->netdev = en_dev->net;
rdev->en_dev = en_dev;
+ rdev->adev = adev;
rdev->id = rdev->en_dev->pdev->devfn;
INIT_LIST_HEAD(&rdev->qp_list);
mutex_init(&rdev->qp_lock);
@@ -1829,7 +1830,6 @@ static void bnxt_re_update_en_info_rdev(struct bnxt_re_dev *rdev,
*/
rtnl_lock();
en_info->rdev = rdev;
- rdev->adev = adev;
rtnl_unlock();
}
@@ -1846,7 +1846,7 @@ static int bnxt_re_add_device(struct auxiliary_device *adev, u8 op_type)
en_dev = en_info->en_dev;
- rdev = bnxt_re_dev_add(aux_priv, en_dev);
+ rdev = bnxt_re_dev_add(adev, en_dev);
if (!rdev || !rdev_to_dev(rdev)) {
rc = -ENOMEM;
goto exit;
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 06/10] RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (4 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 05/10] RDMA/bnxt_re: Fix a possible NULL pointer dereference Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 07/10] RDMA/bnxt_re: Fix an error path in bnxt_re_add_device Selvin Xavier
` (4 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
Driver waits indefinitely for the fifo occupancy to go below
a threshold as soon as the pacing interrupt is received. This can
cause soft lockup on one of the processors, if the rate of DB is very high.
Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th
if the loop is taking more time. Pacing will be continuing until the
occupancy is below the threshold. This is ensured by the
checks in bnxt_re_pacing_timer_exp and further scheduling the
work for pacing based on the fifo occupancy.
Fixes: 2ad4e6303a6d ("RDMA/bnxt_re: Implement doorbell pacing algorithm")
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 915b0d3..b1dcb6b 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -534,6 +534,7 @@ static bool is_dbr_fifo_full(struct bnxt_re_dev *rdev)
static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev)
{
struct bnxt_qplib_db_pacing_data *pacing_data = rdev->qplib_res.pacing_data;
+ u32 retry_fifo_check = 1000;
u32 fifo_occup;
/* loop shouldn't run infintely as the occupancy usually goes
@@ -547,6 +548,14 @@ static void __wait_for_fifo_occupancy_below_th(struct bnxt_re_dev *rdev)
if (fifo_occup < pacing_data->pacing_th)
break;
+ if (!retry_fifo_check--) {
+ dev_info_once(rdev_to_dev(rdev),
+ "%s: fifo_occup = 0x%xfifo_max_depth = 0x%x pacing_th = 0x%x\n",
+ __func__, fifo_occup, pacing_data->fifo_max_depth,
+ pacing_data->pacing_th);
+ break;
+ }
+
}
}
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 07/10] RDMA/bnxt_re: Fix an error path in bnxt_re_add_device
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (5 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 06/10] RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 08/10] RDMA/bnxt_re: Change the sequence of updating the CQ toggle value Selvin Xavier
` (3 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
In bnxt_re_add_device(), when register netdev notifier fails, driver is
not unregistering the IB device in the error cleanup path.
Also, removed the duplicate cleanup in error path of bnxt_re_probe.
Fixes: 94a9dc6ac8f7 ("RDMA/bnxt_re: Group all operations under add_device and remove_device")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/main.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index b1dcb6b..63ca600 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1880,12 +1880,14 @@ static int bnxt_re_add_device(struct auxiliary_device *adev, u8 op_type)
rdev->nb.notifier_call = NULL;
pr_err("%s: Cannot register to netdevice_notifier",
ROCE_DRV_MODULE_NAME);
- return rc;
+ goto re_dev_unreg;
}
bnxt_re_setup_cc(rdev, true);
return 0;
+re_dev_unreg:
+ ib_unregister_device(&rdev->ibdev);
re_dev_uninit:
bnxt_re_update_en_info_rdev(NULL, en_info, adev);
bnxt_re_dev_uninit(rdev, BNXT_RE_COMPLETE_REMOVE);
@@ -2029,15 +2031,7 @@ static int bnxt_re_probe(struct auxiliary_device *adev,
auxiliary_set_drvdata(adev, en_info);
rc = bnxt_re_add_device(adev, BNXT_RE_COMPLETE_INIT);
- if (rc)
- goto err;
mutex_unlock(&bnxt_re_mutex);
- return 0;
-
-err:
- mutex_unlock(&bnxt_re_mutex);
- bnxt_re_remove(adev);
-
return rc;
}
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 08/10] RDMA/bnxt_re: Change the sequence of updating the CQ toggle value
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (6 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 07/10] RDMA/bnxt_re: Fix an error path in bnxt_re_add_device Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 09/10] RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages Selvin Xavier
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Chandramohan Akula, Selvin Xavier
From: Chandramohan Akula <chandramohan.akula@broadcom.com>
Currently the CQ toggle value in the shared page (read by
the userlib) is updated as part of the cqn_handler. There
is a potential race of application calling the CQ ARM
doorbell immediately and using the old toggle value.
Change the sequence of updating CQ toggle value to update
in the bnxt_qplib_service_nq function immediately after
reading the toggle value to be in sync with the HW updated
value.
Fixes: e275919d9669 ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace")
Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/main.c | 8 +-------
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 5 +++++
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 63ca600..6715c96 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1274,15 +1274,9 @@ static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq,
{
struct bnxt_re_cq *cq = container_of(handle, struct bnxt_re_cq,
qplib_cq);
- u32 *cq_ptr;
- if (cq->ib_cq.comp_handler) {
- if (cq->uctx_cq_page) {
- cq_ptr = (u32 *)cq->uctx_cq_page;
- *cq_ptr = cq->qplib_cq.toggle;
- }
+ if (cq->ib_cq.comp_handler)
(*cq->ib_cq.comp_handler)(&cq->ib_cq, cq->ib_cq.cq_context);
- }
return 0;
}
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 42e98e5..2ebcb2d 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -327,6 +327,7 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t)
case NQ_BASE_TYPE_CQ_NOTIFICATION:
{
struct nq_cn *nqcne = (struct nq_cn *)nqe;
+ struct bnxt_re_cq *cq_p;
q_handle = le32_to_cpu(nqcne->cq_handle_low);
q_handle |= (u64)le32_to_cpu(nqcne->cq_handle_high)
@@ -337,6 +338,10 @@ static void bnxt_qplib_service_nq(struct tasklet_struct *t)
cq->toggle = (le16_to_cpu(nqe->info10_type) &
NQ_CN_TOGGLE_MASK) >> NQ_CN_TOGGLE_SFT;
cq->dbinfo.toggle = cq->toggle;
+ cq_p = container_of(cq, struct bnxt_re_cq, qplib_cq);
+ if (cq_p->uctx_cq_page)
+ *((u32 *)cq_p->uctx_cq_page) = cq->toggle;
+
bnxt_qplib_armen_db(&cq->dbinfo,
DBC_DBC_TYPE_CQ_ARMENA);
spin_lock_bh(&cq->compl_lock);
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 09/10] RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (7 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 08/10] RDMA/bnxt_re: Change the sequence of updating the CQ toggle value Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 10/10] RDMA/bnxt_re: Fix the GID table length Selvin Xavier
2024-10-11 23:51 ` [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Jason Gunthorpe
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Bhargava Chenna Marreddy, Selvin Xavier
From: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Avoid memory corruption while setting up Level-2 PBL
pages for the non MR resources when num_pages > 256K.
There will be a single PDE page address (contiguous
pages in the case of > PAGE_SIZE), but, current logic
assumes multiple pages, leading to invalid memory
access after 256K PBL entries in the PDE.
Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/qplib_res.c | 19 +++----------------
1 file changed, 3 insertions(+), 16 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c
index 1fdffd6..96ceec1 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
@@ -257,22 +257,9 @@ int bnxt_qplib_alloc_init_hwq(struct bnxt_qplib_hwq *hwq,
dst_virt_ptr =
(dma_addr_t **)hwq->pbl[PBL_LVL_0].pg_arr;
src_phys_ptr = hwq->pbl[PBL_LVL_1].pg_map_arr;
- if (hwq_attr->type == HWQ_TYPE_MR) {
- /* For MR it is expected that we supply only 1 contigous
- * page i.e only 1 entry in the PDL that will contain
- * all the PBLs for the user supplied memory region
- */
- for (i = 0; i < hwq->pbl[PBL_LVL_1].pg_count;
- i++)
- dst_virt_ptr[0][i] = src_phys_ptr[i] |
- flag;
- } else {
- for (i = 0; i < hwq->pbl[PBL_LVL_1].pg_count;
- i++)
- dst_virt_ptr[PTR_PG(i)][PTR_IDX(i)] =
- src_phys_ptr[i] |
- PTU_PDE_VALID;
- }
+ for (i = 0; i < hwq->pbl[PBL_LVL_1].pg_count; i++)
+ dst_virt_ptr[0][i] = src_phys_ptr[i] | flag;
+
/* Alloc or init PTEs */
rc = __alloc_pbl(res, &hwq->pbl[PBL_LVL_2],
hwq_attr->sginfo);
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH for-rc 10/10] RDMA/bnxt_re: Fix the GID table length
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (8 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 09/10] RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages Selvin Xavier
@ 2024-10-08 7:41 ` Selvin Xavier
2024-10-11 23:51 ` [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Jason Gunthorpe
10 siblings, 0 replies; 15+ messages in thread
From: Selvin Xavier @ 2024-10-08 7:41 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil,
Selvin Xavier
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
GID table length is reported by FW. The gid index
which is passed to the driver during modify_qp/create_ah
is restricted by the sgid_index field of struct ib_global_route.
sgid_index is u8 and the max sgid possible is 256.
Each GID entry in HW will have 2 GID entries in the kernel gid table.
So we can support twice the gid table size reported by FW. Also, restrict
the max GID to 256 also.
Fixes: 847b97887ed4 ("RDMA/bnxt_re: Restrict the max_gids to 256")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/qplib_sp.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index 32c1cc7..e29fbbd 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -159,7 +159,14 @@ int bnxt_qplib_get_dev_attr(struct bnxt_qplib_rcfw *rcfw,
if (!bnxt_qplib_is_chip_gen_p7(rcfw->res->cctx))
attr->l2_db_size = (sb->l2_db_space_size + 1) *
(0x01 << RCFW_DBR_BASE_PAGE_SHIFT);
- attr->max_sgid = BNXT_QPLIB_NUM_GIDS_SUPPORTED;
+ /*
+ * Read the max gid supported by HW.
+ * For each entry in HW GID in HW table, we consume 2
+ * GID entries in the kernel GID table. So max_gid reported
+ * to stack can be up to twice the value reported by the HW, up to 256 gids.
+ */
+ attr->max_sgid = le32_to_cpu(sb->max_gid);
+ attr->max_sgid = min_t(u32, BNXT_QPLIB_NUM_GIDS_SUPPORTED, 2 * attr->max_sgid);
attr->dev_cap_flags = le16_to_cpu(sb->dev_cap_flags);
attr->dev_cap_flags2 = le16_to_cpu(sb->dev_cap_ext_flags_2);
--
2.5.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
` (9 preceding siblings ...)
2024-10-08 7:41 ` [PATCH for-rc 10/10] RDMA/bnxt_re: Fix the GID table length Selvin Xavier
@ 2024-10-11 23:51 ` Jason Gunthorpe
10 siblings, 0 replies; 15+ messages in thread
From: Jason Gunthorpe @ 2024-10-11 23:51 UTC (permalink / raw)
To: Selvin Xavier; +Cc: leon, linux-rdma, andrew.gospodarek, kalesh-anakkur.purayil
On Tue, Oct 08, 2024 at 12:41:32AM -0700, Selvin Xavier wrote:
> Few bugfixes for bnxt_re driver. Please review and apply.
>
> Thanks,
> Selvin Xavier
>
> Abhishek Mohapatra (1):
> RDMA/bnxt_re: Fix the max CQ WQEs for older adapters
>
> Bhargava Chenna Marreddy (1):
> RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages
>
> Chandramohan Akula (1):
> RDMA/bnxt_re: Change the sequence of updating the CQ toggle value
>
> Kalesh AP (5):
> RDMA/bnxt_re: Fix out of bound check
> RDMA/ bnxt_re: Return more meaningful error
> RDMA/bnxt_re: Fix a possible NULL pointer dereference
> RDMA/bnxt_re: Fix an error path in bnxt_re_add_device
> RDMA/bnxt_re: Fix the GID table length
>
> Kashyap Desai (1):
> RDMA/bnxt_re: Fix incorrect dereference of srq in async event
>
> Selvin Xavier (1):
> RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
Applied to for-rc, thanks
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-10-11 23:51 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08 7:41 [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 01/10] RDMA/bnxt_re: Fix the max CQ WQEs for older adapters Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 02/10] RDMA/bnxt_re: Fix out of bound check Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 03/10] RDMA/bnxt_re: Fix incorrect dereference of srq in async event Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 04/10] RDMA/ bnxt_re: Return more meaningful error Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 05/10] RDMA/bnxt_re: Fix a possible NULL pointer dereference Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 06/10] RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 07/10] RDMA/bnxt_re: Fix an error path in bnxt_re_add_device Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 08/10] RDMA/bnxt_re: Change the sequence of updating the CQ toggle value Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 09/10] RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages Selvin Xavier
2024-10-08 7:41 ` [PATCH for-rc 10/10] RDMA/bnxt_re: Fix the GID table length Selvin Xavier
2024-10-11 23:51 ` [PATCH for-rc 00/10] RDMA/bnxt_re: Bug fixes Jason Gunthorpe
-- strict thread matches above, loose matches on Subject: below --
2023-05-11 7:26 Selvin Xavier
2023-05-17 19:35 ` Jason Gunthorpe
2023-05-18 8:17 ` Selvin Xavier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).