* [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs
@ 2026-01-27 10:31 Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
` (4 more replies)
0 siblings, 5 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
Hi,
This patchset supports Direct Verbs in the bnxt_re driver.
This is required by vendor specific applications that need to manage
the HW resources directly and to implement the datapath in the
application.
To support this, the library and the driver are being enhanced to
provide Direct Verbs using which the application can allocate and
manage the HW resources (Queues, Doorbell etc) . The Direct Verbs
enable the application to implement the control path.
Patch#1 Uverbs support for user allocated QP mem
Patch#2 Move uapi methods to a separate file
Patch#3 Refactor existing bnxt_qplib_create_qp() function
Patch#4 Support dbr direct verbs
Patch#5 Support cq and qp direct verbs
Thanks,
-Harsha
******
Changes:
v9:
- Added a new uverbs patch (#1) in RDMA core.
- Supports user/app allocated memory for QP.
- Updated Patch#5 (cq/qp) to utilize umem dev op.
- Updated driver ABI file (deleted dmabuf_fd/len fields).
v8:
- Patch#3:
- Removed dpi_hash table (and lock/rcu).
- Renamed bnxt_re_alloc_dbr_obj->bnxt_re_dbr_obj.
- Added an atomic usecnt in dbr_obj.
- Patch#4:
- Registered a driver specific attribute for dbr_handle.
- Process dbr_handle during QP creation.
- Added refcnt logic to avoid dbr deletion with active QPs.
- Reverted dpi hash table lookup and related code.
- Removed dpi from req_qp ABI.
- Added ib_umem_find_best_pgsz() in umem processing.
- Added a wrapper function for dv_cq deletion.
v7:
- Patch#3:
- DBR_OFFSET attribute changed to PTR_OUT.
- Added a reserved field in struct bnxt_re_dv_db_region.
- Reordered sequence in DBR_ALLOC (hash_add -> uverbs_finalize).
- Synchronized access to dpi hash table.
- Patch#4:
- Changed dmabuf_fd type (u32->s32) in ABI.
- Changed num_dma_blocks() arg from PAGE_SIZE to SZ_4K.
- Fixed atomic read/inc race window in bnxt_re_dv_create_qplib_cq().
- Deleted bnxt_re_dv_init_ib_cq().
v6:
- Minor updates in Patch#3:
- Removed unused variables.
- Renamed & updated a uverbs method to a global.
- Minor updates in Patch#4:
- Removed unused variables, stray hunks.
v5:
- Design changes to address previous round of comments:
- Reverted changes in rdma-core (removed V4-Patch#1).
- Removed driver support for umem-reg/dereg DVs (Patch#3).
- Enhanced driver specific udata to avoid new CQ/QP ioctls (Patch#4).
- Removed additional driver functions in modify/query QP (Patch#4).
- Utilized queue-va in udata for deferred pinning (Patch#4).
v4:
- Added a new (rdma core) patch.
- Addressed code review comments in patch 5.
v3:
- Addressed code review comments in patches 1, 2 and 4.
v2:
- Fixed build warnings reported by test robot in patches 3 and 4.
v8: https://lore.kernel.org/linux-rdma/20260117080052.43279-1-sriharsha.basavapatna@broadcom.com/
v7: https://lore.kernel.org/linux-rdma/20260113170956.103779-1-sriharsha.basavapatna@broadcom.com/
v6: https://lore.kernel.org/linux-rdma/20251224042602.56255-1-sriharsha.basavapatna@broadcom.com/
v5: https://lore.kernel.org/linux-rdma/20251129165441.75274-1-sriharsha.basavapatna@broadcom.com/
v4: https://lore.kernel.org/linux-rdma/20251117061741.15752-1-sriharsha.basavapatna@broadcom.com/
v3: https://lore.kernel.org/linux-rdma/20251110145628.290296-1-sriharsha.basavapatna@broadcom.com/
v2: https://lore.kernel.org/linux-rdma/20251104072320.210596-1-sriharsha.basavapatna@broadcom.com/
v1: https://lore.kernel.org/linux-rdma/20251103105033.205586-1-sriharsha.basavapatna@broadcom.com/
******
Jiri Pirko (1):
RDMA/uverbs: Support QP creation with user allocated memory
Kalesh AP (3):
RDMA/bnxt_re: Move the UAPI methods to a dedicated file
RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function
RDMA/bnxt_re: Direct Verbs: Support DBR verbs
Sriharsha Basavapatna (1):
RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/core/uverbs_std_types_qp.c | 157 ++-
drivers/infiniband/hw/bnxt_re/Makefile | 2 +-
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 6 +
drivers/infiniband/hw/bnxt_re/dv.c | 916 ++++++++++++++++++
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 663 ++++++-------
drivers/infiniband/hw/bnxt_re/ib_verbs.h | 30 +
drivers/infiniband/hw/bnxt_re/main.c | 2 +
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 310 ++----
drivers/infiniband/hw/bnxt_re/qplib_fp.h | 10 +-
drivers/infiniband/hw/bnxt_re/qplib_res.c | 43 +
drivers/infiniband/hw/bnxt_re/qplib_res.h | 10 +
include/rdma/ib_verbs.h | 4 +
include/uapi/rdma/bnxt_re-abi.h | 44 +
include/uapi/rdma/ib_user_ioctl_cmds.h | 8 +
15 files changed, 1628 insertions(+), 578 deletions(-)
create mode 100644 drivers/infiniband/hw/bnxt_re/dv.c
--
2.51.2.636.ga99f379adf
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
@ 2026-01-27 10:31 ` Sriharsha Basavapatna
2026-01-27 12:12 ` Jiri Pirko
2026-01-28 12:31 ` Leon Romanovsky
2026-01-27 10:31 ` [PATCH rdma-next v9 2/5] RDMA/bnxt_re: Move the UAPI methods to a dedicated file Sriharsha Basavapatna
` (3 subsequent siblings)
4 siblings, 2 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Jiri Pirko, Sriharsha Basavapatna
From: Jiri Pirko <jiri@resnulli.us>
This patch supports creation of QPs with user allocated memory (umem).
This is similar to the existing CQ umem support. This enables userspace
applications to provide pre-allocated buffers for QP send and receive
queues.
- Add create_qp_umem device operation to the RDMA device ops.
- Implement get_qp_buffer_umem() helper function to handle both VA-based
and dmabuf-based umem allocation.
- Extend QP creation handler to support umem attributes for SQ and RQ.
- Add new uAPI attributes to specify umem buffers (VA/length or
FD/offset combinations).
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
---
drivers/infiniband/core/device.c | 1 +
drivers/infiniband/core/uverbs_std_types_qp.c | 157 +++++++++++++++++-
include/rdma/ib_verbs.h | 4 +
include/uapi/rdma/ib_user_ioctl_cmds.h | 8 +
4 files changed, 165 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 4e09f6e0995e..a9ae03f1e936 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2704,6 +2704,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, create_cq_umem);
SET_DEVICE_OP(dev_ops, create_flow);
SET_DEVICE_OP(dev_ops, create_qp);
+ SET_DEVICE_OP(dev_ops, create_qp_umem);
SET_DEVICE_OP(dev_ops, create_rwq_ind_table);
SET_DEVICE_OP(dev_ops, create_srq);
SET_DEVICE_OP(dev_ops, create_user_ah);
diff --git a/drivers/infiniband/core/uverbs_std_types_qp.c b/drivers/infiniband/core/uverbs_std_types_qp.c
index be0730e8509e..3bc7a9adbf24 100644
--- a/drivers/infiniband/core/uverbs_std_types_qp.c
+++ b/drivers/infiniband/core/uverbs_std_types_qp.c
@@ -79,6 +79,75 @@ static void set_caps(struct ib_qp_init_attr *attr,
}
}
+static int get_qp_buffer_umem(struct ib_device *ib_dev,
+ struct uverbs_attr_bundle *attrs,
+ int va_attr, int len_attr,
+ int fd_attr, int offset_attr,
+ struct ib_umem **umem_out)
+{
+ struct ib_umem_dmabuf *umem_dmabuf;
+ u64 buffer_va, buffer_length, buffer_offset;
+ int buffer_fd;
+ int ret;
+
+ *umem_out = NULL;
+
+ if (uverbs_attr_is_valid(attrs, va_attr)) {
+ /* VA mode - use regular umem */
+ ret = uverbs_copy_from(&buffer_va, attrs, va_attr);
+ if (ret)
+ return ret;
+
+ ret = uverbs_copy_from(&buffer_length, attrs, len_attr);
+ if (ret)
+ return ret;
+
+ /* VA and FD are mutually exclusive */
+ if (uverbs_attr_is_valid(attrs, fd_attr) ||
+ uverbs_attr_is_valid(attrs, offset_attr))
+ return -EINVAL;
+
+ *umem_out = ib_umem_get(ib_dev, buffer_va, buffer_length,
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(*umem_out)) {
+ ret = PTR_ERR(*umem_out);
+ *umem_out = NULL;
+ return ret;
+ }
+ } else if (uverbs_attr_is_valid(attrs, fd_attr)) {
+ /* Dmabuf mode */
+ ret = uverbs_get_raw_fd(&buffer_fd, attrs, fd_attr);
+ if (ret)
+ return ret;
+
+ ret = uverbs_copy_from(&buffer_offset, attrs, offset_attr);
+ if (ret)
+ return ret;
+
+ ret = uverbs_copy_from(&buffer_length, attrs, len_attr);
+ if (ret)
+ return ret;
+
+ /* FD and VA are mutually exclusive */
+ if (uverbs_attr_is_valid(attrs, va_attr))
+ return -EINVAL;
+
+ umem_dmabuf = ib_umem_dmabuf_get_pinned(ib_dev, buffer_offset,
+ buffer_length, buffer_fd,
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem_dmabuf))
+ return PTR_ERR(umem_dmabuf);
+
+ *umem_out = &umem_dmabuf->umem;
+ } else if (uverbs_attr_is_valid(attrs, len_attr) ||
+ uverbs_attr_is_valid(attrs, offset_attr)) {
+ /* Length or offset without VA/FD is invalid */
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
struct uverbs_attr_bundle *attrs)
{
@@ -95,6 +164,8 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
struct ib_cq *send_cq = NULL;
struct ib_xrcd *xrcd = NULL;
struct ib_uobject *xrcd_uobj = NULL;
+ struct ib_umem *sq_umem = NULL;
+ struct ib_umem *rq_umem = NULL;
struct ib_device *device;
u64 user_handle;
int ret;
@@ -248,11 +319,58 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
set_caps(&attr, &cap, true);
mutex_init(&obj->mcast_lock);
- qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata, obj,
- KBUILD_MODNAME);
- if (IS_ERR(qp)) {
- ret = PTR_ERR(qp);
+ /* Get SQ buffer umem (from VA or dmabuf FD) */
+ ret = get_qp_buffer_umem(device, attrs,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_VA,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_LENGTH,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_FD,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_OFFSET,
+ &sq_umem);
+ if (ret)
goto err_put;
+
+ /* Get RQ buffer umem (from VA or dmabuf FD) */
+ ret = get_qp_buffer_umem(device, attrs,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_VA,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_LENGTH,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_FD,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_OFFSET,
+ &rq_umem);
+ if (ret)
+ goto err_release_sq_umem;
+
+ /* Use umem-based creation if buffers are provided */
+ if (sq_umem || rq_umem) {
+ if (!device->ops.create_qp_umem) {
+ ret = -EOPNOTSUPP;
+ goto err_release_rq_umem;
+ }
+
+ qp = rdma_zalloc_drv_obj(device, ib_qp);
+ if (!qp) {
+ ret = -ENOMEM;
+ goto err_release_rq_umem;
+ }
+
+ qp->device = device;
+ qp->pd = pd;
+ qp->uobject = obj;
+ qp->real_qp = qp;
+ qp->qp_type = attr.qp_type;
+
+ ret = device->ops.create_qp_umem(qp, &attr, sq_umem, rq_umem,
+ attrs);
+ if (ret) {
+ kfree(qp);
+ goto err_release_rq_umem;
+ }
+ } else {
+ qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata,
+ obj, KBUILD_MODNAME);
+ if (IS_ERR(qp)) {
+ ret = PTR_ERR(qp);
+ goto err_put;
+ }
}
ib_qp_usecnt_inc(qp);
@@ -277,11 +395,16 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
sizeof(qp->qp_num));
return ret;
+
+err_release_rq_umem:
+ ib_umem_release(rq_umem);
+err_release_sq_umem:
+ ib_umem_release(sq_umem);
err_put:
if (obj->uevent.event_file)
uverbs_uobject_put(&obj->uevent.event_file->uobj);
return ret;
-};
+}
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_QP_CREATE,
@@ -340,6 +463,30 @@ DECLARE_UVERBS_NAMED_METHOD(
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
UVERBS_ATTR_TYPE(u32),
UA_MANDATORY),
+ /* SQ buffer attributes - use VA or FD, not both */
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_SQ_BUFFER_VA,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_SQ_BUFFER_LENGTH,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
+ UVERBS_ATTR_RAW_FD(UVERBS_ATTR_CREATE_QP_SQ_BUFFER_FD,
+ UA_OPTIONAL),
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_SQ_BUFFER_OFFSET,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
+ /* RQ buffer attributes - use VA or FD, not both */
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_RQ_BUFFER_VA,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_RQ_BUFFER_LENGTH,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
+ UVERBS_ATTR_RAW_FD(UVERBS_ATTR_CREATE_QP_RQ_BUFFER_FD,
+ UA_OPTIONAL),
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_QP_RQ_BUFFER_OFFSET,
+ UVERBS_ATTR_TYPE(u64),
+ UA_OPTIONAL),
UVERBS_ATTR_UHW());
static int UVERBS_HANDLER(UVERBS_METHOD_QP_DESTROY)(
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6c372a37c482..8bbf37b9e823 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2520,6 +2520,10 @@ struct ib_device_ops {
int (*destroy_srq)(struct ib_srq *srq, struct ib_udata *udata);
int (*create_qp)(struct ib_qp *qp, struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata);
+ int (*create_qp_umem)(struct ib_qp *qp,
+ struct ib_qp_init_attr *qp_init_attr,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem,
+ struct uverbs_attr_bundle *attrs);
int (*modify_qp)(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_udata *udata);
int (*query_qp)(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 35da4026f452..0d6b4151512d 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -157,6 +157,14 @@ enum uverbs_attrs_create_qp_cmd_attr_ids {
UVERBS_ATTR_CREATE_QP_EVENT_FD,
UVERBS_ATTR_CREATE_QP_RESP_CAP,
UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_VA,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_LENGTH,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_FD,
+ UVERBS_ATTR_CREATE_QP_SQ_BUFFER_OFFSET,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_VA,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_LENGTH,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_FD,
+ UVERBS_ATTR_CREATE_QP_RQ_BUFFER_OFFSET,
};
enum uverbs_attrs_destroy_qp_cmd_attr_ids {
--
2.51.2.636.ga99f379adf
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH rdma-next v9 2/5] RDMA/bnxt_re: Move the UAPI methods to a dedicated file
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
@ 2026-01-27 10:31 ` Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 3/5] RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function Sriharsha Basavapatna
` (2 subsequent siblings)
4 siblings, 0 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
This is in preparation for upcoming patches in the series.
Driver has to support additional UAPIs for Direct verbs.
Moving current UAPI implementation to a new file, dv.c.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/Makefile | 2 +-
drivers/infiniband/hw/bnxt_re/dv.c | 339 +++++++++++++++++++++++
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 305 +-------------------
drivers/infiniband/hw/bnxt_re/ib_verbs.h | 3 +
4 files changed, 344 insertions(+), 305 deletions(-)
create mode 100644 drivers/infiniband/hw/bnxt_re/dv.c
diff --git a/drivers/infiniband/hw/bnxt_re/Makefile b/drivers/infiniband/hw/bnxt_re/Makefile
index f63417d2ccc6..b82d12df6269 100644
--- a/drivers/infiniband/hw/bnxt_re/Makefile
+++ b/drivers/infiniband/hw/bnxt_re/Makefile
@@ -5,4 +5,4 @@ obj-$(CONFIG_INFINIBAND_BNXT_RE) += bnxt_re.o
bnxt_re-y := main.o ib_verbs.o \
qplib_res.o qplib_rcfw.o \
qplib_sp.o qplib_fp.o hw_counters.o \
- debugfs.o
+ debugfs.o dv.o
diff --git a/drivers/infiniband/hw/bnxt_re/dv.c b/drivers/infiniband/hw/bnxt_re/dv.c
new file mode 100644
index 000000000000..5655c6176af4
--- /dev/null
+++ b/drivers/infiniband/hw/bnxt_re/dv.c
@@ -0,0 +1,339 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause
+/*
+ * Copyright (c) 2025, Broadcom. All rights reserved. The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * Description: Direct Verbs interpreter
+ */
+
+#include <rdma/ib_addr.h>
+#include <rdma/uverbs_types.h>
+#include <rdma/uverbs_std_types.h>
+#include <rdma/ib_user_ioctl_cmds.h>
+#define UVERBS_MODULE_NAME bnxt_re
+#include <rdma/uverbs_named_ioctl.h>
+#include <rdma/bnxt_re-abi.h>
+
+#include "roce_hsi.h"
+#include "qplib_res.h"
+#include "qplib_sp.h"
+#include "qplib_fp.h"
+#include "qplib_rcfw.h"
+#include "bnxt_re.h"
+#include "ib_verbs.h"
+
+static struct bnxt_re_cq *bnxt_re_search_for_cq(struct bnxt_re_dev *rdev, u32 cq_id)
+{
+ struct bnxt_re_cq *cq = NULL, *tmp_cq;
+
+ hash_for_each_possible(rdev->cq_hash, tmp_cq, hash_entry, cq_id) {
+ if (tmp_cq->qplib_cq.id == cq_id) {
+ cq = tmp_cq;
+ break;
+ }
+ }
+ return cq;
+}
+
+static struct bnxt_re_srq *bnxt_re_search_for_srq(struct bnxt_re_dev *rdev, u32 srq_id)
+{
+ struct bnxt_re_srq *srq = NULL, *tmp_srq;
+
+ hash_for_each_possible(rdev->srq_hash, tmp_srq, hash_entry, srq_id) {
+ if (tmp_srq->qplib_srq.id == srq_id) {
+ srq = tmp_srq;
+ break;
+ }
+ }
+ return srq;
+}
+
+static int UVERBS_HANDLER(BNXT_RE_METHOD_NOTIFY_DRV)(struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_ucontext *uctx;
+ struct ib_ucontext *ib_uctx;
+
+ ib_uctx = ib_uverbs_get_ucontext(attrs);
+ if (IS_ERR(ib_uctx))
+ return PTR_ERR(ib_uctx);
+
+ uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
+ if (IS_ERR(uctx))
+ return PTR_ERR(uctx);
+
+ bnxt_re_pacing_alert(uctx->rdev);
+ return 0;
+}
+
+static int UVERBS_HANDLER(BNXT_RE_METHOD_ALLOC_PAGE)(struct uverbs_attr_bundle *attrs)
+{
+ struct ib_uobject *uobj = uverbs_attr_get_uobject(attrs, BNXT_RE_ALLOC_PAGE_HANDLE);
+ enum bnxt_re_alloc_page_type alloc_type;
+ struct bnxt_re_user_mmap_entry *entry;
+ enum bnxt_re_mmap_flag mmap_flag;
+ struct bnxt_qplib_chip_ctx *cctx;
+ struct bnxt_re_ucontext *uctx;
+ struct ib_ucontext *ib_uctx;
+ struct bnxt_re_dev *rdev;
+ u64 mmap_offset;
+ u32 length;
+ u32 dpi;
+ u64 addr;
+ int err;
+
+ ib_uctx = ib_uverbs_get_ucontext(attrs);
+ if (IS_ERR(ib_uctx))
+ return PTR_ERR(ib_uctx);
+
+ uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
+ if (IS_ERR(uctx))
+ return PTR_ERR(uctx);
+
+ err = uverbs_get_const(&alloc_type, attrs, BNXT_RE_ALLOC_PAGE_TYPE);
+ if (err)
+ return err;
+
+ rdev = uctx->rdev;
+ cctx = rdev->chip_ctx;
+
+ switch (alloc_type) {
+ case BNXT_RE_ALLOC_WC_PAGE:
+ if (cctx->modes.db_push) {
+ if (bnxt_qplib_alloc_dpi(&rdev->qplib_res, &uctx->wcdpi,
+ uctx, BNXT_QPLIB_DPI_TYPE_WC))
+ return -ENOMEM;
+ length = PAGE_SIZE;
+ dpi = uctx->wcdpi.dpi;
+ addr = (u64)uctx->wcdpi.umdbr;
+ mmap_flag = BNXT_RE_MMAP_WC_DB;
+ } else {
+ return -EINVAL;
+ }
+
+ break;
+ case BNXT_RE_ALLOC_DBR_BAR_PAGE:
+ length = PAGE_SIZE;
+ addr = (u64)rdev->pacing.dbr_bar_addr;
+ mmap_flag = BNXT_RE_MMAP_DBR_BAR;
+ break;
+
+ case BNXT_RE_ALLOC_DBR_PAGE:
+ length = PAGE_SIZE;
+ addr = (u64)rdev->pacing.dbr_page;
+ mmap_flag = BNXT_RE_MMAP_DBR_PAGE;
+ break;
+
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ entry = bnxt_re_mmap_entry_insert(uctx, addr, mmap_flag, &mmap_offset);
+ if (!entry)
+ return -ENOMEM;
+
+ uobj->object = entry;
+ uverbs_finalize_uobj_create(attrs, BNXT_RE_ALLOC_PAGE_HANDLE);
+ err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_MMAP_OFFSET,
+ &mmap_offset, sizeof(mmap_offset));
+ if (err)
+ return err;
+
+ err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_MMAP_LENGTH,
+ &length, sizeof(length));
+ if (err)
+ return err;
+
+ err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_DPI,
+ &dpi, sizeof(dpi));
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int alloc_page_obj_cleanup(struct ib_uobject *uobject,
+ enum rdma_remove_reason why,
+ struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_user_mmap_entry *entry = uobject->object;
+ struct bnxt_re_ucontext *uctx = entry->uctx;
+
+ switch (entry->mmap_flag) {
+ case BNXT_RE_MMAP_WC_DB:
+ if (uctx && uctx->wcdpi.dbr) {
+ struct bnxt_re_dev *rdev = uctx->rdev;
+
+ bnxt_qplib_dealloc_dpi(&rdev->qplib_res, &uctx->wcdpi);
+ uctx->wcdpi.dbr = NULL;
+ }
+ break;
+ case BNXT_RE_MMAP_DBR_BAR:
+ case BNXT_RE_MMAP_DBR_PAGE:
+ break;
+ default:
+ goto exit;
+ }
+ rdma_user_mmap_entry_remove(&entry->rdma_entry);
+exit:
+ return 0;
+}
+
+DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_ALLOC_PAGE,
+ UVERBS_ATTR_IDR(BNXT_RE_ALLOC_PAGE_HANDLE,
+ BNXT_RE_OBJECT_ALLOC_PAGE,
+ UVERBS_ACCESS_NEW,
+ UA_MANDATORY),
+ UVERBS_ATTR_CONST_IN(BNXT_RE_ALLOC_PAGE_TYPE,
+ enum bnxt_re_alloc_page_type,
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_MMAP_OFFSET,
+ UVERBS_ATTR_TYPE(u64),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_MMAP_LENGTH,
+ UVERBS_ATTR_TYPE(u32),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_DPI,
+ UVERBS_ATTR_TYPE(u32),
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_METHOD_DESTROY(BNXT_RE_METHOD_DESTROY_PAGE,
+ UVERBS_ATTR_IDR(BNXT_RE_DESTROY_PAGE_HANDLE,
+ BNXT_RE_OBJECT_ALLOC_PAGE,
+ UVERBS_ACCESS_DESTROY,
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_ALLOC_PAGE,
+ UVERBS_TYPE_ALLOC_IDR(alloc_page_obj_cleanup),
+ &UVERBS_METHOD(BNXT_RE_METHOD_ALLOC_PAGE),
+ &UVERBS_METHOD(BNXT_RE_METHOD_DESTROY_PAGE));
+
+DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_NOTIFY_DRV);
+
+DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_NOTIFY_DRV,
+ &UVERBS_METHOD(BNXT_RE_METHOD_NOTIFY_DRV));
+
+/* Toggle MEM */
+static int UVERBS_HANDLER(BNXT_RE_METHOD_GET_TOGGLE_MEM)(struct uverbs_attr_bundle *attrs)
+{
+ struct ib_uobject *uobj = uverbs_attr_get_uobject(attrs, BNXT_RE_TOGGLE_MEM_HANDLE);
+ enum bnxt_re_mmap_flag mmap_flag = BNXT_RE_MMAP_TOGGLE_PAGE;
+ enum bnxt_re_get_toggle_mem_type res_type;
+ struct bnxt_re_user_mmap_entry *entry;
+ struct bnxt_re_ucontext *uctx;
+ struct ib_ucontext *ib_uctx;
+ struct bnxt_re_dev *rdev;
+ struct bnxt_re_srq *srq;
+ u32 length = PAGE_SIZE;
+ struct bnxt_re_cq *cq;
+ u64 mem_offset;
+ u32 offset = 0;
+ u64 addr = 0;
+ u32 res_id;
+ int err;
+
+ ib_uctx = ib_uverbs_get_ucontext(attrs);
+ if (IS_ERR(ib_uctx))
+ return PTR_ERR(ib_uctx);
+
+ err = uverbs_get_const(&res_type, attrs, BNXT_RE_TOGGLE_MEM_TYPE);
+ if (err)
+ return err;
+
+ uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
+ rdev = uctx->rdev;
+ err = uverbs_copy_from(&res_id, attrs, BNXT_RE_TOGGLE_MEM_RES_ID);
+ if (err)
+ return err;
+
+ switch (res_type) {
+ case BNXT_RE_CQ_TOGGLE_MEM:
+ cq = bnxt_re_search_for_cq(rdev, res_id);
+ if (!cq)
+ return -EINVAL;
+
+ addr = (u64)cq->uctx_cq_page;
+ break;
+ case BNXT_RE_SRQ_TOGGLE_MEM:
+ srq = bnxt_re_search_for_srq(rdev, res_id);
+ if (!srq)
+ return -EINVAL;
+
+ addr = (u64)srq->uctx_srq_page;
+ break;
+
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ entry = bnxt_re_mmap_entry_insert(uctx, addr, mmap_flag, &mem_offset);
+ if (!entry)
+ return -ENOMEM;
+
+ uobj->object = entry;
+ uverbs_finalize_uobj_create(attrs, BNXT_RE_TOGGLE_MEM_HANDLE);
+ err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_PAGE,
+ &mem_offset, sizeof(mem_offset));
+ if (err)
+ return err;
+
+ err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_LENGTH,
+ &length, sizeof(length));
+ if (err)
+ return err;
+
+ err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_OFFSET,
+ &offset, sizeof(offset));
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int get_toggle_mem_obj_cleanup(struct ib_uobject *uobject,
+ enum rdma_remove_reason why,
+ struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_user_mmap_entry *entry = uobject->object;
+
+ rdma_user_mmap_entry_remove(&entry->rdma_entry);
+ return 0;
+}
+
+DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_TOGGLE_MEM,
+ UVERBS_ATTR_IDR(BNXT_RE_TOGGLE_MEM_HANDLE,
+ BNXT_RE_OBJECT_GET_TOGGLE_MEM,
+ UVERBS_ACCESS_NEW,
+ UA_MANDATORY),
+ UVERBS_ATTR_CONST_IN(BNXT_RE_TOGGLE_MEM_TYPE,
+ enum bnxt_re_get_toggle_mem_type,
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_IN(BNXT_RE_TOGGLE_MEM_RES_ID,
+ UVERBS_ATTR_TYPE(u32),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_PAGE,
+ UVERBS_ATTR_TYPE(u64),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_OFFSET,
+ UVERBS_ATTR_TYPE(u32),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_LENGTH,
+ UVERBS_ATTR_TYPE(u32),
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_METHOD_DESTROY(BNXT_RE_METHOD_RELEASE_TOGGLE_MEM,
+ UVERBS_ATTR_IDR(BNXT_RE_RELEASE_TOGGLE_MEM_HANDLE,
+ BNXT_RE_OBJECT_GET_TOGGLE_MEM,
+ UVERBS_ACCESS_DESTROY,
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_GET_TOGGLE_MEM,
+ UVERBS_TYPE_ALLOC_IDR(get_toggle_mem_obj_cleanup),
+ &UVERBS_METHOD(BNXT_RE_METHOD_GET_TOGGLE_MEM),
+ &UVERBS_METHOD(BNXT_RE_METHOD_RELEASE_TOGGLE_MEM));
+
+const struct uapi_definition bnxt_re_uapi_defs[] = {
+ UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_ALLOC_PAGE),
+ UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_NOTIFY_DRV),
+ UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_GET_TOGGLE_MEM),
+ {}
+};
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index f19b55c13d58..f758f92ba72b 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -627,7 +627,7 @@ static int bnxt_re_create_fence_mr(struct bnxt_re_pd *pd)
return rc;
}
-static struct bnxt_re_user_mmap_entry*
+struct bnxt_re_user_mmap_entry*
bnxt_re_mmap_entry_insert(struct bnxt_re_ucontext *uctx, u64 mem_offset,
enum bnxt_re_mmap_flag mmap_flag, u64 *offset)
{
@@ -4531,32 +4531,6 @@ int bnxt_re_destroy_flow(struct ib_flow *flow_id)
return rc;
}
-static struct bnxt_re_cq *bnxt_re_search_for_cq(struct bnxt_re_dev *rdev, u32 cq_id)
-{
- struct bnxt_re_cq *cq = NULL, *tmp_cq;
-
- hash_for_each_possible(rdev->cq_hash, tmp_cq, hash_entry, cq_id) {
- if (tmp_cq->qplib_cq.id == cq_id) {
- cq = tmp_cq;
- break;
- }
- }
- return cq;
-}
-
-static struct bnxt_re_srq *bnxt_re_search_for_srq(struct bnxt_re_dev *rdev, u32 srq_id)
-{
- struct bnxt_re_srq *srq = NULL, *tmp_srq;
-
- hash_for_each_possible(rdev->srq_hash, tmp_srq, hash_entry, srq_id) {
- if (tmp_srq->qplib_srq.id == srq_id) {
- srq = tmp_srq;
- break;
- }
- }
- return srq;
-}
-
/* Helper function to mmap the virtual memory from user app */
int bnxt_re_mmap(struct ib_ucontext *ib_uctx, struct vm_area_struct *vma)
{
@@ -4659,280 +4633,3 @@ int bnxt_re_process_mad(struct ib_device *ibdev, int mad_flags,
ret |= IB_MAD_RESULT_REPLY;
return ret;
}
-
-static int UVERBS_HANDLER(BNXT_RE_METHOD_NOTIFY_DRV)(struct uverbs_attr_bundle *attrs)
-{
- struct bnxt_re_ucontext *uctx;
-
- uctx = container_of(ib_uverbs_get_ucontext(attrs), struct bnxt_re_ucontext, ib_uctx);
- bnxt_re_pacing_alert(uctx->rdev);
- return 0;
-}
-
-static int UVERBS_HANDLER(BNXT_RE_METHOD_ALLOC_PAGE)(struct uverbs_attr_bundle *attrs)
-{
- struct ib_uobject *uobj = uverbs_attr_get_uobject(attrs, BNXT_RE_ALLOC_PAGE_HANDLE);
- enum bnxt_re_alloc_page_type alloc_type;
- struct bnxt_re_user_mmap_entry *entry;
- enum bnxt_re_mmap_flag mmap_flag;
- struct bnxt_qplib_chip_ctx *cctx;
- struct bnxt_re_ucontext *uctx;
- struct bnxt_re_dev *rdev;
- u64 mmap_offset;
- u32 length;
- u32 dpi;
- u64 addr;
- int err;
-
- uctx = container_of(ib_uverbs_get_ucontext(attrs), struct bnxt_re_ucontext, ib_uctx);
- if (IS_ERR(uctx))
- return PTR_ERR(uctx);
-
- err = uverbs_get_const(&alloc_type, attrs, BNXT_RE_ALLOC_PAGE_TYPE);
- if (err)
- return err;
-
- rdev = uctx->rdev;
- cctx = rdev->chip_ctx;
-
- switch (alloc_type) {
- case BNXT_RE_ALLOC_WC_PAGE:
- if (cctx->modes.db_push) {
- if (bnxt_qplib_alloc_dpi(&rdev->qplib_res, &uctx->wcdpi,
- uctx, BNXT_QPLIB_DPI_TYPE_WC))
- return -ENOMEM;
- length = PAGE_SIZE;
- dpi = uctx->wcdpi.dpi;
- addr = (u64)uctx->wcdpi.umdbr;
- mmap_flag = BNXT_RE_MMAP_WC_DB;
- } else {
- return -EINVAL;
- }
-
- break;
- case BNXT_RE_ALLOC_DBR_BAR_PAGE:
- length = PAGE_SIZE;
- addr = (u64)rdev->pacing.dbr_bar_addr;
- mmap_flag = BNXT_RE_MMAP_DBR_BAR;
- break;
-
- case BNXT_RE_ALLOC_DBR_PAGE:
- length = PAGE_SIZE;
- addr = (u64)rdev->pacing.dbr_page;
- mmap_flag = BNXT_RE_MMAP_DBR_PAGE;
- break;
-
- default:
- return -EOPNOTSUPP;
- }
-
- entry = bnxt_re_mmap_entry_insert(uctx, addr, mmap_flag, &mmap_offset);
- if (!entry)
- return -ENOMEM;
-
- uobj->object = entry;
- uverbs_finalize_uobj_create(attrs, BNXT_RE_ALLOC_PAGE_HANDLE);
- err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_MMAP_OFFSET,
- &mmap_offset, sizeof(mmap_offset));
- if (err)
- return err;
-
- err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_MMAP_LENGTH,
- &length, sizeof(length));
- if (err)
- return err;
-
- err = uverbs_copy_to(attrs, BNXT_RE_ALLOC_PAGE_DPI,
- &dpi, sizeof(dpi));
- if (err)
- return err;
-
- return 0;
-}
-
-static int alloc_page_obj_cleanup(struct ib_uobject *uobject,
- enum rdma_remove_reason why,
- struct uverbs_attr_bundle *attrs)
-{
- struct bnxt_re_user_mmap_entry *entry = uobject->object;
- struct bnxt_re_ucontext *uctx = entry->uctx;
-
- switch (entry->mmap_flag) {
- case BNXT_RE_MMAP_WC_DB:
- if (uctx && uctx->wcdpi.dbr) {
- struct bnxt_re_dev *rdev = uctx->rdev;
-
- bnxt_qplib_dealloc_dpi(&rdev->qplib_res, &uctx->wcdpi);
- uctx->wcdpi.dbr = NULL;
- }
- break;
- case BNXT_RE_MMAP_DBR_BAR:
- case BNXT_RE_MMAP_DBR_PAGE:
- break;
- default:
- goto exit;
- }
- rdma_user_mmap_entry_remove(&entry->rdma_entry);
-exit:
- return 0;
-}
-
-DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_ALLOC_PAGE,
- UVERBS_ATTR_IDR(BNXT_RE_ALLOC_PAGE_HANDLE,
- BNXT_RE_OBJECT_ALLOC_PAGE,
- UVERBS_ACCESS_NEW,
- UA_MANDATORY),
- UVERBS_ATTR_CONST_IN(BNXT_RE_ALLOC_PAGE_TYPE,
- enum bnxt_re_alloc_page_type,
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_MMAP_OFFSET,
- UVERBS_ATTR_TYPE(u64),
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_MMAP_LENGTH,
- UVERBS_ATTR_TYPE(u32),
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_ALLOC_PAGE_DPI,
- UVERBS_ATTR_TYPE(u32),
- UA_MANDATORY));
-
-DECLARE_UVERBS_NAMED_METHOD_DESTROY(BNXT_RE_METHOD_DESTROY_PAGE,
- UVERBS_ATTR_IDR(BNXT_RE_DESTROY_PAGE_HANDLE,
- BNXT_RE_OBJECT_ALLOC_PAGE,
- UVERBS_ACCESS_DESTROY,
- UA_MANDATORY));
-
-DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_ALLOC_PAGE,
- UVERBS_TYPE_ALLOC_IDR(alloc_page_obj_cleanup),
- &UVERBS_METHOD(BNXT_RE_METHOD_ALLOC_PAGE),
- &UVERBS_METHOD(BNXT_RE_METHOD_DESTROY_PAGE));
-
-DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_NOTIFY_DRV);
-
-DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_NOTIFY_DRV,
- &UVERBS_METHOD(BNXT_RE_METHOD_NOTIFY_DRV));
-
-/* Toggle MEM */
-static int UVERBS_HANDLER(BNXT_RE_METHOD_GET_TOGGLE_MEM)(struct uverbs_attr_bundle *attrs)
-{
- struct ib_uobject *uobj = uverbs_attr_get_uobject(attrs, BNXT_RE_TOGGLE_MEM_HANDLE);
- enum bnxt_re_mmap_flag mmap_flag = BNXT_RE_MMAP_TOGGLE_PAGE;
- enum bnxt_re_get_toggle_mem_type res_type;
- struct bnxt_re_user_mmap_entry *entry;
- struct bnxt_re_ucontext *uctx;
- struct ib_ucontext *ib_uctx;
- struct bnxt_re_dev *rdev;
- struct bnxt_re_srq *srq;
- u32 length = PAGE_SIZE;
- struct bnxt_re_cq *cq;
- u64 mem_offset;
- u32 offset = 0;
- u64 addr = 0;
- u32 res_id;
- int err;
-
- ib_uctx = ib_uverbs_get_ucontext(attrs);
- if (IS_ERR(ib_uctx))
- return PTR_ERR(ib_uctx);
-
- err = uverbs_get_const(&res_type, attrs, BNXT_RE_TOGGLE_MEM_TYPE);
- if (err)
- return err;
-
- uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
- rdev = uctx->rdev;
- err = uverbs_copy_from(&res_id, attrs, BNXT_RE_TOGGLE_MEM_RES_ID);
- if (err)
- return err;
-
- switch (res_type) {
- case BNXT_RE_CQ_TOGGLE_MEM:
- cq = bnxt_re_search_for_cq(rdev, res_id);
- if (!cq)
- return -EINVAL;
-
- addr = (u64)cq->uctx_cq_page;
- break;
- case BNXT_RE_SRQ_TOGGLE_MEM:
- srq = bnxt_re_search_for_srq(rdev, res_id);
- if (!srq)
- return -EINVAL;
-
- addr = (u64)srq->uctx_srq_page;
- break;
-
- default:
- return -EOPNOTSUPP;
- }
-
- entry = bnxt_re_mmap_entry_insert(uctx, addr, mmap_flag, &mem_offset);
- if (!entry)
- return -ENOMEM;
-
- uobj->object = entry;
- uverbs_finalize_uobj_create(attrs, BNXT_RE_TOGGLE_MEM_HANDLE);
- err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_PAGE,
- &mem_offset, sizeof(mem_offset));
- if (err)
- return err;
-
- err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_LENGTH,
- &length, sizeof(length));
- if (err)
- return err;
-
- err = uverbs_copy_to(attrs, BNXT_RE_TOGGLE_MEM_MMAP_OFFSET,
- &offset, sizeof(offset));
- if (err)
- return err;
-
- return 0;
-}
-
-static int get_toggle_mem_obj_cleanup(struct ib_uobject *uobject,
- enum rdma_remove_reason why,
- struct uverbs_attr_bundle *attrs)
-{
- struct bnxt_re_user_mmap_entry *entry = uobject->object;
-
- rdma_user_mmap_entry_remove(&entry->rdma_entry);
- return 0;
-}
-
-DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_TOGGLE_MEM,
- UVERBS_ATTR_IDR(BNXT_RE_TOGGLE_MEM_HANDLE,
- BNXT_RE_OBJECT_GET_TOGGLE_MEM,
- UVERBS_ACCESS_NEW,
- UA_MANDATORY),
- UVERBS_ATTR_CONST_IN(BNXT_RE_TOGGLE_MEM_TYPE,
- enum bnxt_re_get_toggle_mem_type,
- UA_MANDATORY),
- UVERBS_ATTR_PTR_IN(BNXT_RE_TOGGLE_MEM_RES_ID,
- UVERBS_ATTR_TYPE(u32),
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_PAGE,
- UVERBS_ATTR_TYPE(u64),
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_OFFSET,
- UVERBS_ATTR_TYPE(u32),
- UA_MANDATORY),
- UVERBS_ATTR_PTR_OUT(BNXT_RE_TOGGLE_MEM_MMAP_LENGTH,
- UVERBS_ATTR_TYPE(u32),
- UA_MANDATORY));
-
-DECLARE_UVERBS_NAMED_METHOD_DESTROY(BNXT_RE_METHOD_RELEASE_TOGGLE_MEM,
- UVERBS_ATTR_IDR(BNXT_RE_RELEASE_TOGGLE_MEM_HANDLE,
- BNXT_RE_OBJECT_GET_TOGGLE_MEM,
- UVERBS_ACCESS_DESTROY,
- UA_MANDATORY));
-
-DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_GET_TOGGLE_MEM,
- UVERBS_TYPE_ALLOC_IDR(get_toggle_mem_obj_cleanup),
- &UVERBS_METHOD(BNXT_RE_METHOD_GET_TOGGLE_MEM),
- &UVERBS_METHOD(BNXT_RE_METHOD_RELEASE_TOGGLE_MEM));
-
-const struct uapi_definition bnxt_re_uapi_defs[] = {
- UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_ALLOC_PAGE),
- UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_NOTIFY_DRV),
- UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_GET_TOGGLE_MEM),
- {}
-};
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 76ba9ab04d5c..a11f56730a31 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -293,4 +293,7 @@ static inline u32 __to_ib_port_num(u16 port_id)
unsigned long bnxt_re_lock_cqs(struct bnxt_re_qp *qp);
void bnxt_re_unlock_cqs(struct bnxt_re_qp *qp, unsigned long flags);
+struct bnxt_re_user_mmap_entry*
+bnxt_re_mmap_entry_insert(struct bnxt_re_ucontext *uctx, u64 mem_offset,
+ enum bnxt_re_mmap_flag mmap_flag, u64 *offset);
#endif /* __BNXT_RE_IB_VERBS_H__ */
--
2.51.2.636.ga99f379adf
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH rdma-next v9 3/5] RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 2/5] RDMA/bnxt_re: Move the UAPI methods to a dedicated file Sriharsha Basavapatna
@ 2026-01-27 10:31 ` Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs Sriharsha Basavapatna
4 siblings, 0 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Inside bnxt_qplib_create_qp(), driver currently is doing
a lot of things like allocating HWQ memory for SQ/RQ/ORRQ/IRRQ,
initializing few of qplib_qp fields etc.
Refactored the code such that all memory allocation for HWQs
have been moved to bnxt_re_init_qp_attr() function and inside
bnxt_qplib_create_qp() function just initialize the request
structure and issue the HWRM command to firmware.
Introduced couple of new functions bnxt_re_setup_qp_hwqs() and
bnxt_re_setup_qp_swqs() moved the hwq and swq memory allocation
logic there.
This patch also introduces a change to store the PD id in
bnxt_qplib_qp. Instead of keeping a pointer to "struct
bnxt_qplib_pd", store PD id directly in "struct bnxt_qplib_qp".
This change is needed for a subsequent change in this patch
series. This PD ID value will be used in new DV implementation
for create_qp(). There is no functional change.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 205 ++++++++++++--
drivers/infiniband/hw/bnxt_re/qplib_fp.c | 310 +++++++---------------
drivers/infiniband/hw/bnxt_re/qplib_fp.h | 10 +-
drivers/infiniband/hw/bnxt_re/qplib_res.h | 6 +
4 files changed, 301 insertions(+), 230 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index f758f92ba72b..0d95eaee3885 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -967,6 +967,12 @@ static void bnxt_re_del_unique_gid(struct bnxt_re_dev *rdev)
dev_err(rdev_to_dev(rdev), "Failed to delete unique GID, rc: %d\n", rc);
}
+static void bnxt_re_qp_free_umem(struct bnxt_re_qp *qp)
+{
+ ib_umem_release(qp->rumem);
+ ib_umem_release(qp->sumem);
+}
+
/* Queue Pairs */
int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
{
@@ -1009,8 +1015,7 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
if (qp->qplib_qp.type == CMDQ_CREATE_QP_TYPE_RAW_ETHERTYPE)
bnxt_re_del_unique_gid(rdev);
- ib_umem_release(qp->rumem);
- ib_umem_release(qp->sumem);
+ bnxt_re_qp_free_umem(qp);
/* Flush all the entries of notification queue associated with
* given qp.
@@ -1154,6 +1159,7 @@ static int bnxt_re_init_user_qp(struct bnxt_re_dev *rdev, struct bnxt_re_pd *pd,
}
qplib_qp->dpi = &cntx->dpi;
+ qplib_qp->is_user = true;
return 0;
rqfail:
ib_umem_release(qp->sumem);
@@ -1211,6 +1217,114 @@ static struct bnxt_re_ah *bnxt_re_create_shadow_qp_ah
return NULL;
}
+static int bnxt_re_qp_alloc_init_xrrq(struct bnxt_re_qp *qp)
+{
+ struct bnxt_qplib_res *res = &qp->rdev->qplib_res;
+ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+ struct bnxt_qplib_hwq_attr hwq_attr = {};
+ struct bnxt_qplib_sg_info sginfo = {};
+ struct bnxt_qplib_hwq *irrq, *orrq;
+ int rc, req_size;
+
+ orrq = &qplib_qp->orrq;
+ orrq->max_elements =
+ ORD_LIMIT_TO_ORRQ_SLOTS(qplib_qp->max_rd_atomic);
+ req_size = orrq->max_elements *
+ BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE + PAGE_SIZE - 1;
+ req_size &= ~(PAGE_SIZE - 1);
+ sginfo.pgsize = req_size;
+ sginfo.pgshft = PAGE_SHIFT;
+
+ hwq_attr.res = res;
+ hwq_attr.sginfo = &sginfo;
+ hwq_attr.depth = orrq->max_elements;
+ hwq_attr.stride = BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE;
+ hwq_attr.aux_stride = 0;
+ hwq_attr.aux_depth = 0;
+ hwq_attr.type = HWQ_TYPE_CTX;
+ rc = bnxt_qplib_alloc_init_hwq(orrq, &hwq_attr);
+ if (rc)
+ return rc;
+
+ irrq = &qplib_qp->irrq;
+ irrq->max_elements =
+ IRD_LIMIT_TO_IRRQ_SLOTS(qplib_qp->max_dest_rd_atomic);
+ req_size = irrq->max_elements *
+ BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE + PAGE_SIZE - 1;
+ req_size &= ~(PAGE_SIZE - 1);
+ sginfo.pgsize = req_size;
+ hwq_attr.sginfo = &sginfo;
+ hwq_attr.depth = irrq->max_elements;
+ hwq_attr.stride = BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE;
+ rc = bnxt_qplib_alloc_init_hwq(irrq, &hwq_attr);
+ if (rc)
+ goto free_orrq_hwq;
+ return 0;
+free_orrq_hwq:
+ bnxt_qplib_free_hwq(res, orrq);
+ return rc;
+}
+
+static int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp)
+{
+ struct bnxt_qplib_res *res = &qp->rdev->qplib_res;
+ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+ struct bnxt_qplib_hwq_attr hwq_attr = {};
+ struct bnxt_qplib_q *sq = &qplib_qp->sq;
+ struct bnxt_qplib_q *rq = &qplib_qp->rq;
+ u8 wqe_mode = qplib_qp->wqe_mode;
+ u8 pg_sz_lvl;
+ int rc;
+
+ hwq_attr.res = res;
+ hwq_attr.sginfo = &sq->sg_info;
+ hwq_attr.stride = bnxt_qplib_get_stride();
+ hwq_attr.depth = bnxt_qplib_get_depth(sq, wqe_mode, true);
+ hwq_attr.aux_stride = qplib_qp->psn_sz;
+ hwq_attr.aux_depth = (qplib_qp->psn_sz) ?
+ bnxt_qplib_set_sq_size(sq, wqe_mode) : 0;
+ if (qplib_qp->is_host_msn_tbl && qplib_qp->psn_sz)
+ hwq_attr.aux_depth = qplib_qp->msn_tbl_sz;
+ hwq_attr.type = HWQ_TYPE_QUEUE;
+ rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr);
+ if (rc)
+ return rc;
+
+ pg_sz_lvl = bnxt_qplib_base_pg_size(&sq->hwq) << CMDQ_CREATE_QP_SQ_PG_SIZE_SFT;
+ pg_sz_lvl |= ((sq->hwq.level & CMDQ_CREATE_QP_SQ_LVL_MASK) <<
+ CMDQ_CREATE_QP_SQ_LVL_SFT);
+ sq->hwq.pg_sz_lvl = pg_sz_lvl;
+
+ hwq_attr.res = res;
+ hwq_attr.sginfo = &rq->sg_info;
+ hwq_attr.stride = bnxt_qplib_get_stride();
+ hwq_attr.depth = bnxt_qplib_get_depth(rq, qplib_qp->wqe_mode, false);
+ hwq_attr.aux_stride = 0;
+ hwq_attr.aux_depth = 0;
+ hwq_attr.type = HWQ_TYPE_QUEUE;
+ rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr);
+ if (rc)
+ goto free_sq_hwq;
+ pg_sz_lvl = bnxt_qplib_base_pg_size(&rq->hwq) <<
+ CMDQ_CREATE_QP_RQ_PG_SIZE_SFT;
+ pg_sz_lvl |= ((rq->hwq.level & CMDQ_CREATE_QP_RQ_LVL_MASK) <<
+ CMDQ_CREATE_QP_RQ_LVL_SFT);
+ rq->hwq.pg_sz_lvl = pg_sz_lvl;
+
+ if (qplib_qp->psn_sz) {
+ rc = bnxt_re_qp_alloc_init_xrrq(qp);
+ if (rc)
+ goto free_rq_hwq;
+ }
+
+ return 0;
+free_rq_hwq:
+ bnxt_qplib_free_hwq(res, &rq->hwq);
+free_sq_hwq:
+ bnxt_qplib_free_hwq(res, &sq->hwq);
+ return rc;
+}
+
static struct bnxt_re_qp *bnxt_re_create_shadow_qp
(struct bnxt_re_pd *pd,
struct bnxt_qplib_res *qp1_res,
@@ -1229,9 +1343,10 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
/* Initialize the shadow QP structure from the QP1 values */
ether_addr_copy(qp->qplib_qp.smac, rdev->netdev->dev_addr);
- qp->qplib_qp.pd = &pd->qplib_pd;
+ qp->qplib_qp.pd_id = pd->qplib_pd.id;
qp->qplib_qp.qp_handle = (u64)(unsigned long)(&qp->qplib_qp);
qp->qplib_qp.type = IB_QPT_UD;
+ qp->qplib_qp.cctx = rdev->chip_ctx;
qp->qplib_qp.max_inline_data = 0;
qp->qplib_qp.sig_type = true;
@@ -1264,10 +1379,14 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
qp->qplib_qp.rq_hdr_buf_size = BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6;
qp->qplib_qp.dpi = &rdev->dpi_privileged;
- rc = bnxt_qplib_create_qp(qp1_res, &qp->qplib_qp);
+ rc = bnxt_re_setup_qp_hwqs(qp);
if (rc)
goto fail;
+ rc = bnxt_qplib_create_qp(qp1_res, &qp->qplib_qp);
+ if (rc)
+ goto free_hwq;
+
spin_lock_init(&qp->sq_lock);
INIT_LIST_HEAD(&qp->list);
mutex_lock(&rdev->qp_lock);
@@ -1275,6 +1394,9 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
atomic_inc(&rdev->stats.res.qp_count);
mutex_unlock(&rdev->qp_lock);
return qp;
+
+free_hwq:
+ bnxt_qplib_free_qp_res(&rdev->qplib_res, &qp->qplib_qp);
fail:
kfree(qp);
return NULL;
@@ -1445,6 +1567,39 @@ static int bnxt_re_init_qp_type(struct bnxt_re_dev *rdev,
return qptype;
}
+static void bnxt_re_qp_calculate_msn_psn_size(struct bnxt_re_qp *qp)
+{
+ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+ struct bnxt_qplib_q *sq = &qplib_qp->sq;
+ struct bnxt_re_dev *rdev = qp->rdev;
+ u8 wqe_mode = qplib_qp->wqe_mode;
+
+ if (rdev->dev_attr)
+ qplib_qp->is_host_msn_tbl =
+ _is_host_msn_table(rdev->dev_attr->dev_cap_flags2);
+
+ if (qplib_qp->type == CMDQ_CREATE_QP_TYPE_RC) {
+ qplib_qp->psn_sz = bnxt_qplib_is_chip_gen_p5_p7(rdev->chip_ctx) ?
+ sizeof(struct sq_psn_search_ext) :
+ sizeof(struct sq_psn_search);
+ if (qplib_qp->is_host_msn_tbl) {
+ qplib_qp->psn_sz = sizeof(struct sq_msn_search);
+ qplib_qp->msn = 0;
+ }
+ }
+
+ /* Update msn tbl size */
+ if (qplib_qp->is_host_msn_tbl && qplib_qp->psn_sz) {
+ if (wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC)
+ qplib_qp->msn_tbl_sz =
+ roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, wqe_mode));
+ else
+ qplib_qp->msn_tbl_sz =
+ roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, wqe_mode)) / 2;
+ qplib_qp->msn = 0;
+ }
+}
+
static int bnxt_re_init_qp_attr(struct bnxt_re_qp *qp, struct bnxt_re_pd *pd,
struct ib_qp_init_attr *init_attr,
struct bnxt_re_ucontext *uctx,
@@ -1462,17 +1617,17 @@ static int bnxt_re_init_qp_attr(struct bnxt_re_qp *qp, struct bnxt_re_pd *pd,
/* Setup misc params */
ether_addr_copy(qplqp->smac, rdev->netdev->dev_addr);
- qplqp->pd = &pd->qplib_pd;
+ qplqp->pd_id = pd->qplib_pd.id;
qplqp->qp_handle = (u64)qplqp;
qplqp->max_inline_data = init_attr->cap.max_inline_data;
qplqp->sig_type = init_attr->sq_sig_type == IB_SIGNAL_ALL_WR;
qptype = bnxt_re_init_qp_type(rdev, init_attr);
- if (qptype < 0) {
- rc = qptype;
- goto out;
- }
+ if (qptype < 0)
+ return qptype;
qplqp->type = (u8)qptype;
qplqp->wqe_mode = bnxt_re_is_var_size_supported(rdev, uctx);
+ qplqp->dev_cap_flags = dev_attr->dev_cap_flags;
+ qplqp->cctx = rdev->chip_ctx;
if (init_attr->qp_type == IB_QPT_RC) {
qplqp->max_rd_atomic = dev_attr->max_qp_rd_atom;
qplqp->max_dest_rd_atomic = dev_attr->max_qp_init_rd_atom;
@@ -1502,20 +1657,32 @@ static int bnxt_re_init_qp_attr(struct bnxt_re_qp *qp, struct bnxt_re_pd *pd,
/* Setup RQ/SRQ */
rc = bnxt_re_init_rq_attr(qp, init_attr, uctx);
if (rc)
- goto out;
+ return rc;
if (init_attr->qp_type == IB_QPT_GSI)
bnxt_re_adjust_gsi_rq_attr(qp);
/* Setup SQ */
rc = bnxt_re_init_sq_attr(qp, init_attr, uctx, ureq);
if (rc)
- goto out;
+ return rc;
if (init_attr->qp_type == IB_QPT_GSI)
bnxt_re_adjust_gsi_sq_attr(qp, init_attr, uctx);
- if (uctx) /* This will update DPI and qp_handle */
+ if (uctx) { /* This will update DPI and qp_handle */
rc = bnxt_re_init_user_qp(rdev, pd, qp, uctx, ureq);
-out:
+ if (rc)
+ return rc;
+ }
+
+ bnxt_re_qp_calculate_msn_psn_size(qp);
+
+ rc = bnxt_re_setup_qp_hwqs(qp);
+ if (rc)
+ goto free_umem;
+
+ return 0;
+free_umem:
+ bnxt_re_qp_free_umem(qp);
return rc;
}
@@ -1573,6 +1740,7 @@ static int bnxt_re_create_gsi_qp(struct bnxt_re_qp *qp, struct bnxt_re_pd *pd,
rdev = qp->rdev;
qplqp = &qp->qplib_qp;
+ qplqp->cctx = rdev->chip_ctx;
qplqp->rq_hdr_buf_size = BNXT_QPLIB_MAX_QP1_RQ_HDR_SIZE_V2;
qplqp->sq_hdr_buf_size = BNXT_QPLIB_MAX_QP1_SQ_HDR_SIZE_V2;
@@ -1676,13 +1844,14 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
if (rc == -ENODEV)
goto qp_destroy;
if (rc)
- goto fail;
+ goto free_hwq;
} else {
rc = bnxt_qplib_create_qp(&rdev->qplib_res, &qp->qplib_qp);
if (rc) {
ibdev_err(&rdev->ibdev, "Failed to create HW QP");
- goto free_umem;
+ goto free_hwq;
}
+
if (udata) {
struct bnxt_re_qp_resp resp;
@@ -1733,9 +1902,9 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
return 0;
qp_destroy:
bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
-free_umem:
- ib_umem_release(qp->rumem);
- ib_umem_release(qp->sumem);
+free_hwq:
+ bnxt_qplib_free_qp_res(&rdev->qplib_res, &qp->qplib_qp);
+ bnxt_re_qp_free_umem(qp);
fail:
return rc;
}
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index c88f049136fc..43535c6ec70b 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -793,8 +793,6 @@ int bnxt_qplib_post_srq_recv(struct bnxt_qplib_srq *srq,
return 0;
}
-/* QP */
-
static int bnxt_qplib_alloc_init_swq(struct bnxt_qplib_q *que)
{
int indx;
@@ -813,9 +811,71 @@ static int bnxt_qplib_alloc_init_swq(struct bnxt_qplib_q *que)
return 0;
}
+static int bnxt_re_setup_qp_swqs(struct bnxt_qplib_qp *qplqp)
+{
+ struct bnxt_qplib_q *sq = &qplqp->sq;
+ struct bnxt_qplib_q *rq = &qplqp->rq;
+ int rc;
+
+ if (qplqp->is_user)
+ return 0;
+
+ rc = bnxt_qplib_alloc_init_swq(sq);
+ if (rc)
+ return rc;
+
+ if (!qplqp->srq) {
+ rc = bnxt_qplib_alloc_init_swq(rq);
+ if (rc)
+ goto free_sq_swq;
+ }
+
+ return 0;
+free_sq_swq:
+ kfree(sq->swq);
+ return rc;
+}
+
+static void bnxt_qp_init_dbinfo(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
+{
+ struct bnxt_qplib_q *sq = &qp->sq;
+ struct bnxt_qplib_q *rq = &qp->rq;
+
+ sq->dbinfo.hwq = &sq->hwq;
+ sq->dbinfo.xid = qp->id;
+ sq->dbinfo.db = qp->dpi->dbr;
+ sq->dbinfo.max_slot = bnxt_qplib_set_sq_max_slot(qp->wqe_mode);
+ sq->dbinfo.flags = 0;
+ if (rq->max_wqe) {
+ rq->dbinfo.hwq = &rq->hwq;
+ rq->dbinfo.xid = qp->id;
+ rq->dbinfo.db = qp->dpi->dbr;
+ rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size);
+ rq->dbinfo.flags = 0;
+ }
+}
+
+static void bnxt_qplib_init_psn_ptr(struct bnxt_qplib_qp *qp, int size)
+{
+ struct bnxt_qplib_hwq *sq_hwq;
+ struct bnxt_qplib_q *sq;
+ u64 fpsne, psn_pg;
+ u16 indx_pad = 0;
+
+ sq = &qp->sq;
+ sq_hwq = &sq->hwq;
+ /* First psn entry */
+ fpsne = (u64)bnxt_qplib_get_qe(sq_hwq, sq_hwq->depth, &psn_pg);
+ if (!IS_ALIGNED(fpsne, PAGE_SIZE))
+ indx_pad = (fpsne & ~PAGE_MASK) / size;
+ sq_hwq->pad_pgofft = indx_pad;
+ sq_hwq->pad_pg = (u64 *)psn_pg;
+ sq_hwq->pad_stride = size;
+}
+
+/* QP */
int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
{
- struct bnxt_qplib_hwq_attr hwq_attr = {};
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
struct creq_create_qp1_resp resp = {};
struct bnxt_qplib_cmdqmsg msg = {};
@@ -824,7 +884,6 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
struct cmdq_create_qp1 req = {};
struct bnxt_qplib_pbl *pbl;
u32 qp_flags = 0;
- u8 pg_sz_lvl;
u32 tbl_indx;
int rc;
@@ -838,26 +897,12 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
req.qp_handle = cpu_to_le64(qp->qp_handle);
/* SQ */
- hwq_attr.res = res;
- hwq_attr.sginfo = &sq->sg_info;
- hwq_attr.stride = sizeof(struct sq_sge);
- hwq_attr.depth = bnxt_qplib_get_depth(sq, qp->wqe_mode, false);
- hwq_attr.type = HWQ_TYPE_QUEUE;
- rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr);
- if (rc)
- return rc;
-
- rc = bnxt_qplib_alloc_init_swq(sq);
- if (rc)
- goto fail_sq;
+ sq->max_sw_wqe = bnxt_qplib_get_depth(sq, qp->wqe_mode, true);
+ req.sq_size = cpu_to_le32(sq->max_sw_wqe);
+ req.sq_pg_size_sq_lvl = sq->hwq.pg_sz_lvl;
- req.sq_size = cpu_to_le32(bnxt_qplib_set_sq_size(sq, qp->wqe_mode));
pbl = &sq->hwq.pbl[PBL_LVL_0];
req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
- pg_sz_lvl = (bnxt_qplib_base_pg_size(&sq->hwq) <<
- CMDQ_CREATE_QP1_SQ_PG_SIZE_SFT);
- pg_sz_lvl |= (sq->hwq.level & CMDQ_CREATE_QP1_SQ_LVL_MASK);
- req.sq_pg_size_sq_lvl = pg_sz_lvl;
req.sq_fwo_sq_sge =
cpu_to_le16((sq->max_sge & CMDQ_CREATE_QP1_SQ_SGE_MASK) <<
CMDQ_CREATE_QP1_SQ_SGE_SFT);
@@ -866,24 +911,10 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
/* RQ */
if (rq->max_wqe) {
rq->dbinfo.flags = 0;
- hwq_attr.res = res;
- hwq_attr.sginfo = &rq->sg_info;
- hwq_attr.stride = sizeof(struct sq_sge);
- hwq_attr.depth = bnxt_qplib_get_depth(rq, qp->wqe_mode, false);
- hwq_attr.type = HWQ_TYPE_QUEUE;
- rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr);
- if (rc)
- goto sq_swq;
- rc = bnxt_qplib_alloc_init_swq(rq);
- if (rc)
- goto fail_rq;
req.rq_size = cpu_to_le32(rq->max_wqe);
pbl = &rq->hwq.pbl[PBL_LVL_0];
req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
- pg_sz_lvl = (bnxt_qplib_base_pg_size(&rq->hwq) <<
- CMDQ_CREATE_QP1_RQ_PG_SIZE_SFT);
- pg_sz_lvl |= (rq->hwq.level & CMDQ_CREATE_QP1_RQ_LVL_MASK);
- req.rq_pg_size_rq_lvl = pg_sz_lvl;
+ req.rq_pg_size_rq_lvl = rq->hwq.pg_sz_lvl;
req.rq_fwo_rq_sge =
cpu_to_le16((rq->max_sge &
CMDQ_CREATE_QP1_RQ_SGE_MASK) <<
@@ -894,11 +925,11 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
rc = bnxt_qplib_alloc_qp_hdr_buf(res, qp);
if (rc) {
rc = -ENOMEM;
- goto rq_rwq;
+ return rc;
}
qp_flags |= CMDQ_CREATE_QP1_QP_FLAGS_RESERVED_LKEY_ENABLE;
req.qp_flags = cpu_to_le32(qp_flags);
- req.pd_id = cpu_to_le32(qp->pd->id);
+ req.pd_id = cpu_to_le32(qp->pd_id);
bnxt_qplib_fill_cmdqmsg(&msg, &req, &resp, NULL, sizeof(req), sizeof(resp), 0);
rc = bnxt_qplib_rcfw_send_message(rcfw, &msg);
@@ -907,73 +938,39 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
qp->id = le32_to_cpu(resp.xid);
qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
- qp->cctx = res->cctx;
- sq->dbinfo.hwq = &sq->hwq;
- sq->dbinfo.xid = qp->id;
- sq->dbinfo.db = qp->dpi->dbr;
- sq->dbinfo.max_slot = bnxt_qplib_set_sq_max_slot(qp->wqe_mode);
- if (rq->max_wqe) {
- rq->dbinfo.hwq = &rq->hwq;
- rq->dbinfo.xid = qp->id;
- rq->dbinfo.db = qp->dpi->dbr;
- rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size);
- }
+
+ rc = bnxt_re_setup_qp_swqs(qp);
+ if (rc)
+ goto destroy_qp;
+ bnxt_qp_init_dbinfo(res, qp);
+
tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
rcfw->qp_tbl[tbl_indx].qp_handle = (void *)qp;
return 0;
+destroy_qp:
+ bnxt_qplib_destroy_qp(res, qp);
fail:
bnxt_qplib_free_qp_hdr_buf(res, qp);
-rq_rwq:
- kfree(rq->swq);
-fail_rq:
- bnxt_qplib_free_hwq(res, &rq->hwq);
-sq_swq:
- kfree(sq->swq);
-fail_sq:
- bnxt_qplib_free_hwq(res, &sq->hwq);
return rc;
}
-static void bnxt_qplib_init_psn_ptr(struct bnxt_qplib_qp *qp, int size)
-{
- struct bnxt_qplib_hwq *hwq;
- struct bnxt_qplib_q *sq;
- u64 fpsne, psn_pg;
- u16 indx_pad = 0;
-
- sq = &qp->sq;
- hwq = &sq->hwq;
- /* First psn entry */
- fpsne = (u64)bnxt_qplib_get_qe(hwq, hwq->depth, &psn_pg);
- if (!IS_ALIGNED(fpsne, PAGE_SIZE))
- indx_pad = (fpsne & ~PAGE_MASK) / size;
- hwq->pad_pgofft = indx_pad;
- hwq->pad_pg = (u64 *)psn_pg;
- hwq->pad_stride = size;
-}
-
int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
{
struct bnxt_qplib_rcfw *rcfw = res->rcfw;
- struct bnxt_qplib_hwq_attr hwq_attr = {};
- struct bnxt_qplib_sg_info sginfo = {};
struct creq_create_qp_resp resp = {};
struct bnxt_qplib_cmdqmsg msg = {};
struct bnxt_qplib_q *sq = &qp->sq;
struct bnxt_qplib_q *rq = &qp->rq;
struct cmdq_create_qp req = {};
- int rc, req_size, psn_sz = 0;
- struct bnxt_qplib_hwq *xrrq;
struct bnxt_qplib_pbl *pbl;
u32 qp_flags = 0;
- u8 pg_sz_lvl;
u32 tbl_indx;
u16 nsge;
+ int rc;
- qp->is_host_msn_tbl = _is_host_msn_table(res->dattr->dev_cap_flags2);
sq->dbinfo.flags = 0;
bnxt_qplib_rcfw_cmd_prep((struct cmdq_base *)&req,
CMDQ_BASE_OPCODE_CREATE_QP,
@@ -985,56 +982,10 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
req.qp_handle = cpu_to_le64(qp->qp_handle);
/* SQ */
- if (qp->type == CMDQ_CREATE_QP_TYPE_RC) {
- psn_sz = bnxt_qplib_is_chip_gen_p5_p7(res->cctx) ?
- sizeof(struct sq_psn_search_ext) :
- sizeof(struct sq_psn_search);
-
- if (qp->is_host_msn_tbl) {
- psn_sz = sizeof(struct sq_msn_search);
- qp->msn = 0;
- }
- }
-
- hwq_attr.res = res;
- hwq_attr.sginfo = &sq->sg_info;
- hwq_attr.stride = sizeof(struct sq_sge);
- hwq_attr.depth = bnxt_qplib_get_depth(sq, qp->wqe_mode, true);
- hwq_attr.aux_stride = psn_sz;
- hwq_attr.aux_depth = psn_sz ? bnxt_qplib_set_sq_size(sq, qp->wqe_mode)
- : 0;
- /* Update msn tbl size */
- if (qp->is_host_msn_tbl && psn_sz) {
- if (qp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC)
- hwq_attr.aux_depth =
- roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode));
- else
- hwq_attr.aux_depth =
- roundup_pow_of_two(bnxt_qplib_set_sq_size(sq, qp->wqe_mode)) / 2;
- qp->msn_tbl_sz = hwq_attr.aux_depth;
- qp->msn = 0;
- }
-
- hwq_attr.type = HWQ_TYPE_QUEUE;
- rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr);
- if (rc)
- return rc;
-
- if (!sq->hwq.is_user) {
- rc = bnxt_qplib_alloc_init_swq(sq);
- if (rc)
- goto fail_sq;
-
- if (psn_sz)
- bnxt_qplib_init_psn_ptr(qp, psn_sz);
- }
- req.sq_size = cpu_to_le32(bnxt_qplib_set_sq_size(sq, qp->wqe_mode));
+ req.sq_size = cpu_to_le32(sq->max_sw_wqe);
pbl = &sq->hwq.pbl[PBL_LVL_0];
req.sq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
- pg_sz_lvl = (bnxt_qplib_base_pg_size(&sq->hwq) <<
- CMDQ_CREATE_QP_SQ_PG_SIZE_SFT);
- pg_sz_lvl |= (sq->hwq.level & CMDQ_CREATE_QP_SQ_LVL_MASK);
- req.sq_pg_size_sq_lvl = pg_sz_lvl;
+ req.sq_pg_size_sq_lvl = sq->hwq.pg_sz_lvl;
req.sq_fwo_sq_sge =
cpu_to_le16(((sq->max_sge & CMDQ_CREATE_QP_SQ_SGE_MASK) <<
CMDQ_CREATE_QP_SQ_SGE_SFT) | 0);
@@ -1043,29 +994,10 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
/* RQ */
if (!qp->srq) {
rq->dbinfo.flags = 0;
- hwq_attr.res = res;
- hwq_attr.sginfo = &rq->sg_info;
- hwq_attr.stride = sizeof(struct sq_sge);
- hwq_attr.depth = bnxt_qplib_get_depth(rq, qp->wqe_mode, false);
- hwq_attr.aux_stride = 0;
- hwq_attr.aux_depth = 0;
- hwq_attr.type = HWQ_TYPE_QUEUE;
- rc = bnxt_qplib_alloc_init_hwq(&rq->hwq, &hwq_attr);
- if (rc)
- goto sq_swq;
- if (!rq->hwq.is_user) {
- rc = bnxt_qplib_alloc_init_swq(rq);
- if (rc)
- goto fail_rq;
- }
-
req.rq_size = cpu_to_le32(rq->max_wqe);
pbl = &rq->hwq.pbl[PBL_LVL_0];
req.rq_pbl = cpu_to_le64(pbl->pg_map_arr[0]);
- pg_sz_lvl = (bnxt_qplib_base_pg_size(&rq->hwq) <<
- CMDQ_CREATE_QP_RQ_PG_SIZE_SFT);
- pg_sz_lvl |= (rq->hwq.level & CMDQ_CREATE_QP_RQ_LVL_MASK);
- req.rq_pg_size_rq_lvl = pg_sz_lvl;
+ req.rq_pg_size_rq_lvl = rq->hwq.pg_sz_lvl;
nsge = (qp->wqe_mode == BNXT_QPLIB_WQE_MODE_STATIC) ?
6 : rq->max_sge;
req.rq_fwo_rq_sge =
@@ -1091,68 +1023,34 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
req.qp_flags = cpu_to_le32(qp_flags);
/* ORRQ and IRRQ */
- if (psn_sz) {
- xrrq = &qp->orrq;
- xrrq->max_elements =
- ORD_LIMIT_TO_ORRQ_SLOTS(qp->max_rd_atomic);
- req_size = xrrq->max_elements *
- BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE + PAGE_SIZE - 1;
- req_size &= ~(PAGE_SIZE - 1);
- sginfo.pgsize = req_size;
- sginfo.pgshft = PAGE_SHIFT;
-
- hwq_attr.res = res;
- hwq_attr.sginfo = &sginfo;
- hwq_attr.depth = xrrq->max_elements;
- hwq_attr.stride = BNXT_QPLIB_MAX_ORRQE_ENTRY_SIZE;
- hwq_attr.aux_stride = 0;
- hwq_attr.aux_depth = 0;
- hwq_attr.type = HWQ_TYPE_CTX;
- rc = bnxt_qplib_alloc_init_hwq(xrrq, &hwq_attr);
- if (rc)
- goto rq_swq;
- pbl = &xrrq->pbl[PBL_LVL_0];
- req.orrq_addr = cpu_to_le64(pbl->pg_map_arr[0]);
-
- xrrq = &qp->irrq;
- xrrq->max_elements = IRD_LIMIT_TO_IRRQ_SLOTS(
- qp->max_dest_rd_atomic);
- req_size = xrrq->max_elements *
- BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE + PAGE_SIZE - 1;
- req_size &= ~(PAGE_SIZE - 1);
- sginfo.pgsize = req_size;
- hwq_attr.depth = xrrq->max_elements;
- hwq_attr.stride = BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE;
- rc = bnxt_qplib_alloc_init_hwq(xrrq, &hwq_attr);
- if (rc)
- goto fail_orrq;
-
- pbl = &xrrq->pbl[PBL_LVL_0];
- req.irrq_addr = cpu_to_le64(pbl->pg_map_arr[0]);
+ if (qp->psn_sz) {
+ req.orrq_addr = cpu_to_le64(bnxt_qplib_get_base_addr(&qp->orrq));
+ req.irrq_addr = cpu_to_le64(bnxt_qplib_get_base_addr(&qp->irrq));
}
- req.pd_id = cpu_to_le32(qp->pd->id);
+
+ req.pd_id = cpu_to_le32(qp->pd_id);
bnxt_qplib_fill_cmdqmsg(&msg, &req, &resp, NULL, sizeof(req),
sizeof(resp), 0);
rc = bnxt_qplib_rcfw_send_message(rcfw, &msg);
if (rc)
- goto fail;
+ return rc;
qp->id = le32_to_cpu(resp.xid);
+
+ if (!qp->is_user) {
+ rc = bnxt_re_setup_qp_swqs(qp);
+ if (rc)
+ goto destroy_qp;
+ }
+ bnxt_qp_init_dbinfo(res, qp);
+ if (qp->psn_sz)
+ bnxt_qplib_init_psn_ptr(qp, qp->psn_sz);
+
qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
INIT_LIST_HEAD(&qp->sq_flush);
INIT_LIST_HEAD(&qp->rq_flush);
qp->cctx = res->cctx;
- sq->dbinfo.hwq = &sq->hwq;
- sq->dbinfo.xid = qp->id;
- sq->dbinfo.db = qp->dpi->dbr;
- sq->dbinfo.max_slot = bnxt_qplib_set_sq_max_slot(qp->wqe_mode);
- if (rq->max_wqe) {
- rq->dbinfo.hwq = &rq->hwq;
- rq->dbinfo.xid = qp->id;
- rq->dbinfo.db = qp->dpi->dbr;
- rq->dbinfo.max_slot = bnxt_qplib_set_rq_max_slot(rq->wqe_size);
- }
spin_lock_bh(&rcfw->tbl_lock);
tbl_indx = map_qp_id_to_tbl_indx(qp->id, rcfw);
rcfw->qp_tbl[tbl_indx].qp_id = qp->id;
@@ -1160,18 +1058,8 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
spin_unlock_bh(&rcfw->tbl_lock);
return 0;
-fail:
- bnxt_qplib_free_hwq(res, &qp->irrq);
-fail_orrq:
- bnxt_qplib_free_hwq(res, &qp->orrq);
-rq_swq:
- kfree(rq->swq);
-fail_rq:
- bnxt_qplib_free_hwq(res, &rq->hwq);
-sq_swq:
- kfree(sq->swq);
-fail_sq:
- bnxt_qplib_free_hwq(res, &sq->hwq);
+destroy_qp:
+ bnxt_qplib_destroy_qp(res, qp);
return rc;
}
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index 1b414a73b46d..c862fa7ba499 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -268,7 +268,7 @@ struct bnxt_qplib_q {
};
struct bnxt_qplib_qp {
- struct bnxt_qplib_pd *pd;
+ u32 pd_id;
struct bnxt_qplib_dpi *dpi;
struct bnxt_qplib_chip_ctx *cctx;
u64 qp_handle;
@@ -279,6 +279,7 @@ struct bnxt_qplib_qp {
u8 wqe_mode;
u8 state;
u8 cur_qp_state;
+ u8 is_user;
u64 modify_flags;
u32 max_inline_data;
u32 mtu;
@@ -343,9 +344,11 @@ struct bnxt_qplib_qp {
struct list_head rq_flush;
u32 msn;
u32 msn_tbl_sz;
+ u32 psn_sz;
bool is_host_msn_tbl;
u8 tos_dscp;
u32 ugid_index;
+ u16 dev_cap_flags;
};
#define BNXT_RE_MAX_MSG_SIZE 0x80000000
@@ -614,6 +617,11 @@ static inline void bnxt_qplib_swq_mod_start(struct bnxt_qplib_q *que, u32 idx)
que->swq_start = que->swq[idx].next_idx;
}
+static inline u32 bnxt_qplib_get_stride(void)
+{
+ return sizeof(struct sq_sge);
+}
+
static inline u32 bnxt_qplib_get_depth(struct bnxt_qplib_q *que, u8 wqe_mode, bool is_sq)
{
u32 slots;
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index 2ea3b7f232a3..ccdab938d707 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -198,6 +198,7 @@ struct bnxt_qplib_hwq {
u32 cons; /* raw */
u8 cp_bit;
u8 is_user;
+ u8 pg_sz_lvl;
u64 *pad_pg;
u32 pad_stride;
u32 pad_pgofft;
@@ -358,6 +359,11 @@ static inline u8 bnxt_qplib_get_ring_type(struct bnxt_qplib_chip_ctx *cctx)
RING_ALLOC_REQ_RING_TYPE_ROCE_CMPL;
}
+static inline u64 bnxt_qplib_get_base_addr(struct bnxt_qplib_hwq *hwq)
+{
+ return hwq->pbl[PBL_LVL_0].pg_map_arr[0];
+}
+
static inline u8 bnxt_qplib_base_pg_size(struct bnxt_qplib_hwq *hwq)
{
u8 pg_size = BNXT_QPLIB_HWRM_PG_SIZE_4K;
--
2.51.2.636.ga99f379adf
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
` (2 preceding siblings ...)
2026-01-27 10:31 ` [PATCH rdma-next v9 3/5] RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function Sriharsha Basavapatna
@ 2026-01-27 10:31 ` Sriharsha Basavapatna
2026-01-27 12:30 ` Jiri Pirko
2026-01-28 15:33 ` Jason Gunthorpe
2026-01-27 10:31 ` [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs Sriharsha Basavapatna
4 siblings, 2 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
The following Direct Verbs (DV) methods have been implemented in
this patch.
Doorbell Region Direct Verbs:
-----------------------------
- BNXT_RE_METHOD_DBR_ALLOC:
This will allow the appliation to create extra doorbell regions
and use the associated doorbell page index in DV_CREATE_QP and
use the associated DB address while ringing the doorbell.
- BNXT_RE_METHOD_DBR_FREE:
Free the allocated doorbell region.
- BNXT_RE_METHOD_GET_DEFAULT_DBR:
Return the default doorbell page index and doorbell page address
associated with the ucontext.
Co-developed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 1 +
drivers/infiniband/hw/bnxt_re/dv.c | 130 ++++++++++++++++++++++
drivers/infiniband/hw/bnxt_re/ib_verbs.h | 7 ++
drivers/infiniband/hw/bnxt_re/qplib_res.c | 43 +++++++
drivers/infiniband/hw/bnxt_re/qplib_res.h | 4 +
include/uapi/rdma/bnxt_re-abi.h | 29 +++++
6 files changed, 214 insertions(+)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 3a7ce4729fcf..0999a42c678c 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -160,6 +160,7 @@ struct bnxt_re_nq_record {
#define MAX_CQ_HASH_BITS (16)
#define MAX_SRQ_HASH_BITS (16)
+#define MAX_DPI_HASH_BITS (16)
static inline bool bnxt_re_chip_gen_p7(u16 chip_num)
{
diff --git a/drivers/infiniband/hw/bnxt_re/dv.c b/drivers/infiniband/hw/bnxt_re/dv.c
index 5655c6176af4..db69f25e294f 100644
--- a/drivers/infiniband/hw/bnxt_re/dv.c
+++ b/drivers/infiniband/hw/bnxt_re/dv.c
@@ -331,9 +331,139 @@ DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_GET_TOGGLE_MEM,
&UVERBS_METHOD(BNXT_RE_METHOD_GET_TOGGLE_MEM),
&UVERBS_METHOD(BNXT_RE_METHOD_RELEASE_TOGGLE_MEM));
+static int UVERBS_HANDLER(BNXT_RE_METHOD_DBR_ALLOC)(struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_dv_db_region dbr = {};
+ struct bnxt_re_ucontext *uctx;
+ struct bnxt_re_dbr_obj *obj;
+ struct ib_ucontext *ib_uctx;
+ struct bnxt_qplib_dpi *dpi;
+ struct bnxt_re_dev *rdev;
+ struct ib_uobject *uobj;
+ u64 mmap_offset;
+ int ret;
+
+ ib_uctx = ib_uverbs_get_ucontext(attrs);
+ if (IS_ERR(ib_uctx))
+ return PTR_ERR(ib_uctx);
+
+ uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
+ rdev = uctx->rdev;
+ uobj = uverbs_attr_get_uobject(attrs, BNXT_RE_DV_ALLOC_DBR_HANDLE);
+
+ obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+ if (!obj)
+ return -ENOMEM;
+
+ dpi = &obj->dpi;
+ ret = bnxt_qplib_alloc_uc_dpi(&rdev->qplib_res, dpi);
+ if (ret)
+ goto free_mem;
+
+ obj->entry = bnxt_re_mmap_entry_insert(uctx, dpi->umdbr,
+ BNXT_RE_MMAP_UC_DB,
+ &mmap_offset);
+ if (!obj->entry) {
+ ret = -ENOMEM;
+ goto free_dpi;
+ }
+
+ obj->rdev = rdev;
+ uobj->object = obj;
+ uverbs_finalize_uobj_create(attrs, BNXT_RE_DV_ALLOC_DBR_HANDLE);
+
+ dbr.umdbr = dpi->umdbr;
+ dbr.dpi = dpi->dpi;
+ ret = uverbs_copy_to_struct_or_zero(attrs, BNXT_RE_DV_ALLOC_DBR_ATTR,
+ &dbr, sizeof(dbr));
+ if (ret)
+ return ret;
+
+ ret = uverbs_copy_to(attrs, BNXT_RE_DV_ALLOC_DBR_OFFSET,
+ &mmap_offset, sizeof(mmap_offset));
+ if (ret)
+ return ret;
+ return 0;
+free_dpi:
+ bnxt_qplib_free_uc_dpi(&rdev->qplib_res, dpi);
+free_mem:
+ kfree(obj);
+ return ret;
+}
+
+static int bnxt_re_dv_dbr_cleanup(struct ib_uobject *uobject,
+ enum rdma_remove_reason why,
+ struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_dbr_obj *obj = uobject->object;
+ struct bnxt_re_dev *rdev = obj->rdev;
+
+ rdma_user_mmap_entry_remove(&obj->entry->rdma_entry);
+ bnxt_qplib_free_uc_dpi(&rdev->qplib_res, &obj->dpi);
+ return 0;
+}
+
+static int UVERBS_HANDLER(BNXT_RE_METHOD_GET_DEFAULT_DBR)(struct uverbs_attr_bundle *attrs)
+{
+ struct bnxt_re_dv_db_region dpi = {};
+ struct bnxt_re_ucontext *uctx;
+ struct ib_ucontext *ib_uctx;
+ int ret;
+
+ ib_uctx = ib_uverbs_get_ucontext(attrs);
+ if (IS_ERR(ib_uctx))
+ return PTR_ERR(ib_uctx);
+
+ uctx = container_of(ib_uctx, struct bnxt_re_ucontext, ib_uctx);
+ dpi.umdbr = uctx->dpi.umdbr;
+ dpi.dpi = uctx->dpi.dpi;
+
+ ret = uverbs_copy_to_struct_or_zero(attrs, BNXT_RE_DV_DEFAULT_DBR_ATTR,
+ &dpi, sizeof(dpi));
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_DBR_ALLOC,
+ UVERBS_ATTR_IDR(BNXT_RE_DV_ALLOC_DBR_HANDLE,
+ BNXT_RE_OBJECT_DBR,
+ UVERBS_ACCESS_NEW,
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_DV_ALLOC_DBR_ATTR,
+ UVERBS_ATTR_STRUCT(struct bnxt_re_dv_db_region,
+ umdbr),
+ UA_MANDATORY),
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_DV_ALLOC_DBR_OFFSET,
+ UVERBS_ATTR_TYPE(u64),
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_METHOD_DESTROY(BNXT_RE_METHOD_DBR_FREE,
+ UVERBS_ATTR_IDR(BNXT_RE_DV_FREE_DBR_HANDLE,
+ BNXT_RE_OBJECT_DBR,
+ UVERBS_ACCESS_DESTROY,
+ UA_MANDATORY));
+
+DECLARE_UVERBS_NAMED_OBJECT(BNXT_RE_OBJECT_DBR,
+ UVERBS_TYPE_ALLOC_IDR(bnxt_re_dv_dbr_cleanup),
+ &UVERBS_METHOD(BNXT_RE_METHOD_DBR_ALLOC),
+ &UVERBS_METHOD(BNXT_RE_METHOD_DBR_FREE));
+
+DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR,
+ UVERBS_ATTR_PTR_OUT(BNXT_RE_DV_DEFAULT_DBR_ATTR,
+ UVERBS_ATTR_STRUCT(struct bnxt_re_dv_db_region,
+ umdbr),
+ UA_MANDATORY));
+
+DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_DEFAULT_DBR,
+ &UVERBS_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR));
+
const struct uapi_definition bnxt_re_uapi_defs[] = {
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_ALLOC_PAGE),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_NOTIFY_DRV),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_GET_TOGGLE_MEM),
+ UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_DBR),
+ UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_DEFAULT_DBR),
{}
};
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index a11f56730a31..33e0f66b39eb 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -164,6 +164,13 @@ struct bnxt_re_user_mmap_entry {
u8 mmap_flag;
};
+struct bnxt_re_dbr_obj {
+ struct bnxt_re_dev *rdev;
+ struct bnxt_qplib_dpi dpi;
+ struct bnxt_re_user_mmap_entry *entry;
+ atomic_t usecnt; /* QPs using this dbr */
+};
+
struct bnxt_re_flow {
struct ib_flow ib_flow;
struct bnxt_re_dev *rdev;
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c
index 875d7b52c06a..30cc2d64a9ae 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
@@ -685,6 +685,49 @@ static int bnxt_qplib_alloc_pd_tbl(struct bnxt_qplib_res *res,
}
/* DPIs */
+int bnxt_qplib_alloc_uc_dpi(struct bnxt_qplib_res *res, struct bnxt_qplib_dpi *dpi)
+{
+ struct bnxt_qplib_dpi_tbl *dpit = &res->dpi_tbl;
+ struct bnxt_qplib_reg_desc *reg;
+ u32 bit_num;
+ int rc = 0;
+
+ reg = &dpit->wcreg;
+ mutex_lock(&res->dpi_tbl_lock);
+ bit_num = find_first_bit(dpit->tbl, dpit->max);
+ if (bit_num >= dpit->max) {
+ rc = -ENOMEM;
+ goto unlock;
+ }
+ /* Found unused DPI */
+ clear_bit(bit_num, dpit->tbl);
+ dpi->bit = bit_num;
+ dpi->dpi = bit_num + (reg->offset - dpit->ucreg.offset) / PAGE_SIZE;
+ dpi->umdbr = reg->bar_base + reg->offset + bit_num * PAGE_SIZE;
+unlock:
+ mutex_unlock(&res->dpi_tbl_lock);
+ return rc;
+}
+
+int bnxt_qplib_free_uc_dpi(struct bnxt_qplib_res *res, struct bnxt_qplib_dpi *dpi)
+{
+ struct bnxt_qplib_dpi_tbl *dpit = &res->dpi_tbl;
+ int rc = 0;
+
+ mutex_lock(&res->dpi_tbl_lock);
+ if (dpi->bit >= dpit->max) {
+ rc = -EINVAL;
+ goto unlock;
+ }
+
+ if (test_and_set_bit(dpi->bit, dpit->tbl))
+ rc = -EINVAL;
+ memset(dpi, 0, sizeof(*dpi));
+unlock:
+ mutex_unlock(&res->dpi_tbl_lock);
+ return rc;
+}
+
int bnxt_qplib_alloc_dpi(struct bnxt_qplib_res *res,
struct bnxt_qplib_dpi *dpi,
void *app, u8 type)
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index ccdab938d707..3a8162ef4c33 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -436,6 +436,10 @@ int bnxt_qplib_alloc_dpi(struct bnxt_qplib_res *res,
void *app, u8 type);
int bnxt_qplib_dealloc_dpi(struct bnxt_qplib_res *res,
struct bnxt_qplib_dpi *dpi);
+int bnxt_qplib_alloc_uc_dpi(struct bnxt_qplib_res *res,
+ struct bnxt_qplib_dpi *dpi);
+int bnxt_qplib_free_uc_dpi(struct bnxt_qplib_res *res,
+ struct bnxt_qplib_dpi *dpi);
void bnxt_qplib_cleanup_res(struct bnxt_qplib_res *res);
int bnxt_qplib_init_res(struct bnxt_qplib_res *res);
void bnxt_qplib_free_res(struct bnxt_qplib_res *res);
diff --git a/include/uapi/rdma/bnxt_re-abi.h b/include/uapi/rdma/bnxt_re-abi.h
index faa9d62b3b30..51f8614a7c4f 100644
--- a/include/uapi/rdma/bnxt_re-abi.h
+++ b/include/uapi/rdma/bnxt_re-abi.h
@@ -162,6 +162,8 @@ enum bnxt_re_objects {
BNXT_RE_OBJECT_ALLOC_PAGE = (1U << UVERBS_ID_NS_SHIFT),
BNXT_RE_OBJECT_NOTIFY_DRV,
BNXT_RE_OBJECT_GET_TOGGLE_MEM,
+ BNXT_RE_OBJECT_DBR,
+ BNXT_RE_OBJECT_DEFAULT_DBR,
};
enum bnxt_re_alloc_page_type {
@@ -215,4 +217,31 @@ enum bnxt_re_toggle_mem_methods {
BNXT_RE_METHOD_GET_TOGGLE_MEM = (1U << UVERBS_ID_NS_SHIFT),
BNXT_RE_METHOD_RELEASE_TOGGLE_MEM,
};
+
+struct bnxt_re_dv_db_region {
+ __u32 dpi;
+ __u32 reserved;
+ __aligned_u64 umdbr;
+};
+
+enum bnxt_re_obj_dbr_alloc_attrs {
+ BNXT_RE_DV_ALLOC_DBR_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+ BNXT_RE_DV_ALLOC_DBR_ATTR,
+ BNXT_RE_DV_ALLOC_DBR_OFFSET,
+};
+
+enum bnxt_re_obj_dbr_free_attrs {
+ BNXT_RE_DV_FREE_DBR_HANDLE = (1U << UVERBS_ID_NS_SHIFT),
+};
+
+enum bnxt_re_obj_default_dbr_attrs {
+ BNXT_RE_DV_DEFAULT_DBR_ATTR = (1U << UVERBS_ID_NS_SHIFT),
+};
+
+enum bnxt_re_obj_dpi_methods {
+ BNXT_RE_METHOD_DBR_ALLOC = (1U << UVERBS_ID_NS_SHIFT),
+ BNXT_RE_METHOD_DBR_FREE,
+ BNXT_RE_METHOD_GET_DEFAULT_DBR,
+};
+
#endif /* __BNXT_RE_UVERBS_ABI_H__*/
--
2.51.2.636.ga99f379adf
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
` (3 preceding siblings ...)
2026-01-27 10:31 ` [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs Sriharsha Basavapatna
@ 2026-01-27 10:31 ` Sriharsha Basavapatna
2026-01-28 15:32 ` Jason Gunthorpe
2026-01-28 15:46 ` Jason Gunthorpe
4 siblings, 2 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 10:31 UTC (permalink / raw)
To: leon, jgg
Cc: linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
The following Direct Verbs have been implemented, by enhancing the
driver specific udata in existing verbs.
CQ Direct Verbs:
----------------
- CREATE_CQ:
Create a CQ using the specified udata (struct bnxt_re_cq_req).
The driver registers a new device op 'create_cq_umem' that is
used to process CQ memory allocated by the userspace application.
The driver maps/pins the CQ user memory and registers it with the
hardware.
- DESTROY_CQ:
Unmap the user memory and destroy the CQ.
QP Direct Verbs:
----------------
- CREATE_QP:
Create a QP using the specified udata (struct bnxt_re_qp_req).
The driver registers a new device op 'create_qp_umem' that is
used to process QP memory allocated by the userspace application.
The driver maps/pins the SQ/RQ user memory and registers it
with the hardware.
- DESTROY_QP:
Unmap SQ/RQ user memory and destroy the QP.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-developed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Co-developed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
drivers/infiniband/hw/bnxt_re/bnxt_re.h | 5 +
drivers/infiniband/hw/bnxt_re/dv.c | 447 +++++++++++++++++++++++
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 171 ++++++---
drivers/infiniband/hw/bnxt_re/ib_verbs.h | 20 +
drivers/infiniband/hw/bnxt_re/main.c | 2 +
include/uapi/rdma/bnxt_re-abi.h | 15 +
6 files changed, 613 insertions(+), 47 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 0999a42c678c..f28acde3a274 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -234,6 +234,8 @@ struct bnxt_re_dev {
union ib_gid ugid;
u32 ugid_index;
u8 sniffer_flow_created : 1;
+ atomic_t dv_cq_count;
+ atomic_t dv_qp_count;
};
#define to_bnxt_re_dev(ptr, member) \
@@ -277,6 +279,9 @@ static inline int bnxt_re_read_context_allowed(struct bnxt_re_dev *rdev)
return 0;
}
+struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev);
+void bnxt_re_put_nq(struct bnxt_re_dev *rdev, struct bnxt_qplib_nq *nq);
+
#define BNXT_RE_CONTEXT_TYPE_QPC_SIZE_P5 1088
#define BNXT_RE_CONTEXT_TYPE_CQ_SIZE_P5 128
#define BNXT_RE_CONTEXT_TYPE_MRW_SIZE_P5 128
diff --git a/drivers/infiniband/hw/bnxt_re/dv.c b/drivers/infiniband/hw/bnxt_re/dv.c
index db69f25e294f..a039f1e460e0 100644
--- a/drivers/infiniband/hw/bnxt_re/dv.c
+++ b/drivers/infiniband/hw/bnxt_re/dv.c
@@ -12,6 +12,7 @@
#include <rdma/ib_user_ioctl_cmds.h>
#define UVERBS_MODULE_NAME bnxt_re
#include <rdma/uverbs_named_ioctl.h>
+#include <rdma/ib_umem.h>
#include <rdma/bnxt_re-abi.h>
#include "roce_hsi.h"
@@ -398,6 +399,9 @@ static int bnxt_re_dv_dbr_cleanup(struct ib_uobject *uobject,
struct bnxt_re_dbr_obj *obj = uobject->object;
struct bnxt_re_dev *rdev = obj->rdev;
+ if (atomic_read(&obj->usecnt))
+ return -EBUSY;
+
rdma_user_mmap_entry_remove(&obj->entry->rdma_entry);
bnxt_qplib_free_uc_dpi(&rdev->qplib_res, &obj->dpi);
return 0;
@@ -459,11 +463,454 @@ DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR,
DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_DEFAULT_DBR,
&UVERBS_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR));
+static int bnxt_re_dv_create_cq_resp(struct bnxt_re_dev *rdev,
+ struct bnxt_re_cq *cq,
+ struct bnxt_re_cq_resp *resp)
+{
+ struct bnxt_qplib_cq *qplcq = &cq->qplib_cq;
+
+ resp->cqid = qplcq->id;
+ resp->tail = qplcq->hwq.cons;
+ resp->phase = qplcq->period;
+ resp->comp_mask = BNXT_RE_CQ_DV_SUPPORT;
+ return 0;
+}
+
+static int bnxt_re_dv_setup_umem(struct bnxt_re_dev *rdev,
+ struct ib_umem *umem,
+ struct bnxt_qplib_sg_info *sginfo,
+ struct ib_umem **umem_ptr)
+{
+ unsigned long page_size;
+
+ if (!umem)
+ return -EINVAL;
+
+ page_size = ib_umem_find_best_pgsz(umem, SZ_4K, 0);
+ if (!page_size)
+ return -EINVAL;
+
+ if (umem_ptr)
+ *umem_ptr = umem;
+ sginfo->umem = umem;
+ sginfo->npages = ib_umem_num_dma_blocks(umem, SZ_4K);
+ sginfo->pgsize = SZ_4K;
+ sginfo->pgshft = __builtin_ctz(SZ_4K);
+ ibdev_dbg(&rdev->ibdev,
+ "umem: 0x%llx npages: %d page_size: %d page_shift: %d\n",
+ (u64)umem, sginfo->npages, sginfo->pgsize, sginfo->pgshft);
+
+ return 0;
+}
+
+static int bnxt_re_dv_create_qplib_cq(struct bnxt_re_dev *rdev,
+ struct bnxt_re_ucontext *re_uctx,
+ struct bnxt_re_cq *cq,
+ struct bnxt_re_cq_req *req,
+ struct ib_umem *umem)
+{
+ struct bnxt_qplib_dev_attr *dev_attr = rdev->dev_attr;
+ struct bnxt_qplib_cq *qplcq;
+ u32 cqe = req->ncqe;
+ u32 max_active_cqs;
+ int rc = -EINVAL;
+
+ if (!atomic_add_unless(&rdev->stats.res.cq_count, 1, dev_attr->max_cq)) {
+ ibdev_dbg(&rdev->ibdev, "Create CQ failed - max exceeded(CQs)");
+ return rc;
+ }
+
+ /* Validate CQ fields */
+ if (cqe < 1 || cqe > dev_attr->max_cq_wqes) {
+ ibdev_dbg(&rdev->ibdev, "Create CQ failed - max exceeded(CQ_WQs)");
+ goto fail_dec;
+ }
+
+ qplcq = &cq->qplib_cq;
+ qplcq->cq_handle = (u64)qplcq;
+
+ rc = bnxt_re_dv_setup_umem(rdev, umem, &qplcq->sg_info, &cq->umem);
+ if (rc)
+ goto fail_dec;
+
+ qplcq->dpi = &re_uctx->dpi;
+ qplcq->max_wqe = cqe;
+ qplcq->nq = bnxt_re_get_nq(rdev);
+ qplcq->cnq_hw_ring_id = qplcq->nq->ring_id;
+ qplcq->coalescing = &rdev->cq_coalescing;
+ rc = bnxt_qplib_create_cq(&rdev->qplib_res, qplcq);
+ if (rc) {
+ ibdev_err(&rdev->ibdev, "Failed to create HW CQ");
+ goto fail_qpl;
+ }
+
+ cq->ib_cq.cqe = cqe;
+ cq->cq_period = qplcq->period;
+
+ max_active_cqs = atomic_read(&rdev->stats.res.cq_count);
+ if (max_active_cqs > rdev->stats.res.cq_watermark)
+ rdev->stats.res.cq_watermark = max_active_cqs;
+ spin_lock_init(&cq->cq_lock);
+
+ return 0;
+
+fail_qpl:
+ ib_umem_release(cq->umem);
+fail_dec:
+ atomic_dec(&rdev->stats.res.cq_count);
+ return rc;
+}
+
+static void bnxt_re_dv_free_qplib_cq(struct bnxt_re_dev *rdev,
+ struct bnxt_re_cq *re_cq)
+{
+ bnxt_qplib_destroy_cq(&rdev->qplib_res, &re_cq->qplib_cq);
+ bnxt_re_put_nq(rdev, re_cq->qplib_cq.nq);
+ ib_umem_release(re_cq->umem);
+}
+
+int bnxt_re_dv_create_cq(struct bnxt_re_dev *rdev, struct ib_udata *udata,
+ struct bnxt_re_cq *re_cq, struct bnxt_re_cq_req *req,
+ struct ib_umem *umem)
+{
+ struct bnxt_re_ucontext *re_uctx =
+ rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
+ struct bnxt_re_cq_resp resp = {};
+ int ret;
+
+ ret = bnxt_re_dv_create_qplib_cq(rdev, re_uctx, re_cq, req, umem);
+ if (ret)
+ return ret;
+
+ ret = bnxt_re_dv_create_cq_resp(rdev, re_cq, &resp);
+ if (ret)
+ goto fail_resp;
+
+ ret = ib_copy_to_udata(udata, &resp, min(sizeof(resp), udata->outlen));
+ if (ret)
+ goto fail_resp;
+
+ re_cq->is_dv_cq = true;
+ atomic_inc(&rdev->dv_cq_count);
+ return 0;
+
+fail_resp:
+ bnxt_re_dv_free_qplib_cq(rdev, re_cq);
+ return ret;
+};
+
+static int bnxt_re_dv_init_qp_attr(struct bnxt_re_qp *qp,
+ struct bnxt_re_ucontext *cntx,
+ struct ib_qp_init_attr *init_attr,
+ struct bnxt_re_qp_req *req,
+ struct bnxt_re_dbr_obj *dbr_obj)
+{
+ struct bnxt_qplib_dev_attr *dev_attr;
+ struct bnxt_qplib_qp *qplqp;
+ struct bnxt_re_cq *send_cq;
+ struct bnxt_re_cq *recv_cq;
+ struct bnxt_re_dev *rdev;
+ struct bnxt_qplib_q *rq;
+ struct bnxt_qplib_q *sq;
+ u32 slot_size;
+ int qptype;
+
+ rdev = qp->rdev;
+ qplqp = &qp->qplib_qp;
+ dev_attr = rdev->dev_attr;
+
+ /* Setup misc params */
+ qplqp->is_user = true;
+ qplqp->pd_id = req->pd_id;
+ qplqp->qp_handle = (u64)qplqp;
+ qplqp->sig_type = false;
+ qptype = __from_ib_qp_type(init_attr->qp_type);
+ if (qptype < 0)
+ return qptype;
+ qplqp->type = (u8)qptype;
+ qplqp->wqe_mode = rdev->chip_ctx->modes.wqe_mode;
+ ether_addr_copy(qplqp->smac, rdev->netdev->dev_addr);
+ qplqp->dev_cap_flags = dev_attr->dev_cap_flags;
+ qplqp->cctx = rdev->chip_ctx;
+
+ if (init_attr->qp_type == IB_QPT_RC) {
+ qplqp->max_rd_atomic = dev_attr->max_qp_rd_atom;
+ qplqp->max_dest_rd_atomic = dev_attr->max_qp_init_rd_atom;
+ }
+ qplqp->mtu = ib_mtu_enum_to_int(iboe_get_mtu(rdev->netdev->mtu));
+ if (dbr_obj)
+ qplqp->dpi = &dbr_obj->dpi;
+ else
+ qplqp->dpi = &cntx->dpi;
+
+ /* Setup CQs */
+ if (!init_attr->send_cq)
+ return -EINVAL;
+ send_cq = container_of(init_attr->send_cq, struct bnxt_re_cq, ib_cq);
+ qplqp->scq = &send_cq->qplib_cq;
+ qp->scq = send_cq;
+
+ if (!init_attr->recv_cq)
+ return -EINVAL;
+ recv_cq = container_of(init_attr->recv_cq, struct bnxt_re_cq, ib_cq);
+ qplqp->rcq = &recv_cq->qplib_cq;
+ qp->rcq = recv_cq;
+
+ if (!init_attr->srq) {
+ /* Setup RQ */
+ slot_size = bnxt_qplib_get_stride();
+ rq = &qplqp->rq;
+ rq->max_sge = init_attr->cap.max_recv_sge;
+ rq->wqe_size = req->rq_wqe_sz;
+ rq->max_wqe = (req->rq_slots * slot_size) /
+ req->rq_wqe_sz;
+ rq->max_sw_wqe = rq->max_wqe;
+ rq->q_full_delta = 0;
+ rq->sg_info.pgsize = PAGE_SIZE;
+ rq->sg_info.pgshft = PAGE_SHIFT;
+ }
+
+ /* Setup SQ */
+ sq = &qplqp->sq;
+ sq->max_sge = init_attr->cap.max_send_sge;
+ sq->wqe_size = req->sq_wqe_sz;
+ sq->max_wqe = req->sq_slots; /* SQ in var-wqe mode */
+ sq->max_sw_wqe = sq->max_wqe;
+ sq->q_full_delta = 0;
+ sq->sg_info.pgsize = PAGE_SIZE;
+ sq->sg_info.pgshft = PAGE_SHIFT;
+
+ return 0;
+}
+
+static int bnxt_re_dv_init_user_qp(struct bnxt_re_dev *rdev,
+ struct bnxt_re_ucontext *cntx,
+ struct bnxt_re_qp *qp,
+ struct ib_qp_init_attr *init_attr,
+ struct bnxt_re_qp_req *req,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem)
+{
+ struct bnxt_qplib_sg_info *sginfo;
+ struct bnxt_qplib_qp *qplib_qp;
+ int rc;
+
+ qplib_qp = &qp->qplib_qp;
+ qplib_qp->qp_handle = req->qp_handle;
+
+ /* SQ */
+ sginfo = &qplib_qp->sq.sg_info;
+ rc = bnxt_re_dv_setup_umem(rdev, sq_umem, sginfo, &qp->sumem);
+ if (rc)
+ return rc;
+
+ /* SRQ */
+ if (init_attr->srq) {
+ struct bnxt_re_srq *srq;
+
+ srq = container_of(init_attr->srq, struct bnxt_re_srq, ib_srq);
+ qplib_qp->srq = &srq->qplib_srq;
+ goto done;
+ }
+
+ /* RQ */
+ sginfo = &qplib_qp->rq.sg_info;
+ rc = bnxt_re_dv_setup_umem(rdev, rq_umem, sginfo, &qp->rumem);
+ if (rc)
+ goto rqfail;
+
+done:
+ qplib_qp->is_user = true;
+ return 0;
+rqfail:
+ ib_umem_release(qp->sumem);
+ qplib_qp->sq.sg_info.umem = NULL;
+ return rc;
+}
+
+static int
+bnxt_re_dv_qp_init_msn(struct bnxt_re_dev *rdev, struct bnxt_re_qp *qp,
+ struct bnxt_re_qp_req *req)
+{
+ struct bnxt_qplib_dev_attr *dev_attr = rdev->dev_attr;
+ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+
+ if (req->sq_npsn > dev_attr->max_qp_wqes ||
+ req->sq_psn_sz > sizeof(struct sq_psn_search_ext))
+ return -EINVAL;
+
+ qplib_qp->is_host_msn_tbl = true;
+ qplib_qp->msn = 0;
+ qplib_qp->psn_sz = req->sq_psn_sz;
+ qplib_qp->msn_tbl_sz = req->sq_psn_sz * req->sq_npsn;
+ return 0;
+}
+
+static void bnxt_re_dv_init_qp(struct bnxt_re_dev *rdev,
+ struct bnxt_re_qp *qp)
+{
+ u32 active_qps, tmp_qps;
+
+ spin_lock_init(&qp->sq_lock);
+ spin_lock_init(&qp->rq_lock);
+ INIT_LIST_HEAD(&qp->list);
+ mutex_lock(&rdev->qp_lock);
+ list_add_tail(&qp->list, &rdev->qp_list);
+ mutex_unlock(&rdev->qp_lock);
+ atomic_inc(&rdev->stats.res.qp_count);
+ active_qps = atomic_read(&rdev->stats.res.qp_count);
+ if (active_qps > rdev->stats.res.qp_watermark)
+ rdev->stats.res.qp_watermark = active_qps;
+
+ /* Get the counters for RC QPs */
+ tmp_qps = atomic_inc_return(&rdev->stats.res.rc_qp_count);
+ if (tmp_qps > rdev->stats.res.rc_qp_watermark)
+ rdev->stats.res.rc_qp_watermark = tmp_qps;
+}
+
+int bnxt_re_dv_create_qp(struct bnxt_re_dev *rdev, struct ib_udata *udata,
+ struct ib_qp_init_attr *init_attr,
+ struct bnxt_re_qp *re_qp, struct bnxt_re_qp_req *req,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem)
+{
+ struct bnxt_re_dbr_obj *dbr_obj = NULL;
+ struct bnxt_re_cq *send_cq = NULL;
+ struct bnxt_re_cq *recv_cq = NULL;
+ struct bnxt_re_qp_resp resp = {};
+ struct uverbs_attr_bundle *attrs;
+ struct bnxt_re_ucontext *uctx;
+ int ret;
+
+ uctx = rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
+ if (init_attr->send_cq) {
+ send_cq = container_of(init_attr->send_cq, struct bnxt_re_cq, ib_cq);
+ re_qp->scq = send_cq;
+ }
+
+ if (init_attr->recv_cq) {
+ recv_cq = container_of(init_attr->recv_cq, struct bnxt_re_cq, ib_cq);
+ re_qp->rcq = recv_cq;
+ }
+
+ attrs = rdma_udata_to_uverbs_attr_bundle(udata);
+ if (!attrs)
+ return -EINVAL;
+
+ if (uverbs_attr_is_valid(attrs, BNXT_RE_CREATE_QP_ATTR_DBR_HANDLE)) {
+ dbr_obj = uverbs_attr_get_obj(attrs, BNXT_RE_CREATE_QP_ATTR_DBR_HANDLE);
+ if (IS_ERR(dbr_obj))
+ return PTR_ERR(dbr_obj);
+ atomic_inc(&dbr_obj->usecnt);
+ re_qp->dbr_obj = dbr_obj;
+ }
+
+ re_qp->rdev = rdev;
+ ret = bnxt_re_dv_init_qp_attr(re_qp, uctx, init_attr, req, dbr_obj);
+ if (ret)
+ goto dbr_rel;
+
+ ret = bnxt_re_dv_init_user_qp(rdev, uctx, re_qp, init_attr, req,
+ sq_umem, rq_umem);
+ if (ret)
+ goto dbr_rel;
+
+ ret = bnxt_re_dv_qp_init_msn(rdev, re_qp, req);
+ if (ret)
+ goto free_umem;
+
+ ret = bnxt_re_setup_qp_hwqs(re_qp, true);
+ if (ret)
+ goto free_umem;
+
+ ret = bnxt_qplib_create_qp(&rdev->qplib_res, &re_qp->qplib_qp);
+ if (ret) {
+ ibdev_err(&rdev->ibdev, "Failed to create HW QP");
+ goto free_hwq;
+ }
+
+ resp.qpid = re_qp->qplib_qp.id;
+ resp.comp_mask = BNXT_RE_QP_DV_SUPPORT;
+ resp.rsvd = 0;
+ ret = ib_copy_to_udata(udata, &resp, sizeof(resp));
+ if (ret)
+ goto free_qplib;
+
+ re_qp->ib_qp.qp_num = re_qp->qplib_qp.id;
+ bnxt_re_dv_init_qp(rdev, re_qp);
+ re_qp->is_dv_qp = true;
+ atomic_inc(&rdev->dv_qp_count);
+ return 0;
+
+free_qplib:
+ bnxt_qplib_destroy_qp(&rdev->qplib_res, &re_qp->qplib_qp);
+free_hwq:
+ bnxt_qplib_free_qp_res(&rdev->qplib_res, &re_qp->qplib_qp);
+free_umem:
+ bnxt_re_qp_free_umem(re_qp);
+dbr_rel:
+ if (dbr_obj)
+ atomic_dec(&dbr_obj->usecnt);
+ return ret;
+}
+
+int bnxt_re_dv_destroy_qp(struct bnxt_re_qp *qp)
+{
+ struct bnxt_re_dev *rdev = qp->rdev;
+ struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
+ struct bnxt_qplib_nq *scq_nq = NULL;
+ struct bnxt_qplib_nq *rcq_nq = NULL;
+ int rc;
+
+ mutex_lock(&rdev->qp_lock);
+ list_del(&qp->list);
+ atomic_dec(&rdev->stats.res.qp_count);
+ if (qp->qplib_qp.type == CMDQ_CREATE_QP_TYPE_RC)
+ atomic_dec(&rdev->stats.res.rc_qp_count);
+ mutex_unlock(&rdev->qp_lock);
+
+ rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, qplib_qp);
+ if (rc)
+ ibdev_err_ratelimited(&rdev->ibdev,
+ "id = %d failed rc = %d",
+ qplib_qp->id, rc);
+
+ bnxt_qplib_free_qp_res(&rdev->qplib_res, qplib_qp);
+ bnxt_re_qp_free_umem(qp);
+
+ /* Flush all the entries of notification queue associated with
+ * given qp.
+ */
+ scq_nq = qplib_qp->scq->nq;
+ rcq_nq = qplib_qp->rcq->nq;
+ bnxt_re_synchronize_nq(scq_nq);
+ if (scq_nq != rcq_nq)
+ bnxt_re_synchronize_nq(rcq_nq);
+
+ atomic_dec(&rdev->dv_qp_count);
+ if (qp->dbr_obj)
+ atomic_dec(&qp->dbr_obj->usecnt);
+ return 0;
+}
+
+ADD_UVERBS_ATTRIBUTES_SIMPLE(
+ bnxt_re_qp_create,
+ UVERBS_OBJECT_QP,
+ UVERBS_METHOD_QP_CREATE,
+ UVERBS_ATTR_IDR(BNXT_RE_CREATE_QP_ATTR_DBR_HANDLE,
+ BNXT_RE_OBJECT_DBR,
+ UVERBS_ACCESS_READ,
+ UA_OPTIONAL));
+
+const struct uapi_definition bnxt_re_create_qp_defs[] = {
+ UAPI_DEF_CHAIN_OBJ_TREE(UVERBS_OBJECT_QP, &bnxt_re_qp_create),
+ {},
+};
+
const struct uapi_definition bnxt_re_uapi_defs[] = {
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_ALLOC_PAGE),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_NOTIFY_DRV),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_GET_TOGGLE_MEM),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_DBR),
UAPI_DEF_CHAIN_OBJ_TREE_NAMED(BNXT_RE_OBJECT_DEFAULT_DBR),
+ UAPI_DEF_CHAIN(bnxt_re_create_qp_defs),
{}
};
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 0d95eaee3885..b1d93327413f 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -967,7 +967,7 @@ static void bnxt_re_del_unique_gid(struct bnxt_re_dev *rdev)
dev_err(rdev_to_dev(rdev), "Failed to delete unique GID, rc: %d\n", rc);
}
-static void bnxt_re_qp_free_umem(struct bnxt_re_qp *qp)
+void bnxt_re_qp_free_umem(struct bnxt_re_qp *qp)
{
ib_umem_release(qp->rumem);
ib_umem_release(qp->sumem);
@@ -984,6 +984,9 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
unsigned int flags;
int rc;
+ if (qp->is_dv_qp)
+ return bnxt_re_dv_destroy_qp(qp);
+
bnxt_re_debug_rem_qpinfo(rdev, qp);
bnxt_qplib_flush_cqn_wq(&qp->qplib_qp);
@@ -1029,7 +1032,7 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata)
return 0;
}
-static u8 __from_ib_qp_type(enum ib_qp_type type)
+u8 __from_ib_qp_type(enum ib_qp_type type)
{
switch (type) {
case IB_QPT_GSI:
@@ -1265,7 +1268,7 @@ static int bnxt_re_qp_alloc_init_xrrq(struct bnxt_re_qp *qp)
return rc;
}
-static int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp)
+int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp, bool is_dv_qp)
{
struct bnxt_qplib_res *res = &qp->rdev->qplib_res;
struct bnxt_qplib_qp *qplib_qp = &qp->qplib_qp;
@@ -1279,12 +1282,17 @@ static int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp)
hwq_attr.res = res;
hwq_attr.sginfo = &sq->sg_info;
hwq_attr.stride = bnxt_qplib_get_stride();
- hwq_attr.depth = bnxt_qplib_get_depth(sq, wqe_mode, true);
hwq_attr.aux_stride = qplib_qp->psn_sz;
- hwq_attr.aux_depth = (qplib_qp->psn_sz) ?
- bnxt_qplib_set_sq_size(sq, wqe_mode) : 0;
- if (qplib_qp->is_host_msn_tbl && qplib_qp->psn_sz)
+ if (!is_dv_qp) {
+ hwq_attr.depth = bnxt_qplib_get_depth(sq, wqe_mode, true);
+ hwq_attr.aux_depth = (qplib_qp->psn_sz) ?
+ bnxt_qplib_set_sq_size(sq, wqe_mode) : 0;
+ if (qplib_qp->is_host_msn_tbl && qplib_qp->psn_sz)
+ hwq_attr.aux_depth = qplib_qp->msn_tbl_sz;
+ } else {
+ hwq_attr.depth = sq->max_wqe;
hwq_attr.aux_depth = qplib_qp->msn_tbl_sz;
+ }
hwq_attr.type = HWQ_TYPE_QUEUE;
rc = bnxt_qplib_alloc_init_hwq(&sq->hwq, &hwq_attr);
if (rc)
@@ -1295,10 +1303,16 @@ static int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp)
CMDQ_CREATE_QP_SQ_LVL_SFT);
sq->hwq.pg_sz_lvl = pg_sz_lvl;
+ if (qplib_qp->srq)
+ goto done;
+
hwq_attr.res = res;
hwq_attr.sginfo = &rq->sg_info;
hwq_attr.stride = bnxt_qplib_get_stride();
- hwq_attr.depth = bnxt_qplib_get_depth(rq, qplib_qp->wqe_mode, false);
+ if (!is_dv_qp)
+ hwq_attr.depth = bnxt_qplib_get_depth(rq, qplib_qp->wqe_mode, false);
+ else
+ hwq_attr.depth = rq->max_wqe * 3;
hwq_attr.aux_stride = 0;
hwq_attr.aux_depth = 0;
hwq_attr.type = HWQ_TYPE_QUEUE;
@@ -1311,6 +1325,7 @@ static int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp)
CMDQ_CREATE_QP_RQ_LVL_SFT);
rq->hwq.pg_sz_lvl = pg_sz_lvl;
+done:
if (qplib_qp->psn_sz) {
rc = bnxt_re_qp_alloc_init_xrrq(qp);
if (rc)
@@ -1379,7 +1394,7 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
qp->qplib_qp.rq_hdr_buf_size = BNXT_QPLIB_MAX_GRH_HDR_SIZE_IPV6;
qp->qplib_qp.dpi = &rdev->dpi_privileged;
- rc = bnxt_re_setup_qp_hwqs(qp);
+ rc = bnxt_re_setup_qp_hwqs(qp, false);
if (rc)
goto fail;
@@ -1676,7 +1691,7 @@ static int bnxt_re_init_qp_attr(struct bnxt_re_qp *qp, struct bnxt_re_pd *pd,
bnxt_re_qp_calculate_msn_psn_size(qp);
- rc = bnxt_re_setup_qp_hwqs(qp);
+ rc = bnxt_re_setup_qp_hwqs(qp, false);
if (rc)
goto free_umem;
@@ -1803,15 +1818,22 @@ static int bnxt_re_add_unique_gid(struct bnxt_re_dev *rdev)
return rc;
}
-int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
- struct ib_udata *udata)
+int bnxt_re_create_qp_umem(struct ib_qp *ib_qp,
+ struct ib_qp_init_attr *qp_init_attr,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem,
+ struct uverbs_attr_bundle *attrs)
{
- struct bnxt_qplib_dev_attr *dev_attr;
- struct bnxt_re_ucontext *uctx;
- struct bnxt_re_qp_req ureq;
- struct bnxt_re_dev *rdev;
+ struct bnxt_re_qp *qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
+ struct bnxt_re_dev *rdev = to_bnxt_re_dev(ib_qp->device, ibdev);
+ struct ib_udata *udata = &attrs->driver_udata;
+ struct bnxt_re_ucontext *uctx = NULL;
+
+ /* Only get ucontext if attrs->context is valid (userspace path) */
+ if (attrs->context)
+ uctx = rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
+ struct bnxt_qplib_dev_attr *dev_attr = rdev->dev_attr;
+ struct bnxt_re_qp_req req = {};
struct bnxt_re_pd *pd;
- struct bnxt_re_qp *qp;
struct ib_pd *ib_pd;
u32 active_qps;
int rc;
@@ -1820,12 +1842,31 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
rdev = pd->rdev;
dev_attr = rdev->dev_attr;
- qp = container_of(ib_qp, struct bnxt_re_qp, ib_qp);
- uctx = rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
- if (udata)
- if (ib_copy_from_udata(&ureq, udata, min(udata->inlen, sizeof(ureq))))
+ /* At this point, udata (attrs->driver_udata) is always valid,
+ * since even for the kernel path we would have initialized it in
+ * bnxt_re_create_qp(). But in kernel path, udata->inlen will be 0,
+ * so we skip userspace data handling.
+ */
+ if (udata->inlen) {
+ if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen)))
return -EFAULT;
+ if (req.comp_mask & BNXT_RE_QP_DV_SUPPORT) {
+ /* DV QP creation requires umem */
+ if (!sq_umem)
+ return -EINVAL;
+ /* rq_umem is optional if SRQ is used */
+ if (!qp_init_attr->srq && !rq_umem)
+ return -EINVAL;
+
+ return bnxt_re_dv_create_qp(rdev, udata, qp_init_attr, qp, &req,
+ sq_umem, rq_umem);
+ }
+ }
+
+ /* Standard QP (non-DV): use req.qpsva/qprva */
+ if (sq_umem || rq_umem)
+ return -EINVAL;
rc = bnxt_re_test_qp_limits(rdev, qp_init_attr, dev_attr);
if (!rc) {
@@ -1834,7 +1875,7 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
}
qp->rdev = rdev;
- rc = bnxt_re_init_qp_attr(qp, pd, qp_init_attr, uctx, &ureq);
+ rc = bnxt_re_init_qp_attr(qp, pd, qp_init_attr, uctx, &req);
if (rc)
goto fail;
@@ -1852,7 +1893,7 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
goto free_hwq;
}
- if (udata) {
+ if (udata->outlen) {
struct bnxt_re_qp_resp resp;
resp.qpid = qp->qplib_qp.id;
@@ -1909,6 +1950,22 @@ int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
return rc;
}
+int bnxt_re_create_qp(struct ib_qp *ib_qp, struct ib_qp_init_attr *qp_init_attr,
+ struct ib_udata *udata)
+{
+ struct uverbs_attr_bundle attrs_wrapper = {};
+ struct uverbs_attr_bundle *attrs;
+
+ if (udata) {
+ attrs = rdma_udata_to_uverbs_attr_bundle(udata);
+ } else {
+ /* Kernel path: use zero-initialized wrapper */
+ attrs = &attrs_wrapper;
+ }
+
+ return bnxt_re_create_qp_umem(ib_qp, qp_init_attr, NULL, NULL, attrs);
+}
+
static u8 __from_ib_qp_state(enum ib_qp_state state)
{
switch (state) {
@@ -3241,7 +3298,7 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
return rc;
}
-static struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev)
+struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev)
{
int min, indx;
@@ -3256,7 +3313,7 @@ static struct bnxt_qplib_nq *bnxt_re_get_nq(struct bnxt_re_dev *rdev)
return &rdev->nqr->nq[min];
}
-static void bnxt_re_put_nq(struct bnxt_re_dev *rdev, struct bnxt_qplib_nq *nq)
+void bnxt_re_put_nq(struct bnxt_re_dev *rdev, struct bnxt_qplib_nq *nq)
{
mutex_lock(&rdev->nqr->load_lock);
nq->load--;
@@ -3284,6 +3341,8 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
bnxt_re_put_nq(rdev, nq);
ib_umem_release(cq->umem);
+ if (cq->is_dv_cq)
+ atomic_dec(&rdev->dv_cq_count);
atomic_dec(&rdev->stats.res.cq_count);
kfree(cq->cql);
@@ -3292,6 +3351,27 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct uverbs_attr_bundle *attrs)
+{
+ return bnxt_re_create_cq_umem(ibcq, attr, NULL, attrs);
+}
+
+static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq)
+{
+ struct bnxt_re_dev *rdev = cq->rdev;
+
+ bnxt_qplib_resize_cq_complete(&rdev->qplib_res, &cq->qplib_cq);
+
+ cq->qplib_cq.max_wqe = cq->resize_cqe;
+ if (cq->resize_umem) {
+ ib_umem_release(cq->umem);
+ cq->umem = cq->resize_umem;
+ cq->resize_umem = NULL;
+ cq->resize_cqe = 0;
+ }
+}
+
+int bnxt_re_create_cq_umem(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
+ struct ib_umem *umem, struct uverbs_attr_bundle *attrs)
{
struct bnxt_re_cq *cq = container_of(ibcq, struct bnxt_re_cq, ib_cq);
struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibcq->device, ibdev);
@@ -3299,7 +3379,9 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct bnxt_re_ucontext *uctx =
rdma_udata_to_drv_context(udata, struct bnxt_re_ucontext, ib_uctx);
struct bnxt_qplib_dev_attr *dev_attr = rdev->dev_attr;
+ struct bnxt_re_cq_resp resp = {};
struct bnxt_qplib_chip_ctx *cctx;
+ struct bnxt_re_cq_req req = {};
int cqe = attr->cqe;
int rc, entries;
u32 active_cqs;
@@ -3317,6 +3399,18 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
cctx = rdev->chip_ctx;
cq->qplib_cq.cq_handle = (u64)(unsigned long)(&cq->qplib_cq);
+ if (udata) {
+ if (ib_copy_from_udata(&req, udata, min(sizeof(req), udata->inlen)))
+ return -EFAULT;
+ if (req.comp_mask & BNXT_RE_CQ_DV_SUPPORT) {
+ /* DV CQ creation requires umem */
+ if (!umem)
+ return -EINVAL;
+
+ return bnxt_re_dv_create_cq(rdev, udata, cq, &req, umem);
+ }
+ }
+
entries = bnxt_re_init_depth(cqe + 1, uctx);
if (entries > dev_attr->max_cq_wqes + 1)
entries = dev_attr->max_cq_wqes + 1;
@@ -3324,12 +3418,11 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
cq->qplib_cq.sg_info.pgsize = PAGE_SIZE;
cq->qplib_cq.sg_info.pgshft = PAGE_SHIFT;
if (udata) {
- struct bnxt_re_cq_req req;
- if (ib_copy_from_udata(&req, udata, sizeof(req))) {
- rc = -EFAULT;
+ if (umem) {
+ /* Standard CQ (non-DV): use req.cq_va */
+ rc = -EINVAL;
goto fail;
}
-
cq->umem = ib_umem_get(&rdev->ibdev, req.cq_va,
entries * sizeof(struct cq_base),
IB_ACCESS_LOCAL_WRITE);
@@ -3370,8 +3463,6 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
spin_lock_init(&cq->cq_lock);
if (udata) {
- struct bnxt_re_cq_resp resp = {};
-
if (cctx->modes.toggle_bits & BNXT_QPLIB_CQ_TOGGLE_BIT) {
hash_add(rdev->cq_hash, &cq->hash_entry, cq->qplib_cq.id);
/* Allocate a page */
@@ -3399,27 +3490,13 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
free_mem:
free_page((unsigned long)cq->uctx_cq_page);
c2fail:
- ib_umem_release(cq->umem);
+ if (cq->umem && !umem)
+ ib_umem_release(cq->umem);
fail:
kfree(cq->cql);
return rc;
}
-static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq)
-{
- struct bnxt_re_dev *rdev = cq->rdev;
-
- bnxt_qplib_resize_cq_complete(&rdev->qplib_res, &cq->qplib_cq);
-
- cq->qplib_cq.max_wqe = cq->resize_cqe;
- if (cq->resize_umem) {
- ib_umem_release(cq->umem);
- cq->umem = cq->resize_umem;
- cq->resize_umem = NULL;
- cq->resize_cqe = 0;
- }
-}
-
int bnxt_re_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
{
struct bnxt_qplib_sg_info sg_info = {};
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 33e0f66b39eb..902135d0aa34 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -96,6 +96,8 @@ struct bnxt_re_qp {
struct bnxt_re_cq *scq;
struct bnxt_re_cq *rcq;
struct dentry *dentry;
+ bool is_dv_qp;
+ struct bnxt_re_dbr_obj *dbr_obj; /* doorbell region */
};
struct bnxt_re_cq {
@@ -113,6 +115,7 @@ struct bnxt_re_cq {
int resize_cqe;
void *uctx_cq_page;
struct hlist_node hash_entry;
+ bool is_dv_cq;
};
struct bnxt_re_mr {
@@ -243,6 +246,10 @@ int bnxt_re_post_srq_recv(struct ib_srq *srq, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
int bnxt_re_create_qp(struct ib_qp *qp, struct ib_qp_init_attr *qp_init_attr,
struct ib_udata *udata);
+int bnxt_re_create_qp_umem(struct ib_qp *ib_qp,
+ struct ib_qp_init_attr *qp_init_attr,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem,
+ struct uverbs_attr_bundle *attrs);
int bnxt_re_modify_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
int qp_attr_mask, struct ib_udata *udata);
int bnxt_re_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
@@ -254,6 +261,8 @@ int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
const struct ib_recv_wr **bad_recv_wr);
int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct uverbs_attr_bundle *attrs);
+int bnxt_re_create_cq_umem(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
+ struct ib_umem *umem, struct uverbs_attr_bundle *attrs);
int bnxt_re_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata);
int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
@@ -303,4 +312,15 @@ void bnxt_re_unlock_cqs(struct bnxt_re_qp *qp, unsigned long flags);
struct bnxt_re_user_mmap_entry*
bnxt_re_mmap_entry_insert(struct bnxt_re_ucontext *uctx, u64 mem_offset,
enum bnxt_re_mmap_flag mmap_flag, u64 *offset);
+u8 __from_ib_qp_type(enum ib_qp_type type);
+int bnxt_re_setup_qp_hwqs(struct bnxt_re_qp *qp, bool is_dv_qp);
+void bnxt_re_qp_free_umem(struct bnxt_re_qp *qp);
+int bnxt_re_dv_create_cq(struct bnxt_re_dev *rdev, struct ib_udata *udata,
+ struct bnxt_re_cq *re_cq, struct bnxt_re_cq_req *req,
+ struct ib_umem *umem);
+int bnxt_re_dv_create_qp(struct bnxt_re_dev *rdev, struct ib_udata *udata,
+ struct ib_qp_init_attr *init_attr,
+ struct bnxt_re_qp *re_qp, struct bnxt_re_qp_req *req,
+ struct ib_umem *sq_umem, struct ib_umem *rq_umem);
+int bnxt_re_dv_destroy_qp(struct bnxt_re_qp *qp);
#endif /* __BNXT_RE_IB_VERBS_H__ */
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 73003ad25ee8..e38724812cc6 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1334,7 +1334,9 @@ static const struct ib_device_ops bnxt_re_dev_ops = {
.alloc_ucontext = bnxt_re_alloc_ucontext,
.create_ah = bnxt_re_create_ah,
.create_cq = bnxt_re_create_cq,
+ .create_cq_umem = bnxt_re_create_cq_umem,
.create_qp = bnxt_re_create_qp,
+ .create_qp_umem = bnxt_re_create_qp_umem,
.create_srq = bnxt_re_create_srq,
.create_user_ah = bnxt_re_create_ah,
.dealloc_pd = bnxt_re_dealloc_pd,
diff --git a/include/uapi/rdma/bnxt_re-abi.h b/include/uapi/rdma/bnxt_re-abi.h
index 51f8614a7c4f..b4ff16a36284 100644
--- a/include/uapi/rdma/bnxt_re-abi.h
+++ b/include/uapi/rdma/bnxt_re-abi.h
@@ -101,10 +101,13 @@ struct bnxt_re_pd_resp {
struct bnxt_re_cq_req {
__aligned_u64 cq_va;
__aligned_u64 cq_handle;
+ __aligned_u64 comp_mask;
+ __u32 ncqe;
};
enum bnxt_re_cq_mask {
BNXT_RE_CQ_TOGGLE_PAGE_SUPPORT = 0x1,
+ BNXT_RE_CQ_DV_SUPPORT = 0x2
};
struct bnxt_re_cq_resp {
@@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
enum bnxt_re_qp_mask {
BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
+ BNXT_RE_QP_DV_SUPPORT = 0x2,
};
struct bnxt_re_qp_req {
@@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
__aligned_u64 qp_handle;
__aligned_u64 comp_mask;
__u32 sq_slots;
+ __u32 pd_id;
+ __u32 sq_wqe_sz;
+ __u32 sq_psn_sz;
+ __u32 sq_npsn;
+ __u32 rq_slots;
+ __u32 rq_wqe_sz;
+};
+
+enum bnxt_re_create_qp_attrs {
+ BNXT_RE_CREATE_QP_ATTR_DBR_HANDLE = UVERBS_ID_DRIVER_NS_WITH_UHW,
};
struct bnxt_re_qp_resp {
__u32 qpid;
__u32 rsvd;
+ __aligned_u64 comp_mask;
};
struct bnxt_re_srq_req {
--
2.51.2.636.ga99f379adf
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
@ 2026-01-27 12:12 ` Jiri Pirko
2026-01-27 13:04 ` Sriharsha Basavapatna
2026-01-28 12:31 ` Leon Romanovsky
1 sibling, 1 reply; 30+ messages in thread
From: Jiri Pirko @ 2026-01-27 12:12 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, jgg, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
Tue, Jan 27, 2026 at 11:31:05AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>
>This patch supports creation of QPs with user allocated memory (umem).
>This is similar to the existing CQ umem support. This enables userspace
>applications to provide pre-allocated buffers for QP send and receive
>queues.
>
>- Add create_qp_umem device operation to the RDMA device ops.
>- Implement get_qp_buffer_umem() helper function to handle both VA-based
> and dmabuf-based umem allocation.
>- Extend QP creation handler to support umem attributes for SQ and RQ.
>- Add new uAPI attributes to specify umem buffers (VA/length or
> FD/offset combinations).
>
>Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
When you send patch in my name, and you do some changes (patch
description was not present in my draft at least), I would expect some
off-list handshake. Is it too much to ask? Looks like you are in a big
hurry, that never brought anything good :/
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 10:31 ` [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs Sriharsha Basavapatna
@ 2026-01-27 12:30 ` Jiri Pirko
2026-01-27 14:15 ` Jason Gunthorpe
2026-01-28 15:33 ` Jason Gunthorpe
1 sibling, 1 reply; 30+ messages in thread
From: Jiri Pirko @ 2026-01-27 12:30 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, jgg, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
Tue, Jan 27, 2026 at 11:31:08AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
>The following Direct Verbs (DV) methods have been implemented in
>this patch.
>
>Doorbell Region Direct Verbs:
>-----------------------------
>- BNXT_RE_METHOD_DBR_ALLOC:
> This will allow the appliation to create extra doorbell regions
> and use the associated doorbell page index in DV_CREATE_QP and
> use the associated DB address while ringing the doorbell.
>
>- BNXT_RE_METHOD_DBR_FREE:
> Free the allocated doorbell region.
>
>- BNXT_RE_METHOD_GET_DEFAULT_DBR:
> Return the default doorbell page index and doorbell page address
> associated with the ucontext.
>
Similar to CQ/QP, why this is bnxt specific? I know a little about rdma,
but I believe we use it in mlx5 too, no?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory
2026-01-27 12:12 ` Jiri Pirko
@ 2026-01-27 13:04 ` Sriharsha Basavapatna
2026-01-28 10:16 ` Jiri Pirko
0 siblings, 1 reply; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-27 13:04 UTC (permalink / raw)
To: Jiri Pirko
Cc: leon, jgg, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 1482 bytes --]
On Tue, Jan 27, 2026 at 5:42 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Tue, Jan 27, 2026 at 11:31:05AM +0100, sriharsha.basavapatna@broadcom.com wrote:
> >From: Jiri Pirko <jiri@resnulli.us>
> >
> >This patch supports creation of QPs with user allocated memory (umem).
> >This is similar to the existing CQ umem support. This enables userspace
> >applications to provide pre-allocated buffers for QP send and receive
> >queues.
> >
> >- Add create_qp_umem device operation to the RDMA device ops.
> >- Implement get_qp_buffer_umem() helper function to handle both VA-based
> > and dmabuf-based umem allocation.
> >- Extend QP creation handler to support umem attributes for SQ and RQ.
> >- Add new uAPI attributes to specify umem buffers (VA/length or
> > FD/offset combinations).
> >
> >Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> >Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
>
> When you send patch in my name, and you do some changes (patch
> description was not present in my draft at least), I would expect some
> off-list handshake. Is it too much to ask? Looks like you are in a big
> hurry, that never brought anything good :/
I didn't know you expected an offline handshake, since I was just
updating the commit message (and that one other line I mentioned
earlier). But please feel free to suggest/revert changes or if you
want to push an updated revision yourself, I'm ok either way.
Thanks,
-Harsha
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 12:30 ` Jiri Pirko
@ 2026-01-27 14:15 ` Jason Gunthorpe
2026-01-27 15:07 ` Jiri Pirko
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-27 14:15 UTC (permalink / raw)
To: Jiri Pirko
Cc: Sriharsha Basavapatna, leon, linux-rdma, andrew.gospodarek,
selvin.xavier, kalesh-anakkur.purayil
On Tue, Jan 27, 2026 at 01:30:03PM +0100, Jiri Pirko wrote:
> Tue, Jan 27, 2026 at 11:31:08AM +0100, sriharsha.basavapatna@broadcom.com wrote:
> >From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >
> >The following Direct Verbs (DV) methods have been implemented in
> >this patch.
> >
> >Doorbell Region Direct Verbs:
> >-----------------------------
> >- BNXT_RE_METHOD_DBR_ALLOC:
> > This will allow the appliation to create extra doorbell regions
> > and use the associated doorbell page index in DV_CREATE_QP and
> > use the associated DB address while ringing the doorbell.
> >
> >- BNXT_RE_METHOD_DBR_FREE:
> > Free the allocated doorbell region.
> >
> >- BNXT_RE_METHOD_GET_DEFAULT_DBR:
> > Return the default doorbell page index and doorbell page address
> > associated with the ucontext.
> >
>
> Similar to CQ/QP, why this is bnxt specific? I know a little about rdma,
> but I believe we use it in mlx5 too, no?
mlx5 has a specific thing too, the doorbell has enough fairly hw
specific properties and never leaks outside the userspace provider.
We consolidated the internal code to manage the mmaps, beyond that
there hasn't been a big push to consolidate more.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 14:15 ` Jason Gunthorpe
@ 2026-01-27 15:07 ` Jiri Pirko
2026-01-27 15:56 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Jiri Pirko @ 2026-01-27 15:07 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Sriharsha Basavapatna, leon, linux-rdma, andrew.gospodarek,
selvin.xavier, kalesh-anakkur.purayil
Tue, Jan 27, 2026 at 03:15:45PM +0100, jgg@ziepe.ca wrote:
>On Tue, Jan 27, 2026 at 01:30:03PM +0100, Jiri Pirko wrote:
>> Tue, Jan 27, 2026 at 11:31:08AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>> >From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>> >
>> >The following Direct Verbs (DV) methods have been implemented in
>> >this patch.
>> >
>> >Doorbell Region Direct Verbs:
>> >-----------------------------
>> >- BNXT_RE_METHOD_DBR_ALLOC:
>> > This will allow the appliation to create extra doorbell regions
>> > and use the associated doorbell page index in DV_CREATE_QP and
>> > use the associated DB address while ringing the doorbell.
>> >
>> >- BNXT_RE_METHOD_DBR_FREE:
>> > Free the allocated doorbell region.
>> >
>> >- BNXT_RE_METHOD_GET_DEFAULT_DBR:
>> > Return the default doorbell page index and doorbell page address
>> > associated with the ucontext.
>> >
>>
>> Similar to CQ/QP, why this is bnxt specific? I know a little about rdma,
>> but I believe we use it in mlx5 too, no?
>
>mlx5 has a specific thing too, the doorbell has enough fairly hw
>specific properties and never leaks outside the userspace provider.
>
>We consolidated the internal code to manage the mmaps, beyond that
>there hasn't been a big push to consolidate more.
I'm a bit lost about what this patchset tries to do. The cover letter
does not mention dmabuf at all. Not sure why. I understand that create
cq/qp is enabled to work with user-passed dma-buf info. So that makes me
assume the same for DBR. I guess I'm wrong.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 15:07 ` Jiri Pirko
@ 2026-01-27 15:56 ` Jason Gunthorpe
2026-01-28 10:04 ` Jiri Pirko
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-27 15:56 UTC (permalink / raw)
To: Jiri Pirko
Cc: Sriharsha Basavapatna, leon, linux-rdma, andrew.gospodarek,
selvin.xavier, kalesh-anakkur.purayil
On Tue, Jan 27, 2026 at 04:07:41PM +0100, Jiri Pirko wrote:
> Tue, Jan 27, 2026 at 03:15:45PM +0100, jgg@ziepe.ca wrote:
> >On Tue, Jan 27, 2026 at 01:30:03PM +0100, Jiri Pirko wrote:
> >> Tue, Jan 27, 2026 at 11:31:08AM +0100, sriharsha.basavapatna@broadcom.com wrote:
> >> >From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> >> >
> >> >The following Direct Verbs (DV) methods have been implemented in
> >> >this patch.
> >> >
> >> >Doorbell Region Direct Verbs:
> >> >-----------------------------
> >> >- BNXT_RE_METHOD_DBR_ALLOC:
> >> > This will allow the appliation to create extra doorbell regions
> >> > and use the associated doorbell page index in DV_CREATE_QP and
> >> > use the associated DB address while ringing the doorbell.
> >> >
> >> >- BNXT_RE_METHOD_DBR_FREE:
> >> > Free the allocated doorbell region.
> >> >
> >> >- BNXT_RE_METHOD_GET_DEFAULT_DBR:
> >> > Return the default doorbell page index and doorbell page address
> >> > associated with the ucontext.
> >> >
> >>
> >> Similar to CQ/QP, why this is bnxt specific? I know a little about rdma,
> >> but I believe we use it in mlx5 too, no?
> >
> >mlx5 has a specific thing too, the doorbell has enough fairly hw
> >specific properties and never leaks outside the userspace provider.
> >
> >We consolidated the internal code to manage the mmaps, beyond that
> >there hasn't been a big push to consolidate more.
>
> I'm a bit lost about what this patchset tries to do. The cover letter
> does not mention dmabuf at all. Not sure why. I understand that create
> cq/qp is enabled to work with user-passed dma-buf info. So that makes me
> assume the same for DBR. I guess I'm wrong.
This series doesn't really clearly explain what it is actually for,
but it is almost certianly about supporting what NCCL calls "GPU
Initiated Networking (GIN)".
To do this you need a couple of components:
1) "DV" verbs to allow direct access to the underlying HW queues
under a QP/CQ. This is because you will write a "RDMA provider"
that runs on the GPU
2) DMABUF support for QP/CQ because you will use DMABUF to place
the QP/CQ inside GPU VRAM so that the "RDMA provider" running in
the GPU can access the rings at full performance
3) A doorebell ring that is compatible with the GPU, usually meaning
dedicated doorbell registers because the GPU can't do locking
coordinated with the CPU.
Of course there are other ways to use these APIs and DV was first
invented for DPDK not NCCL..
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 15:56 ` Jason Gunthorpe
@ 2026-01-28 10:04 ` Jiri Pirko
0 siblings, 0 replies; 30+ messages in thread
From: Jiri Pirko @ 2026-01-28 10:04 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Sriharsha Basavapatna, leon, linux-rdma, andrew.gospodarek,
selvin.xavier, kalesh-anakkur.purayil
Tue, Jan 27, 2026 at 04:56:03PM +0100, jgg@ziepe.ca wrote:
>On Tue, Jan 27, 2026 at 04:07:41PM +0100, Jiri Pirko wrote:
>> Tue, Jan 27, 2026 at 03:15:45PM +0100, jgg@ziepe.ca wrote:
>> >On Tue, Jan 27, 2026 at 01:30:03PM +0100, Jiri Pirko wrote:
>> >> Tue, Jan 27, 2026 at 11:31:08AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>> >> >From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>> >> >
>> >> >The following Direct Verbs (DV) methods have been implemented in
>> >> >this patch.
>> >> >
>> >> >Doorbell Region Direct Verbs:
>> >> >-----------------------------
>> >> >- BNXT_RE_METHOD_DBR_ALLOC:
>> >> > This will allow the appliation to create extra doorbell regions
>> >> > and use the associated doorbell page index in DV_CREATE_QP and
>> >> > use the associated DB address while ringing the doorbell.
>> >> >
>> >> >- BNXT_RE_METHOD_DBR_FREE:
>> >> > Free the allocated doorbell region.
>> >> >
>> >> >- BNXT_RE_METHOD_GET_DEFAULT_DBR:
>> >> > Return the default doorbell page index and doorbell page address
>> >> > associated with the ucontext.
>> >> >
>> >>
>> >> Similar to CQ/QP, why this is bnxt specific? I know a little about rdma,
>> >> but I believe we use it in mlx5 too, no?
>> >
>> >mlx5 has a specific thing too, the doorbell has enough fairly hw
>> >specific properties and never leaks outside the userspace provider.
>> >
>> >We consolidated the internal code to manage the mmaps, beyond that
>> >there hasn't been a big push to consolidate more.
>>
>> I'm a bit lost about what this patchset tries to do. The cover letter
>> does not mention dmabuf at all. Not sure why. I understand that create
>> cq/qp is enabled to work with user-passed dma-buf info. So that makes me
>> assume the same for DBR. I guess I'm wrong.
>
>This series doesn't really clearly explain what it is actually for,
>but it is almost certianly about supporting what NCCL calls "GPU
>Initiated Networking (GIN)".
>
>To do this you need a couple of components:
> 1) "DV" verbs to allow direct access to the underlying HW queues
> under a QP/CQ. This is because you will write a "RDMA provider"
> that runs on the GPU
> 2) DMABUF support for QP/CQ because you will use DMABUF to place
> the QP/CQ inside GPU VRAM so that the "RDMA provider" running in
> the GPU can access the rings at full performance
> 3) A doorebell ring that is compatible with the GPU, usually meaning
> dedicated doorbell registers because the GPU can't do locking
> coordinated with the CPU.
>
>Of course there are other ways to use these APIs and DV was first
>invented for DPDK not NCCL..
I see.
Isn't it common to have a clear cover letter like this expressing
motivations for the patchset and each individual patch? Would be much
easier to follow what the author wants to achieve.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory
2026-01-27 13:04 ` Sriharsha Basavapatna
@ 2026-01-28 10:16 ` Jiri Pirko
0 siblings, 0 replies; 30+ messages in thread
From: Jiri Pirko @ 2026-01-28 10:16 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, jgg, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
Tue, Jan 27, 2026 at 02:04:27PM CET, sriharsha.basavapatna@broadcom.com wrote:
>On Tue, Jan 27, 2026 at 5:42 PM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Tue, Jan 27, 2026 at 11:31:05AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>> >From: Jiri Pirko <jiri@resnulli.us>
>> >
>> >This patch supports creation of QPs with user allocated memory (umem).
>> >This is similar to the existing CQ umem support. This enables userspace
>> >applications to provide pre-allocated buffers for QP send and receive
>> >queues.
>> >
>> >- Add create_qp_umem device operation to the RDMA device ops.
>> >- Implement get_qp_buffer_umem() helper function to handle both VA-based
>> > and dmabuf-based umem allocation.
>> >- Extend QP creation handler to support umem attributes for SQ and RQ.
>> >- Add new uAPI attributes to specify umem buffers (VA/length or
>> > FD/offset combinations).
>> >
>> >Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> >Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
>>
>> When you send patch in my name, and you do some changes (patch
>> description was not present in my draft at least), I would expect some
>> off-list handshake. Is it too much to ask? Looks like you are in a big
>> hurry, that never brought anything good :/
>
>I didn't know you expected an offline handshake, since I was just
>updating the commit message (and that one other line I mentioned
>earlier). But please feel free to suggest/revert changes or if you
>want to push an updated revision yourself, I'm ok either way.
On which planet you find okay to take someones draft (untested) patch
from discussion, take it, add description and send it on his behalf, all
without consulting him?
I'm in a process of testing the patch. Give me day or two, I will send
it myself.
Thanks!
>Thanks,
>-Harsha
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
2026-01-27 12:12 ` Jiri Pirko
@ 2026-01-28 12:31 ` Leon Romanovsky
1 sibling, 0 replies; 30+ messages in thread
From: Leon Romanovsky @ 2026-01-28 12:31 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: jgg, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Jiri Pirko
On Tue, Jan 27, 2026 at 04:01:05PM +0530, Sriharsha Basavapatna wrote:
> From: Jiri Pirko <jiri@resnulli.us>
>
> This patch supports creation of QPs with user allocated memory (umem).
> This is similar to the existing CQ umem support. This enables userspace
> applications to provide pre-allocated buffers for QP send and receive
> queues.
>
> - Add create_qp_umem device operation to the RDMA device ops.
> - Implement get_qp_buffer_umem() helper function to handle both VA-based
> and dmabuf-based umem allocation.
> - Extend QP creation handler to support umem attributes for SQ and RQ.
> - Add new uAPI attributes to specify umem buffers (VA/length or
> FD/offset combinations).
In addition to Jiri's feedback, there is no need to copy and paste an
AI‑generated summary of mechanical changes. We can all read the code.
Your commit message should focus on the motivation and any relevant caveats.
Thanks
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-27 10:31 ` [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs Sriharsha Basavapatna
@ 2026-01-28 15:32 ` Jason Gunthorpe
2026-01-28 15:51 ` Jason Gunthorpe
2026-01-28 16:54 ` Sriharsha Basavapatna
2026-01-28 15:46 ` Jason Gunthorpe
1 sibling, 2 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 15:32 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> struct bnxt_re_cq_resp {
> @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
>
> enum bnxt_re_qp_mask {
> BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> + BNXT_RE_QP_DV_SUPPORT = 0x2,
> };
This is set on the response but there are no new response fields? That seems
backwards?
> struct bnxt_re_qp_req {
> @@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
> __aligned_u64 qp_handle;
> __aligned_u64 comp_mask;
> __u32 sq_slots;
> + __u32 pd_id;
> + __u32 sq_wqe_sz;
> + __u32 sq_psn_sz;
> + __u32 sq_npsn;
> + __u32 rq_slots;
> + __u32 rq_wqe_sz;
> +};
How does compatablity work here? Old userspace will send a short
structure, the new kernel should effectively see 0 at all these fields
is that OK? Sizes of 0 sound bad don't they?
New userspace will send a long structure and old kernels will ignore
the new bits. Is that OK?
I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
old kernel then the kernel fails the creation and userspace can do
something else.
If the userspace passes QP_REQ_MASK_SIZES and the ioctl succeeds then
everthing is OK. Delete the comp_maks in the rep structure.
Also, what is "pd_id"? The other users of pd_id in prior patches seem
to be actual RDMA PDs. Why is something like this being passed here?
The QP already gets a PD from the core interface, why do you need
another pd?
Also the old kernels have a bug:
struct bnxt_re_qp_req ureq;
if (ib_copy_from_udata(&ureq, udata, min(udata->inlen, sizeof(ureq))))
return -EFAULT;
It should have been "ureq = {};". Those sorts of things must be fixed
or this compatability stuff is really broken/security problem! Please
audit all your ib_copy_from_udata()s!!
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs
2026-01-27 10:31 ` [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs Sriharsha Basavapatna
2026-01-27 12:30 ` Jiri Pirko
@ 2026-01-28 15:33 ` Jason Gunthorpe
1 sibling, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 15:33 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Tue, Jan 27, 2026 at 04:01:08PM +0530, Sriharsha Basavapatna wrote:
> From: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
>
> The following Direct Verbs (DV) methods have been implemented in
> this patch.
>
> Doorbell Region Direct Verbs:
> -----------------------------
> - BNXT_RE_METHOD_DBR_ALLOC:
> This will allow the appliation to create extra doorbell regions
> and use the associated doorbell page index in DV_CREATE_QP and
> use the associated DB address while ringing the doorbell.
>
> - BNXT_RE_METHOD_DBR_FREE:
> Free the allocated doorbell region.
>
> - BNXT_RE_METHOD_GET_DEFAULT_DBR:
> Return the default doorbell page index and doorbell page address
> associated with the ucontext.
>
> Co-developed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
> ---
> drivers/infiniband/hw/bnxt_re/bnxt_re.h | 1 +
> drivers/infiniband/hw/bnxt_re/dv.c | 130 ++++++++++++++++++++++
> drivers/infiniband/hw/bnxt_re/ib_verbs.h | 7 ++
> drivers/infiniband/hw/bnxt_re/qplib_res.c | 43 +++++++
> drivers/infiniband/hw/bnxt_re/qplib_res.h | 4 +
> include/uapi/rdma/bnxt_re-abi.h | 29 +++++
> 6 files changed, 214 insertions(+)
This one looks OK
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-27 10:31 ` [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs Sriharsha Basavapatna
2026-01-28 15:32 ` Jason Gunthorpe
@ 2026-01-28 15:46 ` Jason Gunthorpe
2026-02-02 14:19 ` Sriharsha Basavapatna
1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 15:46 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> index 0999a42c678c..f28acde3a274 100644
> --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> @@ -234,6 +234,8 @@ struct bnxt_re_dev {
> union ib_gid ugid;
> u32 ugid_index;
> u8 sniffer_flow_created : 1;
> + atomic_t dv_cq_count;
> + atomic_t dv_qp_count;
> };
Why? Nothing reads these? If they are stats then put them in your
stats struct and return them to userspace. I'd drop it and come later
with a user visible stat.
This patch is really big now, can you at least split it to cq/qp?
> @@ -459,11 +463,454 @@ DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR,
> DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_DEFAULT_DBR,
> &UVERBS_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR));
>
> +static int bnxt_re_dv_create_cq_resp(struct bnxt_re_dev *rdev,
> + struct bnxt_re_cq *cq,
> + struct bnxt_re_cq_resp *resp)
> +{
> + struct bnxt_qplib_cq *qplcq = &cq->qplib_cq;
> +
> + resp->cqid = qplcq->id;
> + resp->tail = qplcq->hwq.cons;
> + resp->phase = qplcq->period;
> + resp->comp_mask = BNXT_RE_CQ_DV_SUPPORT;
> + return 0;
> +}
> +
> +static int bnxt_re_dv_setup_umem(struct bnxt_re_dev *rdev,
> + struct ib_umem *umem,
> + struct bnxt_qplib_sg_info *sginfo,
> + struct ib_umem **umem_ptr)
> +{
> + unsigned long page_size;
> +
> + if (!umem)
> + return -EINVAL;
> +
> + page_size = ib_umem_find_best_pgsz(umem, SZ_4K, 0);
> + if (!page_size)
> + return -EINVAL;
> +
> + if (umem_ptr)
> + *umem_ptr = umem;
Why?? Just have the caller store to the right variable.
> @@ -3324,12 +3418,11 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
> cq->qplib_cq.sg_info.pgsize = PAGE_SIZE;
> cq->qplib_cq.sg_info.pgshft = PAGE_SHIFT;
> if (udata) {
> - struct bnxt_re_cq_req req;
> - if (ib_copy_from_udata(&req, udata, sizeof(req))) {
> - rc = -EFAULT;
> + if (umem) {
> + /* Standard CQ (non-DV): use req.cq_va */
> + rc = -EINVAL;
> goto fail;
> }
I think this should support the umem interface here, it is trivial
right, just skip the below if the umem is passed:
> cq->umem = ib_umem_get(&rdev->ibdev, req.cq_va,
> entries * sizeof(struct cq_base),
> IB_ACCESS_LOCAL_WRITE);
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 15:32 ` Jason Gunthorpe
@ 2026-01-28 15:51 ` Jason Gunthorpe
2026-01-28 18:03 ` Sriharsha Basavapatna
2026-01-28 16:54 ` Sriharsha Basavapatna
1 sibling, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 15:51 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Wed, Jan 28, 2026 at 11:32:48AM -0400, Jason Gunthorpe wrote:
> On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
>
> > struct bnxt_re_cq_resp {
> > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> >
> > enum bnxt_re_qp_mask {
> > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > };
>
> This is set on the response but there are no new response fields? That seems
> backwards?
>
> > struct bnxt_re_qp_req {
> > @@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
> > __aligned_u64 qp_handle;
> > __aligned_u64 comp_mask;
> > __u32 sq_slots;
> > + __u32 pd_id;
> > + __u32 sq_wqe_sz;
> > + __u32 sq_psn_sz;
> > + __u32 sq_npsn;
> > + __u32 rq_slots;
> > + __u32 rq_wqe_sz;
> > +};
>
> How does compatablity work here? Old userspace will send a short
> structure, the new kernel should effectively see 0 at all these fields
> is that OK? Sizes of 0 sound bad don't they?
>
> New userspace will send a long structure and old kernels will ignore
> the new bits. Is that OK?
>
> I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
> old kernel then the kernel fails the creation and userspace can do
> something else.
ugh, WTF, this driver isn't doing comp_mask *at all* !?!
BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS isn't even referenced!? Why is it
there?
According to clangd nothing reads bnxt_re_qp_req.comp_mask, so nothing
today will fail if it is !0.
So you need to add a flag to the init/ucontext create path that says
"I support QP comp_mask", and user space cannot send a non-zero comp
mask without that flag, then check comp_mask for supported bits *like
it should have been done* and follow the above remarks.
CQ has the same issue, if you add comp_mask nothing in current kernels
will check it and nothing will check the sizes either. So you need the
same global flag to say the kernel supports comp_mask for cq before
userspace can inject a non-zero comp_mask.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 15:32 ` Jason Gunthorpe
2026-01-28 15:51 ` Jason Gunthorpe
@ 2026-01-28 16:54 ` Sriharsha Basavapatna
2026-01-28 17:57 ` Sriharsha Basavapatna
2026-01-28 17:58 ` Jason Gunthorpe
1 sibling, 2 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-28 16:54 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 2918 bytes --]
On Wed, Jan 28, 2026 at 9:02 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
>
> > struct bnxt_re_cq_resp {
> > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> >
> > enum bnxt_re_qp_mask {
> > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > };
>
> This is set on the response but there are no new response fields? That seems
> backwards?
This is set on the response field so that the library can figure out
if its request for DV QP creation (set through req->comp_mask), was
successfully executed by the kernel driver or not. If there is an
older kernel, the resp->comp_mask bit for DV would be 0 and so the new
library would know its request failed.
>
> > struct bnxt_re_qp_req {
> > @@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
> > __aligned_u64 qp_handle;
> > __aligned_u64 comp_mask;
> > __u32 sq_slots;
> > + __u32 pd_id;
> > + __u32 sq_wqe_sz;
> > + __u32 sq_psn_sz;
> > + __u32 sq_npsn;
> > + __u32 rq_slots;
> > + __u32 rq_wqe_sz;
> > +};
>
> How does compatablity work here? Old userspace will send a short
> structure, the new kernel should effectively see 0 at all these fields
> is that OK? Sizes of 0 sound bad don't they?
New kernel won't even look at the new fields if the DV bit is not set
in req->comp_mask, since bnxt_re_dv_create_qp() won't be called; i.e
if the request comes from old userspace.
>
> New userspace will send a long structure and old kernels will ignore
> the new bits. Is that OK?
Yes, this is ok, since these new fields are added specifically for the
DV use-case and a new kernel is required for this functionality.
>
> I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
> old kernel then the kernel fails the creation and userspace can do
> something else.
>
> If the userspace passes QP_REQ_MASK_SIZES and the ioctl succeeds then
> everthing is OK. Delete the comp_maks in the rep structure.
As decisions are made based on DV bit in comp_mask (explained above),
this is not needed right?
>
> Also, what is "pd_id"? The other users of pd_id in prior patches seem
> to be actual RDMA PDs. Why is something like this being passed here?
> The QP already gets a PD from the core interface, why do you need
> another pd?
Let me take a closer look at this and get back to you.
>
> Also the old kernels have a bug:
>
> struct bnxt_re_qp_req ureq;
>
> if (ib_copy_from_udata(&ureq, udata, min(udata->inlen, sizeof(ureq))))
> return -EFAULT;
>
> It should have been "ureq = {};". Those sorts of things must be fixed
> or this compatability stuff is really broken/security problem! Please
> audit all your ib_copy_from_udata()s!!
Sure, will audit this.
Thanks,
-Harsha
>
> Jason
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 16:54 ` Sriharsha Basavapatna
@ 2026-01-28 17:57 ` Sriharsha Basavapatna
2026-01-28 19:42 ` Jason Gunthorpe
2026-01-28 17:58 ` Jason Gunthorpe
1 sibling, 1 reply; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-28 17:57 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 3465 bytes --]
On Wed, Jan 28, 2026 at 10:24 PM Sriharsha Basavapatna
<sriharsha.basavapatna@broadcom.com> wrote:
>
> On Wed, Jan 28, 2026 at 9:02 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> >
> > > struct bnxt_re_cq_resp {
> > > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> > >
> > > enum bnxt_re_qp_mask {
> > > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > > };
> >
> > This is set on the response but there are no new response fields? That seems
> > backwards?
> This is set on the response field so that the library can figure out
> if its request for DV QP creation (set through req->comp_mask), was
> successfully executed by the kernel driver or not. If there is an
> older kernel, the resp->comp_mask bit for DV would be 0 and so the new
> library would know its request failed.
I will change this to have a separate bit for DV in the response
comp_mask, instead of reusing the same value from the req comp_mask.
Is that ok?
> >
> > > struct bnxt_re_qp_req {
> > > @@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
> > > __aligned_u64 qp_handle;
> > > __aligned_u64 comp_mask;
> > > __u32 sq_slots;
> > > + __u32 pd_id;
> > > + __u32 sq_wqe_sz;
> > > + __u32 sq_psn_sz;
> > > + __u32 sq_npsn;
> > > + __u32 rq_slots;
> > > + __u32 rq_wqe_sz;
> > > +};
> >
> > How does compatablity work here? Old userspace will send a short
> > structure, the new kernel should effectively see 0 at all these fields
> > is that OK? Sizes of 0 sound bad don't they?
> New kernel won't even look at the new fields if the DV bit is not set
> in req->comp_mask, since bnxt_re_dv_create_qp() won't be called; i.e
> if the request comes from old userspace.
> >
> > New userspace will send a long structure and old kernels will ignore
> > the new bits. Is that OK?
> Yes, this is ok, since these new fields are added specifically for the
> DV use-case and a new kernel is required for this functionality.
> >
> > I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
> > old kernel then the kernel fails the creation and userspace can do
> > something else.
> >
> > If the userspace passes QP_REQ_MASK_SIZES and the ioctl succeeds then
> > everthing is OK. Delete the comp_maks in the rep structure.
> As decisions are made based on DV bit in comp_mask (explained above),
> this is not needed right?
> >
> > Also, what is "pd_id"? The other users of pd_id in prior patches seem
> > to be actual RDMA PDs. Why is something like this being passed here?
> > The QP already gets a PD from the core interface, why do you need
> > another pd?
> Let me take a closer look at this and get back to you.
Agree, we had this earlier in our design, but it is not needed anymore
since we are using std QP-extension mechanism now.
Thanks,
-Harsha
> >
> > Also the old kernels have a bug:
> >
> > struct bnxt_re_qp_req ureq;
> >
> > if (ib_copy_from_udata(&ureq, udata, min(udata->inlen, sizeof(ureq))))
> > return -EFAULT;
> >
> > It should have been "ureq = {};". Those sorts of things must be fixed
> > or this compatability stuff is really broken/security problem! Please
> > audit all your ib_copy_from_udata()s!!
> Sure, will audit this.
> Thanks,
> -Harsha
> >
> > Jason
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 16:54 ` Sriharsha Basavapatna
2026-01-28 17:57 ` Sriharsha Basavapatna
@ 2026-01-28 17:58 ` Jason Gunthorpe
1 sibling, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 17:58 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Wed, Jan 28, 2026 at 10:24:44PM +0530, Sriharsha Basavapatna wrote:
> On Wed, Jan 28, 2026 at 9:02 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> >
> > > struct bnxt_re_cq_resp {
> > > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> > >
> > > enum bnxt_re_qp_mask {
> > > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > > };
> >
> > This is set on the response but there are no new response fields? That seems
> > backwards?
> This is set on the response field so that the library can figure out
> if its request for DV QP creation (set through req->comp_mask), was
> successfully executed by the kernel driver or not. If there is an
> older kernel, the resp->comp_mask bit for DV would be 0 and so the new
> library would know its request failed.
It's backwards, we expect old kernels to EOPNOTSUPP when presented
with something that only a new kernel understands. Failing that we
expect that userspace knows to never send something new to an old
kernel with a global flag.
> > How does compatablity work here? Old userspace will send a short
> > structure, the new kernel should effectively see 0 at all these fields
> > is that OK? Sizes of 0 sound bad don't they?
>
> New kernel won't even look at the new fields if the DV bit is not set
> in req->comp_mask, since bnxt_re_dv_create_qp() won't be called; i.e
> if the request comes from old userspace.
Ok
>
> > New userspace will send a long structure and old kernels will ignore
> > the new bits. Is that OK?
>
> Yes, this is ok, since these new fields are added specifically for the
> DV use-case and a new kernel is required for this functionality.
This does not seem OK, but I guess userspace can detect the resp comp mask
and convert it to a failure. It is not following the design pattern.
> > I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
> > old kernel then the kernel fails the creation and userspace can do
> > something else.
> >
> > If the userspace passes QP_REQ_MASK_SIZES and the ioctl succeeds then
> > everthing is OK. Delete the comp_maks in the rep structure.
> As decisions are made based on DV bit in comp_mask (explained above),
> this is not needed right?
It is, the point is to have a comp mask in the input saying which
values of the input are populated and need to be processed by the
kernel.
I think it is not a big change from what you have here, just check the
comp_mask and add a global bit that this kernel checks those
comp_masks properly. Ideally audit all the structs and make sure all
comp_masks work right so the bit includes everything.
comp_mask should be checked against the list of bits this kernel
supports and if other bits are set return EOPNOTSUPP.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 15:51 ` Jason Gunthorpe
@ 2026-01-28 18:03 ` Sriharsha Basavapatna
2026-01-28 19:41 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-01-28 18:03 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 2601 bytes --]
On Wed, Jan 28, 2026 at 9:21 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Wed, Jan 28, 2026 at 11:32:48AM -0400, Jason Gunthorpe wrote:
> > On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> >
> > > struct bnxt_re_cq_resp {
> > > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> > >
> > > enum bnxt_re_qp_mask {
> > > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > > };
> >
> > This is set on the response but there are no new response fields? That seems
> > backwards?
> >
> > > struct bnxt_re_qp_req {
> > > @@ -129,11 +133,22 @@ struct bnxt_re_qp_req {
> > > __aligned_u64 qp_handle;
> > > __aligned_u64 comp_mask;
> > > __u32 sq_slots;
> > > + __u32 pd_id;
> > > + __u32 sq_wqe_sz;
> > > + __u32 sq_psn_sz;
> > > + __u32 sq_npsn;
> > > + __u32 rq_slots;
> > > + __u32 rq_wqe_sz;
> > > +};
> >
> > How does compatablity work here? Old userspace will send a short
> > structure, the new kernel should effectively see 0 at all these fields
> > is that OK? Sizes of 0 sound bad don't they?
> >
> > New userspace will send a long structure and old kernels will ignore
> > the new bits. Is that OK?
> >
> > I would expect you to set QP_REQ_MASK_SIZES in the *req* comp_mask. If
> > old kernel then the kernel fails the creation and userspace can do
> > something else.
>
> ugh, WTF, this driver isn't doing comp_mask *at all* !?!
>
> BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS isn't even referenced!? Why is it
> there?
>
> According to clangd nothing reads bnxt_re_qp_req.comp_mask, so nothing
> today will fail if it is !0.
>
> So you need to add a flag to the init/ucontext create path that says
> "I support QP comp_mask", and user space cannot send a non-zero comp
> mask without that flag, then check comp_mask for supported bits *like
> it should have been done* and follow the above remarks.
>
> CQ has the same issue, if you add comp_mask nothing in current kernels
> will check it and nothing will check the sizes either. So you need the
> same global flag to say the kernel supports comp_mask for cq before
> userspace can inject a non-zero comp_mask.
VAR_WQE_MODE was initially planned to be controlled per-QP level, so
we added the mask definition. But the current implementation uses
per-device variable WQE mode and it uses the ucontext level flag
BNXT_RE_UCNTX_CAP_VAR_WQE_ENABLED. So
BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS comp mask can be ignored until we
support a per QP variable wqe setting.
Thanks,
-Harsha
>
> Jason
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 18:03 ` Sriharsha Basavapatna
@ 2026-01-28 19:41 ` Jason Gunthorpe
0 siblings, 0 replies; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 19:41 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Wed, Jan 28, 2026 at 11:33:18PM +0530, Sriharsha Basavapatna wrote:
> VAR_WQE_MODE was initially planned to be controlled per-QP level, so
> we added the mask definition. But the current implementation uses
> per-device variable WQE mode and it uses the ucontext level flag
> BNXT_RE_UCNTX_CAP_VAR_WQE_ENABLED. So
> BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS comp mask can be ignored until we
> support a per QP variable wqe setting.
You should have checked the comp_mask when it was added to the struct,
not checking it makes it useless going forward.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 17:57 ` Sriharsha Basavapatna
@ 2026-01-28 19:42 ` Jason Gunthorpe
2026-02-02 14:19 ` Sriharsha Basavapatna
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-01-28 19:42 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Wed, Jan 28, 2026 at 11:27:14PM +0530, Sriharsha Basavapatna wrote:
> On Wed, Jan 28, 2026 at 10:24 PM Sriharsha Basavapatna
> <sriharsha.basavapatna@broadcom.com> wrote:
> >
> > On Wed, Jan 28, 2026 at 9:02 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> > >
> > > > struct bnxt_re_cq_resp {
> > > > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> > > >
> > > > enum bnxt_re_qp_mask {
> > > > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > > > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > > > };
> > >
> > > This is set on the response but there are no new response fields? That seems
> > > backwards?
> > This is set on the response field so that the library can figure out
> > if its request for DV QP creation (set through req->comp_mask), was
> > successfully executed by the kernel driver or not. If there is an
> > older kernel, the resp->comp_mask bit for DV would be 0 and so the new
> > library would know its request failed.
> I will change this to have a separate bit for DV in the response
> comp_mask, instead of reusing the same value from the req comp_mask.
> Is that ok?
No. Do not return anything in the response comp_mask, you must fail
unsupported requests. That is how comp_mask is intended to
work. Userspace uses the uctx to learn if the request can even be
sent.
> > > Also, what is "pd_id"? The other users of pd_id in prior patches seem
> > > to be actual RDMA PDs. Why is something like this being passed here?
> > > The QP already gets a PD from the core interface, why do you need
> > > another pd?
> > Let me take a closer look at this and get back to you.
> Agree, we had this earlier in our design, but it is not needed anymore
> since we are using std QP-extension mechanism now.
Ok great, because you also were not refcounting the PD properly with
that scheme, that is fixed now too.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 15:46 ` Jason Gunthorpe
@ 2026-02-02 14:19 ` Sriharsha Basavapatna
0 siblings, 0 replies; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-02-02 14:19 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 3231 bytes --]
On Wed, Jan 28, 2026 at 9:16 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> > diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > index 0999a42c678c..f28acde3a274 100644
> > --- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > @@ -234,6 +234,8 @@ struct bnxt_re_dev {
> > union ib_gid ugid;
> > u32 ugid_index;
> > u8 sniffer_flow_created : 1;
> > + atomic_t dv_cq_count;
> > + atomic_t dv_qp_count;
> > };
>
> Why? Nothing reads these? If they are stats then put them in your
> stats struct and return them to userspace. I'd drop it and come later
> with a user visible stat.
They are debug counters, deleted them.
>
> This patch is really big now, can you at least split it to cq/qp?
Ack.
>
> > @@ -459,11 +463,454 @@ DECLARE_UVERBS_NAMED_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR,
> > DECLARE_UVERBS_GLOBAL_METHODS(BNXT_RE_OBJECT_DEFAULT_DBR,
> > &UVERBS_METHOD(BNXT_RE_METHOD_GET_DEFAULT_DBR));
> >
> > +static int bnxt_re_dv_create_cq_resp(struct bnxt_re_dev *rdev,
> > + struct bnxt_re_cq *cq,
> > + struct bnxt_re_cq_resp *resp)
> > +{
> > + struct bnxt_qplib_cq *qplcq = &cq->qplib_cq;
> > +
> > + resp->cqid = qplcq->id;
> > + resp->tail = qplcq->hwq.cons;
> > + resp->phase = qplcq->period;
> > + resp->comp_mask = BNXT_RE_CQ_DV_SUPPORT;
> > + return 0;
> > +}
> > +
> > +static int bnxt_re_dv_setup_umem(struct bnxt_re_dev *rdev,
> > + struct ib_umem *umem,
> > + struct bnxt_qplib_sg_info *sginfo,
> > + struct ib_umem **umem_ptr)
> > +{
> > + unsigned long page_size;
> > +
> > + if (!umem)
> > + return -EINVAL;
> > +
> > + page_size = ib_umem_find_best_pgsz(umem, SZ_4K, 0);
> > + if (!page_size)
> > + return -EINVAL;
> > +
> > + if (umem_ptr)
> > + *umem_ptr = umem;
>
> Why?? Just have the caller store to the right variable.
Ack.
>
> > @@ -3324,12 +3418,11 @@ int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
> > cq->qplib_cq.sg_info.pgsize = PAGE_SIZE;
> > cq->qplib_cq.sg_info.pgshft = PAGE_SHIFT;
> > if (udata) {
> > - struct bnxt_re_cq_req req;
> > - if (ib_copy_from_udata(&req, udata, sizeof(req))) {
> > - rc = -EFAULT;
> > + if (umem) {
> > + /* Standard CQ (non-DV): use req.cq_va */
> > + rc = -EINVAL;
> > goto fail;
> > }
>
> I think this should support the umem interface here, it is trivial
> right, just skip the below if the umem is passed:
Ack.
>
> > cq->umem = ib_umem_get(&rdev->ibdev, req.cq_va,
> > entries * sizeof(struct cq_base),
> > IB_ACCESS_LOCAL_WRITE);
>
> Jason
Thanks,
-Harsha
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-01-28 19:42 ` Jason Gunthorpe
@ 2026-02-02 14:19 ` Sriharsha Basavapatna
2026-02-02 17:48 ` Jason Gunthorpe
0 siblings, 1 reply; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-02-02 14:19 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 2831 bytes --]
On Thu, Jan 29, 2026 at 1:12 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Wed, Jan 28, 2026 at 11:27:14PM +0530, Sriharsha Basavapatna wrote:
> > On Wed, Jan 28, 2026 at 10:24 PM Sriharsha Basavapatna
> > <sriharsha.basavapatna@broadcom.com> wrote:
> > >
> > > On Wed, Jan 28, 2026 at 9:02 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > > >
> > > > On Tue, Jan 27, 2026 at 04:01:09PM +0530, Sriharsha Basavapatna wrote:
> > > >
> > > > > struct bnxt_re_cq_resp {
> > > > > @@ -121,6 +124,7 @@ struct bnxt_re_resize_cq_req {
> > > > >
> > > > > enum bnxt_re_qp_mask {
> > > > > BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1,
> > > > > + BNXT_RE_QP_DV_SUPPORT = 0x2,
> > > > > };
> > > >
> > > > This is set on the response but there are no new response fields? That seems
> > > > backwards?
> > > This is set on the response field so that the library can figure out
> > > if its request for DV QP creation (set through req->comp_mask), was
> > > successfully executed by the kernel driver or not. If there is an
> > > older kernel, the resp->comp_mask bit for DV would be 0 and so the new
> > > library would know its request failed.
> > I will change this to have a separate bit for DV in the response
> > comp_mask, instead of reusing the same value from the req comp_mask.
> > Is that ok?
>
> No. Do not return anything in the response comp_mask, you must fail
> unsupported requests. That is how comp_mask is intended to
> work. Userspace uses the uctx to learn if the request can even be
> sent.
Ack.
- Implemented driver logic to return capability in ucontext for both CQ and QP.
- The library adds comp_mask in its CQ/QP creation request, only if
the driver has exported the capability in ucontext.
- The driver checks the requested comp_mask against supported bitmasks
and returns EOPNOTSUPP if invalid.
- Fixed zero initialization of req/resp structures.
>
> > > > Also, what is "pd_id"? The other users of pd_id in prior patches seem
> > > > to be actual RDMA PDs. Why is something like this being passed here?
> > > > The QP already gets a PD from the core interface, why do you need
> > > > another pd?
> > > Let me take a closer look at this and get back to you.
> > Agree, we had this earlier in our design, but it is not needed anymore
> > since we are using std QP-extension mechanism now.
>
> Ok great, because you also were not refcounting the PD properly with
> that scheme, that is fixed now too.
Ack. Reverted it back to use the PD object from the QP, deleted pd_id
in the req structure.
>
> Jason
I have the revised patchset ready. Let me know how you want to proceed
- if I should send it out (without the uverbs kernel patch for QP
umem) or if I should wait for the kernel patch and rebase it before
sending.
Thanks,
-Harsha
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-02-02 14:19 ` Sriharsha Basavapatna
@ 2026-02-02 17:48 ` Jason Gunthorpe
2026-02-03 5:05 ` Sriharsha Basavapatna
0 siblings, 1 reply; 30+ messages in thread
From: Jason Gunthorpe @ 2026-02-02 17:48 UTC (permalink / raw)
To: Sriharsha Basavapatna, Jiri Pirko
Cc: leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil
On Mon, Feb 02, 2026 at 07:49:29PM +0530, Sriharsha Basavapatna wrote:
> I have the revised patchset ready. Let me know how you want to proceed
> - if I should send it out (without the uverbs kernel patch for QP
> umem) or if I should wait for the kernel patch and rebase it before
> sending.
How about change the Author on that patch to youself and some
co-developed & signed-off for Jiri and send it out, I'd at least like
to look at the other changes.
Jiri said he would send his series, there is still time I can stitch
something together.
Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-02-02 17:48 ` Jason Gunthorpe
@ 2026-02-03 5:05 ` Sriharsha Basavapatna
2026-02-03 8:57 ` Jiri Pirko
0 siblings, 1 reply; 30+ messages in thread
From: Sriharsha Basavapatna @ 2026-02-03 5:05 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Jiri Pirko, leon, linux-rdma, andrew.gospodarek, selvin.xavier,
kalesh-anakkur.purayil, Sriharsha Basavapatna
[-- Attachment #1: Type: text/plain, Size: 799 bytes --]
On Mon, Feb 2, 2026 at 11:18 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Mon, Feb 02, 2026 at 07:49:29PM +0530, Sriharsha Basavapatna wrote:
> > I have the revised patchset ready. Let me know how you want to proceed
> > - if I should send it out (without the uverbs kernel patch for QP
> > umem) or if I should wait for the kernel patch and rebase it before
> > sending.
>
> How about change the Author on that patch to youself and some
> co-developed & signed-off for Jiri and send it out, I'd at least like
> to look at the other changes.
>
> Jiri said he would send his series, there is still time I can stitch
> something together.
Ok, I will include that patch as a placeholder for now and this patch
series can be rebased when it is ready.
Thanks,
-Harsha
>
> Jason
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5505 bytes --]
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs
2026-02-03 5:05 ` Sriharsha Basavapatna
@ 2026-02-03 8:57 ` Jiri Pirko
0 siblings, 0 replies; 30+ messages in thread
From: Jiri Pirko @ 2026-02-03 8:57 UTC (permalink / raw)
To: Sriharsha Basavapatna
Cc: Jason Gunthorpe, Jiri Pirko, leon, linux-rdma, andrew.gospodarek,
selvin.xavier, kalesh-anakkur.purayil
Tue, Feb 03, 2026 at 06:05:48AM +0100, sriharsha.basavapatna@broadcom.com wrote:
>On Mon, Feb 2, 2026 at 11:18 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>>
>> On Mon, Feb 02, 2026 at 07:49:29PM +0530, Sriharsha Basavapatna wrote:
>> > I have the revised patchset ready. Let me know how you want to proceed
>> > - if I should send it out (without the uverbs kernel patch for QP
>> > umem) or if I should wait for the kernel patch and rebase it before
>> > sending.
>>
>> How about change the Author on that patch to youself and some
>> co-developed & signed-off for Jiri and send it out, I'd at least like
>> to look at the other changes.
>>
>> Jiri said he would send his series, there is still time I can stitch
>> something together.
>Ok, I will include that patch as a placeholder for now and this patch
>series can be rebased when it is ready.
Check out:
https://lore.kernel.org/linux-rdma/20260203085003.71184-11-jiri@resnulli.us/
>Thanks,
>-Harsha
>>
>> Jason
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2026-02-03 8:57 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27 10:31 [PATCH rdma-next v9 0/5] RDMA/bnxt_re: Support direct verbs Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 1/5] RDMA/uverbs: Support QP creation with user allocated memory Sriharsha Basavapatna
2026-01-27 12:12 ` Jiri Pirko
2026-01-27 13:04 ` Sriharsha Basavapatna
2026-01-28 10:16 ` Jiri Pirko
2026-01-28 12:31 ` Leon Romanovsky
2026-01-27 10:31 ` [PATCH rdma-next v9 2/5] RDMA/bnxt_re: Move the UAPI methods to a dedicated file Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 3/5] RDMA/bnxt_re: Refactor bnxt_qplib_create_qp() function Sriharsha Basavapatna
2026-01-27 10:31 ` [PATCH rdma-next v9 4/5] RDMA/bnxt_re: Direct Verbs: Support DBR verbs Sriharsha Basavapatna
2026-01-27 12:30 ` Jiri Pirko
2026-01-27 14:15 ` Jason Gunthorpe
2026-01-27 15:07 ` Jiri Pirko
2026-01-27 15:56 ` Jason Gunthorpe
2026-01-28 10:04 ` Jiri Pirko
2026-01-28 15:33 ` Jason Gunthorpe
2026-01-27 10:31 ` [PATCH rdma-next v9 5/5] RDMA/bnxt_re: Direct Verbs: Support CQ and QP verbs Sriharsha Basavapatna
2026-01-28 15:32 ` Jason Gunthorpe
2026-01-28 15:51 ` Jason Gunthorpe
2026-01-28 18:03 ` Sriharsha Basavapatna
2026-01-28 19:41 ` Jason Gunthorpe
2026-01-28 16:54 ` Sriharsha Basavapatna
2026-01-28 17:57 ` Sriharsha Basavapatna
2026-01-28 19:42 ` Jason Gunthorpe
2026-02-02 14:19 ` Sriharsha Basavapatna
2026-02-02 17:48 ` Jason Gunthorpe
2026-02-03 5:05 ` Sriharsha Basavapatna
2026-02-03 8:57 ` Jiri Pirko
2026-01-28 17:58 ` Jason Gunthorpe
2026-01-28 15:46 ` Jason Gunthorpe
2026-02-02 14:19 ` Sriharsha Basavapatna
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox