* [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem
@ 2026-04-11 14:49 Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
` (14 more replies)
0 siblings, 15 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
This patchset introduces a generic buffer descriptor infrastructure
for passing memory buffers (dma-buf or user VA) to uverbs commands,
and wires it up for CQ and QP creation in the uverbs core, efa, mlx5,
bnxt_re and mlx4 drivers.
Instead of adding per-command UAPI attributes for each new buffer
type, a single UVERBS_ATTR_BUFFERS array attribute carries all buffer
descriptors. Each descriptor specifies a buffer type and is indexed by
per-command slot enums. A consumption check ensures userspace and
driver agree on which buffers are used.
The patchset:
1. Introduces the core ib_umem_list infrastructure and UAPI.
2. Factors out CQ buffer umem processing into a helper.
3. Integrates umem_list into CQ creation, with fallback to existing
per-attribute path.
4-7. Converts efa, mlx5, bnxt_re and mlx4 to use umem_list for CQ
buffer.
8. Removes the legacy umem field from struct ib_cq, now that all
drivers use umem_list for CQ buffer management.
9. Adds a consumption check verifying all umem_list buffers were
consumed by the driver after CQ creation.
10. Integrates umem_list into QP creation.
11. Converts mlx5 QP creation to use umem_list.
12-15. Extends CQ and QP with doorbell record buffer slots and
converts mlx5 to use them.
Note this re-works the original patchset trying to handle this:
https://lore.kernel.org/all/20260203085003.71184-1-jiri@resnulli.us/
The code is so much different I'm sending this is a new patchset.
---
v1->v2:
one fix and one rebase, see individual patches for changelog
Jiri Pirko (15):
RDMA/core: Introduce generic buffer descriptor infrastructure for umem
RDMA/uverbs: Push out CQ buffer umem processing into a helper
RDMA/uverbs: Integrate umem_list into CQ creation
RDMA/efa: Use umem_list for user CQ buffer
RDMA/mlx5: Use umem_list for user CQ buffer
RDMA/bnxt_re: Use umem_list for user CQ buffer
RDMA/mlx4: Use umem_list for user CQ buffer
RDMA/uverbs: Remove legacy umem field from struct ib_cq
RDMA/uverbs: Verify all umem_list buffers are consumed after CQ
creation
RDMA/uverbs: Integrate umem_list into QP creation
RDMA/mlx5: Use umem_list for QP buffers in create_qp
RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list
RDMA/mlx5: Use umem_list for CQ doorbell record
RDMA/uverbs: Add doorbell record buffer slot to QP umem_list
RDMA/mlx5: Use umem_list for QP doorbell record
drivers/infiniband/core/core_priv.h | 1 +
drivers/infiniband/core/umem.c | 248 ++++++++++++++++++
drivers/infiniband/core/uverbs_cmd.c | 18 +-
drivers/infiniband/core/uverbs_std_types_cq.c | 158 ++++++-----
drivers/infiniband/core/uverbs_std_types_qp.c | 22 +-
drivers/infiniband/core/verbs.c | 27 +-
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 23 +-
drivers/infiniband/hw/efa/efa_verbs.c | 17 +-
drivers/infiniband/hw/mlx4/cq.c | 41 +--
drivers/infiniband/hw/mlx5/cq.c | 40 ++-
drivers/infiniband/hw/mlx5/doorbell.c | 41 ++-
drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +-
drivers/infiniband/hw/mlx5/qp.c | 76 ++++--
drivers/infiniband/hw/mlx5/srq.c | 2 +-
include/rdma/ib_umem.h | 54 ++++
include/rdma/ib_verbs.h | 5 +-
include/rdma/uverbs_ioctl.h | 14 +
include/uapi/rdma/ib_user_ioctl_cmds.h | 17 ++
include/uapi/rdma/ib_user_ioctl_verbs.h | 27 ++
19 files changed, 663 insertions(+), 171 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 01/15] RDMA/core: Introduce generic buffer descriptor infrastructure for umem
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-12 12:33 ` Michael Margolin
2026-04-11 14:49 ` [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper Jiri Pirko
` (13 subsequent siblings)
14 siblings, 1 reply; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Add a unified mechanism for userspace to pass memory buffers to any
uverbs command via a single UVERBS_ATTR_BUFFERS attribute. Each
buffer is described by struct ib_uverbs_buffer_desc with a type
discriminator supporting dma-buf and user VA backed memory, extensible
for future buffer types.
The ib_umem_list API enables any uverbs command to accept multiple
buffers indexed by per-command slot enums, without requiring new UAPI
attributes for each buffer. A consumption check ensures userspace and
driver agree on which buffers are used.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/core/umem.c | 248 ++++++++++++++++++++++++
include/rdma/ib_umem.h | 54 ++++++
include/rdma/uverbs_ioctl.h | 14 ++
include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
include/uapi/rdma/ib_user_ioctl_verbs.h | 27 +++
5 files changed, 344 insertions(+)
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 786fa1aa8e55..f5b03e903b9d 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -37,6 +37,7 @@
#include <linux/dma-mapping.h>
#include <linux/sched/signal.h>
#include <linux/sched/mm.h>
+#include <linux/err.h>
#include <linux/export.h>
#include <linux/slab.h>
#include <linux/pagemap.h>
@@ -332,3 +333,250 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
return 0;
}
EXPORT_SYMBOL(ib_umem_copy_from);
+
+struct ib_umem_list {
+ unsigned int count; /* Total slots in the list. */
+ unsigned long provided; /* Bitmask of slots provided by the user. */
+ unsigned long loaded; /* Bitmask of slots loaded by the driver. */
+ struct ib_umem *umems[] __counted_by(count);
+};
+
+/**
+ * ib_umem_list_create - Create a umem list from UVERBS_ATTR_BUFFERS
+ * @device: IB device
+ * @attrs: uverbs attribute bundle
+ * @slot_max: highest buffer slot index (count = slot_max + 1)
+ *
+ * Return: umem list, or ERR_PTR on failure.
+ */
+struct ib_umem_list *ib_umem_list_create(struct ib_device *device,
+ const struct uverbs_attr_bundle *attrs,
+ unsigned int slot_max)
+{
+ const struct ib_uverbs_buffer_desc *descs;
+ struct ib_umem_dmabuf *umem_dmabuf;
+ struct ib_umem_list *list;
+ struct ib_umem *umem;
+ unsigned int count;
+ int num_descs;
+ int err;
+ int i;
+
+ if (WARN_ON_ONCE(slot_max >= BITS_PER_LONG))
+ return ERR_PTR(-EINVAL);
+ count = slot_max + 1;
+
+ num_descs = uverbs_attr_ptr_get_array_size(
+ (struct uverbs_attr_bundle *)attrs, UVERBS_ATTR_BUFFERS,
+ sizeof(*descs));
+ if (num_descs == -ENOENT) {
+ num_descs = 0;
+ descs = NULL;
+ } else if (num_descs < 0) {
+ return ERR_PTR(num_descs);
+ } else if (num_descs > count) {
+ return ERR_PTR(-EINVAL);
+ } else {
+ descs = uverbs_attr_get_alloced_ptr(attrs, UVERBS_ATTR_BUFFERS);
+ if (IS_ERR(descs))
+ return ERR_CAST(descs);
+ }
+
+ list = kzalloc(struct_size(list, umems, count), GFP_KERNEL);
+ if (!list)
+ return ERR_PTR(-ENOMEM);
+ list->count = count;
+
+ for (i = 0; i < num_descs; i++) {
+ unsigned int idx = descs[i].index;
+
+ if (descs[i].reserved) {
+ err = -EINVAL;
+ goto err_release;
+ }
+ if (idx >= count || (list->provided & BIT(idx))) {
+ err = -EINVAL;
+ goto err_release;
+ }
+
+ switch (descs[i].type) {
+ case IB_UVERBS_BUFFER_TYPE_DMABUF:
+ umem_dmabuf = ib_umem_dmabuf_get_pinned(
+ device, descs[i].addr, descs[i].length,
+ descs[i].fd, IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem_dmabuf)) {
+ err = PTR_ERR(umem_dmabuf);
+ goto err_release;
+ }
+ list->umems[idx] = &umem_dmabuf->umem;
+ break;
+ case IB_UVERBS_BUFFER_TYPE_VA:
+ umem = ib_umem_get(device, descs[i].addr,
+ descs[i].length, IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem)) {
+ err = PTR_ERR(umem);
+ goto err_release;
+ }
+ list->umems[idx] = umem;
+ break;
+ default:
+ err = -EINVAL;
+ goto err_release;
+ }
+ list->provided |= BIT(idx);
+ }
+
+ return list;
+
+err_release:
+ ib_umem_list_release(list);
+ return ERR_PTR(err);
+}
+EXPORT_SYMBOL(ib_umem_list_create);
+
+/**
+ * ib_umem_list_release - Release all umems in the list and free it
+ * @list: umem list
+ */
+void ib_umem_list_release(struct ib_umem_list *list)
+{
+ int i;
+
+ if (!list)
+ return;
+ for (i = 0; i < list->count; i++)
+ ib_umem_release(list->umems[i]);
+ kfree(list);
+}
+EXPORT_SYMBOL(ib_umem_list_release);
+
+/**
+ * ib_umem_list_check_consumed - Verify all provided umems were loaded
+ * @list: umem list
+ *
+ * Return: 0 if all provided slots were loaded, -EINVAL otherwise.
+ */
+int ib_umem_list_check_consumed(const struct ib_umem_list *list)
+{
+ return (list->provided & ~list->loaded) == 0 ? 0 : -EINVAL;
+}
+EXPORT_SYMBOL(ib_umem_list_check_consumed);
+
+/**
+ * ib_umem_list_insert - Insert a umem into the list at a given index
+ * @list: umem list
+ * @index: per-command buffer slot index
+ * @umem: umem pointer to store
+ *
+ * Stores @umem at @index (replacing any existing). For use from create_cq
+ * when the buffer comes from legacy ATTRs rather than the buffer list.
+ */
+void ib_umem_list_insert(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem)
+{
+ ib_umem_list_replace(list, index, umem);
+ if (umem)
+ list->provided |= BIT(index);
+}
+EXPORT_SYMBOL(ib_umem_list_insert);
+
+/**
+ * ib_umem_list_load - Load a umem from the list by index
+ * @list: umem list (may be NULL)
+ * @index: per-command buffer slot index
+ * @size: minimum required umem length
+ *
+ * Return: umem pointer, or NULL if the slot is empty or
+ * the slot is out of bounds, or ERR_PTR(-EINVAL) if the umem is too small.
+ */
+struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
+ unsigned int index, size_t size)
+{
+ struct ib_umem *umem;
+
+ if (!list || index >= list->count)
+ return NULL;
+ umem = list->umems[index];
+ if (!umem)
+ return NULL;
+ if (umem->length < size)
+ return ERR_PTR(-EINVAL);
+ list->loaded |= BIT(index);
+ return umem;
+}
+EXPORT_SYMBOL(ib_umem_list_load);
+
+/**
+ * ib_umem_list_load_or_get - Umem from list or pin user memory
+ * @list: umem list (may be NULL)
+ * @index: per-command buffer slot index
+ * @device: IB device for ib_umem_get when the list slot is empty
+ * @addr: user virtual address for ib_umem_get
+ * @size: length for ib_umem_get
+ * @access: access flags for ib_umem_get
+ *
+ * If @list has a umem at @index, returns it like ib_umem_list_load() (and
+ * marks the slot loaded). Otherwise calls ib_umem_get() with the given
+ * @access flags and on success stores the result at @index when
+ * @list is non-NULL.
+ *
+ * Return: valid umem pointer, or ERR_PTR.
+ */
+struct ib_umem *ib_umem_list_load_or_get(struct ib_umem_list *list,
+ unsigned int index,
+ struct ib_device *device,
+ unsigned long addr, size_t size,
+ int access)
+{
+ struct ib_umem *umem;
+
+ umem = ib_umem_list_load(list, index, size);
+ if (IS_ERR(umem) || umem)
+ return umem;
+ umem = ib_umem_get(device, addr, size, access);
+ if (IS_ERR(umem))
+ return umem;
+ if (list && index < list->count)
+ list->umems[index] = umem;
+ return umem;
+}
+EXPORT_SYMBOL(ib_umem_list_load_or_get);
+
+/**
+ * ib_umem_list_replace - Replace umem at index, releasing the previous one
+ * @list: umem list (may be NULL)
+ * @index: per-command buffer slot index
+ * @umem: new umem pointer (may be NULL to clear the slot)
+ *
+ * Stores @umem at @index. If a different umem was already stored there, it is
+ * released. Used for CQ resize and similar.
+ */
+void ib_umem_list_replace(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem)
+{
+ struct ib_umem *old;
+
+ if (!list || index >= list->count)
+ return;
+ old = list->umems[index];
+ list->umems[index] = umem;
+ if (old && old != umem)
+ ib_umem_release(old);
+}
+EXPORT_SYMBOL(ib_umem_list_replace);
+
+/**
+ * ib_umem_release_non_listed - Release a umem that is not stored in the list
+ * @list: umem list
+ * @index: per-command buffer slot index
+ * @umem: umem pointer to release
+ *
+ * Releases @umem if it is not stored in @list.
+ */
+void ib_umem_release_non_listed(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem)
+{
+ if (!list || index >= list->count || list->umems[index] != umem)
+ ib_umem_release(umem);
+}
+EXPORT_SYMBOL(ib_umem_release_non_listed);
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 2ad52cc1d52b..924acb8d08c3 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -11,6 +11,7 @@
struct ib_device;
struct dma_buf_attach_ops;
+struct uverbs_attr_bundle;
struct ib_umem {
struct ib_device *ibdev;
@@ -80,6 +81,36 @@ struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,
void ib_umem_release(struct ib_umem *umem);
int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
size_t length);
+
+/**
+ * struct ib_umem_list - collection of pre-mapped umems
+ *
+ * Created from the UVERBS_ATTR_BUFFERS attribute. Each entry is indexed
+ * by a per-command buffer slot enum (e.g., IB_UMEM_CQ_BUF for CQ CREATE).
+ * Drivers use ib_umem_list_load() to retrieve a specific umem by index.
+ */
+struct ib_umem_list;
+
+struct ib_umem_list *ib_umem_list_create(struct ib_device *device,
+ const struct uverbs_attr_bundle *attrs,
+ unsigned int slot_max);
+void ib_umem_list_release(struct ib_umem_list *list);
+int ib_umem_list_check_consumed(const struct ib_umem_list *list);
+void ib_umem_list_insert(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem);
+
+struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
+ unsigned int index, size_t size);
+struct ib_umem *ib_umem_list_load_or_get(struct ib_umem_list *list,
+ unsigned int index,
+ struct ib_device *device,
+ unsigned long addr, size_t size,
+ int access);
+void ib_umem_list_replace(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem);
+void ib_umem_release_non_listed(struct ib_umem_list *list, unsigned int index,
+ struct ib_umem *umem);
+
unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
unsigned long pgsz_bitmap,
unsigned long virt);
@@ -230,5 +261,28 @@ static inline void ib_umem_dmabuf_revoke_lock(struct ib_umem_dmabuf *umem_dmabuf
static inline void ib_umem_dmabuf_revoke_unlock(struct ib_umem_dmabuf *umem_dmabuf) {}
static inline void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf) {}
+struct ib_umem_list;
+
+static inline void ib_umem_list_release(struct ib_umem_list *list) { }
+static inline struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
+ unsigned int index,
+ size_t size)
+{
+ return ERR_PTR(-EOPNOTSUPP);
+}
+static inline struct ib_umem *
+ib_umem_list_load_or_get(struct ib_umem_list *list, unsigned int index,
+ struct ib_device *device, unsigned long addr,
+ size_t size, int access)
+{
+ return ERR_PTR(-EOPNOTSUPP);
+}
+static inline void ib_umem_list_replace(struct ib_umem_list *list,
+ unsigned int index,
+ struct ib_umem *umem) { }
+static inline void ib_umem_release_non_listed(struct ib_umem_list *list,
+ unsigned int index,
+ struct ib_umem *umem) { }
+
#endif /* CONFIG_INFINIBAND_USER_MEM */
#endif /* IB_UMEM_H */
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
index e2af17da3e32..05bcab27a87d 100644
--- a/include/rdma/uverbs_ioctl.h
+++ b/include/rdma/uverbs_ioctl.h
@@ -590,6 +590,20 @@ struct uapi_definition {
UA_OPTIONAL, \
.is_udata = 1)
+/*
+ * Optional array of struct ib_uverbs_buffer_desc describing memory regions
+ * backed by dma-buf or user virtual address. Can be added to any method
+ * that needs external buffer support.
+ * Each entry carries an index field selecting the per-command buffer slot.
+ * Use ib_umem_list_create() to map them and ib_umem_list_load() to access.
+ */
+#define UVERBS_ATTR_BUFFERS() \
+ UVERBS_ATTR_PTR_IN(UVERBS_ATTR_BUFFERS, \
+ UVERBS_ATTR_MIN_SIZE( \
+ sizeof(struct ib_uverbs_buffer_desc)), \
+ UA_OPTIONAL, \
+ UA_ALLOC_AND_COPY)
+
/* =================================================
* Parsing infrastructure
* =================================================
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 72041c1b0ea5..10aa6568abf1 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -64,6 +64,7 @@ enum {
UVERBS_ATTR_UHW_IN = UVERBS_ID_DRIVER_NS,
UVERBS_ATTR_UHW_OUT,
UVERBS_ID_DRIVER_NS_WITH_UHW,
+ UVERBS_ATTR_BUFFERS,
};
enum uverbs_methods_device {
diff --git a/include/uapi/rdma/ib_user_ioctl_verbs.h b/include/uapi/rdma/ib_user_ioctl_verbs.h
index 90c5cd8e7753..41ed9f75b4de 100644
--- a/include/uapi/rdma/ib_user_ioctl_verbs.h
+++ b/include/uapi/rdma/ib_user_ioctl_verbs.h
@@ -273,4 +273,31 @@ struct ib_uverbs_gid_entry {
__u32 netdev_ifindex; /* It is 0 if there is no netdev associated with it */
};
+enum ib_uverbs_buffer_type {
+ IB_UVERBS_BUFFER_TYPE_DMABUF,
+ IB_UVERBS_BUFFER_TYPE_VA,
+};
+
+/*
+ * Describes a single buffer backed by dma-buf or user virtual address.
+ * Passed as an array via UVERBS_ATTR_BUFFERS. Each uverb command that
+ * accepts this attribute defines its own per-command buffer slot enum.
+ * The index field selects the buffer slot this descriptor maps to.
+ *
+ * @fd: dma-buf file descriptor (valid for IB_UVERBS_BUFFER_TYPE_DMABUF)
+ * @type: buffer type from enum ib_uverbs_buffer_type
+ * @index: per-command buffer slot index
+ * @reserved: must be zero
+ * @addr: offset within dma-buf, or user virtual address for VA
+ * @length: buffer length in bytes
+ */
+struct ib_uverbs_buffer_desc {
+ __s32 fd;
+ __u32 type;
+ __u32 index;
+ __u32 reserved;
+ __aligned_u64 addr;
+ __aligned_u64 length;
+};
+
#endif
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation Jiri Pirko
` (12 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Extract the UVERBS_ATTR_CREATE_CQ_BUFFER_* attribute processing from
the CQ create handler into uverbs_create_cq_get_umem() and separate
buffer acquisition logic from the rest of CQ creation.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/core/uverbs_std_types_cq.c | 127 ++++++++++--------
1 file changed, 69 insertions(+), 58 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c b/drivers/infiniband/core/uverbs_std_types_cq.c
index d2c8f71f934c..4afe27fef6c9 100644
--- a/drivers/infiniband/core/uverbs_std_types_cq.c
+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
@@ -58,6 +58,72 @@ static int uverbs_free_cq(struct ib_uobject *uobject,
return 0;
}
+static struct ib_umem *uverbs_create_cq_get_umem(struct ib_device *ib_dev,
+ struct uverbs_attr_bundle *attrs)
+{
+ struct ib_umem_dmabuf *umem_dmabuf;
+ u64 buffer_length;
+ u64 buffer_offset;
+ u64 buffer_va;
+ int buffer_fd;
+ int ret;
+
+ if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_VA)) {
+ ret = uverbs_copy_from(&buffer_va, attrs,
+ UVERBS_ATTR_CREATE_CQ_BUFFER_VA);
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = uverbs_copy_from(&buffer_length, attrs,
+ UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH);
+ if (ret)
+ return ERR_PTR(ret);
+
+ if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_FD) ||
+ uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET) ||
+ !ib_dev->ops.create_user_cq)
+ return ERR_PTR(-EINVAL);
+
+ return ib_umem_get(ib_dev, buffer_va, buffer_length,
+ IB_ACCESS_LOCAL_WRITE);
+ }
+
+ if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_FD)) {
+ ret = uverbs_get_raw_fd(&buffer_fd, attrs,
+ UVERBS_ATTR_CREATE_CQ_BUFFER_FD);
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = uverbs_copy_from(&buffer_offset, attrs,
+ UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET);
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = uverbs_copy_from(&buffer_length, attrs,
+ UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH);
+ if (ret)
+ return ERR_PTR(ret);
+
+ if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_VA) ||
+ !ib_dev->ops.create_user_cq)
+ return ERR_PTR(-EINVAL);
+
+ umem_dmabuf = ib_umem_dmabuf_get_pinned(ib_dev, buffer_offset,
+ buffer_length, buffer_fd,
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem_dmabuf))
+ return ERR_CAST(umem_dmabuf);
+ return &umem_dmabuf->umem;
+ }
+
+ if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET) ||
+ uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH) ||
+ !ib_dev->ops.create_cq)
+ return ERR_PTR(-EINVAL);
+
+ return NULL;
+}
+
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
struct uverbs_attr_bundle *attrs)
{
@@ -66,16 +132,11 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
typeof(*obj), uevent.uobject);
struct ib_uverbs_completion_event_file *ev_file = NULL;
struct ib_device *ib_dev = attrs->context->device;
- struct ib_umem_dmabuf *umem_dmabuf;
struct ib_cq_init_attr attr = {};
struct ib_uobject *ev_file_uobj;
struct ib_umem *umem = NULL;
- u64 buffer_length;
- u64 buffer_offset;
struct ib_cq *cq;
u64 user_handle;
- u64 buffer_va;
- int buffer_fd;
int ret;
if ((!ib_dev->ops.create_cq && !ib_dev->ops.create_user_cq) ||
@@ -122,59 +183,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
INIT_LIST_HEAD(&obj->comp_list);
INIT_LIST_HEAD(&obj->uevent.event_list);
- if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_VA)) {
-
- ret = uverbs_copy_from(&buffer_va, attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_VA);
- if (ret)
- goto err_event_file;
-
- ret = uverbs_copy_from(&buffer_length, attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH);
- if (ret)
- goto err_event_file;
-
- if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_FD) ||
- uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET) ||
- !ib_dev->ops.create_user_cq) {
- ret = -EINVAL;
- goto err_event_file;
- }
-
- umem = ib_umem_get(ib_dev, buffer_va, buffer_length, IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(umem)) {
- ret = PTR_ERR(umem);
- goto err_event_file;
- }
- } else if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_FD)) {
-
- ret = uverbs_get_raw_fd(&buffer_fd, attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_FD);
- if (ret)
- goto err_event_file;
-
- ret = uverbs_copy_from(&buffer_offset, attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET);
- if (ret)
- goto err_event_file;
-
- ret = uverbs_copy_from(&buffer_length, attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH);
- if (ret)
- goto err_event_file;
-
- if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_VA) ||
- !ib_dev->ops.create_user_cq) {
- ret = -EINVAL;
- goto err_event_file;
- }
-
- umem_dmabuf = ib_umem_dmabuf_get_pinned(ib_dev, buffer_offset, buffer_length,
- buffer_fd, IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(umem_dmabuf)) {
- ret = PTR_ERR(umem_dmabuf);
- goto err_event_file;
- }
- umem = &umem_dmabuf->umem;
- } else if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET) ||
- uverbs_attr_is_valid(attrs, UVERBS_ATTR_CREATE_CQ_BUFFER_LENGTH) ||
- !ib_dev->ops.create_cq) {
- ret = -EINVAL;
+ umem = uverbs_create_cq_get_umem(ib_dev, attrs);
+ if (IS_ERR(umem)) {
+ ret = PTR_ERR(umem);
goto err_event_file;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer Jiri Pirko
` (11 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Wire up the generic buffer descriptor infrastructure to the CQ create
command, with fallback to the existing per-attribute path. Add
umem_list field to struct ib_cq and define the CQ buffer slot enum.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/core/uverbs_cmd.c | 15 +++++++++++--
drivers/infiniband/core/uverbs_std_types_cq.c | 22 ++++++++++++++-----
drivers/infiniband/core/verbs.c | 9 +++++---
include/rdma/ib_verbs.h | 2 ++
include/uapi/rdma/ib_user_ioctl_cmds.h | 6 +++++
5 files changed, 44 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index a768436ba468..77874834108b 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -42,6 +42,7 @@
#include <rdma/uverbs_types.h>
#include <rdma/uverbs_std_types.h>
+#include <rdma/ib_umem.h>
#include <rdma/ib_ucaps.h>
#include "rdma_core.h"
@@ -1011,6 +1012,7 @@ static int create_cq(struct uverbs_attr_bundle *attrs,
{
struct ib_ucq_object *obj;
struct ib_uverbs_completion_event_file *ev_file = NULL;
+ struct ib_umem_list *umem_list;
struct ib_cq *cq;
int ret;
struct ib_uverbs_ex_create_cq_resp resp = {};
@@ -1044,16 +1046,23 @@ static int create_cq(struct uverbs_attr_bundle *attrs,
attr.comp_vector = cmd->comp_vector;
attr.flags = cmd->flags;
+ umem_list = ib_umem_list_create(ib_dev, attrs, UVERBS_BUF_CQ_MAX);
+ if (IS_ERR(umem_list)) {
+ ret = PTR_ERR(umem_list);
+ goto err_file;
+ }
+
cq = rdma_zalloc_drv_obj(ib_dev, ib_cq);
if (!cq) {
ret = -ENOMEM;
- goto err_file;
+ goto err_list_release;
}
cq->device = ib_dev;
cq->uobject = obj;
cq->comp_handler = ib_uverbs_comp_handler;
cq->event_handler = ib_uverbs_cq_event_handler;
cq->cq_context = ev_file ? &ev_file->ev_queue : NULL;
+ cq->umem_list = umem_list;
atomic_set(&cq->usecnt, 0);
rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ);
@@ -1079,9 +1088,11 @@ static int create_cq(struct uverbs_attr_bundle *attrs,
return uverbs_response(attrs, &resp, sizeof(resp));
err_free:
- ib_umem_release(cq->umem);
+ ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_put(&cq->res);
kfree(cq);
+err_list_release:
+ ib_umem_list_release(umem_list);
err_file:
if (ev_file)
ib_uverbs_release_ucq(ev_file, obj);
diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c b/drivers/infiniband/core/uverbs_std_types_cq.c
index 4afe27fef6c9..f87cd11470fc 100644
--- a/drivers/infiniband/core/uverbs_std_types_cq.c
+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
@@ -134,6 +134,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
struct ib_device *ib_dev = attrs->context->device;
struct ib_cq_init_attr attr = {};
struct ib_uobject *ev_file_uobj;
+ struct ib_umem_list *umem_list;
struct ib_umem *umem = NULL;
struct ib_cq *cq;
u64 user_handle;
@@ -183,17 +184,24 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
INIT_LIST_HEAD(&obj->comp_list);
INIT_LIST_HEAD(&obj->uevent.event_list);
+ umem_list = ib_umem_list_create(ib_dev, attrs, UVERBS_BUF_CQ_MAX);
+ if (IS_ERR(umem_list)) {
+ ret = PTR_ERR(umem_list);
+ goto err_event_file;
+ }
+
umem = uverbs_create_cq_get_umem(ib_dev, attrs);
if (IS_ERR(umem)) {
ret = PTR_ERR(umem);
- goto err_event_file;
+ goto err_umem_list;
}
+ if (umem)
+ ib_umem_list_insert(umem_list, UVERBS_BUF_CQ_BUF, umem);
cq = rdma_zalloc_drv_obj(ib_dev, ib_cq);
if (!cq) {
ret = -ENOMEM;
- ib_umem_release(umem);
- goto err_event_file;
+ goto err_umem_list;
}
cq->device = ib_dev;
@@ -206,6 +214,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
* CQ creation based on their internal udata.
*/
cq->umem = umem;
+ cq->umem_list = umem_list;
atomic_set(&cq->usecnt, 0);
rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ);
@@ -231,9 +240,11 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
return ret;
err_free:
- ib_umem_release(cq->umem);
+ ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_put(&cq->res);
kfree(cq);
+err_umem_list:
+ ib_umem_list_release(umem_list);
err_event_file:
if (obj->uevent.event_file)
uverbs_uobject_put(&obj->uevent.event_file->uobj);
@@ -281,7 +292,8 @@ DECLARE_UVERBS_NAMED_METHOD(
UVERBS_ATTR_PTR_IN(UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET,
UVERBS_ATTR_TYPE(u64),
UA_OPTIONAL),
- UVERBS_ATTR_UHW());
+ UVERBS_ATTR_UHW(),
+ UVERBS_ATTR_BUFFERS());
static int UVERBS_HANDLER(UVERBS_METHOD_CQ_DESTROY)(
struct uverbs_attr_bundle *attrs)
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index bac87de9cc67..ed163fc56ef8 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -50,6 +50,7 @@
#include <rdma/ib_cache.h>
#include <rdma/ib_addr.h>
#include <rdma/ib_umem.h>
+#include <rdma/ib_user_ioctl_cmds.h>
#include <rdma/rw.h>
#include <rdma/lag.h>
@@ -2223,9 +2224,9 @@ struct ib_cq *__ib_create_cq(struct ib_device *device,
}
/*
* We are in kernel verbs flow and drivers are not allowed
- * to set umem pointer, it needs to stay NULL.
+ * to set umem or umem_list pointers, they need to stay NULL.
*/
- WARN_ON_ONCE(cq->umem);
+ WARN_ON_ONCE(cq->umem || cq->umem_list);
rdma_restrack_add(&cq->res);
return cq;
@@ -2245,6 +2246,7 @@ EXPORT_SYMBOL(rdma_set_cq_moderation);
int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
{
+ struct ib_umem_list *umem_list = cq->umem_list;
int ret;
if (WARN_ON_ONCE(cq->shared))
@@ -2257,9 +2259,10 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
if (ret)
return ret;
- ib_umem_release(cq->umem);
+ ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_del(&cq->res);
kfree(cq);
+ ib_umem_list_release(umem_list);
return ret;
}
EXPORT_SYMBOL(ib_destroy_cq_user);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 9dd76f489a0b..dd6c0d68497d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1740,6 +1740,8 @@ struct ib_cq {
unsigned int comp_vector;
struct ib_umem *umem;
+ struct ib_umem_list *umem_list;
+
/*
* Implementation details of the RDMA core, don't use in drivers:
*/
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 10aa6568abf1..375e4e224f6a 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -120,6 +120,12 @@ enum uverbs_attrs_create_cq_cmd_attr_ids {
UVERBS_ATTR_CREATE_CQ_BUFFER_OFFSET,
};
+enum uverbs_buf_cq_slots {
+ UVERBS_BUF_CQ_BUF,
+ __UVERBS_BUF_CQ_MAX,
+ UVERBS_BUF_CQ_MAX = __UVERBS_BUF_CQ_MAX - 1,
+};
+
enum uverbs_attrs_destroy_cq_cmd_attr_ids {
UVERBS_ATTR_DESTROY_CQ_HANDLE,
UVERBS_ATTR_DESTROY_CQ_RESP,
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (2 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 05/15] RDMA/mlx5: " Jiri Pirko
` (10 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Load the CQ buffer using ib_umem_list_load() instead of ibcq->umem.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/efa/efa_verbs.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 7bd0838ebc99..b3236a40b87f 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -1124,6 +1124,7 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
struct efa_ibv_create_cq cmd;
struct efa_cq *cq = to_ecq(ibcq);
int entries = attr->cqe;
+ struct ib_umem *umem;
bool set_src_addr;
int err;
@@ -1172,20 +1173,18 @@ int efa_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
cq->ucontext = ucontext;
cq->size = PAGE_ALIGN(cmd.cq_entry_size * entries * cmd.num_sub_cqs);
- if (ibcq->umem) {
- if (ibcq->umem->length < cq->size) {
- ibdev_dbg(&dev->ibdev, "External memory too small\n");
- err = -EINVAL;
- goto err_out;
- }
-
- if (!ib_umem_is_contiguous(ibcq->umem)) {
+ umem = ib_umem_list_load(ibcq->umem_list, UVERBS_BUF_CQ_BUF, cq->size);
+ if (IS_ERR(umem)) {
+ err = PTR_ERR(umem);
+ goto err_out;
+ } else if (umem) {
+ if (!ib_umem_is_contiguous(umem)) {
ibdev_dbg(&dev->ibdev, "Non contiguous CQ unsupported\n");
err = -EINVAL;
goto err_out;
}
- cq->dma_addr = ib_umem_start_dma_addr(ibcq->umem);
+ cq->dma_addr = ib_umem_start_dma_addr(umem);
} else {
cq->cpu_addr = efa_zalloc_mapped(dev, &cq->dma_addr, cq->size,
DMA_FROM_DEVICE);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 05/15] RDMA/mlx5: Use umem_list for user CQ buffer
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (3 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 06/15] RDMA/bnxt_re: " Jiri Pirko
` (9 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Use ib_umem_list_load_or_get() and ib_umem_list_replace() to work
with umem instead of ibcq->umem.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/mlx5/cq.c | 35 +++++++++++++++------------------
1 file changed, 16 insertions(+), 19 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index a76b7a36087d..bb9ed7caec67 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -727,6 +727,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
int ncont;
void *cqc;
int err;
+ struct ib_umem *umem;
struct mlx5_ib_ucontext *context = rdma_udata_to_drv_context(
udata, struct mlx5_ib_ucontext, ibucontext);
@@ -745,31 +746,29 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
*cqe_size = ucmd.cqe_size;
- if (!cq->ibcq.umem)
- cq->ibcq.umem = ib_umem_get(&dev->ib_dev, ucmd.buf_addr,
- entries * ucmd.cqe_size,
- IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(cq->ibcq.umem))
- return PTR_ERR(cq->ibcq.umem);
+ umem = ib_umem_list_load_or_get(cq->ibcq.umem_list, UVERBS_BUF_CQ_BUF,
+ &dev->ib_dev, ucmd.buf_addr,
+ entries * ucmd.cqe_size,
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem))
+ return PTR_ERR(umem);
page_size = mlx5_umem_find_best_cq_quantized_pgoff(
- cq->ibcq.umem, cqc, log_page_size, MLX5_ADAPTER_PAGE_SHIFT,
+ umem, cqc, log_page_size, MLX5_ADAPTER_PAGE_SHIFT,
page_offset, 64, &page_offset_quantized);
- if (!page_size) {
- err = -EINVAL;
- goto err_umem;
- }
+ if (!page_size)
+ return -EINVAL;
err = mlx5_ib_db_map_user(context, ucmd.db_addr, &cq->db);
if (err)
- goto err_umem;
+ return err;
- ncont = ib_umem_num_dma_blocks(cq->ibcq.umem, page_size);
+ ncont = ib_umem_num_dma_blocks(umem, page_size);
mlx5_ib_dbg(
dev,
"addr 0x%llx, size %u, npages %zu, page_size %lu, ncont %d\n",
ucmd.buf_addr, entries * ucmd.cqe_size,
- ib_umem_num_pages(cq->ibcq.umem), page_size, ncont);
+ ib_umem_num_pages(umem), page_size, ncont);
*inlen = MLX5_ST_SZ_BYTES(create_cq_in) +
MLX5_FLD_SZ_BYTES(create_cq_in, pas[0]) * ncont;
@@ -780,7 +779,7 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
}
pas = (__be64 *)MLX5_ADDR_OF(create_cq_in, *cqb, pas);
- mlx5_ib_populate_pas(cq->ibcq.umem, page_size, pas, 0);
+ mlx5_ib_populate_pas(umem, page_size, pas, 0);
cqc = MLX5_ADDR_OF(create_cq_in, *cqb, cq_context);
MLX5_SET(cqc, cqc, log_page_size,
@@ -851,9 +850,6 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
err_db:
mlx5_ib_db_unmap_user(context, &cq->db);
-
-err_umem:
- /* UMEM is released by ib_core */
return err;
}
@@ -1434,7 +1430,8 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, unsigned int entries,
if (udata) {
cq->ibcq.cqe = entries - 1;
- ib_umem_release(cq->ibcq.umem);
+ ib_umem_list_replace(cq->ibcq.umem_list, UVERBS_BUF_CQ_BUF,
+ cq->resize_umem);
cq->ibcq.umem = cq->resize_umem;
cq->resize_umem = NULL;
} else {
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 06/15] RDMA/bnxt_re: Use umem_list for user CQ buffer
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (4 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 05/15] RDMA/mlx5: " Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 07/15] RDMA/mlx4: " Jiri Pirko
` (8 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Use ib_umem_list_load_or_get() and ib_umem_list_replace() to work
with umem instead of ibcq->umem.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 7ed294516b7e..5c6fc81fad6a 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3379,6 +3379,7 @@ int bnxt_re_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *att
struct bnxt_re_cq_req req;
int rc;
u32 active_cqs, entries;
+ struct ib_umem *umem;
if (attr->flags)
return -EOPNOTSUPP;
@@ -3402,15 +3403,14 @@ int bnxt_re_create_user_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *att
entries = bnxt_re_init_depth(attr->cqe + 1,
dev_attr->max_cq_wqes + 1, uctx);
- if (!ibcq->umem) {
- ibcq->umem = ib_umem_get(&rdev->ibdev, req.cq_va,
- entries * sizeof(struct cq_base),
- IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(ibcq->umem))
- return PTR_ERR(ibcq->umem);
- }
+ umem = ib_umem_list_load_or_get(ibcq->umem_list, UVERBS_BUF_CQ_BUF,
+ &rdev->ibdev, req.cq_va,
+ entries * sizeof(struct cq_base),
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem))
+ return PTR_ERR(umem);
- rc = bnxt_re_setup_sginfo(rdev, ibcq->umem, &cq->qplib_cq.sg_info);
+ rc = bnxt_re_setup_sginfo(rdev, umem, &cq->qplib_cq.sg_info);
if (rc)
return rc;
@@ -3516,8 +3516,10 @@ static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq)
cq->qplib_cq.max_wqe = cq->resize_cqe;
if (cq->resize_umem) {
- ib_umem_release(cq->ib_cq.umem);
+ ib_umem_list_replace(cq->ib_cq.umem_list, UVERBS_BUF_CQ_BUF,
+ cq->resize_umem);
cq->ib_cq.umem = cq->resize_umem;
+ cq->qplib_cq.sg_info.umem = cq->resize_umem;
cq->resize_umem = NULL;
cq->resize_cqe = 0;
}
@@ -4113,7 +4115,7 @@ int bnxt_re_poll_cq(struct ib_cq *ib_cq, int num_entries, struct ib_wc *wc)
/* User CQ; the only processing we do is to
* complete any pending CQ resize operation.
*/
- if (cq->ib_cq.umem) {
+ if (ib_cq->uobject) {
if (cq->resize_umem)
bnxt_re_resize_cq_complete(cq);
return 0;
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 07/15] RDMA/mlx4: Use umem_list for user CQ buffer
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (5 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 06/15] RDMA/bnxt_re: " Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq Jiri Pirko
` (7 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Use ib_umem_list_load() and ib_umem_list_replace() to work
with umem instead of ibcq->umem.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
- rebase on top of Leon's fix
---
drivers/infiniband/hw/mlx4/cq.c | 40 ++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 15 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 7a6eb602d4a6..f6ef85cc37a1 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -152,6 +152,7 @@ int mlx4_ib_create_user_cq(struct ib_cq *ibcq,
int shift;
int n;
int err;
+ struct ib_umem *umem;
struct mlx4_ib_ucontext *context = rdma_udata_to_drv_context(
udata, struct mlx4_ib_ucontext, ibucontext);
@@ -172,22 +173,30 @@ int mlx4_ib_create_user_cq(struct ib_cq *ibcq,
if (err)
goto err_cq;
- if (ibcq->umem &&
- (dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SW_CQ_INIT))
- return -EOPNOTSUPP;
-
- buf_addr = (void *)(unsigned long)ucmd.buf_addr;
-
- if (!ibcq->umem)
- ibcq->umem = ib_umem_get(&dev->ib_dev, ucmd.buf_addr,
- entries * cqe_size,
- IB_ACCESS_LOCAL_WRITE);
- if (IS_ERR(ibcq->umem)) {
- err = PTR_ERR(ibcq->umem);
+ umem = ib_umem_list_load(ibcq->umem_list, UVERBS_BUF_CQ_BUF,
+ entries * cqe_size);
+ if (IS_ERR(umem)) {
+ err = PTR_ERR(umem);
goto err_cq;
}
+ if (umem) {
+ if (dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SW_CQ_INIT)
+ return -EOPNOTSUPP;
+ } else {
+ umem = ib_umem_get(&dev->ib_dev, ucmd.buf_addr,
+ entries * cqe_size,
+ IB_ACCESS_LOCAL_WRITE);
+ if (IS_ERR(umem)) {
+ err = PTR_ERR(umem);
+ goto err_cq;
+ }
+ ib_umem_list_replace(ibcq->umem_list, UVERBS_BUF_CQ_BUF,
+ umem);
+ }
+
+ buf_addr = (void *)(unsigned long)ucmd.buf_addr;
- shift = mlx4_ib_umem_calc_optimal_mtt_size(cq->ibcq.umem, 0, &n);
+ shift = mlx4_ib_umem_calc_optimal_mtt_size(umem, 0, &n);
if (shift < 0) {
err = shift;
goto err_cq;
@@ -197,7 +206,7 @@ int mlx4_ib_create_user_cq(struct ib_cq *ibcq,
if (err)
goto err_cq;
- err = mlx4_ib_umem_write_mtt(dev, &cq->buf.mtt, cq->ibcq.umem);
+ err = mlx4_ib_umem_write_mtt(dev, &cq->buf.mtt, umem);
if (err)
goto err_mtt;
@@ -471,7 +480,8 @@ int mlx4_ib_resize_cq(struct ib_cq *ibcq, unsigned int entries,
if (ibcq->uobject) {
cq->buf = cq->resize_buf->buf;
cq->ibcq.cqe = cq->resize_buf->cqe;
- ib_umem_release(cq->ibcq.umem);
+ ib_umem_list_replace(ibcq->umem_list, UVERBS_BUF_CQ_BUF,
+ cq->resize_umem);
cq->ibcq.umem = cq->resize_umem;
kfree(cq->resize_buf);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (6 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 07/15] RDMA/mlx4: " Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation Jiri Pirko
` (6 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Now that all drivers use umem_list for CQ buffer management, the
legacy umem field in struct ib_cq is no longer needed. Remove it
along with the associated ib_umem_release_non_listed() calls in
error and destroy paths, as buffer lifetime is fully managed through
ib_umem_list_release().
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/core/uverbs_cmd.c | 1 -
drivers/infiniband/core/uverbs_std_types_cq.c | 9 ---------
drivers/infiniband/core/verbs.c | 5 ++---
drivers/infiniband/hw/bnxt_re/ib_verbs.c | 1 -
drivers/infiniband/hw/mlx4/cq.c | 1 -
drivers/infiniband/hw/mlx5/cq.c | 1 -
include/rdma/ib_verbs.h | 2 --
7 files changed, 2 insertions(+), 18 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 77874834108b..60fafa1fb7b4 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1088,7 +1088,6 @@ static int create_cq(struct uverbs_attr_bundle *attrs,
return uverbs_response(attrs, &resp, sizeof(resp));
err_free:
- ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_put(&cq->res);
kfree(cq);
err_list_release:
diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c b/drivers/infiniband/core/uverbs_std_types_cq.c
index f87cd11470fc..c165ff5446f6 100644
--- a/drivers/infiniband/core/uverbs_std_types_cq.c
+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
@@ -209,11 +209,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
cq->comp_handler = ib_uverbs_comp_handler;
cq->event_handler = ib_uverbs_cq_event_handler;
cq->cq_context = ev_file ? &ev_file->ev_queue : NULL;
- /*
- * If UMEM is not provided here, legacy drivers will set it during
- * CQ creation based on their internal udata.
- */
- cq->umem = umem;
cq->umem_list = umem_list;
atomic_set(&cq->usecnt, 0);
@@ -227,9 +222,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
if (ret)
goto err_free;
- /* Check that driver didn't overrun existing umem */
- WARN_ON(umem && cq->umem != umem);
-
obj->uevent.uobject.object = cq;
obj->uevent.uobject.user_handle = user_handle;
rdma_restrack_add(&cq->res);
@@ -240,7 +232,6 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
return ret;
err_free:
- ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_put(&cq->res);
kfree(cq);
err_umem_list:
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index ed163fc56ef8..35700bad8310 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2224,9 +2224,9 @@ struct ib_cq *__ib_create_cq(struct ib_device *device,
}
/*
* We are in kernel verbs flow and drivers are not allowed
- * to set umem or umem_list pointers, they need to stay NULL.
+ * to set umem_list pointer, it needs to stay NULL.
*/
- WARN_ON_ONCE(cq->umem || cq->umem_list);
+ WARN_ON_ONCE(cq->umem_list);
rdma_restrack_add(&cq->res);
return cq;
@@ -2259,7 +2259,6 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
if (ret)
return ret;
- ib_umem_release_non_listed(umem_list, UVERBS_BUF_CQ_BUF, cq->umem);
rdma_restrack_del(&cq->res);
kfree(cq);
ib_umem_list_release(umem_list);
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 5c6fc81fad6a..e63780c78781 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -3518,7 +3518,6 @@ static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq)
if (cq->resize_umem) {
ib_umem_list_replace(cq->ib_cq.umem_list, UVERBS_BUF_CQ_BUF,
cq->resize_umem);
- cq->ib_cq.umem = cq->resize_umem;
cq->qplib_cq.sg_info.umem = cq->resize_umem;
cq->resize_umem = NULL;
cq->resize_cqe = 0;
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index f6ef85cc37a1..3217c5faf0d5 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -482,7 +482,6 @@ int mlx4_ib_resize_cq(struct ib_cq *ibcq, unsigned int entries,
cq->ibcq.cqe = cq->resize_buf->cqe;
ib_umem_list_replace(ibcq->umem_list, UVERBS_BUF_CQ_BUF,
cq->resize_umem);
- cq->ibcq.umem = cq->resize_umem;
kfree(cq->resize_buf);
cq->resize_buf = NULL;
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index bb9ed7caec67..6118deb5e6dc 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -1432,7 +1432,6 @@ int mlx5_ib_resize_cq(struct ib_cq *ibcq, unsigned int entries,
cq->ibcq.cqe = entries - 1;
ib_umem_list_replace(cq->ibcq.umem_list, UVERBS_BUF_CQ_BUF,
cq->resize_umem);
- cq->ibcq.umem = cq->resize_umem;
cq->resize_umem = NULL;
} else {
struct mlx5_ib_cq_buf tbuf;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index dd6c0d68497d..cf7fa69415a1 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1738,8 +1738,6 @@ struct ib_cq {
u8 interrupt:1;
u8 shared:1;
unsigned int comp_vector;
- struct ib_umem *umem;
-
struct ib_umem_list *umem_list;
/*
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (7 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation Jiri Pirko
` (5 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
After the driver creates the CQ, verify that all user-provided
umem buffers were actually consumed by the driver. This rejects
requests where userspace provides buffers that the driver does
not support.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/core/uverbs_std_types_cq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/infiniband/core/uverbs_std_types_cq.c b/drivers/infiniband/core/uverbs_std_types_cq.c
index c165ff5446f6..d3176032d0ac 100644
--- a/drivers/infiniband/core/uverbs_std_types_cq.c
+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
@@ -222,6 +222,10 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
if (ret)
goto err_free;
+ ret = ib_umem_list_check_consumed(umem_list);
+ if (ret)
+ goto err_destroy_cq;
+
obj->uevent.uobject.object = cq;
obj->uevent.uobject.user_handle = user_handle;
rdma_restrack_add(&cq->res);
@@ -231,6 +235,8 @@ static int UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE)(
sizeof(cq->cqe));
return ret;
+err_destroy_cq:
+ ib_dev->ops.destroy_cq(cq, &attrs->driver_udata);
err_free:
rdma_restrack_put(&cq->res);
kfree(cq);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (8 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp Jiri Pirko
` (4 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Wire up the generic buffer descriptor infrastructure to the QP create
command. Add umem_list field to struct ib_qp and define the QP buffer
slot enums.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2: Fix umem_list double free
---
drivers/infiniband/core/core_priv.h | 1 +
drivers/infiniband/core/uverbs_cmd.c | 4 ++--
drivers/infiniband/core/uverbs_std_types_qp.c | 22 ++++++++++++++++---
drivers/infiniband/core/verbs.c | 19 +++++++++++++---
include/rdma/ib_verbs.h | 3 +++
include/uapi/rdma/ib_user_ioctl_cmds.h | 8 +++++++
6 files changed, 49 insertions(+), 8 deletions(-)
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index a2c36666e6fc..3f7b0803f186 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -321,6 +321,7 @@ void nldev_exit(void);
struct ib_qp *ib_create_qp_user(struct ib_device *dev, struct ib_pd *pd,
struct ib_qp_init_attr *attr,
+ struct ib_umem_list *umem_list,
struct ib_udata *udata,
struct ib_uqp_object *uobj, const char *caller);
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 60fafa1fb7b4..ce482ed047b0 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -1467,8 +1467,8 @@ static int create_qp(struct uverbs_attr_bundle *attrs,
attr.source_qpn = cmd->source_qpn;
}
- qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata, obj,
- KBUILD_MODNAME);
+ qp = ib_create_qp_user(device, pd, &attr, NULL,
+ &attrs->driver_udata, obj, KBUILD_MODNAME);
if (IS_ERR(qp)) {
ret = PTR_ERR(qp);
goto err_put;
diff --git a/drivers/infiniband/core/uverbs_std_types_qp.c b/drivers/infiniband/core/uverbs_std_types_qp.c
index be0730e8509e..5d76bfac6544 100644
--- a/drivers/infiniband/core/uverbs_std_types_qp.c
+++ b/drivers/infiniband/core/uverbs_std_types_qp.c
@@ -4,6 +4,7 @@
*/
#include <rdma/uverbs_std_types.h>
+#include <rdma/ib_umem.h>
#include "rdma_core.h"
#include "uverbs.h"
#include "core_priv.h"
@@ -96,6 +97,7 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
struct ib_xrcd *xrcd = NULL;
struct ib_uobject *xrcd_uobj = NULL;
struct ib_device *device;
+ struct ib_umem_list *umem_list;
u64 user_handle;
int ret;
@@ -248,14 +250,24 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
set_caps(&attr, &cap, true);
mutex_init(&obj->mcast_lock);
- qp = ib_create_qp_user(device, pd, &attr, &attrs->driver_udata, obj,
- KBUILD_MODNAME);
+ umem_list = ib_umem_list_create(device, attrs, UVERBS_BUF_QP_MAX);
+ if (IS_ERR(umem_list)) {
+ ret = PTR_ERR(umem_list);
+ goto err_put;
+ }
+
+ qp = ib_create_qp_user(device, pd, &attr, umem_list,
+ &attrs->driver_udata, obj, KBUILD_MODNAME);
if (IS_ERR(qp)) {
ret = PTR_ERR(qp);
goto err_put;
}
ib_qp_usecnt_inc(qp);
+ ret = ib_umem_list_check_consumed(umem_list);
+ if (ret)
+ goto err_destroy_qp;
+
if (attr.qp_type == IB_QPT_XRC_TGT) {
obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object,
uobject);
@@ -277,6 +289,9 @@ static int UVERBS_HANDLER(UVERBS_METHOD_QP_CREATE)(
sizeof(qp->qp_num));
return ret;
+
+err_destroy_qp:
+ ib_destroy_qp_user(qp, &attrs->driver_udata);
err_put:
if (obj->uevent.event_file)
uverbs_uobject_put(&obj->uevent.event_file->uobj);
@@ -340,7 +355,8 @@ DECLARE_UVERBS_NAMED_METHOD(
UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
UVERBS_ATTR_TYPE(u32),
UA_MANDATORY),
- UVERBS_ATTR_UHW());
+ UVERBS_ATTR_UHW(),
+ UVERBS_ATTR_BUFFERS());
static int UVERBS_HANDLER(UVERBS_METHOD_QP_DESTROY)(
struct uverbs_attr_bundle *attrs)
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 35700bad8310..0fe6cb1a9f07 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1266,6 +1266,7 @@ static struct ib_qp *create_xrc_qp_user(struct ib_qp *qp,
static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
struct ib_qp_init_attr *attr,
+ struct ib_umem_list *umem_list,
struct ib_udata *udata,
struct ib_uqp_object *uobj, const char *caller)
{
@@ -1292,6 +1293,7 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
qp->registered_event_handler = attr->event_handler;
qp->port = attr->port_num;
qp->qp_context = attr->qp_context;
+ qp->umem_list = umem_list;
spin_lock_init(&qp->mr_lock);
INIT_LIST_HEAD(&qp->rdma_mrs);
@@ -1326,6 +1328,7 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
qp->device->ops.destroy_qp(qp, udata ? &dummy : NULL);
err_create:
rdma_restrack_put(&qp->res);
+ ib_umem_list_release(qp->umem_list);
kfree(qp);
return ERR_PTR(ret);
@@ -1339,21 +1342,23 @@ static struct ib_qp *create_qp(struct ib_device *dev, struct ib_pd *pd,
* @attr: A list of initial attributes required to create the
* QP. If QP creation succeeds, then the attributes are updated to
* the actual capabilities of the created QP.
+ * @umem_list: pre-mapped dma-buf umem list, or NULL
* @udata: User data
* @uobj: uverbs obect
* @caller: caller's build-time module name
*/
struct ib_qp *ib_create_qp_user(struct ib_device *dev, struct ib_pd *pd,
struct ib_qp_init_attr *attr,
+ struct ib_umem_list *umem_list,
struct ib_udata *udata,
struct ib_uqp_object *uobj, const char *caller)
{
struct ib_qp *qp, *xrc_qp;
if (attr->qp_type == IB_QPT_XRC_TGT)
- qp = create_qp(dev, pd, attr, NULL, NULL, caller);
+ qp = create_qp(dev, pd, attr, umem_list, NULL, NULL, caller);
else
- qp = create_qp(dev, pd, attr, udata, uobj, NULL);
+ qp = create_qp(dev, pd, attr, umem_list, udata, uobj, NULL);
if (attr->qp_type != IB_QPT_XRC_TGT || IS_ERR(qp))
return qp;
@@ -1415,10 +1420,16 @@ struct ib_qp *ib_create_qp_kernel(struct ib_pd *pd,
if (qp_init_attr->cap.max_rdma_ctxs)
rdma_rw_init_qp(device, qp_init_attr);
- qp = create_qp(device, pd, qp_init_attr, NULL, NULL, caller);
+ qp = create_qp(device, pd, qp_init_attr, NULL, NULL, NULL, caller);
if (IS_ERR(qp))
return qp;
+ /*
+ * We are in kernel verbs flow and drivers are not allowed
+ * to set umem_list pointer, it needs to stay NULL.
+ */
+ WARN_ON_ONCE(qp->umem_list);
+
ib_qp_usecnt_inc(qp);
if (qp_init_attr->cap.max_rdma_ctxs) {
@@ -2147,6 +2158,7 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
{
const struct ib_gid_attr *alt_path_sgid_attr = qp->alt_path_sgid_attr;
const struct ib_gid_attr *av_sgid_attr = qp->av_sgid_attr;
+ struct ib_umem_list *umem_list = qp->umem_list;
struct ib_qp_security *sec;
int ret;
@@ -2184,6 +2196,7 @@ int ib_destroy_qp_user(struct ib_qp *qp, struct ib_udata *udata)
rdma_restrack_del(&qp->res);
kfree(qp);
+ ib_umem_list_release(umem_list);
return ret;
}
EXPORT_SYMBOL(ib_destroy_qp_user);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cf7fa69415a1..d78f62611a7e 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1524,6 +1524,7 @@ enum ib_mr_rereg_flags {
};
struct ib_umem;
+struct ib_umem_list;
enum rdma_remove_reason {
/*
@@ -1944,6 +1945,8 @@ struct ib_qp {
/* The counter the qp is bind to */
struct rdma_counter *counter;
+
+ struct ib_umem_list *umem_list;
};
struct ib_dm {
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 375e4e224f6a..9c5d3f989977 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -167,6 +167,14 @@ enum uverbs_attrs_create_qp_cmd_attr_ids {
UVERBS_ATTR_CREATE_QP_RESP_QP_NUM,
};
+enum uverbs_buf_qp_slots {
+ UVERBS_BUF_QP_BUF,
+ UVERBS_BUF_QP_RQ_BUF,
+ UVERBS_BUF_QP_SQ_BUF,
+ __UVERBS_BUF_QP_MAX,
+ UVERBS_BUF_QP_MAX = __UVERBS_BUF_QP_MAX - 1,
+};
+
enum uverbs_attrs_destroy_qp_cmd_attr_ids {
UVERBS_ATTR_DESTROY_QP_HANDLE,
UVERBS_ATTR_DESTROY_QP_RESP,
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (9 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list Jiri Pirko
` (3 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Load the QP and SQ buffer umems from the umem_list, falling back to
ib_umem_get() for the legacy path. Use ib_umem_release_non_listed()
on error and destroy paths in order to release umem properly.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/mlx5/qp.c | 70 +++++++++++++++++++++++----------
1 file changed, 49 insertions(+), 21 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 8f50e7342a76..ba5b41fa5ef9 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -938,6 +938,14 @@ static int adjust_bfregn(struct mlx5_ib_dev *dev,
bfregn % MLX5_NON_FP_BFREGS_PER_UAR;
}
+static unsigned int mlx5_qp_buf_slot(struct mlx5_ib_qp *qp)
+{
+ if (qp->type == IB_QPT_RAW_PACKET ||
+ qp->flags & IB_QP_CREATE_SOURCE_QPN)
+ return UVERBS_BUF_QP_RQ_BUF;
+ return UVERBS_BUF_QP_BUF;
+}
+
static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
struct mlx5_ib_qp *qp, struct ib_udata *udata,
struct ib_qp_init_attr *attr, u32 **in,
@@ -998,14 +1006,26 @@ static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
if (err)
goto err_bfreg;
- if (ucmd->buf_addr && ubuffer->buf_size) {
- ubuffer->buf_addr = ucmd->buf_addr;
- ubuffer->umem = ib_umem_get(&dev->ib_dev, ubuffer->buf_addr,
- ubuffer->buf_size, 0);
+ ubuffer->umem = NULL;
+ if (ubuffer->buf_size) {
+ ubuffer->umem = ib_umem_list_load(qp->ibqp.umem_list, mlx5_qp_buf_slot(qp),
+ ubuffer->buf_size);
if (IS_ERR(ubuffer->umem)) {
err = PTR_ERR(ubuffer->umem);
goto err_bfreg;
+ } else if (!ubuffer->umem && ucmd->buf_addr) {
+ ubuffer->buf_addr = ucmd->buf_addr;
+ ubuffer->umem = ib_umem_get(&dev->ib_dev, ubuffer->buf_addr,
+ ubuffer->buf_size, 0);
+ if (IS_ERR(ubuffer->umem)) {
+ err = PTR_ERR(ubuffer->umem);
+ goto err_bfreg;
+ }
+ ib_umem_list_replace(qp->ibqp.umem_list, mlx5_qp_buf_slot(qp),
+ ubuffer->umem);
}
+ }
+ if (ubuffer->umem) {
page_size = mlx5_umem_find_best_quantized_pgoff(
ubuffer->umem, qpc, log_page_size,
MLX5_ADAPTER_PAGE_SHIFT, page_offset, 64,
@@ -1015,8 +1035,6 @@ static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
goto err_umem;
}
ncont = ib_umem_num_dma_blocks(ubuffer->umem, page_size);
- } else {
- ubuffer->umem = NULL;
}
*inlen = MLX5_ST_SZ_BYTES(create_qp_in) +
@@ -1056,7 +1074,8 @@ static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
kvfree(*in);
err_umem:
- ib_umem_release(ubuffer->umem);
+ ib_umem_release_non_listed(qp->ibqp.umem_list, mlx5_qp_buf_slot(qp),
+ ubuffer->umem);
err_bfreg:
if (bfregn != MLX5_IB_INVALID_BFREG)
@@ -1073,7 +1092,8 @@ static void destroy_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
if (udata) {
/* User QP */
mlx5_ib_db_unmap_user(context, &qp->db);
- ib_umem_release(base->ubuffer.umem);
+ ib_umem_release_non_listed(qp->ibqp.umem_list, mlx5_qp_buf_slot(qp),
+ base->ubuffer.umem);
/*
* Free only the BFREGs which are handled by the kernel.
@@ -1334,7 +1354,8 @@ static int get_qp_ts_format(struct mlx5_ib_dev *dev, struct mlx5_ib_cq *send_cq,
static int create_raw_packet_qp_sq(struct mlx5_ib_dev *dev,
struct ib_udata *udata,
struct mlx5_ib_sq *sq, void *qpin,
- struct ib_pd *pd, struct mlx5_ib_cq *cq)
+ struct ib_pd *pd, struct mlx5_ib_cq *cq,
+ struct ib_umem_list *umem_list)
{
struct mlx5_ib_ubuffer *ubuffer = &sq->ubuffer;
__be64 *pas;
@@ -1352,10 +1373,11 @@ static int create_raw_packet_qp_sq(struct mlx5_ib_dev *dev,
if (ts_format < 0)
return ts_format;
- sq->ubuffer.umem = ib_umem_get(&dev->ib_dev, ubuffer->buf_addr,
- ubuffer->buf_size, 0);
- if (IS_ERR(sq->ubuffer.umem))
- return PTR_ERR(sq->ubuffer.umem);
+ ubuffer->umem = ib_umem_list_load_or_get(umem_list, UVERBS_BUF_QP_SQ_BUF,
+ &dev->ib_dev, ubuffer->buf_addr,
+ ubuffer->buf_size, 0);
+ if (IS_ERR(ubuffer->umem))
+ return PTR_ERR(ubuffer->umem);
page_size = mlx5_umem_find_best_quantized_pgoff(
ubuffer->umem, wq, log_wq_pg_sz, MLX5_ADAPTER_PAGE_SHIFT,
page_offset, 64, &page_offset_quantized);
@@ -1412,18 +1434,21 @@ static int create_raw_packet_qp_sq(struct mlx5_ib_dev *dev,
return 0;
err_umem:
- ib_umem_release(sq->ubuffer.umem);
+ ib_umem_release_non_listed(umem_list, UVERBS_BUF_QP_SQ_BUF,
+ sq->ubuffer.umem);
sq->ubuffer.umem = NULL;
return err;
}
static void destroy_raw_packet_qp_sq(struct mlx5_ib_dev *dev,
- struct mlx5_ib_sq *sq)
+ struct mlx5_ib_sq *sq,
+ struct ib_umem_list *umem_list)
{
destroy_flow_rule_vport_sq(sq);
mlx5_core_destroy_sq_tracked(dev, &sq->base.mqp);
- ib_umem_release(sq->ubuffer.umem);
+ ib_umem_release_non_listed(umem_list, UVERBS_BUF_QP_SQ_BUF,
+ sq->ubuffer.umem);
}
static int create_raw_packet_qp_rq(struct mlx5_ib_dev *dev,
@@ -1567,7 +1592,8 @@ static int create_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
u32 *in, size_t inlen, struct ib_pd *pd,
struct ib_udata *udata,
struct mlx5_ib_create_qp_resp *resp,
- struct ib_qp_init_attr *init_attr)
+ struct ib_qp_init_attr *init_attr,
+ struct ib_umem_list *umem_list)
{
struct mlx5_ib_raw_packet_qp *raw_packet_qp = &qp->raw_packet_qp;
struct mlx5_ib_sq *sq = &raw_packet_qp->sq;
@@ -1587,7 +1613,8 @@ static int create_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
return err;
err = create_raw_packet_qp_sq(dev, udata, sq, in, pd,
- to_mcq(init_attr->send_cq));
+ to_mcq(init_attr->send_cq),
+ umem_list);
if (err)
goto err_destroy_tis;
@@ -1651,7 +1678,7 @@ static int create_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
err_destroy_sq:
if (!qp->sq.wqe_cnt)
return err;
- destroy_raw_packet_qp_sq(dev, sq);
+ destroy_raw_packet_qp_sq(dev, sq, umem_list);
err_destroy_tis:
destroy_raw_packet_qp_tis(dev, sq, pd);
@@ -1671,7 +1698,7 @@ static void destroy_raw_packet_qp(struct mlx5_ib_dev *dev,
}
if (qp->sq.wqe_cnt) {
- destroy_raw_packet_qp_sq(dev, sq);
+ destroy_raw_packet_qp_sq(dev, sq, qp->ibqp.umem_list);
destroy_raw_packet_qp_tis(dev, sq, qp->ibqp.pd);
}
}
@@ -2393,7 +2420,8 @@ static int create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
qp->raw_packet_qp.sq.ubuffer.buf_addr = ucmd->sq_buf_addr;
raw_packet_qp_copy_info(qp, &qp->raw_packet_qp);
err = create_raw_packet_qp(dev, qp, in, inlen, pd, udata,
- ¶ms->resp, init_attr);
+ ¶ms->resp, init_attr,
+ qp->ibqp.umem_list);
} else
err = mlx5_qpc_create_qp(dev, &base->mqp, in, inlen, out);
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (10 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record Jiri Pirko
` (2 subsequent siblings)
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Extend the CQ buffer slot enum with UVERBS_BUF_CQ_DBR, allowing
userspace to provide doorbell record memory via the generic buffer
descriptor infrastructure.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 9c5d3f989977..26c2e3b2125a 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -122,6 +122,7 @@ enum uverbs_attrs_create_cq_cmd_attr_ids {
enum uverbs_buf_cq_slots {
UVERBS_BUF_CQ_BUF,
+ UVERBS_BUF_CQ_DBR,
__UVERBS_BUF_CQ_MAX,
UVERBS_BUF_CQ_MAX = __UVERBS_BUF_CQ_MAX - 1,
};
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (11 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record Jiri Pirko
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Load the doorbell record umem from the umem_list, falling back to
ib_umem_get() for the legacy path. Pass the umem_list and a
per-command slot index through the doorbell mapping infrastructure.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/mlx5/cq.c | 4 ++-
drivers/infiniband/hw/mlx5/doorbell.c | 41 +++++++++++++++++++++++----
drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 +-
drivers/infiniband/hw/mlx5/qp.c | 4 +--
drivers/infiniband/hw/mlx5/srq.c | 2 +-
5 files changed, 43 insertions(+), 11 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 6118deb5e6dc..ef36417a3c65 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -759,7 +759,9 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
if (!page_size)
return -EINVAL;
- err = mlx5_ib_db_map_user(context, ucmd.db_addr, &cq->db);
+ err = mlx5_ib_db_map_user(context, ucmd.db_addr,
+ cq->ibcq.umem_list, UVERBS_BUF_CQ_DBR,
+ sizeof(__be32) * 2, &cq->db);
if (err)
return err;
diff --git a/drivers/infiniband/hw/mlx5/doorbell.c b/drivers/infiniband/hw/mlx5/doorbell.c
index bd68fcf011f4..a1c5851aba10 100644
--- a/drivers/infiniband/hw/mlx5/doorbell.c
+++ b/drivers/infiniband/hw/mlx5/doorbell.c
@@ -40,25 +40,51 @@
struct mlx5_ib_user_db_page {
struct list_head list;
struct ib_umem *umem;
+ struct ib_umem_list *umem_list;
+ unsigned int dbr_index;
unsigned long user_virt;
int refcnt;
struct mm_struct *mm;
};
int mlx5_ib_db_map_user(struct mlx5_ib_ucontext *context, unsigned long virt,
- struct mlx5_db *db)
+ struct ib_umem_list *umem_list, unsigned int dbr_index,
+ size_t dbr_size, struct mlx5_db *db)
{
+ unsigned long dma_offset;
struct mlx5_ib_user_db_page *page;
+ struct ib_umem *umem;
int err = 0;
mutex_lock(&context->db_page_mutex);
+ umem = ib_umem_list_load(umem_list, dbr_index, dbr_size);
+ if (IS_ERR(umem)) {
+ err = PTR_ERR(umem);
+ goto out;
+ } else if (umem) {
+ /* External umem path - no page sharing */
+ page = kzalloc_obj(*page);
+ if (!page) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ page->umem = umem;
+ page->umem_list = umem_list;
+ page->dbr_index = dbr_index;
+ dma_offset = ib_umem_offset(umem);
+ goto add_page;
+ }
+
+ dma_offset = virt & ~PAGE_MASK;
+
list_for_each_entry(page, &context->db_page_list, list)
if ((current->mm == page->mm) &&
(page->user_virt == (virt & PAGE_MASK)))
goto found;
- page = kmalloc_obj(*page);
+ page = kzalloc_obj(*page);
if (!page) {
err = -ENOMEM;
goto out;
@@ -76,11 +102,11 @@ int mlx5_ib_db_map_user(struct mlx5_ib_ucontext *context, unsigned long virt,
mmgrab(current->mm);
page->mm = current->mm;
+add_page:
list_add(&page->list, &context->db_page_list);
found:
- db->dma = sg_dma_address(page->umem->sgt_append.sgt.sgl) +
- (virt & ~PAGE_MASK);
+ db->dma = sg_dma_address(page->umem->sgt_append.sgt.sgl) + dma_offset;
db->u.user_page = page;
++page->refcnt;
@@ -96,8 +122,11 @@ void mlx5_ib_db_unmap_user(struct mlx5_ib_ucontext *context, struct mlx5_db *db)
if (!--db->u.user_page->refcnt) {
list_del(&db->u.user_page->list);
- mmdrop(db->u.user_page->mm);
- ib_umem_release(db->u.user_page->umem);
+ if (db->u.user_page->mm)
+ mmdrop(db->u.user_page->mm);
+ ib_umem_release_non_listed(db->u.user_page->umem_list,
+ db->u.user_page->dbr_index,
+ db->u.user_page->umem);
kfree(db->u.user_page);
}
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 94d1e4f83679..f68f8466e60a 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1261,7 +1261,8 @@ to_mmmap(struct rdma_user_mmap_entry *rdma_entry)
int mlx5_ib_dev_res_cq_init(struct mlx5_ib_dev *dev);
int mlx5_ib_dev_res_srq_init(struct mlx5_ib_dev *dev);
int mlx5_ib_db_map_user(struct mlx5_ib_ucontext *context, unsigned long virt,
- struct mlx5_db *db);
+ struct ib_umem_list *umem_list, unsigned int dbr_index,
+ size_t dbr_size, struct mlx5_db *db);
void mlx5_ib_db_unmap_user(struct mlx5_ib_ucontext *context, struct mlx5_db *db);
void __mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 qpn, struct mlx5_ib_srq *srq);
void mlx5_ib_cq_clean(struct mlx5_ib_cq *cq, u32 qpn, struct mlx5_ib_srq *srq);
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index ba5b41fa5ef9..3edfe44f911a 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -918,7 +918,7 @@ static int create_user_rq(struct mlx5_ib_dev *dev, struct ib_pd *pd,
ib_umem_num_pages(rwq->umem), page_size, rwq->rq_num_pas,
offset);
- err = mlx5_ib_db_map_user(ucontext, ucmd->db_addr, &rwq->db);
+ err = mlx5_ib_db_map_user(ucontext, ucmd->db_addr, NULL, 0, 0, &rwq->db);
if (err) {
mlx5_ib_dbg(dev, "map failed\n");
goto err_umem;
@@ -1062,7 +1062,7 @@ static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
resp->bfreg_index = MLX5_IB_INVALID_BFREG;
qp->bfregn = bfregn;
- err = mlx5_ib_db_map_user(context, ucmd->db_addr, &qp->db);
+ err = mlx5_ib_db_map_user(context, ucmd->db_addr, NULL, 0, 0, &qp->db);
if (err) {
mlx5_ib_dbg(dev, "map failed\n");
goto err_free;
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 852f6f502d14..d4dbbd5a500f 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -74,7 +74,7 @@ static int create_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
}
in->umem = srq->umem;
- err = mlx5_ib_db_map_user(ucontext, ucmd.db_addr, &srq->db);
+ err = mlx5_ib_db_map_user(ucontext, ucmd.db_addr, NULL, 0, 0, &srq->db);
if (err) {
mlx5_ib_dbg(dev, "map doorbell failed\n");
goto err_umem;
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (12 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record Jiri Pirko
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Extend the QP buffer slot enum with UVERBS_BUF_QP_DBR, allowing
userspace to provide doorbell record memory via the generic buffer
descriptor infrastructure.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
index 26c2e3b2125a..1a47942ca1a6 100644
--- a/include/uapi/rdma/ib_user_ioctl_cmds.h
+++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
@@ -172,6 +172,7 @@ enum uverbs_buf_qp_slots {
UVERBS_BUF_QP_BUF,
UVERBS_BUF_QP_RQ_BUF,
UVERBS_BUF_QP_SQ_BUF,
+ UVERBS_BUF_QP_DBR_BUF,
__UVERBS_BUF_QP_MAX,
UVERBS_BUF_QP_MAX = __UVERBS_BUF_QP_MAX - 1,
};
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
` (13 preceding siblings ...)
2026-04-11 14:49 ` [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list Jiri Pirko
@ 2026-04-11 14:49 ` Jiri Pirko
14 siblings, 0 replies; 17+ messages in thread
From: Jiri Pirko @ 2026-04-11 14:49 UTC (permalink / raw)
To: linux-rdma
Cc: jgg, leon, mrgolin, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
From: Jiri Pirko <jiri@nvidia.com>
Pass the QP umem_list to the doorbell mapping infrastructure for
QP creation.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
---
drivers/infiniband/hw/mlx5/qp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 3edfe44f911a..6010fbb43d7a 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -1062,7 +1062,9 @@ static int _create_user_qp(struct mlx5_ib_dev *dev, struct ib_pd *pd,
resp->bfreg_index = MLX5_IB_INVALID_BFREG;
qp->bfregn = bfregn;
- err = mlx5_ib_db_map_user(context, ucmd->db_addr, NULL, 0, 0, &qp->db);
+ err = mlx5_ib_db_map_user(context, ucmd->db_addr,
+ qp->ibqp.umem_list, UVERBS_BUF_QP_DBR_BUF,
+ sizeof(__be32) * 2, &qp->db);
if (err) {
mlx5_ib_dbg(dev, "map failed\n");
goto err_free;
--
2.53.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH rdma-next v2 01/15] RDMA/core: Introduce generic buffer descriptor infrastructure for umem
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
@ 2026-04-12 12:33 ` Michael Margolin
0 siblings, 0 replies; 17+ messages in thread
From: Michael Margolin @ 2026-04-12 12:33 UTC (permalink / raw)
To: Jiri Pirko
Cc: linux-rdma, jgg, leon, gal.pressman, sleybo, parav, mbloch,
yanjun.zhu, marco.crivellari, roman.gushchin, phaddad, lirongqing,
ynachum, huangjunxian6, kalesh-anakkur.purayil, ohartoov,
michaelgur, shayd, edwards, sriharsha.basavapatna,
andrew.gospodarek, selvin.xavier
On Sat, Apr 11, 2026 at 04:49:01PM +0200, Jiri Pirko wrote:
> From: Jiri Pirko <jiri@nvidia.com>
>
> Add a unified mechanism for userspace to pass memory buffers to any
> uverbs command via a single UVERBS_ATTR_BUFFERS attribute. Each
> buffer is described by struct ib_uverbs_buffer_desc with a type
> discriminator supporting dma-buf and user VA backed memory, extensible
> for future buffer types.
>
> The ib_umem_list API enables any uverbs command to accept multiple
> buffers indexed by per-command slot enums, without requiring new UAPI
> attributes for each buffer. A consumption check ensures userspace and
> driver agree on which buffers are used.
>
> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
> ---
> drivers/infiniband/core/umem.c | 248 ++++++++++++++++++++++++
> include/rdma/ib_umem.h | 54 ++++++
> include/rdma/uverbs_ioctl.h | 14 ++
> include/uapi/rdma/ib_user_ioctl_cmds.h | 1 +
> include/uapi/rdma/ib_user_ioctl_verbs.h | 27 +++
> 5 files changed, 344 insertions(+)
>
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 786fa1aa8e55..f5b03e903b9d 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -37,6 +37,7 @@
> #include <linux/dma-mapping.h>
> #include <linux/sched/signal.h>
> #include <linux/sched/mm.h>
> +#include <linux/err.h>
> #include <linux/export.h>
> #include <linux/slab.h>
> #include <linux/pagemap.h>
> @@ -332,3 +333,250 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
> return 0;
> }
> EXPORT_SYMBOL(ib_umem_copy_from);
> +
> +struct ib_umem_list {
> + unsigned int count; /* Total slots in the list. */
> + unsigned long provided; /* Bitmask of slots provided by the user. */
> + unsigned long loaded; /* Bitmask of slots loaded by the driver. */
> + struct ib_umem *umems[] __counted_by(count);
> +};
> +
> +/**
> + * ib_umem_list_create - Create a umem list from UVERBS_ATTR_BUFFERS
> + * @device: IB device
> + * @attrs: uverbs attribute bundle
> + * @slot_max: highest buffer slot index (count = slot_max + 1)
> + *
> + * Return: umem list, or ERR_PTR on failure.
> + */
> +struct ib_umem_list *ib_umem_list_create(struct ib_device *device,
> + const struct uverbs_attr_bundle *attrs,
> + unsigned int slot_max)
> +{
> + const struct ib_uverbs_buffer_desc *descs;
> + struct ib_umem_dmabuf *umem_dmabuf;
> + struct ib_umem_list *list;
> + struct ib_umem *umem;
> + unsigned int count;
> + int num_descs;
> + int err;
> + int i;
> +
> + if (WARN_ON_ONCE(slot_max >= BITS_PER_LONG))
> + return ERR_PTR(-EINVAL);
> + count = slot_max + 1;
> +
> + num_descs = uverbs_attr_ptr_get_array_size(
> + (struct uverbs_attr_bundle *)attrs, UVERBS_ATTR_BUFFERS,
> + sizeof(*descs));
> + if (num_descs == -ENOENT) {
> + num_descs = 0;
> + descs = NULL;
> + } else if (num_descs < 0) {
> + return ERR_PTR(num_descs);
> + } else if (num_descs > count) {
> + return ERR_PTR(-EINVAL);
> + } else {
> + descs = uverbs_attr_get_alloced_ptr(attrs, UVERBS_ATTR_BUFFERS);
> + if (IS_ERR(descs))
> + return ERR_CAST(descs);
> + }
> +
> + list = kzalloc(struct_size(list, umems, count), GFP_KERNEL);
> + if (!list)
> + return ERR_PTR(-ENOMEM);
> + list->count = count;
> +
> + for (i = 0; i < num_descs; i++) {
While I like the idea of standardizing the way we pass buffer
information to the kernel, the list thing looks like over generalization
to me, especially after Leon's refactoring of CQ creation. Maybe we can
add buffer as a new attribute type that can be used for multiple
parameters in a command, and have a helper with the code below that
takes an attribute id and returns a umem object, letting each handler
store it. This would also make it easier for drivers to pass their
private buffers using this infrastructure.
Michael
> + unsigned int idx = descs[i].index;
> +
> + if (descs[i].reserved) {
> + err = -EINVAL;
> + goto err_release;
> + }
> + if (idx >= count || (list->provided & BIT(idx))) {
> + err = -EINVAL;
> + goto err_release;
> + }
> +
> + switch (descs[i].type) {
> + case IB_UVERBS_BUFFER_TYPE_DMABUF:
> + umem_dmabuf = ib_umem_dmabuf_get_pinned(
> + device, descs[i].addr, descs[i].length,
> + descs[i].fd, IB_ACCESS_LOCAL_WRITE);
> + if (IS_ERR(umem_dmabuf)) {
> + err = PTR_ERR(umem_dmabuf);
> + goto err_release;
> + }
> + list->umems[idx] = &umem_dmabuf->umem;
> + break;
> + case IB_UVERBS_BUFFER_TYPE_VA:
> + umem = ib_umem_get(device, descs[i].addr,
> + descs[i].length, IB_ACCESS_LOCAL_WRITE);
> + if (IS_ERR(umem)) {
> + err = PTR_ERR(umem);
> + goto err_release;
> + }
> + list->umems[idx] = umem;
> + break;
> + default:
> + err = -EINVAL;
> + goto err_release;
> + }
> + list->provided |= BIT(idx);
> + }
> +
> + return list;
> +
> +err_release:
> + ib_umem_list_release(list);
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL(ib_umem_list_create);
> +
> +/**
> + * ib_umem_list_release - Release all umems in the list and free it
> + * @list: umem list
> + */
> +void ib_umem_list_release(struct ib_umem_list *list)
> +{
> + int i;
> +
> + if (!list)
> + return;
> + for (i = 0; i < list->count; i++)
> + ib_umem_release(list->umems[i]);
> + kfree(list);
> +}
> +EXPORT_SYMBOL(ib_umem_list_release);
> +
> +/**
> + * ib_umem_list_check_consumed - Verify all provided umems were loaded
> + * @list: umem list
> + *
> + * Return: 0 if all provided slots were loaded, -EINVAL otherwise.
> + */
> +int ib_umem_list_check_consumed(const struct ib_umem_list *list)
> +{
> + return (list->provided & ~list->loaded) == 0 ? 0 : -EINVAL;
> +}
> +EXPORT_SYMBOL(ib_umem_list_check_consumed);
> +
> +/**
> + * ib_umem_list_insert - Insert a umem into the list at a given index
> + * @list: umem list
> + * @index: per-command buffer slot index
> + * @umem: umem pointer to store
> + *
> + * Stores @umem at @index (replacing any existing). For use from create_cq
> + * when the buffer comes from legacy ATTRs rather than the buffer list.
> + */
> +void ib_umem_list_insert(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem)
> +{
> + ib_umem_list_replace(list, index, umem);
> + if (umem)
> + list->provided |= BIT(index);
> +}
> +EXPORT_SYMBOL(ib_umem_list_insert);
> +
> +/**
> + * ib_umem_list_load - Load a umem from the list by index
> + * @list: umem list (may be NULL)
> + * @index: per-command buffer slot index
> + * @size: minimum required umem length
> + *
> + * Return: umem pointer, or NULL if the slot is empty or
> + * the slot is out of bounds, or ERR_PTR(-EINVAL) if the umem is too small.
> + */
> +struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
> + unsigned int index, size_t size)
> +{
> + struct ib_umem *umem;
> +
> + if (!list || index >= list->count)
> + return NULL;
> + umem = list->umems[index];
> + if (!umem)
> + return NULL;
> + if (umem->length < size)
> + return ERR_PTR(-EINVAL);
> + list->loaded |= BIT(index);
> + return umem;
> +}
> +EXPORT_SYMBOL(ib_umem_list_load);
> +
> +/**
> + * ib_umem_list_load_or_get - Umem from list or pin user memory
> + * @list: umem list (may be NULL)
> + * @index: per-command buffer slot index
> + * @device: IB device for ib_umem_get when the list slot is empty
> + * @addr: user virtual address for ib_umem_get
> + * @size: length for ib_umem_get
> + * @access: access flags for ib_umem_get
> + *
> + * If @list has a umem at @index, returns it like ib_umem_list_load() (and
> + * marks the slot loaded). Otherwise calls ib_umem_get() with the given
> + * @access flags and on success stores the result at @index when
> + * @list is non-NULL.
> + *
> + * Return: valid umem pointer, or ERR_PTR.
> + */
> +struct ib_umem *ib_umem_list_load_or_get(struct ib_umem_list *list,
> + unsigned int index,
> + struct ib_device *device,
> + unsigned long addr, size_t size,
> + int access)
> +{
> + struct ib_umem *umem;
> +
> + umem = ib_umem_list_load(list, index, size);
> + if (IS_ERR(umem) || umem)
> + return umem;
> + umem = ib_umem_get(device, addr, size, access);
> + if (IS_ERR(umem))
> + return umem;
> + if (list && index < list->count)
> + list->umems[index] = umem;
> + return umem;
> +}
> +EXPORT_SYMBOL(ib_umem_list_load_or_get);
> +
> +/**
> + * ib_umem_list_replace - Replace umem at index, releasing the previous one
> + * @list: umem list (may be NULL)
> + * @index: per-command buffer slot index
> + * @umem: new umem pointer (may be NULL to clear the slot)
> + *
> + * Stores @umem at @index. If a different umem was already stored there, it is
> + * released. Used for CQ resize and similar.
> + */
> +void ib_umem_list_replace(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem)
> +{
> + struct ib_umem *old;
> +
> + if (!list || index >= list->count)
> + return;
> + old = list->umems[index];
> + list->umems[index] = umem;
> + if (old && old != umem)
> + ib_umem_release(old);
> +}
> +EXPORT_SYMBOL(ib_umem_list_replace);
> +
> +/**
> + * ib_umem_release_non_listed - Release a umem that is not stored in the list
> + * @list: umem list
> + * @index: per-command buffer slot index
> + * @umem: umem pointer to release
> + *
> + * Releases @umem if it is not stored in @list.
> + */
> +void ib_umem_release_non_listed(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem)
> +{
> + if (!list || index >= list->count || list->umems[index] != umem)
> + ib_umem_release(umem);
> +}
> +EXPORT_SYMBOL(ib_umem_release_non_listed);
> diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
> index 2ad52cc1d52b..924acb8d08c3 100644
> --- a/include/rdma/ib_umem.h
> +++ b/include/rdma/ib_umem.h
> @@ -11,6 +11,7 @@
>
> struct ib_device;
> struct dma_buf_attach_ops;
> +struct uverbs_attr_bundle;
>
> struct ib_umem {
> struct ib_device *ibdev;
> @@ -80,6 +81,36 @@ struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,
> void ib_umem_release(struct ib_umem *umem);
> int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
> size_t length);
> +
> +/**
> + * struct ib_umem_list - collection of pre-mapped umems
> + *
> + * Created from the UVERBS_ATTR_BUFFERS attribute. Each entry is indexed
> + * by a per-command buffer slot enum (e.g., IB_UMEM_CQ_BUF for CQ CREATE).
> + * Drivers use ib_umem_list_load() to retrieve a specific umem by index.
> + */
> +struct ib_umem_list;
> +
> +struct ib_umem_list *ib_umem_list_create(struct ib_device *device,
> + const struct uverbs_attr_bundle *attrs,
> + unsigned int slot_max);
> +void ib_umem_list_release(struct ib_umem_list *list);
> +int ib_umem_list_check_consumed(const struct ib_umem_list *list);
> +void ib_umem_list_insert(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem);
> +
> +struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
> + unsigned int index, size_t size);
> +struct ib_umem *ib_umem_list_load_or_get(struct ib_umem_list *list,
> + unsigned int index,
> + struct ib_device *device,
> + unsigned long addr, size_t size,
> + int access);
> +void ib_umem_list_replace(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem);
> +void ib_umem_release_non_listed(struct ib_umem_list *list, unsigned int index,
> + struct ib_umem *umem);
> +
> unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
> unsigned long pgsz_bitmap,
> unsigned long virt);
> @@ -230,5 +261,28 @@ static inline void ib_umem_dmabuf_revoke_lock(struct ib_umem_dmabuf *umem_dmabuf
> static inline void ib_umem_dmabuf_revoke_unlock(struct ib_umem_dmabuf *umem_dmabuf) {}
> static inline void ib_umem_dmabuf_revoke(struct ib_umem_dmabuf *umem_dmabuf) {}
>
> +struct ib_umem_list;
> +
> +static inline void ib_umem_list_release(struct ib_umem_list *list) { }
> +static inline struct ib_umem *ib_umem_list_load(struct ib_umem_list *list,
> + unsigned int index,
> + size_t size)
> +{
> + return ERR_PTR(-EOPNOTSUPP);
> +}
> +static inline struct ib_umem *
> +ib_umem_list_load_or_get(struct ib_umem_list *list, unsigned int index,
> + struct ib_device *device, unsigned long addr,
> + size_t size, int access)
> +{
> + return ERR_PTR(-EOPNOTSUPP);
> +}
> +static inline void ib_umem_list_replace(struct ib_umem_list *list,
> + unsigned int index,
> + struct ib_umem *umem) { }
> +static inline void ib_umem_release_non_listed(struct ib_umem_list *list,
> + unsigned int index,
> + struct ib_umem *umem) { }
> +
> #endif /* CONFIG_INFINIBAND_USER_MEM */
> #endif /* IB_UMEM_H */
> diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
> index e2af17da3e32..05bcab27a87d 100644
> --- a/include/rdma/uverbs_ioctl.h
> +++ b/include/rdma/uverbs_ioctl.h
> @@ -590,6 +590,20 @@ struct uapi_definition {
> UA_OPTIONAL, \
> .is_udata = 1)
>
> +/*
> + * Optional array of struct ib_uverbs_buffer_desc describing memory regions
> + * backed by dma-buf or user virtual address. Can be added to any method
> + * that needs external buffer support.
> + * Each entry carries an index field selecting the per-command buffer slot.
> + * Use ib_umem_list_create() to map them and ib_umem_list_load() to access.
> + */
> +#define UVERBS_ATTR_BUFFERS() \
> + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_BUFFERS, \
> + UVERBS_ATTR_MIN_SIZE( \
> + sizeof(struct ib_uverbs_buffer_desc)), \
> + UA_OPTIONAL, \
> + UA_ALLOC_AND_COPY)
> +
> /* =================================================
> * Parsing infrastructure
> * =================================================
> diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h
> index 72041c1b0ea5..10aa6568abf1 100644
> --- a/include/uapi/rdma/ib_user_ioctl_cmds.h
> +++ b/include/uapi/rdma/ib_user_ioctl_cmds.h
> @@ -64,6 +64,7 @@ enum {
> UVERBS_ATTR_UHW_IN = UVERBS_ID_DRIVER_NS,
> UVERBS_ATTR_UHW_OUT,
> UVERBS_ID_DRIVER_NS_WITH_UHW,
> + UVERBS_ATTR_BUFFERS,
> };
>
> enum uverbs_methods_device {
> diff --git a/include/uapi/rdma/ib_user_ioctl_verbs.h b/include/uapi/rdma/ib_user_ioctl_verbs.h
> index 90c5cd8e7753..41ed9f75b4de 100644
> --- a/include/uapi/rdma/ib_user_ioctl_verbs.h
> +++ b/include/uapi/rdma/ib_user_ioctl_verbs.h
> @@ -273,4 +273,31 @@ struct ib_uverbs_gid_entry {
> __u32 netdev_ifindex; /* It is 0 if there is no netdev associated with it */
> };
>
> +enum ib_uverbs_buffer_type {
> + IB_UVERBS_BUFFER_TYPE_DMABUF,
> + IB_UVERBS_BUFFER_TYPE_VA,
> +};
> +
> +/*
> + * Describes a single buffer backed by dma-buf or user virtual address.
> + * Passed as an array via UVERBS_ATTR_BUFFERS. Each uverb command that
> + * accepts this attribute defines its own per-command buffer slot enum.
> + * The index field selects the buffer slot this descriptor maps to.
> + *
> + * @fd: dma-buf file descriptor (valid for IB_UVERBS_BUFFER_TYPE_DMABUF)
> + * @type: buffer type from enum ib_uverbs_buffer_type
> + * @index: per-command buffer slot index
> + * @reserved: must be zero
> + * @addr: offset within dma-buf, or user virtual address for VA
> + * @length: buffer length in bytes
> + */
> +struct ib_uverbs_buffer_desc {
> + __s32 fd;
> + __u32 type;
> + __u32 index;
> + __u32 reserved;
> + __aligned_u64 addr;
> + __aligned_u64 length;
> +};
> +
> #endif
> --
> 2.53.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-04-12 12:33 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-11 14:49 [PATCH rdma-next v2 00/15] RDMA: Introduce generic buffer descriptor infrastructure for umem Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 01/15] RDMA/core: " Jiri Pirko
2026-04-12 12:33 ` Michael Margolin
2026-04-11 14:49 ` [PATCH rdma-next v2 02/15] RDMA/uverbs: Push out CQ buffer umem processing into a helper Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 03/15] RDMA/uverbs: Integrate umem_list into CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 04/15] RDMA/efa: Use umem_list for user CQ buffer Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 05/15] RDMA/mlx5: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 06/15] RDMA/bnxt_re: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 07/15] RDMA/mlx4: " Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 08/15] RDMA/uverbs: Remove legacy umem field from struct ib_cq Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 09/15] RDMA/uverbs: Verify all umem_list buffers are consumed after CQ creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 10/15] RDMA/uverbs: Integrate umem_list into QP creation Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 11/15] RDMA/mlx5: Use umem_list for QP buffers in create_qp Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 12/15] RDMA/uverbs: Add doorbell record buffer slot to CQ umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 13/15] RDMA/mlx5: Use umem_list for CQ doorbell record Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 14/15] RDMA/uverbs: Add doorbell record buffer slot to QP umem_list Jiri Pirko
2026-04-11 14:49 ` [PATCH rdma-next v2 15/15] RDMA/mlx5: Use umem_list for QP doorbell record Jiri Pirko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox