From: Leon Romanovsky <leon@kernel.org>
To: Jiri Pirko <jiri@resnulli.us>
Cc: linux-rdma@vger.kernel.org, jgg@ziepe.ca, mrgolin@amazon.com,
gal.pressman@linux.dev, sleybo@amazon.com, parav@nvidia.com,
mbloch@nvidia.com, yanjun.zhu@linux.dev, wangliang74@huawei.com,
marco.crivellari@suse.com, roman.gushchin@linux.dev,
phaddad@nvidia.com, lirongqing@baidu.com, ynachum@amazon.com,
huangjunxian6@hisilicon.com, kalesh-anakkur.purayil@broadcom.com,
ohartoov@nvidia.com, michaelgur@nvidia.com, shayd@nvidia.com,
edwards@nvidia.com, sriharsha.basavapatna@broadcom.com,
andrew.gospodarek@broadcom.com, selvin.xavier@broadcom.com
Subject: Re: [PATCH rdma-next 01/10] RDMA/umem: Add reference counting to ib_umem
Date: Tue, 3 Feb 2026 15:03:35 +0200 [thread overview]
Message-ID: <20260203130335.GU34749@unreal> (raw)
In-Reply-To: <uixv7cu4qe5vqkl3dlsd4smbxvflo3syoseuwtf4v7xhwgziul@gqlnz4geufth>
On Tue, Feb 03, 2026 at 01:46:21PM +0100, Jiri Pirko wrote:
> Tue, Feb 03, 2026 at 01:26:18PM +0100, leon@kernel.org wrote:
> >On Tue, Feb 03, 2026 at 11:11:39AM +0100, Jiri Pirko wrote:
> >> Tue, Feb 03, 2026 at 11:03:27AM +0100, leon@kernel.org wrote:
> >> >On Tue, Feb 03, 2026 at 09:49:53AM +0100, Jiri Pirko wrote:
> >> >> From: Jiri Pirko <jiri@nvidia.com>
> >> >>
> >> >> Introduce reference counting for ib_umem objects to simplify memory
> >> >> lifecycle management when umem buffers are shared between the core
> >> >> layer and device drivers.
> >
> >The lifecycle should be confined either to the core or to the driver,
> >but it should not mix responsibilities across both.
>
> Okay, I can re-phrase. The point is, the last holding reference actually
> does the release.
>
>
> >
> >> >>
> >> >> When the core RDMA layer allocates an ib_umem and passes it to a driver
> >> >> (e.g., for CQ or QP creation with external buffers), both layers need
> >> >> to manage the umem lifecycle. Without reference counting, this requires
> >> >> complex conditional release logic to avoid double-frees on error paths
> >> >> and leaks on success paths.
> >> >
> >> >This sentence doesn't align with the proposed change.
> >>
> >> Hmm, I'm not sure why you think it does not align. It exactly describes
> >> the code I had it originally without this path in place :)
> >
> >There is no complex conditional release logic which this reference
> >counting solves. That counting allows delayed, semi-random release,
> >nothing more.
>
> Again. Without the refcount, you have to make sure the umem is consumed
> by the op. Actually, check the existing create_cq_umem. On the error
> path, the umem is released by the caller. On success path the ownership
> is transferred to the calle.
We need to fix it. Exactly right now, I'm working to make sure that umem
is managed by ib_core and drivers don't call to ib_umem_get() at all.
> Consider various error paths in the calle
> some of which are shared with destroy_cq op, some umems are not actually
> used etc, it's a nightmare. I had the code written down like this
> originally, that's why I actually know.
>
> Perhaps I'm missing your point. Is is just about the patch descriptio or
> about the code itself?
Description and the code. UMEM needs to be created by ib_core and
ib_core should destroy them.
>
>
> >
> >>
> >> >
> >> >>
> >> >> With reference counting:
> >> >> - Core allocates umem with refcount=1
> >> >> - Driver calls ib_umem_get_ref() to take a reference
> >> >> - Both layers can unconditionally call ib_umem_release()
> >> >> - The umem is freed only when the last reference is dropped
> >> >>
> >> >> Signed-off-by: Jiri Pirko <jiri@nvidia.com>
> >> >> Change-Id: Ifb1765ea3b14dab3329294633ea5df063c74420d
> >> >
> >> >Please remove the Change-Ids and write the commit message yourself,
> >> >without relying on AI. The current message provides no meaningful
> >>
> >> I'm new in RDMA. Not sure what you mean by meaningful information :)
> >
> >This part of commit message is clearly generated by Cursor:
> >With reference counting:
> >- Core allocates umem with refcount=1
> >- Driver calls ib_umem_get_ref() to take a reference
> >- Both layers can unconditionally call ib_umem_release()
> >- The umem is freed only when the last reference is dropped
> >
> >The above paragraphs have clear AI pattern as they don't explain why,
> >but only how.
>
> Why is explained above, isn't it?
> If you don't want the "how part", I can remove it, no problem.
Commit message should provide an additional information, which is not
available in the code itself. Description like "Core allocates umem with
refcount=1" has zero value.
Thanks
>
>
> >
> >I'm seeing the SAME commit messages in many internals and externals
> >patches.
> >
> >Even my AI review is agreed with me :)
> >...
> >"AI-authorship-score": "medium"
> >...
> >
> >> I'm always trying to provide it.
> >>
> >> >information, particularly the auto‑generated summary at the end.
> >>
> >> Doh, the changeIDs :) Sorry about that.
> >>
> >>
> >> >
> >> >Thanks
> >> >
> >> >> ---
> >> >> drivers/infiniband/core/umem.c | 5 +++++
> >> >> drivers/infiniband/core/umem_dmabuf.c | 1 +
> >> >> drivers/infiniband/core/umem_odp.c | 3 +++
> >> >> include/rdma/ib_umem.h | 9 +++++++++
> >> >> 4 files changed, 18 insertions(+)
> >> >>
> >> >> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> >> >> index 8137031c2a65..09ce694d66ea 100644
> >> >> --- a/drivers/infiniband/core/umem.c
> >> >> +++ b/drivers/infiniband/core/umem.c
> >> >> @@ -192,6 +192,7 @@ struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,
> >> >> umem = kzalloc(sizeof(*umem), GFP_KERNEL);
> >> >> if (!umem)
> >> >> return ERR_PTR(-ENOMEM);
> >> >> + refcount_set(&umem->refcount, 1);
> >> >> umem->ibdev = device;
> >> >> umem->length = size;
> >> >> umem->address = addr;
> >> >> @@ -280,11 +281,15 @@ EXPORT_SYMBOL(ib_umem_get);
> >> >> /**
> >> >> * ib_umem_release - release memory pinned with ib_umem_get
> >> >> * @umem: umem struct to release
> >> >> + *
> >> >> + * Decrement the reference count and free the umem when it reaches zero.
> >> >> */
> >> >> void ib_umem_release(struct ib_umem *umem)
> >> >> {
> >> >> if (!umem)
> >> >> return;
> >> >> + if (!refcount_dec_and_test(&umem->refcount))
> >> >> + return;
> >> >> if (umem->is_dmabuf)
> >> >> return ib_umem_dmabuf_release(to_ib_umem_dmabuf(umem));
> >> >> if (umem->is_odp)
> >> >> diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c
> >> >> index 939da49b0dcc..5c5286092fca 100644
> >> >> --- a/drivers/infiniband/core/umem_dmabuf.c
> >> >> +++ b/drivers/infiniband/core/umem_dmabuf.c
> >> >> @@ -143,6 +143,7 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device,
> >> >> }
> >> >>
> >> >> umem = &umem_dmabuf->umem;
> >> >> + refcount_set(&umem->refcount, 1);
> >> >> umem->ibdev = device;
> >> >> umem->length = size;
> >> >> umem->address = offset;
> >> >> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
> >> >> index 572a91a62a7b..7be30fda57e3 100644
> >> >> --- a/drivers/infiniband/core/umem_odp.c
> >> >> +++ b/drivers/infiniband/core/umem_odp.c
> >> >> @@ -144,6 +144,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_device *device,
> >> >> if (!umem_odp)
> >> >> return ERR_PTR(-ENOMEM);
> >> >> umem = &umem_odp->umem;
> >> >> + refcount_set(&umem->refcount, 1);
> >> >> umem->ibdev = device;
> >> >> umem->writable = ib_access_writable(access);
> >> >> umem->owning_mm = current->mm;
> >> >> @@ -185,6 +186,7 @@ ib_umem_odp_alloc_child(struct ib_umem_odp *root, unsigned long addr,
> >> >> if (!odp_data)
> >> >> return ERR_PTR(-ENOMEM);
> >> >> umem = &odp_data->umem;
> >> >> + refcount_set(&umem->refcount, 1);
> >> >> umem->ibdev = root->umem.ibdev;
> >> >> umem->length = size;
> >> >> umem->address = addr;
> >> >> @@ -245,6 +247,7 @@ struct ib_umem_odp *ib_umem_odp_get(struct ib_device *device,
> >> >> if (!umem_odp)
> >> >> return ERR_PTR(-ENOMEM);
> >> >>
> >> >> + refcount_set(&umem_odp->umem.refcount, 1);
> >> >> umem_odp->umem.ibdev = device;
> >> >> umem_odp->umem.length = size;
> >> >> umem_odp->umem.address = addr;
> >> >> diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
> >> >> index 0a8e092c0ea8..44264f78eab3 100644
> >> >> --- a/include/rdma/ib_umem.h
> >> >> +++ b/include/rdma/ib_umem.h
> >> >> @@ -10,6 +10,7 @@
> >> >> #include <linux/list.h>
> >> >> #include <linux/scatterlist.h>
> >> >> #include <linux/workqueue.h>
> >> >> +#include <linux/refcount.h>
> >> >> #include <rdma/ib_verbs.h>
> >> >>
> >> >> struct ib_ucontext;
> >> >> @@ -19,6 +20,7 @@ struct dma_buf_attach_ops;
> >> >> struct ib_umem {
> >> >> struct ib_device *ibdev;
> >> >> struct mm_struct *owning_mm;
> >> >> + refcount_t refcount;
> >> >> u64 iova;
> >> >> size_t length;
> >> >> unsigned long address;
> >> >> @@ -110,6 +112,12 @@ static inline bool __rdma_umem_block_iter_next(struct ib_block_iter *biter)
> >> >>
> >> >> struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr,
> >> >> size_t size, int access);
> >> >> +
> >> >> +static inline void ib_umem_get_ref(struct ib_umem *umem)
> >> >> +{
> >> >> + refcount_inc(&umem->refcount);
> >> >> +}
> >> >> +
> >> >> void ib_umem_release(struct ib_umem *umem);
> >> >> int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
> >> >> size_t length);
> >> >> @@ -188,6 +196,7 @@ static inline struct ib_umem *ib_umem_get(struct ib_device *device,
> >> >> {
> >> >> return ERR_PTR(-EOPNOTSUPP);
> >> >> }
> >> >> +static inline void ib_umem_get_ref(struct ib_umem *umem) { }
> >> >> static inline void ib_umem_release(struct ib_umem *umem) { }
> >> >> static inline int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset,
> >> >> size_t length) {
> >> >> --
> >> >> 2.51.1
> >> >>
> >> >>
next prev parent reply other threads:[~2026-02-03 13:03 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-03 8:49 [PATCH rdma-next 00/10] RDMA: Extend uverbs umem support for QP buffers and doorbell records Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 01/10] RDMA/umem: Add reference counting to ib_umem Jiri Pirko
2026-02-03 10:03 ` Leon Romanovsky
2026-02-03 10:11 ` Jiri Pirko
2026-02-03 12:26 ` Leon Romanovsky
2026-02-03 12:46 ` Jiri Pirko
2026-02-03 13:03 ` Leon Romanovsky [this message]
2026-02-03 13:20 ` Jiri Pirko
2026-02-03 13:32 ` Leon Romanovsky
2026-02-03 14:31 ` Jiri Pirko
2026-02-03 13:49 ` Sriharsha Basavapatna
2026-02-03 14:29 ` Jiri Pirko
2026-02-03 14:49 ` Sriharsha Basavapatna
2026-02-03 14:51 ` Jason Gunthorpe
2026-02-03 15:39 ` Jiri Pirko
2026-02-03 16:59 ` Jason Gunthorpe
2026-02-04 7:01 ` Jiri Pirko
2026-02-04 15:38 ` Jiri Pirko
2026-02-04 17:46 ` Jason Gunthorpe
2026-02-04 17:54 ` Leon Romanovsky
2026-02-04 17:56 ` Jiri Pirko
2026-02-03 16:56 ` Leon Romanovsky
2026-02-03 17:01 ` Jason Gunthorpe
2026-02-03 8:49 ` [PATCH rdma-next 02/10] RDMA/uverbs: Use umem refcounting for CQ creation with external buffer Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 03/10] RDMA/mlx5: Add support for CQ creation with external umem buffer Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 04/10] RDMA/uverbs: Factor out common buffer umem parsing into helper Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 05/10] RDMA/core: Add support for QP buffer umem in QP creation Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 06/10] RDMA/mlx5: Add support for QP creation with external umem buffers Jiri Pirko
2026-02-03 8:49 ` [PATCH rdma-next 07/10] RDMA/uverbs: Add doorbell record umem support to CQ creation Jiri Pirko
2026-02-03 8:50 ` [PATCH rdma-next 08/10] RDMA/mlx5: Add external doorbell record umem support for CQ Jiri Pirko
2026-02-03 8:50 ` [PATCH rdma-next 09/10] RDMA/uverbs: Add doorbell record umem support to QP creation Jiri Pirko
2026-02-03 8:50 ` [PATCH rdma-next 10/10] RDMA/mlx5: Add external doorbell record umem support for QP Jiri Pirko
2026-02-03 9:59 ` [PATCH rdma-next 00/10] RDMA: Extend uverbs umem support for QP buffers and doorbell records Leon Romanovsky
2026-02-03 10:13 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260203130335.GU34749@unreal \
--to=leon@kernel.org \
--cc=andrew.gospodarek@broadcom.com \
--cc=edwards@nvidia.com \
--cc=gal.pressman@linux.dev \
--cc=huangjunxian6@hisilicon.com \
--cc=jgg@ziepe.ca \
--cc=jiri@resnulli.us \
--cc=kalesh-anakkur.purayil@broadcom.com \
--cc=linux-rdma@vger.kernel.org \
--cc=lirongqing@baidu.com \
--cc=marco.crivellari@suse.com \
--cc=mbloch@nvidia.com \
--cc=michaelgur@nvidia.com \
--cc=mrgolin@amazon.com \
--cc=ohartoov@nvidia.com \
--cc=parav@nvidia.com \
--cc=phaddad@nvidia.com \
--cc=roman.gushchin@linux.dev \
--cc=selvin.xavier@broadcom.com \
--cc=shayd@nvidia.com \
--cc=sleybo@amazon.com \
--cc=sriharsha.basavapatna@broadcom.com \
--cc=wangliang74@huawei.com \
--cc=yanjun.zhu@linux.dev \
--cc=ynachum@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox