From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF3CE346AE5 for ; Tue, 3 Feb 2026 13:32:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770125553; cv=none; b=pzKT0VScCKFdl5mFXIl1EZE7kwZAxwHjPZ2SLGmBbhJJwozenl78q0ID59apAB8Xpq+J1Ufdv2B25vLTQJXCr7wym3sABHxBe+AosNwVxE99zkhiK8OmFb/czZ0lyQlyYXtAMeFol3wmh/YIAMBl0bDGAu25icVr0OC0Gm4gYpc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770125553; c=relaxed/simple; bh=5NBETyZq5PbaMCHBZMdPrB5ldFtPNln+G5NiFHzZkxg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jwvc7PLtWIRT6hsocDn67tDusZoLMyvmriD4vr5vgdsHIg5e7JedekihBbue7EcVikPFVFO45aq6cjwjGna948TE4YUxlYkydm5ZBkY87CDUK4tKU3cme3WEhxVmKqeNIvRHSb3g/Asp4Wws2uUuAtYCbxlZWl7bw35IlhWTqDY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=sCJBko9p; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="sCJBko9p" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8E95BC116D0; Tue, 3 Feb 2026 13:32:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1770125553; bh=5NBETyZq5PbaMCHBZMdPrB5ldFtPNln+G5NiFHzZkxg=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=sCJBko9pB/yHtsZKhNu/+wcc1FxHdUcDxtQwn6kgF2KNt91yDDfduoPpaGq+jjMNi 6/M6hm0l8kbNctQph7rSwwjW6M2ZUcNwy9kDJcHIEaEIkCfcG48d0uQ87bLbBHSxDy rmS5xRUfd/6QZvSgL0Cq3ockmotujxCzVoXLtP4XeVnr1y/Lah7xcNXrNsGk0HBwfT Ox6mIzio206mVr2OqQ86wpYN2Pz0AeUvrt/SIQHh6a/5rjOejsngGbAIOHEBkzM3M/ VGjqzdFzoCToJLKZgPp3vWr9DBtxkdEiXCLDeU3bafvA5jawWGQh3PEE6KohdMSLot DhZd1s4+/lpeQ== Date: Tue, 3 Feb 2026 15:32:28 +0200 From: Leon Romanovsky To: Jiri Pirko Cc: linux-rdma@vger.kernel.org, jgg@ziepe.ca, mrgolin@amazon.com, gal.pressman@linux.dev, sleybo@amazon.com, parav@nvidia.com, mbloch@nvidia.com, yanjun.zhu@linux.dev, wangliang74@huawei.com, marco.crivellari@suse.com, roman.gushchin@linux.dev, phaddad@nvidia.com, lirongqing@baidu.com, ynachum@amazon.com, huangjunxian6@hisilicon.com, kalesh-anakkur.purayil@broadcom.com, ohartoov@nvidia.com, michaelgur@nvidia.com, shayd@nvidia.com, edwards@nvidia.com, sriharsha.basavapatna@broadcom.com, andrew.gospodarek@broadcom.com, selvin.xavier@broadcom.com Subject: Re: [PATCH rdma-next 01/10] RDMA/umem: Add reference counting to ib_umem Message-ID: <20260203133228.GV34749@unreal> References: <20260203085003.71184-1-jiri@resnulli.us> <20260203085003.71184-2-jiri@resnulli.us> <20260203100327.GS34749@unreal> <20260203122618.GT34749@unreal> <20260203130335.GU34749@unreal> <56arliffs27lqgriymsnysnh672joz7ihndkeffqee32vvwxby@w6qhwezufrrc> Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <56arliffs27lqgriymsnysnh672joz7ihndkeffqee32vvwxby@w6qhwezufrrc> On Tue, Feb 03, 2026 at 02:20:45PM +0100, Jiri Pirko wrote: > Tue, Feb 03, 2026 at 02:03:35PM +0100, leon@kernel.org wrote: > >On Tue, Feb 03, 2026 at 01:46:21PM +0100, Jiri Pirko wrote: > >> Tue, Feb 03, 2026 at 01:26:18PM +0100, leon@kernel.org wrote: > >> >On Tue, Feb 03, 2026 at 11:11:39AM +0100, Jiri Pirko wrote: > >> >> Tue, Feb 03, 2026 at 11:03:27AM +0100, leon@kernel.org wrote: > >> >> >On Tue, Feb 03, 2026 at 09:49:53AM +0100, Jiri Pirko wrote: > >> >> >> From: Jiri Pirko > >> >> >> > >> >> >> Introduce reference counting for ib_umem objects to simplify memory > >> >> >> lifecycle management when umem buffers are shared between the core > >> >> >> layer and device drivers. > >> > > >> >The lifecycle should be confined either to the core or to the driver, > >> >but it should not mix responsibilities across both. > >> > >> Okay, I can re-phrase. The point is, the last holding reference actually > >> does the release. > >> > >> > >> > > >> >> >> > >> >> >> When the core RDMA layer allocates an ib_umem and passes it to a driver > >> >> >> (e.g., for CQ or QP creation with external buffers), both layers need > >> >> >> to manage the umem lifecycle. Without reference counting, this requires > >> >> >> complex conditional release logic to avoid double-frees on error paths > >> >> >> and leaks on success paths. > >> >> > > >> >> >This sentence doesn't align with the proposed change. > >> >> > >> >> Hmm, I'm not sure why you think it does not align. It exactly describes > >> >> the code I had it originally without this path in place :) > >> > > >> >There is no complex conditional release logic which this reference > >> >counting solves. That counting allows delayed, semi-random release, > >> >nothing more. > >> > >> Again. Without the refcount, you have to make sure the umem is consumed > >> by the op. Actually, check the existing create_cq_umem. On the error > >> path, the umem is released by the caller. On success path the ownership > >> is transferred to the calle. > > > >We need to fix it. Exactly right now, I'm working to make sure that umem > >is managed by ib_core and drivers don't call to ib_umem_get() at all. > > Should I wait and rebase? It depends on your timeline. We are in -rc8 right now, so at least for next 3 weeks (one week till 6.19 + two weeks merge window) from now, your series won't be applied. Thanks > > > > > >> Consider various error paths in the calle > >> some of which are shared with destroy_cq op, some umems are not actually > >> used etc, it's a nightmare. I had the code written down like this > >> originally, that's why I actually know. > >> > >> Perhaps I'm missing your point. Is is just about the patch descriptio or > >> about the code itself? > > > >Description and the code. UMEM needs to be created by ib_core and > >ib_core should destroy them. > > > >> > >> > >> > > >> >> > >> >> > > >> >> >> > >> >> >> With reference counting: > >> >> >> - Core allocates umem with refcount=1 > >> >> >> - Driver calls ib_umem_get_ref() to take a reference > >> >> >> - Both layers can unconditionally call ib_umem_release() > >> >> >> - The umem is freed only when the last reference is dropped > >> >> >> > >> >> >> Signed-off-by: Jiri Pirko > >> >> >> Change-Id: Ifb1765ea3b14dab3329294633ea5df063c74420d > >> >> > > >> >> >Please remove the Change-Ids and write the commit message yourself, > >> >> >without relying on AI. The current message provides no meaningful > >> >> > >> >> I'm new in RDMA. Not sure what you mean by meaningful information :) > >> > > >> >This part of commit message is clearly generated by Cursor: > >> >With reference counting: > >> >- Core allocates umem with refcount=1 > >> >- Driver calls ib_umem_get_ref() to take a reference > >> >- Both layers can unconditionally call ib_umem_release() > >> >- The umem is freed only when the last reference is dropped > >> > > >> >The above paragraphs have clear AI pattern as they don't explain why, > >> >but only how. > >> > >> Why is explained above, isn't it? > >> If you don't want the "how part", I can remove it, no problem. > > > >Commit message should provide an additional information, which is not > >available in the code itself. Description like "Core allocates umem with > >refcount=1" has zero value. > > :), sure > > > > > >Thanks > > > >> > >> > >> > > >> >I'm seeing the SAME commit messages in many internals and externals > >> >patches. > >> > > >> >Even my AI review is agreed with me :) > >> >... > >> >"AI-authorship-score": "medium" > >> >... > >> > > >> >> I'm always trying to provide it. > >> >> > >> >> >information, particularly the auto‑generated summary at the end. > >> >> > >> >> Doh, the changeIDs :) Sorry about that. > >> >> > >> >> > >> >> > > >> >> >Thanks > >> >> > > >> >> >> --- > >> >> >> drivers/infiniband/core/umem.c | 5 +++++ > >> >> >> drivers/infiniband/core/umem_dmabuf.c | 1 + > >> >> >> drivers/infiniband/core/umem_odp.c | 3 +++ > >> >> >> include/rdma/ib_umem.h | 9 +++++++++ > >> >> >> 4 files changed, 18 insertions(+) > >> >> >> > >> >> >> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c > >> >> >> index 8137031c2a65..09ce694d66ea 100644 > >> >> >> --- a/drivers/infiniband/core/umem.c > >> >> >> +++ b/drivers/infiniband/core/umem.c > >> >> >> @@ -192,6 +192,7 @@ struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr, > >> >> >> umem = kzalloc(sizeof(*umem), GFP_KERNEL); > >> >> >> if (!umem) > >> >> >> return ERR_PTR(-ENOMEM); > >> >> >> + refcount_set(&umem->refcount, 1); > >> >> >> umem->ibdev = device; > >> >> >> umem->length = size; > >> >> >> umem->address = addr; > >> >> >> @@ -280,11 +281,15 @@ EXPORT_SYMBOL(ib_umem_get); > >> >> >> /** > >> >> >> * ib_umem_release - release memory pinned with ib_umem_get > >> >> >> * @umem: umem struct to release > >> >> >> + * > >> >> >> + * Decrement the reference count and free the umem when it reaches zero. > >> >> >> */ > >> >> >> void ib_umem_release(struct ib_umem *umem) > >> >> >> { > >> >> >> if (!umem) > >> >> >> return; > >> >> >> + if (!refcount_dec_and_test(&umem->refcount)) > >> >> >> + return; > >> >> >> if (umem->is_dmabuf) > >> >> >> return ib_umem_dmabuf_release(to_ib_umem_dmabuf(umem)); > >> >> >> if (umem->is_odp) > >> >> >> diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c > >> >> >> index 939da49b0dcc..5c5286092fca 100644 > >> >> >> --- a/drivers/infiniband/core/umem_dmabuf.c > >> >> >> +++ b/drivers/infiniband/core/umem_dmabuf.c > >> >> >> @@ -143,6 +143,7 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, > >> >> >> } > >> >> >> > >> >> >> umem = &umem_dmabuf->umem; > >> >> >> + refcount_set(&umem->refcount, 1); > >> >> >> umem->ibdev = device; > >> >> >> umem->length = size; > >> >> >> umem->address = offset; > >> >> >> diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c > >> >> >> index 572a91a62a7b..7be30fda57e3 100644 > >> >> >> --- a/drivers/infiniband/core/umem_odp.c > >> >> >> +++ b/drivers/infiniband/core/umem_odp.c > >> >> >> @@ -144,6 +144,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_device *device, > >> >> >> if (!umem_odp) > >> >> >> return ERR_PTR(-ENOMEM); > >> >> >> umem = &umem_odp->umem; > >> >> >> + refcount_set(&umem->refcount, 1); > >> >> >> umem->ibdev = device; > >> >> >> umem->writable = ib_access_writable(access); > >> >> >> umem->owning_mm = current->mm; > >> >> >> @@ -185,6 +186,7 @@ ib_umem_odp_alloc_child(struct ib_umem_odp *root, unsigned long addr, > >> >> >> if (!odp_data) > >> >> >> return ERR_PTR(-ENOMEM); > >> >> >> umem = &odp_data->umem; > >> >> >> + refcount_set(&umem->refcount, 1); > >> >> >> umem->ibdev = root->umem.ibdev; > >> >> >> umem->length = size; > >> >> >> umem->address = addr; > >> >> >> @@ -245,6 +247,7 @@ struct ib_umem_odp *ib_umem_odp_get(struct ib_device *device, > >> >> >> if (!umem_odp) > >> >> >> return ERR_PTR(-ENOMEM); > >> >> >> > >> >> >> + refcount_set(&umem_odp->umem.refcount, 1); > >> >> >> umem_odp->umem.ibdev = device; > >> >> >> umem_odp->umem.length = size; > >> >> >> umem_odp->umem.address = addr; > >> >> >> diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h > >> >> >> index 0a8e092c0ea8..44264f78eab3 100644 > >> >> >> --- a/include/rdma/ib_umem.h > >> >> >> +++ b/include/rdma/ib_umem.h > >> >> >> @@ -10,6 +10,7 @@ > >> >> >> #include > >> >> >> #include > >> >> >> #include > >> >> >> +#include > >> >> >> #include > >> >> >> > >> >> >> struct ib_ucontext; > >> >> >> @@ -19,6 +20,7 @@ struct dma_buf_attach_ops; > >> >> >> struct ib_umem { > >> >> >> struct ib_device *ibdev; > >> >> >> struct mm_struct *owning_mm; > >> >> >> + refcount_t refcount; > >> >> >> u64 iova; > >> >> >> size_t length; > >> >> >> unsigned long address; > >> >> >> @@ -110,6 +112,12 @@ static inline bool __rdma_umem_block_iter_next(struct ib_block_iter *biter) > >> >> >> > >> >> >> struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr, > >> >> >> size_t size, int access); > >> >> >> + > >> >> >> +static inline void ib_umem_get_ref(struct ib_umem *umem) > >> >> >> +{ > >> >> >> + refcount_inc(&umem->refcount); > >> >> >> +} > >> >> >> + > >> >> >> void ib_umem_release(struct ib_umem *umem); > >> >> >> int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, > >> >> >> size_t length); > >> >> >> @@ -188,6 +196,7 @@ static inline struct ib_umem *ib_umem_get(struct ib_device *device, > >> >> >> { > >> >> >> return ERR_PTR(-EOPNOTSUPP); > >> >> >> } > >> >> >> +static inline void ib_umem_get_ref(struct ib_umem *umem) { } > >> >> >> static inline void ib_umem_release(struct ib_umem *umem) { } > >> >> >> static inline int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, > >> >> >> size_t length) { > >> >> >> -- > >> >> >> 2.51.1 > >> >> >> > >> >> >>