From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Mark Zhang <markzhang@nvidia.com>,
linux-rdma@vger.kernel.org,
Michael Guralnik <michaelgur@nvidia.com>,
Or Har-Toov <ohartoov@nvidia.com>
Subject: Re: [PATCH rdma-next 2/4] RDMA/restrack: Release MR restrack when delete
Date: Thu, 10 Nov 2022 21:39:51 +0200 [thread overview]
Message-ID: <Y21Th3bG8gaARGuZ@unreal> (raw)
In-Reply-To: <Y21RJc2NIiUZw7A5@nvidia.com>
On Thu, Nov 10, 2022 at 03:29:41PM -0400, Jason Gunthorpe wrote:
> On Thu, Nov 10, 2022 at 11:35:37AM +0200, Leon Romanovsky wrote:
> > On Wed, Nov 09, 2022 at 03:08:33PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 07, 2022 at 10:51:34AM +0200, Leon Romanovsky wrote:
> > > > From: Mark Zhang <markzhang@nvidia.com>
> > > >
> > > > The MR restrack also needs to be released when delete it, otherwise it
> > > > cause memory leak as the task struct won't be released.
> > > >
> > > > Fixes: 13ef5539def7 ("RDMA/restrack: Count references to the verbs objects")
> > > > Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> > > > Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > ---
> > > > drivers/infiniband/core/restrack.c | 2 --
> > > > 1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> > > > index 1f935d9f6178..01a499a8b88d 100644
> > > > --- a/drivers/infiniband/core/restrack.c
> > > > +++ b/drivers/infiniband/core/restrack.c
> > > > @@ -343,8 +343,6 @@ void rdma_restrack_del(struct rdma_restrack_entry *res)
> > > > rt = &dev->res[res->type];
> > > >
> > > > old = xa_erase(&rt->xa, res->id);
> > > > - if (res->type == RDMA_RESTRACK_MR)
> > > > - return;
> > >
> > > This needs more explanation, there was some good reason we needed to
> > > avoid the wait_for_completion() for the driver allocated objects, but I
> > > can't remember it anymore.
> > >
> > > You added this code in the v2 of the original series, maybe it had
> > > something to do with mlx4?
> >
> > I failed to remember either, but if you want even more magic in your life,
> > see this hilarious thread:
> > https://lore.kernel.org/linux-rdma/9ba5a611ceac86774d3d0fda12704cecc30606f9.1618753038.git.leonro@nvidia.com/
>
> Oh, that clears it up
>
> The issue is that dereg can fail for MR:
>
> rdma_restrack_del(&mr->res);
> ret = mr->device->ops.dereg_mr(mr, udata);
> if (!ret) {
> atomic_dec(&pd->usecnt);
>
> Because the driver management of the object puts it in the wrong
> order.
>
> The above if is necessary because if we trigger this failure path
> without it, then the next attempt to free the MR will trigger a
> WARN_ON.
Not really, after first entry to rdma_restrack_del(), we will set
res->valid to false. Any subsequent calls to rdma_restrack_del() will
do nothing.
322 void rdma_restrack_del(struct rdma_restrack_entry *res)
323 {
324 struct rdma_restrack_entry *old;
325 struct rdma_restrack_root *rt;
326 struct ib_device *dev;
327
328 if (!res->valid) {
329 if (res->task) {
330 put_task_struct(res->task);
331 res->task = NULL;
332 }
333 return; <------- exit
334 }
Thanks
next prev parent reply other threads:[~2022-11-10 19:40 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-07 8:51 [PATCH rdma-next 0/4] Various core fixes Leon Romanovsky
2022-11-07 8:51 ` [PATCH rdma-next 1/4] RDMA/nldev: Use __nlmsg_put instead nlmsg_put Leon Romanovsky
2022-11-09 19:15 ` Jason Gunthorpe
2022-11-13 7:19 ` Leon Romanovsky
2022-11-14 14:17 ` Jason Gunthorpe
2022-11-15 7:52 ` Leon Romanovsky
2022-11-07 8:51 ` [PATCH rdma-next 2/4] RDMA/restrack: Release MR restrack when delete Leon Romanovsky
2022-11-09 19:08 ` Jason Gunthorpe
2022-11-10 9:35 ` Leon Romanovsky
2022-11-10 19:29 ` Jason Gunthorpe
2022-11-10 19:39 ` Leon Romanovsky [this message]
2022-11-10 19:47 ` Jason Gunthorpe
2022-11-07 8:51 ` [PATCH rdma-next 3/4] RDMA/core: Make sure "ib_port" is valid when access sysfs node Leon Romanovsky
2022-11-07 8:51 ` [PATCH rdma-next 4/4] RDMA/nldev: Return "-EAGAIN" if the cm_id isn't from expected port Leon Romanovsky
2022-11-15 7:59 ` [PATCH rdma-next 0/4] Various core fixes Leon Romanovsky
2022-11-15 7:59 ` (subset) " Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y21Th3bG8gaARGuZ@unreal \
--to=leon@kernel.org \
--cc=jgg@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=markzhang@nvidia.com \
--cc=michaelgur@nvidia.com \
--cc=ohartoov@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox