public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Jason Gunthorpe <jgg-uk2M96/98Pc@public.gmane.org>
Cc: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	RDMA mailing list
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Mark Bloch <markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	Steve Wise
	<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Subject: Re: [PATCH rdma-next v6 3/8] RDMA/restrack: Add general infrastructure to track RDMA resources
Date: Thu, 25 Jan 2018 22:27:10 +0200	[thread overview]
Message-ID: <20180125202710.GS1393@mtr-leonro.local> (raw)
In-Reply-To: <20180125201023.GA9162-uk2M96/98Pc@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 3338 bytes --]

On Thu, Jan 25, 2018 at 01:10:23PM -0700, Jason Gunthorpe wrote:
> On Thu, Jan 25, 2018 at 09:44:36PM +0200, Leon Romanovsky wrote:
> > > > +	/*
> > > > +	 * @kref: Protect destroy of the resource
> > > > +	 */
> > > > +	struct kref		kref;
> > >
> > > Sticking a kref in a random structure that is itself not malloc'd is a
> > > big red flag:
> >
> > It is not "rand", but embedded part of ib_qp,ib_pd and ib_cq. It is
> > malloced as part of their creation.
>
> I mean the kref isn't in any way linked the lifetime of the malloc
> that contains it. So it isn't really a kref, it is just a refcount.

For now, yes, in the future no. IMHO it is the direction to manage RDMA
objects.

>
> > > @@ -1746,6 +1756,11 @@ struct ib_qp {
> > > +       struct rdma_restrack_root     res;
> > >
> > >
> > > Not seeing how this works at all.
> > >
> > > qp does this:
> > >
> > > +       rdma_restrack_del(&qp->res);
> > >         ret = qp->device->destroy_qp(qp);
> > >
> > > And for instance the mlx5 implementation of destroy_qp() does:
> > >
> > > int mlx5_ib_destroy_qp(struct ib_qp *qp)
> > > {
> > >         struct mlx5_ib_qp *mqp = to_mqp(qp);
> > > [..]
> > >
> > >         kfree(mqp);
> > >
> > > So we kfree the kref containing memory outside the kref's release
> > > function. Super big red flag.
> >
> > It is worth to look on the actual implementation, the
> > rdma_restrack_del() doesn't do a lot: removes from the list and marks
> > the resource as not valid.
>
> It doesn't even do that if the kref count is elevated.
>
> Which is the whole point, the rdma_restrack_del *doesn't* prevent the
> use-after-free on the read side.

I understand it and agree that it is the bug, which is very easy to fix.

>
> > > The best idea I've had to make this locking work is what I emailed you
> > > a few days ago..
> >
> > It is more or less the same idea, but needs to move
> > down_write(&dev->res.rwsem) to be part of rdma_restrack_del and not
> > release routine. It will cause to block the progress of release.
>
> That doesn't help - the kref is still totally non-functional as a kref:
>
>  CPU0                                CPU1
>
>                                  down_read()
>                                  rdma_restrack_get()
> 				 up_read()
> down_write()
> restrack_release()
> up_write()
> kfree(qp)
>                                  fill_res_qp_entry(qp)
> 				 // boom, use after free
>
> A kref is not a lock, and a kref that *doesn't* put kfree in the
> release function is probably not a kref but a refcount.
>
> What I sent you wasn't remotely like this, it had two nested locks,
> the outer mutex and a very special open coded inner read/write lock
> that doesn't deadlock when nested..

I still didn't like your approach, because it has three words which i don't want
to see here: "special open-coded variant". I believe that the way to go is to add
completion structure and convert _del to be something like that:

void rdma_restrack_del(struct rdma_restrack_entry *res)
{
	if (!rdma_restrack_put(res))
		/* it is pseudo-code */
		wait_for_completion(!res->valid);
	return;
}


>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2018-01-25 20:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-25 15:12 [PATCH rdma-next v6 0/8] RDMA resource tracking Leon Romanovsky
     [not found] ` <20180125151227.28202-1-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-01-25 15:12   ` [PATCH rdma-next v6 1/8] RDMA/core: Print caller name instead of function name Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 2/8] RDMA/core: Save kernel caller name in PD and CQ objects Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 3/8] RDMA/restrack: Add general infrastructure to track RDMA resources Leon Romanovsky
     [not found]     ` <20180125151227.28202-4-leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2018-01-25 16:39       ` Doug Ledford
     [not found]         ` <1516898399.27592.139.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-25 19:35           ` Leon Romanovsky
2018-01-25 17:45       ` Jason Gunthorpe
     [not found]         ` <20180125174529.GB3739-uk2M96/98Pc@public.gmane.org>
2018-01-25 19:44           ` Leon Romanovsky
     [not found]             ` <20180125194436.GR1393-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-25 20:10               ` Jason Gunthorpe
     [not found]                 ` <20180125201023.GA9162-uk2M96/98Pc@public.gmane.org>
2018-01-25 20:27                   ` Leon Romanovsky [this message]
     [not found]                     ` <20180125202710.GS1393-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-25 21:00                       ` Jason Gunthorpe
     [not found]                         ` <20180125210026.GA10719-uk2M96/98Pc@public.gmane.org>
2018-01-25 21:18                           ` Leon Romanovsky
     [not found]                             ` <20180125211831.GT1393-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-25 21:57                               ` Jason Gunthorpe
     [not found]                                 ` <20180125215716.GA11537-uk2M96/98Pc@public.gmane.org>
2018-01-26  6:24                                   ` Leon Romanovsky
     [not found]                                     ` <20180126062423.GV1393-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2018-01-26 16:05                                       ` Jason Gunthorpe
2018-01-25 15:12   ` [PATCH rdma-next v6 4/8] RDMA/core: Add resource tracking for create and destroy QPs Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 5/8] RDMA/core: Add resource tracking for create and destroy CQs Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 6/8] RDMA/core: Add resource tracking for create and destroy PDs Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 7/8] RDMA/nldev: Provide global resource utilization Leon Romanovsky
2018-01-25 15:12   ` [PATCH rdma-next v6 8/8] RDMA/nldev: Provide detailed QP information Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180125202710.GS1393@mtr-leonro.local \
    --to=leon-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jgg-uk2M96/98Pc@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=markb-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox