All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-next v11 10/13] RDMA/rxe: Stop lookup of partially built objects
Date: Wed, 16 Mar 2022 10:42:27 -0300	[thread overview]
Message-ID: <20220316134227.GD11336@nvidia.com> (raw)
In-Reply-To: <8aa69cff-2871-3015-fd2d-7279d8f3ddac@gmail.com>

On Tue, Mar 15, 2022 at 10:55:01PM -0500, Bob Pearson wrote:
> On 3/15/22 19:16, Jason Gunthorpe wrote:
> > On Thu, Mar 03, 2022 at 06:08:06PM -0600, Bob Pearson wrote:
> >> Currently the rdma_rxe driver has a security weakness due to giving
> >> objects which are partially initialized indices allowing external
> >> actors to gain access to them by sending packets which refer to
> >> their index (e.g. qpn, rkey, etc) causing unpredictable results.
> >>
> >> This patch adds two new APIs rxe_show(obj) and rxe_hide(obj) which
> >> enable or disable looking up pool objects from indices using
> >> rxe_pool_get_index(). By default objects are disabled. These APIs
> >> are used to enable looking up objects which have indices:
> >> AH, SRQ, QP, MR, and MW. They are added in create verbs after the
> >> objects are fully initialized and as soon as possible in destroy
> >> verbs.
> > 
> > In other parts of rdma we use the word 'finalize' where you used show
> > 
> > So rxe_pool_finalize() or something
> > 
> > I'm not clear on what hide is supposed to be for, if the object is
> > being destroyed why do we need a period when it is NULL in the xarray
> > before just erasing it?
> The problem I am trying to solve is that when a destroy verb is called I want
> to stop looking up rdma objects from their numbers (rkey, qpn, etc.) which
> happens when a new packet arrives that refers to the object. So we start the
> object destroy flow but we may hit a long delay if there are already
> references taken on the object. In the next patch we are going to add a complete
> wait_for_completion so that we won't return to rdma_core until all the refs
> are dropped. While we are waiting for some past event to complete and drop its
> reference new packets may arrive and take a reference on the object while looking it
> up. Potentially this could happen many times. I just want to stop accepting any
> new packets as soon as the destroy verb gets called. Meanwhile we will allow
> past packets to complete. That is what hide() does.

But why not just do xa_erase? Why try to preserve the index during
this time?

> Show() deals with the opposite problem. The way the object library worked as soon as
> the object was created or 'added to the pool' it becomes searchable. But some of the
> verbs have a lot of code to execute after the object gets its number assigned so by
> setting the link to NULL in the xa_alloc call we get the index assigned but the
> object is not searchable. show turns it on at the end for the create verb call after
> all the init code is done.

Show/finalize is a pretty common pattern when working with xarrays, it
is fine

> >> @@ -491,6 +497,7 @@ static int rxe_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
> >>  	struct rxe_qp *qp = to_rqp(ibqp);
> >>  	int ret;
> >>  
> >> +	rxe_hide(qp);
> >>  	ret = rxe_qp_chk_destroy(qp);
> >>  	if (ret)
> >>  		return ret;
> > 
> > So we decided not to destroy the QP but wrecked it in the xarray?
> > 
> > Not convinced about the hide at all..

> This particular example is pretty light weight since rxe_qp_chk_destroy only makes one
> memory reference. But dereg_mr actually can do a lot of work and in either case
> the idea as explained above is not to wreck the object but prevent rxe_pool_get_index(pool, index)
> from succeeding and taking a new ref on the object. Once the verbs client has called a destroy
> verb I don't see any reason why we should continue to process newly arriving packets forever which
> is the only place in the driver where we convert object numbers to objects.

I think once you commit to destroying the object then just take it out
of the xarray immediately and go on and do the other stuff.
 
> There is another issue which is not being dealt with here and that
> is partially torn down objects. After this patch the flow for a
> destroy verb for an indexed object is
> 
> hide() =	"disable new lookups, e.g. xa_store(NULL)"
> 		"misc tear down code"
> rxe_put() = 	"drop a reference, will be last one if the object is quiescent"
> 		"if necessary wait until last ref dropped"
> 		"object cleanup code, includes xa_erase()"
> 
> If objects are still holding references you have to be careful to
> make sure that nothing in misc tear down code matters for an
> outstanding reference. Currently this all works but if any change
> breaks things we have had to defer the change to the cleanup phase.
> It doesn't work by design but just debugging in the past.

I'd say this is not good, the normal pattern I would expect is to
remove it from the xarray and then drive the references to zero before
starting any destruction. Does the wait patch deal with it?

Jason

  reply	other threads:[~2022-03-16 13:42 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-04  0:07 [PATCH for-next v11 00/13] Fix race conditions in rxe_pool Bob Pearson
2022-03-04  0:07 ` [PATCH for-next v11 01/13] RDMA/rxe: Fix ref error in rxe_av.c Bob Pearson
2022-03-04  0:07 ` [PATCH for-next v11 02/13] RDMA/rxe: Replace mr by rkey in responder resources Bob Pearson
2022-03-04  0:07 ` [PATCH for-next v11 03/13] RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 04/13] RDMA/rxe: Delete _locked() APIs for pool objects Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 05/13] RDMA/rxe: Replace obj by elem in declaration Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 06/13] RDMA/rxe: Move max_elem into rxe_type_info Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 07/13] RDMA/rxe: Shorten pool names in rxe_pool.c Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 08/13] RDMA/rxe: Replace red-black trees by xarrays Bob Pearson
2022-03-15 23:45   ` Jason Gunthorpe
2022-03-16  3:05     ` Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 09/13] RDMA/rxe: Use standard names for ref counting Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 10/13] RDMA/rxe: Stop lookup of partially built objects Bob Pearson
2022-03-16  0:16   ` Jason Gunthorpe
2022-03-16  3:55     ` Bob Pearson
2022-03-16 13:42       ` Jason Gunthorpe [this message]
2022-03-04  0:08 ` [PATCH for-next v11 11/13] RDMA/rxe: Add wait_for_completion to pool objects Bob Pearson
2022-03-16  0:17   ` Jason Gunthorpe
2022-03-16  3:57     ` Bob Pearson
2022-03-16 13:43       ` Jason Gunthorpe
2022-03-04  0:08 ` [PATCH for-next v11 12/13] RDMA/rxe: Convert read side locking to rcu Bob Pearson
2022-03-16  0:18   ` Jason Gunthorpe
2022-03-16  4:05     ` Bob Pearson
2022-03-04  0:08 ` [PATCH for-next v11 13/13] RDMA/rxe: Cleanup rxe_pool.c Bob Pearson
2022-03-16  0:25 ` [PATCH for-next v11 00/13] Fix race conditions in rxe_pool Jason Gunthorpe
2022-03-16  4:05   ` Bob Pearson
2022-03-16 16:08     ` Jason Gunthorpe
2022-03-16 16:09       ` Pearson, Robert B

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220316134227.GD11336@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.