From: Jason Gunthorpe <jgg@nvidia.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: lizhijian@fujitsu.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-next] RDMA/rxe: Guard mr state with spin lock
Date: Fri, 29 Jul 2022 14:43:59 -0300 [thread overview]
Message-ID: <YuQcX9JF1l2ILT1M@nvidia.com> (raw)
In-Reply-To: <04e87d78-29a2-735a-b984-d2321a8edc9d@gmail.com>
On Thu, Jul 28, 2022 at 12:54:34PM -0500, Bob Pearson wrote:
> On 7/28/22 11:42, Jason Gunthorpe wrote:
> > On Mon, Jul 25, 2022 at 03:01:15PM -0500, Bob Pearson wrote:
> >> Currently the rxe driver does not guard changes to the mr state
> >> against race conditions which can arise from races between
> >> local operations and remote invalidate operations. This patch
> >> adds a spinlock to the mr object and makes the state changes
> >> atomic.
> >
> > This doesn't make it atomic..
> >
> >> + state = smp_load_acquire(&mr->state);
> >> +
> >> if (unlikely((type == RXE_LOOKUP_LOCAL && mr->lkey != key) ||
> >> (type == RXE_LOOKUP_REMOTE && mr->rkey != key) ||
> >> mr_pd(mr) != pd || (access && !(access & mr->access)) ||
> >> - mr->state != RXE_MR_STATE_VALID)) {
> >> + state != RXE_MR_STATE_VALID)) {
> >> rxe_put(mr);
> >
> > This is still just differently racy
> >
> > The whole point of invalidate is to say that when the invalidate
> > completion occurs there is absolutely no touching of the memory that
> > MR points to.
> >
> > I don't see how this acheives this like this. You need a proper lock
> > spanning from the lookup here until all the "dma" is completed.
> >
> > Jason
>
> Interesting. Then things are in a bit of a mess. Before this patch of course there
> was nothing. And, rxe_resp.c currently looks up an mr from the rkey and saves it
> in the qp and then uses it for additional packets as required for e.g. rdma write
> operations. A local invalidate before a multipacket write finishes will have the wrong
> effect. It will continue to use the mr to perform the data copies. And the data copy
> routine does not validate the mr state. We would have to save the rkey instead and
> re-lookup the mr for each packet.
>
> For a single packet we complete the dma in a single tasklet call. We would have a choice
> of holding a spinlock (for a fairly long time) or marking the mr as busy and deferring a
> local invalidate. A remote invalidate would fall between the packets of an rdma op.
Some kind of "lock" that delays the invalidate completion until the MR
is finished being used is also fairly sensible.
In this case you'd probably convert the 'valid' into some kind of
refcount and when refcount_put reaches 0 then you'd complete any
pending invalidation WR.
So on the tasklet side it becomes two atomics - which is about the
same overhead as a spinlock.
Jason
next prev parent reply other threads:[~2022-07-29 17:44 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-25 20:01 [PATCH for-next] RDMA/rxe: Guard mr state with spin lock Bob Pearson
2022-07-28 16:42 ` Jason Gunthorpe
2022-07-28 17:54 ` Bob Pearson
2022-07-29 17:43 ` Jason Gunthorpe [this message]
2022-07-29 17:50 ` Bob Pearson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuQcX9JF1l2ILT1M@nvidia.com \
--to=jgg@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=lizhijian@fujitsu.com \
--cc=rpearsonhpe@gmail.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox