linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bob Pearson <rpearsonhpe@gmail.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-next 0/6] RDMA/rxe: Fix potential races
Date: Tue, 19 Oct 2021 11:35:30 -0500	[thread overview]
Message-ID: <ccdf6ffa-dc14-7b50-7a17-c0d01d9305bf@gmail.com> (raw)
In-Reply-To: <YW7DGrG04eJwbf7d@unreal>

On 10/19/21 8:07 AM, Leon Romanovsky wrote:
> On Tue, Oct 12, 2021 at 03:19:46PM -0500, Bob Pearson wrote:
>> On 10/12/21 1:34 AM, Leon Romanovsky wrote:
>>> On Sun, Oct 10, 2021 at 06:59:25PM -0500, Bob Pearson wrote:
>>>> There are possible race conditions related to attempting to access
>>>> rxe pool objects at the same time as the pools or elements are being
>>>> freed. This series of patches addresses these races.
>>>
>>> Can we get rid of this pool?
>>>
>>> Thanks
>>>
>>>>
>>>> Bob Pearson (6):
>>>>   RDMA/rxe: Make rxe_alloc() take pool lock
>>>>   RDMA/rxe: Copy setup parameters into rxe_pool
>>>>   RDMA/rxe: Save object pointer in pool element
>>>>   RDMA/rxe: Combine rxe_add_index with rxe_alloc
>>>>   RDMA/rxe: Combine rxe_add_key with rxe_alloc
>>>>   RDMA/rxe: Fix potential race condition in rxe_pool
>>>>
>>>>  drivers/infiniband/sw/rxe/rxe_mcast.c |   5 +-
>>>>  drivers/infiniband/sw/rxe/rxe_mr.c    |   1 -
>>>>  drivers/infiniband/sw/rxe/rxe_mw.c    |   5 +-
>>>>  drivers/infiniband/sw/rxe/rxe_pool.c  | 235 +++++++++++++-------------
>>>>  drivers/infiniband/sw/rxe/rxe_pool.h  |  67 +++-----
>>>>  drivers/infiniband/sw/rxe/rxe_verbs.c |  10 --
>>>>  6 files changed, 140 insertions(+), 183 deletions(-)
>>>>
>>>> -- 
>>>> 2.30.2
>>>>
>>
>> Not sure which 'this' you mean? This set of patches is motivated by someone at HPE
>> running into seg faults caused very infrequently by rdma packets causing seg faults
>> when trying to copy data to or from an MR. This can only happen (other than just dumb
>> bug which doesn't seem to be the case) by a late packet arriving after the MR is
>> de-registered. The root cause of that is the way rxe currently defers cleaning up
>> objects with krefs and potential races between cleanup and new packets looking up
>> rkeys. I found a lot of potential race conditions and tried to close them off. There
>> are another couple of patches coming as well.
> 
> I have no doubts that this series fixes RXE, but my request was more general.
> Is there way/path to remove everything declared in rxe_pool.c|h?
> 
> Thanks
> 

Take a look at the note I copied you on more recently. There is some progress but not
complete elimination of rxe_pool. There is another project suggested by Jason which is
replacing red black trees by xarrays as an alternative approach to indexing rdma objects.
This would still duplicate the indexing done by rdma-core. A while back I looked at
trying to reuse the rdma-core indexing but no effort was made to make that easy. All
of the APIs are private to rdma-core. These indices are managed by the rxe driver for
use as lkeys/rkeys, qpns, srqns, and more recently address handles. xarrays seem to be
more efficient when the indices are fairly compact. There is a suggestion that IB and RoCE
should attempt to make indices that are visible on the network more sparse. Nothing
will make them secure but they could be a lot more secure than they are currently. I
believe mlx5 is now using random keys for this reason.

Bob 

  reply	other threads:[~2021-10-19 16:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-10 23:59 [PATCH for-next 0/6] RDMA/rxe: Fix potential races Bob Pearson
2021-10-10 23:59 ` [PATCH for-next 1/6] RDMA/rxe: Make rxe_alloc() take pool lock Bob Pearson
2021-10-20 23:16   ` Jason Gunthorpe
2021-10-21 17:46     ` Bob Pearson
2021-10-25 12:43       ` Jason Gunthorpe
2021-10-25 18:48         ` Robert Pearson
2021-10-10 23:59 ` [PATCH for-next 2/6] RDMA/rxe: Copy setup parameters into rxe_pool Bob Pearson
2021-10-10 23:59 ` [PATCH for-next 3/6] RDMA/rxe: Save object pointer in pool element Bob Pearson
2021-10-20 23:20   ` Jason Gunthorpe
2021-10-21 17:21     ` Bob Pearson
2021-10-25 15:40       ` Jason Gunthorpe
2021-10-10 23:59 ` [PATCH for-next 4/6] RDMA/rxe: Combine rxe_add_index with rxe_alloc Bob Pearson
2021-10-10 23:59 ` [PATCH for-next 5/6] RDMA/rxe: Combine rxe_add_key " Bob Pearson
2021-10-10 23:59 ` [PATCH for-next 6/6] RDMA/rxe: Fix potential race condition in rxe_pool Bob Pearson
2021-10-20 23:23   ` Jason Gunthorpe
2021-10-12  6:34 ` [PATCH for-next 0/6] RDMA/rxe: Fix potential races Leon Romanovsky
2021-10-12 20:19   ` Bob Pearson
2021-10-19 13:07     ` Leon Romanovsky
2021-10-19 16:35       ` Bob Pearson [this message]
2021-10-19 18:43         ` Jason Gunthorpe
2021-10-19 22:51           ` Bob Pearson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ccdf6ffa-dc14-7b50-7a17-c0d01d9305bf@gmail.com \
    --to=rpearsonhpe@gmail.com \
    --cc=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).