linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Doug Ledford <dledford@redhat.com>,
	Artemy Kovalyov <artemyko@mellanox.com>,
	<linux-rdma@vger.kernel.org>
Subject: Re: [PATCH rdma-rc] RDMA/mlx5: Prevent prefetch from racing with implicit destruction
Date: Tue, 21 Jul 2020 13:59:13 -0300	[thread overview]
Message-ID: <20200721165913.GA3171161@nvidia.com> (raw)
In-Reply-To: <20200719065435.130722-1-leon@kernel.org>

On Sun, Jul 19, 2020 at 09:54:35AM +0300, Leon Romanovsky wrote:
> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
> concurrently with destruction of the implicit MR. The num_deferred_work
> was intended to serialize this, but there is a race:
> 
>        CPU0                                          CPU1
> 
>     mlx5_ib_free_implicit_mr()
>       xa_erase(odp_mkeys)
>       synchronize_srcu()
>       __xa_erase(implicit_children)
>                                       mlx5_ib_prefetch_mr_work()
>                                         pagefault_mr()
>                                          pagefault_implicit_mr()
>                                           implicit_get_child_mr()
>                                            xa_cmpxchg()
>                                         atomic_dec_and_test(num_deferred_mr)
>       wait_event(imr->q_deferred_work)
>       ib_umem_odp_release(odp_imr)
>         kfree(odp_imr)
> 
> At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
> supposed to be empty forever so that destroy_unused_implicit_child_mr()
> and related are not and will not be running.
> 
> Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
> touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
> imr parent.
> 
> The solution is to flush out the prefetch wq by driving num_deferred_work
> to zero after creation of new prefetch work is blocked.
> 
> Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/odp.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

Applied to for-rc, thanks

Jason

      reply	other threads:[~2020-07-21 17:03 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-19  6:54 [PATCH rdma-rc] RDMA/mlx5: Prevent prefetch from racing with implicit destruction Leon Romanovsky
2020-07-21 16:59 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200721165913.GA3171161@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=artemyko@mellanox.com \
    --cc=dledford@redhat.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).