From: Jason Gunthorpe <jgg@nvidia.com>
To: Leon Romanovsky <leon@kernel.org>, linux-rdma@vger.kernel.org
Cc: Doug Ledford <dledford@redhat.com>,
Edward Srouji <edwards@nvidia.com>,
Leon Romanovsky <leonro@mellanox.com>,
Leon Romanovsky <leonro@nvidia.com>,
Matan Barak <matanb@mellanox.com>,
Michael Guralnik <michaelgur@nvidia.com>,
Noa Osherovich <noaos@mellanox.com>,
patches@lists.linux.dev, Steve Wise <swise@opengridcomputing.com>
Subject: Re: [PATCH 00/10] Fix races around IB_MR_REREG_PD and mr->pd
Date: Mon, 8 Jun 2026 14:52:49 -0300 [thread overview]
Message-ID: <20260608175249.GA66752@nvidia.com> (raw)
In-Reply-To: <0-v1-29ebd2c229b5+fd5-ib_mr_pd_jgg@nvidia.com>
On Wed, Jun 03, 2026 at 10:27:39PM -0300, Jason Gunthorpe wrote:
> Sashiko pointed out an existing bug related to mr->pd: when IB_MR_REREG_PD
> is used the mr->pd is changed while only holding the write side of the
> MR's uobject lock.
>
> Effectively, because IB_MR_REREG_PD is usually implemented by changing the
> MR in-place, the mr->pd becomes unreadable outside an MR-specific system
> call that holds the uobject lock. All the readers in this series could
> race with an IB_MR_REREG_PD and potentially UAF the mr->pd.
>
> https://sashiko.dev/#/patchset/20260427-security-bug-fixes-v3-0-4621fa52de0e%40nvidia.com?part=4
>
> This was presented as a simple 'oh look it can race with nldev' which is
> correct. However, asking AI to fully audit mr->pd touches also revealed a
> much more convoluted issue inside mlx5 ODP that is also using mr->pd from
> the page fault work queue, advise mr work queue and advise mr system call
> without any locking.
>
> It turns out this mlx5 problem is entirely unnecessary since outside
> implicit mr there are only three cases where the UMR actually flags the
> PDN to be read by HW, umr_rereg_pas(), mlx5_ib_init_odp_mr() and
> mlx5_ib_init_dmabuf_mr(). umr_rereg_pas() is particularly tricky because
> it illegaly updates mr->pd inside the driver. Reorganize the giant call
> chain from mlx5r_umr_set_update_xlt_mkey_seg() upward so that pdn is
> passed down from those three functions instead of unconditionally picked
> out at the bottom.
>
> nldev however is trickier to fix. To avoid disurbing the happy paths build
> a synchronize barrier by removing the mr from the xarray and then putting
> it right back. The kref completion acts as a positive signal that the
> mr->pd is no longer being used.
>
> Jason Gunthorpe (10):
> IB/mlx5: Don't take the rereg_mr fallback without a new translation
> RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs
> IB/mlx5: Properly support implicit ODP rereg_mr
> RDMA/nldev: Fix locking when accessing mr->pd
> IB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift()
> IB/mlx5: Pull the pdn out of the depths of the umr machinery
> IB/mlx5: Don't mangle the mr->pd inside the rereg callback
> IB/mlx5: Push pdn above mlx5r_umr_update_xlt()
> IB/mlx5: Push pdn above pagfault_real_mr()
> IB/mlx5: Push pdn above pagefault_dmabuf_mr()
Applied to for-next
Thanks,
Jason
prev parent reply other threads:[~2026-06-08 17:52 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-04 1:27 [PATCH 00/10] Fix races around IB_MR_REREG_PD and mr->pd Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 01/10] IB/mlx5: Don't take the rereg_mr fallback without a new translation Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 02/10] RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 03/10] IB/mlx5: Properly support implicit ODP rereg_mr Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 04/10] RDMA/nldev: Fix locking when accessing mr->pd Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 05/10] IB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift() Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 06/10] IB/mlx5: Pull the pdn out of the depths of the umr machinery Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 07/10] IB/mlx5: Don't mangle the mr->pd inside the rereg callback Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 08/10] IB/mlx5: Push pdn above mlx5r_umr_update_xlt() Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 09/10] IB/mlx5: Push pdn above pagfault_real_mr() Jason Gunthorpe
2026-06-04 1:27 ` [PATCH 10/10] IB/mlx5: Push pdn above pagefault_dmabuf_mr() Jason Gunthorpe
2026-06-08 17:52 ` Jason Gunthorpe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260608175249.GA66752@nvidia.com \
--to=jgg@nvidia.com \
--cc=dledford@redhat.com \
--cc=edwards@nvidia.com \
--cc=leon@kernel.org \
--cc=leonro@mellanox.com \
--cc=leonro@nvidia.com \
--cc=linux-rdma@vger.kernel.org \
--cc=matanb@mellanox.com \
--cc=michaelgur@nvidia.com \
--cc=noaos@mellanox.com \
--cc=patches@lists.linux.dev \
--cc=swise@opengridcomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox