Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH 00/10] Fix races around IB_MR_REREG_PD and mr->pd
@ 2026-06-04  1:27 Jason Gunthorpe
  2026-06-04  1:27 ` [PATCH 01/10] IB/mlx5: Don't take the rereg_mr fallback without a new translation Jason Gunthorpe
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2026-06-04  1:27 UTC (permalink / raw)
  To: Leon Romanovsky, linux-rdma
  Cc: Doug Ledford, Edward Srouji, Leon Romanovsky, Leon Romanovsky,
	Matan Barak, Michael Guralnik, Noa Osherovich, patches,
	Steve Wise

Sashiko pointed out an existing bug related to mr->pd: when IB_MR_REREG_PD
is used the mr->pd is changed while only holding the write side of the
MR's uobject lock.

Effectively, because IB_MR_REREG_PD is usually implemented by changing the
MR in-place, the mr->pd becomes unreadable outside an MR-specific system
call that holds the uobject lock. All the readers in this series could
race with an IB_MR_REREG_PD and potentially UAF the mr->pd.

 https://sashiko.dev/#/patchset/20260427-security-bug-fixes-v3-0-4621fa52de0e%40nvidia.com?part=4

This was presented as a simple 'oh look it can race with nldev' which is
correct. However, asking AI to fully audit mr->pd touches also revealed a
much more convoluted issue inside mlx5 ODP that is also using mr->pd from
the page fault work queue, advise mr work queue and advise mr system call
without any locking.

It turns out this mlx5 problem is entirely unnecessary since outside
implicit mr there are only three cases where the UMR actually flags the
PDN to be read by HW, umr_rereg_pas(), mlx5_ib_init_odp_mr() and
mlx5_ib_init_dmabuf_mr(). umr_rereg_pas() is particularly tricky because
it illegaly updates mr->pd inside the driver.  Reorganize the giant call
chain from mlx5r_umr_set_update_xlt_mkey_seg() upward so that pdn is
passed down from those three functions instead of unconditionally picked
out at the bottom.

nldev however is trickier to fix. To avoid disurbing the happy paths build
a synchronize barrier by removing the mr from the xarray and then putting
it right back. The kref completion acts as a positive signal that the
mr->pd is no longer being used.

Jason Gunthorpe (10):
  IB/mlx5: Don't take the rereg_mr fallback without a new translation
  RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs
  IB/mlx5: Properly support implicit ODP rereg_mr
  RDMA/nldev: Fix locking when accessing mr->pd
  IB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift()
  IB/mlx5: Pull the pdn out of the depths of the umr machinery
  IB/mlx5: Don't mangle the mr->pd inside the rereg callback
  IB/mlx5: Push pdn above mlx5r_umr_update_xlt()
  IB/mlx5: Push pdn above pagfault_real_mr()
  IB/mlx5: Push pdn above pagefault_dmabuf_mr()

 drivers/infiniband/core/nldev.c      | 15 +++--
 drivers/infiniband/core/restrack.c   | 49 +++++++++++++++
 drivers/infiniband/core/restrack.h   |  1 +
 drivers/infiniband/core/uverbs_cmd.c | 10 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 12 ++--
 drivers/infiniband/hw/mlx5/mr.c      | 37 ++++++++---
 drivers/infiniband/hw/mlx5/odp.c     | 82 +++++++++++++++----------
 drivers/infiniband/hw/mlx5/umr.c     | 92 +++++++++++++---------------
 drivers/infiniband/hw/mlx5/umr.h     | 11 ++--
 include/rdma/ib_verbs.h              |  5 ++
 10 files changed, 203 insertions(+), 111 deletions(-)


base-commit: e43ffb69e0438cddd72aaa30898b4dc446f664f8
-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-06-08 17:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04  1:27 [PATCH 00/10] Fix races around IB_MR_REREG_PD and mr->pd Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 01/10] IB/mlx5: Don't take the rereg_mr fallback without a new translation Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 02/10] RDMA/mlx5: Create ODP EQ for non-pinned dmabuf MRs Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 03/10] IB/mlx5: Properly support implicit ODP rereg_mr Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 04/10] RDMA/nldev: Fix locking when accessing mr->pd Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 05/10] IB/mlx5: Remove unused mkc bits in mlx5r_umr_update_mr_page_shift() Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 06/10] IB/mlx5: Pull the pdn out of the depths of the umr machinery Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 07/10] IB/mlx5: Don't mangle the mr->pd inside the rereg callback Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 08/10] IB/mlx5: Push pdn above mlx5r_umr_update_xlt() Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 09/10] IB/mlx5: Push pdn above pagfault_real_mr() Jason Gunthorpe
2026-06-04  1:27 ` [PATCH 10/10] IB/mlx5: Push pdn above pagefault_dmabuf_mr() Jason Gunthorpe
2026-06-08 17:52 ` [PATCH 00/10] Fix races around IB_MR_REREG_PD and mr->pd Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox