From: Jason Gunthorpe <jgg@nvidia.com>
To: Edward Srouji <edwards@nvidia.com>
Cc: Leon Romanovsky <leon@kernel.org>,
Chiara Meiohas <cmeiohas@nvidia.com>,
Maor Gottlieb <maorg@mellanox.com>,
Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>,
Gal Pressman <galpress@amazon.com>,
Steve Wise <larrystevenwise@gmail.com>,
Mark Bloch <markb@mellanox.com>,
Mark Zhang <markzhang@nvidia.com>,
Neta Ostrovsky <netao@nvidia.com>,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
Patrisious Haddad <phaddad@nvidia.com>,
Michael Guralnik <michaelgur@nvidia.com>
Subject: Re: [PATCH rdma-next 0/6] RDMA: Fix restrack UAF in QP/CQ/SRQ destroy
Date: Thu, 11 Jun 2026 16:11:04 -0300 [thread overview]
Message-ID: <20260611191104.GA1501742@nvidia.com> (raw)
In-Reply-To: <20260607-restrack-uaf-fix-v1-0-d72e45eb76c2@nvidia.com>
On Sun, Jun 07, 2026 at 09:18:07PM +0300, Edward Srouji wrote:
> The resource-tracking (restrack) database is the back-end for the netlink
> "rdma resource show" interface which pins objects with
> rdma_restrack_get().
> The QP/CQ/SRQ destroy flows call rdma_restrack_del() at the end of
> ib_destroy_*_user(), after device->ops.destroy_*() had already freed the
> vendor object. Therefore, a concurrent netlink dump could look the
> object up and touch freed memory, causing a use-after-free via
> ib_query_qp() for instance.
>
> Fix this by splitting the delete into a begin/commit/abort sequence:
> begin_del() parks the entry as XA_ZERO_ENTRY (so lookups return NULL),
> drops the birth reference and waits for in-flight readers to drain,
> while keeping the index reserved. The destroy paths run begin_del()
> first, then commit_del() on success or abort_del() on error.
> abort_del() re-inserts into the reserved slot, so it needs no allocation
> and cannot fail.
>
> The first two patches remove DCT and raw RSS QP restrack tracking as
> they have never worked (their ID is unset/reserved at create time).
>
> Signed-off-by: Edward Srouji <edwards@nvidia.com>
> ---
> Patrisious Haddad (6):
> RDMA/mlx5: Remove DCT restrack tracking
> RDMA/mlx5: Remove raw RSS QP restrack tracking
> RDMA/core: Add rdma_restrack_begin/abort/commit_del() operations
> RDMA/core: Fix use after free in ib_query_qp()
> RDMA/core: Fix potential use after free in ib_destroy_cq_user()
> RDMA/core: Fix potential use after free in ib_destroy_srq_user()
The pre-existing sashiko issues look real too, can you fix them also:
https://sashiko.dev/#/patchset/20260607-restrack-uaf-fix-v1-0-d72e45eb76c2%40nvidia.com
The sashiko notes about XA_ZERO_ENTRY seems to be really obviously
wrong:
void *__xa_cmpxchg(struct xarray *xa, unsigned long index,
void *old, void *entry, gfp_t gfp)
{
return xa_zero_to_null(__xa_cmpxchg_raw(xa, index, old, entry, gfp));
}
EXPORT_SYMBOL(__xa_cmpxchg);
This looks legit:
For instance, in drivers/infiniband/core/cq.c:ib_free_cq():
ret = cq->device->ops.destroy_cq(cq, NULL);
WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail");
rdma_restrack_del(&cq->res);
and so on
Please send a series switching more/all places to commit/abort,
probably there should be very few/no calls to a naked del left.
This doesn't apply on top of the restrack_sync addition, please rebase
it.
You should probably be refactoring rdma_restrack_sync() and using its
parts in this implementation since it does the same things.
I don't think this should NULL the task on abort either, it doesn't
seem necessary.
Jason
next prev parent reply other threads:[~2026-06-11 19:11 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-07 18:18 [PATCH rdma-next 0/6] RDMA: Fix restrack UAF in QP/CQ/SRQ destroy Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 1/6] RDMA/mlx5: Remove DCT restrack tracking Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 2/6] RDMA/mlx5: Remove raw RSS QP " Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 3/6] RDMA/core: Add rdma_restrack_begin/abort/commit_del() operations Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 4/6] RDMA/core: Fix use after free in ib_query_qp() Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 5/6] RDMA/core: Fix potential use after free in ib_destroy_cq_user() Edward Srouji
2026-06-07 18:18 ` [PATCH rdma-next 6/6] RDMA/core: Fix potential use after free in ib_destroy_srq_user() Edward Srouji
2026-06-11 19:11 ` Jason Gunthorpe [this message]
2026-06-11 19:14 ` [PATCH rdma-next 0/6] RDMA: Fix restrack UAF in QP/CQ/SRQ destroy Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260611191104.GA1501742@nvidia.com \
--to=jgg@nvidia.com \
--cc=cmeiohas@nvidia.com \
--cc=dennis.dalessandro@cornelisnetworks.com \
--cc=edwards@nvidia.com \
--cc=galpress@amazon.com \
--cc=larrystevenwise@gmail.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maorg@mellanox.com \
--cc=markb@mellanox.com \
--cc=markzhang@nvidia.com \
--cc=michaelgur@nvidia.com \
--cc=netao@nvidia.com \
--cc=phaddad@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox