From: Jason Gunthorpe <jgg@nvidia.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@broadcom.com>,
Bryan Tan <bryan-bt.tan@broadcom.com>,
Eric Dumazet <edumazet@google.com>,
Junxian Huang <huangjunxian6@hisilicon.com>,
Konstantin Taranov <kotaranov@microsoft.com>,
Jakub Kicinski <kuba@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
linux-hyperv@vger.kernel.org, linux-rdma@vger.kernel.org,
netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
Selvin Xavier <selvin.xavier@broadcom.com>,
Chengchang Tang <tangchengchang@huawei.com>,
Tariq Toukan <tariqt@nvidia.com>,
Vishnu Dasa <vishnu.dasa@broadcom.com>,
Yishai Hadas <yishaih@nvidia.com>
Cc: Abhijit Gangurde <abhijit.gangurde@amd.com>,
Adit Ranadive <aditr@vmware.com>,
Allen Hubbe <allen.hubbe@amd.com>,
Andrew Boyer <andrew.boyer@amd.com>,
Aditya Sarwade <asarwade@vmware.com>,
Brad Spengler <brad.spengler@opensrcsec.com>,
Bryan Tan <bryantan@vmware.com>,
"David S. Miller" <davem@davemloft.net>,
Dexuan Cui <decui@microsoft.com>,
Doug Ledford <dledford@redhat.com>,
George Zhang <georgezhang@vmware.com>,
Jorgen Hansen <jhansen@vmware.com>,
Jianbo Liu <jianbol@nvidia.com>,
Kai Aizen <kai.aizen.dev@gmail.com>,
Leon Romanovsky <leonro@mellanox.com>,
Leon Romanovsky <leonro@nvidia.com>,
Yixian Liu <liuyixian@huawei.com>, Long Li <longli@microsoft.com>,
Lijun Ou <oulijun@huawei.com>,
Parav Pandit <parav.pandit@emulex.com>,
patches@lists.linux.dev, Roland Dreier <roland@purestorage.com>,
Roland Dreier <rolandd@cisco.com>,
Sagi Grimberg <sagi@grimberg.me>,
Ajay Sharma <sharmaajay@microsoft.com>,
stable@vger.kernel.org, Tariq Toukan <tariqt@mellanox.com>,
"Wei Hu (Xavier)" <xavier.huwei@huawei.com>,
Shaobo Xu <xushaobo2@huawei.com>,
Nenglong Zhao <zhaonenglong@hisilicon.com>
Subject: [PATCH rc 12/15] RDMA/mlx4: Fix mis-use of RCU in mlx4_srq_event()
Date: Tue, 28 Apr 2026 13:17:45 -0300 [thread overview]
Message-ID: <12-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com> (raw)
In-Reply-To: <0-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com>
Sashiko points out the radix_tree itself is RCU safe, but nothing ever
frees the mlx4_srq struct with RCU, and it isn't even accessed within the
RCU critical section. It also will crash if an event is delivered before
the srq object is finished initializing.
Use the spinlock since it isn't easy to make RCU work, use
refcount_inc_not_zero() to protect against partially initialized objects,
and order the refcount_set() to be after the srq is fully initialized.
Cc: stable@vger.kernel.org
Fixes: 30353bfc43a1 ("net/mlx4_core: Use RCU to perform radix tree lookup for SRQ")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx4/srq.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/srq.c b/drivers/net/ethernet/mellanox/mlx4/srq.c
index dd890f5d7b725c..8711689120f302 100644
--- a/drivers/net/ethernet/mellanox/mlx4/srq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/srq.c
@@ -44,13 +44,14 @@ void mlx4_srq_event(struct mlx4_dev *dev, u32 srqn, int event_type)
{
struct mlx4_srq_table *srq_table = &mlx4_priv(dev)->srq_table;
struct mlx4_srq *srq;
+ unsigned long flags;
- rcu_read_lock();
+ spin_lock_irqsave(&srq_table->lock, flags);
srq = radix_tree_lookup(&srq_table->tree, srqn & (dev->caps.num_srqs - 1));
- rcu_read_unlock();
- if (srq)
- refcount_inc(&srq->refcount);
- else {
+ if (!srq || !refcount_inc_not_zero(&srq->refcount))
+ srq = NULL;
+ spin_unlock_irqrestore(&srq_table->lock, flags);
+ if (!srq) {
mlx4_warn(dev, "Async event for bogus SRQ %08x\n", srqn);
return;
}
@@ -203,8 +204,8 @@ int mlx4_srq_alloc(struct mlx4_dev *dev, u32 pdn, u32 cqn, u16 xrcd,
if (err)
goto err_radix;
- refcount_set(&srq->refcount, 1);
init_completion(&srq->free);
+ refcount_set_release(&srq->refcount, 1);
return 0;
--
2.43.0
next prev parent reply other threads:[~2026-04-28 16:18 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:17 [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 01/15] RDMA/ionic: Fix typo in format string Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 02/15] RDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 03/15] RDMA/mlx5: Add missing store/release for lock elision pattern Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 04/15] RDMA/mana: Validate rx_hash_key_len Jason Gunthorpe
2026-04-28 17:50 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 05/15] RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss() Jason Gunthorpe
2026-04-28 17:43 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 06/15] RDMA/mana: Fix mana_destroy_wq_obj() cleanup " Jason Gunthorpe
2026-04-28 17:55 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 07/15] RDMA/mana: Fix error unwind " Jason Gunthorpe
2026-04-28 17:53 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 08/15] RDMA/ocrdma: Clarify the mm_head searching Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 09/15] RDMA/ocrdma: Don't NULL deref uctx on errors in ocrdma_copy_pd_uresp() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 10/15] RDMA/vmw_pvrdma: Fix double free on pvrdma_alloc_ucontext() error path Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 11/15] RDMA/mlx4: Fix resource leak on error in mlx4_ib_create_srq() Jason Gunthorpe
2026-04-28 16:17 ` Jason Gunthorpe [this message]
2026-04-28 16:17 ` [PATCH rc 13/15] RDMA/hns: Fix xarray race in hns_roce_create_srq() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 14/15] RDMA/hns: Fix xarray race in hns_roce_create_qp_common() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 15/15] RDMA/hns: Fix unlocked call to hns_roce_qp_remove() Jason Gunthorpe
2026-04-29 7:55 ` [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Junxian Huang
2026-05-02 18:39 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=12-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com \
--to=jgg@nvidia.com \
--cc=abhijit.gangurde@amd.com \
--cc=aditr@vmware.com \
--cc=allen.hubbe@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew.boyer@amd.com \
--cc=asarwade@vmware.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=brad.spengler@opensrcsec.com \
--cc=bryan-bt.tan@broadcom.com \
--cc=bryantan@vmware.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=dledford@redhat.com \
--cc=edumazet@google.com \
--cc=georgezhang@vmware.com \
--cc=huangjunxian6@hisilicon.com \
--cc=jhansen@vmware.com \
--cc=jianbol@nvidia.com \
--cc=kai.aizen.dev@gmail.com \
--cc=kotaranov@microsoft.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=leonro@mellanox.com \
--cc=leonro@nvidia.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=liuyixian@huawei.com \
--cc=longli@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=oulijun@huawei.com \
--cc=pabeni@redhat.com \
--cc=parav.pandit@emulex.com \
--cc=patches@lists.linux.dev \
--cc=roland@purestorage.com \
--cc=rolandd@cisco.com \
--cc=sagi@grimberg.me \
--cc=selvin.xavier@broadcom.com \
--cc=sharmaajay@microsoft.com \
--cc=stable@vger.kernel.org \
--cc=tangchengchang@huawei.com \
--cc=tariqt@mellanox.com \
--cc=tariqt@nvidia.com \
--cc=vishnu.dasa@broadcom.com \
--cc=xavier.huwei@huawei.com \
--cc=xushaobo2@huawei.com \
--cc=yishaih@nvidia.com \
--cc=zhaonenglong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox