From: Jason Gunthorpe <jgg@nvidia.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@broadcom.com>,
Bryan Tan <bryan-bt.tan@broadcom.com>,
Eric Dumazet <edumazet@google.com>,
Junxian Huang <huangjunxian6@hisilicon.com>,
Konstantin Taranov <kotaranov@microsoft.com>,
Jakub Kicinski <kuba@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
linux-hyperv@vger.kernel.org, linux-rdma@vger.kernel.org,
netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
Selvin Xavier <selvin.xavier@broadcom.com>,
Chengchang Tang <tangchengchang@huawei.com>,
Tariq Toukan <tariqt@nvidia.com>,
Vishnu Dasa <vishnu.dasa@broadcom.com>,
Yishai Hadas <yishaih@nvidia.com>
Cc: Abhijit Gangurde <abhijit.gangurde@amd.com>,
Adit Ranadive <aditr@vmware.com>,
Allen Hubbe <allen.hubbe@amd.com>,
Andrew Boyer <andrew.boyer@amd.com>,
Aditya Sarwade <asarwade@vmware.com>,
Brad Spengler <brad.spengler@opensrcsec.com>,
Bryan Tan <bryantan@vmware.com>,
"David S. Miller" <davem@davemloft.net>,
Dexuan Cui <decui@microsoft.com>,
Doug Ledford <dledford@redhat.com>,
George Zhang <georgezhang@vmware.com>,
Jorgen Hansen <jhansen@vmware.com>,
Jianbo Liu <jianbol@nvidia.com>,
Kai Aizen <kai.aizen.dev@gmail.com>,
Leon Romanovsky <leonro@mellanox.com>,
Leon Romanovsky <leonro@nvidia.com>,
Yixian Liu <liuyixian@huawei.com>, Long Li <longli@microsoft.com>,
Lijun Ou <oulijun@huawei.com>,
Parav Pandit <parav.pandit@emulex.com>,
patches@lists.linux.dev, Roland Dreier <roland@purestorage.com>,
Roland Dreier <rolandd@cisco.com>,
Sagi Grimberg <sagi@grimberg.me>,
Ajay Sharma <sharmaajay@microsoft.com>,
stable@vger.kernel.org, Tariq Toukan <tariqt@mellanox.com>,
"Wei Hu (Xavier)" <xavier.huwei@huawei.com>,
Shaobo Xu <xushaobo2@huawei.com>,
Nenglong Zhao <zhaonenglong@hisilicon.com>
Subject: [PATCH rc 13/15] RDMA/hns: Fix xarray race in hns_roce_create_srq()
Date: Tue, 28 Apr 2026 13:17:46 -0300 [thread overview]
Message-ID: <13-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com> (raw)
In-Reply-To: <0-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com>
Sashiko points out that once the srq memory is stored into the xarray by
alloc_srqc() it can immediately be looked up by:
xa_lock(&srq_table->xa);
srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1));
if (srq)
refcount_inc(&srq->refcount);
xa_unlock(&srq_table->xa);
Which will fail refcount debug because the refcount is 0 and then crash:
srq->event(srq, event_type);
Because event is NULL.
Use refcount_inc_not_zero() instead to ensure a partially prepared srq is
never retrieved from the event handler and fix the ordering of the
initialization so refcount becomes 1 only after it is fully ready.
All the initialization must be done before calling free_srqc() since it
depends on the completion and refcount.
Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Link: https://sashiko.dev/#/patchset/0-v1-e911b76a94d1%2B65d95-rdma_udata_rep_jgg%40nvidia.com?part=3
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/hns/hns_roce_srq.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c
index cb848e8e6bbd76..8b94cbdfa54dfa 100644
--- a/drivers/infiniband/hw/hns/hns_roce_srq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
@@ -16,8 +16,8 @@ void hns_roce_srq_event(struct hns_roce_dev *hr_dev, u32 srqn, int event_type)
xa_lock(&srq_table->xa);
srq = xa_load(&srq_table->xa, srqn & (hr_dev->caps.num_srqs - 1));
- if (srq)
- refcount_inc(&srq->refcount);
+ if (srq && !refcount_inc_not_zero(&srq->refcount))
+ srq = NULL;
xa_unlock(&srq_table->xa);
if (!srq) {
@@ -470,6 +470,10 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
if (ret)
goto err_srqn;
+ srq->event = hns_roce_ib_srq_event;
+ init_completion(&srq->free);
+ refcount_set_release(&srq->refcount, 1);
+
if (udata) {
resp.cap_flags = srq->cap_flags;
resp.srqn = srq->srqn;
@@ -480,10 +484,6 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
}
}
- srq->event = hns_roce_ib_srq_event;
- refcount_set(&srq->refcount, 1);
- init_completion(&srq->free);
-
return 0;
err_srqc:
--
2.43.0
next prev parent reply other threads:[~2026-04-28 16:17 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:17 [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 01/15] RDMA/ionic: Fix typo in format string Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 02/15] RDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 03/15] RDMA/mlx5: Add missing store/release for lock elision pattern Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 04/15] RDMA/mana: Validate rx_hash_key_len Jason Gunthorpe
2026-04-28 17:50 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 05/15] RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss() Jason Gunthorpe
2026-04-28 17:43 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 06/15] RDMA/mana: Fix mana_destroy_wq_obj() cleanup " Jason Gunthorpe
2026-04-28 17:55 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 07/15] RDMA/mana: Fix error unwind " Jason Gunthorpe
2026-04-28 17:53 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 08/15] RDMA/ocrdma: Clarify the mm_head searching Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 09/15] RDMA/ocrdma: Don't NULL deref uctx on errors in ocrdma_copy_pd_uresp() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 10/15] RDMA/vmw_pvrdma: Fix double free on pvrdma_alloc_ucontext() error path Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 11/15] RDMA/mlx4: Fix resource leak on error in mlx4_ib_create_srq() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 12/15] RDMA/mlx4: Fix mis-use of RCU in mlx4_srq_event() Jason Gunthorpe
2026-04-28 16:17 ` Jason Gunthorpe [this message]
2026-04-28 16:17 ` [PATCH rc 14/15] RDMA/hns: Fix xarray race in hns_roce_create_qp_common() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 15/15] RDMA/hns: Fix unlocked call to hns_roce_qp_remove() Jason Gunthorpe
2026-04-29 7:55 ` [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Junxian Huang
2026-05-02 18:39 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=13-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com \
--to=jgg@nvidia.com \
--cc=abhijit.gangurde@amd.com \
--cc=aditr@vmware.com \
--cc=allen.hubbe@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew.boyer@amd.com \
--cc=asarwade@vmware.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=brad.spengler@opensrcsec.com \
--cc=bryan-bt.tan@broadcom.com \
--cc=bryantan@vmware.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=dledford@redhat.com \
--cc=edumazet@google.com \
--cc=georgezhang@vmware.com \
--cc=huangjunxian6@hisilicon.com \
--cc=jhansen@vmware.com \
--cc=jianbol@nvidia.com \
--cc=kai.aizen.dev@gmail.com \
--cc=kotaranov@microsoft.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=leonro@mellanox.com \
--cc=leonro@nvidia.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=liuyixian@huawei.com \
--cc=longli@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=oulijun@huawei.com \
--cc=pabeni@redhat.com \
--cc=parav.pandit@emulex.com \
--cc=patches@lists.linux.dev \
--cc=roland@purestorage.com \
--cc=rolandd@cisco.com \
--cc=sagi@grimberg.me \
--cc=selvin.xavier@broadcom.com \
--cc=sharmaajay@microsoft.com \
--cc=stable@vger.kernel.org \
--cc=tangchengchang@huawei.com \
--cc=tariqt@mellanox.com \
--cc=tariqt@nvidia.com \
--cc=vishnu.dasa@broadcom.com \
--cc=xavier.huwei@huawei.com \
--cc=xushaobo2@huawei.com \
--cc=yishaih@nvidia.com \
--cc=zhaonenglong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox