From: Jason Gunthorpe <jgg@nvidia.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
Broadcom internal kernel review list
<bcm-kernel-feedback-list@broadcom.com>,
Bryan Tan <bryan-bt.tan@broadcom.com>,
Eric Dumazet <edumazet@google.com>,
Junxian Huang <huangjunxian6@hisilicon.com>,
Konstantin Taranov <kotaranov@microsoft.com>,
Jakub Kicinski <kuba@kernel.org>,
Leon Romanovsky <leon@kernel.org>,
linux-hyperv@vger.kernel.org, linux-rdma@vger.kernel.org,
netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
Selvin Xavier <selvin.xavier@broadcom.com>,
Chengchang Tang <tangchengchang@huawei.com>,
Tariq Toukan <tariqt@nvidia.com>,
Vishnu Dasa <vishnu.dasa@broadcom.com>,
Yishai Hadas <yishaih@nvidia.com>
Cc: Abhijit Gangurde <abhijit.gangurde@amd.com>,
Adit Ranadive <aditr@vmware.com>,
Allen Hubbe <allen.hubbe@amd.com>,
Andrew Boyer <andrew.boyer@amd.com>,
Aditya Sarwade <asarwade@vmware.com>,
Brad Spengler <brad.spengler@opensrcsec.com>,
Bryan Tan <bryantan@vmware.com>,
"David S. Miller" <davem@davemloft.net>,
Dexuan Cui <decui@microsoft.com>,
Doug Ledford <dledford@redhat.com>,
George Zhang <georgezhang@vmware.com>,
Jorgen Hansen <jhansen@vmware.com>,
Jianbo Liu <jianbol@nvidia.com>,
Kai Aizen <kai.aizen.dev@gmail.com>,
Leon Romanovsky <leonro@mellanox.com>,
Leon Romanovsky <leonro@nvidia.com>,
Yixian Liu <liuyixian@huawei.com>, Long Li <longli@microsoft.com>,
Lijun Ou <oulijun@huawei.com>,
Parav Pandit <parav.pandit@emulex.com>,
patches@lists.linux.dev, Roland Dreier <roland@purestorage.com>,
Roland Dreier <rolandd@cisco.com>,
Sagi Grimberg <sagi@grimberg.me>,
Ajay Sharma <sharmaajay@microsoft.com>,
stable@vger.kernel.org, Tariq Toukan <tariqt@mellanox.com>,
"Wei Hu (Xavier)" <xavier.huwei@huawei.com>,
Shaobo Xu <xushaobo2@huawei.com>,
Nenglong Zhao <zhaonenglong@hisilicon.com>
Subject: [PATCH rc 03/15] RDMA/mlx5: Add missing store/release for lock elision pattern
Date: Tue, 28 Apr 2026 13:17:36 -0300 [thread overview]
Message-ID: <3-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com> (raw)
In-Reply-To: <0-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com>
mlx5 has a common pattern implementing a device-global singleton resource
where it checks the resource pointer for !NULL and then skips obtaining
the lock.
This is not ordered properly as observing !NULL doesn't mean that all the
data under that pointer is also visible on this CPU when the lock is not
taken.
Use a release/acquire pairing to explicitly manage this.
Pointed out by sashiko, Codex found more cases.
Fixes: 5895e70f2e6e ("IB/mlx5: Allocate resources just before first QP/SRQ is created")
Fixes: 638420115cc4 ("IB/mlx5: Create UMR QP just before first reg_mr occurs")
Link: https://sashiko.dev/#/patchset/SYBPR01MB7881E1E0970268BD69C0BA75AF2B2%40SYBPR01MB7881.ausprd01.prod.outlook.com
Assisted-by: Codex:GPT-5.5
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/hw/mlx5/main.c | 8 ++++----
drivers/infiniband/hw/mlx5/umr.c | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 109661c2ac12b0..73fab8a376933d 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3310,7 +3310,7 @@ int mlx5_ib_dev_res_cq_init(struct mlx5_ib_dev *dev)
* devr->c0 is set once, never changed until device unload.
* Avoid taking the mutex if initialization is already done.
*/
- if (devr->c0)
+ if (smp_load_acquire(&devr->c0))
return 0;
mutex_lock(&devr->cq_lock);
@@ -3336,7 +3336,7 @@ int mlx5_ib_dev_res_cq_init(struct mlx5_ib_dev *dev)
}
devr->p0 = pd;
- devr->c0 = cq;
+ smp_store_release(&devr->c0, cq);
unlock:
mutex_unlock(&devr->cq_lock);
@@ -3354,7 +3354,7 @@ int mlx5_ib_dev_res_srq_init(struct mlx5_ib_dev *dev)
* devr->s1 is set once, never changed until device unload.
* Avoid taking the mutex if initialization is already done.
*/
- if (devr->s1)
+ if (smp_load_acquire(&devr->s1))
return 0;
mutex_lock(&devr->srq_lock);
@@ -3395,7 +3395,7 @@ int mlx5_ib_dev_res_srq_init(struct mlx5_ib_dev *dev)
}
devr->s0 = s0;
- devr->s1 = s1;
+ smp_store_release(&devr->s1, s1);
unlock:
mutex_unlock(&devr->srq_lock);
diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c
index 29488fba21a034..f2139474be3751 100644
--- a/drivers/infiniband/hw/mlx5/umr.c
+++ b/drivers/infiniband/hw/mlx5/umr.c
@@ -147,7 +147,7 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev)
* UMR qp is set once, never changed until device unload.
* Avoid taking the mutex if initialization is already done.
*/
- if (dev->umrc.qp)
+ if (smp_load_acquire(&dev->umrc.qp))
return 0;
mutex_lock(&dev->umrc.init_lock);
@@ -185,7 +185,7 @@ int mlx5r_umr_resource_init(struct mlx5_ib_dev *dev)
sema_init(&dev->umrc.sem, MAX_UMR_WR);
mutex_init(&dev->umrc.lock);
dev->umrc.state = MLX5_UMR_STATE_ACTIVE;
- dev->umrc.qp = qp;
+ smp_store_release(&dev->umrc.qp, qp);
mutex_unlock(&dev->umrc.init_lock);
return 0;
--
2.43.0
next prev parent reply other threads:[~2026-04-28 16:18 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 16:17 [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 01/15] RDMA/ionic: Fix typo in format string Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 02/15] RDMA/mlx5: Restore zero-init to mlx5_ib_modify_qp() ucmd Jason Gunthorpe
2026-04-28 16:17 ` Jason Gunthorpe [this message]
2026-04-28 16:17 ` [PATCH rc 04/15] RDMA/mana: Validate rx_hash_key_len Jason Gunthorpe
2026-04-28 17:50 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 05/15] RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss() Jason Gunthorpe
2026-04-28 17:43 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 06/15] RDMA/mana: Fix mana_destroy_wq_obj() cleanup " Jason Gunthorpe
2026-04-28 17:55 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 07/15] RDMA/mana: Fix error unwind " Jason Gunthorpe
2026-04-28 17:53 ` [EXTERNAL] " Long Li
2026-04-28 16:17 ` [PATCH rc 08/15] RDMA/ocrdma: Clarify the mm_head searching Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 09/15] RDMA/ocrdma: Don't NULL deref uctx on errors in ocrdma_copy_pd_uresp() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 10/15] RDMA/vmw_pvrdma: Fix double free on pvrdma_alloc_ucontext() error path Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 11/15] RDMA/mlx4: Fix resource leak on error in mlx4_ib_create_srq() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 12/15] RDMA/mlx4: Fix mis-use of RCU in mlx4_srq_event() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 13/15] RDMA/hns: Fix xarray race in hns_roce_create_srq() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 14/15] RDMA/hns: Fix xarray race in hns_roce_create_qp_common() Jason Gunthorpe
2026-04-28 16:17 ` [PATCH rc 15/15] RDMA/hns: Fix unlocked call to hns_roce_qp_remove() Jason Gunthorpe
2026-04-29 7:55 ` [PATCH rc 00/15] Various bug fixes for RDMA drivers in the uapi functions Junxian Huang
2026-05-02 18:39 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com \
--to=jgg@nvidia.com \
--cc=abhijit.gangurde@amd.com \
--cc=aditr@vmware.com \
--cc=allen.hubbe@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=andrew.boyer@amd.com \
--cc=asarwade@vmware.com \
--cc=bcm-kernel-feedback-list@broadcom.com \
--cc=brad.spengler@opensrcsec.com \
--cc=bryan-bt.tan@broadcom.com \
--cc=bryantan@vmware.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=dledford@redhat.com \
--cc=edumazet@google.com \
--cc=georgezhang@vmware.com \
--cc=huangjunxian6@hisilicon.com \
--cc=jhansen@vmware.com \
--cc=jianbol@nvidia.com \
--cc=kai.aizen.dev@gmail.com \
--cc=kotaranov@microsoft.com \
--cc=kuba@kernel.org \
--cc=leon@kernel.org \
--cc=leonro@mellanox.com \
--cc=leonro@nvidia.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=liuyixian@huawei.com \
--cc=longli@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=oulijun@huawei.com \
--cc=pabeni@redhat.com \
--cc=parav.pandit@emulex.com \
--cc=patches@lists.linux.dev \
--cc=roland@purestorage.com \
--cc=rolandd@cisco.com \
--cc=sagi@grimberg.me \
--cc=selvin.xavier@broadcom.com \
--cc=sharmaajay@microsoft.com \
--cc=stable@vger.kernel.org \
--cc=tangchengchang@huawei.com \
--cc=tariqt@mellanox.com \
--cc=tariqt@nvidia.com \
--cc=vishnu.dasa@broadcom.com \
--cc=xavier.huwei@huawei.com \
--cc=xushaobo2@huawei.com \
--cc=yishaih@nvidia.com \
--cc=zhaonenglong@hisilicon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox