Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH] IB/cm: Fix av cm device leak on an error path in cm_init_av_by_path()
@ 2026-06-02 19:37 Jason Gunthorpe
  2026-06-05 16:07 ` Jason Gunthorpe
  0 siblings, 1 reply; 2+ messages in thread
From: Jason Gunthorpe @ 2026-06-02 19:37 UTC (permalink / raw)
  To: Leon Romanovsky, linux-rdma; +Cc: Leon Romanovsky, Mark Zhang, patches

Codex pointed out that cm_init_av_by_path() can call cm_set_av_port()
which takes a reference on the cm device, but then can immediately return
error if ib_init_ah_attr_from_path() fails.

Since callers like ib_send_cm_req() put the av on the stack this leaks
that cm device reference.

Re-order cm_init_av_by_path() so it doesn't touch the av until it has done
all its failable work, and then update the av in one shot so it is either
left alone or fully init'd.

Fixes: 76039ac9095f ("IB/cm: Protect cm_dev, cm_ports and mad_agent with kref and lock")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/core/cm.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 6ab9a0aee1ec60..ee966a25141b12 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -530,6 +530,7 @@ static int cm_init_av_by_path(struct sa_path_rec *path,
 	struct rdma_ah_attr new_ah_attr;
 	struct cm_device *cm_dev;
 	struct cm_port *port;
+	u16 pkey_index;
 	int ret;
 
 	port = get_cm_port_from_path(path, sgid_attr);
@@ -538,12 +539,10 @@ static int cm_init_av_by_path(struct sa_path_rec *path,
 	cm_dev = port->cm_dev;
 
 	ret = ib_find_cached_pkey(cm_dev->ib_device, port->port_num,
-				  be16_to_cpu(path->pkey), &av->pkey_index);
+				  be16_to_cpu(path->pkey), &pkey_index);
 	if (ret)
 		return ret;
 
-	cm_set_av_port(av, port);
-
 	/*
 	 * av->ah_attr might be initialized based on wc or during
 	 * request processing time which might have reference to sgid_attr.
@@ -558,6 +557,8 @@ static int cm_init_av_by_path(struct sa_path_rec *path,
 	if (ret)
 		return ret;
 
+	av->pkey_index = pkey_index;
+	cm_set_av_port(av, port);
 	av->timeout = path->packet_life_time + 1;
 	rdma_move_ah_attr(&av->ah_attr, &new_ah_attr);
 	return 0;

base-commit: d6ab440240a04b8737ee4c7bb21af9182e451733
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] IB/cm: Fix av cm device leak on an error path in cm_init_av_by_path()
  2026-06-02 19:37 [PATCH] IB/cm: Fix av cm device leak on an error path in cm_init_av_by_path() Jason Gunthorpe
@ 2026-06-05 16:07 ` Jason Gunthorpe
  0 siblings, 0 replies; 2+ messages in thread
From: Jason Gunthorpe @ 2026-06-05 16:07 UTC (permalink / raw)
  To: Leon Romanovsky, linux-rdma; +Cc: Leon Romanovsky, Mark Zhang, patches

On Tue, Jun 02, 2026 at 04:37:28PM -0300, Jason Gunthorpe wrote:
> Codex pointed out that cm_init_av_by_path() can call cm_set_av_port()
> which takes a reference on the cm device, but then can immediately return
> error if ib_init_ah_attr_from_path() fails.
> 
> Since callers like ib_send_cm_req() put the av on the stack this leaks
> that cm device reference.
> 
> Re-order cm_init_av_by_path() so it doesn't touch the av until it has done
> all its failable work, and then update the av in one shot so it is either
> left alone or fully init'd.
> 
> Fixes: 76039ac9095f ("IB/cm: Protect cm_dev, cm_ports and mad_agent with kref and lock")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/cm.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Sashiko also pointed out that the cm_destroy_av() prior to
cm_init_av_by_path() is harmful as it leaves the AV broken in the error
case and thus the REJ won't send. Since cm_init_av_by_path() is now atomic
it is safe to delete the cm_destroy_av(). On succees the av from
cm_init_av_for_response() is cleaned up by cm_init_av_by_path(), on
failure the 'goto rejected' guarentees the av is destroyed during
ib_destroy_cm_id().

So I folded this in:

-       /* This destroy call is needed to pair with cm_init_av_for_response */
-       cm_destroy_av(&cm_id_priv->av);
+       /*
+        * cm_init_av_by_path() will internally pair with the above
+        * cm_init_av_for_response() if it succeeds.
+        */
        ret = cm_init_av_by_path(&work->path[0], gid_attr, &cm_id_priv->av);
        if (ret) {
                int err;

applied

Jason

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-05 16:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-02 19:37 [PATCH] IB/cm: Fix av cm device leak on an error path in cm_init_av_by_path() Jason Gunthorpe
2026-06-05 16:07 ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox