All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received
@ 2026-05-07 15:47 Praveen Kumar Kannoju
  2026-05-11  7:50 ` Praveen Kannoju
  2026-06-03  0:26 ` Jason Gunthorpe
  0 siblings, 2 replies; 4+ messages in thread
From: Praveen Kumar Kannoju @ 2026-05-07 15:47 UTC (permalink / raw)
  To: yishaih, jgg, leon, linux-rdma, linux-kernel
  Cc: anand.a.khoje, manjunath.b.patil, Praveen Kumar Kannoju

mlx4_ib_multiplex_cm_handler() allocates an id_map_entry for CM
transactions, but the entry is only released on DREQ or REJ flows.

In the duplicate REP handling scenario, cm_dup_rep_handler() may get
invoked when the remote side receives a REP for which no matching
cm_id_priv exists. In such cases the CM handshake never reaches RTU,
and the sender side may never receive either DREQ or REJ cleanup events.

As a result, the allocated id_map_entry remains indefinitely, resulting in
a stale mapping leak.

Fix this by scheduling delayed cleanup immediately after allocating the
id_map_entry. The delayed work is cancelled once CM_RTU_ATTR_ID is
received, indicating that the CM handshake completed successfully.

This ensures abandoned mappings are eventually reclaimed even when RTU is
never received.

Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
---
 drivers/infiniband/hw/mlx4/cm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
index 63a868a3822f..700a840d491d 100644
--- a/drivers/infiniband/hw/mlx4/cm.c
+++ b/drivers/infiniband/hw/mlx4/cm.c
@@ -299,6 +299,7 @@ static void schedule_delayed(struct ib_device *ibdev, struct id_map_entry *id)
 }
 
 #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg *)(m))->rej_reason)
+#define RTU_RECEIVE_TIMEOUT  (60 * HZ)
 int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id,
 		struct ib_mad *mad)
 {
@@ -321,6 +322,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id
 				__func__, slave_id, sl_cm_id);
 			return PTR_ERR(id);
 		}
+
+		schedule_delayed_work(&id->timeout, RTU_RECEIVE_TIMEOUT);
+
 	} else if (mad->mad_hdr.attr_id == CM_REJ_ATTR_ID ||
 		   mad->mad_hdr.attr_id == CM_SIDR_REP_ATTR_ID) {
 		return 0;
@@ -335,6 +339,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id
 		return -EINVAL;
 	}
 
+	if (mad->mad_hdr.attr_id == CM_RTU_ATTR_ID)
+		cancel_delayed_work_sync(&id->timeout);
+
 cont:
 	set_local_comm_id(mad, id->pv_cm_id);
 
@@ -479,6 +486,9 @@ int mlx4_ib_demux_cm_handler(struct ib_device *ibdev, int port, int *slave,
 	    mad->mad_hdr.attr_id == CM_REJ_ATTR_ID)
 		schedule_delayed(ibdev, id);
 
+	if (mad->mad_hdr.attr_id == CM_RTU_ATTR_ID)
+		cancel_delayed_work_sync(&id->timeout);
+
 	return 0;
 }
 
-- 
2.43.7


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* RE: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received
  2026-05-07 15:47 [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received Praveen Kumar Kannoju
@ 2026-05-11  7:50 ` Praveen Kannoju
  2026-06-03  0:26 ` Jason Gunthorpe
  1 sibling, 0 replies; 4+ messages in thread
From: Praveen Kannoju @ 2026-05-11  7:50 UTC (permalink / raw)
  To: Praveen Kannoju, yishaih@nvidia.com, jgg@ziepe.ca,
	leon@kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org
  Cc: Anand Khoje, Manjunath Patil

Confidential - Oracle Restricted \Including External Recipients

Gentle reminder for reviewing the patch.

-
Praveen.


Confidential - Oracle Restricted \Including External Recipients
> -----Original Message-----
> From: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> Sent: Thursday, May 7, 2026 9:18 PM
> To: yishaih@nvidia.com; jgg@ziepe.ca; leon@kernel.org; linux-
> rdma@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: Anand Khoje <anand.a.khoje@oracle.com>; Manjunath Patil
> <manjunath.b.patil@oracle.com>; Praveen Kannoju
> <praveen.kannoju@oracle.com>
> Subject: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never
> received
>
> mlx4_ib_multiplex_cm_handler() allocates an id_map_entry for CM transactions,
> but the entry is only released on DREQ or REJ flows.
>
> In the duplicate REP handling scenario, cm_dup_rep_handler() may get invoked
> when the remote side receives a REP for which no matching cm_id_priv exists. In
> such cases the CM handshake never reaches RTU, and the sender side may never
> receive either DREQ or REJ cleanup events.
>
> As a result, the allocated id_map_entry remains indefinitely, resulting in a stale
> mapping leak.
>
> Fix this by scheduling delayed cleanup immediately after allocating the
> id_map_entry. The delayed work is cancelled once CM_RTU_ATTR_ID is received,
> indicating that the CM handshake completed successfully.
>
> This ensures abandoned mappings are eventually reclaimed even when RTU is
> never received.
>
> Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> ---
>  drivers/infiniband/hw/mlx4/cm.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> index 63a868a3822f..700a840d491d 100644
> --- a/drivers/infiniband/hw/mlx4/cm.c
> +++ b/drivers/infiniband/hw/mlx4/cm.c
> @@ -299,6 +299,7 @@ static void schedule_delayed(struct ib_device *ibdev,
> struct id_map_entry *id)  }
>
>  #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg *)(m))-
> >rej_reason)
> +#define RTU_RECEIVE_TIMEOUT  (60 * HZ)
>  int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id,
>               struct ib_mad *mad)
>  {
> @@ -321,6 +322,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device
> *ibdev, int port, int slave_id
>                               __func__, slave_id, sl_cm_id);
>                       return PTR_ERR(id);
>               }
> +
> +             schedule_delayed_work(&id->timeout,
> RTU_RECEIVE_TIMEOUT);
> +
>       } else if (mad->mad_hdr.attr_id == CM_REJ_ATTR_ID ||
>                  mad->mad_hdr.attr_id == CM_SIDR_REP_ATTR_ID) {
>               return 0;
> @@ -335,6 +339,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device
> *ibdev, int port, int slave_id
>               return -EINVAL;
>       }
>
> +     if (mad->mad_hdr.attr_id == CM_RTU_ATTR_ID)
> +             cancel_delayed_work_sync(&id->timeout);
> +
>  cont:
>       set_local_comm_id(mad, id->pv_cm_id);
>
> @@ -479,6 +486,9 @@ int mlx4_ib_demux_cm_handler(struct ib_device *ibdev,
> int port, int *slave,
>           mad->mad_hdr.attr_id == CM_REJ_ATTR_ID)
>               schedule_delayed(ibdev, id);
>
> +     if (mad->mad_hdr.attr_id == CM_RTU_ATTR_ID)
> +             cancel_delayed_work_sync(&id->timeout);
> +
>       return 0;
>  }
>
> --
> 2.43.7


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received
  2026-05-07 15:47 [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received Praveen Kumar Kannoju
  2026-05-11  7:50 ` Praveen Kannoju
@ 2026-06-03  0:26 ` Jason Gunthorpe
  2026-06-08 12:25   ` Praveen Kannoju
  1 sibling, 1 reply; 4+ messages in thread
From: Jason Gunthorpe @ 2026-06-03  0:26 UTC (permalink / raw)
  To: Praveen Kumar Kannoju
  Cc: yishaih, leon, linux-rdma, linux-kernel, anand.a.khoje,
	manjunath.b.patil

On Thu, May 07, 2026 at 03:47:55PM +0000, Praveen Kumar Kannoju wrote:
> mlx4_ib_multiplex_cm_handler() allocates an id_map_entry for CM
> transactions, but the entry is only released on DREQ or REJ flows.
> 
> In the duplicate REP handling scenario, cm_dup_rep_handler() may get
> invoked when the remote side receives a REP for which no matching
> cm_id_priv exists. In such cases the CM handshake never reaches RTU,
> and the sender side may never receive either DREQ or REJ cleanup events.
> 
> As a result, the allocated id_map_entry remains indefinitely, resulting in
> a stale mapping leak.
> 
> Fix this by scheduling delayed cleanup immediately after allocating the
> id_map_entry. The delayed work is cancelled once CM_RTU_ATTR_ID is
> received, indicating that the CM handshake completed successfully.
> 
> This ensures abandoned mappings are eventually reclaimed even when RTU is
> never received.
> 
> Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> ---
>  drivers/infiniband/hw/mlx4/cm.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> index 63a868a3822f..700a840d491d 100644
> --- a/drivers/infiniband/hw/mlx4/cm.c
> +++ b/drivers/infiniband/hw/mlx4/cm.c
> @@ -299,6 +299,7 @@ static void schedule_delayed(struct ib_device *ibdev, struct id_map_entry *id)
>  }
>  
>  #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg *)(m))->rej_reason)
> +#define RTU_RECEIVE_TIMEOUT  (60 * HZ)
>  int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id,
>  		struct ib_mad *mad)
>  {
> @@ -321,6 +322,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id
>  				__func__, slave_id, sl_cm_id);
>  			return PTR_ERR(id);
>  		}
> +
> +		schedule_delayed_work(&id->timeout, RTU_RECEIVE_TIMEOUT);

So this is a distinct problem from the other one? Can you put all
these mlx4 bugs into one series?

Why does this open code schedule_delayed() and remove all the locking?

Sashiko even points out this might create a UAF:

https://sashiko.dev/#/patchset/20260507154755.452008-1-praveen.kannoju%40oracle.com

Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received
  2026-06-03  0:26 ` Jason Gunthorpe
@ 2026-06-08 12:25   ` Praveen Kannoju
  0 siblings, 0 replies; 4+ messages in thread
From: Praveen Kannoju @ 2026-06-08 12:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: yishaih@nvidia.com, leon@kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, Anand Khoje, Manjunath Patil

Confidential - Oracle Restricted \Including External Recipients

Yes, this is a separate issue from the earlier REJ handling.

In this case, when the remote node drops the reply as a duplicate, the source side can retain the `id_map_entry` indefinitely, which leaves a stale mapping behind.

Thank you for pointing me to the UAF concern and the review link. I will evaluate the locking and lifetime handling carefully, fix the patch as needed, and resend an updated version.


Confidential - Oracle Restricted \Including External Recipients
> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Wednesday, June 3, 2026 5:56 AM
> To: Praveen Kannoju <praveen.kannoju@oracle.com>
> Cc: yishaih@nvidia.com; leon@kernel.org; linux-rdma@vger.kernel.org; linux-
> kernel@vger.kernel.org; Anand Khoje <anand.a.khoje@oracle.com>;
> Manjunath Patil <manjunath.b.patil@oracle.com>
> Subject: Re: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never
> received
>
> On Thu, May 07, 2026 at 03:47:55PM +0000, Praveen Kumar Kannoju wrote:
> > mlx4_ib_multiplex_cm_handler() allocates an id_map_entry for CM
> > transactions, but the entry is only released on DREQ or REJ flows.
> >
> > In the duplicate REP handling scenario, cm_dup_rep_handler() may get
> > invoked when the remote side receives a REP for which no matching
> > cm_id_priv exists. In such cases the CM handshake never reaches RTU,
> > and the sender side may never receive either DREQ or REJ cleanup events.
> >
> > As a result, the allocated id_map_entry remains indefinitely,
> > resulting in a stale mapping leak.
> >
> > Fix this by scheduling delayed cleanup immediately after allocating
> > the id_map_entry. The delayed work is cancelled once CM_RTU_ATTR_ID is
> > received, indicating that the CM handshake completed successfully.
> >
> > This ensures abandoned mappings are eventually reclaimed even when RTU
> > is never received.
> >
> > Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> > ---
> >  drivers/infiniband/hw/mlx4/cm.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/infiniband/hw/mlx4/cm.c
> > b/drivers/infiniband/hw/mlx4/cm.c index 63a868a3822f..700a840d491d
> > 100644
> > --- a/drivers/infiniband/hw/mlx4/cm.c
> > +++ b/drivers/infiniband/hw/mlx4/cm.c
> > @@ -299,6 +299,7 @@ static void schedule_delayed(struct ib_device
> > *ibdev, struct id_map_entry *id)  }
> >
> >  #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg
> > *)(m))->rej_reason)
> > +#define RTU_RECEIVE_TIMEOUT  (60 * HZ)
> >  int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int
> slave_id,
> >             struct ib_mad *mad)
> >  {
> > @@ -321,6 +322,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device
> *ibdev, int port, int slave_id
> >                             __func__, slave_id, sl_cm_id);
> >                     return PTR_ERR(id);
> >             }
> > +
> > +           schedule_delayed_work(&id->timeout,
> RTU_RECEIVE_TIMEOUT);
>
> So this is a distinct problem from the other one? Can you put all these mlx4
> bugs into one series?
>
> Why does this open code schedule_delayed() and remove all the locking?
>
> Sashiko even points out this might create a UAF:
>
> https://sashiko.dev/#/patchset/20260507154755.452008-1-
> praveen.kannoju%40oracle.com
>
> Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-08 12:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-07 15:47 [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received Praveen Kumar Kannoju
2026-05-11  7:50 ` Praveen Kannoju
2026-06-03  0:26 ` Jason Gunthorpe
2026-06-08 12:25   ` Praveen Kannoju

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.