All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Cc: yishaih@nvidia.com, leon@kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, anand.a.khoje@oracle.com,
	manjunath.b.patil@oracle.com
Subject: Re: [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received
Date: Tue, 2 Jun 2026 21:26:14 -0300	[thread overview]
Message-ID: <20260603002614.GA1080033@nvidia.com> (raw)
In-Reply-To: <20260507154755.452008-1-praveen.kannoju@oracle.com>

On Thu, May 07, 2026 at 03:47:55PM +0000, Praveen Kumar Kannoju wrote:
> mlx4_ib_multiplex_cm_handler() allocates an id_map_entry for CM
> transactions, but the entry is only released on DREQ or REJ flows.
> 
> In the duplicate REP handling scenario, cm_dup_rep_handler() may get
> invoked when the remote side receives a REP for which no matching
> cm_id_priv exists. In such cases the CM handshake never reaches RTU,
> and the sender side may never receive either DREQ or REJ cleanup events.
> 
> As a result, the allocated id_map_entry remains indefinitely, resulting in
> a stale mapping leak.
> 
> Fix this by scheduling delayed cleanup immediately after allocating the
> id_map_entry. The delayed work is cancelled once CM_RTU_ATTR_ID is
> received, indicating that the CM handshake completed successfully.
> 
> This ensures abandoned mappings are eventually reclaimed even when RTU is
> never received.
> 
> Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> ---
>  drivers/infiniband/hw/mlx4/cm.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/mlx4/cm.c b/drivers/infiniband/hw/mlx4/cm.c
> index 63a868a3822f..700a840d491d 100644
> --- a/drivers/infiniband/hw/mlx4/cm.c
> +++ b/drivers/infiniband/hw/mlx4/cm.c
> @@ -299,6 +299,7 @@ static void schedule_delayed(struct ib_device *ibdev, struct id_map_entry *id)
>  }
>  
>  #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg *)(m))->rej_reason)
> +#define RTU_RECEIVE_TIMEOUT  (60 * HZ)
>  int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id,
>  		struct ib_mad *mad)
>  {
> @@ -321,6 +322,9 @@ int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id
>  				__func__, slave_id, sl_cm_id);
>  			return PTR_ERR(id);
>  		}
> +
> +		schedule_delayed_work(&id->timeout, RTU_RECEIVE_TIMEOUT);

So this is a distinct problem from the other one? Can you put all
these mlx4 bugs into one series?

Why does this open code schedule_delayed() and remove all the locking?

Sashiko even points out this might create a UAF:

https://sashiko.dev/#/patchset/20260507154755.452008-1-praveen.kannoju%40oracle.com

Jason

  parent reply	other threads:[~2026-06-03  0:26 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 15:47 [PATCH] IB/mlx4: Fix stale CM id_map entries when RTU is never received Praveen Kumar Kannoju
2026-05-11  7:50 ` Praveen Kannoju
2026-06-03  0:26 ` Jason Gunthorpe [this message]
2026-06-08 12:25   ` Praveen Kannoju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260603002614.GA1080033@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=anand.a.khoje@oracle.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=manjunath.b.patil@oracle.com \
    --cc=praveen.kannoju@oracle.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.