Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Vlad Dumitrescu <vdumitrescu@nvidia.com>,
	linux-rdma@vger.kernel.org, Sean Hefty <shefty@nvidia.com>
Subject: [PATCH rdma-next 6/9] IB/cm: Set deadline when sending MADs
Date: Thu,  5 Dec 2024 15:49:36 +0200	[thread overview]
Message-ID: <94e82976688780ac43f5719d86c6630228c2e590.1733405453.git.leon@kernel.org> (raw)
In-Reply-To: <cover.1733405453.git.leon@kernel.org>

From: Vlad Dumitrescu <vdumitrescu@nvidia.com>

With the current MAD retry algorithm, the expected total timeout is
roughly (retries + 1) * timeout_ms.  This is an approximation because
scheduling and completion delays are not strictly accounted for.

For CM the number of retries is typically CMA_MAX_CM_RETRIES (15),
unless the peer is setting REQ:Max CM Retries [1] to a different value.
In theory, the timeout could vary, being based on
CMA_CM_RESPONSE_TIMEOUT + Packet Life Time, as well as the peer's MRA
messages.  In practice, for RoCE, the formula above results in 65536ms.

Based on the above, set a constant deadline to a round 70s, for all
cases.  Note that MRAs will end up calling ib_modify_mad which will
extend the deadline accordingly.

This allows changes to the MAD layer's internal retry algorithm without
affecting the total timeout experienced by CM.

[1] IBTA v1.7 - Section 12.7.27 - Max CM Retries

Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Reviewed-by: Sean Hefty <shefty@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/cm.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 142170473e75..36649faf9842 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -36,6 +36,7 @@ MODULE_LICENSE("Dual BSD/GPL");
 
 #define CM_DESTROY_ID_WAIT_TIMEOUT 10000 /* msecs */
 #define CM_DIRECT_RETRY_CTX ((void *) 1UL)
+#define CM_MAD_TOTAL_TIMEOUT 70000 /* msecs */
 
 static const char * const ibcm_rej_reason_strs[] = {
 	[IB_CM_REJ_NO_QP]			= "no QP",
@@ -279,6 +280,7 @@ static struct ib_mad_send_buf *cm_alloc_msg(struct cm_id_private *cm_id_priv)
 	struct ib_mad_agent *mad_agent;
 	struct ib_mad_send_buf *m;
 	struct ib_ah *ah;
+	int ret;
 
 	lockdep_assert_held(&cm_id_priv->lock);
 
@@ -309,6 +311,17 @@ static struct ib_mad_send_buf *cm_alloc_msg(struct cm_id_private *cm_id_priv)
 	}
 
 	m->ah = ah;
+	m->retries = cm_id_priv->max_cm_retries;
+	ret = ib_set_mad_deadline(m, CM_MAD_TOTAL_TIMEOUT);
+	if (ret) {
+		m = ERR_PTR(ret);
+		ib_free_send_mad(m);
+		rdma_destroy_ah(ah, 0);
+		goto out;
+	}
+
+	refcount_inc(&cm_id_priv->refcount);
+	m->context[0] = cm_id_priv;
 
 out:
 	spin_unlock(&cm_id_priv->av.port->cm_dev->mad_agent_lock);
-- 
2.47.0


  parent reply	other threads:[~2024-12-05 13:50 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-05 13:49 [PATCH rdma-next 0/9] Rework retry algorithm used when sending MADs Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 1/9] IB/mad: Apply timeout modification (CM MRA) only once Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 2/9] IB/mad: Add deadline for send MADs Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 3/9] RDMA/sa_query: Enforce min retry interval and deadline Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 4/9] RDMA/nldev: Add sa-min-timeout management attribute Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 5/9] IB/umad: Set deadline when sending non-RMPP MADs Leon Romanovsky
2024-12-05 13:49 ` Leon Romanovsky [this message]
2024-12-05 13:49 ` [PATCH rdma-next 7/9] IB/mad: Exponential backoff when retrying sends Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 8/9] RDMA/nldev: Add mad-linear-timeouts management attribute Leon Romanovsky
2024-12-05 13:49 ` [PATCH rdma-next 9/9] IB/cma: Lower response timeout to roughly 1s Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94e82976688780ac43f5719d86c6630228c2e590.1733405453.git.leon@kernel.org \
    --to=leon@kernel.org \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=shefty@nvidia.com \
    --cc=vdumitrescu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox