From: Mark Zhang <markzhang@nvidia.com>
To: Etienne AUJAMES <eaujames@ddn.com>, jgg@ziepe.ca, leon@kernel.org
Cc: linux-rdma@vger.kernel.org, Gael.DELBARY@cea.fr,
guillaume.courrier@cea.fr
Subject: Re: [PATCH rdma-next] IB/cma: Define options to set CM timeouts and retries
Date: Mon, 8 Apr 2024 11:26:57 +0800 [thread overview]
Message-ID: <adc18a96-661f-4c60-808f-3a1167a20e1b@nvidia.com> (raw)
In-Reply-To: <ZgxeQbxfKXHkUlQG@eaujamesDDN>
On 4/3/2024 3:36 AM, Etienne AUJAMES wrote:
> External email: Use caution opening links or attachments
>
>
> Define new options in 'rdma_set_option' to override default CM retries
> ("Max CM retries") and timeouts ("Local CM Response Timeout" and "Remote
> CM Response Timeout").
>
Sometimes user-facting tunable is not preferred but let's see:
https://lore.kernel.org/linux-rdma/EC1346C3-3888-4FFB-B302-1DB200AAE00D@oracle.com/
> These options can be useful for RoCE networks (no SM) to decrease the
> overall connection timeout with an unreachable node (by default, it can
> take several minutes).
>
This patch is not only for RoCE, right?
> Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
> ---
> drivers/infiniband/core/cma.c | 92 ++++++++++++++++++++++++++++--
> drivers/infiniband/core/cma_priv.h | 4 ++
> drivers/infiniband/core/ucma.c | 14 +++++
> include/rdma/rdma_cm.h | 5 ++
> include/uapi/rdma/rdma_user_cm.h | 4 +-
> 5 files changed, 113 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 1e2cd7c8716e..cc73b9708862 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1002,6 +1002,8 @@ __rdma_create_id(struct net *net, rdma_cm_event_handler event_handler,
> id_priv->tos_set = false;
> id_priv->timeout_set = false;
> id_priv->min_rnr_timer_set = false;
> + id_priv->max_cm_retries = false;
> + id_priv->cm_timeout = false;
> id_priv->gid_type = IB_GID_TYPE_IB;
> spin_lock_init(&id_priv->lock);
> mutex_init(&id_priv->qp_mutex);
> @@ -2845,6 +2847,80 @@ int rdma_set_min_rnr_timer(struct rdma_cm_id *id, u8 min_rnr_timer)
> }
> EXPORT_SYMBOL(rdma_set_min_rnr_timer);
>
> +/**
> + * rdma_set_cm_retries() - Set the maximum of CM retries of the QP associated
> + * with a connection identifier.
This comment (and the one below) seems not accuruate, as it's not for
the QP. This is different from the rdma_set_ack_timeout().
> + * @id: Communication identifier associated with service type.
> + * @max_cm_retries: 4-bit value definied as "Max CM Retries" REQ field
typo: definied -> defined
> + * (Table 99 "REQ Message Contents" in the IBTA specification).
> + *
> + * This function should be called before rdma_connect() on active side.
> + * The value will determine the maximum number of times the CM should
> + * resend a connection request or reply (REQ/REP) before giving up (UNREACHABLE
> + * event).
> + *
> + * Return: 0 for success
> + */
> +int rdma_set_cm_retries(struct rdma_cm_id *id, u8 max_cm_retries)
> +{
> + struct rdma_id_private *id_priv;
> +
> + /* It is a 4-bit value */
> + if (max_cm_retries & 0xf0)
> + return -EINVAL;
> +
> + if (WARN_ON(id->qp_type != IB_QPT_RC && id->qp_type != IB_QPT_XRC_TGT))
> + return -EINVAL;
> +
Maybe we don't need a warning here.
I think UD also need these 2 parameters, as UD also has Resp. see
cma_resolve_ib_udp() below.
> + id_priv = container_of(id, struct rdma_id_private, id);
> + mutex_lock(&id_priv->qp_mutex);
> + id_priv->max_cm_retries = max_cm_retries;
> + id_priv->max_cm_retries_set = true;
> + mutex_unlock(&id_priv->qp_mutex);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(rdma_set_cm_retries);
> +
> +/**
> + * rdma_set_cm_timeout() - Set the CM timeouts of the QP associated with a
> + * connection identifier.
> + * @id: Communication identifier associated with service type.
> + * @cm_timeout: 5-bit value, expressed as 4.096 * 2^(timeout) usec.
> + * This value should be superior than 0.
> + *
> + * This function should be called before rdma_connect() on active side.
> + * The value will affect the "Remote CM Response Timeout" and the
> + * "Local CM Response Timeout" timeouts to respond to a connection request (REQ)
> + * and to wait for connection reply (REP) ack on the remote node.
> + *
> + * Round-trip timeouts for a REQ and REP requests:
> + * REQ_timeout_ms = remote_cm_response_timeout_ms + 2* PacketLifeTime_ms
> + * REP_timeout_ms = local_cm_response_timeout_ms
> + *
> + * Return: 0 for success
> + */
> +int rdma_set_cm_timeout(struct rdma_cm_id *id, u8 cm_timeout)
> +{
> + struct rdma_id_private *id_priv;
> +
> + /* It is a 5-bit value */
> + if (!cm_timeout || (cm_timeout & 0xe0))
> + return -EINVAL;
> +
> + if (WARN_ON(id->qp_type != IB_QPT_RC && id->qp_type != IB_QPT_XRC_TGT))
> + return -EINVAL;
likewise
> +
> + id_priv = container_of(id, struct rdma_id_private, id);
> + mutex_lock(&id_priv->qp_mutex);
> + id_priv->cm_timeout = cm_timeout;
> + id_priv->cm_timeout_set = true;
> + mutex_unlock(&id_priv->qp_mutex);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(rdma_set_cm_timeout);
> +
> static int route_set_path_rec_inbound(struct cma_work *work,
> struct sa_path_rec *path_rec)
> {
> @@ -4295,8 +4371,11 @@ static int cma_resolve_ib_udp(struct rdma_id_private *id_priv,
> req.path = id_priv->id.route.path_rec;
> req.sgid_attr = id_priv->id.route.addr.dev_addr.sgid_attr;
> req.service_id = rdma_get_service_id(&id_priv->id, cma_dst_addr(id_priv));
> - req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8);
> - req.max_cm_retries = CMA_MAX_CM_RETRIES;
> + req.timeout_ms = id_priv->cm_timeout_set ?
> + id_priv->cm_timeout : CMA_CM_RESPONSE_TIMEOUT;
> + req.timeout_ms = 1 << (req.timeout_ms - 8);
So req.timeout_ms must be greater than 8? Maybe we need to check in
rdma_set_cm_timeout().
next prev parent reply other threads:[~2024-04-08 3:27 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-02 19:36 [PATCH rdma-next] IB/cma: Define options to set CM timeouts and retries Etienne AUJAMES
2024-04-08 3:26 ` Mark Zhang [this message]
2024-04-09 16:11 ` Etienne AUJAMES
2024-04-08 16:10 ` Sean Hefty
2024-04-09 13:07 ` Etienne AUJAMES
2024-04-09 14:44 ` Sean Hefty
2024-04-11 16:04 ` Etienne AUJAMES
2024-04-11 17:15 ` Sean Hefty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adc18a96-661f-4c60-808f-3a1167a20e1b@nvidia.com \
--to=markzhang@nvidia.com \
--cc=Gael.DELBARY@cea.fr \
--cc=eaujames@ddn.com \
--cc=guillaume.courrier@cea.fr \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox