From: Jason Gunthorpe <jgg@nvidia.com>
To: Jacob Moroni <jmoroni@google.com>
Cc: leon@kernel.org, markzhang@nvidia.com,
linux-rdma@vger.kernel.org, edumazet@google.com
Subject: Re: [PATCH] IB/cm: use rwlock for MAD agent lock
Date: Fri, 21 Feb 2025 13:00:51 -0400 [thread overview]
Message-ID: <20250221170051.GA311389@nvidia.com> (raw)
In-Reply-To: <20250220175612.2763122-1-jmoroni@google.com>
On Thu, Feb 20, 2025 at 05:56:12PM +0000, Jacob Moroni wrote:
> In workloads where there are many processes establishing
> connections using RDMA CM in parallel (large scale MPI),
> there can be heavy contention for mad_agent_lock in
> cm_alloc_msg.
>
> This contention can occur while inside of a spin_lock_irq
> region, leading to interrupts being disabled for extended
> durations on many cores. Furthermore, it leads to the
> serialization of rdma_create_ah calls, which has negative
> performance impacts for NICs which are capable of processing
> multiple address handle creations in parallel.
>
> The end result is the machine becoming unresponsive, hung
> task warnings, netdev TX timeouts, etc.
While the patch and fix seems reasonable, I'm somewhat surprised to
see it.
If you are running at such a high workload then I'm shocked you don't
hit all the other nasty problems with RDMA CM scalability?
Is the issue that the AH creation is very slow for some reason? It has
been a longstanding peeve of mine that this is done under a spinlock
context, I've long felt that should be reworked and some of those
spinlocks converted to mutex's.
Jason
next prev parent reply other threads:[~2025-02-21 17:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-20 17:56 [PATCH] IB/cm: use rwlock for MAD agent lock Jacob Moroni
2025-02-21 16:50 ` Eric Dumazet
2025-02-21 17:00 ` Jason Gunthorpe [this message]
2025-02-21 17:03 ` Zhu Yanjun
2025-02-21 17:32 ` Eric Dumazet
2025-02-21 17:39 ` Jacob Moroni
2025-02-22 6:20 ` Zhu Yanjun
2025-02-22 7:38 ` Eric Dumazet
2025-02-22 10:31 ` Zhu Yanjun
2025-04-01 16:18 ` Jason Gunthorpe
2025-04-07 18:41 ` Jason Gunthorpe
-- strict thread matches above, loose matches on Subject: below --
2025-02-20 17:04 Jacob Moroni
2025-02-20 17:37 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250221170051.GA311389@nvidia.com \
--to=jgg@nvidia.com \
--cc=edumazet@google.com \
--cc=jmoroni@google.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=markzhang@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).