From: Jens Domke <jens.domke@riken.jp>
To: Christopher Lameter <cl@linux.com>
Cc: linux-rdma@vger.kernel.org
Subject: Re: Is there a working cache for path record and lids etc for librdmacm?
Date: Tue, 17 Nov 2020 17:46:21 +0900 [thread overview]
Message-ID: <bbaa9827-fed4-492b-5c22-e543e8c69fbf@riken.jp> (raw)
In-Reply-To: <alpine.DEB.2.22.394.2011170253150.206345@www.lameter.com>
Hi Christopher,
On 11/17/20 11:57 AM, Christopher Lameter wrote:
> We have a large number of apps running on the same host that are all
> sending to the same set of hosts. Lots of requests for address resolution
> are going to the SM and for a large set of hosts this can become too much
> for the SM.
I have used ibacm successfully years ago (think somewhere in the
2013-2015 timeframe) but abandoned the approach because some
measurements indicated that using OpenMPI with rdmacm had a big
runtime overhead compared to using OpenMPI+oob (Mellanox was
informed but I'm unsure how much has changed until now)
> Is there something that can locally cache the results of the SM queries to
> avoid additional requests?
Not that I know of, but others might know better. Maybe try contacting
Sean Hefty (driver behind ibacm) directly if he missed your email here
on the list.
> We have tried IBACM but the address resolution does not work on it. It is
> unable to complete a request for any address resolution and leaves kernel
> threads that never terminate instead.
Setting up ibacm was/is painful, maybe you could verify that it works on
a test bed with lowlevel rdmacm tools to debug with ping-pong, etc.
Furthermore, another thing I learned the hard way was that a cold cache
can overwhelm opensm as well. So, if you deploy ibacm, you have to make
sure that not too many requests go to the local ibacm on too many nodes
simultaneously right after starting ibacm service, otherwise having all
nodes sending numerous requests to opensm could timeout -> could be the
reason for your stalled kernel threads.
(another explanation is obviously a bug in ibacm and/or incompatibility
to newer versions of librdmacm or opensm or other IB libs)
Sorry, that I cannot provide more specific and direct help, but maybe my
pointers can help you solve the issue.
Best,
Jens
next prev parent reply other threads:[~2020-11-17 8:56 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-17 2:57 Is there a working cache for path record and lids etc for librdmacm? Christopher Lameter
2020-11-17 8:46 ` Jens Domke [this message]
2020-11-17 14:20 ` Christopher Lameter
2020-11-17 19:33 ` Jason Gunthorpe
2020-11-20 18:05 ` Christopher Lameter
2020-11-20 18:34 ` Håkon Bugge
2020-11-22 12:49 ` Christopher Lameter
2020-11-22 15:50 ` Håkon Bugge
2020-11-22 19:22 ` Christopher Lameter
2020-11-23 12:50 ` Christopher Lameter
2020-11-23 19:01 ` Håkon Bugge
2020-11-24 19:01 ` Christopher Lameter
2020-11-25 8:10 ` Honggang LI
2020-11-25 16:43 ` Christopher Lameter
2020-11-27 14:52 ` Håkon Bugge
2020-11-30 8:24 ` Christopher Lameter
2020-12-04 11:17 ` Håkon Bugge
2020-12-05 11:50 ` Christoph Lameter
2020-12-07 10:28 ` Christoph Lameter
2020-12-07 21:08 ` Mark Haywood
2020-12-08 8:59 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bbaa9827-fed4-492b-5c22-e543e8c69fbf@riken.jp \
--to=jens.domke@riken.jp \
--cc=cl@linux.com \
--cc=linux-rdma@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox