From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
OpenFabrics EWG <ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org>,
Jeff Squyres <jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Subject: Re: bug 1918 - openmpi broken due to rdma-cm changes
Date: Thu, 04 Feb 2010 17:04:23 -0600 [thread overview]
Message-ID: <4B6B5277.80307@opengridcomputing.com> (raw)
In-Reply-To: <DB361897EE7C40C5AB494AAC2D8625AC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
Sean Hefty wrote:
>> Well then the rdma-cm needs to know which devices support hw loopback.
>> Cuz on a T3-only system, no hwloop...
>>
>
> The problem sounds like it's more than just whether 127.0.0.1 is usable. That
> check may fix openmpi, but it sounds more like the app needs to know whether the
> device can actually support loopback, regardless of what addresses are used. Is
> this correct?
>
> What would openmpi do if there were two addresses assigned to the T3 device?
>
It would use them and might even create two connections.
> Does openmpi simply bypass RDMA for all connections on the local machine?
>
>
OpenMPI can be run to use hw loopback if its available. For T3
clusters, OMPI is run in a mode to use shared memory for intra-node
communications.
> Basically, I'm not sure that this is *just* an rdma_cm issue. Although it
> definitely appears that some sort of change needs to be made to the rdma_cm.
>
>
I think the OpenMPI rdmacm code needs to skip 127.0.0.1, in this
particular case. Prior to ofed-1.5.1, however, the bind would fail and
thus OpenMPI would not advertise 127.0.0.1 to its peer. I will work to
get that change done.
But lets also add a device attribute so the rdmacm can know if a device
supports loopback. Clearly, if the rdma-cm allows binds to T3,
loopback connections will fail at connect time.
Hey Roland, are you ok with a device attribute to indicate hw-loopback
support?
Steve.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-02-04 23:04 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4B6B47D0.9030507@aoot.com>
[not found] ` <4B6B47D0.9030507-10udUCx4aRo@public.gmane.org>
2010-02-04 22:33 ` bug 1918 - openmpi broken due to rdma-cm changes Sean Hefty
[not found] ` <DE586AB7003E4FC4B3850004A0B3CAF6-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:39 ` Steve Wise
[not found] ` <4B6B4C9B.8070804-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:42 ` Sean Hefty
[not found] ` <EEF75E01C32544C688989E66FECDC422-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:49 ` Steve Wise
[not found] ` <4B6B4EFE.3010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:54 ` Sean Hefty
[not found] ` <DB361897EE7C40C5AB494AAC2D8625AC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 23:04 ` Steve Wise [this message]
[not found] ` <4B6B5277.80307-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:35 ` Roland Dreier
[not found] ` <aday6j8efqs.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-04 23:44 ` Steve Wise
[not found] ` <4B6B5BD7.9090301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:50 ` Roland Dreier
[not found] ` <adatytwef1i.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 0:03 ` Paul Grun
2010-02-04 22:55 ` Steve Wise
[not found] ` <4B6B5048.5020707-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:56 ` Sean Hefty
2010-02-05 11:32 Jeff Squyres (jsquyres)
[not found] ` <58D723FE08DC6A4398E6596E38F3FA170566DA-2KNrN6/GZtCAsgjym8flbKBKnGwkPULj@public.gmane.org>
2010-02-05 16:16 ` Steve Wise
[not found] ` <4B6C4460.3050908-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:45 ` Steve Wise
2010-02-05 17:51 ` Roland Dreier
[not found] ` <ada4olvefl4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 17:58 ` Jeff Squyres
[not found] ` <324EFA68-12F6-46E9-B876-7F4847B53224-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 18:32 ` Steve Wise
[not found] ` <4B6C6453.9090706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 18:49 ` Roland Dreier
2010-02-05 18:56 ` Jason Gunthorpe
[not found] ` <20100205185616.GS16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 20:08 ` Jeff Squyres
[not found] ` <E8FF8BD1-80AC-4AA7-BC2A-CE7547FB9ABA-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:14 ` Jason Gunthorpe
[not found] ` <20100205211455.GT16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 21:40 ` Jeff Squyres
[not found] ` <697C6107-13A9-48E3-B451-02529305100D-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:53 ` Steve Wise
[not found] ` <4B6C9369.1070208-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 22:15 ` Sean Hefty
[not found] ` <77E29960440B4806B112A7158F4FA1C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 22:21 ` Steve Wise
2010-02-06 16:18 ` Steve Wise
2010-02-05 22:20 ` Jeff Squyres
2010-02-06 0:54 ` Roland Dreier
2010-02-05 18:42 ` Sean Hefty
[not found] ` <3762D25FD9474444A4B3E2240EFB8D0E-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 19:01 ` Steve Wise
[not found] ` <4B6C6B23.4010704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 19:24 ` Roland Dreier
2010-02-05 17:57 ` Jeff Squyres
2010-02-05 16:22 ` Sean Hefty
[not found] ` <0D5487526204477AA2ABED06E46768E2-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 16:38 ` Steve Wise
[not found] ` <4B6C498F.3060708-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:52 ` Sean Hefty
[not found] ` <F6DF49B759AD49EEB44BECD99FE26DCF-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 17:08 ` Steve Wise
2010-02-05 20:09 ` Sean Hefty
[not found] ` <38B735478FE94F40BBA3E8BFD794B10F-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-06 16:31 ` Steve Wise
[not found] ` <4B6D9948.6040007-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-06 16:45 ` Steve Wise
2010-02-07 0:12 ` Sean Hefty
[not found] ` <B41CA82E76BB439B892B4874D38EA652-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-07 1:22 ` Steve Wise
2010-02-07 11:56 ` [ewg] " Tziporet Koren
2010-02-07 16:39 ` Steve Wise
2010-02-07 16:48 ` Roland Dreier
[not found] ` <ada4oltxa8j.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-07 17:42 ` Steve Wise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B6B5277.80307@opengridcomputing.com \
--to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
--cc=ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org \
--cc=jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
--cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.