public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	OpenFabrics EWG <ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org>,
	Jeff Squyres <jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
	Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Subject: Re: bug 1918 - openmpi broken due to rdma-cm changes
Date: Thu, 04 Feb 2010 17:04:23 -0600	[thread overview]
Message-ID: <4B6B5277.80307@opengridcomputing.com> (raw)
In-Reply-To: <DB361897EE7C40C5AB494AAC2D8625AC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>

Sean Hefty wrote:
>> Well then the rdma-cm needs to know which devices support hw loopback.
>> Cuz on a T3-only system, no hwloop...
>>     
>
> The problem sounds like it's more than just whether 127.0.0.1 is usable.  That
> check may fix openmpi, but it sounds more like the app needs to know whether the
> device can actually support loopback, regardless of what addresses are used.  Is
> this correct?
>
> What would openmpi do if there were two addresses assigned to the T3 device?
>   

It would use them and might even create two connections.

> Does openmpi simply bypass RDMA for all connections on the local machine?
>
>   

OpenMPI can be run to use hw loopback if its available.  For T3 
clusters, OMPI is run in a mode to use shared memory for intra-node 
communications.


> Basically, I'm not sure that this is *just* an rdma_cm issue.  Although it
> definitely appears that some sort of change needs to be made to the rdma_cm.
>
>   

I think the OpenMPI rdmacm code needs to skip 127.0.0.1, in this 
particular case.  Prior to ofed-1.5.1, however, the bind would fail and 
thus OpenMPI would not advertise 127.0.0.1 to its peer.  I will work to 
get that change done.

But lets also add a device attribute so the rdmacm can know if a device 
supports loopback.   Clearly, if the rdma-cm allows binds to T3, 
loopback connections will fail at connect time.

Hey Roland, are you ok with a device attribute to indicate hw-loopback 
support?


Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2010-02-04 23:04 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4B6B47D0.9030507@aoot.com>
     [not found] ` <4B6B47D0.9030507-10udUCx4aRo@public.gmane.org>
2010-02-04 22:33   ` bug 1918 - openmpi broken due to rdma-cm changes Sean Hefty
     [not found]     ` <DE586AB7003E4FC4B3850004A0B3CAF6-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:39       ` Steve Wise
     [not found]         ` <4B6B4C9B.8070804-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:42           ` Sean Hefty
     [not found]             ` <EEF75E01C32544C688989E66FECDC422-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:49               ` Steve Wise
     [not found]                 ` <4B6B4EFE.3010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:54                   ` Sean Hefty
     [not found]                     ` <DB361897EE7C40C5AB494AAC2D8625AC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 23:04                       ` Steve Wise [this message]
     [not found]                         ` <4B6B5277.80307-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:35                           ` Roland Dreier
     [not found]                             ` <aday6j8efqs.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-04 23:44                               ` Steve Wise
     [not found]                                 ` <4B6B5BD7.9090301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:50                                   ` Roland Dreier
     [not found]                                     ` <adatytwef1i.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05  0:03                                       ` Paul Grun
2010-02-04 22:55       ` Steve Wise
     [not found]         ` <4B6B5048.5020707-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:56           ` Sean Hefty
2010-02-05 11:32 Jeff Squyres (jsquyres)
     [not found] ` <58D723FE08DC6A4398E6596E38F3FA170566DA-2KNrN6/GZtCAsgjym8flbKBKnGwkPULj@public.gmane.org>
2010-02-05 16:16   ` Steve Wise
     [not found]     ` <4B6C4460.3050908-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:45       ` Steve Wise
2010-02-05 17:51       ` Roland Dreier
     [not found]         ` <ada4olvefl4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 17:58           ` Jeff Squyres
     [not found]             ` <324EFA68-12F6-46E9-B876-7F4847B53224-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 18:32               ` Steve Wise
     [not found]                 ` <4B6C6453.9090706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 18:49                   ` Roland Dreier
2010-02-05 18:56                   ` Jason Gunthorpe
     [not found]                     ` <20100205185616.GS16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 20:08                       ` Jeff Squyres
     [not found]                         ` <E8FF8BD1-80AC-4AA7-BC2A-CE7547FB9ABA-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:14                           ` Jason Gunthorpe
     [not found]                             ` <20100205211455.GT16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 21:40                               ` Jeff Squyres
     [not found]                                 ` <697C6107-13A9-48E3-B451-02529305100D-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:53                                   ` Steve Wise
     [not found]                                     ` <4B6C9369.1070208-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 22:15                                       ` Sean Hefty
     [not found]                                         ` <77E29960440B4806B112A7158F4FA1C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 22:21                                           ` Steve Wise
2010-02-06 16:18                                           ` Steve Wise
2010-02-05 22:20                                       ` Jeff Squyres
2010-02-06  0:54                                   ` Roland Dreier
2010-02-05 18:42           ` Sean Hefty
     [not found]             ` <3762D25FD9474444A4B3E2240EFB8D0E-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 19:01               ` Steve Wise
     [not found]                 ` <4B6C6B23.4010704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 19:24                   ` Roland Dreier
2010-02-05 17:57       ` Jeff Squyres
2010-02-05 16:22   ` Sean Hefty
     [not found]     ` <0D5487526204477AA2ABED06E46768E2-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 16:38       ` Steve Wise
     [not found]         ` <4B6C498F.3060708-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:52           ` Sean Hefty
     [not found]             ` <F6DF49B759AD49EEB44BECD99FE26DCF-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 17:08               ` Steve Wise
2010-02-05 20:09           ` Sean Hefty
     [not found]             ` <38B735478FE94F40BBA3E8BFD794B10F-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-06 16:31               ` Steve Wise
     [not found]                 ` <4B6D9948.6040007-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-06 16:45                   ` Steve Wise
2010-02-07  0:12                   ` Sean Hefty
     [not found]                     ` <B41CA82E76BB439B892B4874D38EA652-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-07  1:22                       ` Steve Wise
2010-02-07 11:56                         ` [ewg] " Tziporet Koren
2010-02-07 16:39                           ` Steve Wise
2010-02-07 16:48                             ` Roland Dreier
     [not found]                               ` <ada4oltxa8j.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-07 17:42                                 ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B6B5277.80307@opengridcomputing.com \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org \
    --cc=jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox