public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Re: bug 1918 - openmpi broken due to rdma-cm changes
@ 2010-02-05 11:32 Jeff Squyres (jsquyres)
       [not found] ` <58D723FE08DC6A4398E6596E38F3FA170566DA-2KNrN6/GZtCAsgjym8flbKBKnGwkPULj@public.gmane.org>
  0 siblings, 1 reply; 72+ messages in thread
From: Jeff Squyres (jsquyres) @ 2010-02-05 11:32 UTC (permalink / raw)
  To: swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW,
	sean.hefty-ral2JQCrhuEAvxtiuMwx3w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, ewg-G2znmakfqn7U1rindQTSdQ,
	Roland Dreier (rdreier)


[-- Attachment #1.1: Type: text/plain, Size: 2364 bytes --]

Note that it is highly unlikely that we will release open mpi 1.4.2 in time for ofed 1.5.1. 

Also note that trying to bind rdma cm to all interface ip addresses was the way that we were advised by openfabrics to figure out which devices are rdma-capable. 

As such, it is highly desirable to get the fix transparently in rdmacm and preserve the old semantic. More specifically, it seems undesirable to change this semantic in a minor ofed point release. 

-jms
Sent from my PDA.  No type good.

----- Original Message -----
From: Steve Wise <swise@opengridcomputing.com>
To: Sean Hefty <sean.hefty@intel.com>
Cc: linux-rdma <linux-rdma@vger.kernel.org>; OpenFabrics EWG <ewg@openfabrics.org>; Jeff Squyres (jsquyres); Roland Dreier (rdreier)
Sent: Thu Feb 04 18:04:23 2010
Subject: Re: bug 1918 - openmpi broken due to rdma-cm changes

Sean Hefty wrote:
>> Well then the rdma-cm needs to know which devices support hw loopback.
>> Cuz on a T3-only system, no hwloop...
>>     
>
> The problem sounds like it's more than just whether 127.0.0.1 is usable.  That
> check may fix openmpi, but it sounds more like the app needs to know whether the
> device can actually support loopback, regardless of what addresses are used.  Is
> this correct?
>
> What would openmpi do if there were two addresses assigned to the T3 device?
>   

It would use them and might even create two connections.

> Does openmpi simply bypass RDMA for all connections on the local machine?
>
>   

OpenMPI can be run to use hw loopback if its available.  For T3 
clusters, OMPI is run in a mode to use shared memory for intra-node 
communications.


> Basically, I'm not sure that this is *just* an rdma_cm issue.  Although it
> definitely appears that some sort of change needs to be made to the rdma_cm.
>
>   

I think the OpenMPI rdmacm code needs to skip 127.0.0.1, in this 
particular case.  Prior to ofed-1.5.1, however, the bind would fail and 
thus OpenMPI would not advertise 127.0.0.1 to its peer.  I will work to 
get that change done.

But lets also add a device attribute so the rdmacm can know if a device 
supports loopback.   Clearly, if the rdma-cm allows binds to T3, 
loopback connections will fail at connect time.

Hey Roland, are you ok with a device attribute to indicate hw-loopback 
support?


Steve.



[-- Attachment #1.2: Type: text/html, Size: 3165 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
ewg mailing list
ewg-ZwoEplunGu1OwGhvXhtEPSCwEArCW2h5@public.gmane.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

^ permalink raw reply	[flat|nested] 72+ messages in thread

end of thread, other threads:[~2010-02-10 19:13 UTC | newest]

Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-05 11:32 bug 1918 - openmpi broken due to rdma-cm changes Jeff Squyres (jsquyres)
     [not found] ` <58D723FE08DC6A4398E6596E38F3FA170566DA-2KNrN6/GZtCAsgjym8flbKBKnGwkPULj@public.gmane.org>
2010-02-05 16:16   ` Steve Wise
     [not found]     ` <4B6C4460.3050908-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:45       ` Steve Wise
2010-02-05 17:51       ` Roland Dreier
     [not found]         ` <ada4olvefl4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 17:58           ` Jeff Squyres
     [not found]             ` <324EFA68-12F6-46E9-B876-7F4847B53224-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 18:32               ` Steve Wise
     [not found]                 ` <4B6C6453.9090706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 18:49                   ` Roland Dreier
2010-02-05 18:56                   ` Jason Gunthorpe
     [not found]                     ` <20100205185616.GS16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 20:08                       ` Jeff Squyres
     [not found]                         ` <E8FF8BD1-80AC-4AA7-BC2A-CE7547FB9ABA-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:14                           ` Jason Gunthorpe
     [not found]                             ` <20100205211455.GT16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 21:40                               ` Jeff Squyres
     [not found]                                 ` <697C6107-13A9-48E3-B451-02529305100D-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:53                                   ` Steve Wise
     [not found]                                     ` <4B6C9369.1070208-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 22:15                                       ` Sean Hefty
     [not found]                                         ` <77E29960440B4806B112A7158F4FA1C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 22:21                                           ` Steve Wise
2010-02-06 16:18                                           ` Steve Wise
2010-02-05 22:20                                       ` Jeff Squyres
2010-02-06  0:54                                   ` Roland Dreier
2010-02-05 18:42           ` Sean Hefty
     [not found]             ` <3762D25FD9474444A4B3E2240EFB8D0E-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 19:01               ` Steve Wise
     [not found]                 ` <4B6C6B23.4010704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 19:24                   ` Roland Dreier
2010-02-05 17:57       ` Jeff Squyres
2010-02-05 16:22   ` Sean Hefty
     [not found]     ` <0D5487526204477AA2ABED06E46768E2-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 16:38       ` Steve Wise
     [not found]         ` <4B6C498F.3060708-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:52           ` Sean Hefty
     [not found]             ` <F6DF49B759AD49EEB44BECD99FE26DCF-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 17:08               ` Steve Wise
2010-02-07 21:44               ` [ewg] " Tziporet Koren
     [not found]                 ` <4B6F3451.2070304-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08  5:38                   ` Steve Wise
2010-02-05 20:09           ` Sean Hefty
     [not found]             ` <38B735478FE94F40BBA3E8BFD794B10F-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-06 16:31               ` Steve Wise
     [not found]                 ` <4B6D9948.6040007-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-06 16:45                   ` Steve Wise
2010-02-07  0:12                   ` Sean Hefty
     [not found]                     ` <B41CA82E76BB439B892B4874D38EA652-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-07  1:22                       ` Steve Wise
     [not found]                         ` <4B6E15C4.9020703-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-07 11:56                           ` [ewg] " Tziporet Koren
     [not found]                             ` <4B6EAA5F.1000208-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-07 16:39                               ` Steve Wise
     [not found]                                 ` <4B6EECBE.6020509-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-07 16:48                                   ` Roland Dreier
     [not found]                                     ` <ada4oltxa8j.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-07 17:42                                       ` Steve Wise
2010-02-08  5:27                                       ` [ewg] " Sean Hefty
2010-02-08 11:52                                   ` Tziporet Koren
     [not found]                                     ` <4B6FFB07.1070701-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08 14:29                                       ` Steve Wise
2010-02-08  6:02                       ` [PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices Sean Hefty
     [not found]                         ` <79BAA34231304F1E84C5A5A53C50A207-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 11:52                           ` [ewg] " Tziporet Koren
     [not found]                             ` <4B6FFB1B.40905-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08 14:29                               ` Steve Wise
     [not found]                                 ` <4B701FE6.60302-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 16:52                                   ` [ewg] " Roland Dreier
     [not found]                                     ` <adawrynwtz9.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-08 19:19                                       ` Jason Gunthorpe
     [not found]                                         ` <20100208191927.GU16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-08 20:02                                           ` Steve Wise
     [not found]                                             ` <4B706DED.9080403-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 20:33                                               ` Sean Hefty
     [not found]                                                 ` <C8A2C57AD5FA4141860DBFF60BFDE2DC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 21:16                                                   ` Steve Wise
     [not found]                                                     ` <4B707F2D.3030508-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 21:56                                                       ` [ewg] " Jeff Squyres
     [not found]                                                         ` <41CC15C4-0200-4C9E-9E10-3D2A9B76D16B-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-08 22:09                                                           ` Jason Gunthorpe
     [not found]                                                             ` <20100208220903.GW16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-08 22:11                                                               ` Jeff Squyres
2010-02-08 22:13                                                           ` Sean Hefty
     [not found]                                                             ` <7CC17592BE414EFCA00A1CB6033D047A-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 22:17                                                               ` [ewg] " Jeff Squyres
     [not found]                                                             ` <44864D85-03D1-412E-906C-D6FF9 04157C8@cisco.com>
     [not found]                                                               ` <44864D85-03D1-412E-906C-D6FF904157C8-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-08 22:26                                                                 ` Sean Hefty
     [not found]                                                                   ` <F533284C543140B0994C54C83C4AFF2B-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 22:28                                                                     ` Steve Wise
     [not found]                                                                       ` <4B709008.9020902-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 23:48                                                                         ` Sean Hefty
     [not found]                                                                           ` <1966FBDAD40C4EAC8611372D2B15AE84-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-09  0:28                                                                             ` Jeff Squyres
2010-02-09  0:30                                                                         ` Pradeep Satyanarayana
     [not found]                                                                           ` <4B70ACB6.5070008-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09  0:45                                                                             ` Jeff Squyres
     [not found]                                                                               ` <FE273021-D385-45EE-9376-6479A92211AF-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-09  0:50                                                                                 ` Pradeep Satyanarayana
     [not found]                                                                                   ` <4B70B152.4080308-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09  1:02                                                                                     ` Jeff Squyres
2010-02-09  0:41                                       ` [PATCH] [for-2.6.33] rdma/cm: revert associating an RDMA device when binding to loopback Sean Hefty
     [not found]                                         ` <421D3D6710E847C5B7CAC00EB73117C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-09 15:29                                           ` Steve Wise
     [not found]                                             ` <4B717F5D.8020403-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-09 16:15                                               ` Pradeep Satyanarayana
     [not found]                                                 ` <4B718A2C.2030602-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09 16:18                                                   ` Steve Wise
     [not found]                                                     ` <4B718ADB.5020602-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-09 16:23                                                       ` Sean Hefty
2010-02-09 22:01                                                   ` Jeff Squyres
     [not found]                                                     ` <4FA7F42E-308A-4A4D-82D8-87794CB8C4DE-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-09 22:17                                                       ` [ewg] " Jason Gunthorpe
2010-02-09 22:20                                                       ` Sean Hefty
2010-02-10 18:10                                           ` [PATCH] [for-2.6.33] " Roland Dreier
     [not found]                                             ` <ada6365hsgm.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-10 18:18                                               ` Steve Wise
2010-02-10 19:13                                               ` Sean Hefty
2010-02-09 16:32                           ` [PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices Sean Hefty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox