From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Sean Hefty <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: "'Jeff Squyres (jsquyres)'"
<jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org,
"Roland Dreier (rdreier)"
<rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
Subject: Re: bug 1918 - openmpi broken due to rdma-cm changes
Date: Fri, 05 Feb 2010 10:38:39 -0600 [thread overview]
Message-ID: <4B6C498F.3060708@opengridcomputing.com> (raw)
In-Reply-To: <0D5487526204477AA2ABED06E46768E2-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
Sean Hefty wrote:
>> Also note that trying to bind rdma cm to all interface ip addresses was the way
>> that we were advised by openfabrics to figure out which devices are rdma-
>> capable.
>>
>> As such, it is highly desirable to get the fix transparently in rdmacm and
>> preserve the old semantic. More specifically, it seems undesirable to change
>> this semantic in a minor ofed point release.
>>
>
> I think the issue is larger than just the rdma_cm.
>
> First, it sounds like openmpi tries to bind to 127.0.0.1, which now works. If
> opemmpi uses shared memory for connections on the same machine, I'm not sure why
> this is a problem, unless it is passing that address to another machine to use
> for a connection. If this is the case, then that is a bug in openmpi.
>
Yes, OpenMPI incorrectly advertises 127.0.0.1 as a valid address
to-which the peer can connect. This needs to be fixed.
> Second, I still don't understand whether iwarp is limited to 'loopback'
> connections that are not bound to 127.0.0.1. For instance, if the RDMA device
> is associated with 192.168.0.1, then can it handle a connection from 192.168.0.1
> <-> 192.168.0.1? If it can't, then the rdma_cm can't help in this case when
> bind is called. The failure has to come during connect, which sounds like the
> behavior that's seen today with 127.0.0.1.
>
Its not iWARP specific. A device may or may not support hw loopback.
Now the IB spec mandates this support, but the iWARP spec doesn't.
Ammasso and Chelsio T3 rnics do not support HW loopback. They will fail
if you try to connect to a local address. The rdma-cm shouldn't allow
binds to 127.0.0.1 for these devices since it 100% implies that the
connection will require hw loopback for that device.
> So, while the rdma_cm can fail binds to 127.0.0.1 if the RDMA device doesn't
> support loopback, I'm still not sure how much of a fix this is.
>
My concern is breaking an existing working OpenMPI in a point release
because we changed semantics of the rdma-cm in an ofed point release...
BTW: Was this change an artifact of rebasing ofed-1.5.1 on a new kernel
version?
Steve.
> - Sean
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-02-05 16:38 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-05 11:32 bug 1918 - openmpi broken due to rdma-cm changes Jeff Squyres (jsquyres)
[not found] ` <58D723FE08DC6A4398E6596E38F3FA170566DA-2KNrN6/GZtCAsgjym8flbKBKnGwkPULj@public.gmane.org>
2010-02-05 16:16 ` Steve Wise
[not found] ` <4B6C4460.3050908-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:45 ` Steve Wise
2010-02-05 17:51 ` Roland Dreier
[not found] ` <ada4olvefl4.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 17:58 ` Jeff Squyres
[not found] ` <324EFA68-12F6-46E9-B876-7F4847B53224-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 18:32 ` Steve Wise
[not found] ` <4B6C6453.9090706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 18:49 ` Roland Dreier
2010-02-05 18:56 ` Jason Gunthorpe
[not found] ` <20100205185616.GS16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 20:08 ` Jeff Squyres
[not found] ` <E8FF8BD1-80AC-4AA7-BC2A-CE7547FB9ABA-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:14 ` Jason Gunthorpe
[not found] ` <20100205211455.GT16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-05 21:40 ` Jeff Squyres
[not found] ` <697C6107-13A9-48E3-B451-02529305100D-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-05 21:53 ` Steve Wise
[not found] ` <4B6C9369.1070208-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 22:15 ` Sean Hefty
[not found] ` <77E29960440B4806B112A7158F4FA1C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 22:21 ` Steve Wise
2010-02-06 16:18 ` Steve Wise
2010-02-05 22:20 ` Jeff Squyres
2010-02-06 0:54 ` Roland Dreier
2010-02-05 18:42 ` Sean Hefty
[not found] ` <3762D25FD9474444A4B3E2240EFB8D0E-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 19:01 ` Steve Wise
[not found] ` <4B6C6B23.4010704-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 19:24 ` Roland Dreier
2010-02-05 17:57 ` Jeff Squyres
2010-02-05 16:22 ` Sean Hefty
[not found] ` <0D5487526204477AA2ABED06E46768E2-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 16:38 ` Steve Wise [this message]
[not found] ` <4B6C498F.3060708-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-05 16:52 ` Sean Hefty
[not found] ` <F6DF49B759AD49EEB44BECD99FE26DCF-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-05 17:08 ` Steve Wise
2010-02-07 21:44 ` [ewg] " Tziporet Koren
[not found] ` <4B6F3451.2070304-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08 5:38 ` Steve Wise
2010-02-05 20:09 ` Sean Hefty
[not found] ` <38B735478FE94F40BBA3E8BFD794B10F-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-06 16:31 ` Steve Wise
[not found] ` <4B6D9948.6040007-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-06 16:45 ` Steve Wise
2010-02-07 0:12 ` Sean Hefty
[not found] ` <B41CA82E76BB439B892B4874D38EA652-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-07 1:22 ` Steve Wise
[not found] ` <4B6E15C4.9020703-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-07 11:56 ` [ewg] " Tziporet Koren
[not found] ` <4B6EAA5F.1000208-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-07 16:39 ` Steve Wise
[not found] ` <4B6EECBE.6020509-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-07 16:48 ` Roland Dreier
[not found] ` <ada4oltxa8j.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-07 17:42 ` Steve Wise
2010-02-08 5:27 ` [ewg] " Sean Hefty
2010-02-08 11:52 ` Tziporet Koren
[not found] ` <4B6FFB07.1070701-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08 14:29 ` Steve Wise
2010-02-08 6:02 ` [PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices Sean Hefty
[not found] ` <79BAA34231304F1E84C5A5A53C50A207-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 11:52 ` [ewg] " Tziporet Koren
[not found] ` <4B6FFB1B.40905-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
2010-02-08 14:29 ` Steve Wise
[not found] ` <4B701FE6.60302-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 16:52 ` [ewg] " Roland Dreier
[not found] ` <adawrynwtz9.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-08 19:19 ` Jason Gunthorpe
[not found] ` <20100208191927.GU16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-08 20:02 ` Steve Wise
[not found] ` <4B706DED.9080403-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 20:33 ` Sean Hefty
[not found] ` <C8A2C57AD5FA4141860DBFF60BFDE2DC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 21:16 ` Steve Wise
[not found] ` <4B707F2D.3030508-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 21:56 ` [ewg] " Jeff Squyres
[not found] ` <41CC15C4-0200-4C9E-9E10-3D2A9B76D16B-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-08 22:09 ` Jason Gunthorpe
[not found] ` <20100208220903.GW16490-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2010-02-08 22:11 ` Jeff Squyres
2010-02-08 22:13 ` Sean Hefty
[not found] ` <7CC17592BE414EFCA00A1CB6033D047A-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 22:17 ` [ewg] " Jeff Squyres
[not found] ` <44864D85-03D1-412E-906C-D6FF9 04157C8@cisco.com>
[not found] ` <44864D85-03D1-412E-906C-D6FF904157C8-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-08 22:26 ` Sean Hefty
[not found] ` <F533284C543140B0994C54C83C4AFF2B-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-08 22:28 ` Steve Wise
[not found] ` <4B709008.9020902-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-08 23:48 ` Sean Hefty
[not found] ` <1966FBDAD40C4EAC8611372D2B15AE84-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-09 0:28 ` Jeff Squyres
2010-02-09 0:30 ` Pradeep Satyanarayana
[not found] ` <4B70ACB6.5070008-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09 0:45 ` Jeff Squyres
[not found] ` <FE273021-D385-45EE-9376-6479A92211AF-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-09 0:50 ` Pradeep Satyanarayana
[not found] ` <4B70B152.4080308-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09 1:02 ` Jeff Squyres
2010-02-09 0:41 ` [PATCH] [for-2.6.33] rdma/cm: revert associating an RDMA device when binding to loopback Sean Hefty
[not found] ` <421D3D6710E847C5B7CAC00EB73117C4-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-09 15:29 ` Steve Wise
[not found] ` <4B717F5D.8020403-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-09 16:15 ` Pradeep Satyanarayana
[not found] ` <4B718A2C.2030602-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-02-09 16:18 ` Steve Wise
[not found] ` <4B718ADB.5020602-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-09 16:23 ` Sean Hefty
2010-02-09 22:01 ` Jeff Squyres
[not found] ` <4FA7F42E-308A-4A4D-82D8-87794CB8C4DE-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-02-09 22:17 ` [ewg] " Jason Gunthorpe
2010-02-09 22:20 ` Sean Hefty
2010-02-10 18:10 ` [PATCH] [for-2.6.33] " Roland Dreier
[not found] ` <ada6365hsgm.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-10 18:18 ` Steve Wise
2010-02-10 19:13 ` Sean Hefty
2010-02-09 16:32 ` [PATCH] [for-2.6.33] rdma/cm: disallow loopback address for iwarp devices Sean Hefty
[not found] <4B6B47D0.9030507@aoot.com>
[not found] ` <4B6B47D0.9030507-10udUCx4aRo@public.gmane.org>
2010-02-04 22:33 ` bug 1918 - openmpi broken due to rdma-cm changes Sean Hefty
[not found] ` <DE586AB7003E4FC4B3850004A0B3CAF6-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:39 ` Steve Wise
[not found] ` <4B6B4C9B.8070804-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:42 ` Sean Hefty
[not found] ` <EEF75E01C32544C688989E66FECDC422-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 22:49 ` Steve Wise
[not found] ` <4B6B4EFE.3010205-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:54 ` Sean Hefty
[not found] ` <DB361897EE7C40C5AB494AAC2D8625AC-Zpru7NauK7drdx17CPfAsdBPR1lH4CV8@public.gmane.org>
2010-02-04 23:04 ` Steve Wise
[not found] ` <4B6B5277.80307-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:35 ` Roland Dreier
[not found] ` <aday6j8efqs.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-04 23:44 ` Steve Wise
[not found] ` <4B6B5BD7.9090301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 23:50 ` Roland Dreier
[not found] ` <adatytwef1i.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-02-05 0:03 ` Paul Grun
2010-02-04 22:55 ` Steve Wise
[not found] ` <4B6B5048.5020707-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2010-02-04 22:56 ` Sean Hefty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B6C498F.3060708@opengridcomputing.com \
--to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
--cc=ewg-G2znmakfqn7U1rindQTSdQ@public.gmane.org \
--cc=jsquyres-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
--cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox