* rdma_bw fails
@ 2011-05-17 16:47 Greg I Kerr
[not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Greg I Kerr @ 2011-05-17 16:47 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
After finally fully comprehending libibverbs, I am now trying to
expand my understand to librdma_cm but it would seem I am having some
problems getting connected.
If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
fails with the error: "4390:pp_client_connect: unexpected CM event 1."
event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
it should work if ib0 is configured.
Thanks in advance for any help,
Greg Kerr
[kerrg@compute-0-3 rdma]$ rdma_bw -c
4292: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |
[kerrg@compute-0-2 rdma]$ rdma_bw -c 10.1.1.30
4390: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |
4390:pp_client_connect: unexpected CM event 1
Here is the output of /sbin/ifconfig:
[kerrg@compute-0-3 rdma]$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84
inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0
inet6 addr: fe80::230:48ff:febe:d984/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:55999 errors:0 dropped:0 overruns:0 frame:0
TX packets:12608 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6023763 (5.7 MiB) TX bytes:2459277 (2.3 MiB)
Memory:febe0000-fec00000
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::230:48be:d984:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:42 errors:0 dropped:5 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:560 (560.0 b) TX bytes:3696 (3.6 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1253 errors:0 dropped:0 overruns:0 frame:0
TX packets:1253 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2093229 (1.9 MiB) TX bytes:2093229 (1.9 MiB)
[kerrg@compute-0-2 rdma]$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:30:48:BE:DA:D4
inet addr:10.1.255.252 Bcast:10.1.255.255 Mask:255.255.0.0
inet6 addr: fe80::230:48ff:febe:dad4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:66833 errors:0 dropped:0 overruns:0 frame:0
TX packets:17905 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6323062 (6.0 MiB) TX bytes:2166485 (2.0 MiB)
Memory:febe0000-fec00000
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:10.1.1.31 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::230:48be:dad4:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:52 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:5 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:4088 (3.9 KiB) TX bytes:352 (352.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1230 errors:0 dropped:0 overruns:0 frame:0
TX packets:1230 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2055348 (1.9 MiB) TX bytes:2055348 (1.9 MiB)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* RE: rdma_bw fails [not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2011-05-17 16:56 ` Hefty, Sean [not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: Hefty, Sean @ 2011-05-17 16:56 UTC (permalink / raw) To: Greg I Kerr, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > If I run rdma_bw on two nodes with the -c option (use rdma_cm) it > fails with the error: "4390:pp_client_connect: unexpected CM event 1." > event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that > it should work if ib0 is configured. It looks like your eth0 and ib0 are configured as the same IP subnet. The routing tables are likely resulting in the rdma_cm trying to use the wrong device. > [kerrg@compute-0-3 rdma]$ /sbin/ifconfig > eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84 > inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0 10.1.x.x subnet > ib0 Link encap:InfiniBand HWaddr > 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 > inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0 10.x.x.x subnet You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>]
* RE: rdma_bw fails [not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org> @ 2011-05-17 17:16 ` ib-x2spCj9RiN0z5UmgcLIfJQ [not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org> 0 siblings, 1 reply; 4+ messages in thread From: ib-x2spCj9RiN0z5UmgcLIfJQ @ 2011-05-17 17:16 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA So... what is the correct way to configure ib? I *think* I have it configured correctly, but want to verify... Where is this information online? I currently have my eth0 addr set, for example, to 100.100.100.1 and ib0 set to 100.100.100.2. I assume that is ok, but I think my bcast and masks are same. E Quoting "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>: >> If I run rdma_bw on two nodes with the -c option (use rdma_cm) it >> fails with the error: "4390:pp_client_connect: unexpected CM event 1." >> event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that >> it should work if ib0 is configured. > > It looks like your eth0 and ib0 are configured as the same IP > subnet. The routing tables are likely resulting in the rdma_cm > trying to use the wrong device. > >> [kerrg@compute-0-3 rdma]$ /sbin/ifconfig >> eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84 >> inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0 > > 10.1.x.x subnet > >> ib0 Link encap:InfiniBand HWaddr >> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 >> inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0 > > 10.x.x.x subnet > > You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0. > > - Sean > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>]
* Re: rdma_bw fails [not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org> @ 2011-05-17 17:17 ` Joe Landman 0 siblings, 0 replies; 4+ messages in thread From: Joe Landman @ 2011-05-17 17:17 UTC (permalink / raw) To: ib-x2spCj9RiN0z5UmgcLIfJQ; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 05/17/2011 01:16 PM, ib-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org wrote: > So... what is the correct way to configure ib? I *think* I have it > configured correctly, but want to verify... Where is this information > online? > > I currently have my eth0 addr set, for example, to 100.100.100.1 and ib0 > set to 100.100.100.2. I assume that is ok, but I think my bcast and > masks are same. They shouldn't be on the same subnet in most cases. Keep IB on its own subnet. What are your subnet masks? What we usually do when we have a set of mixed nets is something like this: eth: 10.100.0.0/16 IB: 10.200.0.0/16 this allows us to keep the traffic easily and obviously separate, and not set up strange routing bits. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics Inc. email: landman-nyOC7EYE20mM0MU9lROt9DlRY1/6cnIP@public.gmane.org web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-05-17 17:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-17 16:47 rdma_bw fails Greg I Kerr
[not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-17 16:56 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-17 17:16 ` ib-x2spCj9RiN0z5UmgcLIfJQ
[not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
2011-05-17 17:17 ` Joe Landman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox