* rdma_bw fails
@ 2011-05-17 16:47 Greg I Kerr
[not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Greg I Kerr @ 2011-05-17 16:47 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
After finally fully comprehending libibverbs, I am now trying to
expand my understand to librdma_cm but it would seem I am having some
problems getting connected.
If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
fails with the error: "4390:pp_client_connect: unexpected CM event 1."
event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
it should work if ib0 is configured.
Thanks in advance for any help,
Greg Kerr
[kerrg@compute-0-3 rdma]$ rdma_bw -c
4292: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |
[kerrg@compute-0-2 rdma]$ rdma_bw -c 10.1.1.30
4390: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |
4390:pp_client_connect: unexpected CM event 1
Here is the output of /sbin/ifconfig:
[kerrg@compute-0-3 rdma]$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84
inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0
inet6 addr: fe80::230:48ff:febe:d984/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:55999 errors:0 dropped:0 overruns:0 frame:0
TX packets:12608 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6023763 (5.7 MiB) TX bytes:2459277 (2.3 MiB)
Memory:febe0000-fec00000
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::230:48be:d984:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:42 errors:0 dropped:5 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:560 (560.0 b) TX bytes:3696 (3.6 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1253 errors:0 dropped:0 overruns:0 frame:0
TX packets:1253 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2093229 (1.9 MiB) TX bytes:2093229 (1.9 MiB)
[kerrg@compute-0-2 rdma]$ /sbin/ifconfig
eth0 Link encap:Ethernet HWaddr 00:30:48:BE:DA:D4
inet addr:10.1.255.252 Bcast:10.1.255.255 Mask:255.255.0.0
inet6 addr: fe80::230:48ff:febe:dad4/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:66833 errors:0 dropped:0 overruns:0 frame:0
TX packets:17905 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:6323062 (6.0 MiB) TX bytes:2166485 (2.0 MiB)
Memory:febe0000-fec00000
ib0 Link encap:InfiniBand HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
inet addr:10.1.1.31 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::230:48be:dad4:1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:52 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:5 overruns:0 carrier:0
collisions:0 txqueuelen:256
RX bytes:4088 (3.9 KiB) TX bytes:352 (352.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:1230 errors:0 dropped:0 overruns:0 frame:0
TX packets:1230 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2055348 (1.9 MiB) TX bytes:2055348 (1.9 MiB)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: rdma_bw fails
[not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-05-17 16:56 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Hefty, Sean @ 2011-05-17 16:56 UTC (permalink / raw)
To: Greg I Kerr, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
> fails with the error: "4390:pp_client_connect: unexpected CM event 1."
> event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
> it should work if ib0 is configured.
It looks like your eth0 and ib0 are configured as the same IP subnet. The routing tables are likely resulting in the rdma_cm trying to use the wrong device.
> [kerrg@compute-0-3 rdma]$ /sbin/ifconfig
> eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84
> inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0
10.1.x.x subnet
> ib0 Link encap:InfiniBand HWaddr
> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0
10.x.x.x subnet
You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0.
- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: rdma_bw fails
[not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2011-05-17 17:16 ` ib-x2spCj9RiN0z5UmgcLIfJQ
[not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: ib-x2spCj9RiN0z5UmgcLIfJQ @ 2011-05-17 17:16 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
So... what is the correct way to configure ib? I *think* I have it
configured correctly, but want to verify... Where is this information
online?
I currently have my eth0 addr set, for example, to 100.100.100.1 and
ib0 set to 100.100.100.2. I assume that is ok, but I think my bcast
and masks are same.
E
Quoting "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>:
>> If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
>> fails with the error: "4390:pp_client_connect: unexpected CM event 1."
>> event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
>> it should work if ib0 is configured.
>
> It looks like your eth0 and ib0 are configured as the same IP
> subnet. The routing tables are likely resulting in the rdma_cm
> trying to use the wrong device.
>
>> [kerrg@compute-0-3 rdma]$ /sbin/ifconfig
>> eth0 Link encap:Ethernet HWaddr 00:30:48:BE:D9:84
>> inet addr:10.1.255.251 Bcast:10.1.255.255 Mask:255.255.0.0
>
> 10.1.x.x subnet
>
>> ib0 Link encap:InfiniBand HWaddr
>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>> inet addr:10.1.1.30 Bcast:10.255.255.255 Mask:255.0.0.0
>
> 10.x.x.x subnet
>
> You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0.
>
> - Sean
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: rdma_bw fails
[not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
@ 2011-05-17 17:17 ` Joe Landman
0 siblings, 0 replies; 4+ messages in thread
From: Joe Landman @ 2011-05-17 17:17 UTC (permalink / raw)
To: ib-x2spCj9RiN0z5UmgcLIfJQ; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA
On 05/17/2011 01:16 PM, ib-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org wrote:
> So... what is the correct way to configure ib? I *think* I have it
> configured correctly, but want to verify... Where is this information
> online?
>
> I currently have my eth0 addr set, for example, to 100.100.100.1 and ib0
> set to 100.100.100.2. I assume that is ok, but I think my bcast and
> masks are same.
They shouldn't be on the same subnet in most cases. Keep IB on its own
subnet. What are your subnet masks?
What we usually do when we have a set of mixed nets is something like this:
eth: 10.100.0.0/16
IB: 10.200.0.0/16
this allows us to keep the traffic easily and obviously separate, and
not set up strange routing bits.
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman-nyOC7EYE20mM0MU9lROt9DlRY1/6cnIP@public.gmane.org
web : http://scalableinformatics.com
http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-05-17 17:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-17 16:47 rdma_bw fails Greg I Kerr
[not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-17 16:56 ` Hefty, Sean
[not found] ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-17 17:16 ` ib-x2spCj9RiN0z5UmgcLIfJQ
[not found] ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
2011-05-17 17:17 ` Joe Landman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox