public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* rdma_bw fails
@ 2011-05-17 16:47 Greg I Kerr
       [not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Greg I Kerr @ 2011-05-17 16:47 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

After finally fully comprehending libibverbs, I am now trying to
expand my understand to librdma_cm but it would seem I am having some
problems getting connected.

If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
fails with the error: "4390:pp_client_connect: unexpected CM event 1."
event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
it should work if ib0 is configured.

Thanks in advance for any help,

Greg Kerr

[kerrg@compute-0-3 rdma]$ rdma_bw -c
4292: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |

[kerrg@compute-0-2 rdma]$ rdma_bw -c 10.1.1.30
4390: | port=18515 | ib_port=1 | size=65536 | tx_depth=100 | sl=0 |
iters=1000 | duplex=0 | cma=1 |
4390:pp_client_connect: unexpected CM event 1

Here is the output of /sbin/ifconfig:

[kerrg@compute-0-3 rdma]$ /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:30:48:BE:D9:84
          inet addr:10.1.255.251  Bcast:10.1.255.255  Mask:255.255.0.0
          inet6 addr: fe80::230:48ff:febe:d984/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:55999 errors:0 dropped:0 overruns:0 frame:0
          TX packets:12608 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6023763 (5.7 MiB)  TX bytes:2459277 (2.3 MiB)
          Memory:febe0000-fec00000

ib0       Link encap:InfiniBand  HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
          inet addr:10.1.1.30  Bcast:10.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::230:48be:d984:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:42 errors:0 dropped:5 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:560 (560.0 b)  TX bytes:3696 (3.6 KiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1253 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1253 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2093229 (1.9 MiB)  TX bytes:2093229 (1.9 MiB)

[kerrg@compute-0-2 rdma]$ /sbin/ifconfig
eth0      Link encap:Ethernet  HWaddr 00:30:48:BE:DA:D4
          inet addr:10.1.255.252  Bcast:10.1.255.255  Mask:255.255.0.0
          inet6 addr: fe80::230:48ff:febe:dad4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:66833 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17905 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6323062 (6.0 MiB)  TX bytes:2166485 (2.0 MiB)
          Memory:febe0000-fec00000

ib0       Link encap:InfiniBand  HWaddr
80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
          inet addr:10.1.1.31  Bcast:10.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::230:48be:dad4:1/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
          RX packets:52 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:5 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:4088 (3.9 KiB)  TX bytes:352 (352.0 b)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1230 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1230 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2055348 (1.9 MiB)  TX bytes:2055348 (1.9 MiB)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: rdma_bw fails
       [not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2011-05-17 16:56   ` Hefty, Sean
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hefty, Sean @ 2011-05-17 16:56 UTC (permalink / raw)
  To: Greg I Kerr, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

> If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
> fails with the error: "4390:pp_client_connect: unexpected CM event 1."
> event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
> it should work if ib0 is configured.

It looks like your eth0 and ib0 are configured as the same IP subnet.  The routing tables are likely resulting in the rdma_cm trying to use the wrong device. 
 
> [kerrg@compute-0-3 rdma]$ /sbin/ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:30:48:BE:D9:84
>           inet addr:10.1.255.251  Bcast:10.1.255.255  Mask:255.255.0.0

10.1.x.x subnet

> ib0       Link encap:InfiniBand  HWaddr
> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>           inet addr:10.1.1.30  Bcast:10.255.255.255  Mask:255.0.0.0

10.x.x.x subnet

You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: rdma_bw fails
       [not found]     ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2011-05-17 17:16       ` ib-x2spCj9RiN0z5UmgcLIfJQ
       [not found]         ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: ib-x2spCj9RiN0z5UmgcLIfJQ @ 2011-05-17 17:16 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

So... what is the correct way to configure ib?  I *think* I have it  
configured correctly, but want to verify... Where is this information  
online?

I currently have my eth0 addr set, for example, to 100.100.100.1 and  
ib0 set to 100.100.100.2.  I assume that is ok, but I think my bcast  
and masks are same.

E

Quoting "Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>:

>> If I run rdma_bw on two nodes with the -c option (use rdma_cm) it
>> fails with the error: "4390:pp_client_connect: unexpected CM event 1."
>> event 1 is RDMA_CM_EVENT_ADDR_ERROR. I was under the impression that
>> it should work if ib0 is configured.
>
> It looks like your eth0 and ib0 are configured as the same IP   
> subnet.  The routing tables are likely resulting in the rdma_cm   
> trying to use the wrong device.
>
>> [kerrg@compute-0-3 rdma]$ /sbin/ifconfig
>> eth0      Link encap:Ethernet  HWaddr 00:30:48:BE:D9:84
>>           inet addr:10.1.255.251  Bcast:10.1.255.255  Mask:255.255.0.0
>
> 10.1.x.x subnet
>
>> ib0       Link encap:InfiniBand  HWaddr
>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>>           inet addr:10.1.1.30  Bcast:10.255.255.255  Mask:255.0.0.0
>
> 10.x.x.x subnet
>
> You can try changing IB to a 10.2.x.x subnet with mask 255.255.0.0.
>
> - Sean
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: rdma_bw fails
       [not found]         ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
@ 2011-05-17 17:17           ` Joe Landman
  0 siblings, 0 replies; 4+ messages in thread
From: Joe Landman @ 2011-05-17 17:17 UTC (permalink / raw)
  To: ib-x2spCj9RiN0z5UmgcLIfJQ; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 05/17/2011 01:16 PM, ib-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org wrote:
> So... what is the correct way to configure ib? I *think* I have it
> configured correctly, but want to verify... Where is this information
> online?
>
> I currently have my eth0 addr set, for example, to 100.100.100.1 and ib0
> set to 100.100.100.2. I assume that is ok, but I think my bcast and
> masks are same.

They shouldn't be on the same subnet in most cases.  Keep IB on its own 
subnet.  What are your subnet masks?

What we usually do when we have a set of mixed nets is something like this:

eth:  10.100.0.0/16
IB:   10.200.0.0/16

this allows us to keep the traffic easily and obviously separate, and 
not set up strange routing bits.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman-nyOC7EYE20mM0MU9lROt9DlRY1/6cnIP@public.gmane.org
web  : http://scalableinformatics.com
        http://scalableinformatics.com/sicluster
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-05-17 17:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-17 16:47 rdma_bw fails Greg I Kerr
     [not found] ` <BANLkTik_42DBsd5SdyDcMvpTaw9+dXTr_w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-05-17 16:56   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A82373F8C7-P5GAC/sN6hmkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2011-05-17 17:16       ` ib-x2spCj9RiN0z5UmgcLIfJQ
     [not found]         ` <20110517111606.3ohp6l6z34gckcoo-x2spCj9RiN0z5UmgcLIfJQ@public.gmane.org>
2011-05-17 17:17           ` Joe Landman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox