public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* some dapl assistance
@ 2010-07-07  9:10 Or Gerlitz
       [not found] ` <4C344493.2030600-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Or Gerlitz @ 2010-07-07  9:10 UTC (permalink / raw)
  To: Davis, Arlin R; +Cc: Itay Berman, linux-rdma

Itay Berman wrote:
> Intel MPI forum admin says that we used the proper way to invoke the dapl debug.
> He suggested that there might be something wrong with the dapl built (though I tried
> running dapltest on other servers with other dapl version and got the same error).

hi Arlin,

While assisting a colleague who is working with Intel MPI / uDAPL, using dapl-2.0.29-1 
and Intel mpi 4.0.0p-027, I couldn't get either of 

1. seeing dapl debug prints when running under Intel MPI
2. any basic dapltest 

to work ... could you help here? see more details below,

Or.

1. dapl prints under mpi

> # /opt/intel/impi/4.0.0.027/intel64/bin/mpiexec -ppn 1 -n 2 -env DAPL_DBG_TYPE 0xff -env DAPL_DBG_DEST 0x3  -env I_MPI_DEBUG 3 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env I_MPI_FABRICS dapl:dapl /tmp/osu
> dodly4:10887: dapl_init: dbg_type=0xff,dbg_dest=0x3
> dodly4:10887:  open_hca: device mthca0 not found
> dodly4:10887:  open_hca: device mthca0 not found
> dodly0:11583: dapl_init: dbg_type=0xff,dbg_dest=0x3
> [1] MPI startup(): DAPL provider OpenIB-mlx4_0-1
> [1] MPI startup(): dapl data transfer mode
> [0] MPI startup(): DAPL provider OpenIB-mthca0-1
> [0] MPI startup(): dapl data transfer mode
> [0] MPI startup(): static connections storm algo
> [0] Rank    Pid      Node name
> [0] 0       11583    dodly0
> [0] 1       10887    dodly4
> # OSU MPI Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> 1                         0.42
> 2                         0.85

What needs to be done such that the dapl debug prints be seen either in the system log or the standard output/error of the mpi rank?


You can see here that on this node (dodly0), the "OpenIB-mthca0-1" is used, but later when I try it with dapltest (next bullet), I can't get dat to open/work with it.

2. dapltest

> # DAT_DBG_TYPE=0x3 dapltest -T S -D OpenIB-mthca0-1
> DAT Registry: Started (dat_init)
> DAT Registry: using config file /etc/dat.conf
> DT_cs_Server: Could not open OpenIB-mthca0-1 (DAT_PROVIDER_NOT_FOUND DAT_NAME_NOT_REGISTERED)
> DT_cs_Server (OpenIB-mthca0-1):  Exiting.
> DAT Registry: Stopped (dat_fini)
> # DAT_DBG_TYPE=0x3 dapltest -T S -D OpenIB-mthca0-1u
> DAT Registry: Started (dat_init)
> DAT Registry: using config file /etc/dat.conf
> DT_cs_Server: Could not open OpenIB-mthca0-1u (DAT_PROVIDER_NOT_FOUND DAT_NAME_NOT_REGISTERED)
> DT_cs_Server (OpenIB-mthca0-1u):  Exiting.
> DAT Registry: Stopped (dat_fini)
> # ibv_devinfo
> hca_id: mthca0
>         transport:                      InfiniBand (0)
>         fw_ver:                         5.0.1
>         node_guid:                      0002:c902:0020:13d0
>         sys_image_guid:                 0002:c902:0020:13d3
>         vendor_id:                      0x02c9
>         vendor_part_id:                 25218
>         hw_ver:                         0xA0
>         board_id:                       MT_0150000001
>         phys_port_cnt:                  2
>                 port:   1
>                         state:                  PORT_ACTIVE (4)
>                         max_mtu:                2048 (4)
[...]

> # rpm -qav | grep -E "intel-mpi|dapl"
> intel-mpi-rt-em64t-4.0.0p-027
> dapl-utils-2.0.29-1
> intel-mpi-em64t-4.0.0p-027
> dapl-devel-2.0.29-1
> compat-dapl-devel-1.2.15-1
> compat-dapl-1.2.15-1
> dapl-debuginfo-2.0.29-1
> dapl-2.0.29-1


I don't think the problem is with the compat-dapl package, as it doesn't have any dat.conf file
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found] ` <4C344493.2030600-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-07-07 17:00   ` Davis, Arlin R
       [not found]     ` <E3280858FA94444CA49D2BA02341C983010435A1A8-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-07 17:00 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Itay Berman, linux-rdma

Or, 

>What needs to be done such that the dapl debug prints be seen 
>either in the system log or the standard output/error of the mpi rank?

There is limited debug in the non-debug builds. If you 
want full debugging capabilities you can install the
source RPM and configure and make as follow (OFED target example):

./configure --enable-debug --prefix /usr --sysconf=/etc --libdir /usr/lib64 LDFLAGS=-L/usr/lib64 CPPFLAGS="-I/usr/include"
make install

debug logs can be set with environment DAPL_DBG_TYPE (default=1)

typedef enum
{
    DAPL_DBG_TYPE_ERR		= 0x0001,
    DAPL_DBG_TYPE_WARN	  	= 0x0002,
    DAPL_DBG_TYPE_EVD	  	= 0x0004,
    DAPL_DBG_TYPE_CM		= 0x0008,
    DAPL_DBG_TYPE_EP		= 0x0010,
    DAPL_DBG_TYPE_UTIL	  	= 0x0020,
    DAPL_DBG_TYPE_CALLBACK	= 0x0040,
    DAPL_DBG_TYPE_DTO_COMP_ERR = 0x0080,
    DAPL_DBG_TYPE_API	  	= 0x0100,
    DAPL_DBG_TYPE_RTN	  	= 0x0200,
    DAPL_DBG_TYPE_EXCEPTION	= 0x0400,
    DAPL_DBG_TYPE_SRQ		= 0x0800,
    DAPL_DBG_TYPE_CNTR  	= 0x1000,
    DAPL_DBG_TYPE_CM_LIST  	= 0x2000,
    DAPL_DBG_TYPE_THREAD  	= 0x4000

} DAPL_DBG_TYPE;

output location can be set with DAPL_DBG_DEST as follow (default=1):

typedef enum
{
    DAPL_DBG_DEST_STDOUT  	= 0x0001,
    DAPL_DBG_DEST_SYSLOG  	= 0x0002,
} DAPL_DBG_DEST;

log messagea are prefixed with hostname:process_id as follow
and by default will be sent to stdout of mpiexec node:

cstnh-9:4834:  query_hca: mlx4_0 192.168.0.109
cstnh-9:4834:  query_hca: port.link_layer = 0x1
cstnh-9:4834:  query_hca: (b0.0) eps 260032, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 0 p_idx 0 sl 1

>
>You can see here that on this node (dodly0), the 
>"OpenIB-mthca0-1" is used, but later when I try it with 
>dapltest (next bullet), I can't get dat to open/work with it.
>
>2. dapltest
>
>> # DAT_DBG_TYPE=0x3 dapltest -T S -D OpenIB-mthca0-1

Intel MPI will pick up the appropriate v1.2 or v2.0 libdat
and libdapl provider libraries depending on your device
selection. However, when using dapltest you have to use 
the appropriate binary that links to the v1.2 library. 

If you are using v1.2 compat library providers (OpenIB-*)
you need to use the compat-dapl tests (dapltest1, dtest1, etc)
that come with the v1.2 package. 

[root@cstnh-10]# rpm -qpl compat-dapl-utils-1.2.16-1.x86_64.rpm
/usr/bin/dapltest1
/usr/bin/dtest1
/usr/share/man/man1/dapltest1.1.gz
/usr/share/man/man1/dtest1.1.gz
/usr/share/man/man5/dat.conf.5.gz

Try the following:

# dapltest1 -T S -D OpenIB-mthca0-1

Sorry for any confusion.

-arlin







--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: some dapl assistance
       [not found]     ` <E3280858FA94444CA49D2BA02341C983010435A1A8-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-07-13 11:41       ` Or Gerlitz
       [not found]         ` <4C3C50CC.7000508-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Or Gerlitz @ 2010-07-13 11:41 UTC (permalink / raw)
  To: Davis, Arlin R; +Cc: Itay Berman, linux-rdma

Davis, Arlin R wrote:
> There is limited debug in the non-debug builds. If you want full debugging capabilities
> you can install the source RPM and configure and make as follows [..] (OFED target example):

okay, got that, once I built the sources by hand as you suggested I could see debug prints
but things didn't really work, so I stepped back and installed the latest rpms - dapl-2.0.29-1
and compat-dapl-1.2.18-1, now I couldn't get intel-mpi to run:

> [root@dodly0 ~]# rpm -qav | grep dapl
> dapl-utils-2.0.29-1
> dapl-2.0.29-1
> compat-dapl-1.2.18-1

> [root@dodly0 ~]# ldconfig -p | grep libdat
>         libdat2.so.2 (libc6,x86-64) => /usr/lib64/libdat2.so.2
>         libdat.so.1 (libc6,x86-64) => /usr/lib64/libdat.so.1

> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat.so.1
> compat-dapl-1.2.18-1
> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat2.so.2
> dapl-2.0.29-1

> [root@dodly0 ~]# /opt/intel/impi/4.0.0.027/intel64/bin/mpiexec -ppn 1 -n 2  -env DAPL_IB_PKEY 0x8002 -env DAPL_DBG_TYPE 0xff -env DAPL_DBG_DEST 0x3  -env I_MPI_DEBUG 3 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env I_MPI_FABRICS dapl:dapl /tmp/osu
> [0] MPI startup(): cannot open dynamic library libdat.so
> [1] MPI startup(): cannot open dynamic library libdat.so
> [0] MPI startup(): cannot open dynamic library libdat2.so
> [0] dapl fabric is not available and fallback fabric is not enabled
> [1] MPI startup(): cannot open dynamic library libdat2.so
> [1] dapl fabric is not available and fallback fabric is not enabled
> rank 1 in job 5  dodly0_54941   caused collective abort of all ranks
>   exit status of rank 1: return code 254
> rank 0 in job 5  dodly0_54941   caused collective abort of all ranks
>   exit status of rank 0: return code 254

Any idea what we're doing wrong?

BTW - before things stopped to work, exporting LD_DEBUG=libs to the MPI rank, 
I noticed that it used the compat-1.2 rpm ...

Now, I can run dapltest fine,
> [root@dodly0 ~]# dapltest -T S -D ofa-v2-mthca0-1
> Dapltest: Service Point Ready - ofa-v2-mthca0-1
> Dapltest: Service Point Ready - ofa-v2-mthca0-1
> Server: Transaction Test Finished for this client

> [root@dodly4 ~]# dapltest -T T -D ofa-v2-mlx4_0-1 -s dodly0 -i 1000 server SR 65536 4 client SR 65536 4
> Server Name: dodly0
> Server Net Address: 172.30.3.230
> DT_cs_Client: Starting Test ...
> ----- Stats ---- : 1 threads, 1 EPs
> Total WQE        :    2919.70 WQE/Sec
> Total Time       :       0.68 sec
> Total Send       :     262.14 MB -     382.69 MB/Sec
> Total Recv       :     262.14 MB -     382.69 MB/Sec
> Total RDMA Read  :       0.00 MB -       0.00 MB/Sec
> Total RDMA Write :       0.00 MB -       0.00 MB/Sec
> DT_cs_Client: ========== End of Work -- Client Exiting

I also noted that the dapl-utils and the compat-dapl-utils are mutual exclusive as both 
attempt to install the same man page for dat.conf
> # rpm -Uvh /usr/src/redhat/RPMS/x86_64/compat-dapl-utils-1.2.18-1.x86_64.rpm
> Preparing...                ########################################### [100%]
>         file /usr/share/man/man5/dat.conf.5.gz from install of compat-dapl-utils-1.2.18-1.x86_64 conflicts with file from package dapl-utils-2.0.29-1.x86_64

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]         ` <4C3C50CC.7000508-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-07-13 16:18           ` Davis, Arlin R
       [not found]             ` <E3280858FA94444CA49D2BA02341C9830104493030-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-13 16:18 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Itay Berman, linux-rdma

Sorry, Intel MPI requires development packages which include libdat.so and libdat2.so  

Please see the install instructions on http://www.openfabrics.org/downloads/dapl/

---

For 1.2 and 2.0 support on same system, including development, install RPM packages as follow: 

dapl-2.0.29-1 
dapl-utils-2.0.29-1 
dapl-devel-2.0.29-1      <<<<
dapl-debuginfo-2.0.29-1 
compat-dapl-1.2.18-1 
compat-dapl-devel-1.2.18-1  <<<<

---

Thanks for the heads up on dat.conf manpage. I will fix the conflict in next release.

-arlin

>-----Original Message-----
>From: Or Gerlitz [mailto:ogerlitz-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org] 
>Sent: Tuesday, July 13, 2010 4:41 AM
>To: Davis, Arlin R
>Cc: Itay Berman; linux-rdma
>Subject: Re: some dapl assistance
>
>Davis, Arlin R wrote:
>> There is limited debug in the non-debug builds. If you want 
>full debugging capabilities
>> you can install the source RPM and configure and make as 
>follows [..] (OFED target example):
>
>okay, got that, once I built the sources by hand as you 
>suggested I could see debug prints
>but things didn't really work, so I stepped back and installed 
>the latest rpms - dapl-2.0.29-1
>and compat-dapl-1.2.18-1, now I couldn't get intel-mpi to run:
>
>> [root@dodly0 ~]# rpm -qav | grep dapl
>> dapl-utils-2.0.29-1
>> dapl-2.0.29-1
>> compat-dapl-1.2.18-1
>
>> [root@dodly0 ~]# ldconfig -p | grep libdat
>>         libdat2.so.2 (libc6,x86-64) => /usr/lib64/libdat2.so.2
>>         libdat.so.1 (libc6,x86-64) => /usr/lib64/libdat.so.1
>
>> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat.so.1
>> compat-dapl-1.2.18-1
>> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat2.so.2
>> dapl-2.0.29-1
>
>> [root@dodly0 ~]# 
>/opt/intel/impi/4.0.0.027/intel64/bin/mpiexec -ppn 1 -n 2  
>-env DAPL_IB_PKEY 0x8002 -env DAPL_DBG_TYPE 0xff -env 
>DAPL_DBG_DEST 0x3  -env I_MPI_DEBUG 3 -env 
>I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env I_MPI_FABRICS 
>dapl:dapl /tmp/osu
>> [0] MPI startup(): cannot open dynamic library libdat.so
>> [1] MPI startup(): cannot open dynamic library libdat.so
>> [0] MPI startup(): cannot open dynamic library libdat2.so
>> [0] dapl fabric is not available and fallback fabric is not enabled
>> [1] MPI startup(): cannot open dynamic library libdat2.so
>> [1] dapl fabric is not available and fallback fabric is not enabled
>> rank 1 in job 5  dodly0_54941   caused collective abort of all ranks
>>   exit status of rank 1: return code 254
>> rank 0 in job 5  dodly0_54941   caused collective abort of all ranks
>>   exit status of rank 0: return code 254
>
>Any idea what we're doing wrong?
>
>BTW - before things stopped to work, exporting LD_DEBUG=libs 
>to the MPI rank, 
>I noticed that it used the compat-1.2 rpm ...
>
>Now, I can run dapltest fine,
>> [root@dodly0 ~]# dapltest -T S -D ofa-v2-mthca0-1
>> Dapltest: Service Point Ready - ofa-v2-mthca0-1
>> Dapltest: Service Point Ready - ofa-v2-mthca0-1
>> Server: Transaction Test Finished for this client
>
>> [root@dodly4 ~]# dapltest -T T -D ofa-v2-mlx4_0-1 -s dodly0 
>-i 1000 server SR 65536 4 client SR 65536 4
>> Server Name: dodly0
>> Server Net Address: 172.30.3.230
>> DT_cs_Client: Starting Test ...
>> ----- Stats ---- : 1 threads, 1 EPs
>> Total WQE        :    2919.70 WQE/Sec
>> Total Time       :       0.68 sec
>> Total Send       :     262.14 MB -     382.69 MB/Sec
>> Total Recv       :     262.14 MB -     382.69 MB/Sec
>> Total RDMA Read  :       0.00 MB -       0.00 MB/Sec
>> Total RDMA Write :       0.00 MB -       0.00 MB/Sec
>> DT_cs_Client: ========== End of Work -- Client Exiting
>
>I also noted that the dapl-utils and the compat-dapl-utils are 
>mutual exclusive as both 
>attempt to install the same man page for dat.conf
>> # rpm -Uvh 
>/usr/src/redhat/RPMS/x86_64/compat-dapl-utils-1.2.18-1.x86_64.rpm
>> Preparing...                
>########################################### [100%]
>>         file /usr/share/man/man5/dat.conf.5.gz from install 
>of compat-dapl-utils-1.2.18-1.x86_64 conflicts with file from 
>package dapl-utils-2.0.29-1.x86_64
>
>Or.
>--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]             ` <E3280858FA94444CA49D2BA02341C9830104493030-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-07-15 14:29               ` Itay Berman
       [not found]                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9894-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Itay Berman @ 2010-07-15 14:29 UTC (permalink / raw)
  To: Davis, Arlin R, Or Gerlitz; +Cc: linux-rdma

[-- Attachment #1: Type: text/plain, Size: 4800 bytes --]

Hello Arlin,

I am Or's colleague whom he assist with this manner.

OK, we got Intel MPI to run. To test the pkey usage we configured it to run over pkey that is not configured on the node. In this case the MPI should have failed, but it didn't.
The dapl debug reports the given pkey (0x8001 = 32769).
How can that be?

See attached the different mpi run. I believe the devices are the correct ones (ofa-v2*). 

Itay
   

-----Original Message-----
From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
Sent: ג 13 יולי 2010 19:19
To: Or Gerlitz
Cc: Itay Berman; linux-rdma
Subject: RE: some dapl assistance

Sorry, Intel MPI requires development packages which include libdat.so and libdat2.so  

Please see the install instructions on http://www.openfabrics.org/downloads/dapl/

---

For 1.2 and 2.0 support on same system, including development, install RPM packages as follow: 

dapl-2.0.29-1 
dapl-utils-2.0.29-1 
dapl-devel-2.0.29-1      <<<<
dapl-debuginfo-2.0.29-1 
compat-dapl-1.2.18-1 
compat-dapl-devel-1.2.18-1  <<<<

---

Thanks for the heads up on dat.conf manpage. I will fix the conflict in next release.

-arlin

>-----Original Message-----
>From: Or Gerlitz [mailto:ogerlitz-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org] 
>Sent: Tuesday, July 13, 2010 4:41 AM
>To: Davis, Arlin R
>Cc: Itay Berman; linux-rdma
>Subject: Re: some dapl assistance
>
>Davis, Arlin R wrote:
>> There is limited debug in the non-debug builds. If you want 
>full debugging capabilities
>> you can install the source RPM and configure and make as 
>follows [..] (OFED target example):
>
>okay, got that, once I built the sources by hand as you 
>suggested I could see debug prints
>but things didn't really work, so I stepped back and installed 
>the latest rpms - dapl-2.0.29-1
>and compat-dapl-1.2.18-1, now I couldn't get intel-mpi to run:
>
>> [root@dodly0 ~]# rpm -qav | grep dapl
>> dapl-utils-2.0.29-1
>> dapl-2.0.29-1
>> compat-dapl-1.2.18-1
>
>> [root@dodly0 ~]# ldconfig -p | grep libdat
>>         libdat2.so.2 (libc6,x86-64) => /usr/lib64/libdat2.so.2
>>         libdat.so.1 (libc6,x86-64) => /usr/lib64/libdat.so.1
>
>> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat.so.1
>> compat-dapl-1.2.18-1
>> [root@dodly0 ~]# rpm -qf /usr/lib64/libdat2.so.2
>> dapl-2.0.29-1
>
>> [root@dodly0 ~]# 
>/opt/intel/impi/4.0.0.027/intel64/bin/mpiexec -ppn 1 -n 2  
>-env DAPL_IB_PKEY 0x8002 -env DAPL_DBG_TYPE 0xff -env 
>DAPL_DBG_DEST 0x3  -env I_MPI_DEBUG 3 -env 
>I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env I_MPI_FABRICS 
>dapl:dapl /tmp/osu
>> [0] MPI startup(): cannot open dynamic library libdat.so
>> [1] MPI startup(): cannot open dynamic library libdat.so
>> [0] MPI startup(): cannot open dynamic library libdat2.so
>> [0] dapl fabric is not available and fallback fabric is not enabled
>> [1] MPI startup(): cannot open dynamic library libdat2.so
>> [1] dapl fabric is not available and fallback fabric is not enabled
>> rank 1 in job 5  dodly0_54941   caused collective abort of all ranks
>>   exit status of rank 1: return code 254
>> rank 0 in job 5  dodly0_54941   caused collective abort of all ranks
>>   exit status of rank 0: return code 254
>
>Any idea what we're doing wrong?
>
>BTW - before things stopped to work, exporting LD_DEBUG=libs 
>to the MPI rank, 
>I noticed that it used the compat-1.2 rpm ...
>
>Now, I can run dapltest fine,
>> [root@dodly0 ~]# dapltest -T S -D ofa-v2-mthca0-1
>> Dapltest: Service Point Ready - ofa-v2-mthca0-1
>> Dapltest: Service Point Ready - ofa-v2-mthca0-1
>> Server: Transaction Test Finished for this client
>
>> [root@dodly4 ~]# dapltest -T T -D ofa-v2-mlx4_0-1 -s dodly0 
>-i 1000 server SR 65536 4 client SR 65536 4
>> Server Name: dodly0
>> Server Net Address: 172.30.3.230
>> DT_cs_Client: Starting Test ...
>> ----- Stats ---- : 1 threads, 1 EPs
>> Total WQE        :    2919.70 WQE/Sec
>> Total Time       :       0.68 sec
>> Total Send       :     262.14 MB -     382.69 MB/Sec
>> Total Recv       :     262.14 MB -     382.69 MB/Sec
>> Total RDMA Read  :       0.00 MB -       0.00 MB/Sec
>> Total RDMA Write :       0.00 MB -       0.00 MB/Sec
>> DT_cs_Client: ========== End of Work -- Client Exiting
>
>I also noted that the dapl-utils and the compat-dapl-utils are 
>mutual exclusive as both 
>attempt to install the same man page for dat.conf
>> # rpm -Uvh 
>/usr/src/redhat/RPMS/x86_64/compat-dapl-utils-1.2.18-1.x86_64.rpm
>> Preparing...                
>########################################### [100%]
>>         file /usr/share/man/man5/dat.conf.5.gz from install 
>of compat-dapl-utils-1.2.18-1.x86_64 conflicts with file from 
>package dapl-utils-2.0.29-1.x86_64
>
>Or.
>

[-- Attachment #2: impi.txt --]
[-- Type: text/plain, Size: 12703 bytes --]


[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 2 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env DAPL_DBG_TYPE 0xffff /tmp/osu
dodly0:47ba: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly4:e32: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:47ba:  open_hca: device mlx4_0 not found
dodly0:47ba:  open_hca: device mlx4_0 not found
dodly0:47ba:  query_hca: port.link_layer = 0x1
dodly0:47ba:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly0:47ba:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:47ba:  query_hca: port.link_layer = 0x1
dodly0:47ba:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly0:47ba:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:47ba:  query_hca: port.link_layer = 0x1
dodly0:47ba:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly0:47ba:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:47ba:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[0] MPI startup(): dapl data transfer mode
dodly4:e32:  query_hca: port.link_layer = 0x1
dodly4:e32:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly4:e32:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:e32:  query_hca: port.link_layer = 0x1
dodly4:e32:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly4:e32:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:e32:  query_hca: port.link_layer = 0x1
dodly4:e32:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 0 p_idx 0 sl 0
dodly4:e32:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:e32:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): dapl data transfer mode
[0] MPI startup(): static connections storm algo
dodly0:47ba:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=19 ret=1, evnts=0x4
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=19 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=17 ret=1, evnts=0x1
# OSU MPI Bandwidth Test v3.1.1
# Size        Bandwidth (MB/s)
1                         0.42
2                         0.85
4                         1.70
8                         3.38
16                        6.75
32                       13.45
64                       26.66
128                      52.43
256                     102.41
512                     196.05
1024                    350.80
2048                    559.92
4096                    682.33
8192                    748.72
16384                   786.83
32768                   674.08
65536                   795.84
131072                  878.78
262144                  927.75
524288                  949.61
1048576                 965.51
2097152                 974.14
4194304                 978.64
dodly0:47ba:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:47ba:  CM FREE: 0x13c939c0 ep=0x13c80b60 st=CM_FREE sck=19 refs=4
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba: dapl_ep_free: Free CM: EP=0x13c80b60 CM=0x13c939c0
dodly4:e32:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:e32:  CM FREE: 0x2f08f70 ep=0x2f09370 st=CM_FREE sck=17 refs=4
dodly0:47ba:  cm_free: cm 0x13c939c0 CM_FREE ep 0x13c80b60 refs=1
dodly4:e32: dapl_ep_free: Free CM: EP=0x2f09370 CM=0x2f08f70
dodly4:e32:  cm_free: cm 0x2f08f70 CM_FREE ep 0x2f09370 refs=1
dodly4:e32:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:e32:  CM FREE: 0x2f19ce0 ep=(nil) st=CM_FREE sck=13 refs=3
dodly0:47ba: dodly4:e32:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:e32:  dapl_poll: fd=15 ret=0, evnts=0x0
 dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:47ba:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:47ba:  CM FREE: 0x13c7f940 ep=(nil) st=CM_FREE sck=14 refs=3



[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 2 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env DAPL_DBG_TYPE 0xffff /tmp/osu
dodly0:3b37: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly4:237: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:3b37:  open_hca: device mlx4_0 not found
dodly0:3b37:  open_hca: device mlx4_0 not found
dodly0:3b37:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly0:3b37:  query_hca: port.link_layer = 0x1
dodly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly0:3b37:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:3b37:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly0:3b37:  query_hca: port.link_layer = 0x1
dodly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly0:3b37:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:3b37:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly0:3b37:  query_hca: port.link_layer = 0x1
dodly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly0:3b37:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly4:237:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly4:237:  query_hca: port.link_layer = 0x1
dodly4:237:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly4:237:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[0] MPI startup(): dapl data transfer mode
dodly4:237:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly4:237:  query_hca: port.link_layer = 0x1
dodly4:237:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly4:237:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:237:  Warning: new pkey(32769), query (Success) err or key !found, using defaults
dodly4:237:  query_hca: port.link_layer = 0x1
dodly4:237:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32769 p_idx 0 sl 0
dodly4:237:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): dapl data transfer mode
[0] MPI startup(): static connections storm algo
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=19 ret=1, evnts=0x4
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=19 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=17 ret=1, evnts=0x1
# OSU MPI Bandwidth Test v3.1.1
# Size        Bandwidth (MB/s)
1                         0.42
2                         0.85
4                         1.69
8                         3.37
16                        6.74
32                       13.45
64                       26.70
128                      52.45
256                     102.12
512                     195.68
1024                    349.75
2048                    555.98
4096                    681.94
8192                    747.29
16384                   785.72
32768                   675.27
65536                   797.38
131072                  879.17
262144                  928.16
524288                  949.20
1048576                 965.38
2097152                 974.11
4194304                 978.56
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:3b37:  CM FREE: 0x1f2d4b40 ep=0x1f2c1b90 st=CM_FREE sck=19 refs=4
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:237:  CM FREE: 0x170f3f70 ep=0x170f4370 st=CM_FREE sck=17 refs=4
dodly0:3b37: dapl_ep_free: Free CM: EP=0x1f2c1b90 CM=0x1f2d4b40
dodly0:3b37:  cm_free: cm 0x1f2d4b40 CM_FREE ep 0x1f2c1b90 refs=1
dodly4:237: dapl_ep_free: Free CM: EP=0x170f4370 CM=0x170f3f70
dodly4:237:  cm_free: cm 0x170f3f70 CM_FREE ep 0x170f4370 refs=1
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:237:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:237:  CM FREE: 0x17104d90 ep=(nil) st=CM_FREE sck=13 refs=3
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:3b37:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:3b37:  CM FREE: 0x1f2c09f0 ep=(nil) st=CM_FREE sck=14 refs=3


[root@dodly0 compat-dapl-1.2.18]# smpquery PKeyTable 4
   0: 0xffff 0x8002 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
   8: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  16: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  24: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  32: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  40: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  48: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
  56: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
64 pkeys capacity for this port



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9894-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
@ 2010-07-15 15:56                   ` Davis, Arlin R
       [not found]                     ` <E3280858FA94444CA49D2BA02341C983010458ECD5-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-15 15:56 UTC (permalink / raw)
  To: Itay Berman, Or Gerlitz; +Cc: linux-rdma


>OK, we got Intel MPI to run. To test the pkey usage we 
>configured it to run over pkey that is not configured on the 
>node. In this case the MPI should have failed, but it didn't.
>The dapl debug reports the given pkey (0x8001 = 32769).
>How can that be?
>

Itay,

If the pkey override is not valid it uses default idx of 0 and ignores pkey value given. 

Notice the Warning message:

odly0:3b37:  Warning: new pkey(32769), query (Success) err or key !found, using defaults 
odly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0

When you override with a correct value of 8002 does it move to p_idx=1 and work?

-arlin



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]                     ` <E3280858FA94444CA49D2BA02341C983010458ECD5-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-07-15 17:18                       ` Itay Berman
       [not found]                         ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B98F3-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Itay Berman @ 2010-07-15 17:18 UTC (permalink / raw)
  To: Davis, Arlin R, Or Gerlitz; +Cc: linux-rdma

No:
No. Same warning:

[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 2 -env I_MPI_CHECK_DAPL_PROVIDER_MISMADAPL_DBG_TYPE 0xffff -env DAPL_IB_PKEY 0x8002 /tmp/osu
dodly4:625b: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:2c17: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:2c17:  open_hca: device mlx4_0 not found
dodly0:2c17:  open_hca: device mlx4_0 not found
dodly4:625b:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly4:625b:  query_hca: port.link_layer = 0x1
dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly0:2c17:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly0:2c17:  query_hca: port.link_layer = 0x1
dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:2c17:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly0:2c17:  query_hca: port.link_layer = 0x1
dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly4:625b:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly4:625b:  query_hca: port.link_layer = 0x1
dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly0:2c17:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly0:2c17:  query_hca: port.link_layer = 0x1
dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:2c17:  dapl_poll: fd=17 ret=1, evnts=0x1
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
dodly0:2c17:  dapl_poll: fd=17 ret=0, evnts=0x0
[0] MPI startup(): dapl data transfer mode
dodly0:2c17:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly4:625b:  Warning: new pkey(32770), query (Success) err or key !found, using defaults
dodly4:625b:  query_hca: port.link_layer = 0x1
dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295

[root@dodly0 compat-dapl-1.2.18]# cat /sys/class/infiniband/mthca0/ports/1/pkeys/1
0x8002

-----Original Message-----
From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
Sent: ה 15 יולי 2010 18:56
To: Itay Berman; Or Gerlitz
Cc: linux-rdma
Subject: RE: some dapl assistance


>OK, we got Intel MPI to run. To test the pkey usage we 
>configured it to run over pkey that is not configured on the 
>node. In this case the MPI should have failed, but it didn't.
>The dapl debug reports the given pkey (0x8001 = 32769).
>How can that be?
>

Itay,

If the pkey override is not valid it uses default idx of 0 and ignores pkey value given. 

Notice the Warning message:

odly0:3b37:  Warning: new pkey(32769), query (Success) err or key !found, using defaults 
odly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0

When you override with a correct value of 8002 does it move to p_idx=1 and work?

-arlin



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]                         ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B98F3-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
@ 2010-07-15 18:16                           ` Davis, Arlin R
  2010-07-15 18:38                           ` Davis, Arlin R
  1 sibling, 0 replies; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-15 18:16 UTC (permalink / raw)
  To: Itay Berman, Or Gerlitz; +Cc: linux-rdma

 
>No:
>No. Same warning:
>
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>
>[root@dodly0 compat-dapl-1.2.18]# cat 
>/sys/class/infiniband/mthca0/ports/1/pkeys/1
>0x8002

Sorry, I only have mlx4 adapters. 
Can you check ibv_devinfo -v and look for max_pkeys?

Thanks,

-arlin



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]                         ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B98F3-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
  2010-07-15 18:16                           ` Davis, Arlin R
@ 2010-07-15 18:38                           ` Davis, Arlin R
       [not found]                             ` <E3280858FA94444CA49D2BA02341C983010458EF5E-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  1 sibling, 1 reply; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-15 18:38 UTC (permalink / raw)
  To: Itay Berman, Or Gerlitz; +Cc: linux-rdma

Itay,

Can you add "-env I_MPI_DAPL_PROVIDER ofa-v2-mthca0-1" to your mpiexec options
to make sure we pick up the correct v2 provider with pkey support? Also bump
up I_MPI_DEBUG to 5 so I can see the provider selection from MPI output.

Thanks,

-arlin

>-----Original Message-----
>From: Itay Berman [mailto:itayb-smomgflXvOZWk0Htik3J/w@public.gmane.org] 
>Sent: Thursday, July 15, 2010 10:19 AM
>To: Davis, Arlin R; Or Gerlitz
>Cc: linux-rdma
>Subject: RE: some dapl assistance
>
>No:
>No. Same warning:
>
>[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env 
>I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 2 -env 
>I_MPI_CHECK_DAPL_PROVIDER_MISMADAPL_DBG_TYPE 0xffff -env 
>DAPL_IB_PKEY 0x8002 /tmp/osu
>dodly4:625b: dapl_init: dbg_type=0xffff,dbg_dest=0x1
>dodly0:2c17: dapl_init: dbg_type=0xffff,dbg_dest=0x1
>dodly0:2c17:  open_hca: device mlx4_0 not found
>dodly0:2c17:  open_hca: device mlx4_0 not found
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  dapl_poll: fd=17 ret=1, evnts=0x1
>[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
>dodly0:2c17:  dapl_poll: fd=17 ret=0, evnts=0x0
>[0] MPI startup(): dapl data transfer mode
>dodly0:2c17:  dapl_poll: fd=14 ret=0, evnts=0x0
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>
>[root@dodly0 compat-dapl-1.2.18]# cat 
>/sys/class/infiniband/mthca0/ports/1/pkeys/1
>0x8002
>
>-----Original Message-----
>From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
>Sent: ה 15 יולי 2010 18:56
>To: Itay Berman; Or Gerlitz
>Cc: linux-rdma
>Subject: RE: some dapl assistance
>
>
>>OK, we got Intel MPI to run. To test the pkey usage we 
>>configured it to run over pkey that is not configured on the 
>>node. In this case the MPI should have failed, but it didn't.
>>The dapl debug reports the given pkey (0x8001 = 32769).
>>How can that be?
>>
>
>Itay,
>
>If the pkey override is not valid it uses default idx of 0 and 
>ignores pkey value given. 
>
>Notice the Warning message:
>
>odly0:3b37:  Warning: new pkey(32769), query (Success) err or 
>key !found, using defaults 
>odly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, 
>sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0
>
>When you override with a correct value of 8002 does it move to 
>p_idx=1 and work?
>
>-arlin
>
>
>
>--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance
       [not found]                             ` <E3280858FA94444CA49D2BA02341C983010458EF5E-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-07-18  8:55                               ` Itay Berman
       [not found]                                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9BD0-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Itay Berman @ 2010-07-18  8:55 UTC (permalink / raw)
  To: Davis, Arlin R; +Cc: linux-rdma, Or Gerlitz

[-- Attachment #1: Type: text/plain, Size: 5063 bytes --]

I can't use "-env I_MPI_DAPL_PROVIDER ofa-v2-mthca0-1" since I have different providers on my nodes.

>From what I can see Intel MPI declare to use to correct providers:
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1

(See the full output and the ibv_devinfo -v for both nodes attached)

Itay.

-----Original Message-----
From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
Sent: ה 15 יולי 2010 21:38
To: Itay Berman; Or Gerlitz
Cc: linux-rdma
Subject: RE: some dapl assistance

Itay,

Can you add "-env I_MPI_DAPL_PROVIDER ofa-v2-mthca0-1" to your mpiexec options
to make sure we pick up the correct v2 provider with pkey support? Also bump
up I_MPI_DEBUG to 5 so I can see the provider selection from MPI output.

Thanks,

-arlin

>-----Original Message-----
>From: Itay Berman [mailto:itayb-smomgflXvOZWk0Htik3J/w@public.gmane.org] 
>Sent: Thursday, July 15, 2010 10:19 AM
>To: Davis, Arlin R; Or Gerlitz
>Cc: linux-rdma
>Subject: RE: some dapl assistance
>
>No:
>No. Same warning:
>
>[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env 
>I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 2 -env 
>I_MPI_CHECK_DAPL_PROVIDER_MISMADAPL_DBG_TYPE 0xffff -env 
>DAPL_IB_PKEY 0x8002 /tmp/osu
>dodly4:625b: dapl_init: dbg_type=0xffff,dbg_dest=0x1
>dodly0:2c17: dapl_init: dbg_type=0xffff,dbg_dest=0x1
>dodly0:2c17:  open_hca: device mlx4_0 not found
>dodly0:2c17:  open_hca: device mlx4_0 not found
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly0:2c17:  query_hca: port.link_layer = 0x1
>dodly0:2c17:  query_hca: (a0.0) eps 64512, sz 16384 evds 
>65408, sz 131071 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly0:2c17:  query_hca: msg 2147483648 rdma 2147483648 iov 27 
>lmr 131056 rmr 0 ack_time 16 mr 4294967295
>dodly0:2c17:  dapl_poll: fd=17 ret=1, evnts=0x1
>[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
>dodly0:2c17:  dapl_poll: fd=17 ret=0, evnts=0x0
>[0] MPI startup(): dapl data transfer mode
>dodly0:2c17:  dapl_poll: fd=14 ret=0, evnts=0x0
>dodly4:625b:  Warning: new pkey(32770), query (Success) err or 
>key !found, using defaults
>dodly4:625b:  query_hca: port.link_layer = 0x1
>dodly4:625b:  query_hca: (a0.0) eps 262076, sz 16351 evds 
>65408, sz 4194303 mtu 2048 - pkey 32770 p_idx 0 sl 0
>dodly4:625b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 
>lmr 524272 rmr 0 ack_time 16 mr 4294967295
>
>[root@dodly0 compat-dapl-1.2.18]# cat 
>/sys/class/infiniband/mthca0/ports/1/pkeys/1
>0x8002
>
>-----Original Message-----
>From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
>Sent: ה 15 יולי 2010 18:56
>To: Itay Berman; Or Gerlitz
>Cc: linux-rdma
>Subject: RE: some dapl assistance
>
>
>>OK, we got Intel MPI to run. To test the pkey usage we 
>>configured it to run over pkey that is not configured on the 
>>node. In this case the MPI should have failed, but it didn't.
>>The dapl debug reports the given pkey (0x8001 = 32769).
>>How can that be?
>>
>
>Itay,
>
>If the pkey override is not valid it uses default idx of 0 and 
>ignores pkey value given. 
>
>Notice the Warning message:
>
>odly0:3b37:  Warning: new pkey(32769), query (Success) err or 
>key !found, using defaults 
>odly0:3b37:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, 
>sz 131071 mtu 2048 - pkey 32769 p_idx 0 sl 0
>
>When you override with a correct value of 8002 does it move to 
>p_idx=1 and work?
>
>-arlin
>
>
>
>

[-- Attachment #2: impi.txt --]
[-- Type: text/plain, Size: 15797 bytes --]

[root@dodly0 compat-dapl-1.2.18]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 5 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env DAPL_DBG_TYPE 0xffff -env DAPL_IB_PKEY 0x8003 /tmp/osu
dodly0:7573: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly4:187b: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:7573:  open_hca: device mlx4_0 not found
dodly0:7573:  open_hca: device mlx4_0 not found
dodly0:7573:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly0:7573:  query_hca: port.link_layer = 0x1
dodly0:7573:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly0:7573:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:7573:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly0:7573:  query_hca: port.link_layer = 0x1
dodly0:7573:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly0:7573:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly4:187b:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly4:187b:  query_hca: port.link_layer = 0x1
dodly4:187b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly4:187b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly0:7573:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly0:7573:  query_hca: port.link_layer = 0x1
dodly0:7573:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly0:7573:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:7573:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[0] MPI startup(): dapl data transfer mode
dodly4:187b:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly4:187b:  query_hca: port.link_layer = 0x1
dodly4:187b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly4:187b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:187b:  Warning: new pkey(32771), query (Success) err or key !found, using defaults
dodly4:187b:  query_hca: port.link_layer = 0x1
dodly4:187b:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 32771 p_idx 0 sl 0
dodly4:187b:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): dapl data transfer mode
[0] MPI startup(): static connections storm algo
dodly0:7573:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=19 ret=1, evnts=0x4
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=19 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=17 ret=1, evnts=0x1
[0] MPI startup(): I_MPI_CHECK_DAPL_PROVIDER_MISMATCH=none
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_FABRICS=dapl:dapl
[1] MPI startup(): set domain to {0,1,2,3} on node dodly4
[0] MPI startup(): set domain to {0,1,2,3} on node dodly0
[0] Rank    Pid      Node name  Pin cpu
[0] 0       30067    dodly0     {0,1,2,3}
[0] 1       6267     dodly4     {0,1,2,3}
# OSU MPI Bandwidth Test v3.1.1
# Size        Bandwidth (MB/s)
1                         0.41
2                         0.83
4                         1.65
8                         3.28
16                        6.56
32                       13.06
64                       25.92
128                      51.12
256                      99.39
512                     192.90
1024                    345.51
2048                    551.05
4096                    681.88
8192                    745.24
16384                   785.84
32768                   671.52
65536                   795.62
131072                  877.93
262144                  927.23
524288                  948.57
1048576                 965.20
2097152                 974.00
4194304                 978.59
dodly0:7573:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:7573:  CM FREE: 0x2c8da40 ep=0x2c8d420 st=CM_FREE sck=19 refs=4
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573: dapl_ep_free: Free CM: EP=0x2c8d420 CM=0x2c8da40
dodly0:7573:  cm_free: cm 0x2c8da40 CM_FREE ep 0x2c8d420 refs=1
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:187b:  CM FREE: 0x6b2c400 ep=0x6b2c800 st=CM_FREE sck=17 refs=4
dodly4:187b: dapl_ep_free: Free CM: EP=0x6b2c800 CM=0x6b2c400
dodly4:187b:  cm_free: cm 0x6b2c400 CM_FREE ep 0x6b2c800 refs=1
dodly0:7573:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:7573:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:7573:  CM FREE: 0x2c79d00 ep=(nil) st=CM_FREE sck=14 refs=3
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:187b:  CM FREE: 0x6b400b0 ep=(nil) st=CM_FREE sck=13 refs=3
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:187b:  dapl_poll: fd=15 ret=0, evnts=0x0

[root@dodly0 compat-dapl-1.2.18]# cat /sys/class/infiniband/mthca0/ports/1/pkeys/1
0x8003

[root@dodly0 compat-dapl-1.2.18]# ibv_devinfo -v
hca_id: mthca0
        transport:                      InfiniBand (0)
        fw_ver:                         5.0.1
        node_guid:                      0002:c902:0020:13d0
        sys_image_guid:                 0002:c902:0020:13d3
        vendor_id:                      0x02c9
        vendor_part_id:                 25218
        hw_ver:                         0xA0
        board_id:                       MT_0150000001
        phys_port_cnt:                  2
        max_mr_size:                    0xffffffffffffffff
        page_size_cap:                  0xfffff000
        max_qp:                         64512
        max_qp_wr:                      16384
        device_cap_flags:               0x00001c76
        max_sge:                        27
        max_sge_rd:                     0
        max_cq:                         65408
        max_cqe:                        131071
        max_mr:                         131056
        max_pd:                         32764
        max_qp_rd_atom:                 4
        max_ee_rd_atom:                 0
        max_res_rd_atom:                258048
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_HCA (1)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  8192
        max_mcast_qp_attach:            56
        max_total_mcast_qp_attach:      458752
        max_ah:                         0
        max_fmr:                        0
        max_srq:                        960
        max_srq_wr:                     16384
        max_srq_sge:                    27
        max_pkeys:                      64
        local_ca_ack_delay:             15
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 8
                        port_lid:               4
                        port_lmc:               0x00
                        link_layer:             IB
                        max_msg_sz:             0x80000000
                        port_cap_flags:         0x00510a68
                        max_vl_num:             8 (4)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           64
                        gid_tbl_len:            32
                        subnet_timeout:         18
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               fe80:0000:0000:0000:0002:c902:0020:13d1

                port:   2
                        state:                  PORT_DOWN (1)
                        max_mtu:                2048 (4)
                        active_mtu:             512 (2)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             IB
                        max_msg_sz:             0x80000000
                        port_cap_flags:         0x00510a68
                        max_vl_num:             8 (4)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           64
                        gid_tbl_len:            32
                        subnet_timeout:         0
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             POLLING (2)
                        GID[  0]:               fe80:0000:0000:0000:0002:c902:0020:13d2


[root@dodly4 tmp]# cat /sys/class/infiniband/mlx4_0/ports/1/pkeys/1
0x8003

[root@dodly4 tmp]# ibv_devinfo -v
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.3.000
        node_guid:                      0002:c903:0000:1d94
        sys_image_guid:                 0002:c903:0000:1d97
        vendor_id:                      0x08f1
        vendor_part_id:                 25408
        hw_ver:                         0xA0
        board_id:                       VLT0120010001
        phys_port_cnt:                  2
        max_mr_size:                    0xffffffffffffffff
        page_size_cap:                  0xfffff000
        max_qp:                         262076
        max_qp_wr:                      16351
        device_cap_flags:               0x00541c66
        max_sge:                        32
        max_sge_rd:                     0
        max_cq:                         65408
        max_cqe:                        4194303
        max_mr:                         524272
        max_pd:                         32764
        max_qp_rd_atom:                 16
        max_ee_rd_atom:                 0
        max_res_rd_atom:                4193216
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_HCA (1)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  8192
        max_mcast_qp_attach:            56
        max_total_mcast_qp_attach:      458752
        max_ah:                         0
        max_fmr:                        0
        max_srq:                        65472
        max_srq_wr:                     16383
        max_srq_sge:                    31
        max_pkeys:                      128
        local_ca_ack_delay:             15
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 8
                        port_lid:               3
                        port_lmc:               0x00
                        link_layer:             IB
                        max_msg_sz:             0x40000000
                        port_cap_flags:         0x02510868
                        max_vl_num:             8 (4)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           128
                        gid_tbl_len:            128
                        subnet_timeout:         18
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               fe80:0000:0000:0000:0002:c903:0000:1d95

                port:   2
                        state:                  PORT_DOWN (1)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             IB
                        max_msg_sz:             0x40000000
                        port_cap_flags:         0x02510868
                        max_vl_num:             8 (4)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           128
                        gid_tbl_len:            128
                        subnet_timeout:         0
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           2.5 Gbps (1)
                        phys_state:             POLLING (2)
                        GID[  0]:               fe80:0000:0000:0000:0002:c903:0000:1d96

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: some dapl assistance - [PATCH] dapl-2.0 improperly handles pkey check/query in host order
       [not found]                                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9BD0-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
@ 2010-07-19 19:04                                   ` Davis, Arlin R
       [not found]                                     ` <E3280858FA94444CA49D2BA02341C983010D14039A-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Davis, Arlin R @ 2010-07-19 19:04 UTC (permalink / raw)
  To: Itay Berman; +Cc: linux-rdma, Or Gerlitz

 
Itay,

>>>OK, we got Intel MPI to run. To test the pkey usage we 
>>>configured it to run over pkey that is not configured on the 
>>>node. In this case the MPI should have failed, but it didn't.
>>>The dapl debug reports the given pkey (0x8001 = 32769).
>>>How can that be?
>>
>>If the pkey override is not valid it uses default idx of 0 and 
>>ignores pkey value given. 

Sorry, verbs pkey_query is network order and the consumer
variable is assumed host order. Please try the following
v2.0 patch (or use 0x0280 without patch):

---

scm, ucm: improperly handles pkey check/query in host order

Convert consumer input to network order before verbs
query pkey check.

Signed-off-by: Arlin Davis <arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

diff --git a/dapl/openib_common/util.c b/dapl/openib_common/util.c
index a69261f..73730ef 100644
--- a/dapl/openib_common/util.c
+++ b/dapl/openib_common/util.c
@@ -326,7 +326,7 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HCA * hca_ptr,
 
                /* set SL, PKEY values, defaults = 0 */
                hca_ptr->ib_trans.pkey_idx = 0;
-               hca_ptr->ib_trans.pkey = dapl_os_get_env_val("DAPL_IB_PKEY", 0);
+               hca_ptr->ib_trans.pkey = htons(dapl_os_get_env_val("DAPL_IB_PKEY", 0));
                hca_ptr->ib_trans.sl = dapl_os_get_env_val("DAPL_IB_SL", 0);
 
 		/* index provided, get pkey; pkey provided, get index */
@@ -345,10 +345,10 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HCA * hca_ptr,
 				}
 			}
 			if (i == dev_attr.max_pkeys) {
-				dapl_log(DAPL_DBG_TYPE_WARN,
-					 " Warning: new pkey(%d), query (%s)"
-					 " err or key !found, using defaults\n",
-					 hca_ptr->ib_trans.pkey, strerror(errno));
+				dapl_log(DAPL_DBG_TYPE_ERR,
+					 " ERR: new pkey(0x%x), query (%s)"
+					 " err or key !found, using default pkey_idx=0\n",
+					 ntohs(hca_ptr->ib_trans.pkey), strerror(errno));
 			}
 		}
 skip_ib:


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* RE: some dapl assistance - [PATCH] dapl-2.0 improperly handles pkeycheck/query in host order
       [not found]                                     ` <E3280858FA94444CA49D2BA02341C983010D14039A-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-07-20  7:27                                       ` Itay Berman
  0 siblings, 0 replies; 12+ messages in thread
From: Itay Berman @ 2010-07-20  7:27 UTC (permalink / raw)
  To: Davis, Arlin R; +Cc: Or Gerlitz, linux-rdma

Works!

[root@dodly0 OMB-3.1.1]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env I_MPI_DEBUG 5 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env DAPL_DBG_TYPE 0xffff -env DAPL_IB_PKEY 0x0280 -env DAPL_IB_SL 4 /tmp/osu_long
dodly0:5bc3: dapl_init: dbg_type=0xffff,dbg_dest=0x1
dodly0:5bc3:  open_hca: device mlx4_0 not found
dodly0:5bc3:  open_hca: device mlx4_0 not found
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 ack_time 16 mr 4294967295
dodly0:5bc3:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly4:1e8d: dapl_init: dbg_type=0xffff,dbg_dest=0x1
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[0] MPI startup(): dapl data transfer mode
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 ack_time 16 mr 4294967295
dodly4:1e8d:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): dapl data transfer mode
[0] MPI startup(): static connections storm algo
dodly0:5bc3:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=1, evnts=0x4
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=1, evnts=0x1
[0] MPI startup(): I_MPI_CHECK_DAPL_PROVIDER_MISMATCH=none
[0] MPI startup(): I_MPI_DEBUG=5
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=17 ret=1, evnts=0x1
[0] MPI startup(): I_MPI_FABRICS=dapl:dapl
[0] MPI startup(): set domain to {0,1,2,3} on node dodly0
[1] MPI startup(): set domain to {0,1,2,3} on node dodly4
[0] Rank    Pid      Node name  Pin cpu
[0] 0       23491    dodly0     {0,1,2,3}
[0] 1       7821     dodly4     {0,1,2,3}
# OSU MPI Bandwidth Test v3.1.1
# Size        Bandwidth (MB/s)
4194304                 978.30
4194304                 978.45
4194304                 978.69
4194304                 978.24
dodly0:5bc3: dapl async_event: DEV ERR 12
dodly4:1e8d: dapl async_event: DEV ERR 12
dodly4:1e8d:  DTO completion ERROR: 12: op 0xff
dodly4:1e8d: DTO completion ERR: status 12, op OP_RDMA_READ, vendor_err 0x81 - 172.30.3.230
[1:dodly4][../../dapl_module_poll.c:3972] Intel MPI fatal error: ofa-v2-mlx4_0-1 DTO operation posted for [0:dodly0] completed with error. status=0x8. cookie=0x40000
Assertion failed in file ../../dapl_module_poll.c at line 3973: 0
internal ABORT - process 1
rank 1 in job 41  dodly0_54941   caused collective abort of all ranks
  exit status of rank 1: killed by signal 9

dapl reports p_idx 1. this is an output of an osu test that I removed the configured pkey. At that time the mpi died. So it indeed ran over that pkey.

To test the sl I will have to change my configuration a bit.

We will be happy to get a new build of dapl if possible.

Thanks,

Itay.  

-----Original Message-----
From: Davis, Arlin R [mailto:arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] 
Sent: ב 19 יולי 2010 22:04
To: Itay Berman
Cc: linux-rdma; Or Gerlitz
Subject: RE: some dapl assistance - [PATCH] dapl-2.0 improperly handles pkeycheck/query in host order

 
Itay,

>>>OK, we got Intel MPI to run. To test the pkey usage we 
>>>configured it to run over pkey that is not configured on the 
>>>node. In this case the MPI should have failed, but it didn't.
>>>The dapl debug reports the given pkey (0x8001 = 32769).
>>>How can that be?
>>
>>If the pkey override is not valid it uses default idx of 0 and 
>>ignores pkey value given. 

Sorry, verbs pkey_query is network order and the consumer
variable is assumed host order. Please try the following
v2.0 patch (or use 0x0280 without patch):

---

scm, ucm: improperly handles pkey check/query in host order

Convert consumer input to network order before verbs
query pkey check.

Signed-off-by: Arlin Davis <arlin.r.davis-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

diff --git a/dapl/openib_common/util.c b/dapl/openib_common/util.c
index a69261f..73730ef 100644
--- a/dapl/openib_common/util.c
+++ b/dapl/openib_common/util.c
@@ -326,7 +326,7 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HCA * hca_ptr,
 
                /* set SL, PKEY values, defaults = 0 */
                hca_ptr->ib_trans.pkey_idx = 0;
-               hca_ptr->ib_trans.pkey = dapl_os_get_env_val("DAPL_IB_PKEY", 0);
+               hca_ptr->ib_trans.pkey = htons(dapl_os_get_env_val("DAPL_IB_PKEY", 0));
                hca_ptr->ib_trans.sl = dapl_os_get_env_val("DAPL_IB_SL", 0);
 
 		/* index provided, get pkey; pkey provided, get index */
@@ -345,10 +345,10 @@ DAT_RETURN dapls_ib_query_hca(IN DAPL_HCA * hca_ptr,
 				}
 			}
 			if (i == dev_attr.max_pkeys) {
-				dapl_log(DAPL_DBG_TYPE_WARN,
-					 " Warning: new pkey(%d), query (%s)"
-					 " err or key !found, using defaults\n",
-					 hca_ptr->ib_trans.pkey, strerror(errno));
+				dapl_log(DAPL_DBG_TYPE_ERR,
+					 " ERR: new pkey(0x%x), query (%s)"
+					 " err or key !found, using default pkey_idx=0\n",
+					 ntohs(hca_ptr->ib_trans.pkey), strerror(errno));
 			}
 		}
 skip_ib:


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-07-20  7:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-07  9:10 some dapl assistance Or Gerlitz
     [not found] ` <4C344493.2030600-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-07-07 17:00   ` Davis, Arlin R
     [not found]     ` <E3280858FA94444CA49D2BA02341C983010435A1A8-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-07-13 11:41       ` Or Gerlitz
     [not found]         ` <4C3C50CC.7000508-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-07-13 16:18           ` Davis, Arlin R
     [not found]             ` <E3280858FA94444CA49D2BA02341C9830104493030-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-07-15 14:29               ` Itay Berman
     [not found]                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9894-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
2010-07-15 15:56                   ` Davis, Arlin R
     [not found]                     ` <E3280858FA94444CA49D2BA02341C983010458ECD5-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-07-15 17:18                       ` Itay Berman
     [not found]                         ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B98F3-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
2010-07-15 18:16                           ` Davis, Arlin R
2010-07-15 18:38                           ` Davis, Arlin R
     [not found]                             ` <E3280858FA94444CA49D2BA02341C983010458EF5E-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-07-18  8:55                               ` Itay Berman
     [not found]                                 ` <6D1AA8ED7402FF49AFAB26F0C948ACF5014B9BD0-QfUkFaTmzUSUvQqKE/ONIwC/G2K4zDHf@public.gmane.org>
2010-07-19 19:04                                   ` some dapl assistance - [PATCH] dapl-2.0 improperly handles pkey check/query in host order Davis, Arlin R
     [not found]                                     ` <E3280858FA94444CA49D2BA02341C983010D14039A-osO9UTpF0URZtRGVdHMbwrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-07-20  7:27                                       ` some dapl assistance - [PATCH] dapl-2.0 improperly handles pkeycheck/query " Itay Berman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox