public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
@ 2011-03-28 18:01 Alan Cook
       [not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cook @ 2011-03-28 18:01 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

In trying to support various speeds with my application, I attempted to change
the speed and width of Infiniband card using ibportstate.  I had difficulty with
the reliability of the setting changes working.  I found that I needed to set
"force_link_speed 0" in /etc/ofed/opensm.conf.  Once that change as made the
service restarted, I was able to reliably change settings on the cards.

All was well until I changed the card to 2.5 x1.  At this point, I was unable to
ping the other machine.  Both PCs show that the link is up and active, but I am
unable to get any communication between the two.  I tried both ping and ibping
with no success.  I brought the interfaces down and back up, but that did not
solve the issue.

I have been using the following commands (as root) to adjust the speed and width:

ibportstate -C mlx4_0 -D 0 1 speed [1-7]
ibportstate -C mlx4_0 -D 0 1 width [1|2]
ibportstate -C mlx4_0 -D 0 1 reset

I was able to both ping and ibping when using the following speed and width
configurations:

 2.5 Gbps x4
 5.0 Gbps x1
 5.0 Gbps x4
10.0 Gbps x1
10.0 Gbps x4

The only setting configuration that does not work is the 2.5 x1.

Is there something that I am missing?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
       [not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
@ 2011-03-28 18:55   ` Hal Rosenstock
  2011-03-28 19:07     ` Alan Cook
  0 siblings, 1 reply; 10+ messages in thread
From: Hal Rosenstock @ 2011-03-28 18:55 UTC (permalink / raw)
  To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/28/2011 2:01 PM, Alan Cook wrote:
> In trying to support various speeds with my application, I attempted to change
> the speed and width of Infiniband card using ibportstate.  I had difficulty with
> the reliability of the setting changes working.  I found that I needed to set
> "force_link_speed 0" in /etc/ofed/opensm.conf.  Once that change as made the
> service restarted, I was able to reliably change settings on the cards.
>
> All was well until I changed the card to 2.5 x1.  At this point, I was unable to
> ping the other machine.  Both PCs show that the link is up and active, but I am
> unable to get any communication between the two.  I tried both ping and ibping
> with no success.  I brought the interfaces down and back up, but that did not
> solve the issue.
>
> I have been using the following commands (as root) to adjust the speed and width:
>
> ibportstate -C mlx4_0 -D 0 1 speed [1-7]
> ibportstate -C mlx4_0 -D 0 1 width [1|2]
> ibportstate -C mlx4_0 -D 0 1 reset
>
> I was able to both ping and ibping when using the following speed and width
> configurations:
>
>   2.5 Gbps x4
>   5.0 Gbps x1
>   5.0 Gbps x4
> 10.0 Gbps x1
> 10.0 Gbps x4
>
> The only setting configuration that does not work is the 2.5 x1.
>
> Is there something that I am missing?

What does smpquery portinfo say for both sides of that link ?

-- Hal

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
  2011-03-28 18:55   ` Hal Rosenstock
@ 2011-03-28 19:07     ` Alan Cook
       [not found]       ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cook @ 2011-03-28 19:07 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hal Rosenstock <hal@...> writes:
> 
> What does smpquery portinfo say for both sides of that link ?
> 

What specifically would be I be looking for in this data to determine if their
is an issue?

For card 1 I have:

# Port info: DR path slid 65535; dlid 65535; 0 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0xfe80000000000000
Lid:.............................2
SMLid:...........................1
CapMask:.........................0x2510868
                                IsTrapSupported
                                IsAutomaticMigrationSupported
                                IsSLMappingSupported
                                IsSystemImageGUIDsupported
                                IsCommunicatonManagementSupported
                                IsVendorClassSupported
                                IsCapabilityMaskNoticeSupported
                                IsClientRegistrationSupported
DiagCode:........................0x0000
MkeyLeasePeriod:.................0
LocalPort:.......................1
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................1X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkDownDefState:................Polling
ProtectBits:.....................0
LMC:.............................0
LinkSpeedActive:.................2.5 Gbps
LinkSpeedEnabled:................2.5 Gbps
NeighborMTU:.....................2048
SMSL:............................0
VLCap:...........................VL0-7
InitType:........................0x00
VLHighLimit:.....................4
VLArbHighCap:....................8
VLArbLowCap:.....................8
InitReply:.......................0x00
MtuCap:..........................2048
VLStallCount:....................0
HoqLife:.........................31
OperVLs:.........................VL0-7
PartEnforceInb:..................0
PartEnforceOutb:.................0
FilterRawInb:....................0
FilterRawOutb:...................0
MkeyViolations:..................0
PkeyViolations:..................0
QkeyViolations:..................0
GuidCap:.........................128
ClientReregister:................0
SubnetTimeout:...................18
RespTimeVal:.....................16
LocalPhysErr:....................8
OverrunErr:......................8
MaxCreditHint:...................0
RoundTrip:.......................0


For card 2 I have:

# Port info: DR path slid 65535; dlid 65535; 0 port 1
Mkey:............................0x0000000000000000
GidPrefix:.......................0xfe80000000000000
Lid:.............................1
SMLid:...........................1
CapMask:.........................0x251086a
                                IsSM
                                IsTrapSupported
                                IsAutomaticMigrationSupported
                                IsSLMappingSupported
                                IsSystemImageGUIDsupported
                                IsCommunicatonManagementSupported
                                IsVendorClassSupported
                                IsCapabilityMaskNoticeSupported
                                IsClientRegistrationSupported
DiagCode:........................0x0000
MkeyLeasePeriod:.................0
LocalPort:.......................1
LinkWidthEnabled:................1X or 4X
LinkWidthSupported:..............1X or 4X
LinkWidthActive:.................1X
LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
LinkState:.......................Active
PhysLinkState:...................LinkUp
LinkDownDefState:................Polling
ProtectBits:.....................0
LMC:.............................0
LinkSpeedActive:.................2.5 Gbps
LinkSpeedEnabled:................2.5 Gbps
NeighborMTU:.....................2048
SMSL:............................0
VLCap:...........................VL0-7
InitType:........................0x00
VLHighLimit:.....................4
VLArbHighCap:....................8
VLArbLowCap:.....................8
InitReply:.......................0x00
MtuCap:..........................2048
VLStallCount:....................0
HoqLife:.........................31
OperVLs:.........................VL0-7
PartEnforceInb:..................0
PartEnforceOutb:.................0
FilterRawInb:....................0
FilterRawOutb:...................0
MkeyViolations:..................0
PkeyViolations:..................0
QkeyViolations:..................0
GuidCap:.........................128
ClientReregister:................0
SubnetTimeout:...................18
RespTimeVal:.....................16
LocalPhysErr:....................8
OverrunErr:......................8
MaxCreditHint:...................0
RoundTrip:.......................0



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
       [not found]       ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
@ 2011-03-28 19:19         ` Hal Rosenstock
  2011-03-28 19:26           ` Alan Cook
  0 siblings, 1 reply; 10+ messages in thread
From: Hal Rosenstock @ 2011-03-28 19:19 UTC (permalink / raw)
  To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/28/2011 3:07 PM, Alan Cook wrote:
> Hal Rosenstock<hal@...>  writes:
>>
>> What does smpquery portinfo say for both sides of that link ?
>>
>
> What specifically would be I be looking for in this data to determine if their
> is an issue?

The various LinkSpeed/Width components to see if they make sense.

> For card 1 I have:
>
> # Port info: DR path slid 65535; dlid 65535; 0 port 1
> Mkey:............................0x0000000000000000
> GidPrefix:.......................0xfe80000000000000
> Lid:.............................2
> SMLid:...........................1
> CapMask:.........................0x2510868
>                                  IsTrapSupported
>                                  IsAutomaticMigrationSupported
>                                  IsSLMappingSupported
>                                  IsSystemImageGUIDsupported
>                                  IsCommunicatonManagementSupported
>                                  IsVendorClassSupported
>                                  IsCapabilityMaskNoticeSupported
>                                  IsClientRegistrationSupported
> DiagCode:........................0x0000
> MkeyLeasePeriod:.................0
> LocalPort:.......................1
> LinkWidthEnabled:................1X or 4X
> LinkWidthSupported:..............1X or 4X
> LinkWidthActive:.................1X
> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
> LinkState:.......................Active
> PhysLinkState:...................LinkUp
> LinkDownDefState:................Polling
> ProtectBits:.....................0
> LMC:.............................0
> LinkSpeedActive:.................2.5 Gbps
> LinkSpeedEnabled:................2.5 Gbps
> NeighborMTU:.....................2048
> SMSL:............................0
> VLCap:...........................VL0-7
> InitType:........................0x00
> VLHighLimit:.....................4
> VLArbHighCap:....................8
> VLArbLowCap:.....................8
> InitReply:.......................0x00
> MtuCap:..........................2048
> VLStallCount:....................0
> HoqLife:.........................31
> OperVLs:.........................VL0-7
> PartEnforceInb:..................0
> PartEnforceOutb:.................0
> FilterRawInb:....................0
> FilterRawOutb:...................0
> MkeyViolations:..................0
> PkeyViolations:..................0
> QkeyViolations:..................0
> GuidCap:.........................128
> ClientReregister:................0
> SubnetTimeout:...................18
> RespTimeVal:.....................16
> LocalPhysErr:....................8
> OverrunErr:......................8
> MaxCreditHint:...................0
> RoundTrip:.......................0
>
>
> For card 2 I have:
>
> # Port info: DR path slid 65535; dlid 65535; 0 port 1
> Mkey:............................0x0000000000000000
> GidPrefix:.......................0xfe80000000000000
> Lid:.............................1
> SMLid:...........................1
> CapMask:.........................0x251086a
>                                  IsSM
>                                  IsTrapSupported
>                                  IsAutomaticMigrationSupported
>                                  IsSLMappingSupported
>                                  IsSystemImageGUIDsupported
>                                  IsCommunicatonManagementSupported
>                                  IsVendorClassSupported
>                                  IsCapabilityMaskNoticeSupported
>                                  IsClientRegistrationSupported
> DiagCode:........................0x0000
> MkeyLeasePeriod:.................0
> LocalPort:.......................1
> LinkWidthEnabled:................1X or 4X
> LinkWidthSupported:..............1X or 4X
> LinkWidthActive:.................1X
> LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps
> LinkState:.......................Active
> PhysLinkState:...................LinkUp
> LinkDownDefState:................Polling
> ProtectBits:.....................0
> LMC:.............................0
> LinkSpeedActive:.................2.5 Gbps
> LinkSpeedEnabled:................2.5 Gbps
> NeighborMTU:.....................2048
> SMSL:............................0
> VLCap:...........................VL0-7
> InitType:........................0x00
> VLHighLimit:.....................4
> VLArbHighCap:....................8
> VLArbLowCap:.....................8
> InitReply:.......................0x00
> MtuCap:..........................2048
> VLStallCount:....................0
> HoqLife:.........................31
> OperVLs:.........................VL0-7
> PartEnforceInb:..................0
> PartEnforceOutb:.................0
> FilterRawInb:....................0
> FilterRawOutb:...................0
> MkeyViolations:..................0
> PkeyViolations:..................0
> QkeyViolations:..................0
> GuidCap:.........................128
> ClientReregister:................0
> SubnetTimeout:...................18
> RespTimeVal:.....................16
> LocalPhysErr:....................8
> OverrunErr:......................8
> MaxCreditHint:...................0
> RoundTrip:.......................0

LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no 
force_link_width option in OpenSM so I think the width you set is 
getting overwritten much like the speed was earlier until you discovered 
force_link_speed 0.

Also, is this b2b HCAs or is there a switch in the topology ?

-- Hal
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
  2011-03-28 19:19         ` Hal Rosenstock
@ 2011-03-28 19:26           ` Alan Cook
       [not found]             ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cook @ 2011-03-28 19:26 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hal Rosenstock <hal@...> writes:
> 
> LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no 
> force_link_width option in OpenSM so I think the width you set is 
> getting overwritten much like the speed was earlier until you discovered 
> force_link_speed 0.

But I see LinkWidthActive set to 1X, which as I understood denoted the width
currently being used.  If I set LinkWidthActive to 4X, I am able to transfer
just fine.

Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most
likely by OpenSM as you suggest), but only after the link is up and active.

I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X.

> 
> Also, is this b2b HCAs or is there a switch in the topology ?
> 

It's b2b HCAs, no switch.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
       [not found]             ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
@ 2011-03-28 19:41               ` Jason Gunthorpe
  2011-03-28 20:41                 ` Alan Cook
  2011-03-28 20:18               ` Hal Rosenstock
  1 sibling, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2011-03-28 19:41 UTC (permalink / raw)
  To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On Mon, Mar 28, 2011 at 07:26:46PM +0000, Alan Cook wrote:
> Hal Rosenstock <hal@...> writes:
> > 
> > LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no 
> > force_link_width option in OpenSM so I think the width you set is 
> > getting overwritten much like the speed was earlier until you discovered 
> > force_link_speed 0.
> 
> But I see LinkWidthActive set to 1X, which as I understood denoted the width
> currently being used.  If I set LinkWidthActive to 4X, I am able to transfer
> just fine.
> 
> Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most
> likely by OpenSM as you suggest), but only after the link is up and active.
> 
> I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X.

Are you using IPoIB to test your ping? You might be having multicast
group membership join problems due to a rate mismatch? Try one of the
low level verbs only diags.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
       [not found]             ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
  2011-03-28 19:41               ` Jason Gunthorpe
@ 2011-03-28 20:18               ` Hal Rosenstock
  1 sibling, 0 replies; 10+ messages in thread
From: Hal Rosenstock @ 2011-03-28 20:18 UTC (permalink / raw)
  To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/28/2011 3:26 PM, Alan Cook wrote:
> Hal Rosenstock<hal@...>  writes:
>>
>> LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no
>> force_link_width option in OpenSM so I think the width you set is
>> getting overwritten much like the speed was earlier until you discovered
>> force_link_speed 0.
>
> But I see LinkWidthActive set to 1X, which as I understood denoted the width
> currently being used.

Yes as long as a renegotiation occurred after the enabled width was changed.

> If I set LinkWidthActive to 4X, I am able to transfer
> just fine.
>
> Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most
> likely by OpenSM as you suggest), but only after the link is up and active.

If it's changed after the link is up and no renegotiation occurs, then 
this is a don't care as you indicate/

> I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X.

I don't have a theory for why only that combo would fail. Do you have 
any other HCA types to try ?

>>
>> Also, is this b2b HCAs or is there a switch in the topology ?
>>
>
> It's b2b HCAs, no switch.

I think ibportstate uses combine routing which is problematic if 
directed at HCA.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
  2011-03-28 19:41               ` Jason Gunthorpe
@ 2011-03-28 20:41                 ` Alan Cook
       [not found]                   ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Alan Cook @ 2011-03-28 20:41 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Jason Gunthorpe <jgunthorpe@...> writes:

> 
> Are you using IPoIB to test your ping? You might be having multicast
> group membership join problems due to a rate mismatch? Try one of the
> low level verbs only diags.
> 

I found the following issue when running ibdiagnet (which I had yet to run).

-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Port raid1/P1 lid=0x0002 guid=0x0002c903000aa6d7 dev=26428 can not join due
    to rate:2.5Gbps < group:10Gbps
-W- Port raid2/P1 lid=0x0001 guid=0x0002c903000aa6cf dev=26428 can not join due
    to rate:2.5Gbps < group:10Gbps
-W- Suboptimal rate for group. Lowest member rate:120Gbps > group-rate:10Gbps

I also changed the speed to 5 Gbps (which I thought had worked earlier) and I am
unable to transfer using a width 1X.  I receive the same warnings as above.

After doing some searching, I created a partitions.conf (which did not exist at
configured location in opensm.conf).  I added the following line:

Test,ipoib,rate=1,defmember=full:0x0002c903000aa6d7,0x0002c903000aa6cf;

After restarting opensm.conf, I find that the above warnings in ibdiagnet no
longer appear, but I am still unable to communicate between PCs.

Did I miss something or do something incorrectly?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
       [not found]                   ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
@ 2011-03-29 17:37                     ` Hal Rosenstock
  2011-03-29 18:01                       ` Alan Cook
  0 siblings, 1 reply; 10+ messages in thread
From: Hal Rosenstock @ 2011-03-29 17:37 UTC (permalink / raw)
  To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 3/28/2011 4:41 PM, Alan Cook wrote:
> Jason Gunthorpe<jgunthorpe@...>  writes:
>
>>
>> Are you using IPoIB to test your ping? You might be having multicast
>> group membership join problems due to a rate mismatch? Try one of the
>> low level verbs only diags.
>>
>
> I found the following issue when running ibdiagnet (which I had yet to run).
>
> -I---------------------------------------------------
> -I- IPoIB Subnets Check
> -I---------------------------------------------------
> -I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
> -W- Port raid1/P1 lid=0x0002 guid=0x0002c903000aa6d7 dev=26428 can not join due
>      to rate:2.5Gbps<  group:10Gbps
> -W- Port raid2/P1 lid=0x0001 guid=0x0002c903000aa6cf dev=26428 can not join due
>      to rate:2.5Gbps<  group:10Gbps
> -W- Suboptimal rate for group. Lowest member rate:120Gbps>  group-rate:10Gbps
>
> I also changed the speed to 5 Gbps (which I thought had worked earlier) and I am
> unable to transfer using a width 1X.  I receive the same warnings as above.

The default is rate 3 (10 Gbps) so I would think anything less than that 
wouldn't work without reconfiguration.

> After doing some searching, I created a partitions.conf (which did not exist at
> configured location in opensm.conf).  I added the following line:
>
> Test,ipoib,rate=1,defmember=full:0x0002c903000aa6d7,0x0002c903000aa6cf;

How about if rather than Test, you change it to Default=0x7fff ?

> After restarting opensm.conf, I find that the above warnings in ibdiagnet no
> longer appear, but I am still unable to communicate between PCs.
>
> Did I miss something or do something incorrectly?

The above is pertinent to IPoIB ping but couldn't explain ibping lack of 
connectivity.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
  2011-03-29 17:37                     ` Hal Rosenstock
@ 2011-03-29 18:01                       ` Alan Cook
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Cook @ 2011-03-29 18:01 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hal Rosenstock <hal@...> writes:
> 
> How about if rather than Test, you change it to Default=0x7fff ?
> 

I tried using the PKey since posting, but it alone did not solve the issue.

> 
> The above is pertinent to IPoIB ping but couldn't explain ibping lack of 
> connectivity.
> 

Turns out I was using the wrong port GUID (bad copy and paste) when testing with
ibping. It has been working all along.

I have just now solved the issue.

The partitions.conf file took me in the right direction.  The solution was to
use rate=2 instead of as rate=1 (which apparently is a reserved value).  I was
unable find documentation for the rate-value pairs, so I was assuming that
rate=1 was 2.5.  I finally found documentation for the rates in a post on the
old ofa-general mailing list:

#define IB_PATH_RECORD_RATE_2_5_GBS             2
#define IB_PATH_RECORD_RATE_10_GBS              3
#define IB_PATH_RECORD_RATE_30_GBS              4
#define IB_PATH_RECORD_RATE_5_GBS               5
#define IB_PATH_RECORD_RATE_20_GBS              6
#define IB_PATH_RECORD_RATE_40_GBS              7
#define IB_PATH_RECORD_RATE_60_GBS              8
#define IB_PATH_RECORD_RATE_80_GBS              9
#define IB_PATH_RECORD_RATE_120_GBS             10

#define IB_MIN_RATE    IB_PATH_RECORD_RATE_2_5_GBS
#define IB_MAX_RATE    IB_PATH_RECORD_RATE_120_GBS

Setting my partitions.conf as follows has solved the issue:

Default=0x7fff,ipoib,rate=2:ALL=full;

Thanks everyone for the help.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-03-29 18:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-28 18:01 Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 Alan Cook
     [not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 18:55   ` Hal Rosenstock
2011-03-28 19:07     ` Alan Cook
     [not found]       ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 19:19         ` Hal Rosenstock
2011-03-28 19:26           ` Alan Cook
     [not found]             ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 19:41               ` Jason Gunthorpe
2011-03-28 20:41                 ` Alan Cook
     [not found]                   ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-29 17:37                     ` Hal Rosenstock
2011-03-29 18:01                       ` Alan Cook
2011-03-28 20:18               ` Hal Rosenstock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox