* Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1
@ 2011-03-28 18:01 Alan Cook
[not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
0 siblings, 1 reply; 10+ messages in thread
From: Alan Cook @ 2011-03-28 18:01 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
In trying to support various speeds with my application, I attempted to change
the speed and width of Infiniband card using ibportstate. I had difficulty with
the reliability of the setting changes working. I found that I needed to set
"force_link_speed 0" in /etc/ofed/opensm.conf. Once that change as made the
service restarted, I was able to reliably change settings on the cards.
All was well until I changed the card to 2.5 x1. At this point, I was unable to
ping the other machine. Both PCs show that the link is up and active, but I am
unable to get any communication between the two. I tried both ping and ibping
with no success. I brought the interfaces down and back up, but that did not
solve the issue.
I have been using the following commands (as root) to adjust the speed and width:
ibportstate -C mlx4_0 -D 0 1 speed [1-7]
ibportstate -C mlx4_0 -D 0 1 width [1|2]
ibportstate -C mlx4_0 -D 0 1 reset
I was able to both ping and ibping when using the following speed and width
configurations:
2.5 Gbps x4
5.0 Gbps x1
5.0 Gbps x4
10.0 Gbps x1
10.0 Gbps x4
The only setting configuration that does not work is the 2.5 x1.
Is there something that I am missing?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread[parent not found: <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>]
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 [not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> @ 2011-03-28 18:55 ` Hal Rosenstock 2011-03-28 19:07 ` Alan Cook 0 siblings, 1 reply; 10+ messages in thread From: Hal Rosenstock @ 2011-03-28 18:55 UTC (permalink / raw) To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 3/28/2011 2:01 PM, Alan Cook wrote: > In trying to support various speeds with my application, I attempted to change > the speed and width of Infiniband card using ibportstate. I had difficulty with > the reliability of the setting changes working. I found that I needed to set > "force_link_speed 0" in /etc/ofed/opensm.conf. Once that change as made the > service restarted, I was able to reliably change settings on the cards. > > All was well until I changed the card to 2.5 x1. At this point, I was unable to > ping the other machine. Both PCs show that the link is up and active, but I am > unable to get any communication between the two. I tried both ping and ibping > with no success. I brought the interfaces down and back up, but that did not > solve the issue. > > I have been using the following commands (as root) to adjust the speed and width: > > ibportstate -C mlx4_0 -D 0 1 speed [1-7] > ibportstate -C mlx4_0 -D 0 1 width [1|2] > ibportstate -C mlx4_0 -D 0 1 reset > > I was able to both ping and ibping when using the following speed and width > configurations: > > 2.5 Gbps x4 > 5.0 Gbps x1 > 5.0 Gbps x4 > 10.0 Gbps x1 > 10.0 Gbps x4 > > The only setting configuration that does not work is the 2.5 x1. > > Is there something that I am missing? What does smpquery portinfo say for both sides of that link ? -- Hal > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 2011-03-28 18:55 ` Hal Rosenstock @ 2011-03-28 19:07 ` Alan Cook [not found] ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Alan Cook @ 2011-03-28 19:07 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Hal Rosenstock <hal@...> writes: > > What does smpquery portinfo say for both sides of that link ? > What specifically would be I be looking for in this data to determine if their is an issue? For card 1 I have: # Port info: DR path slid 65535; dlid 65535; 0 port 1 Mkey:............................0x0000000000000000 GidPrefix:.......................0xfe80000000000000 Lid:.............................2 SMLid:...........................1 CapMask:.........................0x2510868 IsTrapSupported IsAutomaticMigrationSupported IsSLMappingSupported IsSystemImageGUIDsupported IsCommunicatonManagementSupported IsVendorClassSupported IsCapabilityMaskNoticeSupported IsClientRegistrationSupported DiagCode:........................0x0000 MkeyLeasePeriod:.................0 LocalPort:.......................1 LinkWidthEnabled:................1X or 4X LinkWidthSupported:..............1X or 4X LinkWidthActive:.................1X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkState:.......................Active PhysLinkState:...................LinkUp LinkDownDefState:................Polling ProtectBits:.....................0 LMC:.............................0 LinkSpeedActive:.................2.5 Gbps LinkSpeedEnabled:................2.5 Gbps NeighborMTU:.....................2048 SMSL:............................0 VLCap:...........................VL0-7 InitType:........................0x00 VLHighLimit:.....................4 VLArbHighCap:....................8 VLArbLowCap:.....................8 InitReply:.......................0x00 MtuCap:..........................2048 VLStallCount:....................0 HoqLife:.........................31 OperVLs:.........................VL0-7 PartEnforceInb:..................0 PartEnforceOutb:.................0 FilterRawInb:....................0 FilterRawOutb:...................0 MkeyViolations:..................0 PkeyViolations:..................0 QkeyViolations:..................0 GuidCap:.........................128 ClientReregister:................0 SubnetTimeout:...................18 RespTimeVal:.....................16 LocalPhysErr:....................8 OverrunErr:......................8 MaxCreditHint:...................0 RoundTrip:.......................0 For card 2 I have: # Port info: DR path slid 65535; dlid 65535; 0 port 1 Mkey:............................0x0000000000000000 GidPrefix:.......................0xfe80000000000000 Lid:.............................1 SMLid:...........................1 CapMask:.........................0x251086a IsSM IsTrapSupported IsAutomaticMigrationSupported IsSLMappingSupported IsSystemImageGUIDsupported IsCommunicatonManagementSupported IsVendorClassSupported IsCapabilityMaskNoticeSupported IsClientRegistrationSupported DiagCode:........................0x0000 MkeyLeasePeriod:.................0 LocalPort:.......................1 LinkWidthEnabled:................1X or 4X LinkWidthSupported:..............1X or 4X LinkWidthActive:.................1X LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps LinkState:.......................Active PhysLinkState:...................LinkUp LinkDownDefState:................Polling ProtectBits:.....................0 LMC:.............................0 LinkSpeedActive:.................2.5 Gbps LinkSpeedEnabled:................2.5 Gbps NeighborMTU:.....................2048 SMSL:............................0 VLCap:...........................VL0-7 InitType:........................0x00 VLHighLimit:.....................4 VLArbHighCap:....................8 VLArbLowCap:.....................8 InitReply:.......................0x00 MtuCap:..........................2048 VLStallCount:....................0 HoqLife:.........................31 OperVLs:.........................VL0-7 PartEnforceInb:..................0 PartEnforceOutb:.................0 FilterRawInb:....................0 FilterRawOutb:...................0 MkeyViolations:..................0 PkeyViolations:..................0 QkeyViolations:..................0 GuidCap:.........................128 ClientReregister:................0 SubnetTimeout:...................18 RespTimeVal:.....................16 LocalPhysErr:....................8 OverrunErr:......................8 MaxCreditHint:...................0 RoundTrip:.......................0 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>]
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 [not found] ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> @ 2011-03-28 19:19 ` Hal Rosenstock 2011-03-28 19:26 ` Alan Cook 0 siblings, 1 reply; 10+ messages in thread From: Hal Rosenstock @ 2011-03-28 19:19 UTC (permalink / raw) To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 3/28/2011 3:07 PM, Alan Cook wrote: > Hal Rosenstock<hal@...> writes: >> >> What does smpquery portinfo say for both sides of that link ? >> > > What specifically would be I be looking for in this data to determine if their > is an issue? The various LinkSpeed/Width components to see if they make sense. > For card 1 I have: > > # Port info: DR path slid 65535; dlid 65535; 0 port 1 > Mkey:............................0x0000000000000000 > GidPrefix:.......................0xfe80000000000000 > Lid:.............................2 > SMLid:...........................1 > CapMask:.........................0x2510868 > IsTrapSupported > IsAutomaticMigrationSupported > IsSLMappingSupported > IsSystemImageGUIDsupported > IsCommunicatonManagementSupported > IsVendorClassSupported > IsCapabilityMaskNoticeSupported > IsClientRegistrationSupported > DiagCode:........................0x0000 > MkeyLeasePeriod:.................0 > LocalPort:.......................1 > LinkWidthEnabled:................1X or 4X > LinkWidthSupported:..............1X or 4X > LinkWidthActive:.................1X > LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps > LinkState:.......................Active > PhysLinkState:...................LinkUp > LinkDownDefState:................Polling > ProtectBits:.....................0 > LMC:.............................0 > LinkSpeedActive:.................2.5 Gbps > LinkSpeedEnabled:................2.5 Gbps > NeighborMTU:.....................2048 > SMSL:............................0 > VLCap:...........................VL0-7 > InitType:........................0x00 > VLHighLimit:.....................4 > VLArbHighCap:....................8 > VLArbLowCap:.....................8 > InitReply:.......................0x00 > MtuCap:..........................2048 > VLStallCount:....................0 > HoqLife:.........................31 > OperVLs:.........................VL0-7 > PartEnforceInb:..................0 > PartEnforceOutb:.................0 > FilterRawInb:....................0 > FilterRawOutb:...................0 > MkeyViolations:..................0 > PkeyViolations:..................0 > QkeyViolations:..................0 > GuidCap:.........................128 > ClientReregister:................0 > SubnetTimeout:...................18 > RespTimeVal:.....................16 > LocalPhysErr:....................8 > OverrunErr:......................8 > MaxCreditHint:...................0 > RoundTrip:.......................0 > > > For card 2 I have: > > # Port info: DR path slid 65535; dlid 65535; 0 port 1 > Mkey:............................0x0000000000000000 > GidPrefix:.......................0xfe80000000000000 > Lid:.............................1 > SMLid:...........................1 > CapMask:.........................0x251086a > IsSM > IsTrapSupported > IsAutomaticMigrationSupported > IsSLMappingSupported > IsSystemImageGUIDsupported > IsCommunicatonManagementSupported > IsVendorClassSupported > IsCapabilityMaskNoticeSupported > IsClientRegistrationSupported > DiagCode:........................0x0000 > MkeyLeasePeriod:.................0 > LocalPort:.......................1 > LinkWidthEnabled:................1X or 4X > LinkWidthSupported:..............1X or 4X > LinkWidthActive:.................1X > LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps or 10.0 Gbps > LinkState:.......................Active > PhysLinkState:...................LinkUp > LinkDownDefState:................Polling > ProtectBits:.....................0 > LMC:.............................0 > LinkSpeedActive:.................2.5 Gbps > LinkSpeedEnabled:................2.5 Gbps > NeighborMTU:.....................2048 > SMSL:............................0 > VLCap:...........................VL0-7 > InitType:........................0x00 > VLHighLimit:.....................4 > VLArbHighCap:....................8 > VLArbLowCap:.....................8 > InitReply:.......................0x00 > MtuCap:..........................2048 > VLStallCount:....................0 > HoqLife:.........................31 > OperVLs:.........................VL0-7 > PartEnforceInb:..................0 > PartEnforceOutb:.................0 > FilterRawInb:....................0 > FilterRawOutb:...................0 > MkeyViolations:..................0 > PkeyViolations:..................0 > QkeyViolations:..................0 > GuidCap:.........................128 > ClientReregister:................0 > SubnetTimeout:...................18 > RespTimeVal:.....................16 > LocalPhysErr:....................8 > OverrunErr:......................8 > MaxCreditHint:...................0 > RoundTrip:.......................0 LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no force_link_width option in OpenSM so I think the width you set is getting overwritten much like the speed was earlier until you discovered force_link_speed 0. Also, is this b2b HCAs or is there a switch in the topology ? -- Hal -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 2011-03-28 19:19 ` Hal Rosenstock @ 2011-03-28 19:26 ` Alan Cook [not found] ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Alan Cook @ 2011-03-28 19:26 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Hal Rosenstock <hal@...> writes: > > LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no > force_link_width option in OpenSM so I think the width you set is > getting overwritten much like the speed was earlier until you discovered > force_link_speed 0. But I see LinkWidthActive set to 1X, which as I understood denoted the width currently being used. If I set LinkWidthActive to 4X, I am able to transfer just fine. Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most likely by OpenSM as you suggest), but only after the link is up and active. I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X. > > Also, is this b2b HCAs or is there a switch in the topology ? > It's b2b HCAs, no switch. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>]
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 [not found] ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> @ 2011-03-28 19:41 ` Jason Gunthorpe 2011-03-28 20:41 ` Alan Cook 2011-03-28 20:18 ` Hal Rosenstock 1 sibling, 1 reply; 10+ messages in thread From: Jason Gunthorpe @ 2011-03-28 19:41 UTC (permalink / raw) To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On Mon, Mar 28, 2011 at 07:26:46PM +0000, Alan Cook wrote: > Hal Rosenstock <hal@...> writes: > > > > LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no > > force_link_width option in OpenSM so I think the width you set is > > getting overwritten much like the speed was earlier until you discovered > > force_link_speed 0. > > But I see LinkWidthActive set to 1X, which as I understood denoted the width > currently being used. If I set LinkWidthActive to 4X, I am able to transfer > just fine. > > Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most > likely by OpenSM as you suggest), but only after the link is up and active. > > I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X. Are you using IPoIB to test your ping? You might be having multicast group membership join problems due to a rate mismatch? Try one of the low level verbs only diags. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 2011-03-28 19:41 ` Jason Gunthorpe @ 2011-03-28 20:41 ` Alan Cook [not found] ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Alan Cook @ 2011-03-28 20:41 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Jason Gunthorpe <jgunthorpe@...> writes: > > Are you using IPoIB to test your ping? You might be having multicast > group membership join problems due to a rate mismatch? Try one of the > low level verbs only diags. > I found the following issue when running ibdiagnet (which I had yet to run). -I--------------------------------------------------- -I- IPoIB Subnets Check -I--------------------------------------------------- -I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00 -W- Port raid1/P1 lid=0x0002 guid=0x0002c903000aa6d7 dev=26428 can not join due to rate:2.5Gbps < group:10Gbps -W- Port raid2/P1 lid=0x0001 guid=0x0002c903000aa6cf dev=26428 can not join due to rate:2.5Gbps < group:10Gbps -W- Suboptimal rate for group. Lowest member rate:120Gbps > group-rate:10Gbps I also changed the speed to 5 Gbps (which I thought had worked earlier) and I am unable to transfer using a width 1X. I receive the same warnings as above. After doing some searching, I created a partitions.conf (which did not exist at configured location in opensm.conf). I added the following line: Test,ipoib,rate=1,defmember=full:0x0002c903000aa6d7,0x0002c903000aa6cf; After restarting opensm.conf, I find that the above warnings in ibdiagnet no longer appear, but I am still unable to communicate between PCs. Did I miss something or do something incorrectly? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>]
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 [not found] ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> @ 2011-03-29 17:37 ` Hal Rosenstock 2011-03-29 18:01 ` Alan Cook 0 siblings, 1 reply; 10+ messages in thread From: Hal Rosenstock @ 2011-03-29 17:37 UTC (permalink / raw) To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 3/28/2011 4:41 PM, Alan Cook wrote: > Jason Gunthorpe<jgunthorpe@...> writes: > >> >> Are you using IPoIB to test your ping? You might be having multicast >> group membership join problems due to a rate mismatch? Try one of the >> low level verbs only diags. >> > > I found the following issue when running ibdiagnet (which I had yet to run). > > -I--------------------------------------------------- > -I- IPoIB Subnets Check > -I--------------------------------------------------- > -I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00 > -W- Port raid1/P1 lid=0x0002 guid=0x0002c903000aa6d7 dev=26428 can not join due > to rate:2.5Gbps< group:10Gbps > -W- Port raid2/P1 lid=0x0001 guid=0x0002c903000aa6cf dev=26428 can not join due > to rate:2.5Gbps< group:10Gbps > -W- Suboptimal rate for group. Lowest member rate:120Gbps> group-rate:10Gbps > > I also changed the speed to 5 Gbps (which I thought had worked earlier) and I am > unable to transfer using a width 1X. I receive the same warnings as above. The default is rate 3 (10 Gbps) so I would think anything less than that wouldn't work without reconfiguration. > After doing some searching, I created a partitions.conf (which did not exist at > configured location in opensm.conf). I added the following line: > > Test,ipoib,rate=1,defmember=full:0x0002c903000aa6d7,0x0002c903000aa6cf; How about if rather than Test, you change it to Default=0x7fff ? > After restarting opensm.conf, I find that the above warnings in ibdiagnet no > longer appear, but I am still unable to communicate between PCs. > > Did I miss something or do something incorrectly? The above is pertinent to IPoIB ping but couldn't explain ibping lack of connectivity. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 2011-03-29 17:37 ` Hal Rosenstock @ 2011-03-29 18:01 ` Alan Cook 0 siblings, 0 replies; 10+ messages in thread From: Alan Cook @ 2011-03-29 18:01 UTC (permalink / raw) To: linux-rdma-u79uwXL29TY76Z2rM5mHXA Hal Rosenstock <hal@...> writes: > > How about if rather than Test, you change it to Default=0x7fff ? > I tried using the PKey since posting, but it alone did not solve the issue. > > The above is pertinent to IPoIB ping but couldn't explain ibping lack of > connectivity. > Turns out I was using the wrong port GUID (bad copy and paste) when testing with ibping. It has been working all along. I have just now solved the issue. The partitions.conf file took me in the right direction. The solution was to use rate=2 instead of as rate=1 (which apparently is a reserved value). I was unable find documentation for the rate-value pairs, so I was assuming that rate=1 was 2.5. I finally found documentation for the rates in a post on the old ofa-general mailing list: #define IB_PATH_RECORD_RATE_2_5_GBS 2 #define IB_PATH_RECORD_RATE_10_GBS 3 #define IB_PATH_RECORD_RATE_30_GBS 4 #define IB_PATH_RECORD_RATE_5_GBS 5 #define IB_PATH_RECORD_RATE_20_GBS 6 #define IB_PATH_RECORD_RATE_40_GBS 7 #define IB_PATH_RECORD_RATE_60_GBS 8 #define IB_PATH_RECORD_RATE_80_GBS 9 #define IB_PATH_RECORD_RATE_120_GBS 10 #define IB_MIN_RATE IB_PATH_RECORD_RATE_2_5_GBS #define IB_MAX_RATE IB_PATH_RECORD_RATE_120_GBS Setting my partitions.conf as follows has solved the issue: Default=0x7fff,ipoib,rate=2:ALL=full; Thanks everyone for the help. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 [not found] ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org> 2011-03-28 19:41 ` Jason Gunthorpe @ 2011-03-28 20:18 ` Hal Rosenstock 1 sibling, 0 replies; 10+ messages in thread From: Hal Rosenstock @ 2011-03-28 20:18 UTC (permalink / raw) To: Alan Cook; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA On 3/28/2011 3:26 PM, Alan Cook wrote: > Hal Rosenstock<hal@...> writes: >> >> LinkWidthEnabled is 1X or 4X and not just 1X. That's because there's no >> force_link_width option in OpenSM so I think the width you set is >> getting overwritten much like the speed was earlier until you discovered >> force_link_speed 0. > > But I see LinkWidthActive set to 1X, which as I understood denoted the width > currently being used. Yes as long as a renegotiation occurred after the enabled width was changed. > If I set LinkWidthActive to 4X, I am able to transfer > just fine. > > Also, I do see LinkWidthEnabled being changed from "1X" to "1X or 4X" (most > likely by OpenSM as you suggest), but only after the link is up and active. If it's changed after the link is up and no renegotiation occurs, then this is a don't care as you indicate/ > I am able to transfer at 5 Gbps 1X and 10 Gbps 1X, just not 2.5 Gbps 1X. I don't have a theory for why only that combo would fail. Do you have any other HCA types to try ? >> >> Also, is this b2b HCAs or is there a switch in the topology ? >> > > It's b2b HCAs, no switch. I think ibportstate uses combine routing which is problematic if directed at HCA. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-29 18:01 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-28 18:01 Unable to Connect Using Active Speed 2.5 Gbps and Active Width of 1 Alan Cook
[not found] ` <loom.20110328T194956-774-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 18:55 ` Hal Rosenstock
2011-03-28 19:07 ` Alan Cook
[not found] ` <loom.20110328T210350-244-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 19:19 ` Hal Rosenstock
2011-03-28 19:26 ` Alan Cook
[not found] ` <loom.20110328T212117-950-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-28 19:41 ` Jason Gunthorpe
2011-03-28 20:41 ` Alan Cook
[not found] ` <loom.20110328T220853-101-eS7Uydv5nfjZ+VzJOa5vwg@public.gmane.org>
2011-03-29 17:37 ` Hal Rosenstock
2011-03-29 18:01 ` Alan Cook
2011-03-28 20:18 ` Hal Rosenstock
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox