* Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID
@ 2011-12-15 23:17 Gerben Roest
[not found] ` <4EEA8004.4060103-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Gerben Roest @ 2011-12-15 23:17 UTC (permalink / raw)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
Hi,
Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5
machine, directly linked to its neighbour (a twin 1U setup) gives me no
connection but lots of errors in /var/log/opensm.log, like these:
Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12:
validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed
from port 0x001e8c0000b90641 (vespasianus HCA-1), sending
IB_SA_MAD_STATUS_REQ_INVALID
Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12:
validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed
from port 0x001e8c0000c84b62 (titus HCA-1), sending
IB_SA_MAD_STATUS_REQ_INVALID
Does anyone know what happens here? Another twin node has no problems,
that one uses OFED-1.5.1.
I can send a "-V" log of opensm or any config files if you like,
thanks,
Gerben
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 11+ messages in thread[parent not found: <4EEA8004.4060103-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEA8004.4060103-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> @ 2011-12-16 0:06 ` Ira Weiny [not found] ` <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Ira Weiny @ 2011-12-16 0:06 UTC (permalink / raw) To: Gerben Roest; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Thu, 15 Dec 2011 15:17:24 -0800 Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: > Hi, > > Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 > machine, directly linked to its neighbour (a twin 1U setup) gives me no > connection but lots of errors in /var/log/opensm.log, like these: > > Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000b90641 (vespasianus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000c84b62 (titus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > > Does anyone know what happens here? Another twin node has no problems, > that one uses OFED-1.5.1. > > I can send a "-V" log of opensm or any config files if you like, Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. Ira > > thanks, > > Gerben > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Ira Weiny Member of Technical Staff Lawrence Livermore National Lab 925-423-8008 weiny2-i2BcT+NCU+M@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org> @ 2011-12-16 8:56 ` Gerben Roest [not found] ` <4EEB07C3.90803-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Gerben Roest @ 2011-12-16 8:56 UTC (permalink / raw) To: Ira Weiny; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 16-12-2011 1:06, Ira Weiny wrote: > On Thu, 15 Dec 2011 15:17:24 -0800 > Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: > >> Hi, >> >> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 >> machine, directly linked to its neighbour (a twin 1U setup) gives me no >> connection but lots of errors in /var/log/opensm.log, like these: >> >> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000c84b62 (titus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> >> Does anyone know what happens here? Another twin node has no problems, >> that one uses OFED-1.5.1. >> >> I can send a "-V" log of opensm or any config files if you like, > > Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD 0x3dd9290 Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed sending response or unsolicited p_madw = 0x3ddf5c0 Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD 0x3dd7290 Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD for p_madw = 0x3ddf5d8, size = 256 Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD 0x3dd7290, size = 256 Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA MADs received Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: base_ver................0x1 mgmt_class..............0x3 class_ver...............0x2 method..................0x2 (SubnAdmSet) status..................0x0 resv....................0x0 trans_id................0x53bf6d21e attr_id.................0x38 (MCMemberRecord) resv1...................0x0 attr_mod................0x0 rmpp_version............0x0 rmpp_type...............0x0 rmpp_flags..............0x0 rmpp_status.............0x0 seg_num.................0x0 payload_len/new_win.....0x0 sm_key..................0x0000000000000000 attr_offset.............0x0 resv2...................0x0 comp_mask...............0x0000000000010083 Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of incoming record Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: MGID....................ff12:401b:ffff::ffff:ffff PortGid.................fe80::1e:8c00:b9:641 qkey....................0x0 mlid....................0x0 mtu.....................0x0 TClass..................0x0 pkey....................0xFFFF rate....................0x0 pkt_life................0x0 SLFlowLabelHopLimit.....0x0 ScopeState..............0x1 ProxyJoin...............0x0 Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed from port 0x001e8c0000b90641 (vespasianus HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD for p_madw = 0x3dd73f8, size = 256 Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD 0x3dd9290, size = 256 Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: base_ver................0x1 mgmt_class..............0x3 class_ver...............0x2 method..................0x81 (SubnAdmGetResp) status..................0x200 resv....................0x0 trans_id................0x53bf6d21e attr_id.................0x38 (MCMemberRecord) resv1...................0x0 attr_mod................0x0 rmpp_version............0x0 rmpp_type...............0x0 rmpp_flags..............0x0 rmpp_status.............0x0 seg_num.................0x0 payload_len/new_win.....0x0 sm_key..................0x0000000000000000 attr_offset.............0x0 resv2...................0x0 comp_mask...............0x0000000000010083 Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD 0x3dd9290 Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed sending response or unsolicited p_madw = 0x3dd73e0 Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD 0x3dd7e40 Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep signalled Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: Received signal OSM_SIGNAL_SWEEP in state MASTER Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: thanks, Gerben -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB07C3.90803-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB07C3.90803-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> @ 2011-12-16 9:14 ` Alex Netes 2011-12-16 10:46 ` Gerben Roest 0 siblings, 1 reply; 11+ messages in thread From: Alex Netes @ 2011-12-16 9:14 UTC (permalink / raw) To: Gerben Roest Cc: Ira Weiny, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Gerben, It's complaining about the link rate: Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 Probably, the host that is trying to join is connected via 1x cable. The rate is defined by the capabilities of the host that opened a group, so you see this problem only when the host with higher rate created the MC group. On 09:56 Fri 16 Dec , Gerben Roest wrote: > On 16-12-2011 1:06, Ira Weiny wrote: > > On Thu, 15 Dec 2011 15:17:24 -0800 > > Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: > > > >> Hi, > >> > >> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 > >> machine, directly linked to its neighbour (a twin 1U setup) gives me no > >> connection but lots of errors in /var/log/opensm.log, like these: > >> > >> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > >> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending > >> IB_SA_MAD_STATUS_REQ_INVALID > >> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > >> from port 0x001e8c0000c84b62 (titus HCA-1), sending > >> IB_SA_MAD_STATUS_REQ_INVALID > >> > >> Does anyone know what happens here? Another twin node has no problems, > >> that one uses OFED-1.5.1. > >> > >> I can send a "-V" log of opensm or any config files if you like, > > > > Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. > > Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ > Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 > Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ > Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD > 0x3dd9290 > Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] > Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed > sending response or unsolicited p_madw = 0x3ddf5c0 > Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] > Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] > Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] > Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] > Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ > Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ > Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD > 0x3dd7290 > Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] > Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] > Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ > Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD > for p_madw = 0x3ddf5d8, size = 256 > Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD > 0x3dd7290, size = 256 > Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] > Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ > Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA > MADs received > Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: > base_ver................0x1 > mgmt_class..............0x3 > class_ver...............0x2 > method..................0x2 (SubnAdmSet) > status..................0x0 > resv....................0x0 > trans_id................0x53bf6d21e > attr_id.................0x38 > (MCMemberRecord) > resv1...................0x0 > attr_mod................0x0 > rmpp_version............0x0 > rmpp_type...............0x0 > rmpp_flags..............0x0 > rmpp_status.............0x0 > seg_num.................0x0 > payload_len/new_win.....0x0 > sm_key..................0x0000000000000000 > attr_offset.............0x0 > resv2...................0x0 > comp_mask...............0x0000000000010083 > > > Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ > Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting > Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD > Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] > Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] > Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ > Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ > Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of > incoming record > Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: > > MGID....................ff12:401b:ffff::ffff:ffff > PortGid.................fe80::1e:8c00:b9:641 > qkey....................0x0 > mlid....................0x0 > mtu.....................0x0 > TClass..................0x0 > pkey....................0xFFFF > rate....................0x0 > pkt_life................0x0 > SLFlowLabelHopLimit.....0x0 > ScopeState..............0x1 > ProxyJoin...............0x0 > Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's > RATE 2 is less than 3 > Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000b90641 (vespasianus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ > Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ > Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD > for p_madw = 0x3dd73f8, size = 256 > Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD > 0x3dd9290, size = 256 > Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] > Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: > base_ver................0x1 > mgmt_class..............0x3 > class_ver...............0x2 > method..................0x81 > (SubnAdmGetResp) > status..................0x200 > resv....................0x0 > trans_id................0x53bf6d21e > attr_id.................0x38 > (MCMemberRecord) > resv1...................0x0 > attr_mod................0x0 > rmpp_version............0x0 > rmpp_type...............0x0 > rmpp_flags..............0x0 > rmpp_status.............0x0 > seg_num.................0x0 > payload_len/new_win.....0x0 > sm_key..................0x0000000000000000 > attr_offset.............0x0 > resv2...................0x0 > comp_mask...............0x0000000000010083 > > > Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ > Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 > Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ > Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD > 0x3dd9290 > Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] > Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed > sending response or unsolicited p_madw = 0x3dd73e0 > Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] > Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] > Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] > Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] > Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ > Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ > Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD > 0x3dd7e40 > Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] > Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] > Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep > signalled > Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ > Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: > Received signal OSM_SIGNAL_SWEEP in state MASTER > Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ > Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: > > > > thanks, > > Gerben > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- -- Alex -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID 2011-12-16 9:14 ` Alex Netes @ 2011-12-16 10:46 ` Gerben Roest [not found] ` <4EEB216D.2010407-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Gerben Roest @ 2011-12-16 10:46 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 16-12-2011 10:14, Alex Netes wrote: > Hi Gerben, > > It's complaining about the link rate: > > Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 > > Probably, the host that is trying to join is connected via 1x cable. > The rate is defined by the capabilities of the host that opened a group, so > you see this problem only when the host with higher rate created the MC group. Is it possible to force them to some specified speed? The strange thing is that both hosts show this problem if they start opensm, they have the same errors in /var/log/opensm.log. This is what both hosts have: [root@titus ~]# lspci -v |grep Infini 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR / 10GigE] (rev a0) [root@vespasianus ~]# lspci -v |grep Infini 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR / 10GigE] (rev a0) The hosts are connected to each other's single port via one IB cable. [root@vespasianus ~]# grep -A1 -B1 INVALID /var/log/opensm.log| tail Dec 16 11:35:10 041359 [483D2940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed from port 0x001e8c0000c84b62 (titus HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Dec 16 11:35:10 041365 [483D2940] 0x10 -> osm_sa_send_error: [ -- Dec 16 11:35:17 351591 [429C9940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 Dec 16 11:35:17 351598 [429C9940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed from port 0x001e8c0000b90641 (vespasianus HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Dec 16 11:35:17 351604 [429C9940] 0x10 -> osm_sa_send_error: [ -- Dec 16 11:35:18 042907 [43DCB940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 Dec 16 11:35:18 042914 [43DCB940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed from port 0x001e8c0000c84b62 (titus HCA-1), sending IB_SA_MAD_STATUS_REQ_INVALID Dec 16 11:35:18 042920 [43DCB940] 0x10 -> osm_sa_send_error: [ Gerben > > On 09:56 Fri 16 Dec , Gerben Roest wrote: >> On 16-12-2011 1:06, Ira Weiny wrote: >>> On Thu, 15 Dec 2011 15:17:24 -0800 >>> Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: >>> >>>> Hi, >>>> >>>> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 >>>> machine, directly linked to its neighbour (a twin 1U setup) gives me no >>>> connection but lots of errors in /var/log/opensm.log, like these: >>>> >>>> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>> IB_SA_MAD_STATUS_REQ_INVALID >>>> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>>> IB_SA_MAD_STATUS_REQ_INVALID >>>> >>>> Does anyone know what happens here? Another twin node has no problems, >>>> that one uses OFED-1.5.1. >>>> >>>> I can send a "-V" log of opensm or any config files if you like, >>> >>> Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. >> >> Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ >> Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >> Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ >> Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >> 0x3dd9290 >> Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] >> Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed >> sending response or unsolicited p_madw = 0x3ddf5c0 >> Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] >> Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] >> Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] >> Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] >> Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >> Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ >> Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >> 0x3dd7290 >> Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] >> Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >> Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ >> Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD >> for p_madw = 0x3ddf5d8, size = 256 >> Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD >> 0x3dd7290, size = 256 >> Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] >> Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ >> Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA >> MADs received >> Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: >> base_ver................0x1 >> mgmt_class..............0x3 >> class_ver...............0x2 >> method..................0x2 (SubnAdmSet) >> status..................0x0 >> resv....................0x0 >> trans_id................0x53bf6d21e >> attr_id.................0x38 >> (MCMemberRecord) >> resv1...................0x0 >> attr_mod................0x0 >> rmpp_version............0x0 >> rmpp_type...............0x0 >> rmpp_flags..............0x0 >> rmpp_status.............0x0 >> seg_num.................0x0 >> payload_len/new_win.....0x0 >> sm_key..................0x0000000000000000 >> attr_offset.............0x0 >> resv2...................0x0 >> comp_mask...............0x0000000000010083 >> >> >> Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ >> Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting >> Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD >> Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] >> Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] >> Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ >> Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ >> Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of >> incoming record >> Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: >> >> MGID....................ff12:401b:ffff::ffff:ffff >> PortGid.................fe80::1e:8c00:b9:641 >> qkey....................0x0 >> mlid....................0x0 >> mtu.....................0x0 >> TClass..................0x0 >> pkey....................0xFFFF >> rate....................0x0 >> pkt_life................0x0 >> SLFlowLabelHopLimit.....0x0 >> ScopeState..............0x1 >> ProxyJoin...............0x0 >> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's >> RATE 2 is less than 3 >> Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ >> Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ >> Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD >> for p_madw = 0x3dd73f8, size = 256 >> Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD >> 0x3dd9290, size = 256 >> Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] >> Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: >> base_ver................0x1 >> mgmt_class..............0x3 >> class_ver...............0x2 >> method..................0x81 >> (SubnAdmGetResp) >> status..................0x200 >> resv....................0x0 >> trans_id................0x53bf6d21e >> attr_id.................0x38 >> (MCMemberRecord) >> resv1...................0x0 >> attr_mod................0x0 >> rmpp_version............0x0 >> rmpp_type...............0x0 >> rmpp_flags..............0x0 >> rmpp_status.............0x0 >> seg_num.................0x0 >> payload_len/new_win.....0x0 >> sm_key..................0x0000000000000000 >> attr_offset.............0x0 >> resv2...................0x0 >> comp_mask...............0x0000000000010083 >> >> >> Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ >> Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >> Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ >> Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >> 0x3dd9290 >> Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] >> Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed >> sending response or unsolicited p_madw = 0x3dd73e0 >> Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] >> Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] >> Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] >> Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] >> Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >> Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ >> Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >> 0x3dd7e40 >> Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] >> Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >> Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep >> signalled >> Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ >> Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: >> Received signal OSM_SIGNAL_SWEEP in state MASTER >> Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ >> Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: >> >> >> >> thanks, >> >> Gerben >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Grep IT tel: 0252-769005 Egelantier 3 fax: 0252-769006 2211 NN Noordwijkerhout g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org The Netherlands -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB216D.2010407-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB216D.2010407-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> @ 2011-12-16 12:30 ` Hal Rosenstock [not found] ` <4EEB39E8.5030601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Hal Rosenstock @ 2011-12-16 12:30 UTC (permalink / raw) To: Gerben Roest; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 12/16/2011 5:46 AM, Gerben Roest wrote: > On 16-12-2011 10:14, Alex Netes wrote: >> Hi Gerben, >> >> It's complaining about the link rate: >> >> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 >> >> Probably, the host that is trying to join is connected via 1x cable. >> The rate is defined by the capabilities of the host that opened a group, so >> you see this problem only when the host with higher rate created the MC group. > > Is it possible to force them to some specified speed? The easiest way to fix this is to specify rate=2 in the partition file for the default partition as documented in the man page under PARTITION CONFIGURATION SECTION as follows: Default=0x7fff,ipoib,rate=2:ALL=full; > The strange thing is that both hosts show this problem if they start > opensm, What OpenSM version is this ? > they have the same errors in /var/log/opensm.log. This is what > both hosts have: > > [root@titus ~]# lspci -v |grep Infini > 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 > 5GT/s - IB DDR / 10GigE] (rev a0) > > [root@vespasianus ~]# lspci -v |grep Infini > 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 > 5GT/s - IB DDR / 10GigE] (rev a0) What (rate) is shown in ibstat or ibstatus for each port ? > The hosts are connected to each other's single port via one IB cable. I hope they have the same rate on both ports then. -- Hal > [root@vespasianus ~]# grep -A1 -B1 INVALID /var/log/opensm.log| tail > > Dec 16 11:35:10 041359 [483D2940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000c84b62 (titus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Dec 16 11:35:10 041365 [483D2940] 0x10 -> osm_sa_send_error: [ > -- > Dec 16 11:35:17 351591 [429C9940] 0x04 -> validate_port_caps: Port's > RATE 2 is less than 3 > Dec 16 11:35:17 351598 [429C9940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000b90641 (vespasianus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Dec 16 11:35:17 351604 [429C9940] 0x10 -> osm_sa_send_error: [ > -- > Dec 16 11:35:18 042907 [43DCB940] 0x04 -> validate_port_caps: Port's > RATE 2 is less than 3 > Dec 16 11:35:18 042914 [43DCB940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: > validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed > from port 0x001e8c0000c84b62 (titus HCA-1), sending > IB_SA_MAD_STATUS_REQ_INVALID > Dec 16 11:35:18 042920 [43DCB940] 0x10 -> osm_sa_send_error: [ > > Gerben > > >> >> On 09:56 Fri 16 Dec , Gerben Roest wrote: >>> On 16-12-2011 1:06, Ira Weiny wrote: >>>> On Thu, 15 Dec 2011 15:17:24 -0800 >>>> Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: >>>> >>>>> Hi, >>>>> >>>>> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 >>>>> machine, directly linked to its neighbour (a twin 1U setup) gives me no >>>>> connection but lots of errors in /var/log/opensm.log, like these: >>>>> >>>>> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>> >>>>> Does anyone know what happens here? Another twin node has no problems, >>>>> that one uses OFED-1.5.1. >>>>> >>>>> I can send a "-V" log of opensm or any config files if you like, >>>> >>>> Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. >>> >>> Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ >>> Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>> Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ >>> Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>> 0x3dd9290 >>> Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] >>> Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed >>> sending response or unsolicited p_madw = 0x3ddf5c0 >>> Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] >>> Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] >>> Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] >>> Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] >>> Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>> Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ >>> Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>> 0x3dd7290 >>> Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] >>> Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>> Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ >>> Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD >>> for p_madw = 0x3ddf5d8, size = 256 >>> Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD >>> 0x3dd7290, size = 256 >>> Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] >>> Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ >>> Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA >>> MADs received >>> Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: >>> base_ver................0x1 >>> mgmt_class..............0x3 >>> class_ver...............0x2 >>> method..................0x2 (SubnAdmSet) >>> status..................0x0 >>> resv....................0x0 >>> trans_id................0x53bf6d21e >>> attr_id.................0x38 >>> (MCMemberRecord) >>> resv1...................0x0 >>> attr_mod................0x0 >>> rmpp_version............0x0 >>> rmpp_type...............0x0 >>> rmpp_flags..............0x0 >>> rmpp_status.............0x0 >>> seg_num.................0x0 >>> payload_len/new_win.....0x0 >>> sm_key..................0x0000000000000000 >>> attr_offset.............0x0 >>> resv2...................0x0 >>> comp_mask...............0x0000000000010083 >>> >>> >>> Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ >>> Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting >>> Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD >>> Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] >>> Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] >>> Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ >>> Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ >>> Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of >>> incoming record >>> Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: >>> >>> MGID....................ff12:401b:ffff::ffff:ffff >>> PortGid.................fe80::1e:8c00:b9:641 >>> qkey....................0x0 >>> mlid....................0x0 >>> mtu.....................0x0 >>> TClass..................0x0 >>> pkey....................0xFFFF >>> rate....................0x0 >>> pkt_life................0x0 >>> SLFlowLabelHopLimit.....0x0 >>> ScopeState..............0x1 >>> ProxyJoin...............0x0 >>> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's >>> RATE 2 is less than 3 >>> Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>> IB_SA_MAD_STATUS_REQ_INVALID >>> Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ >>> Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ >>> Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD >>> for p_madw = 0x3dd73f8, size = 256 >>> Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD >>> 0x3dd9290, size = 256 >>> Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] >>> Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: >>> base_ver................0x1 >>> mgmt_class..............0x3 >>> class_ver...............0x2 >>> method..................0x81 >>> (SubnAdmGetResp) >>> status..................0x200 >>> resv....................0x0 >>> trans_id................0x53bf6d21e >>> attr_id.................0x38 >>> (MCMemberRecord) >>> resv1...................0x0 >>> attr_mod................0x0 >>> rmpp_version............0x0 >>> rmpp_type...............0x0 >>> rmpp_flags..............0x0 >>> rmpp_status.............0x0 >>> seg_num.................0x0 >>> payload_len/new_win.....0x0 >>> sm_key..................0x0000000000000000 >>> attr_offset.............0x0 >>> resv2...................0x0 >>> comp_mask...............0x0000000000010083 >>> >>> >>> Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ >>> Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>> Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ >>> Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>> 0x3dd9290 >>> Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] >>> Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed >>> sending response or unsolicited p_madw = 0x3dd73e0 >>> Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] >>> Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] >>> Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] >>> Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] >>> Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>> Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ >>> Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>> 0x3dd7e40 >>> Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] >>> Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>> Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep >>> signalled >>> Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ >>> Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: >>> Received signal OSM_SIGNAL_SWEEP in state MASTER >>> Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ >>> Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: >>> >>> >>> >>> thanks, >>> >>> Gerben >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB39E8.5030601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB39E8.5030601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2011-12-16 12:55 ` Gerben Roest [not found] ` <4EEB3FD3.3080409-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Gerben Roest @ 2011-12-16 12:55 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Alex, Hal, On 16-12-2011 13:30, Hal Rosenstock wrote: > On 12/16/2011 5:46 AM, Gerben Roest wrote: >> On 16-12-2011 10:14, Alex Netes wrote: >>> Hi Gerben, >>> >>> It's complaining about the link rate: >>> >>> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 >>> >>> Probably, the host that is trying to join is connected via 1x cable. >>> The rate is defined by the capabilities of the host that opened a group, so >>> you see this problem only when the host with higher rate created the MC group. >> >> Is it possible to force them to some specified speed? > > The easiest way to fix this is to specify rate=2 in the partition file > for the default partition as documented in the man page under PARTITION > CONFIGURATION SECTION as follows: > > Default=0x7fff,ipoib,rate=2:ALL=full; This does the trick! Thanks! > >> The strange thing is that both hosts show this problem if they start >> opensm, > > What OpenSM version is this ? opensm-3.3.9-1.x86_64 But opensm from OFED-1.5.4 gave the same error. > >> they have the same errors in /var/log/opensm.log. This is what >> both hosts have: >> >> [root@titus ~]# lspci -v |grep Infini >> 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 >> 5GT/s - IB DDR / 10GigE] (rev a0) >> >> [root@vespasianus ~]# lspci -v |grep Infini >> 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 >> 5GT/s - IB DDR / 10GigE] (rev a0) > > What (rate) is shown in ibstat or ibstatus for each port ? Both machines have one port each. Both machines give Rate=2, before and after the opensm partitions.conf edit. > >> The hosts are connected to each other's single port via one IB cable. > > I hope they have the same rate on both ports then. yes, they had, and have. They should be identical on-board "cards". Could this be a cable problem? They should be DDR cards. Does Rate=2 mean DDR? thanks, Gerben >> [root@vespasianus ~]# grep -A1 -B1 INVALID /var/log/opensm.log| tail >> >> Dec 16 11:35:10 041359 [483D2940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000c84b62 (titus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> Dec 16 11:35:10 041365 [483D2940] 0x10 -> osm_sa_send_error: [ >> -- >> Dec 16 11:35:17 351591 [429C9940] 0x04 -> validate_port_caps: Port's >> RATE 2 is less than 3 >> Dec 16 11:35:17 351598 [429C9940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> Dec 16 11:35:17 351604 [429C9940] 0x10 -> osm_sa_send_error: [ >> -- >> Dec 16 11:35:18 042907 [43DCB940] 0x04 -> validate_port_caps: Port's >> RATE 2 is less than 3 >> Dec 16 11:35:18 042914 [43DCB940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >> from port 0x001e8c0000c84b62 (titus HCA-1), sending >> IB_SA_MAD_STATUS_REQ_INVALID >> Dec 16 11:35:18 042920 [43DCB940] 0x10 -> osm_sa_send_error: [ >> >> Gerben >> >> >>> >>> On 09:56 Fri 16 Dec , Gerben Roest wrote: >>>> On 16-12-2011 1:06, Ira Weiny wrote: >>>>> On Thu, 15 Dec 2011 15:17:24 -0800 >>>>> Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 >>>>>> machine, directly linked to its neighbour (a twin 1U setup) gives me no >>>>>> connection but lots of errors in /var/log/opensm.log, like these: >>>>>> >>>>>> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>>> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>>> >>>>>> Does anyone know what happens here? Another twin node has no problems, >>>>>> that one uses OFED-1.5.1. >>>>>> >>>>>> I can send a "-V" log of opensm or any config files if you like, >>>>> >>>>> Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. >>>> >>>> Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ >>>> Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>>> Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ >>>> Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>>> 0x3dd9290 >>>> Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] >>>> Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed >>>> sending response or unsolicited p_madw = 0x3ddf5c0 >>>> Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] >>>> Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] >>>> Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] >>>> Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] >>>> Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>>> Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ >>>> Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>>> 0x3dd7290 >>>> Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] >>>> Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>>> Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ >>>> Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD >>>> for p_madw = 0x3ddf5d8, size = 256 >>>> Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD >>>> 0x3dd7290, size = 256 >>>> Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] >>>> Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ >>>> Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA >>>> MADs received >>>> Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: >>>> base_ver................0x1 >>>> mgmt_class..............0x3 >>>> class_ver...............0x2 >>>> method..................0x2 (SubnAdmSet) >>>> status..................0x0 >>>> resv....................0x0 >>>> trans_id................0x53bf6d21e >>>> attr_id.................0x38 >>>> (MCMemberRecord) >>>> resv1...................0x0 >>>> attr_mod................0x0 >>>> rmpp_version............0x0 >>>> rmpp_type...............0x0 >>>> rmpp_flags..............0x0 >>>> rmpp_status.............0x0 >>>> seg_num.................0x0 >>>> payload_len/new_win.....0x0 >>>> sm_key..................0x0000000000000000 >>>> attr_offset.............0x0 >>>> resv2...................0x0 >>>> comp_mask...............0x0000000000010083 >>>> >>>> >>>> Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ >>>> Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting >>>> Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD >>>> Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] >>>> Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] >>>> Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ >>>> Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ >>>> Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of >>>> incoming record >>>> Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: >>>> >>>> MGID....................ff12:401b:ffff::ffff:ffff >>>> PortGid.................fe80::1e:8c00:b9:641 >>>> qkey....................0x0 >>>> mlid....................0x0 >>>> mtu.....................0x0 >>>> TClass..................0x0 >>>> pkey....................0xFFFF >>>> rate....................0x0 >>>> pkt_life................0x0 >>>> SLFlowLabelHopLimit.....0x0 >>>> ScopeState..............0x1 >>>> ProxyJoin...............0x0 >>>> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's >>>> RATE 2 is less than 3 >>>> Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>> IB_SA_MAD_STATUS_REQ_INVALID >>>> Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ >>>> Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ >>>> Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD >>>> for p_madw = 0x3dd73f8, size = 256 >>>> Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD >>>> 0x3dd9290, size = 256 >>>> Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] >>>> Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: >>>> base_ver................0x1 >>>> mgmt_class..............0x3 >>>> class_ver...............0x2 >>>> method..................0x81 >>>> (SubnAdmGetResp) >>>> status..................0x200 >>>> resv....................0x0 >>>> trans_id................0x53bf6d21e >>>> attr_id.................0x38 >>>> (MCMemberRecord) >>>> resv1...................0x0 >>>> attr_mod................0x0 >>>> rmpp_version............0x0 >>>> rmpp_type...............0x0 >>>> rmpp_flags..............0x0 >>>> rmpp_status.............0x0 >>>> seg_num.................0x0 >>>> payload_len/new_win.....0x0 >>>> sm_key..................0x0000000000000000 >>>> attr_offset.............0x0 >>>> resv2...................0x0 >>>> comp_mask...............0x0000000000010083 >>>> >>>> >>>> Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ >>>> Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>>> Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ >>>> Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>>> 0x3dd9290 >>>> Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] >>>> Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed >>>> sending response or unsolicited p_madw = 0x3dd73e0 >>>> Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] >>>> Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] >>>> Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] >>>> Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] >>>> Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>>> Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ >>>> Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>>> 0x3dd7e40 >>>> Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] >>>> Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>>> Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep >>>> signalled >>>> Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ >>>> Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: >>>> Received signal OSM_SIGNAL_SWEEP in state MASTER >>>> Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ >>>> Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: >>>> >>>> >>>> >>>> thanks, >>>> >>>> Gerben >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> > -- Grep IT tel: 0252-769005 Egelantier 3 fax: 0252-769006 2211 NN Noordwijkerhout g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org The Netherlands -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB3FD3.3080409-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB3FD3.3080409-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> @ 2011-12-16 13:10 ` Hal Rosenstock [not found] ` <4EEB4362.1050505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Hal Rosenstock @ 2011-12-16 13:10 UTC (permalink / raw) To: Gerben Roest; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org Hi Gerben, On 12/16/2011 7:55 AM, Gerben Roest wrote: > Hi Alex, Hal, > > On 16-12-2011 13:30, Hal Rosenstock wrote: >> On 12/16/2011 5:46 AM, Gerben Roest wrote: >>> On 16-12-2011 10:14, Alex Netes wrote: >>>> Hi Gerben, >>>> >>>> It's complaining about the link rate: >>>> >>>> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's RATE 2 is less than 3 >>>> >>>> Probably, the host that is trying to join is connected via 1x cable. >>>> The rate is defined by the capabilities of the host that opened a group, so >>>> you see this problem only when the host with higher rate created the MC group. >>> >>> Is it possible to force them to some specified speed? >> >> The easiest way to fix this is to specify rate=2 in the partition file >> for the default partition as documented in the man page under PARTITION >> CONFIGURATION SECTION as follows: >> >> Default=0x7fff,ipoib,rate=2:ALL=full; > > This does the trick! Thanks! > >> >>> The strange thing is that both hosts show this problem if they start >>> opensm, >> >> What OpenSM version is this ? > > opensm-3.3.9-1.x86_64 > > But opensm from OFED-1.5.4 gave the same error. > >> >>> they have the same errors in /var/log/opensm.log. This is what >>> both hosts have: >>> >>> [root@titus ~]# lspci -v |grep Infini >>> 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 >>> 5GT/s - IB DDR / 10GigE] (rev a0) >>> >>> [root@vespasianus ~]# lspci -v |grep Infini >>> 0a:00.0 InfiniBand: Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 >>> 5GT/s - IB DDR / 10GigE] (rev a0) >> >> What (rate) is shown in ibstat or ibstatus for each port ? > > Both machines have one port each. Both machines give Rate=2, before and > after the opensm partitions.conf edit. > >> >>> The hosts are connected to each other's single port via one IB cable. >> >> I hope they have the same rate on both ports then. > > yes, they had, and have. They should be identical on-board "cards". > > Could this be a cable problem? Yes; do you have another cable to try ? If that increases the active port rate to the full port rate (4x DDR) then you should be able to either remove the partition config you just added (and use rate=3) or make the group rate=6 (see below). > They should be DDR cards. Does Rate=2 mean DDR? No; it means 1x SDR (lowest speed/width). 4x DDR would be rate 6 (20 Gbps). See IBA 1.2.1 vol 1 PathRecord SA attribute Rate component. By default, OpenSM sets the rate for the IPoIB broadcast groups when not explicitly specified is rate 3 (10 Gbps) which is 4x SDR. -- Hal > > thanks, > > Gerben > >>> [root@vespasianus ~]# grep -A1 -B1 INVALID /var/log/opensm.log| tail >>> >>> Dec 16 11:35:10 041359 [483D2940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>> IB_SA_MAD_STATUS_REQ_INVALID >>> Dec 16 11:35:10 041365 [483D2940] 0x10 -> osm_sa_send_error: [ >>> -- >>> Dec 16 11:35:17 351591 [429C9940] 0x04 -> validate_port_caps: Port's >>> RATE 2 is less than 3 >>> Dec 16 11:35:17 351598 [429C9940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>> IB_SA_MAD_STATUS_REQ_INVALID >>> Dec 16 11:35:17 351604 [429C9940] 0x10 -> osm_sa_send_error: [ >>> -- >>> Dec 16 11:35:18 042907 [43DCB940] 0x04 -> validate_port_caps: Port's >>> RATE 2 is less than 3 >>> Dec 16 11:35:18 042914 [43DCB940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>> IB_SA_MAD_STATUS_REQ_INVALID >>> Dec 16 11:35:18 042920 [43DCB940] 0x10 -> osm_sa_send_error: [ >>> >>> Gerben >>> >>> >>>> >>>> On 09:56 Fri 16 Dec , Gerben Roest wrote: >>>>> On 16-12-2011 1:06, Ira Weiny wrote: >>>>>> On Thu, 15 Dec 2011 15:17:24 -0800 >>>>>> Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5 >>>>>>> machine, directly linked to its neighbour (a twin 1U setup) gives me no >>>>>>> connection but lots of errors in /var/log/opensm.log, like these: >>>>>>> >>>>>>> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>>>> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>>>> from port 0x001e8c0000c84b62 (titus HCA-1), sending >>>>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>>>> >>>>>>> Does anyone know what happens here? Another twin node has no problems, >>>>>>> that one uses OFED-1.5.1. >>>>>>> >>>>>>> I can send a "-V" log of opensm or any config files if you like, >>>>>> >>>>>> Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors. >>>>> >>>>> Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [ >>>>> Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>>>> Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [ >>>>> Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>>>> 0x3dd9290 >>>>> Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ] >>>>> Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed >>>>> sending response or unsolicited p_madw = 0x3ddf5c0 >>>>> Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ] >>>>> Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ] >>>>> Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ] >>>>> Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ] >>>>> Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>>>> Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [ >>>>> Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD >>>>> 0x3dd7290 >>>>> Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ] >>>>> Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>>>> Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [ >>>>> Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD >>>>> for p_madw = 0x3ddf5d8, size = 256 >>>>> Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD >>>>> 0x3dd7290, size = 256 >>>>> Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ] >>>>> Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [ >>>>> Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA >>>>> MADs received >>>>> Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump: >>>>> base_ver................0x1 >>>>> mgmt_class..............0x3 >>>>> class_ver...............0x2 >>>>> method..................0x2 (SubnAdmSet) >>>>> status..................0x0 >>>>> resv....................0x0 >>>>> trans_id................0x53bf6d21e >>>>> attr_id.................0x38 >>>>> (MCMemberRecord) >>>>> resv1...................0x0 >>>>> attr_mod................0x0 >>>>> rmpp_version............0x0 >>>>> rmpp_type...............0x0 >>>>> rmpp_flags..............0x0 >>>>> rmpp_status.............0x0 >>>>> seg_num.................0x0 >>>>> payload_len/new_win.....0x0 >>>>> sm_key..................0x0000000000000000 >>>>> attr_offset.............0x0 >>>>> resv2...................0x0 >>>>> comp_mask...............0x0000000000010083 >>>>> >>>>> >>>>> Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [ >>>>> Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting >>>>> Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD >>>>> Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ] >>>>> Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ] >>>>> Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [ >>>>> Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [ >>>>> Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of >>>>> incoming record >>>>> Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump: >>>>> >>>>> MGID....................ff12:401b:ffff::ffff:ffff >>>>> PortGid.................fe80::1e:8c00:b9:641 >>>>> qkey....................0x0 >>>>> mlid....................0x0 >>>>> mtu.....................0x0 >>>>> TClass..................0x0 >>>>> pkey....................0xFFFF >>>>> rate....................0x0 >>>>> pkt_life................0x0 >>>>> SLFlowLabelHopLimit.....0x0 >>>>> ScopeState..............0x1 >>>>> ProxyJoin...............0x0 >>>>> Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's >>>>> RATE 2 is less than 3 >>>>> Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12: >>>>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed >>>>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending >>>>> IB_SA_MAD_STATUS_REQ_INVALID >>>>> Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [ >>>>> Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [ >>>>> Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD >>>>> for p_madw = 0x3dd73f8, size = 256 >>>>> Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD >>>>> 0x3dd9290, size = 256 >>>>> Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ] >>>>> Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump: >>>>> base_ver................0x1 >>>>> mgmt_class..............0x3 >>>>> class_ver...............0x2 >>>>> method..................0x81 >>>>> (SubnAdmGetResp) >>>>> status..................0x200 >>>>> resv....................0x0 >>>>> trans_id................0x53bf6d21e >>>>> attr_id.................0x38 >>>>> (MCMemberRecord) >>>>> resv1...................0x0 >>>>> attr_mod................0x0 >>>>> rmpp_version............0x0 >>>>> rmpp_type...............0x0 >>>>> rmpp_flags..............0x0 >>>>> rmpp_status.............0x0 >>>>> seg_num.................0x0 >>>>> payload_len/new_win.....0x0 >>>>> sm_key..................0x0000000000000000 >>>>> attr_offset.............0x0 >>>>> resv2...................0x0 >>>>> comp_mask...............0x0000000000010083 >>>>> >>>>> >>>>> Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [ >>>>> Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256 >>>>> Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [ >>>>> Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>>>> 0x3dd9290 >>>>> Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ] >>>>> Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed >>>>> sending response or unsolicited p_madw = 0x3dd73e0 >>>>> Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ] >>>>> Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ] >>>>> Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ] >>>>> Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ] >>>>> Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [ >>>>> Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [ >>>>> Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD >>>>> 0x3dd7e40 >>>>> Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ] >>>>> Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ] >>>>> Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep >>>>> signalled >>>>> Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [ >>>>> Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process: >>>>> Received signal OSM_SIGNAL_SWEEP in state MASTER >>>>> Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [ >>>>> Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0: >>>>> >>>>> >>>>> >>>>> thanks, >>>>> >>>>> Gerben >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >>>>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> >> > > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB4362.1050505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB4362.1050505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2011-12-16 15:37 ` Gerben Roest [not found] ` <4EEB65D0.8040802-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Gerben Roest @ 2011-12-16 15:37 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 16-12-2011 14:10, Hal Rosenstock wrote: >> They should be DDR cards. Does Rate=2 mean DDR? > > No; it means 1x SDR (lowest speed/width). 4x DDR would be rate 6 (20 > Gbps). See IBA 1.2.1 vol 1 PathRecord SA attribute Rate component. > > By default, OpenSM sets the rate for the IPoIB broadcast groups when not > explicitly specified is rate 3 (10 Gbps) which is 4x SDR. I have a similar twin node that does work correctly (has DDR IB) and it says at ibstat: "Rate: 20" whereas the two that are having problems say "Rate: 2". Testing with openmpi osu_bw show: Rate=2: max bw: 245 MB/s Rate=20: max bw: 1970 MB/s greetings, Gerben -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB65D0.8040802-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB65D0.8040802-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> @ 2011-12-16 15:43 ` Hal Rosenstock [not found] ` <4EEB6729.8070600-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 0 siblings, 1 reply; 11+ messages in thread From: Hal Rosenstock @ 2011-12-16 15:43 UTC (permalink / raw) To: Gerben Roest; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 12/16/2011 10:37 AM, Gerben Roest wrote: > On 16-12-2011 14:10, Hal Rosenstock wrote: > >>> They should be DDR cards. Does Rate=2 mean DDR? >> >> No; it means 1x SDR (lowest speed/width). 4x DDR would be rate 6 (20 >> Gbps). See IBA 1.2.1 vol 1 PathRecord SA attribute Rate component. >> >> By default, OpenSM sets the rate for the IPoIB broadcast groups when not >> explicitly specified is rate 3 (10 Gbps) which is 4x SDR. > > I have a similar twin node that does work correctly (has DDR IB) and it > says at ibstat: "Rate: 20" > whereas the two that are having problems say "Rate: 2". > > Testing with openmpi osu_bw show: > > Rate=2: max bw: 245 MB/s > Rate=20: max bw: 1970 MB/s Yes, that's consistent. Can you temporarily try the cable that is known to work (for rate 20) between the ports that come up at rate 2 and see if they come up properly (at rate 20) ? -- Hal > > greetings, > > Gerben > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <4EEB6729.8070600-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>]
* Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID [not found] ` <4EEB6729.8070600-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> @ 2011-12-16 15:56 ` Gerben Roest 0 siblings, 0 replies; 11+ messages in thread From: Gerben Roest @ 2011-12-16 15:56 UTC (permalink / raw) Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 16-12-2011 16:43, Hal Rosenstock wrote: > On 12/16/2011 10:37 AM, Gerben Roest wrote: >> On 16-12-2011 14:10, Hal Rosenstock wrote: >> >>>> They should be DDR cards. Does Rate=2 mean DDR? >>> >>> No; it means 1x SDR (lowest speed/width). 4x DDR would be rate 6 (20 >>> Gbps). See IBA 1.2.1 vol 1 PathRecord SA attribute Rate component. >>> >>> By default, OpenSM sets the rate for the IPoIB broadcast groups when not >>> explicitly specified is rate 3 (10 Gbps) which is 4x SDR. >> >> I have a similar twin node that does work correctly (has DDR IB) and it >> says at ibstat: "Rate: 20" >> whereas the two that are having problems say "Rate: 2". >> >> Testing with openmpi osu_bw show: >> >> Rate=2: max bw: 245 MB/s >> Rate=20: max bw: 1970 MB/s > > Yes, that's consistent. > > Can you temporarily try the cable that is known to work (for rate 20) > between the ports that come up at rate 2 and see if they come up > properly (at rate 20) ? Yes, I'll try that but that will be next week or so. I'll get back to you on that. thanks for your help, Gerben -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-12-16 15:56 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-15 23:17 Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID Gerben Roest
[not found] ` <4EEA8004.4060103-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 0:06 ` Ira Weiny
[not found] ` <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-16 8:56 ` Gerben Roest
[not found] ` <4EEB07C3.90803-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 9:14 ` Alex Netes
2011-12-16 10:46 ` Gerben Roest
[not found] ` <4EEB216D.2010407-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 12:30 ` Hal Rosenstock
[not found] ` <4EEB39E8.5030601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 12:55 ` Gerben Roest
[not found] ` <4EEB3FD3.3080409-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 13:10 ` Hal Rosenstock
[not found] ` <4EEB4362.1050505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 15:37 ` Gerben Roest
[not found] ` <4EEB65D0.8040802-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 15:43 ` Hal Rosenstock
[not found] ` <4EEB6729.8070600-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 15:56 ` Gerben Roest
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).