All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
To: Ira Weiny <weiny2-i2BcT+NCU+M@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID
Date: Fri, 16 Dec 2011 09:56:35 +0100	[thread overview]
Message-ID: <4EEB07C3.90803@grepit.nl> (raw)
In-Reply-To: <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org>

On 16-12-2011 1:06, Ira Weiny wrote:
> On Thu, 15 Dec 2011 15:17:24 -0800
> Gerben Roest <g.roest-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org> wrote:
> 
>> Hi,
>>
>> Starting opensm from OFED 1.5.1, 1.5.3.2, 1.5.4 on a Scientific Linux 5
>> machine, directly linked to its neighbour (a twin 1U setup) gives me no
>> connection but lots of errors in /var/log/opensm.log, like these:
>>
>> Dec 15 22:38:35 685651 [45AFD940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12:
>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed
>> from port 0x001e8c0000b90641 (vespasianus HCA-1), sending
>> IB_SA_MAD_STATUS_REQ_INVALID
>> Dec 15 22:38:35 686174 [464FE940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12:
>> validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed
>> from port 0x001e8c0000c84b62 (titus HCA-1), sending
>> IB_SA_MAD_STATUS_REQ_INVALID
>>
>> Does anyone know what happens here? Another twin node has no problems,
>> that one uses OFED-1.5.1.
>>
>> I can send a "-V" log of opensm or any config files if you like,
> 
> Just set -D 0x7 which adds VERBOSE and send the snippet around the above errors.

Dec 15 23:35:05 791001 [4399A940] 0x10 -> osm_vendor_send: [
Dec 15 23:35:05 791008 [4399A940] 0x04 -> osm_vendor_send: RMPP 0 length 256
Dec 15 23:35:05 791021 [4399A940] 0x10 -> osm_vendor_put: [
Dec 15 23:35:05 791028 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD
0x3dd9290
Dec 15 23:35:05 791034 [4399A940] 0x10 -> osm_vendor_put: ]
Dec 15 23:35:05 791040 [4399A940] 0x08 -> osm_vendor_send: Completed
sending response or unsolicited p_madw = 0x3ddf5c0
Dec 15 23:35:05 791046 [4399A940] 0x10 -> osm_vendor_send: ]
Dec 15 23:35:05 791051 [4399A940] 0x10 -> osm_sa_send_error: ]
Dec 15 23:35:05 791057 [4399A940] 0x10 -> mcmr_rcv_join_mgrp: ]
Dec 15 23:35:05 791062 [4399A940] 0x10 -> osm_mcmr_rcv_process: ]
Dec 15 23:35:05 791068 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: [
Dec 15 23:35:05 791073 [4399A940] 0x10 -> osm_vendor_put: [
Dec 15 23:35:05 791079 [4399A940] 0x08 -> osm_vendor_put: Retiring UMAD
0x3dd7290
Dec 15 23:35:05 791084 [4399A940] 0x10 -> osm_vendor_put: ]
Dec 15 23:35:05 791090 [4399A940] 0x10 -> sa_mad_ctrl_disp_done_callback: ]
Dec 15 23:35:05 792086 [4B1A6940] 0x10 -> osm_vendor_get: [
Dec 15 23:35:05 792106 [4B1A6940] 0x08 -> osm_vendor_get: Acquiring UMAD
for p_madw = 0x3ddf5d8, size = 256
Dec 15 23:35:05 792117 [4B1A6940] 0x08 -> osm_vendor_get: Acquired UMAD
0x3dd7290, size = 256
Dec 15 23:35:05 792126 [4B1A6940] 0x10 -> osm_vendor_get: ]
Dec 15 23:35:05 792132 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: [
Dec 15 23:35:05 792139 [4B1A6940] 0x08 -> sa_mad_ctrl_rcv_callback: 4 SA
MADs received
Dec 15 23:35:05 792152 [4B1A6940] 0x20 -> SA MAD dump:
                                base_ver................0x1
                                mgmt_class..............0x3
                                class_ver...............0x2
                                method..................0x2 (SubnAdmSet)
                                status..................0x0
                                resv....................0x0
                                trans_id................0x53bf6d21e
                                attr_id.................0x38
(MCMemberRecord)
                                resv1...................0x0
                                attr_mod................0x0
                                rmpp_version............0x0
                                rmpp_type...............0x0
                                rmpp_flags..............0x0
                                rmpp_status.............0x0
                                seg_num.................0x0
                                payload_len/new_win.....0x0
                                sm_key..................0x0000000000000000
                                attr_offset.............0x0
                                resv2...................0x0
                                comp_mask...............0x0000000000010083


Dec 15 23:35:05 792158 [4B1A6940] 0x10 -> sa_mad_ctrl_process: [
Dec 15 23:35:05 792165 [4B1A6940] 0x08 -> sa_mad_ctrl_process: Posting
Dispatcher message OSM_MSG_MAD_MCMEMBER_RECORD
Dec 15 23:35:05 792187 [4B1A6940] 0x10 -> sa_mad_ctrl_process: ]
Dec 15 23:35:05 792194 [4B1A6940] 0x10 -> sa_mad_ctrl_rcv_callback: ]
Dec 15 23:35:05 792204 [46B9F940] 0x10 -> osm_mcmr_rcv_process: [
Dec 15 23:35:05 792211 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: [
Dec 15 23:35:05 792216 [46B9F940] 0x08 -> mcmr_rcv_join_mgrp: Dump of
incoming record
Dec 15 23:35:05 792228 [46B9F940] 0x08 -> MCMember Record dump:

MGID....................ff12:401b:ffff::ffff:ffff
                                PortGid.................fe80::1e:8c00:b9:641
                                qkey....................0x0
                                mlid....................0x0
                                mtu.....................0x0
                                TClass..................0x0
                                pkey....................0xFFFF
                                rate....................0x0
                                pkt_life................0x0
                                SLFlowLabelHopLimit.....0x0
                                ScopeState..............0x1
                                ProxyJoin...............0x0
Dec 15 23:35:05 792236 [46B9F940] 0x04 -> validate_port_caps: Port's
RATE 2 is less than 3
Dec 15 23:35:05 792243 [46B9F940] 0x01 -> mcmr_rcv_join_mgrp: ERR 1B12:
validate_more_comp_fields, validate_port_caps, or JoinState = 0 failed
from port 0x001e8c0000b90641 (vespasianus HCA-1), sending
IB_SA_MAD_STATUS_REQ_INVALID
Dec 15 23:35:05 792253 [46B9F940] 0x10 -> osm_sa_send_error: [
Dec 15 23:35:05 792260 [46B9F940] 0x10 -> osm_vendor_get: [
Dec 15 23:35:05 792266 [46B9F940] 0x08 -> osm_vendor_get: Acquiring UMAD
for p_madw = 0x3dd73f8, size = 256
Dec 15 23:35:05 792273 [46B9F940] 0x08 -> osm_vendor_get: Acquired UMAD
0x3dd9290, size = 256
Dec 15 23:35:05 792279 [46B9F940] 0x10 -> osm_vendor_get: ]
Dec 15 23:35:05 792291 [46B9F940] 0x20 -> SA MAD dump:
                                base_ver................0x1
                                mgmt_class..............0x3
                                class_ver...............0x2
                                method..................0x81
(SubnAdmGetResp)
                                status..................0x200
                                resv....................0x0
                                trans_id................0x53bf6d21e
                                attr_id.................0x38
(MCMemberRecord)
                                resv1...................0x0
                                attr_mod................0x0
                                rmpp_version............0x0
                                rmpp_type...............0x0
                                rmpp_flags..............0x0
                                rmpp_status.............0x0
                                seg_num.................0x0
                                payload_len/new_win.....0x0
                                sm_key..................0x0000000000000000
                                attr_offset.............0x0
                                resv2...................0x0
                                comp_mask...............0x0000000000010083


Dec 15 23:35:05 792298 [46B9F940] 0x10 -> osm_vendor_send: [
Dec 15 23:35:05 792304 [46B9F940] 0x04 -> osm_vendor_send: RMPP 0 length 256
Dec 15 23:35:05 792318 [46B9F940] 0x10 -> osm_vendor_put: [
Dec 15 23:35:05 792325 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD
0x3dd9290
Dec 15 23:35:05 792331 [46B9F940] 0x10 -> osm_vendor_put: ]
Dec 15 23:35:05 792337 [46B9F940] 0x08 -> osm_vendor_send: Completed
sending response or unsolicited p_madw = 0x3dd73e0
Dec 15 23:35:05 792343 [46B9F940] 0x10 -> osm_vendor_send: ]
Dec 15 23:35:05 792360 [46B9F940] 0x10 -> osm_sa_send_error: ]
Dec 15 23:35:05 792366 [46B9F940] 0x10 -> mcmr_rcv_join_mgrp: ]
Dec 15 23:35:05 792371 [46B9F940] 0x10 -> osm_mcmr_rcv_process: ]
Dec 15 23:35:05 792377 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: [
Dec 15 23:35:05 792383 [46B9F940] 0x10 -> osm_vendor_put: [
Dec 15 23:35:05 792388 [46B9F940] 0x08 -> osm_vendor_put: Retiring UMAD
0x3dd7e40
Dec 15 23:35:05 792394 [46B9F940] 0x10 -> osm_vendor_put: ]
Dec 15 23:35:05 792400 [46B9F940] 0x10 -> sa_mad_ctrl_disp_done_callback: ]
Dec 15 23:35:09 759207 [4A7A5940] 0x08 -> sm_sweeper: Off schedule sweep
signalled
Dec 15 23:35:09 759229 [4A7A5940] 0x10 -> osm_state_mgr_process: [
Dec 15 23:35:09 759240 [4A7A5940] 0x08 -> osm_state_mgr_process:
Received signal OSM_SIGNAL_SWEEP in state MASTER
Dec 15 23:35:09 759249 [4A7A5940] 0x10 -> state_mgr_sweep_hop_0: [
Dec 15 23:35:09 759258 [4A7A5940] 0x04 -> state_mgr_sweep_hop_0:



thanks,

Gerben
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2011-12-16  8:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-15 23:17 Problems with link, opensm complains IB_SA_MAD_STATUS_REQ_INVALID Gerben Roest
     [not found] ` <4EEA8004.4060103-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16  0:06   ` Ira Weiny
     [not found]     ` <20111215160600.ebccb033.weiny2-i2BcT+NCU+M@public.gmane.org>
2011-12-16  8:56       ` Gerben Roest [this message]
     [not found]         ` <4EEB07C3.90803-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16  9:14           ` Alex Netes
2011-12-16 10:46             ` Gerben Roest
     [not found]               ` <4EEB216D.2010407-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 12:30                 ` Hal Rosenstock
     [not found]                   ` <4EEB39E8.5030601-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 12:55                     ` Gerben Roest
     [not found]                       ` <4EEB3FD3.3080409-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 13:10                         ` Hal Rosenstock
     [not found]                           ` <4EEB4362.1050505-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 15:37                             ` Gerben Roest
     [not found]                               ` <4EEB65D0.8040802-99SnrGqf+M9mR6Xm/wNWPw@public.gmane.org>
2011-12-16 15:43                                 ` Hal Rosenstock
     [not found]                                   ` <4EEB6729.8070600-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2011-12-16 15:56                                     ` Gerben Roest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4EEB07C3.90803@grepit.nl \
    --to=g.roest-99snrgqf+m9mr6xm/wnwpw@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=weiny2-i2BcT+NCU+M@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.