public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* help trying to track down issue with opensm port
@ 2010-06-11 18:29 Hefty, Sean
       [not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F643455-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Hefty, Sean @ 2010-06-11 18:29 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

The following opensm log is from a windows port of opensm 3.3.6.  Can anyone familiar with the error messages explain whether the errors that are reported should ever happen?  Error 9 is a timeout, so I'm not concerned about that.  I am concerned about the 'failed to obtain request madw' errors.

- Sean

[Jun-10-2010 15:45:37:486][0C2C] 0x03 -> OpenSM 3.3.6 UMAD
[Jun-10-2010 15:45:37:486][0C2C] 0x80 -> OpenSM 3.3.6 UMAD
[Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_init: 1000 pending umads specified
[Jun-10-2010 15:45:37:501][0C2C] 0x80 -> Entering DISCOVERING state
[Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
[Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
[Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9020000409d
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
			Method 0x1, Attr 0x20, TID 0x12a5, Hop Ptr: 0x0
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a5
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
			Method 0x1, Attr 0x20, TID 0x12a8, Hop Ptr: 0x0
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
[Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a8
[Jun-10-2010 15:45:38:375][0CCC] 0x80 -> Entering MASTER state
[Jun-10-2010 15:45:38:375][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
[Jun-10-2010 15:45:38:375][0A1C] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:7 TID:0x0000000000000010
[Jun-10-2010 15:45:38:375][0A1C] 0x01 -> osm_get_port_by_mad_addr: ERR 7504: Lid is out of range: 0
[Jun-10-2010 15:45:38:375][0A1C] 0x01 -> trap_rcv_process_request: ERR 3809: Failed to find source physical port for trap
[Jun-10-2010 15:45:38:375][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:7 GID:fe80::2:c902:0:409d

[Jun-10-2010 15:45:38:375][0CCC] 0x80 -> SUBNET UP

[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x10001b90b, LID 17
[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10001b90b) -- dropping
[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x35, TID 0x10001b90c, LID 17
[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x10001b90c) -- dropping
[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x10000012a, LID 1
[Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10000012a) -- dropping
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x35, TID 0x100016375, LID 4
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100016375) -- dropping
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x100016374, LID 4
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100016374) -- dropping
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x35, TID 0x100000d38, LID 6
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100000d38) -- dropping
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x100000d37, LID 6
[Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100000d37) -- dropping
[Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::ffff:ffff
[Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::1
[Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::7f:fffa
[Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:1
[Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:3
[Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb7:99a9
[Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fc
[Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb8:a9ed
[Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:c
[Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffe9:b65e
[Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x38, TID 0x1000003cd, LID 7
[Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x1000003cd) -- dropping
[Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffac:bad6
[Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x38, TID 0x100000ddb, LID 15
[Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x100000ddb) -- dropping
[Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff4a:4b54
[Jun-10-2010 15:45:38:890][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::180:c200:3
[Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffc0:77dc
[Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff90:fd62
[Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::1 for PortGID: fe80::2:c902:0:25a5
[Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:1405:ffff::180:c200:3 for PortGID: fe80::2:c902:0:25a5
[Jun-10-2010 15:45:38:890][0EF0] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::ffff:ffff for PortGID: fe80::2:c902:0:25a5
[Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::16
[Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:16
[Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffeb:5867
[Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff70:431e
[Jun-10-2010 15:45:38:937][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:2
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
			Method 0x1, Attr 0x20, TID 0x130a, Hop Ptr: 0x0
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130a
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
			Method 0x1, Attr 0x20, TID 0x130b, Hop Ptr: 0x0
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
[Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130b
[Jun-10-2010 15:45:39:202][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
[Jun-10-2010 15:45:39:217][0CCC] 0x02 -> SUBNET UP
[Jun-10-2010 15:45:39:342][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:2
[Jun-10-2010 15:45:40:372][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fb
[Jun-10-2010 15:45:45:364][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::fb
[Jun-10-2010 15:45:45:379][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::1:ff00:525d
[Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
[Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
[Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
[Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
[Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
[Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
[Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
			Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
[Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
[Jun-10-2010 15:45:59:856][0C2C] 0x80 -> Exiting SM
\x1a
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: help trying to track down issue with opensm port
       [not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F643455-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2010-06-11 19:15   ` Hal Rosenstock
       [not found]     ` <AANLkTilmg1NoBZkhzNU6kRZN0xTa89mCPtYGuzIQxWuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2010-06-11 21:41   ` Ira Weiny
  1 sibling, 1 reply; 4+ messages in thread
From: Hal Rosenstock @ 2010-06-11 19:15 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Fri, Jun 11, 2010 at 2:29 PM, Hefty, Sean <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> The following opensm log is from a windows port of opensm 3.3.6.  Can anyone familiar with the error messages explain whether the errors that are reported should ever happen?  Error 9 is a timeout, so I'm not concerned about that.  I am concerned about the 'failed to obtain request madw' errors.

It means that the umad vendor layer was unable to find a transaction
ID match in the match table. I'm not sure why since there were no
error messages relating to match table overflow. Looks like they are
on GetTableResp methods (RMPP). Not sure how that works in Windows.

-- Hal

>
> - Sean
>
> [Jun-10-2010 15:45:37:486][0C2C] 0x03 -> OpenSM 3.3.6 UMAD
> [Jun-10-2010 15:45:37:486][0C2C] 0x80 -> OpenSM 3.3.6 UMAD
> [Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_init: 1000 pending umads specified
> [Jun-10-2010 15:45:37:501][0C2C] 0x80 -> Entering DISCOVERING state
> [Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
> [Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
> [Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9020000409d
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                        Method 0x1, Attr 0x20, TID 0x12a5, Hop Ptr: 0x0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a5
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                        Method 0x1, Attr 0x20, TID 0x12a8, Hop Ptr: 0x0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a8
> [Jun-10-2010 15:45:38:375][0CCC] 0x80 -> Entering MASTER state
> [Jun-10-2010 15:45:38:375][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:7 TID:0x0000000000000010
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> osm_get_port_by_mad_addr: ERR 7504: Lid is out of range: 0
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> trap_rcv_process_request: ERR 3809: Failed to find source physical port for trap
> [Jun-10-2010 15:45:38:375][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:7 GID:fe80::2:c902:0:409d
>
> [Jun-10-2010 15:45:38:375][0CCC] 0x80 -> SUBNET UP
>
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x10001b90b, LID 17
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10001b90b) -- dropping
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x35, TID 0x10001b90c, LID 17
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x10001b90c) -- dropping
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x10000012a, LID 1
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10000012a) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x35, TID 0x100016375, LID 4
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100016375) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x100016374, LID 4
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100016374) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x35, TID 0x100000d38, LID 6
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100000d38) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x100000d37, LID 6
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100000d37) -- dropping
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::ffff:ffff
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::1
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::7f:fffa
> [Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:1
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:3
> [Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb7:99a9
> [Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fc
> [Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb8:a9ed
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:c
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffe9:b65e
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x38, TID 0x1000003cd, LID 7
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x1000003cd) -- dropping
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffac:bad6
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x38, TID 0x100000ddb, LID 15
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x100000ddb) -- dropping
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff4a:4b54
> [Jun-10-2010 15:45:38:890][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::180:c200:3
> [Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffc0:77dc
> [Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff90:fd62
> [Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::1 for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:1405:ffff::180:c200:3 for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0EF0] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::ffff:ffff for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::16
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:16
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffeb:5867
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff70:431e
> [Jun-10-2010 15:45:38:937][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:2
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                        Method 0x1, Attr 0x20, TID 0x130a, Hop Ptr: 0x0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130a
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                        Method 0x1, Attr 0x20, TID 0x130b, Hop Ptr: 0x0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130b
> [Jun-10-2010 15:45:39:202][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
> [Jun-10-2010 15:45:39:217][0CCC] 0x02 -> SUBNET UP
> [Jun-10-2010 15:45:39:342][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:2
> [Jun-10-2010 15:45:40:372][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fb
> [Jun-10-2010 15:45:45:364][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::fb
> [Jun-10-2010 15:45:45:379][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::1:ff00:525d
> [Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                        Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:59:856][0C2C] 0x80 -> Exiting SM
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: help trying to track down issue with opensm port
       [not found]     ` <AANLkTilmg1NoBZkhzNU6kRZN0xTa89mCPtYGuzIQxWuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-06-11 19:22       ` Hefty, Sean
  0 siblings, 0 replies; 4+ messages in thread
From: Hefty, Sean @ 2010-06-11 19:22 UTC (permalink / raw)
  To: Hal Rosenstock; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 586 bytes --]

> It means that the umad vendor layer was unable to find a transaction
> ID match in the match table. I'm not sure why since there were no
> error messages relating to match table overflow. Looks like they are
> on GetTableResp methods (RMPP). Not sure how that works in Windows.

Thanks - that helps.  My guess is that the code is either incorrectly marking the MADs as needing a response, when they do not, or an RMPP failure is being reported.
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±­ÙšŠ{ayº\x1dʇڙë,j\a­¢f£¢·hš‹»öì\x17/oSc¾™Ú³9˜uÀ¦æå‰È&jw¨®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: help trying to track down issue with opensm port
       [not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F643455-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2010-06-11 19:15   ` Hal Rosenstock
@ 2010-06-11 21:41   ` Ira Weiny
  1 sibling, 0 replies; 4+ messages in thread
From: Ira Weiny @ 2010-06-11 21:41 UTC (permalink / raw)
  To: Hefty, Sean; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Fri, 11 Jun 2010 11:29:24 -0700
"Hefty, Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:

> The following opensm log is from a windows port of opensm 3.3.6.  Can anyone
> familiar with the error messages explain whether the errors that are
> reported should ever happen?  Error 9 is a timeout, so I'm not concerned
> about that.  I am concerned about the 'failed to obtain request madw'
> errors.
> 
> - Sean
> 
> [Jun-10-2010 15:45:37:486][0C2C] 0x03 -> OpenSM 3.3.6 UMAD
> [Jun-10-2010 15:45:37:486][0C2C] 0x80 -> OpenSM 3.3.6 UMAD
> [Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_init: 1000 pending umads specified
> [Jun-10-2010 15:45:37:501][0C2C] 0x80 -> Entering DISCOVERING state
> [Jun-10-2010 15:45:37:501][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
> [Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_vendor_bind: Binding to port 0x2c9020000409d
> [Jun-10-2010 15:45:37:548][0C2C] 0x02 -> osm_opensm_bind: Setting IS_SM on port 0x0002c9020000409d
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                         Method 0x1, Attr 0x20, TID 0x12a5, Hop Ptr: 0x0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a5
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                         Method 0x1, Attr 0x20, TID 0x12a8, Hop Ptr: 0x0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
> [Jun-10-2010 15:45:38:375][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x12a8
> [Jun-10-2010 15:45:38:375][0CCC] 0x80 -> Entering MASTER state
> [Jun-10-2010 15:45:38:375][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> log_trap_info: Received Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) Producer:1 (Channel Adapter) from LID:7 TID:0x0000000000000010
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> osm_get_port_by_mad_addr: ERR 7504: Lid is out of range: 0
> [Jun-10-2010 15:45:38:375][0A1C] 0x01 -> trap_rcv_process_request: ERR 3809: Failed to find source physical port for trap
> [Jun-10-2010 15:45:38:375][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:4 num:144 (CapabilityMask, NodeDescription, Link [Width|Speed] Enabled, SM priority changed) from LID:7 GID:fe80::2:c902:0:409d
> 
> [Jun-10-2010 15:45:38:375][0CCC] 0x80 -> SUBNET UP
> 
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x10001b90b, LID 17
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10001b90b) -- dropping

I think this might be an issue with RMPP.  SubnAdmGetTableResp uses RMPP and
umad_receiver does a second umad_recv for RMPP.  My guess is that this second
umad_recv is not being done.  The timeout non-RMPP packet causes the request
madw to be "free'ed".  The next time umad_recv is called the same TID is
returned?  This causes the request mad to not be found because it was
"free'ed" by the previous umad_recv.

What does a read from the umad device on Windows return when an RMPP packet is
being read from the kernel?  I guess there could be an incompatibility in
libibumad when a RMPP packet times out.

Ira


> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x35, TID 0x10001b90c, LID 17
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x10001b90c) -- dropping
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x10000012a, LID 1
> [Jun-10-2010 15:45:38:609][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x10000012a) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x35, TID 0x100016375, LID 4
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100016375) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x100016374, LID 4
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100016374) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x35, TID 0x100000d38, LID 6
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x35 tid=0x100000d38) -- dropping
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x100000d37, LID 6
> [Jun-10-2010 15:45:38:625][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x100000d37) -- dropping
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::ffff:ffff
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::1
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::7f:fffa
> [Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:1
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:3
> [Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb7:99a9
> [Jun-10-2010 15:45:38:874][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fc
> [Jun-10-2010 15:45:38:874][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffb8:a9ed
> [Jun-10-2010 15:45:38:874][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:c
> [Jun-10-2010 15:45:38:874][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffe9:b65e
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x38, TID 0x1000003cd, LID 7
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x1000003cd) -- dropping
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffac:bad6
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x38, TID 0x100000ddb, LID 15
> [Jun-10-2010 15:45:38:890][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x38 tid=0x100000ddb) -- dropping
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff4a:4b54
> [Jun-10-2010 15:45:38:890][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::180:c200:3
> [Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffc0:77dc
> [Jun-10-2010 15:45:38:890][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff90:fd62
> [Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::1 for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0A1C] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:1405:ffff::180:c200:3 for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0EF0] 0x01 -> mcmr_rcv_leave_mgrp: ERR 1B25: Received an invalid delete request for MGID: ff12:401b:ffff::ffff:ffff for PortGID: fe80::2:c902:0:25a5
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::16
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:16
> [Jun-10-2010 15:45:38:890][0D5C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ffeb:5867
> [Jun-10-2010 15:45:38:890][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:ff70:431e
> [Jun-10-2010 15:45:38:937][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:1:2
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                         Method 0x1, Attr 0x20, TID 0x130a, Hop Ptr: 0x0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,14, Return path  = 0,0,0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130a
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> umad_receiver: ERR 5411: DR SMP Send completed with error(9) -- dropping
>                         Method 0x1, Attr 0x20, TID 0x130b, Hop Ptr: 0x0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> Received SMP on a 2 hop path: Initial path = 0,1,10, Return path  = 0,0,0
> [Jun-10-2010 15:45:39:202][08E0] 0x01 -> sm_mad_ctrl_send_err_cb: ERR 3113: MAD completed in error (IB_TIMEOUT): SubnGet(SMInfo), attr_mod 0x0, TID 0x130b
> [Jun-10-2010 15:45:39:202][0CCC] 0x02 -> osm_ucast_mgr_process: minhop tables configured on all switches
> [Jun-10-2010 15:45:39:217][0CCC] 0x02 -> SUBNET UP
> [Jun-10-2010 15:45:39:342][0630] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:1405:ffff::3333:0:2
> [Jun-10-2010 15:45:40:372][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:401b:ffff::fb
> [Jun-10-2010 15:45:45:364][0EF0] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::fb
> [Jun-10-2010 15:45:45:379][0A1C] 0x02 -> log_notice: Reporting Generic Notice type:3 num:66 (New mcast group created) from LID:7 GID:ff12:601b:ffff::1:ff00:525d
> [Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:49:482][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:50:481][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:51:479][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5410: Send completed with error(9) -- dropping
>                         Class 0x3, Method 0x92, Attr 0x11, TID 0x400000002, LID 3
> [Jun-10-2010 15:45:52:477][08E0] 0x01 -> umad_receiver: ERR 5412: Failed to obtain request madw for timed out MAD (method=0x92 attr=0x11 tid=0x400000002) -- dropping
> [Jun-10-2010 15:45:59:856][0C2C] 0x80 -> Exiting SM
> \x1a
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://*vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-06-11 21:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-11 18:29 help trying to track down issue with opensm port Hefty, Sean
     [not found] ` <CF9C39F99A89134C9CF9C4CCB68B8DDF255F643455-osO9UTpF0USkrb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2010-06-11 19:15   ` Hal Rosenstock
     [not found]     ` <AANLkTilmg1NoBZkhzNU6kRZN0xTa89mCPtYGuzIQxWuw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-06-11 19:22       ` Hefty, Sean
2010-06-11 21:41   ` Ira Weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox