All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
To: Konstantin Boyanov <konstantin.boyanov-T5F83Mi6MZE@public.gmane.org>
Cc: Hal Rosenstock
	<hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Konstantin Boyanov <boyanov-/DgOsau14IQ@public.gmane.org>,
	Jason Gunthorpe
	<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: InfiniBand HCA loopback on a single host (subnet manager needed?)
Date: Thu, 31 Mar 2011 09:53:09 -0400	[thread overview]
Message-ID: <4D948745.2080002@dev.mellanox.co.il> (raw)
In-Reply-To: <4D94411F.2080008-T5F83Mi6MZE@public.gmane.org>

On 3/31/2011 4:53 AM, Konstantin Boyanov wrote:
> Hello,
>
> Thanks for the advices! I have gotten my hands on an QSFP loopback plug,
> and yestrday inserted it in the machine (sinlge slot IB card).
>
> Unfortunately I am having problems when starting the Subnet Manager.I
> believe I have installed and loaded all the necessary kernel modules
> needed.
>
> *# lsmod | grep ib
> ib_ipoib 78893 0
> ib_ucm 12567 0
> ib_uverbs 31293 6 rdma_ucm,ib_ucm
> ib_umad 12147 4
> ib_cm 36419 3 ib_ipoib,ib_ucm,rdma_cm
> ib_addr 6089 1 rdma_cm
> ib_sa 22820 4 ib_ipoib,rdma_ucm,rdma_cm,ib_cm
> mlx4_ib 52866 1
> ib_mad 40542 4 ib_umad,ib_cm,ib_sa,mlx4_ib
> ib_core 66295 11
> ib_ipoib,rdma_ucm,ib_ucm,ib_uverbs,ib_umad,rdma_cm,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad
>
> ipv6 321509 72 ib_ipoib,ib_addr
> mlx4_core 93453 2 mlx4_ib,mlx4_en*
>
>
> But when I start the opensm via:
>
> *# /etc/init.d/opensm start*
>
> I see a lot of error messages at the end of /var/log/opensm.log:
>
> *Mar 30 12:50:05 622171 [1795B700] 0x80 -> SM port is down
> Mar 30 12:50:05 622184 [1795B700] 0x01 -> sm_state_mgr_signal_error: ERR
> 3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in state DISCOVERING
> SM port is down
>
> Mar 30 12:50:15 622345 [1795B700] 0x80 -> SM port is down
> Mar 30 12:50:15 622356 [1795B700] 0x01 -> sm_state_mgr_signal_error: ERR
> 3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in state DISCOVERING
> Errors on subnet. Duplicate GUID found by link from a port to itself.
> See verbose opensm.log for more details
>
> Mar 30 12:50:25 622645 [1C963700] 0x80 -> Errors on subnet. Duplicate
> GUID found by link from a port to itself. See verbose opensm.log for
> more details

My bad; can you cable this to some other IB port (either switch or other 
HCA port) ? If this is a 2 port HCA, then it's simple.

> After that, the port state is changed to PORT_INIT, but non of my test
> programs for the loopback (as well as thous in the OFED examples) can
> find a valid LID and oeprate properly.
>
> *# ibv_devinfo
> hca_id: mlx4_0
> transport: InfiniBand (0)
> fw_ver: 2.7.626
> node_guid: 0002:c903:000b:e242
> sys_image_guid: 0002:c903:000b:e245
> vendor_id: 0x02c9
> vendor_part_id: 26428
> hw_ver: 0xB0
> board_id: MT_0D90110009
> phys_port_cnt: 1
> port: 1
> state: PORT_INIT (2)
> max_mtu: 2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
>
> *I am using OFED drivers version 1.4 and the machine is as follows:
>
> *# uname -a
> Linux myhost.domain.de 2.6.32-71.18.2.el6.x86_64 #1 SMP Tue Mar 8
> 15:00:52 CST 2011 x86_64 x86_64 x86_64 GNU/Linux*
>
> It seems to me that the loopback connector is somehow tricking the
> openSM to think that there is something wrong with the ports. Am I right?

It's making the OpenSM think that the remote end of the port has a 
duplicate GUID; doesn't handle this case :-(

> Another thing: If I try to force bring the port to the ACTIVE state with
> ibportstate I get the following error:
>
> # ibportstate -G 0x0002c903000be243 1 enable
> ibwarn: [4824] mad_rpc_open_port: can't open UMAD port ((null):0)
> ibportstate: iberror: failed: Failed to open '(null)' port '0'

Let's fix the problems one at a time. You shouldn't need to do this.

-- Hal

>
> I am really a greenehorn to all this InfiniBand stuff, so please can
> someone decrypt the above error messages in the opensm.log? What should
> I do in order to have a running openSM and a port configured the right
> way, so I can loopback messages? Is there any documentation out there
> which describes the set up of an loopback on a single port, or at least
> the initial setup of an InfiniBand network?
>
> Thnaks in advance for your time and sorry if I am bothering you too much
> with my lame questions.
>
> Best regards,
> Konstantin Boyanov
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      parent reply	other threads:[~2011-03-31 13:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-09  8:04 InfiniBand HCA loopback on a single host (subnet manager needed?) Konstantin Boyanov
     [not found] ` <alpine.LRH.2.00.1103090903140.15803-9mA5q7a405ob1SvskN2V4Q@public.gmane.org>
2011-03-09 17:30   ` Jason Gunthorpe
     [not found]     ` <20110309173005.GN22729-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-03-09 18:24       ` Hal Rosenstock
     [not found]         ` <AANLkTimTm2UgUr4A_XJYGJpGBnRAFE74Eu6At+n9Xnfd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-03-31  8:53           ` Konstantin Boyanov
     [not found]             ` <4D94411F.2080008-T5F83Mi6MZE@public.gmane.org>
2011-03-31 13:53               ` Hal Rosenstock [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D948745.2080002@dev.mellanox.co.il \
    --to=hal-ldsdmyg8hgv8yrgs2mwiifqbs+8scbdb@public.gmane.org \
    --cc=boyanov-/DgOsau14IQ@public.gmane.org \
    --cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
    --cc=konstantin.boyanov-T5F83Mi6MZE@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.