From: Konstantin Boyanov <konstantin.boyanov-T5F83Mi6MZE@public.gmane.org>
To: Hal Rosenstock <hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Konstantin Boyanov <boyanov-/DgOsau14IQ@public.gmane.org>,
Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: InfiniBand HCA loopback on a single host (subnet manager needed?)
Date: Thu, 31 Mar 2011 10:53:51 +0200 [thread overview]
Message-ID: <4D94411F.2080008@desy.de> (raw)
In-Reply-To: <AANLkTimTm2UgUr4A_XJYGJpGBnRAFE74Eu6At+n9Xnfd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Hello,
Thanks for the advices! I have gotten my hands on an QSFP loopback plug,
and yestrday inserted it in the machine (sinlge slot IB card).
Unfortunately I am having problems when starting the Subnet Manager.I
believe I have installed and loaded all the necessary kernel modules
needed.
*# lsmod | grep ib
ib_ipoib 78893 0
ib_ucm 12567 0
ib_uverbs 31293 6 rdma_ucm,ib_ucm
ib_umad 12147 4
ib_cm 36419 3 ib_ipoib,ib_ucm,rdma_cm
ib_addr 6089 1 rdma_cm
ib_sa 22820 4 ib_ipoib,rdma_ucm,rdma_cm,ib_cm
mlx4_ib 52866 1
ib_mad 40542 4 ib_umad,ib_cm,ib_sa,mlx4_ib
ib_core 66295 11
ib_ipoib,rdma_ucm,ib_ucm,ib_uverbs,ib_umad,rdma_cm,ib_cm,iw_cm,ib_sa,mlx4_ib,ib_mad
ipv6 321509 72 ib_ipoib,ib_addr
mlx4_core 93453 2 mlx4_ib,mlx4_en*
But when I start the opensm via:
*# /etc/init.d/opensm start*
I see a lot of error messages at the end of /var/log/opensm.log:
*Mar 30 12:50:05 622171 [1795B700] 0x80 -> SM port is down
Mar 30 12:50:05 622184 [1795B700] 0x01 -> sm_state_mgr_signal_error: ERR
3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in state DISCOVERING
SM port is down
Mar 30 12:50:15 622345 [1795B700] 0x80 -> SM port is down
Mar 30 12:50:15 622356 [1795B700] 0x01 -> sm_state_mgr_signal_error: ERR
3207: Invalid signal OSM_SM_SIGNAL_DISCOVER in state DISCOVERING
Errors on subnet. Duplicate GUID found by link from a port to itself.
See verbose opensm.log for more details
Mar 30 12:50:25 622645 [1C963700] 0x80 -> Errors on subnet. Duplicate
GUID found by link from a port to itself. See verbose opensm.log for
more details
After that, the port state is changed to PORT_INIT, but non of my test
programs for the loopback (as well as thous in the OFED examples) can
find a valid LID and oeprate properly.
*# ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.7.626
node_guid: 0002:c903:000b:e242
sys_image_guid: 0002:c903:000b:e245
vendor_id: 0x02c9
vendor_part_id: 26428
hw_ver: 0xB0
board_id: MT_0D90110009
phys_port_cnt: 1
port: 1
state: PORT_INIT (2)
max_mtu: 2048 (4)
active_mtu: 2048 (4)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
*I am using OFED drivers version 1.4 and the machine is as follows:
*# uname -a
Linux myhost.domain.de 2.6.32-71.18.2.el6.x86_64 #1 SMP Tue Mar 8
15:00:52 CST 2011 x86_64 x86_64 x86_64 GNU/Linux*
It seems to me that the loopback connector is somehow tricking the
openSM to think that there is something wrong with the ports. Am I right?
Another thing: If I try to force bring the port to the ACTIVE state with
ibportstate I get the following error:
# ibportstate -G 0x0002c903000be243 1 enable
ibwarn: [4824] mad_rpc_open_port: can't open UMAD port ((null):0)
ibportstate: iberror: failed: Failed to open '(null)' port '0'
I am really a greenehorn to all this InfiniBand stuff, so please can
someone decrypt the above error messages in the opensm.log? What should
I do in order to have a running openSM and a port configured the right
way, so I can loopback messages? Is there any documentation out there
which describes the set up of an loopback on a single port, or at least
the initial setup of an InfiniBand network?
Thnaks in advance for your time and sorry if I am bothering you too much
with my lame questions.
Best regards,
Konstantin Boyanov
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-03-31 8:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-09 8:04 InfiniBand HCA loopback on a single host (subnet manager needed?) Konstantin Boyanov
[not found] ` <alpine.LRH.2.00.1103090903140.15803-9mA5q7a405ob1SvskN2V4Q@public.gmane.org>
2011-03-09 17:30 ` Jason Gunthorpe
[not found] ` <20110309173005.GN22729-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2011-03-09 18:24 ` Hal Rosenstock
[not found] ` <AANLkTimTm2UgUr4A_XJYGJpGBnRAFE74Eu6At+n9Xnfd-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-03-31 8:53 ` Konstantin Boyanov [this message]
[not found] ` <4D94411F.2080008-T5F83Mi6MZE@public.gmane.org>
2011-03-31 13:53 ` Hal Rosenstock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D94411F.2080008@desy.de \
--to=konstantin.boyanov-t5f83mi6mze@public.gmane.org \
--cc=boyanov-/DgOsau14IQ@public.gmane.org \
--cc=hal.rosenstock-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.