public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: jeff-ruUnomVL5WBWk0Htik3J/w@public.gmane.org (Jeff Haferman)
To: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: rdma problems on Sun / ConnectX hardware
Date: Thu, 31 Dec 2009 11:08:59 -0800 (PST)	[thread overview]
Message-ID: <20091231190859.041281D90009@adint.net> (raw)


Linux kernel = 2.6.18-92.1.26.el5_lustre.1.6.7.2smp
OS = Centos 5.2

We have a Sun Blade system with Sun IB products 
(switch= Sun part number X2821A-Z 36 port QDR switch)
(hcas = Sun part number X4216A-Z dual port DDR PCI-E)

I can SOMETIMES run mvapich or openmpi over IB and it works, but generally I get
a "CQ polling error".  So I went back to the rdma tests and see some problems.

We have installed OFED 1.4.1-4, and because I was having problems I upgraded the firmware on the HCAS:

lspci | grep -i infin
0b:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0 2.5GT/s] (rev a0)

mstflint -d 0b:00.0 q 
Image type:      ConnectX
FW Version:      2.6.0
Device ID:       25418
Chip Revision:   A0
Description:     Node             Port1            Port2            Sys
image
GUIDs:           0003ba000100d770 0003ba000100d771 0003ba000100d772
0003ba000100d773 
MACs:                             0003ba00d771     0003ba00d772     
Board ID:         (SUN0060000001)
VSD:             
PSID:            SUN0060000001

An rping from the client to server gives
verbose
client
created cm_id 0x10ca7c70
cma_event type RDMA_CM_EVENT_ADDR_RESOLVED cma_id 0x10ca7c70 (parent)
cma_event type RDMA_CM_EVENT_ROUTE_RESOLVED cma_id 0x10ca7c70 (parent)
rdma_resolve_addr - rdma_resolve_route successful
created pd 0x10caa3d0
created channel 0x10caa3f0
created cq 0x10caa410
created qp 0x10caa550
rping_setup_buffers called on cb 0x10ca5010
allocated & registered buffers...
cq_thread started.
cma_event type RDMA_CM_EVENT_ESTABLISHED cma_id 0x10ca7c70 (parent)
ESTABLISHED
rmda_connect successful
RDMA addr 10caaa90 rkey 2002800 len 100
send completion
cma_event type RDMA_CM_EVENT_DISCONNECTED cma_id 0x10ca7c70 (parent)
client DISCONNECT EVENT...
wait for RDMA_WRITE_ADV state 6
cq completion failed status 5
rping_free_buffers called on cb 0x10ca5010
destroy cm_id 0x10ca7c70


An ib_rdma_lat gives
   local address: LID 0x16 QPN 0x004f PSN 0x743778 RKey 0x002500 VAddr 0x00000007c72001
  remote address: LID 0x01 QPN 0x004f PSN 0x6497a1 RKey 0x002500 VAddr 0x00000018780001
Conflicting CPU frequency values detected: 2336.000000 != 2003.000000
Latency typical: inf usec
Latency best   : inf usec
Latency worst  : inf usec


Any ideas?????

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2009-12-31 19:08 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-31 19:08 Jeff Haferman [this message]
     [not found] ` <20091231190859.041281D90009-uDbadAYOwZ9eoWH0uzbU5w@public.gmane.org>
2010-01-04  2:36   ` rdma problems on Sun / ConnectX hardware Frank Leers
  -- strict thread matches above, loose matches on Subject: below --
2010-01-03 20:00 Jeff Haferman
     [not found] ` <20100103200023.169DF1D90008-uDbadAYOwZ9eoWH0uzbU5w@public.gmane.org>
2010-01-03 20:19   ` Joe Landman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091231190859.041281D90009@adint.net \
    --to=jeff-ruunomvl5wbwk0htik3j/w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox