Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: nvme-fabrics: crash at nvme connect-all
Date: Thu, 9 Jun 2016 10:40:32 -0500	[thread overview]
Message-ID: <009201d1c265$41fead30$c5fc0790$@opengridcomputing.com> (raw)
In-Reply-To: <1118189510.33005805.1465484651257.JavaMail.zimbra@kalray.eu>

> > I don't see this on my 16 core/64GB memory note, I successfully did a
> > discover/connect-all with the target/host on the same node with 7 target devices
> > w/o any errors.   Note I'm using the nvmf-all.2 branch Christoph setup up
> > yesterday.
> >
> > Marta, I need to learn more about your T5 setup and the "stats" file output.
> > Thanks!
> >
> > Steve.
> 
> Steve, It seems to me that there's a PBLMEM exhaustion because my card has less
> resources than yours (224 MRs if I repeat your calculations):
> # cat /sys/kernel/debug/iw_cxgb4/0000\:09\:00.4/stats
>    Object:      Total    Current        Max       Fail
>      PDID:      65536          1          2          0
>       QID:       1024          0          0          0
>    TPTMEM:      91136          0          0          0
>    PBLMEM:     227840          0          0          0
>    RQTMEM:     318976          0          0          0
>   OCQPMEM:          0          0          0          0
>   DB FULL:          0
>  DB EMPTY:          0
>   DB DROP:          0
>  DB State: NORMAL Transitions 0 FC Interruptions 0
> TCAM_FULL:          0
> ACT_OFLD_CONN_FAILS:          0
> PAS_OFLD_CONN_FAILS:          0
> NEG_ADV_RCVD:          0
> AVAILABLE IRD:       1024
> 
> Fore the more exact reference, it's:
> [   18.651764] cxgb4 0000:09:00.4 eth1: eth1: Chelsio T580-LP-SO (0000:09:00.4)
> 40GBASE-R QSFP
> [   18.651979] cxgb4 0000:09:00.4 eth2: eth2: Chelsio T580-LP-SO (0000:09:00.4)
> 40GBASE-R QSFP
> [   18.652025] cxgb4 0000:09:00.4: Chelsio T580-LP-SO rev 0
> 
> No config file in the firmware directory.
> 


Thanks Marta.  That card has less memory than the T580-CR.  I'm checking with Chelsio on the details.  The "-SO" might mean a mem-free card.   

Also, can you email me the output of 'cat /sys/kernel/debug/cxgb4/blah/meminfo'?

So to make it work given the adapter resources, you need to make the queues shallower and have less of them.  If I can get you a config file that increases the available rdma memory, I'll send it to you.  But perhaps this card is just a low/no memory card more tailored for NIC only vs RDMA. (I'll confirm this soon).

Steve

  reply	other threads:[~2016-06-09 15:40 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-09  9:18 nvme-fabrics: crash at nvme connect-all Marta Rybczynska
2016-06-09  9:29 ` Sagi Grimberg
2016-06-09 10:07   ` Marta Rybczynska
2016-06-09 11:09     ` Sagi Grimberg
2016-06-09 12:12       ` Marta Rybczynska
2016-06-09 12:30         ` Sagi Grimberg
2016-06-09 13:27           ` Steve Wise
2016-06-09 13:36             ` Steve Wise
2016-06-09 13:48               ` Sagi Grimberg
2016-06-09 14:09                 ` Steve Wise
2016-06-09 14:22                   ` Steve Wise
2016-06-09 14:29                     ` Steve Wise
2016-06-09 15:04                       ` Marta Rybczynska
2016-06-09 15:40                         ` Steve Wise [this message]
2016-06-09 15:48                           ` Steve Wise
2016-06-10  9:03                             ` Marta Rybczynska
2016-06-10 13:40                               ` Steve Wise
2016-06-10 13:42                                 ` Marta Rybczynska
2016-06-10 13:49                                   ` Steve Wise
2016-06-09 13:25   ` Christoph Hellwig
2016-06-09 13:24 ` Christoph Hellwig
2016-06-09 15:37   ` Marta Rybczynska
2016-06-09 20:25     ` Steve Wise
2016-06-09 20:35       ` Ming Lin
2016-06-09 21:06         ` Steve Wise
2016-06-09 22:26           ` Ming Lin
2016-06-09 22:40             ` Steve Wise
     [not found]             ` <055801d1c29f$e164c000$a42e4000$@opengridcomputing.com>
2016-06-10 15:11               ` Steve Wise
2016-06-10 16:22                 ` Steve Wise
2016-06-10 18:43                   ` Ming Lin
2016-06-10 19:17                     ` Steve Wise
2016-06-10 20:00                       ` Ming Lin
2016-06-10 20:15                         ` Steve Wise
2016-06-10 20:18                           ` Ming Lin
2016-06-10 21:14                             ` Steve Wise
2016-06-10 21:20                               ` Ming Lin
2016-06-10 21:25                                 ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='009201d1c265$41fead30$c5fc0790$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox