All of lore.kernel.org
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: host/target keep alive timeout loop
Date: Tue, 8 Nov 2016 08:56:50 -0600	[thread overview]
Message-ID: <007901d239d0$561eb700$025c2500$@opengridcomputing.com> (raw)
In-Reply-To: <ff2610d1-164f-d85e-839e-b91344192041@grimberg.me>

 
> > Hey Sagi/Christoph,
> >
> > While running the same kato/recovery tests I've logged in a few other
threads,
> > occasionally I get some controllers on the host that will not reconnect.
Even
> > after I quiesce the test and have the interfaces up and everything is
pingable.
> > When it gets in this state, some of the 10 controllers are up and ok, and
others
> > are stuck in this reconnect/fail loop.
> >
> > The host is stuck continually logging this for one or more controllers:
> >
> > [ 7885.617176] nvme nvme10: failed nvme_keep_alive_end_io error=16385
> > [ 7886.837087] nvme nvme10: rdma_resolve_addr wait failed (-110).
> > [ 7890.183979] nvme nvme10: failed to initialize i/o queue: -110
> > [ 7890.247538] nvme nvme10: Failed reconnect attempt, requeueing...
> 
> This looks like an underlying problem causing the host rdma_connect
> to timeout. Did it happen before or is it a new thing?

I can't say for sure, since until Christoph's recent fix, it was crashing in
other ways.   I don't think the connection is failing at the driver level, but
I'll look into the stats for cxgb4 when it is in this state to see what is
happening at that level.   

      reply	other threads:[~2016-11-08 14:56 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <056f01d2394e$ea77ae70$bf670b50$@opengridcomputing.com>
     [not found] ` <057401d2394f$0b6195b0$2224c110$@opengridcomputing.com>
2016-11-08 10:23   ` host/target keep alive timeout loop Sagi Grimberg
2016-11-08 14:56     ` Steve Wise [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='007901d239d0$561eb700$025c2500$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.