nvme: controller resets - Stephan Günther

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: guenther@tum.de (Stephan Günther)
Subject: nvme: controller resets
Date: Wed, 11 Nov 2015 23:09:57 +0100	[thread overview]
Message-ID: <e3daa6a6ab157228a1f5606a776b01ea@localhost> (raw)
In-Reply-To: <CAKajsGNGvWw1+B7X8HxdvP0MyvAawb_PStu_adOC8wogOF-Fbg@mail.gmail.com>

On 2015/November/12 03:26, Vedant Lath wrote:
> On Wed, Nov 11, 2015@3:58 AM, Vedant Lath <vedant@lath.in> wrote:
> > On Tue, Nov 10, 2015@9:21 PM, Keith Busch <keith.busch@intel.com> wrote:
> >> Not sure really. Normally I file a f/w bug for this kind of thing. :)
> >>
> >> But I'll throw out some potential ideas. Try trottling driver capabilities
> >> and see if anything improves: reduce queue count to 1 and depth to 2
> >> (requires code change).
> >>
> >> If you're able to recreate with reduced settings, then your controller's
> >> failure can be caused by a single command, and it's hopefully just a
> >> matter of finding that command.
> >>
> >> If the problem is not reproducible with reduced settings, then perhaps
> >> it's related to concurrent queue usage or high depth, and you can play
> >> with either to see if you discover anything interesting.
> >>
> >> Of course, I could be way off...
> >
> > Is there any way to monitor all the commands going through the wire?
> > Wouldn't that help? That would at least tell us which NVMe command
> > results in a reset, and the flow of the commands leading up to the
> > reset can give us more context into the error.
> 
> Reducing I/O queue depth to 2 fixes the crash. Increasing I/O queue
> depth to 3 again results in a crash.

The device fails to initialize with those settings for me. However, 
think I found the problem:

@@ -2273,7 +2276,7 @@ static void nvme_alloc_ns(struct nvme_dev *dev, unsigned nsid)
        if (dev->stripe_size)
                blk_queue_chunk_sectors(ns->queue, dev->stripe_size >> 9);
        if (dev->vwc & NVME_CTRL_VWC_PRESENT)
-               blk_queue_flush(ns->queue, REQ_FLUSH | REQ_FUA);
+               blk_queue_flush(ns->queue, REQ_FUA);
        blk_queue_virt_boundary(ns->queue, dev->page_size - 1);

        disk->major = nvme_major

With these changes I was able to create a btrfs, copy several GiB of 
data, umount, remount, scrub, and balance.

The probem is *not* the flush itself (issueing the ioctl does not 
provoke the error. It is either a combination of flush with other 
commands or some flags issued together with a flush.

next prev parent reply	other threads:[~2015-11-11 22:09 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-10 14:30 nvme: controller resets Stephan Günther
2015-11-10 15:51 ` Keith Busch
2015-11-10 20:45   ` Stephan Günther
2015-11-10 21:16     ` Vedant Lath
2015-11-10 21:34       ` Stephan Günther
2015-11-10 21:43         ` Vedant Lath
2015-11-10 22:02           ` Stephan Günther
2015-11-10 22:28   ` Vedant Lath
2015-11-11 21:56     ` Vedant Lath
2015-11-11 22:09       ` Stephan Günther [this message]
2015-11-12 14:02         ` Vedant Lath
2015-11-11 22:14       ` Keith Busch
2015-11-12  9:45         ` Vedant Lath
2015-11-12 11:26           ` Vedant Lath
2015-11-16 21:33           ` Stephan Günther

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3daa6a6ab157228a1f5606a776b01ea@localhost \
    --to=guenther@tum.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).