linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: keith.busch@intel.com (Keith Busch)
Subject: nvme: controller resets
Date: Tue, 10 Nov 2015 15:51:10 +0000	[thread overview]
Message-ID: <20151110155110.GA31697@localhost.localdomain> (raw)
In-Reply-To: <33aa688b8da3f41960d36e66aa1703d8@localhost>

On Tue, Nov 10, 2015@03:30:43PM +0100, Stephan G?nther wrote:
> Hello,
> 
> recently we submitted a small patch that enabled support for the Apple
> NVMe controller. More testing revealed some interesting behavior we
> cannot explain:
> 
> 1) Formatting a partition as vfat or ext2 works fine and so far,
> arbitrary loads are handled correctly by the controller.
> 
> 2) ext3/4 fails, but may be not immediately.
> 
> 3) mkfs.btrfs fails immediately.
> 
> The error is the same every time:
> | nvme 0000:03:00.0: Failed status: 3, reset controller
> | nvme 0000:03:00.0: Cancelling I/O 38 QID 1
> | nvme 0000:03:00.0: Cancelling I/O 39 QID 1
> | nvme 0000:03:00.0: Device not ready; aborting reset
> | nvme 0000:03:00.0: Device failed to resume
> | blk_update_request: I/O error, dev nvme0n1, sector 0
> | blk_update_request: I/O error, dev nvme0n1, sector 977104768
> | Buffer I/O error on dev nvme0n1p3, logical block 120827120, async page read

It says the controller asserted an internal failure status, then failed
the reset recovery. Sounds like there are other quirks to this device
you may have to reverse engineer.

> While trying to isolate the problem we found that running 'partprobe -d'
> also causes the problem.
> 
> So we attached strace to determine the failing ioctl/syscall. However,
> running 'strace -f partprobe -d' suddenly worked fine. Similar to that
> 'strace -f mkfs.btrfs' worked. However, mounting the file system caused
> the problem again.
> 
> Due to the different behavior with and without strace we assume there
> could be some kind of race condition.
> 
> Any ideas how we can track the problem further?

Not sure really. Normally I file a f/w bug for this kind of thing. :)

But I'll throw out some potential ideas. Try trottling driver capabilities
and see if anything improves: reduce queue count to 1 and depth to 2
(requires code change).

If you're able to recreate with reduced settings, then your controller's
failure can be caused by a single command, and it's hopefully just a
matter of finding that command.

If the problem is not reproducible with reduced settings, then perhaps
it's related to concurrent queue usage or high depth, and you can play
with either to see if you discover anything interesting.

Of course, I could be way off...

  reply	other threads:[~2015-11-10 15:51 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-10 14:30 nvme: controller resets Stephan Günther
2015-11-10 15:51 ` Keith Busch [this message]
2015-11-10 20:45   ` Stephan Günther
2015-11-10 21:16     ` Vedant Lath
2015-11-10 21:34       ` Stephan Günther
2015-11-10 21:43         ` Vedant Lath
2015-11-10 22:02           ` Stephan Günther
2015-11-10 22:28   ` Vedant Lath
2015-11-11 21:56     ` Vedant Lath
2015-11-11 22:09       ` Stephan Günther
2015-11-12 14:02         ` Vedant Lath
2015-11-11 22:14       ` Keith Busch
2015-11-12  9:45         ` Vedant Lath
2015-11-12 11:26           ` Vedant Lath
2015-11-16 21:33           ` Stephan Günther

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151110155110.GA31697@localhost.localdomain \
    --to=keith.busch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).