From: Keith Busch <keith.busch@linux.intel.com>
To: Laurence Oberman <loberman@redhat.com>
Cc: Ming Lei <ming.lei@redhat.com>,
Keith Busch <keith.busch@intel.com>, Jens Axboe <axboe@kernel.dk>,
Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Jianchao Wang <jianchao.w.wang@oracle.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH V4 0/7] nvme: pci: fix & improve timeout handling
Date: Tue, 8 May 2018 09:09:33 -0600 [thread overview]
Message-ID: <20180508150933.GA30736@localhost.localdomain> (raw)
In-Reply-To: <1525564282.4007.3.camel@redhat.com>
On Sat, May 05, 2018 at 07:51:22PM -0400, Laurence Oberman wrote:
> 3rd and 4th attempts slightly better, but clearly not dependable
>
> [root@segstorage1 blktests]# ./check block/011
> block/011 => nvme0n1 (disable PCI device while doing I/O)����[failed]
> ����runtime����...��81.188s
> ����--- tests/block/011.out 2018-05-05 18:01:14.268414752 -0400
> ����+++ results/nvme0n1/block/011.out.bad 2018-05-05
> 19:44:48.848568687 -0400
> ����@@ -1,2 +1,3 @@
> �����Running block/011
> ����+tests/block/011: line 47: echo: write error: Input/output error
> �����Test complete
>
> This one passed
> [root@segstorage1 blktests]# ./check block/011
> block/011 => nvme0n1 (disable PCI device while doing I/O)����[passed]
> ����runtime��81.188s��...��43.400s
>
> I will capture a vmcore next time it panics and give some information
> after analyzing the core
We definitely should never panic, but I am not sure this blktest can be
reliable on IO errors: the test is disabling memory space enabling and
bus master without the driver's knowledge, and it does this repeatedly
in a tight loop. If the test happens to disable the device while the
driver is trying to recover from the previous iteration, the recovery
will surely fail, so I think IO errors may possibly be expected.
As far as I can tell, the only way you'll actually get it to succeed is
if the test's subsequent "enable" happen's to hit in conjuction with the
driver's reset pci_enable_device_mem(), such that the pci_dev's enable_cnt
is > 1, which prevents the disabling for the remainder of the test's
looping.
I still think this is a very good test, but we might be able to make it
more deterministic on what actually happens to the pci device.
WARNING: multiple messages have this Message-ID (diff)
From: keith.busch@linux.intel.com (Keith Busch)
Subject: [PATCH V4 0/7] nvme: pci: fix & improve timeout handling
Date: Tue, 8 May 2018 09:09:33 -0600 [thread overview]
Message-ID: <20180508150933.GA30736@localhost.localdomain> (raw)
In-Reply-To: <1525564282.4007.3.camel@redhat.com>
On Sat, May 05, 2018@07:51:22PM -0400, Laurence Oberman wrote:
> 3rd and 4th attempts slightly better, but clearly not dependable
>
> [root at segstorage1 blktests]# ./check block/011
> block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed]
> ????runtime????...??81.188s
> ????--- tests/block/011.out 2018-05-05 18:01:14.268414752 -0400
> ????+++ results/nvme0n1/block/011.out.bad 2018-05-05
> 19:44:48.848568687 -0400
> ????@@ -1,2 +1,3 @@
> ?????Running block/011
> ????+tests/block/011: line 47: echo: write error: Input/output error
> ?????Test complete
>
> This one passed
> [root at segstorage1 blktests]# ./check block/011
> block/011 => nvme0n1 (disable PCI device while doing I/O)????[passed]
> ????runtime??81.188s??...??43.400s
>
> I will capture a vmcore next time it panics and give some information
> after analyzing the core
We definitely should never panic, but I am not sure this blktest can be
reliable on IO errors: the test is disabling memory space enabling and
bus master without the driver's knowledge, and it does this repeatedly
in a tight loop. If the test happens to disable the device while the
driver is trying to recover from the previous iteration, the recovery
will surely fail, so I think IO errors may possibly be expected.
As far as I can tell, the only way you'll actually get it to succeed is
if the test's subsequent "enable" happen's to hit in conjuction with the
driver's reset pci_enable_device_mem(), such that the pci_dev's enable_cnt
is > 1, which prevents the disabling for the remainder of the test's
looping.
I still think this is a very good test, but we might be able to make it
more deterministic on what actually happens to the pci device.
next prev parent reply other threads:[~2018-05-08 15:09 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-05 13:58 [PATCH V4 0/7] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-05 13:58 ` Ming Lei
2018-05-05 13:58 ` [PATCH V4 1/7] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-05 13:58 ` Ming Lei
2018-05-10 15:01 ` Bart Van Assche
2018-05-10 15:01 ` Bart Van Assche
2018-05-10 21:00 ` Ming Lei
2018-05-10 21:00 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 2/7] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 3/7] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 4/7] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 5/7] nvme: core: introduce 'reset_lock' for sync reset state and reset activities Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 6/7] nvme: pci: prepare for supporting error recovery from resetting context Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-07 15:04 ` James Smart
2018-05-07 15:04 ` James Smart
2018-05-10 20:53 ` Ming Lei
2018-05-10 20:53 ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 7/7] nvme: pci: support nested EH Ming Lei
2018-05-05 13:59 ` Ming Lei
2018-05-05 23:11 ` [PATCH V4 0/7] nvme: pci: fix & improve timeout handling Laurence Oberman
2018-05-05 23:11 ` Laurence Oberman
2018-05-05 23:31 ` Laurence Oberman
2018-05-05 23:31 ` Laurence Oberman
2018-05-05 23:51 ` Laurence Oberman
2018-05-05 23:51 ` Laurence Oberman
2018-05-08 15:09 ` Keith Busch [this message]
2018-05-08 15:09 ` Keith Busch
2018-05-10 10:28 ` Ming Lei
2018-05-10 10:28 ` Ming Lei
2018-05-10 21:59 ` Laurence Oberman
2018-05-10 21:59 ` Laurence Oberman
2018-05-10 22:10 ` Ming Lei
2018-05-10 22:10 ` Ming Lei
2018-05-09 5:46 ` jianchao.wang
2018-05-09 5:46 ` jianchao.wang
2018-05-10 2:09 ` Ming Lei
2018-05-10 2:09 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180508150933.GA30736@localhost.localdomain \
--to=keith.busch@linux.intel.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=jianchao.w.wang@oracle.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=loberman@redhat.com \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.