[PATCH V4 0/7] nvme: pci: fix & improve timeout handling

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: loberman@redhat.com (Laurence Oberman)
Subject: [PATCH V4 0/7] nvme: pci: fix & improve timeout handling
Date: Sat, 05 May 2018 19:11:33 -0400	[thread overview]
Message-ID: <1525561893.3082.1.camel@redhat.com> (raw)
In-Reply-To: <20180505135905.18815-1-ming.lei@redhat.com>

On Sat, 2018-05-05@21:58 +0800, Ming Lei wrote:
> Hi,
> 
> The 1st patch introduces blk_quiesce_timeout() and
> blk_unquiesce_timeout()
> for NVMe, meantime fixes blk_sync_queue().
> 
> The 2nd patch covers timeout for admin commands for recovering
> controller
> for avoiding possible deadlock.
> 
> The 3rd and 4th patches avoid to wait_freeze on queues which aren't
> frozen.
> 
> The last 4 patches fixes several races wrt. NVMe timeout handler, and
> finally can make blktests block/011 passed. Meantime the NVMe PCI
> timeout
> mecanism become much more rebost than before.
> 
> gitweb:
> 	https://github.com/ming1/linux/commits/v4.17-rc-nvme-timeout.V4
> 
> V4:
> 	- fixe nvme_init_set_host_mem_cmd()
> 	- use nested EH model, and run both nvme_dev_disable() and
> 	resetting in one same context
> 
> V3:
> 	- fix one new race related freezing in patch 4,
> nvme_reset_work()
> 	may hang forever without this patch
> 	- rewrite the last 3 patches, and avoid to break
> nvme_reset_ctrl*()
> 
> V2:
> 	- fix draining timeout work, so no need to change return value
> from
> 	.timeout()
> 	- fix race between nvme_start_freeze() and nvme_unfreeze()
> 	- cover timeout for admin commands running in EH
> 
> Ming Lei (7):
> ? block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout()
> ? nvme: pci: cover timeout for admin commands running in EH
> ? nvme: pci: only wait freezing if queue is frozen
> ? nvme: pci: freeze queue in nvme_dev_disable() in case of error
> ????recovery
> ? nvme: core: introduce 'reset_lock' for sync reset state and reset
> ????activities
> ? nvme: pci: prepare for supporting error recovery from resetting
> ????context
> ? nvme: pci: support nested EH
> 
> ?block/blk-core.c?????????|??21 +++-
> ?block/blk-mq.c???????????|???9 ++
> ?block/blk-timeout.c??????|???5 +-
> ?drivers/nvme/host/core.c |??46 ++++++-
> ?drivers/nvme/host/nvme.h |???5 +
> ?drivers/nvme/host/pci.c??| 304
> ++++++++++++++++++++++++++++++++++++++++-------
> ?include/linux/blkdev.h???|??13 ++
> ?7 files changed, 356 insertions(+), 47 deletions(-)
> 
> Cc: Jianchao Wang <jianchao.w.wang at oracle.com>
> Cc: Christoph Hellwig <hch at lst.de>
> Cc: Sagi Grimberg <sagi at grimberg.me>
> Cc: linux-nvme at lists.infradead.org
> Cc: Laurence Oberman <loberman at redhat.com>

Hello Ming

I have a two node NUMA system here running your kernel tree
4.17.0-rc3.ming.nvme+

[root at segstorage1 ~]# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 3 5 6 8 11 13 14
node 0 size: 63922 MB
node 0 free: 61310 MB
node 1 cpus: 1 2 4 7 9 10 12 15
node 1 size: 64422 MB
node 1 free: 62372 MB
node distances:
node???0???1?
? 0:??10??20?
? 1:??20??10?

I ran block/011

[root at segstorage1 blktests]# ./check block/011
block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed]
????runtime????...??106.936s
????--- tests/block/011.out	2018-05-05 18:01:14.268414752 -0400
????+++ results/nvme0n1/block/011.out.bad	2018-05-05
19:07:21.028634858 -0400
????@@ -1,2 +1,36 @@
?????Running block/011
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &
IO_U_F_FLIGHT) == 0' failed.
????...
????(Run 'diff -u tests/block/011.out
results/nvme0n1/block/011.out.bad' to see the entire diff)

[ 1421.738551] run blktests block/011 at 2018-05-05 19:05:34
[ 1452.676351] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.718221] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.718239] nvme nvme0: EH 0: before shutdown
[ 1452.760890] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760894] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760897] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760900] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760903] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760906] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760909] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760912] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760915] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760918] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760921] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760923] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1452.760926] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1453.330251] nvme nvme0: controller is down; will reset: CSTS=0x3,
PCI_STATUS=0x10
[ 1453.391713] nvme nvme0: EH 0: after shutdown
[ 1456.804695] device-mapper: multipath: Failing path 259:0.
[ 1526.721196] nvme nvme0: I/O 15 QID 0 timeout, disable controller
[ 1526.754335] nvme nvme0: EH 1: before shutdown
[ 1526.793257] nvme nvme0: EH 1: after shutdown
[ 1526.793327] nvme nvme0: Identify Controller failed (-4)
[ 1526.847869] nvme nvme0: Removing after probe failure status: -5
[ 1526.888206] nvme nvme0: EH 0: after recovery
[ 1526.888212] nvme0n1: detected capacity change from 400088457216 to 0
[ 1526.947520] print_req_error: 1 callbacks suppressed
[ 1526.947522] print_req_error: I/O error, dev nvme0n1, sector 794920
[ 1526.947534] print_req_error: I/O error, dev nvme0n1, sector 569328
[ 1526.947540] print_req_error: I/O error, dev nvme0n1, sector 1234608
[ 1526.947556] print_req_error: I/O error, dev nvme0n1, sector 389296
[ 1526.947564] print_req_error: I/O error, dev nvme0n1, sector 712432
[ 1526.947566] print_req_error: I/O error, dev nvme0n1, sector 889304
[ 1526.947572] print_req_error: I/O error, dev nvme0n1, sector 205776
[ 1526.947574] print_req_error: I/O error, dev nvme0n1, sector 126480
[ 1526.947575] print_req_error: I/O error, dev nvme0n1, sector 1601232
[ 1526.947580] print_req_error: I/O error, dev nvme0n1, sector 1234360
[ 1526.947745] Pid 683(fio) over core_pipe_limit
[ 1526.947746] Skipping core dump
[ 1526.947747] Pid 675(fio) over core_pipe_limit
[ 1526.947748] Skipping core dump
[ 1526.947863] Pid 672(fio) over core_pipe_limit
[ 1526.947863] Skipping core dump
[ 1526.947865] Pid 674(fio) over core_pipe_limit
[ 1526.947866] Skipping core dump
[ 1526.947870] Pid 676(fio) over core_pipe_limit
[ 1526.947871] Pid 679(fio) over core_pipe_limit
[ 1526.947872] Skipping core dump
[ 1526.947872] Skipping core dump
[ 1526.948197] Pid 677(fio) over core_pipe_limit
[ 1526.948197] Skipping core dump
[ 1526.948245] Pid 686(fio) over core_pipe_limit
[ 1526.948245] Skipping core dump
[ 1526.974610] Pid 680(fio) over core_pipe_limit
[ 1526.974611] Pid 684(fio) over core_pipe_limit
[ 1526.974611] Skipping core dump
[ 1526.980370] nvme nvme0: failed to mark controller CONNECTING
[ 1526.980373] nvme nvme0: Removing after probe failure status: -19
[ 1526.980385] nvme nvme0: EH 1: after recovery
[ 1526.980477] Pid 687(fio) over core_pipe_limit
[ 1526.980478] Skipping core dump
[ 1527.858207] Skipping core dump

And leaves me looping here

[ 1721.272276] INFO: task kworker/u66:0:24214 blocked for more than 120
seconds.
[ 1721.311263]???????Tainted: G??????????I???????4.17.0-rc3.ming.nvme+
#1
[ 1721.348027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1721.392957] kworker/u66:0???D????0 24214??????2 0x80000080
[ 1721.424425] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]
[ 1721.458568] Call Trace:
[ 1721.472499]??? __schedule+0x290/0x870
[ 1721.493515]??schedule+0x32/0x80
[ 1721.511656]??blk_mq_freeze_queue_wait+0x46/0xb0
[ 1721.537609]??? remove_wait_queue+0x60/0x60
[ 1721.561081]??blk_cleanup_queue+0x7e/0x180
[ 1721.584637]??nvme_ns_remove+0x106/0x140 [nvme_core]
[ 1721.612589]??nvme_remove_namespaces+0x8e/0xd0 [nvme_core]
[ 1721.643163]??nvme_remove+0x80/0x120 [nvme]
[ 1721.666188]??pci_device_remove+0x3b/0xc0
[ 1721.688553]??device_release_driver_internal+0x148/0x220
[ 1721.719332]??nvme_remove_dead_ctrl_work+0x29/0x40 [nvme]
[ 1721.750474]??process_one_work+0x158/0x360
[ 1721.772632]??worker_thread+0x47/0x3e0
[ 1721.792471]??kthread+0xf8/0x130
[ 1721.810354]??? max_active_store+0x80/0x80
[ 1721.832459]??? kthread_bind+0x10/0x10
[ 1721.852845]??ret_from_fork+0x35/0x40

Did I di something wrong

I never set anything else, the nvme0n1 was not mounted etc.

Thanks
Laurence

next prev parent reply	other threads:[~2018-05-05 23:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-05 13:58 [PATCH V4 0/7] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-05 13:58 ` [PATCH V4 1/7] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-10 15:01   ` Bart Van Assche
2018-05-10 21:00     ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 2/7] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-05 13:59 ` [PATCH V4 3/7] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-05 13:59 ` [PATCH V4 4/7] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-05 13:59 ` [PATCH V4 5/7] nvme: core: introduce 'reset_lock' for sync reset state and reset activities Ming Lei
2018-05-05 13:59 ` [PATCH V4 6/7] nvme: pci: prepare for supporting error recovery from resetting context Ming Lei
2018-05-07 15:04   ` James Smart
2018-05-10 20:53     ` Ming Lei
2018-05-05 13:59 ` [PATCH V4 7/7] nvme: pci: support nested EH Ming Lei
2018-05-05 23:11 ` Laurence Oberman [this message]
2018-05-05 23:31   ` [PATCH V4 0/7] nvme: pci: fix & improve timeout handling Laurence Oberman
2018-05-05 23:51     ` Laurence Oberman
2018-05-08 15:09       ` Keith Busch
2018-05-10 10:28   ` Ming Lei
2018-05-10 21:59     ` Laurence Oberman
2018-05-10 22:10       ` Ming Lei
2018-05-09  5:46 ` jianchao.wang
2018-05-10  2:09   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1525561893.3082.1.camel@redhat.com \
    --to=loberman@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).