diff for duplicates of <1525564282.4007.3.camel@redhat.com> diff --git a/a/1.txt b/N1/1.txt index 3374714..caeccdf 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -1,6 +1,6 @@ -On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: -> On Sat, 2018-05-05 at 19:11 -0400, Laurence Oberman wrote: -> > On Sat, 2018-05-05 at 21:58 +0800, Ming Lei wrote: +On Sat, 2018-05-05@19:31 -0400, Laurence Oberman wrote: +> On Sat, 2018-05-05@19:11 -0400, Laurence Oberman wrote: +> > On Sat, 2018-05-05@21:58 +0800, Ming Lei wrote: > > > Hi, > > > > > > The 1st patch introduces blk_quiesce_timeout() and @@ -44,41 +44,41 @@ On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: > > > - cover timeout for admin commands running in EH > > > > > > Ming Lei (7): -> > > block: introduce blk_quiesce_timeout() and +> > > ? block: introduce blk_quiesce_timeout() and > > > blk_unquiesce_timeout() -> > > nvme: pci: cover timeout for admin commands running in EH -> > > nvme: pci: only wait freezing if queue is frozen -> > > nvme: pci: freeze queue in nvme_dev_disable() in case of error -> > > recovery -> > > nvme: core: introduce 'reset_lock' for sync reset state and +> > > ? nvme: pci: cover timeout for admin commands running in EH +> > > ? nvme: pci: only wait freezing if queue is frozen +> > > ? nvme: pci: freeze queue in nvme_dev_disable() in case of error +> > > ????recovery +> > > ? nvme: core: introduce 'reset_lock' for sync reset state and > > > reset -> > > activities -> > > nvme: pci: prepare for supporting error recovery from resetting -> > > context -> > > nvme: pci: support nested EH +> > > ????activities +> > > ? nvme: pci: prepare for supporting error recovery from resetting +> > > ????context +> > > ? nvme: pci: support nested EH > > > -> > > block/blk-core.c | 21 +++- -> > > block/blk-mq.c | 9 ++ -> > > block/blk-timeout.c | 5 +- -> > > drivers/nvme/host/core.c | 46 ++++++- -> > > drivers/nvme/host/nvme.h | 5 + -> > > drivers/nvme/host/pci.c | 304 +> > > ?block/blk-core.c?????????|??21 +++- +> > > ?block/blk-mq.c???????????|???9 ++ +> > > ?block/blk-timeout.c??????|???5 +- +> > > ?drivers/nvme/host/core.c |??46 ++++++- +> > > ?drivers/nvme/host/nvme.h |???5 + +> > > ?drivers/nvme/host/pci.c??| 304 > > > ++++++++++++++++++++++++++++++++++++++++------- -> > > include/linux/blkdev.h | 13 ++ -> > > 7 files changed, 356 insertions(+), 47 deletions(-) +> > > ?include/linux/blkdev.h???|??13 ++ +> > > ?7 files changed, 356 insertions(+), 47 deletions(-) > > > -> > > Cc: Jianchao Wang <jianchao.w.wang@oracle.com> -> > > Cc: Christoph Hellwig <hch@lst.de> -> > > Cc: Sagi Grimberg <sagi@grimberg.me> -> > > Cc: linux-nvme@lists.infradead.org -> > > Cc: Laurence Oberman <loberman@redhat.com> +> > > Cc: Jianchao Wang <jianchao.w.wang at oracle.com> +> > > Cc: Christoph Hellwig <hch at lst.de> +> > > Cc: Sagi Grimberg <sagi at grimberg.me> +> > > Cc: linux-nvme at lists.infradead.org +> > > Cc: Laurence Oberman <loberman at redhat.com> > > > > Hello Ming > > > > I have a two node NUMA system here running your kernel tree > > 4.17.0-rc3.ming.nvme+ > > -> > [root@segstorage1 ~]# numactl --hardware +> > [root at segstorage1 ~]# numactl --hardware > > available: 2 nodes (0-1) > > node 0 cpus: 0 3 5 6 8 11 13 14 > > node 0 size: 63922 MB @@ -87,36 +87,36 @@ On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: > > node 1 size: 64422 MB > > node 1 free: 62372 MB > > node distances: -> > node 0 1 -> > 0: 10 20 -> > 1: 20 10 +> > node???0???1? +> > ? 0:??10??20? +> > ? 1:??20??10? > > > > I ran block/011 > > -> > [root@segstorage1 blktests]# ./check block/011 +> > [root at segstorage1 blktests]# ./check block/011 > > block/011 => nvme0n1 (disable PCI device while doing -> > I/O) [failed] -> > runtime ... 106.936s -> > --- tests/block/011.out 2018-05-05 18:01:14.268414752 +> > I/O)????[failed] +> > ????runtime????...??106.936s +> > ????--- tests/block/011.out 2018-05-05 18:01:14.268414752 > > -0400 -> > +++ results/nvme0n1/block/011.out.bad 2018-05-05 +> > ????+++ results/nvme0n1/block/011.out.bad 2018-05-05 > > 19:07:21.028634858 -0400 -> > @@ -1,2 +1,36 @@ -> > Running block/011 -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????@@ -1,2 +1,36 @@ +> > ?????Running block/011 +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > > IO_U_F_FLIGHT) == 0' failed. -> > ... -> > (Run 'diff -u tests/block/011.out +> > ????... +> > ????(Run 'diff -u tests/block/011.out > > results/nvme0n1/block/011.out.bad' to see the entire diff) > > > > [ 1421.738551] run blktests block/011 at 2018-05-05 19:05:34 @@ -232,31 +232,31 @@ On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: > > [ 1721.272276] INFO: task kworker/u66:0:24214 blocked for more than > > 120 > > seconds. -> > [ 1721.311263] Tainted: G I 4.17.0- +> > [ 1721.311263]???????Tainted: G??????????I???????4.17.0- > > rc3.ming.nvme+ > > #1 > > [ 1721.348027] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > disables this message. -> > [ 1721.392957] kworker/u66:0 D 0 24214 2 0x80000080 +> > [ 1721.392957] kworker/u66:0???D????0 24214??????2 0x80000080 > > [ 1721.424425] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] > > [ 1721.458568] Call Trace: -> > [ 1721.472499] ? __schedule+0x290/0x870 -> > [ 1721.493515] schedule+0x32/0x80 -> > [ 1721.511656] blk_mq_freeze_queue_wait+0x46/0xb0 -> > [ 1721.537609] ? remove_wait_queue+0x60/0x60 -> > [ 1721.561081] blk_cleanup_queue+0x7e/0x180 -> > [ 1721.584637] nvme_ns_remove+0x106/0x140 [nvme_core] -> > [ 1721.612589] nvme_remove_namespaces+0x8e/0xd0 [nvme_core] -> > [ 1721.643163] nvme_remove+0x80/0x120 [nvme] -> > [ 1721.666188] pci_device_remove+0x3b/0xc0 -> > [ 1721.688553] device_release_driver_internal+0x148/0x220 -> > [ 1721.719332] nvme_remove_dead_ctrl_work+0x29/0x40 [nvme] -> > [ 1721.750474] process_one_work+0x158/0x360 -> > [ 1721.772632] worker_thread+0x47/0x3e0 -> > [ 1721.792471] kthread+0xf8/0x130 -> > [ 1721.810354] ? max_active_store+0x80/0x80 -> > [ 1721.832459] ? kthread_bind+0x10/0x10 -> > [ 1721.852845] ret_from_fork+0x35/0x40 +> > [ 1721.472499]??? __schedule+0x290/0x870 +> > [ 1721.493515]??schedule+0x32/0x80 +> > [ 1721.511656]??blk_mq_freeze_queue_wait+0x46/0xb0 +> > [ 1721.537609]??? remove_wait_queue+0x60/0x60 +> > [ 1721.561081]??blk_cleanup_queue+0x7e/0x180 +> > [ 1721.584637]??nvme_ns_remove+0x106/0x140 [nvme_core] +> > [ 1721.612589]??nvme_remove_namespaces+0x8e/0xd0 [nvme_core] +> > [ 1721.643163]??nvme_remove+0x80/0x120 [nvme] +> > [ 1721.666188]??pci_device_remove+0x3b/0xc0 +> > [ 1721.688553]??device_release_driver_internal+0x148/0x220 +> > [ 1721.719332]??nvme_remove_dead_ctrl_work+0x29/0x40 [nvme] +> > [ 1721.750474]??process_one_work+0x158/0x360 +> > [ 1721.772632]??worker_thread+0x47/0x3e0 +> > [ 1721.792471]??kthread+0xf8/0x130 +> > [ 1721.810354]??? max_active_store+0x80/0x80 +> > [ 1721.832459]??? kthread_bind+0x10/0x10 +> > [ 1721.852845]??ret_from_fork+0x35/0x40 > > > > Did I di something wrong > > @@ -267,117 +267,117 @@ On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: > > Second attempt same issue and this time we panicked > -> [root@segstorage1 blktests]# ./check block/011 -> block/011 => nvme0n1 (disable PCI device while doing I/O) [failed] -> runtime ... 106.936s -> --- tests/block/011.out 2018-05-05 18:01:14.268414752 +> [root at segstorage1 blktests]# ./check block/011 +> block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed] +> ????runtime????...??106.936s +> ????--- tests/block/011.out 2018-05-05 18:01:14.268414752 > -0400 -> +++ results/nvme0n1/block/011.out.bad 2018-05-05 +> ????+++ results/nvme0n1/block/011.out.bad 2018-05-05 > 19:07:21.028634858 -0400 -> @@ -1,2 +1,36 @@ -> Running block/011 -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????@@ -1,2 +1,36 @@ +> ?????Running block/011 +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> +fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & +> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags & > IO_U_F_FLIGHT) == 0' failed. -> ... -> (Run 'diff -u tests/block/011.out +> ????... +> ????(Run 'diff -u tests/block/011.out > results/nvme0n1/block/011.out.bad' to see the entire diff) > -> [ 387.483279] run blktests block/011 at 2018-05-05 19:27:33 -> [ 418.076690] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??387.483279] run blktests block/011 at 2018-05-05 19:27:33 +> [??418.076690] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.117901] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.117901] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.117929] nvme nvme0: EH 0: before shutdown -> [ 418.158827] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.117929] nvme nvme0: EH 0: before shutdown +> [??418.158827] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158830] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158830] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158833] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158833] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158836] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158836] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158838] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158838] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158841] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158841] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158844] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158844] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158847] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158847] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158849] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158849] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158852] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158852] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158855] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158855] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158858] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158858] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158861] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158861] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.158863] nvme nvme0: controller is down; will reset: CSTS=0x3, +> [??418.158863] nvme nvme0: controller is down; will reset: CSTS=0x3, > PCI_STATUS=0x10 -> [ 418.785063] nvme nvme0: EH 0: after shutdown -> [ 420.708723] device-mapper: multipath: Failing path 259:0. -> [ 486.106834] nvme nvme0: I/O 6 QID 0 timeout, disable controller -> [ 486.140306] nvme nvme0: EH 1: before shutdown -> [ 486.179884] nvme nvme0: EH 1: after shutdown -> [ 486.179961] nvme nvme0: Identify Controller failed (-4) -> [ 486.232868] nvme nvme0: Removing after probe failure status: -5 -> [ 486.273935] nvme nvme0: EH 0: after recovery -> [ 486.274230] nvme0n1: detected capacity change from 400088457216 to +> [??418.785063] nvme nvme0: EH 0: after shutdown +> [??420.708723] device-mapper: multipath: Failing path 259:0. +> [??486.106834] nvme nvme0: I/O 6 QID 0 timeout, disable controller +> [??486.140306] nvme nvme0: EH 1: before shutdown +> [??486.179884] nvme nvme0: EH 1: after shutdown +> [??486.179961] nvme nvme0: Identify Controller failed (-4) +> [??486.232868] nvme nvme0: Removing after probe failure status: -5 +> [??486.273935] nvme nvme0: EH 0: after recovery +> [??486.274230] nvme0n1: detected capacity change from 400088457216 to > 0 -> [ 486.334575] print_req_error: I/O error, dev nvme0n1, sector +> [??486.334575] print_req_error: I/O error, dev nvme0n1, sector > 1234608 -> [ 486.334582] print_req_error: I/O error, dev nvme0n1, sector +> [??486.334582] print_req_error: I/O error, dev nvme0n1, sector > 1755840 -> [ 486.334598] print_req_error: I/O error, dev nvme0n1, sector 569328 -> [ 486.334600] print_req_error: I/O error, dev nvme0n1, sector 183296 -> [ 486.334614] print_req_error: I/O error, dev nvme0n1, sector 174576 -> [ 486.334616] print_req_error: I/O error, dev nvme0n1, sector +> [??486.334598] print_req_error: I/O error, dev nvme0n1, sector 569328 +> [??486.334600] print_req_error: I/O error, dev nvme0n1, sector 183296 +> [??486.334614] print_req_error: I/O error, dev nvme0n1, sector 174576 +> [??486.334616] print_req_error: I/O error, dev nvme0n1, sector > 1234360 -> [ 486.334621] print_req_error: I/O error, dev nvme0n1, sector 786336 -> [ 486.334622] print_req_error: I/O error, dev nvme0n1, sector 205776 -> [ 486.334624] print_req_error: I/O error, dev nvme0n1, sector 534320 -> [ 486.334628] print_req_error: I/O error, dev nvme0n1, sector 712432 -> [ 486.334856] Pid 7792(fio) over core_pipe_limit -> [ 486.334857] Pid 7799(fio) over core_pipe_limit -> [ 486.334857] Skipping core dump -> [ 486.334857] Skipping core dump -> [ 486.334918] Pid 7784(fio) over core_pipe_limit -> [ 486.334919] Pid 7797(fio) over core_pipe_limit -> [ 486.334920] Pid 7798(fio) over core_pipe_limit -> [ 486.334921] Pid 7791(fio) over core_pipe_limit -> [ 486.334922] Skipping core dump -> [ 486.334922] Skipping core dump -> [ 486.334922] Skipping core dump -> [ 486.334923] Skipping core dump -> [ 486.335060] Pid 7789(fio) over core_pipe_limit -> [ 486.335061] Skipping core dump -> [ 486.335290] Pid 7785(fio) over core_pipe_limit -> [ 486.335291] Skipping core dump -> [ 486.335292] Pid 7796(fio) over core_pipe_limit -> [ 486.335293] Skipping core dump -> [ 486.335316] Pid 7786(fio) over core_pipe_limit -> [ 486.335317] Skipping core dump -> [ 487.110906] nvme nvme0: failed to mark controller CONNECTING -> [ 487.141743] nvme nvme0: Removing after probe failure status: -19 -> [ 487.176341] nvme nvme0: EH 1: after recovery -> [ 487.232034] BUG: unable to handle kernel NULL pointer dereference +> [??486.334621] print_req_error: I/O error, dev nvme0n1, sector 786336 +> [??486.334622] print_req_error: I/O error, dev nvme0n1, sector 205776 +> [??486.334624] print_req_error: I/O error, dev nvme0n1, sector 534320 +> [??486.334628] print_req_error: I/O error, dev nvme0n1, sector 712432 +> [??486.334856] Pid 7792(fio) over core_pipe_limit +> [??486.334857] Pid 7799(fio) over core_pipe_limit +> [??486.334857] Skipping core dump +> [??486.334857] Skipping core dump +> [??486.334918] Pid 7784(fio) over core_pipe_limit +> [??486.334919] Pid 7797(fio) over core_pipe_limit +> [??486.334920] Pid 7798(fio) over core_pipe_limit +> [??486.334921] Pid 7791(fio) over core_pipe_limit +> [??486.334922] Skipping core dump +> [??486.334922] Skipping core dump +> [??486.334922] Skipping core dump +> [??486.334923] Skipping core dump +> [??486.335060] Pid 7789(fio) over core_pipe_limit +> [??486.335061] Skipping core dump +> [??486.335290] Pid 7785(fio) over core_pipe_limit +> [??486.335291] Skipping core dump +> [??486.335292] Pid 7796(fio) over core_pipe_limit +> [??486.335293] Skipping core dump +> [??486.335316] Pid 7786(fio) over core_pipe_limit +> [??486.335317] Skipping core dump +> [??487.110906] nvme nvme0: failed to mark controller CONNECTING +> [??487.141743] nvme nvme0: Removing after probe failure status: -19 +> [??487.176341] nvme nvme0: EH 1: after recovery +> [??487.232034] BUG: unable to handle kernel NULL pointer dereference > at > 0000000000000000 -> [ 487.276604] PGD 0 P4D 0 -> [ 487.290548] Oops: 0000 [#1] SMP PTI -> [ 487.310135] Modules linked in: macsec tcp_diag udp_diag inet_diag +> [??487.276604] PGD 0 P4D 0? +> [??487.290548] Oops: 0000 [#1] SMP PTI +> [??487.310135] Modules linked in: macsec tcp_diag udp_diag inet_diag > unix_diag af_packet_diag netlink_diag binfmt_misc ebtable_filter > ebtables ip6table_filter ip6_tables devlink xt_physdev br_netfilter > bridge stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 @@ -391,71 +391,71 @@ On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote: > ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper > syscopyarea > sysfillrect sysimgblt -> [ 487.719632] fb_sys_fops ttm sd_mod qla2xxx drm nvme_fc +> [??487.719632]??fb_sys_fops ttm sd_mod qla2xxx drm nvme_fc > nvme_fabrics > i2c_core crc32c_intel nvme serio_raw hpsa bnx2 nvme_core > scsi_transport_fc scsi_transport_sas dm_mirror dm_region_hash dm_log > dm_mod -> [ 487.817595] CPU: 4 PID: 763 Comm: kworker/u66:8 Kdump: loaded -> Tainted: G I 4.17.0-rc3.ming.nvme+ #1 -> [ 487.876571] Hardware name: HP ProLiant DL380 G7, BIOS P67 +> [??487.817595] CPU: 4 PID: 763 Comm: kworker/u66:8 Kdump: loaded +> Tainted: G??????????I???????4.17.0-rc3.ming.nvme+ #1 +> [??487.876571] Hardware name: HP ProLiant DL380 G7, BIOS P67 > 08/16/2015 -> [ 487.913158] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] -> [ 487.946586] RIP: 0010:sbitmap_any_bit_set+0xb/0x30 -> [ 487.973172] RSP: 0018:ffffb19e47fdfe00 EFLAGS: 00010202 -> [ 488.003255] RAX: ffff8f0457931408 RBX: ffff8f0457931400 RCX: +> [??487.913158] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme] +> [??487.946586] RIP: 0010:sbitmap_any_bit_set+0xb/0x30 +> [??487.973172] RSP: 0018:ffffb19e47fdfe00 EFLAGS: 00010202 +> [??488.003255] RAX: ffff8f0457931408 RBX: ffff8f0457931400 RCX: > 0000000000000004 -> [ 488.044199] RDX: 0000000000000000 RSI: 0000000000000000 RDI: +> [??488.044199] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > ffff8f04579314d0 -> [ 488.085253] RBP: ffff8f04570b8000 R08: 00000000000271a0 R09: +> [??488.085253] RBP: ffff8f04570b8000 R08: 00000000000271a0 R09: > ffffffffacda1b44 -> [ 488.126295] R10: ffffd6ee3f4ed000 R11: 0000000000000000 R12: +> [??488.126295] R10: ffffd6ee3f4ed000 R11: 0000000000000000 R12: > 0000000000000001 -> [ 488.166746] R13: 0000000000000001 R14: 0000000000000000 R15: +> [??488.166746] R13: 0000000000000001 R14: 0000000000000000 R15: > ffff8f0457821138 -> [ 488.207076] FS: 0000000000000000(0000) GS:ffff8f145b280000(0000) +> [??488.207076] FS:??0000000000000000(0000) GS:ffff8f145b280000(0000) > knlGS:0000000000000000 -> [ 488.252727] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 -> [ 488.284395] CR2: 0000000000000000 CR3: 0000001e4aa0a006 CR4: +> [??488.252727] CS:??0010 DS: 0000 ES: 0000 CR0: 0000000080050033 +> [??488.284395] CR2: 0000000000000000 CR3: 0000001e4aa0a006 CR4: > 00000000000206e0 -> [ 488.324505] Call Trace: -> [ 488.337945] blk_mq_run_hw_queue+0xad/0xf0 -> [ 488.361057] blk_mq_run_hw_queues+0x4b/0x60 -> [ 488.384507] nvme_kill_queues+0x26/0x80 [nvme_core] -> [ 488.411528] nvme_remove_dead_ctrl_work+0x17/0x40 [nvme] -> [ 488.441602] process_one_work+0x158/0x360 -> [ 488.464568] worker_thread+0x1fa/0x3e0 -> [ 488.486044] kthread+0xf8/0x130 -> [ 488.504022] ? max_active_store+0x80/0x80 -> [ 488.527034] ? kthread_bind+0x10/0x10 -> [ 488.548026] ret_from_fork+0x35/0x40 -> [ 488.569062] Code: c6 44 0f 46 ce 83 c2 01 45 89 ca 4c 89 54 01 08 +> [??488.324505] Call Trace: +> [??488.337945]??blk_mq_run_hw_queue+0xad/0xf0 +> [??488.361057]??blk_mq_run_hw_queues+0x4b/0x60 +> [??488.384507]??nvme_kill_queues+0x26/0x80 [nvme_core] +> [??488.411528]??nvme_remove_dead_ctrl_work+0x17/0x40 [nvme] +> [??488.441602]??process_one_work+0x158/0x360 +> [??488.464568]??worker_thread+0x1fa/0x3e0 +> [??488.486044]??kthread+0xf8/0x130 +> [??488.504022]??? max_active_store+0x80/0x80 +> [??488.527034]??? kthread_bind+0x10/0x10 +> [??488.548026]??ret_from_fork+0x35/0x40 +> [??488.569062] Code: c6 44 0f 46 ce 83 c2 01 45 89 ca 4c 89 54 01 08 > 48 > 8b 4f 10 2b 74 01 08 39 57 08 77 d8 f3 c3 90 8b 4f 08 85 c9 74 1f 48 > 8b > 57 10 <48> 83 3a 00 75 18 31 c0 eb 0a 48 83 c2 40 48 83 3a 00 75 0a -> 83 -> [ 488.676148] RIP: sbitmap_any_bit_set+0xb/0x30 RSP: +> 83? +> [??488.676148] RIP: sbitmap_any_bit_set+0xb/0x30 RSP: > ffffb19e47fdfe00 -> [ 488.711006] CR2: 0000000000000000 +> [??488.711006] CR2: 0000000000000000 3rd and 4th attempts slightly better, but clearly not dependable -[root@segstorage1 blktests]# ./check block/011 -block/011 => nvme0n1 (disable PCI device while doing I/O) [failed] - runtime ... 81.188s - --- tests/block/011.out 2018-05-05 18:01:14.268414752 -0400 - +++ results/nvme0n1/block/011.out.bad 2018-05-05 +[root at segstorage1 blktests]# ./check block/011 +block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed] +????runtime????...??81.188s +????--- tests/block/011.out 2018-05-05 18:01:14.268414752 -0400 +????+++ results/nvme0n1/block/011.out.bad 2018-05-05 19:44:48.848568687 -0400 - @@ -1,2 +1,3 @@ - Running block/011 - +tests/block/011: line 47: echo: write error: Input/output error - Test complete +????@@ -1,2 +1,3 @@ +?????Running block/011 +????+tests/block/011: line 47: echo: write error: Input/output error +?????Test complete This one passed -[root@segstorage1 blktests]# ./check block/011 -block/011 => nvme0n1 (disable PCI device while doing I/O) [passed] - runtime 81.188s ... 43.400s +[root at segstorage1 blktests]# ./check block/011 +block/011 => nvme0n1 (disable PCI device while doing I/O)????[passed] +????runtime??81.188s??...??43.400s I will capture a vmcore next time it panics and give some information after analyzing the core diff --git a/a/content_digest b/N1/content_digest index 5f2f296..bdd3b8f 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -1,22 +1,14 @@ "ref\020180505135905.18815-1-ming.lei@redhat.com\0" "ref\01525561893.3082.1.camel@redhat.com\0" "ref\01525563110.4007.1.camel@redhat.com\0" - "From\0Laurence Oberman <loberman@redhat.com>\0" - "Subject\0Re: [PATCH V4 0/7] nvme: pci: fix & improve timeout handling\0" + "From\0loberman@redhat.com (Laurence Oberman)\0" + "Subject\0[PATCH V4 0/7] nvme: pci: fix & improve timeout handling\0" "Date\0Sat, 05 May 2018 19:51:22 -0400\0" - "To\0Ming Lei <ming.lei@redhat.com>" - " Keith Busch <keith.busch@intel.com>\0" - "Cc\0Jens Axboe <axboe@kernel.dk>" - linux-block@vger.kernel.org - Jianchao Wang <jianchao.w.wang@oracle.com> - Christoph Hellwig <hch@lst.de> - Sagi Grimberg <sagi@grimberg.me> - " linux-nvme@lists.infradead.org\0" "\00:1\0" "b\0" - "On Sat, 2018-05-05 at 19:31 -0400, Laurence Oberman wrote:\n" - "> On Sat, 2018-05-05 at 19:11 -0400, Laurence Oberman wrote:\n" - "> > On Sat, 2018-05-05 at 21:58 +0800, Ming Lei wrote:\n" + "On Sat, 2018-05-05@19:31 -0400, Laurence Oberman wrote:\n" + "> On Sat, 2018-05-05@19:11 -0400, Laurence Oberman wrote:\n" + "> > On Sat, 2018-05-05@21:58 +0800, Ming Lei wrote:\n" "> > > Hi,\n" "> > > \n" "> > > The 1st patch introduces blk_quiesce_timeout() and\n" @@ -60,41 +52,41 @@ "> > > \t- cover timeout for admin commands running in EH\n" "> > > \n" "> > > Ming Lei (7):\n" - "> > > \302\240 block: introduce blk_quiesce_timeout() and\n" + "> > > ? block: introduce blk_quiesce_timeout() and\n" "> > > blk_unquiesce_timeout()\n" - "> > > \302\240 nvme: pci: cover timeout for admin commands running in EH\n" - "> > > \302\240 nvme: pci: only wait freezing if queue is frozen\n" - "> > > \302\240 nvme: pci: freeze queue in nvme_dev_disable() in case of error\n" - "> > > \302\240\302\240\302\240\302\240recovery\n" - "> > > \302\240 nvme: core: introduce 'reset_lock' for sync reset state and\n" + "> > > ? nvme: pci: cover timeout for admin commands running in EH\n" + "> > > ? nvme: pci: only wait freezing if queue is frozen\n" + "> > > ? nvme: pci: freeze queue in nvme_dev_disable() in case of error\n" + "> > > ????recovery\n" + "> > > ? nvme: core: introduce 'reset_lock' for sync reset state and\n" "> > > reset\n" - "> > > \302\240\302\240\302\240\302\240activities\n" - "> > > \302\240 nvme: pci: prepare for supporting error recovery from resetting\n" - "> > > \302\240\302\240\302\240\302\240context\n" - "> > > \302\240 nvme: pci: support nested EH\n" + "> > > ????activities\n" + "> > > ? nvme: pci: prepare for supporting error recovery from resetting\n" + "> > > ????context\n" + "> > > ? nvme: pci: support nested EH\n" "> > > \n" - "> > > \302\240block/blk-core.c\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240|\302\240\302\24021 +++-\n" - "> > > \302\240block/blk-mq.c\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240|\302\240\302\240\302\2409 ++\n" - "> > > \302\240block/blk-timeout.c\302\240\302\240\302\240\302\240\302\240\302\240|\302\240\302\240\302\2405 +-\n" - "> > > \302\240drivers/nvme/host/core.c |\302\240\302\24046 ++++++-\n" - "> > > \302\240drivers/nvme/host/nvme.h |\302\240\302\240\302\2405 +\n" - "> > > \302\240drivers/nvme/host/pci.c\302\240\302\240| 304\n" + "> > > ?block/blk-core.c?????????|??21 +++-\n" + "> > > ?block/blk-mq.c???????????|???9 ++\n" + "> > > ?block/blk-timeout.c??????|???5 +-\n" + "> > > ?drivers/nvme/host/core.c |??46 ++++++-\n" + "> > > ?drivers/nvme/host/nvme.h |???5 +\n" + "> > > ?drivers/nvme/host/pci.c??| 304\n" "> > > ++++++++++++++++++++++++++++++++++++++++-------\n" - "> > > \302\240include/linux/blkdev.h\302\240\302\240\302\240|\302\240\302\24013 ++\n" - "> > > \302\2407 files changed, 356 insertions(+), 47 deletions(-)\n" + "> > > ?include/linux/blkdev.h???|??13 ++\n" + "> > > ?7 files changed, 356 insertions(+), 47 deletions(-)\n" "> > > \n" - "> > > Cc: Jianchao Wang <jianchao.w.wang@oracle.com>\n" - "> > > Cc: Christoph Hellwig <hch@lst.de>\n" - "> > > Cc: Sagi Grimberg <sagi@grimberg.me>\n" - "> > > Cc: linux-nvme@lists.infradead.org\n" - "> > > Cc: Laurence Oberman <loberman@redhat.com>\n" + "> > > Cc: Jianchao Wang <jianchao.w.wang at oracle.com>\n" + "> > > Cc: Christoph Hellwig <hch at lst.de>\n" + "> > > Cc: Sagi Grimberg <sagi at grimberg.me>\n" + "> > > Cc: linux-nvme at lists.infradead.org\n" + "> > > Cc: Laurence Oberman <loberman at redhat.com>\n" "> > \n" "> > Hello Ming\n" "> > \n" "> > I have a two node NUMA system here running your kernel tree\n" "> > 4.17.0-rc3.ming.nvme+\n" "> > \n" - "> > [root@segstorage1 ~]# numactl --hardware\n" + "> > [root at segstorage1 ~]# numactl --hardware\n" "> > available: 2 nodes (0-1)\n" "> > node 0 cpus: 0 3 5 6 8 11 13 14\n" "> > node 0 size: 63922 MB\n" @@ -103,36 +95,36 @@ "> > node 1 size: 64422 MB\n" "> > node 1 free: 62372 MB\n" "> > node distances:\n" - "> > node\302\240\302\240\302\2400\302\240\302\240\302\2401\302\240\n" - "> > \302\240 0:\302\240\302\24010\302\240\302\24020\302\240\n" - "> > \302\240 1:\302\240\302\24020\302\240\302\24010\302\240\n" + "> > node???0???1?\n" + "> > ? 0:??10??20?\n" + "> > ? 1:??20??10?\n" "> > \n" "> > I ran block/011\n" "> > \n" - "> > [root@segstorage1 blktests]# ./check block/011\n" + "> > [root at segstorage1 blktests]# ./check block/011\n" "> > block/011 => nvme0n1 (disable PCI device while doing\n" - "> > I/O)\302\240\302\240\302\240\302\240[failed]\n" - "> > \302\240\302\240\302\240\302\240runtime\302\240\302\240\302\240\302\240...\302\240\302\240106.936s\n" - "> > \302\240\302\240\302\240\302\240--- tests/block/011.out\t2018-05-05 18:01:14.268414752\n" + "> > I/O)????[failed]\n" + "> > ????runtime????...??106.936s\n" + "> > ????--- tests/block/011.out\t2018-05-05 18:01:14.268414752\n" "> > -0400\n" - "> > \302\240\302\240\302\240\302\240+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" + "> > ????+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" "> > 19:07:21.028634858 -0400\n" - "> > \302\240\302\240\302\240\302\240@@ -1,2 +1,36 @@\n" - "> > \302\240\302\240\302\240\302\240\302\240Running block/011\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????@@ -1,2 +1,36 @@\n" + "> > ?????Running block/011\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> > ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> > IO_U_F_FLIGHT) == 0' failed.\n" - "> > \302\240\302\240\302\240\302\240...\n" - "> > \302\240\302\240\302\240\302\240(Run 'diff -u tests/block/011.out\n" + "> > ????...\n" + "> > ????(Run 'diff -u tests/block/011.out\n" "> > results/nvme0n1/block/011.out.bad' to see the entire diff)\n" "> > \n" "> > [ 1421.738551] run blktests block/011 at 2018-05-05 19:05:34\n" @@ -248,31 +240,31 @@ "> > [ 1721.272276] INFO: task kworker/u66:0:24214 blocked for more than\n" "> > 120\n" "> > seconds.\n" - "> > [ 1721.311263]\302\240\302\240\302\240\302\240\302\240\302\240\302\240Tainted: G\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240I\302\240\302\240\302\240\302\240\302\240\302\240\302\2404.17.0-\n" + "> > [ 1721.311263]???????Tainted: G??????????I???????4.17.0-\n" "> > rc3.ming.nvme+\n" "> > #1\n" "> > [ 1721.348027] \"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\"\n" "> > disables this message.\n" - "> > [ 1721.392957] kworker/u66:0\302\240\302\240\302\240D\302\240\302\240\302\240\302\2400 24214\302\240\302\240\302\240\302\240\302\240\302\2402 0x80000080\n" + "> > [ 1721.392957] kworker/u66:0???D????0 24214??????2 0x80000080\n" "> > [ 1721.424425] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]\n" "> > [ 1721.458568] Call Trace:\n" - "> > [ 1721.472499]\302\240\302\240? __schedule+0x290/0x870\n" - "> > [ 1721.493515]\302\240\302\240schedule+0x32/0x80\n" - "> > [ 1721.511656]\302\240\302\240blk_mq_freeze_queue_wait+0x46/0xb0\n" - "> > [ 1721.537609]\302\240\302\240? remove_wait_queue+0x60/0x60\n" - "> > [ 1721.561081]\302\240\302\240blk_cleanup_queue+0x7e/0x180\n" - "> > [ 1721.584637]\302\240\302\240nvme_ns_remove+0x106/0x140 [nvme_core]\n" - "> > [ 1721.612589]\302\240\302\240nvme_remove_namespaces+0x8e/0xd0 [nvme_core]\n" - "> > [ 1721.643163]\302\240\302\240nvme_remove+0x80/0x120 [nvme]\n" - "> > [ 1721.666188]\302\240\302\240pci_device_remove+0x3b/0xc0\n" - "> > [ 1721.688553]\302\240\302\240device_release_driver_internal+0x148/0x220\n" - "> > [ 1721.719332]\302\240\302\240nvme_remove_dead_ctrl_work+0x29/0x40 [nvme]\n" - "> > [ 1721.750474]\302\240\302\240process_one_work+0x158/0x360\n" - "> > [ 1721.772632]\302\240\302\240worker_thread+0x47/0x3e0\n" - "> > [ 1721.792471]\302\240\302\240kthread+0xf8/0x130\n" - "> > [ 1721.810354]\302\240\302\240? max_active_store+0x80/0x80\n" - "> > [ 1721.832459]\302\240\302\240? kthread_bind+0x10/0x10\n" - "> > [ 1721.852845]\302\240\302\240ret_from_fork+0x35/0x40\n" + "> > [ 1721.472499]??? __schedule+0x290/0x870\n" + "> > [ 1721.493515]??schedule+0x32/0x80\n" + "> > [ 1721.511656]??blk_mq_freeze_queue_wait+0x46/0xb0\n" + "> > [ 1721.537609]??? remove_wait_queue+0x60/0x60\n" + "> > [ 1721.561081]??blk_cleanup_queue+0x7e/0x180\n" + "> > [ 1721.584637]??nvme_ns_remove+0x106/0x140 [nvme_core]\n" + "> > [ 1721.612589]??nvme_remove_namespaces+0x8e/0xd0 [nvme_core]\n" + "> > [ 1721.643163]??nvme_remove+0x80/0x120 [nvme]\n" + "> > [ 1721.666188]??pci_device_remove+0x3b/0xc0\n" + "> > [ 1721.688553]??device_release_driver_internal+0x148/0x220\n" + "> > [ 1721.719332]??nvme_remove_dead_ctrl_work+0x29/0x40 [nvme]\n" + "> > [ 1721.750474]??process_one_work+0x158/0x360\n" + "> > [ 1721.772632]??worker_thread+0x47/0x3e0\n" + "> > [ 1721.792471]??kthread+0xf8/0x130\n" + "> > [ 1721.810354]??? max_active_store+0x80/0x80\n" + "> > [ 1721.832459]??? kthread_bind+0x10/0x10\n" + "> > [ 1721.852845]??ret_from_fork+0x35/0x40\n" "> > \n" "> > Did I di something wrong\n" "> > \n" @@ -283,117 +275,117 @@ "> \n" "> Second attempt same issue and this time we panicked\n" "> \n" - "> [root@segstorage1 blktests]# ./check block/011\n" - "> block/011 => nvme0n1 (disable PCI device while doing I/O)\302\240\302\240\302\240\302\240[failed]\n" - "> \302\240\302\240\302\240\302\240runtime\302\240\302\240\302\240\302\240...\302\240\302\240106.936s\n" - "> \302\240\302\240\302\240\302\240--- tests/block/011.out\t2018-05-05 18:01:14.268414752\n" + "> [root at segstorage1 blktests]# ./check block/011\n" + "> block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed]\n" + "> ????runtime????...??106.936s\n" + "> ????--- tests/block/011.out\t2018-05-05 18:01:14.268414752\n" "> -0400\n" - "> \302\240\302\240\302\240\302\240+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" + "> ????+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" "> 19:07:21.028634858 -0400\n" - "> \302\240\302\240\302\240\302\240@@ -1,2 +1,36 @@\n" - "> \302\240\302\240\302\240\302\240\302\240Running block/011\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????@@ -1,2 +1,36 @@\n" + "> ?????Running block/011\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" + "> ????+fio: ioengines.c:289: td_io_queue: Assertion `(io_u->flags &\n" "> IO_U_F_FLIGHT) == 0' failed.\n" - "> \302\240\302\240\302\240\302\240...\n" - "> \302\240\302\240\302\240\302\240(Run 'diff -u tests/block/011.out\n" + "> ????...\n" + "> ????(Run 'diff -u tests/block/011.out\n" "> results/nvme0n1/block/011.out.bad' to see the entire diff)\n" "> \n" - "> [\302\240\302\240387.483279] run blktests block/011 at 2018-05-05 19:27:33\n" - "> [\302\240\302\240418.076690] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??387.483279] run blktests block/011 at 2018-05-05 19:27:33\n" + "> [??418.076690] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.117901] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.117901] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.117929] nvme nvme0: EH 0: before shutdown\n" - "> [\302\240\302\240418.158827] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.117929] nvme nvme0: EH 0: before shutdown\n" + "> [??418.158827] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158830] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158830] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158833] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158833] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158836] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158836] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158838] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158838] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158841] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158841] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158844] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158844] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158847] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158847] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158849] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158849] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158852] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158852] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158855] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158855] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158858] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158858] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158861] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158861] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.158863] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" + "> [??418.158863] nvme nvme0: controller is down; will reset: CSTS=0x3,\n" "> PCI_STATUS=0x10\n" - "> [\302\240\302\240418.785063] nvme nvme0: EH 0: after shutdown\n" - "> [\302\240\302\240420.708723] device-mapper: multipath: Failing path 259:0.\n" - "> [\302\240\302\240486.106834] nvme nvme0: I/O 6 QID 0 timeout, disable controller\n" - "> [\302\240\302\240486.140306] nvme nvme0: EH 1: before shutdown\n" - "> [\302\240\302\240486.179884] nvme nvme0: EH 1: after shutdown\n" - "> [\302\240\302\240486.179961] nvme nvme0: Identify Controller failed (-4)\n" - "> [\302\240\302\240486.232868] nvme nvme0: Removing after probe failure status: -5\n" - "> [\302\240\302\240486.273935] nvme nvme0: EH 0: after recovery\n" - "> [\302\240\302\240486.274230] nvme0n1: detected capacity change from 400088457216 to\n" + "> [??418.785063] nvme nvme0: EH 0: after shutdown\n" + "> [??420.708723] device-mapper: multipath: Failing path 259:0.\n" + "> [??486.106834] nvme nvme0: I/O 6 QID 0 timeout, disable controller\n" + "> [??486.140306] nvme nvme0: EH 1: before shutdown\n" + "> [??486.179884] nvme nvme0: EH 1: after shutdown\n" + "> [??486.179961] nvme nvme0: Identify Controller failed (-4)\n" + "> [??486.232868] nvme nvme0: Removing after probe failure status: -5\n" + "> [??486.273935] nvme nvme0: EH 0: after recovery\n" + "> [??486.274230] nvme0n1: detected capacity change from 400088457216 to\n" "> 0\n" - "> [\302\240\302\240486.334575] print_req_error: I/O error, dev nvme0n1, sector\n" + "> [??486.334575] print_req_error: I/O error, dev nvme0n1, sector\n" "> 1234608\n" - "> [\302\240\302\240486.334582] print_req_error: I/O error, dev nvme0n1, sector\n" + "> [??486.334582] print_req_error: I/O error, dev nvme0n1, sector\n" "> 1755840\n" - "> [\302\240\302\240486.334598] print_req_error: I/O error, dev nvme0n1, sector 569328\n" - "> [\302\240\302\240486.334600] print_req_error: I/O error, dev nvme0n1, sector 183296\n" - "> [\302\240\302\240486.334614] print_req_error: I/O error, dev nvme0n1, sector 174576\n" - "> [\302\240\302\240486.334616] print_req_error: I/O error, dev nvme0n1, sector\n" + "> [??486.334598] print_req_error: I/O error, dev nvme0n1, sector 569328\n" + "> [??486.334600] print_req_error: I/O error, dev nvme0n1, sector 183296\n" + "> [??486.334614] print_req_error: I/O error, dev nvme0n1, sector 174576\n" + "> [??486.334616] print_req_error: I/O error, dev nvme0n1, sector\n" "> 1234360\n" - "> [\302\240\302\240486.334621] print_req_error: I/O error, dev nvme0n1, sector 786336\n" - "> [\302\240\302\240486.334622] print_req_error: I/O error, dev nvme0n1, sector 205776\n" - "> [\302\240\302\240486.334624] print_req_error: I/O error, dev nvme0n1, sector 534320\n" - "> [\302\240\302\240486.334628] print_req_error: I/O error, dev nvme0n1, sector 712432\n" - "> [\302\240\302\240486.334856] Pid 7792(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334857] Pid 7799(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334857] Skipping core dump\n" - "> [\302\240\302\240486.334857] Skipping core dump\n" - "> [\302\240\302\240486.334918] Pid 7784(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334919] Pid 7797(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334920] Pid 7798(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334921] Pid 7791(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.334922] Skipping core dump\n" - "> [\302\240\302\240486.334922] Skipping core dump\n" - "> [\302\240\302\240486.334922] Skipping core dump\n" - "> [\302\240\302\240486.334923] Skipping core dump\n" - "> [\302\240\302\240486.335060] Pid 7789(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.335061] Skipping core dump\n" - "> [\302\240\302\240486.335290] Pid 7785(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.335291] Skipping core dump\n" - "> [\302\240\302\240486.335292] Pid 7796(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.335293] Skipping core dump\n" - "> [\302\240\302\240486.335316] Pid 7786(fio) over core_pipe_limit\n" - "> [\302\240\302\240486.335317] Skipping core dump\n" - "> [\302\240\302\240487.110906] nvme nvme0: failed to mark controller CONNECTING\n" - "> [\302\240\302\240487.141743] nvme nvme0: Removing after probe failure status: -19\n" - "> [\302\240\302\240487.176341] nvme nvme0: EH 1: after recovery\n" - "> [\302\240\302\240487.232034] BUG: unable to handle kernel NULL pointer dereference\n" + "> [??486.334621] print_req_error: I/O error, dev nvme0n1, sector 786336\n" + "> [??486.334622] print_req_error: I/O error, dev nvme0n1, sector 205776\n" + "> [??486.334624] print_req_error: I/O error, dev nvme0n1, sector 534320\n" + "> [??486.334628] print_req_error: I/O error, dev nvme0n1, sector 712432\n" + "> [??486.334856] Pid 7792(fio) over core_pipe_limit\n" + "> [??486.334857] Pid 7799(fio) over core_pipe_limit\n" + "> [??486.334857] Skipping core dump\n" + "> [??486.334857] Skipping core dump\n" + "> [??486.334918] Pid 7784(fio) over core_pipe_limit\n" + "> [??486.334919] Pid 7797(fio) over core_pipe_limit\n" + "> [??486.334920] Pid 7798(fio) over core_pipe_limit\n" + "> [??486.334921] Pid 7791(fio) over core_pipe_limit\n" + "> [??486.334922] Skipping core dump\n" + "> [??486.334922] Skipping core dump\n" + "> [??486.334922] Skipping core dump\n" + "> [??486.334923] Skipping core dump\n" + "> [??486.335060] Pid 7789(fio) over core_pipe_limit\n" + "> [??486.335061] Skipping core dump\n" + "> [??486.335290] Pid 7785(fio) over core_pipe_limit\n" + "> [??486.335291] Skipping core dump\n" + "> [??486.335292] Pid 7796(fio) over core_pipe_limit\n" + "> [??486.335293] Skipping core dump\n" + "> [??486.335316] Pid 7786(fio) over core_pipe_limit\n" + "> [??486.335317] Skipping core dump\n" + "> [??487.110906] nvme nvme0: failed to mark controller CONNECTING\n" + "> [??487.141743] nvme nvme0: Removing after probe failure status: -19\n" + "> [??487.176341] nvme nvme0: EH 1: after recovery\n" + "> [??487.232034] BUG: unable to handle kernel NULL pointer dereference\n" "> at\n" "> 0000000000000000\n" - "> [\302\240\302\240487.276604] PGD 0 P4D 0\302\240\n" - "> [\302\240\302\240487.290548] Oops: 0000 [#1] SMP PTI\n" - "> [\302\240\302\240487.310135] Modules linked in: macsec tcp_diag udp_diag inet_diag\n" + "> [??487.276604] PGD 0 P4D 0?\n" + "> [??487.290548] Oops: 0000 [#1] SMP PTI\n" + "> [??487.310135] Modules linked in: macsec tcp_diag udp_diag inet_diag\n" "> unix_diag af_packet_diag netlink_diag binfmt_misc ebtable_filter\n" "> ebtables ip6table_filter ip6_tables devlink xt_physdev br_netfilter\n" "> bridge stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4\n" @@ -407,73 +399,73 @@ "> ip_tables xfs libcrc32c radeon i2c_algo_bit drm_kms_helper\n" "> syscopyarea\n" "> sysfillrect sysimgblt\n" - "> [\302\240\302\240487.719632]\302\240\302\240fb_sys_fops ttm sd_mod qla2xxx drm nvme_fc\n" + "> [??487.719632]??fb_sys_fops ttm sd_mod qla2xxx drm nvme_fc\n" "> nvme_fabrics\n" "> i2c_core crc32c_intel nvme serio_raw hpsa bnx2 nvme_core\n" "> scsi_transport_fc scsi_transport_sas dm_mirror dm_region_hash dm_log\n" "> dm_mod\n" - "> [\302\240\302\240487.817595] CPU: 4 PID: 763 Comm: kworker/u66:8 Kdump: loaded\n" - "> Tainted: G\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240\302\240I\302\240\302\240\302\240\302\240\302\240\302\240\302\2404.17.0-rc3.ming.nvme+ #1\n" - "> [\302\240\302\240487.876571] Hardware name: HP ProLiant DL380 G7, BIOS P67\n" + "> [??487.817595] CPU: 4 PID: 763 Comm: kworker/u66:8 Kdump: loaded\n" + "> Tainted: G??????????I???????4.17.0-rc3.ming.nvme+ #1\n" + "> [??487.876571] Hardware name: HP ProLiant DL380 G7, BIOS P67\n" "> 08/16/2015\n" - "> [\302\240\302\240487.913158] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]\n" - "> [\302\240\302\240487.946586] RIP: 0010:sbitmap_any_bit_set+0xb/0x30\n" - "> [\302\240\302\240487.973172] RSP: 0018:ffffb19e47fdfe00 EFLAGS: 00010202\n" - "> [\302\240\302\240488.003255] RAX: ffff8f0457931408 RBX: ffff8f0457931400 RCX:\n" + "> [??487.913158] Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]\n" + "> [??487.946586] RIP: 0010:sbitmap_any_bit_set+0xb/0x30\n" + "> [??487.973172] RSP: 0018:ffffb19e47fdfe00 EFLAGS: 00010202\n" + "> [??488.003255] RAX: ffff8f0457931408 RBX: ffff8f0457931400 RCX:\n" "> 0000000000000004\n" - "> [\302\240\302\240488.044199] RDX: 0000000000000000 RSI: 0000000000000000 RDI:\n" + "> [??488.044199] RDX: 0000000000000000 RSI: 0000000000000000 RDI:\n" "> ffff8f04579314d0\n" - "> [\302\240\302\240488.085253] RBP: ffff8f04570b8000 R08: 00000000000271a0 R09:\n" + "> [??488.085253] RBP: ffff8f04570b8000 R08: 00000000000271a0 R09:\n" "> ffffffffacda1b44\n" - "> [\302\240\302\240488.126295] R10: ffffd6ee3f4ed000 R11: 0000000000000000 R12:\n" + "> [??488.126295] R10: ffffd6ee3f4ed000 R11: 0000000000000000 R12:\n" "> 0000000000000001\n" - "> [\302\240\302\240488.166746] R13: 0000000000000001 R14: 0000000000000000 R15:\n" + "> [??488.166746] R13: 0000000000000001 R14: 0000000000000000 R15:\n" "> ffff8f0457821138\n" - "> [\302\240\302\240488.207076] FS:\302\240\302\2400000000000000000(0000) GS:ffff8f145b280000(0000)\n" + "> [??488.207076] FS:??0000000000000000(0000) GS:ffff8f145b280000(0000)\n" "> knlGS:0000000000000000\n" - "> [\302\240\302\240488.252727] CS:\302\240\302\2400010 DS: 0000 ES: 0000 CR0: 0000000080050033\n" - "> [\302\240\302\240488.284395] CR2: 0000000000000000 CR3: 0000001e4aa0a006 CR4:\n" + "> [??488.252727] CS:??0010 DS: 0000 ES: 0000 CR0: 0000000080050033\n" + "> [??488.284395] CR2: 0000000000000000 CR3: 0000001e4aa0a006 CR4:\n" "> 00000000000206e0\n" - "> [\302\240\302\240488.324505] Call Trace:\n" - "> [\302\240\302\240488.337945]\302\240\302\240blk_mq_run_hw_queue+0xad/0xf0\n" - "> [\302\240\302\240488.361057]\302\240\302\240blk_mq_run_hw_queues+0x4b/0x60\n" - "> [\302\240\302\240488.384507]\302\240\302\240nvme_kill_queues+0x26/0x80 [nvme_core]\n" - "> [\302\240\302\240488.411528]\302\240\302\240nvme_remove_dead_ctrl_work+0x17/0x40 [nvme]\n" - "> [\302\240\302\240488.441602]\302\240\302\240process_one_work+0x158/0x360\n" - "> [\302\240\302\240488.464568]\302\240\302\240worker_thread+0x1fa/0x3e0\n" - "> [\302\240\302\240488.486044]\302\240\302\240kthread+0xf8/0x130\n" - "> [\302\240\302\240488.504022]\302\240\302\240? max_active_store+0x80/0x80\n" - "> [\302\240\302\240488.527034]\302\240\302\240? kthread_bind+0x10/0x10\n" - "> [\302\240\302\240488.548026]\302\240\302\240ret_from_fork+0x35/0x40\n" - "> [\302\240\302\240488.569062] Code: c6 44 0f 46 ce 83 c2 01 45 89 ca 4c 89 54 01 08\n" + "> [??488.324505] Call Trace:\n" + "> [??488.337945]??blk_mq_run_hw_queue+0xad/0xf0\n" + "> [??488.361057]??blk_mq_run_hw_queues+0x4b/0x60\n" + "> [??488.384507]??nvme_kill_queues+0x26/0x80 [nvme_core]\n" + "> [??488.411528]??nvme_remove_dead_ctrl_work+0x17/0x40 [nvme]\n" + "> [??488.441602]??process_one_work+0x158/0x360\n" + "> [??488.464568]??worker_thread+0x1fa/0x3e0\n" + "> [??488.486044]??kthread+0xf8/0x130\n" + "> [??488.504022]??? max_active_store+0x80/0x80\n" + "> [??488.527034]??? kthread_bind+0x10/0x10\n" + "> [??488.548026]??ret_from_fork+0x35/0x40\n" + "> [??488.569062] Code: c6 44 0f 46 ce 83 c2 01 45 89 ca 4c 89 54 01 08\n" "> 48\n" "> 8b 4f 10 2b 74 01 08 39 57 08 77 d8 f3 c3 90 8b 4f 08 85 c9 74 1f 48\n" "> 8b\n" "> 57 10 <48> 83 3a 00 75 18 31 c0 eb 0a 48 83 c2 40 48 83 3a 00 75 0a\n" - "> 83\302\240\n" - "> [\302\240\302\240488.676148] RIP: sbitmap_any_bit_set+0xb/0x30 RSP:\n" + "> 83?\n" + "> [??488.676148] RIP: sbitmap_any_bit_set+0xb/0x30 RSP:\n" "> ffffb19e47fdfe00\n" - "> [\302\240\302\240488.711006] CR2: 0000000000000000\n" + "> [??488.711006] CR2: 0000000000000000\n" "\n" "3rd and 4th attempts slightly better, but clearly not dependable\n" "\n" - "[root@segstorage1 blktests]# ./check block/011\n" - "block/011 => nvme0n1 (disable PCI device while doing I/O)\302\240\302\240\302\240\302\240[failed]\n" - "\302\240\302\240\302\240\302\240runtime\302\240\302\240\302\240\302\240...\302\240\302\24081.188s\n" - "\302\240\302\240\302\240\302\240--- tests/block/011.out\t2018-05-05 18:01:14.268414752 -0400\n" - "\302\240\302\240\302\240\302\240+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" + "[root at segstorage1 blktests]# ./check block/011\n" + "block/011 => nvme0n1 (disable PCI device while doing I/O)????[failed]\n" + "????runtime????...??81.188s\n" + "????--- tests/block/011.out\t2018-05-05 18:01:14.268414752 -0400\n" + "????+++ results/nvme0n1/block/011.out.bad\t2018-05-05\n" "19:44:48.848568687 -0400\n" - "\302\240\302\240\302\240\302\240@@ -1,2 +1,3 @@\n" - "\302\240\302\240\302\240\302\240\302\240Running block/011\n" - "\302\240\302\240\302\240\302\240+tests/block/011: line 47: echo: write error: Input/output error\n" - "\302\240\302\240\302\240\302\240\302\240Test complete\n" + "????@@ -1,2 +1,3 @@\n" + "?????Running block/011\n" + "????+tests/block/011: line 47: echo: write error: Input/output error\n" + "?????Test complete\n" "\n" "This one passed \n" - "[root@segstorage1 blktests]# ./check block/011\n" - "block/011 => nvme0n1 (disable PCI device while doing I/O)\302\240\302\240\302\240\302\240[passed]\n" - "\302\240\302\240\302\240\302\240runtime\302\240\302\24081.188s\302\240\302\240...\302\240\302\24043.400s\n" + "[root at segstorage1 blktests]# ./check block/011\n" + "block/011 => nvme0n1 (disable PCI device while doing I/O)????[passed]\n" + "????runtime??81.188s??...??43.400s\n" "\n" "I will capture a vmcore next time it panics and give some information\n" after analyzing the core -fe9a026c40ef22b4f26a56752b5e0297b4386c368e45ec92ddd96dc9cf5ad8fe +093858a099604aafa294dd8b1ea5448b7dec7dc8db654d944f542bb18f4de3ae
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.