* NVME timeout causing system hangs
@ 2019-08-19 23:33 Ashton Holmes
2019-08-22 17:28 ` Keith Busch
0 siblings, 1 reply; 3+ messages in thread
From: Ashton Holmes @ 2019-08-19 23:33 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 841 bytes --]
When playing certain games on my PC dmesg will start spitting out NVME
timeout messages, this eventually results in BTRFS throwing errors and
remounting itself as read only. The drive passes smart's health check
and works fine when not playing games. The really weird part is this
will happen even if the game I'm playing isn't installed on that drive.
I wanted to bisect this but it happens on every kernel version I've
tried. I've attached my dmesg log. This was originally reported here
https://bugzilla.kernel.org/show_bug.cgi?id=202633 but no response was
ever given. In that report I state that 4.19.24 for whatever reason
doesn't trigger this however that doesn't seem to be the case anymore.
I've updated my UEFI since then, I wouldn't expect that to make a
difference but I'm not sure what else would have changed that.
[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 17574 bytes --]
[ 170.678837] nvme nvme0: I/O 128 QID 2 timeout, aborting
[ 170.678845] nvme nvme0: I/O 129 QID 2 timeout, aborting
[ 170.678850] nvme nvme0: I/O 167 QID 2 timeout, aborting
[ 170.678853] nvme nvme0: I/O 168 QID 2 timeout, aborting
[ 170.678856] nvme nvme0: I/O 169 QID 2 timeout, aborting
[ 201.657527] nvme nvme0: I/O 128 QID 2 timeout, reset controller
[ 232.372876] nvme nvme0: I/O 8 QID 0 timeout, reset controller
[ 323.643688] nvme nvme0: Device not ready; aborting reset
[ 323.675893] print_req_error: I/O error, dev nvme0n1, sector 1088653384 flags 80700
[ 323.675902] print_req_error: I/O error, dev nvme0n1, sector 1001346664 flags 80700
[ 323.675915] print_req_error: I/O error, dev nvme0n1, sector 1088646984 flags 84700
[ 323.675920] print_req_error: I/O error, dev nvme0n1, sector 1088647240 flags 84700
[ 323.675923] print_req_error: I/O error, dev nvme0n1, sector 1088647496 flags 84700
[ 323.675927] print_req_error: I/O error, dev nvme0n1, sector 1088647752 flags 84700
[ 323.675931] print_req_error: I/O error, dev nvme0n1, sector 1088648008 flags 84700
[ 323.675935] print_req_error: I/O error, dev nvme0n1, sector 1088648264 flags 84700
[ 323.675938] print_req_error: I/O error, dev nvme0n1, sector 1088648520 flags 84700
[ 323.675942] print_req_error: I/O error, dev nvme0n1, sector 1088648776 flags 84700
[ 323.675993] nvme nvme0: Abort status: 0x7
[ 323.675995] nvme nvme0: Abort status: 0x7
[ 323.675996] nvme nvme0: Abort status: 0x7
[ 323.675998] nvme nvme0: Abort status: 0x7
[ 323.675999] nvme nvme0: Abort status: 0x7
[ 363.442595] INFO: task kworker/u64:4:279 blocked for more than 120 seconds.
[ 363.442599] Not tainted 5.2.7 #7
[ 363.442601] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.442602] kworker/u64:4 D 0 279 2 0x80004000
[ 363.442637] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.442639] Call Trace:
[ 363.442646] ? __schedule+0x299/0x660
[ 363.442649] schedule+0x29/0x90
[ 363.442674] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.442678] ? finish_wait+0x80/0x80
[ 363.442703] start_transaction+0x2cd/0x470 [btrfs]
[ 363.442729] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.442732] ? __switch_to_asm+0x40/0x70
[ 363.442759] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.442762] process_one_work+0x19b/0x3a0
[ 363.442764] worker_thread+0x30/0x370
[ 363.442766] ? process_one_work+0x3a0/0x3a0
[ 363.442768] kthread+0x112/0x130
[ 363.442771] ? kthread_create_on_node+0x60/0x60
[ 363.442772] ret_from_fork+0x22/0x40
[ 363.442776] INFO: task kworker/u64:6:283 blocked for more than 120 seconds.
[ 363.442777] Not tainted 5.2.7 #7
[ 363.442778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.442779] kworker/u64:6 D 0 283 2 0x80004000
[ 363.442808] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.442809] Call Trace:
[ 363.442811] ? __schedule+0x299/0x660
[ 363.442814] schedule+0x29/0x90
[ 363.442837] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.442840] ? finish_wait+0x80/0x80
[ 363.442864] start_transaction+0x2cd/0x470 [btrfs]
[ 363.442889] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.442916] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.442918] process_one_work+0x19b/0x3a0
[ 363.442920] worker_thread+0x30/0x370
[ 363.442922] ? process_one_work+0x3a0/0x3a0
[ 363.442923] kthread+0x112/0x130
[ 363.442925] ? kthread_create_on_node+0x60/0x60
[ 363.442927] ret_from_fork+0x22/0x40
[ 363.442930] INFO: task kworker/u64:10:288 blocked for more than 120 seconds.
[ 363.442931] Not tainted 5.2.7 #7
[ 363.442932] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.442933] kworker/u64:10 D 0 288 2 0x80004000
[ 363.442960] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.442961] Call Trace:
[ 363.442964] ? __schedule+0x299/0x660
[ 363.442966] schedule+0x29/0x90
[ 363.442990] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.442992] ? finish_wait+0x80/0x80
[ 363.443016] start_transaction+0x2cd/0x470 [btrfs]
[ 363.443041] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.443043] ? __switch_to_asm+0x40/0x70
[ 363.443070] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.443072] process_one_work+0x19b/0x3a0
[ 363.443074] worker_thread+0x30/0x370
[ 363.443075] ? process_one_work+0x3a0/0x3a0
[ 363.443077] kthread+0x112/0x130
[ 363.443079] ? kthread_create_on_node+0x60/0x60
[ 363.443081] ret_from_fork+0x22/0x40
[ 363.443083] INFO: task kworker/u64:11:289 blocked for more than 120 seconds.
[ 363.443084] Not tainted 5.2.7 #7
[ 363.443085] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.443086] kworker/u64:11 D 0 289 2 0x80004000
[ 363.443114] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.443114] Call Trace:
[ 363.443117] ? __schedule+0x299/0x660
[ 363.443119] schedule+0x29/0x90
[ 363.443143] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.443146] ? finish_wait+0x80/0x80
[ 363.443169] start_transaction+0x2cd/0x470 [btrfs]
[ 363.443192] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.443194] ? __switch_to_asm+0x40/0x70
[ 363.443217] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.443218] process_one_work+0x19b/0x3a0
[ 363.443220] worker_thread+0x30/0x370
[ 363.443221] ? process_one_work+0x3a0/0x3a0
[ 363.443223] kthread+0x112/0x130
[ 363.443224] ? kthread_create_on_node+0x60/0x60
[ 363.443226] ret_from_fork+0x22/0x40
[ 363.443241] INFO: task btrfs-transacti:897 blocked for more than 120 seconds.
[ 363.443242] Not tainted 5.2.7 #7
[ 363.443243] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.443244] btrfs-transacti D 0 897 2 0x80004000
[ 363.443245] Call Trace:
[ 363.443247] ? __schedule+0x299/0x660
[ 363.443249] schedule+0x29/0x90
[ 363.443251] io_schedule+0x12/0x40
[ 363.443254] wait_on_page_bit+0x12e/0x1f0
[ 363.443256] ? file_fdatawait_range+0x20/0x20
[ 363.443279] read_extent_buffer_pages+0x27a/0x320 [btrfs]
[ 363.443300] btree_read_extent_buffer_pages+0x9b/0x110 [btrfs]
[ 363.443320] read_tree_block+0x47/0x60 [btrfs]
[ 363.443337] read_block_for_search.isra.36+0x1b6/0x350 [btrfs]
[ 363.443356] btrfs_search_slot+0x263/0x990 [btrfs]
[ 363.443374] lookup_inline_extent_backref+0x181/0x630 [btrfs]
[ 363.443392] ? leaf_space_used+0xdc/0x100 [btrfs]
[ 363.443410] __btrfs_free_extent.isra.79+0xf3/0x900 [btrfs]
[ 363.443428] ? pin_down_extent+0xd1/0x140 [btrfs]
[ 363.443446] ? block_rsv_release_bytes+0xfb/0x130 [btrfs]
[ 363.443468] ? btrfs_merge_delayed_refs+0x2be/0x340 [btrfs]
[ 363.443487] __btrfs_run_delayed_refs+0x6d8/0xf40 [btrfs]
[ 363.443506] ? btrfs_create_pending_block_groups+0x145/0x230 [btrfs]
[ 363.443508] ? _cond_resched+0x15/0x30
[ 363.443527] btrfs_run_delayed_refs+0xd3/0x180 [btrfs]
[ 363.443548] btrfs_commit_transaction+0x445/0x9b0 [btrfs]
[ 363.443568] ? start_transaction+0x9e/0x470 [btrfs]
[ 363.443588] transaction_kthread+0x152/0x170 [btrfs]
[ 363.443609] ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
[ 363.443610] kthread+0x112/0x130
[ 363.443612] ? kthread_create_on_node+0x60/0x60
[ 363.443613] ret_from_fork+0x22/0x40
[ 363.443645] INFO: task kworker/u64:14:1347 blocked for more than 120 seconds.
[ 363.443646] Not tainted 5.2.7 #7
[ 363.443647] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.443648] kworker/u64:14 D 0 1347 2 0x80004000
[ 363.443671] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.443672] Call Trace:
[ 363.443674] ? __schedule+0x299/0x660
[ 363.443676] schedule+0x29/0x90
[ 363.443696] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.443698] ? finish_wait+0x80/0x80
[ 363.443719] start_transaction+0x2cd/0x470 [btrfs]
[ 363.443740] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.443742] ? __switch_to_asm+0x40/0x70
[ 363.443764] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.443766] process_one_work+0x19b/0x3a0
[ 363.443768] worker_thread+0x30/0x370
[ 363.443769] ? process_one_work+0x3a0/0x3a0
[ 363.443770] kthread+0x112/0x130
[ 363.443772] ? kthread_create_on_node+0x60/0x60
[ 363.443773] ret_from_fork+0x22/0x40
[ 363.443776] INFO: task kworker/u64:16:1349 blocked for more than 120 seconds.
[ 363.443777] Not tainted 5.2.7 #7
[ 363.443778] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.443778] kworker/u64:16 D 0 1349 2 0x80004000
[ 363.443802] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 363.443802] Call Trace:
[ 363.443805] ? __schedule+0x299/0x660
[ 363.443807] schedule+0x29/0x90
[ 363.443827] wait_current_trans+0xb2/0xe0 [btrfs]
[ 363.443829] ? finish_wait+0x80/0x80
[ 363.443849] start_transaction+0x2cd/0x470 [btrfs]
[ 363.443870] btrfs_finish_ordered_io+0x233/0x830 [btrfs]
[ 363.443896] normal_work_helper+0xae/0x2f0 [btrfs]
[ 363.443898] process_one_work+0x19b/0x3a0
[ 363.443900] worker_thread+0x30/0x370
[ 363.443901] ? process_one_work+0x3a0/0x3a0
[ 363.443903] kthread+0x112/0x130
[ 363.443905] ? kthread_create_on_node+0x60/0x60
[ 363.443907] ret_from_fork+0x22/0x40
[ 363.443912] INFO: task CIPCServer::Thr:1617 blocked for more than 120 seconds.
[ 363.443914] Not tainted 5.2.7 #7
[ 363.443915] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.443916] CIPCServer::Thr D 0 1617 1352 0x20020004
[ 363.443918] Call Trace:
[ 363.443920] ? __schedule+0x299/0x660
[ 363.443924] ? blk_flush_plug_list+0xd7/0x100
[ 363.443926] schedule+0x29/0x90
[ 363.443928] io_schedule+0x12/0x40
[ 363.443930] wait_on_page_bit+0x12e/0x1f0
[ 363.443933] ? file_fdatawait_range+0x20/0x20
[ 363.443962] read_extent_buffer_pages+0x27a/0x320 [btrfs]
[ 363.443983] btree_read_extent_buffer_pages+0x9b/0x110 [btrfs]
[ 363.444008] read_tree_block+0x47/0x60 [btrfs]
[ 363.444030] read_block_for_search.isra.36+0x1b6/0x350 [btrfs]
[ 363.444054] btrfs_search_slot+0x263/0x990 [btrfs]
[ 363.444080] btrfs_lookup_csum+0x52/0x150 [btrfs]
[ 363.444084] ? kmem_cache_alloc+0x150/0x1c0
[ 363.444103] __btrfs_lookup_bio_sums+0x228/0x660 [btrfs]
[ 363.444106] ? _cond_resched+0x15/0x30
[ 363.444126] btrfs_submit_bio_hook+0x103/0x170 [btrfs]
[ 363.444149] submit_one_bio+0x31/0x50 [btrfs]
[ 363.444173] extent_readpages+0x2aa/0x370 [btrfs]
[ 363.444178] ? tomoyo_check_open_permission+0xa6/0x190
[ 363.444180] read_pages+0x6b/0x190
[ 363.444183] __do_page_cache_readahead+0x1c1/0x1e0
[ 363.444186] ondemand_readahead+0x182/0x2f0
[ 363.444189] generic_file_read_iter+0x818/0xc80
[ 363.444193] new_sync_read+0x12a/0x1c0
[ 363.444195] vfs_read+0x91/0x140
[ 363.444197] ksys_read+0x59/0xd0
[ 363.444199] do_fast_syscall_32+0x94/0x204
[ 363.444202] entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[ 363.444205] INFO: task CJobMgr::m_Work:1632 blocked for more than 120 seconds.
[ 363.444207] Not tainted 5.2.7 #7
[ 363.444207] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.444209] CJobMgr::m_Work D 0 1632 1352 0x20020000
[ 363.444210] Call Trace:
[ 363.444212] ? __schedule+0x299/0x660
[ 363.444214] schedule+0x29/0x90
[ 363.444216] io_schedule+0x12/0x40
[ 363.444217] wait_on_page_bit+0x12e/0x1f0
[ 363.444219] ? file_fdatawait_range+0x20/0x20
[ 363.444243] read_extent_buffer_pages+0x27a/0x320 [btrfs]
[ 363.444263] btree_read_extent_buffer_pages+0x9b/0x110 [btrfs]
[ 363.444283] read_tree_block+0x47/0x60 [btrfs]
[ 363.444300] read_block_for_search.isra.36+0x1b6/0x350 [btrfs]
[ 363.444318] btrfs_search_slot+0x263/0x990 [btrfs]
[ 363.444339] ? do_async_commit+0x30/0x30 [btrfs]
[ 363.444358] btrfs_lookup_inode+0x3a/0xc0 [btrfs]
[ 363.444361] ? kmem_cache_alloc+0x150/0x1c0
[ 363.444382] btrfs_iget_path+0x537/0x6e0 [btrfs]
[ 363.444404] btrfs_lookup_dentry+0x469/0x570 [btrfs]
[ 363.444426] btrfs_lookup+0xe/0x30 [btrfs]
[ 363.444428] __lookup_slow+0x97/0x150
[ 363.444430] lookup_slow+0x35/0x50
[ 363.444432] walk_component+0x1bf/0x330
[ 363.444435] path_lookupat.isra.46+0x6d/0x220
[ 363.444436] ? mutex_lock+0xe/0x30
[ 363.444438] filename_lookup.part.62+0xa0/0x170
[ 363.444441] ? strncpy_from_user+0x4f/0x1a0
[ 363.444443] ? getname_flags+0x6a/0x1e0
[ 363.444444] do_faccessat+0xac/0x250
[ 363.444446] do_fast_syscall_32+0x94/0x204
[ 363.444448] entry_SYSCALL_compat_after_hwframe+0x45/0x4d
[ 363.444461] INFO: task steamerrorrepor:2016 blocked for more than 120 seconds.
[ 363.444462] Not tainted 5.2.7 #7
[ 363.444462] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.444463] steamerrorrepor D 0 2016 1 0x20020000
[ 363.444464] Call Trace:
[ 363.444467] ? __schedule+0x299/0x660
[ 363.444468] ? blk_flush_plug_list+0xd7/0x100
[ 363.444470] schedule+0x29/0x90
[ 363.444472] io_schedule+0x12/0x40
[ 363.444473] wait_on_page_bit+0x12e/0x1f0
[ 363.444475] ? file_fdatawait_range+0x20/0x20
[ 363.444498] read_extent_buffer_pages+0x27a/0x320 [btrfs]
[ 363.444518] btree_read_extent_buffer_pages+0x9b/0x110 [btrfs]
[ 363.444539] read_tree_block+0x47/0x60 [btrfs]
[ 363.444556] read_block_for_search.isra.36+0x1b6/0x350 [btrfs]
[ 363.444574] btrfs_search_slot+0x263/0x990 [btrfs]
[ 363.444593] btrfs_lookup_csum+0x52/0x150 [btrfs]
[ 363.444596] ? kmem_cache_alloc+0x150/0x1c0
[ 363.444615] __btrfs_lookup_bio_sums+0x228/0x660 [btrfs]
[ 363.444617] ? _cond_resched+0x15/0x30
[ 363.444638] btrfs_submit_bio_hook+0x103/0x170 [btrfs]
[ 363.444660] submit_one_bio+0x31/0x50 [btrfs]
[ 363.444683] extent_readpages+0x2aa/0x370 [btrfs]
[ 363.444685] ? tomoyo_check_open_permission+0xa6/0x190
[ 363.444687] read_pages+0x6b/0x190
[ 363.444690] __do_page_cache_readahead+0x1c1/0x1e0
[ 363.444692] ondemand_readahead+0x182/0x2f0
[ 363.444693] ? pagecache_get_page+0x30/0x2d0
[ 363.444695] generic_file_read_iter+0x818/0xc80
[ 363.444697] ? do_filp_open+0xa7/0x100
[ 363.444699] new_sync_read+0x12a/0x1c0
[ 363.444701] vfs_read+0x91/0x140
[ 363.444702] ksys_read+0x59/0xd0
[ 363.444704] do_int80_syscall_32+0x50/0xf0
[ 363.444706] entry_INT80_compat+0x82/0x90
[ 384.214291] nvme nvme0: Device not ready; aborting reset
[ 384.214297] nvme nvme0: Removing after probe failure status: -19
[ 444.833351] nvme nvme0: Device not ready; aborting reset
[ 444.857550] print_req_error: 17 callbacks suppressed
[ 444.857553] print_req_error: I/O error, dev nvme0n1, sector 2126920 flags 1000
[ 444.857556] print_req_error: I/O error, dev nvme0n1, sector 1583912 flags 1000
[ 444.857559] print_req_error: I/O error, dev nvme0n1, sector 481082728 flags 1000
[ 444.857563] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 444.857567] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 1, rd 1, flush 0, corrupt 0, gen 0
[ 444.857569] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 1, rd 2, flush 0, corrupt 0, gen 0
[ 444.857572] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 1, rd 3, flush 0, corrupt 0, gen 0
[ 444.857577] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 3, flush 0, corrupt 0, gen 0
[ 444.857592] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 4, flush 0, corrupt 0, gen 0
[ 444.857609] BTRFS info (device dm-1): no csum found for inode 27605208 start 5177344
[ 444.857618] BTRFS info (device dm-1): no csum found for inode 29805279 start 0
[ 444.857632] BTRFS: error (device dm-1) in __btrfs_free_extent:6983: errno=-5 IO failure
[ 444.857635] BTRFS info (device dm-1): forced readonly
[ 444.857638] BTRFS: error (device dm-1) in btrfs_run_delayed_refs:2907: errno=-5 IO failure
[ 444.857642] BTRFS warning (device dm-1): Skipping commit of aborted transaction.
[ 444.857643] BTRFS: error (device dm-1) in cleanup_transaction:1794: errno=-5 IO failure
[ 444.857650] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 5, flush 0, corrupt 0, gen 0
[ 444.857689] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 6, flush 0, corrupt 0, gen 0
[ 444.857694] BTRFS info (device dm-1): no csum found for inode 29805279 start 4096
[ 444.857707] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 7, flush 0, corrupt 0, gen 0
[ 444.857726] BTRFS info (device dm-1): no csum found for inode 29805279 start 8192
[ 444.857737] BTRFS error (device dm-1): bdev /dev/mapper/Storage errs: wr 2, rd 8, flush 0, corrupt 0, gen 0
[ 444.857753] BTRFS info (device dm-1): no csum found for inode 29805279 start 12288
[ 444.857813] BTRFS info (device dm-1): no csum found for inode 29805279 start 0
[ 444.857819] BTRFS info (device dm-1): no csum found for inode 27605208 start 5177344
[ 444.857860] BTRFS info (device dm-1): no csum found for inode 29805279 start 0
[ 444.858292] BTRFS info (device dm-1): no csum found for inode 29805279 start 0
[ 444.858328] BTRFS info (device dm-1): no csum found for inode 29805279 start 0
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: NVME timeout causing system hangs
2019-08-19 23:33 NVME timeout causing system hangs Ashton Holmes
@ 2019-08-22 17:28 ` Keith Busch
2019-08-27 3:16 ` Ashton Holmes
0 siblings, 1 reply; 3+ messages in thread
From: Keith Busch @ 2019-08-22 17:28 UTC (permalink / raw)
To: Ashton Holmes; +Cc: linux-kernel
On Mon, Aug 19, 2019 at 04:33:45PM -0700, Ashton Holmes wrote:
> When playing certain games on my PC dmesg will start spitting out NVME
> timeout messages, this eventually results in BTRFS throwing errors and
> remounting itself as read only. The drive passes smart's health check and
> works fine when not playing games. The really weird part is this will happen
> even if the game I'm playing isn't installed on that drive. I wanted to
> bisect this but it happens on every kernel version I've tried. I've attached
> my dmesg log. This was originally reported here
> https://bugzilla.kernel.org/show_bug.cgi?id=202633 but no response was ever
> given. In that report I state that 4.19.24 for whatever reason doesn't
> trigger this however that doesn't seem to be the case anymore. I've updated
> my UEFI since then, I wouldn't expect that to make a difference but I'm not
> sure what else would have changed that.
This really looks like your nvme controller has gotten itself in an
unresponsive state: it is not responding to IO, admin, or reset
requests.
The only recommendation I have at the moment is to verify you have the
most current firmware from your vendor installed on this controller,
and update if not.
> [ 170.678837] nvme nvme0: I/O 128 QID 2 timeout, aborting
> [ 170.678845] nvme nvme0: I/O 129 QID 2 timeout, aborting
> [ 170.678850] nvme nvme0: I/O 167 QID 2 timeout, aborting
> [ 170.678853] nvme nvme0: I/O 168 QID 2 timeout, aborting
> [ 170.678856] nvme nvme0: I/O 169 QID 2 timeout, aborting
> [ 201.657527] nvme nvme0: I/O 128 QID 2 timeout, reset controller
> [ 232.372876] nvme nvme0: I/O 8 QID 0 timeout, reset controller
> [ 323.643688] nvme nvme0: Device not ready; aborting reset
> [ 323.675893] print_req_error: I/O error, dev nvme0n1, sector 1088653384 flags 80700
> [ 323.675902] print_req_error: I/O error, dev nvme0n1, sector 1001346664 flags 80700
> [ 323.675915] print_req_error: I/O error, dev nvme0n1, sector 1088646984 flags 84700
> [ 323.675920] print_req_error: I/O error, dev nvme0n1, sector 1088647240 flags 84700
> [ 323.675923] print_req_error: I/O error, dev nvme0n1, sector 1088647496 flags 84700
> [ 323.675927] print_req_error: I/O error, dev nvme0n1, sector 1088647752 flags 84700
> [ 323.675931] print_req_error: I/O error, dev nvme0n1, sector 1088648008 flags 84700
> [ 323.675935] print_req_error: I/O error, dev nvme0n1, sector 1088648264 flags 84700
> [ 323.675938] print_req_error: I/O error, dev nvme0n1, sector 1088648520 flags 84700
> [ 323.675942] print_req_error: I/O error, dev nvme0n1, sector 1088648776 flags 84700
> [ 323.675993] nvme nvme0: Abort status: 0x7
> [ 323.675995] nvme nvme0: Abort status: 0x7
> [ 323.675996] nvme nvme0: Abort status: 0x7
> [ 323.675998] nvme nvme0: Abort status: 0x7
> [ 323.675999] nvme nvme0: Abort status: 0x7
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: NVME timeout causing system hangs
2019-08-22 17:28 ` Keith Busch
@ 2019-08-27 3:16 ` Ashton Holmes
0 siblings, 0 replies; 3+ messages in thread
From: Ashton Holmes @ 2019-08-27 3:16 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-kernel
I tried looking around for the firmware online but couldn't find it,
contacting adata ended up dead ending me, they said I can't find it
because there hasn't ever been an firmware updates for this drive and am
recommending I RMA it. Is there anything I can do to further
troubleshoot this and make sure it is/isn't a hardware issue before I
RMA it?
On 8/22/19 10:28 AM, Keith Busch wrote:
> On Mon, Aug 19, 2019 at 04:33:45PM -0700, Ashton Holmes wrote:
>> When playing certain games on my PC dmesg will start spitting out NVME
>> timeout messages, this eventually results in BTRFS throwing errors and
>> remounting itself as read only. The drive passes smart's health check and
>> works fine when not playing games. The really weird part is this will happen
>> even if the game I'm playing isn't installed on that drive. I wanted to
>> bisect this but it happens on every kernel version I've tried. I've attached
>> my dmesg log. This was originally reported here
>> https://bugzilla.kernel.org/show_bug.cgi?id=202633 but no response was ever
>> given. In that report I state that 4.19.24 for whatever reason doesn't
>> trigger this however that doesn't seem to be the case anymore. I've updated
>> my UEFI since then, I wouldn't expect that to make a difference but I'm not
>> sure what else would have changed that.
> This really looks like your nvme controller has gotten itself in an
> unresponsive state: it is not responding to IO, admin, or reset
> requests.
>
> The only recommendation I have at the moment is to verify you have the
> most current firmware from your vendor installed on this controller,
> and update if not.
>
>
>
>> [ 170.678837] nvme nvme0: I/O 128 QID 2 timeout, aborting
>> [ 170.678845] nvme nvme0: I/O 129 QID 2 timeout, aborting
>> [ 170.678850] nvme nvme0: I/O 167 QID 2 timeout, aborting
>> [ 170.678853] nvme nvme0: I/O 168 QID 2 timeout, aborting
>> [ 170.678856] nvme nvme0: I/O 169 QID 2 timeout, aborting
>> [ 201.657527] nvme nvme0: I/O 128 QID 2 timeout, reset controller
>> [ 232.372876] nvme nvme0: I/O 8 QID 0 timeout, reset controller
>> [ 323.643688] nvme nvme0: Device not ready; aborting reset
>> [ 323.675893] print_req_error: I/O error, dev nvme0n1, sector 1088653384 flags 80700
>> [ 323.675902] print_req_error: I/O error, dev nvme0n1, sector 1001346664 flags 80700
>> [ 323.675915] print_req_error: I/O error, dev nvme0n1, sector 1088646984 flags 84700
>> [ 323.675920] print_req_error: I/O error, dev nvme0n1, sector 1088647240 flags 84700
>> [ 323.675923] print_req_error: I/O error, dev nvme0n1, sector 1088647496 flags 84700
>> [ 323.675927] print_req_error: I/O error, dev nvme0n1, sector 1088647752 flags 84700
>> [ 323.675931] print_req_error: I/O error, dev nvme0n1, sector 1088648008 flags 84700
>> [ 323.675935] print_req_error: I/O error, dev nvme0n1, sector 1088648264 flags 84700
>> [ 323.675938] print_req_error: I/O error, dev nvme0n1, sector 1088648520 flags 84700
>> [ 323.675942] print_req_error: I/O error, dev nvme0n1, sector 1088648776 flags 84700
>> [ 323.675993] nvme nvme0: Abort status: 0x7
>> [ 323.675995] nvme nvme0: Abort status: 0x7
>> [ 323.675996] nvme nvme0: Abort status: 0x7
>> [ 323.675998] nvme nvme0: Abort status: 0x7
>> [ 323.675999] nvme nvme0: Abort status: 0x7
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-08-27 3:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-19 23:33 NVME timeout causing system hangs Ashton Holmes
2019-08-22 17:28 ` Keith Busch
2019-08-27 3:16 ` Ashton Holmes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox