From: "brookxu.cn" <brookxu.cn@gmail.com>
To: kbusch@kernel.org, axboe@kernel.dk, hch@lst.de, sagi@grimberg.me
Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 0/4] nvme-tcp: fix hung issues for deleting
Date: Mon, 29 May 2023 18:59:22 +0800 [thread overview]
Message-ID: <cover.1685350577.git.chunguang.xu@shopee.com> (raw)
From: Chunguang Xu <chunguang.xu@shopee.com>
We found that nvme_remove_namespaces() may hang in flush_work(&ctrl->scan_work)
while removing ctrl. The root cause may due to the state of ctrl changed to
NVME_CTRL_DELETING while removing ctrl , which intterupt nvme_tcp_error_recovery_work()/
nvme_reset_ctrl_work()/nvme_tcp_reconnect_or_remove(). At this time, ctrl is
freezed and queue is quiescing . Since scan_work may continue to issue IOs to
load partition table, make it blocked, and lead to nvme_tcp_error_recovery_work()
hang in flush_work(&ctrl->scan_work).
After analyzation, we found that there are mainly two case:
1. Since ctrl is freeze, scan_work hang in __bio_queue_enter() while it issue
new IO to load partition table.
2. Since queus is quiescing, requeue timeouted IO may hang in hctx->dispatch
queue, leading scan_work waiting for IO completion.
CallTrace:
Removing nvme_ctrl
[<0>] __flush_work+0x14c/0x280
[<0>] flush_work+0x14/0x20
[<0>] nvme_remove_namespaces+0x45/0x100
[<0>] nvme_do_delete_ctrl+0x79/0xa0
[<0>] nvme_sysfs_delete+0x6b/0x80
[<0>] dev_attr_store+0x18/0x30
[<0>] sysfs_kf_write+0x3f/0x50
[<0>] kernfs_fop_write_iter+0x141/0x1d0
[<0>] vfs_write+0x25b/0x3d0
[<0>] ksys_write+0x6b/0xf0
[<0>] __x64_sys_write+0x1e/0x30
[<0>] do_syscall_64+0x5d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Scan_work:
Stack 0
[<0>] __bio_queue_enter+0x15a/0x210
[<0>] blk_mq_submit_bio+0x260/0x5e0
[<0>] __submit_bio+0xa6/0x1a0
[<0>] submit_bio_noacct_nocheck+0x2e5/0x390
[<0>] submit_bio_noacct+0x1cd/0x560
[<0>] submit_bio+0x3b/0x60
[<0>] submit_bh_wbc+0x137/0x160
[<0>] block_read_full_folio+0x24d/0x470
[<0>] blkdev_read_folio+0x1c/0x30
[<0>] filemap_read_folio+0x44/0x2a0
[<0>] do_read_cache_folio+0x135/0x390
[<0>] read_cache_folio+0x16/0x20
[<0>] read_part_sector+0x3e/0xd0
[<0>] sgi_partition+0x35/0x1d0
[<0>] bdev_disk_changed+0x1f6/0x650
[<0>] blkdev_get_whole+0x7e/0x90
[<0>] blkdev_get_by_dev+0x19c/0x2e0
[<0>] disk_scan_partitions+0x72/0x100
[<0>] device_add_disk+0x415/0x420
[<0>] nvme_scan_ns+0x636/0xcd0
[<0>] nvme_scan_work+0x26f/0x450
[<0>] process_one_work+0x21c/0x430
[<0>] worker_thread+0x4e/0x3c0
[<0>] kthread+0xfb/0x130
[<0>] ret_from_fork+0x29/0x50
Stack 1
[<0>] filemap_read_folio+0x195/0x2a0
[<0>] do_read_cache_folio+0x135/0x390
[<0>] read_cache_folio+0x16/0x20
[<0>] read_part_sector+0x3e/0xd0
[<0>] read_lba+0xcc/0x1b0
[<0>] efi_partition+0xec/0x7f0
[<0>] bdev_disk_changed+0x1f6/0x650
[<0>] blkdev_get_whole+0x7e/0x90
[<0>] blkdev_get_by_dev+0x19c/0x2e0
[<0>] disk_scan_partitions+0x72/0x100
[<0>] device_add_disk+0x433/0x440
[<0>] nvme_scan_ns+0x636/0xcd0
[<0>] nvme_scan_work+0x26f/0x450
[<0>] process_one_work+0x21c/0x430
[<0>] worker_thread+0x4e/0x3c0
[<0>] kthread+0xfb/0x130
[<0>] ret_from_fork+0x29/0x50
Here try to fix this issue by make sure ctrl is unfreezed and queue is quiescing
while exit from error recovery or reset.
Chunguang Xu (4):
nvme: unfreeze while exit from recovery or resetting
nvme: donot retry request for NVME_CTRL_DELETING_NOIO
nvme: optimize nvme_check_ready() for NVME_CTRL_DELETING_NOIO
nvme-tcp: remove admin_q quiescing from nvme_tcp_teardown_io_queues
drivers/nvme/host/core.c | 5 ++++-
drivers/nvme/host/nvme.h | 3 ++-
drivers/nvme/host/tcp.c | 25 ++++++++++++++++---------
3 files changed, 22 insertions(+), 11 deletions(-)
--
2.25.1
next reply other threads:[~2023-05-29 10:59 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-29 10:59 brookxu.cn [this message]
2023-05-29 10:59 ` [RFC PATCH 1/4] nvme: unfreeze while exit from recovery or resetting brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 2/4] nvme: donot retry request for NVME_CTRL_DELETING_NOIO brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 3/4] nvme: optimize nvme_check_ready() " brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 4/4] nvme-tcp: remove admin_q quiescing from nvme_tcp_teardown_io_queues brookxu.cn
2023-06-05 23:09 ` [RFC PATCH 0/4] nvme-tcp: fix hung issues for deleting Sagi Grimberg
2023-06-06 14:32 ` 许春光
2023-06-06 14:41 ` 许春光
2023-06-06 15:14 ` Ming Lei
2023-06-07 4:09 ` 许春光
2023-06-08 0:56 ` Ming Lei
2023-06-08 2:48 ` 许春光
2023-06-08 13:51 ` Ming Lei
2023-06-09 3:17 ` 许春光
2023-06-09 3:23 ` 许春光
2023-06-11 8:11 ` Sagi Grimberg
2023-06-12 1:33 ` Ming Lei
2023-06-12 6:36 ` Sagi Grimberg
2023-06-13 1:01 ` Ming Lei
2023-06-12 8:24 ` 许春光
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1685350577.git.chunguang.xu@shopee.com \
--to=brookxu.cn@gmail.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).