linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "brookxu.cn" <brookxu.cn@gmail.com>
To: kbusch@kernel.org, axboe@kernel.dk, hch@lst.de, sagi@grimberg.me
Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [RFC PATCH 0/4] nvme-tcp: fix hung issues for deleting
Date: Mon, 29 May 2023 18:59:22 +0800	[thread overview]
Message-ID: <cover.1685350577.git.chunguang.xu@shopee.com> (raw)

From: Chunguang Xu <chunguang.xu@shopee.com>

We found that nvme_remove_namespaces() may hang in flush_work(&ctrl->scan_work)
while removing ctrl. The root cause may due to the state of ctrl changed to
NVME_CTRL_DELETING while removing ctrl , which intterupt nvme_tcp_error_recovery_work()/
nvme_reset_ctrl_work()/nvme_tcp_reconnect_or_remove().  At this time, ctrl is
freezed and queue is quiescing . Since scan_work may continue to issue IOs to
load partition table, make it blocked, and lead to nvme_tcp_error_recovery_work()
hang in flush_work(&ctrl->scan_work).

After analyzation, we found that there are mainly two case: 
1. Since ctrl is freeze, scan_work hang in __bio_queue_enter() while it issue
   new IO to load partition table.
2. Since queus is quiescing, requeue timeouted IO may hang in hctx->dispatch
   queue, leading scan_work waiting for IO completion.

CallTrace:
Removing nvme_ctrl
[<0>] __flush_work+0x14c/0x280
[<0>] flush_work+0x14/0x20
[<0>] nvme_remove_namespaces+0x45/0x100
[<0>] nvme_do_delete_ctrl+0x79/0xa0
[<0>] nvme_sysfs_delete+0x6b/0x80
[<0>] dev_attr_store+0x18/0x30
[<0>] sysfs_kf_write+0x3f/0x50
[<0>] kernfs_fop_write_iter+0x141/0x1d0
[<0>] vfs_write+0x25b/0x3d0
[<0>] ksys_write+0x6b/0xf0
[<0>] __x64_sys_write+0x1e/0x30
[<0>] do_syscall_64+0x5d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

Scan_work:
Stack 0
[<0>] __bio_queue_enter+0x15a/0x210
[<0>] blk_mq_submit_bio+0x260/0x5e0
[<0>] __submit_bio+0xa6/0x1a0
[<0>] submit_bio_noacct_nocheck+0x2e5/0x390
[<0>] submit_bio_noacct+0x1cd/0x560
[<0>] submit_bio+0x3b/0x60
[<0>] submit_bh_wbc+0x137/0x160
[<0>] block_read_full_folio+0x24d/0x470
[<0>] blkdev_read_folio+0x1c/0x30
[<0>] filemap_read_folio+0x44/0x2a0
[<0>] do_read_cache_folio+0x135/0x390
[<0>] read_cache_folio+0x16/0x20
[<0>] read_part_sector+0x3e/0xd0
[<0>] sgi_partition+0x35/0x1d0
[<0>] bdev_disk_changed+0x1f6/0x650
[<0>] blkdev_get_whole+0x7e/0x90
[<0>] blkdev_get_by_dev+0x19c/0x2e0
[<0>] disk_scan_partitions+0x72/0x100
[<0>] device_add_disk+0x415/0x420
[<0>] nvme_scan_ns+0x636/0xcd0
[<0>] nvme_scan_work+0x26f/0x450
[<0>] process_one_work+0x21c/0x430
[<0>] worker_thread+0x4e/0x3c0
[<0>] kthread+0xfb/0x130
[<0>] ret_from_fork+0x29/0x50

Stack 1
[<0>] filemap_read_folio+0x195/0x2a0
[<0>] do_read_cache_folio+0x135/0x390
[<0>] read_cache_folio+0x16/0x20
[<0>] read_part_sector+0x3e/0xd0
[<0>] read_lba+0xcc/0x1b0
[<0>] efi_partition+0xec/0x7f0
[<0>] bdev_disk_changed+0x1f6/0x650
[<0>] blkdev_get_whole+0x7e/0x90
[<0>] blkdev_get_by_dev+0x19c/0x2e0
[<0>] disk_scan_partitions+0x72/0x100
[<0>] device_add_disk+0x433/0x440
[<0>] nvme_scan_ns+0x636/0xcd0
[<0>] nvme_scan_work+0x26f/0x450
[<0>] process_one_work+0x21c/0x430
[<0>] worker_thread+0x4e/0x3c0
[<0>] kthread+0xfb/0x130
[<0>] ret_from_fork+0x29/0x50

Here try to fix this issue by make sure ctrl is unfreezed and queue is quiescing
while exit from error recovery or reset.

Chunguang Xu (4):
  nvme: unfreeze while exit from recovery or resetting
  nvme: donot retry request for NVME_CTRL_DELETING_NOIO
  nvme: optimize nvme_check_ready() for NVME_CTRL_DELETING_NOIO
  nvme-tcp: remove admin_q quiescing from nvme_tcp_teardown_io_queues

 drivers/nvme/host/core.c |  5 ++++-
 drivers/nvme/host/nvme.h |  3 ++-
 drivers/nvme/host/tcp.c  | 25 ++++++++++++++++---------
 3 files changed, 22 insertions(+), 11 deletions(-)

-- 
2.25.1



             reply	other threads:[~2023-05-29 10:59 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-29 10:59 brookxu.cn [this message]
2023-05-29 10:59 ` [RFC PATCH 1/4] nvme: unfreeze while exit from recovery or resetting brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 2/4] nvme: donot retry request for NVME_CTRL_DELETING_NOIO brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 3/4] nvme: optimize nvme_check_ready() " brookxu.cn
2023-05-29 10:59 ` [RFC PATCH 4/4] nvme-tcp: remove admin_q quiescing from nvme_tcp_teardown_io_queues brookxu.cn
2023-06-05 23:09 ` [RFC PATCH 0/4] nvme-tcp: fix hung issues for deleting Sagi Grimberg
2023-06-06 14:32   ` 许春光
2023-06-06 14:41   ` 许春光
2023-06-06 15:14 ` Ming Lei
2023-06-07  4:09   ` 许春光
2023-06-08  0:56     ` Ming Lei
2023-06-08  2:48       ` 许春光
2023-06-08 13:51         ` Ming Lei
2023-06-09  3:17           ` 许春光
2023-06-09  3:23           ` 许春光
2023-06-11  8:11       ` Sagi Grimberg
2023-06-12  1:33         ` Ming Lei
2023-06-12  6:36           ` Sagi Grimberg
2023-06-13  1:01             ` Ming Lei
2023-06-12  8:24           ` 许春光

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1685350577.git.chunguang.xu@shopee.com \
    --to=brookxu.cn@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).