All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: huang-jl <huang-jl@deepseek.com>
Cc: linux-block@vger.kernel.org
Subject: Re: [BUG] ublk: ublk server hangs in D state during STOP_DEV
Date: Sat, 17 Jan 2026 15:44:22 +0800	[thread overview]
Message-ID: <aWs91n3yzPX9mZaV@fedora> (raw)
In-Reply-To: <20260116171613.46312-1-huang-jl@deepseek.com>

On Sat, Jan 17, 2026 at 01:16:13AM +0800, huang-jl wrote:
> > I'd understand why ublk server is stuck in io_wq_put_and_exit() first, so
> > far it is very likely caused by your ublk target logic...
> 
> I think the io-wq worker is stuck executing STOP_DEV uring cmd,
> and not our target I/O logic causes the issue. Let me explain:
> 
> Looking at the iou-wrk thread (348911) stack trace, this iou-wrk is a thread
> in my D-state ublk server, its stack is as follows:
> 
> $ cat /proc/348910/task/348911/stack 
> [<0>] folio_wait_bit_common+0x136/0x330
> [<0>] __folio_lock+0x17/0x30
> [<0>] write_cache_pages+0x1cd/0x430
> [<0>] blkdev_writepages+0x6f/0xb0
> [<0>] do_writepages+0xcd/0x1f0
> [<0>] filemap_fdatawrite_wbc+0x75/0xb0
> [<0>] __filemap_fdatawrite_range+0x58/0x80
> [<0>] filemap_write_and_wait_range+0x59/0xc0
> [<0>] bdev_mark_dead+0x85/0xd0
> [<0>] blk_report_disk_dead+0x87/0xf0
> [<0>] del_gendisk+0x37f/0x3b0
> [<0>] ublk_stop_dev+0x89/0x100 [ublk_drv]
> [<0>] ublk_ctrl_uring_cmd+0x51a/0x750 [ublk_drv]
> [<0>] io_uring_cmd+0x9f/0x140
> [<0>] io_issue_sqe+0x193/0x410
> [<0>] io_wq_submit_work+0xe2/0x380
> [<0>] io_worker_handle_work+0xdf/0x340
> [<0>] io_wq_worker+0xf9/0x350
> [<0>] ret_from_fork+0x44/0x70
> [<0>] ret_from_fork_asm+0x1b/0x30
> 
> This shows:
> 
> - The STOP_DEV command is being executed by an io-wq worker thread
> - ublk_stop_dev() called del_gendisk()
> - del_gendisk() is trying to flush dirty pages via bdev_mark_dead()
> - The writeback is stuck waiting for a folio lock

> - Upon receiving SIGINT, our ublk server will sends UBLK_U_CMD_STOP_DEV to the
>  driver.

Can you share how your server sends STOP_DEV when receiving SIGINT?

If it prevents normal IO command handling, ublk_stop_dev() will cause deadlock.

For example, follows the preferred IO handling in ublk server:

prepare UBLK_IO_FETCH_REQ uring_cmds;
while (1) {
	io_uring_enter(submission & wait event);
}

If you send STOP_DEV command inside the above loop, you will get the
deadlock, because inflight and new IOs can't be handled any more.

So you should send the STOP_DEV command from the signal handler or other
pthread for avoiding the issue.

> But I do not understand why it get stuck at waiting for folio lock.

It just shows normal ublk block IOs can't be completed.

> 
> I traced the code path and understand why STOP_DEV runs in io-wq:
> 
> 1. The ublk server call io_uring_enter() to submit the STOP_DEV uring cmd.
> 2. The kernel will call io_submit_sqes() -> io_submit_sqe() -> io_queue_sqe().
> 3. io_queue_sqe() first tries io_issue_sqe() with IO_URING_F_NONBLOCK
> 4. ublk_ctrl_uring_cmd() returns -EAGAIN when it sees IO_URING_F_NONBLOCK
> 5. io_uring then queues the work to io-wq via io_queue_iowq()
> 
> > If your system supports drgn and it is still ready to collect log, it
> > should be pretty easy to figure out the reason by writing one drgn script
> > to dump ublk queue/ublk io of driver.
> 
> The D-state process is still present on the system. I can install drgn and
> collect information.
> Could you tell me what specific data would be most helpful? For example:
> 
> - ublk_device state and flags?
> - ublk_queue state for each queue (force_abort, nr_io_ready, etc.)?
> - Individual ublk_io flags for inflight I/Os?

Yes, all above info is helpful.

 
Thanks,
Ming


  parent reply	other threads:[~2026-01-17  7:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16 14:15 [BUG] ublk: ublk server hangs in D state during STOP_DEV huang-jl
2026-01-16 14:58 ` Ming Lei
2026-01-17  5:18   ` huang-jl
     [not found]   ` <20260116171613.46312-1-huang-jl@deepseek.com>
2026-01-17  7:44     ` Ming Lei [this message]
2026-01-17 11:16       ` Ming Lei
2026-01-17 17:03         ` huang-jl
2026-01-18 11:50           ` Ming Lei
2026-01-18 13:14             ` huang-jl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWs91n3yzPX9mZaV@fedora \
    --to=ming.lei@redhat.com \
    --cc=huang-jl@deepseek.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.