From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org,
Caleb Sander Mateos <csander@purestorage.com>,
Uday Shankar <ushankar@purestorage.com>,
Ming Lei <ming.lei@redhat.com>
Subject: Re: [PATCH V6 14/24] ublk: fix batch I/O recovery -ENODEV error
Date: Thu, 22 Jan 2026 20:33:39 -0700 [thread overview]
Message-ID: <716ab266-d2fe-45ff-8ec5-10c4d9a1ebcc@kernel.dk> (raw)
In-Reply-To: <20260116141859.719929-15-ming.lei@redhat.com>
Claude says:
ublk: fix batch I/O recovery -ENODEV error
This commit changes the recovery flow to track per-queue readiness and
clear ubq->canceling when individual queues become ready, rather than
waiting for all queues. This fixes UBLK_U_IO_FETCH_IO_CMDS failing with
-ENODEV during batch I/O recovery.
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index e6e746cd369e..346596c1e319 100644
[ ... ]
> @@ -2791,37 +2797,52 @@ static void ublk_stop_dev(struct ublk_device *ub)
> ublk_cancel_dev(ub);
> }
>
> -/* reset ublk io_uring queue & io flags */
> -static void ublk_reset_io_flags(struct ublk_device *ub)
> +/* reset per-queue io flags */
> +static void ublk_queue_reset_io_flags(struct ublk_queue *ubq)
> {
> - int i, j;
> + int j;
>
> - for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> - struct ublk_queue *ubq = ublk_get_queue(ub, i);
> -
> - /* UBLK_IO_FLAG_CANCELED can be cleared now */
> - spin_lock(&ubq->cancel_lock);
> - for (j = 0; j < ubq->q_depth; j++)
> - ubq->ios[j].flags &= ~UBLK_IO_FLAG_CANCELED;
> - spin_unlock(&ubq->cancel_lock);
> - ubq->fail_io = false;
> - }
> - mutex_lock(&ub->cancel_mutex);
> - ublk_set_canceling(ub, false);
> - mutex_unlock(&ub->cancel_mutex);
> + /* UBLK_IO_FLAG_CANCELED can be cleared now */
> + spin_lock(&ubq->cancel_lock);
> + for (j = 0; j < ubq->q_depth; j++)
> + ubq->ios[j].flags &= ~UBLK_IO_FLAG_CANCELED;
> + spin_unlock(&ubq->cancel_lock);
> + ubq->fail_io = false;
> + ubq->canceling = false;
^^^^
The comment on ublk_set_canceling() says "Do not set these flags directly"
and requires cancel_mutex to be held. Is it intentional to set
ubq->canceling directly here without the mutex? The old code used
ublk_set_canceling() with proper locking.
[ ... ]
> @@ -3411,7 +3432,14 @@ static int ublk_batch_unprep_io(struct ublk_queue *ubq,
> {
> struct ublk_io *io = &ubq->ios[elem->tag];
>
> - data->ub->nr_io_ready--;
> + /*
> + * If queue was ready before this decrement, it won't be anymore,
> + * so we need to decrement the queue ready count too.
> + */
> + if (ublk_queue_ready(ubq))
> + data->ub->nr_queue_ready--;
> + ubq->nr_io_ready--;
> +
> ublk_io_lock(io);
> io->flags = 0;
> ublk_io_unlock(io);
When ublk_batch_unprep_io() is used for rollback after a batch prep
failure, the counters are correctly decremented, but if the queue
became ready during prep (triggering ublk_queue_reset_io_flags() which
set ubq->canceling = false), the canceling flag is not restored.
Consider this scenario during recovery:
1. Batch prep command with tags 0,1,2,3 (valid) and tag 5 (invalid)
2. Tags 0-3 prep succeeds, queue becomes ready, canceling set to false
3. Tag 5 validation fails (tag >= q_depth)
4. Rollback via ublk_batch_revert_prep_cmd() calls ublk_batch_unprep_io()
After rollback: nr_io_ready=0, nr_queue_ready=0, but canceling=false.
With canceling incorrectly false, new requests in __ublk_queue_rq_common()
would not be aborted and could reach ublk_queue_cmd() which accesses
ubq->ios[tag].cmd - but the IO slots were cleared by rollback.
Should ublk_batch_unprep_io() restore ubq->canceling = true when the
queue transitions from ready to not-ready?
--
Jens Axboe
next prev parent reply other threads:[~2026-01-23 3:33 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-16 14:18 [PATCH V6 00/24] ublk: add UBLK_F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 01/24] ublk: define ublk_ch_batch_io_fops for the coming feature F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 02/24] ublk: prepare for not tracking task context for command batch Ming Lei
2026-01-16 14:18 ` [PATCH V6 03/24] ublk: add new batch command UBLK_U_IO_PREP_IO_CMDS & UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 04/24] ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2026-01-23 3:31 ` Jens Axboe
2026-01-16 14:18 ` [PATCH V6 05/24] ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 06/24] ublk: add io events fifo structure Ming Lei
2026-01-16 14:18 ` [PATCH V6 07/24] ublk: add batch I/O dispatch infrastructure Ming Lei
2026-01-23 3:32 ` Jens Axboe
2026-01-23 8:10 ` Ming Lei
2026-01-16 14:18 ` [PATCH V6 08/24] ublk: add UBLK_U_IO_FETCH_IO_CMDS for batch I/O processing Ming Lei
2026-01-16 14:18 ` [PATCH V6 09/24] ublk: refactor ublk_queue_rq() and add ublk_batch_queue_rq() Ming Lei
2026-01-23 3:04 ` Caleb Sander Mateos
2026-01-16 14:18 ` [PATCH V6 10/24] ublk: abort requests filled in event kfifo Ming Lei
2026-01-16 14:18 ` [PATCH V6 11/24] ublk: add new feature UBLK_F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 12/24] ublk: document " Ming Lei
2026-01-16 14:18 ` [PATCH V6 13/24] ublk: implement batch request completion via blk_mq_end_request_batch() Ming Lei
2026-01-16 14:18 ` [PATCH V6 14/24] ublk: fix batch I/O recovery -ENODEV error Ming Lei
2026-01-23 3:33 ` Jens Axboe [this message]
2026-01-16 14:18 ` [PATCH V6 15/24] selftests: ublk: fix user_data truncation for tgt_data >= 256 Ming Lei
2026-01-16 14:18 ` [PATCH V6 16/24] selftests: ublk: replace assert() with ublk_assert() Ming Lei
2026-01-16 14:18 ` [PATCH V6 17/24] selftests: ublk: add ublk_io_buf_idx() for returning io buffer index Ming Lei
2026-01-16 14:18 ` [PATCH V6 18/24] selftests: ublk: add batch buffer management infrastructure Ming Lei
2026-01-16 14:18 ` [PATCH V6 19/24] selftests: ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 20/24] selftests: ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 21/24] selftests: ublk: handle UBLK_U_IO_FETCH_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 22/24] selftests: ublk: increase timeout to 150 seconds Ming Lei
2026-01-16 14:18 ` [PATCH V6 23/24] selftests: ublk: add --batch/-b for enabling F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 24/24] selftests: ublk: support arbitrary threads/queues combination Ming Lei
2026-01-23 2:24 ` [PATCH V6 00/24] ublk: add UBLK_F_BATCH_IO Ming Lei
2026-01-23 3:06 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=716ab266-d2fe-45ff-8ec5-10c4d9a1ebcc@kernel.dk \
--to=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox