public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org,
	Caleb Sander Mateos <csander@purestorage.com>,
	Uday Shankar <ushankar@purestorage.com>,
	Ming Lei <ming.lei@redhat.com>
Subject: Re: [PATCH V6 14/24] ublk: fix batch I/O recovery -ENODEV error
Date: Thu, 22 Jan 2026 20:33:39 -0700	[thread overview]
Message-ID: <716ab266-d2fe-45ff-8ec5-10c4d9a1ebcc@kernel.dk> (raw)
In-Reply-To: <20260116141859.719929-15-ming.lei@redhat.com>

Claude says:

ublk: fix batch I/O recovery -ENODEV error

This commit changes the recovery flow to track per-queue readiness and
clear ubq->canceling when individual queues become ready, rather than
waiting for all queues. This fixes UBLK_U_IO_FETCH_IO_CMDS failing with
-ENODEV during batch I/O recovery.

> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index e6e746cd369e..346596c1e319 100644

[ ... ]

> @@ -2791,37 +2797,52 @@ static void ublk_stop_dev(struct ublk_device *ub)
>  	ublk_cancel_dev(ub);
>  }
>
> -/* reset ublk io_uring queue & io flags */
> -static void ublk_reset_io_flags(struct ublk_device *ub)
> +/* reset per-queue io flags */
> +static void ublk_queue_reset_io_flags(struct ublk_queue *ubq)
>  {
> -	int i, j;
> +	int j;
>
> -	for (i = 0; i < ub->dev_info.nr_hw_queues; i++) {
> -		struct ublk_queue *ubq = ublk_get_queue(ub, i);
> -
> -		/* UBLK_IO_FLAG_CANCELED can be cleared now */
> -		spin_lock(&ubq->cancel_lock);
> -		for (j = 0; j < ubq->q_depth; j++)
> -			ubq->ios[j].flags &= ~UBLK_IO_FLAG_CANCELED;
> -		spin_unlock(&ubq->cancel_lock);
> -		ubq->fail_io = false;
> -	}
> -	mutex_lock(&ub->cancel_mutex);
> -	ublk_set_canceling(ub, false);
> -	mutex_unlock(&ub->cancel_mutex);
> +	/* UBLK_IO_FLAG_CANCELED can be cleared now */
> +	spin_lock(&ubq->cancel_lock);
> +	for (j = 0; j < ubq->q_depth; j++)
> +		ubq->ios[j].flags &= ~UBLK_IO_FLAG_CANCELED;
> +	spin_unlock(&ubq->cancel_lock);
> +	ubq->fail_io = false;
> +	ubq->canceling = false;
                         ^^^^

The comment on ublk_set_canceling() says "Do not set these flags directly"
and requires cancel_mutex to be held. Is it intentional to set
ubq->canceling directly here without the mutex? The old code used
ublk_set_canceling() with proper locking.

[ ... ]

> @@ -3411,7 +3432,14 @@ static int ublk_batch_unprep_io(struct ublk_queue *ubq,
>  {
>  	struct ublk_io *io = &ubq->ios[elem->tag];
>
> -	data->ub->nr_io_ready--;
> +	/*
> +	 * If queue was ready before this decrement, it won't be anymore,
> +	 * so we need to decrement the queue ready count too.
> +	 */
> +	if (ublk_queue_ready(ubq))
> +		data->ub->nr_queue_ready--;
> +	ubq->nr_io_ready--;
> +
>  	ublk_io_lock(io);
>  	io->flags = 0;
>  	ublk_io_unlock(io);

When ublk_batch_unprep_io() is used for rollback after a batch prep
failure, the counters are correctly decremented, but if the queue
became ready during prep (triggering ublk_queue_reset_io_flags() which
set ubq->canceling = false), the canceling flag is not restored.

Consider this scenario during recovery:
1. Batch prep command with tags 0,1,2,3 (valid) and tag 5 (invalid)
2. Tags 0-3 prep succeeds, queue becomes ready, canceling set to false
3. Tag 5 validation fails (tag >= q_depth)
4. Rollback via ublk_batch_revert_prep_cmd() calls ublk_batch_unprep_io()

After rollback: nr_io_ready=0, nr_queue_ready=0, but canceling=false.

With canceling incorrectly false, new requests in __ublk_queue_rq_common()
would not be aborted and could reach ublk_queue_cmd() which accesses
ubq->ios[tag].cmd - but the IO slots were cleared by rollback.

Should ublk_batch_unprep_io() restore ubq->canceling = true when the
queue transitions from ready to not-ready?

-- 
Jens Axboe


  reply	other threads:[~2026-01-23  3:33 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16 14:18 [PATCH V6 00/24] ublk: add UBLK_F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 01/24] ublk: define ublk_ch_batch_io_fops for the coming feature F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 02/24] ublk: prepare for not tracking task context for command batch Ming Lei
2026-01-16 14:18 ` [PATCH V6 03/24] ublk: add new batch command UBLK_U_IO_PREP_IO_CMDS & UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 04/24] ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2026-01-23  3:31   ` Jens Axboe
2026-01-16 14:18 ` [PATCH V6 05/24] ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 06/24] ublk: add io events fifo structure Ming Lei
2026-01-16 14:18 ` [PATCH V6 07/24] ublk: add batch I/O dispatch infrastructure Ming Lei
2026-01-23  3:32   ` Jens Axboe
2026-01-23  8:10     ` Ming Lei
2026-01-16 14:18 ` [PATCH V6 08/24] ublk: add UBLK_U_IO_FETCH_IO_CMDS for batch I/O processing Ming Lei
2026-01-16 14:18 ` [PATCH V6 09/24] ublk: refactor ublk_queue_rq() and add ublk_batch_queue_rq() Ming Lei
2026-01-23  3:04   ` Caleb Sander Mateos
2026-01-16 14:18 ` [PATCH V6 10/24] ublk: abort requests filled in event kfifo Ming Lei
2026-01-16 14:18 ` [PATCH V6 11/24] ublk: add new feature UBLK_F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 12/24] ublk: document " Ming Lei
2026-01-16 14:18 ` [PATCH V6 13/24] ublk: implement batch request completion via blk_mq_end_request_batch() Ming Lei
2026-01-16 14:18 ` [PATCH V6 14/24] ublk: fix batch I/O recovery -ENODEV error Ming Lei
2026-01-23  3:33   ` Jens Axboe [this message]
2026-01-16 14:18 ` [PATCH V6 15/24] selftests: ublk: fix user_data truncation for tgt_data >= 256 Ming Lei
2026-01-16 14:18 ` [PATCH V6 16/24] selftests: ublk: replace assert() with ublk_assert() Ming Lei
2026-01-16 14:18 ` [PATCH V6 17/24] selftests: ublk: add ublk_io_buf_idx() for returning io buffer index Ming Lei
2026-01-16 14:18 ` [PATCH V6 18/24] selftests: ublk: add batch buffer management infrastructure Ming Lei
2026-01-16 14:18 ` [PATCH V6 19/24] selftests: ublk: handle UBLK_U_IO_PREP_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 20/24] selftests: ublk: handle UBLK_U_IO_COMMIT_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 21/24] selftests: ublk: handle UBLK_U_IO_FETCH_IO_CMDS Ming Lei
2026-01-16 14:18 ` [PATCH V6 22/24] selftests: ublk: increase timeout to 150 seconds Ming Lei
2026-01-16 14:18 ` [PATCH V6 23/24] selftests: ublk: add --batch/-b for enabling F_BATCH_IO Ming Lei
2026-01-16 14:18 ` [PATCH V6 24/24] selftests: ublk: support arbitrary threads/queues combination Ming Lei
2026-01-23  2:24 ` [PATCH V6 00/24] ublk: add UBLK_F_BATCH_IO Ming Lei
2026-01-23  3:06 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=716ab266-d2fe-45ff-8ec5-10c4d9a1ebcc@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox