public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd
@ 2025-05-05 16:06 Jared Holzman
  2025-05-06  2:13 ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Jared Holzman @ 2025-05-05 16:06 UTC (permalink / raw)
  To: linux-block@vger.kernel.org; +Cc: Ming Lei, Jens Axboe

Hi Ming,

I'm attempting to back port the fix for this issue to the 6.14-stable branch.

Greg Kroah-Hartman has already applied d6aa0c178bf8 - "ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA",
but was unable to apply f40139fde527 cleanly.

I created the patch below. It applies and compiles, but when I rerun the scenario I get several hung tasks waiting on the ub->mutex
which is being held by the following task:

jared@nvme195:~$ sudo cat /proc/230/stack
[<0>] msleep+0x2b/0x50
[<0>] ublk_wait_tagset_rqs_idle+0x3d/0xa0 [ublk_drv]
[<0>] ublk_nosrv_work+0x189/0x1b0 [ublk_drv]
[<0>] process_one_work+0x17b/0x3d0
[<0>] worker_thread+0x2de/0x410
[<0>] kthread+0xfe/0x230
[<0>] ret_from_fork+0x47/0x70
[<0>] ret_from_fork_asm+0x1a/0x30

I presume that 6.14-stable is missing a commit that calls __blk_mq_end_request for the reqs that are not cancelled by ublk_cancel_cmd

Can see a small tweak that would solve this without having to back-port all the ublk changes from 6.15 to 6.14-stable?

Thanks for the help.

Regards,

Jared

---
 drivers/block/ublk_drv.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ab06a7a064fb..7d937168b245 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -1595,14 +1595,31 @@ static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq)
 	return !was_canceled;
 }
 
-static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io,
+static void ublk_cancel_cmd(struct ublk_queue *ubq, unsigned tag,
 		unsigned int issue_flags)
 {
+	struct ublk_io *io = &ubq->ios[tag];
+	struct ublk_device *ub = ubq->dev;
+	struct request *req;
 	bool done;
 
 	if (!(io->flags & UBLK_IO_FLAG_ACTIVE))
 		return;
 
+	/*
+	 * Don't try to cancel this command if the request is started for
+	 * avoiding race between io_uring_cmd_done() and
+	 * io_uring_cmd_complete_in_task().
+	 *
+	 * Either the started request will be aborted via __ublk_abort_rq(),
+	 * then this uring_cmd is canceled next time, or it will be done in
+	 * task work function ublk_dispatch_req() because io_uring guarantees
+	 * that ublk_dispatch_req() is always called
+	 */
+	req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
+	if (req && blk_mq_request_started(req))
+		return;
+
 	spin_lock(&ubq->cancel_lock);
 	done = !!(io->flags & UBLK_IO_FLAG_CANCELED);
 	if (!done)
@@ -1625,7 +1642,6 @@ static void ublk_uring_cmd_cancel_fn(struct io_uring_cmd *cmd,
 	struct task_struct *task;
 	struct ublk_device *ub;
 	bool need_schedule;
-	struct ublk_io *io;
 
 	if (WARN_ON_ONCE(!ubq))
 		return;
@@ -1640,9 +1656,8 @@ static void ublk_uring_cmd_cancel_fn(struct io_uring_cmd *cmd,
 	ub = ubq->dev;
 	need_schedule = ublk_abort_requests(ub, ubq);
 
-	io = &ubq->ios[pdu->tag];
-	WARN_ON_ONCE(io->cmd != cmd);
-	ublk_cancel_cmd(ubq, io, issue_flags);
+	WARN_ON_ONCE(ubq->ios[pdu->tag].cmd != cmd);
+	ublk_cancel_cmd(ubq, pdu->tag, issue_flags);
 
 	if (need_schedule) {
 		schedule_work(&ub->nosrv_work);
@@ -1659,7 +1674,7 @@ static void ublk_cancel_queue(struct ublk_queue *ubq)
 	int i;
 
 	for (i = 0; i < ubq->q_depth; i++)
-		ublk_cancel_cmd(ubq, &ubq->ios[i], IO_URING_F_UNLOCKED);
+		ublk_cancel_cmd(ubq, i, IO_URING_F_UNLOCKED);
 }
 
 /* Cancel all pending commands, must be called after del_gendisk() returns */
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd
  2025-05-05 16:06 Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd Jared Holzman
@ 2025-05-06  2:13 ` Ming Lei
  2025-05-06 13:16   ` Jared Holzman
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2025-05-06  2:13 UTC (permalink / raw)
  To: Jared Holzman; +Cc: linux-block@vger.kernel.org, Jens Axboe

On Mon, May 05, 2025 at 07:06:37PM +0300, Jared Holzman wrote:
> Hi Ming,
> 
> I'm attempting to back port the fix for this issue to the 6.14-stable branch.
> 
> Greg Kroah-Hartman has already applied d6aa0c178bf8 - "ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA",
> but was unable to apply f40139fde527 cleanly.
> 
> I created the patch below. It applies and compiles, but when I rerun the scenario I get several hung tasks waiting on the ub->mutex
> which is being held by the following task:

Hi Jared,

You need to pull in the following patchset too:

https://lore.kernel.org/linux-block/20250416035444.99569-1-ming.lei@redhat.com/

which avoids ub->mutex in error handling code path.

I just picked them in the following tree:

https://github.com/ming1/linux/commits/linux-6.14.y/

Please test and see if they work for you.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd
  2025-05-06  2:13 ` Ming Lei
@ 2025-05-06 13:16   ` Jared Holzman
  2025-05-06 13:40     ` Ming Lei
  0 siblings, 1 reply; 5+ messages in thread
From: Jared Holzman @ 2025-05-06 13:16 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block@vger.kernel.org, Jens Axboe

On 06/05/2025 5:13, Ming Lei wrote:
> On Mon, May 05, 2025 at 07:06:37PM +0300, Jared Holzman wrote:
>> Hi Ming,
>>
>> I'm attempting to back port the fix for this issue to the 6.14-stable branch.
>>
>> Greg Kroah-Hartman has already applied d6aa0c178bf8 - "ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA",
>> but was unable to apply f40139fde527 cleanly.
>>
>> I created the patch below. It applies and compiles, but when I rerun the scenario I get several hung tasks waiting on the ub->mutex
>> which is being held by the following task:
> 
> Hi Jared,
> 
> You need to pull in the following patchset too:
> 
> https://lore.kernel.org/linux-block/20250416035444.99569-1-ming.lei@redhat.com/
> 
> which avoids ub->mutex in error handling code path.
> 
> I just picked them in the following tree:
> 
> https://github.com/ming1/linux/commits/linux-6.14.y/
> 
> Please test and see if they work for you.
> 
> 
> Thanks,
> Ming
> 

Hi Ming,

Tested. It works great!

Will you be sending a pull request to Greg or should I send him the patches?

Thanks,

Jared

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd
  2025-05-06 13:16   ` Jared Holzman
@ 2025-05-06 13:40     ` Ming Lei
  2025-05-06 13:57       ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2025-05-06 13:40 UTC (permalink / raw)
  To: Jared Holzman; +Cc: linux-block@vger.kernel.org, Jens Axboe

On Tue, May 06, 2025 at 04:16:25PM +0300, Jared Holzman wrote:
> On 06/05/2025 5:13, Ming Lei wrote:
> > On Mon, May 05, 2025 at 07:06:37PM +0300, Jared Holzman wrote:
> >> Hi Ming,
> >>
> >> I'm attempting to back port the fix for this issue to the 6.14-stable branch.
> >>
> >> Greg Kroah-Hartman has already applied d6aa0c178bf8 - "ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA",
> >> but was unable to apply f40139fde527 cleanly.
> >>
> >> I created the patch below. It applies and compiles, but when I rerun the scenario I get several hung tasks waiting on the ub->mutex
> >> which is being held by the following task:
> > 
> > Hi Jared,
> > 
> > You need to pull in the following patchset too:
> > 
> > https://lore.kernel.org/linux-block/20250416035444.99569-1-ming.lei@redhat.com/
> > 
> > which avoids ub->mutex in error handling code path.
> > 
> > I just picked them in the following tree:
> > 
> > https://github.com/ming1/linux/commits/linux-6.14.y/
> > 
> > Please test and see if they work for you.
> > 
> > 
> > Thanks,
> > Ming
> > 
> 
> Hi Ming,
> 
> Tested. It works great!
> 
> Will you be sending a pull request to Greg or should I send him the patches?

Hi Jared,

Please make a PR and send to Greg since I didn't test & verify it on 6.14-y
tree yet.

And please Cc me, I will give one double review.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd
  2025-05-06 13:40     ` Ming Lei
@ 2025-05-06 13:57       ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2025-05-06 13:57 UTC (permalink / raw)
  To: Ming Lei, Jared Holzman; +Cc: linux-block@vger.kernel.org

On 5/6/25 7:40 AM, Ming Lei wrote:
> On Tue, May 06, 2025 at 04:16:25PM +0300, Jared Holzman wrote:
>> On 06/05/2025 5:13, Ming Lei wrote:
>>> On Mon, May 05, 2025 at 07:06:37PM +0300, Jared Holzman wrote:
>>>> Hi Ming,
>>>>
>>>> I'm attempting to back port the fix for this issue to the 6.14-stable branch.
>>>>
>>>> Greg Kroah-Hartman has already applied d6aa0c178bf8 - "ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA",
>>>> but was unable to apply f40139fde527 cleanly.
>>>>
>>>> I created the patch below. It applies and compiles, but when I rerun the scenario I get several hung tasks waiting on the ub->mutex
>>>> which is being held by the following task:
>>>
>>> Hi Jared,
>>>
>>> You need to pull in the following patchset too:
>>>
>>> https://lore.kernel.org/linux-block/20250416035444.99569-1-ming.lei@redhat.com/
>>>
>>> which avoids ub->mutex in error handling code path.
>>>
>>> I just picked them in the following tree:
>>>
>>> https://github.com/ming1/linux/commits/linux-6.14.y/
>>>
>>> Please test and see if they work for you.
>>>
>>>
>>> Thanks,
>>> Ming
>>>
>>
>> Hi Ming,
>>
>> Tested. It works great!
>>
>> Will you be sending a pull request to Greg or should I send him the patches?
> 
> Hi Jared,
> 
> Please make a PR and send to Greg since I didn't test & verify it on 6.14-y
> tree yet.
> 
> And please Cc me, I will give one double review.

Don't do a PR for stable, send a patch series instead.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-05-06 13:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-05 16:06 Port to 6.14-stable - ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd Jared Holzman
2025-05-06  2:13 ` Ming Lei
2025-05-06 13:16   ` Jared Holzman
2025-05-06 13:40     ` Ming Lei
2025-05-06 13:57       ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox