* [PATCH] ublk: remove io_cmds list in ublk_queue
@ 2025-03-18 18:14 Uday Shankar
2025-03-18 18:22 ` Jens Axboe
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Uday Shankar @ 2025-03-18 18:14 UTC (permalink / raw)
To: Ming Lei, Jens Axboe; +Cc: linux-block, Uday Shankar
The current I/O dispatch mechanism - queueing I/O by adding it to the
io_cmds list (and poking task_work as needed), then dispatching it in
ublk server task context by reversing io_cmds and completing the
io_uring command associated to each one - was introduced by commit
7d4a93176e014 ("ublk_drv: don't forward io commands in reserve order")
to ensure that the ublk server received I/O in the same order that the
block layer submitted it to ublk_drv. This mechanism was only needed for
the "raw" task_work submission mechanism, since the io_uring task work
wrapper maintains FIFO ordering (using quite a similar mechanism in
fact). The "raw" task_work submission mechanism is no longer supported
in ublk_drv as of commit 29dc5d06613f2 ("ublk: kill queuing request by
task_work_add"), so the explicit llist/reversal is no longer needed - it
just duplicates logic already present in the underlying io_uring APIs.
Remove it.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
---
drivers/block/ublk_drv.c | 46 +++++++++++-----------------------------------
1 file changed, 11 insertions(+), 35 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2955900ee713c5d8f3cbc2a69f6f6058348e5253..82c9d3d22f0ea5a0fad3f33837fa16146b5af7a9 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -77,8 +77,6 @@
UBLK_PARAM_TYPE_DMA_ALIGN)
struct ublk_rq_data {
- struct llist_node node;
-
struct kref ref;
};
@@ -145,8 +143,6 @@ struct ublk_queue {
struct task_struct *ubq_daemon;
char *io_cmd_buf;
- struct llist_head io_cmds;
-
unsigned long io_addr; /* mapped vm address */
unsigned int max_io_sz;
bool force_abort;
@@ -1113,7 +1109,7 @@ static void ublk_complete_rq(struct kref *ref)
}
/*
- * Since __ublk_rq_task_work always fails requests immediately during
+ * Since ublk_rq_task_work_cb always fails requests immediately during
* exiting, __ublk_fail_req() is only called from abort context during
* exiting. So lock is unnecessary.
*
@@ -1159,11 +1155,14 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq,
blk_mq_end_request(rq, BLK_STS_IOERR);
}
-static inline void __ublk_rq_task_work(struct request *req,
- unsigned issue_flags)
+static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd,
+ unsigned int issue_flags)
{
- struct ublk_queue *ubq = req->mq_hctx->driver_data;
- int tag = req->tag;
+ struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
+ struct ublk_queue *ubq = pdu->ubq;
+ int tag = pdu->tag;
+ struct request *req = blk_mq_tag_to_rq(
+ ubq->dev->tag_set.tags[ubq->q_id], tag);
struct ublk_io *io = &ubq->ios[tag];
unsigned int mapped_bytes;
@@ -1238,34 +1237,11 @@ static inline void __ublk_rq_task_work(struct request *req,
ubq_complete_io_cmd(io, UBLK_IO_RES_OK, issue_flags);
}
-static inline void ublk_forward_io_cmds(struct ublk_queue *ubq,
- unsigned issue_flags)
-{
- struct llist_node *io_cmds = llist_del_all(&ubq->io_cmds);
- struct ublk_rq_data *data, *tmp;
-
- io_cmds = llist_reverse_order(io_cmds);
- llist_for_each_entry_safe(data, tmp, io_cmds, node)
- __ublk_rq_task_work(blk_mq_rq_from_pdu(data), issue_flags);
-}
-
-static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd, unsigned issue_flags)
-{
- struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
- struct ublk_queue *ubq = pdu->ubq;
-
- ublk_forward_io_cmds(ubq, issue_flags);
-}
-
static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq)
{
- struct ublk_rq_data *data = blk_mq_rq_to_pdu(rq);
+ struct ublk_io *io = &ubq->ios[rq->tag];
- if (llist_add(&data->node, &ubq->io_cmds)) {
- struct ublk_io *io = &ubq->ios[rq->tag];
-
- io_uring_cmd_complete_in_task(io->cmd, ublk_rq_task_work_cb);
- }
+ io_uring_cmd_complete_in_task(io->cmd, ublk_rq_task_work_cb);
}
static enum blk_eh_timer_return ublk_timeout(struct request *rq)
@@ -1458,7 +1434,7 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
struct request *rq;
/*
- * Either we fail the request or ublk_rq_task_work_fn
+ * Either we fail the request or ublk_rq_task_work_cb
* will do it
*/
rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i);
---
base-commit: 37f0c10318273bec83f373bbdad14fdc4e8ce79e
change-id: 20250318-ublk_io_cmds-a421653cefc2
Best regards,
--
Uday Shankar <ushankar@purestorage.com>
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:14 [PATCH] ublk: remove io_cmds list in ublk_queue Uday Shankar
@ 2025-03-18 18:22 ` Jens Axboe
2025-03-18 18:43 ` Uday Shankar
2025-03-19 2:14 ` Ming Lei
2025-03-19 12:32 ` Jens Axboe
2 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2025-03-18 18:22 UTC (permalink / raw)
To: Uday Shankar, Ming Lei; +Cc: linux-block
On 3/18/25 12:14 PM, Uday Shankar wrote:
> The current I/O dispatch mechanism - queueing I/O by adding it to the
> io_cmds list (and poking task_work as needed), then dispatching it in
> ublk server task context by reversing io_cmds and completing the
> io_uring command associated to each one - was introduced by commit
> 7d4a93176e014 ("ublk_drv: don't forward io commands in reserve order")
> to ensure that the ublk server received I/O in the same order that the
> block layer submitted it to ublk_drv. This mechanism was only needed for
> the "raw" task_work submission mechanism, since the io_uring task work
> wrapper maintains FIFO ordering (using quite a similar mechanism in
> fact). The "raw" task_work submission mechanism is no longer supported
> in ublk_drv as of commit 29dc5d06613f2 ("ublk: kill queuing request by
> task_work_add"), so the explicit llist/reversal is no longer needed - it
> just duplicates logic already present in the underlying io_uring APIs.
> Remove it.
Patch looks good to me, just one followup:
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 2955900ee713c5d8f3cbc2a69f6f6058348e5253..82c9d3d22f0ea5a0fad3f33837fa16146b5af7a9 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -77,8 +77,6 @@
> UBLK_PARAM_TYPE_DMA_ALIGN)
>
> struct ublk_rq_data {
> - struct llist_node node;
> -
> struct kref ref;
> };
Can we get rid of ublk_rq_data then? If it's just a ref thing, I'm sure
we can find an atomic_t of space in struct request and avoid it. Not a
pressing thing, just tossing it out there...
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:22 ` Jens Axboe
@ 2025-03-18 18:43 ` Uday Shankar
2025-03-18 18:48 ` Jens Axboe
0 siblings, 1 reply; 10+ messages in thread
From: Uday Shankar @ 2025-03-18 18:43 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, linux-block
On Tue, Mar 18, 2025 at 12:22:57PM -0600, Jens Axboe wrote:
> > struct ublk_rq_data {
> > - struct llist_node node;
> > -
> > struct kref ref;
> > };
>
> Can we get rid of ublk_rq_data then? If it's just a ref thing, I'm sure
> we can find an atomic_t of space in struct request and avoid it. Not a
> pressing thing, just tossing it out there...
Yeah probably - we do need a ref since one could complete a request
concurrently with another code path which references it (user copy and
zero copy). I see that struct request has a refcount in it already,
though I don't see any examples of drivers using it. Would it be a bad
idea to try and reuse that?
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:43 ` Uday Shankar
@ 2025-03-18 18:48 ` Jens Axboe
2025-03-18 21:58 ` Uday Shankar
2025-03-19 1:04 ` Ming Lei
0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2025-03-18 18:48 UTC (permalink / raw)
To: Uday Shankar; +Cc: Ming Lei, linux-block
On 3/18/25 12:43 PM, Uday Shankar wrote:
> On Tue, Mar 18, 2025 at 12:22:57PM -0600, Jens Axboe wrote:
>>> struct ublk_rq_data {
>>> - struct llist_node node;
>>> -
>>> struct kref ref;
>>> };
>>
>> Can we get rid of ublk_rq_data then? If it's just a ref thing, I'm sure
>> we can find an atomic_t of space in struct request and avoid it. Not a
>> pressing thing, just tossing it out there...
>
> Yeah probably - we do need a ref since one could complete a request
> concurrently with another code path which references it (user copy and
> zero copy). I see that struct request has a refcount in it already,
Right, at least with the current usage, we still do need that kref, or
something similar. I would've probably made it just use refcount_t
though, rather than rely on the indirect calls. kref doesn't really
bring us anything here in terms of API.
> though I don't see any examples of drivers using it. Would it be a bad
> idea to try and reuse that?
We can't reuse that one, and it's not for driver use - purely internal.
But I _think_ you could easily grab space in the union that has the hash
and ipi_list for it. And then you could dump needing this extra data per
request.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:48 ` Jens Axboe
@ 2025-03-18 21:58 ` Uday Shankar
2025-03-19 1:54 ` Jens Axboe
2025-03-19 1:04 ` Ming Lei
1 sibling, 1 reply; 10+ messages in thread
From: Uday Shankar @ 2025-03-18 21:58 UTC (permalink / raw)
To: Jens Axboe; +Cc: Ming Lei, linux-block
On Tue, Mar 18, 2025 at 12:48:44PM -0600, Jens Axboe wrote:
> > though I don't see any examples of drivers using it. Would it be a bad
> > idea to try and reuse that?
>
> We can't reuse that one, and it's not for driver use - purely internal.
> But I _think_ you could easily grab space in the union that has the hash
> and ipi_list for it. And then you could dump needing this extra data per
> request.
Another idea is to union the refcount with end_io_data, since that's
purely for the driver. But it might be a bit weird to tell drivers that
they can use either end_io/end_io_data or the refcount but not both...
not to mention it could be a nice exploit vector if drivers get it
wrong.
Either way, I think this work is follow-on material and shouldn't be
rolled into this patch.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 21:58 ` Uday Shankar
@ 2025-03-19 1:54 ` Jens Axboe
0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2025-03-19 1:54 UTC (permalink / raw)
To: Uday Shankar; +Cc: Ming Lei, linux-block
On 3/18/25 3:58 PM, Uday Shankar wrote:
> On Tue, Mar 18, 2025 at 12:48:44PM -0600, Jens Axboe wrote:
>>> though I don't see any examples of drivers using it. Would it be a bad
>>> idea to try and reuse that?
>>
>> We can't reuse that one, and it's not for driver use - purely internal.
>> But I _think_ you could easily grab space in the union that has the hash
>> and ipi_list for it. And then you could dump needing this extra data per
>> request.
>
> Another idea is to union the refcount with end_io_data, since that's
> purely for the driver. But it might be a bit weird to tell drivers that
> they can use either end_io/end_io_data or the refcount but not both...
> not to mention it could be a nice exploit vector if drivers get it
> wrong.
>
> Either way, I think this work is follow-on material and shouldn't be
> rolled into this patch.
For sure, that's what I said in my first reply too. It's just an idea
for a further improvement, not gating this patch at all.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:48 ` Jens Axboe
2025-03-18 21:58 ` Uday Shankar
@ 2025-03-19 1:04 ` Ming Lei
2025-03-19 1:57 ` Jens Axboe
1 sibling, 1 reply; 10+ messages in thread
From: Ming Lei @ 2025-03-19 1:04 UTC (permalink / raw)
To: Jens Axboe; +Cc: Uday Shankar, linux-block
On Tue, Mar 18, 2025 at 12:48:44PM -0600, Jens Axboe wrote:
> On 3/18/25 12:43 PM, Uday Shankar wrote:
> > On Tue, Mar 18, 2025 at 12:22:57PM -0600, Jens Axboe wrote:
> >>> struct ublk_rq_data {
> >>> - struct llist_node node;
> >>> -
> >>> struct kref ref;
> >>> };
> >>
> >> Can we get rid of ublk_rq_data then? If it's just a ref thing, I'm sure
> >> we can find an atomic_t of space in struct request and avoid it. Not a
> >> pressing thing, just tossing it out there...
> >
> > Yeah probably - we do need a ref since one could complete a request
> > concurrently with another code path which references it (user copy and
> > zero copy). I see that struct request has a refcount in it already,
>
> Right, at least with the current usage, we still do need that kref, or
> something similar. I would've probably made it just use refcount_t
> though, rather than rely on the indirect calls. kref doesn't really
> bring us anything here in terms of API.
>
> > though I don't see any examples of drivers using it. Would it be a bad
> > idea to try and reuse that?
>
> We can't reuse that one, and it's not for driver use - purely internal.
> But I _think_ you could easily grab space in the union that has the hash
> and ipi_list for it. And then you could dump needing this extra data per
> request.
It should be fine to reuse request->ref, since the payload shares same
lifetime with request.
But if it is exported, the interface is likely to be misused...
thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-19 1:04 ` Ming Lei
@ 2025-03-19 1:57 ` Jens Axboe
0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2025-03-19 1:57 UTC (permalink / raw)
To: Ming Lei; +Cc: Uday Shankar, linux-block
On 3/18/25 7:04 PM, Ming Lei wrote:
> On Tue, Mar 18, 2025 at 12:48:44PM -0600, Jens Axboe wrote:
>> On 3/18/25 12:43 PM, Uday Shankar wrote:
>>> On Tue, Mar 18, 2025 at 12:22:57PM -0600, Jens Axboe wrote:
>>>>> struct ublk_rq_data {
>>>>> - struct llist_node node;
>>>>> -
>>>>> struct kref ref;
>>>>> };
>>>>
>>>> Can we get rid of ublk_rq_data then? If it's just a ref thing, I'm sure
>>>> we can find an atomic_t of space in struct request and avoid it. Not a
>>>> pressing thing, just tossing it out there...
>>>
>>> Yeah probably - we do need a ref since one could complete a request
>>> concurrently with another code path which references it (user copy and
>>> zero copy). I see that struct request has a refcount in it already,
>>
>> Right, at least with the current usage, we still do need that kref, or
>> something similar. I would've probably made it just use refcount_t
>> though, rather than rely on the indirect calls. kref doesn't really
>> bring us anything here in terms of API.
>>
>>> though I don't see any examples of drivers using it. Would it be a bad
>>> idea to try and reuse that?
>>
>> We can't reuse that one, and it's not for driver use - purely internal.
>> But I _think_ you could easily grab space in the union that has the hash
>> and ipi_list for it. And then you could dump needing this extra data per
>> request.
>
> It should be fine to reuse request->ref, since the payload shares same
> lifetime with request.
>
> But if it is exported, the interface is likely to be misused...
Exactly, that's why I don't want to have drivers mess with this.
And following up on the location, as I forgot to do that in the reply to
Uday - the end_io_data is not a bad idea either, drivers get to use that
as they wish. Then you can't use it with an end_io handler, but it's not
like we've had a need to do that before anyway... It is a bit odd,
however.
But the ipi_list is only used completion side, at which point the driver
is by definition done with the ref. And hash is just for io scheduling,
which we'd never do with requests like this. So still think that's the
best option.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:14 [PATCH] ublk: remove io_cmds list in ublk_queue Uday Shankar
2025-03-18 18:22 ` Jens Axboe
@ 2025-03-19 2:14 ` Ming Lei
2025-03-19 12:32 ` Jens Axboe
2 siblings, 0 replies; 10+ messages in thread
From: Ming Lei @ 2025-03-19 2:14 UTC (permalink / raw)
To: Uday Shankar; +Cc: Jens Axboe, linux-block
On Tue, Mar 18, 2025 at 12:14:17PM -0600, Uday Shankar wrote:
> The current I/O dispatch mechanism - queueing I/O by adding it to the
> io_cmds list (and poking task_work as needed), then dispatching it in
> ublk server task context by reversing io_cmds and completing the
> io_uring command associated to each one - was introduced by commit
> 7d4a93176e014 ("ublk_drv: don't forward io commands in reserve order")
> to ensure that the ublk server received I/O in the same order that the
> block layer submitted it to ublk_drv. This mechanism was only needed for
> the "raw" task_work submission mechanism, since the io_uring task work
> wrapper maintains FIFO ordering (using quite a similar mechanism in
> fact). The "raw" task_work submission mechanism is no longer supported
> in ublk_drv as of commit 29dc5d06613f2 ("ublk: kill queuing request by
> task_work_add"), so the explicit llist/reversal is no longer needed - it
> just duplicates logic already present in the underlying io_uring APIs.
> Remove it.
>
> Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Thanks,
Ming
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH] ublk: remove io_cmds list in ublk_queue
2025-03-18 18:14 [PATCH] ublk: remove io_cmds list in ublk_queue Uday Shankar
2025-03-18 18:22 ` Jens Axboe
2025-03-19 2:14 ` Ming Lei
@ 2025-03-19 12:32 ` Jens Axboe
2 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2025-03-19 12:32 UTC (permalink / raw)
To: Ming Lei, Uday Shankar; +Cc: linux-block
On Tue, 18 Mar 2025 12:14:17 -0600, Uday Shankar wrote:
> The current I/O dispatch mechanism - queueing I/O by adding it to the
> io_cmds list (and poking task_work as needed), then dispatching it in
> ublk server task context by reversing io_cmds and completing the
> io_uring command associated to each one - was introduced by commit
> 7d4a93176e014 ("ublk_drv: don't forward io commands in reserve order")
> to ensure that the ublk server received I/O in the same order that the
> block layer submitted it to ublk_drv. This mechanism was only needed for
> the "raw" task_work submission mechanism, since the io_uring task work
> wrapper maintains FIFO ordering (using quite a similar mechanism in
> fact). The "raw" task_work submission mechanism is no longer supported
> in ublk_drv as of commit 29dc5d06613f2 ("ublk: kill queuing request by
> task_work_add"), so the explicit llist/reversal is no longer needed - it
> just duplicates logic already present in the underlying io_uring APIs.
> Remove it.
>
> [...]
Applied, thanks!
[1/1] ublk: remove io_cmds list in ublk_queue
commit: 989bcd623a8b0c32b76d9258767d8b37e53419e6
Best regards,
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-03-19 12:32 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-18 18:14 [PATCH] ublk: remove io_cmds list in ublk_queue Uday Shankar
2025-03-18 18:22 ` Jens Axboe
2025-03-18 18:43 ` Uday Shankar
2025-03-18 18:48 ` Jens Axboe
2025-03-18 21:58 ` Uday Shankar
2025-03-19 1:54 ` Jens Axboe
2025-03-19 1:04 ` Ming Lei
2025-03-19 1:57 ` Jens Axboe
2025-03-19 2:14 ` Ming Lei
2025-03-19 12:32 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox