* [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd()
@ 2026-05-05 7:37 Jens Axboe
2026-05-05 15:32 ` Caleb Sander Mateos
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2026-05-05 7:37 UTC (permalink / raw)
To: linux-block@vger.kernel.org; +Cc: Caleb Sander Mateos
This passthrough helper currently only supports discards. Part of that
command is the start and length, which is read from the SQE. It does
so on every invocation, where it really should just make it stable
on the first invocation. This avoids needing to copy the SQE upfront,
as we only really need those two 8b values stored in our per-req
payload.
Cc: stable@vger.kernel.org # 6.17+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
Changes since v1:
- Use IORING_URING_CMD_REISSUE flag (Caleb)
diff --git a/block/ioctl.c b/block/ioctl.c
index fc3be0549aa7..ab2c9ed79946 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -857,6 +857,8 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
#endif
struct blk_iou_cmd {
+ u64 start;
+ u64 len;
int res;
bool nowait;
};
@@ -946,23 +948,27 @@ int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
{
struct block_device *bdev = I_BDEV(cmd->file->f_mapping->host);
struct blk_iou_cmd *bic = io_uring_cmd_to_pdu(cmd, struct blk_iou_cmd);
- const struct io_uring_sqe *sqe = cmd->sqe;
u32 cmd_op = cmd->cmd_op;
- uint64_t start, len;
- if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len ||
- sqe->rw_flags || sqe->file_index))
- return -EINVAL;
+ /* Read what we need from the SQE on the first issue */
+ if (!(issue_flags & IORING_URING_CMD_REISSUE)) {
+ const struct io_uring_sqe *sqe = cmd->sqe;
+
+ if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len ||
+ sqe->rw_flags || sqe->file_index))
+ return -EINVAL;
+
+ bic->start = READ_ONCE(sqe->addr);
+ bic->len = READ_ONCE(sqe->addr3);
+ }
bic->res = 0;
bic->nowait = issue_flags & IO_URING_F_NONBLOCK;
- start = READ_ONCE(sqe->addr);
- len = READ_ONCE(sqe->addr3);
-
switch (cmd_op) {
case BLOCK_URING_CMD_DISCARD:
- return blkdev_cmd_discard(cmd, bdev, start, len, bic->nowait);
+ return blkdev_cmd_discard(cmd, bdev, bic->start, bic->len,
+ bic->nowait);
}
return -EINVAL;
}
--
Jens Axboe
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd()
2026-05-05 7:37 [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd() Jens Axboe
@ 2026-05-05 15:32 ` Caleb Sander Mateos
2026-05-06 10:43 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Caleb Sander Mateos @ 2026-05-05 15:32 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block@vger.kernel.org
On Tue, May 5, 2026 at 12:38 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> This passthrough helper currently only supports discards. Part of that
> command is the start and length, which is read from the SQE. It does
> so on every invocation, where it really should just make it stable
> on the first invocation. This avoids needing to copy the SQE upfront,
> as we only really need those two 8b values stored in our per-req
> payload.
Maybe I'm missing something, but how does this avoid the SQE copy?
Won't io_uring_cmd_sqe_copy() still copy the SQE if the command issue
is delayed by links/drain/forced async or if the initial issue returns
EAGAIN? The code change looks fine to me, but I'm not sure I
understand the motivation.
Best,
Caleb
>
> Cc: stable@vger.kernel.org # 6.17+
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> Changes since v1:
> - Use IORING_URING_CMD_REISSUE flag (Caleb)
>
> diff --git a/block/ioctl.c b/block/ioctl.c
> index fc3be0549aa7..ab2c9ed79946 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -857,6 +857,8 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
> #endif
>
> struct blk_iou_cmd {
> + u64 start;
> + u64 len;
> int res;
> bool nowait;
> };
> @@ -946,23 +948,27 @@ int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
> {
> struct block_device *bdev = I_BDEV(cmd->file->f_mapping->host);
> struct blk_iou_cmd *bic = io_uring_cmd_to_pdu(cmd, struct blk_iou_cmd);
> - const struct io_uring_sqe *sqe = cmd->sqe;
> u32 cmd_op = cmd->cmd_op;
> - uint64_t start, len;
>
> - if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len ||
> - sqe->rw_flags || sqe->file_index))
> - return -EINVAL;
> + /* Read what we need from the SQE on the first issue */
> + if (!(issue_flags & IORING_URING_CMD_REISSUE)) {
> + const struct io_uring_sqe *sqe = cmd->sqe;
> +
> + if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len ||
> + sqe->rw_flags || sqe->file_index))
> + return -EINVAL;
> +
> + bic->start = READ_ONCE(sqe->addr);
> + bic->len = READ_ONCE(sqe->addr3);
> + }
>
> bic->res = 0;
> bic->nowait = issue_flags & IO_URING_F_NONBLOCK;
>
> - start = READ_ONCE(sqe->addr);
> - len = READ_ONCE(sqe->addr3);
> -
> switch (cmd_op) {
> case BLOCK_URING_CMD_DISCARD:
> - return blkdev_cmd_discard(cmd, bdev, start, len, bic->nowait);
> + return blkdev_cmd_discard(cmd, bdev, bic->start, bic->len,
> + bic->nowait);
> }
> return -EINVAL;
> }
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd()
2026-05-05 15:32 ` Caleb Sander Mateos
@ 2026-05-06 10:43 ` Jens Axboe
2026-05-06 16:07 ` Caleb Sander Mateos
0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2026-05-06 10:43 UTC (permalink / raw)
To: Caleb Sander Mateos; +Cc: linux-block@vger.kernel.org
On 5/5/26 9:32 AM, Caleb Sander Mateos wrote:
> On Tue, May 5, 2026 at 12:38?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> This passthrough helper currently only supports discards. Part of that
>> command is the start and length, which is read from the SQE. It does
>> so on every invocation, where it really should just make it stable
>> on the first invocation. This avoids needing to copy the SQE upfront,
>> as we only really need those two 8b values stored in our per-req
>> payload.
>
> Maybe I'm missing something, but how does this avoid the SQE copy?
> Won't io_uring_cmd_sqe_copy() still copy the SQE if the command issue
> is delayed by links/drain/forced async or if the initial issue returns
> EAGAIN? The code change looks fine to me, but I'm not sure I
> understand the motivation.
Because the alternative would be to copy it upfront and then using ->sqe
would always be fine. But we can just ensure start/len are stable. The
patch is about the latter, not adding any further copy avoidance. Yes if
you use async or it's a link or drained, prep will always copy it.
That's fine.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd()
2026-05-06 10:43 ` Jens Axboe
@ 2026-05-06 16:07 ` Caleb Sander Mateos
2026-05-06 16:09 ` Jens Axboe
0 siblings, 1 reply; 5+ messages in thread
From: Caleb Sander Mateos @ 2026-05-06 16:07 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block@vger.kernel.org
On Wed, May 6, 2026 at 3:43 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 5/5/26 9:32 AM, Caleb Sander Mateos wrote:
> > On Tue, May 5, 2026 at 12:38?AM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> This passthrough helper currently only supports discards. Part of that
> >> command is the start and length, which is read from the SQE. It does
> >> so on every invocation, where it really should just make it stable
> >> on the first invocation. This avoids needing to copy the SQE upfront,
> >> as we only really need those two 8b values stored in our per-req
> >> payload.
> >
> > Maybe I'm missing something, but how does this avoid the SQE copy?
> > Won't io_uring_cmd_sqe_copy() still copy the SQE if the command issue
> > is delayed by links/drain/forced async or if the initial issue returns
> > EAGAIN? The code change looks fine to me, but I'm not sure I
> > understand the motivation.
>
> Because the alternative would be to copy it upfront and then using ->sqe
> would always be fine. But we can just ensure start/len are stable. The
> patch is about the latter, not adding any further copy avoidance. Yes if
> you use async or it's a link or drained, prep will always copy it.
> That's fine.
I see what you mean now, makes sense. As for making start/len stable
in the first place, is this just hardening? Currently, it doesn't look
like calling blkdev_cmd_discard() again with different start/len
values after it returned EAGAIN would cause any issues.
Best,
Caleb
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd()
2026-05-06 16:07 ` Caleb Sander Mateos
@ 2026-05-06 16:09 ` Jens Axboe
0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2026-05-06 16:09 UTC (permalink / raw)
To: Caleb Sander Mateos; +Cc: linux-block@vger.kernel.org
On 5/6/26 10:07 AM, Caleb Sander Mateos wrote:
> On Wed, May 6, 2026 at 3:43 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 5/5/26 9:32 AM, Caleb Sander Mateos wrote:
>>> On Tue, May 5, 2026 at 12:38?AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> This passthrough helper currently only supports discards. Part of that
>>>> command is the start and length, which is read from the SQE. It does
>>>> so on every invocation, where it really should just make it stable
>>>> on the first invocation. This avoids needing to copy the SQE upfront,
>>>> as we only really need those two 8b values stored in our per-req
>>>> payload.
>>>
>>> Maybe I'm missing something, but how does this avoid the SQE copy?
>>> Won't io_uring_cmd_sqe_copy() still copy the SQE if the command issue
>>> is delayed by links/drain/forced async or if the initial issue returns
>>> EAGAIN? The code change looks fine to me, but I'm not sure I
>>> understand the motivation.
>>
>> Because the alternative would be to copy it upfront and then using ->sqe
>> would always be fine. But we can just ensure start/len are stable. The
>> patch is about the latter, not adding any further copy avoidance. Yes if
>> you use async or it's a link or drained, prep will always copy it.
>> That's fine.
>
> I see what you mean now, makes sense. As for making start/len stable
> in the first place, is this just hardening? Currently, it doesn't look
> like calling blkdev_cmd_discard() again with different start/len
> values after it returned EAGAIN would cause any issues.
Yeah it doesn't cause any issues, just prudent.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-06 16:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-05 7:37 [PATCH v2] block: only read from sqe on initial invocation of blkdev_uring_cmd() Jens Axboe
2026-05-05 15:32 ` Caleb Sander Mateos
2026-05-06 10:43 ` Jens Axboe
2026-05-06 16:07 ` Caleb Sander Mateos
2026-05-06 16:09 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox