From: Jens Axboe <axboe@kernel.dk>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: io-uring <io-uring@vger.kernel.org>
Subject: Re: [PATCH] io_uring/uring_cmd: be smarter about SQE copying
Date: Tue, 3 Jun 2025 09:11:14 -0600 [thread overview]
Message-ID: <9dd8a125-5d15-4480-a86e-87ee96e238a4@kernel.dk> (raw)
In-Reply-To: <CADUfDZo=mbiz=0wxKSihhw9cxRdj5Uojh=XO0aPxKOZKtEc22A@mail.gmail.com>
On 6/3/25 9:05 AM, Caleb Sander Mateos wrote:
> On Sat, May 31, 2025 at 1:52?PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> uring_cmd currently copies the SQE unconditionally, which was introduced
>> as a work-around in commit:
>>
>> d6211ebbdaa5 ("io_uring/uring_cmd: unconditionally copy SQEs at prep time")
>>
>> because the checking for whether or not this command may have ->issue()
>> called from io-wq wasn't complete. Rectify that, ensuring that if the
>> request is marked explicitly async via REQ_F_FORCE_ASYNC or if it's
>> part of a link chain, then the SQE is copied upfront.
>>
>> Always copying can be costly, particularly when dealing with SQE128
>> rings. But even a normal 64b SQE copy is noticeable at high enough
>> rates.
>>
>> Reported-by: Caleb Sander Mateos <csander@purestorage.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>
>> ---
>>
>> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
>> index 929cad6ee326..cb4b867a2656 100644
>> --- a/io_uring/uring_cmd.c
>> +++ b/io_uring/uring_cmd.c
>> @@ -181,29 +181,42 @@ void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret, u64 res2,
>> }
>> EXPORT_SYMBOL_GPL(io_uring_cmd_done);
>>
>> +static void io_uring_sqe_copy(struct io_kiocb *req, struct io_uring_cmd *ioucmd)
>> +{
>> + struct io_async_cmd *ac = req->async_data;
>> +
>> + if (ioucmd->sqe != ac->sqes) {
>> + memcpy(ac->sqes, ioucmd->sqe, uring_sqe_size(req->ctx));
>> + ioucmd->sqe = ac->sqes;
>> + }
>> +}
>> +
>> static int io_uring_cmd_prep_setup(struct io_kiocb *req,
>> const struct io_uring_sqe *sqe)
>> {
>> struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
>> + struct io_ring_ctx *ctx = req->ctx;
>> struct io_async_cmd *ac;
>>
>> /* see io_uring_cmd_get_async_data() */
>> BUILD_BUG_ON(offsetof(struct io_async_cmd, data) != 0);
>>
>> - ac = io_uring_alloc_async_data(&req->ctx->cmd_cache, req);
>> + ac = io_uring_alloc_async_data(&ctx->cmd_cache, req);
>> if (!ac)
>> return -ENOMEM;
>> ac->data.op_data = NULL;
>>
>> /*
>> - * Unconditionally cache the SQE for now - this is only needed for
>> - * requests that go async, but prep handlers must ensure that any
>> - * sqe data is stable beyond prep. Since uring_cmd is special in
>> - * that it doesn't read in per-op data, play it safe and ensure that
>> - * any SQE data is stable beyond prep. This can later get relaxed.
>> + * Copy SQE now, if we know we're going async. Drain will set
>> + * FORCE_ASYNC, and assume links may cause it to go async. If not,
>> + * copy is deferred until issue time, if the request doesn't issue
>> + * or queue inline.
>> */
>> - memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx));
>> - ioucmd->sqe = ac->sqes;
>> + ioucmd->sqe = sqe;
>> + if (req->flags & (REQ_F_FORCE_ASYNC| REQ_F_LINK | REQ_F_HARDLINK) ||
>> + ctx->submit_state.link.head)
>
> To check my understanding, io_init_req() will set REQ_F_FORCE_ASYNC on
> any request with IOSQE_IO_DRAIN as well as all subsequent requests
> until the IOSQE_IO_DRAIN request completes? Looks like this condition
Correct
> should work then. I think you can drop REQ_F_LINK | REQ_F_HARDLINK,
> though; the initial request of a linked chain will be issued
> synchronously, and ctx->submit_state.link.head will be set for the
> subsequent requests.
>
> I do share Pavel's concern that whether or not a request will be
> initially issued asynchronously is up to the core io_uring code, so it
> seems a bit fragile to make these assumptions in the uring_cmd layer.
> I think I would prefer either passing a bool issue_async to the
> ->prep() handler, or adding an optional ->prep_async() hook called if
> the initial issue may happen asynchronously.
Yes I do too, which is why I suggested we add a specific cold handler
for this kind of case. That leaves the core of io_uring calling it
appropriately, and eliminates the need for storing an sqe pointer. Which
would've been fine before resizing, but not a great idea to do now. I'll
do a v2 of this with that in mind, just haven't gotten around to it yet.
>> + io_uring_sqe_copy(req, ioucmd);
>> +
>> return 0;
>> }
>>
>> @@ -259,6 +272,12 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
>> }
>>
>> ret = file->f_op->uring_cmd(ioucmd, issue_flags);
>> + if (ret == -EAGAIN) {
>> + io_uring_sqe_copy(req, ioucmd);
>> + return ret;
>> + } else if (ret == -EIOCBQUEUED) {
>> + return ret;
>> + }
>> if (ret == -EAGAIN || ret == -EIOCBQUEUED)
>> return ret;
>
> This if condition is always false now, remove it?
Heh yes, just forgot to remove those lines when adding the above change.
--
Jens Axboe
prev parent reply other threads:[~2025-06-03 15:11 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-31 20:52 [PATCH] io_uring/uring_cmd: be smarter about SQE copying Jens Axboe
2025-06-02 10:24 ` Pavel Begunkov
2025-06-02 11:55 ` Jens Axboe
2025-06-02 12:10 ` Pavel Begunkov
2025-06-02 13:04 ` Jens Axboe
2025-06-03 15:05 ` Caleb Sander Mateos
2025-06-03 15:11 ` Jens Axboe [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9dd8a125-5d15-4480-a86e-87ee96e238a4@kernel.dk \
--to=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=io-uring@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.