From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>, Miklos Szeredi <miklos@szeredi.hu>,
Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>,
Sagi Grimberg <sagi@grimberg.me>, Chris Mason <clm@fb.com>,
David Sterba <dsterba@suse.com>,
io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/3] io_uring/uring_cmd: avoid double indirect call in task work dispatch
Date: Fri, 24 Oct 2025 12:21:43 +0800 [thread overview]
Message-ID: <aPr-11nDqcz4z_V-@fedora> (raw)
In-Reply-To: <CADUfDZp21icTKrWHcgRTfmsxtdab85b6R75wAYXW2dA+dzXmoA@mail.gmail.com>
On Thu, Oct 23, 2025 at 08:49:40PM -0700, Caleb Sander Mateos wrote:
> On Thu, Oct 23, 2025 at 8:42 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Thu, Oct 23, 2025 at 02:18:30PM -0600, Caleb Sander Mateos wrote:
> > > io_uring task work dispatch makes an indirect call to struct io_kiocb's
> > > io_task_work.func field to allow running arbitrary task work functions.
> > > In the uring_cmd case, this calls io_uring_cmd_work(), which immediately
> > > makes another indirect call to struct io_uring_cmd's task_work_cb field.
> > > Define the uring_cmd task work callbacks as functions whose signatures
> > > match io_req_tw_func_t. Define a IO_URING_CMD_TASK_WORK_ISSUE_FLAGS
> > > constant in io_uring/cmd.h to avoid manufacturing issue_flags in the
> > > uring_cmd task work callbacks. Now uring_cmd task work dispatch makes a
> > > single indirect call to the uring_cmd implementation's callback. This
> > > also allows removing the task_work_cb field from struct io_uring_cmd,
> > > freeing up some additional storage space.
> >
> > The idea looks good.
> >
> > >
> > > Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
> > > ---
> > > block/ioctl.c | 4 +++-
> > > drivers/block/ublk_drv.c | 15 +++++++++------
> > > drivers/nvme/host/ioctl.c | 5 +++--
> > > fs/btrfs/ioctl.c | 4 +++-
> > > fs/fuse/dev_uring.c | 5 +++--
> > > include/linux/io_uring/cmd.h | 16 +++++++---------
> > > io_uring/uring_cmd.c | 13 ++-----------
> > > 7 files changed, 30 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/block/ioctl.c b/block/ioctl.c
> > > index d7489a56b33c..5c10d48fab27 100644
> > > --- a/block/ioctl.c
> > > +++ b/block/ioctl.c
> > > @@ -767,13 +767,15 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
> > > struct blk_iou_cmd {
> > > int res;
> > > bool nowait;
> > > };
> > >
> > > -static void blk_cmd_complete(struct io_uring_cmd *cmd, unsigned int issue_flags)
> > > +static void blk_cmd_complete(struct io_kiocb *req, io_tw_token_t tw)
> > > {
> > > + struct io_uring_cmd *cmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
> > > struct blk_iou_cmd *bic = io_uring_cmd_to_pdu(cmd, struct blk_iou_cmd);
> > > + unsigned int issue_flags = IO_URING_CMD_TASK_WORK_ISSUE_FLAGS;
> >
> > Now `io_kiocb` is exposed to driver, it could be perfect if 'io_uring_cmd'
> > is kept in kernel API interface, IMO.
>
> You mean change the io_req_tw_func_t signature to pass struct
> io_uring_cmd * instead of struct io_kiocb *? I don't think that would
> make sense because task work is a more general concept, not just for
> uring_cmd. I agree it's a bit ugly exposing struct io_kiocb * outside
> of the io_uring core, but I don't see a way to encapsulate it without
> other downsides (the additional indirect call or the gross macro from
> v1). Treating it as an opaque pointer type seems like the least bad
> option...
If switching to `struct io_kiocb *` can't be accepted, `opaque pointer type`
might not be too bad:
- share the callback storage for both `io_uring_cmd_tw_t` and
`io_req_tw_func_t` via union
- add one request flag for deciding to dispatch which one & prepare `io_kiocb *`
or `io_uring_cmd *`.
>
> >
> > ...
> >
> > > diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
> > > index b84b97c21b43..3efad93404f9 100644
> > > --- a/include/linux/io_uring/cmd.h
> > > +++ b/include/linux/io_uring/cmd.h
> > > @@ -9,18 +9,13 @@
> > > /* only top 8 bits of sqe->uring_cmd_flags for kernel internal use */
> > > #define IORING_URING_CMD_CANCELABLE (1U << 30)
> > > /* io_uring_cmd is being issued again */
> > > #define IORING_URING_CMD_REISSUE (1U << 31)
> > >
> > > -typedef void (*io_uring_cmd_tw_t)(struct io_uring_cmd *cmd,
> > > - unsigned issue_flags);
> > > -
> > > struct io_uring_cmd {
> > > struct file *file;
> > > const struct io_uring_sqe *sqe;
> > > - /* callback to defer completions to task context */
> > > - io_uring_cmd_tw_t task_work_cb;
> > > u32 cmd_op;
> > > u32 flags;
> > > u8 pdu[32]; /* available inline for free use */
> >
> > pdu[40]
>
> I considered that, but wondered if we might want to reuse the 8 bytes
> for something internal to uring_cmd rather than providing it to the
> driver's uring_cmd implementation. If we increase pdu and a driver
> starts using more than 32 bytes, it will be difficult to claw back. It
> seems reasonable to reserve half the space for the io_uring/uring_cmd
> layer and half for the driver.
Fair enough, but I think the 8bytes need to define as reserved, at least
with document benefit.
Thanks,
Ming
prev parent reply other threads:[~2025-10-24 4:22 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-23 20:18 [PATCH v2 0/3] io_uring/uring_cmd: avoid double indirect call in task work dispatch Caleb Sander Mateos
2025-10-23 20:18 ` [PATCH v2 1/3] io_uring: expose io_should_terminate_tw() Caleb Sander Mateos
2025-10-24 3:26 ` Ming Lei
2025-10-23 20:18 ` [PATCH v2 2/3] io_uring/uring_cmd: call io_should_terminate_tw() when needed Caleb Sander Mateos
2025-10-24 3:27 ` Ming Lei
2025-10-23 20:18 ` [PATCH v2 3/3] io_uring/uring_cmd: avoid double indirect call in task work dispatch Caleb Sander Mateos
2025-10-24 3:42 ` Ming Lei
2025-10-24 3:49 ` Caleb Sander Mateos
2025-10-24 4:21 ` Ming Lei [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPr-11nDqcz4z_V-@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=csander@purestorage.com \
--cc=dsterba@suse.com \
--cc=hch@lst.de \
--cc=io-uring@vger.kernel.org \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=miklos@szeredi.hu \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.