From: Ming Lei <ming.lei@redhat.com>
To: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>,
io-uring@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [PATCH V6 7/8] io_uring/uring_cmd: support provide group kernel buffer
Date: Thu, 10 Oct 2024 11:28:46 +0800 [thread overview]
Message-ID: <ZwdJ7sDuHhWT61FR@fedora> (raw)
In-Reply-To: <38ad4c05-6ee3-4839-8d61-f8e1b5219556@gmail.com>
On Wed, Oct 09, 2024 at 04:14:33PM +0100, Pavel Begunkov wrote:
> On 10/6/24 09:46, Ming Lei wrote:
> > On Fri, Oct 04, 2024 at 04:44:54PM +0100, Pavel Begunkov wrote:
> > > On 9/12/24 11:49, Ming Lei wrote:
> > > > Allow uring command to be group leader for providing kernel buffer,
> > > > and this way can support generic device zero copy over device buffer.
> > > >
> > > > The following patch will use the way to support zero copy for ublk.
> > > >
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > > include/linux/io_uring/cmd.h | 7 +++++++
> > > > include/uapi/linux/io_uring.h | 7 ++++++-
> > > > io_uring/uring_cmd.c | 28 ++++++++++++++++++++++++++++
> > > > 3 files changed, 41 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
> > > > index 447fbfd32215..fde3a2ec7d9a 100644
> > > > --- a/include/linux/io_uring/cmd.h
> > > > +++ b/include/linux/io_uring/cmd.h
> > > > @@ -48,6 +48,8 @@ void __io_uring_cmd_do_in_task(struct io_uring_cmd *ioucmd,
> > > > void io_uring_cmd_mark_cancelable(struct io_uring_cmd *cmd,
> > > > unsigned int issue_flags);
> > > > +int io_uring_cmd_provide_kbuf(struct io_uring_cmd *ioucmd,
> > > > + const struct io_uring_kernel_buf *grp_kbuf);
> > > > #else
> > > > static inline int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
> > > > struct iov_iter *iter, void *ioucmd)
> > > > @@ -67,6 +69,11 @@ static inline void io_uring_cmd_mark_cancelable(struct io_uring_cmd *cmd,
> > > > unsigned int issue_flags)
> > > > {
> > > > }
> > > > +static inline int io_uring_cmd_provide_kbuf(struct io_uring_cmd *ioucmd,
> > > > + const struct io_uring_kernel_buf *grp_kbuf)
> > > > +{
> > > > + return -EOPNOTSUPP;
> > > > +}
> > > > #endif
> > > > /*
> > > > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> > > > index 2af32745ebd3..11985eeac10e 100644
> > > > --- a/include/uapi/linux/io_uring.h
> > > > +++ b/include/uapi/linux/io_uring.h
> > > > @@ -271,9 +271,14 @@ enum io_uring_op {
> > > > * sqe->uring_cmd_flags top 8bits aren't available for userspace
> > > > * IORING_URING_CMD_FIXED use registered buffer; pass this flag
> > > > * along with setting sqe->buf_index.
> > > > + * IORING_PROVIDE_GROUP_KBUF this command provides group kernel buffer
> > > > + * for member requests which can retrieve
> > > > + * any sub-buffer with offset(sqe->addr) and
> > > > + * len(sqe->len)
> > >
> > > Is there a good reason it needs to be a cmd generic flag instead of
> > > ublk specific?
> >
> > io_uring request isn't visible for drivers, so driver can't know if the
> > uring command is one group leader.
>
> btw, does it have to be in a group at all? Sure, nobody would be
> able to consume the buffer, but otherwise should be fine.
>
> > Another way is to add new API of io_uring_cmd_may_provide_buffer(ioucmd)
>
> The checks can be done inside of io_uring_cmd_provide_kbuf()
Yeah.
Now the difference is just that:
- user may know it explicitly(UAPI flag) or implicitly(driver's ->cmd_op),
- if driver knows this uring_cmd is one group leader
I am fine with either way.
>
> > so driver can check if device buffer can be provided with this uring_cmd,
> > but I prefer to the new uring_cmd flag:
> >
> > - IORING_PROVIDE_GROUP_KBUF can provide device buffer in generic way.
>
> Ok, could be.
>
> > - ->prep() can fail fast in case that it isn't one group request
>
> I don't believe that matters, a behaving user should never
> see that kind of failure.
>
>
> > > 1. Extra overhead for files / cmds that don't even care about the
> > > feature.
> >
> > It is just checking ioucmd->flags in ->prep(), and basically zero cost.
>
> It's not if we add extra code for each every feature, at
> which point it becomes a maze of such "ifs".
Yeah, I guess it can't be avoided in current uring_cmd design, which
serves for different subsystems now, and more in future.
And the situation is similar with ioctl.
>
> > > 2. As it stands with this patch, the flag is ignored by all other
> > > cmd implementations, which might be quite confusing as an api,
> > > especially so since if we don't set that REQ_F_GROUP_KBUF memeber
> > > requests will silently try to import a buffer the "normal way",
> >
> > The usage is same with buffer select or fixed buffer, and consumer
> > has to check the flag.
>
> We fails requests when it's asked to use the feature but
> those are not supported, at least non-cmd requests.
>
> > And same with IORING_URING_CMD_FIXED which is ignored by other
> > implementations except for nvme, :-)
>
> Oh, that's bad. If you'd try to implement the flag in the
> future it might break the uapi. It might be worth to patch it
> up on the ublk side, i.e. reject the flag, + backport, and hope
> nobody tried to use them together, hmm?
>
> > I can understand the concern, but it exits since uring cmd is born.
> >
> > > i.e. interpret sqe->addr or such as the target buffer.
> >
> > > 3. We can't even put some nice semantics on top since it's
> > > still cmd specific and not generic to all other io_uring
> > > requests.
> > >
> > > I'd even think that it'd make sense to implement it as a
> > > new cmd opcode, but that's the business of the file implementing
> > > it, i.e. ublk.
> > >
> > > > */
> > > > #define IORING_URING_CMD_FIXED (1U << 0)
> > > > -#define IORING_URING_CMD_MASK IORING_URING_CMD_FIXED
> > > > +#define IORING_PROVIDE_GROUP_KBUF (1U << 1)
> > > > +#define IORING_URING_CMD_MASK (IORING_URING_CMD_FIXED | IORING_PROVIDE_GROUP_KBUF)
> >
> > It needs one new file operation, and we shouldn't work toward
> > this way.
>
> Not a new io_uring request, I rather meant sqe->cmd_op,
> like UBLK_U_IO_FETCH_REQ_PROVIDER_BUFFER.
`cmd_op` is supposed to be defined by subsystems, but maybe we can
reserve some for generic uring_cmd. Anyway this shouldn't be one big
deal, we can do that in future if there are more such uses.
Thanks,
Ming
next prev parent reply other threads:[~2024-10-10 3:29 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-12 10:49 [PATCH V6 0/8] io_uring: support sqe group and provide group kbuf Ming Lei
2024-09-12 10:49 ` [PATCH V6 1/8] io_uring: add io_link_req() helper Ming Lei
2024-09-12 10:49 ` [PATCH V6 2/8] io_uring: add io_submit_fail_link() helper Ming Lei
2024-09-12 10:49 ` [PATCH V6 3/8] io_uring: add helper of io_req_commit_cqe() Ming Lei
2024-09-12 10:49 ` [PATCH V6 4/8] io_uring: support SQE group Ming Lei
2024-10-04 13:12 ` Pavel Begunkov
2024-10-06 3:54 ` Ming Lei
2024-10-09 11:53 ` Pavel Begunkov
2024-10-09 12:14 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 5/8] io_uring: support sqe group with members depending on leader Ming Lei
2024-10-04 13:18 ` Pavel Begunkov
2024-10-06 3:54 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 6/8] io_uring: support providing sqe group buffer Ming Lei
2024-10-04 15:32 ` Pavel Begunkov
2024-10-06 8:20 ` Ming Lei
2024-10-09 14:25 ` Pavel Begunkov
2024-10-10 3:00 ` Ming Lei
2024-10-10 18:51 ` Pavel Begunkov
2024-10-11 2:00 ` Ming Lei
2024-10-11 4:06 ` Ming Lei
2024-10-06 9:47 ` Ming Lei
2024-10-09 11:57 ` Pavel Begunkov
2024-10-09 12:21 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 7/8] io_uring/uring_cmd: support provide group kernel buffer Ming Lei
2024-10-04 15:44 ` Pavel Begunkov
2024-10-06 8:46 ` Ming Lei
2024-10-09 15:14 ` Pavel Begunkov
2024-10-10 3:28 ` Ming Lei [this message]
2024-10-10 15:48 ` Pavel Begunkov
2024-10-10 19:31 ` Jens Axboe
2024-10-11 2:30 ` Ming Lei
2024-10-11 2:39 ` Jens Axboe
2024-10-11 3:07 ` Ming Lei
2024-10-11 13:24 ` Jens Axboe
2024-10-11 14:20 ` Ming Lei
2024-10-11 14:41 ` Jens Axboe
2024-10-11 15:45 ` Ming Lei
2024-10-11 16:49 ` Jens Axboe
2024-10-12 3:35 ` Ming Lei
2024-10-14 18:40 ` Pavel Begunkov
2024-10-15 11:05 ` Ming Lei
2024-09-12 10:49 ` [PATCH V6 8/8] ublk: support provide io buffer Ming Lei
2024-10-17 22:31 ` Uday Shankar
2024-10-18 0:45 ` Ming Lei
2024-09-26 10:27 ` [PATCH V6 0/8] io_uring: support sqe group and provide group kbuf Ming Lei
2024-09-26 12:18 ` Jens Axboe
2024-09-26 19:46 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZwdJ7sDuHhWT61FR@fedora \
--to=ming.lei@redhat.com \
--cc=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.