From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Uday Shankar <ushankar@purestorage.com>, linux-block@vger.kernel.org
Subject: Re: [PATCH 6/8] ublk: allow UBLK_IO_(UN)REGISTER_IO_BUF on any task
Date: Tue, 10 Jun 2025 09:34:40 +0800 [thread overview]
Message-ID: <aEeLsEHxZV_Xs7yC@fedora> (raw)
In-Reply-To: <CADUfDZoCCFs6ZKwxDrgsHGTiCgyehS1uODu+iDmz1+5_k7tLbw@mail.gmail.com>
On Mon, Jun 09, 2025 at 10:49:09AM -0700, Caleb Sander Mateos wrote:
> On Mon, Jun 9, 2025 at 5:34 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Fri, Jun 06, 2025 at 03:40:09PM -0600, Caleb Sander Mateos wrote:
> > > Currently, UBLK_IO_REGISTER_IO_BUF and UBLK_IO_UNREGISTER_IO_BUF are
> > > only permitted on the ublk_io's daemon task. But this restriction is
> > > unnecessary. ublk_register_io_buf() calls __ublk_check_and_get_req() to
> > > look up the request from the tagset and atomically take a reference on
> > > the request without accessing the ublk_io. ublk_unregister_io_buf()
> > > doesn't use the q_id or tag at all.
> > >
> > > So allow these opcodes even on tasks other than io->task.
> > >
> > > Handle UBLK_IO_UNREGISTER_IO_BUF before obtaining the ubq and io since
> > > the buffer index being unregistered is not necessarily related to the
> > > specified q_id and tag.
> > >
> > > Add a feature flag UBLK_F_BUF_REG_OFF_DAEMON that userspace can use to
> > > determine whether the kernel supports off-daemon buffer registration.
> > >
> > > Suggested-by: Ming Lei <ming.lei@redhat.com>
> > > Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
> > > ---
> > > drivers/block/ublk_drv.c | 37 +++++++++++++++++++++--------------
> > > include/uapi/linux/ublk_cmd.h | 8 ++++++++
> > > 2 files changed, 30 insertions(+), 15 deletions(-)
> > >
> > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > index a8030818f74a..2084bbdd2cbb 100644
> > > --- a/drivers/block/ublk_drv.c
> > > +++ b/drivers/block/ublk_drv.c
> > > @@ -68,11 +68,12 @@
> > > | UBLK_F_ZONED \
> > > | UBLK_F_USER_RECOVERY_FAIL_IO \
> > > | UBLK_F_UPDATE_SIZE \
> > > | UBLK_F_AUTO_BUF_REG \
> > > | UBLK_F_QUIESCE \
> > > - | UBLK_F_PER_IO_DAEMON)
> > > + | UBLK_F_PER_IO_DAEMON \
> > > + | UBLK_F_BUF_REG_OFF_DAEMON)
> > >
> > > #define UBLK_F_ALL_RECOVERY_FLAGS (UBLK_F_USER_RECOVERY \
> > > | UBLK_F_USER_RECOVERY_REISSUE \
> > > | UBLK_F_USER_RECOVERY_FAIL_IO)
> > >
> > > @@ -2018,20 +2019,10 @@ static int ublk_register_io_buf(struct io_uring_cmd *cmd,
> > > }
> > >
> > > return 0;
> > > }
> > >
> > > -static int ublk_unregister_io_buf(struct io_uring_cmd *cmd,
> > > - const struct ublk_queue *ubq,
> > > - unsigned int index, unsigned int issue_flags)
> > > -{
> > > - if (!ublk_support_zero_copy(ubq))
> > > - return -EINVAL;
> > > -
> > > - return io_buffer_unregister_bvec(cmd, index, issue_flags);
> > > -}
> > > -
> > > static int ublk_fetch(struct io_uring_cmd *cmd, struct ublk_queue *ubq,
> > > struct ublk_io *io, __u64 buf_addr)
> > > {
> > > struct ublk_device *ub = ubq->dev;
> > > int ret = 0;
> > > @@ -2184,10 +2175,18 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
> > >
> > > ret = ublk_check_cmd_op(cmd_op);
> > > if (ret)
> > > goto out;
> > >
> > > + /*
> > > + * io_buffer_unregister_bvec() doesn't access the ubq or io,
> > > + * so no need to validate the q_id, tag, or task
> > > + */
> > > + if (_IOC_NR(cmd_op) == UBLK_IO_UNREGISTER_IO_BUF)
> > > + return io_buffer_unregister_bvec(cmd, ub_cmd->addr,
> > > + issue_flags);
> > > +
> > > ret = -EINVAL;
> > > if (ub_cmd->q_id >= ub->dev_info.nr_hw_queues)
> > > goto out;
> > >
> > > ubq = ublk_get_queue(ub, ub_cmd->q_id);
> > > @@ -2204,12 +2203,21 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
> > >
> > > ublk_prep_cancel(cmd, issue_flags, ubq, tag);
> > > return -EIOCBQUEUED;
> > > }
> > >
> > > - if (READ_ONCE(io->task) != current)
> > > + if (READ_ONCE(io->task) != current) {
> > > + /*
> > > + * ublk_register_io_buf() accesses only the request, not io,
> > > + * so can be handled on any task
> > > + */
> > > + if (_IOC_NR(cmd_op) == UBLK_IO_REGISTER_IO_BUF)
> > > + return ublk_register_io_buf(cmd, ubq, tag, ub_cmd->addr,
> > > + issue_flags);
> > > +
> > > goto out;
> > > + }
> > >
> > > /* there is pending io cmd, something must be wrong */
> > > if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV)) {
> > > ret = -EBUSY;
> >
> > It also skips check on UBLK_IO_FLAG_OWNED_BY_SRV for both UBLK_IO_REGISTER_IO_BUF
> > and UBLK_IO_UNREGISTER_IO_BUF, :-(
>
> As we've discussed before[1], accessing io->flags on tasks other than
> the io's daemon would be a race condition. So I don't see how it's
> possible to keep this check for off-daemon
> UBLK_IO_(UN)REGISTER_IO_BUF. What value do you see in checking for
> UBLK_IO_FLAG_OWNED_BY_SRV? My understanding is that the
> refcount_inc_not_zero() already ensures the ublk I/O has been
> dispatched to the ublk server and either hasn't been completed or has
> other registered buffers still in use, which is pretty similar to
> UBLK_IO_FLAG_OWNED_BY_SRV.
request can't be trusted any more for UBLK_F_BUF_REG_OFF_DAEMON because it
may be freed from elevator switch code or stopping dev code path, so we
can't rely on refcount_inc_not_zero(pdu(req)) only.
However the existing per-io-task and checking UBLK_IO_FLAG_OWNED_BY_SRV can
guarantee that the request is valid.
> For UBLK_IO_UNREGISTER_IO_BUF, I don't think checking io->flags &
> UBLK_IO_FLAG_OWNED_BY_SRV is sufficient to prevent misuse, since
> there's no requirement that the buffer index (addr) being unregistered
> matches the q_id, tag, or even ublk device specified in the command.
It should be fine to skip the check for UBLK_IO_UNREGISTER_IO_BUF because it
doesn't touch io & request.
However I think it is still correct to validate ZERO_COPY flag for
UBLK_IO_UNREGISTER_IO_BUF cause ZERO_COPY is only allowed for privileged
userspace.
Thanks,
Ming
next prev parent reply other threads:[~2025-06-10 1:34 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-06 21:40 [PATCH 0/8] ublk: allow off-daemon zero-copy buffer registration Caleb Sander Mateos
2025-06-06 21:40 ` [PATCH 1/8] ublk: check cmd_op first Caleb Sander Mateos
2025-06-09 6:57 ` Ming Lei
2025-06-06 21:40 ` [PATCH 2/8] ublk: handle UBLK_IO_FETCH_REQ first Caleb Sander Mateos
2025-06-09 7:01 ` Ming Lei
2025-06-06 21:40 ` [PATCH 3/8] ublk: remove task variable from __ublk_ch_uring_cmd() Caleb Sander Mateos
2025-06-09 7:02 ` Ming Lei
2025-06-06 21:40 ` [PATCH 4/8] ublk: consolidate UBLK_IO_FLAG_{ACTIVE,OWNED_BY_SRV} checks Caleb Sander Mateos
2025-06-09 7:19 ` Ming Lei
2025-06-06 21:40 ` [PATCH 5/8] ublk: move ublk_prep_cancel() to case UBLK_IO_COMMIT_AND_FETCH_REQ Caleb Sander Mateos
2025-06-09 7:21 ` Ming Lei
2025-06-06 21:40 ` [PATCH 6/8] ublk: allow UBLK_IO_(UN)REGISTER_IO_BUF on any task Caleb Sander Mateos
2025-06-09 7:34 ` Ming Lei
2025-06-09 17:39 ` Caleb Sander Mateos
2025-06-09 12:34 ` Ming Lei
2025-06-09 17:49 ` Caleb Sander Mateos
2025-06-10 1:34 ` Ming Lei [this message]
2025-06-11 15:47 ` Caleb Sander Mateos
2025-06-12 2:31 ` Ming Lei
2025-06-06 21:40 ` [PATCH 7/8] ublk: optimize UBLK_IO_REGISTER_IO_BUF on daemon task Caleb Sander Mateos
2025-06-09 9:02 ` Ming Lei
2025-06-09 17:14 ` Caleb Sander Mateos
2025-06-10 1:22 ` Ming Lei
2025-06-11 15:36 ` Caleb Sander Mateos
2025-06-12 4:25 ` Ming Lei
2025-06-06 21:40 ` [PATCH 8/8] ublk: remove ubq checks from ublk_{get,put}_req_ref() Caleb Sander Mateos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aEeLsEHxZV_Xs7yC@fedora \
--to=ming.lei@redhat.com \
--cc=csander@purestorage.com \
--cc=linux-block@vger.kernel.org \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox