All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 2/2] ublk: run auto buf unregisgering in same io_ring_ctx with register
Date: Thu, 22 May 2025 10:37:43 +0800	[thread overview]
Message-ID: <aC6N9w4ijVEkHN0l@fedora> (raw)
In-Reply-To: <CADUfDZoY7rC=SxpFnN6bqBg1SiBccSyYTsKAVe2Rx0wAxBdD6Q@mail.gmail.com>

On Wed, May 21, 2025 at 06:05:14PM -0700, Caleb Sander Mateos wrote:
> On Wed, May 21, 2025 at 5:42 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Wed, May 21, 2025 at 08:58:20AM -0700, Caleb Sander Mateos wrote:
> > > On Tue, May 20, 2025 at 7:55 PM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > UBLK_F_AUTO_BUF_REG requires that the buffer registered automatically
> > > > is unregistered in same `io_ring_ctx`, so check it explicitly.
> > > >
> > > > Meantime return the failure code if io_buffer_unregister_bvec() fails,
> > > > then ublk server can handle the failure in consistent way.
> > > >
> > > > Also force to clear UBLK_IO_FLAG_AUTO_BUF_REG in ublk_io_release()
> > > > because ublk_io_release() may be triggered not from handling
> > > > UBLK_IO_COMMIT_AND_FETCH_REQ, and from releasing the `io_ring_ctx`
> > > > for registering the buffer.
> > > >
> > > > Fixes: 99c1e4eb6a3f ("ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG")
> > > > Reported-by: Caleb Sander Mateos <csander@purestorage.com>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >  drivers/block/ublk_drv.c      | 35 +++++++++++++++++++++++++++++++----
> > > >  include/uapi/linux/ublk_cmd.h |  3 ++-
> > > >  2 files changed, 33 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > > index fcf568b89370..2af6422d6a89 100644
> > > > --- a/drivers/block/ublk_drv.c
> > > > +++ b/drivers/block/ublk_drv.c
> > > > @@ -84,6 +84,7 @@ struct ublk_rq_data {
> > > >
> > > >         /* for auto-unregister buffer in case of UBLK_F_AUTO_BUF_REG */
> > > >         u16 buf_index;
> > > > +       unsigned long buf_ctx_id;
> > > >  };
> > > >
> > > >  struct ublk_uring_cmd_pdu {
> > > > @@ -1192,6 +1193,11 @@ static void ublk_auto_buf_reg_fallback(struct request *req, struct ublk_io *io)
> > > >         refcount_set(&data->ref, 1);
> > > >  }
> > > >
> > > > +static unsigned long ublk_uring_cmd_ctx_id(struct io_uring_cmd *cmd)
> > > > +{
> > > > +       return (unsigned long)(cmd_to_io_kiocb(cmd)->ctx);
> > >
> > > Is the fact that a struct io_uring_cmd * can be passed to
> > > cmd_to_io_kiocb() an io_uring internal implementation detail? Maybe it
> > > would be good to add a helper in include/linux/io_uring/cmd.h so ublk
> > > isn't depending on io_uring internals.
> >
> > All this definition is defined in kernel public header, not sure if it is
> > big deal to add the helper, especially there is just single user.
> >
> > But I will do it.
> >
> > >
> > > Also, storing buf_ctx_id as a void * instead would allow this cast to
> > > be avoided, but not a big deal.
> > >
> > > > +}
> > > > +
> > > >  static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io,
> > > >                               unsigned int issue_flags)
> > > >  {
> > > > @@ -1211,6 +1217,8 @@ static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io,
> > > >         }
> > > >         /* one extra reference is dropped by ublk_io_release */
> > > >         refcount_set(&data->ref, 2);
> > > > +
> > > > +       data->buf_ctx_id = ublk_uring_cmd_ctx_id(io->cmd);
> > > >         /* store buffer index in request payload */
> > > >         data->buf_index = pdu->buf.index;
> > > >         io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG;
> > > > @@ -1994,6 +2002,21 @@ static void ublk_io_release(void *priv)
> > > >  {
> > > >         struct request *rq = priv;
> > > >         struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> > > > +       struct ublk_io *io = &ubq->ios[rq->tag];
> > > > +
> > > > +       /*
> > > > +        * In case of UBLK_F_AUTO_BUF_REG, the `io_uring_ctx` for registering
> > > > +        * this buffer may be released, so we reach here not from handling
> > > > +        * `UBLK_IO_COMMIT_AND_FETCH_REQ`.
> > >
> > > What do you mean by this? That the io_uring was closed while a ublk
> > > I/O owned by the server still had a registered buffer?
> >
> > The buffer is registered to `io_ring_ctx A`, which is closed and the buffer
> > is used up and un-registered on `io_ring_ctx A`, so this callback is
> > triggered, but the io command isn't completed yet, which can be run from
> > `io_ring_ctx B`
> >
> > >
> > > > +        *
> > > > +        * Force to clear UBLK_IO_FLAG_AUTO_BUF_REG, so that ublk server
> > > > +        * still may complete this IO request by issuing uring_cmd from
> > > > +        * another `io_uring_ctx` in case that the `io_ring_ctx` for
> > > > +        * registering the buffer is gone
> > > > +        */
> > > > +       if (ublk_support_auto_buf_reg(ubq) &&
> > > > +                       (io->flags & UBLK_IO_FLAG_AUTO_BUF_REG))
> > > > +               io->flags &= ~UBLK_IO_FLAG_AUTO_BUF_REG;
> > >
> > > This is racy, since ublk_io_release() can be called on a thread other
> > > than the ubq_daemon.
> >
> > Yeah, it can be true.
> >
> > > Could we avoid touching io->flags here and
> > > instead have ublk_commit_and_fetch() check whether the reference count
> > > is already 1?
> >
> > It is still a little racy because the buffer unregister from another thread
> > can happen just after the check immediately.
> 
> True. I think it might be better to just skip the unregister if the
> contexts don't match rather than returning -EINVAL. Then there is no
> race. If userspace has already closed the old io_uring context,
> skipping the unregister is the desired behavior. If userspace hasn't
> closed the old io_uring, then that's a userspace bug and they get what
> they deserve (a buffer stuck registered). If userspace wants to submit
> the UBLK_IO_COMMIT_AND_FETCH_REQ on a different io_uring for some
> reason, they can always issue an explicit UBLK_IO_UNREGISTER_IO_BUF on
> the old io_uring to unregister the buffer.

Sounds good, will add the above words on ublk_commit_and_fetch().

Follows two invariants:

- ublk_io_release() is always called once no matter if it is called
from any thread context, request can't be completed until ublk_io_release()
is called

- new UBLK_IO_COMMIT_AND_FETCH_REQ can't be issued until old request
is completed & new request comes

The merged code should work just fine.

But we need to document the feature only works on same io_ring_context.


Thanks, 
Ming


  reply	other threads:[~2025-05-22  2:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21  2:54 [PATCH 0/2] ublk: two fixes on UBLK_F_AUTO_BUF_REG Ming Lei
2025-05-21  2:54 ` [PATCH 1/2] ublk: handle ublk_set_auto_buf_reg() failure correctly in ublk_fetch() Ming Lei
2025-05-21 14:53   ` Caleb Sander Mateos
2025-05-21  2:55 ` [PATCH 2/2] ublk: run auto buf unregisgering in same io_ring_ctx with register Ming Lei
2025-05-21 15:58   ` Caleb Sander Mateos
2025-05-22  0:42     ` Ming Lei
2025-05-22  1:05       ` Caleb Sander Mateos
2025-05-22  2:37         ` Ming Lei [this message]
2025-05-21 19:19   ` Caleb Sander Mateos
2025-05-21 13:11 ` [PATCH 0/2] ublk: two fixes on UBLK_F_AUTO_BUF_REG Jens Axboe
2025-05-21 20:31   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aC6N9w4ijVEkHN0l@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.