public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 2/2] ublk: run auto buf unregisgering in same io_ring_ctx with register
Date: Thu, 22 May 2025 10:37:43 +0800	[thread overview]
Message-ID: <aC6N9w4ijVEkHN0l@fedora> (raw)
In-Reply-To: <CADUfDZoY7rC=SxpFnN6bqBg1SiBccSyYTsKAVe2Rx0wAxBdD6Q@mail.gmail.com>

On Wed, May 21, 2025 at 06:05:14PM -0700, Caleb Sander Mateos wrote:
> On Wed, May 21, 2025 at 5:42 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Wed, May 21, 2025 at 08:58:20AM -0700, Caleb Sander Mateos wrote:
> > > On Tue, May 20, 2025 at 7:55 PM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > UBLK_F_AUTO_BUF_REG requires that the buffer registered automatically
> > > > is unregistered in same `io_ring_ctx`, so check it explicitly.
> > > >
> > > > Meantime return the failure code if io_buffer_unregister_bvec() fails,
> > > > then ublk server can handle the failure in consistent way.
> > > >
> > > > Also force to clear UBLK_IO_FLAG_AUTO_BUF_REG in ublk_io_release()
> > > > because ublk_io_release() may be triggered not from handling
> > > > UBLK_IO_COMMIT_AND_FETCH_REQ, and from releasing the `io_ring_ctx`
> > > > for registering the buffer.
> > > >
> > > > Fixes: 99c1e4eb6a3f ("ublk: register buffer to local io_uring with provided buf index via UBLK_F_AUTO_BUF_REG")
> > > > Reported-by: Caleb Sander Mateos <csander@purestorage.com>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >  drivers/block/ublk_drv.c      | 35 +++++++++++++++++++++++++++++++----
> > > >  include/uapi/linux/ublk_cmd.h |  3 ++-
> > > >  2 files changed, 33 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > > index fcf568b89370..2af6422d6a89 100644
> > > > --- a/drivers/block/ublk_drv.c
> > > > +++ b/drivers/block/ublk_drv.c
> > > > @@ -84,6 +84,7 @@ struct ublk_rq_data {
> > > >
> > > >         /* for auto-unregister buffer in case of UBLK_F_AUTO_BUF_REG */
> > > >         u16 buf_index;
> > > > +       unsigned long buf_ctx_id;
> > > >  };
> > > >
> > > >  struct ublk_uring_cmd_pdu {
> > > > @@ -1192,6 +1193,11 @@ static void ublk_auto_buf_reg_fallback(struct request *req, struct ublk_io *io)
> > > >         refcount_set(&data->ref, 1);
> > > >  }
> > > >
> > > > +static unsigned long ublk_uring_cmd_ctx_id(struct io_uring_cmd *cmd)
> > > > +{
> > > > +       return (unsigned long)(cmd_to_io_kiocb(cmd)->ctx);
> > >
> > > Is the fact that a struct io_uring_cmd * can be passed to
> > > cmd_to_io_kiocb() an io_uring internal implementation detail? Maybe it
> > > would be good to add a helper in include/linux/io_uring/cmd.h so ublk
> > > isn't depending on io_uring internals.
> >
> > All this definition is defined in kernel public header, not sure if it is
> > big deal to add the helper, especially there is just single user.
> >
> > But I will do it.
> >
> > >
> > > Also, storing buf_ctx_id as a void * instead would allow this cast to
> > > be avoided, but not a big deal.
> > >
> > > > +}
> > > > +
> > > >  static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io,
> > > >                               unsigned int issue_flags)
> > > >  {
> > > > @@ -1211,6 +1217,8 @@ static bool ublk_auto_buf_reg(struct request *req, struct ublk_io *io,
> > > >         }
> > > >         /* one extra reference is dropped by ublk_io_release */
> > > >         refcount_set(&data->ref, 2);
> > > > +
> > > > +       data->buf_ctx_id = ublk_uring_cmd_ctx_id(io->cmd);
> > > >         /* store buffer index in request payload */
> > > >         data->buf_index = pdu->buf.index;
> > > >         io->flags |= UBLK_IO_FLAG_AUTO_BUF_REG;
> > > > @@ -1994,6 +2002,21 @@ static void ublk_io_release(void *priv)
> > > >  {
> > > >         struct request *rq = priv;
> > > >         struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> > > > +       struct ublk_io *io = &ubq->ios[rq->tag];
> > > > +
> > > > +       /*
> > > > +        * In case of UBLK_F_AUTO_BUF_REG, the `io_uring_ctx` for registering
> > > > +        * this buffer may be released, so we reach here not from handling
> > > > +        * `UBLK_IO_COMMIT_AND_FETCH_REQ`.
> > >
> > > What do you mean by this? That the io_uring was closed while a ublk
> > > I/O owned by the server still had a registered buffer?
> >
> > The buffer is registered to `io_ring_ctx A`, which is closed and the buffer
> > is used up and un-registered on `io_ring_ctx A`, so this callback is
> > triggered, but the io command isn't completed yet, which can be run from
> > `io_ring_ctx B`
> >
> > >
> > > > +        *
> > > > +        * Force to clear UBLK_IO_FLAG_AUTO_BUF_REG, so that ublk server
> > > > +        * still may complete this IO request by issuing uring_cmd from
> > > > +        * another `io_uring_ctx` in case that the `io_ring_ctx` for
> > > > +        * registering the buffer is gone
> > > > +        */
> > > > +       if (ublk_support_auto_buf_reg(ubq) &&
> > > > +                       (io->flags & UBLK_IO_FLAG_AUTO_BUF_REG))
> > > > +               io->flags &= ~UBLK_IO_FLAG_AUTO_BUF_REG;
> > >
> > > This is racy, since ublk_io_release() can be called on a thread other
> > > than the ubq_daemon.
> >
> > Yeah, it can be true.
> >
> > > Could we avoid touching io->flags here and
> > > instead have ublk_commit_and_fetch() check whether the reference count
> > > is already 1?
> >
> > It is still a little racy because the buffer unregister from another thread
> > can happen just after the check immediately.
> 
> True. I think it might be better to just skip the unregister if the
> contexts don't match rather than returning -EINVAL. Then there is no
> race. If userspace has already closed the old io_uring context,
> skipping the unregister is the desired behavior. If userspace hasn't
> closed the old io_uring, then that's a userspace bug and they get what
> they deserve (a buffer stuck registered). If userspace wants to submit
> the UBLK_IO_COMMIT_AND_FETCH_REQ on a different io_uring for some
> reason, they can always issue an explicit UBLK_IO_UNREGISTER_IO_BUF on
> the old io_uring to unregister the buffer.

Sounds good, will add the above words on ublk_commit_and_fetch().

Follows two invariants:

- ublk_io_release() is always called once no matter if it is called
from any thread context, request can't be completed until ublk_io_release()
is called

- new UBLK_IO_COMMIT_AND_FETCH_REQ can't be issued until old request
is completed & new request comes

The merged code should work just fine.

But we need to document the feature only works on same io_ring_context.


Thanks, 
Ming


  reply	other threads:[~2025-05-22  2:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21  2:54 [PATCH 0/2] ublk: two fixes on UBLK_F_AUTO_BUF_REG Ming Lei
2025-05-21  2:54 ` [PATCH 1/2] ublk: handle ublk_set_auto_buf_reg() failure correctly in ublk_fetch() Ming Lei
2025-05-21 14:53   ` Caleb Sander Mateos
2025-05-21  2:55 ` [PATCH 2/2] ublk: run auto buf unregisgering in same io_ring_ctx with register Ming Lei
2025-05-21 15:58   ` Caleb Sander Mateos
2025-05-22  0:42     ` Ming Lei
2025-05-22  1:05       ` Caleb Sander Mateos
2025-05-22  2:37         ` Ming Lei [this message]
2025-05-21 19:19   ` Caleb Sander Mateos
2025-05-21 13:11 ` [PATCH 0/2] ublk: two fixes on UBLK_F_AUTO_BUF_REG Jens Axboe
2025-05-21 20:31   ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aC6N9w4ijVEkHN0l@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox