public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 2/2] ublk: document IO reference counting design
Date: Sat, 24 Jan 2026 09:30:28 +0800	[thread overview]
Message-ID: <aXQgtDq2ffpAwnUW@fedora> (raw)
In-Reply-To: <CADUfDZog7YhUz0RXfyR=guO3-aKo_FnSF6cSVY1uVv0=m-xB1A@mail.gmail.com>

On Fri, Jan 23, 2026 at 12:37:07PM -0800, Caleb Sander Mateos wrote:
> On Fri, Jan 23, 2026 at 5:53 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > Add comprehensive documentation for ublk's split reference counting
> > model (io->ref + io->task_registered_buffers) above ublk_init_req_ref()
> > given this model isn't very straightforward.
> >
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >  drivers/block/ublk_drv.c | 64 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 64 insertions(+)
> >
> > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > index 7981decd1cee..91218b78e711 100644
> > --- a/drivers/block/ublk_drv.c
> > +++ b/drivers/block/ublk_drv.c
> > @@ -985,6 +985,70 @@ static inline bool ublk_dev_need_req_ref(const struct ublk_device *ub)
> >                ublk_dev_support_auto_buf_reg(ub);
> >  }
> >
> > +/*
> > + * ublk IO Reference Counting Design
> > + * ==================================
> > + *
> > + * For user-copy and zero-copy modes, ublk uses a split reference model with
> > + * two counters that together track IO lifetime:
> > + *
> > + *   - io->ref: refcount for off-task buffer registrations and user-copy ops
> > + *   - io->task_registered_buffers: count of buffers registered on the IO task
> > + *
> > + * Key Invariant:
> > + * --------------
> > + * The sum (io->ref + io->task_registered_buffers) must equal UBLK_REFCOUNT_INIT
> 
> This is only true when UBLK_IO_FLAG_OWNED_BY_SRV is set. If the
> ublk_io isn't currently dispatched to the ublk server, ref and
> task_registered_buffers would both be 0.

Yeah, it also means the reference counter is initialized, which is implied
in the whole doc.

> 
> > + * when no active references exist. This invariant is checked by
> > + * ublk_check_and_reset_active_ref() during daemon exit to determine if all
> > + * references have been released.
> > + *
> > + * Why Split Counters:
> > + * -------------------
> > + * Buffers registered on the IO daemon task can use the lightweight
> > + * task_registered_buffers counter (simple increment/decrement) instead of
> > + * atomic refcount operations. The ublk_io_release() callback checks if
> > + * current == io->task to decide which counter to update.
> 
> However, this optimization can only be used before the I/O is
> completed, since task_registered_buffers is collapsed into the atomic
> ref at that point. All subsequent buffer unregistrations need to use
> the atomic ref since they may be releasing the last reference.

The point is ublk_need_complete_req()/ublk_sub_req_ref(), we can make it
explicit.

> 
> > + *
> > + * Reference Lifecycle:
> > + * --------------------
> > + * 1. ublk_init_req_ref(): Sets io->ref = UBLK_REFCOUNT_INIT at IO dispatch
> > + *
> > + * 2. During IO processing:
> > + *    - On-task buffer reg: task_registered_buffers++ (no ref change)
> > + *    - Off-task buffer reg: ref++ via ublk_get_req_ref()
> > + *    - Buffer unregister callback (ublk_io_release):
> > + *      * If on-task: task_registered_buffers--
> > + *      * If off-task: ref-- via ublk_put_req_ref()
> > + *
> > + * 3. ublk_sub_req_ref() at IO completion:
> > + *    - Computes: sub_refs = UBLK_REFCOUNT_INIT - task_registered_buffers
> > + *    - Subtracts sub_refs from ref
> > + *    - This accounts for the initial UBLK_REFCOUNT_INIT minus any on-task
> > + *      buffers that were already counted in task_registered_buffers
> 
> I would note that task_registered_buffers is also zeroed during
> ublk_sub_req_ref(). Effectively, it's transferring/collapsing
> task_registered_buffers into the atomic ref.

OK.

> 
> > + *
> > + * Example (zero-copy, register on-task, unregister off-task):
> > + *   - Dispatch: ref = UBLK_REFCOUNT_INIT, task_registered_buffers = 0
> > + *   - Register buffer on-task: task_registered_buffers = 1
> > + *   - Unregister off-task: ref-- (UBLK_REFCOUNT_INIT - 1), task_registered_buffers stays 1
> > + *   - Completion via ublk_sub_req_ref():
> > + *     sub_refs = UBLK_REFCOUNT_INIT - 1, ref = (UBLK_REFCOUNT_INIT - 1) - (UBLK_REFCOUNT_INIT - 1) = 0
> > + *
> > + * Example (auto buffer registration):
> > + *   Auto buffer registration sets task_registered_buffers = 1 at dispatch.
> > + *
> > + *   - Dispatch: ref = UBLK_REFCOUNT_INIT, task_registered_buffers = 1
> > + *   - Buffer unregister: task_registered_buffers-- (becomes 0)
> > + *   - Completion via ublk_sub_req_ref(): sub_refs = UBLK_REFCOUNT_INIT - 0, ref becomes 0
> 
> I think the order here was actually reversed in commit b749965edda8
> "ublk: remove ublk_commit_and_fetch()". ublk_need_complete_req() will
> call ublk_sub_req_ref(), and io_buffer_unregister_bvec() is called
> *afterwards*, which results in a call to ublk_io_release(). So the

Right.

> buffer unregister will end up performing an atomic decrement of ref.
> 
> > + *   - Daemon exit check: sum = ref + task_registered_buffers = UBLK_REFCOUNT_INIT
> 
> Both ref  and task_registered_buffers are 0 at this point, no?

It can be UBLK_REFCOUNT_INIT or zero, please see ublk_check_and_reset_active_ref(),
the doc mentions `Daemon exit check`. But yes, both reset finally when we confirm
that the io hasn't active references.


Thanks, 
Ming


  reply	other threads:[~2026-01-24  1:30 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23 13:51 [PATCH 0/2] ublk: auto buffer helpers rename & document io reference Ming Lei
2026-01-23 13:51 ` [PATCH 1/2] ublk: rename auto buffer registration helpers Ming Lei
2026-01-23 13:51 ` [PATCH 2/2] ublk: document IO reference counting design Ming Lei
2026-01-23 20:37   ` Caleb Sander Mateos
2026-01-24  1:30     ` Ming Lei [this message]
2026-01-24  1:38       ` Caleb Sander Mateos
2026-01-24  2:55         ` Ming Lei
2026-01-23 19:13 ` [PATCH 0/2] ublk: auto buffer helpers rename & document io reference Jens Axboe
2026-01-24  1:03   ` Ming Lei
2026-01-24  3:18     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXQgtDq2ffpAwnUW@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=csander@purestorage.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ushankar@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox