From: Ming Lei <ming.lei@redhat.com>
To: Caleb Sander Mateos <csander@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
Uday Shankar <ushankar@purestorage.com>
Subject: Re: [PATCH 2/2] ublk: document IO reference counting design
Date: Sat, 24 Jan 2026 10:55:24 +0800 [thread overview]
Message-ID: <aXQ0nN-E8uv1oHpW@fedora> (raw)
In-Reply-To: <CADUfDZrCCGY90oNo6qdtBrZ=xtZgZDTKndrAwqUjnA0Ou-jr=Q@mail.gmail.com>
On Fri, Jan 23, 2026 at 05:38:37PM -0800, Caleb Sander Mateos wrote:
> On Fri, Jan 23, 2026 at 5:30 PM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Fri, Jan 23, 2026 at 12:37:07PM -0800, Caleb Sander Mateos wrote:
> > > On Fri, Jan 23, 2026 at 5:53 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > Add comprehensive documentation for ublk's split reference counting
> > > > model (io->ref + io->task_registered_buffers) above ublk_init_req_ref()
> > > > given this model isn't very straightforward.
> > > >
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > > drivers/block/ublk_drv.c | 64 ++++++++++++++++++++++++++++++++++++++++
> > > > 1 file changed, 64 insertions(+)
> > > >
> > > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > > index 7981decd1cee..91218b78e711 100644
> > > > --- a/drivers/block/ublk_drv.c
> > > > +++ b/drivers/block/ublk_drv.c
> > > > @@ -985,6 +985,70 @@ static inline bool ublk_dev_need_req_ref(const struct ublk_device *ub)
> > > > ublk_dev_support_auto_buf_reg(ub);
> > > > }
> > > >
> > > > +/*
> > > > + * ublk IO Reference Counting Design
> > > > + * ==================================
> > > > + *
> > > > + * For user-copy and zero-copy modes, ublk uses a split reference model with
> > > > + * two counters that together track IO lifetime:
> > > > + *
> > > > + * - io->ref: refcount for off-task buffer registrations and user-copy ops
> > > > + * - io->task_registered_buffers: count of buffers registered on the IO task
> > > > + *
> > > > + * Key Invariant:
> > > > + * --------------
> > > > + * The sum (io->ref + io->task_registered_buffers) must equal UBLK_REFCOUNT_INIT
> > >
> > > This is only true when UBLK_IO_FLAG_OWNED_BY_SRV is set. If the
> > > ublk_io isn't currently dispatched to the ublk server, ref and
> > > task_registered_buffers would both be 0.
> >
> > Yeah, it also means the reference counter is initialized, which is implied
> > in the whole doc.
>
> Would you mind mentioning explicitly that both reference counts are 0
> for I/Os not currently dispatched to the ublk server?
Sure.
>
> >
> > >
> > > > + * when no active references exist. This invariant is checked by
> > > > + * ublk_check_and_reset_active_ref() during daemon exit to determine if all
> > > > + * references have been released.
> > > > + *
> > > > + * Why Split Counters:
> > > > + * -------------------
> > > > + * Buffers registered on the IO daemon task can use the lightweight
> > > > + * task_registered_buffers counter (simple increment/decrement) instead of
> > > > + * atomic refcount operations. The ublk_io_release() callback checks if
> > > > + * current == io->task to decide which counter to update.
> > >
> > > However, this optimization can only be used before the I/O is
> > > completed, since task_registered_buffers is collapsed into the atomic
> > > ref at that point. All subsequent buffer unregistrations need to use
> > > the atomic ref since they may be releasing the last reference.
> >
> > The point is ublk_need_complete_req()/ublk_sub_req_ref(), we can make it
> > explicit.
> >
> > >
> > > > + *
> > > > + * Reference Lifecycle:
> > > > + * --------------------
> > > > + * 1. ublk_init_req_ref(): Sets io->ref = UBLK_REFCOUNT_INIT at IO dispatch
> > > > + *
> > > > + * 2. During IO processing:
> > > > + * - On-task buffer reg: task_registered_buffers++ (no ref change)
> > > > + * - Off-task buffer reg: ref++ via ublk_get_req_ref()
> > > > + * - Buffer unregister callback (ublk_io_release):
> > > > + * * If on-task: task_registered_buffers--
> > > > + * * If off-task: ref-- via ublk_put_req_ref()
> > > > + *
> > > > + * 3. ublk_sub_req_ref() at IO completion:
> > > > + * - Computes: sub_refs = UBLK_REFCOUNT_INIT - task_registered_buffers
> > > > + * - Subtracts sub_refs from ref
> > > > + * - This accounts for the initial UBLK_REFCOUNT_INIT minus any on-task
> > > > + * buffers that were already counted in task_registered_buffers
> > >
> > > I would note that task_registered_buffers is also zeroed during
> > > ublk_sub_req_ref(). Effectively, it's transferring/collapsing
> > > task_registered_buffers into the atomic ref.
> >
> > OK.
> >
> > >
> > > > + *
> > > > + * Example (zero-copy, register on-task, unregister off-task):
> > > > + * - Dispatch: ref = UBLK_REFCOUNT_INIT, task_registered_buffers = 0
> > > > + * - Register buffer on-task: task_registered_buffers = 1
> > > > + * - Unregister off-task: ref-- (UBLK_REFCOUNT_INIT - 1), task_registered_buffers stays 1
> > > > + * - Completion via ublk_sub_req_ref():
> > > > + * sub_refs = UBLK_REFCOUNT_INIT - 1, ref = (UBLK_REFCOUNT_INIT - 1) - (UBLK_REFCOUNT_INIT - 1) = 0
> > > > + *
> > > > + * Example (auto buffer registration):
> > > > + * Auto buffer registration sets task_registered_buffers = 1 at dispatch.
> > > > + *
> > > > + * - Dispatch: ref = UBLK_REFCOUNT_INIT, task_registered_buffers = 1
> > > > + * - Buffer unregister: task_registered_buffers-- (becomes 0)
> > > > + * - Completion via ublk_sub_req_ref(): sub_refs = UBLK_REFCOUNT_INIT - 0, ref becomes 0
> > >
> > > I think the order here was actually reversed in commit b749965edda8
> > > "ublk: remove ublk_commit_and_fetch()". ublk_need_complete_req() will
> > > call ublk_sub_req_ref(), and io_buffer_unregister_bvec() is called
> > > *afterwards*, which results in a call to ublk_io_release(). So the
> >
> > Right.
>
> Should I send out a patch to restore the original order to recover the
> non-atomic refcount optimization?
Yeah, I think it is good to make the optimization back.
>
> >
> > > buffer unregister will end up performing an atomic decrement of ref.
> > >
> > > > + * - Daemon exit check: sum = ref + task_registered_buffers = UBLK_REFCOUNT_INIT
> > >
> > > Both ref and task_registered_buffers are 0 at this point, no?
> >
> > It can be UBLK_REFCOUNT_INIT or zero, please see ublk_check_and_reset_active_ref(),
> > the doc mentions `Daemon exit check`. But yes, both reset finally when we confirm
> > that the io hasn't active references.
>
> I mainly found it confusing that "Daemon exit check" is listed under
> "Example (auto buffer registration)". That makes it sounds like it's a
> continuation of the steps above, in which case ref and
> task_registered_buffers should be 0 at this point. Consider separating
> the part about the daemon exit check into its own section?
OK.
Also both may not be zero in Daemon exit check when the server is killed,
which should be added as example too.
Thanks,
Ming
next prev parent reply other threads:[~2026-01-24 2:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-23 13:51 [PATCH 0/2] ublk: auto buffer helpers rename & document io reference Ming Lei
2026-01-23 13:51 ` [PATCH 1/2] ublk: rename auto buffer registration helpers Ming Lei
2026-01-23 13:51 ` [PATCH 2/2] ublk: document IO reference counting design Ming Lei
2026-01-23 20:37 ` Caleb Sander Mateos
2026-01-24 1:30 ` Ming Lei
2026-01-24 1:38 ` Caleb Sander Mateos
2026-01-24 2:55 ` Ming Lei [this message]
2026-01-23 19:13 ` [PATCH 0/2] ublk: auto buffer helpers rename & document io reference Jens Axboe
2026-01-24 1:03 ` Ming Lei
2026-01-24 3:18 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aXQ0nN-E8uv1oHpW@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=csander@purestorage.com \
--cc=linux-block@vger.kernel.org \
--cc=ushankar@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox