linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <tom.leiming@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Yonghong Song <yonghong.song@linux.dev>,
	Ming Lei <tom.leiming@gmail.com>
Subject: [RFC PATCH 22/22] ublk: document ublk-bpf & bpf-aio
Date: Tue,  7 Jan 2025 20:04:13 +0800	[thread overview]
Message-ID: <20250107120417.1237392-23-tom.leiming@gmail.com> (raw)
In-Reply-To: <20250107120417.1237392-1-tom.leiming@gmail.com>

Document ublk-bpf motivation and implementation.

Document bpf-aio implementation.

Document ublk-bpf selftests.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 Documentation/block/ublk.rst | 170 +++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index 51665a3e6a50..bf7a3df48036 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -309,6 +309,176 @@ with specified IO tag in the command data:
   ``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy
   the server buffer (pages) read to the IO request pages.
 
+
+UBLK-BPF support
+================
+
+Motivation
+----------
+
+- support stacking ublk
+
+  There are many 3rd party volume manager, ublk may be built over ublk device
+  for simplifying implementation, however, multiple userspace-kernel context
+  switchs for handling one single IO can't be accepted from performance view
+  of point
+
+  ublk-bpf can avoid user-kernel context switch in most fast io path, so ublk
+  over ublk becomes possible
+
+- complicated virtual block device
+
+  Many complicated virtual block devices have admin&meta code path and normal
+  IO fast path; meta & admin IO handling is usually complicated, so it can be
+  moved to ublk server for relieving development burden; meantime IO fast path
+  can be kept in kernel space for the sake of high performance.
+
+  Bpf provides rich maps, which helps a lot for communication between
+  userspace and prog or between prog and prog.
+
+  One typical example is qcow2, which meta IO handling can be kept in
+  ublk server, and fast IO path is moved to bpf prog. Efficient bpf map can be
+  looked up first and see if this virtual LBA & host LBA mapping is hit in
+  the map. If yes, handle the IO with ublk-bpf directly, otherwise forward to
+  ublk server to populate the mapping first.
+
+- some simple high performance virtual devices
+
+  Such as null & loop, the whole implementation can be moved to bpf prog
+  completely.
+
+- provides chance to get similar performance with kernel driver
+
+  One round of kernel/user context switch is avoided, and one extra IO data
+  copy is saved
+
+bpf aio
+-------
+
+bpf aio exports kfuncs for bpf prog to submit & complete IO in async way.
+IO completion handler is provided by the bpf aio user, which is still
+defined in bpf prog(such as ublk bpf prog) as `struct bpf_aio_complete_ops`
+of bpf struct_ops.
+
+bpf aio is designed as generic interface, which can be used for any bpf prog
+in theory, and it may be move to `/lib/` in future if the interface becomes
+mature and stable enough.
+
+- bpf_aio_alloc()
+
+  Allocate one bpf aio instance of `struct bpf_aio`
+
+- bpf_aio_release()
+
+  Free one bpf aio instance of `struct bpf_aio`
+
+- bpf_aio_submit()
+
+  Submit one bpf aio instance of `struct bpf_aio` in async way.
+
+- `struct bpf_aio_complete_ops`
+
+  Define bpf aio completion callback implemented as bpf struct_ops, and
+  it is called when the submitted bpf aio is completed.
+
+
+ublk bpf implementation
+-----------------------
+
+Export `struct ublk_bpf_ops` as bpf struct_ops, so that ublk IO command
+can be queued or handled in the callback defined in the ublk bpf struct_ops,
+see the whole logic in `ublk_run_bpf_handler`:
+
+- `UBLK_BPF_IO_QUEUED`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_QUEUED`,
+  this IO command has been queued by bpf prog, so it won't be forwarded to
+  ublk server
+
+- `UBLK_BPF_IO_REDIRECT`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_REDIRECT`,
+  this IO command will be forwarded to ublk server
+
+- `UBLK_BPF_IO_CONTINUE`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_CONTINUE`,
+  part of this io command is queued, and `ublk_bpf_return_t` carries how many
+  bytes queued, so ublk driver will continue to call the callback to queue
+  remained bytes of this io command further, this way is helpful for
+  implementing stacking devices by allowing IO command split.
+
+ublk bpf provides kfuncs for ublk bpf prog to queue and handle ublk IO command:
+
+- ublk_bpf_complete_io()
+
+  Complete this ublk IO command
+
+- ublk_bpf_get_io_tag()
+
+  Get tag of this ublk IO command
+
+- ublk_bpf_get_queue_id()
+
+  Get queue id of this ublk IO command
+
+- ublk_bpf_get_dev_id()
+
+  Get device id of this ublk IO command
+
+- ublk_bpf_attach_and_prep_aio()
+
+  Attach & prepare bpf aio to this ublk IO command, bpf aio buffer is
+  prepared, and aio's complete callback is setup, so the user prog can
+  get notified when the bpf aio is completed
+
+- ublk_bpf_dettach_and_complete_aio()
+
+  Detach bpf aio from this IO command, and it is usually called from bpf
+  aio's completion callback.
+
+- ublk_bpf_acquire_io_from_aio()
+
+  Acquire ublk IO command from the aio, one typical use is for calling
+  ublk_bpf_complete_io() to complete ublk IO command
+
+- ublk_bpf_release_io_from_aio()
+
+  Release ublk IO command which is acquired from `ublk_bpf_acquire_io_from_aio`
+
+
+Test
+----
+
+- Build kernel & install kernel headers & reboot & test
+
+  enable CONFIG_BLK_DEV_UBLK & CONFIG_UBLK_BPF
+
+  make
+
+  make headers_install INSTALL_HDR_PATH=/usr
+
+  reboot
+
+  make -C tools/testing/selftests TARGETS=ublk run_test
+
+ublk selftests implements null, loop and stripe targets for covering all
+bpf features:
+
+- complete bpf IO handling
+
+- complete ublk server IO handling
+
+- mixed bpf prog and ublk server IO handling
+
+- bpf aio for loop & stripe
+
+- IO split via `UBLK_BPF_IO_CONTINUE` for implementing ublk-stripe
+
+Write & read verify, and mkfs.ext4 & mount & umount are run in the
+selftest.
+
+
 Future development
 ==================
 
-- 
2.47.0


      parent reply	other threads:[~2025-01-07 12:09 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 12:03 [RFC PATCH 00/22] ublk: support bpf Ming Lei
2025-01-07 12:03 ` [RFC PATCH 01/22] ublk: remove two unused fields from 'struct ublk_queue' Ming Lei
2025-01-07 12:03 ` [RFC PATCH 02/22] ublk: convert several bool type fields into bitfield of `ublk_queue` Ming Lei
2025-01-07 12:03 ` [RFC PATCH 03/22] ublk: add helper of ublk_need_map_io() Ming Lei
2025-01-07 12:03 ` [RFC PATCH 04/22] ublk: move ublk into one standalone directory Ming Lei
2025-01-07 12:03 ` [RFC PATCH 05/22] ublk: move private definitions into private header Ming Lei
2025-01-07 12:03 ` [RFC PATCH 06/22] ublk: move several helpers to " Ming Lei
2025-01-07 12:03 ` [RFC PATCH 07/22] ublk: bpf: add bpf prog attach helpers Ming Lei
2025-01-07 12:03 ` [RFC PATCH 08/22] ublk: bpf: add bpf struct_ops Ming Lei
2025-01-10  1:43   ` Alexei Starovoitov
2025-01-13  4:08     ` Ming Lei
2025-01-13 21:30       ` Alexei Starovoitov
2025-01-15 11:58         ` Ming Lei
2025-01-15 20:11           ` Amery Hung
2025-01-07 12:04 ` [RFC PATCH 09/22] ublk: bpf: attach bpf prog to ublk device Ming Lei
2025-01-07 12:04 ` [RFC PATCH 10/22] ublk: bpf: add kfunc for ublk bpf prog Ming Lei
2025-01-07 12:04 ` [RFC PATCH 11/22] ublk: bpf: enable ublk-bpf Ming Lei
2025-01-07 12:04 ` [RFC PATCH 12/22] selftests: ublk: add tests for the ublk-bpf initial implementation Ming Lei
2025-01-07 12:04 ` [RFC PATCH 13/22] selftests: ublk: add tests for covering io split Ming Lei
2025-01-07 12:04 ` [RFC PATCH 14/22] selftests: ublk: add tests for covering redirecting to userspace Ming Lei
2025-01-07 12:04 ` [RFC PATCH 15/22] ublk: bpf: add bpf aio kfunc Ming Lei
2025-01-07 12:04 ` [RFC PATCH 16/22] ublk: bpf: add bpf aio struct_ops Ming Lei
2025-01-07 12:04 ` [RFC PATCH 17/22] ublk: bpf: attach bpf aio prog to ublk device Ming Lei
2025-01-07 12:04 ` [RFC PATCH 18/22] ublk: bpf: add several ublk bpf aio kfuncs Ming Lei
2025-01-07 12:04 ` [RFC PATCH 19/22] ublk: bpf: wire bpf aio with ublk io handling Ming Lei
2025-01-07 12:04 ` [RFC PATCH 20/22] selftests: add tests for ublk bpf aio Ming Lei
2025-01-07 12:04 ` [RFC PATCH 21/22] selftests: add tests for covering both bpf aio and split Ming Lei
2025-01-07 12:04 ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250107120417.1237392-23-tom.leiming@gmail.com \
    --to=tom.leiming@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).