All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <tom.leiming@gmail.com>
To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org
Cc: bpf@vger.kernel.org, Alexei Starovoitov <ast@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Yonghong Song <yonghong.song@linux.dev>,
	Ming Lei <tom.leiming@gmail.com>
Subject: [RFC PATCH 22/22] ublk: document ublk-bpf & bpf-aio
Date: Tue,  7 Jan 2025 20:04:13 +0800	[thread overview]
Message-ID: <20250107120417.1237392-23-tom.leiming@gmail.com> (raw)
In-Reply-To: <20250107120417.1237392-1-tom.leiming@gmail.com>

Document ublk-bpf motivation and implementation.

Document bpf-aio implementation.

Document ublk-bpf selftests.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
---
 Documentation/block/ublk.rst | 170 +++++++++++++++++++++++++++++++++++
 1 file changed, 170 insertions(+)

diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index 51665a3e6a50..bf7a3df48036 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -309,6 +309,176 @@ with specified IO tag in the command data:
   ``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy
   the server buffer (pages) read to the IO request pages.
 
+
+UBLK-BPF support
+================
+
+Motivation
+----------
+
+- support stacking ublk
+
+  There are many 3rd party volume manager, ublk may be built over ublk device
+  for simplifying implementation, however, multiple userspace-kernel context
+  switchs for handling one single IO can't be accepted from performance view
+  of point
+
+  ublk-bpf can avoid user-kernel context switch in most fast io path, so ublk
+  over ublk becomes possible
+
+- complicated virtual block device
+
+  Many complicated virtual block devices have admin&meta code path and normal
+  IO fast path; meta & admin IO handling is usually complicated, so it can be
+  moved to ublk server for relieving development burden; meantime IO fast path
+  can be kept in kernel space for the sake of high performance.
+
+  Bpf provides rich maps, which helps a lot for communication between
+  userspace and prog or between prog and prog.
+
+  One typical example is qcow2, which meta IO handling can be kept in
+  ublk server, and fast IO path is moved to bpf prog. Efficient bpf map can be
+  looked up first and see if this virtual LBA & host LBA mapping is hit in
+  the map. If yes, handle the IO with ublk-bpf directly, otherwise forward to
+  ublk server to populate the mapping first.
+
+- some simple high performance virtual devices
+
+  Such as null & loop, the whole implementation can be moved to bpf prog
+  completely.
+
+- provides chance to get similar performance with kernel driver
+
+  One round of kernel/user context switch is avoided, and one extra IO data
+  copy is saved
+
+bpf aio
+-------
+
+bpf aio exports kfuncs for bpf prog to submit & complete IO in async way.
+IO completion handler is provided by the bpf aio user, which is still
+defined in bpf prog(such as ublk bpf prog) as `struct bpf_aio_complete_ops`
+of bpf struct_ops.
+
+bpf aio is designed as generic interface, which can be used for any bpf prog
+in theory, and it may be move to `/lib/` in future if the interface becomes
+mature and stable enough.
+
+- bpf_aio_alloc()
+
+  Allocate one bpf aio instance of `struct bpf_aio`
+
+- bpf_aio_release()
+
+  Free one bpf aio instance of `struct bpf_aio`
+
+- bpf_aio_submit()
+
+  Submit one bpf aio instance of `struct bpf_aio` in async way.
+
+- `struct bpf_aio_complete_ops`
+
+  Define bpf aio completion callback implemented as bpf struct_ops, and
+  it is called when the submitted bpf aio is completed.
+
+
+ublk bpf implementation
+-----------------------
+
+Export `struct ublk_bpf_ops` as bpf struct_ops, so that ublk IO command
+can be queued or handled in the callback defined in the ublk bpf struct_ops,
+see the whole logic in `ublk_run_bpf_handler`:
+
+- `UBLK_BPF_IO_QUEUED`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_QUEUED`,
+  this IO command has been queued by bpf prog, so it won't be forwarded to
+  ublk server
+
+- `UBLK_BPF_IO_REDIRECT`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_REDIRECT`,
+  this IO command will be forwarded to ublk server
+
+- `UBLK_BPF_IO_CONTINUE`
+
+  If ->queue_io_cmd() or ->queue_io_cmd_daemon() returns `UBLK_BPF_IO_CONTINUE`,
+  part of this io command is queued, and `ublk_bpf_return_t` carries how many
+  bytes queued, so ublk driver will continue to call the callback to queue
+  remained bytes of this io command further, this way is helpful for
+  implementing stacking devices by allowing IO command split.
+
+ublk bpf provides kfuncs for ublk bpf prog to queue and handle ublk IO command:
+
+- ublk_bpf_complete_io()
+
+  Complete this ublk IO command
+
+- ublk_bpf_get_io_tag()
+
+  Get tag of this ublk IO command
+
+- ublk_bpf_get_queue_id()
+
+  Get queue id of this ublk IO command
+
+- ublk_bpf_get_dev_id()
+
+  Get device id of this ublk IO command
+
+- ublk_bpf_attach_and_prep_aio()
+
+  Attach & prepare bpf aio to this ublk IO command, bpf aio buffer is
+  prepared, and aio's complete callback is setup, so the user prog can
+  get notified when the bpf aio is completed
+
+- ublk_bpf_dettach_and_complete_aio()
+
+  Detach bpf aio from this IO command, and it is usually called from bpf
+  aio's completion callback.
+
+- ublk_bpf_acquire_io_from_aio()
+
+  Acquire ublk IO command from the aio, one typical use is for calling
+  ublk_bpf_complete_io() to complete ublk IO command
+
+- ublk_bpf_release_io_from_aio()
+
+  Release ublk IO command which is acquired from `ublk_bpf_acquire_io_from_aio`
+
+
+Test
+----
+
+- Build kernel & install kernel headers & reboot & test
+
+  enable CONFIG_BLK_DEV_UBLK & CONFIG_UBLK_BPF
+
+  make
+
+  make headers_install INSTALL_HDR_PATH=/usr
+
+  reboot
+
+  make -C tools/testing/selftests TARGETS=ublk run_test
+
+ublk selftests implements null, loop and stripe targets for covering all
+bpf features:
+
+- complete bpf IO handling
+
+- complete ublk server IO handling
+
+- mixed bpf prog and ublk server IO handling
+
+- bpf aio for loop & stripe
+
+- IO split via `UBLK_BPF_IO_CONTINUE` for implementing ublk-stripe
+
+Write & read verify, and mkfs.ext4 & mount & umount are run in the
+selftest.
+
+
 Future development
 ==================
 
-- 
2.47.0


      parent reply	other threads:[~2025-01-07 12:09 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-07 12:03 [RFC PATCH 00/22] ublk: support bpf Ming Lei
2025-01-07 12:03 ` [RFC PATCH 01/22] ublk: remove two unused fields from 'struct ublk_queue' Ming Lei
2025-01-07 12:03 ` [RFC PATCH 02/22] ublk: convert several bool type fields into bitfield of `ublk_queue` Ming Lei
2025-01-07 12:03 ` [RFC PATCH 03/22] ublk: add helper of ublk_need_map_io() Ming Lei
2025-01-07 12:03 ` [RFC PATCH 04/22] ublk: move ublk into one standalone directory Ming Lei
2025-01-07 12:03 ` [RFC PATCH 05/22] ublk: move private definitions into private header Ming Lei
2025-01-07 12:03 ` [RFC PATCH 06/22] ublk: move several helpers to " Ming Lei
2025-01-07 12:03 ` [RFC PATCH 07/22] ublk: bpf: add bpf prog attach helpers Ming Lei
2025-01-07 12:03 ` [RFC PATCH 08/22] ublk: bpf: add bpf struct_ops Ming Lei
2025-01-10  1:43   ` Alexei Starovoitov
2025-01-13  4:08     ` Ming Lei
2025-01-13 21:30       ` Alexei Starovoitov
2025-01-15 11:58         ` Ming Lei
2025-01-15 20:11           ` Amery Hung
2025-01-07 12:04 ` [RFC PATCH 09/22] ublk: bpf: attach bpf prog to ublk device Ming Lei
2025-01-07 12:04 ` [RFC PATCH 10/22] ublk: bpf: add kfunc for ublk bpf prog Ming Lei
2025-01-07 12:04 ` [RFC PATCH 11/22] ublk: bpf: enable ublk-bpf Ming Lei
2025-01-07 12:04 ` [RFC PATCH 12/22] selftests: ublk: add tests for the ublk-bpf initial implementation Ming Lei
2025-01-07 12:04 ` [RFC PATCH 13/22] selftests: ublk: add tests for covering io split Ming Lei
2025-01-07 12:04 ` [RFC PATCH 14/22] selftests: ublk: add tests for covering redirecting to userspace Ming Lei
2025-01-07 12:04 ` [RFC PATCH 15/22] ublk: bpf: add bpf aio kfunc Ming Lei
2025-01-07 12:04 ` [RFC PATCH 16/22] ublk: bpf: add bpf aio struct_ops Ming Lei
2025-01-07 12:04 ` [RFC PATCH 17/22] ublk: bpf: attach bpf aio prog to ublk device Ming Lei
2025-01-07 12:04 ` [RFC PATCH 18/22] ublk: bpf: add several ublk bpf aio kfuncs Ming Lei
2025-01-07 12:04 ` [RFC PATCH 19/22] ublk: bpf: wire bpf aio with ublk io handling Ming Lei
2025-01-07 12:04 ` [RFC PATCH 20/22] selftests: add tests for ublk bpf aio Ming Lei
2025-01-07 12:04 ` [RFC PATCH 21/22] selftests: add tests for covering both bpf aio and split Ming Lei
2025-01-07 12:04 ` Ming Lei [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250107120417.1237392-23-tom.leiming@gmail.com \
    --to=tom.leiming@gmail.com \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.