All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kaitao cheng <kaitao.cheng@linux.dev>
To: axboe@kernel.dk, ast@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, martin.lau@linux.dev, eddyz87@gmail.com,
	memxor@gmail.com, song@kernel.org, yonghong.song@linux.dev,
	jolsa@kernel.org, john.fastabend@gmail.com
Cc: bpf@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Kaitao cheng <kaitao.cheng@linux.dev>
Subject: [RFC v2 0/3] block: Introduce a BPF-based I/O scheduler
Date: Sun,  3 May 2026 11:56:20 +0800	[thread overview]
Message-ID: <20260503035623.28771-1-kaitao.cheng@linux.dev> (raw)

I have been working on adding a new BPF-based I/O scheduler. It has both
kernel and user-space parts. In kernel space, using per-ctx, I implemented
a simple elevator that exposes a set of BPF hooks. The goal is to move the
policy side of I/O scheduling out of the kernel and into user space, which
should greatly increase flexibility and applicability. To verify that the
whole stack works end to end, I wrote a simple BPF example program. I am
calling this feature the UFQ (User-programmable Flexible Queueing) I/O
scheduler.

This patch depends on new BPF functionality that I have already posted to
the BPF community but that is not yet in mainline. Details are in the
thread:

https://lore.kernel.org/bpf/20260427165906.84420-1-kaitao.cheng@linux.dev/

To try it, you need to apply the patches from that series first.

I am looking for community feedback on whether this direction and the
implementation approach make sense, and what else we should
consider.

todo:
1. More thorough testing
2. Split the code into multiple sub-patches
3. Identify concrete use cases

Changes in v2:
- Remove bpf_request_put (Alexei Starovoitov)
- Update the UFQ README (Miguel Ojeda)
- Add bio merge support
- Fix synchronization issues during UFQ scheduler transitions

Link to v1:
https://lore.kernel.org/bpf/20260327114741.91500-1-pilgrimtao@gmail.com/

Kaitao Cheng (3):
  bpf: Add KF_SPIN_LOCK flag for kfuncs under bpf_spin_lock
  block: Introduce the UFQ I/O scheduler
  tools/ufq_iosched: add BPF example scheduler and build scaffolding

 block/Kconfig.iosched                         |   8 +
 block/Makefile                                |   1 +
 block/blk-merge.c                             |  28 +-
 block/blk-mq.c                                |   8 +-
 block/blk-mq.h                                |   2 +-
 block/blk.h                                   |   5 +
 block/ufq-bpfops.c                            | 241 +++++++
 block/ufq-iosched.c                           | 640 ++++++++++++++++++
 block/ufq-iosched.h                           |  64 ++
 block/ufq-kfunc.c                             | 131 ++++
 include/linux/btf.h                           |   1 +
 kernel/bpf/verifier.c                         |  20 +-
 tools/ufq_iosched/.gitignore                  |   2 +
 tools/ufq_iosched/Makefile                    | 262 +++++++
 tools/ufq_iosched/README.md                   | 145 ++++
 .../include/bpf-compat/gnu/stubs.h            |  12 +
 tools/ufq_iosched/include/ufq/common.bpf.h    |  75 ++
 tools/ufq_iosched/include/ufq/common.h        |  90 +++
 tools/ufq_iosched/include/ufq/simple_stat.h   |  23 +
 tools/ufq_iosched/ufq_simple.bpf.c            | 604 +++++++++++++++++
 tools/ufq_iosched/ufq_simple.c                | 120 ++++
 21 files changed, 2464 insertions(+), 18 deletions(-)
 create mode 100644 block/ufq-bpfops.c
 create mode 100644 block/ufq-iosched.c
 create mode 100644 block/ufq-iosched.h
 create mode 100644 block/ufq-kfunc.c
 create mode 100644 tools/ufq_iosched/.gitignore
 create mode 100644 tools/ufq_iosched/Makefile
 create mode 100644 tools/ufq_iosched/README.md
 create mode 100644 tools/ufq_iosched/include/bpf-compat/gnu/stubs.h
 create mode 100644 tools/ufq_iosched/include/ufq/common.bpf.h
 create mode 100644 tools/ufq_iosched/include/ufq/common.h
 create mode 100644 tools/ufq_iosched/include/ufq/simple_stat.h
 create mode 100644 tools/ufq_iosched/ufq_simple.bpf.c
 create mode 100644 tools/ufq_iosched/ufq_simple.c

-- 
2.50.1 (Apple Git-155)


             reply	other threads:[~2026-05-03  3:56 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-03  3:56 Kaitao cheng [this message]
2026-05-03  3:56 ` [RFC v2 1/3] bpf: Add KF_SPIN_LOCK flag for kfuncs under bpf_spin_lock Kaitao cheng
2026-05-03  3:56 ` [RFC v2 2/3] block: Introduce the UFQ I/O scheduler Kaitao cheng
2026-05-03  4:45   ` bot+bpf-ci
2026-05-03  4:57   ` sashiko-bot
2026-05-03  3:56 ` [RFC v2 3/3] tools/ufq_iosched: add BPF example scheduler and build scaffolding Kaitao cheng
2026-05-03  4:44   ` bot+bpf-ci
2026-05-03  5:18   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260503035623.28771-1-kaitao.cheng@linux.dev \
    --to=kaitao.cheng@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eddyz87@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=song@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.