Linux Documentation
 help / color / mirror / Atom feed
From: Daniel Gomez <da.gomez@kernel.org>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>,
	linux-block@vger.kernel.org, linux-doc@vger.kernel.org,
	bpf@vger.kernel.org, linux-kselftest@vger.kernel.org,
	Luis Chamberlain <mcgrof@kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Brendan Gregg <brendan.d.gregg@gmail.com>,
	GOST <gost.dev@samsung.com>,
	Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Subject: Re: configurable block error injection
Date: Thu, 4 Jun 2026 13:06:13 +0200	[thread overview]
Message-ID: <d9a5b911-7692-4df2-b67c-c33bc80b61aa@kernel.org> (raw)
In-Reply-To: <20260602150503.GA6887@lst.de>

On 02/06/2026 17.05, Christoph Hellwig wrote:
> On Tue, Jun 02, 2026 at 11:58:25AM +0200, Daniel Gomez wrote:
>> I wonder if the block layer would be interested in moving block error
>> injection off the should_fail() fault injection framework and extending
>> the ALLOW_ERROR_INJECTION annotation instead and offloading all the
>> debugfs configuration logic (block/error-injection.c) into eBPF?
> 
> I've looked into plain ALLOW_ERROR_INJECTION-based injection and it
> is not very useful.  I didn't even now eBPF could use it, 

For context: Josef Bacik introduced it first for BPF only
(BPF_ALLOW_ERROR_INJECTION). Masami Hiramatsu then generalized it into
the error injection framework, renaming it to ALLOW_ERROR_INJECTION
and adding the fail_function debugfs interface (which calls
should_fail()). So annotating a function with ALLOW_ERROR_INJECTION
gives you both backends at once: debugfs (fail_function) and eBPF
(bpf_override_return()).

> but I
> looked into other eBPF injections and at least for my uses cases
> it was a bit of a mess.  

Agreed, and I had not looked closely enough at the series
before proposing the wrong primitives. ALLOW_ERROR_INJECTION /
bpf_override_return() are not sufficient here: bpf_override_return()
cannot set bio->bi_status or call bio_endio(), which I think is the
key operation here.

IIU the series correctly, and oversimplifying: when
injection is enabled and a bio matches, the block layer completes the
bio inline with the chosen blk_status_t (the status= rule from debugfs)
via bio_endio_status(). The submission path returns to the caller
immediately, with the bio already in the error state. Nothing is ever
sent to the device, but the completion path sees the injected error.

submit_bio() -> submit_bio_noacct() -> submit_bio_noacct_nocheck()
      +-- Path 1: no match -> continue normal IO submission
      +-- Path 2: match (diskN) -> blk_error_inject()
                     -> bio_endio_status(bio, inj->status)
                          -> bio_endio()
                     // error injected. bio completed

So this is bio mutating + a bio_endio() call, not a return override.
That can't be solved with ALLOW_ERROR_INJECTION. But we can use
BPF_PROG_TYPE_STRUCT_OPS instead: the kernel keeps ownership of the
bio_endio_status() call, and the eBPF program only drives the policy,
ie. which blk_status_t to return for a given bio, based on whatever 
heuristics it implements.

> I'd have to allow access to certain bio

With struct_ops the bio is passed to the ebpf side as read-only, bio 
fields can be read to decide the policy but cannot write them. Is 
read-only access to bio fields itself a concern?

> fields and would have create a stable UAPI for commands and status
> using the fake BTF struct access which really would not be a good
> idea here as we need to be able to change internals.  

That should not be a problem at all. With CO-RE (compile once, run
everywhere) the program resolves the bio field offsets against the
BTF of the kernel it loads on, so it adapts dynamically if the layout
changes. The contract is just the struct_ops callback signature: a
struct bio * argument and a blk_status_t return. And that doesn't imply
any UAPI commitment AFAIK.

> Additionally
> having fully BTF-enabled toolchains in test VMs is not great either.

Are you referring to the old BCC toolchain requirements [1]? This is
solved in CO-RE [2]. The toolchain (Clang/LLVM, pahole) stays on the
build host; the test VM only needs the prebuilt BPF object, libbpf at
runtime, and the kernel's own BTF (CONFIG_DEBUG_INFO_BTF). No compiler
or BTF toolchain is required inside the VM. Clang/LLVM 10+ is enough to
build CO-RE libbpf tools [3].

Link: https://ebpf.io/what-is-ebpf/#how-are-ebpf-programs-written [1]
Link: https://nakryiko.com/posts/bpf-portability-and-co-re/ [2]
Link: https://github.com/libbpf/libbpf#bpf-co-re-compile-once--run-everywhere [1]

> 
> I've also not actually found any good map type for range lookups,
> which is kinda essential here.

Are you referring to the bio sector range comparison in
__blk_error_inject()? I don't think that needs to be delegated to a
BPF map (Documentation/bpf/maps.rst). The ebpf side has direct access
to the bio fields, so it can apply the same sector/op filtering
__blk_error_inject() does today. That match is already a linear list
walk, so the ebpf program just runs the same [start, end] condition
check.

In summary: what do you think of evolving this series
into eBPF, but BPF_PROG_TYPE_STRUCT_OPS instead of
ALLOW_ERROR_INJECTION/bpf_override_return()?. The configurable debugfs
injection is a nice enhancement over what we had, but the matching is
static (op/start/nr_sectors/status/chance) per gendisk. A struct_ops
hook would expand the failure model to anything derivable from the bio
(plus any state the program keeps), with the kernel still owning the
bio_endio_status() mutation and only the policy moving out of the tree.
The benefit is the flexibility to express the policy without hard-coding
the model in the kernel.

The open question is whether programmable injection is something we want
to support in-tree, or whether more debugfs knobs suffice. Thoughts?

  reply	other threads:[~2026-06-04 11:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-02  5:45 configurable block error injection Christoph Hellwig
2026-06-02  5:45 ` [PATCH 1/9] block: remove ALLOW_ERROR_INJECTION for should_fail_bio Christoph Hellwig
2026-06-02  5:45 ` [PATCH 2/9] block: consolidate the calls to should_fail_bio Christoph Hellwig
2026-06-02  5:45 ` [PATCH 3/9] block: refactor should_fail_bio and should_fail_request Christoph Hellwig
2026-06-02  5:45 ` [PATCH 4/9] block: move the FAIL_MAKE_REQUEST symbol from lib/ to block/ Christoph Hellwig
2026-06-02  5:45 ` [PATCH 5/9] block: add a macro to initialize the status table Christoph Hellwig
2026-06-02  5:45 ` [PATCH 6/9] block: add a "tag" for block status codes Christoph Hellwig
2026-06-02  5:45 ` [PATCH 7/9] block: add a str_to_blk_op helper Christoph Hellwig
2026-06-02  5:45 ` [PATCH 8/9] block: add configurable error injection Christoph Hellwig
2026-06-02  9:42   ` Keith Busch
2026-06-02 14:46     ` Christoph Hellwig
2026-06-02 17:56   ` Randy Dunlap
2026-06-02  5:45 ` [PATCH 9/9] block: move the fail request code Christoph Hellwig
2026-06-02  9:43 ` configurable block error injection Keith Busch
2026-06-02  9:58 ` Daniel Gomez
2026-06-02 15:05   ` Christoph Hellwig
2026-06-04 11:06     ` Daniel Gomez [this message]
2026-06-05  7:23       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d9a5b911-7692-4df2-b67c-c33bc80b61aa@kernel.org \
    --to=da.gomez@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=brendan.d.gregg@gmail.com \
    --cc=corbet@lwn.net \
    --cc=gost.dev@samsung.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox