* configurable block error injection v3
@ 2026-06-08 5:14 Christoph Hellwig
2026-06-08 5:14 ` [PATCH 1/4] block: add a macro to initialize the status table Christoph Hellwig
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Christoph Hellwig @ 2026-06-08 5:14 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc
Hi all,
this series adds a new configurable block error injection facility.
We already have a few to inject block errors, but unfortunately most
of them are either not very useful or hard to use, or both:
- The fail_make_request failure injection point can't distinguish
different commands, different ranges in the file and can only injection
plain I/O errors.
- the should_fail_bio 'dynamic' failure injection has all the same issues
as fail_make_request
- dm-error can only fail all command in the table using BLK_STS_IOERR
and requires setting up a new block device
- dm-flakey and dm-dust allow all kinds of configurability, but still
don't have good error selection, no good support for non-read/write
commands and are limited to the dm table alignment requirements,
which for zoned devices enforces setting them up for an entire zone.
They also once again require setting up a stacked block device,
which is really annoying in harnesses like xfstests
This series adds a new debugfs-based block layer error injection
that allows to configure what operations and ranges the injection
applied to, and what status to return. It also allows to configure a
failure ratio similar to the xfs errortag injection.
Changes since v2:
- improve the documentation a bit
- fix a spelling mistake in a comment
Changes since v1:
- drop the should_fail_bio removal and cleanup depending on it, as it's
used by eBPF programs and thus a hidden UABI.
- as a result split the code out to it's own Kconfig symbol
- various error handling fixed pointed out by Keith
- documentation spelling fixes pointed out by Randy
Diffstat:
Documentation/block/error-injection.rst | 59 ++++++
Documentation/block/index.rst | 1
block/Kconfig | 7
block/Makefile | 1
block/blk-core.c | 86 ++++++--
block/blk-sysfs.c | 4
block/blk.h | 15 +
block/error-injection.c | 308 ++++++++++++++++++++++++++++++++
block/genhd.c | 4
include/linux/blkdev.h | 6
10 files changed, 471 insertions(+), 20 deletions(-)
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH 1/4] block: add a macro to initialize the status table 2026-06-08 5:14 configurable block error injection v3 Christoph Hellwig @ 2026-06-08 5:14 ` Christoph Hellwig 2026-06-08 21:51 ` Bart Van Assche 2026-06-08 5:14 ` [PATCH 2/4] block: add a "tag" for block status codes Christoph Hellwig ` (2 subsequent siblings) 3 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-06-08 5:14 UTC (permalink / raw) To: Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke Prepare for adding a new value to the error table by adding a macro to fill it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> --- block/blk-core.c | 45 +++++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index b0f0a304ea0b..1614323282f1 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -132,39 +132,44 @@ inline const char *blk_op_str(enum req_op op) } EXPORT_SYMBOL_GPL(blk_op_str); +#define ENT(_tag, _errno, _desc) \ +[BLK_STS_##_tag] = { \ + .errno = _errno, \ + .name = _desc, \ +} static const struct { int errno; const char *name; } blk_errors[] = { - [BLK_STS_OK] = { 0, "" }, - [BLK_STS_NOTSUPP] = { -EOPNOTSUPP, "operation not supported" }, - [BLK_STS_TIMEOUT] = { -ETIMEDOUT, "timeout" }, - [BLK_STS_NOSPC] = { -ENOSPC, "critical space allocation" }, - [BLK_STS_TRANSPORT] = { -ENOLINK, "recoverable transport" }, - [BLK_STS_TARGET] = { -EREMOTEIO, "critical target" }, - [BLK_STS_RESV_CONFLICT] = { -EBADE, "reservation conflict" }, - [BLK_STS_MEDIUM] = { -ENODATA, "critical medium" }, - [BLK_STS_PROTECTION] = { -EILSEQ, "protection" }, - [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" }, - [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" }, - [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" }, - [BLK_STS_OFFLINE] = { -ENODEV, "device offline" }, + ENT(OK, 0, ""), + ENT(NOTSUPP, -EOPNOTSUPP, "operation not supported"), + ENT(TIMEOUT, -ETIMEDOUT, "timeout"), + ENT(NOSPC, -ENOSPC, "critical space allocation"), + ENT(TRANSPORT, -ENOLINK, "recoverable transport"), + ENT(TARGET, -EREMOTEIO, "critical target"), + ENT(RESV_CONFLICT, -EBADE, "reservation conflict"), + ENT(MEDIUM, -ENODATA, "critical medium"), + ENT(PROTECTION, -EILSEQ, "protection"), + ENT(RESOURCE, -ENOMEM, "kernel resource"), + ENT(DEV_RESOURCE, -EBUSY, "device resource"), + ENT(AGAIN, -EAGAIN, "nonblocking retry"), + ENT(OFFLINE, -ENODEV, "device offline"), /* device mapper special case, should not leak out: */ - [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" }, + ENT(DM_REQUEUE, -EREMCHG, "dm internal retry"), /* zone device specific errors */ - [BLK_STS_ZONE_OPEN_RESOURCE] = { -ETOOMANYREFS, "open zones exceeded" }, - [BLK_STS_ZONE_ACTIVE_RESOURCE] = { -EOVERFLOW, "active zones exceeded" }, + ENT(ZONE_OPEN_RESOURCE, -ETOOMANYREFS, "open zones exceeded"), + ENT(ZONE_ACTIVE_RESOURCE, -EOVERFLOW, "active zones exceeded"), /* Command duration limit device-side timeout */ - [BLK_STS_DURATION_LIMIT] = { -ETIME, "duration limit exceeded" }, - - [BLK_STS_INVAL] = { -EINVAL, "invalid" }, + ENT(DURATION_LIMIT, -ETIME, "duration limit exceeded"), + ENT(INVAL, -EINVAL, "invalid"), /* everything else not covered above: */ - [BLK_STS_IOERR] = { -EIO, "I/O" }, + ENT(IOERR, -EIO, "I/O"), }; +#undef ENT blk_status_t errno_to_blk_status(int errno) { -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4] block: add a macro to initialize the status table 2026-06-08 5:14 ` [PATCH 1/4] block: add a macro to initialize the status table Christoph Hellwig @ 2026-06-08 21:51 ` Bart Van Assche 0 siblings, 0 replies; 17+ messages in thread From: Bart Van Assche @ 2026-06-08 21:51 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On 6/7/26 10:14 PM, Christoph Hellwig wrote: > Prepare for adding a new value to the error table by adding a macro > to fill it. Reviewed-by: Bart Van Assche <bvanassche@acm.org> ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/4] block: add a "tag" for block status codes 2026-06-08 5:14 configurable block error injection v3 Christoph Hellwig 2026-06-08 5:14 ` [PATCH 1/4] block: add a macro to initialize the status table Christoph Hellwig @ 2026-06-08 5:14 ` Christoph Hellwig 2026-06-08 21:55 ` Bart Van Assche 2026-06-08 5:14 ` [PATCH 3/4] block: add a str_to_blk_op helper Christoph Hellwig 2026-06-08 5:14 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 3 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-06-08 5:14 UTC (permalink / raw) To: Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke The full name of the status codes is not good for user interfaces as it can contain white spaces. Add the name of the status code without the BLK_STS_ prefix as a tag so that it can be used for user interfaces. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> --- block/blk-core.c | 28 ++++++++++++++++++++++++++++ block/blk.h | 2 ++ 2 files changed, 30 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 1614323282f1..7aa9cd110bdd 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -135,10 +135,12 @@ EXPORT_SYMBOL_GPL(blk_op_str); #define ENT(_tag, _errno, _desc) \ [BLK_STS_##_tag] = { \ .errno = _errno, \ + .tag = __stringify(_tag), \ .name = _desc, \ } static const struct { int errno; + const char *tag; const char *name; } blk_errors[] = { ENT(OK, 0, ""), @@ -203,6 +205,32 @@ const char *blk_status_to_str(blk_status_t status) return blk_errors[idx].name; } +const char *blk_status_to_tag(blk_status_t status) +{ + int idx = (__force int)status; + + if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors))) + return "<null>"; + return blk_errors[idx].tag; +} + +blk_status_t tag_to_blk_status(const char *tag) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(blk_errors); i++) { + if (blk_errors[i].tag && + !strcmp(blk_errors[i].tag, tag)) + return (__force blk_status_t)i; + } + + /* + * Return BLK_STS_OK for mismatches as this function is intended to + * parse error status values. + */ + return BLK_STS_OK; +} + /** * blk_sync_queue - cancel any pending callbacks on a queue * @q: the queue diff --git a/block/blk.h b/block/blk.h index 1a2d9101bba0..0eb8e932ec66 100644 --- a/block/blk.h +++ b/block/blk.h @@ -50,6 +50,8 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size, void blk_free_flush_queue(struct blk_flush_queue *q); const char *blk_status_to_str(blk_status_t status); +const char *blk_status_to_tag(blk_status_t status); +blk_status_t tag_to_blk_status(const char *tag); bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic); bool blk_queue_start_drain(struct request_queue *q); -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] block: add a "tag" for block status codes 2026-06-08 5:14 ` [PATCH 2/4] block: add a "tag" for block status codes Christoph Hellwig @ 2026-06-08 21:55 ` Bart Van Assche 2026-06-09 7:43 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Bart Van Assche @ 2026-06-08 21:55 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On 6/7/26 10:14 PM, Christoph Hellwig wrote: > +const char *blk_status_to_tag(blk_status_t status) > +{ > + int idx = (__force int)status; > + > + if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors))) > + return "<null>"; > + return blk_errors[idx].tag; > +} Since designated initializers are used to initialize blk_errors[], it's probably a good idea to check the value of blk_errors[idx].tag, e.g. as follows: return blk_errors[idx].tag ?: "<null>"; Thanks, Bart. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/4] block: add a "tag" for block status codes 2026-06-08 21:55 ` Bart Van Assche @ 2026-06-09 7:43 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-09 7:43 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On Mon, Jun 08, 2026 at 02:55:20PM -0700, Bart Van Assche wrote: > On 6/7/26 10:14 PM, Christoph Hellwig wrote: >> +const char *blk_status_to_tag(blk_status_t status) >> +{ >> + int idx = (__force int)status; >> + >> + if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors))) >> + return "<null>"; >> + return blk_errors[idx].tag; >> +} > > Since designated initializers are used to initialize blk_errors[], it's > probably a good idea to check the value of blk_errors[idx].tag, e.g. as > follows: > > return blk_errors[idx].tag ?: "<null>"; I'd go for the good old and readable if statement, but yes, I can add extra error checking here. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 3/4] block: add a str_to_blk_op helper 2026-06-08 5:14 configurable block error injection v3 Christoph Hellwig 2026-06-08 5:14 ` [PATCH 1/4] block: add a macro to initialize the status table Christoph Hellwig 2026-06-08 5:14 ` [PATCH 2/4] block: add a "tag" for block status codes Christoph Hellwig @ 2026-06-08 5:14 ` Christoph Hellwig 2026-06-08 21:57 ` Bart Van Assche 2026-06-08 5:14 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 3 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-06-08 5:14 UTC (permalink / raw) To: Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke Add a helper to find the REQ_OP_XYZ constant from the "XYZ" string. This will be used for the error injection debugfs interface. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> --- block/blk-core.c | 10 ++++++++++ block/blk.h | 1 + 2 files changed, 11 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 7aa9cd110bdd..aa90aad6da13 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -132,6 +132,16 @@ inline const char *blk_op_str(enum req_op op) } EXPORT_SYMBOL_GPL(blk_op_str); +enum req_op str_to_blk_op(const char *op) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(blk_op_name); i++) + if (blk_op_name[i] && !strcmp(blk_op_name[i], op)) + return (enum req_op)i; + return REQ_OP_LAST; +} + #define ENT(_tag, _errno, _desc) \ [BLK_STS_##_tag] = { \ .errno = _errno, \ diff --git a/block/blk.h b/block/blk.h index 0eb8e932ec66..e8b7d5517086 100644 --- a/block/blk.h +++ b/block/blk.h @@ -52,6 +52,7 @@ void blk_free_flush_queue(struct blk_flush_queue *q); const char *blk_status_to_str(blk_status_t status); const char *blk_status_to_tag(blk_status_t status); blk_status_t tag_to_blk_status(const char *tag); +enum req_op str_to_blk_op(const char *op); bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic); bool blk_queue_start_drain(struct request_queue *q); -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] block: add a str_to_blk_op helper 2026-06-08 5:14 ` [PATCH 3/4] block: add a str_to_blk_op helper Christoph Hellwig @ 2026-06-08 21:57 ` Bart Van Assche 2026-06-09 7:45 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Bart Van Assche @ 2026-06-08 21:57 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On 6/7/26 10:14 PM, Christoph Hellwig wrote: > +enum req_op str_to_blk_op(const char *op) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(blk_op_name); i++) > + if (blk_op_name[i] && !strcmp(blk_op_name[i], op)) > + return (enum req_op)i; > + return REQ_OP_LAST; > +} The above function is similar but not identical to __sysfs_match_string(). Is __sysfs_match_string() good enough in this context? Thanks, Bart. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] block: add a str_to_blk_op helper 2026-06-08 21:57 ` Bart Van Assche @ 2026-06-09 7:45 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-09 7:45 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On Mon, Jun 08, 2026 at 02:57:40PM -0700, Bart Van Assche wrote: > On 6/7/26 10:14 PM, Christoph Hellwig wrote: >> +enum req_op str_to_blk_op(const char *op) >> +{ >> + int i; >> + >> + for (i = 0; i < ARRAY_SIZE(blk_op_name); i++) >> + if (blk_op_name[i] && !strcmp(blk_op_name[i], op)) >> + return (enum req_op)i; >> + return REQ_OP_LAST; >> +} > The above function is similar but not identical to > __sysfs_match_string(). Is __sysfs_match_string() good enough in this > context? __sysfs_match_string exists as soon as an array entry is NULL, but blk_status values are not fully contiguous, so no. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/4] block: add configurable error injection 2026-06-08 5:14 configurable block error injection v3 Christoph Hellwig ` (2 preceding siblings ...) 2026-06-08 5:14 ` [PATCH 3/4] block: add a str_to_blk_op helper Christoph Hellwig @ 2026-06-08 5:14 ` Christoph Hellwig 2026-06-08 14:53 ` Jens Axboe 2026-06-08 22:08 ` Bart Van Assche 3 siblings, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-08 5:14 UTC (permalink / raw) To: Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke Add a new block error injection interface that allows to inject specific status code for specific ranges. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@kernel.org> --- Documentation/block/error-injection.rst | 59 +++++ Documentation/block/index.rst | 1 + block/Kconfig | 7 + block/Makefile | 1 + block/blk-core.c | 3 + block/blk-sysfs.c | 4 + block/blk.h | 12 + block/error-injection.c | 308 ++++++++++++++++++++++++ block/genhd.c | 4 + include/linux/blkdev.h | 6 + 10 files changed, 405 insertions(+) create mode 100644 Documentation/block/error-injection.rst create mode 100644 block/error-injection.c diff --git a/Documentation/block/error-injection.rst b/Documentation/block/error-injection.rst new file mode 100644 index 000000000000..a96b7af362c5 --- /dev/null +++ b/Documentation/block/error-injection.rst @@ -0,0 +1,59 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +Configurable Error Injection +============================ + +Overview +-------- + +Configurable error injection allows injecting specific block layer status codes +for ranges of a block device. Errors can be injected unconditionally, or with a +given probability. + +To use configurable error injection, CONFIG_BLK_ERROR_INJECTION must be enabled. + +The only interface is the error_injection debugfs file, which is created for +each registered gendisk. Writes to this file are used to create or delete rules +and reads return a list of the current error injection sites. + +Options +------- + +The following options specify the operations: + +=================== ======================================================= +add add a new rule +removeall remove all existing rules +=================== ======================================================= + +The following options specify the details of the rule for the add operation: + +=================== ======================================================= +op=<string> block layer operation this rule applies to. This uses + the XYZ for each REQ_OP_XYZ operation, e.g. READ, WRITE + or DISCARD. Mandatory. +status=<string> Status to return. This uses XYZ for each BLK_STS_XYZ + code, e.g. IOERR or MEDIUM. Mandatory. +start=<number> First block layer sector the rule applies to. + Optional, defaults to 0. +nr_sectors=<number> Number of sectors this rule applies. + Optional, defaults to the remainder of the device. +chance=<number> Only return a failure with a likelihood of 1/chance. + Optional, defaults to 1 (always). +=================== ======================================================= + +Example +------- + +Return BLK_STS_IOERR for one in 10 reads of sector 0 of /dev/nvme0n1: + + $ echo 'add,op=READ,start=0,status=IOERR,chance=10' > /sys/kernel/debug/block/nvme0n1/error_injection + +Return BLK_STS_MEDIUM for every write to /dev/nvme0n1: + + $ echo 'add,op=WRITE,start=0,status=MEDIUM' > /sys/kernel/debug/block/nvme0n1/error_injection + +Remove all rules for /dev/nvme0n1: + + $ echo 'removeall' > /sys/kernel/debug/block/nvme0n1/error_injection diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst index 9fea696f9daa..bfa1bbd31ddf 100644 --- a/Documentation/block/index.rst +++ b/Documentation/block/index.rst @@ -22,3 +22,4 @@ Block switching-sched writeback_cache_control ublk + error-injection diff --git a/block/Kconfig b/block/Kconfig index 15027963472d..7651b86eed56 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -221,6 +221,13 @@ config BLOCK_HOLDER_DEPRECATED config BLK_MQ_STACKING bool +config BLK_ERROR_INJECTION + bool "Enable block layer error injection" + help + Enable inserting arbitrary block errors through a debugfs interface. + + See Documentation/block/error-injection.rst for details. + source "block/Kconfig.iosched" endif # BLOCK diff --git a/block/Makefile b/block/Makefile index 54130faacc21..e7bd320e3d69 100644 --- a/block/Makefile +++ b/block/Makefile @@ -13,6 +13,7 @@ obj-y := bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \ disk-events.o blk-ia-ranges.o early-lookup.o +obj-$(CONFIG_BLK_ERROR_INJECTION) += error-injection.o obj-$(CONFIG_BLK_DEV_BSG_COMMON) += bsg.o obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o diff --git a/block/blk-core.c b/block/blk-core.c index aa90aad6da13..268735582ef1 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -767,6 +767,9 @@ static void __submit_bio_noacct_mq(struct bio *bio) void submit_bio_noacct_nocheck(struct bio *bio, bool split) { + if (unlikely(blk_error_inject(bio))) + return; + blk_cgroup_bio_start(bio); if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) { diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index f22c1f253eb3..8a0c2be48a31 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -933,6 +933,8 @@ static void blk_debugfs_remove(struct gendisk *disk) blk_debugfs_lock_nomemsave(q); blk_trace_shutdown(q); + if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + blk_error_injection_exit(disk); debugfs_remove_recursive(q->debugfs_dir); q->debugfs_dir = NULL; q->sched_debugfs_dir = NULL; @@ -963,6 +965,8 @@ int blk_register_queue(struct gendisk *disk) memflags = blk_debugfs_lock(q); q->debugfs_dir = debugfs_create_dir(disk->disk_name, blk_debugfs_root); + if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + blk_error_injection_init(disk); if (queue_is_mq(q)) blk_mq_debugfs_register(q); blk_debugfs_unlock(q, memflags); diff --git a/block/blk.h b/block/blk.h index e8b7d5517086..10df23b2cb90 100644 --- a/block/blk.h +++ b/block/blk.h @@ -660,6 +660,18 @@ static inline bool should_fail_request(struct block_device *part, } #endif /* CONFIG_FAIL_MAKE_REQUEST */ +void blk_error_injection_init(struct gendisk *disk); +void blk_error_injection_exit(struct gendisk *disk); +bool __blk_error_inject(struct bio *bio); +static inline bool blk_error_inject(struct bio *bio) +{ + if (!IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + return false; + if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state)) + return false; + return __blk_error_inject(bio); +} + /* * Optimized request reference counting. Ideally we'd make timeouts be more * clever, as that's the only reason we need references at all... But until diff --git a/block/error-injection.c b/block/error-injection.c new file mode 100644 index 000000000000..3ca4ad297683 --- /dev/null +++ b/block/error-injection.c @@ -0,0 +1,308 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Christoph Hellwig. + */ +#include <linux/debugfs.h> +#include <linux/blkdev.h> +#include <linux/parser.h> +#include <linux/seq_file.h> +#include "blk.h" + +struct blk_error_inject { + struct list_head entry; + sector_t start; + sector_t end; + enum req_op op; + blk_status_t status; + + /* only inject every 1 / chance times */ + unsigned int chance; +}; + +bool __blk_error_inject(struct bio *bio) +{ + struct gendisk *disk = bio->bi_bdev->bd_disk; + struct blk_error_inject *inj; + + rcu_read_lock(); + list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) { + if (bio->bi_iter.bi_sector <= inj->end && + bio_end_sector(bio) > inj->start && + bio_op(bio) == inj->op) { + blk_status_t status = inj->status; + + if (inj->chance > 1 && + (get_random_u32() % inj->chance) != 0) + continue; + + pr_info_ratelimited("%pg: injecting %s error for %s at sector %llu:%u\n", + disk->part0, + blk_status_to_str(status), + blk_op_str(inj->op), + bio->bi_iter.bi_sector, + bio_sectors(bio)); + rcu_read_unlock(); + bio_endio_status(bio, status); + return true; + } + } + rcu_read_unlock(); + return false; +} + +static int error_inject_add(struct gendisk *disk, enum req_op op, + sector_t start, u64 nr_sectors, blk_status_t status, + unsigned int chance) +{ + struct blk_error_inject *inj; + int error = -EINVAL; + + if (op == REQ_OP_LAST) + return -EINVAL; + if (status == BLK_STS_OK) + return -EINVAL; + + inj = kzalloc_obj(*inj); + if (!inj) + return -ENOMEM; + + if (nr_sectors) { + if (U64_MAX - nr_sectors < start) + goto out_free_inj; + inj->end = start + nr_sectors - 1; + } else { + inj->end = U64_MAX; + } + + inj->op = op; + inj->start = start; + inj->status = status; + inj->chance = chance; + + pr_debug_ratelimited("%pg: adding %s injection for %s at sector %llu:%llu\n", + disk->part0, blk_status_to_str(status), + blk_op_str(op), + start, nr_sectors); + + /* + * Add to the front of the list so that newer entries can partially + * override other entries. This also intentionally allows duplicate + * entries as there is no real reason to reject them. + */ + mutex_lock(&disk->error_injection_lock); + if (!disk_live(disk)) { + mutex_unlock(&disk->error_injection_lock); + error = -ENODEV; + goto out_free_inj; + } + list_add_rcu(&inj->entry, &disk->error_injection_list); + set_bit(GD_ERROR_INJECT, &disk->state); + mutex_unlock(&disk->error_injection_lock); + return 0; + +out_free_inj: + kfree(inj); + return error; +} + +static void error_inject_removall(struct gendisk *disk) +{ + struct blk_error_inject *inj; + + mutex_lock(&disk->error_injection_lock); + clear_bit(GD_ERROR_INJECT, &disk->state); + while ((inj = list_first_entry_or_null(&disk->error_injection_list, + struct blk_error_inject, entry))) { + list_del_rcu(&inj->entry); + mutex_unlock(&disk->error_injection_lock); + + kfree_rcu_mightsleep(inj); + + mutex_lock(&disk->error_injection_lock); + } + mutex_unlock(&disk->error_injection_lock); +} + +enum options { + Opt_add = (1u << 0), + Opt_removeall = (1u << 1), + + Opt_op = (1u << 16), + Opt_start = (1u << 17), + Opt_nr_sectors = (1u << 18), + Opt_status = (1u << 19), + Opt_chance = (1u << 20), + + Opt_invalid, +}; + +static const match_table_t opt_tokens = { + { Opt_add, "add", }, + { Opt_removeall, "removeall", }, + { Opt_op, "op=%s", }, + { Opt_start, "start=%u" }, + { Opt_nr_sectors, "nr_sectors=%u" }, + { Opt_status, "status=%s" }, + { Opt_chance, "chance=%u" }, + { Opt_invalid, NULL, }, +}; + +static int match_op(substring_t *args, enum req_op *op) +{ + const char *tag; + + tag = match_strdup(args); + if (!tag) + return -ENOMEM; + *op = str_to_blk_op(tag); + if (*op == REQ_OP_LAST) + pr_warn("invalid op '%s'\n", tag); + kfree(tag); + return 0; +} + +static int match_status(substring_t *args, blk_status_t *status) +{ + const char *tag; + + tag = match_strdup(args); + if (!tag) + return -ENOMEM; + *status = tag_to_blk_status(tag); + if (!*status) + pr_warn("invalid status '%s'\n", tag); + kfree(tag); + return 0; +} + +static ssize_t blk_error_injection_parse_options(struct gendisk *disk, + char *options) +{ + enum { Unset, Add, Removeall } action = Unset; + unsigned int option_mask = 0, chance = 1; + enum req_op op = REQ_OP_LAST; + u64 start = 0, nr_sectors = 0; + blk_status_t status = BLK_STS_OK; + substring_t args[MAX_OPT_ARGS]; + char *p; + + while ((p = strsep(&options, ",\n")) != NULL) { + int error = 0; + ssize_t token; + + if (!*p) + continue; + token = match_token(p, opt_tokens, args); + option_mask |= token; + switch (token) { + case Opt_add: + if (action != Unset) + return -EINVAL; + action = Add; + break; + case Opt_removeall: + if (action != Unset) + return -EINVAL; + action = Removeall; + break; + case Opt_op: + error = match_op(args, &op); + break; + case Opt_start: + error = match_u64(args, &start); + break; + case Opt_nr_sectors: + error = match_u64(args, &nr_sectors); + break; + case Opt_status: + error = match_status(args, &status); + break; + case Opt_chance: + error = match_uint(args, &chance); + if (!error && chance == 0) + error = -EINVAL; + break; + default: + pr_warn("unknown parameter or missing value '%s'\n", p); + error = -EINVAL; + } + if (error) + return error; + } + + switch (action) { + case Add: + return error_inject_add(disk, op, start, nr_sectors, status, + chance); + case Removeall: + if (option_mask & ~Opt_removeall) + return -EINVAL; + error_inject_removall(disk); + return 0; + default: + return -EINVAL; + } +} + +static ssize_t blk_error_injection_write(struct file *file, + const char __user *ubuf, size_t count, loff_t *pos) +{ + struct gendisk *disk = file_inode(file)->i_private; + char *options; + int error; + + options = memdup_user_nul(ubuf, count); + if (IS_ERR(options)) + return PTR_ERR(options); + error = blk_error_injection_parse_options(disk, options); + kfree(options); + + if (error) + return error; + return count; +} + +static int blk_error_injection_show(struct seq_file *s, void *private) +{ + struct gendisk *disk = s->private; + struct blk_error_inject *inj; + + rcu_read_lock(); + list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) { + seq_printf(s, "%llu:%llu status=%s,chance=%u", + inj->start, inj->end, + blk_status_to_tag(inj->status), inj->chance); + seq_putc(s, '\n'); + } + rcu_read_unlock(); + return 0; +} + +static int blk_error_injection_open(struct inode *inode, struct file *file) +{ + return single_open(file, blk_error_injection_show, inode->i_private); +} + +static int blk_error_injection_release(struct inode *inode, struct file *file) +{ + return single_release(inode, file); +} + +static const struct file_operations blk_error_injection_fops = { + .owner = THIS_MODULE, + .write = blk_error_injection_write, + .read = seq_read, + .open = blk_error_injection_open, + .release = blk_error_injection_release, +}; + +void blk_error_injection_init(struct gendisk *disk) +{ + debugfs_create_file("error_injection", 0600, disk->queue->debugfs_dir, + disk, &blk_error_injection_fops); +} + +void blk_error_injection_exit(struct gendisk *disk) +{ + error_inject_removall(disk); +} diff --git a/block/genhd.c b/block/genhd.c index 7d6854fd28e9..f84b6a355b57 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1485,6 +1485,10 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id, lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0); #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED INIT_LIST_HEAD(&disk->slave_bdevs); +#endif +#ifdef CONFIG_BLK_ERROR_INJECTION + mutex_init(&disk->error_injection_lock); + INIT_LIST_HEAD(&disk->error_injection_list); #endif mutex_init(&disk->rqos_state_mutex); kobject_init(&disk->queue_kobj, &blk_queue_ktype); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 57e84d59a642..5070851cf924 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -176,6 +176,7 @@ struct gendisk { #define GD_SUPPRESS_PART_SCAN 5 #define GD_OWNS_QUEUE 6 #define GD_ZONE_APPEND_USED 7 +#define GD_ERROR_INJECT 8 struct mutex open_mutex; /* open/close mutex */ unsigned open_partitions; /* number of open partitions */ @@ -227,6 +228,11 @@ struct gendisk { */ struct blk_independent_access_ranges *ia_ranges; +#ifdef CONFIG_BLK_ERROR_INJECTION + struct mutex error_injection_lock; + struct list_head error_injection_list; +#endif + struct mutex rqos_state_mutex; /* rqos state change mutex */ }; -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-08 5:14 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig @ 2026-06-08 14:53 ` Jens Axboe 2026-06-09 7:41 ` Christoph Hellwig 2026-06-08 22:08 ` Bart Van Assche 1 sibling, 1 reply; 17+ messages in thread From: Jens Axboe @ 2026-06-08 14:53 UTC (permalink / raw) To: Christoph Hellwig Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On 6/7/26 11:14 PM, Christoph Hellwig wrote: > diff --git a/block/blk.h b/block/blk.h > index e8b7d5517086..10df23b2cb90 100644 > --- a/block/blk.h > +++ b/block/blk.h > @@ -660,6 +660,18 @@ static inline bool should_fail_request(struct block_device *part, > } > #endif /* CONFIG_FAIL_MAKE_REQUEST */ > > +void blk_error_injection_init(struct gendisk *disk); > +void blk_error_injection_exit(struct gendisk *disk); > +bool __blk_error_inject(struct bio *bio); > +static inline bool blk_error_inject(struct bio *bio) > +{ > + if (!IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) > + return false; > + if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state)) > + return false; > + return __blk_error_inject(bio); > +} I really hate this part, that's a pretty deep set of pointer chasings to figure out if injection is enabled or not, when in practice error injection is only ever enabled for specific test cases and distros invariably will set CONFIG_BLK_ERROR_INJECTION because they turn on every damn thing under the sun. IOW, that won't fly for the hot path. Maybe a static key would be useful here? -- Jens Axboe ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-08 14:53 ` Jens Axboe @ 2026-06-09 7:41 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-09 7:41 UTC (permalink / raw) To: Jens Axboe Cc: Christoph Hellwig, Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On Mon, Jun 08, 2026 at 08:53:22AM -0600, Jens Axboe wrote: > > + if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state)) > > + return false; > > + return __blk_error_inject(bio); > > +} > > I really hate this part, that's a pretty deep set of pointer chasings to > figure out if injection is enabled or not, It's to the bdev we use everywhere, and then to the disk which we use in a lot of places in the submission path. The only easy way to reduce it would be to move the state to the block_device. We currently don't do partitions in debugfs, but maybe we should? > when in practice error > injection is only ever enabled for specific test cases and distros > invariably will set CONFIG_BLK_ERROR_INJECTION because they turn on > every damn thing under the sun. > > IOW, that won't fly for the hot path. Maybe a static key would be useful > here? a static_key makes sense here, probably including the legacy error injection. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-08 5:14 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 2026-06-08 14:53 ` Jens Axboe @ 2026-06-08 22:08 ` Bart Van Assche 2026-06-09 7:47 ` Christoph Hellwig 1 sibling, 1 reply; 17+ messages in thread From: Bart Van Assche @ 2026-06-08 22:08 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On 6/7/26 10:14 PM, Christoph Hellwig wrote: > +Configurable error injection allows injecting specific block layer status codes > +for ranges of a block device. Errors can be injected unconditionally, or with a ranges -> sector ranges? > +static void error_inject_removall(struct gendisk *disk) > +{ Is a letter "e" perhaps missing from the above function name? (remov -> remove) Thanks, Bart. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-08 22:08 ` Bart Van Assche @ 2026-06-09 7:47 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-09 7:47 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Jens Axboe, Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch, linux-block, linux-doc, Hannes Reinecke On Mon, Jun 08, 2026 at 03:08:47PM -0700, Bart Van Assche wrote: > On 6/7/26 10:14 PM, Christoph Hellwig wrote: >> +Configurable error injection allows injecting specific block layer status codes >> +for ranges of a block device. Errors can be injected unconditionally, or with a > > ranges -> sector ranges? > >> +static void error_inject_removall(struct gendisk *disk) > > +{ > > Is a letter "e" perhaps missing from the above function name? (remov -> > remove) Sure, fixed. ^ permalink raw reply [flat|nested] 17+ messages in thread
* configurable block error injection v2 @ 2026-06-05 18:44 Christoph Hellwig 2026-06-05 18:44 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw) To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc Hi all, this series adds a new configurable block error injection facility. We already have a few to inject block errors, but unfortunately most of them are either not very useful or hard to use, or both: - The fail_make_request failure injection point can't distinguish different commands, different ranges in the file and can only injection plain I/O errors. - the should_fail_bio 'dynamic' failure injection has all the same issues as fail_make_request - dm-error can only fail all command in the table using BLK_STS_IOERR and requires setting up a new block device - dm-flakey and dm-dust allow all kinds of configurability, but still don't have good error selection, no good support for non-read/write commands and are limited to the dm table alignment requirements, which for zoned devices enforces setting them up for an entire zone. They also once again require setting up a stacked block device, which is really annoying in harnesses like xfstests This series adds a new debugfs-based block layer error injection that allows to configure what operations and ranges the injection applied to, and what status to return. It also allows to configure a failure ratio similar to the xfs errortag injection. Changes since v1: - drop the should_fail_bio removal and cleanup depending on it, as it's used by eBPF programs and thus a hidden UABI. - as a result split the code out to it's own Kconfig symbol - various error handling fixed pointed out by Keith - documentation spelling fixes pointed out by Randy Diffstat: Documentation/block/error-injection.rst | 59 ++++++ Documentation/block/index.rst | 1 block/Kconfig | 7 block/Makefile | 1 block/blk-core.c | 86 ++++++-- block/blk-sysfs.c | 4 block/blk.h | 15 + block/error-injection.c | 308 ++++++++++++++++++++++++++++++++ block/genhd.c | 4 include/linux/blkdev.h | 6 10 files changed, 471 insertions(+), 20 deletions(-) ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/4] block: add configurable error injection 2026-06-05 18:44 configurable block error injection v2 Christoph Hellwig @ 2026-06-05 18:44 ` Christoph Hellwig 2026-06-06 7:28 ` Hannes Reinecke 2026-06-06 7:33 ` Damien Le Moal 0 siblings, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw) To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc Add a new block error injection interface that allows to inject specific status code for specific ranges. Signed-off-by: Christoph Hellwig <hch@lst.de> --- Documentation/block/error-injection.rst | 59 +++++ Documentation/block/index.rst | 1 + block/Kconfig | 7 + block/Makefile | 1 + block/blk-core.c | 3 + block/blk-sysfs.c | 4 + block/blk.h | 12 + block/error-injection.c | 308 ++++++++++++++++++++++++ block/genhd.c | 4 + include/linux/blkdev.h | 6 + 10 files changed, 405 insertions(+) create mode 100644 Documentation/block/error-injection.rst create mode 100644 block/error-injection.c diff --git a/Documentation/block/error-injection.rst b/Documentation/block/error-injection.rst new file mode 100644 index 000000000000..b2e2ab6add70 --- /dev/null +++ b/Documentation/block/error-injection.rst @@ -0,0 +1,59 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +Configurable Error Injection +============================ + +Overview +-------- + +Configurable error injection allows injecting specific block layer status codes +for ranges of a block device. Errors can be injected unconditionally, or with a +given probability. + +To use configurable error injection, CONFIG_BLK_ERROR_INJECTION must be enabled. + +The only interface is the error_injection debugfs file, which is created for +each registered gendisk. Writes to this file are used to create or delete rules +and reads return a list of the current error injection sites. + +Options +------- + +The following options specify the operations: + +=================== ======================================================= +add add a new rule +removeall remove all existing rules +=================== ======================================================= + +The following options specify the details of the rule for the add operation: + +=================== ======================================================= +op=%s block layer operation this rule applies to, e.g. READ + or WRITE. + Mandatory. +start=%u First block layer sector the rule applies to. + Optional, defaults to 0. +nr_sectors=%u Number of sectors this rule applies. + Optional, defaults to the remainder of the device. +status=%s Status to return. + Mandatory. +chance=%u Only return a failure with a likelihood of 1/chance. + Optional, defaults to 1 (always). +=================== ======================================================= + +Example +------- + +Return BLK_STS_IOERR for one in 10 reads of sector 0 of /dev/nvme0n1: + + $ echo 'add,op=READ,start=0,status=IOERR,chance=10' > /sys/kernel/debug/block/nvme0n1/error_injection + +Return BLK_STS_MEDIUM for every write to /dev/nvme0n1: + + $ echo 'add,op=WRITE,start=0,status=MEDIUM' > /sys/kernel/debug/block/nvme0n1/error_injection + +Remove all rules for /dev/nvme0n1: + + $ echo 'removeall' > /sys/kernel/debug/block/nvme0n1/error_injection diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst index 9fea696f9daa..bfa1bbd31ddf 100644 --- a/Documentation/block/index.rst +++ b/Documentation/block/index.rst @@ -22,3 +22,4 @@ Block switching-sched writeback_cache_control ublk + error-injection diff --git a/block/Kconfig b/block/Kconfig index 15027963472d..7651b86eed56 100644 --- a/block/Kconfig +++ b/block/Kconfig @@ -221,6 +221,13 @@ config BLOCK_HOLDER_DEPRECATED config BLK_MQ_STACKING bool +config BLK_ERROR_INJECTION + bool "Enable block layer error injection" + help + Enable inserting arbitrary block errors through a debugfs interface. + + See Documentation/block/error-injection.rst for details. + source "block/Kconfig.iosched" endif # BLOCK diff --git a/block/Makefile b/block/Makefile index 7dce2e44276c..d0bb3e15a347 100644 --- a/block/Makefile +++ b/block/Makefile @@ -11,6 +11,7 @@ obj-y := bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \ genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \ disk-events.o blk-ia-ranges.o early-lookup.o +obj-$(CONFIG_BLK_ERROR_INJECTION) += error-injection.o obj-$(CONFIG_BLK_DEV_BSG_COMMON) += bsg.o obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o diff --git a/block/blk-core.c b/block/blk-core.c index aa90aad6da13..268735582ef1 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -767,6 +767,9 @@ static void __submit_bio_noacct_mq(struct bio *bio) void submit_bio_noacct_nocheck(struct bio *bio, bool split) { + if (unlikely(blk_error_inject(bio))) + return; + blk_cgroup_bio_start(bio); if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) { diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index f22c1f253eb3..8a0c2be48a31 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -933,6 +933,8 @@ static void blk_debugfs_remove(struct gendisk *disk) blk_debugfs_lock_nomemsave(q); blk_trace_shutdown(q); + if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + blk_error_injection_exit(disk); debugfs_remove_recursive(q->debugfs_dir); q->debugfs_dir = NULL; q->sched_debugfs_dir = NULL; @@ -963,6 +965,8 @@ int blk_register_queue(struct gendisk *disk) memflags = blk_debugfs_lock(q); q->debugfs_dir = debugfs_create_dir(disk->disk_name, blk_debugfs_root); + if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + blk_error_injection_init(disk); if (queue_is_mq(q)) blk_mq_debugfs_register(q); blk_debugfs_unlock(q, memflags); diff --git a/block/blk.h b/block/blk.h index 93b30d1d0ec6..ec2639752e07 100644 --- a/block/blk.h +++ b/block/blk.h @@ -660,6 +660,18 @@ static inline bool should_fail_request(struct block_device *part, } #endif /* CONFIG_FAIL_MAKE_REQUEST */ +void blk_error_injection_init(struct gendisk *disk); +void blk_error_injection_exit(struct gendisk *disk); +bool __blk_error_inject(struct bio *bio); +static inline bool blk_error_inject(struct bio *bio) +{ + if (!IS_ENABLED(CONFIG_BLK_ERROR_INJECTION)) + return false; + if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state)) + return false; + return __blk_error_inject(bio); +} + /* * Optimized request reference counting. Ideally we'd make timeouts be more * clever, as that's the only reason we need references at all... But until diff --git a/block/error-injection.c b/block/error-injection.c new file mode 100644 index 000000000000..f35bce1d25cc --- /dev/null +++ b/block/error-injection.c @@ -0,0 +1,308 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2026 Christoph Hellwig. + */ +#include <linux/debugfs.h> +#include <linux/blkdev.h> +#include <linux/parser.h> +#include <linux/seq_file.h> +#include "blk.h" + +struct blk_error_inject { + struct list_head entry; + sector_t start; + sector_t end; + enum req_op op; + blk_status_t status; + + /* only inject every 1 / chance times */ + unsigned int chance; +}; + +bool __blk_error_inject(struct bio *bio) +{ + struct gendisk *disk = bio->bi_bdev->bd_disk; + struct blk_error_inject *inj; + + rcu_read_lock(); + list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) { + if (bio->bi_iter.bi_sector <= inj->end && + bio_end_sector(bio) > inj->start && + bio_op(bio) == inj->op) { + blk_status_t status = inj->status; + + if (inj->chance > 1 && + (get_random_u32() % inj->chance) != 0) + continue; + + pr_info_ratelimited("%pg: injecting %s error for %s at sector %llu:%u\n", + disk->part0, + blk_status_to_str(status), + blk_op_str(inj->op), + bio->bi_iter.bi_sector, + bio_sectors(bio)); + rcu_read_unlock(); + bio_endio_status(bio, status); + return true; + } + } + rcu_read_unlock(); + return false; +} + +static int error_inject_add(struct gendisk *disk, enum req_op op, + sector_t start, u64 nr_sectors, blk_status_t status, + unsigned int chance) +{ + struct blk_error_inject *inj; + int error = -EINVAL; + + if (op == REQ_OP_LAST) + return -EINVAL; + if (status == BLK_STS_OK) + return -EINVAL; + + inj = kzalloc_obj(*inj); + if (!inj) + return -ENOMEM; + + if (nr_sectors) { + if (U64_MAX - nr_sectors < start) + goto out_free_inj; + inj->end = start + nr_sectors - 1; + } else { + inj->end = U64_MAX; + } + + inj->op = op; + inj->start = start; + inj->status = status; + inj->chance = chance; + + pr_debug_ratelimited("%pg: adding %s injection for %s at sector %llu:%llu\n", + disk->part0, blk_status_to_str(status), + blk_op_str(op), + start, nr_sectors); + + /* + * Add to the front of the list so that newer entries can partially + * override other entries. This also intentional allows duplicate + * entries as there is no real reason to reject them. + */ + mutex_lock(&disk->error_injection_lock); + if (!disk_live(disk)) { + mutex_unlock(&disk->error_injection_lock); + error = -ENODEV; + goto out_free_inj; + } + list_add_rcu(&inj->entry, &disk->error_injection_list); + set_bit(GD_ERROR_INJECT, &disk->state); + mutex_unlock(&disk->error_injection_lock); + return 0; + +out_free_inj: + kfree(inj); + return error; +} + +static void error_inject_removall(struct gendisk *disk) +{ + struct blk_error_inject *inj; + + mutex_lock(&disk->error_injection_lock); + clear_bit(GD_ERROR_INJECT, &disk->state); + while ((inj = list_first_entry_or_null(&disk->error_injection_list, + struct blk_error_inject, entry))) { + list_del_rcu(&inj->entry); + mutex_unlock(&disk->error_injection_lock); + + kfree_rcu_mightsleep(inj); + + mutex_lock(&disk->error_injection_lock); + } + mutex_unlock(&disk->error_injection_lock); +} + +enum options { + Opt_add = (1u << 0), + Opt_removeall = (1u << 1), + + Opt_op = (1u << 16), + Opt_start = (1u << 17), + Opt_nr_sectors = (1u << 18), + Opt_status = (1u << 19), + Opt_chance = (1u << 20), + + Opt_invalid, +}; + +static const match_table_t opt_tokens = { + { Opt_add, "add", }, + { Opt_removeall, "removeall", }, + { Opt_op, "op=%s", }, + { Opt_start, "start=%u" }, + { Opt_nr_sectors, "nr_sectors=%u" }, + { Opt_status, "status=%s" }, + { Opt_chance, "chance=%u" }, + { Opt_invalid, NULL, }, +}; + +static int match_op(substring_t *args, enum req_op *op) +{ + const char *tag; + + tag = match_strdup(args); + if (!tag) + return -ENOMEM; + *op = str_to_blk_op(tag); + if (*op == REQ_OP_LAST) + pr_warn("invalid op '%s'\n", tag); + kfree(tag); + return 0; +} + +static int match_status(substring_t *args, blk_status_t *status) +{ + const char *tag; + + tag = match_strdup(args); + if (!tag) + return -ENOMEM; + *status = tag_to_blk_status(tag); + if (!*status) + pr_warn("invalid status '%s'\n", tag); + kfree(tag); + return 0; +} + +static ssize_t blk_error_injection_parse_options(struct gendisk *disk, + char *options) +{ + enum { Unset, Add, Removeall } action = Unset; + unsigned int option_mask = 0, chance = 1; + enum req_op op = REQ_OP_LAST; + u64 start = 0, nr_sectors = 0; + blk_status_t status = BLK_STS_OK; + substring_t args[MAX_OPT_ARGS]; + char *p; + + while ((p = strsep(&options, ",\n")) != NULL) { + int error = 0; + ssize_t token; + + if (!*p) + continue; + token = match_token(p, opt_tokens, args); + option_mask |= token; + switch (token) { + case Opt_add: + if (action != Unset) + return -EINVAL; + action = Add; + break; + case Opt_removeall: + if (action != Unset) + return -EINVAL; + action = Removeall; + break; + case Opt_op: + error = match_op(args, &op); + break; + case Opt_start: + error = match_u64(args, &start); + break; + case Opt_nr_sectors: + error = match_u64(args, &nr_sectors); + break; + case Opt_status: + error = match_status(args, &status); + break; + case Opt_chance: + error = match_uint(args, &chance); + if (!error && chance == 0) + error = -EINVAL; + break; + default: + pr_warn("unknown parameter or missing value '%s'\n", p); + error = -EINVAL; + } + if (error) + return error; + } + + switch (action) { + case Add: + return error_inject_add(disk, op, start, nr_sectors, status, + chance); + case Removeall: + if (option_mask & ~Opt_removeall) + return -EINVAL; + error_inject_removall(disk); + return 0; + default: + return -EINVAL; + } +} + +static ssize_t blk_error_injection_write(struct file *file, + const char __user *ubuf, size_t count, loff_t *pos) +{ + struct gendisk *disk = file_inode(file)->i_private; + char *options; + int error; + + options = memdup_user_nul(ubuf, count); + if (IS_ERR(options)) + return PTR_ERR(options); + error = blk_error_injection_parse_options(disk, options); + kfree(options); + + if (error) + return error; + return count; +} + +static int blk_error_injection_show(struct seq_file *s, void *private) +{ + struct gendisk *disk = s->private; + struct blk_error_inject *inj; + + rcu_read_lock(); + list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) { + seq_printf(s, "%llu:%llu status=%s,chance=%u", + inj->start, inj->end, + blk_status_to_tag(inj->status), inj->chance); + seq_putc(s, '\n'); + } + rcu_read_unlock(); + return 0; +} + +static int blk_error_injection_open(struct inode *inode, struct file *file) +{ + return single_open(file, blk_error_injection_show, inode->i_private); +} + +static int blk_error_injection_release(struct inode *inode, struct file *file) +{ + return single_release(inode, file); +} + +static const struct file_operations blk_error_injection_fops = { + .owner = THIS_MODULE, + .write = blk_error_injection_write, + .read = seq_read, + .open = blk_error_injection_open, + .release = blk_error_injection_release, +}; + +void blk_error_injection_init(struct gendisk *disk) +{ + debugfs_create_file("error_injection", 0600, disk->queue->debugfs_dir, + disk, &blk_error_injection_fops); +} + +void blk_error_injection_exit(struct gendisk *disk) +{ + error_inject_removall(disk); +} diff --git a/block/genhd.c b/block/genhd.c index 7d6854fd28e9..f84b6a355b57 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1485,6 +1485,10 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id, lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0); #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED INIT_LIST_HEAD(&disk->slave_bdevs); +#endif +#ifdef CONFIG_BLK_ERROR_INJECTION + mutex_init(&disk->error_injection_lock); + INIT_LIST_HEAD(&disk->error_injection_list); #endif mutex_init(&disk->rqos_state_mutex); kobject_init(&disk->queue_kobj, &blk_queue_ktype); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 17270a28c66d..d2adf2775920 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -176,6 +176,7 @@ struct gendisk { #define GD_SUPPRESS_PART_SCAN 5 #define GD_OWNS_QUEUE 6 #define GD_ZONE_APPEND_USED 7 +#define GD_ERROR_INJECT 8 struct mutex open_mutex; /* open/close mutex */ unsigned open_partitions; /* number of open partitions */ @@ -227,6 +228,11 @@ struct gendisk { */ struct blk_independent_access_ranges *ia_ranges; +#ifdef CONFIG_BLK_ERROR_INJECTION + struct mutex error_injection_lock; + struct list_head error_injection_list; +#endif + struct mutex rqos_state_mutex; /* rqos state change mutex */ }; -- 2.53.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-05 18:44 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig @ 2026-06-06 7:28 ` Hannes Reinecke 2026-06-06 7:33 ` Damien Le Moal 1 sibling, 0 replies; 17+ messages in thread From: Hannes Reinecke @ 2026-06-06 7:28 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc On 6/5/26 20:44, Christoph Hellwig wrote: > Add a new block error injection interface that allows to inject specific > status code for specific ranges. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > Documentation/block/error-injection.rst | 59 +++++ > Documentation/block/index.rst | 1 + > block/Kconfig | 7 + > block/Makefile | 1 + > block/blk-core.c | 3 + > block/blk-sysfs.c | 4 + > block/blk.h | 12 + > block/error-injection.c | 308 ++++++++++++++++++++++++ > block/genhd.c | 4 + > include/linux/blkdev.h | 6 + > 10 files changed, 405 insertions(+) > create mode 100644 Documentation/block/error-injection.rst > create mode 100644 block/error-injection.c > Reviewed-by: Hannes Reinecke <hare@kernel.org> Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] block: add configurable error injection 2026-06-05 18:44 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 2026-06-06 7:28 ` Hannes Reinecke @ 2026-06-06 7:33 ` Damien Le Moal 1 sibling, 0 replies; 17+ messages in thread From: Damien Le Moal @ 2026-06-06 7:33 UTC (permalink / raw) To: Christoph Hellwig, Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc On 2026/06/06 2:44, Christoph Hellwig wrote: > Add a new block error injection interface that allows to inject specific > status code for specific ranges. > > Signed-off-by: Christoph Hellwig <hch@lst.de> [...] > +=================== ======================================================= > +op=%s block layer operation this rule applies to, e.g. READ > + or WRITE. Like you did in the commit message of patch 3, maybe mention that this should match "XYZ" of one of the defined REQ_OP_XYZ operation ? > + Mandatory. > +start=%u First block layer sector the rule applies to. > + Optional, defaults to 0. > +nr_sectors=%u Number of sectors this rule applies. > + Optional, defaults to the remainder of the device. > +status=%s Status to return. Maybe mention that this should match XYZ for one one of the defined BLK_STS_XYZ ? > + Mandatory. > +chance=%u Only return a failure with a likelihood of 1/chance. > + Optional, defaults to 1 (always). > +=================== ======================================================= [...] > + /* > + * Add to the front of the list so that newer entries can partially > + * override other entries. This also intentional allows duplicate s/intentional/intentionally > + * entries as there is no real reason to reject them. > + */ Beside these nits, looks good to me. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-06-09 7:47 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-08 5:14 configurable block error injection v3 Christoph Hellwig 2026-06-08 5:14 ` [PATCH 1/4] block: add a macro to initialize the status table Christoph Hellwig 2026-06-08 21:51 ` Bart Van Assche 2026-06-08 5:14 ` [PATCH 2/4] block: add a "tag" for block status codes Christoph Hellwig 2026-06-08 21:55 ` Bart Van Assche 2026-06-09 7:43 ` Christoph Hellwig 2026-06-08 5:14 ` [PATCH 3/4] block: add a str_to_blk_op helper Christoph Hellwig 2026-06-08 21:57 ` Bart Van Assche 2026-06-09 7:45 ` Christoph Hellwig 2026-06-08 5:14 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 2026-06-08 14:53 ` Jens Axboe 2026-06-09 7:41 ` Christoph Hellwig 2026-06-08 22:08 ` Bart Van Assche 2026-06-09 7:47 ` Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2026-06-05 18:44 configurable block error injection v2 Christoph Hellwig 2026-06-05 18:44 ` [PATCH 4/4] block: add configurable error injection Christoph Hellwig 2026-06-06 7:28 ` Hannes Reinecke 2026-06-06 7:33 ` Damien Le Moal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.