Linux block layer
 help / color / mirror / Atom feed
* [PATCH v2] virtio-blk: clamp zone report to the report buffer capacity
From: Michael Bommarito @ 2026-06-07 12:48 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Stefan Hajnoczi, Jens Axboe
  Cc: Xuan Zhuo, virtualization, linux-block, linux-kernel

virtblk_report_zones() trusts the device-reported number of zones when
walking the report buffer:

	nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
		   nr_zones);
	...
	for (i = 0; i < nz && zone_idx < nr_zones; i++) {
		ret = virtblk_parse_zone(vblk, &report->zones[i], ...);

The buffer is allocated by virtblk_alloc_report_buffer(), whose size is
capped by the queue's max hardware sectors and max segments and can
therefore hold fewer descriptors than nr_zones. nz is bounded only by
the device-supplied report->nr_zones and the requested nr_zones, never
by the buffer's descriptor capacity. At probe time the request count is
unbounded (blk_revalidate_disk_zones() calls report_zones() with
nr_zones == UINT_MAX), so the device-supplied report->nr_zones is the
sole gate: a device that reports more zones than fit in the buffer
drives the loop to read report->zones[i] past the end of the allocation.

A malicious or buggy virtio-blk device that reports an inflated nr_zones
triggers this during zone revalidation at probe. KASAN reports a
vmalloc-out-of-bounds read in virtblk_report_zones() against the report
buffer allocated a few lines earlier.

Clamp nz to the number of descriptors that actually fit in the report
buffer.

Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
v2: drop the explanatory comment per Michael S. Tsirkin's review; the
    clamp itself is unchanged.

 drivers/block/virtio_blk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index b1c9a27..32bf3ba 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -689,6 +689,8 @@ static int virtblk_report_zones(struct gendisk *disk, sector_t sector,
 
 		nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
 			   nr_zones);
+		nz = min_t(u64, nz,
+			   (buflen - sizeof(*report)) / sizeof(report->zones[0]));
 		if (!nz)
 			break;
 

base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
-- 
2.53.0

^ permalink raw reply related

* [PATCH v4] loop: Fix NULL pointer dereference in lo_rw_aio()
From: Tetsuo Handa @ 2026-06-07 10:54 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Van Assche, Christoph Hellwig, Damien Le Moal, Ming Lei,
	linux-block, LKML, Andrew Morton, Linus Torvalds, linux-btrfs,
	David Sterba, linux-fsdevel, Christian Brauner, Hillf Danton
In-Reply-To: <b27609f0-59f0-403d-90af-274c55df817e@I-love.SAKURA.ne.jp>

syzbot is reporting NULL pointer dereference in lo_rw_aio() [1][2].
An analysis by the Gemini AI collaborator [3] considers that this problem
is caused by a timing shift primarily exposed by commit 65565ca5f99b
("block: unify the synchronous bi_end_io callbacks"), along with helper
refactorings like commit 92c3737a2473 ("block: add a bio_submit_or_kill
helper").

But due to difficulty of reproducing this race, discussion about what is
happening and how to fix this problem is stalling. Also, we haven't
identified how many filesystems are subjected to this problem.

Therefore, this patch introduces a grace period for flushing pending I/O
requests (which should be a good thing from the perspective of defensive
programming) so that we won't hit NULL pointer dereference problem, and
also emits BUG: message in order to help filesystem developers identify
the caller of an I/O request that failed to wait for completion so that
filesystem developers can fix such caller to wait for completion.

Note that emitting BUG: message is enabled only if CONFIG_KCOV=y, for
this check is a waste of computation resources for almost all users.

Link: https://syzkaller.appspot.com/bug?extid=cd8a9a308e879a4e2c28 [1]
Link: https://syzkaller.appspot.com/bug?extid=bc273027d5643e48e5b3 [2]
Link: https://lkml.kernel.org/r/fbb3edda-f108-4e5b-acf2-266f043f8125@I-love.SAKURA.ne.jp [3]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/block/loop.c | 82 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 0000913f7efc..4ff254d8b623 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -85,8 +85,26 @@ struct loop_cmd {
 	struct bio_vec *bvec;
 	struct cgroup_subsys_state *blkcg_css;
 	struct cgroup_subsys_state *memcg_css;
+#ifdef CONFIG_KCOV
+	unsigned long stack_entries[30];
+	int stack_nr;
+	pid_t pid;
+	char comm[TASK_COMM_LEN];
+#endif
 };
 
+static void loop_check_io_race(struct loop_device *lo, struct loop_cmd *cmd)
+{
+#ifdef CONFIG_KCOV
+	if (unlikely(data_race(READ_ONCE(lo->lo_state)) == Lo_rundown)) {
+		pr_err("BUG: %s/%u is doing I/O request on loop%d in Lo_rundown state.\n",
+		       cmd->comm, cmd->pid, lo->lo_number);
+		printk("Call trace:\n");
+		stack_trace_print(cmd->stack_entries, cmd->stack_nr, 4);
+	}
+#endif
+}
+
 #define LOOP_IDLE_WORKER_TIMEOUT (60 * HZ)
 #define LOOP_DEFAULT_HW_Q_DEPTH 128
 
@@ -1747,8 +1765,59 @@ static void lo_release(struct gendisk *disk)
 	need_clear = (lo->lo_state == Lo_rundown);
 	mutex_unlock(&lo->lo_mutex);
 
-	if (need_clear)
+	if (need_clear) {
+		/*
+		 * Temporarily release disk->open_mutex in order to flush pending I/O
+		 * requests before clearing the backing device.
+		 *
+		 * This is a layering violation. But since bdev->bd_disk->fops->release()
+		 * (which is mapped to lo_release()) is the final function which
+		 * blkdev_put_whole() from bdev_release() calls immediately before
+		 * releasing disk->open_mutex, this changes nothing except opens a new
+		 * race window for allowing disk->fops->open() (which is mapped to
+		 * lo_open()) to be called.
+		 *
+		 * Even if lo_open() is called from blkdev_get_whole() due to this race,
+		 * the Lo_rundown state guarantees that lo_open() will fail with -ENXIO.
+		 * Thus, there will be effectively no change caused by this violation.
+		 */
+		mutex_unlock(&lo->lo_disk->open_mutex);
+		/*
+		 * Now that loop_queue_rq() sees lo->lo_state != Lo_bound,
+		 * wait for already started loop_queue_rq() to complete.
+		 */
+		synchronize_rcu();
+		/*
+		 * Now that no more works are scheduled by loop_queue_rq(),
+		 * wait for already scheduled works to complete.
+		 */
+		drain_workqueue(lo->workqueue);
+		/*
+		 * Now that no more AIO requests are scheduled by lo_rw_aio(),
+		 * wait for already started AIO to complete.
+		 *
+		 * Due to synchronize_rcu() + drain_workqueue() sequence above,
+		 * calling blk_mq_unfreeze_queue() immediately after blk_mq_freeze_queue()
+		 * returns has to be safe, for loop_queue_rq() no longer schedules new
+		 * lo_rw_aio() works and lo_rw_aio() no longer submits new AIO requests.
+		 *
+		 * Deferring blk_mq_unfreeze_queue() does not help because we are about
+		 * to clear the backing device and drop the refcount for the backing device.
+		 * There is nothing we can do if blk_mq_freeze_queue() fails to flush.
+		 */
+		blk_mq_unfreeze_queue(lo->lo_queue, blk_mq_freeze_queue(lo->lo_queue));
+		/*
+		 * Perform remaining cleanup, with disk->open_mutex held.
+		 *
+		 * The lo->lo_state should remain Lo_rundown despite we temporarily
+		 * released disk->open_mutex, for I am the only and the last user of
+		 * this loop device because lo_open() cannot succeed.
+		 */
+		mutex_lock(&lo->lo_disk->open_mutex);
+		if (WARN_ON(data_race(READ_ONCE(lo->lo_state)) != Lo_rundown))
+			return;
 		__loop_clr_fd(lo);
+	}
 }
 
 static void lo_free_disk(struct gendisk *disk)
@@ -1855,10 +1924,18 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx,
 	struct loop_cmd *cmd = blk_mq_rq_to_pdu(rq);
 	struct loop_device *lo = rq->q->queuedata;
 
+#ifdef CONFIG_KCOV
+	cmd->stack_nr = stack_trace_save(cmd->stack_entries, ARRAY_SIZE(cmd->stack_entries), 0);
+	cmd->pid = current->pid;
+	get_task_comm(cmd->comm, current);
+#endif
+
 	blk_mq_start_request(rq);
 
-	if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound)
+	if (data_race(READ_ONCE(lo->lo_state)) != Lo_bound) {
+		loop_check_io_race(lo, cmd);
 		return BLK_STS_IOERR;
+	}
 
 	switch (req_op(rq)) {
 	case REQ_OP_FLUSH:
@@ -1901,6 +1978,7 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
 	int ret = 0;
 	struct mem_cgroup *old_memcg = NULL;
 
+	loop_check_io_race(lo, cmd);
 	if (write && (lo->lo_flags & LO_FLAGS_READ_ONLY)) {
 		ret = -EIO;
 		goto failed;
-- 
2.47.3



^ permalink raw reply related

* [PATCH] partitions: aix: bound the pp_count scan to the ppe array
From: Bryam Vargas @ 2026-06-07  6:41 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Philippe De Muyter, Kees Cook, Michael Bommarito, linux-block,
	linux-kernel

aix_partition() reads the physical volume descriptor into a fixed-size
struct pvd and then scans its physical-partition-extent array:

	int numpps = be16_to_cpu(pvd->pp_count);
	...
	for (i = 0; i < numpps; i += 1) {
		struct ppe *p = pvd->ppe + i;
		...
		lp_ix = be16_to_cpu(p->lp_ix);

pvd points at a single kmalloc()'d struct pvd whose ppe[] member holds a
fixed ARRAY_SIZE(pvd->ppe) (1016) entries, but the loop runs up to the
on-disk pp_count.  pp_count is an unvalidated __be16 read straight from
the descriptor, so a crafted AIX image with pp_count larger than 1016
drives the loop to read pvd->ppe[i] past the end of the allocation (up to
65535 entries, ~2 MB out of bounds).

The partition scan runs without mounting anything, when a block device
with a crafted AIX/IBM partition table appears (an attacker-supplied
image attached with losetup -P, or a device auto-scanned by udev), via
msdos_partition() -> aix_partition().

Clamp the scan to the number of entries the ppe[] array can hold.

Fixes: 6ceea22bbbc8 ("partitions: add aix lvm partition support files")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
Reproduced on v7.1-rc6 with KASAN (CONFIG_PARTITION_ADVANCED +
CONFIG_AIX_PARTITION).  A crafted disk image whose AIX/IBM partition table
sets pp_count to 0xffff, attached with `losetup -fP image.img` (in-kernel
partition scan, no mount), is reported by KASAN:

  BUG: KASAN: slab-out-of-bounds in aix_partition+0xb6e/0xee0
  Read of size 2 at addr ... by task losetup
   aix_partition
   msdos_partition
   bdev_disk_changed
   loop_reread_partitions
   loop_configure
   lo_ioctl
   __x64_sys_ioctl

i.e. a read past the end of the kmalloc(sizeof(struct pvd)) object.  A control
image with pp_count == 1016 (== ARRAY_SIZE(pvd->ppe)) is clean.  With this
patch the crafted image is parsed with no out-of-bounds access.

This is the read-loop sibling of the lvd scan bounded by Michael Bommarito's
"partitions: aix: bound the lvd scan to one sector"; that change does not
touch the pp_count/ppe[] loop, so the two are complementary (separate hunks).

 block/partitions/aix.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/block/partitions/aix.c b/block/partitions/aix.c
index 29b8f4cebb63..f3c4174e003e 100644
--- a/block/partitions/aix.c
+++ b/block/partitions/aix.c
@@ -226,6 +226,15 @@ int aix_partition(struct parsed_partitions *state)
 		int next_lp_ix = 1;
 		int lp_ix;
 
+		/*
+		 * pvd was read into a fixed-size struct pvd whose ppe[] array
+		 * holds ARRAY_SIZE(pvd->ppe) entries.  pp_count is an
+		 * unvalidated on-disk __be16, so clamp the scan to the array
+		 * size to avoid walking past the allocation.
+		 */
+		if (numpps > ARRAY_SIZE(pvd->ppe))
+			numpps = ARRAY_SIZE(pvd->ppe);
+
 		for (i = 0; i < numpps; i += 1) {
 			struct ppe *p = pvd->ppe + i;
 			unsigned int lv_ix;
-- 
2.43.0



^ permalink raw reply related

* [PATCH] block: clear zone write plugging flag before failing rejected BIOs
From: Jackie Liu @ 2026-06-07  3:18 UTC (permalink / raw)
  To: dlemoal, axboe; +Cc: linux-block

From: Jackie Liu <liuyun01@kylinos.cn>

Commit fe0418eb9bd6 ("block: Prevent potential deadlocks in zone write plug
error recovery") changed blk_zone_wplug_handle_write() to fail BIOs
directly when blk_zone_wplug_prepare_bio() rejects them, for example
because the write is not aligned to the cached write pointer or the plug
needs a write pointer update. However, the BIO is already marked with
BIO_ZONE_WRITE_PLUGGING at that point even though it is not issued.

Completing such a BIO with bio_io_error() makes bio_endio() call
blk_zone_write_plug_bio_endio(), which treats the completion as a failed
device write and may poison the cached zone write pointer state by setting
BLK_ZONE_WPLUG_NEED_WP_UPDATE.

Clear BIO_ZONE_WRITE_PLUGGING and drop the zone write plug reference before
failing the rejected BIO.

Fixes: fe0418eb9bd6 ("block: Prevent potential deadlocks in zone write plug error recovery")
Cc: stable@vger.kernel.org # 6.13+
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
---
 block/blk-zoned.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 6a221c180889..855767d8bfc1 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -1502,7 +1502,9 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs)
 		goto queue_bio;
 
 	if (!blk_zone_wplug_prepare_bio(zwplug, bio)) {
+		bio_clear_flag(bio, BIO_ZONE_WRITE_PLUGGING);
 		spin_unlock_irqrestore(&zwplug->lock, flags);
+		disk_put_zone_wplug(zwplug);
 		bio_io_error(bio);
 		return true;
 	}
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH] virtio-blk: clamp zone report to the report buffer capacity
From: Michael S. Tsirkin @ 2026-06-07  2:23 UTC (permalink / raw)
  To: Michael Bommarito
  Cc: Jason Wang, Stefan Hajnoczi, Jens Axboe, Xuan Zhuo,
	virtualization, linux-block, linux-kernel
In-Reply-To: <20260606170415.1523660-1-michael.bommarito@gmail.com>

On Sat, Jun 06, 2026 at 01:04:15PM -0400, Michael Bommarito wrote:
> virtblk_report_zones() trusts the device-reported number of zones when
> walking the report buffer:
> 
> 	nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
> 		   nr_zones);
> 	...
> 	for (i = 0; i < nz && zone_idx < nr_zones; i++) {
> 		ret = virtblk_parse_zone(vblk, &report->zones[i], ...);
> 
> The buffer is allocated by virtblk_alloc_report_buffer(), whose size is
> capped by the queue's max hardware sectors and max segments and can
> therefore hold fewer descriptors than nr_zones. nz is bounded only by
> the device-supplied report->nr_zones and the requested nr_zones, never
> by the buffer's descriptor capacity. At probe time the request count is
> unbounded (blk_revalidate_disk_zones() calls report_zones() with
> nr_zones == UINT_MAX), so the device-supplied report->nr_zones is the
> sole gate: a device that reports more zones than fit in the buffer
> drives the loop to read report->zones[i] past the end of the allocation.
> 
> A malicious or buggy virtio-blk device that reports an inflated nr_zones
> triggers this during zone revalidation at probe. KASAN reports a
> vmalloc-out-of-bounds read in virtblk_report_zones() against the report
> buffer allocated a few lines earlier.
> 
> Clamp nz to the number of descriptors that actually fit in the report
> buffer.
> 
> Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
> Assisted-by: Claude:claude-opus-4-8
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
>  drivers/block/virtio_blk.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index b1c9a27fe00f3..d50aaf956d558 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -689,6 +689,14 @@ static int virtblk_report_zones(struct gendisk *disk, sector_t sector,
>  
>  		nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),

I think nr_zones should have been le64, bot virtio64.


>  			   nr_zones);
> +		/*
> +		 * The device-reported nr_zones is untrusted;

this part depends on the config. just drop it.

> clamp it to the
> +		 * number of descriptors that actually fit in the report buffer
> +		 * so a malicious or buggy device cannot drive the parse loop
> +		 * past the allocation.
> +		 */
> +		nz = min_t(u64, nz,
> +			   (buflen - sizeof(*report)) / sizeof(report->zones[0]));
>  		if (!nz)
>  			break;
>  
> 
> base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
> -- 
> 2.53.0


^ permalink raw reply

* Re: [PATCH next] drivers/block/rbd: Use strscpy() to copy strings into arrays
From: Alex Elder @ 2026-06-06 23:55 UTC (permalink / raw)
  To: david.laight.linux, Kees Cook, linux-hardening, Arnd Bergmann,
	ceph-devel, linux-block, linux-kernel
  Cc: Ilya Dryomov, Jens Axboe
In-Reply-To: <20260606202744.5113-5-david.laight.linux@gmail.com>

On 6/6/26 3:27 PM, david.laight.linux@gmail.com wrote:
> From: David Laight <david.laight.linux@gmail.com>
> 
> Replacing strcpy() with strscpy() ensures than overflow of the target
> buffer cannot happen.
> 
> Signed-off-by: David Laight <david.laight.linux@gmail.com>
> ---
> This is one of a group of patches that remove potentially unbounded
> strcpy() calls.
> 
> They are mostly replaced by strscpy() or, when strlen() has just been
> called, with memcpy() (usually including the '\0').
> 
> Calls with copy string literals into arrays are left unchanged.
> They are safe and easily detected as such.
> 
> The changes were made by getting the compiler to detect the calls and
> then fixing the code by hand.
> 
> Note that all the changes are only compile tested.
> 
> Some Makefiles were changed to allow files to contain strcpy().
> As well as 'difficult to fix' files, this included 'show' functions
> as they really need to use sysfs_emit() or seq_printf().
> 
> All the patches are being sent individually to avoid very long cc lists.
> Apologies for the terse commit messages and likely unexpected tags.
> (There are about 100 patches in total.)
> 
>   drivers/block/rbd.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 4065336ebd1f..632fa2d56ea0 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3672,7 +3672,7 @@ static void __rbd_lock(struct rbd_device *rbd_dev, const char *cookie)
>   	struct rbd_client_id cid = rbd_get_cid(rbd_dev);
>   
>   	rbd_dev->lock_state = RBD_LOCK_STATE_LOCKED;
> -	strcpy(rbd_dev->lock_cookie, cookie);
> +	strscpy(rbd_dev->lock_cookie, cookie);

This looks good.  The rbd_device->lock_cookie is a 32 byte
array.

Cookies passed in are always a 32 byte array (despite the
function only requiring a string pointer).

Reviewed-by: Alex Elder <elder@riscstar.com>

>   	rbd_set_owner_cid(rbd_dev, &cid);
>   	queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work);
>   }


^ permalink raw reply

* [PATCH next] drivers/block/rbd: Use strscpy() to copy strings into arrays
From: david.laight.linux @ 2026-06-06 20:27 UTC (permalink / raw)
  To: Kees Cook, linux-hardening, Arnd Bergmann, ceph-devel,
	linux-block, linux-kernel
  Cc: Ilya Dryomov, Jens Axboe, David Laight

From: David Laight <david.laight.linux@gmail.com>

Replacing strcpy() with strscpy() ensures than overflow of the target
buffer cannot happen.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
This is one of a group of patches that remove potentially unbounded
strcpy() calls.

They are mostly replaced by strscpy() or, when strlen() has just been
called, with memcpy() (usually including the '\0').

Calls with copy string literals into arrays are left unchanged.
They are safe and easily detected as such.

The changes were made by getting the compiler to detect the calls and
then fixing the code by hand.

Note that all the changes are only compile tested.

Some Makefiles were changed to allow files to contain strcpy().
As well as 'difficult to fix' files, this included 'show' functions
as they really need to use sysfs_emit() or seq_printf().

All the patches are being sent individually to avoid very long cc lists.
Apologies for the terse commit messages and likely unexpected tags.
(There are about 100 patches in total.)

 drivers/block/rbd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 4065336ebd1f..632fa2d56ea0 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3672,7 +3672,7 @@ static void __rbd_lock(struct rbd_device *rbd_dev, const char *cookie)
 	struct rbd_client_id cid = rbd_get_cid(rbd_dev);
 
 	rbd_dev->lock_state = RBD_LOCK_STATE_LOCKED;
-	strcpy(rbd_dev->lock_cookie, cookie);
+	strscpy(rbd_dev->lock_cookie, cookie);
 	rbd_set_owner_cid(rbd_dev, &cid);
 	queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work);
 }
-- 
2.39.5


^ permalink raw reply related

* [PATCH next] block/early-lookup: Replace strlen() strcpy() pair with strscpy()
From: david.laight.linux @ 2026-06-06 20:26 UTC (permalink / raw)
  To: Kees Cook, linux-hardening, Arnd Bergmann, linux-block,
	linux-kernel
  Cc: Jens Axboe, David Laight

From: David Laight <david.laight.linux@gmail.com>

Use the result of strscpy() for the overflow check.

Signed-off-by: David Laight <david.laight.linux@gmail.com>
---
This is one of a group of patches that remove potentially unbounded
strcpy() calls.

They are mostly replaced by strscpy() or, when strlen() has just been
called, with memcpy() (usually including the '\0').

Calls with copy string literals into arrays are left unchanged.
They are safe and easily detected as such.

The changes were made by getting the compiler to detect the calls and
then fixing the code by hand.

Note that all the changes are only compile tested.

Some Makefiles were changed to allow files to contain strcpy().
As well as 'difficult to fix' files, this included 'show' functions
as they really need to use sysfs_emit() or seq_printf().

All the patches are being sent individually to avoid very long cc lists.
Apologies for the terse commit messages and likely unexpected tags.
(There are about 100 patches in total.)

 block/early-lookup.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/early-lookup.c b/block/early-lookup.c
index 3fb57f7d2b12..18ab4a2da676 100644
--- a/block/early-lookup.c
+++ b/block/early-lookup.c
@@ -156,9 +156,8 @@ static int __init devt_from_devname(const char *name, dev_t *devt)
 	char s[32];
 	char *p;
 
-	if (strlen(name) > 31)
+	if (strscpy(s, name) < 0)
 		return -EINVAL;
-	strcpy(s, name);
 	for (p = s; *p; p++) {
 		if (*p == '/')
 			*p = '!';
-- 
2.39.5


^ permalink raw reply related

* [PATCH] partitions: aix: bound the lvd scan to one sector
From: Michael Bommarito @ 2026-06-06 17:07 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Kees Cook, linux-block, linux-kernel

aix_partition() reads the logical-volume descriptor array as a single
sector and then scans it:

	if (numlvs && (d = read_part_sector(state, vgda_sector + 1, &sect))) {
		struct lvd *p = (struct lvd *)d;
		...
		for (i = 0; foundlvs < numlvs && i < state->limit; i += 1) {
			lvip[i].pps_per_lv = be16_to_cpu(p[i].num_lps);

p points at a single 512-byte sector, which holds 512 / sizeof(struct
lvd) = 16 entries, but the loop runs until foundlvs reaches the on-disk
numlvs or i reaches state->limit (DISK_MAX_PARTS, 256). numlvs is an
on-disk __be16 read straight from the volume group descriptor and is not
validated, so a crafted AIX image with numlvs larger than 16 and lvd
entries whose num_lps fields are zero (so foundlvs never advances) drives
the loop to read p[i] well past the end of the read sector buffer.

The 2014 off-by-one fix d97a86c170b4 hardened the matching write of
lvip[lv_ix] but left this read loop unbounded.

Bound the scan to the number of struct lvd entries that fit in the
sector that was actually read.

Fixes: 6ceea22bbbc8 ("partitions: add aix lvm partition support files")
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
 block/partitions/aix.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/partitions/aix.c b/block/partitions/aix.c
index 29b8f4cebb63d..6679e825ba329 100644
--- a/block/partitions/aix.c
+++ b/block/partitions/aix.c
@@ -208,7 +208,14 @@ int aix_partition(struct parsed_partitions *state)
 		if (n) {
 			int foundlvs = 0;
 
-			for (i = 0; foundlvs < numlvs && i < state->limit; i += 1) {
+			/*
+			 * The lvd array was read as a single sector; only the
+			 * struct lvd entries that fit in it are valid.  Bound the
+			 * scan so an on-disk numlvs larger than that cannot walk
+			 * the read buffer out of bounds.
+			 */
+			for (i = 0; foundlvs < numlvs && i < state->limit &&
+				    i < 512 / (int)sizeof(struct lvd); i += 1) {
 				lvip[i].pps_per_lv = be16_to_cpu(p[i].num_lps);
 				if (lvip[i].pps_per_lv)
 					foundlvs += 1;

base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
-- 
2.53.0


^ permalink raw reply related

* [PATCH] virtio-blk: clamp zone report to the report buffer capacity
From: Michael Bommarito @ 2026-06-06 17:04 UTC (permalink / raw)
  To: Michael S . Tsirkin, Jason Wang, Stefan Hajnoczi, Jens Axboe
  Cc: Xuan Zhuo, virtualization, linux-block, linux-kernel

virtblk_report_zones() trusts the device-reported number of zones when
walking the report buffer:

	nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
		   nr_zones);
	...
	for (i = 0; i < nz && zone_idx < nr_zones; i++) {
		ret = virtblk_parse_zone(vblk, &report->zones[i], ...);

The buffer is allocated by virtblk_alloc_report_buffer(), whose size is
capped by the queue's max hardware sectors and max segments and can
therefore hold fewer descriptors than nr_zones. nz is bounded only by
the device-supplied report->nr_zones and the requested nr_zones, never
by the buffer's descriptor capacity. At probe time the request count is
unbounded (blk_revalidate_disk_zones() calls report_zones() with
nr_zones == UINT_MAX), so the device-supplied report->nr_zones is the
sole gate: a device that reports more zones than fit in the buffer
drives the loop to read report->zones[i] past the end of the allocation.

A malicious or buggy virtio-blk device that reports an inflated nr_zones
triggers this during zone revalidation at probe. KASAN reports a
vmalloc-out-of-bounds read in virtblk_report_zones() against the report
buffer allocated a few lines earlier.

Clamp nz to the number of descriptors that actually fit in the report
buffer.

Fixes: 95bfec41bd3d ("virtio-blk: add support for zoned block devices")
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
 drivers/block/virtio_blk.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index b1c9a27fe00f3..d50aaf956d558 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -689,6 +689,14 @@ static int virtblk_report_zones(struct gendisk *disk, sector_t sector,
 
 		nz = min_t(u64, virtio64_to_cpu(vblk->vdev, report->nr_zones),
 			   nr_zones);
+		/*
+		 * The device-reported nr_zones is untrusted; clamp it to the
+		 * number of descriptors that actually fit in the report buffer
+		 * so a malicious or buggy device cannot drive the parse loop
+		 * past the allocation.
+		 */
+		nz = min_t(u64, nz,
+			   (buflen - sizeof(*report)) / sizeof(report->zones[0]));
 		if (!nz)
 			break;
 

base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH 4/4] block: add configurable error injection
From: Damien Le Moal @ 2026-06-06  7:33 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc
In-Reply-To: <20260605184441.590927-5-hch@lst.de>

On 2026/06/06 2:44, Christoph Hellwig wrote:
> Add a new block error injection interface that allows to inject specific
> status code for specific ranges.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

[...]

> +===================	=======================================================
> +op=%s			block layer operation this rule applies to, e.g. READ
> +			or WRITE.

Like you did in the commit message of patch 3, maybe mention that this should
match "XYZ" of one of the defined REQ_OP_XYZ operation ?

> +			Mandatory.
> +start=%u		First block layer sector the rule applies to.
> +			Optional, defaults to 0.
> +nr_sectors=%u		Number of sectors this rule applies.
> +			Optional, defaults to the remainder of the device.
> +status=%s		Status to return.

Maybe mention that this should match XYZ for one one of the defined BLK_STS_XYZ ?

> +			Mandatory.
> +chance=%u		Only return a failure with a likelihood of 1/chance.
> +			Optional, defaults to 1 (always).
> +===================	=======================================================

[...]

> +	/*
> +	 * Add to the front of the list so that newer entries can partially
> +	 * override other entries.  This also intentional allows duplicate

s/intentional/intentionally

> +	 * entries as there is no real reason to reject them.
> +	 */

Beside these nits, looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply

* Re: [PATCH 4/4] block: add configurable error injection
From: Hannes Reinecke @ 2026-06-06  7:28 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc
In-Reply-To: <20260605184441.590927-5-hch@lst.de>

On 6/5/26 20:44, Christoph Hellwig wrote:
> Add a new block error injection interface that allows to inject specific
> status code for specific ranges.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   Documentation/block/error-injection.rst |  59 +++++
>   Documentation/block/index.rst           |   1 +
>   block/Kconfig                           |   7 +
>   block/Makefile                          |   1 +
>   block/blk-core.c                        |   3 +
>   block/blk-sysfs.c                       |   4 +
>   block/blk.h                             |  12 +
>   block/error-injection.c                 | 308 ++++++++++++++++++++++++
>   block/genhd.c                           |   4 +
>   include/linux/blkdev.h                  |   6 +
>   10 files changed, 405 insertions(+)
>   create mode 100644 Documentation/block/error-injection.rst
>   create mode 100644 block/error-injection.c
> 
Reviewed-by: Hannes Reinecke <hare@kernel.org>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply

* Re: [PATCH 3/4] block: add a str_to_blk_op helper
From: Hannes Reinecke @ 2026-06-06  7:27 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-4-hch@lst.de>

On 6/5/26 20:44, Christoph Hellwig wrote:
> Add a helper to find the REQ_OP_XYZ constant from the "XYZ" string.
> This will be used for the error injection debugfs interface.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/blk-core.c | 10 ++++++++++
>   block/blk.h      |  1 +
>   2 files changed, 11 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@kernel.org>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply

* Re: [PATCH 2/4] block: add a "tag" for block status codes
From: Hannes Reinecke @ 2026-06-06  7:25 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-3-hch@lst.de>

On 6/5/26 20:44, Christoph Hellwig wrote:
> The full name of the status codes is not good for user interfaces as it
> can contain white spaces.  Add the name of the status code without the
> BLK_STS_ prefix as a tag so that it can be used for user interfaces.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/blk-core.c | 28 ++++++++++++++++++++++++++++
>   block/blk.h      |  2 ++
>   2 files changed, 30 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@kernel.org>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply

* Re: [PATCH 1/4] block: add a macro to initialize the status table
From: Hannes Reinecke @ 2026-06-06  7:24 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-2-hch@lst.de>

On 6/5/26 20:44, Christoph Hellwig wrote:
> Prepare for adding a new value to the error table by adding a macro
> to fill it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>
> ---
>   block/blk-core.c | 45 +++++++++++++++++++++++++--------------------
>   1 file changed, 25 insertions(+), 20 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@kernel.org>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

^ permalink raw reply

* Re: [PATCH 3/4] block: add a str_to_blk_op helper
From: Damien Le Moal @ 2026-06-06  7:20 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-4-hch@lst.de>

On 2026/06/06 2:44, Christoph Hellwig wrote:
> Add a helper to find the REQ_OP_XYZ constant from the "XYZ" string.
> This will be used for the error injection debugfs interface.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>

Looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply

* Re: [PATCH 2/4] block: add a "tag" for block status codes
From: Damien Le Moal @ 2026-06-06  7:20 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-3-hch@lst.de>

On 2026/06/06 2:44, Christoph Hellwig wrote:
> The full name of the status codes is not good for user interfaces as it
> can contain white spaces.  Add the name of the status code without the
> BLK_STS_ prefix as a tag so that it can be used for user interfaces.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>

Looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply

* Re: [PATCH 1/4] block: add a macro to initialize the status table
From: Damien Le Moal @ 2026-06-06  7:14 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-2-hch@lst.de>

On 2026/06/06 2:44, Christoph Hellwig wrote:
> Prepare for adding a new value to the error table by adding a macro
> to fill it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Keith Busch <kbusch@kernel.org>

Looks good to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply

* [PATCH] block: optimize I/O merge hot path with unlikely() hints
From: Steven Feng @ 2026-06-06  2:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-kernel, Steven Feng

Remove redundant '== false' comparisons and add unlikely() branch
prediction hints in block I/O merge path functions.

These functions (ll_new_hw_segment, ll_merge_requests_fn, and
blk_rq_merge_ok) are executed on every I/O request merge attempt,
making them critical hot paths. Data integrity check failures are
rare events, so marking these conditions as unlikely() helps the
CPU optimize the common case by improving branch prediction.

Changes:
- Replace 'func() == false' with 'unlikely(!func())' for better
  code style and branch prediction

This micro-optimization reduces branch misprediction penalties in
high-frequency I/O merge paths.

Signed-off-by: Steven Feng <steven@joint-cloud.com>
---
 block/blk-merge.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index fcf09325b22e..65347d1646a1 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -547,7 +547,7 @@ static inline int ll_new_hw_segment(struct request *req, struct bio *bio,
 	if (!blk_cgroup_mergeable(req, bio))
 		goto no_merge;
 
-	if (blk_integrity_merge_bio(req->q, req, bio) == false)
+	if (unlikely(!blk_integrity_merge_bio(req->q, req, bio)))
 		goto no_merge;
 
 	/* discard request merge won't add new segment */
@@ -649,7 +649,7 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
 	if (!blk_cgroup_mergeable(req, next->bio))
 		return 0;
 
-	if (blk_integrity_merge_rq(q, req, next) == false)
+	if (unlikely(!blk_integrity_merge_rq(q, req, next)))
 		return 0;
 
 	if (!bio_crypt_ctx_merge_rq(req, next))
@@ -905,7 +905,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 
 	if (!blk_cgroup_mergeable(rq, bio))
 		return false;
-	if (blk_integrity_merge_bio(rq->q, rq, bio) == false)
+	if (unlikely(!blk_integrity_merge_bio(rq->q, rq, bio)))
 		return false;
 	if (!bio_crypt_rq_ctx_compatible(rq, bio))
 		return false;
@@ -915,7 +915,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
 		return false;
 	if (rq->bio->bi_ioprio != bio->bi_ioprio)
 		return false;
-	if (blk_atomic_write_mergeable_rq_bio(rq, bio) == false)
+	if (unlikely(!blk_atomic_write_mergeable_rq_bio(rq, bio)))
 		return false;
 
 	return true;
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH v7 00/14] Enable lock context analysis for the block layer core
From: Jens Axboe @ 2026-06-05 19:41 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-block, Christoph Hellwig, Hannes Reinecke, Damien Le Moal
In-Reply-To: <cover.1780682325.git.bvanassche@acm.org>


On Fri, 05 Jun 2026 11:00:53 -0700, Bart Van Assche wrote:
> Recently the following patch series has been merged: [PATCH v5 00/36]
> Compiler-Based Context- and Locking-Analysis
> (https://lore.kernel.org/lkml/20251219154418.3592607-1-elver@google.com/). That
> patch series drops support for verifying lock context annotations with sparse
> and introduces support for verifying lock context annotations with Clang. The
> support in Clang for lock context annotation and verification is better than
> that in sparse. As an example, __cond_acquires() and __guarded_by() are
> supported by Clang but not by sparse. Hence this patch series that enables lock
> context analysis for the block layer core.
> 
> [...]

Applied, thanks!

[01/14] block: Annotate the queue limits functions
        commit: 08d912bc44dab63f2637677712d2a0b86922389a
[02/14] block/bdev: Annotate the blk_holder_ops callback functions
        commit: 3033c86fa1a8bb31d0a13738fe8c5f9e5bbaf98a
[03/14] block/cgroup: Split blkg_conf_prep()
        commit: ea4f575e72df0fa9e4b3f57a6f48c1ae81fac7b4
[04/14] block/cgroup: Split blkg_conf_exit()
        commit: c574c3cc368d68fff465e5fc811f874d9235b940
[05/14] block/cgroup: Improve lock context annotations
        commit: 9865e416644292124865dfc8a4ffd2b8e6764242
[06/14] block/blk-iocost: Combine two error paths in ioc_qos_write()
        commit: 6a7717a2df6c01b2158979f311ddf4cb35b8987f
[07/14] block/cgroup: Inline blkg_conf_{open,close}_bdev_frozen()
        commit: 998cda78d4e364f75e576ba715a2533462990aee
[08/14] block/crypto: Annotate the crypto functions
        commit: 73bb2480e3eccc6cb2419691c9e60dea9dc6d719
[09/14] block/blk-iocost: Split ioc_rqos_throttle()
        commit: 1ff85a387947890938c05cfe22041dfeef3098dd
[10/14] block/blk-iocost: Inline iocg_lock() and iocg_unlock()
        commit: a255026594e9b7eea24c12d2bd4acae0c11eea94
[11/14] block/blk-mq-debugfs: Improve lock context annotations
        commit: 131f14125a1840d393c3dec8910483e3fc3daf18
[12/14] block/Kyber: Make the lock context annotations compatible with Clang
        commit: b4591b91526ef53eedefc124221ec1a060bfbe54
[13/14] block/mq-deadline: Make the lock context annotations compatible with Clang
        commit: f10b2de2af28f90c9d1a0774a474e5c4e4d222da
[14/14] block: Enable lock context analysis
        commit: 5f0777166e3eaefc02ec0e381658f510f4d068ce

Best regards,
-- 
Jens Axboe




^ permalink raw reply

* Re: [PATCH 1/4] block: add a macro to initialize the status table
From: Matthew Wilcox @ 2026-06-05 18:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-2-hch@lst.de>

On Fri, Jun 05, 2026 at 08:44:27PM +0200, Christoph Hellwig wrote:
> Prepare for adding a new value to the error table by adding a macro
> to fill it.

> +#define ENT(_tag, _errno, _desc)	\
> +[BLK_STS_##_tag] = {				\
> +	.errno		= _errno,		\
> +	.name		= _desc,		\

Bleh.  I hate this.  Before, I can grep for BLK_STS_NOSPC and find it.
After, I can't.  Yes, I know we have a lot of such things already, but
I don't like adding more.

^ permalink raw reply

* [PATCH 4/4] block: add configurable error injection
From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc
In-Reply-To: <20260605184441.590927-1-hch@lst.de>

Add a new block error injection interface that allows to inject specific
status code for specific ranges.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/block/error-injection.rst |  59 +++++
 Documentation/block/index.rst           |   1 +
 block/Kconfig                           |   7 +
 block/Makefile                          |   1 +
 block/blk-core.c                        |   3 +
 block/blk-sysfs.c                       |   4 +
 block/blk.h                             |  12 +
 block/error-injection.c                 | 308 ++++++++++++++++++++++++
 block/genhd.c                           |   4 +
 include/linux/blkdev.h                  |   6 +
 10 files changed, 405 insertions(+)
 create mode 100644 Documentation/block/error-injection.rst
 create mode 100644 block/error-injection.c

diff --git a/Documentation/block/error-injection.rst b/Documentation/block/error-injection.rst
new file mode 100644
index 000000000000..b2e2ab6add70
--- /dev/null
+++ b/Documentation/block/error-injection.rst
@@ -0,0 +1,59 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Configurable Error Injection
+============================
+
+Overview
+--------
+
+Configurable error injection allows injecting specific block layer status codes
+for ranges of a block device.  Errors can be injected unconditionally, or with a
+given probability.
+
+To use configurable error injection, CONFIG_BLK_ERROR_INJECTION must be enabled.
+
+The only interface is the error_injection debugfs file, which is created for
+each registered gendisk.  Writes to this file are used to create or delete rules
+and reads return a list of the current error injection sites.
+
+Options
+-------
+
+The following options specify the operations:
+
+===================	=======================================================
+add			add a new rule
+removeall		remove all existing rules
+===================	=======================================================
+
+The following options specify the details of the rule for the add operation:
+
+===================	=======================================================
+op=%s			block layer operation this rule applies to, e.g. READ
+			or WRITE.
+			Mandatory.
+start=%u		First block layer sector the rule applies to.
+			Optional, defaults to 0.
+nr_sectors=%u		Number of sectors this rule applies.
+			Optional, defaults to the remainder of the device.
+status=%s		Status to return.
+			Mandatory.
+chance=%u		Only return a failure with a likelihood of 1/chance.
+			Optional, defaults to 1 (always).
+===================	=======================================================
+
+Example
+-------
+
+Return BLK_STS_IOERR for one in 10 reads of sector 0 of /dev/nvme0n1:
+
+	$ echo 'add,op=READ,start=0,status=IOERR,chance=10' > /sys/kernel/debug/block/nvme0n1/error_injection
+
+Return BLK_STS_MEDIUM for every write to /dev/nvme0n1:
+
+	$ echo 'add,op=WRITE,start=0,status=MEDIUM' > /sys/kernel/debug/block/nvme0n1/error_injection
+
+Remove all rules for /dev/nvme0n1:
+
+	$ echo 'removeall' > /sys/kernel/debug/block/nvme0n1/error_injection
diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst
index 9fea696f9daa..bfa1bbd31ddf 100644
--- a/Documentation/block/index.rst
+++ b/Documentation/block/index.rst
@@ -22,3 +22,4 @@ Block
    switching-sched
    writeback_cache_control
    ublk
+   error-injection
diff --git a/block/Kconfig b/block/Kconfig
index 15027963472d..7651b86eed56 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -221,6 +221,13 @@ config BLOCK_HOLDER_DEPRECATED
 config BLK_MQ_STACKING
 	bool
 
+config BLK_ERROR_INJECTION
+	bool "Enable block layer error injection"
+	help
+	  Enable inserting arbitrary block errors through a debugfs interface.
+
+	  See Documentation/block/error-injection.rst for details.
+
 source "block/Kconfig.iosched"
 
 endif # BLOCK
diff --git a/block/Makefile b/block/Makefile
index 7dce2e44276c..d0bb3e15a347 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -11,6 +11,7 @@ obj-y		:= bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \
 			genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \
 			disk-events.o blk-ia-ranges.o early-lookup.o
 
+obj-$(CONFIG_BLK_ERROR_INJECTION) += error-injection.o
 obj-$(CONFIG_BLK_DEV_BSG_COMMON) += bsg.o
 obj-$(CONFIG_BLK_DEV_BSGLIB)	+= bsg-lib.o
 obj-$(CONFIG_BLK_CGROUP)	+= blk-cgroup.o
diff --git a/block/blk-core.c b/block/blk-core.c
index aa90aad6da13..268735582ef1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -767,6 +767,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
 
 void submit_bio_noacct_nocheck(struct bio *bio, bool split)
 {
+	if (unlikely(blk_error_inject(bio)))
+		return;
+
 	blk_cgroup_bio_start(bio);
 
 	if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) {
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f22c1f253eb3..8a0c2be48a31 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -933,6 +933,8 @@ static void blk_debugfs_remove(struct gendisk *disk)
 
 	blk_debugfs_lock_nomemsave(q);
 	blk_trace_shutdown(q);
+	if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION))
+		blk_error_injection_exit(disk);
 	debugfs_remove_recursive(q->debugfs_dir);
 	q->debugfs_dir = NULL;
 	q->sched_debugfs_dir = NULL;
@@ -963,6 +965,8 @@ int blk_register_queue(struct gendisk *disk)
 
 	memflags = blk_debugfs_lock(q);
 	q->debugfs_dir = debugfs_create_dir(disk->disk_name, blk_debugfs_root);
+	if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION))
+		blk_error_injection_init(disk);
 	if (queue_is_mq(q))
 		blk_mq_debugfs_register(q);
 	blk_debugfs_unlock(q, memflags);
diff --git a/block/blk.h b/block/blk.h
index 93b30d1d0ec6..ec2639752e07 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -660,6 +660,18 @@ static inline bool should_fail_request(struct block_device *part,
 }
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
+void blk_error_injection_init(struct gendisk *disk);
+void blk_error_injection_exit(struct gendisk *disk);
+bool __blk_error_inject(struct bio *bio);
+static inline bool blk_error_inject(struct bio *bio)
+{
+	if (!IS_ENABLED(CONFIG_BLK_ERROR_INJECTION))
+		return false;
+	if (!test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state))
+		return false;
+	return __blk_error_inject(bio);
+}
+
 /*
  * Optimized request reference counting. Ideally we'd make timeouts be more
  * clever, as that's the only reason we need references at all... But until
diff --git a/block/error-injection.c b/block/error-injection.c
new file mode 100644
index 000000000000..f35bce1d25cc
--- /dev/null
+++ b/block/error-injection.c
@@ -0,0 +1,308 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2026 Christoph Hellwig.
+ */
+#include <linux/debugfs.h>
+#include <linux/blkdev.h>
+#include <linux/parser.h>
+#include <linux/seq_file.h>
+#include "blk.h"
+
+struct blk_error_inject {
+	struct list_head		entry;
+	sector_t			start;
+	sector_t			end;
+	enum req_op			op;
+	blk_status_t			status;
+
+	/* only inject every 1 / chance times */
+	unsigned int			chance;
+};
+
+bool __blk_error_inject(struct bio *bio)
+{
+	struct gendisk *disk = bio->bi_bdev->bd_disk;
+	struct blk_error_inject *inj;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) {
+		if (bio->bi_iter.bi_sector <= inj->end &&
+		    bio_end_sector(bio) > inj->start &&
+		    bio_op(bio) == inj->op) {
+			blk_status_t status = inj->status;
+
+			if (inj->chance > 1 &&
+			    (get_random_u32() % inj->chance) != 0)
+				continue;
+
+			pr_info_ratelimited("%pg: injecting %s error for %s at sector %llu:%u\n",
+					disk->part0,
+					blk_status_to_str(status),
+					blk_op_str(inj->op),
+					bio->bi_iter.bi_sector,
+					bio_sectors(bio));
+			rcu_read_unlock();
+			bio_endio_status(bio, status);
+			return true;
+		}
+	}
+	rcu_read_unlock();
+	return false;
+}
+
+static int error_inject_add(struct gendisk *disk, enum req_op op,
+		sector_t start, u64 nr_sectors, blk_status_t status,
+		unsigned int chance)
+{
+	struct blk_error_inject *inj;
+	int error = -EINVAL;
+
+	if (op == REQ_OP_LAST)
+		return -EINVAL;
+	if (status == BLK_STS_OK)
+		return -EINVAL;
+
+	inj = kzalloc_obj(*inj);
+	if (!inj)
+		return -ENOMEM;
+
+	if (nr_sectors) {
+		if (U64_MAX - nr_sectors < start)
+			goto out_free_inj;
+		inj->end = start + nr_sectors - 1;
+	} else {
+		inj->end = U64_MAX;
+	}
+
+	inj->op = op;
+	inj->start = start;
+	inj->status = status;
+	inj->chance = chance;
+
+	pr_debug_ratelimited("%pg: adding %s injection for %s at sector %llu:%llu\n",
+			disk->part0, blk_status_to_str(status),
+			blk_op_str(op),
+			start, nr_sectors);
+
+	/*
+	 * Add to the front of the list so that newer entries can partially
+	 * override other entries.  This also intentional allows duplicate
+	 * entries as there is no real reason to reject them.
+	 */
+	mutex_lock(&disk->error_injection_lock);
+	if (!disk_live(disk)) {
+		mutex_unlock(&disk->error_injection_lock);
+		error = -ENODEV;
+		goto out_free_inj;
+	}
+	list_add_rcu(&inj->entry, &disk->error_injection_list);
+	set_bit(GD_ERROR_INJECT, &disk->state);
+	mutex_unlock(&disk->error_injection_lock);
+	return 0;
+
+out_free_inj:
+	kfree(inj);
+	return error;
+}
+
+static void error_inject_removall(struct gendisk *disk)
+{
+	struct blk_error_inject *inj;
+
+	mutex_lock(&disk->error_injection_lock);
+	clear_bit(GD_ERROR_INJECT, &disk->state);
+	while ((inj = list_first_entry_or_null(&disk->error_injection_list,
+			struct blk_error_inject, entry))) {
+		list_del_rcu(&inj->entry);
+		mutex_unlock(&disk->error_injection_lock);
+
+		kfree_rcu_mightsleep(inj);
+
+		mutex_lock(&disk->error_injection_lock);
+	}
+	mutex_unlock(&disk->error_injection_lock);
+}
+
+enum options {
+	Opt_add			= (1u << 0),
+	Opt_removeall		= (1u << 1),
+
+	Opt_op			= (1u << 16),
+	Opt_start		= (1u << 17),
+	Opt_nr_sectors		= (1u << 18),
+	Opt_status		= (1u << 19),
+	Opt_chance		= (1u << 20),
+
+	Opt_invalid,
+};
+
+static const match_table_t opt_tokens = {
+	{ Opt_add,			"add",			},
+	{ Opt_removeall,		"removeall",		},
+	{ Opt_op,			"op=%s",		},
+	{ Opt_start,			"start=%u"		},
+	{ Opt_nr_sectors,		"nr_sectors=%u"		},
+	{ Opt_status,			"status=%s"		},
+	{ Opt_chance,			"chance=%u"		},
+	{ Opt_invalid,			NULL,			},
+};
+
+static int match_op(substring_t *args, enum req_op *op)
+{
+	const char *tag;
+
+	tag = match_strdup(args);
+	if (!tag)
+		return -ENOMEM;
+	*op = str_to_blk_op(tag);
+	if (*op == REQ_OP_LAST)
+		pr_warn("invalid op '%s'\n", tag);
+	kfree(tag);
+	return 0;
+}
+
+static int match_status(substring_t *args, blk_status_t *status)
+{
+	const char *tag;
+
+	tag = match_strdup(args);
+	if (!tag)
+		return -ENOMEM;
+	*status = tag_to_blk_status(tag);
+	if (!*status)
+		pr_warn("invalid status '%s'\n", tag);
+	kfree(tag);
+	return 0;
+}
+
+static ssize_t blk_error_injection_parse_options(struct gendisk *disk,
+		char *options)
+{
+	enum { Unset, Add, Removeall } action = Unset;
+	unsigned int option_mask = 0, chance = 1;
+	enum req_op op = REQ_OP_LAST;
+	u64 start = 0, nr_sectors = 0;
+	blk_status_t status = BLK_STS_OK;
+	substring_t args[MAX_OPT_ARGS];
+	char *p;
+
+	while ((p = strsep(&options, ",\n")) != NULL) {
+		int error = 0;
+		ssize_t token;
+
+		if (!*p)
+			continue;
+		token = match_token(p, opt_tokens, args);
+		option_mask |= token;
+		switch (token) {
+		case Opt_add:
+			if (action != Unset)
+				return -EINVAL;
+			action = Add;
+			break;
+		case Opt_removeall:
+			if (action != Unset)
+				return -EINVAL;
+			action = Removeall;
+			break;
+		case Opt_op:
+			error = match_op(args, &op);
+			break;
+		case Opt_start:
+			error = match_u64(args, &start);
+			break;
+		case Opt_nr_sectors:
+			error = match_u64(args, &nr_sectors);
+			break;
+		case Opt_status:
+			error = match_status(args, &status);
+			break;
+		case Opt_chance:
+			error = match_uint(args, &chance);
+			if (!error && chance == 0)
+				error = -EINVAL;
+			break;
+		default:
+			pr_warn("unknown parameter or missing value '%s'\n", p);
+			error = -EINVAL;
+		}
+		if (error)
+			return error;
+	}
+
+	switch (action) {
+	case Add:
+		return error_inject_add(disk, op, start, nr_sectors, status,
+				chance);
+	case Removeall:
+		if (option_mask & ~Opt_removeall)
+			return -EINVAL;
+		error_inject_removall(disk);
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
+static ssize_t blk_error_injection_write(struct file *file,
+		const char __user *ubuf, size_t count, loff_t *pos)
+{
+	struct gendisk *disk = file_inode(file)->i_private;
+	char *options;
+	int error;
+
+	options = memdup_user_nul(ubuf, count);
+	if (IS_ERR(options))
+		return PTR_ERR(options);
+	error = blk_error_injection_parse_options(disk, options);
+	kfree(options);
+
+	if (error)
+		return error;
+	return count;
+}
+
+static int blk_error_injection_show(struct seq_file *s, void *private)
+{
+	struct gendisk *disk = s->private;
+	struct blk_error_inject *inj;
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) {
+		seq_printf(s, "%llu:%llu status=%s,chance=%u",
+			inj->start, inj->end,
+			blk_status_to_tag(inj->status), inj->chance);
+		seq_putc(s, '\n');
+	}
+	rcu_read_unlock();
+	return 0;
+}
+
+static int blk_error_injection_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, blk_error_injection_show, inode->i_private);
+}
+
+static int blk_error_injection_release(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+static const struct file_operations blk_error_injection_fops = {
+	.owner		= THIS_MODULE,
+	.write		= blk_error_injection_write,
+	.read		= seq_read,
+	.open		= blk_error_injection_open,
+	.release	= blk_error_injection_release,
+};
+
+void blk_error_injection_init(struct gendisk *disk)
+{
+	debugfs_create_file("error_injection", 0600, disk->queue->debugfs_dir,
+			disk, &blk_error_injection_fops);
+}
+
+void blk_error_injection_exit(struct gendisk *disk)
+{
+	error_inject_removall(disk);
+}
diff --git a/block/genhd.c b/block/genhd.c
index 7d6854fd28e9..f84b6a355b57 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1485,6 +1485,10 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
 	lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0);
 #ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
 	INIT_LIST_HEAD(&disk->slave_bdevs);
+#endif
+#ifdef CONFIG_BLK_ERROR_INJECTION
+	mutex_init(&disk->error_injection_lock);
+	INIT_LIST_HEAD(&disk->error_injection_list);
 #endif
 	mutex_init(&disk->rqos_state_mutex);
 	kobject_init(&disk->queue_kobj, &blk_queue_ktype);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 17270a28c66d..d2adf2775920 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -176,6 +176,7 @@ struct gendisk {
 #define GD_SUPPRESS_PART_SCAN		5
 #define GD_OWNS_QUEUE			6
 #define GD_ZONE_APPEND_USED		7
+#define GD_ERROR_INJECT			8
 
 	struct mutex open_mutex;	/* open/close mutex */
 	unsigned open_partitions;	/* number of open partitions */
@@ -227,6 +228,11 @@ struct gendisk {
 	 */
 	struct blk_independent_access_ranges *ia_ranges;
 
+#ifdef CONFIG_BLK_ERROR_INJECTION
+	struct mutex		error_injection_lock;
+	struct list_head	error_injection_list;
+#endif
+
 	struct mutex rqos_state_mutex;	/* rqos state change mutex */
 };
 
-- 
2.53.0


^ permalink raw reply related

* [PATCH 3/4] block: add a str_to_blk_op helper
From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-1-hch@lst.de>

Add a helper to find the REQ_OP_XYZ constant from the "XYZ" string.
This will be used for the error injection debugfs interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-core.c | 10 ++++++++++
 block/blk.h      |  1 +
 2 files changed, 11 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7aa9cd110bdd..aa90aad6da13 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -132,6 +132,16 @@ inline const char *blk_op_str(enum req_op op)
 }
 EXPORT_SYMBOL_GPL(blk_op_str);
 
+enum req_op str_to_blk_op(const char *op)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(blk_op_name); i++)
+		if (blk_op_name[i] && !strcmp(blk_op_name[i], op))
+			return (enum req_op)i;
+	return REQ_OP_LAST;
+}
+
 #define ENT(_tag, _errno, _desc)	\
 [BLK_STS_##_tag] = {				\
 	.errno		= _errno,		\
diff --git a/block/blk.h b/block/blk.h
index 10426d23f662..93b30d1d0ec6 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -52,6 +52,7 @@ void blk_free_flush_queue(struct blk_flush_queue *q);
 const char *blk_status_to_str(blk_status_t status);
 const char *blk_status_to_tag(blk_status_t status);
 blk_status_t tag_to_blk_status(const char *tag);
+enum req_op str_to_blk_op(const char *op);
 
 bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
 bool blk_queue_start_drain(struct request_queue *q);
-- 
2.53.0


^ permalink raw reply related

* [PATCH 2/4] block: add a "tag" for block status codes
From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-1-hch@lst.de>

The full name of the status codes is not good for user interfaces as it
can contain white spaces.  Add the name of the status code without the
BLK_STS_ prefix as a tag so that it can be used for user interfaces.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-core.c | 28 ++++++++++++++++++++++++++++
 block/blk.h      |  2 ++
 2 files changed, 30 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1614323282f1..7aa9cd110bdd 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -135,10 +135,12 @@ EXPORT_SYMBOL_GPL(blk_op_str);
 #define ENT(_tag, _errno, _desc)	\
 [BLK_STS_##_tag] = {				\
 	.errno		= _errno,		\
+	.tag		= __stringify(_tag),	\
 	.name		= _desc,		\
 }
 static const struct {
 	int		errno;
+	const char	*tag;
 	const char	*name;
 } blk_errors[] = {
 	ENT(OK,			0,		""),
@@ -203,6 +205,32 @@ const char *blk_status_to_str(blk_status_t status)
 	return blk_errors[idx].name;
 }
 
+const char *blk_status_to_tag(blk_status_t status)
+{
+	int idx = (__force int)status;
+
+	if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors)))
+		return "<null>";
+	return blk_errors[idx].tag;
+}
+
+blk_status_t tag_to_blk_status(const char *tag)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(blk_errors); i++) {
+		if (blk_errors[i].tag &&
+		    !strcmp(blk_errors[i].tag, tag))
+			return (__force blk_status_t)i;
+	}
+
+	/*
+	 * Return BLK_STS_OK for mismatches as this function is intended to
+	 * parse error status values.
+	 */
+	return BLK_STS_OK;
+}
+
 /**
  * blk_sync_queue - cancel any pending callbacks on a queue
  * @q: the queue
diff --git a/block/blk.h b/block/blk.h
index bf1a80493ff1..10426d23f662 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -50,6 +50,8 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
 void blk_free_flush_queue(struct blk_flush_queue *q);
 
 const char *blk_status_to_str(blk_status_t status);
+const char *blk_status_to_tag(blk_status_t status);
+blk_status_t tag_to_blk_status(const char *tag);
 
 bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
 bool blk_queue_start_drain(struct request_queue *q);
-- 
2.53.0


^ permalink raw reply related

* [PATCH 1/4] block: add a macro to initialize the status table
From: Christoph Hellwig @ 2026-06-05 18:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jonathan Corbet, linux-block, linux-doc, Keith Busch
In-Reply-To: <20260605184441.590927-1-hch@lst.de>

Prepare for adding a new value to the error table by adding a macro
to fill it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
---
 block/blk-core.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index b0f0a304ea0b..1614323282f1 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -132,39 +132,44 @@ inline const char *blk_op_str(enum req_op op)
 }
 EXPORT_SYMBOL_GPL(blk_op_str);
 
+#define ENT(_tag, _errno, _desc)	\
+[BLK_STS_##_tag] = {				\
+	.errno		= _errno,		\
+	.name		= _desc,		\
+}
 static const struct {
 	int		errno;
 	const char	*name;
 } blk_errors[] = {
-	[BLK_STS_OK]		= { 0,		"" },
-	[BLK_STS_NOTSUPP]	= { -EOPNOTSUPP, "operation not supported" },
-	[BLK_STS_TIMEOUT]	= { -ETIMEDOUT,	"timeout" },
-	[BLK_STS_NOSPC]		= { -ENOSPC,	"critical space allocation" },
-	[BLK_STS_TRANSPORT]	= { -ENOLINK,	"recoverable transport" },
-	[BLK_STS_TARGET]	= { -EREMOTEIO,	"critical target" },
-	[BLK_STS_RESV_CONFLICT]	= { -EBADE,	"reservation conflict" },
-	[BLK_STS_MEDIUM]	= { -ENODATA,	"critical medium" },
-	[BLK_STS_PROTECTION]	= { -EILSEQ,	"protection" },
-	[BLK_STS_RESOURCE]	= { -ENOMEM,	"kernel resource" },
-	[BLK_STS_DEV_RESOURCE]	= { -EBUSY,	"device resource" },
-	[BLK_STS_AGAIN]		= { -EAGAIN,	"nonblocking retry" },
-	[BLK_STS_OFFLINE]	= { -ENODEV,	"device offline" },
+	ENT(OK,			0,		""),
+	ENT(NOTSUPP,		-EOPNOTSUPP,	"operation not supported"),
+	ENT(TIMEOUT,		-ETIMEDOUT,	"timeout"),
+	ENT(NOSPC,		-ENOSPC,	"critical space allocation"),
+	ENT(TRANSPORT,		-ENOLINK,	"recoverable transport"),
+	ENT(TARGET,		-EREMOTEIO,	"critical target"),
+	ENT(RESV_CONFLICT,	-EBADE,		"reservation conflict"),
+	ENT(MEDIUM,		-ENODATA,	"critical medium"),
+	ENT(PROTECTION,		-EILSEQ,	"protection"),
+	ENT(RESOURCE,		-ENOMEM,	"kernel resource"),
+	ENT(DEV_RESOURCE,	-EBUSY,		"device resource"),
+	ENT(AGAIN,		-EAGAIN,	"nonblocking retry"),
+	ENT(OFFLINE,		-ENODEV,	"device offline"),
 
 	/* device mapper special case, should not leak out: */
-	[BLK_STS_DM_REQUEUE]	= { -EREMCHG, "dm internal retry" },
+	ENT(DM_REQUEUE,		-EREMCHG,	"dm internal retry"),
 
 	/* zone device specific errors */
-	[BLK_STS_ZONE_OPEN_RESOURCE]	= { -ETOOMANYREFS, "open zones exceeded" },
-	[BLK_STS_ZONE_ACTIVE_RESOURCE]	= { -EOVERFLOW, "active zones exceeded" },
+	ENT(ZONE_OPEN_RESOURCE, -ETOOMANYREFS,	"open zones exceeded"),
+	ENT(ZONE_ACTIVE_RESOURCE, -EOVERFLOW,	"active zones exceeded"),
 
 	/* Command duration limit device-side timeout */
-	[BLK_STS_DURATION_LIMIT]	= { -ETIME, "duration limit exceeded" },
-
-	[BLK_STS_INVAL]		= { -EINVAL,	"invalid" },
+	ENT(DURATION_LIMIT,	-ETIME,		"duration limit exceeded"),
+	ENT(INVAL,		-EINVAL,	"invalid"),
 
 	/* everything else not covered above: */
-	[BLK_STS_IOERR]		= { -EIO,	"I/O" },
+	ENT(IOERR,		-EIO,		"I/O"),
 };
+#undef ENT
 
 blk_status_t errno_to_blk_status(int errno)
 {
-- 
2.53.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox