* Re: [PATCH v3 2/4] scsi: host: allocate struct Scsi_Host on the NUMA node of the host adapter
From: Hannes Reinecke @ 2026-06-10 5:59 UTC (permalink / raw)
To: Sumit Saxena, Martin K . Petersen, Jens Axboe
Cc: James E . J . Bottomley, linux-scsi, linux-block, Adam Radford,
Khalid Aziz, Adaptec OEM Raid Solutions, Matthew Wilcox,
Juergen E . Fischer, Russell King, linux-arm-kernel, Finn Thain,
Michael Schmitz, Anil Gurumurthy, Sudarsana Kalluru,
Oliver Neukum, Ali Akcaagac, Jamie Lenehan, Ram Vegesna,
target-devel, Bradley Grove, Satish Kharat, Sesidhar Baddela,
Karan Tilak Kumar, Yihang Li, Don Brace, storagedev,
HighPoint Linux Team, Tyrel Datwyler, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Christophe Leroy, linuxppc-dev,
Brian King, Lee Duncan, Chris Leech, Mike Christie, open-iscsi,
Justin Tee, Paul Ely, Kashyap Desai, Shivasharan S,
Chandrakanth Patil, megaraidlinux.pdl, Sathya Prakash Veerichetty,
Sreekanth Reddy, mpi3mr-linuxdrv.pdl, Suganath Prabu Subramani,
Ranjan Kumar, MPT-FusionLinux.pdl, Daniel Palmer, GOTO Masanori,
YOKOTA Hiroshi, Jack Wang, Geoff Levand, Michael Reed,
Nilesh Javali, GR-QLogic-Storage-Upstream, Narsimhulu Musini,
K . Y . Srinivasan, Haiyang Zhang, Wei Liu, Dexuan Cui, Long Li,
linux-hyperv, Michael S . Tsirkin, Jason Wang, Paolo Bonzini,
Stefan Hajnoczi, Eugenio Perez, virtualization, Vishal Bhakta,
bcm-kernel-feedback-list, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, xen-devel, John Garry
In-Reply-To: <20260609121806.2121755-3-sumit.saxena@broadcom.com>
On 6/9/26 14:18, Sumit Saxena wrote:
> scsi_host_alloc() used kzalloc(), which always picks an arbitrary node.
> Extend the function to accept a 'struct device *dev' parameter and use
> kzalloc_node() with dev_to_node(dev) so the Scsi_Host struct lands on
> the same NUMA node as the HBA, mirroring the treatment already applied
> to struct scsi_device, struct scsi_target, and shost_data.
>
> When dev is NULL (legacy ISA/platform drivers without a dma_dev) the
> allocation falls back to NUMA_NO_NODE, preserving existing behaviour.
>
> Update all in-tree callers:
> - PCI-based HBA drivers pass &pdev->dev (or the equivalent struct
> member such as &phba->pcidev->dev, &h->pdev->dev, &ha->pdev->dev)
> so their host struct is placed on the adapter's node.
> - Non-PCI drivers (ISA, Amiga, ARM PCMCIA, virtio, Hyper-V, PS3, …)
> pass NULL.
> - libfc's libfc_host_alloc() inline helper passes NULL; FC drivers
> that want NUMA awareness can open-code the call with their pdev.
>
> Suggested-by: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
> ---
> drivers/scsi/3w-9xxx.c | 2 +-
> drivers/scsi/3w-sas.c | 2 +-
> drivers/scsi/3w-xxxx.c | 2 +-
> drivers/scsi/53c700.c | 2 +-
> drivers/scsi/BusLogic.c | 2 +-
> drivers/scsi/a100u2w.c | 2 +-
> drivers/scsi/a2091.c | 2 +-
> drivers/scsi/a3000.c | 2 +-
> drivers/scsi/aacraid/linit.c | 2 +-
> drivers/scsi/advansys.c | 6 +++---
> drivers/scsi/aha152x.c | 2 +-
> drivers/scsi/aha1542.c | 2 +-
> drivers/scsi/aha1740.c | 2 +-
> drivers/scsi/aic7xxx/aic79xx_osm.c | 2 +-
> drivers/scsi/aic7xxx/aic7xxx_osm.c | 2 +-
> drivers/scsi/aic94xx/aic94xx_init.c | 2 +-
> drivers/scsi/am53c974.c | 2 +-
> drivers/scsi/arcmsr/arcmsr_hba.c | 3 ++-
> drivers/scsi/arm/acornscsi.c | 2 +-
> drivers/scsi/arm/arxescsi.c | 2 +-
> drivers/scsi/arm/cumana_1.c | 2 +-
> drivers/scsi/arm/cumana_2.c | 2 +-
> drivers/scsi/arm/eesox.c | 2 +-
> drivers/scsi/arm/oak.c | 2 +-
> drivers/scsi/arm/powertec.c | 2 +-
> drivers/scsi/atari_scsi.c | 2 +-
> drivers/scsi/atp870u.c | 2 +-
> drivers/scsi/bfa/bfad_im.c | 2 +-
> drivers/scsi/csiostor/csio_init.c | 4 ++--
> drivers/scsi/dc395x.c | 2 +-
> drivers/scsi/dmx3191d.c | 2 +-
> drivers/scsi/elx/efct/efct_xport.c | 4 ++--
> drivers/scsi/esas2r/esas2r_main.c | 2 +-
> drivers/scsi/fdomain.c | 2 +-
> drivers/scsi/fnic/fnic_main.c | 2 +-
> drivers/scsi/g_NCR5380.c | 2 +-
> drivers/scsi/gvp11.c | 2 +-
> drivers/scsi/hisi_sas/hisi_sas_main.c | 2 +-
> drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 2 +-
> drivers/scsi/hosts.c | 6 ++++--
> drivers/scsi/hpsa.c | 2 +-
> drivers/scsi/hptiop.c | 2 +-
> drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
> drivers/scsi/ibmvscsi/ibmvscsi.c | 2 +-
> drivers/scsi/imm.c | 2 +-
> drivers/scsi/initio.c | 2 +-
> drivers/scsi/ipr.c | 2 +-
> drivers/scsi/ips.c | 2 +-
> drivers/scsi/isci/init.c | 2 +-
> drivers/scsi/jazz_esp.c | 2 +-
> drivers/scsi/libiscsi.c | 2 +-
> drivers/scsi/lpfc/lpfc_init.c | 2 +-
> drivers/scsi/mac53c94.c | 2 +-
> drivers/scsi/mac_esp.c | 2 +-
> drivers/scsi/mac_scsi.c | 2 +-
> drivers/scsi/megaraid.c | 2 +-
> drivers/scsi/megaraid/megaraid_mbox.c | 2 +-
> drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
> drivers/scsi/mesh.c | 2 +-
> drivers/scsi/mpi3mr/mpi3mr_os.c | 2 +-
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 4 ++--
> drivers/scsi/mvme147.c | 2 +-
> drivers/scsi/mvsas/mv_init.c | 2 +-
> drivers/scsi/mvumi.c | 2 +-
> drivers/scsi/myrb.c | 2 +-
> drivers/scsi/myrs.c | 2 +-
> drivers/scsi/ncr53c8xx.c | 2 +-
> drivers/scsi/nsp32.c | 2 +-
> drivers/scsi/pcmcia/nsp_cs.c | 2 +-
> drivers/scsi/pcmcia/qlogic_stub.c | 2 +-
> drivers/scsi/pcmcia/sym53c500_cs.c | 2 +-
> drivers/scsi/pm8001/pm8001_init.c | 2 +-
> drivers/scsi/pmcraid.c | 2 +-
> drivers/scsi/ppa.c | 2 +-
> drivers/scsi/ps3rom.c | 2 +-
> drivers/scsi/qla1280.c | 2 +-
> drivers/scsi/qla2xxx/qla_mid.c | 2 +-
> drivers/scsi/qla2xxx/qla_os.c | 2 +-
> drivers/scsi/qlogicfas.c | 2 +-
> drivers/scsi/qlogicpti.c | 2 +-
> drivers/scsi/scsi_debug.c | 2 +-
> drivers/scsi/sgiwd93.c | 2 +-
> drivers/scsi/smartpqi/smartpqi_init.c | 2 +-
> drivers/scsi/snic/snic_main.c | 2 +-
> drivers/scsi/stex.c | 2 +-
> drivers/scsi/storvsc_drv.c | 2 +-
> drivers/scsi/sun3_scsi.c | 2 +-
> drivers/scsi/sun3x_esp.c | 2 +-
> drivers/scsi/sun_esp.c | 2 +-
> drivers/scsi/sym53c8xx_2/sym_glue.c | 2 +-
> drivers/scsi/virtio_scsi.c | 2 +-
> drivers/scsi/vmw_pvscsi.c | 2 +-
> drivers/scsi/wd719x.c | 2 +-
> drivers/scsi/xen-scsifront.c | 2 +-
> drivers/scsi/zorro_esp.c | 2 +-
> include/scsi/libfc.h | 2 +-
> include/scsi/scsi_host.h | 3 ++-
> 97 files changed, 107 insertions(+), 103 deletions(-)
>
Quite a lot of churn for such a (relatively) simple change.
I think it might be better to introduce a new function
(scsi_host_alloc_node() ?) with the additional parameter,
and make scsi_host_alloc() a wrapper around that.
That will reduce the size of this patch immensely.
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.com +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply
* Re: [PATCH 00/27] Enable lock context analysis in drivers/block/
From: Christoph Hellwig @ 2026-06-10 5:34 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Marco Elver
In-Reply-To: <cover.1781042470.git.bvanassche@acm.org>
Can you please split your series up in a saner way? A bunch of the
patches for drivers with their own subdirectories could be merged ASAP,
but you bundle them with lots of unrelated bits. Similarly having
a standalone series for drbd probably makes life easier as you don't
have to CC random other driver authors on it.
Marco: is there a way to enable the context analysis for parts of
a directory? I'm not sure that always converting a whole directory
is going to be feasible in the long run.
^ permalink raw reply
* Re: [PATCH 22/27] rbd: Enable lock context analysis
From: Christoph Hellwig @ 2026-06-10 5:32 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Marco Elver, Ilya Dryomov
In-Reply-To: <e239f06e82bb4e6f527a9be87d87dd58ffa265d1.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:05:09PM -0700, Bart Van Assche wrote:
> Add lock context annotations and enable lock context analysis.
It does part one, it does not do part 2 described here.
^ permalink raw reply
* Re: [PATCH 21/27] null_blk: Enable lock context analysis
From: Christoph Hellwig @ 2026-06-10 5:31 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Marco Elver,
Keith Busch, Damien Le Moal, Chaitanya Kulkarni,
Johannes Thumshirn, Nilay Shroff, Genjian Zhang, Kees Cook
In-Reply-To: <b70422c6f243b9275e336430071b52eb92dd40c4.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:05:08PM -0700, Bart Van Assche wrote:
> Add __must_hold() annotations where these are missing. Annotate two
> functions that use conditional locking with __context_unsafe().
Please explain why that is needed and there is no better way to have
proper annotations. Both here and in a comment in the code.
^ permalink raw reply
* Re: [PATCH 15/27] loop: Split loop_change_fd()
From: Christoph Hellwig @ 2026-06-10 5:30 UTC (permalink / raw)
To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Christoph Hellwig, Marco Elver
In-Reply-To: <f835863049bc0364d80c69a7eeb9514e2a698ae2.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:05:02PM -0700, Bart Van Assche wrote:
> Prepare for adding a second call of __loop_change_fd().
A bit confusing message. Yes, you latter add another call, but it's
from the same caller to sort out the locking mess, and that is directly
relevant to the scope of the factored out helper. So please state
that. Same for the next patch. All in merging the first free loop
patches would probably be easier to follow.
^ permalink raw reply
* Re: [PATCH 13/27] drbd: Annotate drbd_bm_{lock,unlock}()
From: Christoph Hellwig @ 2026-06-10 5:26 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <b5346ca2d22de25c6a03a7b6477d4110c263e5c6.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 12/27] drbd: Convert drbd_req_state() to unconditional locking
From: Christoph Hellwig @ 2026-06-10 5:25 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <f3f53e2c6d01b92091bfce447c9010b03a7debef.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:04:59PM -0700, Bart Van Assche wrote:
> Prepare for enabling lock context analysis.
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/block/drbd/drbd_state.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
> index 5aa4b1889144..171c6488283d 100644
> --- a/drivers/block/drbd/drbd_state.c
> +++ b/drivers/block/drbd/drbd_state.c
> @@ -639,11 +639,12 @@ static enum drbd_state_rv drbd_req_state(struct drbd_device *device,
> {
> enum drbd_state_rv rv;
>
> - if (f & CS_SERIALIZE)
> - mutex_lock(device->state_mutex);
> + if (!(f & CS_SERIALIZE))
> + return __drbd_req_state(device, mask, val, f);
> +
> + mutex_lock(device->state_mutex);
> rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
> - if (f & CS_SERIALIZE)
> - mutex_unlock(device->state_mutex);
> + mutex_unlock(device->state_mutex);
It's not really unconditional, just different conditions. And really
belongs into the previous patch.
^ permalink raw reply
* Re: [PATCH 11/27] drbd: Split drbd_req_state()
From: Christoph Hellwig @ 2026-06-10 5:24 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Marco Elver,
Philipp Reisner, Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <83d144f376580f491b42d7533e004aced7d26d7a.1781042470.git.bvanassche@acm.org>
> +{
> + enum drbd_state_rv rv;
> +
> + if (f & CS_SERIALIZE)
> + mutex_lock(device->state_mutex);
> + rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
> if (f & CS_SERIALIZE)
> mutex_unlock(device->state_mutex);
Wouldn't something like:
if (f & CS_SERIALIZE) {
mutex_lock(device->state_mutex);
rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
mutex_unlock(device->state_mutex);
} else {
rv = __drbd_req_state(device, mask, val, f & ~CS_SERIALIZE);
}
be either to follow? Is there much of a point in the CS_SERIALIZE
clearing here?
^ permalink raw reply
* Re: [PATCH 10/27] drbd: Make a mutex_unlock() call unconditional
From: Christoph Hellwig @ 2026-06-10 5:22 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <803dcc7259a57e513e7ec06a82307e75acd187a6.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 09/27] drbd: Split drbd_nl_get_connections_dumpit()
From: Christoph Hellwig @ 2026-06-10 5:22 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Marco Elver,
Philipp Reisner, Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <167a1fe9f633fe1d1a0d89a50f10c3185d9ddddd.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:04:56PM -0700, Bart Van Assche wrote:
> Move the code between the 'put_result' and 'out' labels into a new
> function. Prepare for making the following code unconditional:
>
> if (resource)
> mutex_unlock(&resource->conf_update);
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/block/drbd/drbd_nl.c | 98 ++++++++++++++++++++----------------
> 1 file changed, 55 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
> index c44b92856da3..57ab1f1fb1bc 100644
> --- a/drivers/block/drbd/drbd_nl.c
> +++ b/drivers/block/drbd/drbd_nl.c
> @@ -3529,15 +3529,56 @@ int drbd_adm_dump_connections_done(struct netlink_callback *cb)
>
> enum { SINGLE_RESOURCE, ITERATE_RESOURCES };
>
> +static int drbd_nl_put_dump_connections_result(
> + struct sk_buff *skb, struct netlink_callback *cb,
> + struct drbd_resource *resource, struct drbd_connection *connection,
> + enum drbd_ret_code retcode)
Weird indenation here. The usual style is either lining up after the
opening brace or two tabs.
^ permalink raw reply
* Re: [PATCH 08/27] drbd: Pass 'resource' directly to complete_conflicting_writes()
From: Christoph Hellwig @ 2026-06-10 5:21 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <77d0fd75811d7f604fa80b5c93172b5653b52880.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 07/27] drbd: Move two declarations
From: Christoph Hellwig @ 2026-06-10 5:20 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <7e40961a092b149aa2e3b6fc3c510da650ee0401.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 06/27] drbd: Simplify the bitmap locking functions.
From: Christoph Hellwig @ 2026-06-10 5:20 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <b5a5c65d4f10572a112071033c105d54c9ad0974.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:04:53PM -0700, Bart Van Assche wrote:
> Call mutex_lock() instead of mutex_trylock() followed by mutex_lock().
> Remove the code that depends on device->bitmap == NULL because this
> can't happen. All drbd_bm_{lock,unlock}() callers guarantee that the
> device->bitmap pointer is valid.
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 05/27] drbd: Remove the 'local' lock context
From: Christoph Hellwig @ 2026-06-10 5:19 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <0d220690717c802c4d2b684bf924d134dcd0a929.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:04:52PM -0700, Bart Van Assche wrote:
> The 'local' lock context represents code sections between get_ldev() and
> put_ldev() calls. Between these two calls the device->ldev pointer is
> valid. Since there are multiple functions with unpaired get_ldev() /
> put_ldev() calls, remove this lock context.
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 04/27] drbd: Remove the get_ldev_if_state() macro
From: Christoph Hellwig @ 2026-06-10 5:19 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <0f1141908524e2361a6c4921c668190c8f9a5831.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
> - ({ __acquire(x); true; }) : false)
> #define get_ldev(_device) get_ldev_if_state(_device, D_INCONSISTENT)
>
> static inline void put_ldev(struct drbd_device *device)
> @@ -2033,8 +2030,8 @@ static inline void put_ldev(struct drbd_device *device)
> }
> }
>
> -static inline int _get_ldev_if_state(struct drbd_device *device,
> - enum drbd_disk_state mins)
> +static inline int get_ldev_if_state(struct drbd_device *device,
> + enum drbd_disk_state mins)
> {
> int io_allowed;
>
---end quoted text---
^ permalink raw reply
* Re: [PATCH 03/27] drbd: Retain one _get_ldev_if_state() implementation
From: Christoph Hellwig @ 2026-06-10 5:18 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder, Nathan Chancellor
In-Reply-To: <3777c00c7a6e8bf807ac9133cc9b89540c19bac1.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* Re: [PATCH 02/27] drbd: Remove "extern" from function declarations
From: Christoph Hellwig @ 2026-06-10 5:17 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Marco Elver, Philipp Reisner,
Lars Ellenberg, Christoph Böhmwalder
In-Reply-To: <d9c2b6f320770bc0cb543d24f39fe5c9d28bcd99.1781042470.git.bvanassche@acm.org>
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply
* [PATCH 4/4] block: add configurable error injection
From: Christoph Hellwig @ 2026-06-10 5:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc, Hannes Reinecke
In-Reply-To: <20260610051015.1906799-1-hch@lst.de>
Add a new block error injection interface that allows to inject specific
status code for specific ranges.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
---
Documentation/block/error-injection.rst | 59 +++++
Documentation/block/index.rst | 1 +
block/Kconfig | 8 +
block/Makefile | 1 +
block/blk-core.c | 4 +
block/blk-sysfs.c | 5 +
block/error-injection.c | 314 ++++++++++++++++++++++++
block/error-injection.h | 21 ++
block/genhd.c | 4 +
include/linux/blkdev.h | 6 +
10 files changed, 423 insertions(+)
create mode 100644 Documentation/block/error-injection.rst
create mode 100644 block/error-injection.c
create mode 100644 block/error-injection.h
diff --git a/Documentation/block/error-injection.rst b/Documentation/block/error-injection.rst
new file mode 100644
index 000000000000..81f31af82e65
--- /dev/null
+++ b/Documentation/block/error-injection.rst
@@ -0,0 +1,59 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+Configurable Error Injection
+============================
+
+Overview
+--------
+
+Configurable error injection allows injecting specific block layer status codes
+for sector ranges of a block device. Errors can be injected unconditionally, or
+with a given probability.
+
+To use configurable error injection, CONFIG_BLK_ERROR_INJECTION must be enabled.
+
+The only interface is the error_injection debugfs file, which is created for
+each registered gendisk. Writes to this file are used to create or delete rules
+and reads return a list of the current error injection sites.
+
+Options
+-------
+
+The following options specify the operations:
+
+=================== =======================================================
+add add a new rule
+removeall remove all existing rules
+=================== =======================================================
+
+The following options specify the details of the rule for the add operation:
+
+=================== =======================================================
+op=<string> block layer operation this rule applies to. This uses
+ the XYZ for each REQ_OP_XYZ operation, e.g. READ, WRITE
+ or DISCARD. Mandatory.
+status=<string> Status to return. This uses XYZ for each BLK_STS_XYZ
+ code, e.g. IOERR or MEDIUM. Mandatory.
+start=<number> First block layer sector the rule applies to.
+ Optional, defaults to 0.
+nr_sectors=<number> Number of sectors this rule applies.
+ Optional, defaults to the remainder of the device.
+chance=<number> Only return a failure with a likelihood of 1/chance.
+ Optional, defaults to 1 (always).
+=================== =======================================================
+
+Example
+-------
+
+Return BLK_STS_IOERR for one in 10 reads of sector 0 of /dev/nvme0n1:
+
+ $ echo 'add,op=READ,start=0,status=IOERR,chance=10' > /sys/kernel/debug/block/nvme0n1/error_injection
+
+Return BLK_STS_MEDIUM for every write to /dev/nvme0n1:
+
+ $ echo 'add,op=WRITE,start=0,status=MEDIUM' > /sys/kernel/debug/block/nvme0n1/error_injection
+
+Remove all rules for /dev/nvme0n1:
+
+ $ echo 'removeall' > /sys/kernel/debug/block/nvme0n1/error_injection
diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst
index 9fea696f9daa..bfa1bbd31ddf 100644
--- a/Documentation/block/index.rst
+++ b/Documentation/block/index.rst
@@ -22,3 +22,4 @@ Block
switching-sched
writeback_cache_control
ublk
+ error-injection
diff --git a/block/Kconfig b/block/Kconfig
index 15027963472d..70e4a66d941f 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -221,6 +221,14 @@ config BLOCK_HOLDER_DEPRECATED
config BLK_MQ_STACKING
bool
+config BLK_ERROR_INJECTION
+ bool "Enable block layer error injection"
+ select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
+ help
+ Enable inserting arbitrary block errors through a debugfs interface.
+
+ See Documentation/block/error-injection.rst for details.
+
source "block/Kconfig.iosched"
endif # BLOCK
diff --git a/block/Makefile b/block/Makefile
index 54130faacc21..e7bd320e3d69 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -13,6 +13,7 @@ obj-y := bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \
genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \
disk-events.o blk-ia-ranges.o early-lookup.o
+obj-$(CONFIG_BLK_ERROR_INJECTION) += error-injection.o
obj-$(CONFIG_BLK_DEV_BSG_COMMON) += bsg.o
obj-$(CONFIG_BLK_DEV_BSGLIB) += bsg-lib.o
obj-$(CONFIG_BLK_CGROUP) += blk-cgroup.o
diff --git a/block/blk-core.c b/block/blk-core.c
index beaab7a71fba..73a41df98c9a 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -50,6 +50,7 @@
#include "blk-cgroup.h"
#include "blk-throttle.h"
#include "blk-ioprio.h"
+#include "error-injection.h"
struct dentry *blk_debugfs_root;
@@ -767,6 +768,9 @@ static void __submit_bio_noacct_mq(struct bio *bio)
void submit_bio_noacct_nocheck(struct bio *bio, bool split)
{
+ if (unlikely(blk_error_inject(bio)))
+ return;
+
blk_cgroup_bio_start(bio);
if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) {
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f22c1f253eb3..520972676ab4 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -19,6 +19,7 @@
#include "blk-wbt.h"
#include "blk-cgroup.h"
#include "blk-throttle.h"
+#include "error-injection.h"
struct queue_sysfs_entry {
struct attribute attr;
@@ -933,6 +934,8 @@ static void blk_debugfs_remove(struct gendisk *disk)
blk_debugfs_lock_nomemsave(q);
blk_trace_shutdown(q);
+ if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION))
+ blk_error_injection_exit(disk);
debugfs_remove_recursive(q->debugfs_dir);
q->debugfs_dir = NULL;
q->sched_debugfs_dir = NULL;
@@ -963,6 +966,8 @@ int blk_register_queue(struct gendisk *disk)
memflags = blk_debugfs_lock(q);
q->debugfs_dir = debugfs_create_dir(disk->disk_name, blk_debugfs_root);
+ if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION))
+ blk_error_injection_init(disk);
if (queue_is_mq(q))
blk_mq_debugfs_register(q);
blk_debugfs_unlock(q, memflags);
diff --git a/block/error-injection.c b/block/error-injection.c
new file mode 100644
index 000000000000..7f7f0d3327bc
--- /dev/null
+++ b/block/error-injection.c
@@ -0,0 +1,314 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2026 Christoph Hellwig.
+ */
+#include <linux/debugfs.h>
+#include <linux/blkdev.h>
+#include <linux/parser.h>
+#include <linux/seq_file.h>
+#include "blk.h"
+#include "error-injection.h"
+
+struct blk_error_inject {
+ struct list_head entry;
+ sector_t start;
+ sector_t end;
+ enum req_op op;
+ blk_status_t status;
+
+ /* only inject every 1 / chance times */
+ unsigned int chance;
+};
+
+DEFINE_STATIC_KEY_FALSE(blk_error_injection_enabled);
+
+bool __blk_error_inject(struct bio *bio)
+{
+ struct gendisk *disk = bio->bi_bdev->bd_disk;
+ struct blk_error_inject *inj;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) {
+ if (bio->bi_iter.bi_sector <= inj->end &&
+ bio_end_sector(bio) > inj->start &&
+ bio_op(bio) == inj->op) {
+ blk_status_t status = inj->status;
+
+ if (inj->chance > 1 &&
+ (get_random_u32() % inj->chance) != 0)
+ continue;
+
+ pr_info_ratelimited("%pg: injecting %s error for %s at sector %llu:%u\n",
+ disk->part0,
+ blk_status_to_str(status),
+ blk_op_str(inj->op),
+ bio->bi_iter.bi_sector,
+ bio_sectors(bio));
+ rcu_read_unlock();
+ bio_endio_status(bio, status);
+ return true;
+ }
+ }
+ rcu_read_unlock();
+ return false;
+}
+
+static int error_inject_add(struct gendisk *disk, enum req_op op,
+ sector_t start, u64 nr_sectors, blk_status_t status,
+ unsigned int chance)
+{
+ struct blk_error_inject *inj;
+ int error = -EINVAL;
+
+ if (op == REQ_OP_LAST)
+ return -EINVAL;
+ if (status == BLK_STS_OK)
+ return -EINVAL;
+
+ inj = kzalloc_obj(*inj);
+ if (!inj)
+ return -ENOMEM;
+
+ if (nr_sectors) {
+ if (U64_MAX - nr_sectors < start)
+ goto out_free_inj;
+ inj->end = start + nr_sectors - 1;
+ } else {
+ inj->end = U64_MAX;
+ }
+
+ inj->op = op;
+ inj->start = start;
+ inj->status = status;
+ inj->chance = chance;
+
+ pr_debug_ratelimited("%pg: adding %s injection for %s at sector %llu:%llu\n",
+ disk->part0, blk_status_to_str(status),
+ blk_op_str(op),
+ start, nr_sectors);
+
+ /*
+ * Add to the front of the list so that newer entries can partially
+ * override other entries. This also intentionally allows duplicate
+ * entries as there is no real reason to reject them.
+ */
+ mutex_lock(&disk->error_injection_lock);
+ if (!disk_live(disk)) {
+ mutex_unlock(&disk->error_injection_lock);
+ error = -ENODEV;
+ goto out_free_inj;
+ }
+ if (list_empty(&disk->error_injection_list))
+ static_branch_inc(&blk_error_injection_enabled);
+ list_add_rcu(&inj->entry, &disk->error_injection_list);
+ set_bit(GD_ERROR_INJECT, &disk->state);
+ mutex_unlock(&disk->error_injection_lock);
+ return 0;
+
+out_free_inj:
+ kfree(inj);
+ return error;
+}
+
+static void error_inject_removeall(struct gendisk *disk)
+{
+ struct blk_error_inject *inj;
+
+ mutex_lock(&disk->error_injection_lock);
+ clear_bit(GD_ERROR_INJECT, &disk->state);
+ while ((inj = list_first_entry_or_null(&disk->error_injection_list,
+ struct blk_error_inject, entry))) {
+ list_del_rcu(&inj->entry);
+ mutex_unlock(&disk->error_injection_lock);
+
+ kfree_rcu_mightsleep(inj);
+
+ mutex_lock(&disk->error_injection_lock);
+ }
+ static_branch_dec(&blk_error_injection_enabled);
+ mutex_unlock(&disk->error_injection_lock);
+}
+
+enum options {
+ Opt_add = (1u << 0),
+ Opt_removeall = (1u << 1),
+
+ Opt_op = (1u << 16),
+ Opt_start = (1u << 17),
+ Opt_nr_sectors = (1u << 18),
+ Opt_status = (1u << 19),
+ Opt_chance = (1u << 20),
+
+ Opt_invalid,
+};
+
+static const match_table_t opt_tokens = {
+ { Opt_add, "add", },
+ { Opt_removeall, "removeall", },
+ { Opt_op, "op=%s", },
+ { Opt_start, "start=%u" },
+ { Opt_nr_sectors, "nr_sectors=%u" },
+ { Opt_status, "status=%s" },
+ { Opt_chance, "chance=%u" },
+ { Opt_invalid, NULL, },
+};
+
+static int match_op(substring_t *args, enum req_op *op)
+{
+ const char *tag;
+
+ tag = match_strdup(args);
+ if (!tag)
+ return -ENOMEM;
+ *op = str_to_blk_op(tag);
+ if (*op == REQ_OP_LAST)
+ pr_warn("invalid op '%s'\n", tag);
+ kfree(tag);
+ return 0;
+}
+
+static int match_status(substring_t *args, blk_status_t *status)
+{
+ const char *tag;
+
+ tag = match_strdup(args);
+ if (!tag)
+ return -ENOMEM;
+ *status = tag_to_blk_status(tag);
+ if (!*status)
+ pr_warn("invalid status '%s'\n", tag);
+ kfree(tag);
+ return 0;
+}
+
+static ssize_t blk_error_injection_parse_options(struct gendisk *disk,
+ char *options)
+{
+ enum { Unset, Add, Removeall } action = Unset;
+ unsigned int option_mask = 0, chance = 1;
+ enum req_op op = REQ_OP_LAST;
+ u64 start = 0, nr_sectors = 0;
+ blk_status_t status = BLK_STS_OK;
+ substring_t args[MAX_OPT_ARGS];
+ char *p;
+
+ while ((p = strsep(&options, ",\n")) != NULL) {
+ int error = 0;
+ ssize_t token;
+
+ if (!*p)
+ continue;
+ token = match_token(p, opt_tokens, args);
+ option_mask |= token;
+ switch (token) {
+ case Opt_add:
+ if (action != Unset)
+ return -EINVAL;
+ action = Add;
+ break;
+ case Opt_removeall:
+ if (action != Unset)
+ return -EINVAL;
+ action = Removeall;
+ break;
+ case Opt_op:
+ error = match_op(args, &op);
+ break;
+ case Opt_start:
+ error = match_u64(args, &start);
+ break;
+ case Opt_nr_sectors:
+ error = match_u64(args, &nr_sectors);
+ break;
+ case Opt_status:
+ error = match_status(args, &status);
+ break;
+ case Opt_chance:
+ error = match_uint(args, &chance);
+ if (!error && chance == 0)
+ error = -EINVAL;
+ break;
+ default:
+ pr_warn("unknown parameter or missing value '%s'\n", p);
+ error = -EINVAL;
+ }
+ if (error)
+ return error;
+ }
+
+ switch (action) {
+ case Add:
+ return error_inject_add(disk, op, start, nr_sectors, status,
+ chance);
+ case Removeall:
+ if (option_mask & ~Opt_removeall)
+ return -EINVAL;
+ error_inject_removeall(disk);
+ return 0;
+ default:
+ return -EINVAL;
+ }
+}
+
+static ssize_t blk_error_injection_write(struct file *file,
+ const char __user *ubuf, size_t count, loff_t *pos)
+{
+ struct gendisk *disk = file_inode(file)->i_private;
+ char *options;
+ int error;
+
+ options = memdup_user_nul(ubuf, count);
+ if (IS_ERR(options))
+ return PTR_ERR(options);
+ error = blk_error_injection_parse_options(disk, options);
+ kfree(options);
+
+ if (error)
+ return error;
+ return count;
+}
+
+static int blk_error_injection_show(struct seq_file *s, void *private)
+{
+ struct gendisk *disk = s->private;
+ struct blk_error_inject *inj;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(inj, &disk->error_injection_list, entry) {
+ seq_printf(s, "%llu:%llu status=%s,chance=%u",
+ inj->start, inj->end,
+ blk_status_to_tag(inj->status), inj->chance);
+ seq_putc(s, '\n');
+ }
+ rcu_read_unlock();
+ return 0;
+}
+
+static int blk_error_injection_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, blk_error_injection_show, inode->i_private);
+}
+
+static int blk_error_injection_release(struct inode *inode, struct file *file)
+{
+ return single_release(inode, file);
+}
+
+static const struct file_operations blk_error_injection_fops = {
+ .owner = THIS_MODULE,
+ .write = blk_error_injection_write,
+ .read = seq_read,
+ .open = blk_error_injection_open,
+ .release = blk_error_injection_release,
+};
+
+void blk_error_injection_init(struct gendisk *disk)
+{
+ debugfs_create_file("error_injection", 0600, disk->queue->debugfs_dir,
+ disk, &blk_error_injection_fops);
+}
+
+void blk_error_injection_exit(struct gendisk *disk)
+{
+ error_inject_removeall(disk);
+}
diff --git a/block/error-injection.h b/block/error-injection.h
new file mode 100644
index 000000000000..9821d773abab
--- /dev/null
+++ b/block/error-injection.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _BLK_ERROR_INJECTION_H
+#define _BLK_ERROR_INJECTION_H 1
+
+#include <linux/jump_label.h>
+
+DECLARE_STATIC_KEY_FALSE(blk_error_injection_enabled);
+
+void blk_error_injection_init(struct gendisk *disk);
+void blk_error_injection_exit(struct gendisk *disk);
+bool __blk_error_inject(struct bio *bio);
+static inline bool blk_error_inject(struct bio *bio)
+{
+ if (IS_ENABLED(CONFIG_BLK_ERROR_INJECTION) &&
+ static_branch_unlikely(&blk_error_injection_enabled) &&
+ test_bit(GD_ERROR_INJECT, &bio->bi_bdev->bd_disk->state))
+ return __blk_error_inject(bio);
+ return false;
+}
+
+#endif /* _BLK_ERROR_INJECTION_H */
diff --git a/block/genhd.c b/block/genhd.c
index 7d6854fd28e9..f84b6a355b57 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1485,6 +1485,10 @@ struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0);
#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
INIT_LIST_HEAD(&disk->slave_bdevs);
+#endif
+#ifdef CONFIG_BLK_ERROR_INJECTION
+ mutex_init(&disk->error_injection_lock);
+ INIT_LIST_HEAD(&disk->error_injection_list);
#endif
mutex_init(&disk->rqos_state_mutex);
kobject_init(&disk->queue_kobj, &blk_queue_ktype);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 57e84d59a642..5070851cf924 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -176,6 +176,7 @@ struct gendisk {
#define GD_SUPPRESS_PART_SCAN 5
#define GD_OWNS_QUEUE 6
#define GD_ZONE_APPEND_USED 7
+#define GD_ERROR_INJECT 8
struct mutex open_mutex; /* open/close mutex */
unsigned open_partitions; /* number of open partitions */
@@ -227,6 +228,11 @@ struct gendisk {
*/
struct blk_independent_access_ranges *ia_ranges;
+#ifdef CONFIG_BLK_ERROR_INJECTION
+ struct mutex error_injection_lock;
+ struct list_head error_injection_list;
+#endif
+
struct mutex rqos_state_mutex; /* rqos state change mutex */
};
--
2.53.0
^ permalink raw reply related
* [PATCH 3/4] block: add a str_to_blk_op helper
From: Christoph Hellwig @ 2026-06-10 5:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc, Hannes Reinecke
In-Reply-To: <20260610051015.1906799-1-hch@lst.de>
Add a helper to find the REQ_OP_XYZ constant from the "XYZ" string.
This will be used for the error injection debugfs interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
---
block/blk-core.c | 10 ++++++++++
block/blk.h | 1 +
2 files changed, 11 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c
index 842b5c6f2fb4..beaab7a71fba 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -132,6 +132,16 @@ inline const char *blk_op_str(enum req_op op)
}
EXPORT_SYMBOL_GPL(blk_op_str);
+enum req_op str_to_blk_op(const char *op)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(blk_op_name); i++)
+ if (blk_op_name[i] && !strcmp(blk_op_name[i], op))
+ return (enum req_op)i;
+ return REQ_OP_LAST;
+}
+
#define ENT(_tag, _errno, _desc) \
[BLK_STS_##_tag] = { \
.errno = _errno, \
diff --git a/block/blk.h b/block/blk.h
index 3ab2cdd6ed12..507ab34a6e90 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -53,6 +53,7 @@ void blk_free_flush_queue(struct blk_flush_queue *q);
const char *blk_status_to_str(blk_status_t status);
const char *blk_status_to_tag(blk_status_t status);
blk_status_t tag_to_blk_status(const char *tag);
+enum req_op str_to_blk_op(const char *op);
bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
bool blk_queue_start_drain(struct request_queue *q);
--
2.53.0
^ permalink raw reply related
* [PATCH 2/4] block: add a "tag" for block status codes
From: Christoph Hellwig @ 2026-06-10 5:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc, Hannes Reinecke
In-Reply-To: <20260610051015.1906799-1-hch@lst.de>
The full name of the status codes is not good for user interfaces as it
can contain white spaces. Add the name of the status code without the
BLK_STS_ prefix as a tag so that it can be used for user interfaces.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
---
block/blk-core.c | 28 ++++++++++++++++++++++++++++
block/blk.h | 2 ++
2 files changed, 30 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c
index 43121a9f99f0..842b5c6f2fb4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -135,10 +135,12 @@ EXPORT_SYMBOL_GPL(blk_op_str);
#define ENT(_tag, _errno, _desc) \
[BLK_STS_##_tag] = { \
.errno = _errno, \
+ .tag = __stringify(_tag), \
.name = _desc, \
}
static const struct {
int errno;
+ const char *tag;
const char *name;
} blk_errors[] = {
ENT(OK, 0, ""),
@@ -203,6 +205,32 @@ const char *blk_status_to_str(blk_status_t status)
return blk_errors[idx].name;
}
+const char *blk_status_to_tag(blk_status_t status)
+{
+ int idx = (__force int)status;
+
+ if (WARN_ON_ONCE(idx >= ARRAY_SIZE(blk_errors) || !blk_errors[idx].tag))
+ return "<null>";
+ return blk_errors[idx].tag;
+}
+
+blk_status_t tag_to_blk_status(const char *tag)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(blk_errors); i++) {
+ if (blk_errors[i].tag &&
+ !strcmp(blk_errors[i].tag, tag))
+ return (__force blk_status_t)i;
+ }
+
+ /*
+ * Return BLK_STS_OK for mismatches as this function is intended to
+ * parse error status values.
+ */
+ return BLK_STS_OK;
+}
+
/**
* blk_sync_queue - cancel any pending callbacks on a queue
* @q: the queue
diff --git a/block/blk.h b/block/blk.h
index 7fdfb9012ce1..3ab2cdd6ed12 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -51,6 +51,8 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
void blk_free_flush_queue(struct blk_flush_queue *q);
const char *blk_status_to_str(blk_status_t status);
+const char *blk_status_to_tag(blk_status_t status);
+blk_status_t tag_to_blk_status(const char *tag);
bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic);
bool blk_queue_start_drain(struct request_queue *q);
--
2.53.0
^ permalink raw reply related
* [PATCH 1/4] block: add a macro to initialize the status table
From: Christoph Hellwig @ 2026-06-10 5:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc, Bart Van Assche, Hannes Reinecke
In-Reply-To: <20260610051015.1906799-1-hch@lst.de>
Prepare for adding a new value to the error table by adding a macro
to fill it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
---
block/blk-core.c | 45 +++++++++++++++++++++++++--------------------
1 file changed, 25 insertions(+), 20 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 1c637db79e59..43121a9f99f0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -132,39 +132,44 @@ inline const char *blk_op_str(enum req_op op)
}
EXPORT_SYMBOL_GPL(blk_op_str);
+#define ENT(_tag, _errno, _desc) \
+[BLK_STS_##_tag] = { \
+ .errno = _errno, \
+ .name = _desc, \
+}
static const struct {
int errno;
const char *name;
} blk_errors[] = {
- [BLK_STS_OK] = { 0, "" },
- [BLK_STS_NOTSUPP] = { -EOPNOTSUPP, "operation not supported" },
- [BLK_STS_TIMEOUT] = { -ETIMEDOUT, "timeout" },
- [BLK_STS_NOSPC] = { -ENOSPC, "critical space allocation" },
- [BLK_STS_TRANSPORT] = { -ENOLINK, "recoverable transport" },
- [BLK_STS_TARGET] = { -EREMOTEIO, "critical target" },
- [BLK_STS_RESV_CONFLICT] = { -EBADE, "reservation conflict" },
- [BLK_STS_MEDIUM] = { -ENODATA, "critical medium" },
- [BLK_STS_PROTECTION] = { -EILSEQ, "protection" },
- [BLK_STS_RESOURCE] = { -ENOMEM, "kernel resource" },
- [BLK_STS_DEV_RESOURCE] = { -EBUSY, "device resource" },
- [BLK_STS_AGAIN] = { -EAGAIN, "nonblocking retry" },
- [BLK_STS_OFFLINE] = { -ENODEV, "device offline" },
+ ENT(OK, 0, ""),
+ ENT(NOTSUPP, -EOPNOTSUPP, "operation not supported"),
+ ENT(TIMEOUT, -ETIMEDOUT, "timeout"),
+ ENT(NOSPC, -ENOSPC, "critical space allocation"),
+ ENT(TRANSPORT, -ENOLINK, "recoverable transport"),
+ ENT(TARGET, -EREMOTEIO, "critical target"),
+ ENT(RESV_CONFLICT, -EBADE, "reservation conflict"),
+ ENT(MEDIUM, -ENODATA, "critical medium"),
+ ENT(PROTECTION, -EILSEQ, "protection"),
+ ENT(RESOURCE, -ENOMEM, "kernel resource"),
+ ENT(DEV_RESOURCE, -EBUSY, "device resource"),
+ ENT(AGAIN, -EAGAIN, "nonblocking retry"),
+ ENT(OFFLINE, -ENODEV, "device offline"),
/* device mapper special case, should not leak out: */
- [BLK_STS_DM_REQUEUE] = { -EREMCHG, "dm internal retry" },
+ ENT(DM_REQUEUE, -EREMCHG, "dm internal retry"),
/* zone device specific errors */
- [BLK_STS_ZONE_OPEN_RESOURCE] = { -ETOOMANYREFS, "open zones exceeded" },
- [BLK_STS_ZONE_ACTIVE_RESOURCE] = { -EOVERFLOW, "active zones exceeded" },
+ ENT(ZONE_OPEN_RESOURCE, -ETOOMANYREFS, "open zones exceeded"),
+ ENT(ZONE_ACTIVE_RESOURCE, -EOVERFLOW, "active zones exceeded"),
/* Command duration limit device-side timeout */
- [BLK_STS_DURATION_LIMIT] = { -ETIME, "duration limit exceeded" },
-
- [BLK_STS_INVAL] = { -EINVAL, "invalid" },
+ ENT(DURATION_LIMIT, -ETIME, "duration limit exceeded"),
+ ENT(INVAL, -EINVAL, "invalid"),
/* everything else not covered above: */
- [BLK_STS_IOERR] = { -EIO, "I/O" },
+ ENT(IOERR, -EIO, "I/O"),
};
+#undef ENT
blk_status_t errno_to_blk_status(int errno)
{
--
2.53.0
^ permalink raw reply related
* configurable block error injection v4
From: Christoph Hellwig @ 2026-06-10 5:08 UTC (permalink / raw)
To: Jens Axboe
Cc: Jonathan Corbet, Damien Le Moal, Hannes Reinecke, Keith Busch,
linux-block, linux-doc
Hi all,
this series adds a new configurable block error injection facility.
We already have a few to inject block errors, but unfortunately most
of them are either not very useful or hard to use, or both:
- The fail_make_request failure injection point can't distinguish
different commands, different ranges in the file and can only injection
plain I/O errors.
- the should_fail_bio 'dynamic' failure injection has all the same issues
as fail_make_request
- dm-error can only fail all command in the table using BLK_STS_IOERR
and requires setting up a new block device
- dm-flakey and dm-dust allow all kinds of configurability, but still
don't have good error selection, no good support for non-read/write
commands and are limited to the dm table alignment requirements,
which for zoned devices enforces setting them up for an entire zone.
They also once again require setting up a stacked block device,
which is really annoying in harnesses like xfstests
This series adds a new debugfs-based block layer error injection
that allows to configure what operations and ranges the injection
applied to, and what status to return. It also allows to configure a
failure ratio similar to the xfs errortag injection.
Changes since v3:
- use a static branch to guard the new condition
- split out a new header so that jump_label.h doesn't get pulled into
blk.h
- more checking for impossible conditions in blk_status_to_tag
- more spelling fixes
Changes since v2:
- improve the documentation a bit
- fix a spelling mistake in a comment
Changes since v1:
- drop the should_fail_bio removal and cleanup depending on it, as it's
used by eBPF programs and thus a hidden UABI.
- as a result split the code out to it's own Kconfig symbol
- various error handling fixed pointed out by Keith
- documentation spelling fixes pointed out by Randy
Diffstat:
Documentation/block/error-injection.rst | 59 ++++++
Documentation/block/index.rst | 1
block/Kconfig | 8
block/Makefile | 1
block/blk-core.c | 87 ++++++--
block/blk-sysfs.c | 5
block/blk.h | 3
block/error-injection.c | 314 ++++++++++++++++++++++++++++++++
block/error-injection.h | 21 ++
block/genhd.c | 4
include/linux/blkdev.h | 6
11 files changed, 489 insertions(+), 20 deletions(-)
^ permalink raw reply
* [linus:master] [block] 3179a5f7f8: blktests.loop/010.fail
From: kernel test robot @ 2026-06-10 2:57 UTC (permalink / raw)
To: Li Chen
Cc: oe-lkp, lkp, linux-kernel, Jens Axboe, Chaitanya Kulkarni,
Bart Van Assche, linux-block, oliver.sang
Hello,
kernel test robot noticed "blktests.loop/010.fail" on:
commit: 3179a5f7f86bcc3acd5d6fb2a29f891ef5615852 ("block: rate-limit capacity change info log")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master c10130c234c81f4a7a143edbf413080235f8d8ce]
[test failed on linux-next/master 6e845bcb78c95af935094040bd4edc3c2b6dd784]
in testcase: blktests
version: blktests-x86_64-9131687-1_20260529
with following parameters:
test: loop-010
config: x86_64-rhel-9.4-func
compiler: gcc-14
test machine: 16 threads Intel(R) Core(TM) i7-13620H (Raptor Lake) with 32G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202606100416.99599c18-lkp@intel.com
2026-06-05 21:40:15 echo loop/010
2026-06-05 21:40:15 ./check loop/010
loop/010 (check stale loop partition)
loop/010 (check stale loop partition) [failed]
runtime ... 92.093s
something found in dmesg:
[ 59.348633] [ T1202] run blktests loop/010 at 2026-06-05 21:40:15
[ 59.382173] [ T1223] loop0: detected capacity change from 0 to 2097152
[ 59.382922] [ T1005] loop0:
[ 59.385325] [ T1223] loop0:
[ 59.417927] [ T274] loop0: p1
[ 59.793846] [ T1267] loop0: detected capacity change from 0 to 2097152
[ 59.794835] [ T1268] loop0: p1
[ 59.796265] [ T1267] loop0: p1
[ 59.802753] [ T1272] loop0: detected capacity change from 0 to 2097152
[ 59.803800] [ T1272] loop0: p1
...
(See '/lkp/benchmarks/blktests/results/nodev/loop/010.dmesg' for the entire message)
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260610/202606100416.99599c18-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply
* Re: [PATCH 23/27] ublk: Enable lock context analysis
From: Ming Lei @ 2026-06-10 2:24 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Marco Elver,
Nathan Chancellor
In-Reply-To: <0a98c4c6a776f88e8e300ef9b205c190a3857609.1781042470.git.bvanassche@acm.org>
On Tue, Jun 09, 2026 at 03:05:10PM -0700, Bart Van Assche wrote:
> Add the lock context annotations that are required by Clang.
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Thanks,
Ming
^ permalink raw reply
* Re: [PATCH 21/27] null_blk: Enable lock context analysis
From: Damien Le Moal @ 2026-06-09 23:42 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe
Cc: linux-block, Christoph Hellwig, Marco Elver, Keith Busch,
Chaitanya Kulkarni, Johannes Thumshirn, Nilay Shroff,
Genjian Zhang, Kees Cook
In-Reply-To: <b70422c6f243b9275e336430071b52eb92dd40c4.1781042470.git.bvanassche@acm.org>
On 2026/06/10 6:05, Bart Van Assche wrote:
> Add __must_hold() annotations where these are missing. Annotate two
> functions that use conditional locking with __context_unsafe(). Enable lock
> context analysis in the Makefile.
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Looks OK to me.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
--
Damien Le Moal
Western Digital Research
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox