Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] block: validate bios against queue limits in the entered context
@ 2026-05-19 17:23 Keith Busch
  2026-05-19 17:23 ` [PATCH RFC 1/5] blk-mq: fix status for unaligned bio Keith Busch
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Keith Busch @ 2026-05-19 17:23 UTC (permalink / raw)
  To: linux-block, linux-nvme
  Cc: axboe, hch, tom.leiming, coshi036, Igor.Achkinazi, dlemoal,
	Keith Busch

From: Keith Busch <kbusch@kernel.org>

The block layer validates bios against various queue limits and device
capacity in submit_bio_noacct(), but these checks run before
bio_queue_enter(). This means they are not serialized against drivers
that update queue limits inside a freeze window. A bio that passes
validation under old limits can enter the queue after the update and
reach the driver with an invalid configuration.

This series moves all limit-dependent validation into
__bio_split_to_limits(), which runs after the queue usage reference has
been acquired. This ensures proper serialization against limit updates.

This changes was motivated by a few recent reports:

  https://lore.kernel.org/linux-nvme/MW5PR19MB548483D1FAE4F322E4C97352FD032@MW5PR19MB5484.namprd19.prod.outlook.com/
  https://lore.kernel.org/linux-nvme/20260517053635.2282446-1-coshi036@gmail.com/

When an NVMe namespace that is reformatted to use extended metadata that
can't be controller generated/stripped, the driver sets capacity to 0
inside a freeze window because the block layer is not able to form a
viable request for this format.  But a bio that passed bio_check_eod()
before the freeze can still reach nvme_setup_rw() after the update,
triggering a WARN that we didn't expect to be possible of reaching.

For NVMe multipath, moving these checks into the entered context
exacerbates a different problem: a bio targeting a path being torn down
(capacity set to 0) would be failed by the block layer before the driver
gets a chance to redirect it to another path. This is a pre-existing
problem, but the initial changes in this series make it easier to hit.
The series addresses this by introducing a new callback to
block_device_operations, called from bio_io_error(), that lets the
driver intercept and redirect failing bios before they are completed.
NVMe multipath uses this to requeue bios back to the head device for
path re-selection when the path is no longer ready.

Note that callers of submit_bio_noacct_nocheck() (bio split, throttle
dispatch) will now hit the validation checks in __bio_split_to_limits()
that they previously bypassed. This is intentional: these checks must
run in the entered context to be properly serialized, and cannot be
skipped so it is a performance cost that can't be avoided.

Keith Busch (5):
  blk-mq: fix status for unaligned bio
  block: fix invalid zone append status codes
  block: validate bio bounds in the queue entered context
  block: move bio operation validation into __bio_split_to_limits
  block, nvme: add failed_bio callback for multipath bio failover

 block/blk-core.c              | 144 ----------------------------------
 block/blk-mq.c                |   3 +-
 block/blk.h                   |  89 ++++++++++++++++++++-
 drivers/nvme/host/core.c      |   1 +
 drivers/nvme/host/multipath.c |  26 ++++++
 drivers/nvme/host/nvme.h      |   2 +
 include/linux/bio.h           |   6 --
 include/linux/blkdev.h        |  16 ++++
 8 files changed, 133 insertions(+), 154 deletions(-)

-- 
2.53.0-Meta



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2026-05-20 19:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-19 17:23 [PATCH RFC 0/5] block: validate bios against queue limits in the entered context Keith Busch
2026-05-19 17:23 ` [PATCH RFC 1/5] blk-mq: fix status for unaligned bio Keith Busch
2026-05-20  6:56   ` Damien Le Moal
2026-05-20  7:22   ` Christoph Hellwig
2026-05-20 19:54     ` Chao S
2026-05-19 17:23 ` [PATCH RFC 2/5] block: fix invalid zone append status Keith Busch
2026-05-20  6:58   ` Damien Le Moal
2026-05-20  7:23   ` Christoph Hellwig
2026-05-19 17:23 ` [PATCH RFC 3/5] block: validate bio bounds in the queue entered context Keith Busch
2026-05-20  7:22   ` Damien Le Moal
2026-05-20  7:25     ` Damien Le Moal
2026-05-19 17:23 ` [PATCH RFC 4/5] block: move bio validation into __bio_split_to_limits Keith Busch
2026-05-20  7:25   ` Christoph Hellwig
2026-05-20 15:19     ` Keith Busch
2026-05-19 17:23 ` [PATCH RFC 5/5] block, nvme: add failed_bio callback for multipath bio failover Keith Busch
2026-05-20  7:27   ` Christoph Hellwig
2026-05-20 15:07     ` Keith Busch
2026-05-20 15:26       ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox