From: "Tomáš Trnka" <trnka@scm.com>
To: linux-raid@vger.kernel.org, song@kernel.org, yukuai@fnnas.com,
Keith Busch <kbusch@meta.com>
Cc: linan122@huawei.com, axboe@kernel.dk, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH] md/raid1,raid10: don't fail devices for invalid IO errors
Date: Fri, 17 Apr 2026 10:01:19 +0200 [thread overview]
Message-ID: <2528293.RxA6XjA2Nv@electra> (raw)
In-Reply-To: <20260416140345.3872265-1-kbusch@meta.com>
On Thursday, 16 April 2026 16:03:45, CEST Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> BLK_STS_INVAL indicates the IO request itself was invalid, not that the
> device has failed. When raid1 treats this as a device error, it retries
> on alternate mirrors which fail the same way, eventually exceeding the
> read error threshold and removing the device from the array.
>
> This happens when stacking configurations bypass bio_split_to_limits()
> in the IO path: dm-raid calls md_handle_request() directly without going
> through md_submit_bio(), skipping the alignment validation that would
> otherwise reject invalid bios early. The invalid bio reaches the
> lower block layers, which fail the bio with BLK_STS_INVAL, and raid1
> wrongly interprets this as a device failure.
>
> Add BLK_STS_INVAL to raid1_should_handle_error() so that invalid IO
> errors are propagated back to the caller rather than triggering device
> removal. This is consistent with the previous kernel behavior when
> alignment checks were done earlier in the direct-io path.
>
> Fixes: 5ff3f74e145adc7 ("block: simplify direct io validity check")
> Link: https://lore.kernel.org/linux-block/2982107.4sosBPzcNG@electra/
> Reported-by: Tomáš Trnka <trnka@scm.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Tested-by: Tomáš Trnka <trnka@scm.com>
> ---
> drivers/md/raid1-10.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
> index c33099925f230..56a56a4da4f83 100644
> --- a/drivers/md/raid1-10.c
> +++ b/drivers/md/raid1-10.c
> @@ -293,8 +293,13 @@ static inline bool raid1_should_read_first(struct mddev
> *mddev, * bio with REQ_RAHEAD or REQ_NOWAIT can fail at anytime, before
> such IO is * submitted to the underlying disks, hence don't record
> badblocks or retry * in this case.
> + *
> + * BLK_STS_INVAL means the bio was not valid for the underlying device.
> This + * is a user error, not a device failure, so retrying or recording
> bad blocks + * would be wrong.
> */
> static inline bool raid1_should_handle_error(struct bio *bio)
> {
> - return !(bio->bi_opf & (REQ_RAHEAD | REQ_NOWAIT));
> + return !(bio->bi_opf & (REQ_RAHEAD | REQ_NOWAIT)) &&
> + bio->bi_status != BLK_STS_INVAL;
> }
next prev parent reply other threads:[~2026-04-17 8:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-16 14:03 [PATCH] md/raid1,raid10: don't fail devices for invalid IO errors Keith Busch
2026-04-17 8:01 ` Tomáš Trnka [this message]
2026-04-19 5:26 ` Yu Kuai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2528293.RxA6XjA2Nv@electra \
--to=trnka@scm.com \
--cc=axboe@kernel.dk \
--cc=kbusch@kernel.org \
--cc=kbusch@meta.com \
--cc=linan122@huawei.com \
--cc=linux-raid@vger.kernel.org \
--cc=song@kernel.org \
--cc=yukuai@fnnas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox