* [PATCH 1/2] dm-io: clone the source bio instead of copying its biovec
@ 2026-06-16 15:05 Keith Busch
2026-06-16 15:05 ` [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors Keith Busch
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Keith Busch @ 2026-06-16 15:05 UTC (permalink / raw)
To: dm-devel
Cc: linux-block, mpatocka, Keith Busch, Dr. David Alan Gilbert,
Vjaceslavs Klimovs
From: Keith Busch <kbusch@kernel.org>
For DM_IO_BIO requests, do_region() built each destination bio by walking
the source bio's biovec and re-adding the pages one at a time, tracking
the remaining transfer in sectors. The vector lengths are byte granular
and need not be sector aligned (e.g. a misaligned O_DIRECT buffer split
across pages), so the sector-based accounting could lose a sub-sector
fragment: to_sector() truncated the remainder and the outer loop spun
forever submitting empty bios, hanging the I/O.
There is no need to rebuild the biovec at all. The destination reads into
(or writes from) exactly the same pages as the source bio, so the bio can
simply clone the source's biovec with bio_alloc_clone() and remap it to
the target device. The clone inherits the source's iterator and alignment,
and the block layer splits it to the target's limits on submission, so the
whole region maps to a single cloned bio with no manual page copying or
sector accounting.
This removes the per-page copy path (and its open-coded bvec dpages
helpers) for bio-backed I/O and fixes the hang on misaligned direct I/O to
a dm-mirror device. Page-list, vma and kmem sources keep the existing copy
path.
Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Fixes: 5ff3f74e145a ("block: simplify direct io validity check")
Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Reported-by: Vjaceslavs Klimovs <vklimovs@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/md/dm-io.c | 67 +++++++++++++++++-----------------------------
1 file changed, 24 insertions(+), 43 deletions(-)
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 1db565b376200..28adfeb58f240 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -170,12 +170,11 @@ struct dpages {
struct page **p, unsigned long *len, unsigned int *offset);
void (*next_page)(struct dpages *dp);
- union {
- unsigned int context_u;
- struct bvec_iter context_bi;
- };
+ unsigned int context_u;
void *context_ptr;
+ struct bio *orig_bio;
+
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
};
@@ -210,44 +209,6 @@ static void list_dp_init(struct dpages *dp, struct page_list *pl, unsigned int o
dp->context_ptr = pl;
}
-/*
- * Functions for getting the pages from a bvec.
- */
-static void bio_get_page(struct dpages *dp, struct page **p,
- unsigned long *len, unsigned int *offset)
-{
- struct bio_vec bvec = bvec_iter_bvec((struct bio_vec *)dp->context_ptr,
- dp->context_bi);
-
- *p = bvec.bv_page;
- *len = bvec.bv_len;
- *offset = bvec.bv_offset;
-
- /* avoid figuring it out again in bio_next_page() */
- dp->context_bi.bi_sector = (sector_t)bvec.bv_len;
-}
-
-static void bio_next_page(struct dpages *dp)
-{
- unsigned int len = (unsigned int)dp->context_bi.bi_sector;
-
- bvec_iter_advance((struct bio_vec *)dp->context_ptr,
- &dp->context_bi, len);
-}
-
-static void bio_dp_init(struct dpages *dp, struct bio *bio)
-{
- dp->get_page = bio_get_page;
- dp->next_page = bio_next_page;
-
- /*
- * We just use bvec iterator to retrieve pages, so it is ok to
- * access the bvec table directly here
- */
- dp->context_ptr = bio->bi_io_vec;
- dp->context_bi = bio->bi_iter;
-}
-
/*
* Functions for getting the pages from a VMA.
*/
@@ -332,6 +293,21 @@ static void do_region(const blk_opf_t opf, unsigned int region,
return;
}
+ if (dp->orig_bio) {
+ bio = bio_alloc_clone(where->bdev, dp->orig_bio, GFP_NOIO,
+ &io->client->bios);
+ bio->bi_iter.bi_sector = where->sector;
+ bio->bi_iter.bi_size = where->count << SECTOR_SHIFT;
+ bio->bi_opf = opf;
+ bio->bi_end_io = endio;
+ bio->bi_ioprio = ioprio;
+ store_io_and_region_in_bio(bio, io, region);
+
+ atomic_inc(&io->count);
+ submit_bio(bio);
+ return;
+ }
+
/*
* where->count may be zero if op holds a flush and we need to
* send a zero-sized flush.
@@ -468,6 +444,7 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
dp->vma_invalidate_address = NULL;
dp->vma_invalidate_size = 0;
+ dp->orig_bio = NULL;
switch (io_req->mem.type) {
case DM_IO_PAGE_LIST:
@@ -475,7 +452,11 @@ static int dp_init(struct dm_io_request *io_req, struct dpages *dp,
break;
case DM_IO_BIO:
- bio_dp_init(dp, io_req->mem.ptr.bio);
+ /*
+ * The destination bios clone this bio's biovec directly, so
+ * there are no per-page accessors to set up here.
+ */
+ dp->orig_bio = io_req->mem.ptr.bio;
break;
case DM_IO_VMA:
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors
2026-06-16 15:05 [PATCH 1/2] dm-io: clone the source bio instead of copying its biovec Keith Busch
@ 2026-06-16 15:05 ` Keith Busch
2026-06-16 17:54 ` Dr. David Alan Gilbert
2026-06-16 15:40 ` Keith Busch
2026-06-16 15:58 ` Keith Busch
2 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2026-06-16 15:05 UTC (permalink / raw)
To: dm-devel
Cc: linux-block, mpatocka, Keith Busch, Dr. David Alan Gilbert,
Vjaceslavs Klimovs
From: Keith Busch <kbusch@kernel.org>
BLK_STS_INVAL indicates the I/O request itself was invalid (for example a
misaligned direct I/O), not that the device has failed. dm-raid1 treated
any read or write completion error as a device failure: it failed the
mirror leg, retried on the alternatives - which fail identically - and
eventually returned EIO while spuriously degrading the array.
Since commit 5ff3f74e145a ("block: simplify direct io validity check") the
direct I/O path no longer rejects misaligned buffers up front, so an
invalid bio now reaches the lower block layers, which fail it with
BLK_STS_INVAL. dm-io collapses the block status into a per-region error
bit before invoking the completion callback, so record BLK_STS_INVAL on
the originating bio and have the dm-raid1 read, write and end_io paths
propagate it instead of failing the device.
This mirrors the raid1/raid10 fix in commit f7b24c7b41f23
("md/raid1,raid10: don't fail devices for invalid IO errors") for the
device-mapper mirror target.
Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Fixes: 5ff3f74e145a ("block: simplify direct io validity check")
Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Reported-by: Vjaceslavs Klimovs <vklimovs@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/md/dm-io.c | 14 +++++++++++++-
drivers/md/dm-raid1.c | 28 +++++++++++++++++++++++++++-
2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 28adfeb58f240..f382e9f9be059 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -37,6 +37,7 @@ struct io {
struct dm_io_client *client;
io_notify_fn callback;
void *context;
+ struct bio *orig_bio;
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
} __aligned(DM_IO_MAX_REGIONS);
@@ -132,8 +133,18 @@ static void complete_io(struct io *io)
static void dec_count(struct io *io, unsigned int region, blk_status_t error)
{
- if (error)
+ if (error) {
set_bit(region, &io->error_bits);
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying
+ * device (e.g. a misaligned direct I/O), which is a caller error
+ * rather than a device failure. Record it on the original bio so
+ * bio-based targets can propagate it instead of treating it as a
+ * media error and failing the device.
+ */
+ if (error == BLK_STS_INVAL && io->orig_bio)
+ io->orig_bio->bi_status = error;
+ }
if (atomic_dec_and_test(&io->count))
complete_io(io);
@@ -398,6 +409,7 @@ static void async_io(struct dm_io_client *client, unsigned int num_regions,
io->client = client;
io->callback = fn;
io->context = context;
+ io->orig_bio = dp->orig_bio;
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index de5c00704e69c..022ad791c2957 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -524,6 +524,17 @@ static void read_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. That is a caller error, not a device
+ * failure, so propagate it rather than failing the mirror and retrying
+ * on the other legs, which would fail the same way.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
fail_mirror(m, DM_RAID1_READ_ERROR);
if (likely(default_ok(m)) || mirror_available(m->ms, bio)) {
@@ -622,6 +633,16 @@ static void write_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate the error without degrading
+ * the array.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
/*
* If the bio is discard, return an error, but do not
* degrade the array.
@@ -1262,7 +1283,12 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio,
return DM_ENDIO_DONE;
}
- if (*error == BLK_STS_NOTSUPP)
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate it rather than failing the
+ * mirror and retrying, which would fail the same way on every leg.
+ */
+ if (*error == BLK_STS_NOTSUPP || *error == BLK_STS_INVAL)
goto out;
if (bio->bi_opf & REQ_RAHEAD)
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors
2026-06-16 15:05 [PATCH 1/2] dm-io: clone the source bio instead of copying its biovec Keith Busch
2026-06-16 15:05 ` [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors Keith Busch
@ 2026-06-16 15:40 ` Keith Busch
2026-06-16 15:58 ` Keith Busch
2 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2026-06-16 15:40 UTC (permalink / raw)
To: dm-devel; +Cc: linux-block, mpatocka, kbusch, linux, vklimovs
From: Keith Busch <kbusch@kernel.org>
BLK_STS_INVAL indicates the I/O request itself was invalid (for example a
misaligned direct I/O), not that the device has failed. dm-raid1 treated
any read or write completion error as a device failure: it failed the
mirror leg, retried on the alternatives - which fail identically - and
eventually returned EIO while spuriously degrading the array.
Since commit 5ff3f74e145a ("block: simplify direct io validity check") the
direct I/O path no longer rejects misaligned buffers up front, so an
invalid bio now reaches the lower block layers, which fail it with
BLK_STS_INVAL. dm-io collapses the block status into a per-region error
bit before invoking the completion callback, so record BLK_STS_INVAL on
the originating bio and have the dm-raid1 read, write and end_io paths
propagate it instead of failing the device.
This mirrors the raid1/raid10 fix in commit f7b24c7b41f23
("md/raid1,raid10: don't fail devices for invalid IO errors") for the
device-mapper mirror target.
Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Fixes: 5ff3f74e145a ("block: simplify direct io validity check")
Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Reported-by: Vjaceslavs Klimovs <vklimovs@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/md/dm-io.c | 14 +++++++++++++-
drivers/md/dm-raid1.c | 28 +++++++++++++++++++++++++++-
2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 28adfeb58f240..f382e9f9be059 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -37,6 +37,7 @@ struct io {
struct dm_io_client *client;
io_notify_fn callback;
void *context;
+ struct bio *orig_bio;
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
} __aligned(DM_IO_MAX_REGIONS);
@@ -132,8 +133,18 @@ static void complete_io(struct io *io)
static void dec_count(struct io *io, unsigned int region, blk_status_t error)
{
- if (error)
+ if (error) {
set_bit(region, &io->error_bits);
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying
+ * device (e.g. a misaligned direct I/O), which is a caller error
+ * rather than a device failure. Record it on the original bio so
+ * bio-based targets can propagate it instead of treating it as a
+ * media error and failing the device.
+ */
+ if (error == BLK_STS_INVAL && io->orig_bio)
+ io->orig_bio->bi_status = error;
+ }
if (atomic_dec_and_test(&io->count))
complete_io(io);
@@ -398,6 +409,7 @@ static void async_io(struct dm_io_client *client, unsigned int num_regions,
io->client = client;
io->callback = fn;
io->context = context;
+ io->orig_bio = dp->orig_bio;
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index de5c00704e69c..022ad791c2957 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -524,6 +524,17 @@ static void read_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. That is a caller error, not a device
+ * failure, so propagate it rather than failing the mirror and retrying
+ * on the other legs, which would fail the same way.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
fail_mirror(m, DM_RAID1_READ_ERROR);
if (likely(default_ok(m)) || mirror_available(m->ms, bio)) {
@@ -622,6 +633,16 @@ static void write_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate the error without degrading
+ * the array.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
/*
* If the bio is discard, return an error, but do not
* degrade the array.
@@ -1262,7 +1283,12 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio,
return DM_ENDIO_DONE;
}
- if (*error == BLK_STS_NOTSUPP)
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate it rather than failing the
+ * mirror and retrying, which would fail the same way on every leg.
+ */
+ if (*error == BLK_STS_NOTSUPP || *error == BLK_STS_INVAL)
goto out;
if (bio->bi_opf & REQ_RAHEAD)
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors
2026-06-16 15:05 [PATCH 1/2] dm-io: clone the source bio instead of copying its biovec Keith Busch
2026-06-16 15:05 ` [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors Keith Busch
2026-06-16 15:40 ` Keith Busch
@ 2026-06-16 15:58 ` Keith Busch
2 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2026-06-16 15:58 UTC (permalink / raw)
To: Keith Busch
Cc: dm-devel, linux-block, mpatocka, Dr. David Alan Gilbert,
Vjaceslavs Klimovs
BLK_STS_INVAL indicates the I/O request itself was invalid (for example a
misaligned direct I/O), not that the device has failed. dm-raid1 treated
any read or write completion error as a device failure: it failed the
mirror leg, retried on the alternatives - which fail identically - and
eventually returned EIO while spuriously degrading the array.
Since commit 5ff3f74e145a ("block: simplify direct io validity check") the
direct I/O path no longer rejects misaligned buffers up front, so an
invalid bio now reaches the lower block layers, which fail it with
BLK_STS_INVAL. dm-io collapses the block status into a per-region error
bit before invoking the completion callback, so record BLK_STS_INVAL on
the originating bio and have the dm-raid1 read, write and end_io paths
propagate it instead of failing the device.
This mirrors the raid1/raid10 fix in commit f7b24c7b41f23
("md/raid1,raid10: don't fail devices for invalid IO errors") for the
device-mapper mirror target.
Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
Fixes: 5ff3f74e145a ("block: simplify direct io validity check")
Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
Reported-by: Vjaceslavs Klimovs <vklimovs@gmail.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
Resending patch 2/2 from a different machine. For some reason, only 1/2
is getting through with git-send-email, so manually replying to the
thread with the missing second patch.
drivers/md/dm-io.c | 14 +++++++++++++-
drivers/md/dm-raid1.c | 28 +++++++++++++++++++++++++++-
2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 28adfeb58f240..f382e9f9be059 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -37,6 +37,7 @@ struct io {
struct dm_io_client *client;
io_notify_fn callback;
void *context;
+ struct bio *orig_bio;
void *vma_invalidate_address;
unsigned long vma_invalidate_size;
} __aligned(DM_IO_MAX_REGIONS);
@@ -132,8 +133,18 @@ static void complete_io(struct io *io)
static void dec_count(struct io *io, unsigned int region, blk_status_t error)
{
- if (error)
+ if (error) {
set_bit(region, &io->error_bits);
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying
+ * device (e.g. a misaligned direct I/O), which is a caller error
+ * rather than a device failure. Record it on the original bio so
+ * bio-based targets can propagate it instead of treating it as a
+ * media error and failing the device.
+ */
+ if (error == BLK_STS_INVAL && io->orig_bio)
+ io->orig_bio->bi_status = error;
+ }
if (atomic_dec_and_test(&io->count))
complete_io(io);
@@ -398,6 +409,7 @@ static void async_io(struct dm_io_client *client, unsigned int num_regions,
io->client = client;
io->callback = fn;
io->context = context;
+ io->orig_bio = dp->orig_bio;
io->vma_invalidate_address = dp->vma_invalidate_address;
io->vma_invalidate_size = dp->vma_invalidate_size;
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index de5c00704e69c..022ad791c2957 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -524,6 +524,17 @@ static void read_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. That is a caller error, not a device
+ * failure, so propagate it rather than failing the mirror and retrying
+ * on the other legs, which would fail the same way.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
fail_mirror(m, DM_RAID1_READ_ERROR);
if (likely(default_ok(m)) || mirror_available(m->ms, bio)) {
@@ -622,6 +633,16 @@ static void write_callback(unsigned long error, void *context)
return;
}
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate the error without degrading
+ * the array.
+ */
+ if (bio->bi_status == BLK_STS_INVAL) {
+ bio_endio(bio);
+ return;
+ }
+
/*
* If the bio is discard, return an error, but do not
* degrade the array.
@@ -1262,7 +1283,12 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio,
return DM_ENDIO_DONE;
}
- if (*error == BLK_STS_NOTSUPP)
+ /*
+ * BLK_STS_INVAL means the bio was not valid for the underlying device,
+ * e.g. a misaligned direct I/O. Propagate it rather than failing the
+ * mirror and retrying, which would fail the same way on every leg.
+ */
+ if (*error == BLK_STS_NOTSUPP || *error == BLK_STS_INVAL)
goto out;
if (bio->bi_opf & REQ_RAHEAD)
--
2.52.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors
2026-06-16 15:05 ` [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors Keith Busch
@ 2026-06-16 17:54 ` Dr. David Alan Gilbert
2026-06-16 18:48 ` Keith Busch
0 siblings, 1 reply; 6+ messages in thread
From: Dr. David Alan Gilbert @ 2026-06-16 17:54 UTC (permalink / raw)
To: Keith Busch
Cc: dm-devel, linux-block, mpatocka, Keith Busch, Vjaceslavs Klimovs
* Keith Busch (kbusch@meta.com) wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> BLK_STS_INVAL indicates the I/O request itself was invalid (for example a
> misaligned direct I/O), not that the device has failed. dm-raid1 treated
> any read or write completion error as a device failure: it failed the
> mirror leg, retried on the alternatives - which fail identically - and
> eventually returned EIO while spuriously degrading the array.
>
> Since commit 5ff3f74e145a ("block: simplify direct io validity check") the
> direct I/O path no longer rejects misaligned buffers up front, so an
> invalid bio now reaches the lower block layers, which fail it with
> BLK_STS_INVAL. dm-io collapses the block status into a per-region error
> bit before invoking the completion callback, so record BLK_STS_INVAL on
> the originating bio and have the dm-raid1 read, write and end_io paths
> propagate it instead of failing the device.
>
> This mirrors the raid1/raid10 fix in commit f7b24c7b41f23
> ("md/raid1,raid10: don't fail devices for invalid IO errors") for the
> device-mapper mirror target.
>
> Fixes: 7eac33186957 ("iomap: simplify direct io validity check")
> Fixes: 5ff3f74e145a ("block: simplify direct io validity check")
> Reported-by: Dr. David Alan Gilbert <linux@treblig.org>
> Reported-by: Vjaceslavs Klimovs <vklimovs@gmail.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
OK, for this pair I think would be fair for a Tested-by me as well;
they certainly resolve the hang and the WARN/BUGs.
I still see the errors as EIO on my tests, and on the older mirror type
get the stuck resync on write, and on the newer mirror I see the write
apparently succeed (did it really?)
I suggest given the BUG/WARN and the number of people who've tripped over
this, and it's triggerable as a normal user, that it's a candidate for stable.
Thanks again,
Dave
> ---
> drivers/md/dm-io.c | 14 +++++++++++++-
> drivers/md/dm-raid1.c | 28 +++++++++++++++++++++++++++-
> 2 files changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
> index 28adfeb58f240..f382e9f9be059 100644
> --- a/drivers/md/dm-io.c
> +++ b/drivers/md/dm-io.c
> @@ -37,6 +37,7 @@ struct io {
> struct dm_io_client *client;
> io_notify_fn callback;
> void *context;
> + struct bio *orig_bio;
> void *vma_invalidate_address;
> unsigned long vma_invalidate_size;
> } __aligned(DM_IO_MAX_REGIONS);
> @@ -132,8 +133,18 @@ static void complete_io(struct io *io)
>
> static void dec_count(struct io *io, unsigned int region, blk_status_t error)
> {
> - if (error)
> + if (error) {
> set_bit(region, &io->error_bits);
> + /*
> + * BLK_STS_INVAL means the bio was not valid for the underlying
> + * device (e.g. a misaligned direct I/O), which is a caller error
> + * rather than a device failure. Record it on the original bio so
> + * bio-based targets can propagate it instead of treating it as a
> + * media error and failing the device.
> + */
> + if (error == BLK_STS_INVAL && io->orig_bio)
> + io->orig_bio->bi_status = error;
> + }
>
> if (atomic_dec_and_test(&io->count))
> complete_io(io);
> @@ -398,6 +409,7 @@ static void async_io(struct dm_io_client *client, unsigned int num_regions,
> io->client = client;
> io->callback = fn;
> io->context = context;
> + io->orig_bio = dp->orig_bio;
>
> io->vma_invalidate_address = dp->vma_invalidate_address;
> io->vma_invalidate_size = dp->vma_invalidate_size;
> diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
> index de5c00704e69c..022ad791c2957 100644
> --- a/drivers/md/dm-raid1.c
> +++ b/drivers/md/dm-raid1.c
> @@ -524,6 +524,17 @@ static void read_callback(unsigned long error, void *context)
> return;
> }
>
> + /*
> + * BLK_STS_INVAL means the bio was not valid for the underlying device,
> + * e.g. a misaligned direct I/O. That is a caller error, not a device
> + * failure, so propagate it rather than failing the mirror and retrying
> + * on the other legs, which would fail the same way.
> + */
> + if (bio->bi_status == BLK_STS_INVAL) {
> + bio_endio(bio);
> + return;
> + }
> +
> fail_mirror(m, DM_RAID1_READ_ERROR);
>
> if (likely(default_ok(m)) || mirror_available(m->ms, bio)) {
> @@ -622,6 +633,16 @@ static void write_callback(unsigned long error, void *context)
> return;
> }
>
> + /*
> + * BLK_STS_INVAL means the bio was not valid for the underlying device,
> + * e.g. a misaligned direct I/O. Propagate the error without degrading
> + * the array.
> + */
> + if (bio->bi_status == BLK_STS_INVAL) {
> + bio_endio(bio);
> + return;
> + }
> +
> /*
> * If the bio is discard, return an error, but do not
> * degrade the array.
> @@ -1262,7 +1283,12 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio,
> return DM_ENDIO_DONE;
> }
>
> - if (*error == BLK_STS_NOTSUPP)
> + /*
> + * BLK_STS_INVAL means the bio was not valid for the underlying device,
> + * e.g. a misaligned direct I/O. Propagate it rather than failing the
> + * mirror and retrying, which would fail the same way on every leg.
> + */
> + if (*error == BLK_STS_NOTSUPP || *error == BLK_STS_INVAL)
> goto out;
>
> if (bio->bi_opf & REQ_RAHEAD)
> --
> 2.52.0
>
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux | Happy \
\ dave @ treblig.org | | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors
2026-06-16 17:54 ` Dr. David Alan Gilbert
@ 2026-06-16 18:48 ` Keith Busch
0 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2026-06-16 18:48 UTC (permalink / raw)
To: Dr. David Alan Gilbert
Cc: Keith Busch, dm-devel, linux-block, mpatocka, Vjaceslavs Klimovs
On Tue, Jun 16, 2026 at 05:54:28PM +0000, Dr. David Alan Gilbert wrote:
> OK, for this pair I think would be fair for a Tested-by me as well;
> they certainly resolve the hang and the WARN/BUGs.
> I still see the errors as EIO on my tests, and on the older mirror type
Could you share your reproducer? I'm just using the original recipe you
sent here:
https://lore.kernel.org/linux-block/ai7rnH20IYeSmY8s@gallifrey/
And I'm seeing EINVAL instead EIO.
> get the stuck resync on write, and on the newer mirror I see the write
> apparently succeed (did it really?)
There was a time when ext4 used to fallback to buffered io for writes
but not for reads, but looks like that was fixed since 6.18, so should
be returning error.
I tried testing it with a modification to your original read test, and
it is still failing with EINVAL for me:
pread of 4096 said: -1 (Invalid argument)
pwrite of 4096 said: -1 (Invalid argument)
---
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
const char* path="/mnt/tmp/testfile";
static char buf[8192];
int main()
{
int fd=open(path, O_RDWR|O_DIRECT|O_CLOEXEC);
errno=0;
int res3=pread(fd, buf, 4096, 0);
printf("pread of 4096 said: %d (%m)\n", res3);
res3=pwrite(fd, buf, 4096, 0);
printf("pwrite of 4096 said: %d (%m)\n", res3);
}
--
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-06-16 18:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 15:05 [PATCH 1/2] dm-io: clone the source bio instead of copying its biovec Keith Busch
2026-06-16 15:05 ` [PATCH 2/2] dm-raid1: don't fail the mirror for invalid I/O errors Keith Busch
2026-06-16 17:54 ` Dr. David Alan Gilbert
2026-06-16 18:48 ` Keith Busch
2026-06-16 15:40 ` Keith Busch
2026-06-16 15:58 ` Keith Busch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox