* [PATCH v2 0/7] bio_split() error handling rework
@ 2024-10-28 15:27 John Garry
2024-10-28 15:27 ` [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init() John Garry
` (7 more replies)
0 siblings, 8 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
bio_split() error handling could be improved as follows:
- Instead of returning NULL for an error - which is vague - return a
PTR_ERR, which may hint what went wrong.
- Remove BUG_ON() calls - which are generally not preferred - and instead
WARN and pass an error code back to the caller. Many callers of
bio_split() don't check the return code. As such, for an error we would
be getting a crash still from an invalid pointer dereference.
Most bio_split() callers don't check the return value. However, it could
be argued the bio_split() calls should not fail. So far I have just
fixed up the md RAID code to handle these errors, as that is my interest
now.
The motivator for this series was initial md RAID atomic write support in
https://lore.kernel.org/linux-block/21f19b4b-4b83-4ca2-a93b-0a433741fd26@oracle.com/
There I wanted to ensure that we don't split an atomic write bio, and it
made more sense to handle this in bio_split() (instead of the bio_split()
caller).
Based on f1be1788a32e (block/for-6.13/block) block: model freeze & enter
queue as lock for supporting lockdep
Changes since RFC:
- proper handling to end the raid bio in all cases, and also pass back
proper error code (Kuai)
- Add WARN_ON_ERROR in bio_split() (Johannes, Christoph)
- Add small patch to use BLK_STS_OK in bio_init()
- Change bio_submit_split() error path (Christoph)
John Garry (7):
block: Use BLK_STS_OK in bio_init()
block: Rework bio_split() return value
block: Error an attempt to split an atomic write in bio_split()
block: Handle bio_split() errors in bio_submit_split()
md/raid0: Handle bio_split() errors
md/raid1: Handle bio_split() errors
md/raid10: Handle bio_split() errors
block/bio.c | 16 +++++++++----
block/blk-crypto-fallback.c | 2 +-
block/blk-merge.c | 15 ++++++++----
drivers/md/raid0.c | 12 ++++++++++
drivers/md/raid1.c | 32 +++++++++++++++++++++++--
drivers/md/raid10.c | 47 ++++++++++++++++++++++++++++++++++++-
6 files changed, 110 insertions(+), 14 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init()
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-28 16:11 ` Christoph Hellwig
2024-10-28 15:27 ` [PATCH v2 2/7] block: Rework bio_split() return value John Garry
` (6 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
Use the proper blk_status_t value to init the bi_status.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/bio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c
index 95e2ee14cea2..4cc33ee68640 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -251,7 +251,7 @@ void bio_init(struct bio *bio, struct block_device *bdev, struct bio_vec *table,
bio->bi_flags = 0;
bio->bi_ioprio = 0;
bio->bi_write_hint = 0;
- bio->bi_status = 0;
+ bio->bi_status = BLK_STS_OK;
bio->bi_iter.bi_sector = 0;
bio->bi_iter.bi_size = 0;
bio->bi_iter.bi_idx = 0;
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 2/7] block: Rework bio_split() return value
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
2024-10-28 15:27 ` [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init() John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split() John Garry
` (5 subsequent siblings)
7 siblings, 2 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
Instead of returning an inconclusive value of NULL for an error in calling
bio_split(), return a ERR_PTR() always.
Also remove the BUG_ON() calls, and WARN_ON_ONCE() instead. Indeed, since
almost all callers don't check the return code from bio_split(), we'll
crash anyway (for those failures).
Fix up the only user which checks bio_split() return code today (directly
or indirectly), blk_crypto_fallback_split_bio_if_needed(). The md/bcache
code does check the return code in cached_dev_cache_miss() ->
bio_next_split() -> bio_split(), but only to see if there was a split, so
there would be no change in behaviour here (when returning a ERR_PTR()).
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/bio.c | 10 ++++++----
block/blk-crypto-fallback.c | 2 +-
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 4cc33ee68640..42cac7c46e55 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1740,16 +1740,18 @@ struct bio *bio_split(struct bio *bio, int sectors,
{
struct bio *split;
- BUG_ON(sectors <= 0);
- BUG_ON(sectors >= bio_sectors(bio));
+ if (WARN_ON_ONCE(sectors <= 0))
+ return ERR_PTR(-EINVAL);
+ if (WARN_ON_ONCE(sectors >= bio_sectors(bio)))
+ return ERR_PTR(-EINVAL);
/* Zone append commands cannot be split */
if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND))
- return NULL;
+ return ERR_PTR(-EINVAL);
split = bio_alloc_clone(bio->bi_bdev, bio, gfp, bs);
if (!split)
- return NULL;
+ return ERR_PTR(-ENOMEM);
split->bi_iter.bi_size = sectors << 9;
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index b1e7415f8439..29a205482617 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -226,7 +226,7 @@ static bool blk_crypto_fallback_split_bio_if_needed(struct bio **bio_ptr)
split_bio = bio_split(bio, num_sectors, GFP_NOIO,
&crypto_bio_split);
- if (!split_bio) {
+ if (IS_ERR(split_bio)) {
bio->bi_status = BLK_STS_RESOURCE;
return false;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split()
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
2024-10-28 15:27 ` [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init() John Garry
2024-10-28 15:27 ` [PATCH v2 2/7] block: Rework bio_split() return value John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split() John Garry
` (4 subsequent siblings)
7 siblings, 2 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
This is disallowed.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/bio.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/bio.c b/block/bio.c
index 42cac7c46e55..ac2b11b164af 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1749,6 +1749,10 @@ struct bio *bio_split(struct bio *bio, int sectors,
if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND))
return ERR_PTR(-EINVAL);
+ /* atomic writes cannot be split */
+ if (bio->bi_opf & REQ_ATOMIC)
+ return ERR_PTR(-EINVAL);
+
split = bio_alloc_clone(bio->bi_bdev, bio, gfp, bs);
if (!split)
return ERR_PTR(-ENOMEM);
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split()
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
` (2 preceding siblings ...)
2024-10-28 15:27 ` [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split() John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:13 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 5/7] md/raid0: Handle bio_split() errors John Garry
` (3 subsequent siblings)
7 siblings, 2 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
bio_split() may error, so check this.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
block/blk-merge.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 1c73fd37cbee..b640477c190d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -107,11 +107,8 @@ static unsigned int bio_allowed_max_sectors(const struct queue_limits *lim)
static struct bio *bio_submit_split(struct bio *bio, int split_sectors)
{
- if (unlikely(split_sectors < 0)) {
- bio->bi_status = errno_to_blk_status(split_sectors);
- bio_endio(bio);
- return NULL;
- }
+ if (unlikely(split_sectors < 0))
+ goto error;
if (unlikely((bio_op(bio) == REQ_OP_DISCARD)))
@@ -123,6 +120,10 @@ static struct bio *bio_submit_split(struct bio *bio, int split_sectors)
split = bio_split(bio, split_sectors, GFP_NOIO,
&bio->bi_bdev->bd_disk->bio_split);
+ if (IS_ERR(split)) {
+ split_sectors = PTR_ERR(split);
+ goto error;
+ }
split->bi_opf |= REQ_NOMERGE;
blkcg_bio_issue_init(split);
bio_chain(split, bio);
@@ -133,6 +134,10 @@ static struct bio *bio_submit_split(struct bio *bio, int split_sectors)
}
return bio;
+error:
+ bio->bi_status = errno_to_blk_status(split_sectors);
+ bio_endio(bio);
+ return NULL;
}
struct bio *bio_split_discard(struct bio *bio, const struct queue_limits *lim,
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 5/7] md/raid0: Handle bio_split() errors
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
` (3 preceding siblings ...)
2024-10-28 15:27 ` [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split() John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-29 3:52 ` Yu Kuai
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
` (2 subsequent siblings)
7 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
Add proper bio_split() error handling. For any error, set bi_status, end
the bio, and return.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/raid0.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index 32d587524778..baaf5f8b80ae 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -466,6 +466,12 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
struct bio *split = bio_split(bio,
zone->zone_end - bio->bi_iter.bi_sector, GFP_NOIO,
&mddev->bio_set);
+
+ if (IS_ERR(split)) {
+ bio->bi_status = errno_to_blk_status(PTR_ERR(split));
+ bio_endio(bio);
+ return;
+ }
bio_chain(split, bio);
submit_bio_noacct(bio);
bio = split;
@@ -608,6 +614,12 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
if (sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, sectors, GFP_NOIO,
&mddev->bio_set);
+
+ if (IS_ERR(split)) {
+ bio->bi_status = errno_to_blk_status(PTR_ERR(split));
+ bio_endio(bio);
+ return true;
+ }
bio_chain(split, bio);
raid0_map_submit_bio(mddev, bio);
bio = split;
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
` (4 preceding siblings ...)
2024-10-28 15:27 ` [PATCH v2 5/7] md/raid0: Handle bio_split() errors John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-29 3:48 ` Yu Kuai
` (2 more replies)
2024-10-28 15:27 ` [PATCH v2 7/7] md/raid10: " John Garry
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
7 siblings, 3 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
Add proper bio_split() error handling. For any error, call
raid_end_bio_io() and return.
For the case of an in the write path, we need to undo the increment in
the rdev panding count and NULLify the r1_bio->bios[] pointers.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6c9d24203f39..a10018282629 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
const enum req_op op = bio_op(bio);
const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
int max_sectors;
- int rdisk;
+ int rdisk, error;
bool r1bio_existed = !!r1_bio;
/*
@@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
if (max_sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, max_sectors,
gfp, &conf->bio_split);
+
+ if (IS_ERR(split)) {
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
bio_chain(split, bio);
submit_bio_noacct(bio);
bio = split;
@@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
read_bio->bi_private = r1_bio;
mddev_trace_remap(mddev, read_bio, r1_bio->sector);
submit_bio_noacct(read_bio);
+ return;
+
+err_handle:
+ bio->bi_status = errno_to_blk_status(error);
+ set_bit(R1BIO_Uptodate, &r1_bio->state);
+ raid_end_bio_io(r1_bio);
}
static void raid1_write_request(struct mddev *mddev, struct bio *bio,
@@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
{
struct r1conf *conf = mddev->private;
struct r1bio *r1_bio;
- int i, disks;
+ int i, disks, k, error;
unsigned long flags;
struct md_rdev *blocked_rdev;
int first_clone;
@@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
if (max_sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, max_sectors,
GFP_NOIO, &conf->bio_split);
+
+ if (IS_ERR(split)) {
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
bio_chain(split, bio);
submit_bio_noacct(bio);
bio = split;
@@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
/* In case raid1d snuck in to freeze_array */
wake_up_barrier(conf);
+ return;
+err_handle:
+ for (k = 0; k < i; k++) {
+ if (r1_bio->bios[k]) {
+ rdev_dec_pending(conf->mirrors[k].rdev, mddev);
+ r1_bio->bios[k] = NULL;
+ }
+ }
+
+ bio->bi_status = errno_to_blk_status(error);
+ set_bit(R1BIO_Uptodate, &r1_bio->state);
+ raid_end_bio_io(r1_bio);
}
static bool raid1_make_request(struct mddev *mddev, struct bio *bio)
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH v2 7/7] md/raid10: Handle bio_split() errors
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
` (5 preceding siblings ...)
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
@ 2024-10-28 15:27 ` John Garry
2024-10-29 11:55 ` Yu Kuai
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
7 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-28 15:27 UTC (permalink / raw)
To: axboe, song, yukuai3, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, John Garry
Add proper bio_split() error handling. For any error, call
raid_end_bio_io() and return. Except for discard, where we end the bio
directly.
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
drivers/md/raid10.c | 47 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 46 insertions(+), 1 deletion(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index f3bf1116794a..9c56b27b754a 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1159,6 +1159,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
int slot = r10_bio->read_slot;
struct md_rdev *err_rdev = NULL;
gfp_t gfp = GFP_NOIO;
+ int error;
if (slot >= 0 && r10_bio->devs[slot].rdev) {
/*
@@ -1206,6 +1207,10 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
if (max_sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, max_sectors,
gfp, &conf->bio_split);
+ if (IS_ERR(split)) {
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
bio_chain(split, bio);
allow_barrier(conf);
submit_bio_noacct(bio);
@@ -1236,6 +1241,12 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
mddev_trace_remap(mddev, read_bio, r10_bio->sector);
submit_bio_noacct(read_bio);
return;
+err_handle:
+ atomic_dec(&rdev->nr_pending);
+
+ bio->bi_status = errno_to_blk_status(error);
+ set_bit(R10BIO_Uptodate, &r10_bio->state);
+ raid_end_bio_io(r10_bio);
}
static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
@@ -1347,9 +1358,10 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
struct r10bio *r10_bio)
{
struct r10conf *conf = mddev->private;
- int i;
+ int i, k;
sector_t sectors;
int max_sectors;
+ int error;
if ((mddev_is_clustered(mddev) &&
md_cluster_ops->area_resyncing(mddev, WRITE,
@@ -1482,6 +1494,10 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
if (r10_bio->sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, r10_bio->sectors,
GFP_NOIO, &conf->bio_split);
+ if (IS_ERR(split)) {
+ error = PTR_ERR(split);
+ goto err_handle;
+ }
bio_chain(split, bio);
allow_barrier(conf);
submit_bio_noacct(bio);
@@ -1503,6 +1519,25 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
raid10_write_one_disk(mddev, r10_bio, bio, true, i);
}
one_write_done(r10_bio);
+ return;
+err_handle:
+ for (k = 0; k < i; k++) {
+ struct md_rdev *rdev, *rrdev;
+
+ rdev = conf->mirrors[k].rdev;
+ rrdev = conf->mirrors[k].replacement;
+
+ if (rdev)
+ rdev_dec_pending(conf->mirrors[k].rdev, mddev);
+ if (rrdev)
+ rdev_dec_pending(conf->mirrors[k].rdev, mddev);
+ r10_bio->devs[k].bio = NULL;
+ r10_bio->devs[k].repl_bio = NULL;
+ }
+
+ bio->bi_status = errno_to_blk_status(error);
+ set_bit(R10BIO_Uptodate, &r10_bio->state);
+ raid_end_bio_io(r10_bio);
}
static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
@@ -1644,6 +1679,11 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
if (remainder) {
split_size = stripe_size - remainder;
split = bio_split(bio, split_size, GFP_NOIO, &conf->bio_split);
+ if (IS_ERR(split)) {
+ bio->bi_status = errno_to_blk_status(PTR_ERR(split));
+ bio_endio(bio);
+ return 0;
+ }
bio_chain(split, bio);
allow_barrier(conf);
/* Resend the fist split part */
@@ -1654,6 +1694,11 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
if (remainder) {
split_size = bio_sectors(bio) - remainder;
split = bio_split(bio, split_size, GFP_NOIO, &conf->bio_split);
+ if (IS_ERR(split)) {
+ bio->bi_status = errno_to_blk_status(PTR_ERR(split));
+ bio_endio(bio);
+ return 0;
+ }
bio_chain(split, bio);
allow_barrier(conf);
/* Resend the second split part */
--
2.31.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init()
2024-10-28 15:27 ` [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init() John Garry
@ 2024-10-28 16:11 ` Christoph Hellwig
2024-10-28 16:32 ` John Garry
0 siblings, 1 reply; 35+ messages in thread
From: Christoph Hellwig @ 2024-10-28 16:11 UTC (permalink / raw)
To: John Garry
Cc: axboe, song, yukuai3, hch, martin.petersen, linux-block,
linux-kernel, linux-raid, hare, Johannes.Thumshirn
On Mon, Oct 28, 2024 at 03:27:24PM +0000, John Garry wrote:
> Use the proper blk_status_t value to init the bi_status.
I think 0 as ok is a very wide spread assumption and intentional.
Personally I think 0 is fine, as it also is special case by
__bitwise types, but if Jens prefers it this way I'm fine with that
too.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 2/7] block: Rework bio_split() return value
2024-10-28 15:27 ` [PATCH v2 2/7] block: Rework bio_split() return value John Garry
@ 2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2024-10-28 16:12 UTC (permalink / raw)
To: John Garry
Cc: axboe, song, yukuai3, hch, martin.petersen, linux-block,
linux-kernel, linux-raid, hare, Johannes.Thumshirn
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
(hopefully I'll remember the change when I forward port my new code
using it :))
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split()
2024-10-28 15:27 ` [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split() John Garry
@ 2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2024-10-28 16:12 UTC (permalink / raw)
To: John Garry
Cc: axboe, song, yukuai3, hch, martin.petersen, linux-block,
linux-kernel, linux-raid, hare, Johannes.Thumshirn
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split()
2024-10-28 15:27 ` [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split() John Garry
@ 2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:13 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Christoph Hellwig @ 2024-10-28 16:12 UTC (permalink / raw)
To: John Garry
Cc: axboe, song, yukuai3, hch, martin.petersen, linux-block,
linux-kernel, linux-raid, hare, Johannes.Thumshirn
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init()
2024-10-28 16:11 ` Christoph Hellwig
@ 2024-10-28 16:32 ` John Garry
0 siblings, 0 replies; 35+ messages in thread
From: John Garry @ 2024-10-28 16:32 UTC (permalink / raw)
To: Christoph Hellwig
Cc: axboe, song, yukuai3, martin.petersen, linux-block, linux-kernel,
linux-raid, hare, Johannes.Thumshirn
On 28/10/2024 16:11, Christoph Hellwig wrote:
> On Mon, Oct 28, 2024 at 03:27:24PM +0000, John Garry wrote:
>> Use the proper blk_status_t value to init the bi_status.
> I think 0 as ok is a very wide spread assumption and intentional.
Sure
> Personally I think 0 is fine, as it also is special case by
> __bitwise types, but if Jens prefers it this way I'm fine with that
> too.
I just found it easier to grep BLK_STS_OK (rather than 0). And also
proper to init to the declared macro, but I don't feel strongly about this.
Cheers
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
@ 2024-10-29 3:48 ` Yu Kuai
2024-10-29 8:45 ` John Garry
2024-10-29 11:32 ` Yu Kuai
2024-10-29 12:12 ` Yu Kuai
2 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 3:48 UTC (permalink / raw)
To: John Garry, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
Hi,
在 2024/10/28 23:27, John Garry 写道:
> Add proper bio_split() error handling. For any error, call
> raid_end_bio_io() and return.
>
> For the case of an in the write path, we need to undo the increment in
> the rdev panding count and NULLify the r1_bio->bios[] pointers.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 6c9d24203f39..a10018282629 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> const enum req_op op = bio_op(bio);
> const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
> int max_sectors;
> - int rdisk;
> + int rdisk, error;
> bool r1bio_existed = !!r1_bio;
>
> /*
> @@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> gfp, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> read_bio->bi_private = r1_bio;
> mddev_trace_remap(mddev, read_bio, r1_bio->sector);
> submit_bio_noacct(read_bio);
> + return;
> +
> +err_handle:
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
> }
>
> static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> @@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> {
> struct r1conf *conf = mddev->private;
> struct r1bio *r1_bio;
> - int i, disks;
> + int i, disks, k, error;
> unsigned long flags;
> struct md_rdev *blocked_rdev;
> int first_clone;
> @@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> GFP_NOIO, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>
> /* In case raid1d snuck in to freeze_array */
> wake_up_barrier(conf);
> + return;
> +err_handle:
> + for (k = 0; k < i; k++) {
> + if (r1_bio->bios[k]) {
> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
> + r1_bio->bios[k] = NULL;
> + }
> + }
> +
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
Looks good that error code is passed to orig bio. However,
I really think badblocks should be handled somehow, it just doesn't make
sense to return IO error to filesystems or user if one underlying disk
contain BB, while others are good.
Or is it guaranteed that IO error by atomic write won't hurt anyone,
user will handle this error and retry with non atomic write?
Thanks,
Kuai
> }
>
> static bool raid1_make_request(struct mddev *mddev, struct bio *bio)
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 5/7] md/raid0: Handle bio_split() errors
2024-10-28 15:27 ` [PATCH v2 5/7] md/raid0: Handle bio_split() errors John Garry
@ 2024-10-29 3:52 ` Yu Kuai
0 siblings, 0 replies; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 3:52 UTC (permalink / raw)
To: John Garry, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
在 2024/10/28 23:27, John Garry 写道:
> Add proper bio_split() error handling. For any error, set bi_status, end
> the bio, and return.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> drivers/md/raid0.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
LGTM
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index 32d587524778..baaf5f8b80ae 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -466,6 +466,12 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
> struct bio *split = bio_split(bio,
> zone->zone_end - bio->bi_iter.bi_sector, GFP_NOIO,
> &mddev->bio_set);
> +
> + if (IS_ERR(split)) {
> + bio->bi_status = errno_to_blk_status(PTR_ERR(split));
> + bio_endio(bio);
> + return;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -608,6 +614,12 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
> if (sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, sectors, GFP_NOIO,
> &mddev->bio_set);
> +
> + if (IS_ERR(split)) {
> + bio->bi_status = errno_to_blk_status(PTR_ERR(split));
> + bio_endio(bio);
> + return true;
> + }
> bio_chain(split, bio);
> raid0_map_submit_bio(mddev, bio);
> bio = split;
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-29 3:48 ` Yu Kuai
@ 2024-10-29 8:45 ` John Garry
2024-10-29 11:30 ` Yu Kuai
0 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-29 8:45 UTC (permalink / raw)
To: Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
On 29/10/2024 03:48, Yu Kuai wrote:
> Hi,
>
> 在 2024/10/28 23:27, John Garry 写道:
>> Add proper bio_split() error handling. For any error, call
>> raid_end_bio_io() and return.
>>
>> For the case of an in the write path, we need to undo the increment in
>> the rdev panding count and NULLify the r1_bio->bios[] pointers.
>>
>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>> ---
>> drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
>> 1 file changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> index 6c9d24203f39..a10018282629 100644
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev
>> *mddev, struct bio *bio,
>> const enum req_op op = bio_op(bio);
>> const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
>> int max_sectors;
>> - int rdisk;
>> + int rdisk, error;
>> bool r1bio_existed = !!r1_bio;
>> /*
>> @@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev
>> *mddev, struct bio *bio,
>> if (max_sectors < bio_sectors(bio)) {
>> struct bio *split = bio_split(bio, max_sectors,
>> gfp, &conf->bio_split);
>> +
>> + if (IS_ERR(split)) {
>> + error = PTR_ERR(split);
>> + goto err_handle;
>> + }
>> bio_chain(split, bio);
>> submit_bio_noacct(bio);
>> bio = split;
>> @@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev
>> *mddev, struct bio *bio,
>> read_bio->bi_private = r1_bio;
>> mddev_trace_remap(mddev, read_bio, r1_bio->sector);
>> submit_bio_noacct(read_bio);
>> + return;
>> +
>> +err_handle:
>> + bio->bi_status = errno_to_blk_status(error);
>> + set_bit(R1BIO_Uptodate, &r1_bio->state);
>> + raid_end_bio_io(r1_bio);
>> }
>> static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>> @@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev
>> *mddev, struct bio *bio,
>> {
>> struct r1conf *conf = mddev->private;
>> struct r1bio *r1_bio;
>> - int i, disks;
>> + int i, disks, k, error;
>> unsigned long flags;
>> struct md_rdev *blocked_rdev;
>> int first_clone;
>> @@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev
>> *mddev, struct bio *bio,
>> if (max_sectors < bio_sectors(bio)) {
>> struct bio *split = bio_split(bio, max_sectors,
>> GFP_NOIO, &conf->bio_split);
>> +
>> + if (IS_ERR(split)) {
>> + error = PTR_ERR(split);
>> + goto err_handle;
>> + }
>> bio_chain(split, bio);
>> submit_bio_noacct(bio);
>> bio = split;
>> @@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev
>> *mddev, struct bio *bio,
>> /* In case raid1d snuck in to freeze_array */
>> wake_up_barrier(conf);
>> + return;
>> +err_handle:
>> + for (k = 0; k < i; k++) {
>> + if (r1_bio->bios[k]) {
>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>> + r1_bio->bios[k] = NULL;
>> + }
>> + }
>> +
>> + bio->bi_status = errno_to_blk_status(error);
>> + set_bit(R1BIO_Uptodate, &r1_bio->state);
>> + raid_end_bio_io(r1_bio);
Hi Kuai,
>
> Looks good that error code is passed to orig bio. However,
> I really think badblocks should be handled somehow, it just doesn't make
> sense to return IO error to filesystems or user if one underlying disk
> contain BB, while others are good.
Please be aware that this change is not for handling splits in atomic
writes. It is for situation when split fails for whatever reason -
likely a software bug.
For when atomic writes are supported for raid1, my plan is that an
atomic write over a region which covers a BB will error, i.e. goto
err_handle, like:
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1514,6 +1514,12 @@ static void raid1_write_request(struct mddev
*mddev, struct bio *bio,
break;
}
+ if (is_bad && bio->bi_opf & REQ_ATOMIC) {
+ /* We just cannot atomically write this ... */
+ err = -EIO;
+ goto err_handle;
+ }
+
if (is_bad && first_bad <= r1_bio->sector) {
I just think that if we try to write a region atomically which contains
BBs then we should error. Indeed, as I mentioned previously, I really
don't expect BBs on devices which support atomic writes. But we should
still handle it.
OTOH, if we did want to handle atomic writes to regions with BBs, we
could make a bigger effort and write the disks which don't have BBs
atomically (so that we don't split for those good disks). But this is
too complicated and does not achieve much.
>
> Or is it guaranteed that IO error by atomic write won't hurt anyone,
> user will handle this error and retry with non atomic write?
Yes, I think that the user could retry non-atomically for the same
write. Maybe returning a special error code could be useful for this.
Thanks,
John
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH] btrfs: handle bio_split() error
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
` (6 preceding siblings ...)
2024-10-28 15:27 ` [PATCH v2 7/7] md/raid10: " John Garry
@ 2024-10-29 9:11 ` Johannes Thumshirn
2024-10-29 10:33 ` John Garry
` (2 more replies)
7 siblings, 3 replies; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-29 9:11 UTC (permalink / raw)
To: John Garry
Cc: linux-block, linux-btrfs, linux-kernel, linux-raid, axboe, song,
yukuai3, hch, martin.petersen, hare, Johannes Thumshirn
From: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Now that bio_split() can return errors, add error handling for it in
btrfs_split_bio() and ultimately btrfs_submit_chunk().
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
John,
in case you have to send a v3, can you please include this one in your series?
fs/btrfs/bio.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 1f216d07eff6..96cacd5c03a5 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -81,6 +81,9 @@ static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info,
bio = bio_split(&orig_bbio->bio, map_length >> SECTOR_SHIFT, GFP_NOFS,
&btrfs_clone_bioset);
+ if (IS_ERR(bio))
+ return ERR_CAST(bio);
+
bbio = btrfs_bio(bio);
btrfs_bio_init(bbio, fs_info, NULL, orig_bbio);
bbio->inode = orig_bbio->inode;
@@ -687,6 +690,10 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
if (map_length < length) {
bbio = btrfs_split_bio(fs_info, bbio, map_length);
+ if (IS_ERR(bbio)) {
+ ret = PTR_ERR(bbio);
+ goto fail;
+ }
bio = &bbio->bio;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH v2 2/7] block: Rework bio_split() return value
2024-10-28 15:27 ` [PATCH v2 2/7] block: Rework bio_split() return value John Garry
2024-10-28 16:12 ` Christoph Hellwig
@ 2024-10-29 9:12 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-29 9:12 UTC (permalink / raw)
To: John Garry, axboe@kernel.dk, song@kernel.org, yukuai3@huawei.com,
hch
Cc: martin.petersen@oracle.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
hare@suse.de
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split()
2024-10-28 15:27 ` [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split() John Garry
2024-10-28 16:12 ` Christoph Hellwig
@ 2024-10-29 9:12 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-29 9:12 UTC (permalink / raw)
To: John Garry, axboe@kernel.dk, song@kernel.org, yukuai3@huawei.com,
hch
Cc: martin.petersen@oracle.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
hare@suse.de
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split()
2024-10-28 15:27 ` [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split() John Garry
2024-10-28 16:12 ` Christoph Hellwig
@ 2024-10-29 9:13 ` Johannes Thumshirn
1 sibling, 0 replies; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-29 9:13 UTC (permalink / raw)
To: John Garry, axboe@kernel.dk, song@kernel.org, yukuai3@huawei.com,
hch
Cc: martin.petersen@oracle.com, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
hare@suse.de
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
@ 2024-10-29 10:33 ` John Garry
2024-10-30 14:00 ` kernel test robot
2024-10-30 20:05 ` Dan Carpenter
2 siblings, 0 replies; 35+ messages in thread
From: John Garry @ 2024-10-29 10:33 UTC (permalink / raw)
To: Johannes Thumshirn
Cc: linux-block, linux-btrfs, linux-kernel, linux-raid, axboe, song,
yukuai3, hch, martin.petersen, hare, Johannes Thumshirn
On 29/10/2024 09:11, Johannes Thumshirn wrote:
> From: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>
> Now that bio_split() can return errors, add error handling for it in
> btrfs_split_bio() and ultimately btrfs_submit_chunk().
>
> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> ---
>
> John,
> in case you have to send a v3, can you please include this one in your series?
sure, and I think that I will be sending a v3
Cheers
>
> fs/btrfs/bio.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
> index 1f216d07eff6..96cacd5c03a5 100644
> --- a/fs/btrfs/bio.c
> +++ b/fs/btrfs/bio.c
> @@ -81,6 +81,9 @@ static struct btrfs_bio *btrfs_split_bio(struct btrfs_fs_info *fs_info,
>
> bio = bio_split(&orig_bbio->bio, map_length >> SECTOR_SHIFT, GFP_NOFS,
> &btrfs_clone_bioset);
> + if (IS_ERR(bio))
> + return ERR_CAST(bio);
> +
> bbio = btrfs_bio(bio);
> btrfs_bio_init(bbio, fs_info, NULL, orig_bbio);
> bbio->inode = orig_bbio->inode;
> @@ -687,6 +690,10 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
>
> if (map_length < length) {
> bbio = btrfs_split_bio(fs_info, bbio, map_length);
> + if (IS_ERR(bbio)) {
> + ret = PTR_ERR(bbio);
> + goto fail;
> + }
> bio = &bbio->bio;
> }
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-29 8:45 ` John Garry
@ 2024-10-29 11:30 ` Yu Kuai
2024-10-29 11:36 ` John Garry
0 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 11:30 UTC (permalink / raw)
To: John Garry, Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
Hi,
在 2024/10/29 16:45, John Garry 写道:
> On 29/10/2024 03:48, Yu Kuai wrote:
>> Hi,
>>
>> 在 2024/10/28 23:27, John Garry 写道:
>>> Add proper bio_split() error handling. For any error, call
>>> raid_end_bio_io() and return.
>>>
>>> For the case of an in the write path, we need to undo the increment in
>>> the rdev panding count and NULLify the r1_bio->bios[] pointers.
>>>
>>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>>> ---
>>> drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
>>> 1 file changed, 30 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>>> index 6c9d24203f39..a10018282629 100644
>>> --- a/drivers/md/raid1.c
>>> +++ b/drivers/md/raid1.c
>>> @@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> const enum req_op op = bio_op(bio);
>>> const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
>>> int max_sectors;
>>> - int rdisk;
>>> + int rdisk, error;
>>> bool r1bio_existed = !!r1_bio;
>>> /*
>>> @@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> if (max_sectors < bio_sectors(bio)) {
>>> struct bio *split = bio_split(bio, max_sectors,
>>> gfp, &conf->bio_split);
>>> +
>>> + if (IS_ERR(split)) {
>>> + error = PTR_ERR(split);
>>> + goto err_handle;
>>> + }
>>> bio_chain(split, bio);
>>> submit_bio_noacct(bio);
>>> bio = split;
>>> @@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> read_bio->bi_private = r1_bio;
>>> mddev_trace_remap(mddev, read_bio, r1_bio->sector);
>>> submit_bio_noacct(read_bio);
>>> + return;
>>> +
>>> +err_handle:
>>> + bio->bi_status = errno_to_blk_status(error);
>>> + set_bit(R1BIO_Uptodate, &r1_bio->state);
>>> + raid_end_bio_io(r1_bio);
>>> }
>>> static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>>> @@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> {
>>> struct r1conf *conf = mddev->private;
>>> struct r1bio *r1_bio;
>>> - int i, disks;
>>> + int i, disks, k, error;
>>> unsigned long flags;
>>> struct md_rdev *blocked_rdev;
>>> int first_clone;
>>> @@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> if (max_sectors < bio_sectors(bio)) {
>>> struct bio *split = bio_split(bio, max_sectors,
>>> GFP_NOIO, &conf->bio_split);
>>> +
>>> + if (IS_ERR(split)) {
>>> + error = PTR_ERR(split);
>>> + goto err_handle;
>>> + }
>>> bio_chain(split, bio);
>>> submit_bio_noacct(bio);
>>> bio = split;
>>> @@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> /* In case raid1d snuck in to freeze_array */
>>> wake_up_barrier(conf);
>>> + return;
>>> +err_handle:
>>> + for (k = 0; k < i; k++) {
>>> + if (r1_bio->bios[k]) {
>>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>>> + r1_bio->bios[k] = NULL;
>>> + }
>>> + }
>>> +
>>> + bio->bi_status = errno_to_blk_status(error);
>>> + set_bit(R1BIO_Uptodate, &r1_bio->state);
>>> + raid_end_bio_io(r1_bio);
>
> Hi Kuai,
>
>>
>> Looks good that error code is passed to orig bio. However,
>> I really think badblocks should be handled somehow, it just doesn't make
>> sense to return IO error to filesystems or user if one underlying disk
>> contain BB, while others are good.
>
> Please be aware that this change is not for handling splits in atomic
> writes. It is for situation when split fails for whatever reason -
> likely a software bug.
>
> For when atomic writes are supported for raid1, my plan is that an
> atomic write over a region which covers a BB will error, i.e. goto
> err_handle, like:
>
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1514,6 +1514,12 @@ static void raid1_write_request(struct mddev
> *mddev, struct bio *bio,
> break;
> }
>
> + if (is_bad && bio->bi_opf & REQ_ATOMIC) {
> + /* We just cannot atomically write this ... */
> + err = -EIO;
> + goto err_handle;
> + }
> +
> if (is_bad && first_bad <= r1_bio->sector) {
>
>
> I just think that if we try to write a region atomically which contains
> BBs then we should error. Indeed, as I mentioned previously, I really
> don't expect BBs on devices which support atomic writes. But we should
> still handle it.
>
Agreed.
> OTOH, if we did want to handle atomic writes to regions with BBs, we
> could make a bigger effort and write the disks which don't have BBs
> atomically (so that we don't split for those good disks). But this is
> too complicated and does not achieve much.
Agreed.
>
>>
>> Or is it guaranteed that IO error by atomic write won't hurt anyone,
>> user will handle this error and retry with non atomic write?
>
> Yes, I think that the user could retry non-atomically for the same
> write. Maybe returning a special error code could be useful for this.
And can you update the above error path comment when you support raid1
and raid10?
Thanks,
Kuai
>
> Thanks,
> John
>
> .
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
2024-10-29 3:48 ` Yu Kuai
@ 2024-10-29 11:32 ` Yu Kuai
2024-10-29 12:12 ` Yu Kuai
2 siblings, 0 replies; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 11:32 UTC (permalink / raw)
To: John Garry, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
在 2024/10/28 23:27, John Garry 写道:
> Add proper bio_split() error handling. For any error, call
> raid_end_bio_io() and return.
>
> For the case of an in the write path, we need to undo the increment in
> the rdev panding count and NULLify the r1_bio->bios[] pointers.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
LGTM
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> ---
> drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 6c9d24203f39..a10018282629 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> const enum req_op op = bio_op(bio);
> const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
> int max_sectors;
> - int rdisk;
> + int rdisk, error;
> bool r1bio_existed = !!r1_bio;
>
> /*
> @@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> gfp, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> read_bio->bi_private = r1_bio;
> mddev_trace_remap(mddev, read_bio, r1_bio->sector);
> submit_bio_noacct(read_bio);
> + return;
> +
> +err_handle:
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
> }
>
> static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> @@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> {
> struct r1conf *conf = mddev->private;
> struct r1bio *r1_bio;
> - int i, disks;
> + int i, disks, k, error;
> unsigned long flags;
> struct md_rdev *blocked_rdev;
> int first_clone;
> @@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> GFP_NOIO, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>
> /* In case raid1d snuck in to freeze_array */
> wake_up_barrier(conf);
> + return;
> +err_handle:
> + for (k = 0; k < i; k++) {
> + if (r1_bio->bios[k]) {
> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
> + r1_bio->bios[k] = NULL;
> + }
> + }
> +
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
> }
>
> static bool raid1_make_request(struct mddev *mddev, struct bio *bio)
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-29 11:30 ` Yu Kuai
@ 2024-10-29 11:36 ` John Garry
0 siblings, 0 replies; 35+ messages in thread
From: John Garry @ 2024-10-29 11:36 UTC (permalink / raw)
To: Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
On 29/10/2024 11:30, Yu Kuai wrote:
>>>
>>> Or is it guaranteed that IO error by atomic write won't hurt anyone,
>>> user will handle this error and retry with non atomic write?
>>
>> Yes, I think that the user could retry non-atomically for the same
>> write. Maybe returning a special error code could be useful for this.
>
> And can you update the above error path comment when you support raid1
> and raid10?
Sure, can do. I am not sure on a special error code value. I will think
about it.
And I will send update for
https://lore.kernel.org/linux-raid/20240903150748.2179966-1-john.g.garry@oracle.com/T/#m5daa8d32d825d74422bbff272c9b25b6c4fc2788
soon with this suggestion.
Thanks,
John
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 7/7] md/raid10: Handle bio_split() errors
2024-10-28 15:27 ` [PATCH v2 7/7] md/raid10: " John Garry
@ 2024-10-29 11:55 ` Yu Kuai
2024-10-29 12:05 ` John Garry
0 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 11:55 UTC (permalink / raw)
To: John Garry, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
Hi,
在 2024/10/28 23:27, John Garry 写道:
> Add proper bio_split() error handling. For any error, call
> raid_end_bio_io() and return. Except for discard, where we end the bio
> directly.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> drivers/md/raid10.c | 47 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index f3bf1116794a..9c56b27b754a 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1159,6 +1159,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> int slot = r10_bio->read_slot;
> struct md_rdev *err_rdev = NULL;
> gfp_t gfp = GFP_NOIO;
> + int error;
>
> if (slot >= 0 && r10_bio->devs[slot].rdev) {
> /*
> @@ -1206,6 +1207,10 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> gfp, &conf->bio_split);
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> allow_barrier(conf);
> submit_bio_noacct(bio);
> @@ -1236,6 +1241,12 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> mddev_trace_remap(mddev, read_bio, r10_bio->sector);
> submit_bio_noacct(read_bio);
> return;
> +err_handle:
> + atomic_dec(&rdev->nr_pending);
I just realized that for the raid1 patch, this is missed. read_balance()
from raid1 will increase nr_pending as well. :(
> +
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R10BIO_Uptodate, &r10_bio->state);
> + raid_end_bio_io(r10_bio);
> }
>
> static void raid10_write_one_disk(struct mddev *mddev, struct r10bio *r10_bio,
> @@ -1347,9 +1358,10 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
> struct r10bio *r10_bio)
> {
> struct r10conf *conf = mddev->private;
> - int i;
> + int i, k;
> sector_t sectors;
> int max_sectors;
> + int error;
>
> if ((mddev_is_clustered(mddev) &&
> md_cluster_ops->area_resyncing(mddev, WRITE,
> @@ -1482,6 +1494,10 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
> if (r10_bio->sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, r10_bio->sectors,
> GFP_NOIO, &conf->bio_split);
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> allow_barrier(conf);
> submit_bio_noacct(bio);
> @@ -1503,6 +1519,25 @@ static void raid10_write_request(struct mddev *mddev, struct bio *bio,
> raid10_write_one_disk(mddev, r10_bio, bio, true, i);
> }
> one_write_done(r10_bio);
> + return;
> +err_handle:
> + for (k = 0; k < i; k++) {
> + struct md_rdev *rdev, *rrdev;
> +
> + rdev = conf->mirrors[k].rdev;
> + rrdev = conf->mirrors[k].replacement;
This looks wrong, r10_bio->devs[k].devnum should be used to deference
rdev from mirrors.
> +
> + if (rdev)
> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
> + if (rrdev)
> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
This is not correct for now, for the case that rdev is all BB in the
write range, continue will be reached in the loop and rrdev is skipped(
This doesn't look correct to skip rrdev). However, I'll suggest to use:
int d = r10_bio->devs[k].devnum;
if (r10_bio->devs[k].bio == NULL)
rdev_dec_pending(conf->mirrors[d].rdev);
if (r10_bio->devs[k].repl_bio == NULL)
rdev_dec_pending(conf->mirrors[d].replacement);
Thanks,
Kuai
> + r10_bio->devs[k].bio = NULL;
> + r10_bio->devs[k].repl_bio = NULL;
> + }
> +
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R10BIO_Uptodate, &r10_bio->state);
> + raid_end_bio_io(r10_bio);
> }
>
> static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
> @@ -1644,6 +1679,11 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
> if (remainder) {
> split_size = stripe_size - remainder;
> split = bio_split(bio, split_size, GFP_NOIO, &conf->bio_split);
> + if (IS_ERR(split)) {
> + bio->bi_status = errno_to_blk_status(PTR_ERR(split));
> + bio_endio(bio);
> + return 0;
> + }
> bio_chain(split, bio);
> allow_barrier(conf);
> /* Resend the fist split part */
> @@ -1654,6 +1694,11 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio)
> if (remainder) {
> split_size = bio_sectors(bio) - remainder;
> split = bio_split(bio, split_size, GFP_NOIO, &conf->bio_split);
> + if (IS_ERR(split)) {
> + bio->bi_status = errno_to_blk_status(PTR_ERR(split));
> + bio_endio(bio);
> + return 0;
> + }
> bio_chain(split, bio);
> allow_barrier(conf);
> /* Resend the second split part */
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 7/7] md/raid10: Handle bio_split() errors
2024-10-29 11:55 ` Yu Kuai
@ 2024-10-29 12:05 ` John Garry
2024-10-29 12:10 ` Yu Kuai
0 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-29 12:05 UTC (permalink / raw)
To: Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
On 29/10/2024 11:55, Yu Kuai wrote:
> Hi,
>
> 在 2024/10/28 23:27, John Garry 写道:
>> Add proper bio_split() error handling. For any error, call
>> raid_end_bio_io() and return. Except for discard, where we end the bio
>> directly.
>>
>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>> ---
>> drivers/md/raid10.c | 47 ++++++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 46 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index f3bf1116794a..9c56b27b754a 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -1159,6 +1159,7 @@ static void raid10_read_request(struct mddev
>> *mddev, struct bio *bio,
>> int slot = r10_bio->read_slot;
>> struct md_rdev *err_rdev = NULL;
>> gfp_t gfp = GFP_NOIO;
>> + int error;
>> if (slot >= 0 && r10_bio->devs[slot].rdev) {
>> /*
>> @@ -1206,6 +1207,10 @@ static void raid10_read_request(struct mddev
>> *mddev, struct bio *bio,
>> if (max_sectors < bio_sectors(bio)) {
>> struct bio *split = bio_split(bio, max_sectors,
>> gfp, &conf->bio_split);
>> + if (IS_ERR(split)) {
>> + error = PTR_ERR(split);
>> + goto err_handle;
>> + }
>> bio_chain(split, bio);
>> allow_barrier(conf);
>> submit_bio_noacct(bio);
>> @@ -1236,6 +1241,12 @@ static void raid10_read_request(struct mddev
>> *mddev, struct bio *bio,
>> mddev_trace_remap(mddev, read_bio, r10_bio->sector);
>> submit_bio_noacct(read_bio);
>> return;
>> +err_handle:
>> + atomic_dec(&rdev->nr_pending);
>
> I just realized that for the raid1 patch, this is missed. read_balance()
> from raid1 will increase nr_pending as well. :(
hmmm... I have the rdev_dec_pending() call for raid1 at the error label,
which does the appropriate nr_pending dec, right? Or not?
>
>> +
>> + bio->bi_status = errno_to_blk_status(error);
>> + set_bit(R10BIO_Uptodate, &r10_bio->state);
>> + raid_end_bio_io(r10_bio);
>> }
>> static void raid10_write_one_disk(struct mddev *mddev, struct r10bio
>> *r10_bio,
>> @@ -1347,9 +1358,10 @@ static void raid10_write_request(struct mddev
>> *mddev, struct bio *bio,
>> struct r10bio *r10_bio)
>> {
>> struct r10conf *conf = mddev->private;
>> - int i;
>> + int i, k;
>> sector_t sectors;
>> int max_sectors;
>> + int error;
>> if ((mddev_is_clustered(mddev) &&
>> md_cluster_ops->area_resyncing(mddev, WRITE,
>> @@ -1482,6 +1494,10 @@ static void raid10_write_request(struct mddev
>> *mddev, struct bio *bio,
>> if (r10_bio->sectors < bio_sectors(bio)) {
>> struct bio *split = bio_split(bio, r10_bio->sectors,
>> GFP_NOIO, &conf->bio_split);
>> + if (IS_ERR(split)) {
>> + error = PTR_ERR(split);
>> + goto err_handle;
>> + }
>> bio_chain(split, bio);
>> allow_barrier(conf);
>> submit_bio_noacct(bio);
>> @@ -1503,6 +1519,25 @@ static void raid10_write_request(struct mddev
>> *mddev, struct bio *bio,
>> raid10_write_one_disk(mddev, r10_bio, bio, true, i);
>> }
>> one_write_done(r10_bio);
>> + return;
>> +err_handle:
>> + for (k = 0; k < i; k++) {
>> + struct md_rdev *rdev, *rrdev;
>> +
>> + rdev = conf->mirrors[k].rdev;
>> + rrdev = conf->mirrors[k].replacement;
>
> This looks wrong, r10_bio->devs[k].devnum should be used to deference
> rdev from mirrors.
ok
>> +
>> + if (rdev)
>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>> + if (rrdev)
>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>
> This is not correct for now, for the case that rdev is all BB in the
> write range, continue will be reached in the loop and rrdev is skipped(
> This doesn't look correct to skip rrdev). However, I'll suggest to use:
>
> int d = r10_bio->devs[k].devnum;
> if (r10_bio->devs[k].bio == NULL)
eh, should this be:
if (r10_bio->devs[k].bio != NULL)
> rdev_dec_pending(conf->mirrors[d].rdev);
> if (r10_bio->devs[k].repl_bio == NULL)
> rdev_dec_pending(conf->mirrors[d].replacement);
>
>
>> + r10_bio->devs[k].bio = NULL;
>> + r10_bio->devs[k].repl_bio = NULL;
>> + }
>> +
>> + bio->bi_status = errno_to_blk_status(error);
>> + set_bit(R10BIO_Uptodate, &r10_bio->state);
>> + raid_end_bio_io(r10_bio);
Thanks,
John
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 7/7] md/raid10: Handle bio_split() errors
2024-10-29 12:05 ` John Garry
@ 2024-10-29 12:10 ` Yu Kuai
0 siblings, 0 replies; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 12:10 UTC (permalink / raw)
To: John Garry, Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
Hi,
在 2024/10/29 20:05, John Garry 写道:
> On 29/10/2024 11:55, Yu Kuai wrote:
>> Hi,
>>
>> 在 2024/10/28 23:27, John Garry 写道:
>>> Add proper bio_split() error handling. For any error, call
>>> raid_end_bio_io() and return. Except for discard, where we end the bio
>>> directly.
>>>
>>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>>> ---
>>> drivers/md/raid10.c | 47 ++++++++++++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 46 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index f3bf1116794a..9c56b27b754a 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -1159,6 +1159,7 @@ static void raid10_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> int slot = r10_bio->read_slot;
>>> struct md_rdev *err_rdev = NULL;
>>> gfp_t gfp = GFP_NOIO;
>>> + int error;
>>> if (slot >= 0 && r10_bio->devs[slot].rdev) {
>>> /*
>>> @@ -1206,6 +1207,10 @@ static void raid10_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> if (max_sectors < bio_sectors(bio)) {
>>> struct bio *split = bio_split(bio, max_sectors,
>>> gfp, &conf->bio_split);
>>> + if (IS_ERR(split)) {
>>> + error = PTR_ERR(split);
>>> + goto err_handle;
>>> + }
>>> bio_chain(split, bio);
>>> allow_barrier(conf);
>>> submit_bio_noacct(bio);
>>> @@ -1236,6 +1241,12 @@ static void raid10_read_request(struct mddev
>>> *mddev, struct bio *bio,
>>> mddev_trace_remap(mddev, read_bio, r10_bio->sector);
>>> submit_bio_noacct(read_bio);
>>> return;
>>> +err_handle:
>>> + atomic_dec(&rdev->nr_pending);
>>
>> I just realized that for the raid1 patch, this is missed. read_balance()
>> from raid1 will increase nr_pending as well. :(
>
> hmmm... I have the rdev_dec_pending() call for raid1 at the error label,
> which does the appropriate nr_pending dec, right? Or not?
Looks not, I'll reply here. :)
>
>>
>>> +
>>> + bio->bi_status = errno_to_blk_status(error);
>>> + set_bit(R10BIO_Uptodate, &r10_bio->state);
>>> + raid_end_bio_io(r10_bio);
>>> }
>>> static void raid10_write_one_disk(struct mddev *mddev, struct
>>> r10bio *r10_bio,
>>> @@ -1347,9 +1358,10 @@ static void raid10_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> struct r10bio *r10_bio)
>>> {
>>> struct r10conf *conf = mddev->private;
>>> - int i;
>>> + int i, k;
>>> sector_t sectors;
>>> int max_sectors;
>>> + int error;
>>> if ((mddev_is_clustered(mddev) &&
>>> md_cluster_ops->area_resyncing(mddev, WRITE,
>>> @@ -1482,6 +1494,10 @@ static void raid10_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> if (r10_bio->sectors < bio_sectors(bio)) {
>>> struct bio *split = bio_split(bio, r10_bio->sectors,
>>> GFP_NOIO, &conf->bio_split);
>>> + if (IS_ERR(split)) {
>>> + error = PTR_ERR(split);
>>> + goto err_handle;
>>> + }
>>> bio_chain(split, bio);
>>> allow_barrier(conf);
>>> submit_bio_noacct(bio);
>>> @@ -1503,6 +1519,25 @@ static void raid10_write_request(struct mddev
>>> *mddev, struct bio *bio,
>>> raid10_write_one_disk(mddev, r10_bio, bio, true, i);
>>> }
>>> one_write_done(r10_bio);
>>> + return;
>>> +err_handle:
>>> + for (k = 0; k < i; k++) {
>>> + struct md_rdev *rdev, *rrdev;
>>> +
>>> + rdev = conf->mirrors[k].rdev;
>>> + rrdev = conf->mirrors[k].replacement;
>>
>> This looks wrong, r10_bio->devs[k].devnum should be used to deference
>> rdev from mirrors.
>
> ok
>
>>> +
>>> + if (rdev)
>>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>>> + if (rrdev)
>>> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
>>
>> This is not correct for now, for the case that rdev is all BB in the
>> write range, continue will be reached in the loop and rrdev is skipped(
>> This doesn't look correct to skip rrdev). However, I'll suggest to use:
>>
>> int d = r10_bio->devs[k].devnum;
>> if (r10_bio->devs[k].bio == NULL)
>
> eh, should this be:
> if (r10_bio->devs[k].bio != NULL)
Of course, sorry about the typo.
Thanks,
Kuai
>
>> rdev_dec_pending(conf->mirrors[d].rdev);
>> if (r10_bio->devs[k].repl_bio == NULL)
>> rdev_dec_pending(conf->mirrors[d].replacement);
>>
>
>
>
>>
>>> + r10_bio->devs[k].bio = NULL;
>>> + r10_bio->devs[k].repl_bio = NULL;
>>> + }
>>> +
>>> + bio->bi_status = errno_to_blk_status(error);
>>> + set_bit(R10BIO_Uptodate, &r10_bio->state);
>>> + raid_end_bio_io(r10_bio);
>
> Thanks,
> John
>
> .
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
2024-10-29 3:48 ` Yu Kuai
2024-10-29 11:32 ` Yu Kuai
@ 2024-10-29 12:12 ` Yu Kuai
2024-10-29 12:21 ` John Garry
2 siblings, 1 reply; 35+ messages in thread
From: Yu Kuai @ 2024-10-29 12:12 UTC (permalink / raw)
To: John Garry, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
Hi,
在 2024/10/28 23:27, John Garry 写道:
> Add proper bio_split() error handling. For any error, call
> raid_end_bio_io() and return.
>
> For the case of an in the write path, we need to undo the increment in
> the rdev panding count and NULLify the r1_bio->bios[] pointers.
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
> drivers/md/raid1.c | 32 ++++++++++++++++++++++++++++++--
> 1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 6c9d24203f39..a10018282629 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1322,7 +1322,7 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> const enum req_op op = bio_op(bio);
> const blk_opf_t do_sync = bio->bi_opf & REQ_SYNC;
> int max_sectors;
> - int rdisk;
> + int rdisk, error;
> bool r1bio_existed = !!r1_bio;
>
> /*
> @@ -1383,6 +1383,11 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> gfp, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1410,6 +1415,12 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
> read_bio->bi_private = r1_bio;
> mddev_trace_remap(mddev, read_bio, r1_bio->sector);
> submit_bio_noacct(read_bio);
> + return;
> +
> +err_handle:
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
rdev_dec_pending() is missed here. :)
Thanks,
Kuai
> }
>
> static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> @@ -1417,7 +1428,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> {
> struct r1conf *conf = mddev->private;
> struct r1bio *r1_bio;
> - int i, disks;
> + int i, disks, k, error;
> unsigned long flags;
> struct md_rdev *blocked_rdev;
> int first_clone;
> @@ -1576,6 +1587,11 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (max_sectors < bio_sectors(bio)) {
> struct bio *split = bio_split(bio, max_sectors,
> GFP_NOIO, &conf->bio_split);
> +
> + if (IS_ERR(split)) {
> + error = PTR_ERR(split);
> + goto err_handle;
> + }
> bio_chain(split, bio);
> submit_bio_noacct(bio);
> bio = split;
> @@ -1660,6 +1676,18 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>
> /* In case raid1d snuck in to freeze_array */
> wake_up_barrier(conf);
> + return;
> +err_handle:
> + for (k = 0; k < i; k++) {
> + if (r1_bio->bios[k]) {
> + rdev_dec_pending(conf->mirrors[k].rdev, mddev);
> + r1_bio->bios[k] = NULL;
> + }
> + }
> +
> + bio->bi_status = errno_to_blk_status(error);
> + set_bit(R1BIO_Uptodate, &r1_bio->state);
> + raid_end_bio_io(r1_bio);
> }
>
> static bool raid1_make_request(struct mddev *mddev, struct bio *bio)
>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH v2 6/7] md/raid1: Handle bio_split() errors
2024-10-29 12:12 ` Yu Kuai
@ 2024-10-29 12:21 ` John Garry
0 siblings, 0 replies; 35+ messages in thread
From: John Garry @ 2024-10-29 12:21 UTC (permalink / raw)
To: Yu Kuai, axboe, song, hch
Cc: martin.petersen, linux-block, linux-kernel, linux-raid, hare,
Johannes.Thumshirn, yukuai (C)
On 29/10/2024 12:12, Yu Kuai wrote:
>> +err_handle:
>> + bio->bi_status = errno_to_blk_status(error);
>> + set_bit(R1BIO_Uptodate, &r1_bio->state);
>> + raid_end_bio_io(r1_bio);
>
> rdev_dec_pending() is missed here. 🙂
ok, I will fix. I am not sure sure how I missed this...
And I will drop your RB tag.
Cheers!
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
2024-10-29 10:33 ` John Garry
@ 2024-10-30 14:00 ` kernel test robot
2024-10-30 14:06 ` Johannes Thumshirn
2024-10-30 20:05 ` Dan Carpenter
2 siblings, 1 reply; 35+ messages in thread
From: kernel test robot @ 2024-10-30 14:00 UTC (permalink / raw)
To: Johannes Thumshirn, John Garry
Cc: oe-kbuild-all, linux-block, linux-btrfs, linux-kernel, linux-raid,
axboe, song, yukuai3, hch, martin.petersen, hare,
Johannes Thumshirn
Hi Johannes,
kernel test robot noticed the following build warnings:
[auto build test WARNING on kdave/for-next]
[also build test WARNING on linus/master v6.12-rc5 next-20241030]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Johannes-Thumshirn/btrfs-handle-bio_split-error/20241029-171227
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/20241029091121.16281-1-jth%40kernel.org
patch subject: [PATCH] btrfs: handle bio_split() error
config: x86_64-randconfig-121-20241030 (https://download.01.org/0day-ci/archive/20241030/202410302118.6igPxwWv-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241030/202410302118.6igPxwWv-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410302118.6igPxwWv-lkp@intel.com/
sparse warnings: (new ones prefixed by >>)
>> fs/btrfs/bio.c:694:29: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted blk_status_t [assigned] [usertype] ret @@ got long @@
fs/btrfs/bio.c:694:29: sparse: expected restricted blk_status_t [assigned] [usertype] ret
fs/btrfs/bio.c:694:29: sparse: got long
fs/btrfs/bio.c: note: in included file (through include/linux/mmzone.h, include/linux/gfp.h, include/linux/xarray.h, ...):
include/linux/page-flags.h:237:46: sparse: sparse: self-comparison always evaluates to false
include/linux/page-flags.h:237:46: sparse: sparse: self-comparison always evaluates to false
vim +694 fs/btrfs/bio.c
659
660 static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
661 {
662 struct btrfs_inode *inode = bbio->inode;
663 struct btrfs_fs_info *fs_info = bbio->fs_info;
664 struct bio *bio = &bbio->bio;
665 u64 logical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
666 u64 length = bio->bi_iter.bi_size;
667 u64 map_length = length;
668 bool use_append = btrfs_use_zone_append(bbio);
669 struct btrfs_io_context *bioc = NULL;
670 struct btrfs_io_stripe smap;
671 blk_status_t ret;
672 int error;
673
674 if (!bbio->inode || btrfs_is_data_reloc_root(inode->root))
675 smap.rst_search_commit_root = true;
676 else
677 smap.rst_search_commit_root = false;
678
679 btrfs_bio_counter_inc_blocked(fs_info);
680 error = btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length,
681 &bioc, &smap, &mirror_num);
682 if (error) {
683 ret = errno_to_blk_status(error);
684 goto fail;
685 }
686
687 map_length = min(map_length, length);
688 if (use_append)
689 map_length = btrfs_append_map_length(bbio, map_length);
690
691 if (map_length < length) {
692 bbio = btrfs_split_bio(fs_info, bbio, map_length);
693 if (IS_ERR(bbio)) {
> 694 ret = PTR_ERR(bbio);
695 goto fail;
696 }
697 bio = &bbio->bio;
698 }
699
700 /*
701 * Save the iter for the end_io handler and preload the checksums for
702 * data reads.
703 */
704 if (bio_op(bio) == REQ_OP_READ && is_data_bbio(bbio)) {
705 bbio->saved_iter = bio->bi_iter;
706 ret = btrfs_lookup_bio_sums(bbio);
707 if (ret)
708 goto fail;
709 }
710
711 if (btrfs_op(bio) == BTRFS_MAP_WRITE) {
712 if (use_append) {
713 bio->bi_opf &= ~REQ_OP_WRITE;
714 bio->bi_opf |= REQ_OP_ZONE_APPEND;
715 }
716
717 if (is_data_bbio(bbio) && bioc &&
718 btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) {
719 /*
720 * No locking for the list update, as we only add to
721 * the list in the I/O submission path, and list
722 * iteration only happens in the completion path, which
723 * can't happen until after the last submission.
724 */
725 btrfs_get_bioc(bioc);
726 list_add_tail(&bioc->rst_ordered_entry, &bbio->ordered->bioc_list);
727 }
728
729 /*
730 * Csum items for reloc roots have already been cloned at this
731 * point, so they are handled as part of the no-checksum case.
732 */
733 if (inode && !(inode->flags & BTRFS_INODE_NODATASUM) &&
734 !test_bit(BTRFS_FS_STATE_NO_DATA_CSUMS, &fs_info->fs_state) &&
735 !btrfs_is_data_reloc_root(inode->root)) {
736 if (should_async_write(bbio) &&
737 btrfs_wq_submit_bio(bbio, bioc, &smap, mirror_num))
738 goto done;
739
740 ret = btrfs_bio_csum(bbio);
741 if (ret)
742 goto fail;
743 } else if (use_append ||
744 (btrfs_is_zoned(fs_info) && inode &&
745 inode->flags & BTRFS_INODE_NODATASUM)) {
746 ret = btrfs_alloc_dummy_sum(bbio);
747 if (ret)
748 goto fail;
749 }
750 }
751
752 btrfs_submit_bio(bio, bioc, &smap, mirror_num);
753 done:
754 return map_length == length;
755
756 fail:
757 btrfs_bio_counter_dec(fs_info);
758 /*
759 * We have split the original bbio, now we have to end both the current
760 * @bbio and remaining one, as the remaining one will never be submitted.
761 */
762 if (map_length < length) {
763 struct btrfs_bio *remaining = bbio->private;
764
765 ASSERT(bbio->bio.bi_pool == &btrfs_clone_bioset);
766 ASSERT(remaining);
767
768 btrfs_bio_end_io(remaining, ret);
769 }
770 btrfs_bio_end_io(bbio, ret);
771 /* Do not submit another chunk */
772 return true;
773 }
774
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-30 14:00 ` kernel test robot
@ 2024-10-30 14:06 ` Johannes Thumshirn
2024-10-30 14:20 ` John Garry
0 siblings, 1 reply; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-30 14:06 UTC (permalink / raw)
To: kernel test robot, Johannes Thumshirn, John Garry
Cc: oe-kbuild-all@lists.linux.dev, linux-block@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-raid@vger.kernel.org, axboe@kernel.dk, song@kernel.org,
yukuai3@huawei.com, hch, martin.petersen@oracle.com, hare@suse.de
On 30.10.24 15:01, kernel test robot wrote:
> sparse warnings: (new ones prefixed by >>)
>>> fs/btrfs/bio.c:694:29: sparse: sparse: incorrect type in assignment (different base types) @@ expected restricted blk_status_t [assigned] [usertype] ret @@ got long @@
Yep that definitively needs to be:
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index 96cacd5c03a5..847af28ecff9 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -691,7 +691,7 @@ static bool btrfs_submit_chunk(struct btrfs_bio
*bbio, int mirror_num)
if (map_length < length) {
bbio = btrfs_split_bio(fs_info, bbio, map_length);
if (IS_ERR(bbio)) {
- ret = PTR_ERR(bbio);
+ ret = errno_to_blk_status(PTR_ERR(bbio));
goto fail;
}
bio = &bbio->bio;
Can you fold that in John or do you want me to send a new version?
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-30 14:06 ` Johannes Thumshirn
@ 2024-10-30 14:20 ` John Garry
2024-10-30 14:29 ` Johannes Thumshirn
0 siblings, 1 reply; 35+ messages in thread
From: John Garry @ 2024-10-30 14:20 UTC (permalink / raw)
To: Johannes Thumshirn, kernel test robot, Johannes Thumshirn
Cc: oe-kbuild-all@lists.linux.dev, linux-block@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-raid@vger.kernel.org, axboe@kernel.dk, song@kernel.org,
yukuai3@huawei.com, hch, martin.petersen@oracle.com, hare@suse.de
On 30/10/2024 14:06, Johannes Thumshirn wrote:
>>>> ret @@ got long @@
> Yep that definitively needs to be:
>
> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
> index 96cacd5c03a5..847af28ecff9 100644
> --- a/fs/btrfs/bio.c
> +++ b/fs/btrfs/bio.c
> @@ -691,7 +691,7 @@ static bool btrfs_submit_chunk(struct btrfs_bio
> *bbio, int mirror_num)
> if (map_length < length) {
> bbio = btrfs_split_bio(fs_info, bbio, map_length);
> if (IS_ERR(bbio)) {
> - ret = PTR_ERR(bbio);
> + ret = errno_to_blk_status(PTR_ERR(bbio));
> goto fail;
> }
> bio = &bbio->bio;
>
> Can you fold that in John or do you want me to send a new version?
Sure, no problem.
But I would have suggested to not use variable name "ret" for holding a
blk_status_t in original code, especially when it is mixed with
PTR_ERR() usage ...
Thanks,
John
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-30 14:20 ` John Garry
@ 2024-10-30 14:29 ` Johannes Thumshirn
0 siblings, 0 replies; 35+ messages in thread
From: Johannes Thumshirn @ 2024-10-30 14:29 UTC (permalink / raw)
To: John Garry, kernel test robot, Johannes Thumshirn
Cc: oe-kbuild-all@lists.linux.dev, linux-block@vger.kernel.org,
linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-raid@vger.kernel.org, axboe@kernel.dk, song@kernel.org,
yukuai3@huawei.com, hch, martin.petersen@oracle.com, hare@suse.de
On 30.10.24 15:21, John Garry wrote:
> On 30/10/2024 14:06, Johannes Thumshirn wrote:
>>>>> ret @@ got long @@
>> Yep that definitively needs to be:
>>
>> diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
>> index 96cacd5c03a5..847af28ecff9 100644
>> --- a/fs/btrfs/bio.c
>> +++ b/fs/btrfs/bio.c
>> @@ -691,7 +691,7 @@ static bool btrfs_submit_chunk(struct btrfs_bio
>> *bbio, int mirror_num)
>> if (map_length < length) {
>> bbio = btrfs_split_bio(fs_info, bbio, map_length);
>> if (IS_ERR(bbio)) {
>> - ret = PTR_ERR(bbio);
>> + ret = errno_to_blk_status(PTR_ERR(bbio));
>> goto fail;
>> }
>> bio = &bbio->bio;
>>
>> Can you fold that in John or do you want me to send a new version?
>
> Sure, no problem.
>
> But I would have suggested to not use variable name "ret" for holding a
> blk_status_t in original code, especially when it is mixed with
> PTR_ERR() usage ...
Yeah but the original code is using "blk_status_t ret", otherwise it
would be sth. like:
error = PTR_ERR(bbio);
ret = errno_to_blk_status(error);
goto fail;
[...]
fail:
btrfs_bio_end_io(bbio, ret);
Or changing the original code, which we could do but there's way more
urgent problems :)
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
2024-10-29 10:33 ` John Garry
2024-10-30 14:00 ` kernel test robot
@ 2024-10-30 20:05 ` Dan Carpenter
2024-10-31 9:29 ` John Garry
2 siblings, 1 reply; 35+ messages in thread
From: Dan Carpenter @ 2024-10-30 20:05 UTC (permalink / raw)
To: oe-kbuild, Johannes Thumshirn, John Garry
Cc: lkp, oe-kbuild-all, linux-block, linux-btrfs, linux-kernel,
linux-raid, axboe, song, yukuai3, hch, martin.petersen, hare,
Johannes Thumshirn
Hi Johannes,
kernel test robot noticed the following build warnings:
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Johannes-Thumshirn/btrfs-handle-bio_split-error/20241029-171227
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/20241029091121.16281-1-jth%40kernel.org
patch subject: [PATCH] btrfs: handle bio_split() error
config: openrisc-randconfig-r072-20241030 (https://download.01.org/0day-ci/archive/20241031/202410310231.WMcRwBhG-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 14.1.0
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202410310231.WMcRwBhG-lkp@intel.com/
smatch warnings:
fs/btrfs/bio.c:763 btrfs_submit_chunk() error: 'bbio' dereferencing possible ERR_PTR()
vim +/bbio +763 fs/btrfs/bio.c
ae42a154ca8972 Christoph Hellwig 2023-03-07 660 static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num)
103c19723c80bf Christoph Hellwig 2022-11-15 661 {
d5e4377d505189 Christoph Hellwig 2023-01-21 662 struct btrfs_inode *inode = bbio->inode;
4317ff0056bedf Qu Wenruo 2023-03-23 663 struct btrfs_fs_info *fs_info = bbio->fs_info;
ae42a154ca8972 Christoph Hellwig 2023-03-07 664 struct bio *bio = &bbio->bio;
adbe7e388e4239 Anand Jain 2023-04-15 665 u64 logical = bio->bi_iter.bi_sector << SECTOR_SHIFT;
103c19723c80bf Christoph Hellwig 2022-11-15 666 u64 length = bio->bi_iter.bi_size;
103c19723c80bf Christoph Hellwig 2022-11-15 667 u64 map_length = length;
921603c76246a7 Christoph Hellwig 2022-12-12 668 bool use_append = btrfs_use_zone_append(bbio);
103c19723c80bf Christoph Hellwig 2022-11-15 669 struct btrfs_io_context *bioc = NULL;
103c19723c80bf Christoph Hellwig 2022-11-15 670 struct btrfs_io_stripe smap;
9ba0004bd95e05 Christoph Hellwig 2023-01-21 671 blk_status_t ret;
9ba0004bd95e05 Christoph Hellwig 2023-01-21 672 int error;
103c19723c80bf Christoph Hellwig 2022-11-15 673
f4d39cf1cebfb8 Johannes Thumshirn 2024-07-31 674 if (!bbio->inode || btrfs_is_data_reloc_root(inode->root))
f4d39cf1cebfb8 Johannes Thumshirn 2024-07-31 675 smap.rst_search_commit_root = true;
f4d39cf1cebfb8 Johannes Thumshirn 2024-07-31 676 else
f4d39cf1cebfb8 Johannes Thumshirn 2024-07-31 677 smap.rst_search_commit_root = false;
9acaa64187f9b4 Johannes Thumshirn 2023-09-14 678
103c19723c80bf Christoph Hellwig 2022-11-15 679 btrfs_bio_counter_inc_blocked(fs_info);
cd4efd210edfb3 Christoph Hellwig 2023-05-31 680 error = btrfs_map_block(fs_info, btrfs_op(bio), logical, &map_length,
9fb2acc2fe07f1 Qu Wenruo 2023-09-17 681 &bioc, &smap, &mirror_num);
9ba0004bd95e05 Christoph Hellwig 2023-01-21 682 if (error) {
9ba0004bd95e05 Christoph Hellwig 2023-01-21 683 ret = errno_to_blk_status(error);
9ba0004bd95e05 Christoph Hellwig 2023-01-21 684 goto fail;
103c19723c80bf Christoph Hellwig 2022-11-15 685 }
103c19723c80bf Christoph Hellwig 2022-11-15 686
852eee62d31abd Christoph Hellwig 2023-01-21 687 map_length = min(map_length, length);
d5e4377d505189 Christoph Hellwig 2023-01-21 688 if (use_append)
b35243a447b9fe Christoph Hellwig 2024-08-26 689 map_length = btrfs_append_map_length(bbio, map_length);
d5e4377d505189 Christoph Hellwig 2023-01-21 690
103c19723c80bf Christoph Hellwig 2022-11-15 691 if (map_length < length) {
b35243a447b9fe Christoph Hellwig 2024-08-26 692 bbio = btrfs_split_bio(fs_info, bbio, map_length);
28c02a018d50ae Johannes Thumshirn 2024-10-29 693 if (IS_ERR(bbio)) {
28c02a018d50ae Johannes Thumshirn 2024-10-29 694 ret = PTR_ERR(bbio);
28c02a018d50ae Johannes Thumshirn 2024-10-29 695 goto fail;
We hit this goto. We know from the if statement that map_length < length.
28c02a018d50ae Johannes Thumshirn 2024-10-29 696 }
2cef0c79bb81d8 Christoph Hellwig 2023-03-07 697 bio = &bbio->bio;
103c19723c80bf Christoph Hellwig 2022-11-15 698 }
103c19723c80bf Christoph Hellwig 2022-11-15 699
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 700 /*
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 701 * Save the iter for the end_io handler and preload the checksums for
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 702 * data reads.
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 703 */
fbe960877b6f43 Christoph Hellwig 2023-05-31 704 if (bio_op(bio) == REQ_OP_READ && is_data_bbio(bbio)) {
0d3acb25e70d5f Christoph Hellwig 2023-01-21 705 bbio->saved_iter = bio->bi_iter;
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 706 ret = btrfs_lookup_bio_sums(bbio);
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 707 if (ret)
10d9d8c3512f16 Qu Wenruo 2024-08-17 708 goto fail;
1c2b3ee3b0ec4b Christoph Hellwig 2023-01-21 709 }
7276aa7d38255b Christoph Hellwig 2023-01-21 710
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 711 if (btrfs_op(bio) == BTRFS_MAP_WRITE) {
d5e4377d505189 Christoph Hellwig 2023-01-21 712 if (use_append) {
d5e4377d505189 Christoph Hellwig 2023-01-21 713 bio->bi_opf &= ~REQ_OP_WRITE;
d5e4377d505189 Christoph Hellwig 2023-01-21 714 bio->bi_opf |= REQ_OP_ZONE_APPEND;
69ccf3f4244abc Christoph Hellwig 2023-01-21 715 }
69ccf3f4244abc Christoph Hellwig 2023-01-21 716
02c372e1f016e5 Johannes Thumshirn 2023-09-14 717 if (is_data_bbio(bbio) && bioc &&
02c372e1f016e5 Johannes Thumshirn 2023-09-14 718 btrfs_need_stripe_tree_update(bioc->fs_info, bioc->map_type)) {
02c372e1f016e5 Johannes Thumshirn 2023-09-14 719 /*
02c372e1f016e5 Johannes Thumshirn 2023-09-14 720 * No locking for the list update, as we only add to
02c372e1f016e5 Johannes Thumshirn 2023-09-14 721 * the list in the I/O submission path, and list
02c372e1f016e5 Johannes Thumshirn 2023-09-14 722 * iteration only happens in the completion path, which
02c372e1f016e5 Johannes Thumshirn 2023-09-14 723 * can't happen until after the last submission.
02c372e1f016e5 Johannes Thumshirn 2023-09-14 724 */
02c372e1f016e5 Johannes Thumshirn 2023-09-14 725 btrfs_get_bioc(bioc);
02c372e1f016e5 Johannes Thumshirn 2023-09-14 726 list_add_tail(&bioc->rst_ordered_entry, &bbio->ordered->bioc_list);
02c372e1f016e5 Johannes Thumshirn 2023-09-14 727 }
02c372e1f016e5 Johannes Thumshirn 2023-09-14 728
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 729 /*
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 730 * Csum items for reloc roots have already been cloned at this
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 731 * point, so they are handled as part of the no-checksum case.
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 732 */
4317ff0056bedf Qu Wenruo 2023-03-23 733 if (inode && !(inode->flags & BTRFS_INODE_NODATASUM) &&
169aaaf2e0be61 Qu Wenruo 2024-06-14 734 !test_bit(BTRFS_FS_STATE_NO_DATA_CSUMS, &fs_info->fs_state) &&
d5e4377d505189 Christoph Hellwig 2023-01-21 735 !btrfs_is_data_reloc_root(inode->root)) {
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 736 if (should_async_write(bbio) &&
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 737 btrfs_wq_submit_bio(bbio, bioc, &smap, mirror_num))
852eee62d31abd Christoph Hellwig 2023-01-21 738 goto done;
103c19723c80bf Christoph Hellwig 2022-11-15 739
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 740 ret = btrfs_bio_csum(bbio);
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 741 if (ret)
10d9d8c3512f16 Qu Wenruo 2024-08-17 742 goto fail;
cebae292e0c32a Johannes Thumshirn 2024-06-07 743 } else if (use_append ||
cebae292e0c32a Johannes Thumshirn 2024-06-07 744 (btrfs_is_zoned(fs_info) && inode &&
cebae292e0c32a Johannes Thumshirn 2024-06-07 745 inode->flags & BTRFS_INODE_NODATASUM)) {
cbfce4c7fbde23 Christoph Hellwig 2023-05-24 746 ret = btrfs_alloc_dummy_sum(bbio);
cbfce4c7fbde23 Christoph Hellwig 2023-05-24 747 if (ret)
10d9d8c3512f16 Qu Wenruo 2024-08-17 748 goto fail;
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 749 }
103c19723c80bf Christoph Hellwig 2022-11-15 750 }
f8a53bb58ec7e2 Christoph Hellwig 2023-01-21 751
22b4ef50dc1d11 David Sterba 2024-08-27 752 btrfs_submit_bio(bio, bioc, &smap, mirror_num);
852eee62d31abd Christoph Hellwig 2023-01-21 753 done:
852eee62d31abd Christoph Hellwig 2023-01-21 754 return map_length == length;
9ba0004bd95e05 Christoph Hellwig 2023-01-21 755
9ba0004bd95e05 Christoph Hellwig 2023-01-21 756 fail:
9ba0004bd95e05 Christoph Hellwig 2023-01-21 757 btrfs_bio_counter_dec(fs_info);
10d9d8c3512f16 Qu Wenruo 2024-08-17 758 /*
10d9d8c3512f16 Qu Wenruo 2024-08-17 759 * We have split the original bbio, now we have to end both the current
10d9d8c3512f16 Qu Wenruo 2024-08-17 760 * @bbio and remaining one, as the remaining one will never be submitted.
10d9d8c3512f16 Qu Wenruo 2024-08-17 761 */
10d9d8c3512f16 Qu Wenruo 2024-08-17 762 if (map_length < length) {
10d9d8c3512f16 Qu Wenruo 2024-08-17 @763 struct btrfs_bio *remaining = bbio->private;
^^^^^^^^^^^^^
Error pointer dereference
10d9d8c3512f16 Qu Wenruo 2024-08-17 764
10d9d8c3512f16 Qu Wenruo 2024-08-17 765 ASSERT(bbio->bio.bi_pool == &btrfs_clone_bioset);
10d9d8c3512f16 Qu Wenruo 2024-08-17 766 ASSERT(remaining);
10d9d8c3512f16 Qu Wenruo 2024-08-17 767
9ca0e58cb752b0 Qu Wenruo 2024-08-24 768 btrfs_bio_end_io(remaining, ret);
10d9d8c3512f16 Qu Wenruo 2024-08-17 769 }
9ca0e58cb752b0 Qu Wenruo 2024-08-24 770 btrfs_bio_end_io(bbio, ret);
852eee62d31abd Christoph Hellwig 2023-01-21 771 /* Do not submit another chunk */
852eee62d31abd Christoph Hellwig 2023-01-21 772 return true;
852eee62d31abd Christoph Hellwig 2023-01-21 773 }
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH] btrfs: handle bio_split() error
2024-10-30 20:05 ` Dan Carpenter
@ 2024-10-31 9:29 ` John Garry
0 siblings, 0 replies; 35+ messages in thread
From: John Garry @ 2024-10-31 9:29 UTC (permalink / raw)
To: Dan Carpenter, oe-kbuild, Johannes Thumshirn
Cc: lkp, oe-kbuild-all, linux-block, linux-btrfs, linux-kernel,
linux-raid, axboe, song, yukuai3, hch, martin.petersen, hare,
Johannes Thumshirn
> 852eee62d31abd Christoph Hellwig 2023-01-21 687 map_length = min(map_length, length);
> d5e4377d505189 Christoph Hellwig 2023-01-21 688 if (use_append)
> b35243a447b9fe Christoph Hellwig 2024-08-26 689 map_length = btrfs_append_map_length(bbio, map_length);
> d5e4377d505189 Christoph Hellwig 2023-01-21 690
> 103c19723c80bf Christoph Hellwig 2022-11-15 691 if (map_length < length) {
> b35243a447b9fe Christoph Hellwig 2024-08-26 692 bbio = btrfs_split_bio(fs_info, bbio, map_length);
> 28c02a018d50ae Johannes Thumshirn 2024-10-29 693 if (IS_ERR(bbio)) {
> 28c02a018d50ae Johannes Thumshirn 2024-10-29 694 ret = PTR_ERR(bbio);
> 28c02a018d50ae Johannes Thumshirn 2024-10-29 695 goto fail;
>
> We hit this goto. We know from the if statement that map_length < length.
>
> 28c02a018d50ae Johannes Thumshirn 2024-10-29 696 }
> 2cef0c79bb81d8 Christoph Hellwig 2023-03-07 697 bio = &bbio->bio;
> 103c19723c80bf Christoph Hellwig 2022-11-15 698 }
> 103c19723c80bf Christoph Hellwig 2022-11-15 699
...
> 852eee62d31abd Christoph Hellwig 2023-01-21 753 done:
> 852eee62d31abd Christoph Hellwig 2023-01-21 754 return map_length == length;
> 9ba0004bd95e05 Christoph Hellwig 2023-01-21 755
> 9ba0004bd95e05 Christoph Hellwig 2023-01-21 756 fail:
> 9ba0004bd95e05 Christoph Hellwig 2023-01-21 757 btrfs_bio_counter_dec(fs_info);
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 758 /*
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 759 * We have split the original bbio, now we have to end both the current
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 760 * @bbio and remaining one, as the remaining one will never be submitted.
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 761 */
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 762 if (map_length < length) {
> 10d9d8c3512f16 Qu Wenruo 2024-08-17 @763 struct btrfs_bio *remaining = bbio->private;
> ^^^^^^^^^^^^^
> Error pointer dereference
This analysis looks correct.
I want to send a new version of the bio_split() rework series this
morning, so I will not pick up this change for now.
John
^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2024-10-31 9:30 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-28 15:27 [PATCH v2 0/7] bio_split() error handling rework John Garry
2024-10-28 15:27 ` [PATCH v2 1/7] block: Use BLK_STS_OK in bio_init() John Garry
2024-10-28 16:11 ` Christoph Hellwig
2024-10-28 16:32 ` John Garry
2024-10-28 15:27 ` [PATCH v2 2/7] block: Rework bio_split() return value John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 3/7] block: Error an attempt to split an atomic write in bio_split() John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:12 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 4/7] block: Handle bio_split() errors in bio_submit_split() John Garry
2024-10-28 16:12 ` Christoph Hellwig
2024-10-29 9:13 ` Johannes Thumshirn
2024-10-28 15:27 ` [PATCH v2 5/7] md/raid0: Handle bio_split() errors John Garry
2024-10-29 3:52 ` Yu Kuai
2024-10-28 15:27 ` [PATCH v2 6/7] md/raid1: " John Garry
2024-10-29 3:48 ` Yu Kuai
2024-10-29 8:45 ` John Garry
2024-10-29 11:30 ` Yu Kuai
2024-10-29 11:36 ` John Garry
2024-10-29 11:32 ` Yu Kuai
2024-10-29 12:12 ` Yu Kuai
2024-10-29 12:21 ` John Garry
2024-10-28 15:27 ` [PATCH v2 7/7] md/raid10: " John Garry
2024-10-29 11:55 ` Yu Kuai
2024-10-29 12:05 ` John Garry
2024-10-29 12:10 ` Yu Kuai
2024-10-29 9:11 ` [PATCH] btrfs: handle bio_split() error Johannes Thumshirn
2024-10-29 10:33 ` John Garry
2024-10-30 14:00 ` kernel test robot
2024-10-30 14:06 ` Johannes Thumshirn
2024-10-30 14:20 ` John Garry
2024-10-30 14:29 ` Johannes Thumshirn
2024-10-30 20:05 ` Dan Carpenter
2024-10-31 9:29 ` John Garry
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).