* [PATCH] md/raid5: Fix bio retry on interrupted reshape
@ 2026-04-29 11:10 Nigel Croxon
2026-04-29 13:07 ` Paul Menzel
0 siblings, 1 reply; 3+ messages in thread
From: Nigel Croxon @ 2026-04-29 11:10 UTC (permalink / raw)
To: song, yukuai, linux-raid
When a bio encounters LOC_INSIDE_RESHAPE during a reshape that is
interrupted (stopped or unable to progress), the code sets
bi->bi_status = BLK_STS_RESOURCE to signal the block layer for retry.
However, bio_endio() is never called, so the block layer never
receives the completion notification and the retry never happens.
This causes I/O to hang when a filesystem is layered over RAID5 and
reshape gets stuck.
Fix this by calling bio_endio(bi) before md_free_cloned_bio(bi) so
the block layer is properly notified of the BLK_STS_RESOURCE status
and can retry the request.
Tested stripes and stripe size conversions under load comparing
files multiple times during each conversion (i.e. MD reshape) on
ext4 after dropping caches degrading the RaidLV each time and
no data corruption.
Fixes: https://lwn.net/Articles/757123/
Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
---
drivers/md/raid5.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6e79829c5acb..9a3475429ef4 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6217,6 +6217,7 @@ static bool raid5_make_request(struct mddev
*mddev, struct bio * bi)
mempool_free(ctx, conf->ctx_pool);
if (res == STRIPE_WAIT_RESHAPE) {
+ bio_endio(bi);
md_free_cloned_bio(bi);
return false;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] md/raid5: Fix bio retry on interrupted reshape
2026-04-29 11:10 [PATCH] md/raid5: Fix bio retry on interrupted reshape Nigel Croxon
@ 2026-04-29 13:07 ` Paul Menzel
2026-04-30 11:53 ` Nigel Croxon
0 siblings, 1 reply; 3+ messages in thread
From: Paul Menzel @ 2026-04-29 13:07 UTC (permalink / raw)
To: Nigel Croxon; +Cc: song, yukuai, linux-raid
Dear Nigel,
Thank you for your patch.
Am 29.04.26 um 13:10 schrieb Nigel Croxon:
> When a bio encounters LOC_INSIDE_RESHAPE during a reshape that is
> interrupted (stopped or unable to progress), the code sets
> bi->bi_status = BLK_STS_RESOURCE to signal the block layer for retry.
> However, bio_endio() is never called, so the block layer never
> receives the completion notification and the retry never happens.
>
> This causes I/O to hang when a filesystem is layered over RAID5 and
> reshape gets stuck.
>
> Fix this by calling bio_endio(bi) before md_free_cloned_bio(bi) so
> the block layer is properly notified of the BLK_STS_RESOURCE status
> and can retry the request.
>
> Tested stripes and stripe size conversions under load comparing
> files multiple times during each conversion (i.e. MD reshape) on
> ext4 after dropping caches degrading the RaidLV each time and
I thought RaidLV misspelled Raid V (Raid 5), so should you resend, maybe
write it as RAID LV.
> no data corruption.
>
> Fixes: https://lwn.net/Articles/757123/
Which paragraph/comment exactly?
> Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
> ---
> drivers/md/raid5.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 6e79829c5acb..9a3475429ef4 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6217,6 +6217,7 @@ static bool raid5_make_request(struct mddev
> *mddev, struct bio * bi)
>
> mempool_free(ctx, conf->ctx_pool);
> if (res == STRIPE_WAIT_RESHAPE) {
> + bio_endio(bi);
> md_free_cloned_bio(bi);
> return false;
> }
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Kind regards,
Paul
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] md/raid5: Fix bio retry on interrupted reshape
2026-04-29 13:07 ` Paul Menzel
@ 2026-04-30 11:53 ` Nigel Croxon
0 siblings, 0 replies; 3+ messages in thread
From: Nigel Croxon @ 2026-04-30 11:53 UTC (permalink / raw)
To: Paul Menzel; +Cc: song, yukuai, linux-raid
Looks like this was just fixed by Benjamin Marzinski's
commit 418b3e64e4459 md/raid5: Fix UAF on IO across the reshape position
So my patch is not needed anymore.
-Nigel
On 4/29/26 9:07 AM, Paul Menzel wrote:
> Dear Nigel,
>
>
> Thank you for your patch.
>
> Am 29.04.26 um 13:10 schrieb Nigel Croxon:
>> When a bio encounters LOC_INSIDE_RESHAPE during a reshape that is
>> interrupted (stopped or unable to progress), the code sets
>> bi->bi_status = BLK_STS_RESOURCE to signal the block layer for retry.
>> However, bio_endio() is never called, so the block layer never
>> receives the completion notification and the retry never happens.
>>
>> This causes I/O to hang when a filesystem is layered over RAID5 and
>> reshape gets stuck.
>>
>> Fix this by calling bio_endio(bi) before md_free_cloned_bio(bi) so
>> the block layer is properly notified of the BLK_STS_RESOURCE status
>> and can retry the request.
>>
>> Tested stripes and stripe size conversions under load comparing
>> files multiple times during each conversion (i.e. MD reshape) on
>> ext4 after dropping caches degrading the RaidLV each time and
>
> I thought RaidLV misspelled Raid V (Raid 5), so should you resend,
> maybe write it as RAID LV.
>
>> no data corruption.
>>
>> Fixes: https://lwn.net/Articles/757123/
>
> Which paragraph/comment exactly?
>
>> Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
>> ---
>> drivers/md/raid5.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 6e79829c5acb..9a3475429ef4 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -6217,6 +6217,7 @@ static bool raid5_make_request(struct mddev
>> *mddev, struct bio * bi)
>>
>> mempool_free(ctx, conf->ctx_pool);
>> if (res == STRIPE_WAIT_RESHAPE) {
>> + bio_endio(bi);
>> md_free_cloned_bio(bi);
>> return false;
>> }
>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
>
>
> Kind regards,
>
> Paul
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-30 11:54 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 11:10 [PATCH] md/raid5: Fix bio retry on interrupted reshape Nigel Croxon
2026-04-29 13:07 ` Paul Menzel
2026-04-30 11:53 ` Nigel Croxon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox