Re: [PATCH] raid10: badblock-aware reshape write error handling

Linux RAID subsystem development
 help / color / mirror / Atom feed

From: caogh <ghuicao@163.com>
To: Li Nan <magiclinan@foxmail.com>
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
	song@kernel.org, xiao@kernel.org, yukuai@fnnas.com
Subject: Re: [PATCH] raid10: badblock-aware reshape write error handling
Date: Fri, 12 Jun 2026 16:05:04 +0800	[thread overview]
Message-ID: <8da26fc1-d582-4e93-8c00-d545b8558fb2@163.com> (raw)
In-Reply-To: <tencent_1A4057A928F0CDF70F24A070B03ABFB34A05@qq.com>

Nan, thanks for the review.

You're right that the current code chains several mechanisms (badblock 
recording, WantReplacement, md_error) together, which hurts readability. 
I'll refactor in the next version to make the logic clearer.

However, regarding your suggested logic:

> The correct logic might be:
>    1. If there is a replacement, fail directly with md_error
>    2. If not, mark the badblock and set WantReplacement

I have some questions.

If a replacement device hits a write error, calling md_error directly 
will mark it Faulty and kick it out of the array. But it may just be a 
few bad sectors — recording the badblock and continuing the reshape is 
sufficient. Is it really necessary to call md_error directly in this case?

The current patch does:
     1、Try to record the badblock first
     2、On success:
          a) member device: set WantReplacement to trigger a replacement
          b) replacement device: skip WantReplacement to avoid a 
replacement loop
     3、On failure: rdev_set_badblocks has already called md_error 
internally, no further action needed

Does this approach seem reasonable?


在 2026/6/12 11:42, Li Nan 写道:
> On Mon Jun 1, 2026 at 1:46 PM CST, ghuicao wrote:
>> From: Cao Guanghui<caoguanghui@kylinos.cn>
>>
>> Replace the FIXME in end_reshape_write(). Instead of failing the device
>> immediately on write errors during reshape, attempt to record badblocks
>> using new_data_offset with is_new=1.
>>
>> rdev_set_badblocks() returns true on success. On failure (e.g., badblocks
>> table full), it has already called md_error() internally to degrade the
>> device. Queue WantReplacement for member devices regardless of badblock
>> recording success, but skip this for replacement devices to avoid
>> replacement loops.
>>
>> On successful write, clear stale badblock records at the new location
>> since data has migrated.
>>
>> Signed-off-by: Cao Guanghui<caoguanghui@kylinos.cn>
>> ---
>>   drivers/md/raid10.c | 27 ++++++++++++++++++++++++---
>>   1 file changed, 24 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 4901ebe45c87..08d58a1c680e 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -4991,9 +4991,30 @@ static void end_reshape_write(struct bio *bio)
>>   		      conf->mirrors[d].rdev;
>>   
>>   	if (bio->bi_status) {
>> -		/* FIXME should record badblock */
>> -		md_error(mddev, rdev);
>> -	}
>> +		set_bit(WriteErrorSeen, &rdev->flags);
>> +
>> +		/* rdev_set_badblocks returns true on success.
>> +		 * On failure, it has already called md_error() internally.
>> +		 * Use is_new=1 as reshape writes target the new layout
>> +		 * (new_data_offset).
>> +		 */
>> +		if (rdev_set_badblocks(rdev, r10_bio->devs[slot].addr,
>> +				       r10_bio->sectors, 1)) {
>> +			/* Queue async replacement for member devices
>> +			 * For replacement devices, do not trigger WantReplacement
>> +			 * to avoid circular replacement storms.
>> +			 */
>> +			if (!repl) {
>> +				if (!test_and_set_bit(WantReplacement, &rdev->flags))
>> +					set_bit(MD_RECOVERY_NEEDED,
>> +						&rdev->mddev->recovery);
> The logic here seems a bit odd — several mechanisms are chained together.
>
> The correct logic might be:
>    1. If there is a replacement, fail directly with md_error
>    2. If not, mark the badblock and set WantReplacement
>
>> +			}
>> +		}
>> +	} else {
>> +		/* Write succeeded, clear stale badblock records */
>> +		rdev_clear_badblocks(rdev, r10_bio->devs[slot].addr,
>> +				     r10_bio->sectors, 1);
>> +	}
>>   
>>   	rdev_dec_pending(rdev, mddev);
>>   	end_reshape_request(r10_bio);
>>

Thanks, Cao Guanghui

     prev parent reply	other threads:[~2026-06-12  8:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-01  5:46 [PATCH] raid10: badblock-aware reshape write error handling ghuicao
2026-06-12  3:42 ` Li Nan
2026-06-12  8:05   ` caogh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8da26fc1-d582-4e93-8c00-d545b8558fb2@163.com \
    --to=ghuicao@163.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=magiclinan@foxmail.com \
    --cc=song@kernel.org \
    --cc=xiao@kernel.org \
    --cc=yukuai@fnnas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox