linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Li Nan <linan666@huaweicloud.com>
To: Kenta Akagi <k@mgml.me>, Song Liu <song@kernel.org>,
	Yu Kuai <yukuai3@huawei.com>,
	Mariusz Tkaczyk <mtkaczyk@kernel.org>,
	Guoqing Jiang <jgq516@gmail.com>
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] md/raid1,raid10: don't broken array on failfast metadata write fails
Date: Sat, 23 Aug 2025 09:54:11 +0800	[thread overview]
Message-ID: <725043ad-2d50-be78-7cc3-8c565ab364e0@huaweicloud.com> (raw)
In-Reply-To: <20250817172710.4892-2-k@mgml.me>



在 2025/8/18 1:27, Kenta Akagi 写道:
> A super_write IO failure with MD_FAILFAST must not cause the array
> to fail.
> 
> Because a failfast bio may fail even when the rdev is not broken,
> so IO must be retried rather than failing the array when a metadata
> write with MD_FAILFAST fails on the last rdev.
> 
> A metadata write with MD_FAILFAST is retried after failure as
> follows:
> 
> 1. In super_written, MD_SB_NEED_REWRITE is set in sb_flags.
> 
> 2. In md_super_wait, which is called by the function that
> executed md_super_write and waits for completion,
> -EAGAIN is returned because MD_SB_NEED_REWRITE is set.
> 
> 3. The caller of md_super_wait (such as md_update_sb)
> receives a negative return value and then retries md_super_write.
> 
> 4. The md_super_write function, which is called to perform
> the same metadata write, issues a write bio without MD_FAILFAST
> this time.
> 
> When a write from super_written without MD_FAILFAST fails,
> the array may broken, and MD_BROKEN should be set.
> 
> After commit 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10"),
> calling md_error on the last rdev in RAID1/10 always sets
> the MD_BROKEN flag on the array.
> As a result, when failfast IO fails on the last rdev, the array
> immediately becomes failed.
> 
> This commit prevents MD_BROKEN from being set when a super_write with
> MD_FAILFAST fails on the last rdev, ensuring that the array does
> not become failed due to failfast IO failures.
> 
> Failfast IO failures on any rdev except the last one are not retried
> and are marked as Faulty immediately. This minimizes array IO latency
> when an rdev fails.
> 
> Fixes: 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10")
> Signed-off-by: Kenta Akagi <k@mgml.me>


[...]

> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1746,8 +1746,12 @@ static void raid1_status(struct seq_file *seq, struct mddev *mddev)
>    *	- recovery is interrupted.
>    *	- &mddev->degraded is bumped.
>    *
> - * @rdev is marked as &Faulty excluding case when array is failed and
> - * &mddev->fail_last_dev is off.
> + * If @rdev is marked with &FailfastIOFailure, it means that super_write
> + * failed in failfast and will be retried, so the @mddev did not fail.
> + *
> + * @rdev is marked as &Faulty excluding any cases:
> + *	- when @mddev is failed and &mddev->fail_last_dev is off
> + *	- when @rdev is last device and &FailfastIOFailure flag is set
>    */
>   static void raid1_error(struct mddev *mddev, struct md_rdev *rdev)
>   {
> @@ -1758,6 +1762,10 @@ static void raid1_error(struct mddev *mddev, struct md_rdev *rdev)
>   
>   	if (test_bit(In_sync, &rdev->flags) &&
>   	    (conf->raid_disks - mddev->degraded) == 1) {
> +		if (test_bit(FailfastIOFailure, &rdev->flags)) {
> +			spin_unlock_irqrestore(&conf->device_lock, flags);
> +			return;
> +		}
>   		set_bit(MD_BROKEN, &mddev->flags);
>   
>   		if (!mddev->fail_last_dev) {

At this point, users who try to fail this rdev will get a successful return
without Faulty flag. Should we consider it?

-- 
Thanks,
Nan


  parent reply	other threads:[~2025-08-23  1:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-17 17:27 [PATCH v2 0/3] md/raid1,raid10: don't broken array on failfast metadata write fails Kenta Akagi
2025-08-17 17:27 ` [PATCH v2 1/3] " Kenta Akagi
2025-08-18  2:05   ` Yu Kuai
2025-08-18  2:48     ` Yu Kuai
2025-08-18 12:48       ` Kenta Akagi
2025-08-18 15:45         ` Yu Kuai
2025-08-20 17:09           ` Kenta Akagi
2025-08-23  1:54   ` Li Nan [this message]
2025-08-27 17:31     ` Kenta Akagi
2025-08-17 17:27 ` [PATCH v2 2/3] md/raid1,raid10: Add error message when setting MD_BROKEN Kenta Akagi
2025-08-17 17:27 ` [PATCH v2 3/3] md/raid1,raid10: Fix: Operation continuing on 0 devices Kenta Akagi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=725043ad-2d50-be78-7cc3-8c565ab364e0@huaweicloud.com \
    --to=linan666@huaweicloud.com \
    --cc=jgq516@gmail.com \
    --cc=k@mgml.me \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mtkaczyk@kernel.org \
    --cc=song@kernel.org \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).