public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: Kenta Akagi <k@mgml.me>
To: Song Liu <song@kernel.org>, Yu Kuai <yukuai@fnnas.com>,
	Shaohua Li <shli@fb.com>, Mariusz Tkaczyk <mtkaczyk@kernel.org>,
	Xiao Ni <xni@redhat.com>
Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
	Kenta Akagi <k@mgml.me>
Subject: [PATCH v6 1/2] md: Don't set MD_BROKEN for RAID1 and RAID10 when using FailFast
Date: Mon,  5 Jan 2026 23:40:24 +0900	[thread overview]
Message-ID: <20260105144025.12478-2-k@mgml.me> (raw)
In-Reply-To: <20260105144025.12478-1-k@mgml.me>

After commit 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10"),
if the error handler is called on the last rdev in RAID1 or RAID10,
the MD_BROKEN flag will be set on that mddev.
When MD_BROKEN is set, write bios to the md will result in an I/O error.

This causes a problem when using FailFast.
The current implementation of FailFast expects the array to continue
functioning without issues even after calling md_error for the last
rdev.  Furthermore, due to the nature of its functionality, FailFast may
call md_error on all rdevs of the md. Even if retrying I/O on an rdev
would succeed, it first calls md_error before retrying.

To fix this issue, this commit ensures that for RAID1 and RAID10, if the
last In_sync rdev has the FailFast flag set and the mddev's fail_last_dev
is off, the MD_BROKEN flag will not be set on that mddev.

This change impacts userspace. After this commit, If the rdev has the
FailFast flag, the mddev never broken even if the failing bio is not
FailFast. However, it's unlikely that any setup using FailFast expects
the array to halt when md_error is called on the last rdev.

Since FailFast is only implemented for RAID1 and RAID10, no changes are
needed for other personalities.

Fixes: 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10")
Suggested-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Kenta Akagi <k@mgml.me>
---
 drivers/md/md.c     | 6 ++++--
 drivers/md/raid1.c  | 8 +++++++-
 drivers/md/raid10.c | 8 +++++++-
 3 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6062e0deb616..f1745f8921fc 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3050,7 +3050,8 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 	if (cmd_match(buf, "faulty") && rdev->mddev->pers) {
 		md_error(rdev->mddev, rdev);
 
-		if (test_bit(MD_BROKEN, &rdev->mddev->flags))
+		if (test_bit(MD_BROKEN, &rdev->mddev->flags) ||
+		    !test_bit(Faulty, &rdev->flags))
 			err = -EBUSY;
 		else
 			err = 0;
@@ -7915,7 +7916,8 @@ static int set_disk_faulty(struct mddev *mddev, dev_t dev)
 		err =  -ENODEV;
 	else {
 		md_error(mddev, rdev);
-		if (test_bit(MD_BROKEN, &mddev->flags))
+		if (test_bit(MD_BROKEN, &mddev->flags) ||
+		    !test_bit(Faulty, &rdev->flags))
 			err = -EBUSY;
 	}
 	rcu_read_unlock();
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 592a40233004..459b34cd358b 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1745,6 +1745,10 @@ static void raid1_status(struct seq_file *seq, struct mddev *mddev)
  *	- recovery is interrupted.
  *	- &mddev->degraded is bumped.
  *
+ * If the following conditions are met, @mddev never fails:
+ *	- The last In_sync @rdev has the &FailFast flag set.
+ *	- &mddev->fail_last_dev is off.
+ *
  * @rdev is marked as &Faulty excluding case when array is failed and
  * &mddev->fail_last_dev is off.
  */
@@ -1757,7 +1761,9 @@ static void raid1_error(struct mddev *mddev, struct md_rdev *rdev)
 
 	if (test_bit(In_sync, &rdev->flags) &&
 	    (conf->raid_disks - mddev->degraded) == 1) {
-		set_bit(MD_BROKEN, &mddev->flags);
+		if (!test_bit(FailFast, &rdev->flags) ||
+		    mddev->fail_last_dev)
+			set_bit(MD_BROKEN, &mddev->flags);
 
 		if (!mddev->fail_last_dev) {
 			conf->recovery_disabled = mddev->recovery_disabled;
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 14dcd5142eb4..b33149aa5b29 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1989,6 +1989,10 @@ static int enough(struct r10conf *conf, int ignore)
  *	- recovery is interrupted.
  *	- &mddev->degraded is bumped.
  *
+ * If the following conditions are met, @mddev never fails:
+ *	- The last In_sync @rdev has the &FailFast flag set.
+ *	- &mddev->fail_last_dev is off.
+ *
  * @rdev is marked as &Faulty excluding case when array is failed and
  * &mddev->fail_last_dev is off.
  */
@@ -2000,7 +2004,9 @@ static void raid10_error(struct mddev *mddev, struct md_rdev *rdev)
 	spin_lock_irqsave(&conf->device_lock, flags);
 
 	if (test_bit(In_sync, &rdev->flags) && !enough(conf, rdev->raid_disk)) {
-		set_bit(MD_BROKEN, &mddev->flags);
+		if (!test_bit(FailFast, &rdev->flags) ||
+		    mddev->fail_last_dev)
+			set_bit(MD_BROKEN, &mddev->flags);
 
 		if (!mddev->fail_last_dev) {
 			spin_unlock_irqrestore(&conf->device_lock, flags);
-- 
2.50.1


  reply	other threads:[~2026-01-05 14:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-05 14:40 [PATCH v6 0/2] Don't set MD_BROKEN on failfast bio failure Kenta Akagi
2026-01-05 14:40 ` Kenta Akagi [this message]
2026-01-06  2:57   ` [PATCH v6 1/2] md: Don't set MD_BROKEN for RAID1 and RAID10 when using FailFast Li Nan
2026-01-06  7:59     ` Xiao Ni
2026-01-06  9:11       ` Li Nan
2026-01-06  9:25         ` Xiao Ni
2026-01-06 11:14           ` Li Nan
2026-01-06 12:30     ` Kenta Akagi
2026-01-07  2:09       ` Li Nan
2026-01-07  3:35       ` Xiao Ni
2026-01-07  6:43         ` Kenta Akagi
2026-01-16  2:04         ` Kenta Akagi
2026-01-05 14:40 ` [PATCH v6 2/2] md/raid10: fix failfast read error not rescheduled Kenta Akagi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260105144025.12478-2-k@mgml.me \
    --to=k@mgml.me \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=mtkaczyk@kernel.org \
    --cc=shli@fb.com \
    --cc=song@kernel.org \
    --cc=xni@redhat.com \
    --cc=yukuai@fnnas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox