[PATCH v3 0/2] md: fix sync

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 0/2] md: fix sync_action show
@ 2025-08-14  1:57 Zheng Qixing
  2025-08-14  1:57 ` [PATCH v3 1/2] md: add helper rdev_needs_recovery() Zheng Qixing
  2025-08-14  1:57 ` [PATCH v3 2/2] md: fix sync_action incorrect display during resync Zheng Qixing
  0 siblings, 2 replies; 5+ messages in thread
From: Zheng Qixing @ 2025-08-14  1:57 UTC (permalink / raw)
  To: song, yukuai3, linan122
  Cc: linux-raid, linux-kernel, pmenzel, yi.zhang, yangerkun, houtao1,
	zhengqixing

From: Zheng Qixing <zhengqixing@huawei.com>

Changes in v3:
  Code style modification in patch 1.

Fix incorrect display of sync_action when raid is in resync.

Zheng Qixing (2):
  md: add helper rdev_needs_recovery()
  md: fix sync_action incorrect display during resync

 drivers/md/md.c | 56 ++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 46 insertions(+), 10 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3 1/2] md: add helper rdev_needs_recovery()
  2025-08-14  1:57 [PATCH v3 0/2] md: fix sync_action show Zheng Qixing
@ 2025-08-14  1:57 ` Zheng Qixing
  2025-08-14  7:09   ` Li Nan
  2025-08-14  1:57 ` [PATCH v3 2/2] md: fix sync_action incorrect display during resync Zheng Qixing
  1 sibling, 1 reply; 5+ messages in thread
From: Zheng Qixing @ 2025-08-14  1:57 UTC (permalink / raw)
  To: song, yukuai3, linan122
  Cc: linux-raid, linux-kernel, pmenzel, yi.zhang, yangerkun, houtao1,
	zhengqixing

From: Zheng Qixing <zhengqixing@huawei.com>

Add a helper for checking if an rdev needs recovery.

Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index ac85ec73a409..4663e172864e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4835,6 +4835,14 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
 static struct md_sysfs_entry md_metadata =
 __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
 
+static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t sectors)
+{
+	return !test_bit(Journal, &rdev->flags) &&
+	       !test_bit(Faulty, &rdev->flags) &&
+	       !test_bit(In_sync, &rdev->flags) &&
+	       rdev->recovery_offset < sectors;
+}
+
 enum sync_action md_sync_action(struct mddev *mddev)
 {
 	unsigned long recovery = mddev->recovery;
@@ -8969,10 +8977,7 @@ static sector_t md_sync_position(struct mddev *mddev, enum sync_action action)
 		rcu_read_lock();
 		rdev_for_each_rcu(rdev, mddev)
 			if (rdev->raid_disk >= 0 &&
-			    !test_bit(Journal, &rdev->flags) &&
-			    !test_bit(Faulty, &rdev->flags) &&
-			    !test_bit(In_sync, &rdev->flags) &&
-			    rdev->recovery_offset < start)
+			    rdev_needs_recovery(rdev, start))
 				start = rdev->recovery_offset;
 		rcu_read_unlock();
 
@@ -9333,10 +9338,7 @@ void md_do_sync(struct md_thread *thread)
 				rdev_for_each_rcu(rdev, mddev)
 					if (rdev->raid_disk >= 0 &&
 					    mddev->delta_disks >= 0 &&
-					    !test_bit(Journal, &rdev->flags) &&
-					    !test_bit(Faulty, &rdev->flags) &&
-					    !test_bit(In_sync, &rdev->flags) &&
-					    rdev->recovery_offset < mddev->curr_resync)
+					    rdev_needs_recovery(rdev, mddev->curr_resync))
 						rdev->recovery_offset = mddev->curr_resync;
 				rcu_read_unlock();
 			}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v3 2/2] md: fix sync_action incorrect display during resync
  2025-08-14  1:57 [PATCH v3 0/2] md: fix sync_action show Zheng Qixing
  2025-08-14  1:57 ` [PATCH v3 1/2] md: add helper rdev_needs_recovery() Zheng Qixing
@ 2025-08-14  1:57 ` Zheng Qixing
  1 sibling, 0 replies; 5+ messages in thread
From: Zheng Qixing @ 2025-08-14  1:57 UTC (permalink / raw)
  To: song, yukuai3, linan122
  Cc: linux-raid, linux-kernel, pmenzel, yi.zhang, yangerkun, houtao1,
	zhengqixing

From: Zheng Qixing <zhengqixing@huawei.com>

During raid resync, if a disk becomes faulty, the operation is
briefly interrupted. The MD_RECOVERY_RECOVER flag triggered by
the disk failure causes sync_action to incorrectly show "recover"
instead of "resync". The same issue affects reshape operations.

Reproduction steps:
  mdadm -Cv /dev/md1 -l1 -n4 -e1.2 /dev/sd{a..d} // -> resync happended
  mdadm -f /dev/md1 /dev/sda                     // -> resync interrupted
  cat sync_action
  -> recover

Add progress checks in md_sync_action() for resync/recover/reshape
to ensure the interface correctly reports the actual operation type.

Fixes: 4b10a3bc67c1 ("md: ensure resync is prioritized over recovery")
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
---
 drivers/md/md.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4663e172864e..5b6ab4ef042e 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4843,9 +4843,34 @@ static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t sectors)
 	       rdev->recovery_offset < sectors;
 }
 
+static enum sync_action md_get_active_sync_action(struct mddev *mddev)
+{
+	struct md_rdev *rdev;
+	bool is_recover = false;
+
+	if (mddev->resync_offset < MaxSector)
+		return ACTION_RESYNC;
+
+	if (mddev->reshape_position != MaxSector)
+		return ACTION_RESHAPE;
+
+	rcu_read_lock();
+	rdev_for_each_rcu(rdev, mddev) {
+		if (rdev->raid_disk >= 0 &&
+		    rdev_needs_recovery(rdev, MaxSector)) {
+			is_recover = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return is_recover ? ACTION_RECOVER : ACTION_IDLE;
+}
+
 enum sync_action md_sync_action(struct mddev *mddev)
 {
 	unsigned long recovery = mddev->recovery;
+	enum sync_action active_action;
 
 	/*
 	 * frozen has the highest priority, means running sync_thread will be
@@ -4869,8 +4894,17 @@ enum sync_action md_sync_action(struct mddev *mddev)
 	    !test_bit(MD_RECOVERY_NEEDED, &recovery))
 		return ACTION_IDLE;
 
-	if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
-	    mddev->reshape_position != MaxSector)
+	/*
+	 * Check if any sync operation (resync/recover/reshape) is
+	 * currently active. This ensures that only one sync operation
+	 * can run at a time. Returns the type of active operation, or
+	 * ACTION_IDLE if none are active.
+	 */
+	active_action = md_get_active_sync_action(mddev);
+	if (active_action != ACTION_IDLE)
+		return active_action;
+
+	if (test_bit(MD_RECOVERY_RESHAPE, &recovery))
 		return ACTION_RESHAPE;
 
 	if (test_bit(MD_RECOVERY_RECOVER, &recovery))
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] md: add helper rdev_needs_recovery()
  2025-08-14  1:57 ` [PATCH v3 1/2] md: add helper rdev_needs_recovery() Zheng Qixing
@ 2025-08-14  7:09   ` Li Nan
  2025-08-14 11:09     ` Zheng Qixing
  0 siblings, 1 reply; 5+ messages in thread
From: Li Nan @ 2025-08-14  7:09 UTC (permalink / raw)
  To: Zheng Qixing, song, yukuai3
  Cc: linux-raid, linux-kernel, pmenzel, yi.zhang, yangerkun, houtao1,
	zhengqixing



在 2025/8/14 9:57, Zheng Qixing 写道:
> From: Zheng Qixing <zhengqixing@huawei.com>
> 
> Add a helper for checking if an rdev needs recovery.
> 
> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
> ---
>   drivers/md/md.c | 18 ++++++++++--------
>   1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ac85ec73a409..4663e172864e 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -4835,6 +4835,14 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
>   static struct md_sysfs_entry md_metadata =
>   __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
>   
> +static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t sectors)
> +{
> +	return !test_bit(Journal, &rdev->flags) &&
> +	       !test_bit(Faulty, &rdev->flags) &&
> +	       !test_bit(In_sync, &rdev->flags) &&
> +	       rdev->recovery_offset < sectors;
> +}
> +

Every caller is already checking 'rdev->raid_disk >= 0'. Should we move it
into rdev_needs_recovery()?

>   enum sync_action md_sync_action(struct mddev *mddev)
>   {
>   	unsigned long recovery = mddev->recovery;
> @@ -8969,10 +8977,7 @@ static sector_t md_sync_position(struct mddev *mddev, enum sync_action action)
>   		rcu_read_lock();
>   		rdev_for_each_rcu(rdev, mddev)
>   			if (rdev->raid_disk >= 0 &&
> -			    !test_bit(Journal, &rdev->flags) &&
> -			    !test_bit(Faulty, &rdev->flags) &&
> -			    !test_bit(In_sync, &rdev->flags) &&
> -			    rdev->recovery_offset < start)
> +			    rdev_needs_recovery(rdev, start))
>   				start = rdev->recovery_offset;
>   		rcu_read_unlock();
>   
> @@ -9333,10 +9338,7 @@ void md_do_sync(struct md_thread *thread)
>   				rdev_for_each_rcu(rdev, mddev)
>   					if (rdev->raid_disk >= 0 &&
>   					    mddev->delta_disks >= 0 &&
> -					    !test_bit(Journal, &rdev->flags) &&
> -					    !test_bit(Faulty, &rdev->flags) &&
> -					    !test_bit(In_sync, &rdev->flags) &&
> -					    rdev->recovery_offset < mddev->curr_resync)
> +					    rdev_needs_recovery(rdev, mddev->curr_resync))
>   						rdev->recovery_offset = mddev->curr_resync;
>   				rcu_read_unlock();
>   			}

-- 
Thanks,
Nan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3 1/2] md: add helper rdev_needs_recovery()
  2025-08-14  7:09   ` Li Nan
@ 2025-08-14 11:09     ` Zheng Qixing
  0 siblings, 0 replies; 5+ messages in thread
From: Zheng Qixing @ 2025-08-14 11:09 UTC (permalink / raw)
  To: Li Nan, song, yukuai3
  Cc: linux-raid, linux-kernel, pmenzel, yi.zhang, yangerkun, houtao1,
	zhengqixing

Hi,


在 2025/8/14 15:09, Li Nan 写道:
>
>
> 在 2025/8/14 9:57, Zheng Qixing 写道:
>> From: Zheng Qixing <zhengqixing@huawei.com>
>>
>> Add a helper for checking if an rdev needs recovery.
>>
>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
>> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
>> Reviewed-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 18 ++++++++++--------
>>   1 file changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index ac85ec73a409..4663e172864e 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -4835,6 +4835,14 @@ metadata_store(struct mddev *mddev, const char 
>> *buf, size_t len)
>>   static struct md_sysfs_entry md_metadata =
>>   __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, 
>> metadata_store);
>>   +static bool rdev_needs_recovery(struct md_rdev *rdev, sector_t 
>> sectors)
>> +{
>> +    return !test_bit(Journal, &rdev->flags) &&
>> +           !test_bit(Faulty, &rdev->flags) &&
>> +           !test_bit(In_sync, &rdev->flags) &&
>> +           rdev->recovery_offset < sectors;
>> +}
>> +
>
> Every caller is already checking 'rdev->raid_disk >= 0'. Should we 
> move it
> into rdev_needs_recovery()?
>

Good point, thanks.


Qixing


>>   enum sync_action md_sync_action(struct mddev *mddev)
>>   {
>>       unsigned long recovery = mddev->recovery;
>> @@ -8969,10 +8977,7 @@ static sector_t md_sync_position(struct mddev 
>> *mddev, enum sync_action action)
>>           rcu_read_lock();
>>           rdev_for_each_rcu(rdev, mddev)
>>               if (rdev->raid_disk >= 0 &&
>> -                !test_bit(Journal, &rdev->flags) &&
>> -                !test_bit(Faulty, &rdev->flags) &&
>> -                !test_bit(In_sync, &rdev->flags) &&
>> -                rdev->recovery_offset < start)
>> +                rdev_needs_recovery(rdev, start))
>>                   start = rdev->recovery_offset;
>>           rcu_read_unlock();
>>   @@ -9333,10 +9338,7 @@ void md_do_sync(struct md_thread *thread)
>>                   rdev_for_each_rcu(rdev, mddev)
>>                       if (rdev->raid_disk >= 0 &&
>>                           mddev->delta_disks >= 0 &&
>> -                        !test_bit(Journal, &rdev->flags) &&
>> -                        !test_bit(Faulty, &rdev->flags) &&
>> -                        !test_bit(In_sync, &rdev->flags) &&
>> -                        rdev->recovery_offset < mddev->curr_resync)
>> +                        rdev_needs_recovery(rdev, mddev->curr_resync))
>>                           rdev->recovery_offset = mddev->curr_resync;
>>                   rcu_read_unlock();
>>               }
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-14 11:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14  1:57 [PATCH v3 0/2] md: fix sync_action show Zheng Qixing
2025-08-14  1:57 ` [PATCH v3 1/2] md: add helper rdev_needs_recovery() Zheng Qixing
2025-08-14  7:09   ` Li Nan
2025-08-14 11:09     ` Zheng Qixing
2025-08-14  1:57 ` [PATCH v3 2/2] md: fix sync_action incorrect display during resync Zheng Qixing

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).