* [PATCH] md/md: replace confusing resync progress 99.9% with new wording
@ 2017-12-27 5:50 bingjingc
2017-12-28 8:11 ` Guoqing Jiang
0 siblings, 1 reply; 3+ messages in thread
From: bingjingc @ 2017-12-27 5:50 UTC (permalink / raw)
To: linux-raid; +Cc: shli
When any sync action is finishing or interrupted, the progress will
show 99.9% before the sync thread is reaped. Many reporters has asked
what happened to the last blocks. It might be a confusing meaning for
users because the progress will be backward after the interrupted task
is restarted.
Take a raid5 reshape for example:
mdadm -C --assume-clean /dev/md0 -l5 -n3 /dev/loop[012]
echo 2000 > /proc/sys/dev/raid/speed_limit_max
echo 1000 > /proc/sys/dev/raid/speed_limit_min # slow down the speed
mdadm /dev/md0 -a /dev/loop3
mdadm /dev/md0 --grow -n4
while true
do
mdadm -S /dev/md0
sleep 3
mdadm -A /dev/md0 /dev/loop[0123]
done
And you can see the fake 99.9% progress by the following command:
> while true; do cat /proc/mdstat | grep reshape; done
This confusing state can be fixed by exposing the real state to users.
And I also correct the sync action type for display.
Reported-by: Edwin Lin <edwinlin@synology.com>
Reviewed-by: Allen Peng <allenpeng@synology.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
---
drivers/md/md.c | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e4dee0..74106c7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7593,6 +7593,14 @@ static int status_resync(struct seq_file *seq,
struct mddev *mddev)
sector_t rt;
int scale;
unsigned int per_milli;
+ char *sync_action;
+
+ sync_action = (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) ?
+ "reshape" :
+ (test_bit(MD_RECOVERY_CHECK, &mddev->recovery) ?
+ "check" :
+ (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
+ "resync" : "recovery")));
if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
@@ -7602,9 +7610,11 @@ static int status_resync(struct seq_file *seq,
struct mddev *mddev)
resync = mddev->curr_resync;
if (resync <= 3) {
- if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
+ if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) {
/* Still cleaning up */
- resync = max_sectors;
+ seq_printf(seq, "\t%s=CLEANING UP", sync_action);
+ return 1;
+ }
} else if (resync > max_sectors)
resync = max_sectors;
else
@@ -7612,13 +7622,13 @@ static int status_resync(struct seq_file *seq,
struct mddev *mddev)
if (resync == 0) {
if (mddev->recovery_cp < MaxSector) {
- seq_printf(seq, "\tresync=PENDING");
+ seq_printf(seq, "\t%s=PENDING", sync_action);
return 1;
}
return 0;
}
if (resync < 3) {
- seq_printf(seq, "\tresync=DELAYED");
+ seq_printf(seq, "\t%s=DELAYED", sync_action);
return 1;
}
@@ -7648,12 +7658,7 @@ static int status_resync(struct seq_file *seq,
struct mddev *mddev)
seq_printf(seq, "] ");
}
seq_printf(seq, " %s =%3u.%u%% (%llu/%llu)",
- (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)?
- "reshape" :
- (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)?
- "check" :
- (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
- "resync" : "recovery"))),
+ sync_action,
per_milli/10, per_milli % 10,
(unsigned long long) resync/2,
(unsigned long long) max_sectors/2);
--
2.7.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] md/md: replace confusing resync progress 99.9% with new wording
2017-12-27 5:50 [PATCH] md/md: replace confusing resync progress 99.9% with new wording bingjingc
@ 2017-12-28 8:11 ` Guoqing Jiang
2017-12-29 6:15 ` bingjingc
0 siblings, 1 reply; 3+ messages in thread
From: Guoqing Jiang @ 2017-12-28 8:11 UTC (permalink / raw)
To: bingjingc, linux-raid; +Cc: shli
On 12/27/2017 01:50 PM, bingjingc wrote:
>
> When any sync action is finishing or interrupted, the progress will
> show 99.9% before the sync thread is reaped. Many reporters has asked
> what happened to the last blocks. It might be a confusing meaning for
> users because the progress will be backward after the interrupted task
> is restarted.
>
> Take a raid5 reshape for example:
> mdadm -C --assume-clean /dev/md0 -l5 -n3 /dev/loop[012]
> echo 2000 > /proc/sys/dev/raid/speed_limit_max
> echo 1000 > /proc/sys/dev/raid/speed_limit_min # slow down the speed
> mdadm /dev/md0 -a /dev/loop3
> mdadm /dev/md0 --grow -n4
> while true
> do
> mdadm -S /dev/md0
> sleep 3
> mdadm -A /dev/md0 /dev/loop[0123]
> done
>
> And you can see the fake 99.9% progress by the following command:
>> while true; do cat /proc/mdstat | grep reshape; done
Seems it happened when the reshaping is interrupted by issue stop array cmd,
if so, why not just check MD_RECOVERY_INTR in status_resync()?
Thanks,
Guoqing
>>
>
> This confusing state can be fixed by exposing the real state to users.
> And I also correct the sync action type for display.
>
> Reported-by: Edwin Lin <edwinlin@synology.com>
> Reviewed-by: Allen Peng <allenpeng@synology.com>
> Signed-off-by: BingJing Chang <bingjingc@synology.com>
> ---
> drivers/md/md.c | 25 +++++++++++++++----------
> 1 file changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 4e4dee0..74106c7 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -7593,6 +7593,14 @@ static int status_resync(struct seq_file *seq,
> struct mddev *mddev)
> sector_t rt;
> int scale;
> unsigned int per_milli;
> + char *sync_action;
> +
> + sync_action = (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) ?
> + "reshape" :
> + (test_bit(MD_RECOVERY_CHECK, &mddev->recovery) ?
> + "check" :
> + (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
> + "resync" : "recovery")));
>
> if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
> test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
> @@ -7602,9 +7610,11 @@ static int status_resync(struct seq_file *seq,
> struct mddev *mddev)
>
> resync = mddev->curr_resync;
> if (resync <= 3) {
> - if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
> + if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) {
> /* Still cleaning up */
> - resync = max_sectors;
> + seq_printf(seq, "\t%s=CLEANING UP", sync_action);
> + return 1;
> + }
> } else if (resync > max_sectors)
> resync = max_sectors;
> else
> @@ -7612,13 +7622,13 @@ static int status_resync(struct seq_file *seq,
> struct mddev *mddev)
>
> if (resync == 0) {
> if (mddev->recovery_cp < MaxSector) {
> - seq_printf(seq, "\tresync=PENDING");
> + seq_printf(seq, "\t%s=PENDING", sync_action);
> return 1;
> }
> return 0;
> }
> if (resync < 3) {
> - seq_printf(seq, "\tresync=DELAYED");
> + seq_printf(seq, "\t%s=DELAYED", sync_action);
> return 1;
> }
>
> @@ -7648,12 +7658,7 @@ static int status_resync(struct seq_file *seq,
> struct mddev *mddev)
> seq_printf(seq, "] ");
> }
> seq_printf(seq, " %s =%3u.%u%% (%llu/%llu)",
> - (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)?
> - "reshape" :
> - (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)?
> - "check" :
> - (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
> - "resync" : "recovery"))),
> + sync_action,
> per_milli/10, per_milli % 10,
> (unsigned long long) resync/2,
> (unsigned long long) max_sectors/2);
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] md/md: replace confusing resync progress 99.9% with new wording
2017-12-28 8:11 ` Guoqing Jiang
@ 2017-12-29 6:15 ` bingjingc
0 siblings, 0 replies; 3+ messages in thread
From: bingjingc @ 2017-12-29 6:15 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: linux-raid, shli, linux-raid-owner
Guoqing Jiang 於 2017-12-28 16:11 寫到:
> On 12/27/2017 01:50 PM, bingjingc wrote:
>>
>> When any sync action is finishing or interrupted, the progress will
>> show 99.9% before the sync thread is reaped. Many reporters has asked
>> what happened to the last blocks. It might be a confusing meaning for
>> users because the progress will be backward after the interrupted task
>> is restarted.
>>
>> Take a raid5 reshape for example:
>> mdadm -C --assume-clean /dev/md0 -l5 -n3 /dev/loop[012]
>> echo 2000 > /proc/sys/dev/raid/speed_limit_max
>> echo 1000 > /proc/sys/dev/raid/speed_limit_min # slow down the speed
>> mdadm /dev/md0 -a /dev/loop3
>> mdadm /dev/md0 --grow -n4
>> while true
>> do
>> mdadm -S /dev/md0
>> sleep 3
>> mdadm -A /dev/md0 /dev/loop[0123]
>> done
>>
>> And you can see the fake 99.9% progress by the following command:
>>> while true; do cat /proc/mdstat | grep reshape; done
>
> Seems it happened when the reshaping is interrupted by issue stop array
> cmd,
> if so, why not just check MD_RECOVERY_INTR in status_resync()?
>
Actually, 99.9% is also not a clear statement for a finishing case. The
last
block in array has already been processed. No blocks left are not
processed.
The array just waits for the daemon to reap it. Maybe it can be
displayed
as 100% (CLEANING UP). However, I just let it shows "CLEANING UP" here.
Thanks,
BingJing
> Thanks,
> Guoqing
>
>>>
>>
>> This confusing state can be fixed by exposing the real state to users.
>> And I also correct the sync action type for display.
>>
>> Reported-by: Edwin Lin <edwinlin@synology.com>
>> Reviewed-by: Allen Peng <allenpeng@synology.com>
>> Signed-off-by: BingJing Chang <bingjingc@synology.com>
>> ---
>> drivers/md/md.c | 25 +++++++++++++++----------
>> 1 file changed, 15 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 4e4dee0..74106c7 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -7593,6 +7593,14 @@ static int status_resync(struct seq_file *seq,
>> struct mddev *mddev)
>> sector_t rt;
>> int scale;
>> unsigned int per_milli;
>> + char *sync_action;
>> +
>> + sync_action = (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) ?
>> + "reshape" :
>> + (test_bit(MD_RECOVERY_CHECK, &mddev->recovery) ?
>> + "check" :
>> + (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
>> + "resync" : "recovery")));
>>
>> if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
>> test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
>> @@ -7602,9 +7610,11 @@ static int status_resync(struct seq_file *seq,
>> struct mddev *mddev)
>>
>> resync = mddev->curr_resync;
>> if (resync <= 3) {
>> - if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
>> + if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) {
>> /* Still cleaning up */
>> - resync = max_sectors;
>> + seq_printf(seq, "\t%s=CLEANING UP", sync_action);
>> + return 1;
>> + }
>> } else if (resync > max_sectors)
>> resync = max_sectors;
>> else
>> @@ -7612,13 +7622,13 @@ static int status_resync(struct seq_file *seq,
>> struct mddev *mddev)
>>
>> if (resync == 0) {
>> if (mddev->recovery_cp < MaxSector) {
>> - seq_printf(seq, "\tresync=PENDING");
>> + seq_printf(seq, "\t%s=PENDING", sync_action);
>> return 1;
>> }
>> return 0;
>> }
>> if (resync < 3) {
>> - seq_printf(seq, "\tresync=DELAYED");
>> + seq_printf(seq, "\t%s=DELAYED", sync_action);
>> return 1;
>> }
>>
>> @@ -7648,12 +7658,7 @@ static int status_resync(struct seq_file *seq,
>> struct mddev *mddev)
>> seq_printf(seq, "] ");
>> }
>> seq_printf(seq, " %s =%3u.%u%% (%llu/%llu)",
>> - (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)?
>> - "reshape" :
>> - (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)?
>> - "check" :
>> - (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?
>> - "resync" : "recovery"))),
>> + sync_action,
>> per_milli/10, per_milli % 10,
>> (unsigned long long) resync/2,
>> (unsigned long long) max_sectors/2);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid"
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-12-29 6:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-27 5:50 [PATCH] md/md: replace confusing resync progress 99.9% with new wording bingjingc
2017-12-28 8:11 ` Guoqing Jiang
2017-12-29 6:15 ` bingjingc
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).