From: NeilBrown <neil@brown.name>
To: linux-raid@vger.kernel.org
Cc: Zhao Heming <heming.zhao@suse.com>, neilb@suse.com, song@kernel.org
Subject: Re: [PATCH v2] md-cluster: fix safemode_delay value when converting to clustered bitmap
Date: Fri, 17 Jul 2020 09:36:15 +1000 [thread overview]
Message-ID: <87wo33vxqo.fsf@notabene.neil.brown.name> (raw)
In-Reply-To: <1594911043-16956-1-git-send-email-heming.zhao@suse.com>
[-- Attachment #1: Type: text/plain, Size: 3571 bytes --]
On Thu, Jul 16 2020, Zhao Heming wrote:
> When array convert to clustered bitmap, the safe_mode_delay doesn't clean and vice versa.
> the /sys/block/mdX/md/safe_mode_delay keep original value after changing bitmap type.
> in safe_delay_store(), the code forbids setting mddev->safemode_delay when array is clustered.
> So in cluster-md env, the expected safemode_delay value should be 0.
>
> reproduction steps:
> ```
> node1 # mdadm --zero-superblock /dev/sd{b,c,d}
> node1 # mdadm -C /dev/md0 -b internal -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc
> node1 # cat /sys/block/md0/md/safe_mode_delay
> 0.204
> node1 # mdadm -G /dev/md0 -b none
> node1 # mdadm --grow /dev/md0 --bitmap=clustered
> node1 # cat /sys/block/md0/md/safe_mode_delay
> 0.204 <== doesn't change, should ZERO for cluster-md
>
> node1 # mdadm --zero-superblock /dev/sd{b,c,d}
> node1 # mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sdb /dev/sdc
> node1 # cat /sys/block/md0/md/safe_mode_delay
> 0.000
> node1 # mdadm -G /dev/md0 -b none
> node1 # cat /sys/block/md0/md/safe_mode_delay
> 0.000 <== doesn't change, should default value
> ```
>
> Neil said md_setup_cluster/md_cluster_stop are good places to fix.
> After investigation, md_setup_cluster() is a good place for setting,
> but md_cluster_stop are not pair for restoring.
> see below flow:
> (user space)
> mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
> mdadm --grow /dev/md0 -b none
> (kernel space)
> SET_ARRAY_INFO
> update_array_info
> + mddev->bitmap_info.nodes = 0;
> + md_cluster_ops->leave(mddev)
> + md_bitmap_destroy
> md_bitmap_free //won't trigger md_cluster_stop() because above set 0.
>
> Signed-off-by: Zhao Heming <heming.zhao@suse.com>
> ---
> v2:
> - change setting path from location_store to md_setup_cluster
> - add restoring path
>
> ---
> drivers/md/md.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index f567f53..f082f5c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -101,6 +101,8 @@ static int remove_and_add_spares(struct mddev *mddev,
> * count by 2 for every hour elapsed between read errors.
> */
> #define MD_DEFAULT_MAX_CORRECTED_READ_ERRORS 20
> +/* Default safemode delay: 200 msec */
> +#define DEFAULT_SAFEMODE_DELAY ((200 * HZ)/1000 +1)
> /*
> * Current RAID-1,4,5 parallel reconstruction 'guaranteed speed limit'
> * is 1000 KB/sec, so the extra system load does not show up that much.
> @@ -5982,7 +5984,7 @@ int md_run(struct mddev *mddev)
> if (mddev_is_clustered(mddev))
> mddev->safemode_delay = 0;
> else
> - mddev->safemode_delay = (200 * HZ)/1000 +1; /* 200 msec delay */
> + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
> mddev->in_sync = 1;
> smp_wmb();
> spin_lock(&mddev->lock);
> @@ -7361,6 +7363,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
>
> mddev->bitmap_info.nodes = 0;
> md_cluster_ops->leave(mddev);
> + mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
> }
> mddev_suspend(mddev);
> md_bitmap_destroy(mddev);
> @@ -8366,6 +8369,7 @@ int md_setup_cluster(struct mddev *mddev, int nodes)
> }
> spin_unlock(&pers_lock);
>
> + mddev->safemode_delay = 0;
> return md_cluster_ops->join(mddev, nodes);
->join can fail.
I'd rather you checked the error there, and only clear safemode_delay if
the return value is zero.
NeilBrown
> }
>
> --
> 1.8.3.1
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2020-07-16 23:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-16 14:50 [PATCH v2] md-cluster: fix safemode_delay value when converting to clustered bitmap Zhao Heming
2020-07-16 23:36 ` NeilBrown [this message]
2020-07-17 3:56 ` heming.zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87wo33vxqo.fsf@notabene.neil.brown.name \
--to=neil@brown.name \
--cc=heming.zhao@suse.com \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.