From: Coly Li <colyli@suse.de>
To: jes@trained-monkey.org
Cc: linux-raid@vger.kernel.org, Heming Zhao <heming.zhao@suse.com>,
Coly Li <colyli@suse.de>
Subject: [PATCH 3/6] mdadm/super1: restore commit 45a87c2f31335 to fix clustered slot issue
Date: Tue, 21 Jun 2022 00:10:40 +0800 [thread overview]
Message-ID: <20220620161043.3661-4-colyli@suse.de> (raw)
In-Reply-To: <20220620161043.3661-1-colyli@suse.de>
From: Heming Zhao <heming.zhao@suse.com>
Commit 9d67f6496c71 ("mdadm:check the nodes when operate clustered
array") modified assignment logic for st->nodes in write_bitmap1(),
which introduced bitmap slot issue:
load_super1 didn't set up supertype.nodes, which made spare disk only
have one slot info. Then it triggered kernel md_bitmap_load_sb to get
wrong bitmap slot data.
For fixing this issue, there are two methods:
1> revert the related code of commit 9d67f6496c71. and restore the code
from former commit 45a87c2f31335 ("super1: add more checks for
NodeNumUpdate option").
st->nodes value would be 0 & 1 under current code logic. i.e.
When adding a spare disk, there is no place to init st->nodes, and
the value is ZERO.
2> keep 9d67f6496c71, add additional ->nodes handling in load_super1(),
let load_super1 to set st->nodes when bitmap is BITMAP_MAJOR_CLUSTERED.
Under current mdadm code logic, load_super1 will be called many
times, any new code in load_super1 will cost mdadm running more time.
And more reason is I prefer as much as possible to limit clustered
code spreading in every corner.
So I used method <1> to fix this issue.
How to trigger:
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdc
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -a /dev/md0 /dev/sdc
mdadm /dev/md0 --fail /dev/sda
mdadm /dev/md0 --remove /dev/sda
mdadm -Ss
mdadm -A /dev/md0 /dev/sdb /dev/sdc
the output of current "mdadm -X /dev/sdc":
(there should be (by default) 4 slot info for correct output)
```
Filename : /dev/sdc
Magic : 6d746962
Version : 5
UUID : a74642f8:a6b1fba8:58e1f8db:cfe7b082
Events : 29
Events Cleared : 0
State : OK
Chunksize : 64 MB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 306176 (299.00 MiB 313.52 MB)
Bitmap : 5 bits (chunks), 5 dirty (100.0%)
```
And mdadm later operations will trigger kernel output error message:
(triggered by "mdadm -A /dev/md0 /dev/sdb /dev/sdc")
```
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 1
kernel: md-cluster: Could not gather bitmaps from slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 3
kernel: md-cluster: Could not gather bitmaps from slot 3
kernel: md-cluster: failed to gather all resyn infos
kernel: md0: detected capacity change from 0 to 612352
```
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
super1.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/super1.c b/super1.c
index e3e2f954..3a0c69fd 100644
--- a/super1.c
+++ b/super1.c
@@ -2674,7 +2674,17 @@ static int write_bitmap1(struct supertype *st, int fd, enum bitmap_update update
}
if (bms->version == BITMAP_MAJOR_CLUSTERED) {
- if (__cpu_to_le32(st->nodes) < bms->nodes) {
+ if (st->nodes == 1) {
+ /* the parameter for nodes is not valid */
+ pr_err("Warning: cluster-md at least needs two nodes\n");
+ return -EINVAL;
+ } else if (st->nodes == 0) {
+ /*
+ * parameter "--nodes" is not specified, (eg, add a disk to
+ * clustered raid)
+ */
+ break;
+ } else if (__cpu_to_le32(st->nodes) < bms->nodes) {
/*
* Since the nodes num is not increased, no
* need to check the space enough or not,
--
2.35.3
next prev parent reply other threads:[~2022-06-20 16:11 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-20 16:10 [PATCH 0/6] mdadm-CI for-jes/20220620: patches for merge Coly Li
2022-06-20 16:10 ` [PATCH 1/6] Revert "mdadm: fix coredump of mdadm --monitor -r" Coly Li
2022-06-20 16:10 ` [PATCH 2/6] util: replace ioctl use with function Coly Li
2022-06-20 16:10 ` Coly Li [this message]
2022-06-20 16:10 ` [PATCH 4/6] imsm: introduce get_disk_slot_in_dev() Coly Li
2022-06-20 16:10 ` [PATCH 5/6] imsm: use same slot across container Coly Li
2022-06-20 16:10 ` [PATCH 6/6] imsm: block changing slots during creation Coly Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220620161043.3661-4-colyli@suse.de \
--to=colyli@suse.de \
--cc=heming.zhao@suse.com \
--cc=jes@trained-monkey.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).