From: Chris Webb <chris@arachsys.com>
To: linux-raid@vger.kernel.org
Subject: Synchronous vs asynchonous mdadm operations
Date: Fri, 28 Nov 2008 16:27:03 +0000 [thread overview]
Message-ID: <20081128162703.GA22404@arachsys.com> (raw)
I notice that some mdadm operations appear to be asynchronous. For instance,
mdadm --fail /dev/md/shelf.51000 /dev/mapper/slot.51000.1
mdadm --remove /dev/md/shelf.51000 /dev/mapper/slot.51000.1
will always fail at the --remove stage with
mdadm: hot remove failed for /dev/mapper/slot.51000.1: Device or resource busy
whereas adding a short sleep in between will make it successful.
Is there a 'standard' way to wait for this operation to complete or to
perform both steps in one go, other than something horrible like:
mdadm --fail /dev/md/shelf.51000 /dev/mapper/slot.51000.1
MD=$((`stat -c '%#T' -L /dev/md/shelf.51000`))
MAJOR=$((`stat -c '%#t' -L /dev/mapper/slot.51000.1`))
MINOR=$((`stat -c '%#T' -L /dev/mapper/slot.51000.1`))
for RD in /sys/block/md$MD/md/rd*; do
[ -f $RD/block/dev ] || continue
[ "`<$RD/block/dev`" = "$MAJOR:$MINOR" ] || continue
while [ "< $RD/state" != "faulty ]; do sleep 0.1; done
done
mdadm --remove /dev/md/shelf.51000 /dev/mapper/slot.51000.1
Also, is mdadm --stop asynchronous in the same way? If mdadm --stop succeeds
on one host and I immediately run mdadm --assemble on another host which is
able to access the same slots, am I at risk of corrupting the array?
The reason for the question is that I'm seeing occasional cases of arrays which
won't reassemble following such an operation. dmesg alleges there is an invalid
superblock for all of the six slots which were originally part of the array:
md: md126 stopped.
md: etherd/e24.1 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
md: etherd/e24.4 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
md: etherd/e24.5 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
md: etherd/e24.2 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
md: etherd/e24.3 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
md: etherd/e24.0 does not have a valid v1.1 superblock, not importing!
md: md_import_device returned -22
This array had been grown from 258MB slots to 13GB slots on the old host
shortly before being stopped and attempting to reassemble on a new host, and
mdadm --examine on each of the slots shows a superblock reflecting the old
array size, rather than the new. Presumably there is other corruption too,
which I can't see.
# mdadm --examine /dev/etherd/e24.3
/dev/etherd/e24.3:
Magic : a92b4efc
Version : 1.1
Feature Map : 0x0
Array UUID : 94de9400:e0cb45f4:36e50a70:184a6875
Name : 3:shelf.24
Creation Time : Fri Nov 21 18:22:38 2008
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 27789808 (13.25 GiB 14.23 GB)
Array Size : 2107392 (1029.17 MiB 1078.98 MB)
Used Dev Size : 526848 (257.29 MiB 269.75 MB)
Data Offset : 16 sectors
Super Offset : 0 sectors
State : clean
Device UUID : d51aaa04:d51a524b:77b766d1:10eb7ec6
Update Time : Fri Nov 28 13:18:19 2008
Checksum : 9644dd7f - correct
Events : 22
Chunk Size : 4K
Array Slot : 5 (0, 1, 2, 3, 4, 5)
Array State : uuuuuU
The event count shown by mdadm --examine matches across all the slots.
For what it's worth, the underlying aoe devices through which remote slots are
made visible to the old and new hosts should correctly handle synchronous
writes/fsync(). If the sync returns as completed, the written data should
genuinely be visible and consistent from every host which can see the device,
whether locally or remotely. (Obviously if I wasn't respecting fsync()
behaviour at the network block device level, I'd expect all sorts of
consistency problems in moving an array from host to host like this.)
Cheers,
Chris.
next reply other threads:[~2008-11-28 16:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-28 16:27 Chris Webb [this message]
2008-11-28 16:41 ` Synchronous vs asynchonous mdadm operations Chris Webb
2008-12-04 10:59 ` Chris Webb
2008-12-05 4:45 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081128162703.GA22404@arachsys.com \
--to=chris@arachsys.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).