From: Flynn <flynn@kodachi.com>
To: linux-lvm@redhat.com
Subject: [linux-lvm] PV that's present marked as missing?
Date: Mon, 19 Aug 2013 00:55:31 -0400 [thread overview]
Message-ID: <5211A543.9040603@kodachi.com> (raw)
I have a fairly complex LVM2/mdadm setup that I'm in the middle of
turning into a simpler setup. I made a mistake along the way, though,
and have landed in a confusing place.
This is kind of long, and I apologize for that -- trying to describe
completely how I got here. The complex setup I started with:
/dev/md5 is a RAID5 of /dev/sd{b,d,e,f}5
/dev/md6 is a RAID5 of /dev/sd{b,d,e,f}6
etc on up to /dev/md14
/dev/md99 is a RAID1 of /dev/sdg and /dev/sdh
/dev/md{5-14} plus /dev/md99 are all assembled into a volume group
(creatively called vglinux), which has three logical volumes. Only one,
lvstore, is relevant: the other two are getting destroyed as part of the
simplication.
The goal is to end with a RAID6 of /dev/sd{b,d,e,f,g,h}, and no
multiple-partition madness (it's there from the days of old, when mdadm
couldn't reshape arrays). The next step was to free up /dev/sdf,
starting with
pvmove /dev/md5
reshape md5 as a RAID5 of /dev/sd{b,d,e}5 (freeing /dev/sdf5)
lather, rinse, and repeat for the other mds.
The VG has plenty of free space for this; it's slow, but that's OK.
The problem: while md{5,6,7} went fine, I botched the pvmove for md8 and
ended up starting to reshape the array _before the pvmove happened._
Specifically, I did all of these:
mdadm --grow /dev/md8 --array-size 292730880 # it was 439489920
pvresize /dev/md8
mdadm --grow /dev/md8 --raid-devices 3 --backup-file ~/backup
_without_ having moved data off. Once I figured out what was going on,
I did
umount (all the filesystems in the VG)
vgchange -a n vglinux
mdadm --stop /dev/md8
which halted the reshape about 5% of the way done. Then (with some help
from NeilBrown and a buncha experiments with loopback devices) I used
the most recent mdadm snapshot to revert the reshape.
mdadm --assemble --update=revert-reshape /dev/md8 /dev/sd{b,d,e,f}8
NOTE WELL: I KNOW THAT THIS HAS DESTROYED SOME DATA. That's not the
question. [ :) ] There will be damage, yes, I know that, and I should
be able to detect that and correct it.
At this point /dev/md8 is back to 4 devices, array-size 439489920, and
can be started. Next step is to fsck lvstore to get a handle on the
damage before proceeding -- but vgchange -a y vglinux doesn't start lvstore:
# vgchange -a y vglinux
Incorrect metadata area header checksum
Refusing activation of partial LV lvstore. Use --partial to override.
2 logical volume(s) in volume group "vglinux" now active
(The two LVs that it did start are the irrelevant ones.)
So things are confusing:
First, it'd be awesome to know where exactly that "incorrect metada area
header checksum" is coming from. Maybe, y'know, a device to look at, or
some further hint of where to start tracking things down? [ :) ]
Second, if I look in /etc/lvm/archive for vglinux's latest, I find this
bit buried in there:
pv2 {
id = "4F3rcV-sS8p-E6t2-hjGm-gLVB-C6wl-4McUhc"
device = "/dev/md8" # Hint only
status = ["ALLOCATABLE"]
flags = ["MISSING"]
dev_size = 878979840 # 419.13 Gigabytes
pe_start = 384
pe_count = 107297 # 419.129 Gigabytes
}
which seems to be why it's complaining about 'partial PV lvstore'. But,
uh, 4F3rcV-sS8p-E6t2-hjGm-gLVB-C6wl-4McUhc _is_ the UUID of /dev/md8:
# pvs -o +uuid --unit=4m
Incorrect metadata area header checksum
Unable to find "/dev/sdb5" in volume group "vglinux"
PV VG Fmt Attr PSize PFree PV UUID
/dev/md10 vglinux lvm2 a- 107297.00U 0U
LO5KoK-1AjU-iXb0-fkLo-lUKR-Yo9P-wDZQPP
/dev/md11 vglinux lvm2 a- 107297.00U 0U
gBGcjz-DmIb-pAj9-CWnb-jopW-Wd19-iIs1ur
/dev/md125 vglinux lvm2 a- 107297.00U 8607.00U
5JlNTx-yT14-271r-NMAm-a17W-FKe4-pXoOW4
/dev/md13 vglinux lvm2 a- 107297.00U 0U
MJlTQO-lCyE-bP80-FlvE-m1nM-DD2x-qhlIQK
/dev/md14 vglinux lvm2 a- 107297.00U 0U
XDpA1D-kxbq-SEck-ozTl-rP4Y-bMws-MBwNNf
/dev/md5 lvm2 a- 71467.50U 71467.50U
39oFQs-9tlf-ywT4-YgtX-nfcm-rAEq-pAPsdR
/dev/md6 vglinux lvm2 a- 71531.00U 35856.00U
ufKOpM-02YG-12rJ-mt1r-DbEm-xoJu-onzEtr
/dev/md7 vglinux lvm2 a- 71531.00U 71531.00U
NpAKLQ-4Irn-wDA4-0ZDI-ydW6-eY9n-rDp50e
/dev/md8 vglinux lvm2 a- 107297.00U 0U
4F3rcV-sS8p-E6t2-hjGm-gLVB-C6wl-4McUhc
/dev/md9 vglinux lvm2 a- 107297.00U 0U
hRmTMN-Mx17-uUEX-rF1Z-hQ1J-8iDd-S7S2t7
/dev/md99 vglinux lvm2 a- 357667.00U 178748.00U
jUgxoF-mvwR-6C8A-wzjP-K0Xu-MPf8-XewqUE
Finally, note that "Unable to find /dev/sdb5 in vglinux" complaint, and
note that /dev/md5 is _not_ listed as part of vglinux. md5 shouldn't be
part of vglinux right now, and sdb5 has never been a PV on its own (it's
only ever been a part of the md5 PV). WTFO? As it happens, I didn't
actually reshape /dev/md5: after the pvmove, I shredded the md and
recreated it instead. I suppose it's possible that I forgot to vgreduce
before doing that?
Googling and reading indicates that I need to clear that MISSING flag,
and that vgcfgrestore is the only tool for that job -- but editing that
archive file to remove the MISSING flag and trying vgcfgrestore with
that doesn't work:
# vgcfgrestore --debug --verbose --test --file wtfvglinux vglinux
Test mode: Metadata will NOT be updated.
Incorrect metadata area header checksum
Incorrect metadata area header checksum
Restore failed.
Test mode: Wiping internal cache
Wiping internal VG cache
so, at this point, some guidance would be most welcome.
(Also note that before I did the revert-reshape, I dd'd
/dev/sd{b,d,e,f}8 to spare partitions as a backup. It may be relevant
that there are two copies of the metadata for md8's devices?)
Thanks very much,
Flynn
--
The trick is to keep breathing. (Garbage, from _Version 2.0_)
reply other threads:[~2013-08-19 4:55 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5211A543.9040603@kodachi.com \
--to=flynn@kodachi.com \
--cc=linux-lvm@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.