* Re: [linux-lvm] progress, but... - re. fixing LVM/md snafu
2009-04-05 18:32 ` [linux-lvm] " Jayson Vantuyl
@ 2009-04-05 21:12 ` Miles Fidelman
2009-04-06 14:17 ` Miles Fidelman
1 sibling, 0 replies; 6+ messages in thread
From: Miles Fidelman @ 2009-04-05 21:12 UTC (permalink / raw)
To: LVM general discussion and development
Jayson,
This is VERY helpful. Thanks!
Miles
Jayson Vantuyl wrote:
> Miles,
>
> It seems like what's probably happened is that LVM detected the raw
> device instead of the MD device at some point early in the boot
> process. This may be because the MD detection happened after LVM
> setup. I'm unsure if it's possible for LVM to "steal" the device from MD.
>
> Depending on your distribution, this may require different things to
> fix. Stop worrying about downtime. If the data is important, just
> don't worry about downtime. If downtime is really important, build a
> second machine, get it working right, and transfer the data. Being in
> a hurry and attempting to "optimize" the recovery process is a really
> good way to lose the data.
>
> Assuming that you're going to try to fix this setup, I'd start out
> with a backup. This is critical. Everybody always says to do a backup.
> Nobody ever does it. Really, do one. Get an S3 account, use an S3
> backup utility. There's just not an excuse these days. Your data is
> one-MD-mistake away from oblivion.
>
> So, right now MD should have sda/sdb but only has sda. sdb is now
> newer than sda and may have important data if this server stores
> anything like that. The challenge is that, according to MD, sda is
> newer. Since MD isn't handling writes to sdb, it won't be updating its
> metadata to know that it's newer. There are two options that I can
> think of, both ugly. Pick one of:
>
> 1. Destroy the MD. Create a new one with the same UUID and sdb3 as the
> source. (which you listed, the UUID part can trip you up)
> 2. Sync the updated data from sdb3 onto md2. Wipe sdb3. Add it back
> into md2. (might be less downtime depending on data size, doesn't nuke MD)
> 3. Build another machine. Get it working right. Transfer data with
> Rsync. (least downtime, most expensive)
>
> In the first two cases, this only sets you up for it to break again.
> The core problem is figuring out what happened during boot. In a
> perfect world, you would just tell LVM to only consider MD devices.
> That's not hard, but it's complicated by the fact that you have LVM on
> /. This means that the configuration that's used is likely not the
> version on / but a copy of it that is made when you set up your boot
> ramdisk (a.k.a. initrd, or possibly an initramfs). Even if we get LVM
> locked down to use just MDs and get that config used to boot-time,
> there's the possibility that the MD won't get assembled (since it
> already may not have been when LVM was first activated) and the system
> won't boot. Again, fraught with peril.
>
> If you want to fix the MD, first steps will be using a rescue LiveCD
> to boot up and do all of this. With that LiveCD, you can also adjust
> the LVM configuration and update the initrd (or whatever is used for
> boot). You may need to chroot into the system and/or trick the initrd
> into seeing the right devices. I don't really think I can walk you
> through this via an e-mail.
>
> The LVM part is pretty easy. Just set a filter line (you only get one,
> so disable any other filter lines) in <root of system>/etc/lvm.conf to:
>
>> filter = [ "a|^/dev/md.*$|", "r/.*/" ]
>
> That will prevent you from using anything but the MD.
>
> To update the initrd with this information depends on distro (and
> distro version)�. It's usually either some invocation of "mkinitrd" or
> some script that wraps it. It will get the LVM configuration available
> at boot-time. This *MIGHT* sort out the MD problem. It might not. If
> it doesn't, I'm not sure where to tell you to start. If mdadm is being
> used by your initrd, you'll need to tweak its configuration. If it's
> relying on MD autodetection, you might have turned that off in your
> kernel. If you have an IDE controller that takes too long to
> initialize, that can also cause this sort of thing (although that's
> REALLY unlikely these days).
>
> I hope that some of this helps. Although, it will be hard for anyone
> to give you really solid advice without a little more insight into why
> the MD isn't getting assembled prior to LVM's scan.
>
> On Apr 5, 2009, at 10:05 AM, Miles Fidelman wrote:
>
>> Hello again Folks,
>>
>> So.. I'm getting closer to fixing this messed up machine.
>>
>> Where things stand:
>>
>> I have root defined as an LVM2 LV, that should use /dev/md2 as it's PV.
>> /dev/md2 in turn is a RAID1 array built from /dev/sda3 /dev/sdb3 and
>> /dev/sdc3
>>
>> Instead, LVM is reporting: "Found duplicate PV
>> 2ppSS2q0kO3t0tuf8t6S19qY3ypWBOxF: using /dev/sdb3 not /dev/sda3"
>> and the /dev/md2 is reporting itself as inactive (cat /proc/mdstat)
>> and active,degraded (mdadm --detail)
>>
>> ---
>> I'm guessing that, during boot:
>>
>> - the raid array failed to start
>> - LVM found both copies of the PV, and picked one (/dev/sdb3)
>> - everything then came up and my server is humming away
>>
>> but: the md array can't rebuild because the most current device in it
>> is already in use
>>
>> so... I'm looking for the right sequence of events, with the minimum
>> downtime to:
>>
>> 1. stop changes to /dev/sdb3 (actually, to / - which complicates things)
>> 2. rebuild the RAID1 array, making sure to use /dev/sdb3 as the
>> starting point for current data
>> 3. restart in such a way that LVM finds /dev/md2 as the right PVM
>> instead of one of its components
>>
>> Each of these is just tricky enough that I'm sure there are lots of
>> gotchas to watch out for.
>>
>> So.. any suggestions?
>>
>> Thanks very much,
>>
>> Miles Fidelman
>>
>>
>>
>>
>> _______________________________________________
>> linux-lvm mailing list
>> linux-lvm@redhat.com <mailto:linux-lvm@redhat.com>
>> https://www.redhat.com/mailman/listinfo/linux-lvm
>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>
> --
> Jayson Vantuyl
> Founder and Architect
> *Engine Yard <http://www.engineyard.com>*
> jvantuyl@engineyard.com <mailto:jvantuyl@engineyard.com>
> 1 866 518 9275 ext 204
> IRC (freenode): kagato
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
--
Miles R. Fidelman, Director of Government Programs
Traverse Technologies
145 Tremont Street, 3rd Floor
Boston, MA 02111
mfidelman@traversetechnologies.com
857-362-8314
www.traversetechnologies.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [linux-lvm] progress, but... - re. fixing LVM/md snafu
2009-04-05 18:32 ` [linux-lvm] " Jayson Vantuyl
2009-04-05 21:12 ` Miles Fidelman
@ 2009-04-06 14:17 ` Miles Fidelman
1 sibling, 0 replies; 6+ messages in thread
From: Miles Fidelman @ 2009-04-06 14:17 UTC (permalink / raw)
To: LVM general discussion and development
Hi Jayson,
Thanks for all the detailed information yesterday. I've done some more
digging into my system, and I wonder if you'd be willing to comment on
what I found, and the recovery procedure I'm considering.
Quick summary of situation:
- machine comes up, but LVM builds / on top of /dev/sdb3 instead of
/dev/md2 of which /dev/sdb3 is a part
- looks like md2 isn't starting, so I need to fix it (presumably
offline, using a LiveCD), then reboot and get LVM to use the mirror device
What's confusing is that the raid isn't starting at boot time, but
depending on which tools I use shows different status. So first, I have
to get the raid working again and make sure it has the up-to-date data.
Here are some more details, broken into four sections: RAID, LVM, boot
process, recovery procedure - the RAID section has a summary at the
front, followed by details of command listings, the other sections are
much shorter :-):
Comments on the recovery procedure, please!
---------- re. the RAID array --------
RE. the raid array:
summary:
- /proc/mdstat thinks the array is inactive, containing sdb3 and sdd3
- mdadm thinks it's active, degraded, also containing sdb3 and sdd3
(mdadm -D /dev/md2)
- looking at superblocks, mdadm seems to think it's active, degraded
(mdadm -E /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3)
-- containing sda3, only (mdadm -E /dev/sda3)
-- containing sda3, with sdb3 spare (mdadm -E /dev/sdb3)
-- containing sda3 and sdb3, with sdc3 spare (mdadm -E /dev/sdc3) - with
the same Magic #, different UUID from above
-- no superblock on /dev/sdd3 (mdadm -E /dev/sdd3)
details:
more /proc/mdstat:
md2 : inactive sdd3[0] sdb3[2]
195318016 blocks
<looking@RAID>
mdadm -D /dev/md2:
/dev/md2:
Version : 00.90.01
Creation Time : Thu Jul 20 06:15:18 2006
Raid Level : raid1
Device Size : 97659008 (93.13 GiB 100.00 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Fri Apr 3 10:06:41 2009
State : active, degraded
Active Devices : 0
Working Devices : 2
Failed Devices : 0
Spare Devices : 2
Number Major Minor RaidDevice State
0 8 51 0 spare rebuilding /dev/sdd3
1 0 0 - removed
2 8 19 - spare /dev/sdb3
<looking@component devices>
server1:/etc/lvm# mdadm -E /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3
/dev/sda3:
Magic : a92b4efc
Version : 00.90.00
UUID : 3a32acee:8a132ab9:545792a8:0df49d99
Creation Time : Thu Jul 20 06:15:18 2006
Raid Level : raid1
Raid Devices : 2
Total Devices : 1
Preferred Minor : 2
Update Time : Fri Apr 3 22:40:39 2009
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : 71d21f34 - correct
Events : 0.114704240
Number Major Minor RaidDevice State
this 0 8 3 0 active sync /dev/sda3
0 0 8 3 0 active sync /dev/sda3
1 1 0 0 1 faulty removed
/dev/sdb3:
Magic : a92b4efc
Version : 00.90.00
UUID : 3a32acee:8a132ab9:545792a8:0df49d99
Creation Time : Thu Jul 20 06:15:18 2006
Raid Level : raid1
Raid Devices : 2
Total Devices : 2
Preferred Minor : 2
Update Time : Fri Apr 3 10:06:41 2009
State : clean
Active Devices : 1
Working Devices : 2
Failed Devices : 1
Spare Devices : 1
Checksum : 71d1d1fa - correct
Events : 0.114716950
Number Major Minor RaidDevice State
this 2 8 19 2 spare /dev/sdb3
0 0 8 3 0 active sync /dev/sda3
1 1 0 0 1 faulty removed
2 2 8 19 2 spare /dev/sdb3
/dev/sdc3:
Magic : a92b4efc
Version : 00.90.00
UUID : 635fb32e:6a83a5be:12735af4:74016e66
Creation Time : Wed Jul 2 12:48:36 2008
Raid Level : raid1
Raid Devices : 2
Total Devices : 3
Preferred Minor : 2
Update Time : Fri Apr 3 06:42:50 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1
Checksum : 95973481 - correct
Events : 0.26
Number Major Minor RaidDevice State
this 2 8 35 2 spare /dev/sdc3
0 0 8 3 0 active sync /dev/sda3
1 1 8 19 1 active sync /dev/sdb3
2 2 8 35 2 spare /dev/sdc3
mdadm: No super block found on /dev/sdd3 (Expected magic a92b4efc, got
00000000)
<looking@devices with --scan>
server1:/etc/lvm# mdadm -E --scan /dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=635fb32e:6a83a5be:12735af4:74016e66
devices=/dev/sdc3
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=3a32acee:8a132ab9:545792a8:0df49d99
devices=/dev/sda3,/dev/sdb3
-------- re. LVM ---------
/etc/lvm.conf contains the line:
md_component_detection = 0
I expect that if I set it to 1 that would tell LVM to look for RAIDs first.
Also, /etc/lvm/backup/rootvolume contains:
pv0 {
id = "2ppSS2-q0kO-3t0t-uf8t-6S19-qY3y-pWBOxF"
device = "/dev/md2" # Hint only
which suggests that if the RAID is running, lvm will do the right thing
---------- re. boot process ------------
looks like detailed events are:
- MBR loads grub
- grub knows about md and lvm, mounts read-only
-- kernel /vmlinuz-2.6.8-3-686 root=/dev/mapper/rootvolume-rootlv
ro mem=4
- during main boot md comes up first, then lvm
-- from rcS.d/S25mdadm-raid: if not already running ... mdadm -A -s -a
---- I'm guessing this fails for /dev/md2
-- from rcS.d/S26lvm:
-- creates lvm device
-- creates dm device
-- does a vgscan
---- which is where this happens:
Found duplicate PV 2ppSS2q0kO3t0tuf8t6S19qY3ypWBOxF: using /dev/sdb3
not /dev/sda3
Found volume group "backupvolume" using metadata type lvm2
Found volume group "rootvolume" using metadata type lvm2
-- does a vgchange -a -y
---- which looks like it's picking up on sdb3
-- I'm guessing that if the mirror were active, and based on /dev/sdb3
- lvm would pick that up as the volume group
** is this where setting md_component_detection = 1 would be helpful?
------------ recovery procedure ------------
here's what I'm thinking of doing - comments please!
1. turn logging on in lvm.conf, reboot, examine logs to confirm above
guesses (or find out what's really happening)
-- based on the logging, maybe set md_component_detection = 1 in lvm.conf
2. shutdown, boot from LiveCD (I'm using systemrescuecd - great tool by
the way)
3. backup /dev/sdb3 using partimage (just in case!)
4. try to fix /dev/md2
if it's not running - start it, with only /dev/sdb3; then add in other
devices
- A /dev/md2 --add /dev/sdb3 --run (**is this the right way to do
this?**)
- add each device back (mdadm -a /dev/sda3; mdadm -a /dev/sdb3; mdadm -a
/dev/sdd3)
- grow to 3 active devices: mdadm --grow -n 3 /dev/md2
if it's running:
- fail all except /dev/sdb3 (mdadm -f /dev/sda3; mdadm -f /dev/sdb3;
mdadm -f /dev/sdd3)
- remove all except /dev/sdb3 (mdadm -r /dev/sda3; mdadm -r /dev/sdb3;
mdadm -r /dev/sdd3)
- add each device back (mdadm -a /dev/sda3; mdadm -a /dev/sdb3; mdadm -a
/dev/sdd3)
- grow to 3 active devices: mdadm --grow -n 3 /dev/md2
question: do I need to update mdadm.conf?
question: do I need to anything to get rid of the superblock containing
a different UUID
5. reboot the system
- it may just come up
- if it comes up and lvm is still operating off a single partition,
repeat the above, but first add a filter to lvm.conf (wash, rinse,
repeat as necessary)
*** does this seem like a reasonable game plan? ***
Thanks again for your help!
Miles
^ permalink raw reply [flat|nested] 6+ messages in thread