From: NeilBrown <neilb@suse.de>
To: Haakon Alstadheim <hakon.alstadheim@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Interrupted reshape -- mangled backup ?
Date: Thu, 18 Oct 2012 09:33:38 +1100 [thread overview]
Message-ID: <20121018093338.3026c803@notabene.brown> (raw)
In-Reply-To: <507F2462.2050409@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8516 bytes --]
On Wed, 17 Oct 2012 23:34:26 +0200 Haakon Alstadheim
<hakon.alstadheim@gmail.com> wrote:
> I have a Raid5 array with 4 devices that I wanted to see if I could get
> a better performance out of, so i tried changing the chunk size from 64K
> to something bigger. (famous last words) . I got into some other
> trouble and thought I needed a reboot. On reboot I several times managed
> to mount and specify the device with my backup file during initramfs,
> but the reshape stopped every time once the system was at initialized.
So worst-case you can do that again, but insert a "sleep 365d" immediately
after the "mdadm --assemble" is run, so the system never completely
initialises. Then just wait for the reshape to finish.
When mdadm assembles and array that needs to keep growing it will for a
background process to continue monitoring the reshape process. Presumably
that background process is getting killed. I don't know why.
>
> This is under debian sqeeze with a 3.2.0-0.bpo.3-686-pae kernel from
> backports. I installed mdadm from backports to get the latest version of
> that as well, and tried rebooting with --freeze-reshape. Suspect that I
> mixed up my initrd.img-files and started without --freeze-reshape the
> first time after installing the new mdadm. Now mdadm says it can not
> find a backup in my backup file. Opening up the backup in emacs, it
> seems to contain only NULs. Can't be right, can it? I have been mounting
> the backup under a directory under /dev/, on the assumption that the
> mount wold survive past the initramfs stage.
The backup file could certainly contain lots of nuls, but it shouldn't be
*all* nulls. At least there should be a header at the start which describes
which area of the device is contained in the backup.
You can continue without a backup. You still need to specify a backup file,
but if you add "--invalid-backup", it will continue even if the backup file
doesn't contain anything useful.
If the machine was shutdown by a crash during reshape you might suffer
corruption. If it was a clean shutdown you won't.
--freeze-reshape is intended to be the way to handle this, with
--grow --continue
once you are fully up and running, but I don't think that works correctly for
'native' metadata yet - it was implemented with IMSM metadata in mind.
NeilBrown
>
> My bumbling has been happening with a current, correct,
> /etc/mdadm/mdadm.conf containigng:
> --------
> DEVICE /dev/sdh /dev/sde /dev/sdc /dev/sdd
> CREATE owner=root group=disk mode=0660 auto=yes
> HOMEHOST <system>
> ARRAY /dev/md1 level=raid5 num-devices=4
> UUID=583001c4:650dcf0c:404aaa6f:7fc38959 spare-group=main
> -------
> The show-stopper happened with an initramfs and a script in
> /scripts/local-top/mdadm along the lines of:
> -------
> /sbin/mdadm --assemble -f --backup-file=/dev/bak/md1-backup /dev/md1
> --run --auto=yes /dev/sdh /dev/sde /dev/sdc /dev/sdd
> -------
>
> At times I have also had to use the env-variable MDADM_GROW_ALLOW_OLD=1
>
> Below is the output of mdadm -Evvvvs:
> --------
>
>
> /dev/sdh:
> Magic : a92b4efc
> Version : 0.91.00
> UUID : 583001c4:650dcf0c:404aaa6f:7fc38959
> Creation Time : Wed Dec 3 19:45:33 2008
> Raid Level : raid5
> Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
> Array Size : 2930287488 (2794.54 GiB 3000.61 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 1
>
> Reshape pos'n : 2368561152 (2258.84 GiB 2425.41 GB)
> New Chunksize : 131072
>
> Update Time : Wed Oct 17 02:15:53 2012
> State : active
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 14da0760 - correct
> Events : 778795
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 0 8 112 0 active sync /dev/sdh
>
> 0 0 8 112 0 active sync /dev/sdh
> 1 1 8 48 1 active sync /dev/sdd
> 2 2 8 32 2 active sync /dev/sdc
> 3 3 8 64 3 active sync /dev/sde
> /dev/sde:
> Magic : a92b4efc
> Version : 0.91.00
> UUID : 583001c4:650dcf0c:404aaa6f:7fc38959
> Creation Time : Wed Dec 3 19:45:33 2008
> Raid Level : raid5
> Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
> Array Size : 2930287488 (2794.54 GiB 3000.61 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 1
>
> Reshape pos'n : 2368561152 (2258.84 GiB 2425.41 GB)
> New Chunksize : 131072
>
> Update Time : Wed Oct 17 02:15:53 2012
> State : active
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 14da0736 - correct
> Events : 778795
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 3 8 64 3 active sync /dev/sde
>
> 0 0 8 112 0 active sync /dev/sdh
> 1 1 8 48 1 active sync /dev/sdd
> 2 2 8 32 2 active sync /dev/sdc
> 3 3 8 64 3 active sync /dev/sde
> /dev/sdc:
> Magic : a92b4efc
> Version : 0.91.00
> UUID : 583001c4:650dcf0c:404aaa6f:7fc38959
> Creation Time : Wed Dec 3 19:45:33 2008
> Raid Level : raid5
> Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
> Array Size : 2930287488 (2794.54 GiB 3000.61 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 1
>
> Reshape pos'n : 2368561152 (2258.84 GiB 2425.41 GB)
> New Chunksize : 131072
>
> Update Time : Wed Oct 17 02:15:53 2012
> State : active
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 14da0714 - correct
> Events : 778795
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 2 8 32 2 active sync /dev/sdc
>
> 0 0 8 112 0 active sync /dev/sdh
> 1 1 8 48 1 active sync /dev/sdd
> 2 2 8 32 2 active sync /dev/sdc
> 3 3 8 64 3 active sync /dev/sde
> /dev/sdd:
> Magic : a92b4efc
> Version : 0.91.00
> UUID : 583001c4:650dcf0c:404aaa6f:7fc38959
> Creation Time : Wed Dec 3 19:45:33 2008
> Raid Level : raid5
> Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
> Array Size : 2930287488 (2794.54 GiB 3000.61 GB)
> Raid Devices : 4
> Total Devices : 4
> Preferred Minor : 1
>
> Reshape pos'n : 2368561152 (2258.84 GiB 2425.41 GB)
> New Chunksize : 131072
>
> Update Time : Wed Oct 17 02:15:53 2012
> State : active
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 14da0722 - correct
> Events : 778795
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 1 8 48 1 active sync /dev/sdd
>
> 0 0 8 112 0 active sync /dev/sdh
> 1 1 8 48 1 active sync /dev/sdd
> 2 2 8 32 2 active sync /dev/sdc
> 3 3 8 64 3 active sync /dev/sde
> ---------------------------
>
> I guess the moral of all this is that if you want to use mdadm you
> should pay attention and not be in too much of a hurry :-/ .
> I'm just hoping that I can get my system back. This raid contains my
> entire system, and will take a LOT of work to recreate. Mail, calendars
> ... . Backups are a couple of weeks old ...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-10-17 22:33 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-17 21:34 Interrupted reshape -- mangled backup ? Haakon Alstadheim
2012-10-17 22:33 ` NeilBrown [this message]
-- strict thread matches above, loose matches on Subject: below --
2012-10-18 14:08 Haakon Alstadheim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121018093338.3026c803@notabene.brown \
--to=neilb@suse.de \
--cc=hakon.alstadheim@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).