linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Burgess <aab@cichlid.com>
To: linux raid mailing list <linux-raid@vger.kernel.org>
Subject: Re: reshape changing chunk size won't restart
Date: Tue, 21 Dec 2010 18:09:46 -0800	[thread overview]
Message-ID: <1292983786.5543.1@athlon> (raw)
In-Reply-To: <20101222120810.5bba5304@notabene.brown> (from neilb@suse.de on Tue Dec 21 17:08:10 2010)

On 12/21/2010 05:08:10 PM, Neil Brown wrote:
> On Tue, 21 Dec 2010 16:09:59 -0800 Andrew Burgess <aab@cichlid.com>  
> wrote:
> 
> > On 12/21/2010 02:16:19 PM, Neil Brown wrote:
> >
> > > > I started a reshape changing chunk size and after it ran
> > > > for a while i realized the disk i used for the
> > > > backup file was slow so I killed the mdadm
> > >
> > > That was a mistake.
> >
> > Its looking to be a bad one
> >
> > > > running in the background and tried to restart
> > > > with the new location (i moved the file just in case)
> > > >
> > > > mdadm /dev/md5 --grow --chunk=8
> > > --backup-file=/my/raid/RAID_BACKUP_FILE
> > >
> > > As you discovered, that doesn't work.  I'd like to make it  
> possible
> > > to do
> > > something like that, but time is not something I have a lot of.
> >
> > Understand 100%
> >
> > > > I didn't try rebooting as the filesystem is mounted and
> > > > the data seems ok. Didn't want to make things worse...
> > >
> > > It shouldn't make things worse.
> >
> > I had too because umount wouldn't and neither fuser nor lsof
> > could find the guilty party
> >
> > > Do don't need to reboot, unless md5 has your root filesystem.
> > > Just unmount, 'mdadm -S /dev/md5', and assemble:
> > >   mdadm -A /dev/md5  
> --backup-file=/whereever-you-copied-the-file-to \
> > >       /dev/sd[dfcbhljgk]1
> > >
> > > should do it.
> >
> > After rebooting something happened to sdg1:
> >
> > mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE
> > /dev/sd[dfcbhljgk]1
> > mdadm: cannot open device /dev/sdg1: No such device or address
> > mdadm: /dev/sdg1 has no superblock - assembly aborted
> >
> > so i tried it with sdg1 missing
> >
> > mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE
> > /dev/sd[dfcbhljk]1
> > mdadm: Failed to restore critical section for reshape, sorry.
> >
> > so i rebooted and power cycled hoping to get sdg1 back but it was
> > still unhappy with the superblock
> >
> > I even tried it letting it scan for devices:
> >
> > mdadm -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE
> > mdadm: WARNING /dev/sdg1 and /dev/sdg appear to have very similar
> > superblocks.
> >        If they are really different, please --zero the superblock  
> on one
> >        If they are the same or overlap, please remove one from the
> >        DEVICE list in mdadm.conf.
> >
> > so repeating with all but sdg1 specified it results in:
> >
> > mdadm: Failed to restore critical section for reshape, sorry.
> >
> > Anything else I can try? We do have the sector it was on in the  
> original
> > email when it stopped: (2715648/1953511936)
> 
> 
> The business with sdg1 is a bit odd... I would use "--examine" to  
> check each
> device and make sure they have good matching superblocks.  It would  
> be a lot
> better if you can make sure all devices get included when you start  
> the array.

all the working devices have the same Reshape pos'n value in the  
superblock.
sdg1 though:

mdadm -E /dev/sdg1
mdadm: cannot open /dev/sdg1: No such device or address

even though:

ls -l /dev/sdg*
brw-rw---- 1 root disk 8, 96 Dec 21 15:53 /dev/sdg
brw-rw---- 1 root disk 8, 97 Dec 21 15:55 /dev/sdg1

and the partition table looks ok.
sdg is brand new but there are no i/o errors in the log

> Also, try starting with '--verbose', it might give some useful  
> information,
> but I don't hold out a lot of hope.

unless old timestamp is helpful:

mdadm --verbose -A /dev/md5 --backup-file=/my/raid/RAID_BACKUP_FILE   
/dev/sd[dfcbhljk]1
mdadm: looking for devices for /dev/md5
mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot 0.
mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 1.
mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 2.
mdadm: /dev/sdf1 is identified as a member of /dev/md5, slot 8.
mdadm: /dev/sdh1 is identified as a member of /dev/md5, slot 4.
mdadm: /dev/sdj1 is identified as a member of /dev/md5, slot 3.
mdadm: /dev/sdk1 is identified as a member of /dev/md5, slot 6.
mdadm: /dev/sdl1 is identified as a member of /dev/md5, slot 5.
mdadm:/dev/md5 has an active reshape - checking if critical section  
needs to be restored
mdadm: too-old timestamp on backup-metadata on /my/raid/RAID_BACKUP_FILE
mdadm: Failed to find backup of critical section
mdadm: Failed to restore critical section for reshape, sorry.

> Finally, you will probably end up having to modify mdadm so that it  
> ignores a
> failure from Grow_restart.  AS you had a reasonably clean shutdown  
> rather
> than a crash, there is a good chance that the backup file isn't  
> actually
> needed.

If the timestamp info above doesn't change your mind then I'll
try that.

> The next release of mdadm will have a --invalid-backup option to  
> --assemble
> to tell it to just continue even though the backup file looks wrong.

Hope to send you a patch for that.

Thanks for your time!

  reply	other threads:[~2010-12-22  2:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-21 20:01 reshape changing chunk size won't restart Andrew Burgess
2010-12-21 22:16 ` Neil Brown
2010-12-22  0:09   ` Andrew Burgess
2010-12-22  1:08     ` Neil Brown
2010-12-22  2:09       ` Andrew Burgess [this message]
2010-12-22  2:29         ` Neil Brown
2010-12-22  2:59           ` Andrew Burgess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1292983786.5543.1@athlon \
    --to=aab@cichlid.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).