linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: "Simon SÉHIER" <simon@sehier.fr>
Cc: linux-raid@vger.kernel.org
Subject: Re: reboot before reshape from raid 5 to raid 6 (was in state resync=DELAYED). Doesn't assemble anymore.
Date: Wed, 13 Oct 2010 19:37:59 +1100	[thread overview]
Message-ID: <20101013193759.4678186e@notabene> (raw)
In-Reply-To: <20101013081833.GA25675@leontine.pompomgali.com>

On Wed, 13 Oct 2010 10:18:33 +0200
Simon SÉHIER <simon@sehier.fr> wrote:

> On Wed, Oct 13, 2010 at 11:08:23AM +1100, Neil Brown wrote:
> > On Wed, 13 Oct 2010 00:59:52 +0200
> > Simon SÉHIER <simon@sehier.fr> wrote:
> > 
> > > On 12 oct. 2010 22:46:12, Neil Brown wrote :
> > > > On Tue, 12 Oct 2010 16:27:53 +0200
> > > > 
> > > > Simon S <simon@sehier.fr> wrote:
> > > > > Hi all,
> > > > > 
> > > > > I had a config with 5 disks and 3 raid 5 arrays:
> > > > > 
> > > > > md2 : system root
> > > > > md3 : swap
> > > > > md4 : data
> > > > > 
> > > > > I added a 6th disk with the intention of growing my raid5 into raid6.
> > > > > 
> > > > > The step I used were :
> > > > > 
> > > > > # mdadm /dev/mdX -a  /dev/newdiskX
> > > > > # mdadm -G --level 6 -n 6 /dev/mdX --backup-file /mdXbackup
> > > > > 
> > > > >  (yes, with backup file on root partition md2...)
> > > > 
> > > > Bad idea..  Very bad idea.
> > > > 
> > > > > The md3 array reshaped without any problem.
> > > > > md2 seemed to reshape well until it reaches 50.4%, then the rebuild speed
> > > > > stalled at 14Kb/s.
> > > > 
> > > > This is the expected consequence of that bad idea.  Unfortunately it would
> > > > be hard to reliably get mdadm to complain about that, though I guess the
> > > > common cases are easy to protect against ... added to 'todo' list
> > > > 
> > > > > md4 was still in the state "resync=DELAYED" then.
> > > > > 
> > > > > As the rebuild process seemed hung, I restart the machine ... bad idea.
> > > > 
> > > > Not really, nothing else would have worked.
> > > > 
> > > > > Now mdadm refuses to assemble md2 and md4, and displays this message :
> > > > >   mdadm: Failed to restore critical section for reshape, sorry.
> > > > >   
> > > > >     Possibly you needed to specify the --backup-file
> > > > > 
> > > > > md2 is my linux installation, not very bad if I lose this one.
> > > > > 
> > > > > md4 however contains valuable data.
> > > > > 
> > > > > While md4 was still in the state resync=DELAYED before the shutdown, I
> > > > > expect it should not has been (to much) modified and can be recovered.
> > > > 
> > > > Very true.
> > > > 
> > > > > Any idea on how I could safely do it ?
> > > > > 
> > > > > Should I give a try to the hack "Get 'Grow_restart' to always return 0."
> > > > > mentionned by Neil Brown on 22 april 2010 in this mailing list ?
> > > > 
> > > > That is your best bet.  I plan to make that easier to do in mdadm-3.2 (no
> > > > recompile necessary).
> > > > 
> > > > Before you do, check "mdadm -E /dev/newdiskX" and make sure the "Reshape
> > > > position" is 0.  If it is you should be fine.  I
> > > > 
> > > > It won't be for md2 of course.  So md will quite possible have some
> > > > corruption.  Run fsck on it an it will probably be mostly OK, but there is
> > > > a reasonable chance that some files will be corrupted.  Whether and when
> > > > you will notice is impossible to guess.
> > > 
> > > Thanks for your answer Neil, 
> > > 
> > > I recompiled mdadm 3.1.4 with return 0 in the beginning of the function 
> > > Grow_restart (mistake was made with 3.1.2). I have one more question :
> > > 
> > > I first tried assembling the least valued array, md2. It starts reshaping from 
> > > where it stops, in the first seconds around 1300 K/s, and rapidly above 10K/s.
> > > 
> > > While my backup file for md4 (the array I care about) was also on md2. Do I 
> > > have to expect a problem assembling md4 with the modified version of mdadm, or 
> > > can I go without worying md2 (rootfs)  isn't assembled ?
> > 
> > The backup file for md4 would have been essentially empty.  It can be created
> > anew elsewhere.  I probably wouldn't rick using the original backup file
> > even if you can access it, as it could be corrupted.
> > So when you assemble md4, give it a fresh backup file in some stable location,
> > and use the hacked mdadm.
> > 
> > NeilBrown
> > 
> 
> I tried 
> 
>  # mdadm -A --backup-file=/new-empty-md4backup-file /dev/md4
> 
> but the array is now in "inactive" state with 6 spares :
> 
> md4 : inactive sdc4[0](S) sdh4[6](S) sdg4[5](S) sdf4[3](S) sde4[2](S) sdd4[1](S)   
>       1411288041 blocks super 1.2
> 
> I'm a bit confuse on what I could do now.

That surprises me a little.
Try:
  mdadm -S /dev/md4
  mdadm -Avv --backup-file=/new-empty-md4backup-file /dev/md4
  dmesg | tail -100
  mdadm -E /dev/sd[cd]4

and send all of the output.

NeilBrown


> 
> # mdadm -E /dev/sd?4 | grep 'Role\|Stat\|pos\|dev.sd\|Lev\|Time\|Even'
> /dev/sdc4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : Active device 0
>    Array State : AAAAA. ('A' == active, '.' == missing)
> /dev/sdd4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : Active device 1
>    Array State : AAAAA. ('A' == active, '.' == missing)
> /dev/sde4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : Active device 2
>    Array State : AAAAA. ('A' == active, '.' == missing)
> /dev/sdf4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : Active device 3
>    Array State : AAAAA. ('A' == active, '.' == missing)
> /dev/sdg4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : Active device 4
>    Array State : AAAAA. ('A' == active, '.' == missing)
> /dev/sdh4:
>      Raid Level : raid6
>           State : active
>   Reshape pos'n : 0
>          Events : 97
>    Device Role : spare
>    Array State : AAAAA. ('A' == active, '.' == missing)
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-10-13  8:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-12 14:27 reboot before reshape from raid 5 to raid 6 (was in state resync=DELAYED). Doesn't assemble anymore Simon S
2010-10-12 20:46 ` Neil Brown
2010-10-12 22:59   ` Simon SÉHIER
2010-10-12 23:06     ` Simon SEHIER
2010-10-13  0:08     ` Neil Brown
2010-10-13  8:18       ` Simon SÉHIER
2010-10-13  8:37         ` Neil Brown [this message]
2010-10-13 17:32           ` Simon SÉHIER
2010-10-13 20:24             ` Neil Brown
2010-10-14  8:35               ` [resolved] " Simon SÉHIER

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101013193759.4678186e@notabene \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=simon@sehier.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).