linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Busby <michael.a.busby@gmail.com>
To: "Alexander Kühn" <alexander.kuehn@nagilum.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Unable to restart reshape
Date: Sun, 30 Oct 2011 22:15:43 +0000	[thread overview]
Message-ID: <CAFsPQ__+ofSpEy57Ly_u_NkCf94BJ49+_mcqWb7SPMnaSdQyMA@mail.gmail.com> (raw)
In-Reply-To: <20111030230215.Horde.zF2Mdpk8pphOrclnrmpTWiA@cakebox.homeunix.net>

>>>>>> I have a system the was doing a reshape from RAID5 to 6, the system
>>>>>> had to be powered off this morning and moved, upon restarting the
>>>>>> server i issued the following command to continue the reshape
>>>>>>
>>>>>>  mdadm -A /dev/md0 --backup-file=/home/md.backup
>>>>>>
>>>>>> i get back to following error
>>>>>>
>>>>>> mdadm: Failed to restore critical section for reshape, sorry.
>>>>>>
>>>>>> any idea why?
>>>>>>
>>>>>> before shutting down cat /proc/mdstat showed
>>>>>>
>>>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>>>> [raid4] [raid10]
>>>>>> md0 : active raid6 sdf[0] sdb[6](S) sda[4] sdc[3] sde[2] sdd[1]
>>>>>>     7814055936 blocks super 1.0 level 6, 512k chunk, algorithm 18
>>>>>> [6/5] [UUUUU_]
>>>>>>     [==============>......]  reshape = 70.8% (1384415232/1953513984)
>>>>>> finish=3658.6min speed=2592K/sec
>>>>>>
>>>>>> but now it shows
>>>>>>
>>>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>>>> [raid4] [raid10]
>>>>>> md0 : inactive sdc[3] sdb[6](S) sde[2] sdd[1] sdf[0]
>>>>>>      9767572240 blocks super 1.0
>>>>>>
>>>>>> i am totally confused, it seems to have lost a drive from the raid,
>>>>>> and the number of blocks is incorrect
>>>>>>
>>>>>
>>>>> issuing the following
>>>>>
>>>>>  mdadm -Avv --backup-file=/home/md.backup /dev/md0
>>>>>
>>>>> returns
>>>>>
>>>>>
>>>>> mdadm: looking for devices for /dev/md0
>>>>> mdadm: cannot open device /dev/sda5: Device or resource busy
>>>>> mdadm: /dev/sda5 has wrong uuid.
>>>>> mdadm: no RAID superblock on /dev/sda2
>>>>> mdadm: /dev/sda2 has wrong uuid.
>>>>> mdadm: cannot open device /dev/sda1: Device or resource busy
>>>>> mdadm: /dev/sda1 has wrong uuid.
>>>>> mdadm: cannot open device /dev/sda: Device or resource busy
>>>>> mdadm: /dev/sda has wrong uuid.
>>>>> mdadm: /dev/sdg is identified as a member of /dev/md0, slot -1.
>>>>> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 4.
>>>>> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
>>>>> mdadm: /dev/sde is identified as a member of /dev/md0, slot 0.
>>>>> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 1.
>>>>> mdadm: /dev/sdb is identified as a member of /dev/md0, slot 3.
>>>>> mdadm:/dev/md0 has an active reshape - checking if critical section
>>>>> needs to be restored
>>>>> mdadm: backup-metadata found on /home/md.backup but is not needed
>>>>> mdadm: Failed to find backup of critical section
>>>>> mdadm: Failed to restore critical section for reshape, sorry.
>>>>>
>>>>
>>>> seem the above was trying at use the wrong disks to assemble, so using
>>>> the following
>>>>
>>>> mdadm -Avv /dev/md0 --backup-file=/home/md.backup /dev/sd[abcdef]
>>>>
>>>>  mdadm: looking for devices for /dev/md0
>>>> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
>>>> mdadm: /dev/sdb is identified as a member of /dev/md0, slot -1.
>>>> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
>>>> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
>>>> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
>>>> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 0.
>>>> mdadm:/dev/md0 has an active reshape - checking if critical section
>>>> needs to be restored
>>>> mdadm: backup-metadata found on /home/md.backup but is not needed
>>>> mdadm: Failed to find backup of critical section
>>>> mdadm: Failed to restore critical section for reshape, sorry.
>>>>
>>>
>>> have now upgraded to mdadm 3.2.2
>>>
>>> and get a little more info
>>>
>>> mdadm -Avv /dev/md0 --backup-file=/home/md.backup /dev/sd[abcdef]
>>>
>>> mdadm: looking for devices for /dev/md0
>>> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
>>> mdadm: /dev/sdb is identified as a member of /dev/md0, slot -1.
>>> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
>>> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
>>> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
>>> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 0.
>>> mdadm: device 6 in /dev/md0 has wrong state in superblock, but /dev/sdb
>>> seems ok
>>> mdadm:/dev/md0 has an active reshape - checking if critical section
>>> needs to be restored
>>> mdadm: backup-metadata found on /home/md.backup but is not needed
>>> mdadm: Failed to find backup of critical section
>>> mdadm: Failed to restore critical section for reshape, sorry.
>>>
>>
>>
>> Ok, i dont know if this is the right thing to have done
>>
>> ~# mdadm -Avv --force /dev/md0 --backup-file=/home/md.backup
>> /dev/sd[abcdef]
>>
>> mdadm: looking for devices for /dev/md0
>> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
>> mdadm: /dev/sdb is identified as a member of /dev/md0, slot -1.
>> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
>> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
>> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
>> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 0.
>> mdadm: clearing FAULTY flag for device 1 in /dev/md0 for /dev/sdb
>> mdadm: Marking array /dev/md0 as 'clean'
>> mdadm:/dev/md0 has an active reshape - checking if critical section
>> needs to be restored
>> mdadm: backup-metadata found on /home/md.backup but is not needed
>> mdadm: Failed to find backup of critical section
>> mdadm: Failed to restore critical section for reshape, sorry.
>>
>>
>> ~# mdadm -Avv /dev/md0 --backup-file=/home/md.backup /dev/sd[abcdef]
>>
>> mdadm: looking for devices for /dev/md0
>> mdadm: /dev/sda is identified as a member of /dev/md0, slot 4.
>> mdadm: /dev/sdb is identified as a member of /dev/md0, slot -1.
>> mdadm: /dev/sdc is identified as a member of /dev/md0, slot 3.
>> mdadm: /dev/sdd is identified as a member of /dev/md0, slot 1.
>> mdadm: /dev/sde is identified as a member of /dev/md0, slot 2.
>> mdadm: /dev/sdf is identified as a member of /dev/md0, slot 0.
>> mdadm:/dev/md0 has an active reshape - checking if critical section
>> needs to be restored
>> mdadm: restoring critical section
>> mdadm: added /dev/sdd to /dev/md0 as 1
>> mdadm: added /dev/sde to /dev/md0 as 2
>> mdadm: added /dev/sdc to /dev/md0 as 3
>> mdadm: added /dev/sda to /dev/md0 as 4
>> mdadm: no uptodate device for slot 5 of /dev/md0
>> mdadm: added /dev/sdb to /dev/md0 as -1
>> mdadm: added /dev/sdf to /dev/md0 as 0
>> mdadm: /dev/md0 has been started with 4 drives (out of 6) and 1 spare.
>>
>> ~# cat /proc/mdstat
>>
>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>> [raid4] [raid10]
>> md0 : active raid6 sdf[0] sdb[6](S) sdc[3] sde[2] sdd[1]
>>      7814055936 blocks super 1.0 level 6, 512k chunk, algorithm 18
>> [6/4] [UUUU__]
>>      [==============>......]  reshape = 74.3% (1452929024/1953513984)
>> finish=2545.2min speed=3276K/sec
>>
>> unused devices: <none>
>>
>> so looks like its carrying on now but with 4 disks and a spare, maybe
>> i can add the other disk once the reshape has finished
>
> It generally helps to include/examine "mdadm -E /dev/sdX" of all devices
> involved in your mail(s) and also "mdadm -Q --detail /dev/md0".
> After the reshape is done it will automatically rebuild using the spare.
> Then you can have a close look which of your devices arent used, clear the
> metadate from the device and add it as well to regain full redundancy.
> You'll have plenty hours of fun watching /proc/mdstat. ;)
> Alex.
>

Thanks for the response Alex, the reshape has got about 2400mins left
to run and no idea how long the rebuild will take..

I will check out those commands once i am back up and running, i am
fairly new to mdadm so still finding out all the useful commands when
trouble shooting issues, thanks for pointing these out to me
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

      reply	other threads:[~2011-10-30 22:15 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-30 14:57 Unable to restart reshape Michael Busby
2011-10-30 15:34 ` Michael Busby
2011-10-30 15:57   ` Michael Busby
2011-10-30 16:04     ` Michael Busby
2011-10-30 16:22       ` Michael Busby
2011-10-30 22:02         ` Alexander Kühn
2011-10-30 22:15           ` Michael Busby [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFsPQ__+ofSpEy57Ly_u_NkCf94BJ49+_mcqWb7SPMnaSdQyMA@mail.gmail.com \
    --to=michael.a.busby@gmail.com \
    --cc=alexander.kuehn@nagilum.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).