* Raid 5 to 6 grow stalled
@ 2011-04-09 16:23 Edward Siefker
2011-04-09 16:55 ` Edward Siefker
2011-04-09 21:30 ` NeilBrown
0 siblings, 2 replies; 6+ messages in thread
From: Edward Siefker @ 2011-04-09 16:23 UTC (permalink / raw)
To: linux-raid
I have a 3 disk raid-5 array I want to convert to a 4 disk raid-6 array.
I added the 4th disk as a spare, and thought that it would get used
automatically when I converted it to 6, but instead it created a
degraded
raid 6 array. So I removed the spare and added it back to the array to
get it to rebuild. It did start rebuilding, but stalled out.
Here's some console output showing what I did:
root@iblis:/home/hatta# mdadm --grow /dev/md2 --level 6 --raid-devices 4
--backup-file=/mnt/sda6/yo
mdadm level of /dev/md2 changed to raid6
root@iblis:/home/hatta# cat /proc/mdstat
Personalities : [raid10] [raid6] [raid5] [raid4]
md2 : active (auto-read-only) raid6 sdd1[0] sde1[3](S) sdf1[2] sdc1[1]
2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
root@iblis:/home/hatta# mdadm /dev/md2 --remove /dev/sde1
mdadm: hot removed /dev/sde1 from /dev/md2
root@iblis:/home/hatta# mdadm /dev/md2 --add /dev/sde1
mdadm: re-added /dev/sde1
root@iblis:/home/hatta# cat /proc/mdstat
Personalities : [raid10] [raid6] [raid5] [raid4]
md2 : active raid6 sde1[4] sdd1[0] sdf1[2] sdc1[1]
2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
[>....................] recovery = 0.0% (8192/1465135936)
finish=11899.9min speed=2048K/sec
...
...
...
md2 : active raid6 sde1[4] sdd1[0] sdf1[2] sdc1[1]
2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
[>....................] recovery = 0.0% (8192/1465135936)
finish=2954174.4min speed=8K/sec
What's going on here?
--
hatta00@fastmail.fm
--
http://www.fastmail.fm - Does exactly what it says on the tin
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Raid 5 to 6 grow stalled
2011-04-09 16:23 Raid 5 to 6 grow stalled Edward Siefker
@ 2011-04-09 16:55 ` Edward Siefker
2011-04-09 21:30 ` NeilBrown
1 sibling, 0 replies; 6+ messages in thread
From: Edward Siefker @ 2011-04-09 16:55 UTC (permalink / raw)
To: linux-raid
Here's a little more info about my array.
mdadm -D /dev/md2:
/dev/md2:
Version : 0.90
Creation Time : Wed Aug 5 22:07:37 2009
Raid Level : raid6
Array Size : 2930271872 (2794.53 GiB 3000.60 GB)
Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Sat Apr 9 11:49:59 2011
State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric-6
Chunk Size : 64K
Rebuild Status : 0% complete
UUID : 2a150668:a0e6267d:a6dd356e:c26679ce (local to host
iblis)
Events : 0.613692
Number Major Minor RaidDevice State
0 8 49 0 active sync /dev/sdd1
1 8 33 1 active sync /dev/sdc1
2 8 81 2 active sync /dev/sdf1
4 8 65 3 spare rebuilding /dev/sde1
mdadm -E /dev/sde1:
/dev/sde1:
Magic : a92b4efc
Version : 0.90.00
UUID : 2a150668:a0e6267d:a6dd356e:c26679ce (local to host
iblis)
Creation Time : Wed Aug 5 22:07:37 2009
Raid Level : raid6
Used Dev Size : 1465135936 (1397.26 GiB 1500.30 GB)
Array Size : 2930271872 (2794.53 GiB 3000.60 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2
Update Time : Sat Apr 9 11:50:41 2011
State : clean
Active Devices : 3
Working Devices : 4
Failed Devices : 1
Spare Devices : 1
Checksum : ccecf240 - correct
Events : 613694
Layout : left-symmetric-6
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 65 4 spare /dev/sde1
0 0 8 49 0 active sync /dev/sdd1
1 1 8 33 1 active sync /dev/sdc1
2 2 8 81 2 active sync /dev/sdf1
3 3 0 0 3 faulty removed
4 4 8 65 4 spare /dev/sde1
--
hatta00@fastmail.fm
--
http://www.fastmail.fm - IMAP accessible web-mail
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Raid 5 to 6 grow stalled
2011-04-09 16:23 Raid 5 to 6 grow stalled Edward Siefker
2011-04-09 16:55 ` Edward Siefker
@ 2011-04-09 21:30 ` NeilBrown
2011-04-10 15:02 ` Edward Siefker
1 sibling, 1 reply; 6+ messages in thread
From: NeilBrown @ 2011-04-09 21:30 UTC (permalink / raw)
To: Edward Siefker; +Cc: linux-raid
On Sat, 09 Apr 2011 09:23:52 -0700 "Edward Siefker" <hatta00@fastmail.fm>
wrote:
> I have a 3 disk raid-5 array I want to convert to a 4 disk raid-6 array.
> I added the 4th disk as a spare, and thought that it would get used
> automatically when I converted it to 6, but instead it created a
> degraded
> raid 6 array. So I removed the spare and added it back to the array to
> get it to rebuild. It did start rebuilding, but stalled out.
>
> Here's some console output showing what I did:
>
> root@iblis:/home/hatta# mdadm --grow /dev/md2 --level 6 --raid-devices 4
> --backup-file=/mnt/sda6/yo
> mdadm level of /dev/md2 changed to raid6
> root@iblis:/home/hatta# cat /proc/mdstat
> Personalities : [raid10] [raid6] [raid5] [raid4]
> md2 : active (auto-read-only) raid6 sdd1[0] sde1[3](S) sdf1[2] sdc1[1]
> 2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
>
> root@iblis:/home/hatta# mdadm /dev/md2 --remove /dev/sde1
> mdadm: hot removed /dev/sde1 from /dev/md2
> root@iblis:/home/hatta# mdadm /dev/md2 --add /dev/sde1
> mdadm: re-added /dev/sde1
> root@iblis:/home/hatta# cat /proc/mdstat
> Personalities : [raid10] [raid6] [raid5] [raid4]
> md2 : active raid6 sde1[4] sdd1[0] sdf1[2] sdc1[1]
> 2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
> [>....................] recovery = 0.0% (8192/1465135936)
> finish=11899.9min speed=2048K/sec
> ...
> ...
> ...
> md2 : active raid6 sde1[4] sdd1[0] sdf1[2] sdc1[1]
> 2930271872 blocks level 6, 64k chunk, algorithm 18 [4/3] [UUU_]
> [>....................] recovery = 0.0% (8192/1465135936)
> finish=2954174.4min speed=8K/sec
>
>
> What's going on here?
Doesn't look good..
What version of mdadm? What kernel?
NeilBrown
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Raid 5 to 6 grow stalled
2011-04-09 21:30 ` NeilBrown
@ 2011-04-10 15:02 ` Edward Siefker
2011-04-10 15:09 ` Mathias Burén
2011-04-10 21:52 ` NeilBrown
0 siblings, 2 replies; 6+ messages in thread
From: Edward Siefker @ 2011-04-10 15:02 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
>
> Doesn't look good..
> What version of mdadm? What kernel?
>
> NeilBrown
>
root@iblis:~# mdadm --version
mdadm - v3.1.4 - 31st August 2010
root@iblis:~# uname -a
Linux iblis 2.6.38-2-amd64 #1 SMP Tue Mar 29 16:45:36 UTC 2011 x86_64
GNU/Linux
Did you see the other post I made? The array still
reports as clean, but I haven't tried to mount it yet.
This array started out as a RAID1, which I changed to
RAID5, added a disk, reshaped, and added a spare to.
This worked great and I used it for a couple weeks
before deciding to go to RAID6.
Looking at the output of 'mdadm -E' and 'mdadm -D'
(again in the other post), it looks like there's
some inconsistency in the raid device for /dev/sde1.
-E reports it as number 4 and raid device 4. But,
-D says /dev/sde1 is number 4 and raid device 3.
Don't know if that means anything, but it's the
only thing I see that looks unusual.
Since the array is clean, is it safe to mount it?
It's actually a luks volume, fwiw. Thanks
--
hatta00@fastmail.fm
--
http://www.fastmail.fm - mmm... Fastmail...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Raid 5 to 6 grow stalled
2011-04-10 15:02 ` Edward Siefker
@ 2011-04-10 15:09 ` Mathias Burén
2011-04-10 21:52 ` NeilBrown
1 sibling, 0 replies; 6+ messages in thread
From: Mathias Burén @ 2011-04-10 15:09 UTC (permalink / raw)
To: Edward Siefker; +Cc: NeilBrown, linux-raid
On 10 April 2011 16:02, Edward Siefker <hatta00@fastmail.fm> wrote:
>
>>
>> Doesn't look good..
>> What version of mdadm? What kernel?
>>
>> NeilBrown
>>
>
>
>
> root@iblis:~# mdadm --version
> mdadm - v3.1.4 - 31st August 2010
> root@iblis:~# uname -a
> Linux iblis 2.6.38-2-amd64 #1 SMP Tue Mar 29 16:45:36 UTC 2011 x86_64
> GNU/Linux
>
> Did you see the other post I made? The array still
> reports as clean, but I haven't tried to mount it yet.
> This array started out as a RAID1, which I changed to
> RAID5, added a disk, reshaped, and added a spare to.
> This worked great and I used it for a couple weeks
> before deciding to go to RAID6.
>
>
> Looking at the output of 'mdadm -E' and 'mdadm -D'
> (again in the other post), it looks like there's
> some inconsistency in the raid device for /dev/sde1.
> -E reports it as number 4 and raid device 4. But,
> -D says /dev/sde1 is number 4 and raid device 3.
> Don't know if that means anything, but it's the
> only thing I see that looks unusual.
>
> Since the array is clean, is it safe to mount it?
> It's actually a luks volume, fwiw. Thanks
> --
>
> hatta00@fastmail.fm
>
> --
> http://www.fastmail.fm - mmm... Fastmail...
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Just try a fsck -n ?
// Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Raid 5 to 6 grow stalled
2011-04-10 15:02 ` Edward Siefker
2011-04-10 15:09 ` Mathias Burén
@ 2011-04-10 21:52 ` NeilBrown
1 sibling, 0 replies; 6+ messages in thread
From: NeilBrown @ 2011-04-10 21:52 UTC (permalink / raw)
To: Edward Siefker; +Cc: linux-raid
On Sun, 10 Apr 2011 08:02:43 -0700 "Edward Siefker" <hatta00@fastmail.fm>
wrote:
>
> >
> > Doesn't look good..
> > What version of mdadm? What kernel?
> >
> > NeilBrown
> >
>
>
>
> root@iblis:~# mdadm --version
> mdadm - v3.1.4 - 31st August 2010
> root@iblis:~# uname -a
> Linux iblis 2.6.38-2-amd64 #1 SMP Tue Mar 29 16:45:36 UTC 2011 x86_64
> GNU/Linux
>
> Did you see the other post I made? The array still
> reports as clean, but I haven't tried to mount it yet.
> This array started out as a RAID1, which I changed to
> RAID5, added a disk, reshaped, and added a spare to.
> This worked great and I used it for a couple weeks
> before deciding to go to RAID6.
>
>
> Looking at the output of 'mdadm -E' and 'mdadm -D'
> (again in the other post), it looks like there's
> some inconsistency in the raid device for /dev/sde1.
> -E reports it as number 4 and raid device 4. But,
> -D says /dev/sde1 is number 4 and raid device 3.
> Don't know if that means anything, but it's the
> only thing I see that looks unusual.
>
> Since the array is clean, is it safe to mount it?
> It's actually a luks volume, fwiw. Thanks
That inconsistency is expected. The v0.90 metadata isn't able to represent
some state information that the running kernel can represent. So when the
metadata is written out it looks a bit different just as you noticed.
Your data is safe and it is easy to make it all happy again.
There are a couple of options...
The state of your array is that it has been converted to RAID6 in a special
layout where the Q blocks (the second parity block) are all on the last drive
instead of spread among all the drives.
mdadm then tried to start the process of converting this to a more normal
layout but because the array was "auto-read-only" (which means you hadn't
even tried to mount it or anything) it got confused and aborted.
This left some sysfs settings in an unusual state.
In particular:
sync_max is 16385 (sectors) so when the rebuilt started it paused at 8192K
suspend_hi is 65536 so any IO lower than this address will block.
The simplest thing to is:
echo max > /sys/block/md0/md/sync_max
echo 0 > /sys/block/md0/md/suspend_hi
this will allow the recovery of sde1 to complete and you will have full access
to your data.
This will result in the array being in the unusual layout with non-rotated Q.
This can then be fixed with
mdadm --grow /dev/md0 --layout=normalise --backup=/whatever
If you don't want to wait for both the recovery and the subsequent
reshape you could do them both at once:
- issue the above two 'echo' commands.
- fail and remove sde1
mdadm /dev/md0 -f /dev/sde1
mdadm /dev/md0 -r /dev/sde1
- then freeze recovery in the array, add the device back in, and start the
grow, so:
echo frozen > /sys/block/md0/md/sync_action
mdadm /dev/md0 --add /dev/sde1
mdadm --grow /dev/md0 --layout=normalise --backup=/whereever
You don't need to use the same backup file as before - and make sure you
give the name of a file which doesn't exist, or mdadm will complain,
unfreeze the array and it will start recovery - which isn't a big problem,
just not part of the plan.
And you did exactly the right thing to ask instead of fiddling with the
array!! It was in quite an unusual state and while I is unlikely you would
have corrupted any date - waiting for a definitive answer is safest!
The next version of mdadm will check for arrays that are 'auto-readonly' and
not get confused by them.
Thanks,
NeilBrown
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-04-10 21:52 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-09 16:23 Raid 5 to 6 grow stalled Edward Siefker
2011-04-09 16:55 ` Edward Siefker
2011-04-09 21:30 ` NeilBrown
2011-04-10 15:02 ` Edward Siefker
2011-04-10 15:09 ` Mathias Burén
2011-04-10 21:52 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).