* spare not becoming active
@ 2007-06-27 20:34 Simon
2007-06-28 16:06 ` Simon
0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-06-27 20:34 UTC (permalink / raw)
To: linux-raid
Hi,
not sure if this is the place to ask, as i'm trying to use and learn raids. So
i lack some knowledge of the general practice and still having trouble figuring
out mdadm...
Anyways, the story is simple, i had a raid5 going on 3 usb keys. The keys are
partitioned with fdisk to have 1 first ext2 device and a second raid-autodetect
device (i didn't know what to choose here, i belive it doesn't matter if i don't
bood with it...?)
All 3 arrays were active and nice, i decided to make an experiment and pull on
one (i wasn't sure if it was safe to do, but) and I immediately saw it was
missing. I removed it, replugged the drive, mounted partition 1, fine, add to
array, fine. Array started reconstructing here i think.
Then another drive, that i was suspecting of being faulty died. I did the same
remove/add procedure and it was there as spare.
Below are some outputs you'll recognize, my question is, how do you get the
spare ones to become active again? And I currently have one active device,
while I'm supposed to have at least two (to survive on raid5). So, what about
the data (i dont care as it was just a test, but) is it completely lost, are
there chances?
Thanks for any info/pointers!
Simon
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
[multipath] [faulty]
md0 : active raid5 sdf2[3](S) sde2[4](S) sdd2[5](F) sdc2[1] sdb2[6](F)
1952128 blocks level 5, 64k chunk, algorithm 2 [3/1] [_U_]
============================================================================
/dev/md0:
Version : 00.90.03
Creation Time : Sun Jun 24 09:30:24 2007
Raid Level : raid5
Array Size : 1952128 (1906.70 MiB 1998.98 MB)
Used Dev Size : 976064 (953.35 MiB 999.49 MB)
Raid Devices : 3
Total Devices : 5
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Wed Jun 27 16:35:44 2007
State : clean, degraded
Active Devices : 1
Working Devices : 3
Failed Devices : 2
Spare Devices : 2
Layout : left-symmetric
Chunk Size : 64K
UUID : 170691d8:28aaa115:628a7a6d:3715a011
Events : 0.834
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 34 1 active sync /dev/sdc2
2 0 0 2 removed
3 8 82 - spare /dev/sdf2
4 8 66 - spare /dev/sde2
5 8 50 - faulty spare
6 8 18 - faulty spare
============================================================================
brw-rw---- 1 root disk 8, 1 Jun 22 23:24 /dev/sda1
brw-rw---- 1 root disk 8, 33 Jun 22 23:24 /dev/sdc1
brw-rw---- 1 root disk 8, 65 Jun 27 14:26 /dev/sde1
brw-rw---- 1 root disk 8, 81 Jun 27 15:14 /dev/sdf1
============================================================================
[dev 9, 0] /dev/md0 170691D8.28AAA115.628A7A6D.3715A011 online
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 8, 34] /dev/sdc2 170691D8.28AAA115.628A7A6D.3715A011 good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 8, 82] /dev/sdf2 170691D8.28AAA115.628A7A6D.3715A011 spare
[dev 8, 66] /dev/sde2 170691D8.28AAA115.628A7A6D.3715A011 spare
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: spare not becoming active
2007-06-27 20:34 spare not becoming active Simon
@ 2007-06-28 16:06 ` Simon
2007-07-03 20:53 ` Simon
0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-06-28 16:06 UTC (permalink / raw)
To: linux-raid
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 1 8 34 1 active sync /dev/sdc2
> 2 0 0 2 removed
>
> 3 8 82 - spare /dev/sdf2
> 4 8 66 - spare /dev/sde2
> 5 8 50 - faulty spare
> 6 8 18 - faulty spare
I was trying a couple things, but never got to change this status.
At one point I stopped the array and restarted it, and it didn't work. (I did it
before, so i don't see why...)
# /sbin/mdadm -R /dev/md0
mdadm: failed to run array /dev/md0: Invalid argument
I'm starting to think the documentation i read was very outdated, or only
touched the subject in surface, can you guys recommend a good reading, like a
companion to the man page?
Thanks,
Simon
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: spare not becoming active
2007-06-28 16:06 ` Simon
@ 2007-07-03 20:53 ` Simon
2007-07-03 22:20 ` Nix
0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-07-03 20:53 UTC (permalink / raw)
To: linux-raid
I got a ghost in my system!
Ok, I'm still having weird issues and as nobody replied, I'm still
unsure how to proceed...
Here, I tried reformating (dd'ed some zeros on the whole drives),
repartitionning them.
Device Boot Start End Blocks Id System
/dev/sdb1 1 1 961 83 Linux
/dev/sdb2 2 1015 1005888 fd Linux raid autodetect
I have 3 identical drives, fresh made, i issue the command:
mdadm --create --verbose /dev/md1 --level=raid5 --raid-devices=3
/dev/sdb2 /dev/sdc2 /dev/sdd2
My system is a new install (nothing left in /etc, etc...), so I really
don't get what's going on, check the output of several programs:
/dev/md1:
Version : 00.90.03
Creation Time : Tue Jul 3 16:29:38 2007
Raid Level : raid5
Array Size : 2011648 (1964.83 MiB 2059.93 MB)
Used Dev Size : 1005824 (982.41 MiB 1029.96 MB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1
Persistence : Superblock is persistent
Update Time : Tue Jul 3 16:31:05 2007
State : clean, degraded, recovering
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 39% complete
UUID : 84aa4aaf:8b2c555e:3f9ae70d:2eedd5b3
Events : 0.4
Number Major Minor RaidDevice State
0 8 18 0 active sync /dev/sdb2
1 8 34 1 active sync /dev/sdc2
3 8 50 2 spare rebuilding /dev/sdd2
================================================================================
[dev 9, 1] /dev/md1 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
[dev 8, 18] /dev/sdb2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
[dev 8, 34] /dev/sdc2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
[dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
[dev 8, 50] /dev/sdd2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: spare not becoming active
2007-07-03 20:53 ` Simon
@ 2007-07-03 22:20 ` Nix
2007-07-03 22:52 ` Simon
0 siblings, 1 reply; 6+ messages in thread
From: Nix @ 2007-07-03 22:20 UTC (permalink / raw)
To: Simon; +Cc: linux-raid
On 3 Jul 2007, Simon spake thusly:
> I have 3 identical drives, fresh made, i issue the command:
> mdadm --create --verbose /dev/md1 --level=raid5 --raid-devices=3
> /dev/sdb2 /dev/sdc2 /dev/sdd2
OK...
> /dev/md1:
> Version : 00.90.03
> Creation Time : Tue Jul 3 16:29:38 2007
> Raid Level : raid5
> Array Size : 2011648 (1964.83 MiB 2059.93 MB)
> Used Dev Size : 1005824 (982.41 MiB 1029.96 MB)
> Raid Devices : 3
> Total Devices : 3
> Preferred Minor : 1
> Persistence : Superblock is persistent
>
> Update Time : Tue Jul 3 16:31:05 2007
> State : clean, degraded, recovering
OK...
> Active Devices : 2
> Working Devices : 3
> Failed Devices : 0
> Spare Devices : 1
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Rebuild Status : 39% complete
>
> UUID : 84aa4aaf:8b2c555e:3f9ae70d:2eedd5b3
> Events : 0.4
>
> Number Major Minor RaidDevice State
> 0 8 18 0 active sync /dev/sdb2
> 1 8 34 1 active sync /dev/sdc2
> 3 8 50 2 spare rebuilding /dev/sdd2
This is normal for a RAID-5 array construction. Rather than force you to
wait for ages until the RAID parity has been written, mdadm creates a
degraded two-element array with a single spare and fails over to it; the
rebuild involved in the failover automatically constructs the parity.
(Perhaps mdadm --detail could note that the first reconstruction hasn't
finished and tell the user what's going on.
Anyway, you're at 39%: wait, and eventually the reconstruction will be
done. Obviously if you shut down before then it'll come up degraded an
you'll have to wait while it reconstructs again. So, um, don't do
that. :)
> ================================================================================
> [dev 9, 1] /dev/md1 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> [dev 8, 18] /dev/sdb2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> [dev 8, 34] /dev/sdc2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
> [dev 8, 50] /dev/sdd2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
That output is rather strange though, mainly because of the mystic
missing drive with no name. mdadm bug?
--
`... in the sense that dragons logically follow evolution so they would
be able to wield metal.' --- Kenneth Eng's colourless green ideas sleep
furiously
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: spare not becoming active
2007-07-03 22:20 ` Nix
@ 2007-07-03 22:52 ` Simon
2007-07-03 22:54 ` Simon
0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-07-03 22:52 UTC (permalink / raw)
To: linux-raid
> This is normal for a RAID-5 array construction. Rather than force you to
> wait for ages until the RAID parity has been written, mdadm creates a
> degraded two-element array with a single spare and fails over to it; the
> rebuild involved in the failover automatically constructs the parity.
Makes sense. And i was aware that it was reconstructing...
> > [dev 9, 1] /dev/md1 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> > [dev 8, 18] /dev/sdb2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev 8, 34] /dev/sdc2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
> > [dev 8, 50] /dev/sdd2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
>
> That output is rather strange though, mainly because of the mystic
> missing drive with no name. mdadm bug?
That missing drive is the story of my life with raid!!
You must have missed the previous posts i sent in this thread, where I
was getting one active device and two spare (not being active) and 2
missing drives. As if the missing drive were taking the spot of an
active one, and the spare waiting could not take that spot.
Well, that's how I understood it, with my noobish mind! ;)
Worst part is that, even though no trace is left on my system after a
reboot (all is written to a tmpfs), even if i was formating the
devices (eg: /dev/sdb2) or using `mdadm --zero-superblock ...` They
were still being created as missing.
Anyway, good news! After reconstruction the output looks perfect, i
tried soft failing one device and it came back just fine. Problem was
with a real failure (see first post).
Thanks,
Simon
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: spare not becoming active
2007-07-03 22:52 ` Simon
@ 2007-07-03 22:54 ` Simon
0 siblings, 0 replies; 6+ messages in thread
From: Simon @ 2007-07-03 22:54 UTC (permalink / raw)
To: linux-raid
Oups, sorry for the double posts...
> This is normal for a RAID-5 array construction. Rather than force you to
> wait for ages until the RAID parity has been written, mdadm creates a
> degraded two-element array with a single spare and fails over to it; the
> rebuild involved in the failover automatically constructs the parity.
Makes sense. And i was aware that it was reconstructing...
> > [dev 9, 1] /dev/md1 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> > [dev 8, 18] /dev/sdb2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev 8, 34] /dev/sdc2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing
> > [dev 8, 50] /dev/sdd2 84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
>
> That output is rather strange though, mainly because of the mystic
> missing drive with no name. mdadm bug?
That missing drive is the story of my life with raid!!
You must have missed the previous posts i sent in this thread, where I
was getting one active device and two spare (not being active) and 2
missing drives. As if the missing drive were taking the spot of an
active one, and the spare waiting could not take that spot.
Well, that's how I understood it, with my noobish mind! ;)
Worst part is that, even though no trace is left on my system after a
reboot (all is written to a tmpfs), even if i was formating the
devices (eg: /dev/sdb2) or using `mdadm --zero-superblock ...` They
were still being created as missing.
Anyway, good news! After reconstruction the output looks perfect, i
tried soft failing one device and it came back just fine. Problem was
with a real failure (see first post).
Thanks,
Simon
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-07-03 22:54 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-27 20:34 spare not becoming active Simon
2007-06-28 16:06 ` Simon
2007-07-03 20:53 ` Simon
2007-07-03 22:20 ` Nix
2007-07-03 22:52 ` Simon
2007-07-03 22:54 ` Simon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).