linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* spare not becoming active
@ 2007-06-27 20:34 Simon
  2007-06-28 16:06 ` Simon
  0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-06-27 20:34 UTC (permalink / raw)
  To: linux-raid

Hi,
  not sure if this is the place to ask, as i'm trying to use and learn raids. So 
i lack some knowledge of the general practice and still having trouble figuring 
out mdadm...

Anyways, the story is simple, i had a raid5 going on 3 usb keys.  The keys are 
partitioned with fdisk to have 1 first ext2 device and a second  raid-autodetect 
device (i didn't know what to choose here, i belive it doesn't matter if i don't 
bood with it...?)

All 3 arrays were active and nice, i decided to make an experiment and pull on 
one (i wasn't sure if it was safe to do, but) and I immediately saw it was 
missing. I removed it, replugged the drive, mounted partition 1, fine, add to 
array, fine.  Array started reconstructing here i think.

Then another drive, that i was suspecting of being faulty died.  I did the same 
remove/add procedure and it was there as spare.

Below are some outputs you'll recognize, my question is, how do you get the 
spare ones to become active again?  And I currently have one active device, 
while I'm supposed to have at least two (to survive on raid5).  So, what about 
the data (i dont care as it was just a test, but) is it completely lost, are 
there chances?

Thanks for any info/pointers!
   Simon

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
[multipath] [faulty]
md0 : active raid5 sdf2[3](S) sde2[4](S) sdd2[5](F) sdc2[1] sdb2[6](F)
       1952128 blocks level 5, 64k chunk, algorithm 2 [3/1] [_U_]
============================================================================
/dev/md0:
         Version : 00.90.03
   Creation Time : Sun Jun 24 09:30:24 2007
      Raid Level : raid5
      Array Size : 1952128 (1906.70 MiB 1998.98 MB)
   Used Dev Size : 976064 (953.35 MiB 999.49 MB)
    Raid Devices : 3
   Total Devices : 5
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Wed Jun 27 16:35:44 2007
           State : clean, degraded
  Active Devices : 1
Working Devices : 3
  Failed Devices : 2
   Spare Devices : 2

          Layout : left-symmetric
      Chunk Size : 64K

            UUID : 170691d8:28aaa115:628a7a6d:3715a011
          Events : 0.834

     Number   Major   Minor   RaidDevice State
        0       0        0        0      removed
        1       8       34        1      active sync   /dev/sdc2
        2       0        0        2      removed

        3       8       82        -      spare   /dev/sdf2
        4       8       66        -      spare   /dev/sde2
        5       8       50        -      faulty spare
        6       8       18        -      faulty spare
============================================================================
brw-rw---- 1 root disk 8,  1 Jun 22 23:24 /dev/sda1
brw-rw---- 1 root disk 8, 33 Jun 22 23:24 /dev/sdc1
brw-rw---- 1 root disk 8, 65 Jun 27 14:26 /dev/sde1
brw-rw---- 1 root disk 8, 81 Jun 27 15:14 /dev/sdf1
============================================================================
[dev   9,   0] /dev/md0         170691D8.28AAA115.628A7A6D.3715A011 online
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev   8,  34] /dev/sdc2        170691D8.28AAA115.628A7A6D.3715A011 good
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev   8,  82] /dev/sdf2        170691D8.28AAA115.628A7A6D.3715A011 spare
[dev   8,  66] /dev/sde2        170691D8.28AAA115.628A7A6D.3715A011 spare

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: spare not becoming active
  2007-06-27 20:34 spare not becoming active Simon
@ 2007-06-28 16:06 ` Simon
  2007-07-03 20:53   ` Simon
  0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-06-28 16:06 UTC (permalink / raw)
  To: linux-raid

>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       34        1      active sync   /dev/sdc2
>        2       0        0        2      removed
> 
>        3       8       82        -      spare   /dev/sdf2
>        4       8       66        -      spare   /dev/sde2
>        5       8       50        -      faulty spare
>        6       8       18        -      faulty spare

I was trying a couple things, but never got to change this status.
At one point I stopped the array and restarted it, and it didn't work. (I did it 
before, so i don't see why...)

# /sbin/mdadm -R /dev/md0
mdadm: failed to run array /dev/md0: Invalid argument

I'm starting to think the documentation i read was very outdated, or only 
touched the subject in surface, can you guys recommend a good reading, like a 
companion to the man page?

Thanks,
   Simon




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: spare not becoming active
  2007-06-28 16:06 ` Simon
@ 2007-07-03 20:53   ` Simon
  2007-07-03 22:20     ` Nix
  0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-07-03 20:53 UTC (permalink / raw)
  To: linux-raid

I got a ghost in my system!

Ok, I'm still having weird issues and as nobody replied, I'm still
unsure how to proceed...
Here, I tried reformating (dd'ed some zeros on the whole drives),
repartitionning them.

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1           1         961   83  Linux
/dev/sdb2               2        1015     1005888   fd  Linux raid autodetect

I have 3 identical drives, fresh made, i issue the command:
mdadm --create --verbose /dev/md1 --level=raid5 --raid-devices=3
/dev/sdb2 /dev/sdc2 /dev/sdd2

My system is a new install (nothing left in /etc, etc...), so I really
don't get what's going on, check the output of several programs:

/dev/md1:
        Version : 00.90.03
  Creation Time : Tue Jul  3 16:29:38 2007
     Raid Level : raid5
     Array Size : 2011648 (1964.83 MiB 2059.93 MB)
  Used Dev Size : 1005824 (982.41 MiB 1029.96 MB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 1
    Persistence : Superblock is persistent

    Update Time : Tue Jul  3 16:31:05 2007
          State : clean, degraded, recovering
 Active Devices : 2
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 39% complete

           UUID : 84aa4aaf:8b2c555e:3f9ae70d:2eedd5b3
         Events : 0.4

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2
       3       8       50        2      spare rebuilding   /dev/sdd2
================================================================================
[dev   9,   1] /dev/md1         84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
[dev   8,  18] /dev/sdb2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
[dev   8,  34] /dev/sdc2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
[dev   8,  50] /dev/sdd2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: spare not becoming active
  2007-07-03 20:53   ` Simon
@ 2007-07-03 22:20     ` Nix
  2007-07-03 22:52       ` Simon
  0 siblings, 1 reply; 6+ messages in thread
From: Nix @ 2007-07-03 22:20 UTC (permalink / raw)
  To: Simon; +Cc: linux-raid

On 3 Jul 2007, Simon spake thusly:
> I have 3 identical drives, fresh made, i issue the command:
> mdadm --create --verbose /dev/md1 --level=raid5 --raid-devices=3
> /dev/sdb2 /dev/sdc2 /dev/sdd2

OK...

> /dev/md1:
>        Version : 00.90.03
>  Creation Time : Tue Jul  3 16:29:38 2007
>     Raid Level : raid5
>     Array Size : 2011648 (1964.83 MiB 2059.93 MB)
>  Used Dev Size : 1005824 (982.41 MiB 1029.96 MB)
>   Raid Devices : 3
>  Total Devices : 3
> Preferred Minor : 1
>    Persistence : Superblock is persistent
>
>    Update Time : Tue Jul  3 16:31:05 2007
>          State : clean, degraded, recovering

OK...

> Active Devices : 2
> Working Devices : 3
> Failed Devices : 0
>  Spare Devices : 1
>
>         Layout : left-symmetric
>     Chunk Size : 64K
>
> Rebuild Status : 39% complete
>
>           UUID : 84aa4aaf:8b2c555e:3f9ae70d:2eedd5b3
>         Events : 0.4
>
>    Number   Major   Minor   RaidDevice State
>       0       8       18        0      active sync   /dev/sdb2
>       1       8       34        1      active sync   /dev/sdc2
>       3       8       50        2      spare rebuilding   /dev/sdd2

This is normal for a RAID-5 array construction. Rather than force you to
wait for ages until the RAID parity has been written, mdadm creates a
degraded two-element array with a single spare and fails over to it; the
rebuild involved in the failover automatically constructs the parity.

(Perhaps mdadm --detail could note that the first reconstruction hasn't
finished and tell the user what's going on.

Anyway, you're at 39%: wait, and eventually the reconstruction will be
done.  Obviously if you shut down before then it'll come up degraded an
you'll have to wait while it reconstructs again. So, um, don't do
that. :)

> ================================================================================
> [dev   9,   1] /dev/md1         84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> [dev   8,  18] /dev/sdb2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> [dev   8,  34] /dev/sdc2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> [dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
> [dev   8,  50] /dev/sdd2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare

That output is rather strange though, mainly because of the mystic
missing drive with no name. mdadm bug?

-- 
`... in the sense that dragons logically follow evolution so they would
 be able to wield metal.' --- Kenneth Eng's colourless green ideas sleep
 furiously

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: spare not becoming active
  2007-07-03 22:20     ` Nix
@ 2007-07-03 22:52       ` Simon
  2007-07-03 22:54         ` Simon
  0 siblings, 1 reply; 6+ messages in thread
From: Simon @ 2007-07-03 22:52 UTC (permalink / raw)
  To: linux-raid

> This is normal for a RAID-5 array construction. Rather than force you to
> wait for ages until the RAID parity has been written, mdadm creates a
> degraded two-element array with a single spare and fails over to it; the
> rebuild involved in the failover automatically constructs the parity.

Makes sense.  And i was aware that it was reconstructing...

> > [dev   9,   1] /dev/md1         84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> > [dev   8,  18] /dev/sdb2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev   8,  34] /dev/sdc2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
> > [dev   8,  50] /dev/sdd2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
>
> That output is rather strange though, mainly because of the mystic
> missing drive with no name. mdadm bug?

That missing drive is the story of my life with raid!!
You must have missed the previous posts i sent in this thread, where I
was getting one active device and two spare (not being active) and 2
missing drives.  As if the missing drive were taking the spot of an
active one, and the spare waiting could not take that spot.

Well, that's how I understood it, with my noobish mind! ;)

Worst part is that, even though no trace is left on my system after a
reboot (all is written to a tmpfs), even if i was formating the
devices (eg: /dev/sdb2) or using `mdadm --zero-superblock ...` They
were still being created as missing.

Anyway, good news!  After reconstruction the output looks perfect, i
tried soft failing one device and it came back just fine.  Problem was
with a real failure (see first post).

Thanks,
 Simon

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: spare not becoming active
  2007-07-03 22:52       ` Simon
@ 2007-07-03 22:54         ` Simon
  0 siblings, 0 replies; 6+ messages in thread
From: Simon @ 2007-07-03 22:54 UTC (permalink / raw)
  To: linux-raid

Oups, sorry for the double posts...

> This is normal for a RAID-5 array construction. Rather than force you to
> wait for ages until the RAID parity has been written, mdadm creates a
> degraded two-element array with a single spare and fails over to it; the
> rebuild involved in the failover automatically constructs the parity.

Makes sense.  And i was aware that it was reconstructing...

> > [dev   9,   1] /dev/md1         84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 online
> > [dev   8,  18] /dev/sdb2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev   8,  34] /dev/sdc2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 good
> > [dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000 missing
> > [dev   8,  50] /dev/sdd2        84AA4AAF.8B2C555E.3F9AE70D.2EEDD5B3 spare
>
> That output is rather strange though, mainly because of the mystic
> missing drive with no name. mdadm bug?

That missing drive is the story of my life with raid!!
You must have missed the previous posts i sent in this thread, where I
was getting one active device and two spare (not being active) and 2
missing drives.  As if the missing drive were taking the spot of an
active one, and the spare waiting could not take that spot.

Well, that's how I understood it, with my noobish mind! ;)

Worst part is that, even though no trace is left on my system after a
reboot (all is written to a tmpfs), even if i was formating the
devices (eg: /dev/sdb2) or using `mdadm --zero-superblock ...` They
were still being created as missing.

Anyway, good news!  After reconstruction the output looks perfect, i
tried soft failing one device and it came back just fine.  Problem was
with a real failure (see first post).

Thanks,
 Simon

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-07-03 22:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-27 20:34 spare not becoming active Simon
2007-06-28 16:06 ` Simon
2007-07-03 20:53   ` Simon
2007-07-03 22:20     ` Nix
2007-07-03 22:52       ` Simon
2007-07-03 22:54         ` Simon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).