All of lore.kernel.org
 help / color / mirror / Atom feed
* strange RAID5 problem
@ 2006-05-09  5:30 Maurice Hilarius
  2006-05-09  5:45 ` Neil Brown
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Maurice Hilarius @ 2006-05-09  5:30 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

Good evening.

I am having a bit of a problem with a largish RAID5 set.
Now it is looking more and more like I am about to lose all the data on
it, so I am asking (begging?) to see if anyone can help me sort this out.


Here is the scenario: 16 SATA  disks connected to a pair of AMCC(3Ware)
9550SX-12 controllers.

RAID 5, 15 disks, plus 1 hot spare.

SMART started reporting errors on a disk, so it was retired with the
3Ware CLI, then removed and replaced.
The new disk had a JBOD signature added with the 3Ware CLI, then a
single large partition was created with fdisk.

At this point I would expect to be able to add the disk back to the
array by:
[root@box ~]# mdadm /dev/md3 -a /dev/sdw1

But, I get this error message:
mdadm: hot add failed for /dev/sdw1: No such device

What? We just made the partition on sdw a moment ago in fdisk. It IS there!

So. we look around a bit:
# /cat/proc/mdstat

md3 : inactive sdq1[0] sdaf1[15] sdae1[14] sdad1[13] sdac1[12] sdab1[11]
sdaa1[10] sdz1[9] sdy1[8] sdx1[7] sdv1[5] sdu1[4] sdt1[3] sds1[2]
sdr1[1]
      5860631040 blocks

Yup, that looks correct, missing sdw1[6]

Looking more:
# mdadm -D /dev/md3

/dev/md3:
        Version : 00.90.01
  Creation Time : Tue Jan 10 19:21:23 2006
     Raid Level : raid5
    Device Size : 390708736 (372.61 GiB 400.09 GB)
   Raid Devices : 16
  Total Devices : 15
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Mon May  8 19:33:36 2006
          State : active, degraded
 Active Devices : 15
Working Devices : 15
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 256K

           UUID : 771aa4c0:48d9b467:44c847e2:9bc81c43
         Events : 0.1818687

    Number   Major   Minor   RaidDevice State
       0      65        1        0      active sync   /dev/sdq1
       1      65       17        1      active sync   /dev/sdr1
       2      65       33        2      active sync   /dev/sds1
       3      65       49        3      active sync   /dev/sdt1
       4      65       65        4      active sync   /dev/sdu1
       5      65       81        5      active sync   /dev/sdv1
     609       0        0        0      removed
       7      65      113        7      active sync   /dev/sdx1
       8      65      129        8      active sync   /dev/sdy1
       9      65      145        9      active sync   /dev/sdz1
      10      65      161       10      active sync   /dev/sdaa1
      11      65      177       11      active sync   /dev/sdab1
      12      65      193       12      active sync   /dev/sdac1
      13      65      209       13      active sync   /dev/sdad1
      14      65      225       14      active sync   /dev/sdae1
      15      65      241       15      active sync   /dev/sdaf1

That also looks to be as expected.

So, lets try to assemble it again and force sdw1 in to it:

[root@box ~]# mdadm
--assemble /dev/md3 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1
/dev/sdv1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1
/dev/sdac1 /dev/sdad1 /dev/sdae1 /dev/sdaf1
mdadm: superblock on /dev/sdw1 doesn't match others - assembly aborted

[root@box ~]# mdadm
--assemble /dev/md3 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1
/dev/sdv1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdac1
/dev/sdad1 /dev/sdae1 /dev/sdaf1
mdadm: failed to RUN_ARRAY /dev/md3: Invalid argument

[root@box ~]# mdadm
-A /dev/md3 /dev/sdq1 /dev/sdr1 /dev/sds1 /dev/sdt1 /dev/sdu1 /dev/sdv1
/dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdac1
/dev/sdad1 /dev/sdae1 /dev/sdaf1
mdadm: device /dev/md3 already active - cannot assemble it

[root@box ~]# cat /proc/mdstat
Personalities : [raid1] [raid5]
md1 : active raid1 hdb3[1] hda3[0]
      115105600 blocks [2/2] [UU]

md2 : active raid5 sdp1[15] sdo1[14] sdn1[13] sdm1[12] sdl1[11] sdk1[10]
sdj1[9] sdi1[8] sdh1[7] sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1]
sda1[0]
      5860631040 blocks level 5, 256k chunk, algorithm 2 [16/16]
[UUUUUUUUUUUUUUUU]

md3 : inactive sdq1[0] sdaf1[15] sdae1[14] sdad1[13] sdac1[12] sdab1[11]
sdaa1[10] sdz1[9] sdy1[8] sdx1[7] sdv1[5] sdu1[4] sdt1[3] sds1[2]
sdr1[1]
      5860631040 blocks
md0 : active raid1 hdb1[1] hda1[0]
      104320 blocks [2/2] [UU]

unused devices: <none>

[root@box ~]# mdadm /dev/md3 -a /dev/sdw1
mdadm: hot add failed for /dev/sdw1: No such device

OK, let's mount the degraded RAID and try to copy the files to somewhere
else, so we can make it from scratch:

[root@box ~]# mount /dev/md3 /all/boxw16/
/dev/md3: Invalid argument
mount: /dev/md3: can't read superblock

[root@box ~]# fsck /dev/md3
fsck 1.35 (28-Feb-2004)
e2fsck 1.35 (28-Feb-2004)
fsck.ext2: Invalid argument while trying to open /dev/md3

The superblock could not be read..

[root@box ~]# mke2fs -n /dev/md3
mke2fs 1.35 (28-Feb-2004)
mke2fs: Device size reported to be zero.  Invalid partition specified,
or partition table wasn't reread after running fdisk, due to
a modified partition being busy and in use.  You may need to
reboot to re-read your partition table.


So, now what to do?

Any ideas would be DEEPLY appreciated !


-- 

Regards,
	Maurice


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-05-10 14:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-09  5:30 strange RAID5 problem Maurice Hilarius
2006-05-09  5:45 ` Neil Brown
2006-05-09  5:58 ` Luca Berra
2006-05-09 16:16   ` Maurice Hilarius
2006-05-09 19:20     ` Luca Berra
2006-05-09 22:19       ` Maurice Hilarius
2006-05-10 14:54         ` Thanks! Was:[Re: strange RAID5 problem] Maurice Hilarius
2006-05-09  6:12 ` strange RAID5 problem CaT

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.