From mboxrd@z Thu Jan 1 00:00:00 1970 From: hgichon Subject: Re: rescue an alien md raid5 Date: Tue, 24 Feb 2009 09:56:28 +0900 Message-ID: <49A345BC.1010206@gluesys.com> References: <200902231013.46082.harry.mangalam@uci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <200902231013.46082.harry.mangalam@uci.edu> Sender: linux-raid-owner@vger.kernel.org To: Harry Mangalam Cc: linux-raid List-Id: linux-raid.ids I read USRobotics 8700 feature. # Supports linear storage use (all drives as one logical drive) or use multiple drives as primary and back-up storage # Dynamic logical drive sizing - Add an additional drive, when needed, and it will be integrated into the logical drive with no effect on data maybe... there is no lvm layer? try vgscan / vgchange -ay best regards. -kpkim Harry Mangalam wrote: > Here's an unusual (long) tale of woe. > > We had a USRobotics 8700 NAS appliance with 4 SATA disks in RAID5: > > which was a fine (if crude) ARM-based Linux NAS until it stroked out > at some point, leaving us with a degraded RAID5 and comatose NAS > device. > > We'd like to get the files back of course and I've moved the disks to > a Linux PC, hooked them up to a cheap Silicon Image 4x SATA > controller and brought up the whole frankenmess with mdadm. It > reported a clean but degraded array: > > =============================================================== > > root@pnh-rcs:/# mdadm --detail /dev/md0 > /dev/md0: > Version : 00.90.03 > Creation Time : Wed Feb 14 16:30:17 2007 > Raid Level : raid5 > Array Size : 1464370176 (1396.53 GiB 1499.52 GB) > Used Dev Size : 488123392 (465.51 GiB 499.84 GB) > Raid Devices : 4 > Total Devices : 3 > Preferred Minor : 0 > Persistence : Superblock is persistent > > Update Time : Fri Dec 12 20:26:27 2008 > State : clean, degraded > Active Devices : 3 > Working Devices : 3 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 64K > > UUID : 7a60cd58:ad85ebdc:3b55d79a:a33c7fe6 > Events : 0.264294 > > Number Major Minor RaidDevice State > 0 0 0 0 removed > 1 8 35 1 active sync /dev/sdc3 > 2 8 51 2 active sync /dev/sdd3 > 3 8 67 3 active sync /dev/sde3 > =============================================================== > > > The original 500G Maxtor disks were formatted in 3 partitions as > follows: > > (for /dev/sd[bcde]) > disk sdb was bad so I had to replace it. > > =============================================================== > Disk /dev/sdc: 500.1 GB, 500107862016 bytes > 16 heads, 63 sectors/track, 969021 cylinders > Units = cylinders of 1008 * 512 = 516096 bytes > Disk identifier: 0x00000000 > > Device Boot Start End Blocks Id System > /dev/sdc1 1 261 131543+ 83 Linux > /dev/sdc2 262 522 131544 82 Linux swap / > Solaris > /dev/sdc3 523 969022 488123496+ 89 Unknown > =============================================================== > > > I formatted the replacement (different make/layout - Seagate) as a > single partition: > /dev/sdb1: > =============================================================== > Disk /dev/sdb: 500.1 GB, 500107862016 bytes > 255 heads, 63 sectors/track, 60801 cylinders > Units = cylinders of 16065 * 512 = 8225280 bytes > Disk identifier: 0x21d01216 > > Device Boot Start End Blocks Id System > /dev/sdb1 1 60801 488384001 83 Linux > =============================================================== > > and tried to rebuild the raid by stopping the raid, removing the bad > disk, adding the new disk. It came up and reported that it was > rebuilding. After several hours, it rebuilt and reported itself > clean (altho during a reboot, it became /dev/md1 instead of md0) > > =============================================================== > $ cat /proc/mdstat > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] > [raid4] [raid10] > md1 : active raid5 sdb1[0] sde3[3] sdd3[2] sdc3[1] > 1464370176 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] > =============================================================== > > > =============================================================== > $ mdadm --detail /dev/md1 > /dev/md1: > Version : 00.90.03 > Creation Time : Wed Feb 14 16:30:17 2007 > Raid Level : raid5 > Array Size : 1464370176 (1396.53 GiB 1499.52 GB) > Used Dev Size : 488123392 (465.51 GiB 499.84 GB) > Raid Devices : 4 > Total Devices : 4 > Preferred Minor : 1 > Persistence : Superblock is persistent > > Update Time : Mon Feb 23 09:06:27 2009 > State : clean > Active Devices : 4 > Working Devices : 4 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 64K > > UUID : 7a60cd58:ad85ebdc:3b55d79a:a33c7fe6 > Events : 0.265494 > > Number Major Minor RaidDevice State > 0 8 17 0 active sync /dev/sdb1 > 1 8 35 1 active sync /dev/sdc3 > 2 8 51 2 active sync /dev/sdd3 > 3 8 67 3 active sync /dev/sde3 > =============================================================== > > > The docs and files on the USR web site imply that the native > filesystem was originally XFS, but when i try to mount it as such, I > can't: > > mount -vvv -t xfs /dev/md1 /mnt > mount: fstab path: "/etc/fstab" > mount: lock path: "/etc/mtab~" > mount: temp path: "/etc/mtab.tmp" > mount: no LABEL=, no UUID=, going to mount /dev/md1 by path > mount: spec: "/dev/md1" > mount: node: "/mnt" > mount: types: "xfs" > mount: opts: "(null)" > mount: mount(2) syscall: source: "/dev/md1", target: "/mnt", > filesystemtype: "xfs", mountflags: -1058209792, data: (null) > mount: wrong fs type, bad option, bad superblock on /dev/md1, > missing codepage or helper program, or other error > In some cases useful info is found in syslog - try > dmesg | tail or so > > and when I check dmesg: > [ 245.008000] SGI XFS with ACLs, security attributes, realtime, large > block numbers, no debug enabled > [ 245.020000] SGI XFS Quota Management subsystem > [ 245.020000] XFS: SB read failed > [ 327.696000] md: md0 stopped. > [ 327.696000] md: unbind > [ 327.696000] md: export_rdev(sdc1) > [ 327.696000] md: unbind > [ 327.696000] md: export_rdev(sde1) > [ 327.696000] md: unbind > [ 327.696000] md: export_rdev(sdd1) > [ 439.660000] XFS: bad magic number > [ 439.660000] XFS: SB validate failed > > repeated attempts repeat the last 2 lines above. This implies that > the superblock is bad and xfs_repair also reports that: > xfs_repair /dev/md1 > - creating 2 worker thread(s) > Phase 1 - find and verify superblock... > bad primary superblock - bad magic number !!! > > attempting to find secondary superblock... > ...... ... > ..found candidate secondary superblock... > unable to verify superblock, continuing... > ... > ...found candidate secondary superblock... > unable to verify superblock, continuing... > ... > > > So my question is what should I do now? Were those 1st 2 partitions > (that I didn't create on the replacement disk) important? Should I > try to remove the replaced disk, create 3 partitions, and try again, > or am I just well and truly hosed? > >