* fsck problems. Can't restore raid @ 2009-12-26 1:33 Rick Bragg 2009-12-26 2:47 ` Rick Bragg 0 siblings, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-26 1:33 UTC (permalink / raw) To: Linux RAID Hi, I have a raid 10 array and for some reason the system went down and I can't get it back. during re-boot, I get the following error: The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 <device> I have tried everything I can think of and I can't seem to do an fsck or repair the file system. what can I do? Thanks Rick ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: fsck problems. Can't restore raid 2009-12-26 1:33 fsck problems. Can't restore raid Rick Bragg @ 2009-12-26 2:47 ` Rick Bragg 2009-12-26 3:12 ` Rick Bragg 0 siblings, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-26 2:47 UTC (permalink / raw) To: Linux RAID On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > Hi, > > I have a raid 10 array and for some reason the system went down and I > can't get it back. > > during re-boot, I get the following error: > > The superblock could not be read or does not describe a correct ext2 > filesystem. If the device is valid and it really contains an ext2 > filesystem (and not swap or ufs or something else), then the superblock > is corrupt, and you might try running e2fsck with an alternate > superblock: > e2fsck -b 8193 <device> > > I have tried everything I can think of and I can't seem to do an fsck or > repair the file system. > > what can I do? > > Thanks > Rick > More info: My array is made up of /dev/sda, sdb, sdc, and sdd. However they are not mounted right now. My OS is booted off of /dev/sde. I am running ubuntu 9.04 mdadm -Q --detail /dev/md0 mdadm: md device /dev/md0 does not appear to be active. Where do I take if from here? I'm not up on this as much as I should be at all. In fact I am quite a newbe to this... Any help would be greatly appreciated. Thanks Rick ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: fsck problems. Can't restore raid 2009-12-26 2:47 ` Rick Bragg @ 2009-12-26 3:12 ` Rick Bragg 2009-12-26 18:47 ` Leslie Rhorer 0 siblings, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-26 3:12 UTC (permalink / raw) To: Linux RAID [-- Attachment #1: Type: text/plain, Size: 1644 bytes --] On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > Hi, > > > > I have a raid 10 array and for some reason the system went down and I > > can't get it back. > > > > during re-boot, I get the following error: > > > > The superblock could not be read or does not describe a correct ext2 > > filesystem. If the device is valid and it really contains an ext2 > > filesystem (and not swap or ufs or something else), then the superblock > > is corrupt, and you might try running e2fsck with an alternate > > superblock: > > e2fsck -b 8193 <device> > > > > I have tried everything I can think of and I can't seem to do an fsck or > > repair the file system. > > > > what can I do? > > > > Thanks > > Rick > > > > > More info: > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they are > not mounted right now. My OS is booted off of /dev/sde. I am running > ubuntu 9.04 > > mdadm -Q --detail /dev/md0 > mdadm: md device /dev/md0 does not appear to be active. > > Where do I take if from here? I'm not up on this as much as I should be > at all. In fact I am quite a newbe to this... Any help would be greatly > appreciated. > > Thanks > Rick > Here is even more info: # mdadm --assemble --scan mdadm: /dev/md0 assembled from 2 drives - not enough to start the array. # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd mdadm: cannot open device /dev/sdb: Device or resource busy mdadm: /dev/sdb has no superblock - assembly aborted Is my array toast? What can I do? Thanks Rick [-- Attachment #2: smime.p7s --] [-- Type: application/x-pkcs7-signature, Size: 5489 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-26 3:12 ` Rick Bragg @ 2009-12-26 18:47 ` Leslie Rhorer 2009-12-26 19:44 ` Rick Bragg 0 siblings, 1 reply; 12+ messages in thread From: Leslie Rhorer @ 2009-12-26 18:47 UTC (permalink / raw) To: 'Rick Bragg', 'Linux RAID' I take it from your post the drives are not partitioned, and the RAID array consists of raw disk members? First, check the superblocks of the md devices: `mdadm --examine /dev/sda`, etc. If 2 or more of the superblocks are corrupt, then that's your problem. If not, then it should be possible to get the array mounted one way or the other. Once you get the array assembled again, then you can repair it, if need be. Once that is done, you can repair the file system if it is corrupted. Once everything is clean, you can mount the file system, and if necessary attempt to recover any lost files. > -----Original Message----- > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > owner@vger.kernel.org] On Behalf Of Rick Bragg > Sent: Friday, December 25, 2009 9:13 PM > To: Linux RAID > Subject: Re: fsck problems. Can't restore raid > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > > Hi, > > > > > > I have a raid 10 array and for some reason the system went down and I > > > can't get it back. > > > > > > during re-boot, I get the following error: > > > > > > The superblock could not be read or does not describe a correct ext2 > > > filesystem. If the device is valid and it really contains an ext2 > > > filesystem (and not swap or ufs or something else), then the > superblock > > > is corrupt, and you might try running e2fsck with an alternate > > > superblock: > > > e2fsck -b 8193 <device> > > > > > > I have tried everything I can think of and I can't seem to do an fsck > or > > > repair the file system. > > > > > > what can I do? > > > > > > Thanks > > > Rick > > > > > > > > > More info: > > > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they are > > not mounted right now. My OS is booted off of /dev/sde. I am running > > ubuntu 9.04 > > > > mdadm -Q --detail /dev/md0 > > mdadm: md device /dev/md0 does not appear to be active. > > > > Where do I take if from here? I'm not up on this as much as I should be > > at all. In fact I am quite a newbe to this... Any help would be greatly > > appreciated. > > > > Thanks > > Rick > > > > > Here is even more info: > > # mdadm --assemble --scan > mdadm: /dev/md0 assembled from 2 drives - not enough to start the array. > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd > mdadm: cannot open device /dev/sdb: Device or resource busy > mdadm: /dev/sdb has no superblock - assembly aborted > > Is my array toast? > What can I do? > > Thanks > Rick > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-26 18:47 ` Leslie Rhorer @ 2009-12-26 19:44 ` Rick Bragg 2009-12-26 21:14 ` Leslie Rhorer 0 siblings, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-26 19:44 UTC (permalink / raw) To: Leslie Rhorer; +Cc: 'Linux RAID' Hi Thanks, it was an array that was up and running for a long time, and all of a sudden, this happened. so there were formatted and up and running fine. If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the following error on all disks: mdadm: No md superblock detected on /dev/sda. thats on all disks... (sda, sdb, sdc, and sdd) When I run fdisk on /dev/sda I get the following error: Unable to read /dev/sda However, running fdisk on all other disks shows that they are up and formatted with "raid" file type. Not sure what I can do next... Thanks Rick On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote: > I take it from your post the drives are not partitioned, and the > RAID array consists of raw disk members? First, check the superblocks of > the md devices: > > `mdadm --examine /dev/sda`, etc. If 2 or more of the superblocks > are corrupt, then that's your problem. If not, then it should be possible > to get the array mounted one way or the other. Once you get the array > assembled again, then you can repair it, if need be. Once that is done, you > can repair the file system if it is corrupted. Once everything is clean, > you can mount the file system, and if necessary attempt to recover any lost > files. > > > -----Original Message----- > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > > owner@vger.kernel.org] On Behalf Of Rick Bragg > > Sent: Friday, December 25, 2009 9:13 PM > > To: Linux RAID > > Subject: Re: fsck problems. Can't restore raid > > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > > > Hi, > > > > > > > > I have a raid 10 array and for some reason the system went down and I > > > > can't get it back. > > > > > > > > during re-boot, I get the following error: > > > > > > > > The superblock could not be read or does not describe a correct ext2 > > > > filesystem. If the device is valid and it really contains an ext2 > > > > filesystem (and not swap or ufs or something else), then the > > superblock > > > > is corrupt, and you might try running e2fsck with an alternate > > > > superblock: > > > > e2fsck -b 8193 <device> > > > > > > > > I have tried everything I can think of and I can't seem to do an fsck > > or > > > > repair the file system. > > > > > > > > what can I do? > > > > > > > > Thanks > > > > Rick > > > > > > > > > > > > > More info: > > > > > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they are > > > not mounted right now. My OS is booted off of /dev/sde. I am running > > > ubuntu 9.04 > > > > > > mdadm -Q --detail /dev/md0 > > > mdadm: md device /dev/md0 does not appear to be active. > > > > > > Where do I take if from here? I'm not up on this as much as I should be > > > at all. In fact I am quite a newbe to this... Any help would be greatly > > > appreciated. > > > > > > Thanks > > > Rick > > > > > > > > > Here is even more info: > > > > # mdadm --assemble --scan > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the array. > > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd > > mdadm: cannot open device /dev/sdb: Device or resource busy > > mdadm: /dev/sdb has no superblock - assembly aborted > > > > Is my array toast? > > What can I do? > > > > Thanks > > Rick > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-26 19:44 ` Rick Bragg @ 2009-12-26 21:14 ` Leslie Rhorer 2009-12-26 21:59 ` Green Mountain Network Info 2009-12-27 1:01 ` Rick Bragg 0 siblings, 2 replies; 12+ messages in thread From: Leslie Rhorer @ 2009-12-26 21:14 UTC (permalink / raw) To: 'Rick Bragg'; +Cc: 'Linux RAID' > Thanks, it was an array that was up and running for a long time, and all > of a sudden, this happened. so there were formatted and up and running > fine. Well, you didn't quite answer my question. Are the drives partitioned, or not? > If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the > following error on all disks: > mdadm: No md superblock detected on /dev/sda. > thats on all disks... (sda, sdb, sdc, and sdd) Well, we know the array is at least partially assembling, so it is finding at least some of the superblocks. It sounds to me like perhaps the drives are partitioned. > When I run fdisk on /dev/sda I get the following error: > Unable to read /dev/sda That sounds like a dead drive. I suggest running SMART tests on it. You might try changing the cable or moving it to another controller port. > However, running fdisk on all other disks shows that they are up and > formatted with "raid" file type. Formatted, or partitioned? You might post the output from fdisk when you type "p" at the FDISK prompt. At a guess, I would say perhaps your drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4, ...), /dev/sdb1, etc. Try typing `ls /dev/sd*` and see what pops up. There should be references to sda, sdb, etc. If there are also references to sdb1, sdb2, etc, then your drives have valid partitions, and it is those (or some of them, at least) which are targets for md. Once you have determined which drives have which partitions, then issue the commands ` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is "1", "2", "3", etc., including only those values returned by the ls command. For example, on one of my systems: RAID-Server:/etc/ssmtp# ls /dev/sd* /dev/sda /dev/sda2 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj /dev/sdl /dev/sda1 /dev/sda3 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk RAID-Server:/etc/ssmtp# ls /dev/hd* /dev/hda /dev/hda1 /dev/hda2 /dev/hda3 This result tells us I have one PATA drive (hda) with three partitions on it. It also tells us I have one (probably) SATA or SCSI drive (sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no partitions on them. If I issue the examine command on /dev/sda, I get an error: RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda mdadm: No md superblock detected on /dev/sda. That's because in this case the DRIVE does not have an md superblock. It is the PARTITIONS which have superblocks: RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1 /dev/sda1: Magic : a92b4efc Version : 1.0 Feature Map : 0x1 Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d Name : 'RAID-Server':1 Creation Time : Wed Dec 23 23:46:28 2009 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 803160 (392.23 MiB 411.22 MB) Array Size : 803160 (392.23 MiB 411.22 MB) Super Offset : 803168 sectors State : clean Device UUID : 28212297:1d982d5d:ce41b6fe:03720159 Internal Bitmap : 2 sectors from superblock Update Time : Sat Dec 26 13:00:32 2009 Checksum : af8f04b1 - correct Events : 204 Array Slot : 1 (failed, 1, 0) Array State : uU 1 failed The other 11 drives, however, are un-partitioned, and the md superblock rests on the drive device itself: RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb /dev/sdb: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca Name : 'RAID-Server':0 Creation Time : Sat Apr 25 01:17:12 2009 Raid Level : raid6 Raid Devices : 11 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB) Array Size : 17581722624 (8383.62 GiB 9001.84 GB) Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB) Data Offset : 272 sectors Super Offset : 8 sectors State : clean Device UUID : d40c9255:cef0739f:966d448d:e549ada8 Internal Bitmap : 2 sectors from superblock Update Time : Sat Dec 26 15:09:44 2009 Checksum : e290ec2f - correct Events : 1193460 Chunk Size : 256K Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7) Array State : Uuuuuuuuuuu > > Not sure what I can do next... > > Thanks > Rick > > > > > > > On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote: > > I take it from your post the drives are not partitioned, and the > > RAID array consists of raw disk members? First, check the superblocks > of > > the md devices: > > > > `mdadm --examine /dev/sda`, etc. If 2 or more of the superblocks > > are corrupt, then that's your problem. If not, then it should be > possible > > to get the array mounted one way or the other. Once you get the array > > assembled again, then you can repair it, if need be. Once that is done, > you > > can repair the file system if it is corrupted. Once everything is > clean, > > you can mount the file system, and if necessary attempt to recover any > lost > > files. > > > > > -----Original Message----- > > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > > > owner@vger.kernel.org] On Behalf Of Rick Bragg > > > Sent: Friday, December 25, 2009 9:13 PM > > > To: Linux RAID > > > Subject: Re: fsck problems. Can't restore raid > > > > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > > > > Hi, > > > > > > > > > > I have a raid 10 array and for some reason the system went down > and I > > > > > can't get it back. > > > > > > > > > > during re-boot, I get the following error: > > > > > > > > > > The superblock could not be read or does not describe a correct > ext2 > > > > > filesystem. If the device is valid and it really contains an ext2 > > > > > filesystem (and not swap or ufs or something else), then the > > > superblock > > > > > is corrupt, and you might try running e2fsck with an alternate > > > > > superblock: > > > > > e2fsck -b 8193 <device> > > > > > > > > > > I have tried everything I can think of and I can't seem to do an > fsck > > > or > > > > > repair the file system. > > > > > > > > > > what can I do? > > > > > > > > > > Thanks > > > > > Rick > > > > > > > > > > > > > > > > > More info: > > > > > > > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they > are > > > > not mounted right now. My OS is booted off of /dev/sde. I am > running > > > > ubuntu 9.04 > > > > > > > > mdadm -Q --detail /dev/md0 > > > > mdadm: md device /dev/md0 does not appear to be active. > > > > > > > > Where do I take if from here? I'm not up on this as much as I > should be > > > > at all. In fact I am quite a newbe to this... Any help would be > greatly > > > > appreciated. > > > > > > > > Thanks > > > > Rick > > > > > > > > > > > > > Here is even more info: > > > > > > # mdadm --assemble --scan > > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the > array. > > > > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd > > > mdadm: cannot open device /dev/sdb: Device or resource busy > > > mdadm: /dev/sdb has no superblock - assembly aborted > > > > > > Is my array toast? > > > What can I do? > > > > > > Thanks > > > Rick > > > > > > > > > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-26 21:14 ` Leslie Rhorer @ 2009-12-26 21:59 ` Green Mountain Network Info 2009-12-27 1:01 ` Rick Bragg 1 sibling, 0 replies; 12+ messages in thread From: Green Mountain Network Info @ 2009-12-26 21:59 UTC (permalink / raw) To: Leslie Rhorer; +Cc: 'Linux RAID' Hi, Thanks again Leslie, they are partitioned as linux raid, and the format was Ext3. They are all sdx1 (where x is a, b, c, d) The raid array is not a bootable system at all, but a mounted drive. I am running the system off of a totally different drive. (/dev/sde) Also, I am running a smart test now on all the drives # smartctl -t /dev/sdx... I didn't know this existed, so I will await the output and hope to make sense of it. I will try to change up the cables and ports once the smartctl results are in. Following is some more info: fdisk info: # fdisk /dev/sda Unable to read /dev/sda # fdisk /dev/sdb Disk /dev/sdb: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 60801 488384001 fd Linux raid autodetect # fdisk /dev/sdc Disk /dev/sdc: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00045567 Device Boot Start End Blocks Id System /dev/sdc1 1 60801 488384001 fd Linux raid autodetect # fdisk /dev/sdd Disk /dev/sdd: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdd1 1 60801 488384001 fd Linux raid autodetect # mdadm --examine /dev/sda1 mdadm: No md superblock detected on /dev/sda1. (no surprise there...) # mdadm --examine /dev/sdb1 mdadm: No md superblock detected on /dev/sdb1. (Does this mean that sdb1 is bad? or is that OK?) # mdadm --examine /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3ad - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 33 2 active sync /dev/sdc1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 # mdadm --examine /dev/sdd1 /dev/sdd1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3bf - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 49 3 active sync /dev/sdd1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 Also, for the controlers, I am using a couple of Promise cards: # lspci ... 01:02.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) ... 01:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) If any of this stands out as something really wrong, please let me know. Thanks again so much for your help! rick On Sat, 2009-12-26 at 15:14 -0600, Leslie Rhorer wrote: > > Thanks, it was an array that was up and running for a long time, and all > > of a sudden, this happened. so there were formatted and up and running > > fine. > > Well, you didn't quite answer my question. Are the drives > partitioned, or not? > > > If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the > > following error on all disks: > > mdadm: No md superblock detected on /dev/sda. > > thats on all disks... (sda, sdb, sdc, and sdd) > > Well, we know the array is at least partially assembling, so it is > finding at least some of the superblocks. It sounds to me like perhaps the > drives are partitioned. > > > When I run fdisk on /dev/sda I get the following error: > > Unable to read /dev/sda > > That sounds like a dead drive. I suggest running SMART tests on it. > You might try changing the cable or moving it to another controller port. > > > However, running fdisk on all other disks shows that they are up and > > formatted with "raid" file type. > > Formatted, or partitioned? You might post the output from fdisk > when you type "p" at the FDISK prompt. At a guess, I would say perhaps your > drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4, > ...), /dev/sdb1, etc. Try typing > > `ls /dev/sd*` > > and see what pops up. There should be references to sda, sdb, etc. If > there are also references to sdb1, sdb2, etc, then your drives have valid > partitions, and it is those (or some of them, at least) which are targets > for md. Once you have determined which drives have which partitions, then > issue the commands > > ` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is > "1", "2", "3", etc., including only those values returned by the ls command. > > For example, on one of my systems: > > RAID-Server:/etc/ssmtp# ls /dev/sd* > /dev/sda /dev/sda2 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj > /dev/sdl > /dev/sda1 /dev/sda3 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk > RAID-Server:/etc/ssmtp# ls /dev/hd* > /dev/hda /dev/hda1 /dev/hda2 /dev/hda3 > > This result tells us I have one PATA drive (hda) with three > partitions on it. It also tells us I have one (probably) SATA or SCSI drive > (sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no > partitions on them. > > If I issue the examine command on /dev/sda, I get an error: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda > mdadm: No md superblock detected on /dev/sda. > > That's because in this case the DRIVE does not have an md > superblock. It is the PARTITIONS which have superblocks: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1 > /dev/sda1: > Magic : a92b4efc > Version : 1.0 > Feature Map : 0x1 > Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d > Name : 'RAID-Server':1 > Creation Time : Wed Dec 23 23:46:28 2009 > Raid Level : raid1 > Raid Devices : 2 > > Avail Dev Size : 803160 (392.23 MiB 411.22 MB) > Array Size : 803160 (392.23 MiB 411.22 MB) > Super Offset : 803168 sectors > State : clean > Device UUID : 28212297:1d982d5d:ce41b6fe:03720159 > > Internal Bitmap : 2 sectors from superblock > Update Time : Sat Dec 26 13:00:32 2009 > Checksum : af8f04b1 - correct > Events : 204 > > > Array Slot : 1 (failed, 1, 0) > Array State : uU 1 failed > > The other 11 drives, however, are un-partitioned, and the md > superblock rests on the drive device itself: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb > /dev/sdb: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x1 > Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca > Name : 'RAID-Server':0 > Creation Time : Sat Apr 25 01:17:12 2009 > Raid Level : raid6 > Raid Devices : 11 > > Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB) > Array Size : 17581722624 (8383.62 GiB 9001.84 GB) > Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB) > Data Offset : 272 sectors > Super Offset : 8 sectors > State : clean > Device UUID : d40c9255:cef0739f:966d448d:e549ada8 > > Internal Bitmap : 2 sectors from superblock > Update Time : Sat Dec 26 15:09:44 2009 > Checksum : e290ec2f - correct > Events : 1193460 > > Chunk Size : 256K > > Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7) > Array State : Uuuuuuuuuuu > > > > > Not sure what I can do next... > > > > Thanks > > Rick > > > > > > > > > > > > > > On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote: > > > I take it from your post the drives are not partitioned, and the > > > RAID array consists of raw disk members? First, check the superblocks > > of > > > the md devices: > > > > > > `mdadm --examine /dev/sda`, etc. If 2 or more of the superblocks > > > are corrupt, then that's your problem. If not, then it should be > > possible > > > to get the array mounted one way or the other. Once you get the array > > > assembled again, then you can repair it, if need be. Once that is done, > > you > > > can repair the file system if it is corrupted. Once everything is > > clean, > > > you can mount the file system, and if necessary attempt to recover any > > lost > > > files. > > > > > > > -----Original Message----- > > > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > > > > owner@vger.kernel.org] On Behalf Of Rick Bragg > > > > Sent: Friday, December 25, 2009 9:13 PM > > > > To: Linux RAID > > > > Subject: Re: fsck problems. Can't restore raid > > > > > > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > > > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > > > > > Hi, > > > > > > > > > > > > I have a raid 10 array and for some reason the system went down > > and I > > > > > > can't get it back. > > > > > > > > > > > > during re-boot, I get the following error: > > > > > > > > > > > > The superblock could not be read or does not describe a correct > > ext2 > > > > > > filesystem. If the device is valid and it really contains an ext2 > > > > > > filesystem (and not swap or ufs or something else), then the > > > > superblock > > > > > > is corrupt, and you might try running e2fsck with an alternate > > > > > > superblock: > > > > > > e2fsck -b 8193 <device> > > > > > > > > > > > > I have tried everything I can think of and I can't seem to do an > > fsck > > > > or > > > > > > repair the file system. > > > > > > > > > > > > what can I do? > > > > > > > > > > > > Thanks > > > > > > Rick > > > > > > > > > > > > > > > > > > > > > More info: > > > > > > > > > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they > > are > > > > > not mounted right now. My OS is booted off of /dev/sde. I am > > running > > > > > ubuntu 9.04 > > > > > > > > > > mdadm -Q --detail /dev/md0 > > > > > mdadm: md device /dev/md0 does not appear to be active. > > > > > > > > > > Where do I take if from here? I'm not up on this as much as I > > should be > > > > > at all. In fact I am quite a newbe to this... Any help would be > > greatly > > > > > appreciated. > > > > > > > > > > Thanks > > > > > Rick > > > > > > > > > > > > > > > > > Here is even more info: > > > > > > > > # mdadm --assemble --scan > > > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the > > array. > > > > > > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd > > > > mdadm: cannot open device /dev/sdb: Device or resource busy > > > > mdadm: /dev/sdb has no superblock - assembly aborted > > > > > > > > Is my array toast? > > > > What can I do? > > > > > > > > Thanks > > > > Rick > > > > > > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Green Mountain Network PO Box 1462 Burlington, VT 05402 802.264.4851 info@gmnet.net ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-26 21:14 ` Leslie Rhorer 2009-12-26 21:59 ` Green Mountain Network Info @ 2009-12-27 1:01 ` Rick Bragg 2009-12-27 6:13 ` Leslie Rhorer 1 sibling, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-27 1:01 UTC (permalink / raw) To: 'Linux RAID' Hi, Thanks again Leslie, they are partitioned as linux raid, and the format was Ext3. They are all sdx1 (where x is a, b, c, d) The raid array is not a bootable system at all, but a mounted drive. I am running the system off of a totally different drive. (/dev/sde) Also, I am running a smart test now on all the drives # smartctl -t /dev/sdx... I didn't know this existed, so I will await the output and hope to make sense of it. I will try to change up the cables and ports once the smartctl results are in. Following is some more info: fdisk info: # fdisk /dev/sda Unable to read /dev/sda # fdisk /dev/sdb Disk /dev/sdb: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdb1 1 60801 488384001 fd Linux raid autodetect # fdisk /dev/sdc Disk /dev/sdc: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00045567 Device Boot Start End Blocks Id System /dev/sdc1 1 60801 488384001 fd Linux raid autodetect # fdisk /dev/sdd Disk /dev/sdd: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00000000 Device Boot Start End Blocks Id System /dev/sdd1 1 60801 488384001 fd Linux raid autodetect # mdadm --examine /dev/sda1 mdadm: No md superblock detected on /dev/sda1. (no surprise there...) # mdadm --examine /dev/sdb1 mdadm: No md superblock detected on /dev/sdb1. (Does this mean that sdb1 is bad? or is that OK?) # mdadm --examine /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3ad - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 33 2 active sync /dev/sdc1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 # mdadm --examine /dev/sdd1 /dev/sdd1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3bf - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 49 3 active sync /dev/sdd1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 Also, for the controlers, I am using a couple of Promise cards: # lspci ... 01:02.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) ... 01:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA 300 TX4) (rev 02) If any of this stands out as something really wrong, please let me know. Thanks again so much for your help! rick On Sat, 2009-12-26 at 15:14 -0600, Leslie Rhorer wrote: > > Thanks, it was an array that was up and running for a long time, and all > > of a sudden, this happened. so there were formatted and up and running > > fine. > > Well, you didn't quite answer my question. Are the drives > partitioned, or not? > > > If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the > > following error on all disks: > > mdadm: No md superblock detected on /dev/sda. > > thats on all disks... (sda, sdb, sdc, and sdd) > > Well, we know the array is at least partially assembling, so it is > finding at least some of the superblocks. It sounds to me like perhaps the > drives are partitioned. > > > When I run fdisk on /dev/sda I get the following error: > > Unable to read /dev/sda > > That sounds like a dead drive. I suggest running SMART tests on it. > You might try changing the cable or moving it to another controller port. > > > However, running fdisk on all other disks shows that they are up and > > formatted with "raid" file type. > > Formatted, or partitioned? You might post the output from fdisk > when you type "p" at the FDISK prompt. At a guess, I would say perhaps your > drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4, > ...), /dev/sdb1, etc. Try typing > > `ls /dev/sd*` > > and see what pops up. There should be references to sda, sdb, etc. If > there are also references to sdb1, sdb2, etc, then your drives have valid > partitions, and it is those (or some of them, at least) which are targets > for md. Once you have determined which drives have which partitions, then > issue the commands > > ` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is > "1", "2", "3", etc., including only those values returned by the ls command. > > For example, on one of my systems: > > RAID-Server:/etc/ssmtp# ls /dev/sd* > /dev/sda /dev/sda2 /dev/sdb /dev/sdd /dev/sdf /dev/sdh /dev/sdj > /dev/sdl > /dev/sda1 /dev/sda3 /dev/sdc /dev/sde /dev/sdg /dev/sdi /dev/sdk > RAID-Server:/etc/ssmtp# ls /dev/hd* > /dev/hda /dev/hda1 /dev/hda2 /dev/hda3 > > This result tells us I have one PATA drive (hda) with three > partitions on it. It also tells us I have one (probably) SATA or SCSI drive > (sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no > partitions on them. > > If I issue the examine command on /dev/sda, I get an error: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda > mdadm: No md superblock detected on /dev/sda. > > That's because in this case the DRIVE does not have an md > superblock. It is the PARTITIONS which have superblocks: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1 > /dev/sda1: > Magic : a92b4efc > Version : 1.0 > Feature Map : 0x1 > Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d > Name : 'RAID-Server':1 > Creation Time : Wed Dec 23 23:46:28 2009 > Raid Level : raid1 > Raid Devices : 2 > > Avail Dev Size : 803160 (392.23 MiB 411.22 MB) > Array Size : 803160 (392.23 MiB 411.22 MB) > Super Offset : 803168 sectors > State : clean > Device UUID : 28212297:1d982d5d:ce41b6fe:03720159 > > Internal Bitmap : 2 sectors from superblock > Update Time : Sat Dec 26 13:00:32 2009 > Checksum : af8f04b1 - correct > Events : 204 > > > Array Slot : 1 (failed, 1, 0) > Array State : uU 1 failed > > The other 11 drives, however, are un-partitioned, and the md > superblock rests on the drive device itself: > > RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb > /dev/sdb: > Magic : a92b4efc > Version : 1.2 > Feature Map : 0x1 > Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca > Name : 'RAID-Server':0 > Creation Time : Sat Apr 25 01:17:12 2009 > Raid Level : raid6 > Raid Devices : 11 > > Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB) > Array Size : 17581722624 (8383.62 GiB 9001.84 GB) > Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB) > Data Offset : 272 sectors > Super Offset : 8 sectors > State : clean > Device UUID : d40c9255:cef0739f:966d448d:e549ada8 > > Internal Bitmap : 2 sectors from superblock > Update Time : Sat Dec 26 15:09:44 2009 > Checksum : e290ec2f - correct > Events : 1193460 > > Chunk Size : 256K > > Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7) > Array State : Uuuuuuuuuuu > > > > > Not sure what I can do next... > > > > Thanks > > Rick > > > > > > > > > > > > > > On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote: > > > I take it from your post the drives are not partitioned, and the > > > RAID array consists of raw disk members? First, check the superblocks > > of > > > the md devices: > > > > > > `mdadm --examine /dev/sda`, etc. If 2 or more of the superblocks > > > are corrupt, then that's your problem. If not, then it should be > > possible > > > to get the array mounted one way or the other. Once you get the array > > > assembled again, then you can repair it, if need be. Once that is done, > > you > > > can repair the file system if it is corrupted. Once everything is > > clean, > > > you can mount the file system, and if necessary attempt to recover any > > lost > > > files. > > > > > > > -----Original Message----- > > > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid- > > > > owner@vger.kernel.org] On Behalf Of Rick Bragg > > > > Sent: Friday, December 25, 2009 9:13 PM > > > > To: Linux RAID > > > > Subject: Re: fsck problems. Can't restore raid > > > > > > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote: > > > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote: > > > > > > Hi, > > > > > > > > > > > > I have a raid 10 array and for some reason the system went down > > and I > > > > > > can't get it back. > > > > > > > > > > > > during re-boot, I get the following error: > > > > > > > > > > > > The superblock could not be read or does not describe a correct > > ext2 > > > > > > filesystem. If the device is valid and it really contains an ext2 > > > > > > filesystem (and not swap or ufs or something else), then the > > > > superblock > > > > > > is corrupt, and you might try running e2fsck with an alternate > > > > > > superblock: > > > > > > e2fsck -b 8193 <device> > > > > > > > > > > > > I have tried everything I can think of and I can't seem to do an > > fsck > > > > or > > > > > > repair the file system. > > > > > > > > > > > > what can I do? > > > > > > > > > > > > Thanks > > > > > > Rick > > > > > > > > > > > > > > > > > > > > > More info: > > > > > > > > > > My array is made up of /dev/sda, sdb, sdc, and sdd. However they > > are > > > > > not mounted right now. My OS is booted off of /dev/sde. I am > > running > > > > > ubuntu 9.04 > > > > > > > > > > mdadm -Q --detail /dev/md0 > > > > > mdadm: md device /dev/md0 does not appear to be active. > > > > > > > > > > Where do I take if from here? I'm not up on this as much as I > > should be > > > > > at all. In fact I am quite a newbe to this... Any help would be > > greatly > > > > > appreciated. > > > > > > > > > > Thanks > > > > > Rick > > > > > > > > > > > > > > > > > Here is even more info: > > > > > > > > # mdadm --assemble --scan > > > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the > > array. > > > > > > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd > > > > mdadm: cannot open device /dev/sdb: Device or resource busy > > > > mdadm: /dev/sdb has no superblock - assembly aborted > > > > > > > > Is my array toast? > > > > What can I do? > > > > > > > > Thanks > > > > Rick > > > > > > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-27 1:01 ` Rick Bragg @ 2009-12-27 6:13 ` Leslie Rhorer 2009-12-27 18:41 ` Rick Bragg 0 siblings, 1 reply; 12+ messages in thread From: Leslie Rhorer @ 2009-12-27 6:13 UTC (permalink / raw) To: 'Rick Bragg', 'Linux RAID' > # mdadm --examine /dev/sdb1 > mdadm: No md superblock detected on /dev/sdb1. > > (Does this mean that sdb1 is bad? or is that OK?) It doesn't necessarily mean the drive is bad, but the superblock is gone. Are you having mdadm monitor your array(s) and send informational messages to you upon RAID events? If not, then what may have happened is you lost the superblock on sdb1 and at some other time - before or after - lost the sda drive. Once both events had taken place, your array is toast. All may not be lost, however. First of all, take care when re-arranging not to lose track of which drive was which at the outset. In fact, other than the sda drive, you might be best served not to move anything. Take special care if the system re-assigns drive letters, as it can easily do. When you created your array, one of the steps you should have taken was to put the drive configuration into /etc/mdadm.conf. In particular, you may need to attempt to re-create the array by mimicking the original command used to create the array, basically zeroing out the superblocks and starting again from scratch. If you do it correctly, and are careful, it may be possible to recover the array. Note, however, the array parameters must match the original configuration exactly, including the role each drive plays in the array. If you get them out of line, you can really destroy all the data. What are the contents of /etc/mdadm.conf? ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-27 6:13 ` Leslie Rhorer @ 2009-12-27 18:41 ` Rick Bragg 2009-12-27 22:47 ` Leslie Rhorer 0 siblings, 1 reply; 12+ messages in thread From: Rick Bragg @ 2009-12-27 18:41 UTC (permalink / raw) To: Leslie Rhorer; +Cc: 'Linux RAID' Thanks again Leslie, I threaded my responses below. On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote: > > # mdadm --examine /dev/sdb1 > > mdadm: No md superblock detected on /dev/sdb1. > > > > (Does this mean that sdb1 is bad? or is that OK?) > > It doesn't necessarily mean the drive is bad, but the superblock is > gone. Are you having mdadm monitor your array(s) and send informational > messages to you upon RAID events? If not, then what may have happened is > you lost the superblock on sdb1 and at some other time - before or after - > lost the sda drive. Once both events had taken place, your array is toast. Right, I need to set up monitoring... > > All may not be lost, however. First of all, take care when > re-arranging not to lose track of which drive was which at the outset. In > fact, other than the sda drive, you might be best served not to move > anything. Take special care if the system re-assigns drive letters, as it > can easily do. So should I just "move" the A drive? and try to fire it back up? > > When you created your array, one of the steps you should have taken > was to put the drive configuration into /etc/mdadm.conf. In particular, you > may need to attempt to re-create the array by mimicking the original command > used to create the array, basically zeroing out the superblocks and starting > again from scratch. If you do it correctly, and are careful, it may be > possible to recover the array. Note, however, the array parameters must > match the original configuration exactly, including the role each drive > plays in the array. If you get them out of line, you can really destroy all > the data. > > What are the contents of /etc/mdadm.conf? > mdadm.conf contains this: ARRAY /dev/md0 level=raid10 num-devices=4 UUID=3d93e545:c8d5baec:24e6b15c:676eb40f So, by re-creating, do you mean I should try to run the "mdadm --create" command again the same way I did back when I created the array originally? Will that wipe out my data? Also, here is my output from smart tests: Does this shed any more light on anything? I'm not sure what the "Remaining" and "LifeTime" fiends mean... # smartctl -l selftest /dev/sda smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Standard Inquiry (36 bytes) failed [No such device] Retrying with a 64 byte Standard Inquiry Standard Inquiry (64 bytes) failed [No such device] A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options. # smartctl -l selftest /dev/sdb smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed: read failure 90% 7963 543357 # smartctl -l selftest /dev/sdc smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 7964 - # smartctl -l selftest /dev/sdd smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF READ SMART DATA SECTION === SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 7965 - This is strange, now I am getting info from mdadm --examine that is different than before... # mdadm --examine /dev/sda mdadm: No md superblock detected on /dev/sda. # mdadm --examine /dev/sda1 /dev/sda1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:40 2009 State : active Active Devices : 3 Working Devices : 3 Failed Devices : 1 Spare Devices : 0 Checksum : edcd7fb8 - correct Events : 1131433 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 1 0 active sync /dev/sda1 0 0 8 1 0 active sync /dev/sda1 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 # mdadm --examine /dev/sdb mdadm: No md superblock detected on /dev/sdb. # mdadm --examine /dev/sdb1 /dev/sdb1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Wed Dec 16 15:33:13 2009 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Checksum : edc67464 - correct Events : 687460 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 17 1 active sync /dev/sdb1 0 0 8 1 0 active sync /dev/sda1 1 1 8 17 1 active sync /dev/sdb1 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 # mdadm --examine /dev/sdc mdadm: No md superblock detected on /dev/sdc. # mdadm --examine /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3ad - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 2 8 33 2 active sync /dev/sdc1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 # mdadm --examine /dev/sdd mdadm: No md superblock detected on /dev/sdd. # mdadm --examine /dev/sdd1 /dev/sdd1: Magic : a92b4efc Version : 00.90.00 UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host smoke) Creation Time : Wed Jan 28 14:58:49 2009 Raid Level : raid10 Used Dev Size : 488383936 (465.76 GiB 500.11 GB) Array Size : 976767872 (931.52 GiB 1000.21 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Update Time : Thu Dec 24 19:25:58 2009 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 1 Spare Devices : 0 Checksum : eddec3bf - correct Events : 1131438 Layout : near=2, far=1 Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 49 3 active sync /dev/sdd1 0 0 0 0 0 removed 1 1 0 0 1 faulty removed 2 2 8 33 2 active sync /dev/sdc1 3 3 8 49 3 active sync /dev/sdd1 Thanks again for all your help! Rick ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: fsck problems. Can't restore raid 2009-12-27 18:41 ` Rick Bragg @ 2009-12-27 22:47 ` Leslie Rhorer 2009-12-29 2:46 ` Michael Evans 0 siblings, 1 reply; 12+ messages in thread From: Leslie Rhorer @ 2009-12-27 22:47 UTC (permalink / raw) To: 'Rick Bragg'; +Cc: 'Linux RAID' > On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote: > > > # mdadm --examine /dev/sdb1 > > > mdadm: No md superblock detected on /dev/sdb1. > > > > > > (Does this mean that sdb1 is bad? or is that OK?) > > > > It doesn't necessarily mean the drive is bad, but the superblock is > > gone. Are you having mdadm monitor your array(s) and send informational > > messages to you upon RAID events? If not, then what may have happened > is > > you lost the superblock on sdb1 and at some other time - before or after > - > > lost the sda drive. Once both events had taken place, your array is > toast. > Right, I need to set up monitoring... Um, yeah. A RAID array won't prevent drives from going up in smoke, and if you don't know a drive has failed, you won't know you need to fix something - until a second drive fails. > > All may not be lost, however. First of all, take care when > > re-arranging not to lose track of which drive was which at the outset. > In > > fact, other than the sda drive, you might be best served not to move > > anything. Take special care if the system re-assigns drive letters, as > it > > can easily do. > So should I just "move" the A drive? and try to fire it back up? At this point, yeah. Don't lose track of from where and to where it has been moved, though. > > What are the contents of /etc/mdadm.conf? > > > > mdadm.conf contains this: > ARRAY /dev/md0 level=raid10 num-devices=4 > UUID=3d93e545:c8d5baec:24e6b15c:676eb40f Yeah, that doesn't help much. > So, by re-creating, do you mean I should try to run the "mdadm --create" > command again the same way I did back when I created the array > originally? Will that wipe out my data? Not in and of itself, no. If you get the drive order wrong (different than when it was first created) and resync or write to the array, then it will munge the data, but all creating the array does is create the superblocks. > # smartctl -l selftest /dev/sda > smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > Standard Inquiry (36 bytes) failed [No such device] > Retrying with a 64 byte Standard Inquiry > Standard Inquiry (64 bytes) failed [No such device] > A mandatory SMART command failed: exiting. To continue, add one or more '- > T permissive' options. Well, we kind of knew that. Either the drive is dead, or there is a hardware problem in the controller path. Hope for the latter, although a drive with a frozen platter can sometimes be resurrected, and if the drive electronics are bad but the servo assemblies are OK, replacing the electronics is not difficult. Otherwise, it's a goner. > # smartctl -l selftest /dev/sdb > smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF READ SMART DATA SECTION === > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed: read failure 90% 7963 > 543357 Oooh! That's bad. Really bad. Your earlier post showed the superblock is a 0.90 version. The 0.90 superblock is stored near the end of the partition. Your drive is suffering a heart attack when it gets near the end of the drive. If you can't get your sda drive working again, then I'm afraid you've lost some data, maybe all of it. Trying to rebuild a partition from scratch when part of it is corrupted is not for the feint of heart. If you are lucky, you might be able to dd part of the sdb drive onto a healthy one and manually restore the superblock. That, or since the sda drive does appear in /dev, you might have some luck copying some of it to a new drive. Beyond that, you are either going to need the advice of someone who knows much more about md and Linux than I do, or else the services of a professional drive recovery expert. They don't come cheap. > This is strange, now I am getting info from mdadm --examine that is > different than before... It looks like sda may be responding for the time being. I suggest you try to assemble the array, and if successful, copy whatever data you can to a backup device. Do not mount the array as read-write until you have recovered everything you can. If some data is orphaned, it might be in the lost+found directory. If that's successful, I suggest you find out why you had two failures and start over. I wouldn't use a 0.90 superblock, though, and you definitely want to have monitoring enabled. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: fsck problems. Can't restore raid 2009-12-27 22:47 ` Leslie Rhorer @ 2009-12-29 2:46 ` Michael Evans 0 siblings, 0 replies; 12+ messages in thread From: Michael Evans @ 2009-12-29 2:46 UTC (permalink / raw) To: Leslie Rhorer; +Cc: Rick Bragg, Linux RAID On Sun, Dec 27, 2009 at 2:47 PM, Leslie Rhorer <lrhorer@satx.rr.com> wrote: >> On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote: >> > > # mdadm --examine /dev/sdb1 >> > > mdadm: No md superblock detected on /dev/sdb1. >> > > >> > > (Does this mean that sdb1 is bad? or is that OK?) >> > >> > It doesn't necessarily mean the drive is bad, but the superblock is >> > gone. Are you having mdadm monitor your array(s) and send informational >> > messages to you upon RAID events? If not, then what may have happened >> is >> > you lost the superblock on sdb1 and at some other time - before or after >> - >> > lost the sda drive. Once both events had taken place, your array is >> toast. >> Right, I need to set up monitoring... > > Um, yeah. A RAID array won't prevent drives from going up in smoke, > and if you don't know a drive has failed, you won't know you need to fix > something - until a second drive fails. > >> > All may not be lost, however. First of all, take care when >> > re-arranging not to lose track of which drive was which at the outset. >> In >> > fact, other than the sda drive, you might be best served not to move >> > anything. Take special care if the system re-assigns drive letters, as >> it >> > can easily do. >> So should I just "move" the A drive? and try to fire it back up? > > At this point, yeah. Don't lose track of from where and to where it > has been moved, though. > >> > What are the contents of /etc/mdadm.conf? >> > >> >> mdadm.conf contains this: >> ARRAY /dev/md0 level=raid10 num-devices=4 >> UUID=3d93e545:c8d5baec:24e6b15c:676eb40f > > Yeah, that doesn't help much. > >> So, by re-creating, do you mean I should try to run the "mdadm --create" >> command again the same way I did back when I created the array >> originally? Will that wipe out my data? > > Not in and of itself, no. If you get the drive order wrong > (different than when it was first created) and resync or write to the array, > then it will munge the data, but all creating the array does is create the > superblocks. > > >> # smartctl -l selftest /dev/sda >> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> Standard Inquiry (36 bytes) failed [No such device] >> Retrying with a 64 byte Standard Inquiry >> Standard Inquiry (64 bytes) failed [No such device] >> A mandatory SMART command failed: exiting. To continue, add one or more '- >> T permissive' options. > > Well, we kind of knew that. Either the drive is dead, or there is a > hardware problem in the controller path. Hope for the latter, although a > drive with a frozen platter can sometimes be resurrected, and if the drive > electronics are bad but the servo assemblies are OK, replacing the > electronics is not difficult. Otherwise, it's a goner. > >> # smartctl -l selftest /dev/sdb >> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen >> Home page is http://smartmontools.sourceforge.net/ >> >> === START OF READ SMART DATA SECTION === >> SMART Self-test log structure revision number 1 >> Num Test_Description Status Remaining >> LifeTime(hours) LBA_of_first_error >> # 1 Extended offline Completed: read failure 90% 7963 >> 543357 > > Oooh! That's bad. Really bad. Your earlier post showed the > superblock is a 0.90 version. The 0.90 superblock is stored near the end of > the partition. Your drive is suffering a heart attack when it gets near the > end of the drive. If you can't get your sda drive working again, then I'm > afraid you've lost some data, maybe all of it. Trying to rebuild a > partition from scratch when part of it is corrupted is not for the feint of > heart. If you are lucky, you might be able to dd part of the sdb drive onto > a healthy one and manually restore the superblock. That, or since the sda > drive does appear in /dev, you might have some luck copying some of it to a > new drive. > > Beyond that, you are either going to need the advice of someone who > knows much more about md and Linux than I do, or else the services of a > professional drive recovery expert. They don't come cheap. > >> This is strange, now I am getting info from mdadm --examine that is >> different than before... > > It looks like sda may be responding for the time being. I suggest > you try to assemble the array, and if successful, copy whatever data you can > to a backup device. Do not mount the array as read-write until you have > recovered everything you can. If some data is orphaned, it might be in the > lost+found directory. If that's successful, I suggest you find out why you > had two failures and start over. I wouldn't use a 0.90 superblock, though, > and you definitely want to have monitoring enabled. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > If you have the spare drives/space I -highly- recommend dd_rescue / ddrescue copying the suspected-bad drives contents to clean drives. http://www.linuxfoundation.org/collaborate/workgroups/linux-raid/raid_recovery has a script to try out the combinations so you can see where the least data is lost. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-12-29 2:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-26 1:33 fsck problems. Can't restore raid Rick Bragg 2009-12-26 2:47 ` Rick Bragg 2009-12-26 3:12 ` Rick Bragg 2009-12-26 18:47 ` Leslie Rhorer 2009-12-26 19:44 ` Rick Bragg 2009-12-26 21:14 ` Leslie Rhorer 2009-12-26 21:59 ` Green Mountain Network Info 2009-12-27 1:01 ` Rick Bragg 2009-12-27 6:13 ` Leslie Rhorer 2009-12-27 18:41 ` Rick Bragg 2009-12-27 22:47 ` Leslie Rhorer 2009-12-29 2:46 ` Michael Evans
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).