fsck problems. Can't restore raid

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* fsck problems.  Can't restore raid
@ 2009-12-26  1:33 Rick Bragg
  2009-12-26  2:47 ` Rick Bragg
  0 siblings, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-26  1:33 UTC (permalink / raw)
  To: Linux RAID

Hi,

I have a raid 10 array and for some reason the system went down and I
can't get it back.  

during re-boot, I get the following error:

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate
superblock:
    e2fsck -b 8193 <device>

I have tried everything I can think of and I can't seem to do an fsck or
repair the file system.

what can I do?

Thanks
Rick




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fsck problems.  Can't restore raid
  2009-12-26  1:33 fsck problems. Can't restore raid Rick Bragg
@ 2009-12-26  2:47 ` Rick Bragg
  2009-12-26  3:12   ` Rick Bragg
  0 siblings, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-26  2:47 UTC (permalink / raw)
  To: Linux RAID

On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> Hi,
> 
> I have a raid 10 array and for some reason the system went down and I
> can't get it back.  
> 
> during re-boot, I get the following error:
> 
> The superblock could not be read or does not describe a correct ext2
> filesystem.  If the device is valid and it really contains an ext2
> filesystem (and not swap or ufs or something else), then the superblock
> is corrupt, and you might try running e2fsck with an alternate
> superblock:
>     e2fsck -b 8193 <device>
> 
> I have tried everything I can think of and I can't seem to do an fsck or
> repair the file system.
> 
> what can I do?
> 
> Thanks
> Rick
> 


More info:

My array is made up of /dev/sda, sdb, sdc, and sdd.  However they are
not mounted right now.  My OS is booted off of /dev/sde.  I am running
ubuntu 9.04

mdadm -Q --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.

Where do I take if from here?  I'm not up on this as much as I should be
at all.  In fact I am quite a newbe to this... Any help would be greatly
appreciated.

Thanks
Rick




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fsck problems.  Can't restore raid
  2009-12-26  2:47 ` Rick Bragg
@ 2009-12-26  3:12   ` Rick Bragg
  2009-12-26 18:47     ` Leslie Rhorer
  0 siblings, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-26  3:12 UTC (permalink / raw)
  To: Linux RAID

[-- Attachment #1: Type: text/plain, Size: 1644 bytes --]

On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > Hi,
> > 
> > I have a raid 10 array and for some reason the system went down and I
> > can't get it back.  
> > 
> > during re-boot, I get the following error:
> > 
> > The superblock could not be read or does not describe a correct ext2
> > filesystem.  If the device is valid and it really contains an ext2
> > filesystem (and not swap or ufs or something else), then the superblock
> > is corrupt, and you might try running e2fsck with an alternate
> > superblock:
> >     e2fsck -b 8193 <device>
> > 
> > I have tried everything I can think of and I can't seem to do an fsck or
> > repair the file system.
> > 
> > what can I do?
> > 
> > Thanks
> > Rick
> > 
> 
> 
> More info:
> 
> My array is made up of /dev/sda, sdb, sdc, and sdd.  However they are
> not mounted right now.  My OS is booted off of /dev/sde.  I am running
> ubuntu 9.04
> 
> mdadm -Q --detail /dev/md0
> mdadm: md device /dev/md0 does not appear to be active.
> 
> Where do I take if from here?  I'm not up on this as much as I should be
> at all.  In fact I am quite a newbe to this... Any help would be greatly
> appreciated.
> 
> Thanks
> Rick
> 


Here is even more info:

# mdadm --assemble --scan
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.

# mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
mdadm: cannot open device /dev/sdb: Device or resource busy
mdadm: /dev/sdb has no superblock - assembly aborted

Is my array toast?
What can I do?

Thanks
Rick




[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5489 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-26  3:12   ` Rick Bragg
@ 2009-12-26 18:47     ` Leslie Rhorer
  2009-12-26 19:44       ` Rick Bragg
  0 siblings, 1 reply; 12+ messages in thread
From: Leslie Rhorer @ 2009-12-26 18:47 UTC (permalink / raw)
  To: 'Rick Bragg', 'Linux RAID'


	I take it from your post the drives are not partitioned, and the
RAID array consists of raw disk members?  First, check the superblocks of
the md devices:

	`mdadm --examine /dev/sda`, etc.  If 2 or more of the superblocks
are corrupt, then that's your problem.  If not, then it should be possible
to get the array mounted one way or the other.  Once you get the array
assembled again, then you can repair it, if need be.  Once that is done, you
can repair the file system if it is corrupted.  Once everything is clean,
you can mount the file system, and if necessary attempt to recover any lost
files.

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Rick Bragg
> Sent: Friday, December 25, 2009 9:13 PM
> To: Linux RAID
> Subject: Re: fsck problems. Can't restore raid
> 
> On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > > Hi,
> > >
> > > I have a raid 10 array and for some reason the system went down and I
> > > can't get it back.
> > >
> > > during re-boot, I get the following error:
> > >
> > > The superblock could not be read or does not describe a correct ext2
> > > filesystem.  If the device is valid and it really contains an ext2
> > > filesystem (and not swap or ufs or something else), then the
> superblock
> > > is corrupt, and you might try running e2fsck with an alternate
> > > superblock:
> > >     e2fsck -b 8193 <device>
> > >
> > > I have tried everything I can think of and I can't seem to do an fsck
> or
> > > repair the file system.
> > >
> > > what can I do?
> > >
> > > Thanks
> > > Rick
> > >
> >
> >
> > More info:
> >
> > My array is made up of /dev/sda, sdb, sdc, and sdd.  However they are
> > not mounted right now.  My OS is booted off of /dev/sde.  I am running
> > ubuntu 9.04
> >
> > mdadm -Q --detail /dev/md0
> > mdadm: md device /dev/md0 does not appear to be active.
> >
> > Where do I take if from here?  I'm not up on this as much as I should be
> > at all.  In fact I am quite a newbe to this... Any help would be greatly
> > appreciated.
> >
> > Thanks
> > Rick
> >
> 
> 
> Here is even more info:
> 
> # mdadm --assemble --scan
> mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
> 
> # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
> mdadm: cannot open device /dev/sdb: Device or resource busy
> mdadm: /dev/sdb has no superblock - assembly aborted
> 
> Is my array toast?
> What can I do?
> 
> Thanks
> Rick
> 
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-26 18:47     ` Leslie Rhorer
@ 2009-12-26 19:44       ` Rick Bragg
  2009-12-26 21:14         ` Leslie Rhorer
  0 siblings, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-26 19:44 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: 'Linux RAID'

Hi 

Thanks, it was an array that was up and running for a long time, and all
of a sudden, this happened.  so there were formatted and up and running
fine.

If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the
following error on all disks:
mdadm: No md superblock detected on /dev/sda.  
thats on all disks... (sda, sdb, sdc, and sdd)

When I run fdisk on /dev/sda I get the following error:
Unable to read /dev/sda

However, running fdisk on all other disks shows that they are up and
formatted with "raid" file type.

Not sure what I can do next...

Thanks
Rick






On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote:
> 	I take it from your post the drives are not partitioned, and the
> RAID array consists of raw disk members?  First, check the superblocks of
> the md devices:
> 
> 	`mdadm --examine /dev/sda`, etc.  If 2 or more of the superblocks
> are corrupt, then that's your problem.  If not, then it should be possible
> to get the array mounted one way or the other.  Once you get the array
> assembled again, then you can repair it, if need be.  Once that is done, you
> can repair the file system if it is corrupted.  Once everything is clean,
> you can mount the file system, and if necessary attempt to recover any lost
> files.
> 
> > -----Original Message-----
> > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > owner@vger.kernel.org] On Behalf Of Rick Bragg
> > Sent: Friday, December 25, 2009 9:13 PM
> > To: Linux RAID
> > Subject: Re: fsck problems. Can't restore raid
> > 
> > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > > > Hi,
> > > >
> > > > I have a raid 10 array and for some reason the system went down and I
> > > > can't get it back.
> > > >
> > > > during re-boot, I get the following error:
> > > >
> > > > The superblock could not be read or does not describe a correct ext2
> > > > filesystem.  If the device is valid and it really contains an ext2
> > > > filesystem (and not swap or ufs or something else), then the
> > superblock
> > > > is corrupt, and you might try running e2fsck with an alternate
> > > > superblock:
> > > >     e2fsck -b 8193 <device>
> > > >
> > > > I have tried everything I can think of and I can't seem to do an fsck
> > or
> > > > repair the file system.
> > > >
> > > > what can I do?
> > > >
> > > > Thanks
> > > > Rick
> > > >
> > >
> > >
> > > More info:
> > >
> > > My array is made up of /dev/sda, sdb, sdc, and sdd.  However they are
> > > not mounted right now.  My OS is booted off of /dev/sde.  I am running
> > > ubuntu 9.04
> > >
> > > mdadm -Q --detail /dev/md0
> > > mdadm: md device /dev/md0 does not appear to be active.
> > >
> > > Where do I take if from here?  I'm not up on this as much as I should be
> > > at all.  In fact I am quite a newbe to this... Any help would be greatly
> > > appreciated.
> > >
> > > Thanks
> > > Rick
> > >
> > 
> > 
> > Here is even more info:
> > 
> > # mdadm --assemble --scan
> > mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
> > 
> > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
> > mdadm: cannot open device /dev/sdb: Device or resource busy
> > mdadm: /dev/sdb has no superblock - assembly aborted
> > 
> > Is my array toast?
> > What can I do?
> > 
> > Thanks
> > Rick
> > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-26 19:44       ` Rick Bragg
@ 2009-12-26 21:14         ` Leslie Rhorer
  2009-12-26 21:59           ` Green Mountain Network Info
  2009-12-27  1:01           ` Rick Bragg
  0 siblings, 2 replies; 12+ messages in thread
From: Leslie Rhorer @ 2009-12-26 21:14 UTC (permalink / raw)
  To: 'Rick Bragg'; +Cc: 'Linux RAID'


> Thanks, it was an array that was up and running for a long time, and all
> of a sudden, this happened.  so there were formatted and up and running
> fine.

	Well, you didn't quite answer my question.  Are the drives
partitioned, or not?

> If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the
> following error on all disks:
> mdadm: No md superblock detected on /dev/sda.
> thats on all disks... (sda, sdb, sdc, and sdd)

	Well, we know the array is at least partially assembling, so it is
finding at least some of the superblocks.  It sounds to me like perhaps the
drives are partitioned.

> When I run fdisk on /dev/sda I get the following error:
> Unable to read /dev/sda

	That sounds like a dead drive.  I suggest running SMART tests on it.
You might try changing the cable or moving it to another controller port.

> However, running fdisk on all other disks shows that they are up and
> formatted with "raid" file type.

	Formatted, or partitioned?  You might post the output from fdisk
when you type "p" at the FDISK prompt.  At a guess, I would say perhaps your
drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4,
...), /dev/sdb1, etc.  Try typing

`ls /dev/sd*`

and see what pops up.  There should be references to sda, sdb, etc.  If
there are also references to sdb1, sdb2, etc, then your drives have valid
partitions, and it is those (or some of them, at least) which are targets
for md.  Once you have determined which drives have which partitions, then
issue the commands

` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is
"1", "2", "3", etc., including only those values returned by the ls command.

	For example, on one of my systems:

RAID-Server:/etc/ssmtp# ls /dev/sd*
/dev/sda   /dev/sda2  /dev/sdb  /dev/sdd  /dev/sdf  /dev/sdh  /dev/sdj
/dev/sdl
/dev/sda1  /dev/sda3  /dev/sdc  /dev/sde  /dev/sdg  /dev/sdi  /dev/sdk
RAID-Server:/etc/ssmtp# ls /dev/hd*
/dev/hda  /dev/hda1  /dev/hda2  /dev/hda3

	This result tells us I have one PATA drive (hda) with three
partitions on it.  It also tells us I have one (probably) SATA or SCSI drive
(sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no
partitions on them.

	If I issue the examine command on /dev/sda, I get an error:

RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda
mdadm: No md superblock detected on /dev/sda.

	That's because in this case the DRIVE does not have an md
superblock.  It is the PARTITIONS which have superblocks:

RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
     Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d
           Name : 'RAID-Server':1
  Creation Time : Wed Dec 23 23:46:28 2009
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 803160 (392.23 MiB 411.22 MB)
     Array Size : 803160 (392.23 MiB 411.22 MB)
   Super Offset : 803168 sectors
          State : clean
    Device UUID : 28212297:1d982d5d:ce41b6fe:03720159

Internal Bitmap : 2 sectors from superblock
    Update Time : Sat Dec 26 13:00:32 2009
       Checksum : af8f04b1 - correct
         Events : 204


    Array Slot : 1 (failed, 1, 0)
   Array State : uU 1 failed

	The other 11 drives, however, are un-partitioned, and the md
superblock rests on the drive device itself:

RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca
           Name : 'RAID-Server':0
  Creation Time : Sat Apr 25 01:17:12 2009
     Raid Level : raid6
   Raid Devices : 11

 Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
     Array Size : 17581722624 (8383.62 GiB 9001.84 GB)
  Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
    Data Offset : 272 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : d40c9255:cef0739f:966d448d:e549ada8

Internal Bitmap : 2 sectors from superblock
    Update Time : Sat Dec 26 15:09:44 2009
       Checksum : e290ec2f - correct
         Events : 1193460

     Chunk Size : 256K

    Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7)
   Array State : Uuuuuuuuuuu

> 
> Not sure what I can do next...
> 
> Thanks
> Rick
> 
> 
> 
> 
> 
> 
> On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote:
> > 	I take it from your post the drives are not partitioned, and the
> > RAID array consists of raw disk members?  First, check the superblocks
> of
> > the md devices:
> >
> > 	`mdadm --examine /dev/sda`, etc.  If 2 or more of the superblocks
> > are corrupt, then that's your problem.  If not, then it should be
> possible
> > to get the array mounted one way or the other.  Once you get the array
> > assembled again, then you can repair it, if need be.  Once that is done,
> you
> > can repair the file system if it is corrupted.  Once everything is
> clean,
> > you can mount the file system, and if necessary attempt to recover any
> lost
> > files.
> >
> > > -----Original Message-----
> > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > > owner@vger.kernel.org] On Behalf Of Rick Bragg
> > > Sent: Friday, December 25, 2009 9:13 PM
> > > To: Linux RAID
> > > Subject: Re: fsck problems. Can't restore raid
> > >
> > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > > > > Hi,
> > > > >
> > > > > I have a raid 10 array and for some reason the system went down
> and I
> > > > > can't get it back.
> > > > >
> > > > > during re-boot, I get the following error:
> > > > >
> > > > > The superblock could not be read or does not describe a correct
> ext2
> > > > > filesystem.  If the device is valid and it really contains an ext2
> > > > > filesystem (and not swap or ufs or something else), then the
> > > superblock
> > > > > is corrupt, and you might try running e2fsck with an alternate
> > > > > superblock:
> > > > >     e2fsck -b 8193 <device>
> > > > >
> > > > > I have tried everything I can think of and I can't seem to do an
> fsck
> > > or
> > > > > repair the file system.
> > > > >
> > > > > what can I do?
> > > > >
> > > > > Thanks
> > > > > Rick
> > > > >
> > > >
> > > >
> > > > More info:
> > > >
> > > > My array is made up of /dev/sda, sdb, sdc, and sdd.  However they
> are
> > > > not mounted right now.  My OS is booted off of /dev/sde.  I am
> running
> > > > ubuntu 9.04
> > > >
> > > > mdadm -Q --detail /dev/md0
> > > > mdadm: md device /dev/md0 does not appear to be active.
> > > >
> > > > Where do I take if from here?  I'm not up on this as much as I
> should be
> > > > at all.  In fact I am quite a newbe to this... Any help would be
> greatly
> > > > appreciated.
> > > >
> > > > Thanks
> > > > Rick
> > > >
> > >
> > >
> > > Here is even more info:
> > >
> > > # mdadm --assemble --scan
> > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the
> array.
> > >
> > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
> > > mdadm: cannot open device /dev/sdb: Device or resource busy
> > > mdadm: /dev/sdb has no superblock - assembly aborted
> > >
> > > Is my array toast?
> > > What can I do?
> > >
> > > Thanks
> > > Rick
> > >
> > >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-26 21:14         ` Leslie Rhorer
@ 2009-12-26 21:59           ` Green Mountain Network Info
  2009-12-27  1:01           ` Rick Bragg
  1 sibling, 0 replies; 12+ messages in thread
From: Green Mountain Network Info @ 2009-12-26 21:59 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: 'Linux RAID'

Hi,

Thanks again Leslie, they are partitioned as linux raid, and the format
was Ext3.  They are all sdx1 (where x is a, b, c, d)   The raid array is
not a bootable system at all, but a mounted drive.  I am running the
system off of a totally different drive. (/dev/sde)  Also, I am running
a smart test now on all the drives 
# smartctl -t /dev/sdx...  I didn't know this existed, so I will await
the output and hope to make sense of it.  I will try to change up the
cables and ports once the smartctl results are in. 



Following is some more info:

fdisk info:

# fdisk /dev/sda
Unable to read /dev/sda

# fdisk /dev/sdb
Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60801   488384001   fd  Linux raid
autodetect

# fdisk /dev/sdc
Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00045567

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       60801   488384001   fd  Linux raid
autodetect

# fdisk /dev/sdd
Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       60801   488384001   fd  Linux raid
autodetect



# mdadm --examine /dev/sda1
mdadm: No md superblock detected on /dev/sda1.
(no surprise there...)


# mdadm --examine /dev/sdb1
mdadm: No md superblock detected on /dev/sdb1.

(Does this mean that sdb1 is bad? or is that OK?)


# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3ad - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       33        2      active sync   /dev/sdc1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



# mdadm --examine /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3bf - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       49        3      active sync   /dev/sdd1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



Also, for the controlers, I am using a couple of Promise cards:

# lspci
...
01:02.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)
...
01:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)


If any of this stands out as something really wrong, please let me know.
Thanks again so much for your help!
rick











On Sat, 2009-12-26 at 15:14 -0600, Leslie Rhorer wrote:
> > Thanks, it was an array that was up and running for a long time, and all
> > of a sudden, this happened.  so there were formatted and up and running
> > fine.
> 
> 	Well, you didn't quite answer my question.  Are the drives
> partitioned, or not?
> 
> > If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the
> > following error on all disks:
> > mdadm: No md superblock detected on /dev/sda.
> > thats on all disks... (sda, sdb, sdc, and sdd)
> 
> 	Well, we know the array is at least partially assembling, so it is
> finding at least some of the superblocks.  It sounds to me like perhaps the
> drives are partitioned.
> 
> > When I run fdisk on /dev/sda I get the following error:
> > Unable to read /dev/sda
> 
> 	That sounds like a dead drive.  I suggest running SMART tests on it.
> You might try changing the cable or moving it to another controller port.
> 
> > However, running fdisk on all other disks shows that they are up and
> > formatted with "raid" file type.
> 
> 	Formatted, or partitioned?  You might post the output from fdisk
> when you type "p" at the FDISK prompt.  At a guess, I would say perhaps your
> drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4,
> ...), /dev/sdb1, etc.  Try typing
> 
> `ls /dev/sd*`
> 
> and see what pops up.  There should be references to sda, sdb, etc.  If
> there are also references to sdb1, sdb2, etc, then your drives have valid
> partitions, and it is those (or some of them, at least) which are targets
> for md.  Once you have determined which drives have which partitions, then
> issue the commands
> 
> ` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is
> "1", "2", "3", etc., including only those values returned by the ls command.
> 
> 	For example, on one of my systems:
> 
> RAID-Server:/etc/ssmtp# ls /dev/sd*
> /dev/sda   /dev/sda2  /dev/sdb  /dev/sdd  /dev/sdf  /dev/sdh  /dev/sdj
> /dev/sdl
> /dev/sda1  /dev/sda3  /dev/sdc  /dev/sde  /dev/sdg  /dev/sdi  /dev/sdk
> RAID-Server:/etc/ssmtp# ls /dev/hd*
> /dev/hda  /dev/hda1  /dev/hda2  /dev/hda3
> 
> 	This result tells us I have one PATA drive (hda) with three
> partitions on it.  It also tells us I have one (probably) SATA or SCSI drive
> (sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no
> partitions on them.
> 
> 	If I issue the examine command on /dev/sda, I get an error:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda
> mdadm: No md superblock detected on /dev/sda.
> 
> 	That's because in this case the DRIVE does not have an md
> superblock.  It is the PARTITIONS which have superblocks:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1
> /dev/sda1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x1
>      Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d
>            Name : 'RAID-Server':1
>   Creation Time : Wed Dec 23 23:46:28 2009
>      Raid Level : raid1
>    Raid Devices : 2
> 
>  Avail Dev Size : 803160 (392.23 MiB 411.22 MB)
>      Array Size : 803160 (392.23 MiB 411.22 MB)
>    Super Offset : 803168 sectors
>           State : clean
>     Device UUID : 28212297:1d982d5d:ce41b6fe:03720159
> 
> Internal Bitmap : 2 sectors from superblock
>     Update Time : Sat Dec 26 13:00:32 2009
>        Checksum : af8f04b1 - correct
>          Events : 204
> 
> 
>     Array Slot : 1 (failed, 1, 0)
>    Array State : uU 1 failed
> 
> 	The other 11 drives, however, are un-partitioned, and the md
> superblock rests on the drive device itself:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb
> /dev/sdb:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca
>            Name : 'RAID-Server':0
>   Creation Time : Sat Apr 25 01:17:12 2009
>      Raid Level : raid6
>    Raid Devices : 11
> 
>  Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
>      Array Size : 17581722624 (8383.62 GiB 9001.84 GB)
>   Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
>     Data Offset : 272 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : d40c9255:cef0739f:966d448d:e549ada8
> 
> Internal Bitmap : 2 sectors from superblock
>     Update Time : Sat Dec 26 15:09:44 2009
>        Checksum : e290ec2f - correct
>          Events : 1193460
> 
>      Chunk Size : 256K
> 
>     Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7)
>    Array State : Uuuuuuuuuuu
> 
> > 
> > Not sure what I can do next...
> > 
> > Thanks
> > Rick
> > 
> > 
> > 
> > 
> > 
> > 
> > On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote:
> > > 	I take it from your post the drives are not partitioned, and the
> > > RAID array consists of raw disk members?  First, check the superblocks
> > of
> > > the md devices:
> > >
> > > 	`mdadm --examine /dev/sda`, etc.  If 2 or more of the superblocks
> > > are corrupt, then that's your problem.  If not, then it should be
> > possible
> > > to get the array mounted one way or the other.  Once you get the array
> > > assembled again, then you can repair it, if need be.  Once that is done,
> > you
> > > can repair the file system if it is corrupted.  Once everything is
> > clean,
> > > you can mount the file system, and if necessary attempt to recover any
> > lost
> > > files.
> > >
> > > > -----Original Message-----
> > > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > > > owner@vger.kernel.org] On Behalf Of Rick Bragg
> > > > Sent: Friday, December 25, 2009 9:13 PM
> > > > To: Linux RAID
> > > > Subject: Re: fsck problems. Can't restore raid
> > > >
> > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> > > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have a raid 10 array and for some reason the system went down
> > and I
> > > > > > can't get it back.
> > > > > >
> > > > > > during re-boot, I get the following error:
> > > > > >
> > > > > > The superblock could not be read or does not describe a correct
> > ext2
> > > > > > filesystem.  If the device is valid and it really contains an ext2
> > > > > > filesystem (and not swap or ufs or something else), then the
> > > > superblock
> > > > > > is corrupt, and you might try running e2fsck with an alternate
> > > > > > superblock:
> > > > > >     e2fsck -b 8193 <device>
> > > > > >
> > > > > > I have tried everything I can think of and I can't seem to do an
> > fsck
> > > > or
> > > > > > repair the file system.
> > > > > >
> > > > > > what can I do?
> > > > > >
> > > > > > Thanks
> > > > > > Rick
> > > > > >
> > > > >
> > > > >
> > > > > More info:
> > > > >
> > > > > My array is made up of /dev/sda, sdb, sdc, and sdd.  However they
> > are
> > > > > not mounted right now.  My OS is booted off of /dev/sde.  I am
> > running
> > > > > ubuntu 9.04
> > > > >
> > > > > mdadm -Q --detail /dev/md0
> > > > > mdadm: md device /dev/md0 does not appear to be active.
> > > > >
> > > > > Where do I take if from here?  I'm not up on this as much as I
> > should be
> > > > > at all.  In fact I am quite a newbe to this... Any help would be
> > greatly
> > > > > appreciated.
> > > > >
> > > > > Thanks
> > > > > Rick
> > > > >
> > > >
> > > >
> > > > Here is even more info:
> > > >
> > > > # mdadm --assemble --scan
> > > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the
> > array.
> > > >
> > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
> > > > mdadm: cannot open device /dev/sdb: Device or resource busy
> > > > mdadm: /dev/sdb has no superblock - assembly aborted
> > > >
> > > > Is my array toast?
> > > > What can I do?
> > > >
> > > > Thanks
> > > > Rick
> > > >
> > > >
> > >
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Green Mountain Network
PO Box 1462
Burlington, VT 05402
802.264.4851
info@gmnet.net




^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-26 21:14         ` Leslie Rhorer
  2009-12-26 21:59           ` Green Mountain Network Info
@ 2009-12-27  1:01           ` Rick Bragg
  2009-12-27  6:13             ` Leslie Rhorer
  1 sibling, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-27  1:01 UTC (permalink / raw)
  To: 'Linux RAID'

Hi,

Thanks again Leslie, they are partitioned as linux raid, and the format
was Ext3.  They are all sdx1 (where x is a, b, c, d)   The raid array is
not a bootable system at all, but a mounted drive.  I am running the
system off of a totally different drive. (/dev/sde)  Also, I am running
a smart test now on all the drives 
# smartctl -t /dev/sdx...  I didn't know this existed, so I will await
the output and hope to make sense of it.  I will try to change up the
cables and ports once the smartctl results are in. 



Following is some more info:

fdisk info:

# fdisk /dev/sda
Unable to read /dev/sda

# fdisk /dev/sdb
Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60801   488384001   fd  Linux raid
autodetect

# fdisk /dev/sdc
Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00045567

   Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1       60801   488384001   fd  Linux raid
autodetect

# fdisk /dev/sdd
Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1       60801   488384001   fd  Linux raid
autodetect



# mdadm --examine /dev/sda1
mdadm: No md superblock detected on /dev/sda1.
(no surprise there...)


# mdadm --examine /dev/sdb1
mdadm: No md superblock detected on /dev/sdb1.

(Does this mean that sdb1 is bad? or is that OK?)


# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3ad - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       33        2      active sync   /dev/sdc1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



# mdadm --examine /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3bf - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       49        3      active sync   /dev/sdd1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



Also, for the controlers, I am using a couple of Promise cards:

# lspci
...
01:02.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)
...
01:06.0 Mass storage controller: Promise Technology, Inc. PDC40718 (SATA
300 TX4) (rev 02)


If any of this stands out as something really wrong, please let me know.
Thanks again so much for your help!
rick








On Sat, 2009-12-26 at 15:14 -0600, Leslie Rhorer wrote:
> > Thanks, it was an array that was up and running for a long time, and all
> > of a sudden, this happened.  so there were formatted and up and running
> > fine.
> 
> 	Well, you didn't quite answer my question.  Are the drives
> partitioned, or not?
> 
> > If I run, `mdadm --examine /dev/sda` etc. on all my disks, I get the
> > following error on all disks:
> > mdadm: No md superblock detected on /dev/sda.
> > thats on all disks... (sda, sdb, sdc, and sdd)
> 
> 	Well, we know the array is at least partially assembling, so it is
> finding at least some of the superblocks.  It sounds to me like perhaps the
> drives are partitioned.
> 
> > When I run fdisk on /dev/sda I get the following error:
> > Unable to read /dev/sda
> 
> 	That sounds like a dead drive.  I suggest running SMART tests on it.
> You might try changing the cable or moving it to another controller port.
> 
> > However, running fdisk on all other disks shows that they are up and
> > formatted with "raid" file type.
> 
> 	Formatted, or partitioned?  You might post the output from fdisk
> when you type "p" at the FDISK prompt.  At a guess, I would say perhaps your
> drives are partitioned and the devices are /dev/sda1 (or 2, or 3, or 4,
> ...), /dev/sdb1, etc.  Try typing
> 
> `ls /dev/sd*`
> 
> and see what pops up.  There should be references to sda, sdb, etc.  If
> there are also references to sdb1, sdb2, etc, then your drives have valid
> partitions, and it is those (or some of them, at least) which are targets
> for md.  Once you have determined which drives have which partitions, then
> issue the commands
> 
> ` mdadm --examine /dev/sdxy`, where x is "a", "b", "c", "d", etc., and y is
> "1", "2", "3", etc., including only those values returned by the ls command.
> 
> 	For example, on one of my systems:
> 
> RAID-Server:/etc/ssmtp# ls /dev/sd*
> /dev/sda   /dev/sda2  /dev/sdb  /dev/sdd  /dev/sdf  /dev/sdh  /dev/sdj
> /dev/sdl
> /dev/sda1  /dev/sda3  /dev/sdc  /dev/sde  /dev/sdg  /dev/sdi  /dev/sdk
> RAID-Server:/etc/ssmtp# ls /dev/hd*
> /dev/hda  /dev/hda1  /dev/hda2  /dev/hda3
> 
> 	This result tells us I have one PATA drive (hda) with three
> partitions on it.  It also tells us I have one (probably) SATA or SCSI drive
> (sda) with three partitions, and 11 SATA or SCSI drives (sdb - sdl) with no
> partitions on them.
> 
> 	If I issue the examine command on /dev/sda, I get an error:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda
> mdadm: No md superblock detected on /dev/sda.
> 
> 	That's because in this case the DRIVE does not have an md
> superblock.  It is the PARTITIONS which have superblocks:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sda1
> /dev/sda1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x1
>      Array UUID : 76e8e11d:e0183c3c:404cb86a:19a7cb3d
>            Name : 'RAID-Server':1
>   Creation Time : Wed Dec 23 23:46:28 2009
>      Raid Level : raid1
>    Raid Devices : 2
> 
>  Avail Dev Size : 803160 (392.23 MiB 411.22 MB)
>      Array Size : 803160 (392.23 MiB 411.22 MB)
>    Super Offset : 803168 sectors
>           State : clean
>     Device UUID : 28212297:1d982d5d:ce41b6fe:03720159
> 
> Internal Bitmap : 2 sectors from superblock
>     Update Time : Sat Dec 26 13:00:32 2009
>        Checksum : af8f04b1 - correct
>          Events : 204
> 
> 
>     Array Slot : 1 (failed, 1, 0)
>    Array State : uU 1 failed
> 
> 	The other 11 drives, however, are un-partitioned, and the md
> superblock rests on the drive device itself:
> 
> RAID-Server:/etc/ssmtp# mdadm --examine /dev/sdb
> /dev/sdb:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 5ff10d73:a096195f:7a646bba:a68986ca
>            Name : 'RAID-Server':0
>   Creation Time : Sat Apr 25 01:17:12 2009
>      Raid Level : raid6
>    Raid Devices : 11
> 
>  Avail Dev Size : 1953524896 (931.51 GiB 1000.20 GB)
>      Array Size : 17581722624 (8383.62 GiB 9001.84 GB)
>   Used Dev Size : 1953524736 (931.51 GiB 1000.20 GB)
>     Data Offset : 272 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : d40c9255:cef0739f:966d448d:e549ada8
> 
> Internal Bitmap : 2 sectors from superblock
>     Update Time : Sat Dec 26 15:09:44 2009
>        Checksum : e290ec2f - correct
>          Events : 1193460
> 
>      Chunk Size : 256K
> 
>     Array Slot : 0 (0, 1, 2, 3, 4, 5, 6, 10, 8, 9, 7)
>    Array State : Uuuuuuuuuuu
> 
> > 
> > Not sure what I can do next...
> > 
> > Thanks
> > Rick
> > 
> > 
> > 
> > 
> > 
> > 
> > On Sat, 2009-12-26 at 12:47 -0600, Leslie Rhorer wrote:
> > > 	I take it from your post the drives are not partitioned, and the
> > > RAID array consists of raw disk members?  First, check the superblocks
> > of
> > > the md devices:
> > >
> > > 	`mdadm --examine /dev/sda`, etc.  If 2 or more of the superblocks
> > > are corrupt, then that's your problem.  If not, then it should be
> > possible
> > > to get the array mounted one way or the other.  Once you get the array
> > > assembled again, then you can repair it, if need be.  Once that is done,
> > you
> > > can repair the file system if it is corrupted.  Once everything is
> > clean,
> > > you can mount the file system, and if necessary attempt to recover any
> > lost
> > > files.
> > >
> > > > -----Original Message-----
> > > > From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> > > > owner@vger.kernel.org] On Behalf Of Rick Bragg
> > > > Sent: Friday, December 25, 2009 9:13 PM
> > > > To: Linux RAID
> > > > Subject: Re: fsck problems. Can't restore raid
> > > >
> > > > On Fri, 2009-12-25 at 21:47 -0500, Rick Bragg wrote:
> > > > > On Fri, 2009-12-25 at 20:33 -0500, Rick Bragg wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have a raid 10 array and for some reason the system went down
> > and I
> > > > > > can't get it back.
> > > > > >
> > > > > > during re-boot, I get the following error:
> > > > > >
> > > > > > The superblock could not be read or does not describe a correct
> > ext2
> > > > > > filesystem.  If the device is valid and it really contains an ext2
> > > > > > filesystem (and not swap or ufs or something else), then the
> > > > superblock
> > > > > > is corrupt, and you might try running e2fsck with an alternate
> > > > > > superblock:
> > > > > >     e2fsck -b 8193 <device>
> > > > > >
> > > > > > I have tried everything I can think of and I can't seem to do an
> > fsck
> > > > or
> > > > > > repair the file system.
> > > > > >
> > > > > > what can I do?
> > > > > >
> > > > > > Thanks
> > > > > > Rick
> > > > > >
> > > > >
> > > > >
> > > > > More info:
> > > > >
> > > > > My array is made up of /dev/sda, sdb, sdc, and sdd.  However they
> > are
> > > > > not mounted right now.  My OS is booted off of /dev/sde.  I am
> > running
> > > > > ubuntu 9.04
> > > > >
> > > > > mdadm -Q --detail /dev/md0
> > > > > mdadm: md device /dev/md0 does not appear to be active.
> > > > >
> > > > > Where do I take if from here?  I'm not up on this as much as I
> > should be
> > > > > at all.  In fact I am quite a newbe to this... Any help would be
> > greatly
> > > > > appreciated.
> > > > >
> > > > > Thanks
> > > > > Rick
> > > > >
> > > >
> > > >
> > > > Here is even more info:
> > > >
> > > > # mdadm --assemble --scan
> > > > mdadm: /dev/md0 assembled from 2 drives - not enough to start the
> > array.
> > > >
> > > > # mdadm --assemble /dev/sda /dev/sdb /dev/sdc /dev/sdd
> > > > mdadm: cannot open device /dev/sdb: Device or resource busy
> > > > mdadm: /dev/sdb has no superblock - assembly aborted
> > > >
> > > > Is my array toast?
> > > > What can I do?
> > > >
> > > > Thanks
> > > > Rick
> > > >
> > > >
> > >
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > >
> > 
> > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-27  1:01           ` Rick Bragg
@ 2009-12-27  6:13             ` Leslie Rhorer
  2009-12-27 18:41               ` Rick Bragg
  0 siblings, 1 reply; 12+ messages in thread
From: Leslie Rhorer @ 2009-12-27  6:13 UTC (permalink / raw)
  To: 'Rick Bragg', 'Linux RAID'

> # mdadm --examine /dev/sdb1
> mdadm: No md superblock detected on /dev/sdb1.
> 
> (Does this mean that sdb1 is bad? or is that OK?)

	It doesn't necessarily mean the drive is bad, but the superblock is
gone.  Are you having mdadm monitor your array(s) and send informational
messages to you upon RAID events?  If not, then what may have happened is
you lost the superblock on sdb1 and at some other time - before or after -
lost the sda drive.  Once both events had taken place, your array is toast.

	All may not be lost, however.  First of all, take care when
re-arranging not to lose track of which drive was which at the outset.  In
fact, other than the sda drive, you might be best served not to move
anything.  Take special care if the system re-assigns drive letters, as it
can easily do.

	When you created your array, one of the steps you should have taken
was to put the drive configuration into /etc/mdadm.conf.  In particular, you
may need to attempt to re-create the array by mimicking the original command
used to create the array, basically zeroing out the superblocks and starting
again from scratch.  If you do it correctly, and are careful, it may be
possible to recover the array.  Note, however, the array parameters must
match the original configuration exactly, including the role each drive
plays in the array.  If you get them out of line, you can really destroy all
the data.

	What are the contents of /etc/mdadm.conf?



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-27  6:13             ` Leslie Rhorer
@ 2009-12-27 18:41               ` Rick Bragg
  2009-12-27 22:47                 ` Leslie Rhorer
  0 siblings, 1 reply; 12+ messages in thread
From: Rick Bragg @ 2009-12-27 18:41 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: 'Linux RAID'

Thanks again Leslie,  I threaded my responses below.

On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote:
> > # mdadm --examine /dev/sdb1
> > mdadm: No md superblock detected on /dev/sdb1.
> > 
> > (Does this mean that sdb1 is bad? or is that OK?)
> 
> 	It doesn't necessarily mean the drive is bad, but the superblock is
> gone.  Are you having mdadm monitor your array(s) and send informational
> messages to you upon RAID events?  If not, then what may have happened is
> you lost the superblock on sdb1 and at some other time - before or after -
> lost the sda drive.  Once both events had taken place, your array is toast.
Right, I need to set up monitoring...

> 
> 	All may not be lost, however.  First of all, take care when
> re-arranging not to lose track of which drive was which at the outset.  In
> fact, other than the sda drive, you might be best served not to move
> anything.  Take special care if the system re-assigns drive letters, as it
> can easily do.
So should I just "move" the A drive? and try to fire it back up?

> 
> 	When you created your array, one of the steps you should have taken
> was to put the drive configuration into /etc/mdadm.conf.  In particular, you
> may need to attempt to re-create the array by mimicking the original command
> used to create the array, basically zeroing out the superblocks and starting
> again from scratch.  If you do it correctly, and are careful, it may be
> possible to recover the array.  Note, however, the array parameters must
> match the original configuration exactly, including the role each drive
> plays in the array.  If you get them out of line, you can really destroy all
> the data.
> 
> 	What are the contents of /etc/mdadm.conf?
> 

mdadm.conf contains this:
ARRAY /dev/md0 level=raid10 num-devices=4 UUID=3d93e545:c8d5baec:24e6b15c:676eb40f

So, by re-creating, do you mean I should try to run the "mdadm --create"
command again the same way I did back when I created the array
originally? Will that wipe out my data?



Also, here is my output from smart tests:  Does this shed any more light
on anything?  I'm not sure what the "Remaining" and "LifeTime" fiends
mean...

# smartctl -l selftest /dev/sda
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Standard Inquiry (36 bytes) failed [No such device]
Retrying with a 64 byte Standard Inquiry
Standard Inquiry (64 bytes) failed [No such device]
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.


# smartctl -l selftest /dev/sdb
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      7963         543357



# smartctl -l selftest /dev/sdc
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      7964         -


# smartctl -l selftest /dev/sdd
smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      7965         -





This is strange, now I am getting info from mdadm --examine that is
different than before...

# mdadm --examine /dev/sda
mdadm: No md superblock detected on /dev/sda.

# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:40 2009
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : edcd7fb8 - correct
         Events : 1131433

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8        1        0      active sync   /dev/sda1

   0     0       8        1        0      active sync   /dev/sda1
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



# mdadm --examine /dev/sdb
mdadm: No md superblock detected on /dev/sdb.

# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Dec 16 15:33:13 2009
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
       Checksum : edc67464 - correct
         Events : 687460

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       17        1      active sync   /dev/sdb1

   0     0       8        1        0      active sync   /dev/sda1
   1     1       8       17        1      active sync   /dev/sdb1
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



# mdadm --examine /dev/sdc
mdadm: No md superblock detected on /dev/sdc.

# mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3ad - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       33        2      active sync   /dev/sdc1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



# mdadm --examine /dev/sdd
mdadm: No md superblock detected on /dev/sdd.

# mdadm --examine /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 3d93e545:c8d5baec:24e6b15c:676eb40f (local to host
smoke)
  Creation Time : Wed Jan 28 14:58:49 2009
     Raid Level : raid10
  Used Dev Size : 488383936 (465.76 GiB 500.11 GB)
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Thu Dec 24 19:25:58 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : eddec3bf - correct
         Events : 1131438

         Layout : near=2, far=1
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       49        3      active sync   /dev/sdd1

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
   3     3       8       49        3      active sync   /dev/sdd1



Thanks again for all your help!
Rick
















^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fsck problems.  Can't restore raid
  2009-12-27 18:41               ` Rick Bragg
@ 2009-12-27 22:47                 ` Leslie Rhorer
  2009-12-29  2:46                   ` Michael Evans
  0 siblings, 1 reply; 12+ messages in thread
From: Leslie Rhorer @ 2009-12-27 22:47 UTC (permalink / raw)
  To: 'Rick Bragg'; +Cc: 'Linux RAID'

> On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote:
> > > # mdadm --examine /dev/sdb1
> > > mdadm: No md superblock detected on /dev/sdb1.
> > >
> > > (Does this mean that sdb1 is bad? or is that OK?)
> >
> > 	It doesn't necessarily mean the drive is bad, but the superblock is
> > gone.  Are you having mdadm monitor your array(s) and send informational
> > messages to you upon RAID events?  If not, then what may have happened
> is
> > you lost the superblock on sdb1 and at some other time - before or after
> -
> > lost the sda drive.  Once both events had taken place, your array is
> toast.
> Right, I need to set up monitoring...

	Um, yeah.  A RAID array won't prevent drives from going up in smoke,
and if you don't know a drive has failed, you won't know you need to fix
something - until a second drive fails.

> > 	All may not be lost, however.  First of all, take care when
> > re-arranging not to lose track of which drive was which at the outset.
> In
> > fact, other than the sda drive, you might be best served not to move
> > anything.  Take special care if the system re-assigns drive letters, as
> it
> > can easily do.
> So should I just "move" the A drive? and try to fire it back up?

	At this point, yeah.  Don't lose track of from where and to where it
has been moved, though.

> > 	What are the contents of /etc/mdadm.conf?
> >
> 
> mdadm.conf contains this:
> ARRAY /dev/md0 level=raid10 num-devices=4
> UUID=3d93e545:c8d5baec:24e6b15c:676eb40f

	Yeah, that doesn't help much.

> So, by re-creating, do you mean I should try to run the "mdadm --create"
> command again the same way I did back when I created the array
> originally? Will that wipe out my data?

	Not in and of itself, no.  If you get the drive order wrong
(different than when it was first created) and resync or write to the array,
then it will munge the data, but all creating the array does is create the
superblocks.


> # smartctl -l selftest /dev/sda
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> Standard Inquiry (36 bytes) failed [No such device]
> Retrying with a 64 byte Standard Inquiry
> Standard Inquiry (64 bytes) failed [No such device]
> A mandatory SMART command failed: exiting. To continue, add one or more '-
> T permissive' options.

	Well, we kind of knew that.  Either the drive is dead, or there is a
hardware problem in the controller path.  Hope for the latter, although a
drive with a frozen platter can sometimes be resurrected, and if the drive
electronics are bad but the servo assemblies are OK, replacing the
electronics is not difficult.  Otherwise, it's a goner.

> # smartctl -l selftest /dev/sdb
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF READ SMART DATA SECTION ===
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed: read failure       90%      7963
> 543357

	Oooh!  That's bad.  Really bad.  Your earlier post showed the
superblock is a 0.90 version.  The 0.90 superblock is stored near the end of
the partition.  Your drive is suffering a heart attack when it gets near the
end of the drive.  If you can't get your sda drive working again, then I'm
afraid you've lost some data, maybe all of it.  Trying to rebuild a
partition from scratch when part of it is corrupted is not for the feint of
heart.  If you are lucky, you might be able to dd part of the sdb drive onto
a healthy one and manually restore the superblock.  That, or since the sda
drive does appear in /dev, you might have some luck copying some of it to a
new drive.

	Beyond that, you are either going to need the advice of someone who
knows much more about md and Linux than I do, or else the services of a
professional drive recovery expert.  They don't come cheap.

> This is strange, now I am getting info from mdadm --examine that is
> different than before...

	It looks like sda may be responding for the time being.  I suggest
you try to assemble the array, and if successful, copy whatever data you can
to a backup device.  Do not mount the array as read-write until you have
recovered everything you can.  If some data is orphaned, it might be in the
lost+found directory.  If that's successful, I suggest you find out why you
had two failures and start over.  I wouldn't use a 0.90 superblock, though,
and you definitely want to have monitoring enabled.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fsck problems. Can't restore raid
  2009-12-27 22:47                 ` Leslie Rhorer
@ 2009-12-29  2:46                   ` Michael Evans
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Evans @ 2009-12-29  2:46 UTC (permalink / raw)
  To: Leslie Rhorer; +Cc: Rick Bragg, Linux RAID

On Sun, Dec 27, 2009 at 2:47 PM, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>> On Sun, 2009-12-27 at 00:13 -0600, Leslie Rhorer wrote:
>> > > # mdadm --examine /dev/sdb1
>> > > mdadm: No md superblock detected on /dev/sdb1.
>> > >
>> > > (Does this mean that sdb1 is bad? or is that OK?)
>> >
>> >     It doesn't necessarily mean the drive is bad, but the superblock is
>> > gone.  Are you having mdadm monitor your array(s) and send informational
>> > messages to you upon RAID events?  If not, then what may have happened
>> is
>> > you lost the superblock on sdb1 and at some other time - before or after
>> -
>> > lost the sda drive.  Once both events had taken place, your array is
>> toast.
>> Right, I need to set up monitoring...
>
>        Um, yeah.  A RAID array won't prevent drives from going up in smoke,
> and if you don't know a drive has failed, you won't know you need to fix
> something - until a second drive fails.
>
>> >     All may not be lost, however.  First of all, take care when
>> > re-arranging not to lose track of which drive was which at the outset.
>> In
>> > fact, other than the sda drive, you might be best served not to move
>> > anything.  Take special care if the system re-assigns drive letters, as
>> it
>> > can easily do.
>> So should I just "move" the A drive? and try to fire it back up?
>
>        At this point, yeah.  Don't lose track of from where and to where it
> has been moved, though.
>
>> >     What are the contents of /etc/mdadm.conf?
>> >
>>
>> mdadm.conf contains this:
>> ARRAY /dev/md0 level=raid10 num-devices=4
>> UUID=3d93e545:c8d5baec:24e6b15c:676eb40f
>
>        Yeah, that doesn't help much.
>
>> So, by re-creating, do you mean I should try to run the "mdadm --create"
>> command again the same way I did back when I created the array
>> originally? Will that wipe out my data?
>
>        Not in and of itself, no.  If you get the drive order wrong
> (different than when it was first created) and resync or write to the array,
> then it will munge the data, but all creating the array does is create the
> superblocks.
>
>
>> # smartctl -l selftest /dev/sda
>> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
>> Home page is http://smartmontools.sourceforge.net/
>>
>> Standard Inquiry (36 bytes) failed [No such device]
>> Retrying with a 64 byte Standard Inquiry
>> Standard Inquiry (64 bytes) failed [No such device]
>> A mandatory SMART command failed: exiting. To continue, add one or more '-
>> T permissive' options.
>
>        Well, we kind of knew that.  Either the drive is dead, or there is a
> hardware problem in the controller path.  Hope for the latter, although a
> drive with a frozen platter can sometimes be resurrected, and if the drive
> electronics are bad but the servo assemblies are OK, replacing the
> electronics is not difficult.  Otherwise, it's a goner.
>
>> # smartctl -l selftest /dev/sdb
>> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
>> Home page is http://smartmontools.sourceforge.net/
>>
>> === START OF READ SMART DATA SECTION ===
>> SMART Self-test log structure revision number 1
>> Num  Test_Description    Status                  Remaining
>> LifeTime(hours)  LBA_of_first_error
>> # 1  Extended offline    Completed: read failure       90%      7963
>> 543357
>
>        Oooh!  That's bad.  Really bad.  Your earlier post showed the
> superblock is a 0.90 version.  The 0.90 superblock is stored near the end of
> the partition.  Your drive is suffering a heart attack when it gets near the
> end of the drive.  If you can't get your sda drive working again, then I'm
> afraid you've lost some data, maybe all of it.  Trying to rebuild a
> partition from scratch when part of it is corrupted is not for the feint of
> heart.  If you are lucky, you might be able to dd part of the sdb drive onto
> a healthy one and manually restore the superblock.  That, or since the sda
> drive does appear in /dev, you might have some luck copying some of it to a
> new drive.
>
>        Beyond that, you are either going to need the advice of someone who
> knows much more about md and Linux than I do, or else the services of a
> professional drive recovery expert.  They don't come cheap.
>
>> This is strange, now I am getting info from mdadm --examine that is
>> different than before...
>
>        It looks like sda may be responding for the time being.  I suggest
> you try to assemble the array, and if successful, copy whatever data you can
> to a backup device.  Do not mount the array as read-write until you have
> recovered everything you can.  If some data is orphaned, it might be in the
> lost+found directory.  If that's successful, I suggest you find out why you
> had two failures and start over.  I wouldn't use a 0.90 superblock, though,
> and you definitely want to have monitoring enabled.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

If you have the spare drives/space I -highly- recommend dd_rescue /
ddrescue copying the suspected-bad drives contents to clean drives.
http://www.linuxfoundation.org/collaborate/workgroups/linux-raid/raid_recovery
has a script to try out the combinations so you can see where the
least data is lost.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-12-29  2:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-26  1:33 fsck problems. Can't restore raid Rick Bragg
2009-12-26  2:47 ` Rick Bragg
2009-12-26  3:12   ` Rick Bragg
2009-12-26 18:47     ` Leslie Rhorer
2009-12-26 19:44       ` Rick Bragg
2009-12-26 21:14         ` Leslie Rhorer
2009-12-26 21:59           ` Green Mountain Network Info
2009-12-27  1:01           ` Rick Bragg
2009-12-27  6:13             ` Leslie Rhorer
2009-12-27 18:41               ` Rick Bragg
2009-12-27 22:47                 ` Leslie Rhorer
2009-12-29  2:46                   ` Michael Evans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).