Re: Server down-failed RAID5-asking for some assistance

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: John Valarti <mdadmuser@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: Server down-failed RAID5-asking for some assistance
Date: Fri, 22 Apr 2011 12:57:34 +1000	[thread overview]
Message-ID: <20110422125734.1a68a736@notabene.brown> (raw)
In-Reply-To: <BANLkTimQXUvU68op8C-W4qPUQzBRzqgP+A@mail.gmail.com>

On Thu, 21 Apr 2011 20:32:57 -0600 John Valarti <mdadmuser@gmail.com> wrote:

> On Thu, Apr 21, 2011 at 1:59 PM, David Brown <david.brown@hesbynett.no> wrote:
> .
> > My first thought would be to get /all/ the disks, not just the "failed"
> > ones, out of the machine.  You want to make full images of them (with
> > ddrescue or something similar) to files on another disk, and then work with
> > those images.  ..
> > Once you've got some (hopefully most) of your data recovered from the
> > images, buy four /new/ disks to put in the machine, and work on your
> > restore.  You don't want to reuse the failing disks, and probably the other
> > two equally old and worn disks will be high risk too.
> 
> OK, I think I understand.
> Does that mean I need to buy 8 disks, all the same size or bigger?
> The originals are 250GB SATA so that should be OK, I guess.
> 
> I read some more and found out I should run mdadm --examine.
> 
> Should I not be able to just add the one disk partition sdc2 back to the RAID?

Possibly.

It looks like sdb2 failed in October 2009 !!!! and nobody noticed.  So your
array has been running degraded since then.

If you

 mdadm -A /dev/md1 --force /dev/sd[acd]2

Then you will have your array back, though there could be a small amount of
data corruption if the array was in the middle of writing when the system
crashed/died/lost-power/whatever-happened.

This will give you access to your data.
How much you trust your drives to continue to give access to your data is up
to you.  But you would be wise to at least by a 1TB drive to copy all the
data on to before you put too much stress on your old drives.

Once you have a safe copy, you could

 mdadm /dev/md1 --add /dev/sdb2

This will add sdb2 to the array and it will recovery the data for sdb2 from
the data and parity on the other drives.  If this works - great.  However
there is a reasonable chance you will hit a read error in which case the
recovery will abort and you will still have your data on the degraded array.

You could possibly run some bad-blocks test on each drive (which will be
destructive - but you  have a backup on the 1TB drive) and decide if you want
to throw them out or keep using them.


What ever you do, once you have a work array again what you feel happy to
trust, make sure a 'check' run happens regularly.  Some distros provide a
cron job to do this for you.  It involves simply
   echo check > /sys/block/md0/md/sync_action

This will read every block on every device to make sure there are no sleeping
bad blocks.  Every month is probably a reasonable frequency to run it.

Also run "mdadm --monitor" configured to send you email if there is a drive
failure.  Also run "mdadm --monitor --oneshot" from a cron tab every day so
that if you have a degraded array it will nag you about it every day.

Good luck,
NeilBrown

> 
> 
> Here is the result of --examine
> 
> /dev/sda2:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : ddf4d448:36afa319:f0917855:03f8bbe8
>  Creation Time : Mon May 15 16:38:05 2006
>     Raid Level : raid5
>  Used Dev Size : 244975104 (233.63 GiB 250.85 GB)
>     Array Size : 734925312 (700.88 GiB 752.56 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 1
> 
>    Update Time : Mon Apr 18 07:48:54 2011
>          State : clean
> Active Devices : 3
> Working Devices : 3
> Failed Devices : 1
>  Spare Devices : 0
>       Checksum : 5674ce60 - correct
>         Events : 28580020
> 
>         Layout : left-symmetric
>     Chunk Size : 256K
> 
>      Number   Major   Minor   RaidDevice State
> this     1       8       18        1      active sync   /dev/sdb2
> 
>   0     0       8        2        0      active sync   /dev/sda2
>   1     1       8       18        1      active sync   /dev/sdb2
>   2     2       8       34        2      active sync   /dev/sdc2
>   3     3       0        0        3      faulty removed
> /dev/sdb2:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : ddf4d448:36afa319:f0917855:03f8bbe8
>  Creation Time : Mon May 15 16:38:05 2006
>     Raid Level : raid5
>  Used Dev Size : 244975104 (233.63 GiB 250.85 GB)
>     Array Size : 734925312 (700.88 GiB 752.56 GB)
>   Raid Devices : 4
>  Total Devices : 4
> Preferred Minor : 1
> 
>    Update Time : Sun Oct 18 10:04:06 2009
>          State : active
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
>  Spare Devices : 0
>       Checksum : 5171dcb2 - correct
>         Events : 20333614
> 
>         Layout : left-symmetric
>     Chunk Size : 256K
> 
>      Number   Major   Minor   RaidDevice State
> this     3       8       50        3      active sync   /dev/sdd2
> 
>   0     0       8        2        0      active sync   /dev/sda2
>   1     1       8       18        1      active sync   /dev/sdb2
>   2     2       8       34        2      active sync   /dev/sdc2
>   3     3       8       50        3      active sync   /dev/sdd2
> /dev/sdc2:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : ddf4d448:36afa319:f0917855:03f8bbe8
>  Creation Time : Mon May 15 16:38:05 2006
>     Raid Level : raid5
>  Used Dev Size : 244975104 (233.63 GiB 250.85 GB)
>     Array Size : 734925312 (700.88 GiB 752.56 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 1
> 
>    Update Time : Mon Apr 18 07:48:51 2011
>          State : clean
> Active Devices : 3
> Working Devices : 3
> Failed Devices : 1
>  Spare Devices : 0
>       Checksum : 5674ce6b - correct
>         Events : 28580018
> 
>         Layout : left-symmetric
>     Chunk Size : 256K
> 
>      Number   Major   Minor   RaidDevice State
> this     2       8       34        2      active sync   /dev/sdc2
> 
>   0     0       8        2        0      active sync   /dev/sda2
>   1     1       8       18        1      active sync   /dev/sdb2
>   2     2       8       34        2      active sync   /dev/sdc2
>   3     3       0        0        3      faulty removed
> /dev/sdd2:
>          Magic : a92b4efc
>        Version : 0.90.00
>           UUID : ddf4d448:36afa319:f0917855:03f8bbe8
>  Creation Time : Mon May 15 16:38:05 2006
>     Raid Level : raid5
>  Used Dev Size : 244975104 (233.63 GiB 250.85 GB)
>     Array Size : 734925312 (700.88 GiB 752.56 GB)
>   Raid Devices : 4
>  Total Devices : 3
> Preferred Minor : 1
> 
>    Update Time : Mon Apr 18 07:48:54 2011
>          State : clean
> Active Devices : 3
> Working Devices : 3
> Failed Devices : 1
>  Spare Devices : 0
>       Checksum : 5674ce4e - correct
>         Events : 28580020
> 
>         Layout : left-symmetric
>     Chunk Size : 256K
> 
>      Number   Major   Minor   RaidDevice State
> this     0       8        2        0      active sync   /dev/sda2
> 
>   0     0       8        2        0      active sync   /dev/sda2
>   1     1       8       18        1      active sync   /dev/sdb2
>   2     2       8       34        2      active sync   /dev/sdc2
>   3     3       0        0        3      faulty removed
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2011-04-22  2:57 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-21 18:29 Server down-failed RAID5-asking for some assistance John Valarti
2011-04-21 19:59 ` David Brown
     [not found]   ` <BANLkTim18Sx6JdZO5PiAqnrakDPzy5PNJQ@mail.gmail.com>
2011-04-22  2:32     ` John Valarti
2011-04-22  2:57       ` NeilBrown [this message]
2011-04-22  3:31         ` Server " John Valarti
     [not found]         ` <BANLkTin0SoBzRAear8Jt+26MnVJWouXoNA@mail.gmail.com>
     [not found]           ` <20110423074411.78fef94f@notabene.brown>
     [not found]             ` <BANLkTik_ZY4uoV3E=ua1p+tUD9g8xqQDVg@mail.gmail.com>
     [not found]               ` <20110423184824.55ee7893@notabene.brown>
     [not found]                 ` <BANLkTi=sCfFFfmZTzj2g8-aDNhDqVK8e-A@mail.gmail.com>
     [not found]                   ` <20110424075101.6763309f@notabene.brown>
2011-04-24  0:07                     ` John Valarti
2011-04-24  0:37                       ` John Robinson
2011-04-24  1:49                         ` John Valarti
2011-04-24  2:12                           ` John Robinson
2011-04-24  2:28                             ` John Valarti
2011-04-24  2:58                               ` NeilBrown
2011-04-24  6:30                                 ` John Valarti
2011-04-24  2:54                       ` NeilBrown
2011-04-24  7:06                         ` John Valarti
2011-04-24  8:41                           ` NeilBrown
2011-04-24 11:57                             ` John Robinson
2011-04-24 12:29                               ` NeilBrown
2011-04-24 16:04                                 ` John Valarti
2011-04-24 16:15                                   ` John Valarti
2011-04-24 16:31                                     ` Mathias Burén
2011-04-24 18:41                                       ` John Valarti
2011-04-22 11:19       ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110422125734.1a68a736@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=mdadmuser@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).