All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: Alex <mysqlstudent@gmail.com>,
	Linux RAID <linux-raid@vger.kernel.org>,
	Neil Brown <neilb@suse.de>
Subject: Re: Need to remove failed disk from RAID5 array
Date: Wed, 18 Jul 2012 16:26:50 -0400	[thread overview]
Message-ID: <50071C0A.8080307@tmr.com> (raw)
In-Reply-To: <CAB1R3shxBWebm13ie5gR0h++-GBTyfZrranHr8tjGkbPipV32w@mail.gmail.com>

Alex wrote:
> Hi,
>
> I have a degraded RAID5 array on an fc15 box due to sda failing:
>
> Personalities : [raid6] [raid5] [raid4]
> md1 : active raid5 sda3[5](F) sdd2[4] sdc2[2] sdb2[1]
>        2890747392 blocks super 1.1 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
>        bitmap: 8/8 pages [32KB], 65536KB chunk
>
> md0 : active raid5 sda2[5] sdd1[4] sdc1[2] sdb1[1]
>        30715392 blocks super 1.1 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
>
> There's a ton of messages like these:
>
> end_request: I/O error, dev sda, sector 1668467332
> md/raid:md1: read error NOT corrected!! (sector 1646961280 on sda3).
> md/raid:md1: Disk failure on sda3, disabling device.
> md/raid:md1: Operation continuing on 3 devices.
> md/raid:md1: read error not correctable (sector 1646961288 on sda3).
>
> What is the proper procedure to remove the disk from the array,
> shutdown the server, and reboot with a new sda?
>
> # mdadm --version
> mdadm - v3.2.5 - 18th May 2012
>
> # mdadm -Es
> ARRAY /dev/md/0 metadata=1.1 UUID=4b5a3704:c681f663:99e744e4:254ebe3e
> name=pixie.example.com:0
> ARRAY /dev/md/1 metadata=1.1 UUID=d5032866:15381f0b:e725e8ae:26f9a971
> name=pixie.example.com:1
>
> # mdadm --detail /dev/md1
> /dev/md1:
>          Version : 1.1
>    Creation Time : Sun Aug  7 12:52:18 2011
>       Raid Level : raid5
>       Array Size : 2890747392 (2756.83 GiB 2960.13 GB)
>    Used Dev Size : 963582464 (918.94 GiB 986.71 GB)
>     Raid Devices : 4
>    Total Devices : 4
>      Persistence : Superblock is persistent
>
>    Intent Bitmap : Internal
>
>      Update Time : Mon Jul 16 19:14:11 2012
>            State : active, degraded
>   Active Devices : 3
> Working Devices : 3
>   Failed Devices : 1
>    Spare Devices : 0
>
>           Layout : left-symmetric
>       Chunk Size : 512K
>
>             Name : pixie.example.com:1  (local to host pixie.example.com)
>             UUID : d5032866:15381f0b:e725e8ae:26f9a971
>           Events : 162567
>
>      Number   Major   Minor   RaidDevice State
>         0       0        0        0      removed
>         1       8       18        1      active sync   /dev/sdb2
>         2       8       34        2      active sync   /dev/sdc2
>         4       8       50        3      active sync   /dev/sdd2
>
>         5       8        3        -      faulty spare   /dev/sda3
>
> I'd appreciate a pointer to any existing documentation, or some
> general guidance on the proper procedure.
>

Once the drive is failed about all you can do is add another drive as a spare, 
wait until the rebuild completes, then remove the old drive from the array. If 
you have a new kernel, 3.3 or newer you might have been able to use the 
undocumented but amazing "want_replacement" action to speed your rebuild, but 
when it is so bad it gets kicked I think it's too late.

Neil might have a thought on this, the option makes the rebuild vastly faster 
and safer.


-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot



  reply	other threads:[~2012-07-18 20:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-16 23:17 Need to remove failed disk from RAID5 array Alex
2012-07-18 20:26 ` Bill Davidsen [this message]
2012-07-19  2:44   ` Alex
2012-07-19  3:16     ` Roman Mamedov
2012-07-19 14:25       ` Bill Davidsen
2012-07-19 14:35         ` Roman Mamedov
2012-07-19 14:51           ` Bill Davidsen
2012-07-19 21:08         ` NeilBrown
2012-07-20  1:04           ` Alex
2012-07-20  1:22             ` Bill Davidsen
2012-07-20  1:37             ` NeilBrown
2012-07-23  4:14             ` Bill Davidsen
2012-07-24 14:02               ` Alex
2012-07-19 15:37       ` Alex

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50071C0A.8080307@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=mysqlstudent@gmail.com \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.