linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: NeilBrown <neilb@suse.de>
Cc: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Removing a failing drive from multiple arrays
Date: Tue, 24 Apr 2012 20:07:18 -0400	[thread overview]
Message-ID: <4F974036.60000@tmr.com> (raw)
In-Reply-To: <20120420075212.4574111a@notabene.brown>

NeilBrown wrote:
> On Thu, 19 Apr 2012 14:54:30 -0400 Bill Davidsen<davidsen@tmr.com>  wrote:
>
>> I have a failing drive, and partitions are in multiple arrays. I'm
>> looking for the least painful and most reliable way to replace it. It's
>> internal, I have a twin in an external box, and can create all the parts
>> now and then swap the drive physically. The layout is complex, here's
>> what blkdevtra tells me about this device, the full trace is attached.
>>
>> Block device sdd, logical device 8:48
>> Model Family:     Seagate Barracuda 7200.10
>> Device Model:     ST3750640AS
>> Serial Number:    5QD330ZW
>>       Device size   732.575 GB
>>              sdd1     0.201 GB
>>              sdd2     3.912 GB
>>              sdd3    24.419 GB
>>              sdd4     0.000 GB
>>              sdd5    48.838 GB [md123] /mnt/workspace
>>              sdd6     0.498 GB
>>              sdd7    19.543 GB [md125]
>>              sdd8    29.303 GB [md126]
>>              sdd9   605.859 GB [md127] /exports/common
>>     Unpartitioned     0.003 GB
>>
>> I think what I want to do is to partition the new drive, then one array
>> at a time fail and remove the partition on the bad drive, and add a
>> partition on the new good drive. Then repeat for each array until all
>> are complete and on a new drive. Then I should be able to power off,
>> remove the failed drive, put the good drive in the case, and the arrays
>> should reassemble by UUID.
>>
>> Does that sound right? Is there an easier way?
>>
>
> I would add the new partition before failing the old but that isn't a big
> issues.
>
> If you were running a really new kernel, used 1.x metadata, and were happy to
> try out code that that hasn't had a lot of real-life testing you could (after
> adding the new partition) do
>     echo want_replacement>  /sys/block/md123/md/dev-sdd5/state
> (for example).
>
> Then it would build the spare before failing the original.
> You need linux 3.3 for this to have any chance of working.
>
Well, it does occur, has on the first bunch of partitions, is now doing 
the big ~TB one. And because I'm nervous about power cycling sick disks 
(been there, done that) I am doing the whole rebuild onto drives 
attached by USB and eSATA connections. On the last one now.

I did them all live and running, although I did "swapoff" the one for 
swap, it isn't really needed and just seems like a bad thing to be 
diddling while the system is using it.

Good news, it has worked perfectly, bad news it doesn't do what I 
thought it did. For RAID-[56] it does what I expected and pulls data off 
the partition marked for replacement, but with RAID-10 2f layout the 
"take the best copy" logic seems to take over and data comes from all 
active drives. I would have expected it to come from the failing drive 
first and be taken elsewhere only if the failing drive didn't provide 
the data. I have seen cases where migration failed due to a bad sector 
on another drive, so that's unexpected. I don't claim it wrong, just 
"not what I expected."

I think in a perfect world (where you have infinite time to diddle 
stuff), it would be useful to have three options:
  - favor the failing drive, recover what you must
  - reconstruct all data possible, don't use the failing drive
  - build the new copy fastest way possible, get it where it's available.

In any case this feature worked just fine, and I put my thoughts on the 
method out for comment. By morning the last rebuild will be done, and I 
can actually pull the bad drives by serial number, hope the UUID means 
the new drive can go anywhere, add another eSATA card and Blu-Ray 
burner, and be up solid.


-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

  parent reply	other threads:[~2012-04-25  0:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-19 18:54 Removing a failing drive from multiple arrays Bill Davidsen
2012-04-19 21:52 ` NeilBrown
2012-04-20 14:30   ` Bill Davidsen
2012-04-22 22:33   ` Bill Davidsen
2012-04-22 22:55     ` NeilBrown
2012-04-25  0:07   ` Bill Davidsen [this message]
2012-04-20 14:35 ` John Stoffel
2012-04-20 16:31   ` John Robinson
     [not found]     ` <CAK2H+efwgznsS4==Rrtm6UE=uOb25-Q0Qm84i8yAJEJJ2JLdgg@mail.gmail.com>
2012-04-22 18:41       ` John Robinson
2012-04-26  2:37         ` Bill Davidsen
2012-04-26  6:19           ` John Robinson
2012-04-26  7:36           ` Brian Candler
2012-04-26 12:59             ` Bill Davidsen
2012-04-26 13:23               ` Brian Candler
2012-04-26 21:17                 ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F974036.60000@tmr.com \
    --to=davidsen@tmr.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).