linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Asdo <asdo@shiftmail.org>
To: NeilBrown <neilb@suse.de>
Cc: Oliver Martin <oliver@volatilevoid.net>,
	patrik@dsl.sk, David Brown <david.brown@hesbynett.no>,
	linux-raid@vger.kernel.org
Subject: Re: Hot-replace for RAID5
Date: Mon, 21 May 2012 11:54:43 +0200	[thread overview]
Message-ID: <4FBA10E3.7080202@shiftmail.org> (raw)
In-Reply-To: <20120518134555.3a9ce08b@notabene.brown>

On 05/18/12 05:45, NeilBrown wrote:
> On Thu, 17 May 2012 01:34:15 +0200 Oliver Martin<oliver@volatilevoid.net>
> wrote:
>
>> Hi Neil,
>>
>> Am 11.05.2012 02:50, schrieb NeilBrown:
>>> Doing an in-place reshape with the new 3.3 code should work, though with a
>>> softer "should" than above.  We will only know that it is "stable" when enough
>>> people (such as yourself) try it and report success.  If anything does go
>>> wrong I would of course help you to put the array back together but I can
>>> never guarantee no data loss.  You wouldn't be the first to test the code on
>>> live data, but you would be the second that I have heard of.
>> I guess I'll be taking 2nd place then. I just used it on three live
>> raid6 arrays, and it worked perfectly.
> 3 arrays - so you are 2nd, 3rd, and 4th :-)

Good to know that when all is good, hot-replace works.

I wonder if all "error paths" were considered and implemented (and maybe 
even tested, but we users could help with testing if we understand the 
intended behaviour), i.e.

what happens when the disk being hot-replaced shows read errors in 
locations previously unknown to the bad-block list: does it
- immediately fall back to fail+rebuild or
- first tries a recompute + rewrite of the sector, then if rewrite fails 
it falls back to fail+rebuild
- first tries a recompute + rewrite of the sector, then if rewrite fails 
it adds the block to bad block list, then if the list is out-of-space it 
falls back to fail+rebuild
?

What happens if the destination of the hot-replace has *one* write 
error? And *lots* of write errors?

What happens if one hot-replace hits a sector for which both the disk 
being replaced and another one have an entry in the bad block list, and 
so there is not enough parity information to recompute? Does it proceed 
anyway marking the corresponding sector in the bad-block-list for the 
destination device (=nonvalid strip), or it fails the hot-replace, or what?

(this is actually more about bad block lists)
What happens if a *different* disk shows back sectors due to concomitant 
reads (simultaneous but not caused by hot-replace):
- first recomputes and rewrites, then if rewrite fails it is added to 
bad block list, then if list is full it gets failed? Or can another 
hot-replace get started when already one is running?

Thank you


  parent reply	other threads:[~2012-05-21  9:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-08  9:10 Hot-replace for RAID5 Patrik Horník
2012-05-10  6:59 ` David Brown
2012-05-10  8:50   ` Patrik Horník
2012-05-10 17:16   ` Patrik Horník
2012-05-11  0:50     ` NeilBrown
2012-05-11  2:44       ` Patrik Horník
2012-05-11  7:16         ` David Brown
2012-05-12  4:40           ` Patrik Horník
2012-05-12 15:56             ` Patrik Horník
2012-05-12 23:19               ` NeilBrown
2012-05-13  7:43                 ` Patrik Horník
2012-05-13 21:41                   ` Patrik Horník
2012-05-13 22:15                     ` NeilBrown
2012-05-14  0:52                       ` Patrik Horník
2012-05-15 10:11                         ` Patrik Horník
2012-05-15 10:43                           ` NeilBrown
     [not found]                             ` <CAAOsTSmMrs2bHDbFrND4-iaxwrTA0WySd_AVaK+KXZ-XZsysag@mail.gmail.com>
     [not found]                               ` <20120515212820.14db2fd2@notabene.brown>
2012-05-15 11:56                                 ` Patrik Horník
2012-05-15 12:13                                   ` NeilBrown
2012-05-15 19:39                                     ` Patrik Horník
2012-05-15 22:47                                       ` NeilBrown
2012-05-16  5:51                                         ` Patrik Horník
2012-05-16 23:34       ` Oliver Martin
2012-05-18  3:45         ` NeilBrown
2012-05-19 10:40           ` Patrik Horník
2012-05-21  9:54           ` Asdo [this message]
2012-05-21 10:12             ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBA10E3.7080202@shiftmail.org \
    --to=asdo@shiftmail.org \
    --cc=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=oliver@volatilevoid.net \
    --cc=patrik@dsl.sk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).