linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Maybe crazy idea for reshaping: Instant/On-Demand reshaping
Date: Sat, 20 Feb 2010 19:40:20 +1100	[thread overview]
Message-ID: <20100220194020.5d1689e6@notabene.brown> (raw)
In-Reply-To: <87zl34xyqt.fsf@frosties.localdomain>

On Sat, 20 Feb 2010 06:27:54 +0100
Goswin von Brederlow <goswin-v-b@web.de> wrote:

> Hi,
> 
> last night I started a reshape of a raid5 array. Now I got up and it is
> still over 15 hours till the reshape is done. That got me thinking.
> Since I haven't had my coffee yet let me use you as a sounding board.
> 
> 
> 1) Wouldn't it be great if during reshape the size of the raid would
> gradualy increase?
> 
> As the reshape progresses and data moves from X to X+1 disks it creates
> free space. So why can't one increase the device size gradually to
> include that space?
> 
> Unfortunately the space it frees is needed to reshape the later
> stripes. As it reshapes a window of free space moves from the start of
> the disks to the end of the disks. For the device size to grow the place
> where the new stripe would land after reshaping needs to be free and
> that means the window of free space must have moved far enough to
> include that place. That means X/(X+1) of the data has already been
> copied. Only while copying the last 1/(X+1) of data could the size
> increase. That would still be a plus.
> 
> Note: After all the data has been copied, when the window of free space
> has reached the end of the disks, there is still work to do. The window
> of free space contains random data and needs to be resynced or zeroed so
> the parity of the future stripes is correct. For the size to increase
> gradually that resync/zeroing would have to be interleaved with copying
> the remaining data.
> 
> 
> 2) With the existing reshape a gradual size increase is impossible
> untill late in the reshape. Could we do better?
> 
> The problem with increasing the size before the reshape is done is that
> there is existing data where our new free space is going to be. Maybe we
> could move the data away as needed. Whenever something writes to a new
> stripe that still contains old data we move the old stripe to its new
> place. That would require information where old data is. Something like
> a bitmap. We might not get stripe granularity but that is ok.
> 
> It gets bit more complex. Moving a chunk of old data means writing data
> to new stripes. They can contain old data as well requiring a recursion.
> But old data always gets copied to lower blocks. Assuming we finished
> some reshaping at the start of the disks (at least some critical section
> must be done) then eventually we hit a region that was already reshaped.
> As the reshape progresses it will take less and less recursions.
> 
> Note: reads from stripes with old data would return all 0.
> 
> Note 2: writing to a stripe can write to the old stripe if that wasn't
> respahed yet.
> 
> Note 3: there would still be a normal reshape process that goes from
> start to end on each disk, it would just run in parallel with the
> on-demand copying.
> 
> Writing to a new stripe that hasn't yet been reshaped will be horribly
> slow at first and gradually become faster as the reshape progresses.
> Also as more new stripes get written there will be more and more chunks
> in the middle of the disks that have been reshaped so the recursion will
> not have to go to the fully reshaped region at the start of the disk
> every time.
> 
> 
> So what do you think? Is this lack of cafein speaking?

I think your cost-benefit trade-off is way off balance on the cost side.

This sort of complexity really belongs in a filesystem, not in a a block
device.

NeilBrown

  reply	other threads:[~2010-02-20  8:40 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-20  5:27 Maybe crazy idea for reshaping: Instant/On-Demand reshaping Goswin von Brederlow
2010-02-20  8:40 ` Neil Brown [this message]
2010-02-20 10:39   ` Goswin von Brederlow
2010-02-21  2:21     ` Neil Brown
2010-02-21 11:14       ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100220194020.5d1689e6@notabene.brown \
    --to=neilb@suse.de \
    --cc=goswin-v-b@web.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).