From: NeilBrown <neilb@suse.de>
To: patrik@dsl.sk
Cc: David Brown <david.brown@hesbynett.no>, linux-raid@vger.kernel.org
Subject: Re: Hot-replace for RAID5
Date: Wed, 16 May 2012 08:47:30 +1000 [thread overview]
Message-ID: <20120516084730.0b30fe31@notabene.brown> (raw)
In-Reply-To: <CAAOsTSn1+jtViRE-9f7YpMduZ9avBfWMC72vHF_NFC0kjF-hRg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4498 bytes --]
On Tue, 15 May 2012 21:39:10 +0200 Patrik Horník <patrik@dsl.sk> wrote:
> BTW thank you very much for the fix for layout=preserve. As soon as
> current reshape finishes, I am going to other arrays.
>
> Are regressions in 2.3.4 serious and so to which version I should
> apply the patch? Or when you looked at the code, should
> layout=left-symmetric-6 work in 2.3.2?
Regression isn't dangerous, just inconvenient (--add often doesn't work).
--layout=left-symmetric-6 will work on 2.3.2, providing the current layout
of the array is "left symmetric" which I think is the default, but you should
check.
NeilBrown
>
> In regard reshaping speed, estimation when doing things a lot more
> sequentially gives much higher speeds. Lets say 48 MB backup, 6 drives
> with 80 MB/s sequential speed. If you do reshaping like this:
> - Read 8 MB sequential from each drive in parallel, 0.1 s
> - Then write it to backup, 48/80 = 0.6 s
> - Calculate Q for something like 48 MB (guessing 0.05 s) and writing
> it back to diff drives in parallel in 0.1 s. Because it is in the
> cache and you are only writing in this phase (?), there is not back
> and forth seeking and rotational latency applies only couple of times
> altogether, lets say 0.02.
> - Update superblock and move header back, two worst seeks, 0.03 s (I
> dont know how often do you update superblocks?)
>
> you process 8 MB in cca 0.9 s, so speed in this scenario should be cca 9 MB/s.
>
> I guess the main real difference when you logically doing it in
> stripes can be that when you waiting for completion of writing chunks
> (are you waiting for real completion of writes?), the difference
> between first and last drive is often long enough to need wait one or
> more rotations for writing another stripe. If that is the case, you
> need add cca 128 * lets say 1.5 * 0.005 s = 0.64 s and so we are down
> to cca 4.3 MB/s theoretically.
>
> Patrik
>
> On Tue, May 15, 2012 at 2:13 PM, NeilBrown <neilb@suse.de> wrote:
> > On Tue, 15 May 2012 13:56:58 +0200 Patrik Horník <patrik@dsl.sk> wrote:
> >
> >> Anyway increasing it to 5K did not help and drives don't seem to be
> >> fully utilized.
> >>
> >> Does the reshape work something like this:
> >> - Read about X = (50M / N - 1 / stripe size) stripes from drives and
> >> write them to the backup-file
> >> - Reshape X stripes one by another sequentially
> >> - Reshaping stripe by reading chunks from all drives, calculate Q,
> >> writing all chunks back and doing I/O for next stripe only after
> >> finishing previous one?
> >>
> >> So after increasing stripe_cache_size the cache should hold stripes
> >> after backing them and so reshaping should not need to read them from
> >> drives again?
> >>
> >> Cant the slow speed be caused by some synchronization issues? How are
> >> the stripes read for writing them to backup-file? Is it done one by
> >> one, so I/Os for next stripe are issued only after having read the
> >> previous stripe completely? Are they issued in maximum parallel way
> >> possible?
> >
> > There is as much parallelism as I could manage.
> > The backup file is divided into 2 sections.
> > Write to one, then the other, then invalidate the first and write to it etc.
> > So while one half is being written, the data in the other half is being
> > reshaped in the array.
> > Also the stripe reads are scheduled asynchronously and as soon as a stripe is
> > fully available, the Q is calculated and they are scheduled for write.
> >
> > The slowness is due to continually having to seek back a little way to over
> > write what has just be read, and also having to update the metadata each time
> > to record where we are up to.
> >
> > NeilBrown
> >
> >
> >>
> >> Patrik
> >>
> >>
> >> On Tue, May 15, 2012 at 1:28 PM, NeilBrown <neilb@suse.de> wrote:
> >> > On Tue, 15 May 2012 13:16:42 +0200 Patrik Horník <patrik@dsl.sk> wrote:
> >> >
> >> >> Can I increase it during reshape by echo N >
> >> >> /sys/block/mdX/md/stripe_cache_size?
> >> >
> >> > Yes.
> >> >
> >> >
> >> >>
> >> >> How is the size determined? I have only 1027 while having 8 GB system memory...
> >> >
> >> > Not very well.
> >> >
> >> > It is set to 256, or the minimum size needed to allow the reshape to proceed
> >> > (which means about 4 chunks worth). I should probably add some auto-sizing
> >> > but that sort of stuff is hard :-(
> >> >
> >> > NeilBrown
> >> >
> >
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-05-15 22:47 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-08 9:10 Hot-replace for RAID5 Patrik Horník
2012-05-10 6:59 ` David Brown
2012-05-10 8:50 ` Patrik Horník
2012-05-10 17:16 ` Patrik Horník
2012-05-11 0:50 ` NeilBrown
2012-05-11 2:44 ` Patrik Horník
2012-05-11 7:16 ` David Brown
2012-05-12 4:40 ` Patrik Horník
2012-05-12 15:56 ` Patrik Horník
2012-05-12 23:19 ` NeilBrown
2012-05-13 7:43 ` Patrik Horník
2012-05-13 21:41 ` Patrik Horník
2012-05-13 22:15 ` NeilBrown
2012-05-14 0:52 ` Patrik Horník
2012-05-15 10:11 ` Patrik Horník
2012-05-15 10:43 ` NeilBrown
[not found] ` <CAAOsTSmMrs2bHDbFrND4-iaxwrTA0WySd_AVaK+KXZ-XZsysag@mail.gmail.com>
[not found] ` <20120515212820.14db2fd2@notabene.brown>
2012-05-15 11:56 ` Patrik Horník
2012-05-15 12:13 ` NeilBrown
2012-05-15 19:39 ` Patrik Horník
2012-05-15 22:47 ` NeilBrown [this message]
2012-05-16 5:51 ` Patrik Horník
2012-05-16 23:34 ` Oliver Martin
2012-05-18 3:45 ` NeilBrown
2012-05-19 10:40 ` Patrik Horník
2012-05-21 9:54 ` Asdo
2012-05-21 10:12 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120516084730.0b30fe31@notabene.brown \
--to=neilb@suse.de \
--cc=david.brown@hesbynett.no \
--cc=linux-raid@vger.kernel.org \
--cc=patrik@dsl.sk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.