* Software RAID 6 initial sync very slow
@ 2008-06-01 11:26 thomas62186218
2008-06-01 20:37 ` Richard Scobie
2008-06-02 12:35 ` Bill Davidsen
0 siblings, 2 replies; 7+ messages in thread
From: thomas62186218 @ 2008-06-01 11:26 UTC (permalink / raw)
To: linux-raid
Hi all,
I am debating between hardware or software RAID for a high-performance
Linux storage server. One area of concern in my initial mdadm tests
concerns sync speed. My setup is as follows:
Linux Fedora 7 x64
12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe HBA
Dual 2.5GHz quad-core Xeon
8GB RAM
My first test was to create a RAID 6 md device, 128K chunk, with all 12
drives. The sync is currently underway at a very slow 30MB/sec, with
mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID card
tested, I performed the RAID 6 initial sync with the same 12 drives in
just under 5 hours. Meanwhile, our total CPU usage during mdadm syncing
is at 5%, so there's plenty of CPU not being used.
Why is the sync so slow? This concerns me not only because initial sync
time is important, but because this may be indicative of slow sync
times in the event of a drive failure.
Thanks for your input,
Thomas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
@ 2008-06-01 20:37 ` Richard Scobie
2008-06-02 12:35 ` Bill Davidsen
1 sibling, 0 replies; 7+ messages in thread
From: Richard Scobie @ 2008-06-01 20:37 UTC (permalink / raw)
To: thomas62186218; +Cc: linux-raid
thomas62186218@aol.com wrote:
> My first test was to create a RAID 6 md device, 128K chunk, with all 12
> drives. The sync is currently underway at a very slow 30MB/sec, with
> mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID card
> tested, I performed the RAID 6 initial sync with the same 12 drives in
> just under 5 hours. Meanwhile, our total CPU usage during mdadm syncing
> is at 5%, so there's plenty of CPU not being used.
I have a very similar system, except single quad core and 16 x 750GB
It syncs (md RAID6) in 3:40min which is roughly proportional to the time
you see with your hardware RAID sync up, so you should be able to match
it with md RAID.
I cannot remember whether stripe_cache_size affected the sync time -
currently I have this set to 16384.
Regards,
Richard
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
2008-06-01 20:37 ` Richard Scobie
@ 2008-06-02 12:35 ` Bill Davidsen
2008-06-02 19:18 ` thomas62186218
1 sibling, 1 reply; 7+ messages in thread
From: Bill Davidsen @ 2008-06-02 12:35 UTC (permalink / raw)
To: thomas62186218; +Cc: linux-raid
thomas62186218@aol.com wrote:
> Hi all,
>
> I am debating between hardware or software RAID for a high-performance
> Linux storage server. One area of concern in my initial mdadm tests
> concerns sync speed. My setup is as follows:
>
> Linux Fedora 7 x64
> 12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe HBA
> Dual 2.5GHz quad-core Xeon
> 8GB RAM
>
> My first test was to create a RAID 6 md device, 128K chunk, with all
> 12 drives. The sync is currently underway at a very slow 30MB/sec,
> with mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID
> card tested, I performed the RAID 6 initial sync with the same 12
> drives in just under 5 hours. Meanwhile, our total CPU usage during
> mdadm syncing is at 5%, so there's plenty of CPU not being used.
>
> Why is the sync so slow? This concerns me not only because initial
> sync time is important, but because this may be indicative of slow
> sync times in the event of a drive failure.
Two things will limit this, chunk size and max speed set in the
/sys/block/mdNN/md/sync_speed_max setting. What you are seeing does
sound slow.
--
Bill Davidsen <davidsen@tmr.com>
"Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-02 12:35 ` Bill Davidsen
@ 2008-06-02 19:18 ` thomas62186218
2008-06-03 1:08 ` Richard Scobie
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: thomas62186218 @ 2008-06-02 19:18 UTC (permalink / raw)
To: davidsen, linux-raid
Thank you Bill and Richard for your responses.
In sync_speed_max, I had already set it to 250000 (250MB/sec). For
sync_speed_min, I have 249900 set. My rational behind doing this was to
"force" it to go as fast as it can. Any problem with this?
However, adjusting stripe_cache_size did improve performance. It was
256 at first, and my sync rate was 28MB/sec. When I increased it to
4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384,
and it jumped again to 40MB/sec. Increasing stripe_cache_size above
that did not seem to have any effect.
My question then is, how do I set the stripe_cache_size at the time of
md creation? I would rather set it then, as opposed to having to echo
stripe_cache_size variable with a new setting. In other words, where is
this default value of 256 coming from? Thanks all!!
-Thomas
-----Original Message-----
From: Bill Davidsen <davidsen@tmr.com>
To: thomas62186218@aol.com
Cc: linux-raid@vger.kernel.org
Sent: Mon, 2 Jun 2008 5:35 am
Subject: Re: Software RAID 6 initial sync very slow
thomas62186218@aol.com wrote:
> Hi all,
>
> I am debating between hardware or software RAID for a
high-performance
> Linux storage server. One area of concern in my initial mdadm tests
> concerns sync speed. My setup is as follows:
>
> Linux Fedora 7 x64
> 12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe
HBA
> Dual 2.5GHz quad-core Xeon
> 8GB RAM
>
> My first test was to create a RAID 6 md device, 128K chunk, with all
> 12 drives. The sync is currently underway at a very slow 30MB/sec,
> with mdadm estimating this as a 9hr process. With an Adaptec 5805
RAID
> card tested, I performed the RAID 6 initial sync with the same 12
> drives in just under 5 hours. Meanwhile, our total CPU usage during
> mdadm syncing is at 5%, so there's plenty of CPU not being used.
>
> Why is the sync so slow? This concerns me not only because initial
> sync time is important, but because this may be indicative of slow
> sync times in the event of a drive failure.
Two things will limit this, chunk size and max speed set in the
/sys/block/mdNN/md/sync_speed_max setting. What you are seeing does
sound slow.
--
Bill Davidsen <davidsen@tmr.com>
"Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-02 19:18 ` thomas62186218
@ 2008-06-03 1:08 ` Richard Scobie
2008-06-03 5:11 ` Neil Brown
2008-06-03 19:47 ` Bill Davidsen
2 siblings, 0 replies; 7+ messages in thread
From: Richard Scobie @ 2008-06-03 1:08 UTC (permalink / raw)
To: thomas62186218; +Cc: davidsen, linux-raid
thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
>
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For
> sync_speed_min, I have 249900 set. My rational behind doing this was to
> "force" it to go as fast as it can. Any problem with this?
You cannot force it to go faster, as for a RAID6 resync, the limiting
speed is going to be the maximum write speed of one drive and in
practice, I found that it self limited to around 60MB/s.
Regards,
Richard
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-02 19:18 ` thomas62186218
2008-06-03 1:08 ` Richard Scobie
@ 2008-06-03 5:11 ` Neil Brown
2008-06-03 19:47 ` Bill Davidsen
2 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2008-06-03 5:11 UTC (permalink / raw)
To: thomas62186218; +Cc: davidsen, linux-raid
On Monday June 2, thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
>
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For
> sync_speed_min, I have 249900 set. My rational behind doing this was to
> "force" it to go as fast as it can. Any problem with this?
>
> However, adjusting stripe_cache_size did improve performance. It was
> 256 at first, and my sync rate was 28MB/sec. When I increased it to
> 4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384,
> and it jumped again to 40MB/sec. Increasing stripe_cache_size above
> that did not seem to have any effect.
>
> My question then is, how do I set the stripe_cache_size at the time of
> md creation? I would rather set it then, as opposed to having to echo
> stripe_cache_size variable with a new setting. In other words, where is
> this default value of 256 coming from? Thanks all!!
256 is the default hard coded into the kernel.
Why do you have a problem with echoing a number into the sysfs
variable. I guess I could teach mdadm to do that for you, but it
would just open the file and write to it, just like you do.
raid6 resync (like raid5) is optimised for an array the is already in
sync. It reads everything and checks the P and Q blocks. When it
finds P or Q that are wrong, it calculates the correct value and goes
back to write it out.
On a fresh drive, this will involve lots of writing which means
seeking back to write something.
With a larger stripe_cache, the writes can presumably be done in
larger slabs so there are fewer seeks.
You might get a better result by creating the array with two missing
devices and two spares. It will then read the good devices completely
linearly, and write the spares completely linearly and so should get
full hardware speed with normal stripe_cache size.
For raid5, mdadm makes this arrangement automatically. It doesn't for
raid6.
I'd be interested to discover what speed you get if you:
mdadm -C /dev/mdXX -l 6 -n ... -x 2 /dev/0 /dev/1 ... /dev/n-3 missing /dev/n-2 /dev/nd-1
(if you get my drift).
Of course only do this if you don't have valuable data on the array
already (though it should survive).
NeilBrown
NeilBrown
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Software RAID 6 initial sync very slow
2008-06-02 19:18 ` thomas62186218
2008-06-03 1:08 ` Richard Scobie
2008-06-03 5:11 ` Neil Brown
@ 2008-06-03 19:47 ` Bill Davidsen
2 siblings, 0 replies; 7+ messages in thread
From: Bill Davidsen @ 2008-06-03 19:47 UTC (permalink / raw)
To: thomas62186218; +Cc: linux-raid
thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
>
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For
> sync_speed_min, I have 249900 set. My rational behind doing this was
> to "force" it to go as fast as it can. Any problem with this?
>
Other than possibly totally killing your response if you use some other
raid levels, no. I usually set the min speed to something I can tolerate
in the background, and still have a useful system.
> However, adjusting stripe_cache_size did improve performance. It was
> 256 at first, and my sync rate was 28MB/sec. When I increased it to
> 4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384,
> and it jumped again to 40MB/sec. Increasing stripe_cache_size above
> that did not seem to have any effect.
>
You definitely hit diminishing returns on cache_size. There are people
on this list who swear by 32k or 64k, but even when I can spare the
memory I don't see any benefit to looking for that last tiny gain.
> My question then is, how do I set the stripe_cache_size at the time of
> md creation? I would rather set it then, as opposed to having to echo
> stripe_cache_size variable with a new setting. In other words, where
> is this default value of 256 coming from? Thanks all!!
rc.local is my usual choice for things like that, and if you want to
really spend a lot of time chasing performance, be aware that you can
use blockdev to play with readahead on the devince and filesystem, and
generally spend days trying to create a test which will show benefits in
a statistically valid manner. Add chunk size and knock yourself out. I
am convinced that the answer to better performance is "it depends," so
just boosting readahead and cache_size a bit usually gets 80-90% of
anything you can get with tons of playing.
--
Bill Davidsen <davidsen@tmr.com>
"Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-03 19:47 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
2008-06-01 20:37 ` Richard Scobie
2008-06-02 12:35 ` Bill Davidsen
2008-06-02 19:18 ` thomas62186218
2008-06-03 1:08 ` Richard Scobie
2008-06-03 5:11 ` Neil Brown
2008-06-03 19:47 ` Bill Davidsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).