linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Software RAID 6 initial sync very slow
@ 2008-06-01 11:26 thomas62186218
  2008-06-01 20:37 ` Richard Scobie
  2008-06-02 12:35 ` Bill Davidsen
  0 siblings, 2 replies; 7+ messages in thread
From: thomas62186218 @ 2008-06-01 11:26 UTC (permalink / raw)
  To: linux-raid

Hi all,

I am debating between hardware or software RAID for a high-performance 
Linux storage server. One area of concern in my initial mdadm tests 
concerns sync speed. My setup is as follows:

Linux Fedora 7 x64
12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe HBA
Dual 2.5GHz quad-core Xeon
8GB RAM

My first test was to create a RAID 6 md device, 128K chunk, with all 12 
drives. The sync is currently underway at a very slow 30MB/sec, with 
mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID card 
tested, I performed the RAID 6 initial sync with the same 12 drives in 
just under 5 hours. Meanwhile, our total CPU usage during mdadm syncing 
is at 5%, so there's plenty of CPU not being used.

Why is the sync so slow? This concerns me not only because initial sync 
time is important, but because this may be indicative of slow sync 
times in the event of a drive failure.

Thanks for your input,
Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
@ 2008-06-01 20:37 ` Richard Scobie
  2008-06-02 12:35 ` Bill Davidsen
  1 sibling, 0 replies; 7+ messages in thread
From: Richard Scobie @ 2008-06-01 20:37 UTC (permalink / raw)
  To: thomas62186218; +Cc: linux-raid

thomas62186218@aol.com wrote:

> My first test was to create a RAID 6 md device, 128K chunk, with all 12 
> drives. The sync is currently underway at a very slow 30MB/sec, with 
> mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID card 
> tested, I performed the RAID 6 initial sync with the same 12 drives in 
> just under 5 hours. Meanwhile, our total CPU usage during mdadm syncing 
> is at 5%, so there's plenty of CPU not being used.

I have a very similar system, except single quad core and 16 x 750GB

It syncs (md RAID6) in 3:40min which is roughly proportional to the time 
you see with your hardware RAID sync up, so you should be able to match 
it with md RAID.

I cannot remember whether stripe_cache_size affected the sync time - 
currently I have this set to 16384.

Regards,

Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
  2008-06-01 20:37 ` Richard Scobie
@ 2008-06-02 12:35 ` Bill Davidsen
  2008-06-02 19:18   ` thomas62186218
  1 sibling, 1 reply; 7+ messages in thread
From: Bill Davidsen @ 2008-06-02 12:35 UTC (permalink / raw)
  To: thomas62186218; +Cc: linux-raid

thomas62186218@aol.com wrote:
> Hi all,
>
> I am debating between hardware or software RAID for a high-performance 
> Linux storage server. One area of concern in my initial mdadm tests 
> concerns sync speed. My setup is as follows:
>
> Linux Fedora 7 x64
> 12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe HBA
> Dual 2.5GHz quad-core Xeon
> 8GB RAM
>
> My first test was to create a RAID 6 md device, 128K chunk, with all 
> 12 drives. The sync is currently underway at a very slow 30MB/sec, 
> with mdadm estimating this as a 9hr process. With an Adaptec 5805 RAID 
> card tested, I performed the RAID 6 initial sync with the same 12 
> drives in just under 5 hours. Meanwhile, our total CPU usage during 
> mdadm syncing is at 5%, so there's plenty of CPU not being used.
>
> Why is the sync so slow? This concerns me not only because initial 
> sync time is important, but because this may be indicative of slow 
> sync times in the event of a drive failure.

Two things will limit this, chunk size and max speed set in the 
/sys/block/mdNN/md/sync_speed_max setting. What you are seeing does 
sound slow.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-02 12:35 ` Bill Davidsen
@ 2008-06-02 19:18   ` thomas62186218
  2008-06-03  1:08     ` Richard Scobie
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: thomas62186218 @ 2008-06-02 19:18 UTC (permalink / raw)
  To: davidsen, linux-raid

Thank you Bill and Richard for your responses.

In sync_speed_max, I had already set it to 250000 (250MB/sec). For 
sync_speed_min, I have 249900 set. My rational behind doing this was to 
"force" it to go as fast as it can. Any problem with this?

However, adjusting stripe_cache_size did improve performance. It was 
256 at first, and my sync rate was 28MB/sec. When I increased it to 
4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384, 
and it jumped again to 40MB/sec. Increasing stripe_cache_size above 
that did not seem to have any effect.

My question then is, how do I set the stripe_cache_size at the time of 
md creation? I would rather set it then, as opposed to having to echo 
stripe_cache_size variable with a new setting. In other words, where is 
this default value of 256 coming from? Thanks all!!

-Thomas

-----Original Message-----
From: Bill Davidsen <davidsen@tmr.com>
To: thomas62186218@aol.com
Cc: linux-raid@vger.kernel.org
Sent: Mon, 2 Jun 2008 5:35 am
Subject: Re: Software RAID 6 initial sync very slow









thomas62186218@aol.com wrote:

> Hi all,

>

> I am debating between hardware or software RAID for a 
high-performance
> Linux storage server. One area of concern in my initial mdadm tests
> concerns sync speed. My setup is as follows:

>

> Linux Fedora 7 x64

> 12 x 1TB Hitachi 1TB SATAII drives connected via LSI SAS3801E PCIe 
HBA

> Dual 2.5GHz quad-core Xeon

> 8GB RAM

>

> My first test was to create a RAID 6 md device, 128K chunk, with all
> 12 drives. The sync is currently underway at a very slow 30MB/sec,
> with mdadm estimating this as a 9hr process. With an Adaptec 5805 
RAID
> card tested, I performed the RAID 6 initial sync with the same 12
> drives in just under 5 hours. Meanwhile, our total CPU usage during
> mdadm syncing is at 5%, so there's plenty of CPU not being used.

>

> Why is the sync so slow? This concerns me not only because initial
> sync time is important, but because this may be indicative of slow
> sync times in the event of a drive failure.


Two things will limit this, chunk size and max speed set in the
/sys/block/mdNN/md/sync_speed_max setting. What you are seeing does
sound slow.


--
Bill Davidsen <davidsen@tmr.com>

 "Woe unto the statesman who makes war without a reason that will still

 be valid when the war is over..." Otto von Bismark







^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-02 19:18   ` thomas62186218
@ 2008-06-03  1:08     ` Richard Scobie
  2008-06-03  5:11     ` Neil Brown
  2008-06-03 19:47     ` Bill Davidsen
  2 siblings, 0 replies; 7+ messages in thread
From: Richard Scobie @ 2008-06-03  1:08 UTC (permalink / raw)
  To: thomas62186218; +Cc: davidsen, linux-raid

thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
> 
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For 
> sync_speed_min, I have 249900 set. My rational behind doing this was to 
> "force" it to go as fast as it can. Any problem with this?

You cannot force it to go faster, as for a RAID6 resync, the limiting 
speed is going to be the maximum write speed of one drive and in 
practice, I found that it self limited to around 60MB/s.

Regards,

Richard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-02 19:18   ` thomas62186218
  2008-06-03  1:08     ` Richard Scobie
@ 2008-06-03  5:11     ` Neil Brown
  2008-06-03 19:47     ` Bill Davidsen
  2 siblings, 0 replies; 7+ messages in thread
From: Neil Brown @ 2008-06-03  5:11 UTC (permalink / raw)
  To: thomas62186218; +Cc: davidsen, linux-raid

On Monday June 2, thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
> 
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For 
> sync_speed_min, I have 249900 set. My rational behind doing this was to 
> "force" it to go as fast as it can. Any problem with this?
> 
> However, adjusting stripe_cache_size did improve performance. It was 
> 256 at first, and my sync rate was 28MB/sec. When I increased it to 
> 4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384, 
> and it jumped again to 40MB/sec. Increasing stripe_cache_size above 
> that did not seem to have any effect.
> 
> My question then is, how do I set the stripe_cache_size at the time of 
> md creation? I would rather set it then, as opposed to having to echo 
> stripe_cache_size variable with a new setting. In other words, where is 
> this default value of 256 coming from? Thanks all!!

256 is the default hard coded into the kernel.
Why do you have a problem with echoing a number into the sysfs
variable.  I guess I could teach mdadm to do that for you, but it
would just open the file and write to it, just like you do.

raid6 resync (like raid5) is optimised for an array the is already in
sync.  It reads everything and checks the P and Q blocks.  When it
finds P or Q that are wrong, it calculates the correct value and goes
back to write it out.
On a fresh drive, this will involve lots of writing which means
seeking back to write something.
With a larger stripe_cache, the writes can presumably be done in
larger slabs so there are fewer seeks.

You might get a better result by creating the array with two missing
devices and two spares.  It will then read the good devices completely
linearly, and write the spares completely linearly and so should get
full hardware speed with normal stripe_cache size.

For raid5, mdadm makes this arrangement automatically.  It doesn't for
raid6.

I'd be interested to discover what speed you get if you:

  mdadm -C /dev/mdXX -l 6 -n ... -x 2  /dev/0 /dev/1 ... /dev/n-3 missing /dev/n-2 /dev/nd-1

(if you get my drift).
Of course only do this if you don't have valuable data on the array
already (though it should survive).

NeilBrown

NeilBrown

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Software RAID 6 initial sync very slow
  2008-06-02 19:18   ` thomas62186218
  2008-06-03  1:08     ` Richard Scobie
  2008-06-03  5:11     ` Neil Brown
@ 2008-06-03 19:47     ` Bill Davidsen
  2 siblings, 0 replies; 7+ messages in thread
From: Bill Davidsen @ 2008-06-03 19:47 UTC (permalink / raw)
  To: thomas62186218; +Cc: linux-raid

thomas62186218@aol.com wrote:
> Thank you Bill and Richard for your responses.
>
> In sync_speed_max, I had already set it to 250000 (250MB/sec). For 
> sync_speed_min, I have 249900 set. My rational behind doing this was 
> to "force" it to go as fast as it can. Any problem with this?
>
Other than possibly totally killing your response if you use some other 
raid levels, no. I usually set the min speed to something I can tolerate 
in the background, and still have a useful system.

> However, adjusting stripe_cache_size did improve performance. It was 
> 256 at first, and my sync rate was 28MB/sec. When I increased it to 
> 4096, my sync rate jumped to 38MB/sec. Then I increased it to 16384, 
> and it jumped again to 40MB/sec. Increasing stripe_cache_size above 
> that did not seem to have any effect.
>
You definitely hit diminishing returns on cache_size. There are people 
on this list who swear by 32k or 64k, but even when I can spare the 
memory I don't see any benefit to looking for that last tiny gain.

> My question then is, how do I set the stripe_cache_size at the time of 
> md creation? I would rather set it then, as opposed to having to echo 
> stripe_cache_size variable with a new setting. In other words, where 
> is this default value of 256 coming from? Thanks all!!

rc.local is my usual choice for things like that, and if you want to 
really spend a lot of time chasing performance, be aware that you can 
use blockdev to play with readahead on the devince and filesystem, and 
generally spend days trying to create a test which will show benefits in 
a statistically valid manner. Add chunk size and knock yourself out. I 
am convinced that the answer to better performance is "it depends," so 
just boosting readahead and cache_size a bit usually gets 80-90% of 
anything you can get with tons of playing.

-- 
Bill Davidsen <davidsen@tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-06-03 19:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-01 11:26 Software RAID 6 initial sync very slow thomas62186218
2008-06-01 20:37 ` Richard Scobie
2008-06-02 12:35 ` Bill Davidsen
2008-06-02 19:18   ` thomas62186218
2008-06-03  1:08     ` Richard Scobie
2008-06-03  5:11     ` Neil Brown
2008-06-03 19:47     ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).