linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Raid10 multi core scaling
@ 2013-11-26 10:58 Pedro Teixeira
  2013-11-26 11:19 ` Adam Goryachev
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Pedro Teixeira @ 2013-11-26 10:58 UTC (permalink / raw)
  To: linux-raid

   I created a Raid10 array with 16 sata 1TB disks and the array
performance
seems to be limited by the md0_raid10 taking 99% of one core and not
scalling to other cores. I tried overclocing the cpu cores and this lead to
a small increase in performance ( but md0_raid10 keeps eating 99% of one
core ).

   I'm using:
    - a phenom X6 at 3600mhz
    - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )
    - 4x marvell 9230 sata3 controllers ( 4 ports each ) pcie 2.0 2x
lanes.
    - 8GB ram
    - custom 3.12 kernel and mdadm compiled from latest source

   what I did to test performance was to force a check on the array, and
this
leads to mdadm reporting a speed of about 990000K/sec. The hard disks
report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
resync speed a bit and the hdd's utilizartion to about 58% )

   If I do the same with a raid5 array instead of raid10, them resync
speed
will be almost double of raid10, the harddisk utilization reported will be
98-100% and I can see at least two cores being used.

________________________________________________________________________________
Mensagem enviada através do email grátis AEIOU
http://www.aeiou.pt
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid10 multi core scaling
  2013-11-26 10:58 Raid10 multi core scaling Pedro Teixeira
@ 2013-11-26 11:19 ` Adam Goryachev
  2013-11-27  6:52 ` Stan Hoeppner
  2013-12-02  6:22 ` NeilBrown
  2 siblings, 0 replies; 6+ messages in thread
From: Adam Goryachev @ 2013-11-26 11:19 UTC (permalink / raw)
  To: Pedro Teixeira, linux-raid



On 26/11/13 21:58, Pedro Teixeira wrote:
>    I created a Raid10 array with 16 sata 1TB disks and the array
> performance
> seems to be limited by the md0_raid10 taking 99% of one core and not
> scalling to other cores. I tried overclocing the cpu cores and this
> lead to
> a small increase in performance ( but md0_raid10 keeps eating 99% of one
> core ).
>
>    I'm using:
>     - a phenom X6 at 3600mhz
>     - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )
>     - 4x marvell 9230 sata3 controllers ( 4 ports each ) pcie 2.0 2x
> lanes.
>     - 8GB ram
>     - custom 3.12 kernel and mdadm compiled from latest source
>
>    what I did to test performance was to force a check on the array, and
> this
> leads to mdadm reporting a speed of about 990000K/sec. The hard disks
> report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
> resync speed a bit and the hdd's utilizartion to about 58% )
>
>    If I do the same with a raid5 array instead of raid10, them resync
> speed
> will be almost double of raid10, the harddisk utilization reported
> will be
> 98-100% and I can see at least two cores being used.

AFAIK, the only way to make RAID10 use multiple cores is to actually
create 8 RAID1 arrays, and then combine those into a RAID0 linear or
RAID0 striped array. Each array will then create a new thread (total of
9 threads) which will then spread across all available cores.

There is ongoing work happening to improve this, but I don't think it is
available in any released kernel yet.

Regards,
Adam

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid10 multi core scaling
  2013-11-26 10:58 Raid10 multi core scaling Pedro Teixeira
  2013-11-26 11:19 ` Adam Goryachev
@ 2013-11-27  6:52 ` Stan Hoeppner
  2013-12-02  6:22 ` NeilBrown
  2 siblings, 0 replies; 6+ messages in thread
From: Stan Hoeppner @ 2013-11-27  6:52 UTC (permalink / raw)
  To: Pedro Teixeira, linux-raid

On 11/26/2013 4:58 AM, Pedro Teixeira wrote:
>    I created a Raid10 array with 16 sata 1TB disks and the array
> performance
> seems to be limited by the md0_raid10 taking 99% of one core and not
> scalling to other cores. 

The md RAID 5/6/10 drivers have a single write thread.  If you push
enough write IO you will peak one CPU core and hit a wall.  An effort is
currently underway to make use of multiple write threads, but this code
is not ready yet.

I tried overclocing the cpu cores and this lead to
> a small increase in performance ( but md0_raid10 keeps eating 99% of one
> core ).
> 
>    I'm using:
>     - a phenom X6 at 3600mhz
>     - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )

So with this hardware you'll peak one CPU core until you've written
somewhere around 64GB, at which point you will have saturated the flash
cache on the drives.  After this point you should see a change from
being CPU bound to being disk bound, as you're writing at spindle speed.
 4x Marvell 88SE9230 based HBAs w/PCIe 2.0 x2 interfaces limit you to
4GB/s read/write throughput to flash cache.  The drives spindle
performance limits you to 2GB/s.  So somewhere in between 2-4GB/s your
3.6GHz Phenom core is running out of juice.

You should not be CPU/thread limited while reading, as reads are not
limited to a single thread.  With a pure streaming read you should be
able to get close to 4GB/s throughput, and you'll see multiple cores in
play, but the work is being done by other kernel IO threads, not the md
thread.

>    what I did to test performance was to force a check on the array, and
> this

This only tells you the behavior of resync, not a normal workload.

> leads to mdadm reporting a speed of about 990000K/sec. The hard disks
> report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
> resync speed a bit and the hdd's utilizartion to about 58% )
> 
>    If I do the same with a raid5 array instead of raid10, them resync
> speed
> will be almost double of raid10, the harddisk utilization reported will be
> 98-100% and I can see at least two cores being used.

This is an apples to oranges comparison, so saying resync speed of RAID5
is double that of RAID10 doesn't mean anything.  Also, the RAID5 core
utilization you see is due to RAID5 using a second core for parity
calculations.

If you want RAID10 and you're hitting a wall at one core, your best
option currently is to build 8 RAID1 devices and build a RAID0 device of
these.  If resync is your preferred test method then you'd fire up 8
resyncs of the 8 RAID1 devices, in parallel, then sum the run times.
You can't resync a RAID0 device.  The total run time should be
significantly lower than using md/RAID10 or md/RAID5.  And you'll see
multiple cores in play, all of them actually, because you'll have 8
RAID1 devices and 6 cores.  But the utilization per core will be quite low.

There are other options to get around the core saturation problem.  You
could create multiple md/RAID10 arrays and lay a stripe over them or
concatenate them, such as a 2x8 or 4x4.  But you really must know what
you're doing to get the nested striping right, or properly layout XFS
AGs on a concatenation.  If not done properly performance could be worse
than what you have now.

Given the stripe over mirrors gets all cores in play, and doesn't have
such pitfalls, it's the better option by far.

-- 
Stan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid10 multi core scaling
  2013-11-26 10:58 Raid10 multi core scaling Pedro Teixeira
  2013-11-26 11:19 ` Adam Goryachev
  2013-11-27  6:52 ` Stan Hoeppner
@ 2013-12-02  6:22 ` NeilBrown
  2013-12-02  8:19   ` David Brown
  2 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2013-12-02  6:22 UTC (permalink / raw)
  To: Pedro Teixeira; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1909 bytes --]

On Tue, 26 Nov 2013 10:58:59 +0000 Pedro Teixeira <finas@aeiou.pt> wrote:

>    I created a Raid10 array with 16 sata 1TB disks and the array
> performance
> seems to be limited by the md0_raid10 taking 99% of one core and not
> scalling to other cores. I tried overclocing the cpu cores and this lead to
> a small increase in performance ( but md0_raid10 keeps eating 99% of one
> core ).

Are you really talking about general array performance, or just resync
performance?

Because md0_raid10 doesn't do much work for normal IO so that should scale to
multiple processors.
I'm not sure that optimising resync to use more than one processor is really
much of a priority - is it?

NeilBrown


> 
>    I'm using:
>     - a phenom X6 at 3600mhz
>     - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )
>     - 4x marvell 9230 sata3 controllers ( 4 ports each ) pcie 2.0 2x
> lanes.
>     - 8GB ram
>     - custom 3.12 kernel and mdadm compiled from latest source
> 
>    what I did to test performance was to force a check on the array, and
> this
> leads to mdadm reporting a speed of about 990000K/sec. The hard disks
> report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
> resync speed a bit and the hdd's utilizartion to about 58% )
> 
>    If I do the same with a raid5 array instead of raid10, them resync
> speed
> will be almost double of raid10, the harddisk utilization reported will be
> 98-100% and I can see at least two cores being used.
> 
> ________________________________________________________________________________
> Mensagem enviada através do email grátis AEIOU
> http://www.aeiou.pt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid10 multi core scaling
  2013-12-02  6:22 ` NeilBrown
@ 2013-12-02  8:19   ` David Brown
  2013-12-02  9:04     ` NeilBrown
  0 siblings, 1 reply; 6+ messages in thread
From: David Brown @ 2013-12-02  8:19 UTC (permalink / raw)
  To: NeilBrown, Pedro Teixeira; +Cc: linux-raid

On 02/12/13 07:22, NeilBrown wrote:
> On Tue, 26 Nov 2013 10:58:59 +0000 Pedro Teixeira <finas@aeiou.pt> wrote:
> 
>>    I created a Raid10 array with 16 sata 1TB disks and the array
>> performance
>> seems to be limited by the md0_raid10 taking 99% of one core and not
>> scalling to other cores. I tried overclocing the cpu cores and this lead to
>> a small increase in performance ( but md0_raid10 keeps eating 99% of one
>> core ).
> 
> Are you really talking about general array performance, or just resync
> performance?
> 
> Because md0_raid10 doesn't do much work for normal IO so that should scale to
> multiple processors.
> I'm not sure that optimising resync to use more than one processor is really
> much of a priority - is it?
> 
> NeilBrown

I think that depends a bit on whether we are talking about md raid10, or
raid1+0.  For md raid10, resyncs and rebuilds are always going to take a
bit of time - data has to be copied back and forth across the same
disks, with a lot of head movement (especially for "far" layout).  I
don't see that it would take much cpu time, however.  And this sort of
layout is mostly for small arrays - it gives great performance when you
have two drives, and perhaps up to about 4 drives.  But after that,
people probably want raid1+0 layouts (i.e., raid0 stripes of raid1 pairs).

For raid1 + 0, resync time /is/ important - it is one of the reasons
people pick it, especially the time to resync a single replaced drive.
But again, I can't see how that would take cpu time.

I suppose that when doing the initial sync of this 16 disk array, you
have to read in all data from 8 disks and write it out to the other 8
disks.  That's a lot of IO passing through the cpus, even if there is no
calculations going on.  But I thought each raid1 pair already had its
own thread, so this would be scaled across 8 cores?  Is md raid10
limited to just one thread?

mvh.,

David


> 
> 
>>
>>    I'm using:
>>     - a phenom X6 at 3600mhz
>>     - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )
>>     - 4x marvell 9230 sata3 controllers ( 4 ports each ) pcie 2.0 2x
>> lanes.
>>     - 8GB ram
>>     - custom 3.12 kernel and mdadm compiled from latest source
>>
>>    what I did to test performance was to force a check on the array, and
>> this
>> leads to mdadm reporting a speed of about 990000K/sec. The hard disks
>> report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
>> resync speed a bit and the hdd's utilizartion to about 58% )
>>
>>    If I do the same with a raid5 array instead of raid10, them resync
>> speed
>> will be almost double of raid10, the harddisk utilization reported will be
>> 98-100% and I can see at least two cores being used.
>>
>> ________________________________________________________________________________
>> Mensagem enviada através do email grátis AEIOU
>> http://www.aeiou.pt
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Raid10 multi core scaling
  2013-12-02  8:19   ` David Brown
@ 2013-12-02  9:04     ` NeilBrown
  0 siblings, 0 replies; 6+ messages in thread
From: NeilBrown @ 2013-12-02  9:04 UTC (permalink / raw)
  To: David Brown; +Cc: Pedro Teixeira, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3732 bytes --]

On Mon, 02 Dec 2013 09:19:45 +0100 David Brown <david.brown@hesbynett.no>
wrote:

> On 02/12/13 07:22, NeilBrown wrote:
> > On Tue, 26 Nov 2013 10:58:59 +0000 Pedro Teixeira <finas@aeiou.pt> wrote:
> > 
> >>    I created a Raid10 array with 16 sata 1TB disks and the array
> >> performance
> >> seems to be limited by the md0_raid10 taking 99% of one core and not
> >> scalling to other cores. I tried overclocing the cpu cores and this lead to
> >> a small increase in performance ( but md0_raid10 keeps eating 99% of one
> >> core ).
> > 
> > Are you really talking about general array performance, or just resync
> > performance?
> > 
> > Because md0_raid10 doesn't do much work for normal IO so that should scale to
> > multiple processors.
> > I'm not sure that optimising resync to use more than one processor is really
> > much of a priority - is it?
> > 
> > NeilBrown
> 
> I think that depends a bit on whether we are talking about md raid10, or
> raid1+0.  For md raid10, resyncs and rebuilds are always going to take a
> bit of time - data has to be copied back and forth across the same
> disks, with a lot of head movement (especially for "far" layout).  I
> don't see that it would take much cpu time, however.  And this sort of
> layout is mostly for small arrays - it gives great performance when you
> have two drives, and perhaps up to about 4 drives.  But after that,
> people probably want raid1+0 layouts (i.e., raid0 stripes of raid1 pairs).
> 
> For raid1 + 0, resync time /is/ important - it is one of the reasons
> people pick it, especially the time to resync a single replaced drive.
> But again, I can't see how that would take cpu time.
> 
> I suppose that when doing the initial sync of this 16 disk array, you
> have to read in all data from 8 disks and write it out to the other 8
> disks.  That's a lot of IO passing through the cpus, even if there is no
> calculations going on.  But I thought each raid1 pair already had its
> own thread, so this would be scaled across 8 cores?  Is md raid10
> limited to just one thread?

For recovery the data should not pass through the CPU.  But it still has to
handle thousands of IO requests.
For resync, we read both copies and compare only writing if they differ.
This can be more efficient when the device is a lot slower than the CPU.

md/raid10 is currently limited to just one thread.

NeilBrown

> 
> mvh.,
> 
> David
> 
> 
> > 
> > 
> >>
> >>    I'm using:
> >>     - a phenom X6 at 3600mhz
> >>     - 16 seagate SSHDs ( sata3 7200RPM with 8GB ssd cache )
> >>     - 4x marvell 9230 sata3 controllers ( 4 ports each ) pcie 2.0 2x
> >> lanes.
> >>     - 8GB ram
> >>     - custom 3.12 kernel and mdadm compiled from latest source
> >>
> >>    what I did to test performance was to force a check on the array, and
> >> this
> >> leads to mdadm reporting a speed of about 990000K/sec. The hard disks
> >> report a 54% utilization. ( Overclocking the cpu by 200mhz increases the
> >> resync speed a bit and the hdd's utilizartion to about 58% )
> >>
> >>    If I do the same with a raid5 array instead of raid10, them resync
> >> speed
> >> will be almost double of raid10, the harddisk utilization reported will be
> >> 98-100% and I can see at least two cores being used.
> >>
> >> ________________________________________________________________________________
> >> Mensagem enviada através do email grátis AEIOU
> >> http://www.aeiou.pt
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-12-02  9:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-26 10:58 Raid10 multi core scaling Pedro Teixeira
2013-11-26 11:19 ` Adam Goryachev
2013-11-27  6:52 ` Stan Hoeppner
2013-12-02  6:22 ` NeilBrown
2013-12-02  8:19   ` David Brown
2013-12-02  9:04     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).