From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stan Hoeppner Subject: Re: md RAID with enterprise-class SATA or SAS drives Date: Mon, 21 May 2012 14:05:01 -0500 Message-ID: <4FBA91DD.7010307@hardwarefreak.com> References: <4FAAE8F1.8000600@pocock.com.au> <4FABC7C6.4030107@turmel.org> <4FAC2FF2.5060305@hardwarefreak.com> <4FAC40BC.1060300@hesbynett.no> <4FACBB68.2080304@hesbynett.no> <4FACCAC8.4020206@pocock.com.au> <4FAD9283.7020809@hardwarefreak.com> <4FBA8EA9.40203@hardwarefreak.com> Reply-To: stan@hardwarefreak.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Roberto Spadim Cc: CoolCold , Daniel Pocock , David Brown , Phil Turmel , Marcus Sorensen , linux-raid@vger.kernel.org List-Id: linux-raid.ids On 5/21/2012 1:54 PM, Roberto Spadim wrote: > hum, does anyone could explain what a 'multi thread' version of raid1 > could be implemented? > for example, how to scale it? and why this new implementation could > scale it better I just did below. You layer a stripe over many RAID 1 pairs. A single md RAID 1 pair isn't enough to saturate a single core so there is no gain to be had by trying to thread the RAID 1 code. -- Stan > 2012/5/21 Stan Hoeppner : >> On 5/21/2012 10:20 AM, CoolCold wrote: >>> On Sat, May 12, 2012 at 2:28 AM, Stan Hoeppner wrote: >>>> On 5/11/2012 3:16 AM, Daniel Pocock wrote: >>>> >>> [snip] >>>> That's the one scenario where I abhor using md raid, as I mentioned. At >>>> least, a boot raid 1 pair. Using layered md raid 1 + 0, or 1 + linear >>>> is a great solution for many workloads. Ask me why I say raid 1 + 0 >>>> instead of raid 10. >>> So, I'm asking - why? >> >> Neil pointed out quite some time ago that the md RAID 1/5/6/10 code runs >> as a single kernel thread. Thus when running heavy IO workloads across >> many rust disks or a few SSDs, the md thread becomes CPU bound, as it >> can only execute on a single core, just as with any other single thread. >> >> This issue is becoming more relevant as folks move to the latest >> generation of server CPUs that trade clock speed for higher core count. >> Imagine the surprise of the op who buys a dual socket box with 2x 16 >> core AMD Interlagos 2.0GHz CPUs, 256GB RAM, and 32 SSDs in md RAID 10, >> only to find he can only get a tiny fraction of the SSD throughput. >> Upon investigation he finds a single md thread peaking one core while >> the rest are relatively idle but for the application itself. >> >> As I understand Neil's explanation, the md RAID 0 and linear code don't >> run as separate kernel threads, but merely pass offsets to the block >> layer, which is fully threaded. Thus, by layering md RAID 0 over md >> RAID 1 pairs, the striping load is spread over all cores. Same with >> linear, avoiding the single thread bottleneck. >> >> This layering can be done with any md RAID level, creating RAID50s and >> RAID60s, or concatenations of RAID5/6, as well as of RAID 10. >> >> And it shouldn't take anywhere near 32 modern SSDs to saturate a single >> 2GHz core with md RAID 10. It's likely less than 8 SSDs, which yield >> ~400K IOPS, but I haven't done verufication testing myself at this point. >> >> -- >> Stan >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > >