From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Davidsen Subject: Re: Successful RAID 6 setup Date: Mon, 09 Nov 2009 12:37:57 -0500 Message-ID: <4AF85375.9@tmr.com> References: <20091104184049356.ZIDJ2725@cdptpa-omta04.mail.rr.com> <4AF5BDF3.8020907@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Sender: linux-raid-owner@vger.kernel.org To: Beolach Cc: Doug Ledford , Leslie Rhorer , linux-raid@vger.kernel.org List-Id: linux-raid.ids Beolach wrote: > On Sat, Nov 7, 2009 at 11:35, Doug Ledford wrote: > >> On 11/04/2009 01:40 PM, Leslie Rhorer wrote: >> >>> I would recommend a larger chunk size. I'm using 256K, and even >>> 512K or 1024K probably would not be excessive. >>> >> OK, I've got some data that I'm not quite ready to send out yet, but it >> maps out the relationship between max_sectors_kb (largest request size a >> disk can process, which varies based upon scsi host adapter in question, >> but for SATA adapters is capped at and defaults to 512KB max per >> request) and chunk size for a raid0 array across 4 disks or 5 disks (I >> could run other array sizes too, and that's part of what I'm waiting on >> before sending the data out). The point here being that a raid0 array >> will show up more of the md/lower layer block device interactions where >> as raid5/6 would muddy the waters with other stuff. The results of the >> tests I ran were pretty conclusive that the sweet spot for chunk size is >> when chunk size is == max_sectors_kb, and since SATA is the predominant >> thing today and it defaults to 512K, that gives a 512K chunk as the >> sweet spot. Given that the chunk size is generally about optimizing >> block device operations at the command/queue level, it should transfer >> directly to raid5/6 as well. >> >> > > This only really applies for large sequential io loads, right? I seem > to recall > smaller chunk sizes being more effective for smaller random io loads. > Not true now (if it ever was). The operative limit here is seek time, not transfer time. Back in the day of old and slow drives, hanging off old and slow connections, the time to transfer the data was somewhat of an issue. Current SATA drives and controllers have higher transfer rates, and until SSD make seek times smaller bigger is better within reason. Related question: that said, why is a six drive raid6 slower than a four drive? On a small write all the data chunks have to be read, but that can be done in parallel, so the limit should stay at the seek time of the slowest drive. In practice it behaves as if the data chunks were being read one at a time. Is that real, or just fallout from not a long enough test to smooth out the data? -- Bill Davidsen "We can't solve today's problems by using the same thinking we used in creating them." - Einstein