* Small chunk size read performance penalty
@ 2013-08-18 22:05 Ian Pilcher
2013-08-18 22:16 ` Roberto Spadim
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Ian Pilcher @ 2013-08-18 22:05 UTC (permalink / raw)
To: linux-raid
Can anyone point me to a good explanation of the read performance impact
of small (RAID-5 and RAID-6) chunk sizes?
I understand why large chunks hurt write performance, but I haven't been
able to reason through the small-chunk/read case, and my Interweb
searches haven't really turned anything up.
The "read penalty" is definitely there; I can see it in the test data
from my NAS. I just don't understand *why* it's there.
Thanks!
--
========================================================================
Ian Pilcher arequipeno@gmail.com
Sometimes there's nothing left to do but crash and burn...or die trying.
========================================================================
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Small chunk size read performance penalty 2013-08-18 22:05 Small chunk size read performance penalty Ian Pilcher @ 2013-08-18 22:16 ` Roberto Spadim 2013-08-19 1:40 ` Stan Hoeppner 2013-08-19 3:01 ` Roberto Spadim 2 siblings, 0 replies; 6+ messages in thread From: Roberto Spadim @ 2013-08-18 22:16 UTC (permalink / raw) To: Ian Pilcher; +Cc: Linux-RAID i'm not sure, but maybe the problem is the same of small chunk size of raid-0 the head move a lot for small works, while it could move less and do many job but i'm not sure about it 2013/8/18 Ian Pilcher <arequipeno@gmail.com>: > Can anyone point me to a good explanation of the read performance impact > of small (RAID-5 and RAID-6) chunk sizes? > > I understand why large chunks hurt write performance, but I haven't been > able to reason through the small-chunk/read case, and my Interweb > searches haven't really turned anything up. > > The "read penalty" is definitely there; I can see it in the test data > from my NAS. I just don't understand *why* it's there. > > Thanks! > > -- > ======================================================================== > Ian Pilcher arequipeno@gmail.com > Sometimes there's nothing left to do but crash and burn...or die trying. > ======================================================================== > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Roberto Spadim ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Small chunk size read performance penalty 2013-08-18 22:05 Small chunk size read performance penalty Ian Pilcher 2013-08-18 22:16 ` Roberto Spadim @ 2013-08-19 1:40 ` Stan Hoeppner 2013-08-19 5:49 ` Ian Pilcher 2013-08-19 3:01 ` Roberto Spadim 2 siblings, 1 reply; 6+ messages in thread From: Stan Hoeppner @ 2013-08-19 1:40 UTC (permalink / raw) To: Ian Pilcher; +Cc: linux-raid On 8/18/2013 5:05 PM, Ian Pilcher wrote: > Can anyone point me to a good explanation of the read performance impact > of small (RAID-5 and RAID-6) chunk sizes? Can you elaborate on your workload that demonstrates this? Different workloads behave differently with different chunk sizes. > I understand why large chunks hurt write performance... Again this is workload dependent. Large chunks increase write and read performance for large streaming workloads. > The "read penalty" is definitely there; I can see it in the test data > from my NAS. I just don't understand *why* it's there. If you can see it, then please demonstrate this read penalty with numbers. You obviously have test data from the same set of disks with two different RAID5s of different chunk sizes. This is required to see such a difference in performance. Please share this data with us. -- Stan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Small chunk size read performance penalty 2013-08-19 1:40 ` Stan Hoeppner @ 2013-08-19 5:49 ` Ian Pilcher 2013-08-20 2:28 ` Stan Hoeppner 0 siblings, 1 reply; 6+ messages in thread From: Ian Pilcher @ 2013-08-19 5:49 UTC (permalink / raw) To: linux-raid On 08/18/2013 08:40 PM, Stan Hoeppner wrote: > Can you elaborate on your workload that demonstrates this? Different > workloads behave differently with different chunk sizes. dd ... at block sizes between 4KiB and 1MiB, on RAID-5 and -6 arrays with chunk sizes in the same range. Hardware is 5 7200 RPM SATA drives in a NAS (Thecus N5550) with an Atom D2550 processor and an ICH10R chipset. The drives are all connected to the chipset's built-in AHCI controller. > If you can see it, then please demonstrate this read penalty with > numbers. You obviously have test data from the same set of disks with > two different RAID5s of different chunk sizes. This is required to see > such a difference in performance. Please share this data with us. I've uploaded the data (in OpenDocument spreadsheet form) to Dropbox. I think that it's accessible at this link: https://www.dropbox.com/s/4dq93th4wu5rr2y/nas_benchmarks.ods (This is my first attempt at sharing anything via Dropbox, so let me know if it doesn't work.) I actually find your response really interesting. From my Interweb searching, the "small stripe size read penalty" seems to be pretty widely accepted, much as the "large stripe size write penalty" is. It certainly does show up in my data; as the chunk size increases reads of even small blocks get faster. -- ======================================================================== Ian Pilcher arequipeno@gmail.com Sometimes there's nothing left to do but crash and burn...or die trying. ======================================================================== ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Small chunk size read performance penalty 2013-08-19 5:49 ` Ian Pilcher @ 2013-08-20 2:28 ` Stan Hoeppner 0 siblings, 0 replies; 6+ messages in thread From: Stan Hoeppner @ 2013-08-20 2:28 UTC (permalink / raw) To: Ian Pilcher; +Cc: linux-raid On 8/19/2013 12:49 AM, Ian Pilcher wrote: > On 08/18/2013 08:40 PM, Stan Hoeppner wrote: >> Can you elaborate on your workload that demonstrates this? Different >> workloads behave differently with different chunk sizes. > > dd ... at block sizes between 4KiB and 1MiB, on RAID-5 and -6 arrays > with chunk sizes in the same range. > > Hardware is 5 7200 RPM SATA drives in a NAS (Thecus N5550) with an Atom > D2550 processor and an ICH10R chipset. The drives are all connected to > the chipset's built-in AHCI controller. > >> If you can see it, then please demonstrate this read penalty with >> numbers. You obviously have test data from the same set of disks with >> two different RAID5s of different chunk sizes. This is required to see >> such a difference in performance. Please share this data with us. > > I've uploaded the data (in OpenDocument spreadsheet form) to Dropbox. I > think that it's accessible at this link: > > https://www.dropbox.com/s/4dq93th4wu5rr2y/nas_benchmarks.ods > > (This is my first attempt at sharing anything via Dropbox, so let me > know if it doesn't work.) > > I actually find your response really interesting. From my Interweb > searching, the "small stripe size read penalty" seems to be pretty > widely accepted, much as the "large stripe size write penalty" is. It > certainly does show up in my data; as the chunk size increases reads of > even small blocks get faster. Everything in the world of storage performance depends on the workload. The statements above assume an unstated workload, and are so general as to not be worth repeating, and certainly not putting any stock in. The former is true of large streaming workloads. If your workload deals with small IO reads, such as mail serving, then a small stripe is not detrimental as the mail file you're reading is almost always smaller than the stripe size, and often smaller than the chunk size. Using a large chunk/stripe with such a workload can create hotspots on some disks in the array, increasing latency, and decreasing throughput. However, in this scenario, the big win is in write latency. A large chunk/stripe size will generate a huge amount of unnecessary read IO during RMW cycles to recalculate parity when you write a new mail message into an existing stripe. With an optimal chunk/stripe for this workload, you read few extra sectors during RMW. It's often very difficult to get this balance right. And even if you do, mail workloads are still many times slower on parity RAID than on mirrors or striped mirrors (RAID10). This obviously depends on load. Even "low end" modern server hardware with md RAID6 and a handful of disks can easily handle a few hundred active mail users. Once you get into the thousands you'll need mirror based RAID as RMW latency will grind you to a halt. The same hardware is plenty. You simply change the RAID level. You'll need a couple more disks to maintain total capacity, but simply changing to mirror based RAID will increase throughput 5-15 fold, and decrease latency substantially. Any "large stripe size write penalty" will be a function of mismatching the workload to the RAID stripe and/or array/drive hardware. Using a large stripe with a mail workload will yield poor performance indeed due to large RMW bandwidth/latency. Large stripe with this workload typically means >32-64KB. Yes, that's stripe, not chunk. For this workload using a 6 drive RAID6 you'd want an 8-16KB chunk for a 32-64KB stripe. This is the opposite of the meme you quote above. Again, workload dependent. If your workload is HPC file serving, where user files are 10s to 100s of GB, even TBs in size, then you'd want the largest chunk/strip/stripe your hardware can perform well with. This may be as low as 512KB or it may be as large as 2MB. And it will likely be hardware based RAID, not Linux md. -- Stan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Small chunk size read performance penalty 2013-08-18 22:05 Small chunk size read performance penalty Ian Pilcher 2013-08-18 22:16 ` Roberto Spadim 2013-08-19 1:40 ` Stan Hoeppner @ 2013-08-19 3:01 ` Roberto Spadim 2 siblings, 0 replies; 6+ messages in thread From: Roberto Spadim @ 2013-08-19 3:01 UTC (permalink / raw) To: Ian Pilcher; +Cc: Linux-RAID HI Ian just some points that you should consider... it's an idea about the theory... don't think it's a main guide to linux-raid or anything else... some basic things... a disk have one head arm, in other words it can read/write a sequence of bits, move, read/write more bits, move... etc when you use strip (raid-0, 5, 6, raid 10) you change how data is write, instead of a continous byte stream, you divide it in chunks... chunk 1 at disk 1 position 0, chunk 2 at disk 2 position 0, chunk 3 at disk1 position 1 .... etc... this place each disk close to one specific chunk example... considering a read from position 0 to last position of array: the disk used = chunk id % number_of_disks, with a chunk of 128MB and two disks, and reading 1GB you will read 0-128MB from disk 1, 128-256MB from disk 2, 256-384MB from disk 1 ... etc when you read more than one chunk, you use two heads arms (two disks) here you have speed boost, you must check what is the best chunk size for your work load, (and chunk should only be used when need, on some workloads raid1 is better) using two head you can read/write faster, but this is only nice for continous stream... when you need parallel works (many thread) you can use raid1 for read in parallel since it don't have chunks it can read data with only one disk, in other words... the workload can (when possible) be shared for 1 disk / thread, if you have 10 disks using raid-1, you can have a nice performace for 10 threads without problems, for write the slowest disk will stall the write performace, but you should consider what you need... that's a superficial explain, the implementation can be a bit different, but explain the idea about the use of chunks the workload tell what's better in my system sometimes raid1 is better than raid10 because i have many threads reading diferent parts of disk, in this case i can add many heads arms (disks) one for each important thread, and i have a good performace (considering a high load system), but if you need fast continous read, the chunk is VERY good for example if you need a read of 1GB, and have 10 disks, you can have a performace of 10x with a good chunk size, since each disk will be used when a chunk read is requested, and each disk can read parallel and continous (without many head movement) if you put a chunk size of 1GB and you need a read of 1GB, you don't have any performace boost from chunks... the best thing to do is TEST with your workload i think this can help, if not delete from your mail box :) ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-08-20 2:28 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-08-18 22:05 Small chunk size read performance penalty Ian Pilcher 2013-08-18 22:16 ` Roberto Spadim 2013-08-19 1:40 ` Stan Hoeppner 2013-08-19 5:49 ` Ian Pilcher 2013-08-20 2:28 ` Stan Hoeppner 2013-08-19 3:01 ` Roberto Spadim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox