* Fastest Chunk Size w/XFS For MD Software RAID = 1024k
@ 2007-06-27 23:20 Justin Piszcz
2007-06-27 23:20 ` Justin Piszcz
[not found] ` <46832E60.9000006@rabbit.us>
0 siblings, 2 replies; 11+ messages in thread
From: Justin Piszcz @ 2007-06-27 23:20 UTC (permalink / raw)
To: linux-raid, xfs; +Cc: Alan Piszcz
The results speak for themselves:
http://home.comcast.net/~jpiszcz/chunk/index.html
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-27 23:20 Fastest Chunk Size w/XFS For MD Software RAID = 1024k Justin Piszcz @ 2007-06-27 23:20 ` Justin Piszcz 2007-06-27 23:24 ` Justin Piszcz 2007-06-28 5:08 ` David Chinner [not found] ` <46832E60.9000006@rabbit.us> 1 sibling, 2 replies; 11+ messages in thread From: Justin Piszcz @ 2007-06-27 23:20 UTC (permalink / raw) To: linux-raid, xfs; +Cc: Alan Piszcz For drives with 16MB of cache (in this case, raptors). Justin. On Wed, 27 Jun 2007, Justin Piszcz wrote: > The results speak for themselves: > > http://home.comcast.net/~jpiszcz/chunk/index.html > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-27 23:20 ` Justin Piszcz @ 2007-06-27 23:24 ` Justin Piszcz 2007-06-28 5:08 ` David Chinner 1 sibling, 0 replies; 11+ messages in thread From: Justin Piszcz @ 2007-06-27 23:24 UTC (permalink / raw) To: linux-raid, xfs; +Cc: Alan Piszcz For the e-mail archives: p34-128k-chunk,15696M,77236.3,99,445653,86.3333,192267,34.3333,78773.7,99,524463,41,594.9,0,16:100000:16/64,1298.67,10.6667,5964.33,17.3333,3035.67,18.3333,1512,13.6667,5334.33,16,2634.67,19 p34-512k-chunk,15696M,78383,99,436842,86,162969,27,79624,99,486892,38,583.0,0,16:100000:16/64,2019,17,9715,29,4272,23,2250,22,17095,45,3691,30 p34-1024k-chunk,15696M,77672.3,99,455267,87.3333,183772,29.6667,79601.3,99,578225,43.3333,595.933,0,16:100000:16/64,2085.67,18,12953,39,3908.33,23.3333,2375.33,23.3333,18492,51.6667,3388.33,27 p34-4096k-chunk,15696M,33791.1,43.5556,176630,37.3333,72235.1,11.5556,34424.9,44,247925,18.2222,271.644,0,16:100000:16/64,560,4.88889,2928,8.88889,1039.56,5.77778,571.556,5.33333,1729.78,5.33333,1289.33,9.33333 On Wed, 27 Jun 2007, Justin Piszcz wrote: > For drives with 16MB of cache (in this case, raptors). > > Justin. > > On Wed, 27 Jun 2007, Justin Piszcz wrote: > >> The results speak for themselves: >> >> http://home.comcast.net/~jpiszcz/chunk/index.html >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-27 23:20 ` Justin Piszcz 2007-06-27 23:24 ` Justin Piszcz @ 2007-06-28 5:08 ` David Chinner 2007-06-28 7:53 ` David Greaves 2007-06-28 8:07 ` Justin Piszcz 1 sibling, 2 replies; 11+ messages in thread From: David Chinner @ 2007-06-28 5:08 UTC (permalink / raw) To: Justin Piszcz; +Cc: linux-raid, xfs, Alan Piszcz On Wed, Jun 27, 2007 at 07:20:42PM -0400, Justin Piszcz wrote: > For drives with 16MB of cache (in this case, raptors). That's four (4) drives, right? If so, how do you get a block read rate of 578MB/s from 4 drives? That's 145MB/s per drive.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-28 5:08 ` David Chinner @ 2007-06-28 7:53 ` David Greaves 2007-06-28 8:07 ` Justin Piszcz 1 sibling, 0 replies; 11+ messages in thread From: David Greaves @ 2007-06-28 7:53 UTC (permalink / raw) To: David Chinner; +Cc: Justin Piszcz, linux-raid, xfs, Alan Piszcz David Chinner wrote: > On Wed, Jun 27, 2007 at 07:20:42PM -0400, Justin Piszcz wrote: >> For drives with 16MB of cache (in this case, raptors). > > That's four (4) drives, right? I'm pretty sure he's using 10 - email a few days back... >>>>>> Justin Piszcz wrote: >>>>> Running test with 10 RAPTOR 150 hard drives, expect it to take >>>>> awhile until I get the results, avg them etc. :) > If so, how do you get a block read rate of 578MB/s from > 4 drives? That's 145MB/s per drive.... Which gives a far more reasonable 60MB/s per drive... David ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-28 5:08 ` David Chinner 2007-06-28 7:53 ` David Greaves @ 2007-06-28 8:07 ` Justin Piszcz 1 sibling, 0 replies; 11+ messages in thread From: Justin Piszcz @ 2007-06-28 8:07 UTC (permalink / raw) To: David Chinner; +Cc: linux-raid, xfs, Alan Piszcz 10 disks total. Justin. On Thu, 28 Jun 2007, David Chinner wrote: > On Wed, Jun 27, 2007 at 07:20:42PM -0400, Justin Piszcz wrote: >> For drives with 16MB of cache (in this case, raptors). > > That's four (4) drives, right? > > If so, how do you get a block read rate of 578MB/s from > 4 drives? That's 145MB/s per drive.... > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <46832E60.9000006@rabbit.us>]
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k [not found] ` <46832E60.9000006@rabbit.us> @ 2007-06-28 8:07 ` Justin Piszcz [not found] ` <46837056.4050306@rabbit.us> 0 siblings, 1 reply; 11+ messages in thread From: Justin Piszcz @ 2007-06-28 8:07 UTC (permalink / raw) To: Peter Rabbitson; +Cc: linux-raid, xfs, Alan Piszcz mdadm --create \ --verbose /dev/md3 \ --level=5 \ --raid-devices=10 \ --chunk=1024 \ --force \ --run /dev/sd[cdefghijkl]1 Justin. On Thu, 28 Jun 2007, Peter Rabbitson wrote: > Justin Piszcz wrote: >> The results speak for themselves: >> >> http://home.comcast.net/~jpiszcz/chunk/index.html >> > > > What is the array layout (-l ? -n ? -p ?) > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <46837056.4050306@rabbit.us>]
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k [not found] ` <46837056.4050306@rabbit.us> @ 2007-06-28 8:27 ` Justin Piszcz 2007-06-28 22:05 ` David Chinner 2007-06-28 9:05 ` Matti Aarnio 1 sibling, 1 reply; 11+ messages in thread From: Justin Piszcz @ 2007-06-28 8:27 UTC (permalink / raw) To: Peter Rabbitson; +Cc: linux-raid, xfs, Alan Piszcz On Thu, 28 Jun 2007, Peter Rabbitson wrote: > Justin Piszcz wrote: >> mdadm --create \ >> --verbose /dev/md3 \ >> --level=5 \ >> --raid-devices=10 \ >> --chunk=1024 \ >> --force \ >> --run >> /dev/sd[cdefghijkl]1 >> >> Justin. > > Interesting, I came up with the same results (1M chunk being superior) with a > completely different raid set with XFS on top: > > mdadm --create \ > --level=10 \ > --chunk=1024 \ > --raid-devices=4 \ > --layout=f3 \ > ... > > Could it be attributed to XFS itself? > > Peter > Good question, by the way how much cache do the drives have that you are testing with? Justin. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-28 8:27 ` Justin Piszcz @ 2007-06-28 22:05 ` David Chinner 0 siblings, 0 replies; 11+ messages in thread From: David Chinner @ 2007-06-28 22:05 UTC (permalink / raw) To: Justin Piszcz; +Cc: Peter Rabbitson, linux-raid, xfs, Alan Piszcz On Thu, Jun 28, 2007 at 04:27:15AM -0400, Justin Piszcz wrote: > > > On Thu, 28 Jun 2007, Peter Rabbitson wrote: > > >Justin Piszcz wrote: > >>mdadm --create \ > >> --verbose /dev/md3 \ > >> --level=5 \ > >> --raid-devices=10 \ > >> --chunk=1024 \ > >> --force \ > >> --run > >> /dev/sd[cdefghijkl]1 > >> > >>Justin. > > > >Interesting, I came up with the same results (1M chunk being superior) > >with a completely different raid set with XFS on top: > > > >mdadm --create \ > > --level=10 \ > > --chunk=1024 \ > > --raid-devices=4 \ > > --layout=f3 \ > > ... > > > >Could it be attributed to XFS itself? More likely it's related to the I/O size being sent to the disks. The larger the chunk size, the larger the I/o hitting each disk. I think the maximum I/O size is 512k ATM on x86(_64), so a chunk of 1MB will guarantee that there are maximally sized I/Os being sent to the disk.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k [not found] ` <46837056.4050306@rabbit.us> 2007-06-28 8:27 ` Justin Piszcz @ 2007-06-28 9:05 ` Matti Aarnio 2007-06-28 13:27 ` Jon Nelson 1 sibling, 1 reply; 11+ messages in thread From: Matti Aarnio @ 2007-06-28 9:05 UTC (permalink / raw) To: Peter Rabbitson; +Cc: Justin Piszcz, linux-raid, xfs, Alan Piszcz On Thu, Jun 28, 2007 at 10:24:54AM +0200, Peter Rabbitson wrote: > Interesting, I came up with the same results (1M chunk being superior) > with a completely different raid set with XFS on top: > > mdadm --create \ > --level=10 \ > --chunk=1024 \ > --raid-devices=4 \ > --layout=f3 \ > ... > > Could it be attributed to XFS itself? Sort of.. /dev/md4: Version : 00.90.03 Raid Level : raid5 Raid Devices : 4 Total Devices : 4 Preferred Minor : 4 Active Devices : 4 Working Devices : 4 Layout : left-symmetric Chunk Size : 256K This means there are 3x 256k for the user data.. Now I had to carefully tune the XFS bsize/sunit/swidth to match that: meta-data=/dev/DataDisk/lvol0 isize=256 agcount=32, agsize=7325824 blks = sectsz=512 attr=1 data = bsize=4096 blocks=234426368, imaxpct=25 = sunit=64 swidth=192 blks, unwritten=1 ... That is, 4k * 64 = 256k, and 64 * 3 = 192 With that, bulk writing on the file system runs without need to read back blocks of disk-space to calculate RAID5 parity data because the filesystem's idea of block does not align with RAID5 surface. I do have LVM in between the MD-RAID5 and XFS, so I did also align the LVM to that 3 * 256k. Doing this alignment thing did boost write performance by nearly a factor of 2 from mkfs.xfs with default parameters. With very wide RAID5, like the original question... I would find it very surprising if the alignment of upper layers to MD-RAID level would not be important there as well. Very small continuous writing does not make good use of disk mechanism, (seek time, rotation delay), so something in order of 128k-1024k will speed things up -- presuming that when you are writing, you are doing it many MB at the time. Database transactions are a lot smaller, and are indeed harmed by such large megachunk-IO oriented surfaces. RAID-levels 0 and 1 (and 10) do not have the need of reading back parts of the surface because a subset of it was not altered by incoming write. Some DB application on top of the filesystem would benefit if we had a way for it to ask about these alignment boundary issues, so it could read whole alignment block even though it writes out only a subset of it. (Theory being that those same blocks would also exist in memory cache and thus be available for write-back parity calculation.) > Peter /Matti Aarnio ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Fastest Chunk Size w/XFS For MD Software RAID = 1024k 2007-06-28 9:05 ` Matti Aarnio @ 2007-06-28 13:27 ` Jon Nelson 0 siblings, 0 replies; 11+ messages in thread From: Jon Nelson @ 2007-06-28 13:27 UTC (permalink / raw) To: Matti Aarnio; +Cc: Peter Rabbitson, Justin Piszcz, linux-raid, xfs, Alan Piszcz On Thu, 28 Jun 2007, Matti Aarnio wrote: > I do have LVM in between the MD-RAID5 and XFS, so I did also align > the LVM to that 3 * 256k. How did you align the LVM ? -- Jon Nelson <jnelson-linux-raid@jamponi.net> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-06-29 5:38 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-27 23:20 Fastest Chunk Size w/XFS For MD Software RAID = 1024k Justin Piszcz
2007-06-27 23:20 ` Justin Piszcz
2007-06-27 23:24 ` Justin Piszcz
2007-06-28 5:08 ` David Chinner
2007-06-28 7:53 ` David Greaves
2007-06-28 8:07 ` Justin Piszcz
[not found] ` <46832E60.9000006@rabbit.us>
2007-06-28 8:07 ` Justin Piszcz
[not found] ` <46837056.4050306@rabbit.us>
2007-06-28 8:27 ` Justin Piszcz
2007-06-28 22:05 ` David Chinner
2007-06-28 9:05 ` Matti Aarnio
2007-06-28 13:27 ` Jon Nelson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox