* specify agsize?
@ 2013-07-14 0:11 aurfalien
2013-07-14 2:13 ` Eric Sandeen
0 siblings, 1 reply; 15+ messages in thread
From: aurfalien @ 2013-07-14 0:11 UTC (permalink / raw)
To: xfs
Hello again,
I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size.
So I do;
mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data
And I get;
meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=6701716480, imaxpct=5
= sunit=32 swidth=448 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=131072, version=2
= sectsz=512 sunit=32 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
All is fine but I was recently made aware of tweaking agsize. So I would like to mess around and iozone any diffs between the above agcount of 32 and whatever agcount changes I may do.
I didn't see any mention of agsize/agcount on the XFS FAQ and would like to know, based on the above, why does XFS think I have 32 allocation groups with the corresponding size? And are these optimal numbers?
Thanks in advance,
- aurf
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: specify agsize? 2013-07-14 0:11 specify agsize? aurfalien @ 2013-07-14 2:13 ` Eric Sandeen 2013-07-14 4:20 ` aurfalien 0 siblings, 1 reply; 15+ messages in thread From: Eric Sandeen @ 2013-07-14 2:13 UTC (permalink / raw) To: aurfalien; +Cc: xfs On 7/13/13 7:11 PM, aurfalien wrote: > Hello again, > > I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size. > > So I do; > > mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data > > And I get; > > meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks > = sectsz=512 attr=2, projid32bit=0 > data = bsize=4096 blocks=6701716480, imaxpct=5 > = sunit=32 swidth=448 blks > naming =version 2 bsize=4096 ascii-ci=0 > log =internal log bsize=4096 blocks=131072, version=2 > = sectsz=512 sunit=32 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > > > All is fine but I was recently made aware of tweaking agsize. Made aware by what? For what reason? > So I would like to mess around and iozone any diffs between the above > agcount of 32 and whatever agcount changes I may do. Unless iozone is your machine's normal workload, that will probably prove to be uninteresting. > I didn't see any mention of agsize/agcount on the XFS FAQ and would > like to know, based on the above, why does XFS think I have 32 > allocation groups with the corresponding size? It doesn't think so, it _knows_ so, because it made them itself. ;) > And are these optimal > numbers? How high is up? Here's the appropriate faq entry: http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E -Eric > Thanks in advance, > > - aurf > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 2:13 ` Eric Sandeen @ 2013-07-14 4:20 ` aurfalien 2013-07-14 7:06 ` Stan Hoeppner 2013-07-14 16:14 ` Eric Sandeen 0 siblings, 2 replies; 15+ messages in thread From: aurfalien @ 2013-07-14 4:20 UTC (permalink / raw) To: Eric Sandeen; +Cc: xfs On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote: > On 7/13/13 7:11 PM, aurfalien wrote: >> Hello again, >> >> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size. >> >> So I do; >> >> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data >> >> And I get; >> >> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >> = sectsz=512 attr=2, projid32bit=0 >> data = bsize=4096 blocks=6701716480, imaxpct=5 >> = sunit=32 swidth=448 blks >> naming =version 2 bsize=4096 ascii-ci=0 >> log =internal log bsize=4096 blocks=131072, version=2 >> = sectsz=512 sunit=32 blks, lazy-count=1 >> realtime =none extsz=4096 blocks=0, rtextents=0 >> >> >> All is fine but I was recently made aware of tweaking agsize. > > Made aware by what? For what reason? Autodesk has this software called Flame which requires very very fast local storage using XFS. They have an entire write up on how to calc proper agsize for optimal performance. I never mess with agsize but it is required when creating the XFS file system for use with Flame. I realize its tailored for there apps particular IO characteristics, so I'm curious about it. >> So I would like to mess around and iozone any diffs between the above >> agcount of 32 and whatever agcount changes I may do. > > Unless iozone is your machine's normal workload, that will probably prove to be uninteresting. Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. >> I didn't see any mention of agsize/agcount on the XFS FAQ and would >> like to know, based on the above, why does XFS think I have 32 >> allocation groups with the corresponding size? > > It doesn't think so, it _knows_ so, because it made them itself. ;) Yea but based on what? Why 32 at there current size? >> And are these optimal >> numbers? > > How high is up? > > Here's the appropriate faq entry: > > http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E Problem is I run Centos so the line; "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " ... doesn't really apply. - aurf _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 4:20 ` aurfalien @ 2013-07-14 7:06 ` Stan Hoeppner 2013-07-14 16:56 ` aurfalien 2013-07-15 1:07 ` Dave Chinner 2013-07-14 16:14 ` Eric Sandeen 1 sibling, 2 replies; 15+ messages in thread From: Stan Hoeppner @ 2013-07-14 7:06 UTC (permalink / raw) To: aurfalien; +Cc: Eric Sandeen, xfs On 7/13/2013 11:20 PM, aurfalien wrote: ... >>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data ... >>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >>> = sectsz=512 attr=2, projid32bit=0 >>> data = bsize=4096 blocks=6701716480, imaxpct=5 >>> = sunit=32 swidth=448 blks >>> naming =version 2 bsize=4096 ascii-ci=0 >>> log =internal log bsize=4096 blocks=131072, version=2 >>> = sectsz=512 sunit=32 blks, lazy-count=1 >>> realtime =none extsz=4096 blocks=0, rtextents=0 ... > Autodesk has this software called Flame which requires very very fast local storage using XFS. If "Flame" does any random writes then you probably shouldn't be using RAID6. > They have an entire write up on how to calc proper agsize for optimal performance. I think you're confused. Maximum agsize is 1TB. Making your AGs smaller than that won't decrease application performance, so it's literally impossible to tune agsize to increase performance. agcount on the other hand can potentially have an effect if the application is sufficiently threaded. But agcount doesn't mean anything in isolation. It's tied directly to the characteristics of the RAID level and hardware. For example, mkfs.xfs gave you 32 AGs for this 14 spindle array. One could make 32 AGs on a single 4TB SATA disk and the performance difference between the two will be radically different. ... > Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. No, it won't. See above. > Yea but based on what? Based on the fact that your XFS is ~26TB. mkfs.xfs could have given you 26 AGs of ~1TB each. But it chose to give you 32 AGs of ~815GB each. Whether you run bonnie, iozone, or your Flame application, you won't be able to measure a meaningful difference, if any difference, between 26 and 32 AGs. ... > Problem is I run Centos so the line; > > "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " > > ... doesn't really apply. This makes no sense. What doesn't apply? You can change to noop or deadline with a single echo command in a startup script: echo noop > /sys/block/sdX/queue/scheduler where sdX is the name of your RAID device. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 7:06 ` Stan Hoeppner @ 2013-07-14 16:56 ` aurfalien 2013-07-15 1:07 ` Dave Chinner 1 sibling, 0 replies; 15+ messages in thread From: aurfalien @ 2013-07-14 16:56 UTC (permalink / raw) To: stan; +Cc: Eric Sandeen, xfs On Jul 14, 2013, at 12:06 AM, Stan Hoeppner wrote: > On 7/13/2013 11:20 PM, aurfalien wrote: > ... >>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data > ... >>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >>>> = sectsz=512 attr=2, projid32bit=0 >>>> data = bsize=4096 blocks=6701716480, imaxpct=5 >>>> = sunit=32 swidth=448 blks >>>> naming =version 2 bsize=4096 ascii-ci=0 >>>> log =internal log bsize=4096 blocks=131072, version=2 >>>> = sectsz=512 sunit=32 blks, lazy-count=1 >>>> realtime =none extsz=4096 blocks=0, rtextents=0 > ... >> Autodesk has this software called Flame which requires very very fast local storage using XFS. > > If "Flame" does any random writes then you probably shouldn't be using > RAID6. > >> They have an entire write up on how to calc proper agsize for optimal performance. > > I think you're confused. Maximum agsize is 1TB. Making your AGs > smaller than that won't decrease application performance, so it's > literally impossible to tune agsize to increase performance. agcount on > the other hand can potentially have an effect if the application is > sufficiently threaded. But agcount doesn't mean anything in isolation. > It's tied directly to the characteristics of the RAID level and > hardware. For example, mkfs.xfs gave you 32 AGs for this 14 spindle > array. One could make 32 AGs on a single 4TB SATA disk and the > performance difference between the two will be radically different. > > ... >> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. > > No, it won't. See above. > >> Yea but based on what? > > Based on the fact that your XFS is ~26TB. > > mkfs.xfs could have given you 26 AGs of ~1TB each. But it chose to give > you 32 AGs of ~815GB each. Whether you run bonnie, iozone, or your > Flame application, you won't be able to measure a meaningful difference, > if any difference, between 26 and 32 AGs. > > ... >> Problem is I run Centos so the line; >> >> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " >> >> ... doesn't really apply. > > This makes no sense. What doesn't apply? Well, I had assumed it meant Linux kernel version 3.2.12, were as CentOS is at whatever RHEL is at being 2.6.32. At any rate, based on what I'm getting from you all, is that leave the agcount alone as agsize will max at 1TB and agcount will adjust depending on volume size. This volume will encounter a lot of random IO so 32 AGs will suffice at any rate. Un sure if increasing it to Autodesks 128 will really help my env. I'm assuming they want a lot of parallelism which again doesn't apply in my case.. > You can change to noop or deadline with a single echo command in a > startup script: > > echo noop > /sys/block/sdX/queue/scheduler > > where sdX is the name of your RAID device. > > -- > Stan > > - aurf _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 7:06 ` Stan Hoeppner 2013-07-14 16:56 ` aurfalien @ 2013-07-15 1:07 ` Dave Chinner 1 sibling, 0 replies; 15+ messages in thread From: Dave Chinner @ 2013-07-15 1:07 UTC (permalink / raw) To: Stan Hoeppner; +Cc: xfs, Eric Sandeen, aurfalien On Sun, Jul 14, 2013 at 02:06:43AM -0500, Stan Hoeppner wrote: > On 7/13/2013 11:20 PM, aurfalien wrote: > ... > >>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data > ... > >>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks > >>> = sectsz=512 attr=2, projid32bit=0 > >>> data = bsize=4096 blocks=6701716480, imaxpct=5 > >>> = sunit=32 swidth=448 blks > >>> naming =version 2 bsize=4096 ascii-ci=0 > >>> log =internal log bsize=4096 blocks=131072, version=2 > >>> = sectsz=512 sunit=32 blks, lazy-count=1 > >>> realtime =none extsz=4096 blocks=0, rtextents=0 > ... > > Autodesk has this software called Flame which requires very very fast local storage using XFS. > > If "Flame" does any random writes then you probably shouldn't be using > RAID6. Oh, we are talking about flame/smoke/lustre rendering environments here. Go back 5 years, a renderwall compositing effects via smoke was one of the nastiest small random write workloads you could throw at a filesystem. It was often used to benchmark file server performance for renderwalls and still may be. Think of a workload that reads lots of shared texture files across thousands of machines, each crunching a single video frame to add an effect and all doing small random writes to the video frame as it modifies a small section of each line of the video frame.... Translation: tuning for AG size is a waste of time. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 4:20 ` aurfalien 2013-07-14 7:06 ` Stan Hoeppner @ 2013-07-14 16:14 ` Eric Sandeen 2013-07-14 16:46 ` aurfalien ` (2 more replies) 1 sibling, 3 replies; 15+ messages in thread From: Eric Sandeen @ 2013-07-14 16:14 UTC (permalink / raw) To: aurfalien; +Cc: xfs On 7/13/13 11:20 PM, aurfalien wrote: > > On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote: > >> On 7/13/13 7:11 PM, aurfalien wrote: >>> Hello again, >>> >>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size. >>> >>> So I do; >>> >>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data >>> >>> And I get; >>> >>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >>> = sectsz=512 attr=2, projid32bit=0 >>> data = bsize=4096 blocks=6701716480, imaxpct=5 >>> = sunit=32 swidth=448 blks >>> naming =version 2 bsize=4096 ascii-ci=0 >>> log =internal log bsize=4096 blocks=131072, version=2 >>> = sectsz=512 sunit=32 blks, lazy-count=1 >>> realtime =none extsz=4096 blocks=0, rtextents=0 >>> >>> >>> All is fine but I was recently made aware of tweaking agsize. >> >> Made aware by what? For what reason? > > Autodesk has this software called Flame which requires very very fast > local storage using XFS. They have an entire write up on how to calc > proper agsize for optimal performance. http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 I guess? That's quite a procedure! And I have to say, a slightly strange one at first glance. It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe. In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other. > I never mess with agsize but it is require when creating the XFS > file system for use with Flame. I realize its tailored for there > apps particular IO characteristics, so I'm curious about it. In general more AGs allow more concurrency for some operations; it also will generally change how/where files in multiple directories get allocated. >>> So I would like to mess around and iozone any diffs between the above >>> agcount of 32 and whatever agcount changes I may do. >> >> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting. > > Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. Not necessarily, see above; I'm not sure what iozone invocation would show any effects from more or fewer AGs. Anyway, iozone != flame, not by a long shot! :) >>> I didn't see any mention of agsize/agcount on the XFS FAQ and would >>> like to know, based on the above, why does XFS think I have 32 >>> allocation groups with the corresponding size? >> >> It doesn't think so, it _knows_ so, because it made them itself. ;) > > Yea but based on what? > > Why 32 at there current size? see calc_default_ag_geometry() Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles: } else if (dblocks > GIGABYTES(512, blocklog)) shift = 5; 2^5 = 32 If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T. >>> And are these optimal >>> numbers? >> >> How high is up? >> >> Here's the appropriate faq entry: >> >> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E > > Problem is I run Centos so the line; > > "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " > > ... doesn't really apply. Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload. I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be. Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size. -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 16:14 ` Eric Sandeen @ 2013-07-14 16:46 ` aurfalien 2013-07-14 17:14 ` aurfalien 2013-07-14 22:08 ` Stan Hoeppner 2 siblings, 0 replies; 15+ messages in thread From: aurfalien @ 2013-07-14 16:46 UTC (permalink / raw) To: Eric Sandeen; +Cc: xfs Sorry to top post. But this was exactly the kind of info I was hoping for. Thanks Eric. - aurf On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote: > On 7/13/13 11:20 PM, aurfalien wrote: >> >> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote: >> >>> On 7/13/13 7:11 PM, aurfalien wrote: >>>> Hello again, >>>> >>>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size. >>>> >>>> So I do; >>>> >>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data >>>> >>>> And I get; >>>> >>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >>>> = sectsz=512 attr=2, projid32bit=0 >>>> data = bsize=4096 blocks=6701716480, imaxpct=5 >>>> = sunit=32 swidth=448 blks >>>> naming =version 2 bsize=4096 ascii-ci=0 >>>> log =internal log bsize=4096 blocks=131072, version=2 >>>> = sectsz=512 sunit=32 blks, lazy-count=1 >>>> realtime =none extsz=4096 blocks=0, rtextents=0 >>>> >>>> >>>> All is fine but I was recently made aware of tweaking agsize. >>> >>> Made aware by what? For what reason? >> >> Autodesk has this software called Flame which requires very very fast >> local storage using XFS. They have an entire write up on how to calc >> proper agsize for optimal performance. > > http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 > > I guess? > > That's quite a procedure! And I have to say, a slightly strange one at first glance. > > It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe. > > In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other. > >> I never mess with agsize but it is require when creating the XFS >> file system for use with Flame. I realize its tailored for there >> apps particular IO characteristics, so I'm curious about it. > > In general more AGs allow more concurrency for some operations; > it also will generally change how/where files in multiple directories get > allocated. > >>>> So I would like to mess around and iozone any diffs between the above >>>> agcount of 32 and whatever agcount changes I may do. >>> >>> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting. >> >> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. > > Not necessarily, see above; I'm not sure what iozone invocation would > show any effects from more or fewer AGs. Anyway, iozone != flame, not > by a long shot! :) > >>>> I didn't see any mention of agsize/agcount on the XFS FAQ and would >>>> like to know, based on the above, why does XFS think I have 32 >>>> allocation groups with the corresponding size? >>> >>> It doesn't think so, it _knows_ so, because it made them itself. ;) >> >> Yea but based on what? >> >> Why 32 at there current size? > > see calc_default_ag_geometry() > > Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles: > > } else if (dblocks > GIGABYTES(512, blocklog)) > shift = 5; > > 2^5 = 32 > > If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T. > >>>> And are these optimal >>>> numbers? >>> >>> How high is up? >>> >>> Here's the appropriate faq entry: >>> >>> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E >> >> Problem is I run Centos so the line; >> >> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " >> >> ... doesn't really apply. > > Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload. > > I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be. > > Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size. > > -Eric > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 16:14 ` Eric Sandeen 2013-07-14 16:46 ` aurfalien @ 2013-07-14 17:14 ` aurfalien 2013-07-15 1:22 ` Dave Chinner 2013-07-14 22:08 ` Stan Hoeppner 2 siblings, 1 reply; 15+ messages in thread From: aurfalien @ 2013-07-14 17:14 UTC (permalink / raw) To: Eric Sandeen; +Cc: xfs On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote: > On 7/13/13 11:20 PM, aurfalien wrote: >> >> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote: >> >>> On 7/13/13 7:11 PM, aurfalien wrote: >>>> Hello again, >>>> >>>> I have a Raid 6 x16 disk array with 128k stripe size and a 512 byte block size. >>>> >>>> So I do; >>>> >>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 /dev/mapper/vg_doofus_data-lv_data >>>> >>>> And I get; >>>> >>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 agcount=32, agsize=209428640 blks >>>> = sectsz=512 attr=2, projid32bit=0 >>>> data = bsize=4096 blocks=6701716480, imaxpct=5 >>>> = sunit=32 swidth=448 blks >>>> naming =version 2 bsize=4096 ascii-ci=0 >>>> log =internal log bsize=4096 blocks=131072, version=2 >>>> = sectsz=512 sunit=32 blks, lazy-count=1 >>>> realtime =none extsz=4096 blocks=0, rtextents=0 >>>> >>>> >>>> All is fine but I was recently made aware of tweaking agsize. >>> >>> Made aware by what? For what reason? >> >> Autodesk has this software called Flame which requires very very fast >> local storage using XFS. They have an entire write up on how to calc >> proper agsize for optimal performance. > > http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 > > I guess? > > That's quite a procedure! And I have to say, a slightly strange one at first glance. > > It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe. Sorry to double reply to the same thread. But the volume in question (regarding the Autodesk article) is used for very fast playback of image files. So realtime performance for files of 2048x1556 resolution. These files are being touched/retouched throughout the day by the person driving the Flame. The fragmentation on these systems on a heavy day, meaning one were they are running at 98% full is about 5% on avg. On any given day, the systems are about 80% full. > In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other. > >> I never mess with agsize but it is require when creating the XFS >> file system for use with Flame. I realize its tailored for there >> apps particular IO characteristics, so I'm curious about it. > > In general more AGs allow more concurrency for some operations; > it also will generally change how/where files in multiple directories get > allocated. > >>>> So I would like to mess around and iozone any diffs between the above >>>> agcount of 32 and whatever agcount changes I may do. >>> >>> Unless iozone is your machine's normal workload, that will probably prove to be uninteresting. >> >> Well, it will give me a base line comparison of non tweaked agsize vs tweaked agsize. > > Not necessarily, see above; I'm not sure what iozone invocation would > show any effects from more or fewer AGs. Anyway, iozone != flame, not > by a long shot! :) > >>>> I didn't see any mention of agsize/agcount on the XFS FAQ and would >>>> like to know, based on the above, why does XFS think I have 32 >>>> allocation groups with the corresponding size? >>> >>> It doesn't think so, it _knows_ so, because it made them itself. ;) >> >> Yea but based on what? >> >> Why 32 at there current size? > > see calc_default_ag_geometry() > > Since you are in multidisk mode (you have stripe geometry) it uses more AGs for more AGs since it knows you have more spindles: > > } else if (dblocks > GIGABYTES(512, blocklog)) > shift = 5; > > 2^5 = 32 > > If you hadn't been in multidisk mode you would have gotten 25 AGs due to the max AG size of 1T. > >>>> And are these optimal >>>> numbers? >>> >>> How high is up? >>> >>> Here's the appropriate faq entry: >>> >>> http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E >> >> Problem is I run Centos so the line; >> >> "As of kernel 3.2.12, the default i/o scheduler, CFQ, will defeat much of the parallelization in XFS. " >> >> ... doesn't really apply. > > Well, my point was that your original question, "are these optimal numbers?" included absolutely no context of your workload, so the best answer is yes - the default mkfs behavior is optimal for a generic, unspecified workload. > > I don't have access to Autodesk Flame so I really don't know how it behaves or what an optimal tuning might be. > > Anyway, I think the calc_default_ag_geometry() info above answered your original question of "why does XFS think I have 32 allocation groups with the corresponding size?" - that's simply the default mkfs algorithm when in multidisk mode, for a disk of this size. > > -Eric > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 17:14 ` aurfalien @ 2013-07-15 1:22 ` Dave Chinner 0 siblings, 0 replies; 15+ messages in thread From: Dave Chinner @ 2013-07-15 1:22 UTC (permalink / raw) To: aurfalien; +Cc: Eric Sandeen, xfs On Sun, Jul 14, 2013 at 10:14:15AM -0700, aurfalien wrote: > On Jul 14, 2013, at 9:14 AM, Eric Sandeen wrote: > > On 7/13/13 11:20 PM, aurfalien wrote: > >> On Jul 13, 2013, at 7:13 PM, Eric Sandeen wrote: > >>> On 7/13/13 7:11 PM, aurfalien wrote: > >>>> Hello again, > >>>> > >>>> I have a Raid 6 x16 disk array with 128k stripe size and a > >>>> 512 byte block size. > >>>> > >>>> So I do; > >>>> > >>>> mkfs.xfs -f -l size=512m -d su=128k,sw=14 > >>>> /dev/mapper/vg_doofus_data-lv_data > >>>> > >>>> And I get; > >>>> > >>>> meta-data=/dev/mapper/vg_doofus_data-lv_data isize=256 > >>>> agcount=32, agsize=209428640 blks = > >>>> sectsz=512 attr=2, projid32bit=0 data = > >>>> bsize=4096 blocks=6701716480, imaxpct=5 = > >>>> sunit=32 swidth=448 blks naming =version 2 > >>>> bsize=4096 ascii-ci=0 log =internal log > >>>> bsize=4096 blocks=131072, version=2 = > >>>> sectsz=512 sunit=32 blks, lazy-count=1 realtime =none > >>>> extsz=4096 blocks=0, rtextents=0 > >>>> > >>>> > >>>> All is fine but I was recently made aware of tweaking agsize. > >>> > >>> Made aware by what? For what reason? > >> > >> Autodesk has this software called Flame which requires very > >> very fast local storage using XFS. They have an entire write up > >> on how to calc proper agsize for optimal performance. > > > > http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 > > > > I guess? > > > > That's quite a procedure! And I have to say, a slightly strange > > one at first glance. > > > > It'd be nice if they said what they were trying to accomplish > > rather than just giving you a long recipe. > > > Sorry to double reply to the same thread. > > But the volume in question (regarding the Autodesk article) is > used for very fast playback of image files. So realtime > performance for files of 2048x1556 resolution. These files are > being touched/retouched throughout the day by the person driving > the Flame. Sure - it's file per frame video that is being used here, and 2k resolution is generally around 12.5MB per frame. If you are concerned about playback rates, then it is far more important that the frames are laid out sequentially on disk than anything else. Tuning the number of AGs doesn't acheive that - increasing the number of AGs is more likely to cause them to be written all over the place, especially as the filesystem ages and AGs fill up. > The fragmentation on these systems on a heavy day, meaning one > were they are running at 98% full is about 5% on avg. On any > given day, the systems are about 80% full. If they are running their filesystems to 98% full, they they have already given up any hope they have of getting reliable layout of their video files. If you are concerned about low latency, high throughput playback, then it's far more important to get the stripe width set up correctly for the size of the file so each frame is stripe width aligned and each frame takes a single physical IO to read from disk and there is minimal seek between the two frames. The only reason I can see for increasing the number of AGs here is that they are trying to limit the number of video directories that share the same AGs as they are specifying the inode64 mount option. i.e. the assumption is that each video clip is sufficiently large that with 128AGs it is unlikely that two video clips will end up in the same AG and hence potentially interleave as they are modified.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 16:14 ` Eric Sandeen 2013-07-14 16:46 ` aurfalien 2013-07-14 17:14 ` aurfalien @ 2013-07-14 22:08 ` Stan Hoeppner 2013-07-14 22:42 ` aurfalien 2 siblings, 1 reply; 15+ messages in thread From: Stan Hoeppner @ 2013-07-14 22:08 UTC (permalink / raw) To: Eric Sandeen; +Cc: xfs, aurfalien On 7/14/2013 11:14 AM, Eric Sandeen wrote: > http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 > > I guess? > > That's quite a procedure! And I have to say, a slightly strange one at first glance. Agreed. > It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe. Again. > In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other. Or it's just as likely they are laying out these image frames in a specific manner across 128 directories, assuming 128 AGs exist, to achieve some specific "on disk" organization of the files. It's simply not possible to know without more information. Interestingly, on a 14+2 RAID6 array of 7.2K drives, normally 128 AGs will decrease parallel performance due to a huge increase in head seek latency. Thus I'd assume this isn't a parallel workload. Either that or Autodesk doesn't know XFS as well as they believe. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 22:08 ` Stan Hoeppner @ 2013-07-14 22:42 ` aurfalien 2013-07-14 23:43 ` Stan Hoeppner 0 siblings, 1 reply; 15+ messages in thread From: aurfalien @ 2013-07-14 22:42 UTC (permalink / raw) To: stan; +Cc: Eric Sandeen, xfs On Jul 14, 2013, at 3:08 PM, Stan Hoeppner wrote: > On 7/14/2013 11:14 AM, Eric Sandeen wrote: > >> http://wikihelp.autodesk.com/Creative_Finishing/enu/2012/Help/05_Installation_Guides/Installation_and_Configuration_Guide_for_Linux_Workstations/0118-Advanced118/0194-Manually194/0199-Creating199 >> >> I guess? >> >> That's quite a procedure! And I have to say, a slightly strange one at first glance. > > Agreed. > >> It'd be nice if they said what they were trying to accomplish rather than just giving you a long recipe. > > Again. > >> In the end, I think they are trying to create 128AGs and maybe work around some mkfs corner case or other. > > Or it's just as likely they are laying out these image frames in a > specific manner across 128 directories, assuming 128 AGs exist, to > achieve some specific "on disk" organization of the files. It's simply > not possible to know without more information. > > Interestingly, on a 14+2 RAID6 array of 7.2K drives, normally 128 AGs > will decrease parallel performance due to a huge increase in head seek > latency. Thus I'd assume this isn't a parallel workload. Either that > or Autodesk doesn't know XFS as well as they believe. Now hold on a minute here Stan. While I don't really like Autodesk as they pretty much atrophy software. The fact is that they, the finishing suite division, know XFS and realtime 2K performance is realized all day long as long as one follows there guidelines. After all, SGI developed XFS as well as visual computing stations and back in the day, you had SGIs running Flame vs today which is Linux. Flame is a visual computing app after all. Albeit with a front end or GUI tuned to artists but still. My initial post on this was to try and understand if there mobs make sense to the general XFS community and wether I could benefit from them in applying those mods to general purpose storage. - aurf _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 22:42 ` aurfalien @ 2013-07-14 23:43 ` Stan Hoeppner 0 siblings, 0 replies; 15+ messages in thread From: Stan Hoeppner @ 2013-07-14 23:43 UTC (permalink / raw) To: aurfalien; +Cc: Eric Sandeen, xfs On 7/14/2013 5:42 PM, aurfalien wrote: > My initial post on this was to try and understand if there mobs make sense to the general XFS community They do not. > and wether I could benefit from them in applying those mods to general purpose storage. You may or may not. There's simply not enough information available in that guide. Obviously Autodesk has a reason for recommending 128 AGs, but no such reasoning is provided. I already explained why, in the general case, agcount has no relevance in isolation. Setting agcount properly for the general XFS case requires knowledge of the underlying storage device size, geometry, spindle speed, etc. The Autodesk instructions Eric linked are specific to a select group of Autodesk certified HP workstation models, Autodesk's own storage arrays, or unspecified FC SAN storage. Nowhere in the "storage configuration" chapter does it mention the number of disks or RAID level required or recommended backing the LUNs. Thus, given what I've explained of the relationship between array capacity, spindle count, RAID level, etc, it simply doesn't make sense to arbitrarily specify 128 allocation groups, especially when the storage hardware characteristic are completely ignored. So if Autodesk is ignoring these critical factors when telling you to use 128 allocation groups, then they either have some application specific file layout that benefits from 128 AGs, or, as I said, they don't know XFS as well as they think they do. I'm not disparaging Autodesk here. There are plenty of vendors who do things with XFS that aren't necessarily wise, sometimes flat out bad. Taking a quick glance at the data directory layout on a current Flame system may get us closer to understanding why they want 128 AGs. For instance, if they've created exactly 128 directories on the XFS volume that would fully answer the question as to why they want 128 AGs. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? @ 2013-07-14 19:45 Richard Scobie 2013-07-14 22:18 ` aurfalien 0 siblings, 1 reply; 15+ messages in thread From: Richard Scobie @ 2013-07-14 19:45 UTC (permalink / raw) To: xfs aurfalien wrote: ............. So I would like to mess around and iozone any diffs between the above agcount of 32 and whatever agcount changes I may do. ............. There is an Autodesk tool to do this work, sw_io_perf_tool which will give a much more realistic evaluation than iozone. Checkout: http://usa.autodesk.com/adsk/servlet/ps/dl/item?siteID=123112&id=15486735&linkID=9242618 Regards, Richard _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: specify agsize? 2013-07-14 19:45 Richard Scobie @ 2013-07-14 22:18 ` aurfalien 0 siblings, 0 replies; 15+ messages in thread From: aurfalien @ 2013-07-14 22:18 UTC (permalink / raw) To: Richard Scobie; +Cc: xfs On Jul 14, 2013, at 12:45 PM, Richard Scobie wrote: > aurfalien wrote: > ............. > > So I would like to mess around and iozone any diffs between the above agcount of 32 and whatever agcount changes I may do. > > ............. > > There is an Autodesk tool to do this work, sw_io_perf_tool which will give a much more realistic evaluation than iozone. > > Checkout: > > http://usa.autodesk.com/adsk/servlet/ps/dl/item?siteID=123112&id=15486735&linkID=9242618 Brilliant, many thanks! - aurf _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-07-15 1:23 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-07-14 0:11 specify agsize? aurfalien 2013-07-14 2:13 ` Eric Sandeen 2013-07-14 4:20 ` aurfalien 2013-07-14 7:06 ` Stan Hoeppner 2013-07-14 16:56 ` aurfalien 2013-07-15 1:07 ` Dave Chinner 2013-07-14 16:14 ` Eric Sandeen 2013-07-14 16:46 ` aurfalien 2013-07-14 17:14 ` aurfalien 2013-07-15 1:22 ` Dave Chinner 2013-07-14 22:08 ` Stan Hoeppner 2013-07-14 22:42 ` aurfalien 2013-07-14 23:43 ` Stan Hoeppner -- strict thread matches above, loose matches on Subject: below -- 2013-07-14 19:45 Richard Scobie 2013-07-14 22:18 ` aurfalien
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox