* XFS: performance
@ 2010-11-28 22:51 Yclept Nemo
2010-11-29 0:11 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Yclept Nemo @ 2010-11-28 22:51 UTC (permalink / raw)
To: xfs
After 3-4 years of using one XFS partition for every mount point
(/,/usr,/etc,/home,/tmp...) I started noticing a rapid performance
degradation. Subjectively I now feel my XFS partition is 5-10x slower
... while other partitions (ntfs,ext3) remain the same.
Now I just purchased a new hard-drive and I'm going to be copying all
my original files over onto a *new* XFS partition. When using mkfs.xfs
I'd like to optimize and avoid whatever it was that made my old XFS
partition slower than a snail.
in all cases, bsize=4096
Original hard drive (98GB Seagate, 39.8Gb XFS partition)
# xfs_info /dev/sdb5
meta-data=/dev/sdb5
isize=256
agcount=4, agsize=2595896 blks (9.9 GB)
sectsz=512 attr=2
data=
bsize=4096
blocks=10383582, imaxpct=25 (39.6 GB)
sunit=0 swidth=0 blks
naming=version 2
bsize=4096 ascii-ci=0
log=internal
bsize=4096
blocks=5070, version=2 (19.8 MB)
sectsz=512
sunit=0 blks, lazy-count=0
realtime =none
extsz=4096
blocks=0, rtextents=0
New hard drive (500GB Samsung MP4, 100GB XFS partition)
# mkfs.xfs /dev/sda5
meta-data=/dev/sda5
isize=256
agcount=4, agsize=6553600 blks (25.0 GB)
sectsz=512 attr=2
data=
bsize=4096
blocks=26214400, imaxpct=25 (100.0 GB)
sunit=0 swidth=0 blks
naming=version 2
bsize=4096 ascii-ci=0
log=internal
bsize=4096
blocks=12800, version=2 (50MB)
sectsz=512
sunit=0 blks, lazy-count=1
realtime=none
extsz=4096
blocks=0, rtextents=0
I was considering running "mkfs.xfs -d agcount=32 -i attr=2 -l
version=2,lazy-count=1,size=256m /dev/sda5".
Yes, I know that in xfs_progs 3.1.3 "-i attr=2 -l
version=2,lazy-count=1" are already default options. However I think I
should tweak the log size, blocksize, and data allocation group counts
beyond the default values and I'm looking for some recommendations or
input.
I assume mkfs.xfs automatically selects optimal values, but I *have*
space to spare for a larger log section... and perhaps my old XFS
partition became sluggish when the log section had filled up, if this
is even possible.
Similarly a larger agcount should always give better performance,
right? Some resources claim that agcount should never fall below
eight.
I'm also hesitant about reducing the blocksize from a maximum of 4096
bytes, but since XFS manages my entire file-system tree, a blocksize
of 512, 1024,or even 2048 bytes might squeeze out some extra
performance. [I assume] the performance w.r.t. blocksize is:
. a larger blocksize dramatically increases large file performance,
but also increases space usage when dealing with small files.
. a smaller blocksize dramatically decreases performance for large
files, and somewhat increases performance for small files, while also
slightly increasing space usage with extra inodes(?)
I want to make it clear that I prefer performance over space efficiency.
Many thanks,
orbisvicis
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: XFS: performance 2010-11-28 22:51 XFS: performance Yclept Nemo @ 2010-11-29 0:11 ` Dave Chinner 2010-11-29 1:21 ` Yclept Nemo 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2010-11-29 0:11 UTC (permalink / raw) To: Yclept Nemo; +Cc: xfs On Sun, Nov 28, 2010 at 10:51:04PM +0000, Yclept Nemo wrote: > After 3-4 years of using one XFS partition for every mount point > (/,/usr,/etc,/home,/tmp...) I started noticing a rapid performance > degradation. Subjectively I now feel my XFS partition is 5-10x slower > ... while other partitions (ntfs,ext3) remain the same. Can you run some benchmarks to show this non-subjectively? Aged filesystems will be slower than new filesytsems, and it should be measurable. Also, knowing what you filesystem contains (number of files, used capacity, whether you have run it near ENOSPC for extended periods of time, etc) would help us understand the way the filesytesm has aged as well. > Now I just purchased a new hard-drive and I'm going to be copying all > my original files over onto a *new* XFS partition. When using mkfs.xfs > I'd like to optimize and avoid whatever it was that made my old XFS > partition slower than a snail. Nobody can give definite advice without first quantifying and then understanding the aging slowdown you've been seeing. .... > I was considering running "mkfs.xfs -d agcount=32 -i attr=2 -l > version=2,lazy-count=1,size=256m /dev/sda5". > > Yes, I know that in xfs_progs 3.1.3 "-i attr=2 -l > version=2,lazy-count=1" are already default options. However I think I > should tweak the log size, blocksize, and data allocation group counts > beyond the default values and I'm looking for some recommendations or > input. Why do you think you should tweak them? > I assume mkfs.xfs automatically selects optimal values, but I *have* > space to spare for a larger log section... and perhaps my old XFS > partition became sluggish when the log section had filled up, if this > is even possible. Well, you had a very small log (20MB) on the original filesystem, and so as the filesystems ages (e.g. free space fragments), each allocation/free transaction would be larger than on a new filesytsem because of the larger btrees that need to be manipulated. With such a small log, that could be part of the reason for the slowdown you were seeing. However, without knowing what you filesystem looks like physically, this is only speculation. That being said, the larger log (50MB) that the new filesystem has shouldn't have the same degree of degradation under the same aging characteristics. It's probably not necesary to go larger than ~100MB for partition of 100GB on a single spindle... > Similarly a larger agcount should always give better performance, > right? No. > Some resources claim that agcount should never fall below > eight. If those resources are right, then why would we default to 4 AGs for filesystems on single spindles? > I'm also hesitant about reducing the blocksize from a maximum of 4096 > bytes, but since XFS manages my entire file-system tree, a blocksize > of 512, 1024,or even 2048 bytes might squeeze out some extra > performance. [I assume] the performance w.r.t. blocksize is: > . a larger blocksize dramatically increases large file performance, No, it doesn't - maybe a few percent difference when you've got multiple GB/s throughput, but it's mostly noise for single spindles. Extent based allocation makes block size pretty much irrelevant for sequential write performance... > but also increases space usage when dealing with small files. Not significantly enough to matter for modern disks. > . a smaller blocksize dramatically decreases performance for large > files, See above. > and somewhat increases performance for small files, Not really. > while also > slightly increasing space usage with extra inodes(?) ???? > I want to make it clear that I prefer performance over space efficiency. That's what the defaults are biased towards. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-29 0:11 ` Dave Chinner @ 2010-11-29 1:21 ` Yclept Nemo 2010-11-29 1:59 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Yclept Nemo @ 2010-11-29 1:21 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On Mon, Nov 29, 2010 at 12:11 AM, Dave Chinner <david@fromorbit.com> wrote: > On Sun, Nov 28, 2010 at 10:51:04PM +0000, Yclept Nemo wrote: >> After 3-4 years of using one XFS partition for every mount point >> (/,/usr,/etc,/home,/tmp...) I started noticing a rapid performance >> degradation. Subjectively I now feel my XFS partition is 5-10x slower >> ... while other partitions (ntfs,ext3) remain the same. > > Can you run some benchmarks to show this non-subjectively? Aged > filesystems will be slower than new filesytsems, and it should be > measurable. Also, knowing what you filesystem contains (number of > files, used capacity, whether you have run it near ENOSPC for > extended periods of time, etc) would help us understand the way the > filesytesm has aged as well. Certainly, if you are interested I can run either dbench or bonnie++ tests comparing an XFS partition (with default values from xfsprogs 3.1.3) on the new hard-drive to the existing partition on the old. As I'm not sure what you're looking for, what command parameters should I profile against? The XFS partition in question is 39.61GB in size, of which 30.71GB are in use (8.90GB free). It contains a typical Arch Linux installation with many programs and many personal files. Usage pattern as follows: . equal runtime split between (near ENOSPC) and (approximately 10.0GB free) . mostly small files, one or two exceptions . often reach ENOSPC through carelessness . run xfs_fsr very often Breakdown of space: /home: 14.6GB /usr: 12.3GB /var: 1005.1 MB /etc: 58.8 MB >> Now I just purchased a new hard-drive and I'm going to be copying all >> my original files over onto a *new* XFS partition. When using mkfs.xfs >> I'd like to optimize and avoid whatever it was that made my old XFS >> partition slower than a snail. > > Nobody can give definite advice without first quantifying and then > understanding the aging slowdown you've been seeing. > > .... (see above) >> I was considering running "mkfs.xfs -d agcount=32 -i attr=2 -l >> version=2,lazy-count=1,size=256m /dev/sda5". >> >> Yes, I know that in xfs_progs 3.1.3 "-i attr=2 -l >> version=2,lazy-count=1" are already default options. However I think I >> should tweak the log size, blocksize, and data allocation group counts >> beyond the default values and I'm looking for some recommendations or >> input. > > Why do you think you should tweak them? To avoid the aging slowdown as well as to increase read/write/metadata performance with small files. >> I assume mkfs.xfs automatically selects optimal values, but I *have* >> space to spare for a larger log section... and perhaps my old XFS >> partition became sluggish when the log section had filled up, if this >> is even possible. > > Well, you had a very small log (20MB) on the original filesystem, > and so as the filesystems ages (e.g. free space fragments), each > allocation/free transaction would be larger than on a new filesytsem > because of the larger btrees that need to be manipulated. With such > a small log, that could be part of the reason for the slowdown you > were seeing. However, without knowing what you filesystem looks > like physically, this is only speculation. > > That being said, the larger log (50MB) that the new filesystem has > shouldn't have the same degree of degradation under the same aging > characteristics. It's probably not necesary to go larger than ~100MB > for partition of 100GB on a single spindle... In this case I'll aim for a large log section, probably 256 or 512MB, unless it will impede performance. That way there will be no problems when I resize the partition to 200GB ... 300GB... up to a maximum of 450GB. In fact the manual page of xfs_growfs - which might be outdated - warns that log resizing is not implemented, so it would probably be auspicious to create an overly-large log section. >> Similarly a larger agcount should always give better performance, >> right? > > No. > >> Some resources claim that agcount should never fall below >> eight. > > If those resources are right, then why would we default to 4 AGs for > filesystems on single spindles? Obviously you are against modifying the agcount - I won't touch it :) >> I'm also hesitant about reducing the blocksize from a maximum of 4096 >> bytes, but since XFS manages my entire file-system tree, a blocksize >> of 512, 1024,or even 2048 bytes might squeeze out some extra >> performance. [I assume] the performance w.r.t. blocksize is: >> . a larger blocksize dramatically increases large file performance, > > No, it doesn't - maybe a few percent difference when you've got > multiple GB/s throughput, but it's mostly noise for single spindles. > Extent based allocation makes block size pretty much irrelevant for > sequential write performance... > >> but also increases space usage when dealing with small files. > > Not significantly enough to matter for modern disks. > >> . a smaller blocksize dramatically decreases performance for large >> files, > > See above. > >> and somewhat increases performance for small files, > > Not really. > >> while also >> slightly increasing space usage with extra inodes(?) > > ???? Not actually sure what I intended. My knowledge of file-systems depends on Google and that statement was only a shot in the dark. However, you've convinced me not to change the blocksize (keep in mind I'm running an entire Linux installation from this one XFS partition, small files included). If the blocksize option is so performance-independent, why does it even exist? >> I want to make it clear that I prefer performance over space efficiency. > > That's what the defaults are biased towards. Good to know the defaults are sane - yet another reason not to modify the blocksize and data allocation group counts. Thanks, orbisvicis _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-29 1:21 ` Yclept Nemo @ 2010-11-29 1:59 ` Dave Chinner [not found] ` <AANLkTikw086Z_66cz_U-EdFQx14TXP6XmiG-KyLN4BLo@mail.gmail.com> 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2010-11-29 1:59 UTC (permalink / raw) To: Yclept Nemo; +Cc: xfs On Mon, Nov 29, 2010 at 01:21:11AM +0000, Yclept Nemo wrote: > On Mon, Nov 29, 2010 at 12:11 AM, Dave Chinner <david@fromorbit.com> wrote: > > On Sun, Nov 28, 2010 at 10:51:04PM +0000, Yclept Nemo wrote: > >> After 3-4 years of using one XFS partition for every mount point > >> (/,/usr,/etc,/home,/tmp...) I started noticing a rapid performance > >> degradation. Subjectively I now feel my XFS partition is 5-10x slower > >> ... while other partitions (ntfs,ext3) remain the same. > > > > Can you run some benchmarks to show this non-subjectively? Aged > > filesystems will be slower than new filesytsems, and it should be > > measurable. Also, knowing what you filesystem contains (number of > > files, used capacity, whether you have run it near ENOSPC for > > extended periods of time, etc) would help us understand the way the > > filesytesm has aged as well. > > Certainly, if you are interested I can run either dbench or bonnie++ > tests comparing an XFS partition (with default values from xfsprogs > 3.1.3) on the new hard-drive to the existing partition on the old. As > I'm not sure what you're looking for, what command parameters should I > profile against? > > The XFS partition in question is 39.61GB in size, of which 30.71GB are > in use (8.90GB free). It contains a typical Arch Linux installation > with many programs and many personal files. Usage pattern as follows: > . equal runtime split between (near ENOSPC) and (approximately 10.0GB free) There's your problem - it's a well known fact that running XFS at more than 85-90% capacity for extended periods of time causes free space fragmentation and that results in performance degradation. > . mostly small files, one or two exceptions > . often reach ENOSPC through carelessness > . run xfs_fsr very often And xfs_fsr is also known to cause free space fragmentation when run on filesystems with not much space available... > >> I was considering running "mkfs.xfs -d agcount=32 -i attr=2 -l > >> version=2,lazy-count=1,size=256m /dev/sda5". > >> > >> Yes, I know that in xfs_progs 3.1.3 "-i attr=2 -l > >> version=2,lazy-count=1" are already default options. However I think I > >> should tweak the log size, blocksize, and data allocation group counts > >> beyond the default values and I'm looking for some recommendations or > >> input. > > > > Why do you think you should tweak them? > > To avoid the aging slowdown as well as to increase read/write/metadata > performance with small files. According to your description above, tweaking these will not help you at all. > >> I assume mkfs.xfs automatically selects optimal values, but I *have* > >> space to spare for a larger log section... and perhaps my old XFS > >> partition became sluggish when the log section had filled up, if this > >> is even possible. > > > > Well, you had a very small log (20MB) on the original filesystem, > > and so as the filesystems ages (e.g. free space fragments), each > > allocation/free transaction would be larger than on a new filesytsem > > because of the larger btrees that need to be manipulated. With such > > a small log, that could be part of the reason for the slowdown you > > were seeing. However, without knowing what you filesystem looks > > like physically, this is only speculation. > > > > That being said, the larger log (50MB) that the new filesystem has > > shouldn't have the same degree of degradation under the same aging > > characteristics. It's probably not necesary to go larger than ~100MB > > for partition of 100GB on a single spindle... > > In this case I'll aim for a large log section, probably 256 or 512MB, > unless it will impede performance. It can because once you get into a tail-pushing situation then it'll trigger lots and lots of metadata IO. That's probably not ideal for a laptop drive. You'd do better to keep the log at around 100MB and use delayed logging.... > That way there will be no problems > when I resize the partition to 200GB ... 300GB... up to a maximum of > 450GB. In fact the manual page of xfs_growfs - which might be outdated > - warns that log resizing is not implemented, so it would probably be > auspicious to create an overly-large log section. > > >> Similarly a larger agcount should always give better performance, > >> right? > > > > No. > > > >> Some resources claim that agcount should never fall below > >> eight. > > > > If those resources are right, then why would we default to 4 AGs for > > filesystems on single spindles? > > Obviously you are against modifying the agcount - I won't touch it :) No, what I'm pointing out is that <some random web reference> is not a good guide for tuning an XFS filesystem. You need to _understand_ what changing that knob does before you change it. If you don't understand what it does, then don't change it... > Not actually sure what I intended. My knowledge of file-systems > depends on Google and that statement was only a shot in the dark. > However, you've convinced me not to change the blocksize (keep in mind > I'm running an entire Linux installation from this one XFS partition, > small files included). Sure, I do that too. My workstation has a 220GB root partition that contains all my kernel trees, build areas, etc. It has agcount=16 because I'm running on a 8c machine and do 8-way parallel builds, a log of 105MB and I'm using delaylog.... > If the blocksize option is so > performance-independent, why does it even exist? Because there are situations where it makes sense to change the block size. That isn't really a general use root filesystem, though... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <AANLkTikw086Z_66cz_U-EdFQx14TXP6XmiG-KyLN4BLo@mail.gmail.com>]
* Re: XFS: performance [not found] ` <AANLkTikw086Z_66cz_U-EdFQx14TXP6XmiG-KyLN4BLo@mail.gmail.com> @ 2010-11-29 3:57 ` Yclept Nemo 2010-11-29 5:41 ` Stan Hoeppner 2010-11-29 8:38 ` Michael Monnerie 0 siblings, 2 replies; 11+ messages in thread From: Yclept Nemo @ 2010-11-29 3:57 UTC (permalink / raw) To: xfs > If only I had a dollar for every time I forgot to 'cc the mailing list within Gmail: On Mon, Nov 29, 2010 at 1:59 AM, Dave Chinner <david@fromorbit.com> wrote: > On Mon, Nov 29, 2010 at 01:21:11AM +0000, Yclept Nemo wrote: >> On Mon, Nov 29, 2010 at 12:11 AM, Dave Chinner <david@fromorbit.com> wrote: >> > On Sun, Nov 28, 2010 at 10:51:04PM +0000, Yclept Nemo wrote: >> >> After 3-4 years of using one XFS partition for every mount point >> >> (/,/usr,/etc,/home,/tmp...) I started noticing a rapid performance >> >> degradation. Subjectively I now feel my XFS partition is 5-10x slower >> >> ... while other partitions (ntfs,ext3) remain the same. >> > >> > Can you run some benchmarks to show this non-subjectively? Aged >> > filesystems will be slower than new filesytsems, and it should be >> > measurable. Also, knowing what you filesystem contains (number of >> > files, used capacity, whether you have run it near ENOSPC for >> > extended periods of time, etc) would help us understand the way the >> > filesytesm has aged as well. >> >> Certainly, if you are interested I can run either dbench or bonnie++ >> tests comparing an XFS partition (with default values from xfsprogs >> 3.1.3) on the new hard-drive to the existing partition on the old. As >> I'm not sure what you're looking for, what command parameters should I >> profile against? >> >> The XFS partition in question is 39.61GB in size, of which 30.71GB are >> in use (8.90GB free). It contains a typical Arch Linux installation >> with many programs and many personal files. Usage pattern as follows: >> . equal runtime split between (near ENOSPC) and (approximately 10.0GB free) > > There's your problem - it's a well known fact that running XFS at > more than 85-90% capacity for extended periods of time causes free > space fragmentation and that results in performance degradation. > >> . mostly small files, one or two exceptions >> . often reach ENOSPC through carelessness >> . run xfs_fsr very often > > And xfs_fsr is also known to cause free space fragmentation when run > on filesystems with not much space available... Pheww... I'm relieved to learn that my performance degradation will be alleviated with this hard-drive update. Which also means I need no longer be so obsessive-compulsive when tweaking the second incarnation of my XFS file-system. >> >> Similarly a larger agcount should always give better performance, >> >> right? >> > >> > No. >> > >> >> Some resources claim that agcount should never fall below >> >> eight. >> > >> > If those resources are right, then why would we default to 4 AGs for >> > filesystems on single spindles? >> >> Obviously you are against modifying the agcount - I won't touch it :) > > No, what I'm pointing out is that <some random web reference> is not > a good guide for tuning an XFS filesystem. You need to _understand_ > what changing that knob does before you change it. If you don't > understand what it does, then don't change it... As I understand an XFS system is demarcated into several allocation groups each containing a superblock as well as private inode and free-space btrees, and thus increasing the AG count increases parallelization. I simply assumed the process was CPU-bound, not disk bound. Though by mentioning spindles, I realized it makes sense to limit the amount of parallel access to a single hard drive; I've often noticed XFS come to a crawl when I simultaneously call multiple intensive IO operations. >> Not actually sure what I intended. My knowledge of file-systems >> depends on Google and that statement was only a shot in the dark. >> However, you've convinced me not to change the blocksize (keep in mind >> I'm running an entire Linux installation from this one XFS partition, >> small files included). > > Sure, I do that too. My workstation has a 220GB root partition that > contains all my kernel trees, build areas, etc. It has agcount=16 > because I'm running on a 8c machine and do 8-way parallel builds, a > log of 105MB and I'm using delaylog.... You mention an eight-core machine (8c?). Since I operate a dual-core system, would it make sense to increase my AG count slightly, to five or six? Hm.. I was going to wait till 2.6.39 but I think I'll enable delayed logging right now! >> If the blocksize option is so >> performance-independent, why does it even exist? > > Because there are situations where it makes sense to change the > block size. That isn't really a general use root filesystem, > though... Now for the implementation of transferring the data to new XFS partition/hard-drive... I was originally going to use "rsync -avxAHX" until I stumbled across this list's thread, "ENOSPC at 90% with plenty of inodes" which mentioned xfsdump and xfsrestore. I now have three questions: . since xfsdump and xfsrestore access the base file-system structure, these tools will be able to copy everything ???: - files - special files (sockets/fifos) - permissions - attributes - acls (it is in the man page, but I list it here for completion) - symlinks - hard links - extended attributes - character/block devices - modification times - etc... everything: anything rsync could copy and more . since xfsdump and xfsrestore access the base file-system structure, will they be able to: - update creation-time XFS parameters (from the original file-system) to adapt to the new XFS file-system in order to benefit from performance and capability improvements. For example: . adapt the log section from version1 to version2 . modify the agcount . update the metadata attributes to version 2 . enable lazy-count for all files . etc - reduce fragmentation and free-space fragmentation upon xfsrestore. (or does xfsrestore simply copy bit-by-bit the old XFS structure into the new file-system). -If not, are there any reasons to nonetheless prefer xfsdump/xfsrestore over rsync? If any of this is already mentioned in the xfsdump/restore man pages, i apologize; I simply don't want to wait till tomorrow to begin the backup/restore process. Sincerely, orbisvicis _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-29 3:57 ` Yclept Nemo @ 2010-11-29 5:41 ` Stan Hoeppner 2010-11-30 4:29 ` Dave Chinner 2010-11-29 8:38 ` Michael Monnerie 1 sibling, 1 reply; 11+ messages in thread From: Stan Hoeppner @ 2010-11-29 5:41 UTC (permalink / raw) To: xfs Yclept Nemo put forth on 11/28/2010 9:57 PM: > Pheww... I'm relieved to learn that my performance degradation will be > alleviated with this hard-drive update. Which also means I need no > longer be so obsessive-compulsive when tweaking the second incarnation > of my XFS file-system. You can also alleviate this problem with careful planning. Have a /boot and / filesystem that are painfully small, making / just large enough to hold the OS files, log files, etc. Make one or more XFS filesystems to hold your data. If one of these becomes slow due to hitting the 85%+ mark, you can simply copy all the data to an external device or NFS mounted directory, delete and then remake the filesystem, preferably larger if you have free space on the drive. When you copy all the files back to the new filesystem, your performance will be restored. The catch is you need to make the new on bigger by at least 15-20%, more if you have the space, to prevent arriving at the same situation soon after doing all this. > As I understand an XFS system is demarcated into several allocation > groups each containing a superblock as well as private inode and > free-space btrees, and thus increasing the AG count increases > parallelization. I simply assumed the process was CPU-bound, not disk > bound. Though by mentioning spindles, I realized it makes sense to > limit the amount of parallel access to a single hard drive; I've often > noticed XFS come to a crawl when I simultaneously call multiple > intensive IO operations. You didn't notice XFS come to a crawl. You noticed your single disk come to a crawl. Even 15k rpm SAS drives max out at ~300 seeks/sec. A 5.4k rpm laptop SATA drive will be lucky to sustain 100 seeks/sec. A cheap SATA SSD will do multiple thousands of seeks/sec, as will a RAID array with 8 or more 15k SAS drives and a decent sized write cache, say 256MB or more. Multiple AGs really shine with large RAID arrays with many spindles. They are far less relevant WRT performance of a single disk. > You mention an eight-core machine (8c?). Since I operate a dual-core > system, would it make sense to increase my AG count slightly, to five > or six? Dave didn't mention the disk configuration of his "workstation". I'm guessing he's got a local RAID setup with 8-16 drives. AG count has a direct relationship to the storage hardware, not the number of CPUs (cores) in the system. If you have a 24 core system (2x Magny Cours) and a single disk, creating an FS with 24 AGs will give you nothing, and may actually impede performance due to all the extra head seeking across those 24 AGs. > Hm.. I was going to wait till 2.6.39 but I think I'll enable delayed > logging right now! Delayed logging can definitely help increase write throughput to a single disk. It pushes some of the I/O bottleneck into CPU/memory territory for a short period of time. Keep in mind that data must eventually be written to disk, so you are merely delaying the physical disk bottleneck for a while, as the name implies. Also note that it will do absolutely nothing for read performance. > Now for the implementation of transferring the data to new XFS > partition/hard-drive... > I was originally going to use "rsync -avxAHX" until I stumbled across > this list's thread, "ENOSPC at 90% with plenty of inodes" which > mentioned xfsdump and xfsrestore. I now have three questions: If xfsdump/xfsrestore don't turn out to be a viable solution... Just create a new big XFS filesystem on the new disk, such as you have now, with the defaults. Enter runlevel 2, stop every daemon you can without blowing things up, and "cp -a" everything over to the new filesystem. Modify /etc/lilo or /etc/grub accordingly on the new disk to use the new disk, and burn an MBR to the new disk. Reboot the machine, enter BIOS, set new disk as boot, and boot. It should be that simple. If it doesn't work, change the boot disk in the BIOS, boot the old disk, and troubleshoot. This is pretty much exactly what I did when I replaced the drive in my home MX/Samba/etc server about a year ago, although I had many partitions instead of one, all EXT2 not XFS, and probably many more daemons running that your workstation. I copied each FS separately, and avoided /proc, which turned out to be a mistake. There are apparently a few things in /proc that the kernel doesn't create new on each boot, so I had to go back and copy those individually-can't recall now exactly what they were. Anyway, cp _everything_ over, and ignore any errors, and you should be ok. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-29 5:41 ` Stan Hoeppner @ 2010-11-30 4:29 ` Dave Chinner 2010-11-30 4:50 ` Stan Hoeppner 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2010-11-30 4:29 UTC (permalink / raw) To: Stan Hoeppner; +Cc: xfs On Sun, Nov 28, 2010 at 11:41:35PM -0600, Stan Hoeppner wrote: > Yclept Nemo put forth on 11/28/2010 9:57 PM: > > You mention an eight-core machine (8c?). Since I operate a dual-core > > system, would it make sense to increase my AG count slightly, to five > > or six? > > Dave didn't mention the disk configuration of his "workstation". I'm > guessing he's got a local RAID setup with 8-16 drives. 2 SSDs in RAID0. > AG count has a > direct relationship to the storage hardware, not the number of CPUs > (cores) in the system. Actually, I used 16 AGs because it's twice the number of CPU cores and I want to make sure that CPU parallel workloads (e.g. make -j 8) don't serialise on AG locks during allocation. IOWs, I laid it out that way precisely because of the number of CPUs in the system... And to point out the not-so-obvious, this is the _default layout_ that mkfs.xfs in the debian squeeze installer came up with. IOWs, mkfs.xfs did exactly what I wanted without me having to tweak _anything_. > If you have a 24 core system (2x Magny Cours) > and a single disk, creating an FS with 24 AGs will give you nothing, and > may actually impede performance due to all the extra head seeking across > those 24 AGs. In that case, you are right. Single spindle SRDs go backwards in performance pretty quickly once you go over 4 AGs... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-30 4:29 ` Dave Chinner @ 2010-11-30 4:50 ` Stan Hoeppner 2010-11-30 7:51 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Stan Hoeppner @ 2010-11-30 4:50 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs Dave Chinner put forth on 11/29/2010 10:29 PM: > On Sun, Nov 28, 2010 at 11:41:35PM -0600, Stan Hoeppner wrote: >> Yclept Nemo put forth on 11/28/2010 9:57 PM: >>> You mention an eight-core machine (8c?). Since I operate a dual-core >>> system, would it make sense to increase my AG count slightly, to five >>> or six? >> >> Dave didn't mention the disk configuration of his "workstation". I'm >> guessing he's got a local RAID setup with 8-16 drives. > > 2 SSDs in RAID0. >From an IOPs and throughput perspective, very similar to my guess. Curious, are those Intel, OCZ, or other SSDs? Which model, specifically? Benchmark data? I ask as all the results I find on the web for SSDs are from Windows 7 machines. :( I like to see some Linux results. >> AG count has a >> direct relationship to the storage hardware, not the number of CPUs >> (cores) in the system. > > Actually, I used 16 AGs because it's twice the number of CPU cores > and I want to make sure that CPU parallel workloads (e.g. make -j 8) > don't serialise on AG locks during allocation. IOWs, I laid it out > that way precisely because of the number of CPUs in the system... And that makes perfect sense, assuming you have a sufficiently speedy storage device, which you do. > And to point out the not-so-obvious, this is the _default layout_ > that mkfs.xfs in the debian squeeze installer came up with. IOWs, > mkfs.xfs did exactly what I wanted without me having to tweak > _anything_. Forgive me for I've not looked at the code. How exactly does mkfs.xfs determine the AG count? If you'd had a single 7.2k SATA drive instead of 2 RAID0 SSDs, would it have still given you 16 AGs? If so, I'd say that's a bug. >> If you have a 24 core system (2x Magny Cours) >> and a single disk, creating an FS with 24 AGs will give you nothing, and >> may actually impede performance due to all the extra head seeking across >> those 24 AGs. > > In that case, you are right. Single spindle SRDs go backwards in > performance pretty quickly once you go over 4 AGs... That was the point I was making originally. AG count should be balanced between storage device performance and number of cores, not strictly one or the other. True? How does mkfs.xfs strike that balance? Or does it, if using defaults? -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-30 4:50 ` Stan Hoeppner @ 2010-11-30 7:51 ` Dave Chinner 2010-12-01 0:47 ` Stan Hoeppner 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2010-11-30 7:51 UTC (permalink / raw) To: Stan Hoeppner; +Cc: xfs On Mon, Nov 29, 2010 at 10:50:33PM -0600, Stan Hoeppner wrote: > Dave Chinner put forth on 11/29/2010 10:29 PM: > > On Sun, Nov 28, 2010 at 11:41:35PM -0600, Stan Hoeppner wrote: > >> Yclept Nemo put forth on 11/28/2010 9:57 PM: > >>> You mention an eight-core machine (8c?). Since I operate a dual-core > >>> system, would it make sense to increase my AG count slightly, to five > >>> or six? > >> > >> Dave didn't mention the disk configuration of his "workstation". I'm > >> guessing he's got a local RAID setup with 8-16 drives. > > > > 2 SSDs in RAID0. > > From an IOPs and throughput perspective, very similar to my guess. > Curious, are those Intel, OCZ, or other SSDs? Which model, > specifically? Benchmark data? I ask as all the results I find on the > web for SSDs are from Windows 7 machines. :( I like to see some Linux > results. Cheap as it gets 120GB Sandforce 1200 drives. In RAID0, I'm getting about 450MB/s sequential write, a little more for read. I'm seeing up to 12-14k random 4k writes per drive through XFS. Other than that I didn't bother with any more benchmarks because it was clearly Fast Enough. > > And to point out the not-so-obvious, this is the _default layout_ > > that mkfs.xfs in the debian squeeze installer came up with. IOWs, > > mkfs.xfs did exactly what I wanted without me having to tweak > > _anything_. > > Forgive me for I've not looked at the code. How exactly does mkfs.xfs > determine the AG count? If you'd had a single 7.2k SATA drive instead > of 2 RAID0 SSDs, would it have still given you 16 AGs? If so, I'd say > that's a bug. No, it detected the RAID configuration. 16 AGs is the default for a RAID device, 4 AGs is used if RAID is not detected. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-30 7:51 ` Dave Chinner @ 2010-12-01 0:47 ` Stan Hoeppner 0 siblings, 0 replies; 11+ messages in thread From: Stan Hoeppner @ 2010-12-01 0:47 UTC (permalink / raw) To: xfs Dave Chinner put forth on 11/30/2010 1:51 AM: > No, it detected the RAID configuration. 16 AGs is the default for a > RAID device, 4 AGs is used if RAID is not detected. So, as a general rule of thumb, one should have as many AGs as the maximum number of potential writers, assuming the underlying storage device can handle the I/O? For instance, on a quad socket 12 core Magny Cours system with 48 total cores, if one has an application with 48 writing threads, one should have at least 48 AGs? And if the the default is 4 AGs per physical disk, does this mean one should have at least 12 drives in the stripe width? I'm thinking of SRDs in my example, not SSDs. And obviously if these are heavy writing threads, a single 2 GHz core can easily outrun a single disk, so one would probably want at least 48 SRDs if not 96 in the stripe. Yes? Obviously there are potentially many other factors that may come into play in this equation. I'm just looking for a general formula one should use as a baseline template, as 16 AGs doesn't seem to be enough for any application, core/thread count, SRD RAID configuration. -- Stan _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: XFS: performance 2010-11-29 3:57 ` Yclept Nemo 2010-11-29 5:41 ` Stan Hoeppner @ 2010-11-29 8:38 ` Michael Monnerie 1 sibling, 0 replies; 11+ messages in thread From: Michael Monnerie @ 2010-11-29 8:38 UTC (permalink / raw) To: xfs; +Cc: Yclept Nemo [-- Attachment #1.1: Type: Text/Plain, Size: 592 bytes --] On Montag, 29. November 2010 Yclept Nemo wrote: > -If not, are there any reasons to nonetheless prefer > xfsdump/xfsrestore over rsync? Just do an rsync. As this copies data new, XFS can arrange them in order anyway. No need for special tools. -- mit freundlichen Grüssen, Michael Monnerie, Ing. BSc it-management Internet Services: Protéger http://proteger.at [gesprochen: Prot-e-schee] Tel: +43 660 / 415 6531 // ****** Radiointerview zum Thema Spam ****** // http://www.it-podcast.at/archiv.html#podcast-100716 // // Haus zu verkaufen: http://zmi.at/langegg/ [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 198 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-12-01 0:46 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-28 22:51 XFS: performance Yclept Nemo
2010-11-29 0:11 ` Dave Chinner
2010-11-29 1:21 ` Yclept Nemo
2010-11-29 1:59 ` Dave Chinner
[not found] ` <AANLkTikw086Z_66cz_U-EdFQx14TXP6XmiG-KyLN4BLo@mail.gmail.com>
2010-11-29 3:57 ` Yclept Nemo
2010-11-29 5:41 ` Stan Hoeppner
2010-11-30 4:29 ` Dave Chinner
2010-11-30 4:50 ` Stan Hoeppner
2010-11-30 7:51 ` Dave Chinner
2010-12-01 0:47 ` Stan Hoeppner
2010-11-29 8:38 ` Michael Monnerie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox