* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) [not found] <CAHuzUScfp19c_th_pfsZs05+yDz34MuEH-P1f+FF1dcivfH=5Q@mail.gmail.com> @ 2014-04-23 2:18 ` Dave Chinner 2014-04-23 7:23 ` Ivan Pantovic 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2014-04-23 2:18 UTC (permalink / raw) To: Speedy Milan; +Cc: xfs, linux-kernel, Ivan Pantovic [cc xfs@oss.sgi.com] On Mon, Apr 21, 2014 at 10:58:53PM +0200, Speedy Milan wrote: > I want to report very slow deletion of 24 50GB files (in total 12 TB), > all present in the same folder. total = 1.2TB? > OS is CentOS 6.4, with upgraded kernel 3.13.1. > > The hardware is a Supermicro server with 15x 4TB WD Se drives in MD > RAID 6, totalling 52TB of free space. > > XFS is formated directly on the RAID volume, without LVM layers. > > Deletion was done with rm -f * command, and it took upwards of 1 hour > to delete the files. > > File system was filled completely prior to deletion. Oh, that's bad. it's likely you fragmented the files into millions of extents? > rm was mostly waiting (D state), probably for kworker threads, and No, waiting for IO. > iostat was showing big HDD utilization numbers and very low throughput > so it looked like a random HDD workload was in effect. Yup, smells like file fragmentation. Non-fragmented 50GB files should be removed in a few milliseconds. but if you've badly fragmented the files, there could be 10 million extents in a 50GB file. A few milliseconds per extent removal gives you.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) 2014-04-23 2:18 ` rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) Dave Chinner @ 2014-04-23 7:23 ` Ivan Pantovic 2014-04-23 8:25 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: Ivan Pantovic @ 2014-04-23 7:23 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel, Speedy Milan, xfs > [root@drive-b ~]# xfs_db -r /dev/md0 > xfs_db> frag > actual 11157932, ideal 11015175, fragmentation factor 1.28% > xfs_db> this is current level of fragmentation ... is it bad? some say over 1% is candidate for defrag? ... we can leave it like this and wait for a next full backup and then check on the fragmentation of that file. On 04/23/2014 04:18 AM, Dave Chinner wrote: > [cc xfs@oss.sgi.com] > > On Mon, Apr 21, 2014 at 10:58:53PM +0200, Speedy Milan wrote: >> I want to report very slow deletion of 24 50GB files (in total 12 TB), >> all present in the same folder. > total = 1.2TB? > >> OS is CentOS 6.4, with upgraded kernel 3.13.1. >> >> The hardware is a Supermicro server with 15x 4TB WD Se drives in MD >> RAID 6, totalling 52TB of free space. >> >> XFS is formated directly on the RAID volume, without LVM layers. >> >> Deletion was done with rm -f * command, and it took upwards of 1 hour >> to delete the files. >> >> File system was filled completely prior to deletion. > Oh, that's bad. it's likely you fragmented the files into > millions of extents? > >> rm was mostly waiting (D state), probably for kworker threads, and > No, waiting for IO. > >> iostat was showing big HDD utilization numbers and very low throughput >> so it looked like a random HDD workload was in effect. > Yup, smells like file fragmentation. Non-fragmented 50GB files > should be removed in a few milliseconds. but if you've badly > fragmented the files, there could be 10 million extents in a 50GB > file. A few milliseconds per extent removal gives you.... > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) 2014-04-23 7:23 ` Ivan Pantovic @ 2014-04-23 8:25 ` Dave Chinner 2014-04-23 9:21 ` Ivan Pantovic 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2014-04-23 8:25 UTC (permalink / raw) To: Ivan Pantovic; +Cc: linux-kernel, Speedy Milan, xfs On Wed, Apr 23, 2014 at 09:23:41AM +0200, Ivan Pantovic wrote: > > >[root@drive-b ~]# xfs_db -r /dev/md0 > >xfs_db> frag > >actual 11157932, ideal 11015175, fragmentation factor 1.28% > >xfs_db> > > this is current level of fragmentation ... is it bad? http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F > some say over 1% is candidate for defrag? ... Some say that over 70% is usually not a problem: http://www.mythtv.org/wiki/XFS_Filesystem#Defragmenting_XFS_Partitions i.e. the level that becomes are problem is highly workload specific. So, you can't read *anything* in that number without know exactly what is in your filesystem, how the application(s) interact with it and so on. Besides, I was asking specifically about the files you removed, not the files that remain in the filesystem. Given that you have 11 million inodes in the filesystem, you probably removed the only significantly large files in the filesystem.... So, the files your removed are now free space, so free space fragmentation is what we need to look at. i.e. use the freesp command to dump the histogram and summary of the free space... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) 2014-04-23 8:25 ` Dave Chinner @ 2014-04-23 9:21 ` Ivan Pantovic 2014-04-23 22:12 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: Ivan Pantovic @ 2014-04-23 9:21 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel, Speedy Milan, xfs Hi Dave, > xfs_db> freesp > from to extents blocks pct > 1 1 52463 52463 0.00 > 2 3 73270 181394 0.01 > 4 7 134526 739592 0.03 > 8 15 250469 2870193 0.12 > 16 31 581572 13465403 0.58 > 32 63 692386 32096932 1.37 > 64 127 1234204 119157757 5.09 > 128 255 91015 16690243 0.71 > 256 511 18977 6703895 0.29 > 512 1023 12821 8611576 0.37 > 1024 2047 23209 33177541 1.42 > 2048 4095 43282 101126831 4.32 > 4096 8191 12726 55814285 2.39 > 8192 16383 2138 22750157 0.97 > 16384 32767 1033 21790120 0.93 > 32768 65535 433 19852497 0.85 > 65536 131071 254 23052185 0.99 > 131072 262143 204 37833000 1.62 > 262144 524287 229 89970969 3.85 > 524288 1048575 164 124210580 5.31 > 1048576 2097151 130 173193687 7.40 > 2097152 4194303 22 61297862 2.62 > 4194304 8388607 16 97070435 4.15 > 8388608 16777215 26 320475332 13.70 > 16777216 33554431 6 133282461 5.70 > 33554432 67108863 12 616939026 26.37 > 134217728 268435328 1 207504563 8.87 > xfs_db> well now it is quite obvious that file fragmentation was actually the issue. this is what munin has to say about that time frame when files were deleted. http://picpaste.com/df_inode-pinpoint_1397768678_1397876678-kpwd9loR.png http://picpaste.com/df-pinpoint_1397768678_1397876678-pQ7ZCTPu.png although the drives were "only" 50% busy while deleting all those inodes. it's quite interesting how we got there in the first place thanks to bacula backup and some other hardware failure not related to the backup server. On 04/23/2014 10:25 AM, Dave Chinner wrote: > On Wed, Apr 23, 2014 at 09:23:41AM +0200, Ivan Pantovic wrote: >>> [root@drive-b ~]# xfs_db -r /dev/md0 >>> xfs_db> frag >>> actual 11157932, ideal 11015175, fragmentation factor 1.28% >>> xfs_db> >> this is current level of fragmentation ... is it bad? > http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F > >> some say over 1% is candidate for defrag? ... > Some say that over 70% is usually not a problem: > > http://www.mythtv.org/wiki/XFS_Filesystem#Defragmenting_XFS_Partitions > > i.e. the level that becomes are problem is highly workload specific. > So, you can't read *anything* in that number without know exactly > what is in your filesystem, how the application(s) interact with it > and so on. > > Besides, I was asking specifically about the files you removed, not > the files that remain in the filesystem. Given that you have 11 > million inodes in the filesystem, you probably removed the only > significantly large files in the filesystem.... > > So, the files your removed are now free space, so free space > fragmentation is what we need to look at. i.e. use the freesp > command to dump the histogram and summary of the free space... > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) 2014-04-23 9:21 ` Ivan Pantovic @ 2014-04-23 22:12 ` Dave Chinner 0 siblings, 0 replies; 5+ messages in thread From: Dave Chinner @ 2014-04-23 22:12 UTC (permalink / raw) To: Ivan Pantovic; +Cc: linux-kernel, Speedy Milan, xfs On Wed, Apr 23, 2014 at 11:21:54AM +0200, Ivan Pantovic wrote: > Hi Dave, > > >xfs_db> freesp > > from to extents blocks pct > > 1 1 52463 52463 0.00 > > 2 3 73270 181394 0.01 > > 4 7 134526 739592 0.03 > > 8 15 250469 2870193 0.12 > > 16 31 581572 13465403 0.58 > > 32 63 692386 32096932 1.37 > > 64 127 1234204 119157757 5.09 So these are the small free spaces that lead to problems. There's around 3 million small free space extents in the filesystems, totalling 7% of the free space. That's quite a lot, and it means that there is a good chance that small allocations will find these small free spaces rather than find a large extent and start from there. > > 128 255 91015 16690243 0.71 > > 256 511 18977 6703895 0.29 > > 512 1023 12821 8611576 0.37 > > 1024 2047 23209 33177541 1.42 > > 2048 4095 43282 101126831 4.32 > > 4096 8191 12726 55814285 2.39 > > 8192 16383 2138 22750157 0.97 > > 16384 32767 1033 21790120 0.93 > > 32768 65535 433 19852497 0.85 > > 65536 131071 254 23052185 0.99 > > 131072 262143 204 37833000 1.62 > > 262144 524287 229 89970969 3.85 > > 524288 1048575 164 124210580 5.31 > >1048576 2097151 130 173193687 7.40 > >2097152 4194303 22 61297862 2.62 > >4194304 8388607 16 97070435 4.15 > >8388608 16777215 26 320475332 13.70 > >16777216 33554431 6 133282461 5.70 > >33554432 67108863 12 616939026 26.37 > >134217728 268435328 1 207504563 8.87 There are some large free spaces still, so your filesystem is still in fairly good shape from that perspective. You can get a better idea of whether the fragmentation is isolated to specific AGs by using the freesp -a <agno> command to dump each individual freespace index. You can then use the xfs_bmap command to find files that are located in those fragmented AGs. The only way to fix freespace fragmentation right now is to remove the extents that are chopping up the freespace. Moving data around on a per-directory basis (e.g. cp the regular files to a temp directory, rename them back over the original) is one way of acheiving this, though you have to carefully control the destination AG and make sure it is an AG that is made up mostly of contiguous freespace to begin with.... But you only really need to do this if you are seeing ongoing problems. Often just freeing up space in the filesystem will fix the problem... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-04-23 22:12 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAHuzUScfp19c_th_pfsZs05+yDz34MuEH-P1f+FF1dcivfH=5Q@mail.gmail.com>
2014-04-23 2:18 ` rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) Dave Chinner
2014-04-23 7:23 ` Ivan Pantovic
2014-04-23 8:25 ` Dave Chinner
2014-04-23 9:21 ` Ivan Pantovic
2014-04-23 22:12 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).