Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)
       [not found] <CAHuzUScfp19c_th_pfsZs05+yDz34MuEH-P1f+FF1dcivfH=5Q@mail.gmail.com>
@ 2014-04-23  2:18 ` Dave Chinner
  2014-04-23  7:23   ` Ivan Pantovic
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2014-04-23  2:18 UTC (permalink / raw)
  To: Speedy Milan; +Cc: xfs, linux-kernel, Ivan Pantovic

[cc xfs@oss.sgi.com]

On Mon, Apr 21, 2014 at 10:58:53PM +0200, Speedy Milan wrote:
> I want to report very slow deletion of 24 50GB files (in total 12 TB),
> all present in the same folder.

total = 1.2TB?

> OS is CentOS 6.4, with upgraded kernel 3.13.1.
> 
> The hardware is a Supermicro server with 15x 4TB WD Se drives in MD
> RAID 6, totalling 52TB of free space.
> 
> XFS is formated directly on the RAID volume, without LVM layers.
> 
> Deletion was done with rm -f * command, and it took upwards of 1 hour
> to delete the files.
> 
> File system was filled completely prior to deletion.

Oh, that's bad. it's likely you fragmented the files into
millions of extents?

> rm was mostly waiting (D state), probably for kworker threads, and

No, waiting for IO.

> iostat was showing big HDD utilization numbers and very low throughput
> so it looked like a random HDD workload was in effect.

Yup, smells like file fragmentation. Non-fragmented 50GB files
should be removed in a few milliseconds. but if you've badly
fragmented the files, there could be 10 million extents in a 50GB
file. A few milliseconds per extent removal gives you....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)
  2014-04-23  2:18 ` rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) Dave Chinner
@ 2014-04-23  7:23   ` Ivan Pantovic
  2014-04-23  8:25     ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Ivan Pantovic @ 2014-04-23  7:23 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, Speedy Milan, xfs


> [root@drive-b ~]# xfs_db -r /dev/md0
> xfs_db> frag
> actual 11157932, ideal 11015175, fragmentation factor 1.28%
> xfs_db>

this is current level of fragmentation ... is it bad?

some say over 1% is candidate for defrag? ...

we can leave it like this and wait for a next full backup and then check 
on the fragmentation of that file.

On 04/23/2014 04:18 AM, Dave Chinner wrote:
> [cc xfs@oss.sgi.com]
>
> On Mon, Apr 21, 2014 at 10:58:53PM +0200, Speedy Milan wrote:
>> I want to report very slow deletion of 24 50GB files (in total 12 TB),
>> all present in the same folder.
> total = 1.2TB?
>
>> OS is CentOS 6.4, with upgraded kernel 3.13.1.
>>
>> The hardware is a Supermicro server with 15x 4TB WD Se drives in MD
>> RAID 6, totalling 52TB of free space.
>>
>> XFS is formated directly on the RAID volume, without LVM layers.
>>
>> Deletion was done with rm -f * command, and it took upwards of 1 hour
>> to delete the files.
>>
>> File system was filled completely prior to deletion.
> Oh, that's bad. it's likely you fragmented the files into
> millions of extents?
>
>> rm was mostly waiting (D state), probably for kworker threads, and
> No, waiting for IO.
>
>> iostat was showing big HDD utilization numbers and very low throughput
>> so it looked like a random HDD workload was in effect.
> Yup, smells like file fragmentation. Non-fragmented 50GB files
> should be removed in a few milliseconds. but if you've badly
> fragmented the files, there could be 10 million extents in a 50GB
> file. A few milliseconds per extent removal gives you....
>
> Cheers,
>
> Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)
  2014-04-23  7:23   ` Ivan Pantovic
@ 2014-04-23  8:25     ` Dave Chinner
  2014-04-23  9:21       ` Ivan Pantovic
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2014-04-23  8:25 UTC (permalink / raw)
  To: Ivan Pantovic; +Cc: linux-kernel, Speedy Milan, xfs

On Wed, Apr 23, 2014 at 09:23:41AM +0200, Ivan Pantovic wrote:
> 
> >[root@drive-b ~]# xfs_db -r /dev/md0
> >xfs_db> frag
> >actual 11157932, ideal 11015175, fragmentation factor 1.28%
> >xfs_db>
> 
> this is current level of fragmentation ... is it bad?

http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F

> some say over 1% is candidate for defrag? ...

Some say that over 70% is usually not a problem:

http://www.mythtv.org/wiki/XFS_Filesystem#Defragmenting_XFS_Partitions

i.e. the level that becomes are problem is highly workload specific.
So, you can't read *anything* in that number without know exactly
what is in your filesystem, how the application(s) interact with it
and so on. 

Besides, I was asking specifically about the files you removed, not
the files that remain in the filesystem. Given that you have 11
million inodes in the filesystem, you probably removed the only
significantly large files in the filesystem....

So, the files your removed are now free space, so free space
fragmentation is what we need to look at.  i.e. use the freesp
command to dump the histogram and summary of the free space...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)
  2014-04-23  8:25     ` Dave Chinner
@ 2014-04-23  9:21       ` Ivan Pantovic
  2014-04-23 22:12         ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Ivan Pantovic @ 2014-04-23  9:21 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, Speedy Milan, xfs

Hi Dave,

> xfs_db> freesp
>    from      to extents  blocks    pct
>       1       1   52463   52463   0.00
>       2       3   73270  181394   0.01
>       4       7  134526  739592   0.03
>       8      15  250469 2870193   0.12
>      16      31  581572 13465403   0.58
>      32      63  692386 32096932   1.37
>      64     127 1234204 119157757   5.09
>     128     255   91015 16690243   0.71
>     256     511   18977 6703895   0.29
>     512    1023   12821 8611576   0.37
>    1024    2047   23209 33177541   1.42
>    2048    4095   43282 101126831   4.32
>    4096    8191   12726 55814285   2.39
>    8192   16383    2138 22750157   0.97
>   16384   32767    1033 21790120   0.93
>   32768   65535     433 19852497   0.85
>   65536  131071     254 23052185   0.99
>  131072  262143     204 37833000   1.62
>  262144  524287     229 89970969   3.85
>  524288 1048575     164 124210580   5.31
> 1048576 2097151     130 173193687   7.40
> 2097152 4194303      22 61297862   2.62
> 4194304 8388607      16 97070435   4.15
> 8388608 16777215      26 320475332  13.70
> 16777216 33554431       6 133282461   5.70
> 33554432 67108863      12 616939026  26.37
> 134217728 268435328       1 207504563   8.87
> xfs_db>

well now it is quite obvious that file fragmentation was actually the issue.

this is what munin has to say about that time frame when files were deleted.

http://picpaste.com/df_inode-pinpoint_1397768678_1397876678-kpwd9loR.png
http://picpaste.com/df-pinpoint_1397768678_1397876678-pQ7ZCTPu.png

although the drives were "only" 50% busy while deleting all those inodes.

it's quite interesting how we got there in the first place thanks to 
bacula backup and some other hardware failure not related to the backup 
server.

On 04/23/2014 10:25 AM, Dave Chinner wrote:
> On Wed, Apr 23, 2014 at 09:23:41AM +0200, Ivan Pantovic wrote:
>>> [root@drive-b ~]# xfs_db -r /dev/md0
>>> xfs_db> frag
>>> actual 11157932, ideal 11015175, fragmentation factor 1.28%
>>> xfs_db>
>> this is current level of fragmentation ... is it bad?
> http://xfs.org/index.php/XFS_FAQ#Q:_The_xfs_db_.22frag.22_command_says_I.27m_over_50.25._Is_that_bad.3F
>
>> some say over 1% is candidate for defrag? ...
> Some say that over 70% is usually not a problem:
>
> http://www.mythtv.org/wiki/XFS_Filesystem#Defragmenting_XFS_Partitions
>
> i.e. the level that becomes are problem is highly workload specific.
> So, you can't read *anything* in that number without know exactly
> what is in your filesystem, how the application(s) interact with it
> and so on.
>
> Besides, I was asking specifically about the files you removed, not
> the files that remain in the filesystem. Given that you have 11
> million inodes in the filesystem, you probably removed the only
> significantly large files in the filesystem....
>
> So, the files your removed are now free space, so free space
> fragmentation is what we need to look at.  i.e. use the freesp
> command to dump the histogram and summary of the free space...
>
> Cheers,
>
> Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB)
  2014-04-23  9:21       ` Ivan Pantovic
@ 2014-04-23 22:12         ` Dave Chinner
  0 siblings, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2014-04-23 22:12 UTC (permalink / raw)
  To: Ivan Pantovic; +Cc: linux-kernel, Speedy Milan, xfs

On Wed, Apr 23, 2014 at 11:21:54AM +0200, Ivan Pantovic wrote:
> Hi Dave,
> 
> >xfs_db> freesp
> >   from      to extents  blocks    pct
> >      1       1   52463   52463   0.00
> >      2       3   73270  181394   0.01
> >      4       7  134526  739592   0.03
> >      8      15  250469 2870193   0.12
> >     16      31  581572 13465403   0.58
> >     32      63  692386 32096932   1.37
> >     64     127 1234204 119157757   5.09

So these are the small free spaces that lead to problems. There's
around 3 million small free space extents in the filesystems,
totalling 7% of the free space. That's quite a lot, and it means
that there is a good chance that small allocations will find these
small free spaces rather than find a large extent and start from
there.

> >    128     255   91015 16690243   0.71
> >    256     511   18977 6703895   0.29
> >    512    1023   12821 8611576   0.37
> >   1024    2047   23209 33177541   1.42
> >   2048    4095   43282 101126831   4.32
> >   4096    8191   12726 55814285   2.39
> >   8192   16383    2138 22750157   0.97
> >  16384   32767    1033 21790120   0.93
> >  32768   65535     433 19852497   0.85
> >  65536  131071     254 23052185   0.99
> > 131072  262143     204 37833000   1.62
> > 262144  524287     229 89970969   3.85
> > 524288 1048575     164 124210580   5.31
> >1048576 2097151     130 173193687   7.40
> >2097152 4194303      22 61297862   2.62
> >4194304 8388607      16 97070435   4.15
> >8388608 16777215      26 320475332  13.70
> >16777216 33554431       6 133282461   5.70
> >33554432 67108863      12 616939026  26.37
> >134217728 268435328       1 207504563   8.87

There are some large free spaces still, so your filesystem is still
in fairly good shape from that perspective. You can get a better
idea of whether the fragmentation is isolated to specific AGs by
using the freesp -a <agno> command to dump each individual freespace
index. You can then use the xfs_bmap command to find files that are
located in those fragmented AGs.

The only way to fix freespace fragmentation right now is to remove
the extents that are chopping up the freespace. Moving data around
on a per-directory basis (e.g. cp the regular files to a temp
directory, rename them back over the original) is one way of
acheiving this, though you have to carefully control the destination
AG and make sure it is an AG that is made up mostly of contiguous
freespace to begin with....

But you only really need to do this if you are seeing ongoing
problems. Often just freeing up space in the filesystem will fix the
problem...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-04-23 22:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAHuzUScfp19c_th_pfsZs05+yDz34MuEH-P1f+FF1dcivfH=5Q@mail.gmail.com>
2014-04-23  2:18 ` rm -f * on large files very slow on XFS + MD RAID 6 volume of 15x 4TB of HDDs (52TB) Dave Chinner
2014-04-23  7:23   ` Ivan Pantovic
2014-04-23  8:25     ` Dave Chinner
2014-04-23  9:21       ` Ivan Pantovic
2014-04-23 22:12         ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).