public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Richard Ems <richard.ems@cape-horn-eng.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com
Subject: Re: XFS unlink still slow on 3.1.9 kernel ?
Date: Tue, 14 Feb 2012 13:32:00 +0100	[thread overview]
Message-ID: <4F3A5440.409@cape-horn-eng.com> (raw)
In-Reply-To: <20120214000924.GF14132@dastard>

Hi Dave, hi list,

first thanks for the very detailed reply. Please find below my comments
and questions.

On 02/14/2012 01:09 AM, Dave Chinner wrote:
> On Mon, Feb 13, 2012 at 05:57:58PM +0100, Richard Ems wrote:
>> I am running openSUSE 12.1, kernel 3.1.9-1.4-default. The 20 TB XFS
>> partition is 100% full
> 
> Running filesystems to 100% full is always a bad idea - it causes
> significant increases in fragementation of both data and metadata
> compared to a filesystem that doesn't get past ~90% full.

Yes, true, I know. But I have no other free space for this backups. I am
waiting for a new already ordered system and will have 4 times this
space. So later I will open a new thread asking if my thoughts for
creating this new 80 TB XFS partition are right.



>> I am asking because I am seeing very long times while removing big
>> directory trees. I thought on kernels above 3.0 removing dirs and files
>> had improved a lot, but I don't see that improvement.
> 
> You won't if the directory traversal is seek bound and that is the
> limiting factor for performance.

*Seek bound*? *When* is the directory traversal *seek bound*?


>> This is a backup system running dirvish, so most files in the dirs I am
>> removing are hard links. Almost all of the files do have ACLs set.
> 
> The unlink will have an extra IO to read per inode - the out-of-line
> attribute block, so you've just added 11 million IOs to the 800,000
> the traversal already takes to the unlink overhead. So it's going to
> take roughly ten hours because the unlink is gong to be read IO seek
> bound....

It took 110 minutes and not 10 hours. All files and dirs there had ACLs set.


> Christophs suggestions to use larger inodes to keep the attribute
> data inline is a very good one - whenever you have a workload that
> is attribute heavy you should use larger inodes to try to keep the
> attributes in-line if possible. The down side is that increasing the
> inode size increases the amount of IO required to read/write inodes,
> though this typically isn't a huge penalty compared to the penalty
> of out-of-line attributes.

I will use larger inodes always from now on, since we largely use ACLs
on our XFS partitions.


> Also, for large directories like this (millions of entries) you
> should also consider using a larger directory block size (mkfs -n
> size=xxxx option) as that can be scaled independently to the
> filesystem block size. This will significantly decrease the amount
> of IO and fragmentation large directories cause. Peak modification
> performance of small directories will be reduced because larger
> block size directories consume more CPU to process, but for large
> directories performance will be significantly better as they will
> spend much less time waiting for IO.

This was not ONE directory with that many files, but a directory
containing 834591 subdirectories (deeply nested, not all in the same
dir!) and 10539154 files.

Many thanks,
Richard


-- 
Richard Ems       mail: Richard.Ems@Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5º piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-02-14 12:32 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-13 16:57 XFS unlink still slow on 3.1.9 kernel ? Richard Ems
2012-02-13 17:08 ` Christoph Hellwig
2012-02-13 17:11   ` Richard Ems
2012-02-13 17:15     ` Christoph Hellwig
2012-02-13 17:26       ` Richard Ems
2012-02-13 17:29         ` Christoph Hellwig
2012-02-13 17:53           ` Richard Ems
2012-02-13 18:02             ` Christoph Hellwig
2012-02-13 18:06               ` Richard Ems
2012-02-13 18:10                 ` Christoph Hellwig
2012-02-13 18:18                   ` Richard Ems
2012-02-13 18:48                   ` Richard Ems
2012-02-13 21:16                     ` Christoph Hellwig
2012-02-14  5:31                       ` Stan Hoeppner
2012-02-14  9:48                       ` Richard Ems
2012-02-14 19:43                         ` Christoph Hellwig
2012-02-14  9:49                       ` Richard Ems
2012-02-14 10:54                       ` Richard Ems
2012-02-14 11:44                       ` Richard Ems
2012-02-14  0:09 ` Dave Chinner
2012-02-14 12:32   ` Richard Ems [this message]
2012-02-14 19:45     ` Christoph Hellwig
2012-02-15 12:07       ` Richard Ems
2012-02-15  1:27     ` Dave Chinner
2012-02-15 12:07       ` Richard Ems
  -- strict thread matches above, loose matches on Subject: below --
2012-02-14 13:02 Richard Ems
     [not found] ` <4F3AA191.9030606@mnsu.edu>
2012-02-14 18:12   ` Richard Ems
2012-02-14 19:07     ` Christoph Hellwig
2012-02-15 12:48       ` Richard Ems
2012-02-14 23:10 ` Stan Hoeppner
2012-02-15 15:54   ` Richard Ems

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3A5440.409@cape-horn-eng.com \
    --to=richard.ems@cape-horn-eng.com \
    --cc=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox