public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Richard Ems <richard.ems@cape-horn-eng.com>
To: xfs@oss.sgi.com
Subject: Re: XFS unlink still slow on 3.1.9 kernel ?
Date: Wed, 15 Feb 2012 13:07:01 +0100	[thread overview]
Message-ID: <4F3B9FE5.9070407@cape-horn-eng.com> (raw)
In-Reply-To: <20120215012753.GJ14132@dastard>

Hi Dave, hi list,

On 02/15/2012 02:27 AM, Dave Chinner wrote:
> On Tue, Feb 14, 2012 at 01:32:00PM +0100, Richard Ems wrote:
>> On 02/14/2012 01:09 AM, Dave Chinner wrote:
>>>> I am asking because I am seeing very long times while removing big
>>>> directory trees. I thought on kernels above 3.0 removing dirs and files
>>>> had improved a lot, but I don't see that improvement.
>>>
>>> You won't if the directory traversal is seek bound and that is the
>>> limiting factor for performance.
>>
>> *Seek bound*? *When* is the directory traversal *seek bound*?
> 
> Whenever you are traversing a directory structure that is not alrady
> hot in the cache. IOWS, almost always.

Ok, got that.

> 
>>>> This is a backup system running dirvish, so most files in the dirs I am
>>>> removing are hard links. Almost all of the files do have ACLs set.
>>>
>>> The unlink will have an extra IO to read per inode - the out-of-line
>>> attribute block, so you've just added 11 million IOs to the 800,000
>>> the traversal already takes to the unlink overhead. So it's going to
>>> take roughly ten hours because the unlink is gong to be read IO seek
>>> bound....
>>
>> It took 110 minutes and not 10 hours. All files and dirs there had ACLs set.
> 
> I was basing that on you "find dir" time of 100 minutes, which was
> the only number you gave, and making the assumption it didn't read
> the attribute blocks and that it was seeing worse case seek times
> (i.e. avg seek times) for every IO.
> 
> Given the way locality works in XFS, I'd suggest that the typical
> seek time will be much less (a few blocks, not half the disk
> platter) and not necessarily on the same disk (due to RAID) so the
> average seek time for your workload is likely to be much lower. If
> it's at 1ms (closer to track-to-track seek times) instead of the
> 5ms, then that 10hrs becomes 2hrs for that many IOs....

Many thanks for the clarification !!!


>>> Also, for large directories like this (millions of entries) you
>>> should also consider using a larger directory block size (mkfs -n
>>> size=xxxx option) as that can be scaled independently to the
>>> filesystem block size. This will significantly decrease the amount
>>> of IO and fragmentation large directories cause. Peak modification
>>> performance of small directories will be reduced because larger
>>> block size directories consume more CPU to process, but for large
>>> directories performance will be significantly better as they will
>>> spend much less time waiting for IO.
>>
>> This was not ONE directory with that many files, but a directory
>> containing 834591 subdirectories (deeply nested, not all in the same
>> dir!) and 10539154 files.
> 
> So you've got a directory *tree* that indexes 11 million inodes, not
> "one directory with 11 million files and dirs in it" as you
> originally described.  Both Christoph and I have interpreted your
> original description as "one large directory", but there's no need
> to shout at us because it's difficult to understand any given
> configuration from just a few lines of text.  IOWs, details like "one
> directory" vs "one directory tree" might seem insignificant to you,
> but they mean an awful lot us developers and can easily lead us down
> the wrong path.

Sorry, I didn't mean to shout at anyone, sorry for that. I just wanted
to clarify my original description, since I noticed I did it wrong. Now
I know I should have used ** and not uppercase.
As you suggested, I should have written *directory tree* and not only
*directory*, sorry, my fault.
But I didn't mean to shout at anyone, I am very happy for the fast and
extense responses from both you and Christoph! Thanks again!


> 
> FWIW, directory tree traversal is even more read IO latency
> sensitive than a single large directory traversal because we can't
> do readahead across directory boundaries to hide seek latencies as
> much as possible and the locality on individual directories can be
> very different depending on the allocaiton policy the filesystem is
> using. As it is, large directory blocks can also reduce the amount
> of IO needed in this sort of situation and speed up traversals....
> 
> Cheers,
> 
> Dave.


Many thanks!
Richard

-- 
Richard Ems       mail: Richard.Ems@Cape-Horn-Eng.com

Cape Horn Engineering S.L.
C/ Dr. J.J. Dómine 1, 5º piso
46011 Valencia
Tel : +34 96 3242923 / Fax 924
http://www.cape-horn-eng.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-02-15 12:07 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-13 16:57 XFS unlink still slow on 3.1.9 kernel ? Richard Ems
2012-02-13 17:08 ` Christoph Hellwig
2012-02-13 17:11   ` Richard Ems
2012-02-13 17:15     ` Christoph Hellwig
2012-02-13 17:26       ` Richard Ems
2012-02-13 17:29         ` Christoph Hellwig
2012-02-13 17:53           ` Richard Ems
2012-02-13 18:02             ` Christoph Hellwig
2012-02-13 18:06               ` Richard Ems
2012-02-13 18:10                 ` Christoph Hellwig
2012-02-13 18:18                   ` Richard Ems
2012-02-13 18:48                   ` Richard Ems
2012-02-13 21:16                     ` Christoph Hellwig
2012-02-14  5:31                       ` Stan Hoeppner
2012-02-14  9:48                       ` Richard Ems
2012-02-14 19:43                         ` Christoph Hellwig
2012-02-14  9:49                       ` Richard Ems
2012-02-14 10:54                       ` Richard Ems
2012-02-14 11:44                       ` Richard Ems
2012-02-14  0:09 ` Dave Chinner
2012-02-14 12:32   ` Richard Ems
2012-02-14 19:45     ` Christoph Hellwig
2012-02-15 12:07       ` Richard Ems
2012-02-15  1:27     ` Dave Chinner
2012-02-15 12:07       ` Richard Ems [this message]
  -- strict thread matches above, loose matches on Subject: below --
2012-02-14 13:02 Richard Ems
     [not found] ` <4F3AA191.9030606@mnsu.edu>
2012-02-14 18:12   ` Richard Ems
2012-02-14 19:07     ` Christoph Hellwig
2012-02-15 12:48       ` Richard Ems
2012-02-14 23:10 ` Stan Hoeppner
2012-02-15 15:54   ` Richard Ems

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F3B9FE5.9070407@cape-horn-eng.com \
    --to=richard.ems@cape-horn-eng.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox