All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ragnar Kjørstad" <reiserfs@ragnark.vestdata.no>
To: Jonathan Briggs <jbriggs@esoft.com>
Cc: Ed Walker <ewalker@surfcity.net>, reiserfs-list@namesys.com
Subject: Re: Fastest way to "find / -mtime +7".....
Date: Tue, 19 Jul 2005 22:09:02 +0200	[thread overview]
Message-ID: <20050719200902.GM1508@vestdata.no> (raw)
In-Reply-To: <1121798933.15596.13.camel@localhost>

On Tue, Jul 19, 2005 at 12:48:53PM -0600, Jonathan Briggs wrote:
> > this is pretty slow on reiser, at least compared with ext2/3, and I  
> > understand that it may be because the find command returns the names  
> > in a non-optimal order (ie readdir order?).
> 
> I think Reiser3 is slow more because with mtime, find has to stat each
> file. 

The two issues are related.

Readdir will return the filenames in hash order. Find will then go and
stat each file, still in hash order.

Problem is, the inodes are not sorted in directory hash order on the
disk. They tend to be in approximate creation order. So, the disk access
pattern is nearly random access, meaning every stat is likely to touch a
new block and readahead is completely useless.



I once wrote a new hash for reiserfs3 specifically for maildir. This
hash caused files to be order in approximate creating order, matching
the inode order much closer. 


You will find both the patch and some benchmark results if you search
the archive (messageid 20010628030201.E380@vestdata.no), but speeded up
my testcase by a factor of 6. (My testcase read all the data too though.
I don't think I ever tested just "find . -ls")


In reiserfs3 the usefullness of the hash is limited as hashes are per
filesystem settings. (So it is only useful if you have a dedicated
maildir filesystem). I've lost track of reiserfs4 features - maybe you
can select hashes per directory now? Or maybe the whole thing is made
obsolete by putting the inodes with the directoryentries?




-- 
Ragnar Kjørstad
Software Engineer
Scali - http://www.scali.com
Scaling the Linux Datacenter

  reply	other threads:[~2005-07-19 20:09 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-19 17:55 Fastest way to "find / -mtime +7" Ed Walker
2005-07-19 18:48 ` Jonathan Briggs
2005-07-19 20:09   ` Ragnar Kjørstad [this message]
2005-07-19 22:00     ` Jonathan Briggs
2005-07-20 12:33       ` Alexander G. M. Smith
2005-07-20 16:26       ` Andreas Dilger
2005-07-20 20:44         ` Ragnar Kjørstad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050719200902.GM1508@vestdata.no \
    --to=reiserfs@ragnark.vestdata.no \
    --cc=ewalker@surfcity.net \
    --cc=jbriggs@esoft.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.