public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Investigating poor du performance on ReiserFS
@ 2006-10-06 14:56 Phillip Susi
  0 siblings, 0 replies; only message in thread
From: Phillip Susi @ 2006-10-06 14:56 UTC (permalink / raw)
  To: Linux-kernel

I've been looking into a performance problem lately on ReiserFS 
executing du on a rather large directory.  For reference this is a 
Maildir of lkml since Jan 1 of this year which currently contains around 
90,000 messages/files.  Executing a du in this directory with cold 
caches takes a horribly long time.  A find completes rather quickly but 
all the stat()s that du performs seems to take a very long time to read 
in the required data ( orders of magnitude longer than it should take 
for the disks to read the amount of data transfered ).

I believe this is due to a massive seek storm caused in the process of 
reading all of the leaf nodes to fetch the stat blocks.  I have surmised 
that this is due to the fact that the directory entries are sorted by 
their hash value, and that is the order they are returned to du in, 
which then performs a stat() on each one in sequence.  The problem is 
that the hash sort order has no relationship to the order of the leaf 
nodes that hold the stat info.  While the leaf nodes generally have 
keys, and thus block locations, close to the parent directory, the order 
they are accessed in is essentially random, which causes the seek storm.

Does this theory sound plausible?  How hard would it be to sort the 
directory listing by key before returning it?  Would doing that likely 
fix the problem by causing du to stat the files in the order of the leaf 
node keys, and thus, quite likely in the order that the blocks appear on 
disk?


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2006-10-06 14:56 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-06 14:56 Investigating poor du performance on ReiserFS Phillip Susi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox