From: tytso@mit.edu (Theodore Ts'o)
To: kernelnewbies@lists.kernelnewbies.org
Subject: Work (really slow directory access on ext4)
Date: Wed, 6 Aug 2014 10:49:37 -0400 [thread overview]
Message-ID: <20140806144937.GA22325@thunk.org> (raw)
I don't subscribe to kernelnewbies, but I came across this thread in
the mail archive while researching an unrelated issue.
Valdis' observations are on the mark here. It's almost certain that
you are getting overwhelmed with other disk traffic, because your
directory isn't *that* big.
That being said, there are certainly issues with really really big
directories, and solving this is certainly not going to be a newbie
project (if it was easy to solve, it would have been addressed a long
time ago). See:
http://en.it-usenet.org/thread/11916/10367/
for the background. It's a little bit dated, in that we do use a
64-bit hash on 64-bit systems, but the fundamental issues are still
there.
If you sort the readdir files by inode order, this can help
significantly. Some userspace programs, such as mutt, do this.
Unfortunately "ls" does not. (That might be a good newbie project,
since it's a userspace-only project. However, I'm pretty sure the
shellutils maintainers will also react negatively if they are sent
patches which don't compile. :-)
A proof of concept of how this can be a win can be found here:
http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/contrib/spd_readdir.c
LD_PRELOAD aren't guaranteed to work on all programs, so this is much
more of a hack than something I'd recommend for extended production
use. But it shows that if you have a readdir+stat workload, sorting
by inode makes a huge difference.
As far as getting traces to better understand problems, I strongly
suggest that you try things like vmstat, iostat, and blktrace; system
call traces like strace aren't going to get you very far. (See
http://brooker.co.za/blog/2013/07/14/io-performance.html for a nice
introduction to blktrace). Use the scientific method; collect
baseline statistics using vmstat, iostat, sar, before you run your
test workload, so you know how much I/O is going on before you start
your test. If you can run your test on a quiscient system, that's a
really good idea. Then collect statistics as your run your workload,
and then only tweak one variable at a time, and record everything in a
systematic way.
Finally, if you have more problems of a technical nature with respect
to the ext4, there is the ext3-users at redhat.com list, or the
developer's list at linux-ext4 at vger.kernel.org. It would be nice if
you tried the ext3-users or the kernel-newbies or tried googling to
see if anyone else has come across the problem and figured out the
solution already, but if you can't figure things out any other way, do
feel free to ask the linux-ext4 list. We won't bite. :-)
Cheers,
- Ted
P.S. If you have a large number of directories which are much larger
than you expect, and you don't want to do the "mkdir foo.new; mv foo/*
foo.new ; rmdir foo; mv foo.new foo" trick on a large number of
directories, you can also schedule downtime and while the file system
is unmounted, use "e2fsck -fD". See the man page for more details.
It won't solve all of your problems, and it might not solve any of
your problem, but it will probably make the performance of large
directories somewhat better.
next reply other threads:[~2014-08-06 14:49 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-06 14:49 Theodore Ts'o [this message]
2014-08-06 18:26 ` Work (really slow directory access on ext4) Arlie Stephens
2014-08-06 19:29 ` Nick Krause
-- strict thread matches above, loose matches on Subject: below --
2014-07-24 16:38 Work Nick Krause
2014-07-24 16:51 ` Work Andev
2014-07-24 17:10 ` Work Nick Krause
2014-07-25 2:23 ` Work Nick Krause
2014-07-25 17:42 ` Work Valdis.Kletnieks at vt.edu
2014-07-25 21:54 ` Work Nick Krause
2014-07-25 22:23 ` Work Arlie Stephens
2014-07-25 23:35 ` Work Valdis.Kletnieks at vt.edu
2014-07-26 1:08 ` Work (really slow directory access on ext4) Arlie Stephens
2014-07-26 1:22 ` Nick Krause
2014-07-30 2:34 ` Nick Krause
2014-07-30 17:38 ` Arlie Stephens
2014-07-30 19:48 ` Valdis.Kletnieks at vt.edu
2014-07-30 20:45 ` Nick Krause
2014-07-31 23:36 ` Arlie Stephens
2014-07-31 23:41 ` Henry Hallam
2014-08-01 1:47 ` Nick Krause
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140806144937.GA22325@thunk.org \
--to=tytso@mit.edu \
--cc=kernelnewbies@lists.kernelnewbies.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).