From: Theodore Tso <tytso@mit.edu>
To: Timo Sirainen <tss@iki.fi>
Cc: Josef Bacik <josef@toxicpanda.com>, linux-kernel@vger.kernel.org
Subject: Re: ext3/ext4 directories don't shrink after deleting lots of files
Date: Fri, 15 May 2009 14:25:41 -0400 [thread overview]
Message-ID: <20090515182541.GB6641@mit.edu> (raw)
In-Reply-To: <1242408544.6933.687.camel@timo-desktop>
On Fri, May 15, 2009 at 01:29:04PM -0400, Timo Sirainen wrote:
> On Fri, 2009-05-15 at 06:58 -0400, Theodore Tso wrote:
> > > I was rather thinking something that I could run while the system was
> > > fully operational. Otherwise just moving the files to a temp directory +
> > > rmdir() + rename() would have been fine too.
> > >
> > > I just tested that xfs, jfs and reiserfs all shrink the directories
> > > immediately. Is it more difficult to implement for ext* or has no one
> > > else found this to be a problem?
> >
> > It's probably fairest to say no one has thought it worth the effort.
>
> My problem is with mail servers and Maildir format where it's possible
> that a user has tons of emails and wants to delete them. The mailbox
> maybe slowly grows back to the huge size, but in the meantime it's
> slower than necessary.
The problem is that unless the user is deleting a *huge* number of
files, it's rare that the directory entry block goes completely empty.
If you shrink from 15,000 messages to 12,000 messages, say, because of
the fact that we use a hashed b-tree as our data structure, the leaf
blocks in the btree generally still contain some directory entries.
So to fix this we need to actually coalesce directory leaf blocks on
the fly, on top of everything else that I had mentioned. It's
certianly doable, but again, someone would have to submit a patch. We
might get around to it one of these days, but plates of those of us
who are doing ext4 are pretty full with higher priority items at
present.
There is an off-line fix that works quite well -- e2fsck -fD, but
obviously that requires scheduling downtime.
How big of a deal is this for you? I use a local maildir myself, and
they can get quite large:
% ls /home/tytso/isync/mit
total 2132
1412 cur/ 716 new/ 4 tmp/
But once they are in cache, it's no longer a major problem. I suppose
on a mail server where you have a very large number of users, caching
2 megs of directory data per user could get ugly; and it does take
time the first time you pull their directory entry into the cache.
What sort of performance degredation are you measuring, and what are
the impacts operationally at the moment for you? Is this just a
theoretical concern, or are you measuring a significant slowdown as a result?
- Ted
next prev parent reply other threads:[~2009-05-15 18:25 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-14 22:02 ext3/ext4 directories don't shrink after deleting lots of files Timo Sirainen
2009-05-15 0:32 ` Josef Bacik
2009-05-15 0:45 ` Timo Sirainen
2009-05-15 10:58 ` Theodore Tso
2009-05-15 17:29 ` Timo Sirainen
2009-05-15 18:25 ` Theodore Tso [this message]
2009-05-16 9:42 ` david
2009-05-17 21:33 ` Theodore Tso
2009-05-18 2:49 ` david
2009-05-18 3:21 ` Theodore Tso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090515182541.GB6641@mit.edu \
--to=tytso@mit.edu \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tss@iki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox