From: Theodore Tso <tytso@mit.edu>
To: Eric Sandeen <sandeen@redhat.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: how to scale root-reserved space going forward...
Date: Sun, 1 Mar 2009 21:47:54 -0500 [thread overview]
Message-ID: <20090302024754.GF6973@mit.edu> (raw)
In-Reply-To: <49AB0ABE.1030009@redhat.com>
On Sun, Mar 01, 2009 at 04:22:54PM -0600, Eric Sandeen wrote:
> 5% of a 16T filesystem is getting a little crazy from the point of view
> of "root-reserved" - 800G!
>
> But I think the original reason for this reserved space was actually as
> an allocator cushion; letting root gain access to it was just a
> safety-valve for that.
Yep, that's correct. Historically, this came from BSD Fast
Filesystem, which used to use a default reserve of 10%. To quote from
the FreeBSD sources, in ufs/ffs.h:
/*
* MINFREE gives the minimum acceptable percentage of filesystem
* blocks which may be free. If the freelist drops below this level
* only the superuser may continue to allocate blocks. This may
* be set to 0 if no reserve of free blocks is deemed necessary,
* however throughput drops by fifty percent if the filesystem
* is run at between 95% and 100% full; thus the minimum default
* value of fs_minfree is 5%. However, to get good clustering
* performance, 10% is a better choice. hence we use 10% as our
* default value. With 10% free space, fragmentation is not a
* problem, so we choose to optimize for time.
*/
#define MINFREE 8
The interesting thing is that FreeBSD has decided push things down to
8%. A quick survey shows that NetBSD is using a MINFREE of 5%, like
Linux. (Fortunately, http://fxr.watson.org/ makes it easy to make
these comparisons.)
And like Linux, it looks like the *BSD's have the same tendency not to
update the comments when they update the code. :-)
> Now that we have a completely different allocator in ext4, and
> potentially much larger filesystems, I think we need to revisit how much
> is held back, and for what reason.
>
> Any thoughts on a reasonable way to scale this reservation (or, just for
> discussion - if it's even needed at all today for ext4?)
This is a reasonable question. What would be great is if we could get
a benchmarking team to fill an ext4 filesystem with files. The simple
thing would be if we did something fixed --- say, 50 files per
directory, each file 100k, and say 10 subdirectories in each
directory, to some fixed depth, and with a filesystem size of at least
8 gigabytes (which would give us at least 16 flex groups with the
default flex size of 16) --- and then filled each filesystem to from
0% to 90% in increments of 10%, and from 90% to 99% in increments of
1%, and then ran some throughput benchmark like bonnie on the mostly
filled filesystem.
A better filler would probably use a random file sizes with a average
size of say 64k, but with outliers from 4k to 128 megs, and a similar
random distribution of number of files per directory, and number of
subdirectories and depth of subdirectories.
I suppose it would be good to do one set of charts with a filesystem
size of 8 gigs, and another at 80 gigs and 800 gigs, and see if the
shape of the filesystem curve changes at scale. Once we have that, we
would be in a position to make a reasonable set of defaults.
Or we could just guess and come up with some percentage figure that
sounds good. :-)
- Ted
next prev parent reply other threads:[~2009-03-02 2:47 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-01 22:22 how to scale root-reserved space going forward Eric Sandeen
2009-03-02 2:47 ` Theodore Tso [this message]
2009-03-02 7:17 ` Ron Johnson
2009-03-02 8:56 ` Andreas Dilger
2009-03-02 16:26 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090302024754.GF6973@mit.edu \
--to=tytso@mit.edu \
--cc=linux-ext4@vger.kernel.org \
--cc=sandeen@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).