linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mke2fs reserved_ratio default value is nonsensical
@ 2011-03-28 18:02 Oren Elrad
  2011-03-28 18:06 ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Oren Elrad @ 2011-03-28 18:02 UTC (permalink / raw)
  To: linux-ext4

Undesired behavior; mke2fs defaults to reserving 5% of the volume for
the root user. 5% of a 2TB volume is 100GB. The rationale for root
reservation (syslogd, etc...) does not require 100GB. As volumes get
larger, this default makes less and less sense.

Proposal; If the user does not specify their preferred reserve_ratio
on the command-line (-m), use the less of 5% or MAX_RSRV_SIZE. I
propose 10GiB as a sensible maximum default reservation for root.

Patch: Follows and http://capsid.brandeis.edu/~elrad/e2fsprog.gitdiff

Tested on the latest git+patch, RHEL5 (2.6.18-194.17.1.el5) with a
12TB volume (which would reserve 600GB under the default!):

# /root/e2fsprogs/misc/mke2fs -T ext4 -L scratch /dev/sdd1
[...]
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
732422144 inodes, 2929671159 blocks
2621440 blocks (0.09%) reserved for the super user
[...]

Oren Elrad
Dept. of Physics
Brandeis University

---- Patch follows ----

diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index 9798b88..0ff3785 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -108,6 +108,8 @@ profile_t	profile;
 int sys_page_size = 4096;
 int linux_version_code = 0;

+static const unsigned long long MAX_RSRV_SIZE = 10ULL * (1 << 30); // 10 GiB
+
 static void usage(void)
 {
 	fprintf(stderr, _("Usage: %s [-c|-l filename] [-b block-size] "
@@ -1154,7 +1156,7 @@ static void PRS(int argc, char *argv[])
 	int		inode_ratio = 0;
 	int		inode_size = 0;
 	unsigned long	flex_bg_size = 0;
-	double		reserved_ratio = 5.0;
+	double		reserved_ratio = -1.0; // Default: lesser of 5%, MAX_RSRV_SIZE
 	int		lsector_size = 0, psector_size = 0;
 	int		show_version_only = 0;
 	unsigned long long num_inodes = 0; /* unsigned long long to catch
too-large input */
@@ -1893,9 +1895,17 @@ profile_error:

 	/*
 	 * Calculate number of blocks to reserve
+	 * If reserved_ratio >= 0.0, it was passed as an argument, use it as-is
+	 * If reserved_ratio < 0.0, no argument was passed, choose the
lesser of 5%, MAX_RSRV_SIZE
 	 */
-	ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
-				  ext2fs_blocks_count(&fs_param) / 100.0);
+	if ( reserved_ratio >= 0.0 ) {
+		ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
+					  ext2fs_blocks_count(&fs_param) / 100.0);
+	} else {
+		const blk64_t r_blk_count = ext2fs_blocks_count(&fs_param) / 20.0;
+		const blk64_t max_r_blk_count = MAX_RSRV_SIZE / blocksize;
+		ext2fs_r_blocks_count_set(&fs_param, (r_blk_count < max_r_blk_count
? r_blk_count : max_r_blk_count));
+	}
 }

 static int should_do_undo(const char *name)

By making a contribution to this project, I certify that:

	(a) The contribution was created in whole or in part by me and I
            have the right to submit it under the open source license
            indicated in the file;

	(d) I understand and agree that this project and the contribution
	    are public and that a record of the contribution (including all
	    personal information I submit with it, including my sign-off) is
	    maintained indefinitely and may be redistributed consistent with
	    this project or the open source license(s) involved.

Signed-off-by: Oren M Elrad <elrad@brandeis.edu>

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-28 18:02 [PATCH] mke2fs reserved_ratio default value is nonsensical Oren Elrad
@ 2011-03-28 18:06 ` Eric Sandeen
  2011-03-28 18:27   ` Oren Elrad
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Eric Sandeen @ 2011-03-28 18:06 UTC (permalink / raw)
  To: Oren Elrad; +Cc: linux-ext4

On 3/28/11 1:02 PM, Oren Elrad wrote:
> Undesired behavior; mke2fs defaults to reserving 5% of the volume for
> the root user. 5% of a 2TB volume is 100GB. The rationale for root
> reservation (syslogd, etc...) does not require 100GB. As volumes get
> larger, this default makes less and less sense.
> 
> Proposal; If the user does not specify their preferred reserve_ratio
> on the command-line (-m), use the less of 5% or MAX_RSRV_SIZE. I
> propose 10GiB as a sensible maximum default reservation for root.
> 
> Patch: Follows and http://capsid.brandeis.edu/~elrad/e2fsprog.gitdiff
> 
> Tested on the latest git+patch, RHEL5 (2.6.18-194.17.1.el5) with a
> 12TB volume (which would reserve 600GB under the default!):

There's been a bit of debate about this; is the space really saved
for root, or is it to stop the allocator from going off the rails
when the fs nears capacity?  Both, really.

I don't really have a horse in the race, but the complaint has certainly
come up before... it's just important to realize that the space isn't 
only there for root's eventual use.

No other fs that I know of enforces this "don't fill the fs to capacity" 
common sense programatically, though.

-Eric

> # /root/e2fsprogs/misc/mke2fs -T ext4 -L scratch /dev/sdd1
> [...]
> OS type: Linux
> Block size=4096 (log=2)
> Fragment size=4096 (log=2)
> Stride=0 blocks, Stripe width=0 blocks
> 732422144 inodes, 2929671159 blocks
> 2621440 blocks (0.09%) reserved for the super user
> [...]
> 
> Oren Elrad
> Dept. of Physics
> Brandeis University
> 
> ---- Patch follows ----
> 
> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> index 9798b88..0ff3785 100644
> --- a/misc/mke2fs.c
> +++ b/misc/mke2fs.c
> @@ -108,6 +108,8 @@ profile_t	profile;
>  int sys_page_size = 4096;
>  int linux_version_code = 0;
> 
> +static const unsigned long long MAX_RSRV_SIZE = 10ULL * (1 << 30); // 10 GiB
> +
>  static void usage(void)
>  {
>  	fprintf(stderr, _("Usage: %s [-c|-l filename] [-b block-size] "
> @@ -1154,7 +1156,7 @@ static void PRS(int argc, char *argv[])
>  	int		inode_ratio = 0;
>  	int		inode_size = 0;
>  	unsigned long	flex_bg_size = 0;
> -	double		reserved_ratio = 5.0;
> +	double		reserved_ratio = -1.0; // Default: lesser of 5%, MAX_RSRV_SIZE
>  	int		lsector_size = 0, psector_size = 0;
>  	int		show_version_only = 0;
>  	unsigned long long num_inodes = 0; /* unsigned long long to catch
> too-large input */
> @@ -1893,9 +1895,17 @@ profile_error:
> 
>  	/*
>  	 * Calculate number of blocks to reserve
> +	 * If reserved_ratio >= 0.0, it was passed as an argument, use it as-is
> +	 * If reserved_ratio < 0.0, no argument was passed, choose the
> lesser of 5%, MAX_RSRV_SIZE
>  	 */
> -	ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
> -				  ext2fs_blocks_count(&fs_param) / 100.0);
> +	if ( reserved_ratio >= 0.0 ) {
> +		ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
> +					  ext2fs_blocks_count(&fs_param) / 100.0);
> +	} else {
> +		const blk64_t r_blk_count = ext2fs_blocks_count(&fs_param) / 20.0;
> +		const blk64_t max_r_blk_count = MAX_RSRV_SIZE / blocksize;
> +		ext2fs_r_blocks_count_set(&fs_param, (r_blk_count < max_r_blk_count
> ? r_blk_count : max_r_blk_count));
> +	}
>  }
> 
>  static int should_do_undo(const char *name)
> 
> By making a contribution to this project, I certify that:
> 
> 	(a) The contribution was created in whole or in part by me and I
>             have the right to submit it under the open source license
>             indicated in the file;
> 
> 	(d) I understand and agree that this project and the contribution
> 	    are public and that a record of the contribution (including all
> 	    personal information I submit with it, including my sign-off) is
> 	    maintained indefinitely and may be redistributed consistent with
> 	    this project or the open source license(s) involved.
> 
> Signed-off-by: Oren M Elrad <elrad@brandeis.edu>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-28 18:06 ` Eric Sandeen
@ 2011-03-28 18:27   ` Oren Elrad
  2011-03-28 18:30     ` Eric Sandeen
  2011-03-29  6:41   ` Rogier Wolff
  2011-03-29 14:05   ` Theodore Tso
  2 siblings, 1 reply; 9+ messages in thread
From: Oren Elrad @ 2011-03-28 18:27 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Mon, Mar 28, 2011 at 2:06 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 3/28/11 1:02 PM, Oren Elrad wrote:
>> Undesired behavior; mke2fs defaults to reserving 5% of the volume for
>> the root user. 5% of a 2TB volume is 100GB. The rationale for root
>> reservation (syslogd, etc...) does not require 100GB. As volumes get
>> larger, this default makes less and less sense.
>>
>> Proposal; If the user does not specify their preferred reserve_ratio
>> on the command-line (-m), use the less of 5% or MAX_RSRV_SIZE. I
>> propose 10GiB as a sensible maximum default reservation for root.
>>
>> Patch: Follows and http://capsid.brandeis.edu/~elrad/e2fsprog.gitdiff
>>
>> Tested on the latest git+patch, RHEL5 (2.6.18-194.17.1.el5) with a
>> 12TB volume (which would reserve 600GB under the default!):
>
> There's been a bit of debate about this; is the space really saved
> for root, or is it to stop the allocator from going off the rails
> when the fs nears capacity?  Both, really.
>
> I don't really have a horse in the race, but the complaint has certainly
> come up before... it's just important to realize that the space isn't
> only there for root's eventual use.
>
> No other fs that I know of enforces this "don't fill the fs to capacity"
> common sense programatically, though.
>
> -Eric
>

[SNIP]

Well, in my version you still get some reservation to prevent whatever
woes (fragmentation, allocator slow-down) that accompany a nearly-full
disk. If you think 25 or 50GiB is a more appropriate maximum default,
I have no objections.

Whatever the reason for reservation, more than 100GB is totally
nonsensical IMHO.

Oren Elrad
Dept. of Physics
Brandeis University
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-28 18:27   ` Oren Elrad
@ 2011-03-28 18:30     ` Eric Sandeen
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Sandeen @ 2011-03-28 18:30 UTC (permalink / raw)
  To: Oren Elrad; +Cc: linux-ext4

On 3/28/11 1:27 PM, Oren Elrad wrote:
> On Mon, Mar 28, 2011 at 2:06 PM, Eric Sandeen <sandeen@redhat.com> wrote:
>> On 3/28/11 1:02 PM, Oren Elrad wrote:
>>> Undesired behavior; mke2fs defaults to reserving 5% of the volume for
>>> the root user. 5% of a 2TB volume is 100GB. The rationale for root
>>> reservation (syslogd, etc...) does not require 100GB. As volumes get
>>> larger, this default makes less and less sense.
>>>
>>> Proposal; If the user does not specify their preferred reserve_ratio
>>> on the command-line (-m), use the less of 5% or MAX_RSRV_SIZE. I
>>> propose 10GiB as a sensible maximum default reservation for root.
>>>
>>> Patch: Follows and http://capsid.brandeis.edu/~elrad/e2fsprog.gitdiff
>>>
>>> Tested on the latest git+patch, RHEL5 (2.6.18-194.17.1.el5) with a
>>> 12TB volume (which would reserve 600GB under the default!):
>>
>> There's been a bit of debate about this; is the space really saved
>> for root, or is it to stop the allocator from going off the rails
>> when the fs nears capacity?  Both, really.
>>
>> I don't really have a horse in the race, but the complaint has certainly
>> come up before... it's just important to realize that the space isn't
>> only there for root's eventual use.
>>
>> No other fs that I know of enforces this "don't fill the fs to capacity"
>> common sense programatically, though.
>>
>> -Eric
>>
> 
> [SNIP]
> 
> Well, in my version you still get some reservation to prevent whatever
> woes (fragmentation, allocator slow-down) that accompany a nearly-full
> disk. If you think 25 or 50GiB is a more appropriate maximum default,
> I have no objections.

the question is, how much is enough?  (isn't that always the question?) :)
What constitutes "nearly full?"

> Whatever the reason for reservation, more than 100GB is totally
> nonsensical IMHO.

That depends; 1% sounds small, until the total is a Petabyte.

For overall performance, it may well be the % that matters, not the 
absolute number.  It really could probably use more real investigation,
and less hand-waving (of which I am also guilty).  :)

-Eric

> Oren Elrad
> Dept. of Physics
> Brandeis University


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-28 18:06 ` Eric Sandeen
  2011-03-28 18:27   ` Oren Elrad
@ 2011-03-29  6:41   ` Rogier Wolff
  2011-03-29 14:05   ` Theodore Tso
  2 siblings, 0 replies; 9+ messages in thread
From: Rogier Wolff @ 2011-03-29  6:41 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Oren Elrad, linux-ext4

On Mon, Mar 28, 2011 at 01:06:57PM -0500, Eric Sandeen wrote:
> On 3/28/11 1:02 PM, Oren Elrad wrote:
> > Undesired behavior; mke2fs defaults to reserving 5% of the volume for
> > the root user. 5% of a 2TB volume is 100GB. The rationale for root
> > reservation (syslogd, etc...) does not require 100GB. As volumes get
> > larger, this default makes less and less sense.
> > 
> > Proposal; If the user does not specify their preferred reserve_ratio
> > on the command-line (-m), use the less of 5% or MAX_RSRV_SIZE. I
> > propose 10GiB as a sensible maximum default reservation for root.
> > 
> > Patch: Follows and http://capsid.brandeis.edu/~elrad/e2fsprog.gitdiff
> > 
> > Tested on the latest git+patch, RHEL5 (2.6.18-194.17.1.el5) with a
> > 12TB volume (which would reserve 600GB under the default!):
> 
> There's been a bit of debate about this; is the space really saved
> for root, or is it to stop the allocator from going off the rails
> when the fs nears capacity?  Both, really.
> 
> I don't really have a horse in the race, but the complaint has certainly
> come up before... it's just important to realize that the space isn't 
> only there for root's eventual use.
> 
> No other fs that I know of enforces this "don't fill the fs to capacity" 
> common sense programatically, though.

That could very well be because other filesystems don't much have the
big penalty that unix-like filesystems have when the fs fills to
capacity.

The effect shows I think on my work-filesystem: There mkdir now takes
tens of milliseconds instead of microseconds. That sort of performance
degradation should be prevented by a 5 or 10% free-space buffer. 

The idea is that if most block groups are filled to 95%, you'll have a
block group with free space nearby, so searching for free blocks will
always be fast.

On the other extreme: if 95% of the block groups are completely full,
the other 5% of block groups will be completely empty. So you'll have
no trouble finding free space there.

Eric has a patch to fix my mkdir troubles (hopefully) which I can't
test because I have production data there. And it's too much to run
backups. I have therefore been forced to choose "use RAID5" as the
data security policy for that data. (which is an improvement over "use
RAID0" which we used until a year ago or so).

It is a conscious choice, because besides that we would like to keep
the data, we NEED it to be fast as well. (and on the other hand, we
can't invest lots of money).

	Roger. 



> 
> -Eric
> 
> > # /root/e2fsprogs/misc/mke2fs -T ext4 -L scratch /dev/sdd1
> > [...]
> > OS type: Linux
> > Block size=4096 (log=2)
> > Fragment size=4096 (log=2)
> > Stride=0 blocks, Stripe width=0 blocks
> > 732422144 inodes, 2929671159 blocks
> > 2621440 blocks (0.09%) reserved for the super user
> > [...]
> > 
> > Oren Elrad
> > Dept. of Physics
> > Brandeis University
> > 
> > ---- Patch follows ----
> > 
> > diff --git a/misc/mke2fs.c b/misc/mke2fs.c
> > index 9798b88..0ff3785 100644
> > --- a/misc/mke2fs.c
> > +++ b/misc/mke2fs.c
> > @@ -108,6 +108,8 @@ profile_t	profile;
> >  int sys_page_size = 4096;
> >  int linux_version_code = 0;
> > 
> > +static const unsigned long long MAX_RSRV_SIZE = 10ULL * (1 << 30); // 10 GiB
> > +
> >  static void usage(void)
> >  {
> >  	fprintf(stderr, _("Usage: %s [-c|-l filename] [-b block-size] "
> > @@ -1154,7 +1156,7 @@ static void PRS(int argc, char *argv[])
> >  	int		inode_ratio = 0;
> >  	int		inode_size = 0;
> >  	unsigned long	flex_bg_size = 0;
> > -	double		reserved_ratio = 5.0;
> > +	double		reserved_ratio = -1.0; // Default: lesser of 5%, MAX_RSRV_SIZE
> >  	int		lsector_size = 0, psector_size = 0;
> >  	int		show_version_only = 0;
> >  	unsigned long long num_inodes = 0; /* unsigned long long to catch
> > too-large input */
> > @@ -1893,9 +1895,17 @@ profile_error:
> > 
> >  	/*
> >  	 * Calculate number of blocks to reserve
> > +	 * If reserved_ratio >= 0.0, it was passed as an argument, use it as-is
> > +	 * If reserved_ratio < 0.0, no argument was passed, choose the
> > lesser of 5%, MAX_RSRV_SIZE
> >  	 */
> > -	ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
> > -				  ext2fs_blocks_count(&fs_param) / 100.0);
> > +	if ( reserved_ratio >= 0.0 ) {
> > +		ext2fs_r_blocks_count_set(&fs_param, reserved_ratio *
> > +					  ext2fs_blocks_count(&fs_param) / 100.0);
> > +	} else {
> > +		const blk64_t r_blk_count = ext2fs_blocks_count(&fs_param) / 20.0;
> > +		const blk64_t max_r_blk_count = MAX_RSRV_SIZE / blocksize;
> > +		ext2fs_r_blocks_count_set(&fs_param, (r_blk_count < max_r_blk_count
> > ? r_blk_count : max_r_blk_count));
> > +	}
> >  }
> > 
> >  static int should_do_undo(const char *name)
> > 
> > By making a contribution to this project, I certify that:
> > 
> > 	(a) The contribution was created in whole or in part by me and I
> >             have the right to submit it under the open source license
> >             indicated in the file;
> > 
> > 	(d) I understand and agree that this project and the contribution
> > 	    are public and that a record of the contribution (including all
> > 	    personal information I submit with it, including my sign-off) is
> > 	    maintained indefinitely and may be redistributed consistent with
> > 	    this project or the open source license(s) involved.
> > 
> > Signed-off-by: Oren M Elrad <elrad@brandeis.edu>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-28 18:06 ` Eric Sandeen
  2011-03-28 18:27   ` Oren Elrad
  2011-03-29  6:41   ` Rogier Wolff
@ 2011-03-29 14:05   ` Theodore Tso
  2011-03-29 15:26     ` Eric Sandeen
  2 siblings, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2011-03-29 14:05 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Oren Elrad, linux-ext4


On Mar 28, 2011, at 2:06 PM, Eric Sandeen wrote:
> 
> No other fs that I know of enforces this "don't fill the fs to capacity" 
> common sense programatically, though.

Actually, we (ext2) copied this from the BSD Fast File System (FFS) which used a default MINFREE of 10%.   For ext2 we decided to bring it down to 5%.   FreeBSD currently uses 8% as their default free ratio.

The decrease does seem to be relative to the percentage of free space, from empirical experience, although no one I know of has done a formal analysis of the slowdown.   A lot depends on your workload, how much memory pressure you place on your system, etc.  I've actually started seeing slowdowns starting as early as 80% full when you're trying to allocate large chunks (1M to 8M) at a time, although this isn't something where I've gathered hard data; just what I've noticed from looking at different systems and their performance characteristics.

Fortunately disks are cheap, and lots of people end up buying far more disk space than they need, and so they naturally keep their file systems well under 75-80% full.

If someone wants to add some tuning parameters to mke2fs.conf, so they can set their own personal default free ratios, or even min_reserved_blocks and max_reserved_blocks settings, that's probably a reasonable patch to e2fsprogs that I'd be willing to accept.   I don't think changing the global defaults that we give to users makes sense at this point; I don't think we have enough data (and given what I've seen, the burden of proof should be on those who want to decrease or even eliminate this free ratio --- my fear is that the BSD FFS folks were right, and 8% or 10% is really more appropriate than 5%).

I'd encourage people to run some benchmarks at different levels of fullness, and with different levels of fragmentation.  The FS Impressions tools, which won the 2009 best paper award at FAST (http://www.usenix.org/events/fast09/tech/slides/agrawal.pdf) might be a good place to begin.

-- Ted


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-29 14:05   ` Theodore Tso
@ 2011-03-29 15:26     ` Eric Sandeen
  2011-03-29 16:00       ` Rogier Wolff
  2011-03-29 16:57       ` Oren Elrad
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Sandeen @ 2011-03-29 15:26 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Oren Elrad, linux-ext4

On 3/29/11 9:05 AM, Theodore Tso wrote:
> 
> On Mar 28, 2011, at 2:06 PM, Eric Sandeen wrote:
>> 
>> No other fs that I know of enforces this "don't fill the fs to
>> capacity" common sense programatically, though.
> 
> Actually, we (ext2) copied this from the BSD Fast File System (FFS)
> which used a default MINFREE of 10%.   For ext2 we decided to bring
> it down to 5%.   FreeBSD currently uses 8% as their default free
> ratio.

Clearly I don't know enough filesystems, I guess ;)

Should have said "linux filesystem" perhaps.

> The decrease does seem to be relative to the percentage of free
> space, from empirical experience, although no one I know of has done
> a formal analysis of the slowdown.   A lot depends on your workload,
> how much memory pressure you place on your system, etc.  I've
> actually started seeing slowdowns starting as early as 80% full when
> you're trying to allocate large chunks (1M to 8M) at a time, although
> this isn't something where I've gathered hard data; just what I've
> noticed from looking at different systems and their performance
> characteristics.
> 
> Fortunately disks are cheap, and lots of people end up buying far
> more disk space than they need, and so they naturally keep their file
> systems well under 75-80% full.
> 
> If someone wants to add some tuning parameters to mke2fs.conf, so
> they can set their own personal default free ratios, or even
> min_reserved_blocks and max_reserved_blocks settings, that's probably
> a reasonable patch to e2fsprogs that I'd be willing to accept. 

Hm I thought I had sent that, but it was only for the other two
semi-controversial behaviors.  :)

I agree, it seems like at least a decent first step to make it
more site/admin-configurable. 

-Eric


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-29 15:26     ` Eric Sandeen
@ 2011-03-29 16:00       ` Rogier Wolff
  2011-03-29 16:57       ` Oren Elrad
  1 sibling, 0 replies; 9+ messages in thread
From: Rogier Wolff @ 2011-03-29 16:00 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Tso, Oren Elrad, linux-ext4

On Tue, Mar 29, 2011 at 10:26:29AM -0500, Eric Sandeen wrote:
> I agree, it seems like at least a decent first step to make it
> more site/admin-configurable. 

If the filesystem developers (that's us on this mailing list) decide
that 5% is a good tradeoff between "wasted space" and "performance
when the FS fills up", I think we should leave it as it is. If 
users really want to fill up their fs to the rim, they can do so 
"as root". Or they can tune the fs. 

If you make it configurable, it becomes too easy to create a badly
performing filesystem. 

People who don't understand the reasons behind the "reserved for root"
percentage will then be tempted to change the default for their system
to zero, and later complain about the bad performance they are getting. 

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mke2fs reserved_ratio default value is nonsensical
  2011-03-29 15:26     ` Eric Sandeen
  2011-03-29 16:00       ` Rogier Wolff
@ 2011-03-29 16:57       ` Oren Elrad
  1 sibling, 0 replies; 9+ messages in thread
From: Oren Elrad @ 2011-03-29 16:57 UTC (permalink / raw)
  Cc: linux-ext4

On Tue, Mar 29, 2011 at 11:26 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>
> On 3/29/11 9:05 AM, Theodore Tso wrote:
> >
> > On Mar 28, 2011, at 2:06 PM, Eric Sandeen wrote:
> >>
> >> No other fs that I know of enforces this "don't fill the fs to
> >> capacity" common sense programatically, though.
> >
> > Actually, we (ext2) copied this from the BSD Fast File System (FFS)
> > which used a default MINFREE of 10%.   For ext2 we decided to bring
> > it down to 5%.   FreeBSD currently uses 8% as their default free
> > ratio.
>
> Clearly I don't know enough filesystems, I guess ;)
>
> Should have said "linux filesystem" perhaps.
>
> > The decrease does seem to be relative to the percentage of free
> > space, from empirical experience, although no one I know of has done
> > a formal analysis of the slowdown.   A lot depends on your workload,
> > how much memory pressure you place on your system, etc.  I've
> > actually started seeing slowdowns starting as early as 80% full when
> > you're trying to allocate large chunks (1M to 8M) at a time, although
> > this isn't something where I've gathered hard data; just what I've
> > noticed from looking at different systems and their performance
> > characteristics.
> >
> > Fortunately disks are cheap, and lots of people end up buying far
> > more disk space than they need, and so they naturally keep their file
> > systems well under 75-80% full.
> >
> > If someone wants to add some tuning parameters to mke2fs.conf, so
> > they can set their own personal default free ratios, or even
> > min_reserved_blocks and max_reserved_blocks settings, that's probably
> > a reasonable patch to e2fsprogs that I'd be willing to accept.
>
> Hm I thought I had sent that, but it was only for the other two
> semi-controversial behaviors.  :)
>
> I agree, it seems like at least a decent first step to make it
> more site/admin-configurable.
>
> -Eric
>

Thanks all for the informative replies.

I never meant to suggest that 5% is a bad default as a general matter
(but I dropped the ball communicating that); I agree entirely that
non-root users should not be able to degrade performance by filling up
the disk and that 5% is a very good default choice.

My only (admittedly unsupported) claim was that past some **fixed**
value there is little gain from reserving more space independently of
the size of the volume. When I get a chance, I will try to design a
benchmark to test that claim (having a few spare 12TB volumes helps)
but I fear the results will depend heavily on the usage pattern.

Oren Elrad
Dept. of Physics
Brandeis University
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-03-29 16:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-28 18:02 [PATCH] mke2fs reserved_ratio default value is nonsensical Oren Elrad
2011-03-28 18:06 ` Eric Sandeen
2011-03-28 18:27   ` Oren Elrad
2011-03-28 18:30     ` Eric Sandeen
2011-03-29  6:41   ` Rogier Wolff
2011-03-29 14:05   ` Theodore Tso
2011-03-29 15:26     ` Eric Sandeen
2011-03-29 16:00       ` Rogier Wolff
2011-03-29 16:57       ` Oren Elrad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).