Re: The flex_bg inode allocator

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Theodore Tso <tytso@mit.edu>
To: Xiang Wang <xiangw@google.com>
Cc: ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: The flex_bg inode allocator
Date: Sat, 18 Jul 2009 08:36:08 -0400	[thread overview]
Message-ID: <20090718123608.GD12744@mit.edu> (raw)
In-Reply-To: <d5ca277e0907172038x3dce5f8dx11d6ec4f7b0f3c52@mail.gmail.com>

On Fri, Jul 17, 2009 at 08:38:18PM -0700, Xiang Wang wrote:
> 
> Recently I've found out that the flex_bg inode allocator(the
> find_group_flex function called by ext4_new_inode) is actually not in
> use unless we specify the "oldalloc" option on mount as well as
> setting the flex_bg size to be > 1.
> Currently, the default option on mount is "orlov".
> 

Actually, the "flex_bg inode allocator" is actually the older
allocator.  The newer allocator is still flex_bg based, but it uses
the orlov algorithms as well.  It has resulted is significant fsck
speedups as a result.  See:

http://thunk.org/tytso/blog/2009/02/26/fast-ext4-fsck-times-revisited/

> 1) What's the current status of the flex_bg inode allocator? Will it
> be set as a default soon?

It will probably be removed soon, actually...

> 2) If not, are there any particular reasons that it is held back? Is
> it all because of the worse performance numbers shown in the two
> metrics
> ("read tree total" and "read compiled tree total") in Compilebench?

I kept in case there were performance regressions with the orlov
allocator.  At least in theory for some workloads, the fact that we
are more aggressively spreading inodes from different directories into
different flex_bg's could potentially degrade performance; the reason
why we needed to do this, though, was to make the filesystem more
resistant to aging.

> 3) Are there any ongoing efforts and/or future plans to improve it? Or
> is there any work in similar directions?

Nothing at the moment.  I could imagine in the future wanting to play
with algorithms that are based on the filename (i.e., separating .o
files from .c files in build directories, etc. --- there's Usenix
paper that talks about other ideas long these lines), but in the
short-term, improving the block allocator, especially in the face of
heavy filesystem free space fragmentation, is probably the much higher
priority.  Nothing is immediately planned though.

If you're interested in trying to play with things along these lines,
I'd suggest starting with some set of benchmarks that test changes in
the inode and block allocators, both for pristine filesystems and
filesystems that have undergone significant aging.

Regards,

						- Ted

     prev parent reply	other threads:[~2009-07-18 12:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-18  3:38 The flex_bg inode allocator Xiang Wang
2009-07-18 12:36 ` Theodore Tso [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090718123608.GD12744@mit.edu \
    --to=tytso@mit.edu \
    --cc=linux-ext4@vger.kernel.org \
    --cc=xiangw@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).