linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@clusterfs.com>
To: coly <colyli@gmail.com>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [RFC 4/5] inode reservation v0.1 (benchmark result)
Date: Thu, 24 May 2007 14:21:44 -0600	[thread overview]
Message-ID: <20070524202144.GS5181@schatzie.adilger.int> (raw)
In-Reply-To: <1179943697.4179.55.camel@coly-t43.site>

On May 24, 2007  02:08 +0800, coly wrote:
> Due to the bad design of magic inode and the on-disk layout of magic
> inode. When 30 files created alternatively in each directory, no
> performance advantage exists. When 50 files created alternatively in
> each directory, the patched ext4 will use double time on removing all
> the files and directories.

I don't think the use of magic inodes is the right approach.  One possibility
to avoid changing the on-disk format at all is to only do the reservation in
memory, scaling the reservation with the size of the directory.

The only issue that arises is how to regenerate the same reservation
after a remount.  This might be possible to do by looking into the leaf
block at create time to see which inode numbers are already in use for
that leaf and checking whether there are free inodes in each group.

One way to get the "best" mapping is possibly checking groups in order of
decreasing number of inodes for that leaf in each group and once a suitable
group has been found doing a few name->hash->inode numbers to get the old
mapping back.  Once this leaf->group mapping has been established it
can be re-used for a given leaf block until that window is full.

Since you need to scan all of a leaf block's dir entries in a hash block
at insert time to look for duplicate names, and the inode numbers are
in the dir entries, this shouldn't introduce any additional disk IO.

Also, regardless of what the mapping turns out to be - the goal is to place
inodes with a similar hash into nearby inodes, and this heuristic works
relatively well for that.  Once the given leaf block's inode range is full
then new inodes can be allocated from a new window as it was done for the
newly-created directory.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


      reply	other threads:[~2007-05-24 20:21 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-23 18:08 [RFC 4/5] inode reservation v0.1 (benchmark result) coly
2007-05-24 20:21 ` Andreas Dilger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070524202144.GS5181@schatzie.adilger.int \
    --to=adilger@clusterfs.com \
    --cc=colyli@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).