public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Valerie Henson <val_henson@linux.intel.com>
To: Theodore Tso <tytso@mit.edu>, David Chinner <dgc@sgi.com>,
	"Cabot, Mason B" <mason.b.cabot@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: Ext3 vs NTFS performance
Date: Fri, 4 May 2007 12:40:42 -0700	[thread overview]
Message-ID: <20070504194036.GE3869@nifty> (raw)
In-Reply-To: <20070504122307.GA25339@thunk.org>

On Fri, May 04, 2007 at 08:23:08AM -0400, Theodore Tso wrote:
> On Thu, May 03, 2007 at 02:14:52PM -0700, Valerie Henson wrote:
> 
> > I'd really like to see a generic VFS-level detection of
> > read()/write()/creat()/mkdir()/etc. patterns which could detect things
> > like "Oh, this file is likely to be deleted immediately, wait and see
> > if it goes away and don't bother sending it on to the FS immediately"
> > or "Looks like this file will grow pretty big, let's go pre-allocate
> > some space for it."  This is probably best done as a set of helper
> > functions in the usual way.
> 
> What patterns do you think means things like "this file is likely to
> be deleted immediate", or "this file will grow pretty big"?  I don't
> think there are any that would be generally valid.

I wouldn't have guessed that either, but it turns out there are:

http://www.eecs.harvard.edu/~ellard/pubs/able-usenix04.pdf

    We present evidence that attributes that are known to
    the file system when a file is created, such as its name,
    permission mode, and owner, are often strongly related
    to future properties of the file such as its ultimate size,
    lifespan, and access pattern. More importantly, we show
    that we can exploit these relationships to automatically
    generate predictive models for these properties, and that
    these predictions are sufficiently accurate to enable opti-
    mizations.

For example, lock files have predictable names and permissions, and
live for a fraction of second in most cases.  Files which are appended
a few hundred bytes at a time are probably log files and will continue
to grow in this manner.  Some of their predictions were 98% accurate!

In any case, any predictive algorithms we already do at the file
system level can be done at the VFS level, and shared between file
systems, instead of being reimplemented over and over again.  Just
food for thought.

-VAL

  reply	other threads:[~2007-05-04 19:40 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-01 20:43 Ext3 vs NTFS performance Cabot, Mason B
2007-05-01 21:23 ` Andrew Morton
2007-05-02 12:21   ` Andi Kleen
2007-05-02 16:04     ` Theodore Tso
2007-05-02 18:40       ` Andi Kleen
2007-05-02 19:28         ` Theodore Tso
2007-05-02 16:16   ` Theodore Tso
2007-05-02 18:08     ` Jeremy Allison
2007-05-02 19:34       ` Theodore Tso
2007-05-02 20:38         ` Jeff Garzik
2007-05-02 22:01           ` Theodore Tso
2007-05-02  3:54 ` Gerhard Mack
2007-05-02 15:46   ` David Chinner
2007-05-02 15:44 ` David Chinner
2007-05-02 19:46   ` Chris Mason
2007-05-03  0:15     ` David Chinner
2007-05-03 12:57       ` Chris Mason
2007-05-03 21:14   ` Valerie Henson
2007-05-03 22:40     ` Bernd Eckenfels
2007-05-04  8:12       ` Anton Altaparmakov
2007-05-04  9:46         ` Christoph Hellwig
2007-05-04 14:47           ` Anton Altaparmakov
2007-05-04 15:49           ` Michael Tokarev
2007-05-04 18:41             ` Theodore Tso
2007-05-05  9:59             ` Christoph Hellwig
2007-05-06 20:59           ` Jörn Engel
2007-05-04 12:23     ` Theodore Tso
2007-05-04 19:40       ` Valerie Henson [this message]
2007-05-04 18:56 ` Phillip Susi
2007-05-04 19:52   ` Cabot, Mason B
2007-05-07 14:31     ` Phillip Susi
2007-09-12 23:47 ` Update: " Cabot, Mason B
  -- strict thread matches above, loose matches on Subject: below --
2007-05-03  3:51 Al Boldi
2007-05-05  3:13 Xu CanHao
2007-05-05 13:45 ` Theodore Tso
     [not found] <8hiYr-2fJ-1@gated-at.bofh.it>
     [not found] ` <8huGm-2W4-33@gated-at.bofh.it>
2007-05-05 22:25   ` Bodo Eggert
2007-05-06  5:04     ` Xu CanHao
2007-05-06  1:48 Albert Cahalan
     [not found] <8gShI-3hY-11@gated-at.bofh.it>
     [not found] ` <8h1bh-8sG-11@gated-at.bofh.it>
     [not found]   ` <8h2Al-280-1@gated-at.bofh.it>
     [not found]     ` <8hW9y-2Lp-3@gated-at.bofh.it>
2007-05-07 11:21       ` Bodo Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070504194036.GE3869@nifty \
    --to=val_henson@linux.intel.com \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mason.b.cabot@intel.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox