linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Theodore Tso <tytso@mit.edu>
To: Viji V Nair <viji@fedoraproject.org>
Cc: Eric Sandeen <sandeen@redhat.com>,
	ext3-users@redhat.com, linux-ext4@vger.kernel.org
Subject: Re: optimising filesystem for many small files
Date: Sat, 17 Oct 2009 18:26:19 -0400	[thread overview]
Message-ID: <20091017222619.GA10074@mit.edu> (raw)
In-Reply-To: <84c89ac10910171056i773dfb93wc2e917a086dd8ef0@mail.gmail.com>

On Sat, Oct 17, 2009 at 11:26:04PM +0530, Viji V Nair wrote:
> these files are not in a single directory, this is a pyramid
> structure. There are total 15 pyramids and coming down from top to
> bottom the sub directories and files  are multiplied by a factor of 4.
> 
> The IO is scattered all over!!!! and this is a single disk file system.
> 
> Since the python application is creating files, it is creating
> multiple files to multiple sub directories at a time.

What is the application trying to do, at a high level?  Sometimes it's
not possible to optimize a filesystem against a badly designed
application.  :-(

It sounds like it is generating files distributed in subdirectories in
a completely random order.  How are the files going to be read
afterwards?  In the order they were created, or some other order
different from the order in which they were read?

With a sufficiently bad access patterns, there may not be a lot you
can do, other than (a) throw hardware at the problem, or (b) fix or
redesign the application to be more intelligent (if possible).

	     		       	    		    - Ted

  reply	other threads:[~2009-10-17 22:26 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-17  6:52 optimising filesystem for many small files Viji V Nair
2009-10-17 14:32 ` Eric Sandeen
2009-10-17 17:56   ` Viji V Nair
2009-10-17 22:26     ` Theodore Tso [this message]
2009-10-18  9:31       ` Viji V Nair
2009-10-18 11:25         ` Jon Burgess
2009-10-18 12:51           ` Viji V Nair
2009-10-18 11:41         ` Matija Nalis
2009-10-18 13:08           ` Fwd: " Viji V Nair
2009-10-19  7:23             ` Stephen Samuel (gmail)
2009-10-18 13:14           ` Viji V Nair
2009-10-18 15:07             ` Jon Burgess
2009-10-18 16:29               ` Viji V Nair
2009-10-18 17:15                 ` Jon Burgess
2009-10-18 14:15         ` Peter Grandi
2009-10-18 16:10           ` Viji V Nair
2009-10-18 15:34         ` Eric Sandeen
2009-10-18 16:33           ` Viji V Nair
  -- strict thread matches above, loose matches on Subject: below --
2009-10-17  6:59 Viji V Nair

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091017222619.GA10074@mit.edu \
    --to=tytso@mit.edu \
    --cc=ext3-users@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=viji@fedoraproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).