From: Matija Nalis <mnalis-ml@voyager.hr>
To: Viji V Nair <viji@fedoraproject.org>
Cc: linux-ext4@vger.kernel.org, ext3-users@redhat.com
Subject: Re: optimising filesystem for many small files
Date: Sun, 18 Oct 2009 13:41:00 +0200 [thread overview]
Message-ID: <20091018114100.GA26721@eagle102.home.lan> (raw)
In-Reply-To: <84c89ac10910180231p202fb5f1r2e192e9ac0b51509@mail.gmail.com>
On Sun, Oct 18, 2009 at 03:01:46PM +0530, Viji V Nair wrote:
> The application which we are using are modified versions of mapnik and
> tilecache, these are single threaded so we are running 4 process at a
How does it scale if you reduce the number or processes - especially if you
run just one of those ? As this is just a single disk, 4 simultaneous
readers/writers would probably *totally* kill it with seeks.
I suspect it might even run faster with just 1 process then with 4 of
them...
> time. We can say only four images are created at a single point of
> time. Some times a single image is taking around 20 sec to create. I
is that 20 secs just the write time for an precomputed file of 10k ?
Or does it also include reading and processing and writing ?
> can see lots of system resources are free, memory, processors etc
> (these are 4G, 2 x 5420 XEON)
I do not see how the "lots of memory" could be free, especially with such a
large number of inodes. dentry and inode cache alone should consume those
pretty fast as the number of files grow, not to mention (dirty and
otherwise) buffers...
You may want to tune following sysctls to allow more stuff to remain in
write-back cache (but then again, you will probably need more memory):
vm.vfs_cache_pressure
vm.dirty_writeback_centisecs
vm.dirty_expire_centisecs
vm.dirty_background_ratio
vm.dirty_ratio
> The file system is crated with "-i 1024 -b 1024" for larger inode
> number, 50% of the total images are less than 10KB. I have disabled
> access time and given a large value to the commit also. Do you have
> any other recommendation of the file system creation?
for ext3, larger journal on external journal device (if that is an option)
should probably help, as it would reduce some of the seeks which are most
probably slowing this down immensely.
If you can modify hardware setup, RAID10 (better with many smaller disks
than with fewer bigger ones) should help *very* much. Flash-disk-thingies of
appropriate size are even better option (as the seek issues are few orders
of magnitude smaller problem). Also probably more RAM (unless you full
dataset is much smaller than 2 GB, which I doubt).
On the other hand, have you tried testing some other filesystems ?
I've had much better performance with lots of small files of XFS (but that
was on big RAID5, so YMMV), for example.
--
Opinions above are GNU-copylefted.
next prev parent reply other threads:[~2009-10-18 11:41 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-17 6:52 optimising filesystem for many small files Viji V Nair
2009-10-17 14:32 ` Eric Sandeen
2009-10-17 17:56 ` Viji V Nair
2009-10-17 22:26 ` Theodore Tso
2009-10-18 9:31 ` Viji V Nair
2009-10-18 11:25 ` Jon Burgess
2009-10-18 12:51 ` Viji V Nair
2009-10-18 11:41 ` Matija Nalis [this message]
2009-10-18 13:08 ` Fwd: " Viji V Nair
2009-10-19 7:23 ` Stephen Samuel (gmail)
2009-10-18 13:14 ` Viji V Nair
2009-10-18 15:07 ` Jon Burgess
2009-10-18 16:29 ` Viji V Nair
2009-10-18 17:15 ` Jon Burgess
2009-10-18 14:15 ` Peter Grandi
2009-10-18 16:10 ` Viji V Nair
2009-10-18 15:34 ` Eric Sandeen
2009-10-18 16:33 ` Viji V Nair
-- strict thread matches above, loose matches on Subject: below --
2009-10-17 6:59 Viji V Nair
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091018114100.GA26721@eagle102.home.lan \
--to=mnalis-ml@voyager.hr \
--cc=ext3-users@redhat.com \
--cc=linux-ext4@vger.kernel.org \
--cc=viji@fedoraproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).