public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Lukáš Czerner" <lczerner@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Radek Pazdera <rpazdera@redhat.com>,
	linux-ext4@vger.kernel.org, kasparek@fit.vutbr.cz
Subject: Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index
Date: Mon, 17 Jun 2013 10:58:35 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1306171057050.3270@localhost.localdomain> (raw)
In-Reply-To: <20130616005533.GF29338@dastard>

On Sun, 16 Jun 2013, Dave Chinner wrote:

> Date: Sun, 16 Jun 2013 10:55:33 +1000
> From: Dave Chinner <david@fromorbit.com>
> To: Radek Pazdera <rpazdera@redhat.com>
> Cc: linux-ext4@vger.kernel.org, lczerner@redhat.com, kasparek@fit.vutbr.cz
> Subject: Re: [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index
> 
> On Sat, May 04, 2013 at 11:28:33PM +0200, Radek Pazdera wrote:
> > Hello everyone,
> > 
> > I am an university student from Brno /CZE/. I decided to try to optimise
> > the readdir/stat scenario in ext4 as the final project to school. I
> > posted some test results I got few months ago [1].
> > 
> > I tried to implement an additional tree for ext4's directory index
> > that would be sorted by inode numbers. The tree then would be used
> > by ext4_readdir() which should lead to substantial increase of
> > performance of operations that manipulate a whole directory at once.
> > 
> > The performance increase should be visible especially with large
> > directories or in case of low memory or cache pressure.
> > 
> > This patch series is what I've got so far. I must say, I originally
> > thought it would be *much* simpler :).
> ....
> > BENCHMARKS
> > ==========
> > 
> > I did some benchmarks and compared the performance with ext4/htree,
> > XFS, and btrfs up to 5 000 000 of files in a single directory. Not
> > all of them are done though (they run for days).
> 
> Just a note that for users that have this sort of workload on XFS,
> it is generally recommended that they increase the directory block
> size to 8-16k (from the default of 4k). The saddle point where 8-16k
> directory blocks tends to perform better than 4k directory blocks is
> around the 2-3 million file point....
> 
> Further, if you are doing random operations on such directories,
> then increasing it to the maximum of 64k is recommended. This
> greatly reduces the IO overhead of directory manipulations by making
> the trees widers and shallower. i.e. we recommend trading off CPU
> and memory for lower IO overhead and better layout on disk as it's
> layout and IO that are the performance limiting factors for large
> directories. :)
> 
> > Full results are available here:
> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/ext4-5M/
> 
> Can you publish the scripts you used so we can try to reproduce
> your results?

Hi Dave,

IIRC the tests used to generate the results should be found here:

https://github.com/astro-/dir-index-test

however I am not entirely sure whether the github repository is kept
up-to-date. Radek can you confirm ?

-Lukas

> 
> > I also did some tests on an aged file system (I used the simple 0.8
> > chance to create, 0.2 to delete a file) where the results of ext4
> > with itree are much better even than xfs, which gets fragmented:
> > 
> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/cp.png
> >     http://www.stud.fit.vutbr.cz/~xpazde00/soubory/5M-dirty/readdir-stat.png
> 
> This XFS result is of interest to me here - it shouldn't degrade
> like that, so having the script to be able to reproduce it locally
> would be helpful to me. Indeed, I posted a simple patch yesterday
> that significantly improves XFS performance on a similar small file
> create workload:
> 
> http://marc.info/?l=linux-fsdevel&m=137126465712701&w=2
> 
> That writeback plugging change should benefit ext4 as well in these
> workloads....
> 
> Cheers,
> 
> Dave.
> 

  reply	other threads:[~2013-06-17  8:58 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-04 21:28 [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index Radek Pazdera
2013-05-04 21:28 ` [RFC 1/9] ext4: Adding itree feature and inode flags Radek Pazdera
2013-05-04 21:28 ` [RFC 2/9] ext4: Allow sorting dx_map by inode as well Radek Pazdera
2013-05-04 21:28 ` [RFC 3/9] ext4: Adding a link to itree to the dx_root struct Radek Pazdera
2013-05-04 21:28 ` [RFC 4/9] ext4: Adding itree structures Radek Pazdera
2013-05-04 21:28 ` [RFC 5/9] ext4: Adding itree implementation I - Core Radek Pazdera
2013-05-04 21:28 ` [RFC 6/9] ext4: Adding itree implementation II - Inserting Radek Pazdera
2013-05-04 21:28 ` [RFC 7/9] ext4: Adding itree implementation III - Deleting Radek Pazdera
2013-05-04 21:28 ` [RFC 8/9] ext4: Make directory operations use itree Radek Pazdera
2013-05-04 21:28 ` [RFC 9/9] ext4: Make ext4_readdir() use itree if available Radek Pazdera
2013-05-11 13:28 ` [RFC 0/9] ext4: An Auxiliary Tree for the Directory Index Zheng Liu
2013-05-11 21:18   ` Radek Pazdera
2013-06-16  0:55 ` Dave Chinner
2013-06-17  8:58   ` Lukáš Czerner [this message]
2013-06-19 12:10     ` Radek Pazdera
2013-06-27  9:24 ` Lukáš Czerner
2013-07-01 11:40   ` Radek Pazdera
2013-07-01 12:17     ` Lukáš Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1306171057050.3270@localhost.localdomain \
    --to=lczerner@redhat.com \
    --cc=david@fromorbit.com \
    --cc=kasparek@fit.vutbr.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=rpazdera@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox