linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Miao Xie <miaox@cn.fujitsu.com>
Cc: Yan Zheng <zheng.yan@oracle.com>,
	Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Poor creat/delete files performance
Date: Thu, 26 Aug 2010 19:15:39 -0400	[thread overview]
Message-ID: <20100826231539.GB14190@think> (raw)
In-Reply-To: <4C763CFB.6010600@cn.fujitsu.com>

On Thu, Aug 26, 2010 at 06:07:55PM +0800, Miao Xie wrote:
> On Wed, 18 Aug 2010 20:57:43 -0400, Chris Mason wrote:
> >Since the files are empty, and we aren't doing enough files to trigger
> >IO, it is really benchmarking the cost of the btree insertions/removals
> >in comparison with ext4.  I do expect this to be higher because btrfs is
> >indexing the directories twice (once by name and once by sequence number
> >for faster backups).
> >
> >On my machine:
> >
> >Btrfs defaults:
> >
> >Create files:
> >	Total files: 50000
> >	Total time: 0.916680
> >	Average time: 0.000018
> >Delete files:
> >	Total files: 50000
> >	Total time: 1.329892
> >	Average time: 0.000027
> >
> >Ext4:
> >
> >creat_unlink 50000
> >Create files:
> >	Total files: 50000
> >	Total time: 0.718190
> >	Average time: 0.000014
> >Delete files:
> >	Total files: 50000
> >	Total time: 0.308815
> >	Average time: 0.000006
> >
> >We're definitely slower than ext4, but as Ric's benchmarks show things
> >tend to tilt in our favor once IO is actually done.
> >
> >There are two big things that would help fix this performance gap:
> >Switching the extent buffer rbtree into a radix tree (esp a lockless
> >radix tree), and delaying insertion of the inode so that we can do more
> >in btree operations in bulk.
> >
> >The radix tree is a much easier and more contained project.
> 
> The type of the radix tree's key is "unsigned long", but the type of the
> extent buffer's key is "u64". That is we can't use the radix tree instead of
> rbtree on the 32-bits boxs. So we can't switching the extent buffer rbtree
> into a radix tree.

Right, but the key is just the byte number offset from 0.  The extent
buffers are backed by pages, and the pages are allocated off the
metadata inode's address space, which is backed by a radix tree.

You can try using the (bytes offset >> PAGE_CACHE_SHIFT).  The problem
you might hit is the radix tree is tuned pretty hard now for the page
cache.

Another option is to attach the extent buffers to page->private, and use
the page cache's radix tree (remove the rbtree completely).  For
blocksize > pagesize, we could only put the first page in each block
into the page cache, and just tie the rest of the off the extent buffer.

But, if you get the 4K metadata block size part working, I can cram in
the larger block sizes pretty easily now.

-chris


      reply	other threads:[~2010-08-26 23:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-18 10:12 Poor creat/delete files performance Miao Xie
2010-08-18 10:49 ` Morten P.D. Stevens
2010-08-18 14:25   ` Chris Mason
2010-08-18 15:28     ` Morten P.D. Stevens
2010-08-18 15:39       ` Chris Mason
2010-08-18 16:26         ` Morten P.D. Stevens
2010-08-18 10:49 ` Leonidas Spyropoulos
2010-08-18 11:00   ` Miao Xie
2010-08-18 12:09 ` Chris Mason
2010-08-19  0:35   ` Miao Xie
2010-08-19  0:57     ` Chris Mason
2010-08-19  1:38       ` Miao Xie
2010-08-26 10:07       ` Miao Xie
2010-08-26 23:15         ` Chris Mason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100826231539.GB14190@think \
    --to=chris.mason@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=miaox@cn.fujitsu.com \
    --cc=zheng.yan@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).