linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Ravikiran G Thirumalai <kiran@scalex86.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Latest vfs scalability patch
Date: Tue, 6 Oct 2009 14:26:24 +0200	[thread overview]
Message-ID: <20091006122623.GE30316@wotan.suse.de> (raw)
In-Reply-To: <20091006101414.GM5216@kernel.dk>

On Tue, Oct 06, 2009 at 12:14:14PM +0200, Jens Axboe wrote:
> On Tue, Oct 06 2009, Nick Piggin wrote:
> > Hi,
> > 
> > Several people have been interested to test my vfs patches, so rather
> > than resend patches I have uploaded a rollup against Linus's current
> > head.
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/npiggin/patches/fs-scale/
> > 
> > I have used ext2,ext3,autofs4,nfs as well as in-memory filesystems
> > OK (although this doesn't mean there are no bugs!). Otherwise, if your
> > filesystem compiles, then there is a reasonable chance of it working,
> > or ask me and I can try updating it for the new locking.
> > 
> > I would be interested in seeing any numbers people might come up with,
> > including single-threaded performance.
> 
> I gave this a quick spin on the 64-thread nehalem. Just a simple dbench
> with 64 clients on tmpfs. The results are below. While running perf top
> -a in mainline, the top 5 entries are:
> 
>           2086691.00 - 96.6% : _spin_lock
>             14866.00 -  0.7% : copy_user_generic_string
>              5710.00 -  0.3% : mutex_spin_on_owner
>              2837.00 -  0.1% : _atomic_dec_and_lock
>              2274.00 -  0.1% : __d_lookup
> 
> Uhm auch... It doesn't look much prettier for the patch kernel, though:
> 
>           9396422.00 - 95.7% : _spin_lock
>             66978.00 -  0.7% : copy_user_generic_string
>             43775.00 -  0.4% : dput
>             23946.00 -  0.2% : __link_path_walk
>             17699.00 -  0.2% : path_init
>             15046.00 -  0.2% : do_lookup

Yep, this is the problem of the common-path lookup. Every dentry
element in the path has its d_lock taken for every path lookup,
so cwd dentry lock bounces a lot for dbench.

I'm working on doing path traversal without any locks or stores
to the dentries in the common cases, so that should basically
be the last bit of the puzzle for vfs locking (although it can be
considered a different type of problem than the global lock
removal, but RCU-freed struct inode is important for the approach
I'm taking, so I'm basing it on top of these patches).
 
It's a copout, but you could try running multiple dbenches under
different working directories (or actually, IIRC dbench does root
based path lookups so maybe that won't help you much).


> Anyway, below are the results. Seem very stable.
> 
>                                 throughput
> ------------------------------------------------
> 2.6.32-rc3-git          |      561.218 MB/sec
> 2.6.32-rc3-git+patch    |      627.022 MB/sec

Well it's good to see you got some improvement.

  parent reply	other threads:[~2009-10-06 12:26 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-06  6:49 Latest vfs scalability patch Nick Piggin
2009-10-06 10:14 ` Jens Axboe
2009-10-06 10:26   ` Jens Axboe
2009-10-06 11:10     ` Peter Zijlstra
2009-10-06 12:51       ` Jens Axboe
2009-10-06 12:26   ` Nick Piggin [this message]
2009-10-06 12:49     ` Jens Axboe
2009-10-07  8:58       ` [rfc][patch] store-free path walking Nick Piggin
2009-10-07  9:56         ` Jens Axboe
2009-10-07 10:10           ` Nick Piggin
2009-10-12  3:58           ` Nick Piggin
2009-10-12  5:59             ` Nick Piggin
2009-10-12  8:20               ` Jens Axboe
2009-10-12 11:00                 ` Jens Axboe
2009-10-13  1:26             ` Christoph Hellwig
2009-10-13  1:52               ` Nick Piggin
2009-10-07 14:56         ` Linus Torvalds
2009-10-07 16:27           ` Linus Torvalds
2009-10-07 16:46             ` Nick Piggin
2009-10-07 19:25               ` Linus Torvalds
2009-10-07 20:34                 ` Andi Kleen
2009-10-07 20:51                   ` Linus Torvalds
2009-10-07 21:06                     ` Andi Kleen
2009-10-07 21:20                       ` Linus Torvalds
2009-10-07 21:57                         ` Linus Torvalds
2009-10-07 22:22                           ` Andi Kleen
2009-10-08  7:39                             ` Nick Piggin
2009-10-09 17:53                               ` Andi Kleen
2009-10-08 13:12                           ` Denys Vlasenko
2009-10-09  7:47                             ` Nick Piggin
2009-10-09 17:49                             ` Andi Kleen
2009-10-07 16:29           ` Nick Piggin
2009-10-08 12:36           ` Nick Piggin
2009-10-08 12:57             ` Jens Axboe
2009-10-08 13:22               ` Nick Piggin
2009-10-08 13:30                 ` Jens Axboe
2009-10-08 18:00                   ` Peter Zijlstra
2009-10-09  4:04                     ` Nick Piggin
2009-10-09  8:54                 ` Jens Axboe
2009-10-09  9:51                   ` Jens Axboe
2009-10-09 10:02                     ` Nick Piggin
2009-10-09 10:08                       ` Jens Axboe
2009-10-09 10:07                   ` Nick Piggin
2009-10-09  3:50             ` Nick Piggin
2009-10-09  6:15               ` David Miller
2009-10-09 10:40                 ` Nick Piggin
2009-10-09 11:09                   ` Jens Axboe
2009-10-09 10:44                 ` Nick Piggin
2009-10-09 10:48                   ` Jens Axboe
2009-10-09 23:16         ` Paul E. McKenney
2009-10-15 10:08 ` Latest vfs scalability patch Anton Blanchard
2009-10-15 10:39   ` Nick Piggin
2009-10-15 10:46     ` Anton Blanchard
2009-10-15 10:53   ` Nick Piggin
2009-10-15 11:23     ` Anton Blanchard
2009-10-15 11:41       ` Nick Piggin
2009-10-15 11:48         ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091006122623.GE30316@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=jens.axboe@oracle.com \
    --cc=kiran@scalex86.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).