linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@kernel.dk>
To: Dave Chinner <david@fromorbit.com>
Cc: Nick Piggin <npiggin@gmail.com>, Nick Piggin <npiggin@kernel.dk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [patch 1/6] fs: icache RCU free inodes
Date: Mon, 15 Nov 2010 15:21:00 +1100	[thread overview]
Message-ID: <20101115042059.GB3320@amd> (raw)
In-Reply-To: <20101115010027.GC22876@dastard>

On Mon, Nov 15, 2010 at 12:00:27PM +1100, Dave Chinner wrote:
> On Fri, Nov 12, 2010 at 12:24:21PM +1100, Nick Piggin wrote:
> > On Wed, Nov 10, 2010 at 9:05 AM, Nick Piggin <npiggin@kernel.dk> wrote:
> > > On Tue, Nov 09, 2010 at 09:08:17AM -0800, Linus Torvalds wrote:
> > >> On Tue, Nov 9, 2010 at 8:21 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > >> >
> > >> > You can see problems using this fancy thing :
> > >> >
> > >> > - Need to use slab ctor() to not overwrite some sensitive fields of
> > >> > reused inodes.
> > >> >  (spinlock, next pointer)
> > >>
> > >> Yes, the downside of using SLAB_DESTROY_BY_RCU is that you really
> > >> cannot initialize some fields in the allocation path, because they may
> > >> end up being still used while allocating a new (well, re-used) entry.
> > >>
> > >> However, I think that in the long run we pretty much _have_ to do that
> > >> anyway, because the "free each inode separately with RCU" is a real
> > >> overhead (Nick reports 10-20% cost). So it just makes my skin crawl to
> > >> go that way.
> > >
> > > This is a creat/unlink loop on a tmpfs filesystem. Any real filesystem
> > > is going to be *much* heavier in creat/unlink (so that 10-20% cost would
> > > look more like a few %), and any real workload is going to have much
> > > less intensive pattern.
> > 
> > So to get some more precise numbers, on a new kernel, and on a nehalem
> > class CPU, creat/unlink busy loop on ramfs (worst possible case for inode
> > RCU), then inode RCU costs 12% more time.
> > 
> > If we go to ext4 over ramdisk, it's 4.2% slower. Btrfs is 4.3% slower, XFS
> > is about 4.9% slower.
> 
> That is actually significant because in the current XFS performance
> using delayed logging for pure metadata operations is not that far
> off ramdisk results.  Indeed, the simple test:
> 
>         while (i++ < 1000 * 1000) {
>                 int fd = open("foo", O_CREAT|O_RDWR, 777);
>                 unlink("foo");
>                 close(fd);
>         }
> 
> Running 8 instances of the above on XFS, each in their own
> directory, on a single sata drive with delayed logging enabled with
> my current working XFS tree (includes SLAB_DESTROY_BY_RCU inode
> cache and XFS inode cache, and numerous other XFS scalability
> enhancements) currently runs at ~250k files/s. It took ~33s for 8 of
> those loops above to complete in parallel, and was 100% CPU bound...

David,

This is 30K inodes per second per CPU, versus nearly 800K per second
number that I measured the 12% slowdown with. About 25x slower. How you
are trying to FUD this as doing anything but confirming my hypothesis, I
don't know and honestly I don't want to know so don't try to tell me.

That you are still at this campaign of negative and destructive crap
baffles me. All the effort you've put into negativity and obstruction,
you could have gone and got some *actual* numbers. But no, you're
obviously more interested in FUD.

I don't know what is funnier, that I keep responding to you, or that you
keep expecting me to reply when you ignore all _my_ comments about your
patches and ignore all the reasons I have given to want to merge things
my way (eg. my response to SLAB_DESTROY_BY_RCU patch where you ignored
all my feedback, and you ignore this entire thread about how and why I
want to approach rcu-walk in the way I do).

But that's it. I have explained my position, offered reasonable answers
to all questions and objections, shown good numbers, and given
strategies that regressions can be solved with. That's all I need to do.

I acknowledge the very small potential for regressions with inode-RCU
for a very small number of users. I also weigh that against complexity
and reviewability, and against the very large speedups for very many
users that rcu-walk can give. And also offered approaches for ways that
future work can resolve any regressions. You ignored all that.

You show me no respect or cortesy and seem to take me as a big joke. So
at this point I'm not interested in your handwaving or opinions. Is that
clear? Until you 1) treat me the way you expect to be treated, and 2)
actaully have something constructive, do do not cc me. I do not care.

  reply	other threads:[~2010-11-15  4:21 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-09 12:46 [patch 1/6] fs: icache RCU free inodes Nick Piggin
2010-11-09 12:47 ` [patch 2/6] fs: icache avoid RCU freeing for pseudo fs Nick Piggin
2010-11-09 12:58 ` [patch 3/6] fs: dcache documentation cleanup Nick Piggin
2010-11-09 16:24   ` Christoph Hellwig
2010-11-09 22:06     ` Nick Piggin
2010-11-10 16:27       ` Christoph Hellwig
2010-11-09 13:01 ` [patch 4/6] fs: d_delete change Nick Piggin
2010-11-09 16:25   ` Christoph Hellwig
2010-11-09 22:08     ` Nick Piggin
2010-11-10 16:32       ` Christoph Hellwig
2010-11-11  0:27         ` Nick Piggin
2010-11-11 22:07           ` Linus Torvalds
2010-11-09 13:02 ` [patch 5/6] fs: d_compare change for rcu-walk Nick Piggin
2010-11-09 16:25   ` Christoph Hellwig
2010-11-10  1:48     ` Nick Piggin
2010-11-09 13:03 ` [patch 6/6] fs: d_hash " Nick Piggin
2010-11-09 14:19 ` [patch 1/6] fs: icache RCU free inodes Andi Kleen
2010-11-09 21:36   ` Nick Piggin
2010-11-10 14:47     ` Andi Kleen
2010-11-11  4:27       ` Nick Piggin
2010-11-09 16:02 ` Linus Torvalds
2010-11-09 16:21   ` Christoph Hellwig
2010-11-09 21:48     ` Nick Piggin
2010-11-09 16:21   ` Eric Dumazet
2010-11-09 17:08     ` Linus Torvalds
2010-11-09 17:15       ` Christoph Hellwig
2010-11-09 21:55         ` Nick Piggin
2010-11-09 22:05       ` Nick Piggin
2010-11-12  1:24         ` Nick Piggin
2010-11-12  4:48           ` Linus Torvalds
2010-11-12  6:02             ` Nick Piggin
2010-11-12  6:49               ` Nick Piggin
2010-11-12 17:33                 ` Linus Torvalds
2010-11-12 23:17                   ` Nick Piggin
2010-11-15  1:00           ` Dave Chinner
2010-11-15  4:21             ` Nick Piggin [this message]
2010-11-16  3:02               ` Dave Chinner
2010-11-16  3:49                 ` Nick Piggin
2010-11-17  1:12                   ` Dave Chinner
2010-11-17  4:18                     ` Nick Piggin
2010-11-17  5:56                       ` Nick Piggin
2010-11-17  6:04                         ` Nick Piggin
2010-11-09 21:44   ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101115042059.GB3320@amd \
    --to=npiggin@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).