From: Nick Piggin <npiggin@kernel.dk>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Inode Lock Scalability V4
Date: Sun, 17 Oct 2010 04:55:15 +1100 [thread overview]
Message-ID: <20101016175515.GA6868@amd> (raw)
In-Reply-To: <1287216853-17634-1-git-send-email-david@fromorbit.com>
On Sat, Oct 16, 2010 at 07:13:54PM +1100, Dave Chinner wrote:
> Version 2 of this series is a complete rework of the original patch
> series. Nick's original code nested list locks inside the the
> inode->i_lock, resulting in a large mess of trylock operations to
> get locks out of order all over the place. In many cases, the reason
> fo this lock ordering is removed later on in Nick's series as
> cleanups are introduced.
I think you must misunderstand how my patchset is structured. It is not
that it was simply thrown together with cleanups piled on top of crap as
I discovered new ways to fix trylocks.
It is structured so that for the first steps, each (semi-)independent
aspect of inode_lock's concurrency protection is taken over by a new
lock. I've just made these locks dumb globals for simplicity at this
point.
Often, the required locking of these different aspects of the icache
overlap. Also inode_lock remains in place until the end. This makes lock
ordering get ugly.
But the point is that once we get past the trylocks and actually have
the locks held, it is relatively easy to demonstrate that the protection
provided is exactly what inode_lock provided.
After inode_lock is removed, I start to scale the locks properly and
introduce more complicated transforms to the code to improve the
locking. I really like how I split it up.
In your patch set you've basically pulled all these steps into the
inode_lock splitting itself. This is my first big objection.
Second I actually prefer how I cover all the icache state of an inode
with i_lock. You have avoided some coverage in the interest of reducing
lock ordering complexity, but as I have demonstrated RCU tends to work
as well for this too (and is mostly all solved in the second half of my
series).
> This patch set is just the basic inode_lock breakup patches plus a
> few more simple changes to the inode code. It stops short of
> introducing RCU inode freeing because those changes are not
> completely baked yet.
It also doesn't contain per-zone locking and lrus, or scalability of
superblock list locking.
And while the rcu-walk path walking is not fully baked, it has been
reviewed by Linus and is in pretty good shape. So I prefer to utilise
RCU locking here too, seeing as we know it will go in.
I much prefer to fix all of the problematic global locks in icache up
front and do the icache merge in a single release so we don't have
locking changes there stringing out over several releases.
next prev parent reply other threads:[~2010-10-16 17:55 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-10-16 8:13 Inode Lock Scalability V4 Dave Chinner
2010-10-16 8:13 ` [PATCH 01/19] fs: switch bdev inode bdi's correctly Dave Chinner
2010-10-16 9:30 ` Nick Piggin
2010-10-16 16:31 ` Christoph Hellwig
2010-10-16 8:13 ` [PATCH 02/19] kernel: add bl_list Dave Chinner
2010-10-16 9:51 ` Nick Piggin
2010-10-16 16:32 ` Christoph Hellwig
2010-10-16 8:13 ` [PATCH 03/19] fs: Convert nr_inodes and nr_unused to per-cpu counters Dave Chinner
2010-10-16 8:29 ` Eric Dumazet
2010-10-16 10:04 ` Dave Chinner
2010-10-16 10:27 ` Eric Dumazet
2010-10-16 17:26 ` Peter Zijlstra
2010-10-17 1:09 ` Dave Chinner
2010-10-17 1:12 ` Christoph Hellwig
2010-10-17 2:16 ` Dave Chinner
2010-10-16 8:13 ` [PATCH 04/19] fs: Implement lazy LRU updates for inodes Dave Chinner
2010-10-16 9:29 ` Nick Piggin
2010-10-16 16:59 ` Christoph Hellwig
2010-10-16 17:29 ` Nick Piggin
2010-10-16 17:34 ` Nick Piggin
2010-10-17 0:47 ` Christoph Hellwig
2010-10-17 0:47 ` Christoph Hellwig
2010-10-17 2:09 ` Nick Piggin
2010-10-17 1:53 ` Dave Chinner
2010-10-16 8:13 ` [PATCH 05/19] fs: inode split IO and LRU lists Dave Chinner
2010-10-16 8:14 ` [PATCH 06/19] fs: Clean up inode reference counting Dave Chinner
2010-10-16 8:14 ` [PATCH 07/19] exofs: use iput() for inode reference count decrements Dave Chinner
2010-10-16 8:14 ` [PATCH 08/19] fs: rework icount to be a locked variable Dave Chinner
2010-10-16 8:14 ` [PATCH 09/19] fs: Factor inode hash operations into functions Dave Chinner
2010-10-16 8:14 ` [PATCH 10/19] fs: Introduce per-bucket inode hash locks Dave Chinner
2010-10-16 8:14 ` [PATCH 11/19] fs: add a per-superblock lock for the inode list Dave Chinner
2010-10-16 8:14 ` [PATCH 12/19] fs: split locking of inode writeback and LRU lists Dave Chinner
2010-10-16 8:14 ` [PATCH 13/19] fs: Protect inode->i_state with the inode->i_lock Dave Chinner
2010-10-16 8:14 ` [PATCH 14/19] fs: introduce a per-cpu last_ino allocator Dave Chinner
2010-10-16 8:14 ` [PATCH 15/19] fs: Make iunique independent of inode_lock Dave Chinner
2010-10-16 8:14 ` [PATCH 16/19] fs: icache remove inode_lock Dave Chinner
2010-10-16 8:14 ` [PATCH 17/19] fs: Reduce inode I_FREEING and factor inode disposal Dave Chinner
2010-10-17 1:30 ` Christoph Hellwig
2010-10-17 2:49 ` Nick Piggin
2010-10-17 4:13 ` Dave Chinner
2010-10-17 4:35 ` Nick Piggin
2010-10-17 5:13 ` Nick Piggin
2010-10-17 6:52 ` Dave Chinner
2010-10-17 7:05 ` Nick Piggin
2010-10-17 23:39 ` Dave Chinner
2010-10-18 21:27 ` Sage Weil
2010-10-19 3:54 ` Nick Piggin
2010-10-16 8:14 ` [PATCH 18/19] fs: split __inode_add_to_list Dave Chinner
2010-10-16 8:14 ` [PATCH 19/19] fs: do not assign default i_ino in new_inode Dave Chinner
2010-10-16 9:09 ` Eric Dumazet
2010-10-16 16:35 ` Christoph Hellwig
2010-10-18 9:11 ` Eric Dumazet
2010-10-18 14:48 ` Christoph Hellwig
2010-10-16 17:55 ` Nick Piggin [this message]
2010-10-17 2:47 ` Inode Lock Scalability V4 Dave Chinner
2010-10-17 2:55 ` Nick Piggin
2010-10-17 2:57 ` Nick Piggin
2010-10-17 6:10 ` Dave Chinner
2010-10-17 6:34 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101016175515.GA6868@amd \
--to=npiggin@kernel.dk \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).