From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 2/4] XFS: Use the inode tree for finding dirty inodes
Date: Wed, 23 Jul 2008 10:05:48 +1000 [thread overview]
Message-ID: <20080723000548.GG5947@disturbed> (raw)
In-Reply-To: <20080722072733.GA15376@infradead.org>
On Tue, Jul 22, 2008 at 03:27:33AM -0400, Christoph Hellwig wrote:
> On Tue, Jul 22, 2008 at 03:30:19PM +1000, Dave Chinner wrote:
> > Until the method we'll use for referencing VFS inodes in core XFS
> > code is clear, I'd rather leave it consistent with the rest of the code.
>
> I would expect your cleanups in this area to go in before this patch,
> or not?
The cleanups are on top of these patches. I'll have to reorder and
redo them if they are to go in first...
> > As it is, I expect that this code will be revisited several times as
> > new functionality is added (e.g. combining the VFS and XFS inodes)
> > because that will change the way we interface with VFS and XFS inodes.
> > Much of this vnode specific code will change at that time; I'd
> > rather just change it once when the time is appropriate than have
> > to change it multiple times....
>
> Well, it's not vnode specific code. bhv_vnode_t is just a typedef for
> struct inode and the various functions dealing with it are trivial
> wrapper, many of them in fact expanding to no code at all.
Right, but we are still using bhv_vnode_t in core XFS code, not
struct inode. There haven't been patches to decide that one way or
another.
> But maybe
> killing this syntactic sugar should be a separate patch.
I think so - do it once and once only.
> I only fear
> we'll never get it in with the current review and commit latencies
> for XFS :(
I can see this being a big issue in the not-too-distant future.....
> > > This needs a big comment on why you're using the gang lookup for
> > > a single lookup. I guess that's because the gang lookup skips to
> > > the next actually existing entry instead of returning NULL, but that's
> > > not really obvious to the reader.
> >
> > Fair enough. This was later in the original series where this was
> > a dirty tag lookup and didn't need commenting ;)
>
> Actually even then a comment describing why we want a gang lookup
> when just looking up a single element might be a good idea.
Sure - I added that.
> On the
> other hand I wonder whether we might actually want to use the gang
> lookups, the unconditional igrab is probably cheaper than lots of
> radix tree lookups.
Yes, agreed, and that is the way I'm heading. I want to get
everything working against the radix tree before attempting
to optimise the lookups.
The next steps are:
- combine the XFS + VFS inodes into one structure so that we
can simplify inode lookup.
- bypass the linux icache so all lookups run from the radix
tree. This involves adding our own dirty VFS inode
tracking. This initially just uses the VFS superblock
dirty list like the generic code
- expand the dirty inode tracking to setting a tag in the
radix tree.
- changing radix tree walkers looking for dirty inodes to
use tag lookups - this requires adding interfaces to the
radix tree code
- Use a tag in the radix tree to mark inodes for reclaim.
- change reclaim to use a radix tree walk.
- changing radix tree walkers to all use gang and gang-tag
lookups - this requires adding interfaces to the radix
tree code
- factor the individual radix tree walk implementations once
commonalities are obvious.
This is just basic groundwork. The interesting stuff starts after
this - separating data and inode flushing, avoiding RMW cycles in
xfs_iflush, writeback aggregation beyond a single inode cluster,
multi-level caching and cache compression, RCU locking, thread pools
to parallelise tree walks, and so on.
There's a bunch of stuff in the pipeline - this is the basic
groundwork needed to catch it all effectively...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2008-07-23 0:05 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-20 12:19 [PATCH 0/4] XFS: replace the mount inode list with radix tree traversals Dave Chinner
2008-07-20 12:19 ` [PATCH 1/4] XFS: Remove xfs_iflush_all and clean up xfs_finish_reclaim_all() Dave Chinner
2008-07-21 7:58 ` Christoph Hellwig
2008-07-21 11:33 ` Dave Chinner
2008-07-22 4:24 ` Christoph Hellwig
2008-07-20 12:19 ` [PATCH 2/4] XFS: Use the inode tree for finding dirty inodes Dave Chinner
2008-07-22 4:28 ` Christoph Hellwig
2008-07-22 5:30 ` Dave Chinner
2008-07-22 7:27 ` Christoph Hellwig
2008-07-23 0:05 ` Dave Chinner [this message]
2008-07-23 2:10 ` Mark Goodwin
2008-07-23 3:46 ` Eric Sandeen
2008-07-23 4:04 ` Mark Goodwin
2008-07-23 4:09 ` Eric Sandeen
2008-07-23 5:00 ` Mark Goodwin
2008-07-23 4:27 ` Dave Chinner
2008-07-23 4:18 ` Dave Chinner
2008-07-23 15:37 ` Russell Cattelan
2008-07-24 6:02 ` Mark Goodwin
2008-07-25 3:55 ` Russell Cattelan
2008-07-25 4:08 ` Mark Goodwin
2008-07-25 5:40 ` Russell Cattelan
2008-07-25 6:55 ` Niv Sardi
2008-07-23 20:49 ` Christoph Hellwig
2008-07-24 11:46 ` Dave Chinner
2008-07-20 12:19 ` [PATCH 3/4] XFS: Traverse inode trees when releasing dquots Dave Chinner
2008-07-20 12:19 ` [PATCH 4/4] XFS: remove the mount inode list Dave Chinner
2008-07-22 4:29 ` Christoph Hellwig
2008-07-22 5:42 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080723000548.GG5947@disturbed \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox