From: Christoph Hellwig <hch@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>, xfs@oss.sgi.com
Subject: Re: XFS metadata flushing design - current and future
Date: Mon, 29 Aug 2011 08:33:18 -0400 [thread overview]
Message-ID: <20110829123318.GA12928@infradead.org> (raw)
In-Reply-To: <20110829010149.GE3162@dastard>
On Mon, Aug 29, 2011 at 11:01:49AM +1000, Dave Chinner wrote:
> Another thing I've noticed is that AIL pushing of dirty inodes can
> be quite inefficient from a CPU usage perspective. Inodes that have
> already been flushed to their backing buffer results in a
> IOP_PUSHBUF call when the AIL tries to push them. Pushing the buffer
> requires a buffer cache search, followed by a delwri list promotion.
> However, the initial xfs_iflush() call on a dirty inode also
> clusters all the other remaining dirty inodes in the buffer to the
> buffer. When the AIl hits those other dirty inodes, they are already
> locked and so we do a IOP_PUSHBUF call. On every other dirty inode.
> So on a completely dirty inode cluster, we do ~30 needless buffer
> cache searches and buffer delwri promotions all for the same buffer.
> That's a lot of extra work we don't need to be doing - ~10% of the
> buffer cache lookups come from IOP_PUSHBUF under inode intensive
> metadata workloads:
One really stupid thing we do in that area is that the xfs_iflush from
xfs_inode_item_push puts the buffer at the end of the delwri list and
expects it to be aged, just so that the first xfs_inode_item_pushbuf
can promote it to the front of the list. Now that we mostly write
metadata from AIL pushing we should not do an additional pass of aging
on that - that's what we already the AIL for. Once we did that we
should be able to remove the buffer promotion and make the pushuf a
no-op. The only thing this might interact with in a not so nice way
would be inode reclaim if it still did delwri writes with the delay
period, but we might be able to get away without that one as well.
> Also, larger inode buffers to reduce the amount of IO we do to both
> read and write inodes might also provide significant benefits by
> reducing the amount of IO and number of buffers we need to track in
> the cache...
We could try to get for large in-core clusters. That is try to always
allocate N aligned inode clusters together, and always read/write
clusters in that alignment together if possible.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-08-29 12:33 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-27 8:03 XFS metadata flushing design - current and future Christoph Hellwig
2011-08-29 1:01 ` Dave Chinner
2011-08-29 6:33 ` Christoph Hellwig
2011-08-29 12:33 ` Christoph Hellwig [this message]
2011-08-30 1:28 ` Dave Chinner
2011-08-30 5:09 ` Christoph Hellwig
2011-08-30 7:06 ` Dave Chinner
2011-08-30 7:10 ` Christoph Hellwig
2011-09-09 22:31 ` Stewart Smith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110829123318.GA12928@infradead.org \
--to=hch@infradead.org \
--cc=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox