public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 0/9] Delayed write metadata writeback V5
Date: Tue,  9 Feb 2010 14:56:33 +1100	[thread overview]
Message-ID: <1265687802-23043-1-git-send-email-david@fromorbit.com> (raw)

While I started with killing async inode writeback, the series has
grown. It's not really limited to inode writeback - it touches dquot
flushing, changes the way the AIL pushes on buffers, adds xfsbufd
sorting for delayed write buffers, adds a real non-blocking mode to
inode reclaim and avoids physical inode writeback from the VFS while
fixing bugs in handling delayed write inodes.  Hence this is more
about enabling efficient delayed write metadata than it is able
killing async inode writeback.

The idea behind this series is to make metadata buffers get
written from xfsbufd via the delayed write queue rather than being
issued asynchronously from all over the place. To do this, async
buffer writeback is almost entirely removed from XFS, replaced
instead by delayed writes and a method to expedite flushing of
delayed write buffers when required.

The result of funnelling all the buffer IO into a single place
is that we can more tightly control and therefore optimise the
submission of metadata IO. Aggregating the buffers before dispatch
allows much better sort efficiency of the buffers as the sort window
is not limited to the size of the elevator congestion hysteresis
limit. Hence we can approach 100% merge effeciency on large numbers
of buffers when dispatched for IO and greatly reduce the amount
of seeking metadata writeback causes.

The major change is to the inode flushing and reclaim code. Delayed
write inodes hold the flush lock for much longer than for async
writeback, and hence blocking on the flush lock can cause extremely
long latencies without other mechanisms to expedite the release of
the flush locks. To prevent needing to flush inodes immediately,
all operations are done non-blocking unless synchronous. This
required a significant rework of the inode reclaim code, but it
greatly simplified other pieces of code (e.g. log item pushing).

Version 5
- drop the fsync changes to xfs_fs_write_inode() and the associated
  locking changes, replace them with a targeted inode logging
  function from Christoph Hellwig to fix a performance regression on
  fs_mark -S4 workloads on an SSD.

Version 4
- rework inode reclaim checks for better legibility
- add warning to reclaim code when delwri flush errors occur
- kill XFS_ITEM_FLUSHING now it is not used
- clean up sync_mode flags being pushed into xfs_iflush()
- kill the now unused xfs_bawrite() function
- include Christoph's fsync cache flush fix
- rework the inode locking and call to xfs_fsync() when doing
  synchronous inode writes to close races between the fsync and
  the background delwri flush afterwards.

Version 3
- rework inode reclaim to:
	- separate it from xfs_iflush return values
	- provide a non-blocking mode for background operation
- apply delwri buffer promotion tricks to dquot flushing
- kill unneeded dquot flushing flags, similar to inode flushing flag
  removal
- fix sync inode flush bug when trying to flush delwri inodes

Version 2:
- use generic list sort function
- when unmounting, push the delwri buffers first, then do sync inode
  reclaim so that reclaim doesn't block for 15 seconds waiting for
  delwri inode buffers to be aged and written before the inodes can
  be reclaimed.

Alex, the patch series is available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/dgc/xfs for-2.6.34

Christoph Hellwig (2):
      xfs: remove invalid barrier optimization from xfs_fsync
      xfs: log changed inodes instead of writing them synchronously

Dave Chinner (7):
      xfs: Make inode reclaim states explicit
      xfs: Use delayed write for inodes rather than async V2
      xfs: Don't issue buffer IO direct from AIL push V2
      xfs: Sort delayed write buffers before dispatch
      xfs: Use delay write promotion for dquot flushing
      xfs: kill the unused XFS_QMOPT_* flush flags V2
      xfs: kill xfs_bawrite

 fs/xfs/linux-2.6/xfs_buf.c    |  135 ++++++++++++++++++++++++++--------------
 fs/xfs/linux-2.6/xfs_buf.h    |    3 +-
 fs/xfs/linux-2.6/xfs_super.c  |  111 ++++++++++++++++++++++++---------
 fs/xfs/linux-2.6/xfs_sync.c   |  138 +++++++++++++++++++++++++++++++++-------
 fs/xfs/linux-2.6/xfs_trace.h  |    1 +
 fs/xfs/quota/xfs_dquot.c      |   38 +++++-------
 fs/xfs/quota/xfs_dquot_item.c |   87 ++++----------------------
 fs/xfs/quota/xfs_dquot_item.h |    4 -
 fs/xfs/quota/xfs_qm.c         |   14 ++---
 fs/xfs/xfs_buf_item.c         |   64 ++++++++++---------
 fs/xfs/xfs_inode.c            |   86 ++------------------------
 fs/xfs/xfs_inode.h            |   11 +---
 fs/xfs/xfs_inode_item.c       |  108 +++++++-------------------------
 fs/xfs/xfs_inode_item.h       |    6 --
 fs/xfs/xfs_mount.c            |   13 ++++-
 fs/xfs/xfs_quota.h            |    8 +--
 fs/xfs/xfs_trans.h            |    3 +-
 fs/xfs/xfs_trans_ail.c        |   13 ++--
 fs/xfs/xfs_vnodeops.c         |   12 +---
 19 files changed, 410 insertions(+), 445 deletions(-)

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

             reply	other threads:[~2010-02-09  3:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-09  3:56 Dave Chinner [this message]
2010-02-09  3:56 ` [PATCH 1/9] xfs: Make inode reclaim states explicit Dave Chinner
2010-02-09  3:56 ` [PATCH 2/9] xfs: Use delayed write for inodes rather than async V2 Dave Chinner
2010-02-09  3:56 ` [PATCH 3/9] xfs: Don't issue buffer IO direct from AIL push V2 Dave Chinner
2010-02-09  3:56 ` [PATCH 4/9] xfs: Sort delayed write buffers before dispatch Dave Chinner
2010-02-09  3:56 ` [PATCH 5/9] xfs: Use delay write promotion for dquot flushing Dave Chinner
2010-02-09  3:56 ` [PATCH 6/9] xfs: kill the unused XFS_QMOPT_* flush flags V2 Dave Chinner
2010-02-09  3:56 ` [PATCH 7/9] xfs: remove invalid barrier optimization from xfs_fsync Dave Chinner
2010-02-09  3:56 ` [PATCH 8/9] xfs: log changed inodes instead of writing them synchronously Dave Chinner
2010-02-09  3:56 ` [PATCH 9/9] xfs: kill xfs_bawrite Dave Chinner
2010-02-09 19:10 ` [PATCH 0/9] Delayed write metadata writeback V5 Alex Elder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1265687802-23043-1-git-send-email-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox