From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 0/3] Kill async inode writeback V2
Date: Tue, 5 Jan 2010 11:04:18 +1100 [thread overview]
Message-ID: <1262649861-28530-1-git-send-email-david@fromorbit.com> (raw)
Currently we do background inode writeback on demand from many
different places - xfssyncd, xfsbufd, xfsaild and the bdi writeback
threads. The result is that inodes can be pushed at any time and
there is little to no locality to the IO patterns results from such
writeback. Indeed, we can have completing writebacks occurring which
only serves to slow down writeback.
The idea behind this series is to make metadata buffers get
written from xfsbufd via the delayed write queue rather than than from
all these other places. All the other places do is make the buffers
delayed write so that the xfsbufd can issue them.
This means that inode flushes can no longer happen asynchronously,
but we still need a method for ensuring timely dispatch of buffers
that we may be waiting for IO completion on. To do this, we allow
delayed write buffers to be "promoted" in the delayed write queue.
This effectively short-cuts the aging of the buffers, and combined
with a demand flush of the xfsbufd we push all aged and promoted
buffers out at the same time.
Combine this with sorting the delayed write buffers to be written
into disk offset order before dispatch, and we vastly improve the
IO patterns for metadata writeback. IO is issued from one place and
in a disk/elevator friendly order.
Version 2:
- use generic list sort function
- when unmounting, push the delwri buffers first, then do sync inode
reclaim so that reclaim doesn't block for 15 seconds waiting for
delwri inode buffers to be aged and written before the inodes can
be reclaimed.
Perf results (average of 3 runs) on a debug XFS build (means allocation
patterns are randomly varied, so runtimes are also a bit variable):
Untar 2.6.32 kernel tarball, sync, then remove:
Untar+sync rm -rf
xfs-dev: 25.2s 13.0s
xfs-dev-delwri-1: 22.5s 9.1s
xfs-dev-delwri-2: 21.9s 8.4s
4 processes each creating 100,000, five byte files in separate
directories concurrently, then 4 processes removing a directory each
concurrently.
create rm -rf
xfs-dev: 8m32s 4m10s
xfs-dev-delwri-1: 4m55s 3m42s
xfs-dev-delwri-2: 4m56s 3m33s
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next reply other threads:[~2010-01-05 0:07 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-05 0:04 Dave Chinner [this message]
2010-01-05 0:04 ` [PATCH 1/3] xfs: Use delayed write for inodes rather than async Dave Chinner
2010-01-08 10:36 ` Christoph Hellwig
2010-01-08 11:05 ` Dave Chinner
2010-01-08 11:14 ` Christoph Hellwig
2010-01-05 0:04 ` [PATCH 2/3] xfs: Don't issue buffer IO direct from AIL push Dave Chinner
2010-01-08 11:07 ` Christoph Hellwig
2010-01-08 11:15 ` Dave Chinner
2010-01-05 0:04 ` [PATCH 3/3] xfs: Sort delayed write buffers before dispatch Dave Chinner
2010-01-08 11:11 ` Christoph Hellwig
2010-01-08 11:17 ` Dave Chinner
2010-01-06 18:08 ` [PATCH 0/3] Kill async inode writeback V2 Christoph Hellwig
2010-01-06 22:49 ` Dave Chinner
2010-01-08 10:14 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1262649861-28530-1-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox