From: Chris Mason <chris.mason@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
ext4 <linux-ext4@vger.kernel.org>,
Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] Improve buffered streaming write ordering
Date: Thu, 09 Oct 2008 11:11:20 -0400 [thread overview]
Message-ID: <1223565080.14090.28.camel@think.oraclecorp.com> (raw)
In-Reply-To: <20081002234309.GH30001@disturbed>
On Fri, 2008-10-03 at 09:43 +1000, Dave Chinner wrote:
> On Thu, Oct 02, 2008 at 11:48:56PM +0530, Aneesh Kumar K.V wrote:
> > On Thu, Oct 02, 2008 at 08:20:54AM -0400, Chris Mason wrote:
> > > On Wed, 2008-10-01 at 21:52 -0700, Andrew Morton wrote:
> > > For a 4.5GB streaming buffered write, this printk inside
> > > ext4_da_writepage shows up 37,2429 times in /var/log/messages.
> > >
> >
> > Part of that can happen due to shrink_page_list -> pageout -> writepagee
> > call back with lots of unallocated buffer_heads(blocks).
>
> Quite frankly, a simple streaming buffered write should *never*
> trigger writeback from the LRU in memory reclaim. That indicates
> that some feedback loop has broken down and we are not cleaning
> pages fast enough or perhaps in the correct order. Page reclaim in
> this case should be reclaiming clean pages (those that have already
> been written back), not writing back random dirty pages.
Here are some go faster stripes for the XFS buffered writeback. This
patch has a lot of debatable features to it, but the idea is to show
which knobs are slowing us down today.
The first change is to avoid calling balance_dirty_pages_ratelimited on
every page. When we know we're doing a largeish write it makes more
sense to balance things less often. This might just mean our
ratelimit_pages magic value is too small.
The second change makes xfs bump wbc->nr_to_write (suggested by
Christoph), which probably makes delalloc go in bigger chunks.
On unpatched kernels, XFS does streaming writes to my 4 drive array at
around 205MB/s. With the patch below, I come in at 326MB/s. O_DIRECT
runs at 330MB/s, so that's pretty good.
With just the nr_to_write change, I get around 315MB/s.
With just the balance_dirty_pages_nr change, I get around 240MB/s.
-chris
diff --git a/fs/xfs/linux-2.6/xfs_aops.c b/fs/xfs/linux-2.6/xfs_aops.c
index a44d68e..c72bd54 100644
--- a/fs/xfs/linux-2.6/xfs_aops.c
+++ b/fs/xfs/linux-2.6/xfs_aops.c
@@ -944,6 +944,9 @@ xfs_page_state_convert(
int trylock = 0;
int all_bh = unmapped;
+
+ wbc->nr_to_write *= 4;
+
if (startio) {
if (wbc->sync_mode == WB_SYNC_NONE && wbc->nonblocking)
trylock |= BMAPI_TRYLOCK;
diff --git a/mm/filemap.c b/mm/filemap.c
index 876bc59..b6c26e3 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2389,6 +2389,7 @@ static ssize_t generic_perform_write(struct file *file,
long status = 0;
ssize_t written = 0;
unsigned int flags = 0;
+ unsigned long nr = 0;
/*
* Copies from kernel address space cannot fail (NFSD is a big user).
@@ -2460,11 +2461,17 @@ again:
}
pos += copied;
written += copied;
-
- balance_dirty_pages_ratelimited(mapping);
+ nr++;
+ if (nr > 256) {
+ balance_dirty_pages_ratelimited_nr(mapping, nr);
+ nr = 0;
+ }
} while (iov_iter_count(i));
+ if (nr)
+ balance_dirty_pages_ratelimited_nr(mapping, nr);
+
return written ? written : status;
}
next prev parent reply other threads:[~2008-10-09 15:11 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-01 18:40 [PATCH] Improve buffered streaming write ordering Chris Mason
2008-10-02 4:52 ` Andrew Morton
2008-10-02 12:20 ` Chris Mason
2008-10-02 16:12 ` Chris Mason
2008-10-02 18:18 ` Aneesh Kumar K.V
2008-10-02 19:44 ` Andrew Morton
2008-10-02 23:43 ` Dave Chinner
2008-10-03 19:45 ` Chris Mason
2008-10-06 10:16 ` Aneesh Kumar K.V
2008-10-06 14:21 ` Chris Mason
2008-10-07 8:45 ` Aneesh Kumar K.V
2008-10-07 9:05 ` Christoph Hellwig
2008-10-07 10:02 ` Aneesh Kumar K.V
2008-10-07 13:29 ` Theodore Tso
2008-10-07 13:36 ` Christoph Hellwig
2008-10-07 13:36 ` Christoph Hellwig
2008-10-07 14:46 ` Nick Piggin
2008-10-07 13:36 ` Christoph Hellwig
2008-10-07 13:55 ` Peter Staubach
2008-10-07 14:38 ` Chuck Lever
2008-10-09 15:11 ` Chris Mason [this message]
2008-10-10 5:13 ` Dave Chinner
2008-10-03 1:11 ` Chris Mason
2008-10-03 2:43 ` Nick Piggin
2008-10-03 12:07 ` Chris Mason
2008-10-02 18:08 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1223565080.14090.28.camel@think.oraclecorp.com \
--to=chris.mason@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.