From: Christoph Hellwig <hch@infradead.org>
To: Mel Gorman <mgorman@suse.de>
Cc: "xfs@oss.sgi.com" <xfs@oss.sgi.com>,
Christoph Hellwig <hch@infradead.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Wu Fengguang <fengguang.wu@intel.com>,
Johannes Weiner <jweiner@redhat.com>
Subject: Re: [PATCH 03/27] xfs: use write_cache_pages for writeback clustering
Date: Mon, 11 Jul 2011 07:10:39 -0400 [thread overview]
Message-ID: <20110711111039.GA3139@infradead.org> (raw)
In-Reply-To: <20110705143409.GB15285@suse.de>
On Tue, Jul 05, 2011 at 03:34:10PM +0100, Mel Gorman wrote:
> > However, what I'm questioning is whether we should even care what
> > page memory reclaim wants to write - it seems to make fundamentally
> > bad decisions from an IO persepctive.
> >
>
> It sucks from an IO perspective but from the perspective of the VM that
> needs memory to be free in a particular zone or node, it's a reasonable
> request.
It might appear reasonable, but it's not.
What the VM wants underneath is generally (1):
- free N pages in zone Z
and it then goes own to free the pages one one by one though kswapd,
which leads to freeing those N pages, but unless they already were
clean it will take very long to get there and bog down the whole
system.
So we need a better way to actually perform that underlying request.
Dave's suggestion of keeping different lists for clean vs dirty pages
in the VM and preferably reclaiming for the clean ones when having
zone pressure is one first step. The second one will be to tell the
writeback threads to preferably reclaim from a zone. I'm actually
not sure how do that yet, as we could have memory from different
zones on a single inode. Taking an inode that has memory from the
right zone and the writing that out will probably work fine for
different zones in a 64-bit NUMA systems where zones more or less
equal nodes. It probably won't work very well if we need to free
up memory in the various low memory zones, as those will be spread
over random inodes.
> It doesnt' check how many pages are under writeback. Direct reclaim
> will check if the block device is congested but that is about
> it. Otherwise the expectation was the elevator would handle the
> merging of requests into a sensible patter.
It can't. The elevator has a relatively small window it can operate
on, and can never fix up a bad large scale writeback pattern.
> Also, while filesystem
> pages are getting cleaned by flushs, that does not cover anonymous
> pages being written to swap.
At least for now we will have to keep kswapd writeback for swap. It
is just as inefficient a on a filesystem, but given that people don't
rely on swap performance we can probably live with it. Note that we
can't simply use background flushing for swap, as that would mean
we'd need backing space allocated for all main memory, which isn't
very practical with todays memory sized. The whole concept of demand
paging anonymous memory leads to pretty bad I/O patterns. If you're
actually making heavy use of it the old-school unix full process paging
would be a lot faster.
(1) moulo things like compaction
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-07-11 11:10 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-29 14:01 [PATCH 00/27] patch queue for Linux 3.1 Christoph Hellwig
2011-06-29 14:01 ` [PATCH 01/27] xfs: PF_FSTRANS should never be set in ->writepage Christoph Hellwig
2011-06-30 1:34 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 02/27] xfs: remove the unused ilock_nowait codepath in writepage Christoph Hellwig
2011-06-30 0:15 ` Dave Chinner
2011-06-30 1:26 ` Dave Chinner
2011-06-30 6:55 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 03/27] xfs: use write_cache_pages for writeback clustering Christoph Hellwig
2011-06-30 2:00 ` Dave Chinner
2011-06-30 2:48 ` Dave Chinner
2011-06-30 6:57 ` Christoph Hellwig
2011-07-01 2:22 ` Dave Chinner
2011-07-01 4:18 ` Dave Chinner
2011-07-01 8:59 ` Christoph Hellwig
2011-07-01 9:20 ` Dave Chinner
2011-07-01 9:33 ` Christoph Hellwig
2011-07-01 14:59 ` Mel Gorman
2011-07-01 15:15 ` Christoph Hellwig
2011-07-02 2:42 ` Dave Chinner
2011-07-05 14:10 ` Mel Gorman
2011-07-05 15:55 ` Dave Chinner
2011-07-11 10:26 ` Christoph Hellwig
2011-07-01 15:41 ` Wu Fengguang
2011-07-04 3:25 ` Dave Chinner
2011-07-05 14:34 ` Mel Gorman
2011-07-06 1:23 ` Dave Chinner
2011-07-11 11:10 ` Christoph Hellwig [this message]
2011-07-06 4:53 ` Wu Fengguang
2011-07-06 6:47 ` Minchan Kim
2011-07-06 7:17 ` Dave Chinner
2011-07-06 15:12 ` Johannes Weiner
2011-07-08 9:54 ` Dave Chinner
2011-07-11 17:20 ` Johannes Weiner
2011-07-11 17:24 ` Christoph Hellwig
2011-07-11 19:09 ` Rik van Riel
2011-07-01 8:51 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 04/27] xfs: cleanup xfs_add_to_ioend Christoph Hellwig
2011-06-29 22:13 ` Alex Elder
2011-06-30 2:00 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 05/27] xfs: work around bogus gcc warning in xfs_allocbt_init_cursor Christoph Hellwig
2011-06-29 22:13 ` Alex Elder
2011-06-29 14:01 ` [PATCH 06/27] xfs: split xfs_setattr Christoph Hellwig
2011-06-29 22:13 ` Alex Elder
2011-06-30 7:03 ` Christoph Hellwig
2011-06-30 12:28 ` Alex Elder
2011-06-30 2:11 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 08/27] xfs: kill xfs_itruncate_start Christoph Hellwig
2011-06-29 22:13 ` Alex Elder
2011-06-29 14:01 ` [PATCH 09/27] xfs: split xfs_itruncate_finish Christoph Hellwig
2011-06-30 2:44 ` Dave Chinner
2011-06-30 7:18 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 10/27] xfs: improve sync behaviour in the fact of aggressive dirtying Christoph Hellwig
2011-06-30 2:52 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 11/27] xfs: fix filesystsem freeze race in xfs_trans_alloc Christoph Hellwig
2011-06-30 2:59 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 12/27] xfs: remove i_transp Christoph Hellwig
2011-06-30 3:00 ` Dave Chinner
2011-06-29 14:01 ` [PATCH 13/27] xfs: factor out xfs_dir2_leaf_find_entry Christoph Hellwig
2011-06-30 6:11 ` Dave Chinner
2011-06-30 7:34 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 14/27] xfs: cleanup shortform directory inode number handling Christoph Hellwig
2011-06-30 6:35 ` Dave Chinner
2011-06-30 7:39 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 15/27] xfs: kill struct xfs_dir2_sf Christoph Hellwig
2011-06-30 7:04 ` Dave Chinner
2011-06-30 7:09 ` Christoph Hellwig
2011-06-29 14:01 ` [PATCH 16/27] xfs: cleanup the defintion of struct xfs_dir2_sf_entry Christoph Hellwig
2011-06-29 14:01 ` [PATCH 17/27] xfs: avoid usage of struct xfs_dir2_block Christoph Hellwig
2011-06-29 14:01 ` [PATCH 18/27] xfs: kill " Christoph Hellwig
2011-06-29 14:01 ` [PATCH 19/27] xfs: avoid usage of struct xfs_dir2_data Christoph Hellwig
2011-06-29 14:01 ` [PATCH 20/27] xfs: kill " Christoph Hellwig
2011-06-29 14:01 ` [PATCH 21/27] xfs: cleanup the defintion of struct xfs_dir2_data_entry Christoph Hellwig
2011-06-29 14:01 ` [PATCH 22/27] xfs: cleanup struct xfs_dir2_leaf Christoph Hellwig
2011-06-29 14:01 ` [PATCH 23/27] xfs: remove the unused xfs_bufhash structure Christoph Hellwig
2011-06-29 14:01 ` [PATCH 24/27] xfs: clean up buffer locking helpers Christoph Hellwig
2011-06-29 14:01 ` [PATCH 25/27] xfs: return the buffer locked from xfs_buf_get_uncached Christoph Hellwig
2011-06-29 14:01 ` [PATCH 26/27] xfs: cleanup I/O-related buffer flags Christoph Hellwig
2011-06-29 14:01 ` [PATCH 27/27] xfs: avoid a few disk cache flushes Christoph Hellwig
2011-06-30 6:36 ` [PATCH 00/27] patch queue for Linux 3.1 Dave Chinner
2011-06-30 6:50 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110711111039.GA3139@infradead.org \
--to=hch@infradead.org \
--cc=fengguang.wu@intel.com \
--cc=jweiner@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox