From: Wu Fengguang <fengguang.wu@intel.com>
To: Nikita Danilov <danilov@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Theodore Tso <tytso@mit.edu>,
Christoph Hellwig <hch@infradead.org>,
Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
"Li, Shaohua" <shaohua.li@intel.com>,
Myklebust Trond <Trond.Myklebust@netapp.com>,
"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
Jan Kara <jack@suse.cz>, Nick Piggin <npiggin@suse.de>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 30/45] vmscan: lumpy pageout
Date: Wed, 7 Oct 2009 23:00:47 +0800 [thread overview]
Message-ID: <20091007150047.GA9848@localhost> (raw)
In-Reply-To: <8acda98c0910070750x6428b96fgdeee5946d1408888@mail.gmail.com>
On Wed, Oct 07, 2009 at 10:50:17PM +0800, Nikita Danilov wrote:
> 2009/10/7 Wu Fengguang <fengguang.wu@intel.com>:
>
> [...]
>
> > + if (bdi_cap_writeback_dirty(mapping->backing_dev_info) &&
> > + !mapping->a_ops->writepages) {
> > + wbc.range_start = (page->index + 1) << PAGE_CACHE_SHIFT;
> > + wbc.nr_to_write = LUMPY_PAGEOUT_PAGES - 1;
> > + generic_writepages(mapping, &wbc);
> > + iput(inode);
>
> I am afraid calling iput() from within pageout code is not generally
> safe: it can trigger a lot of file system activity (think
> open-unlinked file) that is not supposed to happen in the
> direct-reclaim context. Limiting pageout clustering to kswapd might be
> a better idea---this would improve direct-reclaim latency too, but
> still, file systems are not designed for re-entrant calls to iput().
Good point, thanks!
---
vmscan: lumpy pageout
When pageout a dirty page, try to piggy back more consecutive dirty
pages (up to 512KB) to improve IO efficiency.
Only ext3/reiserfs which don't have its own aops->writepages are
supported in this initial version.
CC: Dave Chinner <david@fromorbit.com>
CC: Nikita Danilov <danilov@gmail.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
mm/page-writeback.c | 12 ++++++++++++
mm/vmscan.c | 20 ++++++++++++++++++++
2 files changed, 32 insertions(+)
--- linux.orig/mm/vmscan.c 2009-10-07 21:39:13.000000000 +0800
+++ linux/mm/vmscan.c 2009-10-07 22:59:57.000000000 +0800
@@ -344,6 +344,8 @@ typedef enum {
PAGE_CLEAN,
} pageout_t;
+#define LUMPY_PAGEOUT_PAGES (512 * 1024 / PAGE_CACHE_SIZE)
+
/*
* pageout is called by shrink_page_list() for each dirty page.
* Calls ->writepage().
@@ -398,6 +400,10 @@ static pageout_t pageout(struct page *pa
.nonblocking = 1,
.for_reclaim = 1,
};
+ struct inode *inode = NULL;
+
+ if (current_is_kswapd())
+ inode = igrab(mapping->host);
SetPageReclaim(page);
res = mapping->a_ops->writepage(page, &wbc);
@@ -405,10 +411,24 @@ static pageout_t pageout(struct page *pa
handle_write_error(mapping, page, res);
if (res == AOP_WRITEPAGE_ACTIVATE) {
ClearPageReclaim(page);
+ iput(inode);
return PAGE_ACTIVATE;
}
/*
+ * only write_cache_pages() supports for_reclaim for now
+ * ignore shmem for now, thanks to Nikita.
+ */
+ if (current_is_kswapd() &&
+ bdi_cap_writeback_dirty(mapping->backing_dev_info) &&
+ !mapping->a_ops->writepages) {
+ wbc.range_start = (page->index + 1) << PAGE_CACHE_SHIFT;
+ wbc.nr_to_write = LUMPY_PAGEOUT_PAGES - 1;
+ generic_writepages(mapping, &wbc);
+ iput(inode);
+ }
+
+ /*
* Wait on writeback if requested to. This happens when
* direct reclaiming a large contiguous area and the
* first attempt to free a range of pages fails.
--- linux.orig/mm/page-writeback.c 2009-10-07 21:39:13.000000000 +0800
+++ linux/mm/page-writeback.c 2009-10-07 22:57:55.000000000 +0800
@@ -805,6 +805,11 @@ int write_cache_pages(struct address_spa
break;
}
+ if (wbc->for_reclaim && done_index != page->index) {
+ done = 1;
+ break;
+ }
+
if (nr_to_write != wbc->nr_to_write &&
done_index + WB_SEGMENT_DIST < page->index &&
--wbc->nr_segments <= 0) {
@@ -846,6 +851,13 @@ continue_unlock:
if (!clear_page_dirty_for_io(page))
goto continue_unlock;
+ /*
+ * active and unevictable pages will be checked at
+ * rotate time
+ */
+ if (wbc->for_reclaim)
+ SetPageReclaim(page);
+
ret = (*writepage)(page, wbc, data);
if (unlikely(ret)) {
if (ret == AOP_WRITEPAGE_ACTIVATE) {
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-10-07 15:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-07 10:38 [PATCH 30/45] vmscan: lumpy pageout Nikita Danilov
2009-10-07 11:14 ` Wu Fengguang
2009-10-07 11:32 ` Nick Piggin
2009-10-07 11:37 ` Nikita Danilov
2009-10-07 13:29 ` Wu Fengguang
2009-10-07 13:42 ` Wu Fengguang
2009-10-07 14:20 ` Wu Fengguang
2009-10-07 14:50 ` Nikita Danilov
2009-10-07 15:00 ` Wu Fengguang [this message]
2009-10-07 15:50 ` Nikita Danilov
2009-10-08 2:37 ` Wu Fengguang
2009-10-08 8:20 ` Hugh Dickins
2009-10-08 10:12 ` Wu Fengguang
-- strict thread matches above, loose matches on Subject: below --
2009-10-07 7:38 [PATCH 00/45] some writeback experiments Wu Fengguang
2009-10-07 7:38 ` [PATCH 30/45] vmscan: lumpy pageout Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091007150047.GA9848@localhost \
--to=fengguang.wu@intel.com \
--cc=Trond.Myklebust@netapp.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=danilov@gmail.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=shaohua.li@intel.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.