From: Theodore Ts'o <tytso@mit.edu>
To: stable@vger.kernel.org
Cc: Ext4 Developers List <linux-ext4@vger.kernel.org>,
Theodore Ts'o <tytso@mit.edu>, Dave Chinner <david@fromorbit.com>
Subject: [PATCH v2.6.34.y 21/28] ext4: Use our own write_cache_pages()
Date: Tue, 1 Jun 2010 12:13:08 -0400 [thread overview]
Message-ID: <1275408795-17487-21-git-send-email-tytso@mit.edu> (raw)
In-Reply-To: <1275408795-17487-1-git-send-email-tytso@mit.edu>
Make a copy of write_cache_pages() for the benefit of
ext4_da_writepages(). This allows us to simplify the code some, and
will allow us to further customize the code in future patches.
There are some nasty hacks in write_cache_pages(), which Linus has
(correctly) characterized as vile. I've just copied it into
write_cache_pages_da(), without trying to clean those bits up lest I
break something in the ext4's delalloc implementation, which is a bit
fragile right now. This will allow Dave Chinner to clean up
write_cache_pages() in mm/page-writeback.c, without worrying about
breaking ext4. Eventually write_cache_pages_da() will go away when I
rewrite ext4's delayed allocation and create a general
ext4_writepages() which is used for all of ext4's writeback. Until
now this is the lowest risk way to clean up the core
write_cache_pages() function.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>
---
fs/ext4/inode.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++---------
1 files changed, 119 insertions(+), 22 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 6aa0442..830336d 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2426,17 +2426,6 @@ static int __mpage_da_writepage(struct page *page,
struct buffer_head *bh, *head;
sector_t logical;
- if (mpd->io_done) {
- /*
- * Rest of the page in the page_vec
- * redirty then and skip then. We will
- * try to write them again after
- * starting a new transaction
- */
- redirty_page_for_writepage(wbc, page);
- unlock_page(page);
- return MPAGE_DA_EXTENT_TAIL;
- }
/*
* Can we merge this page to current extent?
*/
@@ -2831,6 +2820,124 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode)
return ext4_chunk_trans_blocks(inode, max_blocks);
}
+/*
+ * write_cache_pages_da - walk the list of dirty pages of the given
+ * address space and call the callback function (which usually writes
+ * the pages).
+ *
+ * This is a forked version of write_cache_pages(). Differences:
+ * Range cyclic is ignored.
+ * no_nrwrite_index_update is always presumed true
+ */
+static int write_cache_pages_da(struct address_space *mapping,
+ struct writeback_control *wbc,
+ struct mpage_da_data *mpd)
+{
+ int ret = 0;
+ int done = 0;
+ struct pagevec pvec;
+ int nr_pages;
+ pgoff_t index;
+ pgoff_t end; /* Inclusive */
+ long nr_to_write = wbc->nr_to_write;
+
+ pagevec_init(&pvec, 0);
+ index = wbc->range_start >> PAGE_CACHE_SHIFT;
+ end = wbc->range_end >> PAGE_CACHE_SHIFT;
+
+ while (!done && (index <= end)) {
+ int i;
+
+ nr_pages = pagevec_lookup_tag(&pvec, mapping, &index,
+ PAGECACHE_TAG_DIRTY,
+ min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1);
+ if (nr_pages == 0)
+ break;
+
+ for (i = 0; i < nr_pages; i++) {
+ struct page *page = pvec.pages[i];
+
+ /*
+ * At this point, the page may be truncated or
+ * invalidated (changing page->mapping to NULL), or
+ * even swizzled back from swapper_space to tmpfs file
+ * mapping. However, page->index will not change
+ * because we have a reference on the page.
+ */
+ if (page->index > end) {
+ done = 1;
+ break;
+ }
+
+ lock_page(page);
+
+ /*
+ * Page truncated or invalidated. We can freely skip it
+ * then, even for data integrity operations: the page
+ * has disappeared concurrently, so there could be no
+ * real expectation of this data interity operation
+ * even if there is now a new, dirty page at the same
+ * pagecache address.
+ */
+ if (unlikely(page->mapping != mapping)) {
+continue_unlock:
+ unlock_page(page);
+ continue;
+ }
+
+ if (!PageDirty(page)) {
+ /* someone wrote it for us */
+ goto continue_unlock;
+ }
+
+ if (PageWriteback(page)) {
+ if (wbc->sync_mode != WB_SYNC_NONE)
+ wait_on_page_writeback(page);
+ else
+ goto continue_unlock;
+ }
+
+ BUG_ON(PageWriteback(page));
+ if (!clear_page_dirty_for_io(page))
+ goto continue_unlock;
+
+ ret = __mpage_da_writepage(page, wbc, mpd);
+ if (unlikely(ret)) {
+ if (ret == AOP_WRITEPAGE_ACTIVATE) {
+ unlock_page(page);
+ ret = 0;
+ } else {
+ done = 1;
+ break;
+ }
+ }
+
+ if (nr_to_write > 0) {
+ nr_to_write--;
+ if (nr_to_write == 0 &&
+ wbc->sync_mode == WB_SYNC_NONE) {
+ /*
+ * We stop writing back only if we are
+ * not doing integrity sync. In case of
+ * integrity sync we have to keep going
+ * because someone may be concurrently
+ * dirtying pages, and we might have
+ * synced a lot of newly appeared dirty
+ * pages, but have not synced all of the
+ * old dirty pages.
+ */
+ done = 1;
+ break;
+ }
+ }
+ }
+ pagevec_release(&pvec);
+ cond_resched();
+ }
+ return ret;
+}
+
+
static int ext4_da_writepages(struct address_space *mapping,
struct writeback_control *wbc)
{
@@ -2839,7 +2946,6 @@ static int ext4_da_writepages(struct address_space *mapping,
handle_t *handle = NULL;
struct mpage_da_data mpd;
struct inode *inode = mapping->host;
- int no_nrwrite_index_update;
int pages_written = 0;
long pages_skipped;
unsigned int max_pages;
@@ -2919,12 +3025,6 @@ static int ext4_da_writepages(struct address_space *mapping,
mpd.wbc = wbc;
mpd.inode = mapping->host;
- /*
- * we don't want write_cache_pages to update
- * nr_to_write and writeback_index
- */
- no_nrwrite_index_update = wbc->no_nrwrite_index_update;
- wbc->no_nrwrite_index_update = 1;
pages_skipped = wbc->pages_skipped;
retry:
@@ -2966,8 +3066,7 @@ retry:
mpd.io_done = 0;
mpd.pages_written = 0;
mpd.retval = 0;
- ret = write_cache_pages(mapping, wbc, __mpage_da_writepage,
- &mpd);
+ ret = write_cache_pages_da(mapping, wbc, &mpd);
/*
* If we have a contiguous extent of pages and we
* haven't done the I/O yet, map the blocks and submit
@@ -3033,8 +3132,6 @@ retry:
mapping->writeback_index = index;
out_writepages:
- if (!no_nrwrite_index_update)
- wbc->no_nrwrite_index_update = 0;
wbc->nr_to_write -= nr_to_writebump;
wbc->range_start = range_start;
trace_ext4_da_writepages_result(inode, wbc, ret, pages_written);
--
1.6.6.1.1.g974db.dirty
next prev parent reply other threads:[~2010-06-01 16:13 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-01 16:12 [PATCH v2.6.34.y 01/28] ext4: check missed return value in ext4_sync_file() Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 02/28] ext4: fix memory leaks in error path handling of ext4_ext_zeroout() Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 03/28] ext4: Remove unnecessary call to ext4_get_group_desc() in mballoc Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 04/28] ext4: rename ext4_mb_release_desc() to ext4_mb_unload_buddy() Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 05/28] ext4: allow defrag (EXT4_IOC_MOVE_EXT) in 32bit compat mode Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 06/28] ext4: fix quota accounting in case of fallocate Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 07/28] ext4: check s_log_groups_per_flex in online resize code Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 08/28] ext4: don't return to userspace after freezing the fs with a mutex held Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 09/28] ext4: stop issuing discards if not supported by device Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 10/28] ext4: don't scan/accumulate more pages than mballoc will allocate Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 11/28] ext4: Do not zero out uninitialized extents beyond i_size Theodore Ts'o
2010-06-01 16:12 ` [PATCH v2.6.34.y 12/28] ext4: clean up inode bitmaps manipulation in ext4_free_inode Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 13/28] ext4: init statistics after journal recovery Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 14/28] quota: use flags interface for dquot alloc/free space Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 15/28] quota: add the option to not fail with EDQUOT in block Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 16/28] ext4: don't use quota reservation for speculative metadata Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 17/28] ext4: Remove extraneous newlines in ext4_msg() calls Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 18/28] ext4: Prevent creation of files larger than RLIMIT_FSIZE using fallocate Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 19/28] ext4: check for a good block group before loading buddy pages Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 20/28] ext4: Show journal_checksum option Theodore Ts'o
2010-06-01 16:13 ` Theodore Ts'o [this message]
2010-06-01 16:13 ` [PATCH v2.6.34.y 22/28] ext4: Use bitops to read/modify i_flags in struct ext4_inode_info Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 23/28] ext4: Avoid crashing on NULL ptr dereference on a filesystem error Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 24/28] ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warranted Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 25/28] ext4: restart ext4_ext_remove_space() after transaction restart Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 26/28] ext4: Conditionally define compat ioctl numbers Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 27/28] ext4: Fix compat EXT4_IOC_ADD_GROUP Theodore Ts'o
2010-06-01 16:13 ` [PATCH v2.6.34.y 28/28] ext4: Make fsync sync new parent directories in no-journal mode Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1275408795-17487-21-git-send-email-tytso@mit.edu \
--to=tytso@mit.edu \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).