From: Jan Kara <jack@suse.cz>
To: LKML <linux-kernel@vger.kernel.org>
Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
npiggin@suse.de, Jan Kara <jack@suse.cz>
Subject: [PATCH 08/11] fs: Don't clear dirty bits in block_write_full_page()
Date: Mon, 15 Jun 2009 19:59:55 +0200 [thread overview]
Message-ID: <1245088797-29533-9-git-send-email-jack@suse.cz> (raw)
In-Reply-To: <1245088797-29533-1-git-send-email-jack@suse.cz>
If getblock() fails in block_write_full_page(), we don't want to clear
dirty bits on buffers. Actually, we even want to redirty the page. This
way we just won't silently discard users data (written e.g. through mmap)
in case of ENOSPC, EDQUOT, EIO or other write error (which may be just
transient e.g. because we have to commit a transaction to free up some space).
The downside of this approach is that if the error is persistent we have this
page pinned in memory forever and if there are lots of such pages, we can bring
the machine OOM.
We also don't want to clear dirty bits from buffers above i_size because that
is a generally a bussiness of invalidatepage where filesystem might want to do
some additional work. If we clear dirty bits already in block_write_full_page,
memory reclaim can reap the page before invalidatepage is called on the page
and thus confusing the filesystem.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/buffer.c | 40 +++++++++++++++++-----------------------
1 files changed, 17 insertions(+), 23 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index 7eb1710..21a8cb9 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1662,19 +1662,14 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
* handle any aliases from the underlying blockdev's mapping.
*/
do {
- if (block > last_block) {
- /*
- * mapped buffers outside i_size will occur, because
- * this page can be outside i_size when there is a
- * truncate in progress.
- */
- /*
- * The buffer was zeroed by block_write_full_page()
- */
- clear_buffer_dirty(bh);
- set_buffer_uptodate(bh);
- } else if ((!buffer_mapped(bh) || buffer_delay(bh)) &&
- buffer_dirty(bh)) {
+ /*
+ * Mapped buffers outside i_size will occur, because
+ * this page can be outside i_size when there is a
+ * truncate in progress.
+ */
+ if (block <= last_block &&
+ (!buffer_mapped(bh) || buffer_delay(bh)) &&
+ buffer_dirty(bh)) {
WARN_ON(bh->b_size != blocksize);
err = get_block(inode, block, bh, 1);
if (err)
@@ -1692,9 +1687,10 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
block++;
} while (bh != head);
+ block = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
do {
- if (!buffer_mapped(bh))
- continue;
+ if (!buffer_mapped(bh) || block > last_block)
+ goto next;
/*
* If it's a fully non-blocking write attempt and we cannot
* lock the buffer then redirty the page. Note that this can
@@ -1706,13 +1702,15 @@ static int __block_write_full_page(struct inode *inode, struct page *page,
lock_buffer(bh);
} else if (!trylock_buffer(bh)) {
redirty_page_for_writepage(wbc, page);
- continue;
+ goto next;
}
if (test_clear_buffer_dirty(bh)) {
mark_buffer_async_write_endio(bh, handler);
} else {
unlock_buffer(bh);
}
+next:
+ block++;
} while ((bh = bh->b_this_page) != head);
/*
@@ -1753,9 +1751,11 @@ recover:
/*
* ENOSPC, or some other error. We may already have added some
* blocks to the file, so we need to write these out to avoid
- * exposing stale data.
+ * exposing stale data. We redirty the page so that we don't
+ * loose data we are unable to write.
* The page is currently locked and not marked for writeback
*/
+ redirty_page_for_writepage(wbc, page);
bh = head;
/* Recovery: lock and submit the mapped buffers */
do {
@@ -1763,12 +1763,6 @@ recover:
!buffer_delay(bh)) {
lock_buffer(bh);
mark_buffer_async_write_endio(bh, handler);
- } else {
- /*
- * The buffer may have been set dirty during
- * attachment to a dirty page.
- */
- clear_buffer_dirty(bh);
}
} while ((bh = bh->b_this_page) != head);
SetPageError(page);
--
1.6.0.2
next prev parent reply other threads:[~2009-06-15 17:59 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-15 17:59 [PATCH 0/10] Fix page_mkwrite() for blocksize < pagesize (version 3) Jan Kara
2009-06-15 17:59 ` [PATCH 01/11] ext3: Get rid of extenddisksize parameter of ext3_get_blocks_handle() Jan Kara
2009-06-17 10:28 ` Nick Piggin
2009-06-17 11:49 ` Jan Kara
2009-06-15 17:59 ` [PATCH 02/11] vfs: Add better VFS support for page_mkwrite when blocksize < pagesize Jan Kara
2009-06-25 16:17 ` Nick Piggin
2009-06-25 16:43 ` Nick Piggin
2009-06-25 17:47 ` Christoph Hellwig
2009-06-26 8:42 ` Nick Piggin
2009-06-30 17:37 ` Christoph Hellwig
2009-07-02 7:22 ` Nick Piggin
2009-07-04 15:18 ` Christoph Hellwig
2009-07-06 9:08 ` Nick Piggin
2009-07-06 10:35 ` Christoph Hellwig
2009-07-06 11:49 ` Nick Piggin
2009-06-26 12:21 ` Jan Kara
2009-06-26 12:55 ` Nick Piggin
2009-06-26 16:08 ` Jan Kara
2009-06-29 5:54 ` Nick Piggin
2009-06-15 17:59 ` [PATCH 03/11] ext2: Allocate space for mmaped file on page fault Jan Kara
2009-06-15 17:59 ` [PATCH 04/11] ext4: Make sure blocks are properly allocated under mmaped page even when blocksize < pagesize Jan Kara
2009-06-15 17:59 ` [PATCH 05/11] ext3: Allocate space for mmaped file on page fault Jan Kara
2009-06-15 17:59 ` [PATCH 06/11] vfs: Implement generic per-cpu counters for delayed allocation Jan Kara
2009-06-15 17:59 ` [PATCH 07/11] vfs: Unmap underlying metadata of new data buffers only when buffer is mapped Jan Kara
2009-06-17 10:35 ` Nick Piggin
2009-06-17 12:05 ` Jan Kara
2009-06-17 13:53 ` Nick Piggin
2009-06-18 12:00 ` Theodore Tso
2009-06-18 11:51 ` OGAWA Hirofumi
2009-06-15 17:59 ` Jan Kara [this message]
2009-06-15 17:59 ` [PATCH 09/11] vfs: Export wakeup_pdflush Jan Kara
2009-06-15 17:59 ` [PATCH 10/11] ext3: Implement delayed allocation on page_mkwrite time Jan Kara
2009-06-15 18:02 ` [PATCH 0/10] Fix page_mkwrite() for blocksize < pagesize (version 3) Jan Kara
2009-06-15 18:17 ` Aneesh Kumar K.V
2009-06-16 10:28 ` Jan Kara
2009-06-16 14:34 ` Christoph Hellwig
2009-06-16 14:42 ` Jan Kara
2009-06-30 17:44 ` Christoph Hellwig
2009-07-01 10:29 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1245088797-29533-9-git-send-email-jack@suse.cz \
--to=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).