From: npiggin@suse.de
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: [patch 5/8] mm: write_cache_pages integrity fix
Date: Fri, 10 Oct 2008 02:50:44 +1100 [thread overview]
Message-ID: <20081009174822.621353840@suse.de> (raw)
In-Reply-To: 20081009155039.139856823@suse.de
[-- Attachment #1: mm-wcp-integrity-fix.patch --]
[-- Type: text/plain, Size: 2811 bytes --]
In write_cache_pages, nr_to_write is heeded even for data-integrity syncs, so
the function will return success after writing out nr_to_write pages, even if
that was not sufficient to guarantee data integrity.
The callers tend to set it to values that could break data interity semantics
easily in practice. For example, nr_to_write can be set to mapping->nr_pages *
2, however if a file has a single, dirty page, then fsync is called, subsequent
pages might be concurrently added and dirtied, then write_cache_pages might
writeout two of these newly dirty pages, while not writing out the old page
that should have been written out.
Fix this by ignoring nr_to_write if it is a data integrity sync.
This is a data interity bug.
Signed-off-by: Nick Piggin <npiggin@suse.de>
---
The reason this has been done in the past is to avoid stalling sync operations
behind page dirtiers.
"If a file has one dirty page at offset 1000000000000000 then someone
does an fsync() and someone else gets in first and starts madly writing
pages at offset 0, we want to write that page at 1000000000000000.
Somehow."
What we to today is return success after an arbitrary amount of pages are
written, whether or not we have provided the data-integrity semantics that
the caller has asked for. Even this doesn't actually fix all stall cases
completely: in the above situation, if the file has a huge number of pages
in pagecache (but not dirty), then mapping->nrpages is going to be huge,
even if pages are being dirtied.
This change does indeed make the possibility of long stalls lager, and that's
not a good thing, but lying about data integrity is even worse. We have to
either perform the sync, or return -ELINUXISLAME so at least the caller knows
what has happened.
There are subsequent competing approaches in the works to solve the stall
problems properly, without compromising data integrity.
Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -951,8 +951,10 @@ again:
done = 1;
break;
}
- if (--(wbc->nr_to_write) <= 0)
- done = 1;
+ if (wbc->sync_mode == WB_SYNC_NONE) {
+ if (--(wbc->nr_to_write) <= 0)
+ done = 1;
+ }
if (wbc->nonblocking && bdi_write_congested(bdi)) {
wbc->encountered_congestion = 1;
done = 1;
Index: linux-2.6/mm/filemap.c
===================================================================
--- linux-2.6.orig/mm/filemap.c
+++ linux-2.6/mm/filemap.c
@@ -209,7 +209,7 @@ int __filemap_fdatawrite_range(struct ad
int ret;
struct writeback_control wbc = {
.sync_mode = sync_mode,
- .nr_to_write = mapping->nrpages * 2,
+ .nr_to_write = LONG_MAX,
.range_start = start,
.range_end = end,
};
--
next prev parent reply other threads:[~2008-10-09 7:58 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-09 15:50 [patch 0/8] write_cache_pages fixes npiggin
2008-10-09 15:50 ` [patch 1/8] mm: write_cache_pages cyclic fix npiggin
2008-10-09 15:50 ` [patch 2/8] mm: write_cache_pages AOP_WRITEPAGE_ACTIVATE fix npiggin
2008-10-10 16:00 ` Miklos Szeredi
2008-10-10 18:29 ` Hugh Dickins
2008-10-11 4:05 ` Nick Piggin
2008-10-09 15:50 ` [patch 3/8] mm: write_cache_pages writepage error fix npiggin
2008-10-09 15:50 ` [patch 4/8] mm: write_cache_pages type overflow fix npiggin
2008-10-09 8:23 ` Christoph Hellwig
2008-10-09 8:33 ` Nick Piggin
2008-10-10 13:10 ` Theodore Tso
2008-10-10 13:13 ` Christoph Hellwig
2008-10-10 13:37 ` Theodore Tso
2008-10-10 13:48 ` Steven Whitehouse
2008-10-10 14:05 ` Theodore Tso
2008-10-10 14:08 ` Christoph Hellwig
2008-10-10 15:54 ` Aneesh Kumar K.V
2008-10-10 15:59 ` Chris Mason
2008-10-10 16:10 ` Theodore Tso
2008-10-10 16:34 ` Christoph Hellwig
2008-10-10 13:56 ` Chris Mason
2008-10-09 15:50 ` npiggin [this message]
2008-10-09 12:52 ` [patch 5/8] mm: write_cache_pages integrity fix Chris Mason
2008-10-09 13:27 ` Nick Piggin
2008-10-09 13:35 ` Chris Mason
2008-10-09 13:55 ` Nick Piggin
2008-10-09 14:12 ` Chris Mason
2008-10-09 14:21 ` Nick Piggin
2008-10-09 14:39 ` Chris Mason
2008-10-09 14:50 ` Nick Piggin
2008-10-09 15:16 ` Chris Mason
2008-10-10 2:40 ` Nick Piggin
2008-10-09 15:50 ` [patch 6/8] mm: write_cache_pages cleanups npiggin
2008-10-09 14:37 ` Artem Bityutskiy
2008-10-09 15:50 ` [patch 7/8] mm: write_cache_pages optimise page cleaning npiggin
2008-10-09 15:50 ` [patch 8/8] mm: write_cache_pages terminate quickly npiggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081009174822.621353840@suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mpatocka@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).