From: Eric Sandeen <sandeen@redhat.com>
To: linux-kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michael Halcrow <mhalcrow@us.ibm.com>,
mike@halcrow.us
Subject: [PATCH] ecryptfs: fix fsx data corruption problems
Date: Sun, 16 Dec 2007 23:32:33 -0600 [thread overview]
Message-ID: <476609F1.6040408@redhat.com> (raw)
ecryptfs in 2.6.24-rc3 wasn't surviving fsx for me at all,
dying after 4 ops. Generally, encountering problems with stale
data and improperly zeroed pages. An extending truncate + write
for example would expose stale data.
With the changes below I got to a million ops and beyond with all
mmap ops disabled - mmap still needs work. (A version of this
patch on a RHEL5 kernel ran for over 110 million fsx ops)
I added a few comments as well, to the best of my understanding
as I read through the code.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
Index: linux-2.6.24-rc3/fs/ecryptfs/mmap.c
===================================================================
--- linux-2.6.24-rc3.orig/fs/ecryptfs/mmap.c
+++ linux-2.6.24-rc3/fs/ecryptfs/mmap.c
@@ -263,14 +263,13 @@ out:
return 0;
}
+/* This function must zero any hole we create */
static int ecryptfs_prepare_write(struct file *file, struct page *page,
unsigned from, unsigned to)
{
int rc = 0;
+ loff_t prev_page_end_size;
- if (from == 0 && to == PAGE_CACHE_SIZE)
- goto out; /* If we are writing a full page, it will be
- up to date. */
if (!PageUptodate(page)) {
rc = ecryptfs_read_lower_page_segment(page, page->index, 0,
PAGE_CACHE_SIZE,
@@ -283,22 +282,32 @@ static int ecryptfs_prepare_write(struct
} else
SetPageUptodate(page);
}
- if (page->index != 0) {
- loff_t end_of_prev_pg_pos =
- (((loff_t)page->index << PAGE_CACHE_SHIFT) - 1);
- if (end_of_prev_pg_pos > i_size_read(page->mapping->host)) {
+ prev_page_end_size = ((loff_t)page->index << PAGE_CACHE_SHIFT);
+
+ /*
+ * If creating a page or more of holes, zero them out via truncate.
+ * Note, this will increase i_size.
+ */
+ if (page->index != 0) {
+ if (prev_page_end_size > i_size_read(page->mapping->host)) {
rc = ecryptfs_truncate(file->f_path.dentry,
- end_of_prev_pg_pos);
+ prev_page_end_size);
if (rc) {
printk(KERN_ERR "Error on attempt to "
"truncate to (higher) offset [%lld];"
- " rc = [%d]\n", end_of_prev_pg_pos, rc);
+ " rc = [%d]\n", prev_page_end_size, rc);
goto out;
}
}
- if (end_of_prev_pg_pos + 1 > i_size_read(page->mapping->host))
- zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+ }
+ /*
+ * Writing to a new page, and creating a small hole from start of page?
+ * Zero it out.
+ */
+ if ((i_size_read(page->mapping->host) == prev_page_end_size) &&
+ (from != 0)) {
+ zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
}
out:
return rc;
Index: linux-2.6.24-rc3/fs/ecryptfs/read_write.c
===================================================================
--- linux-2.6.24-rc3.orig/fs/ecryptfs/read_write.c
+++ linux-2.6.24-rc3/fs/ecryptfs/read_write.c
@@ -124,6 +124,10 @@ int ecryptfs_write(struct file *ecryptfs
loff_t pos;
int rc = 0;
+ /*
+ * if we are writing beyond current size, then start pos
+ * at the current size - we'll fill in zeros from there.
+ */
if (offset > ecryptfs_file_size)
pos = ecryptfs_file_size;
else
@@ -137,6 +141,7 @@ int ecryptfs_write(struct file *ecryptfs
if (num_bytes > total_remaining_bytes)
num_bytes = total_remaining_bytes;
if (pos < offset) {
+ /* remaining zeros to write, up to destination offset */
size_t total_remaining_zeros = (offset - pos);
if (num_bytes > total_remaining_zeros)
@@ -167,17 +172,27 @@ int ecryptfs_write(struct file *ecryptfs
}
}
ecryptfs_page_virt = kmap_atomic(ecryptfs_page, KM_USER0);
+
+ /*
+ * pos: where we're now writing, offset: where the request was
+ * If current pos is before request, we are filling zeros
+ * If we are at or beyond request, we are writing the *data*
+ * If we're in a fresh page beyond eof, zero it in either case
+ */
+ if (pos < offset || !start_offset_in_page) {
+ /* We are extending past the previous end of the file.
+ * Fill in zero values to the end of the page */
+ memset(((char *)ecryptfs_page_virt
+ + start_offset_in_page), 0,
+ PAGE_CACHE_SIZE - start_offset_in_page);
+ }
+
+ /* pos >= offset, we are now writing the data request */
if (pos >= offset) {
memcpy(((char *)ecryptfs_page_virt
+ start_offset_in_page),
(data + data_offset), num_bytes);
data_offset += num_bytes;
- } else {
- /* We are extending past the previous end of the file.
- * Fill in zero values up to the start of where we
- * will be writing data. */
- memset(((char *)ecryptfs_page_virt
- + start_offset_in_page), 0, num_bytes);
}
kunmap_atomic(ecryptfs_page_virt, KM_USER0);
flush_dcache_page(ecryptfs_page);
next reply other threads:[~2007-12-17 5:32 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-17 5:32 Eric Sandeen [this message]
2007-12-17 15:58 ` [PATCH] ecryptfs: fix fsx data corruption problems Michael Halcrow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=476609F1.6040408@redhat.com \
--to=sandeen@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhalcrow@us.ibm.com \
--cc=mike@halcrow.us \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox