linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Linux Filesystems <linux-fsdevel@vger.kernel.org>,
	Steven Whitehouse <swhiteho@redhat.com>
Subject: Re: Announce: new-aops-1 for 2.6.21-rc3
Date: Sun, 1 Apr 2007 07:42:18 +0200	[thread overview]
Message-ID: <20070401054218.GA6896@wotan.suse.de> (raw)
In-Reply-To: <20070315234713.GH21942@ca-server1.us.oracle.com>

Hi,

Progress is coming along: as well as the several fixes Mark has posted, we've
got a GFS2 patch from Steve as well, which also enables us to get rid of the
AOPS_TRUNCATED_PAGE handling in the legacy ->prepare_write paths. And I've
been auditing code and doing stress and code coverage tests for various
error paths...

I was hoping to have a cont_prepare_write filesystem converted before doing
a next patchset, but there are enough fixes and new stuff now that I'll
probably do one tomorrow.

Anyway my recent testing involved things like poisioning the memory of new
buffers, and injecting failures into various places (eg. the atomic
usercopies), then running fsx-linux and stuff on top of ext2 with various
block sizes. This found one subtle problem (the first, below), and
inspection found a couple of others.

With these fixes, the stress tests now survive as long as I've cared to wait.
Any comments always appreciated.

Thanks,
Nick

--

- The first hunk fixes a case where new buffers that were only partially
  cleared (because the write partially overlaps a buffer, so that region is
not cleared), and then the write fails before filling up this portion of the
buffer. Because we have already cleared buffer_new, page_zero_new_buffers
does not clear it properly. This results in uninitialised data going onto
disk (and in pagecache).  The solution is to only clear_buffer_new when we
know everything has been initialised, which makes sense.

- The second hunk fixes a thinko.

- The last hunk fixes a problem where we turn any short write to a !uptodate
page into a zero-length write, for the reason explained in comments. However
it was not taking this contraction into account in page_zero_new_buffers and
__block_commit_write, which is inconsistent.

Index: linux-2.6/fs/buffer.c
===================================================================
--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -1852,17 +1852,8 @@ static int __block_prepare_write(struct 
 		if (!buffer_uptodate(*wait_bh))
 			err = -EIO;
 	}
-	if (!err) {
-		bh = head;
-		do {
-			if (buffer_new(bh))
-				clear_buffer_new(bh);
-		} while ((bh = bh->b_this_page) != head);
-		return 0;
-	}
-
-	/* Error case: */
-	page_zero_new_buffers(page, from, to);
+	if (unlikely(err))
+		page_zero_new_buffers(page, from, to);
 	return err;
 }
 
@@ -1895,7 +1886,7 @@ void page_zero_new_buffers(struct page *
 					end = min(to, block_end);
 
 					kaddr = kmap_atomic(page, KM_USER0);
-					memset(kaddr+start, 0, block_end-end);
+					memset(kaddr+start, 0, end - start);
 					flush_dcache_page(page);
 					kunmap_atomic(kaddr, KM_USER0);
 					set_buffer_uptodate(bh);
@@ -2015,30 +2006,29 @@ int block_write_end(struct file *file, s
 
 	start = pos & (PAGE_CACHE_SIZE - 1);
 
-	if (unlikely(copied < len))
+	if (unlikely(copied < len)) {
+		/*
+		 * The buffers that were written will now be uptodate, so we
+		 * don't have to worry about a readpage reading them and
+		 * overwriting a partial write. However if we have encountered
+		 * a short write and only partially written into a buffer, it
+		 * will not be marked uptodate, so a readpage might come in and
+		 * destroy our partial write.
+		 *
+		 * Do the simplest thing, and just treat any short write to a
+		 * non uptodate page as a zero-length write, and force the
+		 * caller to redo the whole thing.
+		 */
+		if (!PageUptodate(page))
+			copied = 0;
+
 		page_zero_new_buffers(page, start+copied, start+len);
+	}
 	flush_dcache_page(page);
 
 	/* This could be a short (even 0-length) commit */
 	__block_commit_write(inode, page, start, start+copied);
 
-	/*
-	 * The buffers that were written will now be uptodate, so we don't
-	 * have to worry about a readpage reading them and overwriting
-	 * a partial write. However if we have encountered a short write
-	 * and only partially written into a buffer, it will not be marked
-	 * uptodate, so a readpage might come in and destroy our partial
-	 * write.
-	 *
-	 * Do the simplest thing, and just treat any short write to a non
-	 * uptodate page as a zero-length write, and force the caller to
-	 * redo the whole thing.
-	 */
-	if (unlikely(copied < len)) {
-		if (!PageUptodate(page))
-			copied = 0;
-	}
-
 	unlock_page(page);
 	mark_page_accessed(page); /* XXX: put this in caller? */
 	page_cache_release(page);

  parent reply	other threads:[~2007-04-01  5:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-15 16:17 Announce: new-aops-1 for 2.6.21-rc3 Nick Piggin
2007-03-15 19:32 ` Joel Becker
2007-03-15 19:57   ` Nick Piggin
2007-03-15 19:53 ` Mark Fasheh
2007-03-15 19:57   ` Nick Piggin
2007-03-15 21:08 ` Mark Fasheh
2007-03-15 23:47 ` Mark Fasheh
2007-03-20  5:36   ` Nick Piggin
2007-04-01  5:42   ` Nick Piggin [this message]
2007-04-02  2:14     ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070401054218.GA6896@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mark.fasheh@oracle.com \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).