linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
To: linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Andreas Dilger <adilger@sun.com>, "Theodore Ts'o" <tytso@mit.edu>,
	Jan Kara <jack@suse.cz>
Cc: Nick Piggin <npiggin@suse.de>,
	dle-develop@lists.sourceforge.net,
	Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>
Subject: [PATCH] ext3: prevent reread after write IO error v2
Date: Thu, 14 Jan 2010 19:14:30 +0900	[thread overview]
Message-ID: <4B4EEE86.7080807@hitachi.com> (raw)
In-Reply-To: <4B4EDE5C.8040600@hitachi.com>

This patch fixes the similar bug fixed by commit 95450f5a.

If a directory is modified, its data block is journaled as metadata
and finally written back to the right place.  Now, we assume a
transient write erorr happens on that writeback.  Uptodate flag of
the buffer is cleared by write error, so next access on the buffer
causes a reread from disk.  This means it breaks the filesystems
consistency.

To prevent old directory data from being reread, this patch set
uptodate flag again in the case of after write error before issuing
the read operation.  The write error on the directory's data block
is detected at the time of journal checkpointing or discarded if a
rewrite by another modification succeeds, so no problem.

Similarly, this kind of consistency breakage can be caused by
a transient write error on a bitmap block.

I tested this patch by using fault injection approach.

By the way, I think the right fix is to keep uptodate flag on write
error, but it gives a big impact.  We have to confirm whether
over 200 buffer_uptodate's are used for real uptodate check or write
error check.  For now, I adopt the quick-fix solution.

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
---
 fs/ext3/balloc.c |   12 ++++++++++++
 fs/ext3/inode.c  |   13 +++++++++++++
 fs/ext3/namei.c  |   15 ++++++++++++++-
 3 files changed, 39 insertions(+), 1 deletions(-)

diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c
index 27967f9..5dc5ccf 100644
--- a/fs/ext3/balloc.c
+++ b/fs/ext3/balloc.c
@@ -156,6 +156,18 @@ read_block_bitmap(struct super_block *sb, unsigned int block_group)
 	if (likely(bh_uptodate_or_lock(bh)))
 		return bh;
 
+	/*
+	 * uptodate flag may have been cleared by a previous (transient)
+	 * write IO error.  In this case, we don't want to reread its
+	 * old on-disk data.  Actually the buffer has the latest data,
+	 * so set uptodate flag again.
+	 */
+	if (buffer_write_io_error(bh)) {
+		set_buffer_uptodate(bh);
+		unlock_buffer(bh);
+		return bh;
+	}
+
 	if (bh_submit_read(bh) < 0) {
 		brelse(bh);
 		ext3_error(sb, __func__,
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 455e6e6..67d7849 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1077,10 +1077,23 @@ struct buffer_head *ext3_bread(handle_t *handle, struct inode *inode,
 		return bh;
 	if (buffer_uptodate(bh))
 		return bh;
+
+	/*
+	 * uptodate flag may have been cleared by a previous (transient)
+	 * write IO error.  In this case, we don't want to reread its
+	 * old on-disk data.  Actually the buffer has the latest data,
+	 * so set uptodate flag again.
+	 */
+	if (buffer_write_io_error(bh)) {
+		set_buffer_uptodate(bh);
+		return bh;
+	}
+
 	ll_rw_block(READ_META, 1, &bh);
 	wait_on_buffer(bh);
 	if (buffer_uptodate(bh))
 		return bh;
+
 	put_bh(bh);
 	*err = -EIO;
 	return NULL;
diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c
index 7b0e44f..7ed8e45 100644
--- a/fs/ext3/namei.c
+++ b/fs/ext3/namei.c
@@ -909,7 +909,20 @@ restart:
 				num++;
 				bh = ext3_getblk(NULL, dir, b++, 0, &err);
 				bh_use[ra_max] = bh;
-				if (bh)
+				if (!bh || buffer_uptodate(bh))
+					continue;
+
+				/*
+				 * uptodate flag may have been cleared by a
+				 * previous (transient) write IO error.  In
+				 * this case, we don't want to reread its
+				 * old on-disk data.  Actually the buffer
+				 * has the latest data, so set uptodate flag
+				 * again.
+				 */
+				if (buffer_write_io_error(bh))
+					set_buffer_uptodate(bh);
+				else
 					ll_rw_block(READ_META, 1, &bh);
 			}
 		}
-- 
1.6.0.3



-- 
Hidehiro Kawai
Hitachi, Systems Development Laboratory
Linux Technology Center


  reply	other threads:[~2010-01-14 10:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-14  6:12 [PATCH] ext3: prevent reread after write IO error Hidehiro Kawai
2010-01-14  9:05 ` Hidehiro Kawai
2010-01-14 10:14   ` Hidehiro Kawai [this message]
2010-01-14 14:18     ` [PATCH] ext3: prevent reread after write IO error v2 Jan Kara
2010-01-15 10:38       ` Hidehiro Kawai
2010-01-18  5:18       ` Nick Piggin
2010-01-18  6:05         ` IO error semantics Nick Piggin
2010-01-18 12:24           ` Dave Chinner
2010-01-18 14:00             ` Nick Piggin
2010-01-18 22:51               ` Dave Chinner
2010-01-18 23:33               ` Anton Altaparmakov
2010-01-25 15:23                 ` Ric Wheeler
2010-01-25 16:15                   ` Greg Freemyer
2010-01-25 17:47                   ` tytso
2010-01-25 17:50                     ` Ric Wheeler
2010-01-25 17:59                       ` Nick Piggin
     [not found]                     ` <20100125175529.GB2018@laptop>
2010-01-26  6:19                       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B4EEE86.7080807@hitachi.com \
    --to=hidehiro.kawai.ez@hitachi.com \
    --cc=adilger@sun.com \
    --cc=akpm@linux-foundation.org \
    --cc=dle-develop@lists.sourceforge.net \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=satoshi.oshima.fk@hitachi.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).