Re: [PATCH 4/5] jbd: fix error handling for checkpoint io

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
To: Jan Kara <jack@suse.cz>
Cc: akpm@linux-foundation.org, sct@redhat.com, adilger@clusterfs.com,
	linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org,
	jbacik@redhat.com, cmm@us.ibm.com, tytso@mit.edu,
	sugita <yumiko.sugita.yf@hitachi.com>,
	Satoshi OSHIMA <satoshi.oshima.fk@hitachi.com>
Subject: Re: [PATCH 4/5] jbd: fix error handling for checkpoint io
Date: Mon, 23 Jun 2008 20:14:54 +0900	[thread overview]
Message-ID: <485F85AE.1010704@hitachi.com> (raw)
In-Reply-To: <20080603080219.GA17936@duck.suse.cz>

Hi,

I noticed a problem of this patch.  Please see below.

Jan Kara wrote:

> On Tue 03-06-08 13:40:25, Hidehiro Kawai wrote:
> 
>>Subject: [PATCH 4/5] jbd: fix error handling for checkpoint io
>>
>>When a checkpointing IO fails, current JBD code doesn't check the
>>error and continue journaling.  This means latest metadata can be
>>lost from both the journal and filesystem.
>>
>>This patch leaves the failed metadata blocks in the journal space
>>and aborts journaling in the case of log_do_checkpoint().
>>To achieve this, we need to do:
>>
>>1. don't remove the failed buffer from the checkpoint list where in
>>   the case of __try_to_free_cp_buf() because it may be released or
>>   overwritten by a later transaction
>>2. log_do_checkpoint() is the last chance, remove the failed buffer
>>   from the checkpoint list and abort the journal
>>3. when checkpointing fails, don't update the journal super block to
>>   prevent the journaled contents from being cleaned.  For safety,
>>   don't update j_tail and j_tail_sequence either

3. is implemented as described below.
  (1) if log_do_checkpoint() detects an I/O error during
      checkpointing, it calls journal_abort() to abort the journal
  (2) if the journal has aborted, don't update s_start and s_sequence
      in the on-disk journal superblock

So, if the journal aborts, journaled data will be replayed on the
next mount.

Now, please remember that some dirty metadata buffers are written
back to the filesystem without journaling if the journal aborted.
We are happy if all dirty metadata buffers are written to the disk,
the integrity of the filesystem will be kept.  However, replaying
the journaled data can overwrite the latest on-disk metadata blocks
partly with old data.  It would break the filesystem.

My idea to resolve this problem is that we don't write out metadata
buffers which belong to uncommitted transactions if journal has
aborted.  Although the latest filesystem updates will be lost,
we can ensure the integrity.  It will also be effective for the
kernel panic in the middle of writing metadata buffers without
journaling (this would occur in the `mount -o errors=panic' case.)

Which integrity or latest state should we choose?

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
---
 fs/jbd/commit.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Index: linux-2.6.26-rc5-mm3/fs/jbd/commit.c
===================================================================
--- linux-2.6.26-rc5-mm3.orig/fs/jbd/commit.c
+++ linux-2.6.26-rc5-mm3/fs/jbd/commit.c
@@ -486,9 +486,10 @@ void journal_commit_transaction(journal_
 		jh = commit_transaction->t_buffers;
 
 		/* If we're in abort mode, we just un-journal the buffer and
-		   release it for background writing. */
+		   release it. */
 
 		if (is_journal_aborted(journal)) {
+			clear_buffer_jbddirty(jh2bh(jh));
 			JBUFFER_TRACE(jh, "journal is aborting: refile");
 			journal_refile_buffer(journal, jh);
 			/* If that was the last one, we need to clean up
@@ -823,6 +824,8 @@ restart_loop:
 		if (buffer_jbddirty(bh)) {
 			JBUFFER_TRACE(jh, "add to new checkpointing trans");
 			__journal_insert_checkpoint(jh, commit_transaction);
+			if (is_journal_aborted(journal))
+				clear_buffer_jbddirty(bh);
 			JBUFFER_TRACE(jh, "refile for checkpoint writeback");
 			__journal_refile_buffer(jh);
 			jbd_unlock_bh_state(bh);

next prev parent reply	other threads:[~2008-06-23 11:15 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-02 10:40 [PATCH 0/5] jbd: possible filesystem corruption fixes (take 2) Hidehiro Kawai
2008-06-02 10:43 ` [PATCH 1/5] jbd: strictly check for write errors on data buffers Hidehiro Kawai
2008-06-03 22:30   ` Andrew Morton
2008-06-04 10:19     ` Jan Kara
2008-06-04 18:19       ` Andrew Morton
2008-06-04 21:22         ` Theodore Tso
2008-06-04 21:58           ` Andrew Morton
2008-06-04 22:51             ` Theodore Tso
2008-06-05  9:35               ` Jan Kara
2008-06-05 11:33                 ` Hidehiro Kawai
2008-06-05 14:29                   ` Theodore Tso
2008-06-05 16:20                     ` Andrew Morton
2008-06-05 18:49                       ` Andreas Dilger
2008-06-09 10:09                         ` Hidehiro Kawai
2008-06-11 12:35                           ` Jan Kara
2008-06-12 13:19                             ` Hidehiro Kawai
2008-06-05  3:28           ` Mike Snitzer
2008-06-04 21:58         ` Andreas Dilger
2008-06-04 10:53     ` Hidehiro Kawai
2008-06-02 10:45 ` [PATCH 2/5] jbd: ordered data integrity fix Hidehiro Kawai
2008-06-02 11:59   ` Jan Kara
2008-06-03 22:33   ` Andrew Morton
2008-06-04 10:55     ` Hidehiro Kawai
2008-06-02 10:46 ` [PATCH 3/5] jbd: abort when failed to log metadata buffers Hidehiro Kawai
2008-06-02 12:00   ` Jan Kara
2008-06-03 22:35   ` Andrew Morton
2008-06-04 10:57     ` Hidehiro Kawai
2008-06-02 10:47 ` [PATCH 4/5] jbd: fix error handling for checkpoint io Hidehiro Kawai
2008-06-02 12:44   ` Jan Kara
2008-06-03  4:31     ` Hidehiro Kawai
2008-06-03  4:40     ` Hidehiro Kawai
2008-06-03  5:11       ` Hidehiro Kawai
2008-06-03  5:20         ` Andrew Morton
2008-06-03  8:02       ` Jan Kara
2008-06-23 11:14         ` Hidehiro Kawai [this message]
2008-06-23 12:22           ` Jan Kara
2008-06-24 11:52             ` Hidehiro Kawai
2008-06-24 13:33               ` Jan Kara
2008-06-27  8:06                 ` Hidehiro Kawai
2008-06-27 10:24                   ` Jan Kara
2008-06-30  5:09                     ` Hidehiro Kawai
2008-07-07 10:07                       ` Jan Kara
2008-06-02 10:48 ` [PATCH 5/5] ext3: abort ext3 if the journal has aborted Hidehiro Kawai
2008-06-02 12:49   ` Jan Kara
2008-06-02 12:05 ` [PATCH 0/5] jbd: possible filesystem corruption fixes (take 2) Jan Kara
2008-06-03  4:30   ` Hidehiro Kawai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=485F85AE.1010704@hitachi.com \
    --to=hidehiro.kawai.ez@hitachi.com \
    --cc=adilger@clusterfs.com \
    --cc=akpm@linux-foundation.org \
    --cc=cmm@us.ibm.com \
    --cc=jack@suse.cz \
    --cc=jbacik@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=satoshi.oshima.fk@hitachi.com \
    --cc=sct@redhat.com \
    --cc=tytso@mit.edu \
    --cc=yumiko.sugita.yf@hitachi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox