jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch
@ 2008-09-22 21:08 Andrew Morton
  2008-09-23 16:56 ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Duane Griffin
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2008-09-22 21:08 UTC (permalink / raw)
  To: Theodore Ts'o, Stephen C. Tweedie; +Cc: linux-ext4


Guys, I have a note here that this might be needed in 2.6.27.

I also have a note that Stephen had issues with it, but I
don't recall what they were.

Can we get this sorted out please?



From: "Duane Griffin" <duaneg@dghda.com>

The __jbd2_log_wait_for_space function sits in a loop checkpointing
transactions until there is sufficient space free in the journal. 
However, if there are no transactions to be processed (e.g.  because the
free space calculation is wrong due to a corrupted filesystem) it will
never progress.

Check for space being required when no transactions are outstanding and
abort the journal instead of endlessly looping.

This patch fixes the bug reported by Sami Liedes at:
http://bugzilla.kernel.org/show_bug.cgi?id=10976

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: Sami Liedes <sliedes@cc.hut.fi>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/jbd2/checkpoint.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff -puN fs/jbd2/checkpoint.c~jbd2-abort-instead-of-waiting-for-nonexistent-transactions fs/jbd2/checkpoint.c
--- a/fs/jbd2/checkpoint.c~jbd2-abort-instead-of-waiting-for-nonexistent-transactions
+++ a/fs/jbd2/checkpoint.c
@@ -126,14 +126,29 @@ void __jbd2_log_wait_for_space(journal_t
 
 		/*
 		 * Test again, another process may have checkpointed while we
-		 * were waiting for the checkpoint lock
+		 * were waiting for the checkpoint lock. If there are no
+		 * outstanding transactions there is nothing to checkpoint and
+		 * we can't make progress. Abort the journal in this case.
 		 */
 		spin_lock(&journal->j_state_lock);
+		spin_lock(&journal->j_list_lock);
 		nblocks = jbd_space_needed(journal);
 		if (__jbd2_log_space_left(journal) < nblocks) {
+			int chkpt = journal->j_checkpoint_transactions != NULL;
+
+			spin_unlock(&journal->j_list_lock);
 			spin_unlock(&journal->j_state_lock);
-			jbd2_log_do_checkpoint(journal);
+			if (chkpt) {
+				jbd2_log_do_checkpoint(journal);
+			} else {
+				printk(KERN_ERR "%s: no transactions\n",
+				       __func__);
+				jbd2_journal_abort(journal, 0);
+			}
+
 			spin_lock(&journal->j_state_lock);
+		} else {
+			spin_unlock(&journal->j_list_lock);
 		}
 		mutex_unlock(&journal->j_checkpoint_mutex);
 	}
_


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch
  2008-09-22 21:08 jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Andrew Morton
@ 2008-09-23 16:56 ` Duane Griffin
  2008-09-29  2:24   ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Theodore Tso
  0 siblings, 1 reply; 4+ messages in thread
From: Duane Griffin @ 2008-09-23 16:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Ts'o, Stephen C. Tweedie, linux-ext4

2008/9/22 Andrew Morton <akpm@linux-foundation.org>:
> Guys, I have a note here that this might be needed in 2.6.27.
>
> I also have a note that Stephen had issues with it, but I
> don't recall what they were.

Stephen suggested that it would be better to sanity check the journal
start/end pointers on mount, rather than catching the error later like
this. I never quite convinced myself I'd worked out the right way to
do that, sorry. Perhaps someone would like to confirm (or otherwise)
whether or not the following is correct:

In journal_reset (?) check that:

journal->j_first == 1 (this seems to be the only valid value)

and

journal->j_last >= JFS_MIN_JOURNAL_BLOCKS

Additionally, it should be possible to check the journal->j_last more
precisely. For internal journals it seems straight-forward, we can
just check that journal->j_last == inode->i_size >>
inode->i_sb->s_blocksize_bits. For external journals we'd need to load
the device's superblock and check journal->j_last == s_blocks_count.

> Can we get this sorted out please?

If the above is confirmed I'll send a patch to that effect for jdb,
jdb2 and for e2fsprogs (fsck doesn't check j_first/j_last either).

Regardless, I think the original patch may be a good idea. It improves
robustness and matches the other locations where we call
jbd2_log_do_checkpoint. They are all in loops that test that
journal->j_checkpoint_transactions != NULL.

Cheers,
Duane.

-- 
"I never could learn to drink that blood and call it wine" - Bob Dylan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch
  2008-09-23 16:56 ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Duane Griffin
@ 2008-09-29  2:24   ` Theodore Tso
  2008-09-29 16:51     ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Duane Griffin
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Tso @ 2008-09-29  2:24 UTC (permalink / raw)
  To: Duane Griffin; +Cc: Andrew Morton, Stephen C. Tweedie, linux-ext4

On Tue, Sep 23, 2008 at 05:56:27PM +0100, Duane Griffin wrote:
> Stephen suggested that it would be better to sanity check the journal
> start/end pointers on mount, rather than catching the error later like
> this. I never quite convinced myself I'd worked out the right way to
> do that, sorry. Perhaps someone would like to confirm (or otherwise)
> whether or not the following is correct:
> 
> In journal_reset (?) check that:
> 
> journal->j_first == 1 (this seems to be the only valid value)
> 
> and
> 
> journal->j_last >= JFS_MIN_JOURNAL_BLOCKS

Yes, for all existing currently created, j_first will be 1.  I can't
think of a good reason for why we might want to reserve some space at
the beginning of the journal, but the safest check would be:

    (journal->j_last - journal->j_first +1) >= JFS_MIN_JOURNAL_BLOCKS

> Additionally, it should be possible to check the journal->j_last more
> precisely. For internal journals it seems straight-forward, we can
> just check that journal->j_last == inode->i_size >>
> inode->i_sb->s_blocksize_bits. For external journals we'd need to load
> the device's superblock and check journal->j_last == s_blocks_count.

Yep, agreed.

> Regardless, I think the original patch may be a good idea. It improves
> robustness and matches the other locations where we call
> jbd2_log_do_checkpoint. They are all in loops that test that
> journal->j_checkpoint_transactions != NULL.

Agreed.  I've included it in the ext4 patch queue, and will be soon
putting out a new ext4 patchset consisting of the patches I plan to
push during the next merge window.

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch
  2008-09-29  2:24   ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Theodore Tso
@ 2008-09-29 16:51     ` Duane Griffin
  0 siblings, 0 replies; 4+ messages in thread
From: Duane Griffin @ 2008-09-29 16:51 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Duane Griffin, Andrew Morton, Stephen C. Tweedie, linux-ext4

On Sun, Sep 28, 2008 at 10:24:26PM -0400, Theodore Tso wrote:
> On Tue, Sep 23, 2008 at 05:56:27PM +0100, Duane Griffin wrote:
> > Stephen suggested that it would be better to sanity check the journal
> > start/end pointers on mount, rather than catching the error later like
> > this. I never quite convinced myself I'd worked out the right way to
> > do that, sorry. Perhaps someone would like to confirm (or otherwise)
> > whether or not the following is correct:
> > 
> > In journal_reset (?) check that:
> > 
> > journal->j_first == 1 (this seems to be the only valid value)
> > 
> > and
> > 
> > journal->j_last >= JFS_MIN_JOURNAL_BLOCKS
> 
> Yes, for all existing currently created, j_first will be 1.  I can't
> think of a good reason for why we might want to reserve some space at
> the beginning of the journal, but the safest check would be:
> 
>     (journal->j_last - journal->j_first +1) >= JFS_MIN_JOURNAL_BLOCKS

Fair enough.

> > Additionally, it should be possible to check the journal->j_last more
> > precisely. For internal journals it seems straight-forward, we can
> > just check that journal->j_last == inode->i_size >>
> > inode->i_sb->s_blocksize_bits. For external journals we'd need to load
> > the device's superblock and check journal->j_last == s_blocks_count.
> 
> Yep, agreed.

OK, great. See patch below. I'll send the ext3/jbd version once you're
happy with it.

> > Regardless, I think the original patch may be a good idea. It improves
> > robustness and matches the other locations where we call
> > jbd2_log_do_checkpoint. They are all in loops that test that
> > journal->j_checkpoint_transactions != NULL.
> 
> Agreed.  I've included it in the ext4 patch queue, and will be soon
> putting out a new ext4 patchset consisting of the patches I plan to
> push during the next merge window.

Great, thanks. The original patch was for ext3/jbd patch, but I'm not
sure whether that has been accepted anywhere or not. I'll check after
the ext3 patches are merged and resend it if needed.

> 						- Ted

Cheers,
Duane.

-- 

Subject: [PATCH] jbd2: sanity check block range

Invalid journal start/end block values can cause BUGs. Do some sanity
checking on them when we load the journal.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
---
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 8207a01..bb926e4 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -41,6 +41,8 @@
 #include <asm/uaccess.h>
 #include <asm/page.h>
 
+#include "../ext4/ext4.h"
+
 EXPORT_SYMBOL(jbd2_journal_start);
 EXPORT_SYMBOL(jbd2_journal_restart);
 EXPORT_SYMBOL(jbd2_journal_extend);
@@ -1120,6 +1122,34 @@ static void journal_fail_superblock (journal_t *journal)
 	journal->j_sb_buffer = NULL;
 }
 
+static int validate_last_block(journal_t *journal, unsigned long last)
+{
+	if (journal->j_inode) {
+		return last == journal->j_inode->i_size >>
+			journal->j_inode->i_sb->s_blocksize_bits;
+	} else {
+		struct buffer_head *bh;
+		struct ext4_super_block *es;
+		ext4_fsblk_t sb_block;
+		ext4_fsblk_t count;
+		unsigned long offset;
+
+		sb_block = EXT4_MIN_BLOCK_SIZE / journal->j_blocksize;
+		offset = EXT4_MIN_BLOCK_SIZE % journal->j_blocksize;
+		bh = __getblk(journal->j_dev, sb_block, journal->j_blocksize);
+		if (bh) {
+			es = (struct ext4_super_block *) bh->b_data + offset;
+			count = ext4_blocks_count(es);
+			brelse(bh);
+			return count == last;
+		} else {
+			printk(KERN_WARNING
+				"JBD2: IO error reading journal's ext3 sb\n");
+			return 0;
+		}
+	}
+}
+
 /*
  * Given a journal_t structure, initialise the various fields for
  * startup of a new journaling session.  We use this both when creating
@@ -1134,6 +1164,16 @@ static int journal_reset(journal_t *journal)
 
 	first = be32_to_cpu(sb->s_first);
 	last = be32_to_cpu(sb->s_maxlen);
+	if (last - first + 1 < JBD2_MIN_JOURNAL_BLOCKS) {
+		printk(KERN_ERR "JBD2: Bad journal block range: %llu-%llu\n",
+			first, last);
+		return -EIO;
+	}
+
+	if (!validate_last_block(journal, last)) {
+		printk(KERN_ERR "JBD2: Bad last journal block: %llu\n", last);
+		return -EIO;
+	}
 
 	journal->j_first = first;
 	journal->j_last = last;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-09-29 16:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-22 21:08 jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Andrew Morton
2008-09-23 16:56 ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Duane Griffin
2008-09-29  2:24   ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Theodore Tso
2008-09-29 16:51     ` jbd2-abort-instead-of-waiting-for-nonexistent-transactions.patch Duane Griffin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).