linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] e2fsck shouln't consider superblock summaries as fatal
@ 2008-08-26 10:45 Andreas Dilger
  2008-08-26 14:58 ` Eric Sandeen
  2008-08-26 17:04 ` Theodore Tso
  0 siblings, 2 replies; 7+ messages in thread
From: Andreas Dilger @ 2008-08-26 10:45 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4

Running e2fsck on a quiescent (but mounted) filesystem fails in the
common case where the superblock inode and block count summaries are
wrong.  The kernel doesn't update these values except at unmount time.
If there are other errors in the filesystem then they will already
cause e2fsck to consider the filesystem invalid, so these minor errors
should not.

Don't consider only an error in the superblock summary as incorrect.
The kernel does not update this field except at unmount time.  Any
other unfixed errors will themselves mark the filesystem invalid.

Signed-off-by: Andreas Dilger <adilger@sun.com>

--- ./e2fsck/pass5.c.orig	2008-07-09 12:36:03.000000000 -0600
+++ ./e2fsck/pass5.c	2008-08-26 04:10:40.000000000 -0600
@@ -347,8 +347,7 @@ redo_counts:
 		if (fix_problem(ctx, PR_5_FREE_BLOCK_COUNT, &pctx)) {
 			fs->super->s_free_blocks_count = free_blocks;
 			ext2fs_mark_super_dirty(fs);
-		} else
-			ext2fs_unmark_valid(fs);
+		}
 	}
 errout:
 	ext2fs_free_mem(&free_array);
@@ -566,8 +565,7 @@ do_counts:
 		if (fix_problem(ctx, PR_5_FREE_INODE_COUNT, &pctx)) {
 			fs->super->s_free_inodes_count = free_inodes;
 			ext2fs_mark_super_dirty(fs);
-		} else
-			ext2fs_unmark_valid(fs);
+		}
 	}
 errout:
 	ext2fs_free_mem(&free_array);
--- ./e2fsck/problem.c.orig	2008-07-09 12:36:03.000000000 -0600
+++ ./e2fsck/problem.c	2008-08-26 03:58:05.000000000 -0600
@@ -1577,7 +1587,7 @@ static struct e2fsck_problem problem_tab
 	/* Free inodes count wrong */
 	{ PR_5_FREE_INODE_COUNT,
 	  N_("Free @is count wrong (%i, counted=%j).\n"),
-	  PROMPT_FIX, PR_PREEN_OK | PR_PREEN_NOMSG },
+	  PROMPT_FIX, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
 
 	/* Free blocks count for group wrong */
 	{ PR_5_FREE_BLOCK_COUNT_GROUP,
@@ -1587,7 +1597,7 @@ static struct e2fsck_problem problem_tab
 	/* Free blocks count wrong */
 	{ PR_5_FREE_BLOCK_COUNT,
 	  N_("Free @bs count wrong (%b, counted=%c).\n"),
-	  PROMPT_FIX, PR_PREEN_OK | PR_PREEN_NOMSG },
+	  PROMPT_FIX, PR_PREEN_OK | PR_NO_OK | PR_PREEN_NOMSG },
 
 	/* Programming error: bitmap endpoints don't match */
 	{ PR_5_BMAP_ENDPOINTS,

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-26 10:45 [PATCH] e2fsck shouln't consider superblock summaries as fatal Andreas Dilger
@ 2008-08-26 14:58 ` Eric Sandeen
  2008-08-26 17:04 ` Theodore Tso
  1 sibling, 0 replies; 7+ messages in thread
From: Eric Sandeen @ 2008-08-26 14:58 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Theodore Ts'o, linux-ext4

Andreas Dilger wrote:
> Running e2fsck on a quiescent (but mounted) filesystem fails in the
> common case where the superblock inode and block count summaries are
> wrong.  The kernel doesn't update these values except at unmount time.
> If there are other errors in the filesystem then they will already
> cause e2fsck to consider the filesystem invalid, so these minor errors
> should not.

If by quiescent, if you mean ->write_super_lockfs, shouldn't that path
be indistinguishable from an unmount?  Why wouldn't write_super_lockfs
also update these counts, rather than working around it in fsck?

-Eric

> Don't consider only an error in the superblock summary as incorrect.
> The kernel does not update this field except at unmount time.  Any
> other unfixed errors will themselves mark the filesystem invalid.
> 
> Signed-off-by: Andreas Dilger <adilger@sun.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-26 10:45 [PATCH] e2fsck shouln't consider superblock summaries as fatal Andreas Dilger
  2008-08-26 14:58 ` Eric Sandeen
@ 2008-08-26 17:04 ` Theodore Tso
  2008-08-26 21:27   ` Andreas Dilger
  1 sibling, 1 reply; 7+ messages in thread
From: Theodore Tso @ 2008-08-26 17:04 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

On Tue, Aug 26, 2008 at 04:45:02AM -0600, Andreas Dilger wrote:
> Running e2fsck on a quiescent (but mounted) filesystem fails in the
> common case where the superblock inode and block count summaries are
> wrong.  The kernel doesn't update these values except at unmount time.
> If there are other errors in the filesystem then they will already
> cause e2fsck to consider the filesystem invalid, so these minor errors
> should not.

Sure, but *when* would it ever be safe to run e2fsck without -n on a
mounted filesystem?  What's the scenario where this would matter?  And
on an unmounted filesystem, if the block counts are wrong, and the
user refuses to fix them the filesystem technically really isn't 100%
valid.

					- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-26 17:04 ` Theodore Tso
@ 2008-08-26 21:27   ` Andreas Dilger
  2008-08-27  0:25     ` Theodore Tso
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2008-08-26 21:27 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

On Aug 26, 2008  13:04 -0400, Theodore Ts'o wrote:
> On Tue, Aug 26, 2008 at 04:45:02AM -0600, Andreas Dilger wrote:
> > Running e2fsck on a quiescent (but mounted) filesystem fails in the
> > common case where the superblock inode and block count summaries are
> > wrong.  The kernel doesn't update these values except at unmount time.
> > If there are other errors in the filesystem then they will already
> > cause e2fsck to consider the filesystem invalid, so these minor errors
> > should not.
> 
> Sure, but *when* would it ever be safe to run e2fsck without -n on a
> mounted filesystem?  What's the scenario where this would matter?  And
> on an unmounted filesystem, if the block counts are wrong, and the
> user refuses to fix them the filesystem technically really isn't 100%
> valid.

I mean that this is for "e2fsck -fn".  In that case the filesystem isn't
changed, and is often completely clean except the superblock counters.
Until we have block-device freeze ioctl widely available (or convince
users to use LVM), the best we can do is quiesce Lustre IO without
unmounting the filesystem.


Without patch:

# e2fsck -fn /dev/hda3
e2fsck 1.40.7.sun1 (28-Feb-2008)
Warning!  /dev/hda3 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (213908, counted=213298).
Fix? no

Free inodes count wrong (249992, counted=249282).
Fix? no

lustre-MDT0000: ********** WARNING: Filesystem still has errors **********

lustre-MDT0000: 56/250048 files (17.9% non-contiguous), 36092/250000 blocks

# echo $?
4

With patch:

# e2fsck -fn /dev/hda3
e2fsck 1.40.11.sun1 (17-June-2008)
Warning!  /dev/hda3 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (213908, counted=213298).
Fix? no

Free inodes count wrong (249992, counted=249282).
Fix? no

lustre-MDT0000: 56/250048 files (17.9% non-contiguous), 36092/250000 blocks
# echo $?
0

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-26 21:27   ` Andreas Dilger
@ 2008-08-27  0:25     ` Theodore Tso
  2008-08-27  7:32       ` Andreas Dilger
  0 siblings, 1 reply; 7+ messages in thread
From: Theodore Tso @ 2008-08-27  0:25 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

On Tue, Aug 26, 2008 at 03:27:43PM -0600, Andreas Dilger wrote:
> I mean that this is for "e2fsck -fn".  In that case the filesystem isn't
> changed, and is often completely clean except the superblock counters.
> Until we have block-device freeze ioctl widely available (or convince
> users to use LVM), the best we can do is quiesce Lustre IO without
> unmounting the filesystem.

Ah, I see.  So the main thing that you are trying to achieve with the
patch is avoid the non-zero exit from fsck, right?

I guess I'm really not that happy with letting the filesystem getting
marked as "valid" if the user refuses to fix the free blocks/inode
count summary when the -n flag isn't getting set.  And technically, if
the summary statistics are wrong, the filesystem is not actually
valid, which is what an exit code of 4, right?

It seems like the much more "correct" solution, which would actually
be more code, but would also be useful when a user wants to check a
filesystem without actually changing *anything*, including running the
journal, would be to create an I/O manager which reads in the journal
into memory, and creates a "override map" data structure such that
when e2fsck tries to read from a block which is in the journal, that
the (read-only) I/O manager read the block in the journal instead of
from the disk.  (Of course it will need to respect the revoke records,
too!)

Once we have this I/O manager, I think e2fsck should use it by default
with the -n option, so that we can correctly check the filesystem, and
**not** modify the device at all.  This would also give you the exit
status of 0 for quiscent filesystems, as you would wish.  Debugfs
could also have an option to use this I/O manager to read in the
journal.

						- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-27  0:25     ` Theodore Tso
@ 2008-08-27  7:32       ` Andreas Dilger
  2008-08-27 13:44         ` Theodore Tso
  0 siblings, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2008-08-27  7:32 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

On Aug 26, 2008  20:25 -0400, Theodore Ts'o wrote:
> On Tue, Aug 26, 2008 at 03:27:43PM -0600, Andreas Dilger wrote:
> > I mean that this is for "e2fsck -fn".  In that case the filesystem isn't
> > changed, and is often completely clean except the superblock counters.
> > Until we have block-device freeze ioctl widely available (or convince
> > users to use LVM), the best we can do is quiesce Lustre IO without
> > unmounting the filesystem.
> 
> Ah, I see.  So the main thing that you are trying to achieve with the
> patch is avoid the non-zero exit from fsck, right?

Yes, the non-zero exit is the main issue.

> I guess I'm really not that happy with letting the filesystem getting
> marked as "valid" if the user refuses to fix the free blocks/inode
> count summary when the -n flag isn't getting set.  And technically, if
> the summary statistics are wrong, the filesystem is not actually
> valid, which is what an exit code of 4, right?

Sure, but the summary statistics are _always_ wrong these days.  I even
think there was even a hack somewhere to e2fsck that you wrote to fixes
up the summaries silently when e2fsck was always reporting errors after
we turned off the superblock updates...

> It seems like the much more "correct" solution, which would actually
> be more code, but would also be useful when a user wants to check a
> filesystem without actually changing *anything*, including running the
> journal, would be to create an I/O manager which reads in the journal
> into memory, and creates a "override map" data structure such that
> when e2fsck tries to read from a block which is in the journal, that
> the (read-only) I/O manager read the block in the journal instead of
> from the disk.  (Of course it will need to respect the revoke records,
> too!)

I don't think this is the issue at all.  It isn't that the journal has the
right summary values either, otherwise waiting 1 commit interval would
be enough.  The issue is that the kernel NEVER updates the summaries
by itself, so the effort to replay the journal in memory would be cool,
but wouldn't help at all.


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] e2fsck shouln't consider superblock summaries as fatal
  2008-08-27  7:32       ` Andreas Dilger
@ 2008-08-27 13:44         ` Theodore Tso
  0 siblings, 0 replies; 7+ messages in thread
From: Theodore Tso @ 2008-08-27 13:44 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

On Wed, Aug 27, 2008 at 01:32:42AM -0600, Andreas Dilger wrote:
> I don't think this is the issue at all.  It isn't that the journal has the
> right summary values either, otherwise waiting 1 commit interval would
> be enough.  The issue is that the kernel NEVER updates the summaries
> by itself, so the effort to replay the journal in memory would be cool,
> but wouldn't help at all.

If you reboot and replay the journal, the summary values are right.
So the correct values are indeed in the journal.  The problem is that
we *never* write the superblock to disk, but only to the journal.
Consider:

/*
 * Ext3 always journals updates to the superblock itself, so we don't
 * have to propagate any other updates to the superblock on disk at this
 * point.  Just start an async writeback to get the buffers on their way
 * to the disk.
 *
 * This implicitly triggers the writebehind on sync().
 */

static void ext3_write_super (struct super_block * sb)
{
	if (mutex_trylock(&sb->s_lock) != 0)
		BUG();
	sb->s_dirt = 0;
}

The comment is a little out of date, since we don't even start an
async writeback these days.  All ext3_write_super does is mark the
superblock as non-dirty, so we are 100% dependent on the summary
values getting written to the journal.  Even if we call fsync on the
filesystem, we don't actually write the superblock to its permanent
location on disk, but only to the journal.

static int ext3_sync_fs(struct super_block *sb, int wait)
{
	tid_t target;

	sb->s_dirt = 0;
	if (journal_start_commit(EXT3_SB(sb)->s_journal, &target)) {
		if (wait)
			log_wait_commit(EXT3_SB(sb)->s_journal, target);
	}
	return 0;
}

It's only if we unmount or freeze the filesystem that we call
ext3_commit_super().

						- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-08-27 13:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-26 10:45 [PATCH] e2fsck shouln't consider superblock summaries as fatal Andreas Dilger
2008-08-26 14:58 ` Eric Sandeen
2008-08-26 17:04 ` Theodore Tso
2008-08-26 21:27   ` Andreas Dilger
2008-08-27  0:25     ` Theodore Tso
2008-08-27  7:32       ` Andreas Dilger
2008-08-27 13:44         ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).