(resend) extent header problems following shrink with resize2fs

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* (resend) extent header problems following shrink with resize2fs
@ 2008-12-23  5:49 Paul Collins
  2008-12-23  6:18 ` Theodore Tso
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Collins @ 2008-12-23  5:49 UTC (permalink / raw)
  To: linux-ext4

(resending without gipped attachment)

Running Linux 2.6.28-rc9 as of ab65387243f47a7bc11725f733c86bf27248b326.
e2fsprogs 1.41.3-1 from Debian.

Yesterday I created a ~464GB ext4 volume and copied about 107GB of music
files onto it.  Then I decided that I wanted to use half of the disk for
something else, so last night I resized the ext4 filesystem to ~232GB
and recreated the partitions to suit.  This morning I wrote some new
files to the ext4 filesystem, which went fine.  Then I installed a new
music player, which wanted to scan all of the files on the disk.  It
reported being unable to read some files, and there's rather a lot of
this sort of thing in dmesg (see also http://ondioline.org/~paul/e4dmesg.gz):

EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39565: invalid magic - magic 24e, entries 28338, max 21313(0), depth 28712(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39555: invalid magic - magic cd8c, entries 59560, max 57082(0), depth 5425(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39563: invalid magic - magic 976d, entries 52325, max 49256(0), depth 50316(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39556: invalid magic - magic 61c6, entries 47990, max 4668(0), depth 32768(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #44888: invalid magic - magic 42a, entries 5388, max 32960(0), depth 1872(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39844: invalid magic - magic 6ae8, entries 44073, max 20807(0), depth 10869(0)
EXT4-fs error (device sdb1): ext4_ext_find_extent: bad header in inode #39843: invalid magic - magic 2200, entries 38282, max 17931(0), depth 0(0)

There are no "access beyond end of partition" messages, so I don't think
I screwed up the resize procedure.  The argument I gave resizefs was
"244192000K"; here's the partition table:

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       30401   244196001   83  Linux
/dev/sdb2           30402       60801   244188000   83  Linux


e2fsck aborts when I try to use it fix the filesystem:

        /dev/sdb1 contains a file system with errors, check forced.
        Pass 1: Checking inodes, blocks, and sizes
        Error1: Corrupt extent header on inode 38979
        [New Thread 0x7fe15e066740 (LWP 24166)]

        Program received signal SIGABRT, Aborted.
        [Switching to Thread 0x7fe15e066740 (LWP 24166)]
        0x00007fe15d0fbed5 in raise () from /lib/libc.so.6
        (gdb) bt
        #0  0x00007fe15d0fbed5 in raise () from /lib/libc.so.6
        #1  0x00007fe15d0fd3f3 in abort () from /lib/libc.so.6
        #2  0x000000000040bdae in scan_extent_node (ctx=0x24c6f70, 
            pctx=0x7fff6607c7a0, pb=0x7fff6607c5f0, start_block=0, ehandle=0x2ed94d0)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1700
        #3  0x000000000040cc1d in check_blocks (ctx=0x24c6f70, pctx=0x7fff6607c7a0, 
            block_buf=0x2ec11a0 "�002")
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1773
        #4  0x000000000040e063 in e2fsck_pass1 (ctx=0x24c6f70)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1030
        #5  0x00000000004089e8 in e2fsck_run (ctx=0x24c6f70)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/e2fsck.c:215
        #6  0x00000000004074a3 in main (argc=<value optimized out>, 
            argv=<value optimized out>)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/unix.c:1278

and e2image exits with:

        e2image: Corrupt extent header while iterating over inode 38979

In the matter of inode 38979, debugfs says:

        debugfs:  bmap <38979> 0
        argv[0]: Corrupt extent header while mapping logical block 0


I can keep the FS around so let me know if you need any more
information.

-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (resend) extent header problems following shrink with resize2fs
  2008-12-23  5:49 (resend) extent header problems following shrink with resize2fs Paul Collins
@ 2008-12-23  6:18 ` Theodore Tso
  2008-12-20 18:14   ` Massive filesystem corruption Matteo Croce
  2008-12-25  7:18   ` (resend) extent header problems following shrink with resize2fs Paul Collins
  0 siblings, 2 replies; 11+ messages in thread
From: Theodore Tso @ 2008-12-23  6:18 UTC (permalink / raw)
  To: Paul Collins; +Cc: linux-ext4

On Tue, Dec 23, 2008 at 06:49:15PM +1300, Paul Collins wrote:
> (resending without gipped attachment)
> 
> Yesterday I created a ~464GB ext4 volume and copied about 107GB of music
> files onto it.  Then I decided that I wanted to use half of the disk for
> something else, so last night I resized the ext4 filesystem to ~232GB
> and recreated the partitions to suit.  This morning I wrote some new
> files to the ext4 filesystem, which went fine.  Then I installed a new
> music player, which wanted to scan all of the files on the disk.  It
> reported being unable to read some files, and there's rather a lot of
> this sort of thing in dmesg (see also http://ondioline.org/~paul/e4dmesg.gz):

Yeah, resize2fs needs to be fixed to handle extents correctly.  At the
moment it can screw them up pretty badly.  I'll log this as a bug to
resize2fs; thanks for reporting it, and I hope you didn't suffer any
permanent data loss.

Regards,

							- Ted

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Massive filesystem corruption
@ 2008-12-20 18:14   ` Matteo Croce
  2008-12-20 19:27     ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: Matteo Croce @ 2008-12-20 18:14 UTC (permalink / raw)
  To: linux-ext4

Hi,
i've lost my ext4 partition with a 2.6.27 vanilla kernel:

root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sda1,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

root@ubuntu:~# dmesg | tail -1
[ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.
root@ubuntu:~# e2fsck /dev/sda1
e2fsck 1.41.3 (12-Oct-2008)
e2fsck: Superblock invalid, trying backup blocks...
/dev/sda1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error1: Corrupt extent header on inode 107192
Aborted (core dumped)
root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
(gdb) run
Starting program: /sbin/e2fsck /dev/sda1
[Thread debugging using libthread_db enabled]
e2fsck 1.41.3 (12-Oct-2008)
/sbin/e2fsck: Superblock invalid, trying backup blocks...
/dev/sda1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error1: Corrupt extent header on inode 107192
[New Thread 0xb7e46700 (LWP 12878)]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7e46700 (LWP 12878)]
0xb8031430 in __kernel_vsyscall ()
(gdb) backtrace
#0  0xb8031430 in __kernel_vsyscall ()
#1  0xb7e8c880 in raise () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7e8e248 in abort () from /lib/tls/i686/cmov/libc.so.6
#3  0x0805397b in scan_extent_node (ctx=0x9193038, pctx=0xbf830d7c, 
pb=0xbf830c5c, start_block=0, ehandle=0x91b8170) at 
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1700
#4  0x08054c02 in check_blocks (ctx=0x9193038, pctx=0xbf830d7c, 
block_buf=0x91acff0 "\225\"\005") at 
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1773
#5  0x080565ca in e2fsck_pass1 (ctx=0x9193038) at 
/build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1030
#6  0x08050063 in e2fsck_run (ctx=0x9193038) at 
/build/buildd/e2fsprogs-1.41.3/e2fsck/e2fsck.c:215
#7  0x0804e4b8 in main (argc=Cannot access memory at address 0x324e
) at /build/buildd/e2fsprogs-1.41.3/e2fsck/unix.c:1278
(gdb)

please if you know how can I read, fix or debug it answer in a reasonable time,
i need that disk space and i'll format it in a few days

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Massive filesystem corruption
  2008-12-20 18:14   ` Massive filesystem corruption Matteo Croce
@ 2008-12-20 19:27     ` Eric Sandeen
  2008-12-21  2:05       ` Matteo Croce
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2008-12-20 19:27 UTC (permalink / raw)
  To: Matteo Croce; +Cc: linux-ext4

Matteo Croce wrote:
> Hi,
> i've lost my ext4 partition with a 2.6.27 vanilla kernel:
> 
> root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
> mount: wrong fs type, bad option, bad superblock on /dev/sda1,
>        missing codepage or helper program, or other error
>        In some cases useful info is found in syslog - try
>        dmesg | tail  or so

What happened between the last successful mount and this failure?

> root@ubuntu:~# dmesg | tail -1
> [ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.

Was there anything before that?  (i.e. check tail -n 10?)

What does the beginning of the fs look like, maybe you can put the first
16k or so of a dd somewehre, or run it through hexdump -C, see if
something else stomped on this partition.

> root@ubuntu:~# e2fsck /dev/sda1
> e2fsck 1.41.3 (12-Oct-2008)
> e2fsck: Superblock invalid, trying backup blocks...
> /dev/sda1 was not cleanly unmounted, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Error1: Corrupt extent header on inode 107192
> Aborted (core dumped)
> root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
> (gdb) run
> Starting program: /sbin/e2fsck /dev/sda1
> [Thread debugging using libthread_db enabled]
> e2fsck 1.41.3 (12-Oct-2008)
> /sbin/e2fsck: Superblock invalid, trying backup blocks...
> /dev/sda1 was not cleanly unmounted, check forced.
> Pass 1: Checking inodes, blocks, and sizes
> Error1: Corrupt extent header on inode 107192
> [New Thread 0xb7e46700 (LWP 12878)]
> 
> Program received signal SIGABRT, Aborted.

well, this was an explicit abort():

        if (pctx->errcode) {
                printf("Error1: %s on inode %u\n",
                        error_message(pctx->errcode), pctx->ino);
                abort();
        }

... I guess that error is not handled yet.

can you open the fs with debugfs, and try

debugfs> stat <107192>

and/or

debugfs> dump <107192> /some/path/to/dumpfile

and maybe we can see what's wrong with this inode.  If it's the only one
then perhaps it can be nuked w/ debugfs and fsck will continue.

-Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Massive filesystem corruption
  2008-12-20 19:27     ` Eric Sandeen
@ 2008-12-21  2:05       ` Matteo Croce
  2008-12-21  3:23         ` Eric Sandeen
                           ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Matteo Croce @ 2008-12-21  2:05 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-ext4

On Saturday 20 December 2008 20:27:24 Eric Sandeen wrote:
> Matteo Croce wrote:
> > Hi,
> > i've lost my ext4 partition with a 2.6.27 vanilla kernel:
> >
> > root@ubuntu:~# mount -t ext4dev /dev/sda1 /mnt
> > mount: wrong fs type, bad option, bad superblock on /dev/sda1,
> >        missing codepage or helper program, or other error
> >        In some cases useful info is found in syslog - try
> >        dmesg | tail  or so
>
> What happened between the last successful mount and this failure?

A system freeze (mouse hanged etc.)

> > root@ubuntu:~# dmesg | tail -1
> > [ 4874.514703] VFS: Can't find ext4 filesystem on dev sda1.
>
> Was there anything before that?  (i.e. check tail -n 10?)

Nothing relevant, usb loading and other drivers..

> What does the beginning of the fs look like, maybe you can put the first
> 16k or so of a dd somewehre, or run it through hexdump -C, see if
> something else stomped on this partition.

I'll check it

> > root@ubuntu:~# e2fsck /dev/sda1
> > e2fsck 1.41.3 (12-Oct-2008)
> > e2fsck: Superblock invalid, trying backup blocks...
> > /dev/sda1 was not cleanly unmounted, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Error1: Corrupt extent header on inode 107192
> > Aborted (core dumped)
> > root@ubuntu:~# gdb -q --args e2fsck /dev/sda1
> > (gdb) run
> > Starting program: /sbin/e2fsck /dev/sda1
> > [Thread debugging using libthread_db enabled]
> > e2fsck 1.41.3 (12-Oct-2008)
> > /sbin/e2fsck: Superblock invalid, trying backup blocks...
> > /dev/sda1 was not cleanly unmounted, check forced.
> > Pass 1: Checking inodes, blocks, and sizes
> > Error1: Corrupt extent header on inode 107192
> > [New Thread 0xb7e46700 (LWP 12878)]
> >
> > Program received signal SIGABRT, Aborted.
>
> well, this was an explicit abort():
>
>         if (pctx->errcode) {
>                 printf("Error1: %s on inode %u\n",
>                         error_message(pctx->errcode), pctx->ino);
>                 abort();
>         }
>
> ... I guess that error is not handled yet.
>
> can you open the fs with debugfs, and try
>
> debugfs> stat <107192>
>
> and/or
>
> debugfs> dump <107192> /some/path/to/dumpfile
>
> and maybe we can see what's wrong with this inode.  If it's the only one
> then perhaps it can be nuked w/ debugfs and fsck will continue.
>
> -Eric

debugfs is new to me, have you some docs for me to read?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Massive filesystem corruption
  2008-12-21  2:05       ` Matteo Croce
@ 2008-12-21  3:23         ` Eric Sandeen
  2008-12-21  5:09         ` Nick Dokos
  2008-12-26  3:57         ` Theodore Tso
  2 siblings, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2008-12-21  3:23 UTC (permalink / raw)
  To: Matteo Croce; +Cc: linux-ext4

Matteo Croce wrote:

> debugfs is new to me, have you some docs for me to read?

sure, "man debugfs"

-Eric

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Massive filesystem corruption
  2008-12-21  2:05       ` Matteo Croce
  2008-12-21  3:23         ` Eric Sandeen
@ 2008-12-21  5:09         ` Nick Dokos
  2008-12-26  3:57         ` Theodore Tso
  2 siblings, 0 replies; 11+ messages in thread
From: Nick Dokos @ 2008-12-21  5:09 UTC (permalink / raw)
  To: Matteo Croce; +Cc: Eric Sandeen, linux-ext4

Matteo Croce <technoboy85@gmail.com> wrote:

> ...
> debugfs is new to me, have you some docs for me to read?
> 

The debugfs man page gives summary descriptions of all the commands. I
am not aware of any other documentation. If for some reason the man page
is not installed locally, you can try e.g.

    http://linux.die.net/man/8/debugfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Massive filesystem corruption
  2008-12-21  2:05       ` Matteo Croce
  2008-12-21  3:23         ` Eric Sandeen
  2008-12-21  5:09         ` Nick Dokos
@ 2008-12-26  3:57         ` Theodore Tso
  2 siblings, 0 replies; 11+ messages in thread
From: Theodore Tso @ 2008-12-26  3:57 UTC (permalink / raw)
  To: Matteo Croce, Paul Collins; +Cc: linux-ext4

On Sun, Dec 21, 2008 at 03:05:49AM +0100, Matteo Croce wrote:
> > > Pass 1: Checking inodes, blocks, and sizes
> > > Error1: Corrupt extent header on inode 107192
> > > [New Thread 0xb7e46700 (LWP 12878)]

The following patch to e2fsprogs will fix e2fsck's inability to deal
with a corrupted interior node in the extent tree.  It will be in the
next maintenance release of e2fsprogs, and it should address the
problem you've pointed out.

Regards,

						- Ted


commit 7518c176867099eb529502103106501861a71280
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Thu Dec 25 22:42:38 2008 -0500

    e2fsck: Fix an unhandled corruption case in scan_extent_node()
    
    A corrupted interior node in an extent tree would cause e2fsck to
    crash with the error message:
    
    Error1: Corrupt extent header on inode 107192
    Aborted (core dumped)
    
    Handle this and related failures when scanning an inode's extent tree
    more robustly.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c
index 2619272..04aeb26 100644
--- a/e2fsck/pass1.c
+++ b/e2fsck/pass1.c
@@ -1655,6 +1655,7 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 			problem = PR_1_EXTENT_ENDS_BEYOND;
 
 		if (problem) {
+		report_problem:
 			pctx->blk = extent.e_pblk;
 			pctx->blk2 = extent.e_lblk;
 			pctx->num = extent.e_len;
@@ -1662,11 +1663,7 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 				pctx->errcode =
 					ext2fs_extent_delete(ehandle, 0);
 				if (pctx->errcode) {
-					fix_problem(ctx,
-						    PR_1_EXTENT_DELETE_FAIL,
-						    pctx);
-					/* Should never get here */
-					ctx->flags |= E2F_FLAG_ABORT;
+					pctx->str = "ext2fs_extent_delete";
 					return;
 				}
 				pctx->errcode = ext2fs_extent_get(ehandle,
@@ -1682,23 +1679,27 @@ static void scan_extent_node(e2fsck_t ctx, struct problem_context *pctx,
 		}
 
 		if (!is_leaf) {
-			mark_block_used(ctx, extent.e_pblk);
-			pb->num_blocks++;
+			blk = extent.e_pblk;
 			pctx->errcode = ext2fs_extent_get(ehandle,
 						  EXT2_EXTENT_DOWN, &extent);
 			if (pctx->errcode) {
-				printf("Error1: %s on inode %u\n",
-					error_message(pctx->errcode), pctx->ino);
-				abort();
+				pctx->str = "EXT2_EXTENT_DOWN";
+				problem = PR_1_EXTENT_HEADER_INVALID;
+				if (pctx->errcode == EXT2_ET_EXTENT_HEADER_BAD)
+					goto report_problem;
+				return;
 			}
 			scan_extent_node(ctx, pctx, pb, extent.e_lblk, ehandle);
+			if (pctx->errcode)
+				return;
 			pctx->errcode = ext2fs_extent_get(ehandle,
 						  EXT2_EXTENT_UP, &extent);
 			if (pctx->errcode) {
-				printf("Error1: %s on inode %u\n",
-					error_message(pctx->errcode), pctx->ino);
-				abort();
+				pctx->str = "EXT2_EXTENT_UP";
+				return;
 			}
+			mark_block_used(ctx, blk);
+			pb->num_blocks++;
 			goto next;
 		}
 
@@ -1780,7 +1781,14 @@ static void check_blocks_extents(e2fsck_t ctx, struct problem_context *pctx,
 	}
 
 	scan_extent_node(ctx, pctx, pb, 0, ehandle);
-
+	if (pctx->errcode &&
+	    fix_problem(ctx, PR_1_EXTENT_ITERATE_FAILURE, pctx)) {
+		pb->num_blocks = 0;
+		inode->i_blocks = 0;
+		e2fsck_clear_inode(ctx, ino, inode, E2F_FLAG_RESTART,
+				   "check_blocks_extents");
+		pctx->errcode = 0;
+	}
 	ext2fs_extent_free(ehandle);
 }
 
diff --git a/e2fsck/problem.c b/e2fsck/problem.c
index 19e8719..9cb3094 100644
--- a/e2fsck/problem.c
+++ b/e2fsck/problem.c
@@ -823,10 +823,11 @@ static struct e2fsck_problem problem_table[] = {
 	  N_("Error while reading over @x tree in @i %i: %m\n"),
 	  PROMPT_CLEAR_INODE, 0 },
 
-	/* Error deleting a bogus extent */
-	{ PR_1_EXTENT_DELETE_FAIL,
-	  N_("Error while deleting extent: %m\n"),
-	  PROMPT_ABORT, 0 },
+	/* Failure to iterate extents */
+	{ PR_1_EXTENT_ITERATE_FAILURE,
+	  N_("Failed to iterate extents in @i %i\n"
+	     "\t(op %s, blk %b, lblk %c): %m\n"),
+	  PROMPT_CLEAR_INODE, 0 },
 
 	/* Bad starting block in extent */
 	{ PR_1_EXTENT_BAD_START_BLK,
@@ -863,6 +864,10 @@ static struct e2fsck_problem problem_table[] = {
 	  N_("@i %i has out of order extents\n\t(@n logical @b %c, physical @b %b, len %N)\n"),
 	  PROMPT_CLEAR, 0 },
 
+	{ PR_1_EXTENT_HEADER_INVALID,
+	  N_("@i %i has an invalid extent node (blk %b, lblk %c)\n"),
+	  PROMPT_CLEAR, 0 },
+
 	/* Pass 1b errors */
 
 	/* Pass 1B: Rescan for duplicate/bad blocks */
diff --git a/e2fsck/problem.h b/e2fsck/problem.h
index 815b37c..1cb054c 100644
--- a/e2fsck/problem.h
+++ b/e2fsck/problem.h
@@ -479,8 +479,8 @@ struct problem_context {
 /* Error while reading extent tree */
 #define PR_1_READ_EXTENT		0x010056
 
-/* Error deleting a bogus extent */
-#define PR_1_EXTENT_DELETE_FAIL		0x010057
+/* Failure to iterate extents */
+#define PR_1_EXTENT_ITERATE_FAILURE	0x010057
 
 /* Bad starting block in extent */
 #define PR_1_EXTENT_BAD_START_BLK	0x010058
@@ -503,6 +503,9 @@ struct problem_context {
 /* Extents are out of order */
 #define PR_1_OUT_OF_ORDER_EXTENTS	0x01005E
 
+/* Extent node header invalid */
+#define PR_1_EXTENT_HEADER_INVALID	0x01005F
+
 /*
  * Pass 1b errors
  */
diff --git a/lib/ext2fs/extent.c b/lib/ext2fs/extent.c
index 929e5cd..5545a94 100644
--- a/lib/ext2fs/extent.c
+++ b/lib/ext2fs/extent.c
@@ -441,8 +441,10 @@ retry:
 		eh = (struct ext3_extent_header *) newpath->buf;
 
 		retval = ext2fs_extent_header_verify(eh, handle->fs->blocksize);
-		if (retval)
+		if (retval) {
+			handle->level--;
 			return retval;
+		}
 
 		newpath->left = newpath->entries =
 			ext2fs_le16_to_cpu(eh->eh_entries);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: (resend) extent header problems following shrink with resize2fs
  2008-12-23  6:18 ` Theodore Tso
  2008-12-20 18:14   ` Massive filesystem corruption Matteo Croce
@ 2008-12-25  7:18   ` Paul Collins
  2008-12-25 13:09     ` Theodore Tso
  2008-12-26  4:14     ` Theodore Tso
  1 sibling, 2 replies; 11+ messages in thread
From: Paul Collins @ 2008-12-25  7:18 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

Theodore Tso <tytso@MIT.EDU> writes:
> Yeah, resize2fs needs to be fixed to handle extents correctly.  At the
> moment it can screw them up pretty badly.

In the meantime, perhaps something like the patch below is appropriate?

> I'll log this as a bug to resize2fs; thanks for reporting it, and I
> hope you didn't suffer any permanent data loss.

No worries there, that was replica N+1 of those particular files.

My real concern, which I didn't highlight well and buried way down in my
original report to boot, was e2fsck blowing up like it did.  Hardware
being what it is, I imagine at some point extent headers will get
corrupted, and losing one file is of course preferable to losing the
entire filesystem.

For reference, here's that backtrace again.

        /dev/sdb1 contains a file system with errors, check forced.
        Pass 1: Checking inodes, blocks, and sizes
        Error1: Corrupt extent header on inode 38979
        [New Thread 0x7fe15e066740 (LWP 24166)]

        Program received signal SIGABRT, Aborted.
        [Switching to Thread 0x7fe15e066740 (LWP 24166)]
        0x00007fe15d0fbed5 in raise () from /lib/libc.so.6
        (gdb) bt
        #0  0x00007fe15d0fbed5 in raise () from /lib/libc.so.6
        #1  0x00007fe15d0fd3f3 in abort () from /lib/libc.so.6
        #2  0x000000000040bdae in scan_extent_node (ctx=0x24c6f70, 
            pctx=0x7fff6607c7a0, pb=0x7fff6607c5f0, start_block=0, ehandle=0x2ed94d0)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1700
        #3  0x000000000040cc1d in check_blocks (ctx=0x24c6f70, pctx=0x7fff6607c7a0, 
            block_buf=0x2ec11a0 "�002")
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1773
        #4  0x000000000040e063 in e2fsck_pass1 (ctx=0x24c6f70)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/pass1.c:1030
        #5  0x00000000004089e8 in e2fsck_run (ctx=0x24c6f70)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/e2fsck.c:215
        #6  0x00000000004074a3 in main (argc=<value optimized out>, 
            argv=<value optimized out>)
            at /build/buildd/e2fsprogs-1.41.3/e2fsck/unix.c:1278


diff --git a/resize/main.c b/resize/main.c
index 3de333e..fb4fa99 100644
--- a/resize/main.c
+++ b/resize/main.c
@@ -426,6 +426,13 @@ int main (int argc, char ** argv)
 			"long.  Nothing to do!\n\n"), new_size);
 		exit(0);
 	}
+	if ((new_size < fs->super->s_blocks_count) &&
+	    (fs->super->s_feature_incompat & EXT3_FEATURE_INCOMPAT_EXTENTS)) {
+		fprintf(stderr, _("Reducing the size of a "
+				  "filesystem with extents enabled\n"
+				  "is currently not supported.\n"));
+		exit(1);
+	}
 	if (mount_flags & EXT2_MF_MOUNTED) {
 		retval = online_resize_fs(fs, mtpt, &new_size, flags);
 	} else {


-- 
Paul Collins
Wellington, New Zealand

Dag vijandelijk luchtschip de huismeester is dood
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: (resend) extent header problems following shrink with resize2fs
  2008-12-25  7:18   ` (resend) extent header problems following shrink with resize2fs Paul Collins
@ 2008-12-25 13:09     ` Theodore Tso
  2008-12-26  4:14     ` Theodore Tso
  1 sibling, 0 replies; 11+ messages in thread
From: Theodore Tso @ 2008-12-25 13:09 UTC (permalink / raw)
  To: Paul Collins; +Cc: linux-ext4

On Thu, Dec 25, 2008 at 08:18:48PM +1300, Paul Collins wrote:
> My real concern, which I didn't highlight well and buried way down in my
> original report to boot, was e2fsck blowing up like it did.  Hardware
> being what it is, I imagine at some point extent headers will get
> corrupted, and losing one file is of course preferable to losing the
> entire filesystem.

Yeah, I know about that problem.  It was highlighted recently but what
with the end of the year coming up I haven't had a chance to fix it
yet.  It's an embarassing oversight on my part; I didn't notice that I
failed to handle this case because it happens relatively rarely that
an extent tree has a depth >= 2 in the first place, since this error
only happens when an non-leaf interior node gets corrupted.  I had
left it as an "we'll handle this later" case, and I never got back to
it.  The short-term workaround is simply to use debugfs and use the
clri function:

debugfs -w /dev/sdb1
debugfs: clri <38979>
debugfs: quit

... and then run e2fsck.  I'll get this fixed in the next maintenance
release of e2fsprogs, though, which will be out soon.  We have a few
ext4 related problems that I really need to get fixed and out the
door.

	   	      	      	    	    	- Ted

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: (resend) extent header problems following shrink with resize2fs
  2008-12-25  7:18   ` (resend) extent header problems following shrink with resize2fs Paul Collins
  2008-12-25 13:09     ` Theodore Tso
@ 2008-12-26  4:14     ` Theodore Tso
  1 sibling, 0 replies; 11+ messages in thread
From: Theodore Tso @ 2008-12-26  4:14 UTC (permalink / raw)
  To: Paul Collins; +Cc: linux-ext4

On Thu, Dec 25, 2008 at 08:18:48PM +1300, Paul Collins wrote:
> Theodore Tso <tytso@MIT.EDU> writes:
> > Yeah, resize2fs needs to be fixed to handle extents correctly.  At the
> > moment it can screw them up pretty badly.
> 
> In the meantime, perhaps something like the patch below is appropriate?

Actually, I think the following patch should fix things up nicely.  I
need to create a test case so I can be sure this fixes the problem,
but I think this should address the root cause of theproblem you
reported.

						- Ted

diff --git a/resize/resize2fs.c b/resize/resize2fs.c
index abe05f5..65398a6 100644
--- a/resize/resize2fs.c
+++ b/resize/resize2fs.c
@@ -1188,6 +1188,16 @@ static int process_block(ext2_filsys fs, blk_t	*block_nr,
 	return ret;
 }
 
+static int process_block_ind(ext2_filsys fs, blk_t *block_nr,
+			     e2_blkcnt_t blockcnt, blk_t ref_block, 
+			     int ref_offset, void *priv_data)
+{
+	if (blockcnt >= 0)
+		return 0;
+	return process_block(fs, block_nr, blockcnt, ref_block, ref_offset,
+			     priv_data);
+}
+
 /*
  * Progress callback
  */
@@ -1302,6 +1312,18 @@ static errcode_t inode_scan_and_fix(ext2_resize_t rfs)
 		if (ext2fs_inode_has_valid_blocks(inode) &&
 		    (rfs->bmap || pb.is_dir)) {
 			pb.ino = ino;
+			if (inode->i_flags & EXT4_EXTENTS_FL) {
+				/*
+				 * With extent-based files, we have
+				 * to translate all of the interior
+				 * node blocks first.
+				 */
+				retval = ext2fs_block_iterate2(rfs->old_fs,
+						ino, 0, block_buf,
+						process_block_ind, &pb);
+				if (retval)
+					goto errout;
+			}
 			retval = ext2fs_block_iterate2(rfs->old_fs,
 						       ino, 0, block_buf,
 						       process_block, &pb);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-12-26  4:14 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-23  5:49 (resend) extent header problems following shrink with resize2fs Paul Collins
2008-12-23  6:18 ` Theodore Tso
2008-12-20 18:14   ` Massive filesystem corruption Matteo Croce
2008-12-20 19:27     ` Eric Sandeen
2008-12-21  2:05       ` Matteo Croce
2008-12-21  3:23         ` Eric Sandeen
2008-12-21  5:09         ` Nick Dokos
2008-12-26  3:57         ` Theodore Tso
2008-12-25  7:18   ` (resend) extent header problems following shrink with resize2fs Paul Collins
2008-12-25 13:09     ` Theodore Tso
2008-12-26  4:14     ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).