Once more: Recovering a damaged ext4 fs?

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Once more: Recovering a damaged ext4 fs?
@ 2009-03-27 20:41 J.D. Bakker
  2009-03-27 22:46 ` Theodore Tso
  0 siblings, 1 reply; 9+ messages in thread
From: J.D. Bakker @ 2009-03-27 20:41 UTC (permalink / raw)
  To: linux-ext4

Hi all,

My 4TB ext4 RAID-6 has been damaged again. Symptoms leading up to it 
were very similar to the last time (see 
http://article.gmane.org/gmane.comp.file-systems.ext4/11418 ): a 
process attempted to delete a large (~2GB) file, resulting in a soft 
lockup with the following call trace:

  [<ffffffff80526dd7>] ? _spin_lock+0x16/0x19
  [<ffffffff80317b49>] ? ext4_mb_init_cache+0x81c/0xa58
  [<ffffffff80281249>] ? __lru_cache_add+0x8e/0xb6
  [<ffffffff80279d37>] ? find_or_create_page+0x62/0x88
  [<ffffffff80317ec2>] ? ext4_mb_load_buddy+0x13d/0x326
  [<ffffffff80318385>] ? ext4_mb_free_blocks+0x2da/0x75e
  [<ffffffff802c02d7>] ? __find_get_block+0xc6/0x1bc
  [<ffffffff802feebb>] ? ext4_free_blocks+0x7f/0xb2
  [<ffffffff8031294b>] ? ext4_ext_truncate+0x3e3/0x854
  [<ffffffff80306e38>] ? ext4_truncate+0x67/0x5bd
  [<ffffffff8032594e>] ? jbd2_journal_dirty_metadata+0x124/0x146
  [<ffffffff80314d44>] ? __ext4_handle_dirty_metadata+0xac/0xb7
  [<ffffffff803024c1>] ? ext4_mark_iloc_dirty+0x432/0x4a9
  [<ffffffff80303177>] ? ext4_mark_inode_dirty+0x135/0x166
  [<ffffffff803074e0>] ? ext4_delete_inode+0x152/0x22e
  [<ffffffff8030738e>] ? ext4_delete_inode+0x0/0x22e
  [<ffffffff802b44ac>] ? generic_delete_inode+0x82/0x109
  [<ffffffff802acd44>] ? do_unlinkat+0xf7/0x150
  [<ffffffff802a380c>] ? vfs_read+0x11e/0x133
  [<ffffffff80527545>] ? page_fault+0x25/0x30
  [<ffffffff8020c0ea>] ? system_call_fastpath+0x16/0x1

Kernel is 2.6.29-rc6. Machine is still responsive to anything that 
doesn't touch the ext4 file system, but fails to halt. Upon power 
cycling fsck fails with:

  newraidfs: Superblock has an invalid ext3 journal (inode 8).
  CLEARED.
  *** ext3 journal has been deleted - filesystem is now ext2 only ***

  newraidfs: Note: if several inode or block bitmap blocks or part
  of the inode table require relocation, you may wish to try
  running e2fsck with the '-b 32768' option first.  The problem
  may lie only with the primary block group descriptors, and
  the backup block group descriptors may be OK.

  newraidfs: Block bitmap for group 0 is not in group.  (block 3273617603)

  newraidfs: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
  	(i.e., without -a or -p options)

A manual e2fsck -nv /dev/md0 reported:

  e2fsck 1.41.4 (27-Jan-2009)
  ./e2fsck/e2fsck: Group descriptors look bad... trying backup blocks...
  Block bitmap for group 0 is not in group.  (block 3273617603)
  Relocate? no
  Inode bitmap for group 0 is not in group.  (block 3067860682)
  Relocate? no
  Inode table for group 0 is not in group.  (block 3051956899)
  WARNING: SEVERE DATA LOSS POSSIBLE.
  Relocate? no
  Group descriptor 0 checksum is invalid.  Fix? no
  Inode table for group 1 is not in group.  (block 1842273247)
  WARNING: SEVERE DATA LOSS POSSIBLE.
  Relocate? no
  Group descriptor 1 checksum is invalid.  Fix? no
  Inode bitmap for group 2 is not in group.  (block 3148026909)
  Relocate? no
  Inode table for group 2 is not in group.  (block 1321535690)
  WARNING: SEVERE DATA LOSS POSSIBLE.
  Relocate? no
  Group descriptor 2 checksum is invalid.  Fix? no
  [...]
  ./e2fsck/e2fsck: Invalid argument while reading bad blocks inode
  This doesn't bode well, but we'll try to go on...
  Pass 1: Checking inodes, blocks, and sizes
  Illegal block number passed to ext2fs_test_block_bitmap #3051956899 
for in-use block map
  Illegal block number passed to ext2fs_mark_block_bitmap #3051956899 
for in-use block map
  Illegal block number passed to ext2fs_test_block_bitmap #3051956900 
for in-use block map
  Illegal block number passed to ext2fs_mark_block_bitmap #3051956900 
for in-use block map
  [...]

Full logs available at:
   http://lartmaker.nl/ext4/e2fsck-md0-20090327.txt
   http://lartmaker.nl/ext4/e2fsck-md0-32768-20090327.txt
   http://lartmaker.nl/ext4/e2fsck-md0-98304-20090327.txt

I've run dumpe2fs:
   http://lartmaker.nl/ext4/dumpe2fs-md0-20090327.txt
   http://lartmaker.nl/ext4/dumpe2fs-md0-32768-20090327.txt
   http://lartmaker.nl/ext4/dumpe2fs-md0-98304-20090327.txt
...but it worries me that all three start with "ext2fs_read_bb_inode: 
Invalid argument".

This is linux-2.6.29-rc6 (x86_64) running on an Intel Core i7 920 
processor (quad core plus hyperthreading). Kernel config is 
http://lartmaker.nl/ext4/kernel-config-20090327.txt ; dmesg is at 
http://lartmaker.nl/ext4/dmesg-20090327.txt

So,
- is there a way to recover my file system? I do have backups of most 
data,but as my remote weeklies run on Saturdays I'd still lose a lot 
of work
- is ext4 on software raid-6 on x86_64 considered production stable? 
I have been getting these hangs almost monthly, which is a lot worse 
than my old ext3 software RAID.

Thanks,

JDB.

-- 
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-27 20:41 Once more: Recovering a damaged ext4 fs? J.D. Bakker
@ 2009-03-27 22:46 ` Theodore Tso
  2009-03-27 23:47   ` J.D. Bakker
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2009-03-27 22:46 UTC (permalink / raw)
  To: J.D. Bakker; +Cc: linux-ext4

On Fri, Mar 27, 2009 at 09:41:21PM +0100, J.D. Bakker wrote:
> Hi all,
>
> My 4TB ext4 RAID-6 has been damaged again. Symptoms leading up to it  
> were very similar to the last time (see  
> http://article.gmane.org/gmane.comp.file-systems.ext4/11418 ): a process 
> attempted to delete a large (~2GB) file, resulting in a soft lockup with 
> the following call trace:
>
>  [<ffffffff80526dd7>] ? _spin_lock+0x16/0x19
>  [<ffffffff80317b49>] ? ext4_mb_init_cache+0x81c/0xa58
>  [<ffffffff80281249>] ? __lru_cache_add+0x8e/0xb6
>  [<ffffffff80279d37>] ? find_or_create_page+0x62/0x88
>  [<ffffffff80317ec2>] ? ext4_mb_load_buddy+0x13d/0x326
>  [<ffffffff80318385>] ? ext4_mb_free_blocks+0x2da/0x75e

Thanks, we've been trying to track this down.  The hint that you were
trying to delete a large (~2 GB) file may be what I need to reproduce
it locally.

If it happens again, could you try doing this:

   echo w > /proc/sysrq-trigger
   dmesg > /tmp/dmesg.txt

And send the output of dmesg.txt to us?  

> Kernel is 2.6.29-rc6. Machine is still responsive to anything that  
> doesn't touch the ext4 file system, but fails to halt. Upon power  
> cycling fsck fails with:
>
>  newraidfs: Superblock has an invalid ext3 journal (inode 8).
>  CLEARED.
>  *** ext3 journal has been deleted - filesystem is now ext2 only ***
>
>  newraidfs: Note: if several inode or block bitmap blocks or part
>  of the inode table require relocation, you may wish to try
>  running e2fsck with the '-b 32768' option first.  The problem
>  may lie only with the primary block group descriptors, and
>  the backup block group descriptors may be OK.
>
>  newraidfs: Block bitmap for group 0 is not in group.  (block 3273617603)

It's rather disturbing that there was this much damage done from what
looks like a deadlock condition.  Others who have report this soft
lockup condition haven't reported this kind of filesystem damage.  I
wonder if it might be caused by power-cycling the box; if possible, I
do recommend that people use the reset button rather than power
cycling the box; it tends to be much safer and gentler on the machine.

>  e2fsck 1.41.4 (27-Jan-2009)
>  ./e2fsck/e2fsck: Group descriptors look bad... trying backup blocks...
>  Block bitmap for group 0 is not in group.  (block 3273617603)
>  Relocate? no
>  Inode bitmap for group 0 is not in group.  (block 3067860682)
>  Relocate? no
>  Inode table for group 0 is not in group.  (block 3051956899)
>  WARNING: SEVERE DATA LOSS POSSIBLE.

I really don't know how to explain the fact that your primary and
backup superblocks are getting corrupted.  This is a real puzzler for
me.  As I think I've told you before, the kernel simply doesn't know
how write to the backup superblocks. 

> - is there a way to recover my file system? I do have backups of most  
> data,but as my remote weeklies run on Saturdays I'd still lose a lot of 
> work

Well, probably the best bet at this point is to use "mke2fs -S"; see
the man pages for more details.  You need to make sure you give
exactly the same arguments to mke2fs that you used when you first
created the filesystem.  The mke2fs.conf also needs to be exactly the
same as when the filesystem was originally created.

Given that your system seems to have this prediction to wipe out the
first part of your block group descriptors, what I would recommend is
backing up your block group descriptors like this:

	dd if=/dev/XXXX of=backup-bg.img bs=4k count=234

This will backup just your block group descriptors, and will allow you
to restore them later (although you will have to run e2fsck restoring
them).

The bigger question is how 16 4k blocks between block numbers 1 and 17
are getting overwritten by garbage.  As I mentioned, I haven't seen
anything like this except from your system.  Some others have reported
a soft lockup when doing an "rm -rf" of a large hierarchy, but they
haven't reported this kind of filesystem corruption.  I haven't been
able to replicate it yet myself.  

> - is ext4 on software raid-6 on x86_64 considered production stable? I 
> have been getting these hangs almost monthly, which is a lot worse than 
> my old ext3 software RAID.

Well, the softlockup bug you're seeing is a real one.  A lot of people
aren't seeing it, but you clearly are seeing it, and so we need to
track it down.  I guess by definition, the fact that you're seeing
this bug means it's not "production stable".  On the other hand, a lot
of poeple have been using ext4 without seeing this bug, some of them
in production situations.  The criteria for "production stable" is a
little grey; certainly no enterprise distribution is calling ext4
"production stable" yet, although it's been released as a technology
preview by some distro's.  The problem is that a lot of these problems
can only be found when it starts getting tested by a large userbase,
so this kind of early testing is critical.  

That being said, I don't want to see early testers losing data, since
that tends to scare them off from providing the testing that we so
critically need.  Hence my suggestion of using dd to backup the block
group descriptor blocks. 

And if you're not willing to take the risk, I'll completely understand
your deciding that you need to switch back to ext3.  But if you are
willing to continue testing, and helping us find the root cause
of the problem, we will be very grateful.

Best regards,

						- Ted

P.S.  You were using a completely stock kernel, correct?  No other
patches installed?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-27 22:46 ` Theodore Tso
@ 2009-03-27 23:47   ` J.D. Bakker
  2009-03-28  4:06     ` Theodore Tso
  2009-03-28 12:30     ` Theodore Tso
  0 siblings, 2 replies; 9+ messages in thread
From: J.D. Bakker @ 2009-03-27 23:47 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

At 18:46 -0400 27-03-2009, Theodore Tso wrote:
>Thanks, we've been trying to track this down.  The hint that you were
>trying to delete a large (~2 GB) file may be what I need to reproduce
>it locally.

For the record, I've had it (=the soft lockup) happen to me six times 
now since I built the box last December. Three times the machine 
would reboot/fsck without any issues, one time it had errors which 
were fixable with fsck -y, and this is the second time I've gotten 
"WARNING: SEVERE DATA LOSS POSSIBLE". Kernels involved were 2.6.28, 
2.6.28.4 and 2.6.29-rc6, no patches, almost identical .config (ie: 
upgraded through 'make oldconfig').

All six times the process experiencing the lockup was trying to 
delete a file no smaller than 700MB which had just been read from. 
This time the process was mythtranscode (which had obviously just 
read the entire file), the last time it was an rm on a movie I'd just 
finished watching.

>If it happens again, could you try doing this:
>
>    echo w > /proc/sysrq-trigger
>    dmesg > /tmp/dmesg.txt
>
>And send the output of dmesg.txt to us?

Will do.

>It's rather disturbing that there was this much damage done from what
>looks like a deadlock condition.  Others who have report this soft
>lockup condition haven't reported this kind of filesystem damage.  I
>wonder if it might be caused by power-cycling the box; if possible, I
>do recommend that people use the reset button rather than power
>cycling the box; it tends to be much safer and gentler on the machine.

ACK. I have this nagging feeling that this time the damage was more 
extensive because I waited only a few minutes before power cycling; 
my last soft lockup was last Tuesday, and then I waited about half an 
hour before reaching for the power button.

[I have gotten into the habit of power cycling vs resets, as my two 
ivtv TV grabber cards sometimes fail to re-init their firmware after 
anything other than a cold boot following a minute of power-off]

>Given that your system seems to have this prediction to wipe out the
>first part of your block group descriptors, what I would recommend is
>backing up your block group descriptors like this:
>
>	dd if=/dev/XXXX of=backup-bg.img bs=4k count=234
>
>This will backup just your block group descriptors, and will allow you
>to restore them later (although you will have to run e2fsck restoring
>them).

Thanks, will add that to my nightly backup. The last sentence should 
read "...run e2fsck *after* restoring them", right?

>The bigger question is how 16 4k blocks between block numbers 1 and 17
>are getting overwritten by garbage.  As I mentioned, I haven't seen
>anything like this except from your system.  Some others have reported
>a soft lockup when doing an "rm -rf" of a large hierarchy, but they
>haven't reported this kind of filesystem corruption.  I haven't been
>able to replicate it yet myself.

I have a few suspects, but no hard evidence beyond that. Two of the 
six drives in my (linux) software RAID-6 hang off a Marvell SATA/SAS 
RAID controller. Support for that chip (mvsas) is very recent, and 
I'll Google around to see if the BIOS has a habit of scribbling over 
data blocks. I pretty much never reboot the machine other than to get 
out of hangs, so it's not impossible that the soft lockups are a red 
herring.

As I mentioned before I am running the (closed) NVidia X drivers, but 
during none of the hangs have I done anything more challenging than 
watching xterms under fvwm. Other than that the entire setup (CPU, MB 
et al) is reasonably bleeding edge, but I don't see why that should 
manifest itself in this particular way (as opposed to, say, video 
glitches or compiler SIG11s).

>And if you're not willing to take the risk, I'll completely understand
>your deciding that you need to switch back to ext3.  But if you are
>willing to continue testing, and helping us find the root cause
>of the problem, we will be very grateful.

I'd prefer to stay with ext4, as its benefits make sense for the 
simulations I'm running. The downside is that this is my main 
home/office server and MythTV backend; not only is restoring from 
backup tedious, but I'll also have to explain to my SO that the RAID 
ate her shows.

>P.S.  You were using a completely stock kernel, correct?  No other
>patches installed?

Yes.

Thanks,

JDB.
[off to read up on mke2fs -S]
-- 
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-27 23:47   ` J.D. Bakker
@ 2009-03-28  4:06     ` Theodore Tso
  2009-03-28 12:30     ` Theodore Tso
  1 sibling, 0 replies; 9+ messages in thread
From: Theodore Tso @ 2009-03-28  4:06 UTC (permalink / raw)
  To: J.D. Bakker; +Cc: linux-ext4

This patch *might* solve your problem.  I can't be sure because I
haven't been able to reproduce the soft lockup problem when rm'ing a
file yet.  But if it's happening fairly often, it might be worth a
try; it definitely fixes a real bug in ext4 --- I'm just not sure it's
*your* bug.  :-)

						- Ted

commit 73cda61b58a060b6691791a44c01c16155617451
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Fri Mar 27 19:43:21 2009 -0400

    ext4: fix locking typo in mballoc which could cause soft lockup hangs
    
    Smatch (http://repo.or.cz/w/smatch.git/) complains about the locking in
    ext4_mb_add_n_trim() from fs/ext4/mballoc.c
    
      4438          list_for_each_entry_rcu(tmp_pa, &lg->lg_prealloc_list[order],
      4439                                                  pa_inode_list) {
      4440                  spin_lock(&tmp_pa->pa_lock);
      4441                  if (tmp_pa->pa_deleted) {
      4442                          spin_unlock(&pa->pa_lock);
      4443                          continue;
      4444                  }
    
    Brown paper bag time...
    
    Reported-by: Dan Carpenter <error27@gmail.com>
    Reviewed-by: Eric Sandeen <sandeen@redhat.com>
    Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Cc: stable@kernel.org

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 4f2f476..12d1081 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4389,7 +4389,7 @@ static void ext4_mb_add_n_trim(struct ext4_allocation_context *ac)
 						pa_inode_list) {
 		spin_lock(&tmp_pa->pa_lock);
 		if (tmp_pa->pa_deleted) {
-			spin_unlock(&pa->pa_lock);
+			spin_unlock(&tmp_pa->pa_lock);
 			continue;
 		}
 		if (!added && pa->pa_free < tmp_pa->pa_free) {

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-27 23:47   ` J.D. Bakker
  2009-03-28  4:06     ` Theodore Tso
@ 2009-03-28 12:30     ` Theodore Tso
  2009-03-28 12:53       ` J.D. Bakker
  1 sibling, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2009-03-28 12:30 UTC (permalink / raw)
  To: J.D. Bakker; +Cc: linux-ext4

On Sat, Mar 28, 2009 at 12:47:19AM +0100, J.D. Bakker wrote:
>
>> P.S.  You were using a completely stock kernel, correct?  No other
>> patches installed?
>
> Yes.
>

Hi J.D.,

Can you verify exactly what kernel version you are using?  This issue
is being discussed at:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/330824

... and one person has said it was fixed in 2.6.28-rc8, and at least
one, possibly two people have reported that it has been fixed in
2.6.29.  I think you're running some version of 2.6.28 or 2.6.28.y,
correct?

If these report is correct, the problem may be fixed already in
mainline, but I'd like to figure out *which* patch made the problem go
away, so we can get it backported into various distribution kernels
and into the 2.6.28.* and and possibly the 2.6.27.* stable series.

Thanks,

						- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-28 12:30     ` Theodore Tso
@ 2009-03-28 12:53       ` J.D. Bakker
  2009-03-28 13:09         ` Theodore Tso
  0 siblings, 1 reply; 9+ messages in thread
From: J.D. Bakker @ 2009-03-28 12:53 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

At 08:30 -0400 28-03-2009, Theodore Tso wrote:
>Can you verify exactly what kernel version you are using? This issue
>is being discussed at:
>
>https://bugs.launchpad.net/ubuntu/+source/linux/+bug/330824
>
>... and one person has said it was fixed in 2.6.28-rc8, and at least
>one, possibly two people have reported that it has been fixed in
>2.6.29.  I think you're running some version of 2.6.28 or 2.6.28.y,
>correct?

Had the problem with 2.6.28, 2.6.28.4 and 2.6.29-rc6. After 
yesterday's crash I've upgraded to 2.6.29. For completeness' sake, my 
ext4 RAID had been created from scratch, not upgraded from ext3.

Should I still apply the oneliner you posted yesterday on 2.6.29?

In the meantime I've tried mkfs -S, this complained about "File 
exists while trying to create journal". fsck -y is running (has been 
for a few hours) and appears to cycle through

  Group xx inode table at yy conflicts with some other fs block. Relocate?
  [repeated enough times to overflow my xterm's scrollback buffer]
  Root inode is not a directory. Clear?

We'll see what I can fish out of the lost+found once it's done.

JDB.
-- 
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-28 12:53       ` J.D. Bakker
@ 2009-03-28 13:09         ` Theodore Tso
  2009-03-29 22:01           ` J.D. Bakker
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Tso @ 2009-03-28 13:09 UTC (permalink / raw)
  To: J.D. Bakker; +Cc: linux-ext4

On Sat, Mar 28, 2009 at 01:53:35PM +0100, J.D. Bakker wrote:
> At 08:30 -0400 28-03-2009, Theodore Tso wrote:
>> Can you verify exactly what kernel version you are using? This issue
>> is being discussed at:
>>
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/330824
>>...
>
> Had the problem with 2.6.28, 2.6.28.4 and 2.6.29-rc6. After yesterday's 
> crash I've upgraded to 2.6.29. For completeness' sake, my ext4 RAID had 
> been created from scratch, not upgraded from ext3.

Thanks, I've updated your information on the Launchpad bug site.  If
we can get indpendent confirmation, it appears the bug was fixed
sometime between 2.6.28-rc6 and 2.6.28-rc8.

> Should I still apply the oneliner you posted yesterday on 2.6.29?

Yeah, it's definitely a god fix to have in hand.

> In the meantime I've tried mkfs -S, this complained about "File exists 
> while trying to create journal". fsck -y is running (has been for a few 
> hours) and appears to cycle through

Oops, I need to fix mke2fs to handle this case better.  (I assume you
were doing "mke2fs -S -t ext4 /dev/XXX", or something like that,
right?)

You should be able to work around the "File exists..." error via this
command:

	debugfs -w /dev/XXXX -R "clri <8>"

... and then retrying the mke2fs -S command.

Regards,

					- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-28 13:09         ` Theodore Tso
@ 2009-03-29 22:01           ` J.D. Bakker
  2009-03-31 12:42             ` Theodore Tso
  0 siblings, 1 reply; 9+ messages in thread
From: J.D. Bakker @ 2009-03-29 22:01 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-ext4

At 09:09 -0400 28-03-2009, Theodore Tso wrote:
>On Sat, Mar 28, 2009 at 01:53:35PM +0100, J.D. Bakker wrote:
>  > In the meantime I've tried mkfs -S, this complained about "File exists
>  > while trying to create journal". fsck -y is running (has been for a few
>  > hours) and appears to cycle through
>
>You should be able to work around the "File exists..." error via this
>command:
>
>	debugfs -w /dev/XXXX -R "clri <8>"
>
>... and then retrying the mke2fs -S command.

Tried that, gave somewhat unexpected results. I cancelled the running 
fsck, and issued 'debugfs -w /dev/md0 -R "clri <8>"'. This appeared 
to work, but when I retried the mkfs -S, I still got the "File exists 
while trying to create journal " error. I re-issued the debugfs 
command, which then failed with

   debugfs 1.41.4 (27-Jan-2009)
   /dev/md0: Bad magic number in super-block while opening filesystem

I have restarted the fsck (e2fsck -yv /dev/md0), but it appears to be 
stuck in a loop:

  e2fsck 1.41.4 (27-Jan-2009)
  ./e2fsck/e2fsck: Superblock invalid, trying backup blocks...
  Group descriptor 1 checksum is invalid.  Fix? yes
  Group descriptor 2 checksum is invalid.  Fix? yes
  [...]
  Group descriptor 27775 checksum is invalid.  Fix? yes
  Group descriptor 27941 checksum is invalid.  Fix? yes
  newraidfs contains a file system with errors, check forced.
  Pass 1: Checking inodes, blocks, and sizes
  Group 859's inode table at 3080346 conflicts with some other fs block.
  Relocate? yes
  Group 860's block bitmap at 33161701 conflicts with some other fs block.
  Relocate? yes
  [...]
  Group 25840's inode table at 846725656 conflicts with some other fs block.
  Relocate? yes
  Group 25840's inode table at 846725657 conflicts with some other fs block.
  Relocate? yes
  Root inode is not a directory.  Clear? yes
  [no output for a few minutes]
  Error allocating 1 contiguous block(s) in block group 175 for block 
bitmap: Could not allocate block in ext2 filesystem
  Error allocating 512 contiguous block(s) in block group 175 for 
inode table: Could not allocate block in ext2 filesystem
  Error allocating 1 contiguous block(s) in block group 769 for inode 
bitmap: Could not allocate block in ext2 filesystem
  [...]
  Error allocating 512 contiguous block(s) in block group 16353 for 
inode table: Could not allocate block in ext2 filesystem
  Error allocating 512 contiguous block(s) in block group 25840 for 
inode table: Could not allocate block in ext2 filesystem
  Restarting e2fsck from the beginning...
  ./e2fsck/e2fsck: Group descriptors look bad... trying backup blocks...
  Group descriptor 1 checksum is invalid.  Fix? yes

...and it starts all over again. I had left it running overnight; in 
the morning it had produced the exact same output 97 times. Over 
those runs the e2fsck process grew from a few hundred MB to 3GB (all 
of the RAM installed in the machine), and had pushed all other 
processes out to swap. Full log file is available at 
http://lartmaker.nl/ext4/e2fsck-md0-20090327-yv-2.txt . I have since 
killed e2fsck in the belief that if 97 passes weren't going to do it, 
number 98 would be unlikely to help much.

Is there anything else I can do? Before the crash the fs was ~66% 
full, so I'm not sure why e2fsck fails to allocate blocks.

Thanks,

JDB.
-- 
LART. 250 MIPS under one Watt. Free hardware design files.
http://www.lartmaker.nl/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Once more: Recovering a damaged ext4 fs?
  2009-03-29 22:01           ` J.D. Bakker
@ 2009-03-31 12:42             ` Theodore Tso
  0 siblings, 0 replies; 9+ messages in thread
From: Theodore Tso @ 2009-03-31 12:42 UTC (permalink / raw)
  To: J.D. Bakker; +Cc: linux-ext4

OK, here's a patch that should allow mke2fs -S to work.  It should be
applied against the e2fsprogs 1.41.4.

Sorry for the delay in getting this to you; things have been crazy
busy on my end.

						- Ted

commit a620baddee647faf42c49ee2e04ee3f667149d68
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Tue Mar 31 07:42:24 2009 -0400

    mke2fs: Don't try to create the journal in super-only mode
    
    Since we aren't initializing the inode table, creating the journal
    will just fail.
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

diff --git a/misc/mke2fs.c b/misc/mke2fs.c
index e69e5ce..4f50ffa 100644
--- a/misc/mke2fs.c
+++ b/misc/mke2fs.c
@@ -2079,6 +2079,12 @@ int main (int argc, char *argv[])
 		    EXT3_FEATURE_COMPAT_HAS_JOURNAL)) {
 		journal_blocks = figure_journal_size(journal_size, fs);
 
+		if (super_only) {
+			printf(_("Skipping journal creation in super-only mode\n"));
+			fs->super->s_journal_inum = EXT2_JOURNAL_INO;
+			goto no_journal;
+		}
+
 		if (!journal_blocks) {
 			fs->super->s_feature_compat &=
 				~EXT3_FEATURE_COMPAT_HAS_JOURNAL;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-03-31 12:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 20:41 Once more: Recovering a damaged ext4 fs? J.D. Bakker
2009-03-27 22:46 ` Theodore Tso
2009-03-27 23:47   ` J.D. Bakker
2009-03-28  4:06     ` Theodore Tso
2009-03-28 12:30     ` Theodore Tso
2009-03-28 12:53       ` J.D. Bakker
2009-03-28 13:09         ` Theodore Tso
2009-03-29 22:01           ` J.D. Bakker
2009-03-31 12:42             ` Theodore Tso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).