Re: xfs_repair 3.2.0 cannot (?) fix fs

From: "Arkadiusz Miśkiewicz" <arekm@maven.pl>
To: Dave Chinner <david@fromorbit.com>
Cc: Alex Elder <elder@kernel.org>, xfs@oss.sgi.com
Subject: Re: xfs_repair 3.2.0 cannot (?) fix fs
Date: Mon, 30 Jun 2014 07:36:24 +0200	[thread overview]
Message-ID: <201406300736.24291.arekm@maven.pl> (raw)
In-Reply-To: <20140630031810.GC4453@dastard>

On Monday 30 of June 2014, Dave Chinner wrote:
> [Compendium reply to all 3 emails]
> 
> On Sat, Jun 28, 2014 at 01:41:54AM +0200, Arkadiusz Miśkiewicz wrote:
> > Hello.
> > 
> > I have a fs (metadump of it
> > http://ixion.pld-linux.org/~arekm/p2/x1/web2-home.metadump.gz)
> > that xfs_repair 3.2.0 is unable to fix properly.
> > 
> > Running xfs_repair few times shows the same errors repeating:
> > http://ixion.pld-linux.org/~arekm/p2/x1/repair2.txt
> > http://ixion.pld-linux.org/~arekm/p2/x1/repair3.txt
> > http://ixion.pld-linux.org/~arekm/p2/x1/repair4.txt
> > http://ixion.pld-linux.org/~arekm/p2/x1/repair5.txt
> > 
> > (repair1.txt also exists - it was initial, very big/long repair)
> > 
> > Note that fs mounts fine (and was mounting fine before and after repair)
> > but xfs_repair indicates that not everything got fixed.
> > 
> > 
> > Unfortunately there looks to be a problem with metadump image. xfs_repair
> > is able to finish fixing on a restored image but is not able (see
> > repairX.txt) above on real devices. Huh?
> > 
> > Examples of problems repeating each time xfs_repair is run:
> > 
> > 1)
> > reset bad sb for ag 5
> >
> >. non-null group quota inode field in superblock 7
> 
> OK, so this is indicative of something screwed up a long time ago.
> Firstly, the primary superblocks shows:
> 
> uquotino = 4077961
> gquotino = 0
> qflags = 0
> 
> i.e. user quota @ inode 4077961, no group quota. The secondary
> superblocks that are being warned about show:
> 
> uquotino = 0
> gquotino = 4077962
> qflags = 0
> 
> Which is clearly wrong. They should have been overwritten during the
> growfs operation to match the primary superblock.
> 
> The similarity in inode number leads me to beleive at some point
> both user and group/project quotas were enabled on this filesystem,

Both user and project quotas were enabled on this fs for last few years.

> but right now only user quotas are enabled.  It's only AGs 1-15 that
> show this, so this seems to me that it is likely that this
> filesystem was originally only 16 AGs and it's been grown many times
> since?

The quotas was running fine until some repair run (ie. before and after first 
repair mounting with quota succeeded) - some xfs_repair run later broke this.

> 
> Oh, this all occurred because you had a growfs operation on 3.10
> fail because of garbage in the the sb of AG 16 (i.e. this from IRC:
> http://sprunge.us/UJFE)? IOWs, this commit:
> 
> 9802182 xfs: verify superblocks as they are read from disk
> 
> tripped up on sb 16. That means sb 16 is was not modified by the
> growfs operation, and so should have the pre-growfs information in
> it:
> 
> uquotino = 4077961
> gquotino = 4077962
> qflags = 0x77
> 
> Yeah, that's what I thought - the previous grow operation had both
> quotas enabled. OK, that explains why the growfs operation had
> issues, but it doesn't explain exactly how the quota inodes got
> screwed up like that.

The fs had working quota when having 3 digit number of AGs. I wouldn't blame 
growfs failure to be related to quota brokeness. IMO some repair broke this 
(or tried fixing and broke).

> Anyway, the growfs issues were solved by:
> 
> 10e6e65 xfs: be more forgiving of a v4 secondary sb w/ junk in v5 fields
> 
> which landed in 3.13.

Ok.

> 
> > 2)
> > correcting nblocks for inode 965195858, was 19 - counted 20
> > correcting nextents for inode 965195858, was 16 - counted 17
> 
> Which is preceeded by:
> 
> data fork in ino 965195858 claims free block 60323539
> data fork in ino 965195858 claims free block 60323532
> 
> and when combined with the later:
> 
> entry "dsc0945153ac18d4d4f1a-150x150.jpg" (ino 967349800) in dir 965195858
> is a duplicate name, marking entry to be junked
> 
> errors from that directory, it looks like the space was freed but
> the directory btree not correctly updated. No idea what might have
> caused that, but it is a classic symptom of volatile write caches...
> 
> Hmmm, and when It goes to junk them on my local testing:
> 
> rebuilding directory inode 965195858
> name create failed in ino 965195858 (117), filesystem may be out of space
> 
> Which is an EFSCORRUPTED error trying to rebuild that directory.
> The second error pass did not throw an error, but it did not fix
> the errors as a 3rd pass still reported this. I'll look into why.
> 
> > 3) clearing some entries; moving to lost+found (the same files)
> > 
> > 
> > 4)
> > Phase 7 - verify and correct link counts...
> > Invalid inode number 0xfeffffffffffffff
> > xfs_dir_ino_validate: XFS_ERROR_REPORT
> > Metadata corruption detected at block 0x11fbb698/0x1000
> > libxfs_writebufr: write verifer failed on bno 0x11fbb698/0x1000
> > Invalid inode number 0xfeffffffffffffff
> > xfs_dir_ino_validate: XFS_ERROR_REPORT
> > Metadata corruption detected at block 0x11fbb698/0x1000
> > libxfs_writebufr: write verifer failed on bno 0x11fbb698/0x1000
> > done
> 
> Not sure what that is yet, but it looks like writing a directory
> block found entries with invalid inode numbers in it. i.e. it's
> telling me that there's something not been fixed up.
> 
> I'm actually seeing this in phase4:
> 
>         - agno = 148
> Invalid inode number 0xfeffffffffffffff
> xfs_dir_ino_validate: XFS_ERROR_REPORT
> Metadata corruption detected at block 0x11fbb698/0x1000
> libxfs_writebufr: write verifer failed on bno 0x11fbb698/0x1000
> 
> Second time around, this does not happen, so the error has been
> corrected in a later phase of the first pass.

Here on two runs I got exactly the same report:

Phase 7 - verify and correct link counts...

Invalid inode number 0xfeffffffffffffff
xfs_dir_ino_validate: XFS_ERROR_REPORT
Metadata corruption detected at block 0x11fbb698/0x1000
libxfs_writebufr: write verifer failed on bno 0x11fbb698/0x1000
Invalid inode number 0xfeffffffffffffff
xfs_dir_ino_validate: XFS_ERROR_REPORT
Metadata corruption detected at block 0x11fbb698/0x1000
libxfs_writebufr: write verifer failed on bno 0x11fbb698/0x1000

but there were more of errors like this earlier so repair fixed some but left 
with these two.

> 
> > 5)Metadata CRC error detected at block 0x0/0x200
> > but it is not CRC enabled fs
> 
> That's typically caused by junk in the superblock beyond the end
> of the v4 superblock structure. It should be followed by "zeroing
> junk ..."

Shouldn't repair fix superblocks when noticing v4 fs?

I mean 3.2.0 repair reports:

$ xfs_repair -v ./1t-image 
Phase 1 - find and verify superblock...
        - reporting progress in intervals of 15 minutes
        - block cache size set to 748144 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 2 tail block 2
        - scan filesystem freespace and inode maps...
Metadata CRC error detected at block 0x0/0x200
zeroing unused portion of primary superblock (AG #0)
        - 07:20:11: scanning filesystem freespace - 391 of 391 allocation 
groups done
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - 07:20:11: scanning agi unlinked lists - 391 of 391 allocation groups 
done
        - process known inodes and perform inode discovery...
        - agno = 0
[...]

but if I run 3.1.11 after running 3.2.0 then superblocks get fixed:

$ ./xfsprogs/repair/xfs_repair -v ./1t-image 
Phase 1 - find and verify superblock...
        - block cache size set to 748144 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 2 tail block 2
        - scan filesystem freespace and inode maps...
zeroing unused portion of primary superblock (AG #0)
zeroing unused portion of secondary superblock (AG #3)
zeroing unused portion of secondary superblock (AG #1)
zeroing unused portion of secondary superblock (AG #8)
zeroing unused portion of secondary superblock (AG #2)
zeroing unused portion of secondary superblock (AG #5)
zeroing unused portion of secondary superblock (AG #6)
zeroing unused portion of secondary superblock (AG #20)
zeroing unused portion of secondary superblock (AG #9)
zeroing unused portion of secondary superblock (AG #7)
zeroing unused portion of secondary superblock (AG #12)
zeroing unused portion of secondary superblock (AG #10)
zeroing unused portion of secondary superblock (AG #13)
zeroing unused portion of secondary superblock (AG #14)
[...]
zeroing unused portion of secondary superblock (AG #388)
zeroing unused portion of secondary superblock (AG #363)
        - found root inode chunk
Phase 3 - for each AG...

Shouldn't these be "unused" for 3.2.0, too (since v4 fs) ?

> > Made xfs metadump without file obfuscation and I'm able to reproduce the
> > problem reliably on the image (if some xfs developer wants metadump image
> > then please mail me - I don't want to put it for everyone due to obvious
> > reasons).
> > 
> > So additional bug in xfs_metadump where file obfuscation "fixes" some
> > issues. Does it obfuscate but keep invalid conditions (like keeping "/"
> > in file name) ? I guess it is not doing that.
> 
> I doubt it handles a "/" in a file name properly - that's rather
> illegal, and the obfuscation code probably doesn't handle it at all.

Would be nice to keep these bad conditions. obfuscated metadump is behaving 
differently than non-obfuscated metadump with xfs_repair here (less issues 
with obfuscated than non-obfuscated), so obfuscation simply hides problems.

I assume that you do testing on the non-obfuscated dump I gave on irc?

> FWIW, xfs_repair will trash those files anyway:
> 
> entry at block 22 offset 560 in directory inode 419558142 has illegal name
> "/_198.jpg": clearing entry
> 
> So regardless of whether metadump handles them or is not going to
> change the fact that filenames with "/" them are broken....
> 
> But the real question here is how did you get "/" characters in
> filenames?

No idea. It could get corrupted many months/years ago. This fs has not seen 
repair for very long time (since there was no visible issues with it).

> > [3571367.717167] XFS (loop0): Mounting Filesystem
> > [3571367.883958] XFS (loop0): Ending clean mount
> > [3571367.900733] XFS (loop0): Failed to initialize disk quotas.
> > 
> > Files are accessible etc. Just no quota. Unfortunately no information why
> > initialization failed.
> 
> I can't tell why that's happening yet. I'm not sure what the correct
> state is supposed to be yet (mount options will tell me)

noatime,nodiratime,nodev,nosuid,usrquota,prjquota

> so I'm not
> sure what went wrong. As it is, you probaby should be upgrading to
> a more recent kernel....

I can try to mount metadump image on newer kernel - will check and report 
back.

> > So xfs_repair wasn't able to fix that, too.
> 
> xfs_repair isn't detecting there is a problem because the uquotino
> is not corrupt and the qflags is zero. Hence it doesn't do anything.
> 
> More as I find it.
> 
> Cheers,
> 
> Dave.

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs