recovering corrupt filesystem after raid failure

* recovering corrupt filesystem after raid failure
@ 2016-02-22  1:29 David Lechner
  2016-02-22  2:24 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: David Lechner @ 2016-02-22  1:29 UTC (permalink / raw)
  To: xfs

Long story short, I had a dual disk failure in a raid 5. I've managed to
get the raid back up and salvaged what I could. However, the xfs is
seriously damaged. I've tried running xfs_repair, but it is failing and
it recommended to send a message to this mailing list. This is an Ubuntu
12.04 machine, so xfs_repair version 3.1.7.

The file system won't mount. Fails with "mount: Structure needs
cleaning". So I tried xfs_repair. I had to resort to xfs_repair -L
because the first 500MB or so of the filesystem was wiped out. Now,
xfs_repair /dev/md127 gets stuck, so I am running xfs_repair -P
/dev/md127. This gets much farther, but it is failing too. It gives an
error message like this:

...
disconnected inode 2101958, moving to lost+found
corrupt dinode 2101958, extent total = 1, nblocks = 0.  This is a bug.
Please capture the filesystem metadata with xfs_metadump and
report it to xfs@oss.sgi.com.
cache_node_purge: refcount was 1, not zero (node=0x7f2c57e1b120)

fatal error -- 117 - couldn't iget disconnected inode

However, nblocks = 0 does not seem to be true...

xfs_db -x /dev/md127
cache_node_purge: refcount was 1, not zero (node=0x219c9e0)
xfs_db: cannot read root inode (117)
cache_node_purge: refcount was 1, not zero (node=0x21a0620)
xfs_db: cannot read realtime bitmap inode (117)
xfs_db> inode 2101958
xfs_db> print
core.magic = 0x494e
core.mode = 0100664
core.version = 2
core.format = 2 (extents)
core.nlinkv2 = 1
core.onlink = 0
core.projid_lo = 0
core.projid_hi = 0
core.uid = 119
core.gid = 133
core.flushiter = 5
core.atime.sec = Sun Apr 26 02:30:54 2015
core.atime.nsec = 000000000
core.mtime.sec = Fri Nov  7 14:54:27 2014
core.mtime.nsec = 000000000
core.ctime.sec = Sun Apr 26 02:30:54 2015
core.ctime.nsec = 941028318
core.size = 279864
core.nblocks = 69
core.extsize = 0
core.nextents = 1
core.naextents = 0
core.forkoff = 0
core.aformat = 2 (extents)
core.dmevmask = 0
core.dmstate = 0
core.newrtbm = 0
core.prealloc = 0
core.realtime = 0
core.immutable = 0
core.append = 0
core.sync = 0
core.noatime = 0
core.nodump = 0
core.rtinherit = 0
core.projinherit = 0
core.nosymlinks = 0
core.extsz = 0
core.extszinherit = 0
core.nodefrag = 0
core.filestream = 0
core.gen = 3320313054
next_unlinked = null
u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,147322885,69,0]

If I re-run xfs_repair -P /dev/md127, it will fail on different
seemingly random inode with the same error message.

I've uploaded the output of xfs_metadump to dropbox if anyone would like
to have a look. It is 22MB compressed, 2.2GB uncompressed.

https://www.dropbox.com/s/o18cxapu7o75sor/xfs_metadump.xz?dl=0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread