bad fs

* bad fs - xfs_repair 3.01 crashes on it
@ 2009-07-03 11:20 Michael Monnerie
  2009-07-03 18:34 ` Eric Sandeen
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Michael Monnerie @ 2009-07-03 11:20 UTC (permalink / raw)
  To: xfs mailing list

[-- Attachment #1.1.1: Type: text/plain, Size: 3470 bytes --]

Tonight our server rebooted, and I found in /var/log/warn that he was crying 
a lot about xfs since June 7 already:

Jun  7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jun  7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G          2.6.27.21-0.1-xen #1
Jun  7 03:06:31 orion.i.zmi.at kernel:
Jun  7 03:06:31 orion.i.zmi.at kernel: Call Trace:
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff804635e0>] dump_stack+0x69/0x6f
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033bbcc>] xfs_iformat_extents+0xc9/0x1c5 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033c129>] xfs_iformat+0x2b0/0x3f6 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033c356>] xfs_iread+0xe7/0x1ed [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0337920>] xfs_iget_core+0x3a5/0x63a [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0337c97>] xfs_iget+0xe2/0x187 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0359302>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa03593bb>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0359f6a>] xfs_ioctl+0x3ca/0x680 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0357ff6>] xfs_file_ioctl+0x25/0x69 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aa8cd>] vfs_ioctl+0x21/0x6c
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aab3a>] do_vfs_ioctl+0x222/0x231
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aab9a>] sys_ioctl+0x51/0x73
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<00007f7231d6cb77>] 0x7f7231d6cb77

But XFS didn't go offline, so nobody found this messages. There are a lot of them.
They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
since then. It would have been nice if xfs_fsr could have displayed
a message, so we would have received the cron mail. (But it got killed
by the kernel, that's a good excuse)

Anyway, so I went to xfs_repair (3.01) and got this:

Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
[snip]
        - agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
[snip]
Phase 4 - check for duplicate blocks...
[snip]
        - agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

And then xfs_repair crashes out, without having repaired. I attached the full 
xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
the metadump.

I'll not be here for a week now, I hope the problem is not very serious.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4

[-- Attachment #1.1.2: xfsrepair.data1 --]
[-- Type: text/plain, Size: 1930 bytes --]

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - agno = 32
        - agno = 33
        - agno = 34
        - agno = 35
        - agno = 36
        - agno = 37
        - agno = 38
        - agno = 39
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread