* bad fs - xfs_repair 3.01 crashes on it
@ 2009-07-03 11:20 Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Michael Monnerie @ 2009-07-03 11:20 UTC (permalink / raw)
To: xfs mailing list
[-- Attachment #1.1.1: Type: text/plain, Size: 3470 bytes --]
Tonight our server rebooted, and I found in /var/log/warn that he was crying
a lot about xfs since June 7 already:
Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jun 7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G 2.6.27.21-0.1-xen #1
Jun 7 03:06:31 orion.i.zmi.at kernel:
Jun 7 03:06:31 orion.i.zmi.at kernel: Call Trace:
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff804635e0>] dump_stack+0x69/0x6f
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033bbcc>] xfs_iformat_extents+0xc9/0x1c5 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033c129>] xfs_iformat+0x2b0/0x3f6 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033c356>] xfs_iread+0xe7/0x1ed [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0337920>] xfs_iget_core+0x3a5/0x63a [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0337c97>] xfs_iget+0xe2/0x187 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0359302>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa03593bb>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0359f6a>] xfs_ioctl+0x3ca/0x680 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0357ff6>] xfs_file_ioctl+0x25/0x69 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aa8cd>] vfs_ioctl+0x21/0x6c
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aab3a>] do_vfs_ioctl+0x222/0x231
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aab9a>] sys_ioctl+0x51/0x73
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jun 7 03:06:31 orion.i.zmi.at kernel: [<00007f7231d6cb77>] 0x7f7231d6cb77
But XFS didn't go offline, so nobody found this messages. There are a lot of them.
They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
since then. It would have been nice if xfs_fsr could have displayed
a message, so we would have received the cron mail. (But it got killed
by the kernel, that's a good excuse)
Anyway, so I went to xfs_repair (3.01) and got this:
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
[snip]
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
[snip]
Phase 4 - check for duplicate blocks...
[snip]
- agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
And then xfs_repair crashes out, without having repaired. I attached the full
xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
the metadump.
I'll not be here for a week now, I hope the problem is not very serious.
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
[-- Attachment #1.1.2: xfsrepair.data1 --]
[-- Type: text/plain, Size: 1930 bytes --]
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-03 11:20 Michael Monnerie
@ 2009-07-03 18:34 ` Eric Sandeen
2009-07-04 5:43 ` Eric Sandeen
2009-07-12 18:52 ` Eric Sandeen
2 siblings, 0 replies; 8+ messages in thread
From: Eric Sandeen @ 2009-07-03 18:34 UTC (permalink / raw)
To: Michael Monnerie; +Cc: xfs mailing list
Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying
> a lot about xfs since June 7 already:
>
> Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
...
> But XFS didn't go offline, so nobody found this messages. There are a lot of them.
> They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
> since then. It would have been nice if xfs_fsr could have displayed
> a message, so we would have received the cron mail. (But it got killed
> by the kernel, that's a good excuse)
I'll have to think about why this didn't shut down the fs. There are
just a few that don't.
> Anyway, so I went to xfs_repair (3.01) and got this:
>
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> [snip]
> - agno = 14
> local inode 3857051697 attr too small (size = 3, min size = 4)
> bad attribute fork in inode 3857051697, clearing attr fork
> clearing inode 3857051697 attributes
> cleared inode 3857051697
> [snip]
> Phase 4 - check for duplicate blocks...
> [snip]
> - agno = 15
> data fork in regular inode 3857051697 claims used block 537147998
> xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
>
> And then xfs_repair crashes out, without having repaired. I attached the full
> xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
> the metadump.
Thanks for the metadump image, I'll try to take a look.
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-03 11:20 Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
@ 2009-07-04 5:43 ` Eric Sandeen
2009-07-12 17:02 ` Michael Monnerie
2009-07-12 18:52 ` Eric Sandeen
2 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-07-04 5:43 UTC (permalink / raw)
To: Michael Monnerie; +Cc: xfs mailing list
Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying
> a lot about xfs since June 7 already:
...
> But XFS didn't go offline, so nobody found this messages. There are a lot of them.
> They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
> since then. It would have been nice if xfs_fsr could have displayed
> a message, so we would have received the cron mail. (But it got killed
> by the kernel, that's a good excuse)
ok yeah we should see why fsr didn't print anything ...
> Anyway, so I went to xfs_repair (3.01) and got this:
>
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> [snip]
> - agno = 14
> local inode 3857051697 attr too small (size = 3, min size = 4)
> bad attribute fork in inode 3857051697, clearing attr fork
> clearing inode 3857051697 attributes
> cleared inode 3857051697
> [snip]
> Phase 4 - check for duplicate blocks...
> [snip]
> - agno = 15
> data fork in regular inode 3857051697 claims used block 537147998
> xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
Ok, so this is essentially some code which first does a scan; if it
finds an error it bails out and clears the inode, but if not, it calls
essentially the same function again, comments say "set bitmaps this
time" - but on the 2nd call it finds an error, which isn't handled well.
The ASSERT(err == 0) bit is presumably because if the first scan didn't
find anything, the 2nd call shouldn't either, but ... not the case here
:( There are more checks that can go wrong -after- the scan-only portion.
So either the caller needs to cope w/ the error at this point, or the
scan only business needs do all the checks, I think.
Where's Barry when you need him ....
Also I need to look at when the ASSERTs are active and when they should
be; the Fedora packaged xfsprogs doesn't have the ASSERT active, and so
this doesn't trip. After 2 calls to xfs_repair on Fedora, w/o the
ASSERTs active, it checks clean on the 3rd (!). Not great. Not sure
how much was cleared out in the process either...
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-04 5:43 ` Eric Sandeen
@ 2009-07-12 17:02 ` Michael Monnerie
2009-07-12 18:09 ` Eric Sandeen
0 siblings, 1 reply; 8+ messages in thread
From: Michael Monnerie @ 2009-07-12 17:02 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 960 bytes --]
On Samstag 04 Juli 2009 Eric Sandeen wrote:
> Where's Barry when you need him ....
Who's that?
> Also I need to look at when the ASSERTs are active and when they
> should be; the Fedora packaged xfsprogs doesn't have the ASSERT
> active, and so this doesn't trip. After 2 calls to xfs_repair on
> Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!). Not
> great. Not sure how much was cleared out in the process either...
Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to
only hit one file, but I don't dare delete it, maybe it makes things
worse?
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-12 17:02 ` Michael Monnerie
@ 2009-07-12 18:09 ` Eric Sandeen
0 siblings, 0 replies; 8+ messages in thread
From: Eric Sandeen @ 2009-07-12 18:09 UTC (permalink / raw)
To: Michael Monnerie; +Cc: xfs
Michael Monnerie wrote:
> On Samstag 04 Juli 2009 Eric Sandeen wrote:
>> Where's Barry when you need him ....
>
> Who's that?
The ex-sgi xfs_repair maintainer :)
>> Also I need to look at when the ASSERTs are active and when they
>> should be; the Fedora packaged xfsprogs doesn't have the ASSERT
>> active, and so this doesn't trip. After 2 calls to xfs_repair on
>> Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!). Not
>> great. Not sure how much was cleared out in the process either...
>
> Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to
> only hit one file, but I don't dare delete it, maybe it makes things
> worse?
>
> mfg zmi
Sorry, I will get back to this soon - today I hope. I seem to be
getting more and more familiar w/ xfs_repair these days. :)
If you do want to try deleting that one file or other such tricks, you
can do it on a sparse metadata image of the fs as a dry run:
# xfs_metadump -o /dev/whatever metadump.img
# xfs_mdrestore metadump.img filesystem.img
# mount -o loop filesystem.img mnt/
# <fiddle as you please>
# umount mnt/
# xfs_repair filesystem.img
# mount -o loop filesystem.img mnt/
and see what happens...
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-03 11:20 Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
2009-07-04 5:43 ` Eric Sandeen
@ 2009-07-12 18:52 ` Eric Sandeen
2009-07-12 22:08 ` Michael Monnerie
2 siblings, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2009-07-12 18:52 UTC (permalink / raw)
To: Michael Monnerie; +Cc: xfs mailing list
Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying
> a lot about xfs since June 7 already:
>
> Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
> Jun 7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G 2.6.27.21-0.1-xen #1
> Jun 7 03:06:31 orion.i.zmi.at kernel:
Hm, the other sort of interesting thing here is that a recently-reported
RH bug:
[Bug 510823] "Structure needs cleaning" when reading files from an XFS
partition (extent count for ino XYZ data fork too low (6) for file format)
also seems to -possibly- be related to an xfs_fsr run, and also is
related to extents in the wrong format. In that case it was the
opposite; an inode was found in btree format which had few enough
extents that it should have been in the extents format in the inode; in
your case, it looks like there were too many extents to fit in the
format it had...
Just out of curiosity, it looks like you have rather a lot of extended
attributes on at least the inode above, is that accurate? Or maybe
that's part of the corruption?
I'll focus on getting xfs_repair to cope first, but I wonder what
happened here...
Thanks,
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
2009-07-12 18:52 ` Eric Sandeen
@ 2009-07-12 22:08 ` Michael Monnerie
0 siblings, 0 replies; 8+ messages in thread
From: Michael Monnerie @ 2009-07-12 22:08 UTC (permalink / raw)
To: xfs
On Sonntag 12 Juli 2009 Eric Sandeen wrote:
> Just out of curiosity, it looks like you have rather a lot of
> extended attributes on at least the inode above, is that accurate?
> Or maybe that's part of the corruption?
# find . -inum 3857051697
find: "./samba/tmp/BettyPC.tib": Die Struktur muss bereinigt werden
(means: structure needs cleaning)
I'm not sure if that message means that file has the corresponding inode
number? If it is, it's a backup of a PC made with Acronis.
Normally I only use xattr's to set one or two extra rights
> I'll focus on getting xfs_repair to cope first, but I wonder what
> happened here...
No idea. Didn't have a crash on that server IIRC. I tried some "ls"
and "getfacl" and got these crashes:
Jul 13 00:01:10 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jul 13 00:01:10 orion.i.zmi.at kernel: Pid: 17213, comm: find Tainted: G 2.6.27.23-0.1-xen #1
Jul 13 00:01:10 orion.i.zmi.at kernel:
Jul 13 00:01:10 orion.i.zmi.at kernel: Call Trace:
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43
Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 13 00:01:10 orion.i.zmi.at kernel: [<00007f802a89f4ce>] 0x7f802a89f4ce
Jul 13 00:01:10 orion.i.zmi.at kernel:
Jul 13 00:02:35 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jul 13 00:02:35 orion.i.zmi.at kernel: Pid: 17232, comm: getfacl Tainted: G 2.6.27.23-0.1-xen #1
Jul 13 00:02:35 orion.i.zmi.at kernel:
Jul 13 00:02:35 orion.i.zmi.at kernel: Call Trace:
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a1baa>] sys_newlstat+0x19/0x31
Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 13 00:02:35 orion.i.zmi.at kernel: [<00007f9a8d911225>] 0x7f9a8d911225
Jul 11 03:02:53 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jul 11 03:02:53 orion.i.zmi.at kernel: Pid: 2881, comm: xfs_fsr Tainted: G 2.6.27.23-0.1-xen #1
Jul 11 03:02:53 orion.i.zmi.at kernel:
Jul 11 03:02:53 orion.i.zmi.at kernel: Call Trace:
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0358336>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa03583ef>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0358f9e>] xfs_ioctl+0x3ca/0x680 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa035702a>] xfs_file_ioctl+0x25/0x69 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aab39>] vfs_ioctl+0x21/0x6c
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aada6>] do_vfs_ioctl+0x222/0x231
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aae06>] sys_ioctl+0x51/0x73
Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 11 03:02:53 orion.i.zmi.at kernel: [<00007fc76ba0bb77>] 0x7fc76ba0bb77
I also found this one:
Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jul 12 00:01:29 orion.i.zmi.at kernel: 00000000: 49 4e 81 ff 02 02 00 00 00 00 03 e8 00 00 00 64 IN.............d
Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": XFS internal error xfs_iformat_extents(1) at line 565 of file fs/xfs/xfs_inode.c. Caller 0xffffffffa033b153
Jul 12 00:01:29 orion.i.zmi.at kernel: Pid: 9592, comm: find Tainted: G 2.6.27.23-0.1-xen #1
Jul 12 00:01:29 orion.i.zmi.at kernel:
Jul 12 00:01:29 orion.i.zmi.at kernel: Call Trace:
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43
Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 12 00:01:29 orion.i.zmi.at kernel: [<00007fdc209084ce>] 0x7fdc209084ce
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it
@ 2009-08-31 6:40 Michael Monnerie
0 siblings, 0 replies; 8+ messages in thread
From: Michael Monnerie @ 2009-08-31 6:40 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 16457 bytes --]
On Sonntag 12 Juli 2009 Eric Sandeen wrote:
> If you do want to try deleting that one file or other such tricks,
> you can do it on a sparse metadata image of the fs as a dry run:
>
> # xfs_metadump -o /dev/whatever metadump.img
> # xfs_mdrestore metadump.img filesystem.img
> # mount -o loop filesystem.img mnt/
> # <fiddle as you please>
> # umount mnt/
> # xfs_repair filesystem.img
> # mount -o loop filesystem.img mnt/
>
> and see what happens...
To warm up the old thread, I did this now:
* make metadump
* mount it
* remove unneeded files/dirs
This already produced lots of errors, where files/dirs couldn't be
deleted. I made a metadump of this again, it's on
http://zmi.at/xfs.metadump-brokenonly.bz2
Then I tried with v3.0.1:
# xfs_repair xfs.img
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
data fork in regular inode 3857051697 claims used block 537546384
- agno = 19
xfs_repair: dinode.c:2108: process_inode_data_fork: Zusicherung »err ==
0« nicht erfüllt.
Abgebrochen
So I patched out the ASSERT in #dinode.c:2108 and this made:
# xfs_repair xfs.img
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
data fork in regular inode 3857051697 claims used block 537546384
bad attribute format 1 in inode 3857051697, resetting value
correcting nblocks for inode 3857051697, was 10135251 - counted 8388604
- agno = 21
- agno = 22
- agno = 23
- agno = 24
data fork in regular inode 6174936063 claims used block 537240415
correcting nblocks for inode 6174936063, was 1 - counted 0
data fork in regular inode 6180186880 claims used block 537242879
correcting nblocks for inode 6180186880, was 1 - counted 0
- agno = 25
- agno = 26
- agno = 27
data fork in regular inode 7257143306 claims used block 537251790
correcting nblocks for inode 7257143306, was 1 - counted 0
data fork in regular inode 7257143307 claims used block 537257951
correcting nblocks for inode 7257143307, was 1 - counted 0
data fork in regular inode 6720520457 claims used block 537246687
correcting nblocks for inode 6720520457, was 1 - counted 0
data fork in regular inode 6720520458 claims used block 537247327
correcting nblocks for inode 6720520458, was 1 - counted 0
- agno = 28
- agno = 29
- agno = 30
- agno = 31
data fork in regular inode 8326467385 claims used block 537198367
correcting nblocks for inode 8326467385, was 1 - counted 0
- agno = 32
inode block 537201328 multiply claimed, state was 3
inode block 537201329 multiply claimed, state was 3
inode block 537201330 multiply claimed, state was 3
data fork in regular inode 8595221283 claims used block 537201683
correcting nblocks for inode 8595221283, was 1 - counted 0
data fork in regular inode 8595221284 claims used block 537201684
- agno = 33
correcting nblocks for inode 8595221284, was 5 - counted 0
data fork in regular inode 8595221285 claims used block 537201689
correcting nblocks for inode 8595221285, was 1 - counted 0
data fork in regular inode 8595221286 claims used block 537201690
correcting nblocks for inode 8595221286, was 6 - counted 0
data fork in regular inode 8326763299 claims used block 537270223
correcting nblocks for inode 8326763299, was 1 - counted 0
data fork in regular inode 8326763300 claims used block 537271439
correcting nblocks for inode 8326763300, was 1 - counted 0
data fork in regular inode 8595221287 claims used block 537201696
correcting nblocks for inode 8595221287, was 1 - counted 0
data fork in regular inode 8595221288 claims used block 537201699
attr fork in regular inode 8595221288 claims used block 537201698
xfs_repair: dinode.c:2241: process_inode_attr_fork: Zusicherung »err ==
0« nicht erfüllt.
data fork in regular inode 8058708772 claims used block 537258543
correcting nblocks for inode 8058708772, was 1 - counted 0
data fork in regular inode 8058708773 claims used block 537260719
correcting nblocks for inode 8058708773, was 1 - counted 0
Abgebrochen
So again patch dinode.c:2241 ASSERT away:
# xfs_repair xfs.img
(about 550KB output, see http://zmi.at/xfs_repair.txt )
corrupt dinode 8326467385, extent total = 1, nblocks = 0. This is a
bug.
Please capture the filesystem metadata with xfs_metadump and
report it to xfs@oss.sgi.com.
cache_node_purge: refcount was 1, not zero (node=0x7f90c4de34e0)
fatal error -- couldn't map inode 8326467385, err = 117
Now, should I PANIC? Doesn't all seem to be nice...
I mounted this image, made "rm -r *" so all files/dirs which were good
were deleted. There are still a lot in there which can't be deleted. Can
someone help me fix it please?
I made another metadump image, it's on
http://zmi.at/xfs.metadump-brokenonly2.bz2
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-08-31 6:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-31 6:40 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
-- strict thread matches above, loose matches on Subject: below --
2009-07-03 11:20 Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
2009-07-04 5:43 ` Eric Sandeen
2009-07-12 17:02 ` Michael Monnerie
2009-07-12 18:09 ` Eric Sandeen
2009-07-12 18:52 ` Eric Sandeen
2009-07-12 22:08 ` Michael Monnerie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox