* bad fs - xfs_repair 3.01 crashes on it
@ 2009-07-03 11:20 Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Michael Monnerie @ 2009-07-03 11:20 UTC (permalink / raw)
To: xfs mailing list
[-- Attachment #1.1.1: Type: text/plain, Size: 3470 bytes --]
Tonight our server rebooted, and I found in /var/log/warn that he was crying
a lot about xfs since June 7 already:
Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair.
Jun 7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G 2.6.27.21-0.1-xen #1
Jun 7 03:06:31 orion.i.zmi.at kernel:
Jun 7 03:06:31 orion.i.zmi.at kernel: Call Trace:
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff804635e0>] dump_stack+0x69/0x6f
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033bbcc>] xfs_iformat_extents+0xc9/0x1c5 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033c129>] xfs_iformat+0x2b0/0x3f6 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa033c356>] xfs_iread+0xe7/0x1ed [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0337920>] xfs_iget_core+0x3a5/0x63a [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0337c97>] xfs_iget+0xe2/0x187 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0359302>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa03593bb>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0359f6a>] xfs_ioctl+0x3ca/0x680 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffffa0357ff6>] xfs_file_ioctl+0x25/0x69 [xfs]
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aa8cd>] vfs_ioctl+0x21/0x6c
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aab3a>] do_vfs_ioctl+0x222/0x231
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff802aab9a>] sys_ioctl+0x51/0x73
Jun 7 03:06:31 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jun 7 03:06:31 orion.i.zmi.at kernel: [<00007f7231d6cb77>] 0x7f7231d6cb77
But XFS didn't go offline, so nobody found this messages. There are a lot of them.
They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
since then. It would have been nice if xfs_fsr could have displayed
a message, so we would have received the cron mail. (But it got killed
by the kernel, that's a good excuse)
Anyway, so I went to xfs_repair (3.01) and got this:
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
[snip]
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
[snip]
Phase 4 - check for duplicate blocks...
[snip]
- agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
And then xfs_repair crashes out, without having repaired. I attached the full
xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
the metadump.
I'll not be here for a week now, I hope the problem is not very serious.
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
[-- Attachment #1.1.2: xfsrepair.data1 --]
[-- Type: text/plain, Size: 1930 bytes --]
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
- agno = 15
- agno = 16
- agno = 17
- agno = 18
- agno = 19
- agno = 20
- agno = 21
- agno = 22
- agno = 23
- agno = 24
- agno = 25
- agno = 26
- agno = 27
- agno = 28
- agno = 29
- agno = 30
- agno = 31
- agno = 32
- agno = 33
- agno = 34
- agno = 35
- agno = 36
- agno = 37
- agno = 38
- agno = 39
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie @ 2009-07-03 18:34 ` Eric Sandeen 2009-07-04 5:43 ` Eric Sandeen ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2009-07-03 18:34 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs mailing list Michael Monnerie wrote: > Tonight our server rebooted, and I found in /var/log/warn that he was crying > a lot about xfs since June 7 already: > > Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. ... > But XFS didn't go offline, so nobody found this messages. There are a lot of them. > They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run > since then. It would have been nice if xfs_fsr could have displayed > a message, so we would have received the cron mail. (But it got killed > by the kernel, that's a good excuse) I'll have to think about why this didn't shut down the fs. There are just a few that don't. > Anyway, so I went to xfs_repair (3.01) and got this: > > Phase 3 - for each AG... > - scan and clear agi unlinked lists... > - process known inodes and perform inode discovery... > [snip] > - agno = 14 > local inode 3857051697 attr too small (size = 3, min size = 4) > bad attribute fork in inode 3857051697, clearing attr fork > clearing inode 3857051697 attributes > cleared inode 3857051697 > [snip] > Phase 4 - check for duplicate blocks... > [snip] > - agno = 15 > data fork in regular inode 3857051697 claims used block 537147998 > xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed. > > And then xfs_repair crashes out, without having repaired. I attached the full > xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2 > the metadump. Thanks for the metadump image, I'll try to take a look. -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie 2009-07-03 18:34 ` Eric Sandeen @ 2009-07-04 5:43 ` Eric Sandeen 2009-07-12 17:02 ` Michael Monnerie 2009-07-12 18:52 ` Eric Sandeen 2009-07-14 4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen 3 siblings, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2009-07-04 5:43 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs mailing list Michael Monnerie wrote: > Tonight our server rebooted, and I found in /var/log/warn that he was crying > a lot about xfs since June 7 already: ... > But XFS didn't go offline, so nobody found this messages. There are a lot of them. > They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run > since then. It would have been nice if xfs_fsr could have displayed > a message, so we would have received the cron mail. (But it got killed > by the kernel, that's a good excuse) ok yeah we should see why fsr didn't print anything ... > Anyway, so I went to xfs_repair (3.01) and got this: > > Phase 3 - for each AG... > - scan and clear agi unlinked lists... > - process known inodes and perform inode discovery... > [snip] > - agno = 14 > local inode 3857051697 attr too small (size = 3, min size = 4) > bad attribute fork in inode 3857051697, clearing attr fork > clearing inode 3857051697 attributes > cleared inode 3857051697 > [snip] > Phase 4 - check for duplicate blocks... > [snip] > - agno = 15 > data fork in regular inode 3857051697 claims used block 537147998 > xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed. Ok, so this is essentially some code which first does a scan; if it finds an error it bails out and clears the inode, but if not, it calls essentially the same function again, comments say "set bitmaps this time" - but on the 2nd call it finds an error, which isn't handled well. The ASSERT(err == 0) bit is presumably because if the first scan didn't find anything, the 2nd call shouldn't either, but ... not the case here :( There are more checks that can go wrong -after- the scan-only portion. So either the caller needs to cope w/ the error at this point, or the scan only business needs do all the checks, I think. Where's Barry when you need him .... Also I need to look at when the ASSERTs are active and when they should be; the Fedora packaged xfsprogs doesn't have the ASSERT active, and so this doesn't trip. After 2 calls to xfs_repair on Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!). Not great. Not sure how much was cleared out in the process either... -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-04 5:43 ` Eric Sandeen @ 2009-07-12 17:02 ` Michael Monnerie 2009-07-12 18:09 ` Eric Sandeen 0 siblings, 1 reply; 11+ messages in thread From: Michael Monnerie @ 2009-07-12 17:02 UTC (permalink / raw) To: xfs [-- Attachment #1.1: Type: text/plain, Size: 960 bytes --] On Samstag 04 Juli 2009 Eric Sandeen wrote: > Where's Barry when you need him .... Who's that? > Also I need to look at when the ASSERTs are active and when they > should be; the Fedora packaged xfsprogs doesn't have the ASSERT > active, and so this doesn't trip. After 2 calls to xfs_repair on > Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!). Not > great. Not sure how much was cleared out in the process either... Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to only hit one file, but I don't dare delete it, maybe it makes things worse? mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-12 17:02 ` Michael Monnerie @ 2009-07-12 18:09 ` Eric Sandeen 0 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2009-07-12 18:09 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs Michael Monnerie wrote: > On Samstag 04 Juli 2009 Eric Sandeen wrote: >> Where's Barry when you need him .... > > Who's that? The ex-sgi xfs_repair maintainer :) >> Also I need to look at when the ASSERTs are active and when they >> should be; the Fedora packaged xfsprogs doesn't have the ASSERT >> active, and so this doesn't trip. After 2 calls to xfs_repair on >> Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!). Not >> great. Not sure how much was cleared out in the process either... > > Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to > only hit one file, but I don't dare delete it, maybe it makes things > worse? > > mfg zmi Sorry, I will get back to this soon - today I hope. I seem to be getting more and more familiar w/ xfs_repair these days. :) If you do want to try deleting that one file or other such tricks, you can do it on a sparse metadata image of the fs as a dry run: # xfs_metadump -o /dev/whatever metadump.img # xfs_mdrestore metadump.img filesystem.img # mount -o loop filesystem.img mnt/ # <fiddle as you please> # umount mnt/ # xfs_repair filesystem.img # mount -o loop filesystem.img mnt/ and see what happens... -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie 2009-07-03 18:34 ` Eric Sandeen 2009-07-04 5:43 ` Eric Sandeen @ 2009-07-12 18:52 ` Eric Sandeen 2009-07-12 22:08 ` Michael Monnerie 2009-07-14 4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen 3 siblings, 1 reply; 11+ messages in thread From: Eric Sandeen @ 2009-07-12 18:52 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs mailing list Michael Monnerie wrote: > Tonight our server rebooted, and I found in /var/log/warn that he was crying > a lot about xfs since June 7 already: > > Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. > Jun 7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G 2.6.27.21-0.1-xen #1 > Jun 7 03:06:31 orion.i.zmi.at kernel: Hm, the other sort of interesting thing here is that a recently-reported RH bug: [Bug 510823] "Structure needs cleaning" when reading files from an XFS partition (extent count for ino XYZ data fork too low (6) for file format) also seems to -possibly- be related to an xfs_fsr run, and also is related to extents in the wrong format. In that case it was the opposite; an inode was found in btree format which had few enough extents that it should have been in the extents format in the inode; in your case, it looks like there were too many extents to fit in the format it had... Just out of curiosity, it looks like you have rather a lot of extended attributes on at least the inode above, is that accurate? Or maybe that's part of the corruption? I'll focus on getting xfs_repair to cope first, but I wonder what happened here... Thanks, -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: bad fs - xfs_repair 3.01 crashes on it 2009-07-12 18:52 ` Eric Sandeen @ 2009-07-12 22:08 ` Michael Monnerie 0 siblings, 0 replies; 11+ messages in thread From: Michael Monnerie @ 2009-07-12 22:08 UTC (permalink / raw) To: xfs On Sonntag 12 Juli 2009 Eric Sandeen wrote: > Just out of curiosity, it looks like you have rather a lot of > extended attributes on at least the inode above, is that accurate? > Or maybe that's part of the corruption? # find . -inum 3857051697 find: "./samba/tmp/BettyPC.tib": Die Struktur muss bereinigt werden (means: structure needs cleaning) I'm not sure if that message means that file has the corresponding inode number? If it is, it's a backup of a PC made with Acronis. Normally I only use xattr's to set one or two extra rights > I'll focus on getting xfs_repair to cope first, but I wonder what > happened here... No idea. Didn't have a crash on that server IIRC. I tried some "ls" and "getfacl" and got these crashes: Jul 13 00:01:10 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. Jul 13 00:01:10 orion.i.zmi.at kernel: Pid: 17213, comm: find Tainted: G 2.6.27.23-0.1-xen #1 Jul 13 00:01:10 orion.i.zmi.at kernel: Jul 13 00:01:10 orion.i.zmi.at kernel: Call Trace: Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs] Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43 Jul 13 00:01:10 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b Jul 13 00:01:10 orion.i.zmi.at kernel: [<00007f802a89f4ce>] 0x7f802a89f4ce Jul 13 00:01:10 orion.i.zmi.at kernel: Jul 13 00:02:35 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. Jul 13 00:02:35 orion.i.zmi.at kernel: Pid: 17232, comm: getfacl Tainted: G 2.6.27.23-0.1-xen #1 Jul 13 00:02:35 orion.i.zmi.at kernel: Jul 13 00:02:35 orion.i.zmi.at kernel: Call Trace: Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs] Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff802a1baa>] sys_newlstat+0x19/0x31 Jul 13 00:02:35 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b Jul 13 00:02:35 orion.i.zmi.at kernel: [<00007f9a8d911225>] 0x7f9a8d911225 Jul 11 03:02:53 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. Jul 11 03:02:53 orion.i.zmi.at kernel: Pid: 2881, comm: xfs_fsr Tainted: G 2.6.27.23-0.1-xen #1 Jul 11 03:02:53 orion.i.zmi.at kernel: Jul 11 03:02:53 orion.i.zmi.at kernel: Call Trace: Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58 Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0358336>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa03583ef>] xfs_open_by_handle+0x60/0x1cb [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa0358f9e>] xfs_ioctl+0x3ca/0x680 [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffffa035702a>] xfs_file_ioctl+0x25/0x69 [xfs] Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aab39>] vfs_ioctl+0x21/0x6c Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aada6>] do_vfs_ioctl+0x222/0x231 Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff802aae06>] sys_ioctl+0x51/0x73 Jul 11 03:02:53 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b Jul 11 03:02:53 orion.i.zmi.at kernel: [<00007fc76ba0bb77>] 0x7fc76ba0bb77 I also found this one: Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5). Unmount and run xfs_repair. Jul 12 00:01:29 orion.i.zmi.at kernel: 00000000: 49 4e 81 ff 02 02 00 00 00 00 03 e8 00 00 00 64 IN.............d Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": XFS internal error xfs_iformat_extents(1) at line 565 of file fs/xfs/xfs_inode.c. Caller 0xffffffffa033b153 Jul 12 00:01:29 orion.i.zmi.at kernel: Pid: 9592, comm: find Tainted: G 2.6.27.23-0.1-xen #1 Jul 12 00:01:29 orion.i.zmi.at kernel: Jul 12 00:01:29 orion.i.zmi.at kernel: Call Trace: Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff80463a33>] dump_stack+0x69/0x6f Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs] Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a5d27>] real_lookup+0x7e/0x10f Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a5e1b>] do_lookup+0x63/0xb6 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a8ad7>] path_walk+0x5e/0xb9 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a962e>] user_path_at+0x48/0x79 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43 Jul 12 00:01:29 orion.i.zmi.at kernel: [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b Jul 12 00:01:29 orion.i.zmi.at kernel: [<00007fdc209084ce>] 0x7fdc209084ce mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing 2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie ` (2 preceding siblings ...) 2009-07-12 18:52 ` Eric Sandeen @ 2009-07-14 4:13 ` Eric Sandeen 2009-07-14 5:42 ` Josef 'Jeff' Sipek 2009-07-14 6:05 ` Michael Monnerie 3 siblings, 2 replies; 11+ messages in thread From: Eric Sandeen @ 2009-07-14 4:13 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs mailing list As reported in "bad fs - xfs_repair 3.01 crashes on it" ... The filesystem encountered a bad attribute fork which is cleared: local inode 3857051697 attr too small (size = 3, min size = 4) bad attribute fork in inode 3857051697, clearing attr fork clearing inode 3857051697 attributes and then later this inode failed an assertion: data fork in regular inode 3857051697 claims used block 537147998 xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed. The ASSERT is there because process_inode_data_fork() calls process_exinode() twice; once with check_dups == 1, and again with check_dups == 0. The assertion is that they should both return the the same answer about whether the inode contained duplicate blocks. However, they are tested in different ways; with check_dups set, process_exinode() simply does search_dup_extent() when it gets to process_bmbt_reclist_int(); without check_dups set, it utilizes the ba_bmap[][] array of bitmaps, compared against the current extent record. Long story short(er), when we cleared the bad attribute in clear_dinode_attr(), it used XFS_DFORK_APTR() to get to the shortform attribute header, and set some fields. However, di_forkoff must have been corrupt as well, because setting these fields corrupted the extent list, and the 4th extent on the inode got its physical block modified from: 431241822 / 0x19B43A5E to: 537147998 / 0x20043A5E and this new (corrupt) physical block matched another inode's block, triggering the dup & return 1, triggering the ASSERT. Whew. Anyway, simply setting di_forkoff to 0 should be enough to flag the inode as having no attr fork, and messing with where we think the shortform attribute header may be is now shown to be dangerous. Simply not mucking w/ the header seems to fix the problem, based on testing with the metadump image. Almost. process_inode_attr_fork() calls clear_dinode_attr() which puts it into the XFS_DINODE_FMT_EXTENTS state, but upon return resets that to XFS_DINODE_FMT_LOCAL. Later, it's checked that if !XFS_DFORK_Q(), the format is XFS_DINODE_FMT_EXTENTS (!) and it gets reset. So drop the setting to XFS_DINODE_FMT_LOCAL; for whatever reason, "no attributes" seems to expect _EXTENTS format, see for example xfs_attr_shortform_remove(), clear_dinode_core(), and xfs_attr_fork_reset() in the kernel, which all set it to _EXTENTS in this circumstance. Fix this up after both calls to clear_dinode_attr(). Signed-off-by: Eric Sandeen <sandeen@sandeen.net> --- diff --git a/repair/dinode.c b/repair/dinode.c index 84e1d05..23de0a8 100644 --- a/repair/dinode.c +++ b/repair/dinode.c @@ -103,23 +103,8 @@ clear_dinode_attr(xfs_mount_t *mp, xfs_dinode_t *dino, xfs_ino_t ino_num) } /* get rid of the fork by clearing forkoff */ - - /* Originally, when the attr repair code was added, the fork was cleared - * by turning it into shortform status. This meant clearing the - * hdr.totsize/count fields and also changing aformat to LOCAL - * (vs EXTENTS). Over various fixes, the aformat and forkoff have - * been updated to not show an attribute fork at all, however. - * It could be possible that resetting totsize/count are not needed, - * but just to be safe, leave it in for now. - */ - - if (!no_modify) { - xfs_attr_shortform_t *asf = (xfs_attr_shortform_t *) - XFS_DFORK_APTR(dino); - asf->hdr.totsize = cpu_to_be16(sizeof(xfs_attr_sf_hdr_t)); - asf->hdr.count = 0; - dinoc->di_forkoff = 0; /* got to do this after asf is set */ - } + if (!no_modify) + dinoc->di_forkoff = 0; /* * always returns 1 since the fork gets zapped @@ -2195,7 +2180,6 @@ process_inode_attr_fork( if (delete_attr_ok) { do_warn(_(", clearing attr fork\n")); *dirty += clear_dinode_attr(mp, dino, lino); - dinoc->di_aformat = XFS_DINODE_FMT_LOCAL; } else { do_warn("\n"); *dirty += clear_dinode(mp, dino, lino); @@ -2253,12 +2237,10 @@ process_inode_attr_fork( lino); if (!repair) { /* clear attributes if not done already */ - if (!no_modify) { + if (!no_modify) *dirty += clear_dinode_attr(mp, dino, lino); - dinoc->di_aformat = XFS_DINODE_FMT_LOCAL; - } else { + else do_warn(_("would clear attr fork\n")); - } *atotblocks = 0; *anextents = 0; } _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing 2009-07-14 4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen @ 2009-07-14 5:42 ` Josef 'Jeff' Sipek 2009-07-14 6:05 ` Michael Monnerie 1 sibling, 0 replies; 11+ messages in thread From: Josef 'Jeff' Sipek @ 2009-07-14 5:42 UTC (permalink / raw) To: Eric Sandeen; +Cc: Michael Monnerie, xfs mailing list On Mon, Jul 13, 2009 at 11:13:58PM -0500, Eric Sandeen wrote: ... > Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Nice! Josef 'Jeff' Sipek. -- Humans were created by water to transport it upward. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing 2009-07-14 4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen 2009-07-14 5:42 ` Josef 'Jeff' Sipek @ 2009-07-14 6:05 ` Michael Monnerie 2009-07-14 6:16 ` Eric Sandeen 1 sibling, 1 reply; 11+ messages in thread From: Michael Monnerie @ 2009-07-14 6:05 UTC (permalink / raw) To: xfs [-- Attachment #1.1: Type: text/plain, Size: 651 bytes --] On Dienstag 14 Juli 2009 Eric Sandeen wrote: > Whew. It's people like you I'm afraid of ;-) Is there still blood in your vains or was it replaced with silicone once? To ask in a simple way: Will this version fix the problem on my disk? If yes, where could I download it? (git ..?) mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing 2009-07-14 6:05 ` Michael Monnerie @ 2009-07-14 6:16 ` Eric Sandeen 0 siblings, 0 replies; 11+ messages in thread From: Eric Sandeen @ 2009-07-14 6:16 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs@oss.sgi.com On Jul 14, 2009, at 1:05 AM, Michael Monnerie <michael.monnerie@is.it-management.at > wrote: > On Dienstag 14 Juli 2009 Eric Sandeen wrote: >> Whew. > > It's people like you I'm afraid of ;-) Is there still blood in your > vains or was it replaced with silicone once? Heh.. :) > > To ask in a simple way: Will this version fix the problem on my > disk? If > yes, where could I download it? (git ..?) > It's not committed yet, you could patch it yourself now, or wait til it's reviewed and committed... -Eric > mfg zmi > -- > // Michael Monnerie, Ing.BSc ----- http://it-management.at > // Tel: 0660 / 415 65 31 .network.your.ideas. > // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" > // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 > // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-07-14 6:16 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie 2009-07-03 18:34 ` Eric Sandeen 2009-07-04 5:43 ` Eric Sandeen 2009-07-12 17:02 ` Michael Monnerie 2009-07-12 18:09 ` Eric Sandeen 2009-07-12 18:52 ` Eric Sandeen 2009-07-12 22:08 ` Michael Monnerie 2009-07-14 4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen 2009-07-14 5:42 ` Josef 'Jeff' Sipek 2009-07-14 6:05 ` Michael Monnerie 2009-07-14 6:16 ` Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox