public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* bad fs - xfs_repair 3.01 crashes on it
@ 2009-07-03 11:20 Michael Monnerie
  2009-07-03 18:34 ` Eric Sandeen
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Michael Monnerie @ 2009-07-03 11:20 UTC (permalink / raw)
  To: xfs mailing list


[-- Attachment #1.1.1: Type: text/plain, Size: 3470 bytes --]

Tonight our server rebooted, and I found in /var/log/warn that he was crying 
a lot about xfs since June 7 already:

Jun  7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jun  7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G          2.6.27.21-0.1-xen #1
Jun  7 03:06:31 orion.i.zmi.at kernel:
Jun  7 03:06:31 orion.i.zmi.at kernel: Call Trace:
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff804635e0>] dump_stack+0x69/0x6f
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033bbcc>] xfs_iformat_extents+0xc9/0x1c5 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033c129>] xfs_iformat+0x2b0/0x3f6 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa033c356>] xfs_iread+0xe7/0x1ed [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0337920>] xfs_iget_core+0x3a5/0x63a [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0337c97>] xfs_iget+0xe2/0x187 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0359302>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa03593bb>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0359f6a>] xfs_ioctl+0x3ca/0x680 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffffa0357ff6>] xfs_file_ioctl+0x25/0x69 [xfs]
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aa8cd>] vfs_ioctl+0x21/0x6c
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aab3a>] do_vfs_ioctl+0x222/0x231
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff802aab9a>] sys_ioctl+0x51/0x73
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jun  7 03:06:31 orion.i.zmi.at kernel:  [<00007f7231d6cb77>] 0x7f7231d6cb77

But XFS didn't go offline, so nobody found this messages. There are a lot of them.
They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
since then. It would have been nice if xfs_fsr could have displayed
a message, so we would have received the cron mail. (But it got killed
by the kernel, that's a good excuse)

Anyway, so I went to xfs_repair (3.01) and got this:

Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
[snip]
        - agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
[snip]
Phase 4 - check for duplicate blocks...
[snip]
        - agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

And then xfs_repair crashes out, without having repaired. I attached the full 
xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
the metadump.

I'll not be here for a week now, I hope the problem is not very serious.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4


[-- Attachment #1.1.2: xfsrepair.data1 --]
[-- Type: text/plain, Size: 1930 bytes --]

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes
cleared inode 3857051697
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - agno = 32
        - agno = 33
        - agno = 34
        - agno = 35
        - agno = 36
        - agno = 37
        - agno = 38
        - agno = 39
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
@ 2009-07-03 18:34 ` Eric Sandeen
  2009-07-04  5:43 ` Eric Sandeen
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-07-03 18:34 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs mailing list

Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying 
> a lot about xfs since June 7 already:
> 
> Jun  7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.

...

> But XFS didn't go offline, so nobody found this messages. There are a lot of them.
> They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
> since then. It would have been nice if xfs_fsr could have displayed
> a message, so we would have received the cron mail. (But it got killed
> by the kernel, that's a good excuse)

I'll have to think about why this didn't shut down the fs.  There are
just a few that don't.

> Anyway, so I went to xfs_repair (3.01) and got this:
> 
> Phase 3 - for each AG...
>         - scan and clear agi unlinked lists...
>         - process known inodes and perform inode discovery...
> [snip]
>         - agno = 14
> local inode 3857051697 attr too small (size = 3, min size = 4)
> bad attribute fork in inode 3857051697, clearing attr fork
> clearing inode 3857051697 attributes
> cleared inode 3857051697
> [snip]
> Phase 4 - check for duplicate blocks...
> [snip]
>         - agno = 15
> data fork in regular inode 3857051697 claims used block 537147998
> xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.
> 
> And then xfs_repair crashes out, without having repaired. I attached the full 
> xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2
> the metadump.

Thanks for the metadump image, I'll try to take a look.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
  2009-07-03 18:34 ` Eric Sandeen
@ 2009-07-04  5:43 ` Eric Sandeen
  2009-07-12 17:02   ` Michael Monnerie
  2009-07-12 18:52 ` Eric Sandeen
  2009-07-14  4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2009-07-04  5:43 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs mailing list

Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying 
> a lot about xfs since June 7 already:

...

> But XFS didn't go offline, so nobody found this messages. There are a lot of them.
> They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we run
> since then. It would have been nice if xfs_fsr could have displayed
> a message, so we would have received the cron mail. (But it got killed
> by the kernel, that's a good excuse)

ok yeah we should see why fsr didn't print anything ...

> Anyway, so I went to xfs_repair (3.01) and got this:
> 
> Phase 3 - for each AG...
>         - scan and clear agi unlinked lists...
>         - process known inodes and perform inode discovery...
> [snip]
>         - agno = 14
> local inode 3857051697 attr too small (size = 3, min size = 4)
> bad attribute fork in inode 3857051697, clearing attr fork
> clearing inode 3857051697 attributes
> cleared inode 3857051697
> [snip]
> Phase 4 - check for duplicate blocks...
> [snip]
>         - agno = 15
> data fork in regular inode 3857051697 claims used block 537147998
> xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

Ok, so this is essentially some code which first does a scan; if it
finds an error it bails out and clears the inode, but if not, it calls
essentially the same function again, comments say "set bitmaps this
time" - but on the 2nd call it finds an error, which isn't handled well.
 The ASSERT(err == 0) bit is presumably because if the first scan didn't
find anything, the 2nd call shouldn't either, but ... not the case here
:(  There are more checks that can go wrong -after- the scan-only portion.

So either the caller needs to cope w/ the error at this point, or the
scan only business needs do all the checks, I think.

Where's Barry when you need him ....

Also I need to look at when the ASSERTs are active and when they should
be; the Fedora packaged xfsprogs doesn't have the ASSERT active, and so
this doesn't trip.  After 2 calls to xfs_repair on Fedora, w/o the
ASSERTs active, it checks clean on the 3rd (!).  Not great.  Not sure
how much was cleared out in the process either...

-Eric


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-04  5:43 ` Eric Sandeen
@ 2009-07-12 17:02   ` Michael Monnerie
  2009-07-12 18:09     ` Eric Sandeen
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Monnerie @ 2009-07-12 17:02 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 960 bytes --]

On Samstag 04 Juli 2009 Eric Sandeen wrote:
> Where's Barry when you need him ....

Who's that?

> Also I need to look at when the ASSERTs are active and when they
> should be; the Fedora packaged xfsprogs doesn't have the ASSERT
> active, and so this doesn't trip.  After 2 calls to xfs_repair on
> Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!).  Not
> great.  Not sure how much was cleared out in the process either...

Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to 
only hit one file, but I don't dare delete it, maybe it makes things 
worse?

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4


[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-12 17:02   ` Michael Monnerie
@ 2009-07-12 18:09     ` Eric Sandeen
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-07-12 18:09 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs

Michael Monnerie wrote:
> On Samstag 04 Juli 2009 Eric Sandeen wrote:
>> Where's Barry when you need him ....
> 
> Who's that?

The ex-sgi xfs_repair maintainer :)

>> Also I need to look at when the ASSERTs are active and when they
>> should be; the Fedora packaged xfsprogs doesn't have the ASSERT
>> active, and so this doesn't trip.  After 2 calls to xfs_repair on
>> Fedora, w/o the ASSERTs active, it checks clean on the 3rd (!).  Not
>> great.  Not sure how much was cleared out in the process either...
> 
> Any ideas/news on this? I'd like to xfs_repair that stuff. It seems to 
> only hit one file, but I don't dare delete it, maybe it makes things 
> worse?
> 
> mfg zmi

Sorry, I will get back to this soon - today I hope.  I seem to be
getting more and more familiar w/ xfs_repair these days.  :)

If you do want to try deleting that one file or other such tricks, you
can do it on a sparse metadata image of the fs as a dry run:

# xfs_metadump -o /dev/whatever metadump.img
# xfs_mdrestore metadump.img filesystem.img
# mount -o loop filesystem.img mnt/
# <fiddle as you please>
# umount mnt/
# xfs_repair filesystem.img
# mount -o loop filesystem.img mnt/

and see what happens...

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
  2009-07-03 18:34 ` Eric Sandeen
  2009-07-04  5:43 ` Eric Sandeen
@ 2009-07-12 18:52 ` Eric Sandeen
  2009-07-12 22:08   ` Michael Monnerie
  2009-07-14  4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen
  3 siblings, 1 reply; 11+ messages in thread
From: Eric Sandeen @ 2009-07-12 18:52 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs mailing list

Michael Monnerie wrote:
> Tonight our server rebooted, and I found in /var/log/warn that he was crying 
> a lot about xfs since June 7 already:
> 
> Jun  7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
> Jun  7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G          2.6.27.21-0.1-xen #1
> Jun  7 03:06:31 orion.i.zmi.at kernel:

Hm, the other sort of interesting thing here is that a recently-reported
RH bug:

[Bug 510823] "Structure needs cleaning" when reading files from an XFS
partition (extent count for ino XYZ data fork too low (6) for file format)

also seems to -possibly- be related to an xfs_fsr run, and also is
related to extents in the wrong format.  In that case it was the
opposite; an inode was found in btree format which had few enough
extents that it should have been in the extents format in the inode; in
your case, it looks like there were too many extents to fit in the
format it had...

Just out of curiosity, it looks like you have rather a lot of extended
attributes on at least the inode above, is that accurate?  Or maybe
that's part of the corruption?

I'll focus on getting xfs_repair to cope first, but I wonder what
happened here...

Thanks,
-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: bad fs - xfs_repair 3.01 crashes on it
  2009-07-12 18:52 ` Eric Sandeen
@ 2009-07-12 22:08   ` Michael Monnerie
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Monnerie @ 2009-07-12 22:08 UTC (permalink / raw)
  To: xfs

On Sonntag 12 Juli 2009 Eric Sandeen wrote:
> Just out of curiosity, it looks like you have rather a lot of
> extended attributes on at least the inode above, is that accurate?
>  Or maybe that's part of the corruption?

# find . -inum 3857051697
find: "./samba/tmp/BettyPC.tib": Die Struktur muss bereinigt werden
(means: structure needs cleaning)

I'm not sure if that message means that file has the corresponding inode
number? If it is, it's a backup of a PC made with Acronis.

Normally I only use xattr's to set one or two extra rights

> I'll focus on getting xfs_repair to cope first, but I wonder what
> happened here...

No idea. Didn't have a crash on that server IIRC. I tried some "ls" 
and "getfacl" and got these crashes:

Jul 13 00:01:10 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jul 13 00:01:10 orion.i.zmi.at kernel: Pid: 17213, comm: find Tainted: G          2.6.27.23-0.1-xen #1
Jul 13 00:01:10 orion.i.zmi.at kernel:
Jul 13 00:01:10 orion.i.zmi.at kernel: Call Trace:
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 13 00:01:10 orion.i.zmi.at kernel:  [<00007f802a89f4ce>] 0x7f802a89f4ce
Jul 13 00:01:10 orion.i.zmi.at kernel:
Jul 13 00:02:35 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jul 13 00:02:35 orion.i.zmi.at kernel: Pid: 17232, comm: getfacl Tainted: G          2.6.27.23-0.1-xen #1
Jul 13 00:02:35 orion.i.zmi.at kernel:
Jul 13 00:02:35 orion.i.zmi.at kernel: Call Trace:
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff802a1baa>] sys_newlstat+0x19/0x31
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 13 00:02:35 orion.i.zmi.at kernel:  [<00007f9a8d911225>] 0x7f9a8d911225

Jul 11 03:02:53 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jul 11 03:02:53 orion.i.zmi.at kernel: Pid: 2881, comm: xfs_fsr Tainted: G          2.6.27.23-0.1-xen #1
Jul 11 03:02:53 orion.i.zmi.at kernel:
Jul 11 03:02:53 orion.i.zmi.at kernel: Call Trace:
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa0358336>] xfs_vget_fsop_handlereq+0xc2/0x11b [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa03583ef>] xfs_open_by_handle+0x60/0x1cb [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa0358f9e>] xfs_ioctl+0x3ca/0x680 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffffa035702a>] xfs_file_ioctl+0x25/0x69 [xfs]
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff802aab39>] vfs_ioctl+0x21/0x6c
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff802aada6>] do_vfs_ioctl+0x222/0x231
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff802aae06>] sys_ioctl+0x51/0x73
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 11 03:02:53 orion.i.zmi.at kernel:  [<00007fc76ba0bb77>] 0x7fc76ba0bb77

I also found this one:
Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 3857051697 ((a)extents = 5).  Unmount and run xfs_repair.
Jul 12 00:01:29 orion.i.zmi.at kernel: 00000000: 49 4e 81 ff 02 02 00 00 00 00 03 e8 00 00 00 64  IN.............d
Jul 12 00:01:29 orion.i.zmi.at kernel: Filesystem "dm-0": XFS internal error xfs_iformat_extents(1) at line 565 of file fs/xfs/xfs_inode.c.  Caller 0xffffffffa033b153
Jul 12 00:01:29 orion.i.zmi.at kernel: Pid: 9592, comm: find Tainted: G          2.6.27.23-0.1-xen #1
Jul 12 00:01:29 orion.i.zmi.at kernel:
Jul 12 00:01:29 orion.i.zmi.at kernel: Call Trace:
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff8020c597>] show_trace_log_lvl+0x41/0x58
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff80463a33>] dump_stack+0x69/0x6f
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa033abf8>] xfs_iformat_extents+0xc9/0x1c4 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa033b153>] xfs_iformat+0x2b0/0x3f7 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa033b381>] xfs_iread+0xe7/0x1ee [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa0336928>] xfs_iget_core+0x3a5/0x63a [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa0336c9f>] xfs_iget+0xe2/0x187 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa0350941>] xfs_lookup+0x79/0xa5 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffffa035955b>] xfs_vn_lookup+0x3c/0x78 [xfs]
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a5d27>] real_lookup+0x7e/0x10f
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a5e1b>] do_lookup+0x63/0xb6
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a84a8>] __link_path_walk+0x9f4/0xe58
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a8ad7>] path_walk+0x5e/0xb9
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a8c94>] do_path_lookup+0x162/0x1b9
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a962e>] user_path_at+0x48/0x79
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a1b65>] vfs_lstat_fd+0x15/0x41
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff802a1c68>] sys_newfstatat+0x22/0x43
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<ffffffff8020b3b8>] system_call_fastpath+0x16/0x1b
Jul 12 00:01:29 orion.i.zmi.at kernel:  [<00007fdc209084ce>] 0x7fdc209084ce


mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing
  2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
                   ` (2 preceding siblings ...)
  2009-07-12 18:52 ` Eric Sandeen
@ 2009-07-14  4:13 ` Eric Sandeen
  2009-07-14  5:42   ` Josef 'Jeff' Sipek
  2009-07-14  6:05   ` Michael Monnerie
  3 siblings, 2 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-07-14  4:13 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs mailing list

As reported in "bad fs - xfs_repair 3.01 crashes on it" ...

The filesystem encountered a bad attribute fork which is cleared:

local inode 3857051697 attr too small (size = 3, min size = 4)
bad attribute fork in inode 3857051697, clearing attr fork
clearing inode 3857051697 attributes

and then later this inode failed an assertion:

data fork in regular inode 3857051697 claims used block 537147998
xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed.

The ASSERT is there because process_inode_data_fork() calls 
process_exinode() twice; once with check_dups == 1, and again with 
check_dups == 0.  The assertion is that they should both return the
the same answer about whether the inode contained duplicate blocks.

However, they are tested in different ways; with check_dups set,
process_exinode() simply does search_dup_extent() when it gets
to process_bmbt_reclist_int(); without check_dups set, it utilizes
the ba_bmap[][] array of bitmaps, compared against the current
extent record.

Long story short(er), when we cleared the bad attribute in
clear_dinode_attr(), it used XFS_DFORK_APTR() to get to the
shortform attribute header, and set some fields.  However,
di_forkoff must have been corrupt as well, because setting these
fields corrupted the extent list, and the 4th extent on the inode
got its physical block modified from:
431241822 / 0x19B43A5E
to:
537147998 / 0x20043A5E

and this new (corrupt) physical block matched another inode's
block, triggering the dup & return 1, triggering the ASSERT.

Whew.

Anyway, simply setting di_forkoff to 0 should be enough to flag
the inode as having no attr fork, and messing with where we
think the shortform attribute header may be is now shown to be
dangerous.  Simply not mucking w/ the header seems to fix
the problem, based on testing with the metadump image.

Almost.

process_inode_attr_fork() calls clear_dinode_attr() which puts
it into the XFS_DINODE_FMT_EXTENTS state, but upon return resets
that to XFS_DINODE_FMT_LOCAL.  Later, it's checked that if 
!XFS_DFORK_Q(), the format is XFS_DINODE_FMT_EXTENTS (!)
and it gets reset.

So drop the setting to XFS_DINODE_FMT_LOCAL; for whatever reason,
"no attributes" seems to expect _EXTENTS format, see for example
xfs_attr_shortform_remove(), clear_dinode_core(), and 
xfs_attr_fork_reset() in the kernel, which all set it to _EXTENTS
in this circumstance.

Fix this up after both calls to clear_dinode_attr().

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
---

diff --git a/repair/dinode.c b/repair/dinode.c
index 84e1d05..23de0a8 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -103,23 +103,8 @@ clear_dinode_attr(xfs_mount_t *mp, xfs_dinode_t *dino, xfs_ino_t ino_num)
 	}
 
 	/* get rid of the fork by clearing forkoff */
-
-	/* Originally, when the attr repair code was added, the fork was cleared
-	 * by turning it into shortform status.  This meant clearing the
-	 * hdr.totsize/count fields and also changing aformat to LOCAL
-	 * (vs EXTENTS).  Over various fixes, the aformat and forkoff have
-	 * been updated to not show an attribute fork at all, however.
-	 * It could be possible that resetting totsize/count are not needed,
-	 * but just to be safe, leave it in for now.
-	 */
-
-	if (!no_modify) {
-		xfs_attr_shortform_t *asf = (xfs_attr_shortform_t *)
-				XFS_DFORK_APTR(dino);
-		asf->hdr.totsize = cpu_to_be16(sizeof(xfs_attr_sf_hdr_t));
-		asf->hdr.count = 0;
-		dinoc->di_forkoff = 0;  /* got to do this after asf is set */
-	}
+	if (!no_modify)
+		dinoc->di_forkoff = 0;
 
 	/*
 	 * always returns 1 since the fork gets zapped
@@ -2195,7 +2180,6 @@ process_inode_attr_fork(
 			if (delete_attr_ok)  {
 				do_warn(_(", clearing attr fork\n"));
 				*dirty += clear_dinode_attr(mp, dino, lino);
-				dinoc->di_aformat = XFS_DINODE_FMT_LOCAL;
 			} else  {
 				do_warn("\n");
 				*dirty += clear_dinode(mp, dino, lino);
@@ -2253,12 +2237,10 @@ process_inode_attr_fork(
 			lino);
 		if (!repair) {
 			/* clear attributes if not done already */
-			if (!no_modify)  {
+			if (!no_modify)
 				*dirty += clear_dinode_attr(mp, dino, lino);
-				dinoc->di_aformat = XFS_DINODE_FMT_LOCAL;
-			} else  {
+			else
 				do_warn(_("would clear attr fork\n"));
-			}
 			*atotblocks = 0;
 			*anextents = 0;
 		}


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing
  2009-07-14  4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen
@ 2009-07-14  5:42   ` Josef 'Jeff' Sipek
  2009-07-14  6:05   ` Michael Monnerie
  1 sibling, 0 replies; 11+ messages in thread
From: Josef 'Jeff' Sipek @ 2009-07-14  5:42 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Michael Monnerie, xfs mailing list

On Mon, Jul 13, 2009 at 11:13:58PM -0500, Eric Sandeen wrote:
...
> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>

Nice!

Josef 'Jeff' Sipek.

-- 
Humans were created by water to transport it upward.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing
  2009-07-14  4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen
  2009-07-14  5:42   ` Josef 'Jeff' Sipek
@ 2009-07-14  6:05   ` Michael Monnerie
  2009-07-14  6:16     ` Eric Sandeen
  1 sibling, 1 reply; 11+ messages in thread
From: Michael Monnerie @ 2009-07-14  6:05 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 651 bytes --]

On Dienstag 14 Juli 2009 Eric Sandeen wrote:
> Whew.

It's people like you I'm afraid of ;-) Is there still blood in your 
vains or was it replaced with silicone once?

To ask in a simple way: Will this version fix the problem on my disk? If 
yes, where could I download it? (git ..?)

mfg zmi
-- 
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4


[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing
  2009-07-14  6:05   ` Michael Monnerie
@ 2009-07-14  6:16     ` Eric Sandeen
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Sandeen @ 2009-07-14  6:16 UTC (permalink / raw)
  To: Michael Monnerie; +Cc: xfs@oss.sgi.com

On Jul 14, 2009, at 1:05 AM, Michael Monnerie <michael.monnerie@is.it-management.at 
 > wrote:

> On Dienstag 14 Juli 2009 Eric Sandeen wrote:
>> Whew.
>
> It's people like you I'm afraid of ;-) Is there still blood in your
> vains or was it replaced with silicone once?

Heh.. :)

>
> To ask in a simple way: Will this version fix the problem on my  
> disk? If
> yes, where could I download it? (git ..?)
>
It's not committed yet, you could patch it yourself now, or wait til  
it's reviewed and committed...

-Eric
> mfg zmi
> -- 
> // Michael Monnerie, Ing.BSc    -----      http://it-management.at
> // Tel: 0660 / 415 65 31                      .network.your.ideas.
> // PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
> // Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
> // Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-07-14  6:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-03 11:20 bad fs - xfs_repair 3.01 crashes on it Michael Monnerie
2009-07-03 18:34 ` Eric Sandeen
2009-07-04  5:43 ` Eric Sandeen
2009-07-12 17:02   ` Michael Monnerie
2009-07-12 18:09     ` Eric Sandeen
2009-07-12 18:52 ` Eric Sandeen
2009-07-12 22:08   ` Michael Monnerie
2009-07-14  4:13 ` [PATCH] xfs_repair - do not attempt to set shortform attr header when clearing Eric Sandeen
2009-07-14  5:42   ` Josef 'Jeff' Sipek
2009-07-14  6:05   ` Michael Monnerie
2009-07-14  6:16     ` Eric Sandeen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox