public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* xfs_repair misses an fs error?
@ 2013-04-15 23:47 Keith Keller
  2013-04-16 16:25 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Keller @ 2013-04-15 23:47 UTC (permalink / raw)
  To: linux-xfs

Hi all,

I recently had a filesystem stop, and went through the usual umount,
mount, umount, xfs_repair path.  xfs_repair didn't find any errors, but
I noticed that I was still getting some strange issues where a file in a
directory didn't have an inode and was resulting in IO errors.  Thinking
to address the problem later, I moved the directory to a different
location on the filesystem so that future backups could proceed
normally.

Later, in an attempt to collect log errors, I re-ran xfs_repair on the
filesystem.  It then found errors and corrected them, and on remount, I
found that the directory in question was repaired and no longer had a
file entry with no inode.  (So I can't reproduce that output; I didn't
save it, thinking that the second xfs_repair wouldn't fix the issue, and
I'd generate it again before posting.)

Has anyone else seen a situation where xfs_repair misses a filesystem
problem, but then finds it if a file or directory is moved?  And more
generally, is there a way to use xfs_db or similar to try to find other
inodes that might be causing similar problems?

In case it's helpful, the last xfs_repair stderr is below.
Unfortunately I didn't save the initial xfs_repair logs, but from what I
remember (which could be inaccurate, I admit) I did not see any errors.
FWIW, 8495309 was the inode of the directory that was reporting the file
with no inode before the successful xfs_repair.

--keith

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
correcting nblocks for inode 14653773, was 18446744073709486338 -
counted 1
data fork in ino 14653803 claims free block 1654114652
data fork in ino 14653833 claims free block 1654114653
        - agno = 1
correcting nblocks for inode 2161878797, was 72057594037927936 - counted
0
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - agno = 32
        - agno = 33
        - agno = 34
        - agno = 35
        - agno = 36
        - agno = 37
        - agno = 38
        - agno = 39
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - agno = 32
        - agno = 33
        - agno = 34
        - agno = 35
        - agno = 36
        - agno = 37
        - agno = 38
        - agno = 39
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
bad hash table for directory inode 8495309 (hash value mismatch):
rebuilding
rebuilding directory inode 8495309
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
resetting inode 9161151 nlinks from 10 to 9
resetting inode 9161407 nlinks from 8 to 9
Note - quota info will be regenerated on next quota mount.
done





-- 
kkeller@wombat.san-francisco.ca.us


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair misses an fs error?
  2013-04-15 23:47 xfs_repair misses an fs error? Keith Keller
@ 2013-04-16 16:25 ` Dave Chinner
  2013-04-16 18:44   ` Keith Keller
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2013-04-16 16:25 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-xfs

On Mon, Apr 15, 2013 at 04:47:32PM -0700, Keith Keller wrote:
> Hi all,
> 
> I recently had a filesystem stop, and went through the usual umount,
> mount, umount, xfs_repair path.  xfs_repair didn't find any errors, but
> I noticed that I was still getting some strange issues where a file in a
> directory didn't have an inode and was resulting in IO errors.  Thinking
> to address the problem later, I moved the directory to a different
> location on the filesystem so that future backups could proceed
> normally.
> 
> Later, in an attempt to collect log errors, I re-ran xfs_repair on the
> filesystem.  It then found errors and corrected them, and on remount, I
> found that the directory in question was repaired and no longer had a
> file entry with no inode.  (So I can't reproduce that output; I didn't
> save it, thinking that the second xfs_repair wouldn't fix the issue, and
> I'd generate it again before posting.)
> 
> Has anyone else seen a situation where xfs_repair misses a filesystem
> problem, but then finds it if a file or directory is moved?  And more

Not recently. What version of xfs_repair are you using?

> generally, is there a way to use xfs_db or similar to try to find other
> inodes that might be causing similar problems?

xfs_repair -n is the usual way to find broken stuff without
modifying anything....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair misses an fs error?
  2013-04-16 16:25 ` Dave Chinner
@ 2013-04-16 18:44   ` Keith Keller
  2013-04-16 19:19     ` Roger Willcocks
  2013-04-16 22:05     ` Dave Chinner
  0 siblings, 2 replies; 5+ messages in thread
From: Keith Keller @ 2013-04-16 18:44 UTC (permalink / raw)
  To: linux-xfs

On 2013-04-16, Dave Chinner <david@fromorbit.com> wrote:
>
> Not recently. What version of xfs_repair are you using?

Hmm, perhaps this is a difference.  I believe (though, again I did very
poor logging, and I apologize) that the initial repair used 3.1.1.  The
recent successful repair definitely used 3.1.10.  Is it possible 3.1.1
is old enough that it might not have caught the issues I reported from
the 3.1.10 log?

--keith


-- 
kkeller@wombat.san-francisco.ca.us


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair misses an fs error?
  2013-04-16 18:44   ` Keith Keller
@ 2013-04-16 19:19     ` Roger Willcocks
  2013-04-16 22:05     ` Dave Chinner
  1 sibling, 0 replies; 5+ messages in thread
From: Roger Willcocks @ 2013-04-16 19:19 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-xfs

On Tue, 2013-04-16 at 11:44 -0700, Keith Keller wrote:
> On 2013-04-16, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Not recently. What version of xfs_repair are you using?
> 
> Hmm, perhaps this is a difference.  I believe (though, again I did very
> poor logging, and I apologize) that the initial repair used 3.1.1.  The
> recent successful repair definitely used 3.1.10.  Is it possible 3.1.1
> is old enough that it might not have caught the issues I reported from
> the 3.1.10 log?
> 

Yes, we had a system here for which xfs_repair 3.1.6 reported for 30 or
so files:

   data fork in regular inode 3238731555 claims used block 1080914355
   correcting nblocks for inode 3238731555, was 304 - counted 0

in phase three, and 

   correcting nblocks for inode 3238731555, was 0 - counted 304

in phase four, so the filesystem ended up back where it started. Version
3.1.8 fixed this, reporting instead e.g.:

   data fork in regular inode 3238731617 claims used block 1080933203
   correcting nextents for inode 3238731617
   correcting nblocks for inode 3238731617, was 304 - counted 0
   correcting nextents for inode 3238731617, was 1 - counted 0

> 
-- 
Roger Willcocks <roger@filmlight.ltd.uk>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: xfs_repair misses an fs error?
  2013-04-16 18:44   ` Keith Keller
  2013-04-16 19:19     ` Roger Willcocks
@ 2013-04-16 22:05     ` Dave Chinner
  1 sibling, 0 replies; 5+ messages in thread
From: Dave Chinner @ 2013-04-16 22:05 UTC (permalink / raw)
  To: Keith Keller; +Cc: linux-xfs

On Tue, Apr 16, 2013 at 11:44:54AM -0700, Keith Keller wrote:
> On 2013-04-16, Dave Chinner <david@fromorbit.com> wrote:
> >
> > Not recently. What version of xfs_repair are you using?
> 
> Hmm, perhaps this is a difference.  I believe (though, again I did very
> poor logging, and I apologize) that the initial repair used 3.1.1.  The
> recent successful repair definitely used 3.1.10.  Is it possible 3.1.1
> is old enough that it might not have caught the issues I reported from
> the 3.1.10 log?

Yes - there have been problems like this fixed since 3.1.1...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-04-16 22:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-15 23:47 xfs_repair misses an fs error? Keith Keller
2013-04-16 16:25 ` Dave Chinner
2013-04-16 18:44   ` Keith Keller
2013-04-16 19:19     ` Roger Willcocks
2013-04-16 22:05     ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox