public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
* fsck.ext4 returning false positives
@ 2013-02-27 21:16 Bryan Mesich
  2013-02-27 21:47 ` Theodore Ts'o
  0 siblings, 1 reply; 3+ messages in thread
From: Bryan Mesich @ 2013-02-27 21:16 UTC (permalink / raw)
  To: linux-ext4, tytso

We have a semi-large NFS file server (in terms of storage) that is
responsible for delivering storage to our Learning Management System (LMS).
About 6 months ago, we ran into file system corruption on said server
(at the time, we were using ext3).  After fixing the corruption, I decided
it would be a good idea to run a weekly fsck on the large file system in
hopes of heading off a situation where the file system gets re-mounted
read-only due to corruption.

The file system in question is 1.8TB in size, which took a _very_ long time
to check when using ext3 (thus the move to ext4).  Taking the system down
weekly to run a file system check was not feasible, so I used lvm/dm to
take a read-write snapshot of the volume.  I could then run fsck on the
snapshot volume without taking the system down.  I made sure to mount the
snap volume before running fsck so that the journal could do recovery.  The
steps I'm using are as follows:

- Snapshot volume (read-write)
- Mount snap volume (replay journal)
- Umount snap volume
- Run fsck on snap volume
- Remove snap volume

I migrated the file system to ext4 in December 2012 by copying the files
from the old file system to the new one (I didn't go the "upgrade" route).
I continued performing the weekly file system checks after migrating to
ext4 and starting seeing strange behavior when running fsck on a snapshot
volume.  Here is the output from this mornings fsck:

e2fsck 1.42.6 (21-Sep-2012)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (133413770, counted=133413835).
Fix? no

Free inodes count wrong (118244509, counted=118244510).
Fix? no

/dev/sanvg2/bbcontent_snap: 2554723/120799232 files (0.5% non-contiguous),
349770870/483184640 blocks

This is the 3rd time fsck has indicated problems with the free block and inode
counts since migrating to ext4 in December 2012.  And each time I take the
server down to umount and fsck the file system, nothing is fixed or found
wrong with the file system.  I ran the check again this morning (with an updated
e2fsprogs) and got the same results:

e2fsck 1.42.7 (21-Jan-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong (133197192, counted=133197331).
Fix? no

Free inodes count wrong (118242252, counted=118242254).
Fix? no

/dev/sanvg2/bbcontent_snap: 2556980/120799232 files (0.5% non-contiguous),
349987448/483184640 blocks

I'm not sure what's to blame for this problem. Any help would be
appreciated.  Server is running the following:

RHEL 5.9 x86_64
Kernel 3.4.29
e2fsprogs 1.42.7

Storage stack has the following:

[MD RAID1] -> [LVM - 2 LVs] -> [EXT4]

Thanks in advance,

Bryan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-02-28 15:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-27 21:16 fsck.ext4 returning false positives Bryan Mesich
2013-02-27 21:47 ` Theodore Ts'o
2013-02-28 15:49   ` Bryan Mesich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox