From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id pAMF6DDF149524 for ; Tue, 22 Nov 2011 09:06:13 -0600 Received: from mail.sandeen.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id E18A61D27FC4 for ; Tue, 22 Nov 2011 07:06:10 -0800 (PST) Received: from mail.sandeen.net (sandeen.net [63.231.237.45]) by cuda.sgi.com with ESMTP id rcH6BnPZdBvDGAD5 for ; Tue, 22 Nov 2011 07:06:10 -0800 (PST) Message-ID: <4ECBBA61.4050704@sandeen.net> Date: Tue, 22 Nov 2011 09:06:09 -0600 From: Eric Sandeen MIME-Version: 1.0 Subject: Re: EFSCORRUPTED on mount? References: In-Reply-To: List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Gregory Farnum Cc: xfs@oss.sgi.com On 11/21/11 12:06 PM, Gregory Farnum wrote: > While working with a Ceph node running XFS we somehow managed to > corrupt our filesystem. I don't think there were any hard powercycles > on this node, but while starting up after a kernel upgrade (it's > running 3.1) the daemon was performing its usual startup sequence (a > lot of file truncates, mostly) when it got an error out of the > filesystem: Others have had good comments but also: > 2011-11-17 16:00:37.294876 7f83f3eef720 filestore(/mnt/osd.17) > truncate meta/pginfo_12.7c8/0 size 0 > 2011-11-17 16:00:37.483407 7f83f3eef720 filestore(/mnt/osd.17) > truncate meta/pginfo_12.7c8/0 size 0 = -117 > 2011-11-17 16:00:37.483476 7f83f3eef720 filestore(/mnt/osd.17) error > error 117: Structure needs cleaning not handled was there anything in dmesg/system logs right at this point? XFS should have said something about this original error. -Eric > When I tried to look at the filesystem, it failed with EIO. When I > tried to mount the filesystem after a remount, it gave me an internal > error: > > root@cephstore6358:~# mount /dev/sdg1 /mnt/osd.17 > 2011 Nov 18 14:52:47 cephstore6358 [82374.729383] XFS: Internal error > XFS_WANT_CORRUPTED_GOTO at line 1664 of file fs/xfs/xfs_alloc.c. > Caller 0xffffffff811d6b71 > 2011 Nov 18 14:52:47 cephstore6358 [82374.729386] > 2011 Nov 18 14:52:47 cephstore6358 [82374.758262] XFS (sdg1): Internal > error xfs_trans_cancel at line 1928 of file fs/xfs/xfs_trans.c. > Caller 0xffffffff811fa463 > 2011 Nov 18 14:52:47 cephstore6358 [82374.758265] > 2011 Nov 18 14:52:47 cephstore6358 [82374.758352] XFS (sdg1): > Corruption of in-memory data detected. Shutting down filesystem > 2011 Nov 18 14:52:47 cephstore6358 [82374.758356] XFS (sdg1): Please > umount the filesystem and rectify the problem(s) > 2011 Nov 18 14:52:47 cephstore6358 [82374.758364] XFS (sdg1): Failed > to recover EFIs > mount: Structure needs cleaning > > dmesg had a little more output: > > dmesg says: > [82373.779312] XFS (sdg1): Mounting Filesystem > [82373.930531] XFS (sdg1): Starting recovery (logdev: internal) > [82374.729383] XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line > 1664 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff811d6b71 > [82374.729386] > [82374.741959] Pid: 30648, comm: mount Not tainted > 3.1.0-dho-00004-g1ffcb5c-dirty #1 > [82374.749543] Call Trace: > [82374.751994] [] ? xfs_free_ag_extent+0x4e3/0x698 > [82374.758157] [] ? xfs_setup_devices+0x84/0x84 > [82374.758163] [] ? xfs_setup_devices+0x84/0x84 > [82374.758167] [] ? xfs_free_extent+0xb6/0xf9 > [82374.758171] [] ? kmem_zone_alloc+0x58/0x9e > [82374.758179] [] ? xfs_trans_get_efd+0x21/0x2a > [82374.758185] [] ? xlog_recover_process_efi+0x113/0x172 > [82374.758190] [] ? xlog_recover_process_efis+0x4e/0x8e > [82374.758194] [] ? xlog_recover_finish+0x14/0x88 > [82374.758199] [] ? xfs_mountfs+0x46c/0x56a > [82374.758204] [] ? xfs_fs_fill_super+0x16d/0x244 > [82374.758213] [] ? mount_bdev+0x13d/0x198 > [82374.758218] [] ? mount_fs+0xc/0xa6 > [82374.758225] [] ? vfs_kern_mount+0x61/0x97 > [82374.758230] [] ? do_kern_mount+0x49/0xd6 > [82374.758234] [] ? do_mount+0x6f6/0x75d > [82374.758241] [] ? memdup_user+0x3a/0x56 > [82374.758246] [] ? sys_mount+0x88/0xc4 > [82374.758254] [] ? system_call_fastpath+0x16/0x1b > [82374.758262] XFS (sdg1): Internal error xfs_trans_cancel at line > 1928 of file fs/xfs/xfs_trans.c. Caller 0xffffffff811fa463 > > [82374.758265] > [82374.758268] Pid: 30648, comm: mount Not tainted > 3.1.0-dho-00004-g1ffcb5c-dirty #1 > [82374.758270] Call Trace: > [82374.758275] [] ? xfs_trans_cancel+0x56/0xcf > [82374.758279] [] ? xlog_recover_process_efi+0x163/0x172 > [82374.758284] [] ? xlog_recover_process_efis+0x4e/0x8e > [82374.758288] [] ? xlog_recover_finish+0x14/0x88 > [82374.758293] [] ? xfs_mountfs+0x46c/0x56a > [82374.758298] [] ? xfs_fs_fill_super+0x16d/0x244 > [82374.758303] [] ? mount_bdev+0x13d/0x198 > [82374.758307] [] ? mount_fs+0xc/0xa6 > [82374.758312] [] ? vfs_kern_mount+0x61/0x97 > [82374.758317] [] ? do_kern_mount+0x49/0xd6 > [82374.758321] [] ? do_mount+0x6f6/0x75d > [82374.758325] [] ? memdup_user+0x3a/0x56 > [82374.758330] [] ? sys_mount+0x88/0xc4 > [82374.758335] [] ? system_call_fastpath+0x16/0x1b > [82374.758341] XFS (sdg1): xfs_do_force_shutdown(0x8) called from line > 1929 of file fs/xfs/xfs_trans.c. Return address = 0xffffffff81201ee6 > [82374.758352] XFS (sdg1): Corruption of in-memory data detected. > Shutting down filesystem > [82374.758356] XFS (sdg1): Please umount the filesystem and rectify > the problem(s) > [82374.758364] XFS (sdg1): Failed to recover EFIs > [82374.758367] XFS (sdg1): log mount finish failed > > xfs_check doesn't give me much either, since I assume the errors above > are involved in log replay: > root@cephstore6358:~# xfs_check -v /dev/sdg1 > ERROR: The filesystem has valuable metadata changes in a log which needs to > be replayed. Mount the filesystem to replay the log, and unmount it before > re-running xfs_check. If you are unable to mount the filesystem, then use > the xfs_repair -L option to destroy the log and attempt a repair. > Note that destroying the log may cause corruption -- please attempt a mount > of the filesystem before doing this. > > Is there something useful I can do about this? Data I can provide to > help track down what broke? > -Greg > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs