From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Mon, 29 Jan 2007 16:12:10 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l0U0C3qw031856 for ; Mon, 29 Jan 2007 16:12:05 -0800 Message-Id: <200701300010.LAA00558@larry.melbourne.sgi.com> From: "Barry Naujok" Subject: RE: xfs_repair leaves things un-repaired. Date: Tue, 30 Jan 2007 11:14:58 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit In-Reply-To: <1170114096.12767.9.camel@tmolus.apparatus.net> Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: ajones@apparatus.net, xfs@oss.sgi.com Hi Andrew, > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of Andrew Jones > Sent: Tuesday, 30 January 2007 10:42 AM > To: xfs@oss.sgi.com > Subject: xfs_repair leaves things un-repaired. > > I have a filesystem which I cannot repair with xfs_repair. Running > xfs_repair results in its finding and fixing the same errors, over and > over and over. Whenever I attempt to manipulate certain directories, > the filesystem shuts itself down: > > Jan 29 17:59:02 amnesiac kernel: [] xfs_btree_check_sblock > +0x9c/0xab [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_alloc_lookup > +0x134/0x35c [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_alloc_lookup > +0x134/0x35c [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_free_ag_extent > +0x48/0x5fd [xfs] > Jan 29 17:59:02 amnesiac kernel: [] > xfs_free_extent+0xb7/0xd4 > [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_bmap_finish > +0xe6/0x167 [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_itruncate_finish > +0x1af/0x2ff [xfs] > Jan 29 17:59:02 amnesiac kernel: [] > xfs_inactive+0x254/0x92c > [xfs] > Jan 29 17:59:02 amnesiac kernel: [] iput+0x3d/0x66 > Jan 29 17:59:02 amnesiac kernel: [] xfs_remove+0x322/0x3a9 > [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_validate_fields > +0x1e/0x7d [xfs] > Jan 29 17:59:02 amnesiac kernel: [] xfs_vn_unlink+0x2f/0x3b > [xfs] > Jan 29 17:59:02 amnesiac kernel: [] inotify_inode_is_dead > +0x18/0x6c > Jan 29 17:59:02 amnesiac kernel: [] xfs_fs_clear_inode > +0x6d/0xa3 [xfs] > Jan 29 17:59:02 amnesiac kernel: [] clear_inode+0xab/0xd8 > Jan 29 17:59:02 amnesiac kernel: [] generic_delete_inode > +0xbd/0x10f > Jan 29 17:59:02 amnesiac kernel: [] iput+0x64/0x66 > Jan 29 17:59:02 amnesiac kernel: [] do_unlinkat+0xa7/0x113 > Jan 29 17:59:02 amnesiac kernel: [] vfs_readdir+0x7d/0x8d > Jan 29 17:59:02 amnesiac kernel: [] filldir64+0x0/0xc3 > Jan 29 17:59:02 amnesiac kernel: [] > sys_getdents64+0x9b/0xa5 > Jan 29 17:59:02 amnesiac kernel: [] sysenter_past_esp > +0x56/0x79 > Jan 29 17:59:02 amnesiac kernel: xfs_force_shutdown(dm-0,0x8) called > from line 4267 of file fs/xfs/xfs_bmap.c. Return address = 0xf94e46f0 > Jan 29 17:59:15 amnesiac kernel: xfs_force_shutdown(dm-0,0x1) called > from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xf94e46f0 > Jan 29 17:59:15 amnesiac kernel: xfs_force_shutdown(dm-0,0x1) called > from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xf94e46f0 > > I think the second and third "xfs_force_shutdown" calls came after I > unmounted, remounted, and attempted to repeat the "rm" that had failed > with the first one, without an xfs_repair attempt in the interregnum. > > I tried copying it from one filesystem to a new one, using tar. It > worked fine for a while, but then I had an "unplanned" > shutdown due to a > failure in the RAID devices. Since then, the same problems > have arisen. > > Is this a normal problem? Should I just give up and copy to a new > filesystem? The xfs_repair output is valid. All the inodes that are reporting errors are orphaned inodes that were moved into lost+found. At the start of phase 4, the lost+found directory is deleted which causes all the inodes in lost+found to be re-orphaned. The current solution to this problem is to rename lost+found after an xfs_repair run and then unmount and try xfs_repair again. Regarding the shutdown, that is not normal and I personally don't know what the problem is from the trace. If it's a corrupt lost+found that xfs_repair is generating (I gather you are rm'ing lost+found), the second xfs_repair run after a rename should identify the problem with the directory. You can also try running xfs_check on the device as it may pick up something xfs_repair is missing. Regards, Barry. > root@amnesiac#xfs_info /dev/vg0/home > meta-data=/dev/vg0/home isize=256 agcount=65, > agsize=7325792 > blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=468855808, > imaxpct=25 > = sunit=0 swidth=0 blks, > unwritten=1 > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=32768, version=1 > = sectsz=512 sunit=0 blks > realtime =none extsz=4096 blocks=0, rtextents=0 > root@amnesiac#uname -a > Linux amnesiac 2.6.18-3-686 #1 SMP Sun Dec 10 19:37:06 UTC 2006 i686 > GNU/Linux > root@amnesiac#xfs_repair -V > xfs_repair version 2.8.18 > > The xfs_repair -v output is attached to this message. >