From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n63BKKxj161591 for ; Fri, 3 Jul 2009 06:20:20 -0500 Received: from mailsrv1.zmi.at (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 801B133ECEC for ; Fri, 3 Jul 2009 04:20:50 -0700 (PDT) Received: from mailsrv1.zmi.at (mailsrv1.zmi.at [212.69.162.198]) by cuda.sgi.com with ESMTP id n3cd6hTmUgFYpg1a for ; Fri, 03 Jul 2009 04:20:50 -0700 (PDT) Received: from mailsrv2.i.zmi.at (h081217106033.dyn.cm.kabsi.at [81.217.106.33]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailsrv2.i.zmi.at", Issuer "power4u.zmi.at" (not verified)) by mailsrv1.zmi.at (Postfix) with ESMTP id C4DB65380 for ; Fri, 3 Jul 2009 13:22:26 +0200 (CEST) Received: from saturn.localnet (unknown [10.72.27.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mailsrv2.i.zmi.at (Postfix) with ESMTPSA id BAD3D400155 for ; Fri, 3 Jul 2009 13:20:48 +0200 (CEST) From: Michael Monnerie Subject: bad fs - xfs_repair 3.01 crashes on it Date: Fri, 3 Jul 2009 13:20:43 +0200 MIME-Version: 1.0 Message-Id: <200907031320.48358@zmi.at> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============2377985493959329584==" Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: xfs mailing list --===============2377985493959329584== Content-Type: multipart/signed; boundary="nextPart3171997.tSlmBtre28"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit --nextPart3171997.tSlmBtre28 Content-Type: multipart/mixed; boundary="Boundary-01=_MmeTKaJHscv+BOH" Content-Transfer-Encoding: 7bit Content-Disposition: inline --Boundary-01=_MmeTKaJHscv+BOH Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Tonight our server rebooted, and I found in /var/log/warn that he was cryin= g=20 a lot about xfs since June 7 already: Jun 7 03:06:31 orion.i.zmi.at kernel: Filesystem "dm-0": corrupt inode 385= 7051697 ((a)extents =3D 5). Unmount and run xfs_repair. Jun 7 03:06:31 orion.i.zmi.at kernel: Pid: 23230, comm: xfs_fsr Tainted: G= 2.6.27.21-0.1-xen #1 Jun 7 03:06:31 orion.i.zmi.at kernel: Jun 7 03:06:31 orion.i.zmi.at kernel: Call Trace: Jun 7 03:06:31 orion.i.zmi.at kernel: [] show_trace_log= _lvl+0x41/0x58 Jun 7 03:06:31 orion.i.zmi.at kernel: [] dump_stack+0x6= 9/0x6f Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_iformat_ex= tents+0xc9/0x1c5 [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_iformat+0x= 2b0/0x3f6 [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_iread+0xe7= /0x1ed [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_iget_core+= 0x3a5/0x63a [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_iget+0xe2/= 0x187 [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_vget_fsop_= handlereq+0xc2/0x11b [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_open_by_ha= ndle+0x60/0x1cb [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_ioctl+0x3c= a/0x680 [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] xfs_file_ioctl= +0x25/0x69 [xfs] Jun 7 03:06:31 orion.i.zmi.at kernel: [] vfs_ioctl+0x21= /0x6c Jun 7 03:06:31 orion.i.zmi.at kernel: [] do_vfs_ioctl+0= x222/0x231 Jun 7 03:06:31 orion.i.zmi.at kernel: [] sys_ioctl+0x51= /0x73 Jun 7 03:06:31 orion.i.zmi.at kernel: [] system_call_fa= stpath+0x16/0x1b Jun 7 03:06:31 orion.i.zmi.at kernel: [<00007f7231d6cb77>] 0x7f7231d6cb77 But XFS didn't go offline, so nobody found this messages. There are a lot o= f them. They obviously are generated by the nightly "xfs_fsr -v -t 7200" which we r= un since then. It would have been nice if xfs_fsr could have displayed a message, so we would have received the cron mail. (But it got killed by the kernel, that's a good excuse) Anyway, so I went to xfs_repair (3.01) and got this: Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... [snip] - agno =3D 14 local inode 3857051697 attr too small (size =3D 3, min size =3D 4) bad attribute fork in inode 3857051697, clearing attr fork clearing inode 3857051697 attributes cleared inode 3857051697 [snip] Phase 4 - check for duplicate blocks... [snip] - agno =3D 15 data fork in regular inode 3857051697 claims used block 537147998 xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err =3D=3D 0= ' failed. And then xfs_repair crashes out, without having repaired. I attached the fu= ll=20 xfs_repair log here, and http://zmi.at/x/xfs.metadump.data1.bz2 the metadump. I'll not be here for a week now, I hope the problem is not very serious. mfg zmi =2D-=20 // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 --Boundary-01=_MmeTKaJHscv+BOH Content-Type: text/plain; charset="UTF-8"; name="xfsrepair.data1" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="xfsrepair.data1" Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 local inode 3857051697 attr too small (size = 3, min size = 4) bad attribute fork in inode 3857051697, clearing attr fork clearing inode 3857051697 attributes cleared inode 3857051697 - agno = 15 - agno = 16 - agno = 17 - agno = 18 - agno = 19 - agno = 20 - agno = 21 - agno = 22 - agno = 23 - agno = 24 - agno = 25 - agno = 26 - agno = 27 - agno = 28 - agno = 29 - agno = 30 - agno = 31 - agno = 32 - agno = 33 - agno = 34 - agno = 35 - agno = 36 - agno = 37 - agno = 38 - agno = 39 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 data fork in regular inode 3857051697 claims used block 537147998 xfs_repair: dinode.c:2108: process_inode_data_fork: Assertion `err == 0' failed. --Boundary-01=_MmeTKaJHscv+BOH-- --nextPart3171997.tSlmBtre28 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iEYEABECAAYFAkpN6ZAACgkQzhSR9xwSCbSOdACg7aZ5elczWsWNvZjXok3e7cL6 aSsAn0rUX84zVftmgrdr/sg7QEYNOqnH =Z88E -----END PGP SIGNATURE----- --nextPart3171997.tSlmBtre28-- --===============2377985493959329584== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs --===============2377985493959329584==--