From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: with ECARTIS (v1.0.0; list xfs); Tue, 03 Apr 2007 17:42:42 -0700 (PDT) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l340gbfB023972 for ; Tue, 3 Apr 2007 17:42:39 -0700 Message-Id: <200704040042.KAA01820@larry.melbourne.sgi.com> From: "Barry Naujok" Subject: RE: xfs_repair segfault Date: Wed, 4 Apr 2007 10:45:47 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit In-Reply-To: Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com List-Id: xfs To: "'James W. Abendschan'" , xfs@oss.sgi.com Hi James, Would it be possible for you apply the patch I posted to xfs@oss in Feb http://oss.sgi.com/archives/xfs/2007-02/msg00072.html to the latest xfsprogs source, make and install it and run: # xfs_metadump /dev/md1 - | bzip2 > /tmp/bad_xfs.bz2 And make the image available for me to download and analyse? Regards, Barry. > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of James W. Abendschan > Sent: Wednesday, 4 April 2007 5:12 AM > To: xfs@oss.sgi.com > Subject: xfs_repair segfault > > Hi there -- I have a 6.9TB XFS volume that is acting up > after a power failure (I understand XFS + no UPS + PC > hardware == badness. Not my decision.) > > The machine is a dual proc x86 (intel xeon 5130) w/ 8GB RAM > running a custom 2.6.18 kernel on top of Ubuntu 6.06. > > Since xfs_check can't repair volumes of this size without > scads of memory, I've been using xfs_repair to correct > power-related problems before. > > Unfortunately, for some reason xfs_repair is segfaulting: > > # ulimit -c unlimited > # xfs_repair -v /dev/md1 > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - zero log... > zero_log: head block 8 tail block 8 > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan and clear agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 3 > - agno = 4 > - agno = 5 > - agno = 6 > - agno = 7 > - agno = 8 > - agno = 9 > - agno = 10 > - agno = 11 > - agno = 12 > - agno = 13 > - agno = 14 > - agno = 15 > - agno = 16 > - agno = 17 > - agno = 18 > - agno = 19 > - agno = 20 > - agno = 21 > - agno = 22 > - agno = 23 > - agno = 24 > - agno = 25 > - agno = 26 > - agno = 27 > - agno = 28 > - agno = 29 > - agno = 30 > - agno = 31 > - process newly discovered inodes... > Phase 4 - check for duplicate blocks... > - setting up duplicate extent list... > - clear lost+found (if it exists) ... > - clearing existing "lost+found" inode > Segmentation fault (core dumped) > > > gdb doesn't show anything useful (I don't know how to interpret > the I/O error) : > > > # gdb /sbin/xfs_repair core > GNU gdb 6.4-debian > Copyright 2005 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public > License, and you are > welcome to change it and/or distribute copies of it under > certain conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show > warranty" for details. > This GDB was configured as "i486-linux-gnu"...(no debugging > symbols found) > Using host libthread_db library > "/lib/tls/i686/cmov/libthread_db.so.1". > > (no debugging symbols found) > Core was generated by `xfs_repair -v /dev/md1'. > Program terminated with signal 11, Segmentation fault. > > warning: Can't read pathname for load map: Input/output error. > Reading symbols from /lib/libuuid.so.1...(no debugging > symbols found)...done. > Loaded symbols for /lib/libuuid.so.1 > Reading symbols from /lib/tls/i686/cmov/libc.so.6...(no > debugging symbols found) > Loaded symbols for /lib/tls/i686/cmov/libc.so.6 > Reading symbols from /lib/ld-linux.so.2...(no debugging > symbols found)...done. > Loaded symbols for /lib/ld-linux.so.2 > > #0 0x08052f42 in ?? () > (gdb) bt > #0 0x08052f42 in ?? () > #1 0x000088e9 in ?? () > #2 0x00000800 in ?? () > #3 0x00000080 in ?? () > #4 0x00000000 in ?? () > > > What's the next step? > > Thanks, > James > > >