Re: Segmentation fault during xfs_repair

From: Eric Sandeen <sandeen@sandeen.net>
To: Richard Kolkovich <richard@intrameta.com>
Cc: xfs@oss.sgi.com
Subject: Re: Segmentation fault during xfs_repair
Date: Fri, 05 Jun 2009 23:44:05 -0500	[thread overview]
Message-ID: <4A29F415.5020203@sandeen.net> (raw)
In-Reply-To: <20090605222236.GA39825@magus.portal.sigil.org>

Richard Kolkovich wrote:
> We have a corrupted XFS partition on a storage server.  Attempting to run xfs_repair the first time yielded the message about a corrupt log file, so I have run xfs_repair with -L to clear that.  Now, xfs_repair segfaults in Phase 3.  I have tried -P and a huge -m to no avail.  It always seems to segfault at the same point:
> 
> bad directory block magic # 0 in block 11 for directory inode 341521797
> corrupt block 11 in directory inode 341521797
>         will junk block
> Segmentation fault (core dumped)

...

> I can provide the full core file, if need be (956M).  The xfs_metadump can be found at:
> 
> http://files.intrameta.com/metadump.gz (735M)
> 
> Any suggestions/ideas on how to proceed are welcome.  Please Reply-All, as I'm not subscribed to the ML.

Ok, on a -g (not -02) build:

Program terminated with signal 11, Segmentation fault.
#0  0x0000000000418d05 in traverse_int_dir2block (mp=0x7ffff4c4f150,
da_cursor=0x7ffff4c4eb30, rbno=0x7ffff4c4ebdc) at dir2.c:356
356			da_cursor->level[i].hashval =
(gdb) p i
$1 = 46501

i is set from

i = da_cursor->active = be16_to_cpu(node->hdr.level);

(gdb) p node->hdr.level // note this is big endian
$2 = 42421

that's a crazily deep btree, well beyond anything sane:

#define XFS_DA_NODE_MAXDEPTH    5       /* max depth of Btree */

So repair really should be checking for this before it goes off and
indexes it:

356			da_cursor->level[i].hashval =

because the cursor only has this much in the array:

        dir2_level_state_t      level[XFS_DA_NODE_MAXDEPTH];

I'll have to ponder what repair should do in this case ... and I'll see
if there's something we can do in xfs_db to just whack out this problem
and let repair continue for now.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs