From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <xfs-bounces@oss.sgi.com>
Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111])
	by oss.sgi.com (Postfix) with ESMTP id 707227F54
	for <xfs@oss.sgi.com>; Wed, 11 Feb 2015 09:51:59 -0600 (CST)
Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25])
	by relay1.corp.sgi.com (Postfix) with ESMTP id 4D74A8F8073
	for <xfs@oss.sgi.com>; Wed, 11 Feb 2015 07:51:59 -0800 (PST)
Received: from sandeen.net (sandeen.net [63.231.237.45]) by cuda.sgi.com with
	ESMTP id kooJPcxdnPsF5I4t for <xfs@oss.sgi.com>;
	Wed, 11 Feb 2015 07:51:57 -0800 (PST)
Message-ID: <54DB7A9A.2070109@sandeen.net>
Date: Wed, 11 Feb 2015 09:51:54 -0600
From: Eric Sandeen <sandeen@sandeen.net>
MIME-Version: 1.0
Subject: Re: xfs_logprint segfault with external log
References: <54DB5E70.80607@oracle.com>
In-Reply-To: <54DB5E70.80607@oracle.com>
List-Id: XFS Filesystem from SGI <xfs.oss.sgi.com>
List-Unsubscribe: <http://oss.sgi.com/mailman/options/xfs>,
	<mailto:xfs-request@oss.sgi.com?subject=unsubscribe>
List-Archive: <http://oss.sgi.com/pipermail/xfs>
List-Post: <mailto:xfs@oss.sgi.com>
List-Help: <mailto:xfs-request@oss.sgi.com?subject=help>
List-Subscribe: <http://oss.sgi.com/mailman/listinfo/xfs>,
	<mailto:xfs-request@oss.sgi.com?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: xfs-bounces@oss.sgi.com
Sender: xfs-bounces@oss.sgi.com
To: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>, xfs@oss.sgi.com

On 2/11/15 7:51 AM, Alexander Tsvetkov wrote:
> Hello,
> 
> I've obtained corrupted xfs log after some sanity xfs testing:
> 
> "log=logfile
> log_size=855
> 
> dd if=/dev/zero "of=$log" bs=4096 count=$log_size
> loopdev=$(losetup -f)
> losetup $loopdev $log
> 
> mkfs.xfs -f -m crc=1 -llogdev=$loopdev,size=${log_size}b $SCRATCH_DEV
> mount -t xfs -ologdev=$loopdev $SCRATCH_DEV $SCRATCH_MNT
> ./fdtree.sh  -l 4 -d 4 -C -o $SCRATCH_MNT
> sync
> umount $SCRATCH_MNT
> 
> xfs_logprint -l $loopdev $SCRATCH_DEV"
> 
> Test makes crc enabled xfs filesystem with the external log of minimal allowed size and then creates on this fs the small directory tree
> with sub directories and files of fixed depth and size with help of fdtree utility: https://computing.llnl.gov/?set=code&page=sio_downloads
> 
> After that xfs_logprint stably reports bad data in log:

TBH, xfs_logprint has always been a little buggy in corners.  It's
a diagnostic/developer tool, and as such has not been made as robust
as tools that users need to use every day.  Still, we'd hope for
no segfaults or errors.  ;)

> "Oper (307): tid: eec9b0c7  len: 16  clientid: TRANS  flags: none
> EXTENTS inode data
> Oper (308): tid: 41000000  len: 805306368  clientid: ERROR  flags: none
> LOCAL attr data
> 
> ============================================================================
> cycle: 1        version: 2              lsn: 1,3138     tail_lsn: 1,2
> length of Log Record: 32256     prev offset: 3074               num ops: 375
> uuid: 39a962b7-4c0d-4e0e-8bcd-39471f93bc1d   format: little endian linux
> h_size: 32768
> ----------------------------------------------------------------------------
> Oper (0): tid: eec9b0c7  len: 48  clientid: TRANS  flags: none
> **********************************************************************
> * ERROR: data block=3138                                              *
> **********************************************************************
> 
> xfs_logprint: unknown log operation type (2e00)
> Bad data in log"

It's probably just mis-parsing something.  i.e. more likely a logprint bug
than an xfs bug.

If you could provide an xfs_metadump of the filesystem at this point, that
would probably be the simplest reproducer for us.  Fixing it may not be the
very highest priority, but I have dug into and fixed logprint bugs in the
past.  It's not very fun.  ;)


> Subsequent call to "xfs_repair -n -l $loopdev $SCRATCH_DEV" passes and filesystem is mounted without errors.
> I've supposed the using of inappropriate log size so updated log_size to default mkfs.xfs value for this device: "log_size=2560".
> After that xfs_logprint core dumped with segfault (race condition):
> 
> "Feb 11 13:55:42 fedora.fedora kernel: xfs_logprint[14007]: segfault at 29f16768 ip 00000000004028ed sp 00007fff61b46850 error 4 in xfs_logprint[400000+4e000]"

a metadump of this filesystem would be useful as well, assuming it reproduces the
bug.

Thanks,
-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs