From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay1.corp.sgi.com [137.38.102.111]) by oss.sgi.com (Postfix) with ESMTP id 707227F54 for ; Wed, 11 Feb 2015 09:51:59 -0600 (CST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay1.corp.sgi.com (Postfix) with ESMTP id 4D74A8F8073 for ; Wed, 11 Feb 2015 07:51:59 -0800 (PST) Received: from sandeen.net (sandeen.net [63.231.237.45]) by cuda.sgi.com with ESMTP id kooJPcxdnPsF5I4t for ; Wed, 11 Feb 2015 07:51:57 -0800 (PST) Message-ID: <54DB7A9A.2070109@sandeen.net> Date: Wed, 11 Feb 2015 09:51:54 -0600 From: Eric Sandeen MIME-Version: 1.0 Subject: Re: xfs_logprint segfault with external log References: <54DB5E70.80607@oracle.com> In-Reply-To: <54DB5E70.80607@oracle.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Alexander Tsvetkov , xfs@oss.sgi.com On 2/11/15 7:51 AM, Alexander Tsvetkov wrote: > Hello, > > I've obtained corrupted xfs log after some sanity xfs testing: > > "log=logfile > log_size=855 > > dd if=/dev/zero "of=$log" bs=4096 count=$log_size > loopdev=$(losetup -f) > losetup $loopdev $log > > mkfs.xfs -f -m crc=1 -llogdev=$loopdev,size=${log_size}b $SCRATCH_DEV > mount -t xfs -ologdev=$loopdev $SCRATCH_DEV $SCRATCH_MNT > ./fdtree.sh -l 4 -d 4 -C -o $SCRATCH_MNT > sync > umount $SCRATCH_MNT > > xfs_logprint -l $loopdev $SCRATCH_DEV" > > Test makes crc enabled xfs filesystem with the external log of minimal allowed size and then creates on this fs the small directory tree > with sub directories and files of fixed depth and size with help of fdtree utility: https://computing.llnl.gov/?set=code&page=sio_downloads > > After that xfs_logprint stably reports bad data in log: TBH, xfs_logprint has always been a little buggy in corners. It's a diagnostic/developer tool, and as such has not been made as robust as tools that users need to use every day. Still, we'd hope for no segfaults or errors. ;) > "Oper (307): tid: eec9b0c7 len: 16 clientid: TRANS flags: none > EXTENTS inode data > Oper (308): tid: 41000000 len: 805306368 clientid: ERROR flags: none > LOCAL attr data > > ============================================================================ > cycle: 1 version: 2 lsn: 1,3138 tail_lsn: 1,2 > length of Log Record: 32256 prev offset: 3074 num ops: 375 > uuid: 39a962b7-4c0d-4e0e-8bcd-39471f93bc1d format: little endian linux > h_size: 32768 > ---------------------------------------------------------------------------- > Oper (0): tid: eec9b0c7 len: 48 clientid: TRANS flags: none > ********************************************************************** > * ERROR: data block=3138 * > ********************************************************************** > > xfs_logprint: unknown log operation type (2e00) > Bad data in log" It's probably just mis-parsing something. i.e. more likely a logprint bug than an xfs bug. If you could provide an xfs_metadump of the filesystem at this point, that would probably be the simplest reproducer for us. Fixing it may not be the very highest priority, but I have dug into and fixed logprint bugs in the past. It's not very fun. ;) > Subsequent call to "xfs_repair -n -l $loopdev $SCRATCH_DEV" passes and filesystem is mounted without errors. > I've supposed the using of inappropriate log size so updated log_size to default mkfs.xfs value for this device: "log_size=2560". > After that xfs_logprint core dumped with segfault (race condition): > > "Feb 11 13:55:42 fedora.fedora kernel: xfs_logprint[14007]: segfault at 29f16768 ip 00000000004028ed sp 00007fff61b46850 error 4 in xfs_logprint[400000+4e000]" a metadump of this filesystem would be useful as well, assuming it reproduces the bug. Thanks, -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs