From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id 366F77F50 for ; Sun, 17 Aug 2014 21:41:21 -0500 (CDT) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.157.11]) by relay3.corp.sgi.com (Postfix) with ESMTP id C4C97AC001 for ; Sun, 17 Aug 2014 19:41:20 -0700 (PDT) Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [150.101.137.129]) by cuda.sgi.com with ESMTP id NMCQQYGcT1kKIm5G for ; Sun, 17 Aug 2014 19:41:18 -0700 (PDT) Date: Mon, 18 Aug 2014 12:41:14 +1000 From: Dave Chinner Subject: Re: xfsdump completes very prematurely in low RAM, commit found Message-ID: <20140818024114.GK20518@dastard> References: <53F15EEF.4090308@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <53F15EEF.4090308@gmail.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: "Michael L. Semon" Cc: "xfs@oss.sgi.com" On Sun, Aug 17, 2014 at 10:03:27PM -0400, Michael L. Semon wrote: > Hi! I had some phantom issues that are chasing me through this 3.17 > merge window period. While chasing those issues, I decided to do an > xfsdump of a v5/finobt XFS system rescued from PEBKAC issues. The > xfsdump completed rather prematurely, ending like this test case > output... > > xfsdump: dumping special file ino 4194523 mode 0x21b0 > xfsdump: dumping special file ino 4194524 mode 0x21b0 > xfsdump: dumping special file ino 4194525 mode 0x21b0 > xfsdump: dumping special file ino 4194526 mode 0x21b0 > xfsdump: dumping special file ino 4194527 mode 0x21b0 > xfsdump: ending media file > xfsdump: media file size 4512992 bytes > xfsdump: ending stream: 23 seconds elapsed > xfsdump: dump size (non-dir files) : 4452088 bytes > xfsdump: dump complete: 23 seconds elapsed > xfsdump: Dump Summary: > xfsdump: stream 0 /mnt/xfstests-scratch/blah.0.dump OK (success) > xfsdump: Dump Status: SUCCESS > > That looks fine for a lack of obvious error messages. However, it > should end like this: > > xfsdump: dumping regular file ino 13653551 offset 0 to offset 12154 (size 12154) > xfsdump: dumping regular file ino 13653555 offset 0 to offset 16554 (size 16554) > xfsdump: dumping regular file ino 13653556 offset 0 to offset 185 (size 185) > xfsdump: dumping regular file ino 13653557 offset 0 to offset 471 (size 471) > xfsdump: dumping special file ino 13653558 mode 0xa1ff > xfsdump: ending media file > xfsdump: media file size 1999127056 bytes > xfsdump: ending stream: 465 seconds elapsed > xfsdump: dump size (non-dir files) : 1963549104 bytes > xfsdump: dump complete: 465 seconds elapsed > xfsdump: Dump Summary: > xfsdump: stream 0 /mnt/xfstests-scratch/blah.0.dump OK (success) > xfsdump: Dump Status: SUCCESS What's the inode number progression of a successful dump at the point at which the incomplete dump ends? i.e. around inode 4194527? That number is one inode chunk short of 2^22, which implies that there is a failure or some kind moving from one AG to the next. The progrssion of inode numbers will tell me whether this is the case or not... > Bisect brought me here: > > root@oldsvrhw:/usr/src/kernel-git/linux# git bisect bad > c7cb51dcb0a38624d42eeabb38502fa54a4d774b is the first bad commit > commit c7cb51dcb0a38624d42eeabb38502fa54a4d774b > Author: Jie Liu > Date: Thu Jul 24 12:18:47 2014 +1000 > > xfs: fix error handling at xfs_inumbers > From: Jie Liu > To fetch the file system number tables, we currently just ignore the > errors and proceed to loop over the next AG or bump agino to the next > chunk in case of btree operations failed, that is not properly because > those errors might hint us potential file system problems. > This patch rework xfs_inumbers() to handle the btree operation errors > as well as the loop conditions. > Signed-off-by: Jie Liu > Reviewed-by: Dave Chinner > Signed-off-by: Dave Chinner > > :040000 040000 ec78dc86468ee00df7a63bba97a135b8c6a84a95 2e447774a8f85b1b8d43ffa9fd28cbea3402d717 M fs > > Maybe Jeff's patch is doing its job. After all, on several successful > test runs, the kernel was sending messages like (paraphrased) "BUG: bad > state in page table" to remote syslog. The Pentium III PC has too > little memory (512 MB) to do this job. However, I think that the > xfsdump should last more than 23 seconds before causing issues. Memory should not matter for counting the number of inodes or extracting them from the kernel. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs