From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ric Wheeler Subject: Re: large fs testing Date: Thu, 28 May 2009 06:52:10 -0400 Message-ID: <4A1E6CDA.3090500@redhat.com> References: <4A17FFD8.80401@redhat.com> <5971.1243359565@gamaville.dokosmarshall.org> <4A1C2B40.30102@redhat.com> <20090526212132.GE3218@webber.adilger.int> <4A1C6A71.7010300@redhat.com> <20090528063003.GC3218@webber.adilger.int> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: nicholas.dokos@hp.com, linux-fsdevel@vger.kernel.org, Christoph Hellwig , Douglas Shakshober , Joshua Giles , Valerie Aurora , Eric Sandeen , Steven Whitehouse , Edward Shishkin , Josef Bacik , Jeff Moyer , Chris Mason , "Whitney, Eric" , Theodore Tso To: Andreas Dilger Return-path: Received: from mx2.redhat.com ([66.187.237.31]:48810 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754200AbZE1Kxg (ORCPT ); Thu, 28 May 2009 06:53:36 -0400 In-Reply-To: <20090528063003.GC3218@webber.adilger.int> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On 05/28/2009 02:30 AM, Andreas Dilger wrote: > On May 26, 2009 18:17 -0400, Ric Wheeler wrote: >> What I did get was the following from the fsck run: >> >> root@l82bi250:/home/redhatYou have new mail in /var/spool/mail/root >> [root@l82bi250 redhat]# time /sbin/fsck.ext4 -tt -y /dev/mapper/Big_boy-Big_boy >> e2fsck 1.41.4 (27-Jan-2009) >> Pass 1: Checking inodes, blocks, and sizes >> Pass 1: Memory used: 1596k/1177752k (1447k/150k), time: 1184.73/514.16/344.38 >> Pass 1: I/O read: 50655MB, write: 0MB, rate: 42.76MB/s >> Pass 2: Checking directory structure >> Entry '4a1590dc~~~~~~~~O4A0SMJ1VC34YQ1PD3B5DL9Q' in /da (188378) >> references inode 196988 in group 30 where _INODE_UNINIT is set. >> Fix? yes >> >> Restarting e2fsck from the beginning... >> Group descriptor 15 checksum is invalid. Fix? yes >> >> Pass 1: Checking inodes, blocks, and sizes >> Pass 1: Memory used: 120396k/-1389015k (120134k/263k), time: 1134.71/522.48/323.65 >> Pass 1: I/O read: 50656MB, write: 0MB, rate: 44.64MB/s >> Pass 2: Checking directory structure >> Entry '4a15910c~~~~~~~~H8099TRM701Q29CSTCWBVIHJ' in /0b (404925) >> references inode 413100 in group 62 where _INODE_UNINIT is set. >> Fix? yes >> >> Restarting e2fsck from the beginning... >> Group descriptor 31 checksum is invalid. Fix? yes > > This looks like there is a patch of ours missing from the upstream e2fsprogs. > We have a patch that will restart e2fsck only a single time for inodes > beyond the high waterwark. On a large filesystem like yours this would > have cut 30 minutes off the e2fsck time. I'll submit that separately. > >> Pass 1: Checking inodes, blocks, and sizes >> Pass 1: Memory used: 231360k/246272k (231083k/278k), time: 1140.48/521.00/334.74 >> Pass 1: I/O read: 50658MB, write: 0MB, rate: 44.42MB/s >> Pass 2: Checking directory structure >> Pass 2: Memory used: 231360k/1290436k (231083k/278k), time: 538.22/264.56/83.49 >> Pass 2: I/O read: 13749MB, write: 0MB, rate: 25.55MB/s >> Pass 3: Checking directory connectivity >> Peak memory: Memory used: 231360k/1789000k (231083k/278k), time: >> 4221.57/1947.37/1116.21 >> Pass 3A: Memory used: 231360k/1789000k (231083k/278k), time: 0.00/ 0.00/ 0.00 >> Pass 3A: I/O read: 0MB, write: 0MB, rate: 0.00MB/s >> Pass 3: Memory used: 231360k/1290436k (231083k/278k), time: 9.99/ 0.26/ 1.37 >> Pass 3: I/O read: 1MB, write: 0MB, rate: 0.10MB/s >> Pass 4: Checking reference counts >> Pass 4: Memory used: 231360k/-1481575k (231082k/279k), time: 147.16/139.87/ 1.94 > > Sign overflow here... Looks like we exceed 2.5GB of memory here. Still, > not too bad considering this is a 80TB filesystem. We fsck had a virtual size of around 10GB (5.4 resident in the 6GB of DRAM) when I checked... I wonder if it would have been significantly faster without the excessive swap use (i.e., on a box with more memory)? > >> Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s >> Pass 5: Checking group summary information >> Inode bitmap differences: -(98404--98405) >> >> Note that it got truncated in Pass 5 - just after writing out some values >> that look like they sign wrapped? >> >> -(103650--103655) -(103659--103660) -103663 -103665 -103667 >> -(103669--103670) -(103673--103676) -103679 -103684 -103687 -10 > > No, this is what gets printed when there are inodes (or blocks) marked > in the bitmap that are not in use. It shouldn't be truncated however. > You said the node crashed at this point? > > Cheers, Andreas Yes - unfortunately, no logs or other indication of why it crashed and we did not have a serial console setup either so we don't have anything to go on. I am going to push harder to get some large storage configurations that we can use for testing internally, so hopefully, we will have something to test on in a couple of months.... Ric