Return-Path: <sandeen@redhat.com>
Received: from po14.mit.edu ([unix socket])
	by po14.mit.edu (Cyrus v2.1.5) with LMTP; Thu, 10 Jul 2008 12:37:30 -0400
X-Sieve: CMU Sieve 2.2
Received: from pacific-carrier-annex.mit.edu by po14.mit.edu (8.13.6/4.7) id m6AGbTgx015684; Thu, 10 Jul 2008 12:37:29 -0400 (EDT)
Received: from mit.edu (W92-130-BARRACUDA-1.MIT.EDU [18.7.21.220])
	by pacific-carrier-annex.mit.edu (8.13.6/8.9.2) with ESMTP id m6AGbJFS008456
	for <tytso@mit.edu>; Thu, 10 Jul 2008 12:37:19 -0400 (EDT)
X-ASG-Whitelist: Barracuda Reputation
Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31])
	by mit.edu (Spam Firewall) with ESMTP id 2FD13A12783
	for <tytso@mit.edu>; Thu, 10 Jul 2008 12:37:19 -0400 (EDT)
Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254])
	by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m6AGbIaI006717;
	Thu, 10 Jul 2008 12:37:18 -0400
Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15])
	by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m6AGEdmB014184;
	Thu, 10 Jul 2008 12:14:39 -0400
Received: from liberator.sandeen.net (sebastian-int.corp.redhat.com [172.16.52.221])
	by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m6AGEa7N024663;
	Thu, 10 Jul 2008 12:14:37 -0400
Message-ID: <48763564.2090505@redhat.com>
Date: Thu, 10 Jul 2008 11:14:28 -0500
From: Eric Sandeen <sandeen@redhat.com>
User-Agent: Thunderbird 2.0.0.14 (Macintosh/20080421)
MIME-Version: 1.0
To: rwheeler@redhat.com
CC: Theodore Tso <tytso@mit.edu>, linux-ext4-owner@vger.kernel.org
Subject: Re: suspiciously good fsck times?
References: <4876025A.80909@gmail.com> <20080710151822.GA25939@mit.edu> <48762F9F.5070308@redhat.com>
In-Reply-To: <48762F9F.5070308@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.42
X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254
X-Spam-Score: -2.464
X-Spam-Flag: NO

Ric Wheeler wrote:
> Theodore Tso wrote:
>> On Thu, Jul 10, 2008 at 08:36:42AM -0400, Ric Wheeler wrote:
>>   
>>> Just to be mean, I have been trying to test the fsck speed of ext4 with  
>>> lots of small files.  The test I ran uses fs_mark to fill a 1TB Seagate  
>>> drive with 45.6 million 20k files (distributed between 256 
>>> subdirectories).
>>>
>>> Running on ext3, "fsck -f" takes about one hour.
>>>
>>> Running on ext4, with uninit_bg, the same fsck is finished in a bit over  
>>> 5 minutes - more than 10x faster.  (Without uninit_bg, the fsck takes  
>>> about 10 minutes).
>>>
>>> Is this too good to be true? Below is the fsck run itself, the tree is  
>>> Ted's latest git tree and his 1.41 WIP tools,
>>>     
>> Wow.  My guess is that flex_bg is making the difference.  What we
>> would want to compare is the I/O read statistics line:

I thought we actually had flex_bg off at least on the first run and it
still looked good.  (Ric just made the fs with mkfs.ext3 -j -I 256 -E
test_fs initially I think)

Val & I talked about this a little, and came to the conclusion that
directory fragmentation might be a pretty big part of it.

I did a similar workload on a much smaller fs, and the largest dir
(~11MB) looked like this on ext3:

BLOCKS:
(0-4):3950592-3950596, (5):3950604, (6-7):3950606-3950607, (8):3950630,
(9):3950871, (10-11):3950875-3950876, (IND):3950899, (12):3950900,
(13):3950934, (14):3950937, (15-16):3950943-3950944, (17):3951390,
(18):3951396, (19):3951402, (20):3951406, (21):3951408, (22):3951410,
(23):3951581, (24):3951684, (25):3951985, (26):3952031, (27):3952156,
(28):3952322, (29):3952418, (30):3952599, (31):3952626, (32):3954038,
(33):3954693, (34):3954698, (35):3954874, (36):3955108, (37):3955708,
(38):3955711, (39):3956034, (40):3956598, (41):3957173, (42):3957179,
(43):3957622, (44):3957763, (45):3957824, (46):3957910, (47):3958190,
(48):3958302, (49):3958488, (50):3958834, (51):3959173, (52):3959468,
(53):3959842, (54):3959903, (55):3960029, (56):3960245, (57):3960446
..... ad naseum ...
(4032):4893557, (4033):4894194, (4034):4894719, (4035):4937580,
(4036):4937887, (4037):4939087, (4038):4939233, (4039):4939502,
(4040):4939508, (4041):4940473, (4042-4043):4940939-4940940,
(4044):4941191, (4045):4941402, (4046-4048):4941409-4941411,
(4049):4943061, (4050):4943307, (4051-4052):4943314-4943315
TOTAL: 4058

compared to ext4:

BLOCKS:
(0):1900544, (1-5070):1900546-1905615
TOTAL: 5071


> We did run fsck through seekwatcher & saw a significant reduction in 
> seeks/sec for ext4. Eric has the pretty pictures that he can share.

sure do (AFAIK these were with neither flex_bg nor uninit_bg):

http://people.redhat.com/esandeen/ext4/e4fsck-1T.png
http://people.redhat.com/esandeen/ext4/e3fsck-1T.png
http://people.redhat.com/esandeen/ext4/ext3-ext4-fsck-1T.png

I'm still working out what's what.  But that hockey-stick-shaped red
line for ext4 is intriguing, I think it's very densely packed $SOMETHING
that ext3 had to seek all over for, guessing it's the directories.
Although that strikes me as an odd place for the root-level directories
to land.

I need to check, does ext3 use reservation windows for directories?
Looks like maybe it should... :)

-Eric


