From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755759Ab2CIOey (ORCPT ); Fri, 9 Mar 2012 09:34:54 -0500 Received: from rcsinet15.oracle.com ([148.87.113.117]:18946 "EHLO rcsinet15.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753738Ab2CIOew (ORCPT ); Fri, 9 Mar 2012 09:34:52 -0500 Date: Fri, 9 Mar 2012 09:34:46 -0500 From: Chris Mason To: Lukas Czerner Cc: Jacek Luczak , linux-ext4@vger.kernel.org, linux-fsdevel , LKML , linux-btrfs@vger.kernel.org Subject: Re: getdents - ext4 vs btrfs performance Message-ID: <20120309143446.GO29510@shiny> Mail-Followup-To: Chris Mason , Lukas Czerner , Jacek Luczak , linux-ext4@vger.kernel.org, linux-fsdevel , LKML , linux-btrfs@vger.kernel.org References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: ucsinet21.oracle.com [156.151.31.93] X-CT-RefId: str=0001.0A090206.4F5A150A.002A,ss=1,re=0.000,fgs=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 09, 2012 at 12:29:29PM +0100, Lukas Czerner wrote: > Hi, > > I have created a simple script which creates a bunch of files with > random names in the directory and then performs operation like list, > tar, find, copy and remove. I have run it for ext4, xfs and btrfs with > the 4k size files. And the result is that ext4 pretty much dominates the > create times, tar times and find times. However copy times is a whole > different story unfortunately - is sucks badly. > > Once we cross the mark of 320000 files in the directory (on my system) the > ext4 is becoming significantly worse in copy times. And that is where > the hash tree order in the directory entry really hit in. > > Here is a simple graph: > > http://people.redhat.com/lczerner/files/copy_benchmark.pdf > > Here is a data where you can play with it: > > https://www.google.com/fusiontables/DataSource?snapid=S425803zyTE > > and here is the txt file for convenience: > > http://people.redhat.com/lczerner/files/copy_data.txt > > I have also run the correlation.py from Phillip Susi on directory with > 100000 4k files and indeed the name to block correlation in ext4 is pretty > much random :) > > _ext4_ > Name to inode correlation: 0.50002499975 > Name to block correlation: 0.50002499975 > Inode to block correlation: 0.9999900001 > > _xfs_ > Name to inode correlation: 0.969660303397 > Name to block correlation: 0.969660303397 > Inode to block correlation: 1.0 > > > So there definitely is a huge space for improvements in ext4. Thanks Lukas, this is great data. There is definitely room for btrfs to speed up in the other phases as well. -chris