From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ted Ts'o Subject: Re: Bad performance of ext4 with kernel 3.0.17 Date: Thu, 1 Mar 2012 21:45:31 -0500 Message-ID: <20120302024531.GJ32588@thunk.org> References: <20120301194735.GD32588@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Ext4 development To: Xupeng Yun Return-path: Received: from li9-11.members.linode.com ([67.18.176.11]:33456 "EHLO test.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755431Ab2CBCpe (ORCPT ); Thu, 1 Mar 2012 21:45:34 -0500 Content-Disposition: inline In-Reply-To: Sender: linux-ext4-owner@vger.kernel.org List-ID: Hmm, it sounds like we're hitting some kind of scaling problem. How many CPU's/cores do you have on your server? And it would be interesting to try varying the --numjobs parameter and see how the various file systems behave with 1, 2, 4, 8, and 16 threads. The other thing that's worth checking is to try using filefrag -v on the test file after the benchmark has finished, just to make sure the file layout is sane. It should be, but I just want to double check... - Ted On Fri, Mar 02, 2012 at 08:50:55AM +0800, Xupeng Yun wrote: > On Fri, Mar 2, 2012 at 03:47, Ted Ts'o wrote: > > Two things I'd try: > > > > #1) If this is a freshly created file system, the kernel may be > > initializing the inode table in the background, and this could be > > interfering with your benchmark workload. =A0To address this, you c= an > > either (a) add the mount option noinititable, (b) add the mke2fs > > option "-E lazy_itable_init=3D0" --- but this will cause the mke2fs= to > > take a lot longer, or (c) mount the file system and wait until > > "dumpe2fs /dev/md3 | tail" shows that the last block group has the > > ITABLE_ZEROED flag set. =A0For benchmarking purposes on a scratch > > workload, option (a) above is the fast thing to do. > > >=20 > Thank you Ted, I followed this and got the same result (read IOPS ~95= 0 > / write IOPS ~100) >=20 > > #2) It could be that the file system is choosing blocks farther awa= y > > from the beginning of the disk, which is slower, whereas the fio on > > the raw disk will use the blocks closest to the beginning of the di= sk, > > which are the fastest one. =A0You could try creating the file syste= m so > > it is only 10GB, and then try running fio on that small, truncated > > file system, and see if that makes a difference. >=20 > I created LVM on top of the RAID10 device, and then created a smaller= LV(20GB), > after that I took benchmarks against the very same LV with different > filesystems, the > results are interesting: >=20 > xfs (read IOPS ~1700 / write IOPS ~200) > ext4 (read IOPS ~950 / write IOPS ~100) > ext3( read IOPS ~900 / write IOPS ~100) > reisferfs (read IOPS ~930 / write IOPS ~100) > btrfs (read IOPS ~1200 / write IOPS ~120) >=20 > I got very bad performance from XFS > (http://www.spinics.net/lists/xfs/msg08688.html) about > two months ago, which was caused by known bugs of XFS, then I tried > ext4 on some of > my servers, it works very well until I got a new server set up with s= oft RAID10. >=20 > What should I learn to understand what's happening? any suggestion is > appreciated. >=20 > --=20 > Xupeng Yun > http://about.me/xupeng -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html