From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Darrick J. Wong" Subject: Re: [PATCH 11/14] libext2fs: use fallocate for creating journals and hugefiles Date: Mon, 18 May 2015 12:24:52 -0700 Message-ID: <20150518192452.GI30577@birch.djwong.org> References: <20150514002108.10785.85860.stgit@birch.djwong.org> <20150514002219.10785.76994.stgit@birch.djwong.org> <20150517033925.GH4489@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-ext4@vger.kernel.org To: "Theodore Ts'o" Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:20368 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932068AbbERTY5 (ORCPT ); Mon, 18 May 2015 15:24:57 -0400 Content-Disposition: inline In-Reply-To: <20150517033925.GH4489@thunk.org> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Sat, May 16, 2015 at 11:39:25PM -0400, Theodore Ts'o wrote: > On Wed, May 13, 2015 at 05:22:19PM -0700, Darrick J. Wong wrote: > > Use the new fallocate API for creating the journal and the mk_hugefile > > feature. > > > > Signed-off-by: Darrick J. Wong > > I tried applying patches 9-11, and I found a regression. If you add > the following stanza to /etc/mke2fs.conf: > > hugefile = { > features = extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,^resize_inode,sparse_super2 > hash_alg = half_md4 > num_backup_sb = 0 > packed_meta_blocks = 1 > make_hugefiles = 1 > inode_ratio = 4194304 > hugefiles_dir = /store > hugefiles_name = big-data > hugefiles_digits = 0 > hugefiles_size = 0 > hugefiles_align = 256M > num_hugefiles = 1 > zero_hugefiles = false > flex_bg_size = 262144 > } > > ... then "mke2fs -Fq -T hugefile /dev/sdXX" should create a file > system with a single file /store/big-data that starts at offset 256M > and consumes the rest of the space. For example, try the commands > > % time mke2fs -Fq -T hugefile /tmp/foo.img 8T > % debugfs -R "extents /store/big-data" /tmp/foo.img > > With this patch applied, the file /store/big-data is a zero-length > file, instead of a very big file consuming the whole disk. Oops. I missed that subtlety; it's a pretty quick fix to make it fallocate all the way to the end. I also found a small bookkeeping error that eliminates the churn in the test case expect files. > Arguably there should have been a test so that this regression would > be detected automatically. I'll take care of adding it. > > (BTW, note how quickly the file /store/big-data is created using the > mk_hugefile code. Although I understand the new fallocate code is > more general, hopefully this generality doesn't cause performance > regression in terms of the file system layout or CPU time required to > create the big-data file.) A lot of the complexity deals with figuring out if for a given hole we should merely try to extent (or merge) the left and right extents. For empty files, it figures out that there is no left/right extent and simply cuts to the alloc-range-and-map loop. I noticed that it seemed to slow down maybe a tenth of a second (out of 5) for a 4TB file; is that too much of a regression? --D > > > --- a/tests/r_32to64bit_meta/expect > > +++ b/tests/r_32to64bit_meta/expect > > @@ -35,8 +35,8 @@ Change in FS metadata: > > Inode count: 65536 > > Block count: 524288 > > Reserved block count: 26214 > > --Free blocks: 858 > > -+Free blocks: 852 > > +-Free blocks: 857 > > ++Free blocks: 851 > > Free inodes: 65046 > > First block: 1 > > Block size: 1024 > > Why these changes? This implies the new fallocate code isn't creating > an extent tree that isn't quite as efficient as the original code? > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html