public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Valerie Clement <valerie.clement@bull.net>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>, sandeen@redhat.com
Subject: Re: [PATCH] ext4: mballoc: fix mb_normalize_request algorithm for 1KB block size filesystems
Date: Thu, 1 May 2008 22:44:10 +0530	[thread overview]
Message-ID: <20080501171410.GC7005@skywalker> (raw)
In-Reply-To: <1209562870.5307.12.camel@ext1.frec.bull.fr>

On Wed, Apr 30, 2008 at 03:41:10PM +0200, Valerie Clement wrote:
> mballoc: fix mb_normalize_request algorithm for 1KB block size filesystems
> 
> From: Valerie Clement <valerie.clement@bull.net>
> 
> In case of inode preallocation, the number of blocks to allocate depends
> on the file size and it is calculated in ext4_mb_normalize_group_request().
> Each group in the filesystem is then checked to find one that can be used
> for allocation; this is done in ext4_mb_good_group().
> 
> When a file bigger than 4MB is created, the requested number of blocks to
> preallocate, calculated by ext4_mb_normalize_group_request is 4096.
> However for a filesystem with 1KB block size, the maximum size of the
> block buddies used by the multiblock allocator is 2048, so none of
> groups in the filesystem satisfies the search criteria in
> ext4_mb_good_group(). Scanning all the filesystem groups impacts
> performance.

s/ext4_mb_normalize_group_request/ext4_mb_normalize_request/


That's true the max order is block_size_bits + 1
Can you update the commit message with the above information ?

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


> 
> The following numbers show that:
> - on an ext4 FS with 1KB block size mounted with nodelalloc option:
> # dd if=/dev/zero of=/mnt/test/foo bs=8k count=1k conv=fsync
> 1024+0 records in
> 1024+0 records out
> 8388608 bytes (8.4 MB) copied, 35.5091 seconds, 236 kB/s
> 
> - on an ext4 FS with 1KB block size mounted with nodelalloc and nomballoc
> options:
> # dd if=/dev/zero of=/mnt/test/foo bs=8k count=1k conv=fsync
> 1024+0 records in
> 1024+0 records out
> 8388608 bytes (8.4 MB) copied, 0.233754 seconds, 35.9 MB/s
> 
> In the two cases, dd is done after creating the FS with -b1024 option,
> mounting the FS with the options specified before and flushing all caches
> using echo 3 > /proc/sys/vm/drop_caches.
> The partition size is 70GB.
> I did the same test on a 1TB partition, it took several minutes to write
> 8MB!
> 
> This patch modifies the algorithm in ext4_mb_normalize_group_request to
> calculate the number of blocks to allocate by taking into account the
> maximum size of free blocks chunks handled by the multiblock allocator.
> 
> It has also been tested for filesystems with 2KB and 4KB block sizes to
> ensure that those cases don't regress.
> 
> Signed-off-by: Valerie Clement <valerie.clement@bull.net>
> 
> ---
> 
>  mballoc.c |   19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> Index: linux-2.6.25/fs/ext4/mballoc.c
> ===================================================================
> --- linux-2.6.25.orig/fs/ext4/mballoc.c	2008-04-25 16:19:32.000000000 +0200
> +++ linux-2.6.25/fs/ext4/mballoc.c	2008-04-25 16:49:34.000000000 +0200
> @@ -2905,12 +2905,11 @@ ext4_mb_normalize_request(struct ext4_al
>  	if (size < i_size_read(ac->ac_inode))
>  		size = i_size_read(ac->ac_inode);
> 
> -	/* max available blocks in a free group */
> -	max = EXT4_BLOCKS_PER_GROUP(ac->ac_sb) - 1 - 1 -
> -				EXT4_SB(ac->ac_sb)->s_itb_per_group;
> +	/* max size of free chunks */
> +	max = 2 << bsbits;
> 
> -#define NRL_CHECK_SIZE(req, size, max,bits)	\
> -		(req <= (size) || max <= ((size) >> bits))
> +#define NRL_CHECK_SIZE(req, size, max, chunk_size)	\
> +		(req <= (size) || max <= (chunk_size))
> 
>  	/* first, try to predict filesize */
>  	/* XXX: should this table be tunable? */
> @@ -2929,16 +2928,16 @@ ext4_mb_normalize_request(struct ext4_al
>  		size = 512 * 1024;
>  	} else if (size <= 1024 * 1024) {
>  		size = 1024 * 1024;
> -	} else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, bsbits)) {
> +	} else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, 2 * 1024)) {
>  		start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
> -						(20 - bsbits)) << 20;
> -		size = 1024 * 1024;
> -	} else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, bsbits)) {
> +						(21 - bsbits)) << 21;
> +		size = 2* 1024 * 1024;
> +	} else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, 4 * 1024)) {
>  		start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
>  							(22 - bsbits)) << 22;
>  		size = 4 * 1024 * 1024;
>  	} else if (NRL_CHECK_SIZE(ac->ac_o_ex.fe_len,
> -					(8<<20)>>bsbits, max, bsbits)) {
> +					(8<<20)>>bsbits, max, 8 * 1024)) {
>  		start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
>  							(23 - bsbits)) << 23;
>  		size = 8 * 1024 * 1024;
> 
> 

  reply	other threads:[~2008-05-01 17:14 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-30 13:41 [PATCH] ext4: mballoc: fix mb_normalize_request algorithm for 1KB block size filesystems Valerie Clement
2008-05-01 17:14 ` Aneesh Kumar K.V [this message]
2008-05-02 21:12   ` Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080501171410.GC7005@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=valerie.clement@bull.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox