linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mingming Cao <cmm@us.ibm.com>
To: "Amit K. Arora" <aarora@linux.vnet.ibm.com>
Cc: linux-ext4@vger.kernel.org, suparna@in.ibm.com,
	suzuki@in.ibm.com, alex@clusterfs.com
Subject: Re: [RFC][Patch 1/2] Persistent preallocation in ext4
Date: Wed, 27 Dec 2006 15:30:44 -0800	[thread overview]
Message-ID: <1167262245.3792.20.camel@dyn9047017103.beaverton.ibm.com> (raw)
In-Reply-To: <20061215123528.GA24572@amitarora.in.ibm.com>

On Fri, 2006-12-15 at 18:05 +0530, Amit K. Arora wrote:
> This is the first patch in the set of two.
> 
> It implements the ioctl which will be used for persistent preallocation. It is a respun of the previous patch which was posted earlier, and includes following changes:
> * Takes care of review comments by Mingming
> * The declaration of extent related macros are now moved to ext4_fs_extent.h (from ext4_fs.h)
> * Updated the logic to calculate block and max_blocks in ext4/ioctl.c, which is used to call get_blocks.
> 
> It does _not_ take care of implementing persistent preallocation for non-extent based files. It is because of the following reasons:
> * It is being considered as a rare case
> * Users can/should convert their file(s) to extent format to use this feature
> * Moreover, posix_fallocate() can be used for this purpose, if the user does not want to convert the file(s) to the extent based format.
> 
> 
> Signed-off-by: Amit Arora (aarora@in.ibm.com)
> 
Hi Amit, 

looks good to me, a few comments :)
.....
> Index: linux-2.6.19.prealloc/fs/ext4/ioctl.c
> ===================================================================
> --- linux-2.6.19.prealloc.orig/fs/ext4/ioctl.c	2006-12-15 16:44:35.000000000 +0530
> +++ linux-2.6.19.prealloc/fs/ext4/ioctl.c	2006-12-15 17:47:00.000000000 +0530
> @@ -248,6 +248,65 @@
>  		return err;
>  	}
> 
> +	case EXT4_IOC_PREALLOCATE: {
> +		struct ext4_falloc_input input;
> +		handle_t *handle;
> +		ext4_fsblk_t block, max_blocks;
> +		int ret, ret2, nblocks = 0, retries = 0;
> +		struct buffer_head map_bh;
> +		unsigned int blkbits = inode->i_blkbits;
> +
> +		if (IS_RDONLY(inode))
> +			return -EROFS;
> +
> +		if (copy_from_user(&input,
> +			(struct ext4_falloc_input __user *) arg, sizeof(input)))
> +			return -EFAULT;
> +
> +		if (input.len == 0)
> +			return -EINVAL;
> +
> +		if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
> +			return -ENOTTY;
> +
> +		block = input.offset >> blkbits;
> +		max_blocks = (EXT4_BLOCK_ALIGN(input.len + input.offset,
> +						blkbits) >> blkbits) - block;
> +		handle=ext4_journal_start(inode,
> +				EXT4_DATA_TRANS_BLOCKS(inode->i_sb)+max_blocks);
> +		if (IS_ERR(handle))
> +			return PTR_ERR(handle);
> +retry:
> +		ret = 0;
> +		while(ret>=0 && ret<max_blocks)
> +		{
> +			block = block + ret;
> +			max_blocks = max_blocks - ret;
> +	  		ret = ext4_ext_get_blocks(handle, inode, block,
> +					max_blocks, &map_bh,
> +					EXT4_CREATE_UNINITIALIZED_EXT, 0);
> +			if(ret > 0 && test_bit(BH_New, &map_bh.b_state))
> +				nblocks = nblocks + ret;
> +		}


ext4_ext_get_blocks() returns 0 when it is mapping (non allocating) a
hole. In our case, we are doing allocating, so here it is not possible
to returns a 0 from ext4_ext_get_blocks(). I think we should quit the
loop and BUGON if ret == 0 here.

> +		if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb,
> +						&retries))
> +			goto retry;
> +
> +		if(nblocks) {
> +			mutex_lock(&inode->i_mutex);
> +			inode->i_size = inode->i_size + (nblocks >> blkbits);
> +			EXT4_I(inode)->i_disksize = inode->i_size;
> +			mutex_unlock(&inode->i_mutex);
> +		}

Hmm... We should not need to worry about the inode->i_size if we are
preallocating blocks for holes. 

And, Looking at other places calling ext4_*_get_blocks() in the kernel,
it seems not all of them protected by i_mutex lock. I think it probably
okay to not holding i_mutex during calling ext4_ext4_get_blocks(). 

> +
> +		ext4_mark_inode_dirty(handle, inode);
> +		ret2 = ext4_journal_stop(handle);
> +		if(ret > 0)
> +			ret = ret2;
> +
> +		return ret > 0 ? nblocks : ret;
> +	}
> +

Since the API takes the number of bytes to preallocate, at return time,
shall we convert the blocks to bytes to the user?

Here it returns the number of allocated blocks to the user.   Do we need
to worry about the case when dealing with a range with partial hole and
partial blocks already allocated? In that case nblocks(the new
preallocated blocks) will less than the maxblocks (the number of blocks
asked by application).  I am wondering what does other filesystem like
xfs do? Maybe we should do the same thing.

Thanks,
Mingming

  parent reply	other threads:[~2006-12-27 23:30 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-05 13:43 [RFC][Patch 1/1] Persistent preallocation in ext4 Amit K. Arora
2006-12-06  5:58 ` Amit K. Arora
2006-12-12  1:28   ` Mingming Cao
2006-12-12  6:23     ` Amit K. Arora
2006-12-13  0:20       ` Mingming Cao
2006-12-13 10:01         ` Amit K. Arora
2006-12-13 13:36           ` Dave Kleikamp
2006-12-13 15:38             ` Suparna Bhattacharya
2006-12-13 15:54             ` Mingming Cao
2006-12-15 12:35   ` [RFC][Patch 1/2] " Amit K. Arora
2006-12-19 11:05     ` Amit K. Arora
     [not found]       ` <20061219211206.GO5937@schatzie.adilger.int>
2006-12-20  6:28         ` Amit K. Arora
2006-12-27 23:30     ` Mingming Cao [this message]
2007-01-02 11:04       ` Amit K. Arora
2007-01-02 22:47         ` Mingming Cao
2007-01-09  9:05         ` Amit K. Arora
2006-12-15 12:39 ` [RFC][Patch 2/2] " Amit K. Arora
2006-12-15 23:02   ` Andreas Dilger
2006-12-16  4:30     ` Amit K. Arora
2006-12-19 11:42   ` Amit K. Arora
2006-12-19 11:54     ` Amit K. Arora
2006-12-19 21:14     ` Andreas Dilger
2006-12-19 21:23       ` Eric Sandeen
2006-12-20  8:19         ` Amit K. Arora
2006-12-22 15:16       ` Amit K. Arora
2006-12-22 15:31         ` Alex Tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1167262245.3792.20.camel@dyn9047017103.beaverton.ibm.com \
    --to=cmm@us.ibm.com \
    --cc=aarora@linux.vnet.ibm.com \
    --cc=alex@clusterfs.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=suparna@in.ibm.com \
    --cc=suzuki@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).