linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Amit K. Arora" <aarora@linux.vnet.ibm.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: linux-ext4@vger.kernel.org, suparna@in.ibm.com, suzuki@in.ibm.com
Subject: Re: [RFC][Patch 1/1] Persistent preallocation in ext4
Date: Wed, 13 Dec 2006 15:31:58 +0530	[thread overview]
Message-ID: <20061213100158.GA6517@amitarora.in.ibm.com> (raw)
In-Reply-To: <1165969238.3771.34.camel@dyn9047017103.beaverton.ibm.com>

On Tue, Dec 12, 2006 at 04:20:38PM -0800, Mingming Cao wrote:
> On Tue, 2006-12-12 at 11:53 +0530, Amit K. Arora wrote:
> > +
> > +		if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
> > +			return -ENOTTY;
> >
> 
> Supporting preallocation for extent based files seems fairly
> straightforward.  I agree we should look at this first.  After get this
> done, it probably worth re-consider whether to support preallocation for
> non-extent based files on ext4. I could imagine user upgrade from ext3
> to ext4, and expecting to use preallocation on those existing files....

I gave a thought on this initially. But, I was not sure how we should
implement preallocation in a non-extent based file. Using extents we can
mark a set of blocks as unitialized, but how will we do this for
non-extent based files ? If we do not have a way to mark blocks
uninitialized, when someone will try to read from a preallocated block,
it will return junk/stale data instead of zeroes.

But, if we can think of a solution here then it will be as simple as
removing the check above and replacing "ext4_ext_get_blocks()" with
"ext4_get_blocks_wrap()" in the while() loop.

> 
> > +
> > +		block = EXT4_BLOCK_ALIGN(input.offset, blkbits) >> blkbits;
> > +		max_blocks = EXT4_BLOCK_ALIGN(input.len, blkbits) >> blkbits;

I was wondering if I should change above lines to this :

+		block = input.offset >> blkbits;
+		max_blocks = (EXT4_BLOCK_ALIGN(input.len+input.offset,
					blkbits) >> blkbits) - block;

Reason is that the block which contains the offset, should also be
preallocated. And the max_blocks should be calculated accordingly.

> > +		while(ret>=0 && ret<max_blocks)
> > +		{
> > +			block = block + ret;
> > +			max_blocks = max_blocks - ret;
> > +	  		ret = ext4_ext_get_blocks(handle, inode, block,
> > +					max_blocks, &map_bh,
> > +					EXT4_CREATE_UNINITIALIZED_EXT, 1);
> 
> Since the interface takes offset and number of blocks to allocate, I
> assuming we are going to handle holes in preallocation, thus, we cannot
> not mark the extend_size flag to 1 when calling ext4_ext_get_blocks.
> 
> I think we should update i_size and i_disksize after preallocation. Oh,
> to protect parallel updating i_size, we have to take i_mutex down.

Okay. So, is this what you want to be done here :

+retry:
+                ret = 0;
+                while(ret>=0 && ret<max_blocks)
+                {
+                        block = block + ret;
+                        max_blocks = max_blocks - ret;
+                        ret = ext4_ext_get_blocks(handle, inode, block,
+                                        max_blocks, &map_bh,
+                                        EXT4_CREATE_UNINITIALIZED_EXT,0);
+                        if(ret > 0 && test_bit(BH_New, &map_bh.b_state))
+                                nblocks = nblocks + ret;
+                }
+                if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb,
+                                                &retries))
+                        goto retry;
+
+                if(nblocks) {
+                        mutex_lock(&inode->i_mutex);
+                        inode->i_size = inode->i_size + (nblocks >> blkbits);
+                        EXT4_I(inode)->i_disksize = inode->i_size;
+                        mutex_unlock(&inode->i_mutex);
+                }

> 
> > +		}
> > +		ext4_mark_inode_dirty(handle, inode);
> > +		ext4_journal_stop(handle);
> > +
> 
> Error code returned by ext4_journal_stop() is being ignored here, is
> this right?
> Well, there are other places in ext34/ioctl.c which ignore the return
> returned by ext4_journal_stop(), maybe should fix this in a separate
> patch.

Agreed. I think following should take care of it:

+               ext4_mark_inode_dirty(handle, inode);
+               ret2 = ext4_journal_stop(handle);
+               if(ret > 0)
+                       ret = ret2;
+               return ret > 0 ? nblocks : ret;

> > +		return ret>0?0:ret;
> > +	}
> >
> >
> Oh, what if we failed to allocate the full amount of blocks? i.e, the
> ext4_ext_get_blocks() returns -ENOSPC error and exit the loop early. Are
> we going to return error, or try something like
> 
> if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
> 	goto retry
> 
> I wonder it might be useful to return the actual number of blocks
> preallocated back to the application.

Ok. Yes, makes sense. We can return the number of "new" blocks like
this:
+               return ret > 0 ? nblocks : ret;



Please let me know if you agree with the above set of changes, and any
further comments you have. I will then update and test the new patch and
post it again. Thanks!

Regards,
Amit Arora

  reply	other threads:[~2006-12-13 10:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-05 13:43 [RFC][Patch 1/1] Persistent preallocation in ext4 Amit K. Arora
2006-12-06  5:58 ` Amit K. Arora
2006-12-12  1:28   ` Mingming Cao
2006-12-12  6:23     ` Amit K. Arora
2006-12-13  0:20       ` Mingming Cao
2006-12-13 10:01         ` Amit K. Arora [this message]
2006-12-13 13:36           ` Dave Kleikamp
2006-12-13 15:38             ` Suparna Bhattacharya
2006-12-13 15:54             ` Mingming Cao
2006-12-15 12:35   ` [RFC][Patch 1/2] " Amit K. Arora
2006-12-19 11:05     ` Amit K. Arora
     [not found]       ` <20061219211206.GO5937@schatzie.adilger.int>
2006-12-20  6:28         ` Amit K. Arora
2006-12-27 23:30     ` Mingming Cao
2007-01-02 11:04       ` Amit K. Arora
2007-01-02 22:47         ` Mingming Cao
2007-01-09  9:05         ` Amit K. Arora
2006-12-15 12:39 ` [RFC][Patch 2/2] " Amit K. Arora
2006-12-15 23:02   ` Andreas Dilger
2006-12-16  4:30     ` Amit K. Arora
2006-12-19 11:42   ` Amit K. Arora
2006-12-19 11:54     ` Amit K. Arora
2006-12-19 21:14     ` Andreas Dilger
2006-12-19 21:23       ` Eric Sandeen
2006-12-20  8:19         ` Amit K. Arora
2006-12-22 15:16       ` Amit K. Arora
2006-12-22 15:31         ` Alex Tomas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061213100158.GA6517@amitarora.in.ibm.com \
    --to=aarora@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=suparna@in.ibm.com \
    --cc=suzuki@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).