All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tao Ma <tao.ma@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH 17/41] ocfs2: Add CoW support.
Date: Fri, 21 Aug 2009 10:04:18 +0800	[thread overview]
Message-ID: <4A8E00A2.1050902@oracle.com> (raw)
In-Reply-To: <20090821005932.GE10558@mail.oracle.com>



Joel Becker wrote:
> On Tue, Aug 18, 2009 at 02:19:18PM +0800, Tao Ma wrote:
>> +		if (*cow_len + leaf_clusters >= max_clusters) {
>> +			if (*cow_len == 0) {
>> +				/*
>> +				 * cpos is in a very large extent record.
>> +				 * So just split max_clusters from the
>> +				 * extent record.
>> +				 */
>> +				if ((rec_end - cpos) <= max_clusters) {
>> +					/*
>> +					 * We can take max_clusters off
>> +					 * the end and cover all of our
>> +					 * write.
>> +					 */
>> +					*cow_start = rec_end - max_clusters;
> 
> 	I just had a thought.  What if leaf_clusters is more than
> max_clusters, but our cpos is near the end of the extent?  This only
> works for the first pass (*cow_len == 0).  Imagine we have an extent of
> 1.5MB, and our cpos is at 1MB, and we want to write 1MB.  Clustersize of
> 256K.
> 
>   |<-- rec->e_cpo ==0                   leaf_clusters == 6 -->|
>   |--<256>--|--<256>--|--<256>--|--<256>--|--<256>--|--<256>--|
>                                           `cpos
> 
> When we get here, *cow_len is 0, *cow_start is 0 (rec->e_cpos), rec_end
> is 6 (leaf_clusters), cpos is 4 because the user wants to write at 1MB
> into the file, max_clusters is 4 (1MB) and write_len is 4 (1MB write).
> 
> 1. *cow_len + leaf_clusters >= max_clusters
>    0        + 6             >= 4  -> TRUE
> 
> 2. (rec_end - cpos) <= max_clusters
>    6        - 4     <= 4  -> TRUE
> 
> 3. *cow_start = rec_end - max_clusters
>               = 6       - 4  -> 2
> 
> 4. *cow_len = max_clusters
>             = 4
> 
> 	This leaves us with a *cow_start of 2 and a *cow_len of 4.  This
> *correctly gets us the last MB of the extent.  That's good.
> However, our write was 1MB from cpos 4.  We're returning to the caller
> 1MB from cpos 2.  We're going to do a short write.
oh, yes, this is really a bug. I guess the reason why tristan's test 
case can't find it is that we are now called from ocfs2_write_begin. 
Normally the write_len will at most be a PAGE_SIZE and the start is 
aligned to PAGE_SIZE also. With our x86_64 boxes, PAGE_SIZE=4096, we 
always CoW the right pos. But with PAGE_SIZE=64k, it will be exposed.
So how about change the first check to:
			if (((rec_end - cpos) <= max_clusters) &&
			    (cpos + write_len <= rec_end)) {

> 
>> +				} else if ((*cow_start + max_clusters) >
>> +					   (cpos + write_len)) {
> 
> 	Should this be >=?  I think it should be, and I think it's my
> fault.  But check to make sure.
yeah, ">=" will be more accurate. Actually, with ">", we can survive in 
a less efficient way since the "else" will cover this case and CoW a 
different part(start from another pos).

Regards,
Tao
> 
>> +					/*
>> +					 * We can take max_clusters off
>> +					 * the front and cover all of
>> +					 * our write.
>> +					 */
>> +					/* NOOP, *cow_start is already set */
>> +				} else {
>> +					/*
>> +					 * We're CoWing more data than
>> +					 * write_len for contiguousness,
>> +					 * but it doesn't fit at the
>> +					 * front or end of this extent.
>> +					 * Let's try to slice the extent
>> +					 * up nicely.  Optimally, our
>> +					 * CoW region starts at a
>> +					 * multiple of max_clusters.  If
>> +					 * that doesn't fit, we give up
>> +					 * and just CoW at cpos.
>> +					 */
>> +					*cow_start +=
>> +						(cpos - *cow_start) &
>> +							~(max_clusters - 1);
>> +					if ((*cow_start + max_clusters) <
>> +					    (cpos + write_len))
>> +						*cow_start = cpos;
>> +				}
>> +			}
>> +			*cow_len = max_clusters;
>> +			break;
>> +		} else
>> +			*cow_len += leaf_clusters;
>> +
>> +		/*
>> +		 * If we reach the end of the extent block and don't get enough
>> +		 * clusters, continue with the next extent block if possible.
>> +		 */
>> +		if (i + 1 == le16_to_cpu(el->l_next_free_rec) &&
>> +		    eb && eb->h_next_leaf_blk) {
>> +			brelse(eb_bh);
>> +			eb_bh = NULL;
>> +
>> +			ret = ocfs2_read_extent_block(INODE_CACHE(inode),
>> +					       le64_to_cpu(eb->h_next_leaf_blk),
>> +					       &eb_bh);
>> +			if (ret) {
>> +				mlog_errno(ret);
>> +				goto out;
>> +			}
>> +
>> +			eb = (struct ocfs2_extent_block *) eb_bh->b_data;
>> +			el = &eb->h_list;
>> +			i = -1;
>> +		}
>> +	}
>> +
>> +out:
>> +	brelse(eb_bh);
>> +	return ret;
>> +}
> 
> Joel
> 

  reply	other threads:[~2009-08-21  2:04 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-18  6:19 [Ocfs2-devel] [PATCH 00/41] ocfs2: Add reflink file support. V4 Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 01/41] ocfs2: Define refcount tree structure Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 02/41] ocfs2: Add metaecc for ocfs2_refcount_block Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 03/41] ocfs2: Add ocfs2_read_refcount_block Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 04/41] ocfs2: Abstract caching info checkpoint Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 05/41] ocfs2: Add new refcount tree lock resource in dlmglue Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 06/41] ocfs2: Add caching info for refcount tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 07/41] ocfs2: Add refcount tree lock mechanism Tao Ma
2009-08-19 23:25   ` Joel Becker
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 08/41] ocfs2: Basic tree root operation Tao Ma
2009-08-19 23:30   ` Joel Becker
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 09/41] ocfs2: Wrap ocfs2_extent_contig in ocfs2_extent_tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 10/41] ocfs2: Abstract extent split process Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 11/41] ocfs2: Add refcount b-tree as a new extent tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 12/41] ocfs2: move tree path functions to alloc.h Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 13/41] ocfs2: Add support for incrementing refcount in the tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 14/41] ocfs2: Add support of decrementing refcount for delete Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 15/41] ocfs2: Add functions for extents refcounted Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 16/41] ocfs2: Decrement refcount when truncating refcounted extents Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 17/41] ocfs2: Add CoW support Tao Ma
2009-08-21  0:59   ` Joel Becker
2009-08-21  2:04     ` Tao Ma [this message]
2009-08-21  2:51       ` Joel Becker
2009-08-21  3:04         ` Tao Ma
2009-08-21  7:10           ` Joel Becker
2009-08-21  3:55         ` Joel Becker
2009-08-21  6:25           ` Tao Ma
2009-08-21  7:07             ` Joel Becker
2009-08-21  8:24               ` Tao Ma
2009-08-21 18:39                 ` Joel Becker
2009-08-21 20:58                   ` Joel Becker
2009-08-24 15:04                     ` Tao Ma
2009-08-24 18:20                       ` Joel Becker
2009-08-25 19:30                       ` Joel Becker
2009-08-26  8:17                         ` TaoMa
2009-08-21 23:07                   ` Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 18/41] ocfs2: CoW refcount tree improvement Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 19/41] ocfs2: Integrate CoW in file write Tao Ma
2009-08-21  1:04   ` Joel Becker
2009-08-21  2:12     ` Tao Ma
2009-08-21 14:55       ` Tao Ma
2009-08-21 20:43         ` Joel Becker
2009-08-21 21:12   ` Joel Becker
2009-08-21 23:17     ` Tao Ma
2009-08-21 23:42       ` Joel Becker
2009-08-22  0:31         ` Tao Ma
2009-08-24 15:06         ` Tao Ma
2009-08-24 18:32           ` Joel Becker
2009-08-25  0:12             ` [Ocfs2-devel] [PATCH 19/41] ocfs2: Integrate CoW in file write(add refcount check) Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 20/41] ocfs2: CoW a reflinked cluster when it is truncated Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 21/41] ocfs2: Add normal functions for reflink a normal file's extents Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 22/41] ocfs2: handle file attributes issue for reflink Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 23/41] ocfs2: Return extent flags for xattr value tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 24/41] ocfs2: Abstract duplicate clusters process in CoW Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 25/41] ocfs2: Add CoW support for xattr Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 26/41] ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 27/41] ocfs2: Abstract the creation of xattr block Tao Ma
2009-08-21  1:22   ` Joel Becker
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 28/41] ocfs2: Abstract ocfs2 xattr tree extend rec iteration process Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 29/41] ocfs2: Attach xattr clusters to refcount tree Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 30/41] ocfs2: Call refcount tree remove process properly Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 31/41] ocfs2: Create an xattr indexed block if needed Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 32/41] ocfs2: Add reflink support for xattr Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 33/41] ocfs2: Modify removing xattr process for refcount Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 34/41] ocfs2: Don't merge in 1st refcount ops of reflink Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 35/41] ocfs2: Make transaction extend more efficient Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 36/41] ocfs2: Use proper parameter for some inode operation Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 37/41] ocfs2: Create reflinked file in orphan dir Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 38/41] ocfs2: Add preserve to reflink Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 39/41] ocfs2: Implement ocfs2_reflink Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 40/41] ocfs2: Enable refcount tree support Tao Ma
2009-08-18  6:19 ` [Ocfs2-devel] [PATCH 41/41] ocfs2: Add ioctl for reflink Tao Ma
2009-08-21  1:24 ` [Ocfs2-devel] [PATCH 00/41] ocfs2: Add reflink file support. V4 Joel Becker
2009-08-21  1:39   ` Tao Ma
2009-08-24 23:11   ` TaoMa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A8E00A2.1050902@oracle.com \
    --to=tao.ma@oracle.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.