linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Goffredo Baroncelli <kreijack@libero.it>
To: WeiFeng Liu <weifeng.liu@hushmail.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC PATCH] Decrease meta fragments by using a caterpillar band Method (Ver. 2)
Date: Tue, 29 May 2012 19:32:50 +0200	[thread overview]
Message-ID: <4FC50842.8080304@libero.it> (raw)
In-Reply-To: <20120529162442.7D3ABA6E45@smtp.hushmail.com>

Hi Liu,

On 05/29/2012 06:24 PM, WeiFeng Liu wrote:
> This is a several bugs fixed version since my first patch commit, and added
> patch of btrfs-prog
> 
> 
> Introduction and brief speculate of values and penalties:
> 
> 	When a tree block need to be created, we offer, say, 2 or 3 blocks for it, 
> then pick one from the continuous blocks. If this tree block needs a cow,
> another free block from these continuous blocks can be grabbed, and the old one
> is freed for next cow.


What happens if the block is not COW-ed *and freed* but COW-ed only
(think about a snapshot updated) ? I.e, what happens if the user makes
5-6 snapshot and the caterpillar-size is 3 ?

> 
> 	In the most ideal condition only 2 continuous blocks are kept for any times
> of cowing a tree block -- image a caterpillar band by which my method is named.

Apart my doubt above, I am very interested on the performances. However
I have some doubts about the space efficiency.
I have the impression that today BTRFS consumes a lot of space for the
meta-data.

On my linux box:

	ghigo@venice:~$ sudo btrfs fi df /
	Data: total=19.01GB, used=14.10GB
	System, DUP: total=8.00MB, used=4.00KB
	System: total=4.00MB, used=0.00
	Metadata, DUP: total=2.00GB, used=933.55MB
	Metadata: total=8.00MB, used=0.00

So basically the metadata are bout the 6% of the data (0.9GB / 14.1GB ).

But with your idea, BTRFS should reserve 3 blocks for every metadata block.
This means that the BTRFS ratio metadata/data will increase to 18%
(6%*3). Which is not a so negligible value.

GB

> 
> 	Given a scene that there are two file systems: one with COW and the other
> NOCOW, each has 1GB space for metadata with it's first 100MB has been used, and
> let me focus only on ops of modifying metadata and ignore deleting metadata for
> simple reason. 
> 
> 	As we can image that the NOCOW fs would likely keep the layout of its 100MB
> meta to most neat: changes can be overwritten in the original places and leave
> the rest 900MB untouched. But it is lack of the excellent feature to assure data
> integrity which owned only by COW fs.
> 
> 	However, only modifying metadata though, the COW fs would make holes in 
> the first 100MB and write COWed meta into the rest 900MB space, in the extreme
> condition, the whole 1GB space would be fragmented finally and scattered by that
> 100MB metadata. I don't think btrfs will be easily trap into such bad state, as
> I understood we have extent, chunk, cluster and maybe other methods(tell me
> please) to slow fragmenting, but it seems that there are no dedicate method
> (fix me) to help COW getting rid of this type of fragments which NOCOW fs does
> not incline to make.
> 
> 	I introduce the caterpillar band method as a compromise. It use 300MB for
> meta to avoid such fragments and without lost of COW feature, in the instance,
> that means three countinues blocks are used for a single tree block, the tree
> block will be circularly updated(cowed) within its three countinues blocks.
> 
> 	Penalties? Yes there are, thanks to Arne Jansen and Liu Bo. As Arne Jansen
> indicated, the most disadvantage of the method will be that this will basically
> limit us to 1/3th of platter speed to write meta when using spinning drives and
> to 1/4th if using four countinues blocks for a tree block.
> 
> 	About readahead, which will be also down to the 1/3th of NOCOW fs rate, but
> I would discreetly think it as an advantage rather than penalty comparing worse
> condition which COW would get -- nearly since the first COW, the new tree blocks
> cowed would be 50MB far away from their original neighbor blocks normally, and
> after frequent random modify ops, would the worst conditition be that every
> dozen of tree blocks newly cowed are more than 50MB far away from their original
> neighbor blocks if in equal probability?
> 
> 	So permit me to think readahead is only usefull for NOCOW fs in this
> scenario, because it always keeps its original 100MB continued, and my way would
> keep 1/3 readahead rate vs maybe-zero by pure COW if worstly.
> 
> 	Of course, both penalties and values are only for metadata and will not
> affect user date read/write, my patch is only applied for cow tree blocks. 
> 	But if there are large number of small files(size<4k), values and penalties
> will also affect those small user data R/W.
> 
> 	I have not made tests for my patch by now, it still need some time to get
> more check for both strategy and code in patch and fix possible bugs before
> test, any comments are welcome.
> 
> 
> Thanks
> 
> 
> signed-off-by WeiFeng Liu
> 523f28f9b3d9c710cacc31dbba644efb1678cf62
> 
> ---
> 
> diff -uprN a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> --- a/fs/btrfs/ctree.c	2012-05-21 18:42:51.000000000 +0000
> +++ b/fs/btrfs/ctree.c	2012-05-29 23:08:19.000000000 +0000
> @@ -444,9 +444,21 @@ static noinline int __btrfs_cow_block(st
>  	} else
>  		parent_start = 0;
>  
> -	cow = btrfs_alloc_free_block(trans, root, buf->len, parent_start,
> -				     root->root_key.objectid, &disk_key,
> -				     level, search_start, empty_size, 1);
> + 	if (root->fs_info->cater_factor > 1) {	
> +		if (btrfs_cater_factor(btrfs_header_cater(buf)) > 1)			
> +			cow = btrfs_grab_cater_block(trans, root, buf, parent_start,
> +							root->root_key.objectid, &disk_key,
> +							level, search_start, empty_size, 1);
> +		else			
> +			cow = btrfs_alloc_free_block_cater(trans, root, buf->len, 
> +							parent_start,
> +							root->root_key.objectid, &disk_key,
> +							level, search_start, empty_size, 1);
> +	} else {
> +		cow = btrfs_alloc_free_block(trans, root, buf->len, parent_start,
> + 				     root->root_key.objectid, &disk_key,
> + 				     level, search_start, empty_size, 1);
> +	}
>  	if (IS_ERR(cow))
>  		return PTR_ERR(cow);
>  
> diff -uprN a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> --- a/fs/btrfs/ctree.h	2012-05-21 18:42:51.000000000 +0000
> +++ b/fs/btrfs/ctree.h	2012-05-29 23:08:19.000000000 +0000
> @@ -341,6 +341,7 @@ struct btrfs_header {
>  	__le64 owner;
>  	__le32 nritems;
>  	u8 level;
> +	u8 cater_index_factor;
>  } __attribute__ ((__packed__));
>  
>  #define BTRFS_NODEPTRS_PER_BLOCK(r) (((r)->nodesize - \
> @@ -544,6 +545,7 @@ struct btrfs_extent_item {
>  	__le64 refs;
>  	__le64 generation;
>  	__le64 flags;
> +	u8 cater_index_factor;
>  } __attribute__ ((__packed__));
>  
>  struct btrfs_extent_item_v0 {
> @@ -1222,6 +1224,7 @@ struct btrfs_fs_info {
>  
>  	unsigned data_chunk_allocations;
>  	unsigned metadata_ratio;
> +	unsigned cater_factor;
>  
>  	void *bdev_holder;
>  
> @@ -1761,6 +1764,8 @@ BTRFS_SETGET_FUNCS(extent_refs, struct b
>  BTRFS_SETGET_FUNCS(extent_generation, struct btrfs_extent_item,
>  		   generation, 64);
>  BTRFS_SETGET_FUNCS(extent_flags, struct btrfs_extent_item, flags, 64);
> +BTRFS_SETGET_FUNCS(extent_cater, struct btrfs_extent_item,
> +			cater_index_factor, 8);
>  
>  BTRFS_SETGET_FUNCS(extent_refs_v0, struct btrfs_extent_item_v0, refs, 32);
>  
> @@ -1781,6 +1786,16 @@ static inline void btrfs_set_tree_block_
>  	write_eb_member(eb, item, struct btrfs_tree_block_info, key, key);
>  }
>  
> +static inline u8 btrfs_cater_factor(u8 cater_index_factor)
> +{
> +	return (cater_index_factor & 15);
> +}
> +
> +static inline u8 btrfs_cater_index(u8 cater_index_factor)
> +{
> +	return (cater_index_factor >> 4);
> +}
> +
>  BTRFS_SETGET_FUNCS(extent_data_ref_root, struct btrfs_extent_data_ref,
>  		   root, 64);
>  BTRFS_SETGET_FUNCS(extent_data_ref_objectid, struct btrfs_extent_data_ref,
> @@ -2042,6 +2057,8 @@ BTRFS_SETGET_HEADER_FUNCS(header_owner,
>  BTRFS_SETGET_HEADER_FUNCS(header_nritems, struct btrfs_header, nritems, 32);
>  BTRFS_SETGET_HEADER_FUNCS(header_flags, struct btrfs_header, flags, 64);
>  BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
> +BTRFS_SETGET_HEADER_FUNCS(header_cater, struct btrfs_header,
> +			cater_index_factor, 8);
>  
>  static inline int btrfs_header_flag(struct extent_buffer *eb, u64 flag)
>  {
> @@ -2445,7 +2462,20 @@ struct extent_buffer *btrfs_alloc_free_b
>  					struct btrfs_root *root, u32 blocksize,
>  					u64 parent, u64 root_objectid,
>  					struct btrfs_disk_key *key, int level,
> +					u64 hint, u64 empty_size, int for_cow);					
> +int btrfs_cater_blocks_free(struct btrfs_root *root, u64 start, u64 len,
> +					u8 factor, u8 index);
> +struct extent_buffer *btrfs_grab_cater_block(struct btrfs_trans_handle *trans,
> +					struct btrfs_root *root, struct extent_buffer *buf,
> +					u64 parent, u64 root_objectid,
> +					struct btrfs_disk_key *key, int level,
>  					u64 hint, u64 empty_size, int for_cow);
> +struct extent_buffer *btrfs_alloc_free_block_cater(
> +					struct btrfs_trans_handle *trans,
> +					struct btrfs_root *root, u32 blocksize,
> +					u64 parent, u64 root_objectid,
> +					struct btrfs_disk_key *key, int level,
> + 					u64 hint, u64 empty_size, int for_cow);
>  void btrfs_free_tree_block(struct btrfs_trans_handle *trans,
>  			   struct btrfs_root *root,
>  			   struct extent_buffer *buf,
> @@ -2453,7 +2483,7 @@ void btrfs_free_tree_block(struct btrfs_
>  struct extent_buffer *btrfs_init_new_buffer(struct btrfs_trans_handle *trans,
>  					    struct btrfs_root *root,
>  					    u64 bytenr, u32 blocksize,
> -					    int level);
> +					    int level, u8 cater);
>  int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
>  				     struct btrfs_root *root,
>  				     u64 root_objectid, u64 owner,
> diff -uprN a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
> --- a/fs/btrfs/delayed-ref.h	2012-05-21 18:42:51.000000000 +0000
> +++ b/fs/btrfs/delayed-ref.h	2012-05-29 23:08:19.000000000 +0000
> @@ -63,6 +63,7 @@ struct btrfs_delayed_extent_op {
>  	unsigned int update_key:1;
>  	unsigned int update_flags:1;
>  	unsigned int is_data:1;
> +	u8 cater_index_factor;
>  };
>  
>  /*
> diff -uprN a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> --- a/fs/btrfs/extent-tree.c	2012-05-21 18:42:51.000000000 +0000
> +++ b/fs/btrfs/extent-tree.c	2012-05-29 23:08:19.000000000 +0000
> @@ -87,10 +87,15 @@ static int alloc_reserved_file_extent(st
>  				      u64 flags, u64 owner, u64 offset,
>  				      struct btrfs_key *ins, int ref_mod);
>  static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
> -				     struct btrfs_root *root,
> -				     u64 parent, u64 root_objectid,
> -				     u64 flags, struct btrfs_disk_key *key,
> -				     int level, struct btrfs_key *ins);
> +						struct btrfs_root *root,
> +						u64 parent, u64 root_objectid,
> +						u64 flags, struct btrfs_disk_key *key,
> +						int level, struct btrfs_key *ins);
> +static int alloc_reserved_tree_block_cater(struct btrfs_trans_handle *trans,
> +						struct btrfs_root *root,
> +						u64 parent, u64 root_objectid,
> +						u64 flags, struct btrfs_disk_key *key,
> +						int level, struct btrfs_key *ins, u8 cater);
>  static int do_chunk_alloc(struct btrfs_trans_handle *trans,
>  			  struct btrfs_root *extent_root, u64 alloc_bytes,
>  			  u64 flags, int force);
> @@ -1979,9 +1984,9 @@ static int run_delayed_data_ref(struct b
>  			flags |= extent_op->flags_to_set;
>  		}
>  		ret = alloc_reserved_file_extent(trans, root,
> -						 parent, ref_root, flags,
> -						 ref->objectid, ref->offset,
> -						 &ins, node->ref_mod);
> +						parent, ref_root, flags,
> +						ref->objectid, ref->offset,
> +						&ins, node->ref_mod);
>  	} else if (node->action == BTRFS_ADD_DELAYED_REF) {
>  		ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
>  					     node->num_bytes, parent,
> @@ -2101,12 +2106,19 @@ static int run_delayed_tree_ref(struct b
>  	BUG_ON(node->ref_mod != 1);
>  	if (node->action == BTRFS_ADD_DELAYED_REF && insert_reserved) {
>  		BUG_ON(!extent_op || !extent_op->update_flags ||
> -		       !extent_op->update_key);
> -		ret = alloc_reserved_tree_block(trans, root,
> +		       !extent_op->update_key);						
> +		if (btrfs_cater_factor(extent_op->cater_index_factor < 2))
> +			ret = alloc_reserved_tree_block(trans, root,
> + 						parent, ref_root,
> + 						extent_op->flags_to_set,
> + 						&extent_op->key,
> + 						ref->level, &ins);
> +		else
> +			ret = alloc_reserved_tree_block_cater(trans, root,
>  						parent, ref_root,
>  						extent_op->flags_to_set,
>  						&extent_op->key,
> -						ref->level, &ins);
> +						ref->level, &ins, extent_op->cater_index_factor);
>  	} else if (node->action == BTRFS_ADD_DELAYED_REF) {
>  		ret = __btrfs_inc_extent_ref(trans, root, node->bytenr,
>  					     node->num_bytes, parent, ref_root,
> @@ -4860,6 +4872,8 @@ static int __btrfs_free_extent(struct bt
>  	int num_to_del = 1;
>  	u32 item_size;
>  	u64 refs;
> +	u8 factor = btrfs_cater_factor(extent_op->cater_index_factor);
> +	u8 index = btrfs_cater_index(extent_op->cater_index_factor);
>  
>  	path = btrfs_alloc_path();
>  	if (!path)
> @@ -5024,6 +5038,10 @@ static int __btrfs_free_extent(struct bt
>  			     (bytenr + num_bytes - 1) >> PAGE_CACHE_SHIFT);
>  		}
>  
> +		if (btrfs_cater_blocks_free(root, bytenr, num_bytes, factor, index)) {
> +			bytenr = bytenr - num_bytes * index;
> +			num_bytes = num_bytes * index;
> +		}
>  		ret = update_block_group(trans, root, bytenr, num_bytes, 0);
>  		BUG_ON(ret);
>  	}
> @@ -5926,6 +5944,7 @@ static int alloc_reserved_file_extent(st
>  	btrfs_set_extent_generation(leaf, extent_item, trans->transid);
>  	btrfs_set_extent_flags(leaf, extent_item,
>  			       flags | BTRFS_EXTENT_FLAG_DATA);
> +	btrfs_set_extent_cater(leaf, extent_item, 0);
>  
>  	iref = (struct btrfs_extent_inline_ref *)(extent_item + 1);
>  	btrfs_set_extent_inline_ref_type(leaf, iref, type);
> @@ -5987,6 +6006,7 @@ static int alloc_reserved_tree_block(str
>  	btrfs_set_extent_generation(leaf, extent_item, trans->transid);
>  	btrfs_set_extent_flags(leaf, extent_item,
>  			       flags | BTRFS_EXTENT_FLAG_TREE_BLOCK);
> +	btrfs_set_extent_cater(leaf, extent_item, 0);
>  	block_info = (struct btrfs_tree_block_info *)(extent_item + 1);
>  
>  	btrfs_set_tree_block_key(leaf, block_info, key);
> @@ -6017,6 +6037,77 @@ static int alloc_reserved_tree_block(str
>  	return ret;
>  }
>  
> +static int alloc_reserved_tree_block_cater(struct btrfs_trans_handle *trans,
> +				     struct btrfs_root *root,
> +				     u64 parent, u64 root_objectid,
> +				     u64 flags, struct btrfs_disk_key *key,
> +				     int level, struct btrfs_key *ins, u8 cater)
> +{
> +	int ret;
> +	struct btrfs_fs_info *fs_info = root->fs_info;
> +	struct btrfs_extent_item *extent_item;
> +	struct btrfs_tree_block_info *block_info;
> +	struct btrfs_extent_inline_ref *iref;
> +	struct btrfs_path *path;
> +	struct extent_buffer *leaf;
> +	u32 size = sizeof(*extent_item) + sizeof(*block_info) + sizeof(*iref);
> +	u8 factor = btrfs_cater_factor(cater);
> +	u8 index = btrfs_cater_index(cater);
> +	u64 bytenr = ins->objectid;
> +	u64 num_bytes = ins->offset;
> +
> +	path = btrfs_alloc_path();
> +	if (!path)
> +		return -ENOMEM;
> +
> +	path->leave_spinning = 1;
> +	ret = btrfs_insert_empty_item(trans, fs_info->extent_root, path,
> +				      ins, size);
> +	BUG_ON(ret);
> +
> +	leaf = path->nodes[0];
> +	extent_item = btrfs_item_ptr(leaf, path->slots[0],
> +				     struct btrfs_extent_item);
> +	btrfs_set_extent_refs(leaf, extent_item, 1);
> +	btrfs_set_extent_generation(leaf, extent_item, trans->transid);
> +	btrfs_set_extent_flags(leaf, extent_item,
> +			       flags | BTRFS_EXTENT_FLAG_TREE_BLOCK);
> +	btrfs_set_extent_cater(leaf, extent_item, cater);
> +	
> +	block_info = (struct btrfs_tree_block_info *)(extent_item + 1);
> +
> +	btrfs_set_tree_block_key(leaf, block_info, key);
> +	btrfs_set_tree_block_level(leaf, block_info, level);
> +
> +	iref = (struct btrfs_extent_inline_ref *)(block_info + 1);
> +	if (parent > 0) {
> +		BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF));
> +		btrfs_set_extent_inline_ref_type(leaf, iref,
> +						 BTRFS_SHARED_BLOCK_REF_KEY);
> +		btrfs_set_extent_inline_ref_offset(leaf, iref, parent);
> +	} else {
> +		btrfs_set_extent_inline_ref_type(leaf, iref,
> +						 BTRFS_TREE_BLOCK_REF_KEY);
> +		btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid);
> +	}
> +
> +	btrfs_mark_buffer_dirty(leaf);
> +	btrfs_free_path(path);
> +
> +	if (btrfs_cater_blocks_free(root, bytenr, num_bytes, factor, index)) {
> +		bytenr = bytenr - num_bytes * index;
> +		num_bytes = num_bytes * index;
> +	}
> +	ret = update_block_group(trans, root, bytenr, num_bytes, 1);
> +	if (ret) {
> +		printk(KERN_ERR "btrfs update block group failed for %llu "
> +		       "%llu\n", (unsigned long long)ins->objectid,
> +		       (unsigned long long)ins->offset);
> +		BUG();
> +	}
> +	return ret;
> +}
> +
>  int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
>  				     struct btrfs_root *root,
>  				     u64 root_objectid, u64 owner,
> @@ -6096,7 +6187,7 @@ int btrfs_alloc_logged_file_extent(struc
>  struct extent_buffer *btrfs_init_new_buffer(struct btrfs_trans_handle *trans,
>  					    struct btrfs_root *root,
>  					    u64 bytenr, u32 blocksize,
> -					    int level)
> +					    int level, u8 cater)
>  {
>  	struct extent_buffer *buf;
>  
> @@ -6104,6 +6195,7 @@ struct extent_buffer *btrfs_init_new_buf
>  	if (!buf)
>  		return ERR_PTR(-ENOMEM);
>  	btrfs_set_header_generation(buf, trans->transid);
> +	btrfs_set_header_cater(buf, cater);
>  	btrfs_set_buffer_lockdep_class(root->root_key.objectid, buf, level);
>  	btrfs_tree_lock(buf);
>  	clean_tree_block(trans, root, buf);
> @@ -6221,7 +6313,201 @@ struct extent_buffer *btrfs_alloc_free_b
>  	}
>  
>  	buf = btrfs_init_new_buffer(trans, root, ins.objectid,
> -				    blocksize, level);
> +				    blocksize, level, 0);
> +	BUG_ON(IS_ERR(buf));
> +
> +	if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) {
> +		if (parent == 0)
> +			parent = ins.objectid;
> +		flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
> +	} else
> +		BUG_ON(parent > 0);
> +
> +	if (root_objectid != BTRFS_TREE_LOG_OBJECTID) {
> +		struct btrfs_delayed_extent_op *extent_op;
> +		extent_op = kmalloc(sizeof(*extent_op), GFP_NOFS);
> +		BUG_ON(!extent_op);
> +		if (key)
> +			memcpy(&extent_op->key, key, sizeof(extent_op->key));
> +		else
> +			memset(&extent_op->key, 0, sizeof(extent_op->key));
> +		extent_op->flags_to_set = flags;
> +		extent_op->update_key = 1;
> +		extent_op->update_flags = 1;
> +		extent_op->is_data = 0;
> +		extent_op->cater_index_factor = 0;
> +
> +		ret = btrfs_add_delayed_tree_ref(root->fs_info, trans,
> +					ins.objectid,
> +					ins.offset, parent, root_objectid,
> +					level, BTRFS_ADD_DELAYED_EXTENT,
> +					extent_op, for_cow);
> +		BUG_ON(ret);
> +	}
> +	return buf;
> +}
> +
> +int btrfs_cater_blocks_free(struct btrfs_root *root, u64 start, u64 len,
> +				u8 factor, u8 index)
> +{
> +	u8 i;
> +	int ret;
> +	u64 bytenr, cur;
> +	struct extent_io_tree *tree;
> +	struct extent_buffer *eb;
> +
> +	if (factor < 2 )
> +		return 0;
> +
> +	bytenr = start - len * index;
> +	tree = &((BTRFS_I(root->fs_info->btree_inode))->io_tree);
> +
> +	for (i = 0; i < factor; i++) {
> +		if (i == index)
> +			continue;
> +		cur = bytenr + len * index;
> +		ret = btrfs_lookup_extent(root, cur, len);
> +		BUG_ON(ret);
> +		if (ret == 0)
> +			return 0;
> +
> +		rcu_read_lock();
> +		eb = radix_tree_lookup(&tree->buffer, cur >> PAGE_CACHE_SHIFT);
> +		if (eb)
> +			return 0;
> +		rcu_read_unlock();
> +	}
> +	
> +	return 1;
> +}
> +
> +static noinline struct extent_buffer *__btrfs_grab_cater_block(
> +					struct btrfs_trans_handle *trans,
> +					struct btrfs_root *root,
> +					u64 blockprt, u32 blocksize,
> +					u64 parent, u64 root_objectid,
> +					struct btrfs_disk_key *key, int level,
> +					int for_cow, u8 index, u8 factor)
> +{
> +	struct btrfs_key ins;
> +	struct btrfs_block_rsv *block_rsv;
> +	struct extent_buffer *buf;
> +	u64 flags = 0;
> +	int ret;
> +	u8 cater;
> +
> +	if (factor < 2)
> +		return NULL;
> +
> +	cater = (index << 4 | factor);
> +	ins.objectid = blockprt - blocksize * index;
> +	ins.offset = blocksize;
> +	
> +	/* 
> +	 * Here reserving metadata is needed, and unreserving somewhere after 
> +	 * commit transaction is also needed, it should be fixed in the future
> +	 * - WeiFeng Liu. 
> +	 */
> +	buf = btrfs_init_new_buffer(trans, root, ins.objectid,
> +				    blocksize, level, cater);
> +	BUG_ON(IS_ERR(buf));
> +
> +	if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) {
> +		if (parent == 0)
> +			parent = ins.objectid;
> +		flags |= BTRFS_BLOCK_FLAG_FULL_BACKREF;
> +	} else
> +		BUG_ON(parent > 0);
> +
> +	if (root_objectid != BTRFS_TREE_LOG_OBJECTID) {
> +		struct btrfs_delayed_extent_op *extent_op;
> +		extent_op = kmalloc(sizeof(*extent_op), GFP_NOFS);
> +		BUG_ON(!extent_op);
> +		if (key)
> +			memcpy(&extent_op->key, key, sizeof(extent_op->key));
> +		else
> +			memset(&extent_op->key, 0, sizeof(extent_op->key));
> +		extent_op->flags_to_set = flags;
> +		extent_op->update_key = 1;
> +		extent_op->update_flags = 1;
> +		extent_op->is_data = 0;
> +		extent_op->cater_index_factor = ((index << 4) | factor);
> +
> +		ret = btrfs_add_delayed_tree_ref(root->fs_info, trans,
> +					ins.objectid,
> +					ins.offset, parent, root_objectid,
> +					level, BTRFS_ADD_DELAYED_EXTENT,
> +					extent_op, for_cow);
> +		BUG_ON(ret);
> +	}
> +	return buf;
> +}
> +
> +struct extent_buffer *btrfs_grab_cater_block(struct btrfs_trans_handle *trans,
> +					struct btrfs_root *root, struct extent_buffer *buf,
> +					u64 parent, u64 root_objectid,
> +					struct btrfs_disk_key *key, int level,
> +					u64 hint, u64 empty_size, int for_cow)
> +{
> +	u8 factor, index, i;
> +	int ret;
> +	u64 bytenr, cur;
> +	struct extent_buffer *eb = NULL;
> +	
> +	index = btrfs_cater_index(btrfs_header_cater(buf));
> +	factor = btrfs_cater_factor(btrfs_header_cater(buf));
> +	if (factor < 2)
> +		return NULL;
> +
> +	bytenr = buf->start - buf->len * index;
> +	for (i = 0; i < factor; i++) {
> +		if (i == index)
> +			continue;
> +	cur = bytenr + buf->len * i;
> +	ret = btrfs_lookup_extent(root, cur, buf->len);	
> +	BUG_ON(ret);
> +	if (ret == 0)
> +			continue;
> +	eb = __btrfs_grab_cater_block(trans, root, cur, buf->len,
> +					parent, root_objectid,
> +					key, level, for_cow, i, factor);
> +	if (eb)
> +		break;
> +	}
> +	
> +	return eb;
> +}
> +
> +struct extent_buffer *btrfs_alloc_free_block_cater(struct btrfs_trans_handle *trans,
> +					struct btrfs_root *root, u32 blocksize,
> +					u64 parent, u64 root_objectid,
> +					struct btrfs_disk_key *key, int level,
> +					u64 hint, u64 empty_size, int for_cow)
> +{
> +	struct btrfs_key ins;
> +	struct btrfs_block_rsv *block_rsv;
> +	struct extent_buffer *buf;
> +	u64 flags = 0;
> +	int ret;
> +	u8 factor = root->fs_info->cater_factor;
> +	
> +
> +	if (factor > 1)
> +		blocksize = blocksize * factor;
> +
> +	block_rsv = use_block_rsv(trans, root, blocksize);
> +	if (IS_ERR(block_rsv))
> +		return ERR_CAST(block_rsv);
> +
> +	ret = btrfs_reserve_extent(trans, root, blocksize, blocksize,
> +				   empty_size, hint, (u64)-1, &ins, 0);
> +	if (ret) {
> +		unuse_block_rsv(root->fs_info, block_rsv, blocksize);
> +		return ERR_PTR(ret);
> +	}
> +
> +	buf = btrfs_init_new_buffer(trans, root, ins.objectid,
> +				    blocksize, level, factor);
>  	BUG_ON(IS_ERR(buf));
>  
>  	if (root_objectid == BTRFS_TREE_RELOC_OBJECTID) {
> @@ -6243,6 +6529,7 @@ struct extent_buffer *btrfs_alloc_free_b
>  		extent_op->update_key = 1;
>  		extent_op->update_flags = 1;
>  		extent_op->is_data = 0;
> +		extent_op->cater_index_factor = factor;
>  
>  		ret = btrfs_add_delayed_tree_ref(root->fs_info, trans,
>  					ins.objectid,
> diff -uprN a/fs/btrfs/super.c b/fs/btrfs/super.c
> --- a/fs/btrfs/super.c	2012-05-21 18:42:51.000000000 +0000
> +++ b/fs/btrfs/super.c	2012-05-29 23:08:19.000000000 +0000
> @@ -166,7 +166,7 @@ enum {
>  	Opt_enospc_debug, Opt_subvolrootid, Opt_defrag, Opt_inode_cache,
>  	Opt_no_space_cache, Opt_recovery, Opt_skip_balance,
>  	Opt_check_integrity, Opt_check_integrity_including_extent_data,
> -	Opt_check_integrity_print_mask,
> +	Opt_check_integrity_print_mask, Opt_cater_factor,
>  	Opt_err,
>  };
>  
> @@ -206,6 +206,7 @@ static match_table_t tokens = {
>  	{Opt_check_integrity, "check_int"},
>  	{Opt_check_integrity_including_extent_data, "check_int_data"},
>  	{Opt_check_integrity_print_mask, "check_int_print_mask=%d"},
> +	{Opt_cater_factor, "cater_factor=%d"},
>  	{Opt_err, NULL},
>  };
>  
> @@ -407,6 +408,15 @@ int btrfs_parse_options(struct btrfs_roo
>  		case Opt_skip_balance:
>  			btrfs_set_opt(info->mount_opt, SKIP_BALANCE);
>  			break;
> +		case Opt_cater_factor:
> +			match_int(&args[0], &intarg);
> +			if (intarg > 16 || intarg < 1)
> +				info->cater_factor = 1;
> +			else
> +				info->cater_factor = intarg;
> +			printk(KERN_INFO "btrfs: cater_factor is %d\n",
> +						(unsigned char)info->cater_factor);
> +			break;
>  #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
>  		case Opt_check_integrity_including_extent_data:
>  			printk(KERN_INFO "btrfs: enabling check integrity"
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 


      reply	other threads:[~2012-05-29 17:32 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-29 16:24 [RFC PATCH] Decrease meta fragments by using a caterpillar band Method (Ver. 2) WeiFeng Liu
2012-05-29 17:32 ` Goffredo Baroncelli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FC50842.8080304@libero.it \
    --to=kreijack@libero.it \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=weifeng.liu@hushmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).