public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Josef Bacik <josef@toxicpanda.com>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH 8/8] btrfs: add support for multiple global roots
Date: Sat, 6 Nov 2021 09:18:02 +0800	[thread overview]
Message-ID: <0595f1b5-2d5c-c5ca-cfad-efb753afec1b@gmx.com> (raw)
In-Reply-To: <a6f403691bdec22e8e052f699ae52f18875cb870.1636145221.git.josef@toxicpanda.com>



On 2021/11/6 04:49, Josef Bacik wrote:
> With extent tree v2 you will be able to create multiple csum, extent,
> and free space trees.  They will be used based on the block group, which
> will now use the block_group_item->chunk_objectid to point to the set of
> global roots that it will use.  When allocating new block groups we'll
> simply mod the gigabyte offset of the block group against the number of
> global roots we have and that will be the block groups global id.
>
>  From there we can take the bytenr that we're modifying in the respective
> tree, look up the block group and get that block groups corresponding
> global root id.  From there we can get to the appropriate global root
> for that bytenr.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/block-group.c     | 11 +++++++--
>   fs/btrfs/block-group.h     |  1 +
>   fs/btrfs/ctree.h           |  2 ++
>   fs/btrfs/disk-io.c         | 49 +++++++++++++++++++++++++++++++-------
>   fs/btrfs/free-space-tree.c |  2 ++
>   fs/btrfs/transaction.c     | 15 ++++++++++++
>   fs/btrfs/tree-checker.c    | 21 ++++++++++++++--
>   7 files changed, 88 insertions(+), 13 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 7eb0a8632a01..85516f2fd5da 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2002,6 +2002,7 @@ static int read_one_block_group(struct btrfs_fs_info *info,
>   	cache->length = key->offset;
>   	cache->used = btrfs_stack_block_group_used(bgi);
>   	cache->flags = btrfs_stack_block_group_flags(bgi);
> +	cache->global_root_id = btrfs_stack_block_group_chunk_objectid(bgi);
>
>   	set_free_space_tree_thresholds(cache);
>
> @@ -2284,7 +2285,7 @@ static int insert_block_group_item(struct btrfs_trans_handle *trans,
>   	spin_lock(&block_group->lock);
>   	btrfs_set_stack_block_group_used(&bgi, block_group->used);
>   	btrfs_set_stack_block_group_chunk_objectid(&bgi,
> -				BTRFS_FIRST_CHUNK_TREE_OBJECTID);
> +						   block_group->global_root_id);
>   	btrfs_set_stack_block_group_flags(&bgi, block_group->flags);
>   	key.objectid = block_group->start;
>   	key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
> @@ -2460,6 +2461,12 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
>   	cache->flags = type;
>   	cache->last_byte_to_unpin = (u64)-1;
>   	cache->cached = BTRFS_CACHE_FINISHED;
> +	cache->global_root_id = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
> +
> +	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
> +		cache->global_root_id = div64_u64(cache->start, SZ_1G) %
> +			fs_info->nr_global_roots;
> +

Any special reason for this complex global_root_id calculation?

My initial assumption for global trees is pretty simple, just something
like (CSUM_TREE, ROOT_ITEM, bg bytenr) or (EXTENT_TREE, ROOT_ITEM, bg
bytenr) as their root key items.

But this is definitely not the case here.

Thus I'm wondering why we're not using something more simple.

Thanks,
Qu

>   	if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE))
>   		cache->needs_free_space = 1;
>
> @@ -2676,7 +2683,7 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
>   	bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
>   	btrfs_set_stack_block_group_used(&bgi, cache->used);
>   	btrfs_set_stack_block_group_chunk_objectid(&bgi,
> -			BTRFS_FIRST_CHUNK_TREE_OBJECTID);
> +						   cache->global_root_id);
>   	btrfs_set_stack_block_group_flags(&bgi, cache->flags);
>   	write_extent_buffer(leaf, &bgi, bi, sizeof(bgi));
>   	btrfs_mark_buffer_dirty(leaf);
> diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
> index 5878b7ce3b78..93aabc68bb6a 100644
> --- a/fs/btrfs/block-group.h
> +++ b/fs/btrfs/block-group.h
> @@ -68,6 +68,7 @@ struct btrfs_block_group {
>   	u64 bytes_super;
>   	u64 flags;
>   	u64 cache_generation;
> +	u64 global_root_id;
>
>   	/*
>   	 * If the free space extent count exceeds this number, convert the block
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index b57367141b95..7de0cd2b87ec 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -1057,6 +1057,8 @@ struct btrfs_fs_info {
>   	spinlock_t relocation_bg_lock;
>   	u64 data_reloc_bg;
>
> +	u64 nr_global_roots;
> +
>   	spinlock_t zone_active_bgs_lock;
>   	struct list_head zone_active_bgs;
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 45b2bde43150..a8bc00d17b26 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1295,13 +1295,33 @@ struct btrfs_root *btrfs_global_root(struct btrfs_fs_info *fs_info,
>   	return root;
>   }
>
> +static u64 btrfs_global_root_id(struct btrfs_fs_info *fs_info, u64 bytenr)
> +{
> +	struct btrfs_block_group *block_group;
> +	u64 ret;
> +
> +	if (!btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
> +		return 0;
> +
> +	if (likely(bytenr))
> +		block_group = btrfs_lookup_block_group(fs_info, bytenr);
> +	else
> +		block_group = btrfs_lookup_first_block_group(fs_info, bytenr);
> +	ASSERT(block_group);
> +	if (!block_group)
> +		return 0;
> +	ret = block_group->global_root_id;
> +	btrfs_put_block_group(block_group);
> +	return ret;
> +}
> +
>   struct btrfs_root *btrfs_csum_root(struct btrfs_fs_info *fs_info,
>   				   u64 bytenr)
>   {
>   	struct btrfs_key key = {
>   		.objectid = BTRFS_CSUM_TREE_OBJECTID,
>   		.type = BTRFS_ROOT_ITEM_KEY,
> -		.offset = 0,
> +		.offset = btrfs_global_root_id(fs_info, bytenr),
>   	};
>
>   	return btrfs_global_root(fs_info, &key);
> @@ -1313,7 +1333,7 @@ struct btrfs_root *btrfs_extent_root(struct btrfs_fs_info *fs_info,
>   	struct btrfs_key key = {
>   		.objectid = BTRFS_EXTENT_TREE_OBJECTID,
>   		.type = BTRFS_ROOT_ITEM_KEY,
> -		.offset = 0,
> +		.offset = btrfs_global_root_id(fs_info, bytenr),
>   	};
>
>   	return btrfs_global_root(fs_info, &key);
> @@ -2094,7 +2114,6 @@ static void backup_super_roots(struct btrfs_fs_info *info)
>   {
>   	const int next_backup = info->backup_root_index;
>   	struct btrfs_root_backup *root_backup;
> -	struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
>
>   	root_backup = info->super_for_commit->super_roots + next_backup;
>
> @@ -2128,6 +2147,7 @@ static void backup_super_roots(struct btrfs_fs_info *info)
>   			btrfs_header_level(info->block_group_root->node));
>   	} else {
>   		struct btrfs_root *extent_root = btrfs_extent_root(info, 0);
> +		struct btrfs_root *csum_root = btrfs_csum_root(info, 0);
>
>   		btrfs_set_backup_extent_root(root_backup,
>   					     extent_root->node->start);
> @@ -2135,6 +2155,12 @@ static void backup_super_roots(struct btrfs_fs_info *info)
>   				btrfs_header_generation(extent_root->node));
>   		btrfs_set_backup_extent_root_level(root_backup,
>   					btrfs_header_level(extent_root->node));
> +
> +		btrfs_set_backup_csum_root(root_backup, csum_root->node->start);
> +		btrfs_set_backup_csum_root_gen(root_backup,
> +					       btrfs_header_generation(csum_root->node));
> +		btrfs_set_backup_csum_root_level(root_backup,
> +						 btrfs_header_level(csum_root->node));
>   	}
>
>   	/*
> @@ -2156,12 +2182,6 @@ static void backup_super_roots(struct btrfs_fs_info *info)
>   	btrfs_set_backup_dev_root_level(root_backup,
>   				       btrfs_header_level(info->dev_root->node));
>
> -	btrfs_set_backup_csum_root(root_backup, csum_root->node->start);
> -	btrfs_set_backup_csum_root_gen(root_backup,
> -				       btrfs_header_generation(csum_root->node));
> -	btrfs_set_backup_csum_root_level(root_backup,
> -					 btrfs_header_level(csum_root->node));
> -
>   	btrfs_set_backup_total_bytes(root_backup,
>   			     btrfs_super_total_bytes(info->super_copy));
>   	btrfs_set_backup_bytes_used(root_backup,
> @@ -2550,6 +2570,7 @@ static int load_global_roots_objectid(struct btrfs_root *tree_root,
>   {
>   	struct btrfs_fs_info *fs_info = tree_root->fs_info;
>   	struct btrfs_root *root;
> +	u64 max_global_id = 0;
>   	int ret;
>   	struct btrfs_key key = {
>   		.objectid = objectid,
> @@ -2586,6 +2607,13 @@ static int load_global_roots_objectid(struct btrfs_root *tree_root,
>   			break;
>   		btrfs_release_path(path);
>
> +		/*
> +		 * Just worry about this for extent tree, it'll be the same for
> +		 * everybody.
> +		 */
> +		if (objectid == BTRFS_EXTENT_TREE_OBJECTID)
> +			max_global_id = max(max_global_id, key.offset);
> +
>   		found = true;
>   		root = read_tree_root_path(tree_root, path, &key);
>   		if (IS_ERR(root)) {
> @@ -2603,6 +2631,9 @@ static int load_global_roots_objectid(struct btrfs_root *tree_root,
>   	}
>   	btrfs_release_path(path);
>
> +	if (objectid == BTRFS_EXTENT_TREE_OBJECTID)
> +		fs_info->nr_global_roots = max_global_id + 1;
> +
>   	if (!found || ret) {
>   		if (objectid == BTRFS_CSUM_TREE_OBJECTID)
>   			set_bit(BTRFS_FS_STATE_NO_CSUMS, &fs_info->fs_state);
> diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
> index cf227450f356..60a73bcffaf1 100644
> --- a/fs/btrfs/free-space-tree.c
> +++ b/fs/btrfs/free-space-tree.c
> @@ -24,6 +24,8 @@ static struct btrfs_root *btrfs_free_space_root(
>   		.type = BTRFS_ROOT_ITEM_KEY,
>   		.offset = 0,
>   	};
> +	if (btrfs_fs_incompat(block_group->fs_info, EXTENT_TREE_V2))
> +		key.offset = block_group->global_root_id;
>   	return btrfs_global_root(block_group->fs_info, &key);
>   }
>
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index ba8dd90ac3ce..e343ff8db05d 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -1827,6 +1827,14 @@ static void update_super_roots(struct btrfs_fs_info *fs_info)
>   		super->cache_generation = 0;
>   	if (test_bit(BTRFS_FS_UPDATE_UUID_TREE_GEN, &fs_info->flags))
>   		super->uuid_tree_generation = root_item->generation;
> +
> +	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
> +		root_item = &fs_info->block_group_root->root_item;
> +
> +		super->block_group_root = root_item->bytenr;
> +		super->block_group_root_generation = root_item->generation;
> +		super->block_group_root_level = root_item->level;
> +	}
>   }
>
>   int btrfs_transaction_in_commit(struct btrfs_fs_info *info)
> @@ -2261,6 +2269,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
>   	list_add_tail(&fs_info->chunk_root->dirty_list,
>   		      &cur_trans->switch_commits);
>
> +	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
> +		btrfs_set_root_node(&fs_info->block_group_root->root_item,
> +				    fs_info->block_group_root->node);
> +		list_add_tail(&fs_info->block_group_root->dirty_list,
> +			      &cur_trans->switch_commits);
> +	}
> +
>   	switch_commit_roots(trans);
>
>   	ASSERT(list_empty(&cur_trans->dirty_bgs));
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 1c33dd0e4afc..572f52d78297 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -639,8 +639,10 @@ static void block_group_err(const struct extent_buffer *eb, int slot,
>   static int check_block_group_item(struct extent_buffer *leaf,
>   				  struct btrfs_key *key, int slot)
>   {
> +	struct btrfs_fs_info *fs_info = leaf->fs_info;
>   	struct btrfs_block_group_item bgi;
>   	u32 item_size = btrfs_item_size_nr(leaf, slot);
> +	u64 chunk_objectid;
>   	u64 flags;
>   	u64 type;
>
> @@ -663,8 +665,23 @@ static int check_block_group_item(struct extent_buffer *leaf,
>
>   	read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot),
>   			   sizeof(bgi));
> -	if (unlikely(btrfs_stack_block_group_chunk_objectid(&bgi) !=
> -		     BTRFS_FIRST_CHUNK_TREE_OBJECTID)) {
> +	chunk_objectid = btrfs_stack_block_group_chunk_objectid(&bgi);
> +	if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2)) {
> +		/*
> +		 * We don't init the nr_global_roots until we load the global
> +		 * roots, so this could be 0 at mount time.  If it's 0 we'll
> +		 * just assume we're fine, and later we'll check against our
> +		 * actual value.
> +		 */
> +		if (unlikely(fs_info->nr_global_roots &&
> +			     chunk_objectid >= fs_info->nr_global_roots)) {
> +			block_group_err(leaf, slot,
> +	"invalid block group global root id, have %llu, needs to be <= %llu",
> +					chunk_objectid,
> +					fs_info->nr_global_roots);
> +			return -EUCLEAN;
> +		}
> +	} else if (unlikely(chunk_objectid != BTRFS_FIRST_CHUNK_TREE_OBJECTID)) {
>   		block_group_err(leaf, slot,
>   		"invalid block group chunk objectid, have %llu expect %llu",
>   				btrfs_stack_block_group_chunk_objectid(&bgi),
>

  reply	other threads:[~2021-11-06  1:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-05 20:49 [PATCH 0/8] btrfs: extent tree v2, support for global roots Josef Bacik
2021-11-05 20:49 ` [PATCH 1/8] btrfs: add definition for EXTENT_TREE_V2 Josef Bacik
2021-11-05 20:49 ` [PATCH 2/8] btrfs: disable balance for extent tree v2 for now Josef Bacik
2021-11-05 20:49 ` [PATCH 3/8] btrfs: disable qgroups in extent tree v2 Josef Bacik
2021-11-05 20:49 ` [PATCH 4/8] btrfs: use metadata usage for global block rsv " Josef Bacik
2021-11-05 20:49 ` [PATCH 5/8] btrfs: tree-checker: don't fail on empty extent roots for " Josef Bacik
2021-11-06  1:05   ` Qu Wenruo
2021-11-05 20:49 ` [PATCH 6/8] btrfs: abstract out loading the tree root Josef Bacik
2021-11-05 20:49 ` [PATCH 7/8] btrfs: add code to support the block group root Josef Bacik
2021-11-06  1:11   ` Qu Wenruo
2021-11-08 19:36     ` Josef Bacik
2021-11-09  1:14       ` Qu Wenruo
2021-11-09 19:24         ` Josef Bacik
2021-11-09 23:44           ` Qu Wenruo
2021-11-10 13:57             ` Josef Bacik
2021-11-10  7:13           ` Qu Wenruo
2021-11-10 13:54             ` Josef Bacik
2021-11-05 20:49 ` [PATCH 8/8] btrfs: add support for multiple global roots Josef Bacik
2021-11-06  1:18   ` Qu Wenruo [this message]
2021-11-06  1:51     ` Qu Wenruo
2021-11-08 19:39       ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0595f1b5-2d5c-c5ca-cfad-efb753afec1b@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox