From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Mark Harmstone <maharmstone@fb.com>, linux-btrfs@vger.kernel.org
Subject: Re: [RFC PATCH 04/10] btrfs: add extended version of struct block_group_item
Date: Fri, 23 May 2025 19:23:41 +0930 [thread overview]
Message-ID: <1b511b2f-0738-42a1-95af-546bd7b4e51e@gmx.com> (raw)
In-Reply-To: <20250515163641.3449017-5-maharmstone@fb.com>
在 2025/5/16 02:06, Mark Harmstone 写道:
> Add a struct btrfs_block_group_item_v2, which is used in the block group
> tree if the remap-tree incompat flag is set.
>
> This adds two new fields to the block group item: `remap_bytes` and
> `identity_remap_count`.
>
> `remap_bytes` records the amount of data that's physically within this
> block group, but nominally in another, remapped block group. This is
> necessary because this data will need to be moved first if this block
> group is itself relocated. If `remap_bytes` > 0, this is an indicator to
> the relocation thread that it will need to search the remap-tree for
> backrefs. A block group must also have `remap_bytes` == 0 before it can
> be dropped.
>
> `identity_remap_count` records how many identity remap items are located
> in the remap tree for this block group. When relocation is begun for
> this block group, this is set to the number of holes in the free-space
> tree for this range. As identity remaps are converted into actual remaps
> by the relocation process, this number is decreased. Once it reaches 0,
> either because of relocation or because extents have been deleted, the
> block group has been fully remapped and its chunk's device extents are
> removed.
Can we add those two items into a new item other than a completely new
v2 block group item?
I mean for regular block groups they do not need those members, and all
block groups starts from regular ones at mkfs time.
We can add a regular block group flag to indicate if the bg has the
extra members.
Thanks,
Qu
>
> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
> ---
> fs/btrfs/accessors.h | 20 +++++++
> fs/btrfs/block-group.c | 101 ++++++++++++++++++++++++--------
> fs/btrfs/block-group.h | 14 ++++-
> fs/btrfs/tree-checker.c | 10 +++-
> include/uapi/linux/btrfs_tree.h | 8 +++
> 5 files changed, 126 insertions(+), 27 deletions(-)
>
> diff --git a/fs/btrfs/accessors.h b/fs/btrfs/accessors.h
> index 5f5eda8d6f9e..6e6dd664217b 100644
> --- a/fs/btrfs/accessors.h
> +++ b/fs/btrfs/accessors.h
> @@ -264,6 +264,26 @@ BTRFS_SETGET_FUNCS(block_group_flags, struct btrfs_block_group_item, flags, 64);
> BTRFS_SETGET_STACK_FUNCS(stack_block_group_flags,
> struct btrfs_block_group_item, flags, 64);
>
> +/* struct btrfs_block_group_item_v2 */
> +BTRFS_SETGET_STACK_FUNCS(stack_block_group_v2_used, struct btrfs_block_group_item_v2,
> + used, 64);
> +BTRFS_SETGET_FUNCS(block_group_v2_used, struct btrfs_block_group_item_v2, used, 64);
> +BTRFS_SETGET_STACK_FUNCS(stack_block_group_v2_chunk_objectid,
> + struct btrfs_block_group_item_v2, chunk_objectid, 64);
> +BTRFS_SETGET_FUNCS(block_group_v2_chunk_objectid,
> + struct btrfs_block_group_item_v2, chunk_objectid, 64);
> +BTRFS_SETGET_STACK_FUNCS(stack_block_group_v2_flags,
> + struct btrfs_block_group_item_v2, flags, 64);
> +BTRFS_SETGET_FUNCS(block_group_v2_flags, struct btrfs_block_group_item_v2, flags, 64);
> +BTRFS_SETGET_STACK_FUNCS(stack_block_group_v2_remap_bytes,
> + struct btrfs_block_group_item_v2, remap_bytes, 64);
> +BTRFS_SETGET_FUNCS(block_group_v2_remap_bytes, struct btrfs_block_group_item_v2,
> + remap_bytes, 64);
> +BTRFS_SETGET_STACK_FUNCS(stack_block_group_v2_identity_remap_count,
> + struct btrfs_block_group_item_v2, identity_remap_count, 32);
> +BTRFS_SETGET_FUNCS(block_group_v2_identity_remap_count, struct btrfs_block_group_item_v2,
> + identity_remap_count, 32);
> +
> /* struct btrfs_free_space_info */
> BTRFS_SETGET_FUNCS(free_space_extent_count, struct btrfs_free_space_info,
> extent_count, 32);
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 5b0cb04b2b93..6a2aa792ccb2 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2351,7 +2351,7 @@ static int check_chunk_block_group_mappings(struct btrfs_fs_info *fs_info)
> }
>
> static int read_one_block_group(struct btrfs_fs_info *info,
> - struct btrfs_block_group_item *bgi,
> + struct btrfs_block_group_item_v2 *bgi,
> const struct btrfs_key *key,
> int need_clear)
> {
> @@ -2366,11 +2366,16 @@ static int read_one_block_group(struct btrfs_fs_info *info,
> return -ENOMEM;
>
> cache->length = key->offset;
> - cache->used = btrfs_stack_block_group_used(bgi);
> + cache->used = btrfs_stack_block_group_v2_used(bgi);
> cache->commit_used = cache->used;
> - cache->flags = btrfs_stack_block_group_flags(bgi);
> - cache->global_root_id = btrfs_stack_block_group_chunk_objectid(bgi);
> + cache->flags = btrfs_stack_block_group_v2_flags(bgi);
> + cache->global_root_id = btrfs_stack_block_group_v2_chunk_objectid(bgi);
> cache->space_info = btrfs_find_space_info(info, cache->flags);
> + cache->remap_bytes = btrfs_stack_block_group_v2_remap_bytes(bgi);
> + cache->commit_remap_bytes = cache->remap_bytes;
> + cache->identity_remap_count =
> + btrfs_stack_block_group_v2_identity_remap_count(bgi);
> + cache->commit_identity_remap_count = cache->identity_remap_count;
>
> set_free_space_tree_thresholds(cache);
>
> @@ -2435,7 +2440,7 @@ static int read_one_block_group(struct btrfs_fs_info *info,
> } else if (cache->length == cache->used) {
> cache->cached = BTRFS_CACHE_FINISHED;
> btrfs_free_excluded_extents(cache);
> - } else if (cache->used == 0) {
> + } else if (cache->used == 0 && cache->remap_bytes == 0) {
> cache->cached = BTRFS_CACHE_FINISHED;
> ret = btrfs_add_new_free_space(cache, cache->start,
> cache->start + cache->length, NULL);
> @@ -2455,7 +2460,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
>
> set_avail_alloc_bits(info, cache->flags);
> if (btrfs_chunk_writeable(info, cache->start)) {
> - if (cache->used == 0) {
> + if (cache->used == 0 && cache->identity_remap_count == 0 &&
> + cache->remap_bytes == 0) {
> ASSERT(list_empty(&cache->bg_list));
> if (btrfs_test_opt(info, DISCARD_ASYNC))
> btrfs_discard_queue_work(&info->discard_ctl, cache);
> @@ -2559,9 +2565,10 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
> need_clear = 1;
>
> while (1) {
> - struct btrfs_block_group_item bgi;
> + struct btrfs_block_group_item_v2 bgi;
> struct extent_buffer *leaf;
> int slot;
> + size_t size;
>
> ret = find_first_block_group(info, path, &key);
> if (ret > 0)
> @@ -2572,8 +2579,16 @@ int btrfs_read_block_groups(struct btrfs_fs_info *info)
> leaf = path->nodes[0];
> slot = path->slots[0];
>
> + if (btrfs_fs_incompat(info, REMAP_TREE)) {
> + size = sizeof(struct btrfs_block_group_item_v2);
> + } else {
> + size = sizeof(struct btrfs_block_group_item);
> + btrfs_set_stack_block_group_v2_remap_bytes(&bgi, 0);
> + btrfs_set_stack_block_group_v2_identity_remap_count(&bgi, 0);
> + }
> +
> read_extent_buffer(leaf, &bgi, btrfs_item_ptr_offset(leaf, slot),
> - sizeof(bgi));
> + size);
>
> btrfs_item_key_to_cpu(leaf, &key, slot);
> btrfs_release_path(path);
> @@ -2643,25 +2658,38 @@ static int insert_block_group_item(struct btrfs_trans_handle *trans,
> struct btrfs_block_group *block_group)
> {
> struct btrfs_fs_info *fs_info = trans->fs_info;
> - struct btrfs_block_group_item bgi;
> + struct btrfs_block_group_item_v2 bgi;
> struct btrfs_root *root = btrfs_block_group_root(fs_info);
> struct btrfs_key key;
> u64 old_commit_used;
> + size_t size;
> int ret;
>
> spin_lock(&block_group->lock);
> - btrfs_set_stack_block_group_used(&bgi, block_group->used);
> - btrfs_set_stack_block_group_chunk_objectid(&bgi,
> - block_group->global_root_id);
> - btrfs_set_stack_block_group_flags(&bgi, block_group->flags);
> + btrfs_set_stack_block_group_v2_used(&bgi, block_group->used);
> + btrfs_set_stack_block_group_v2_chunk_objectid(&bgi,
> + block_group->global_root_id);
> + btrfs_set_stack_block_group_v2_flags(&bgi, block_group->flags);
> + btrfs_set_stack_block_group_v2_remap_bytes(&bgi,
> + block_group->remap_bytes);
> + btrfs_set_stack_block_group_v2_identity_remap_count(&bgi,
> + block_group->identity_remap_count);
> old_commit_used = block_group->commit_used;
> block_group->commit_used = block_group->used;
> + block_group->commit_remap_bytes = block_group->remap_bytes;
> + block_group->commit_identity_remap_count =
> + block_group->identity_remap_count;
> key.objectid = block_group->start;
> key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
> key.offset = block_group->length;
> spin_unlock(&block_group->lock);
>
> - ret = btrfs_insert_item(trans, root, &key, &bgi, sizeof(bgi));
> + if (btrfs_fs_incompat(fs_info, REMAP_TREE))
> + size = sizeof(struct btrfs_block_group_item_v2);
> + else
> + size = sizeof(struct btrfs_block_group_item);
> +
> + ret = btrfs_insert_item(trans, root, &key, &bgi, size);
> if (ret < 0) {
> spin_lock(&block_group->lock);
> block_group->commit_used = old_commit_used;
> @@ -3116,10 +3144,12 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
> struct btrfs_root *root = btrfs_block_group_root(fs_info);
> unsigned long bi;
> struct extent_buffer *leaf;
> - struct btrfs_block_group_item bgi;
> + struct btrfs_block_group_item_v2 bgi;
> struct btrfs_key key;
> - u64 old_commit_used;
> - u64 used;
> + u64 old_commit_used, old_commit_remap_bytes;
> + u32 old_commit_identity_remap_count;
> + u64 used, remap_bytes;
> + u32 identity_remap_count;
>
> /*
> * Block group items update can be triggered out of commit transaction
> @@ -3129,13 +3159,21 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
> */
> spin_lock(&cache->lock);
> old_commit_used = cache->commit_used;
> + old_commit_remap_bytes = cache->commit_remap_bytes;
> + old_commit_identity_remap_count = cache->commit_identity_remap_count;
> used = cache->used;
> - /* No change in used bytes, can safely skip it. */
> - if (cache->commit_used == used) {
> + remap_bytes = cache->remap_bytes;
> + identity_remap_count = cache->identity_remap_count;
> + /* No change in values, can safely skip it. */
> + if (cache->commit_used == used &&
> + cache->commit_remap_bytes == remap_bytes &&
> + cache->commit_identity_remap_count == identity_remap_count) {
> spin_unlock(&cache->lock);
> return 0;
> }
> cache->commit_used = used;
> + cache->commit_remap_bytes = remap_bytes;
> + cache->commit_identity_remap_count = identity_remap_count;
> spin_unlock(&cache->lock);
>
> key.objectid = cache->start;
> @@ -3151,11 +3189,23 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
>
> leaf = path->nodes[0];
> bi = btrfs_item_ptr_offset(leaf, path->slots[0]);
> - btrfs_set_stack_block_group_used(&bgi, used);
> - btrfs_set_stack_block_group_chunk_objectid(&bgi,
> - cache->global_root_id);
> - btrfs_set_stack_block_group_flags(&bgi, cache->flags);
> - write_extent_buffer(leaf, &bgi, bi, sizeof(bgi));
> + btrfs_set_stack_block_group_v2_used(&bgi, used);
> + btrfs_set_stack_block_group_v2_chunk_objectid(&bgi,
> + cache->global_root_id);
> + btrfs_set_stack_block_group_v2_flags(&bgi, cache->flags);
> +
> + if (btrfs_fs_incompat(fs_info, REMAP_TREE)) {
> + btrfs_set_stack_block_group_v2_remap_bytes(&bgi,
> + cache->remap_bytes);
> + btrfs_set_stack_block_group_v2_identity_remap_count(&bgi,
> + cache->identity_remap_count);
> + write_extent_buffer(leaf, &bgi, bi,
> + sizeof(struct btrfs_block_group_item_v2));
> + } else {
> + write_extent_buffer(leaf, &bgi, bi,
> + sizeof(struct btrfs_block_group_item));
> + }
> +
> fail:
> btrfs_release_path(path);
> /*
> @@ -3170,6 +3220,9 @@ static int update_block_group_item(struct btrfs_trans_handle *trans,
> if (ret < 0 && ret != -ENOENT) {
> spin_lock(&cache->lock);
> cache->commit_used = old_commit_used;
> + cache->commit_remap_bytes = old_commit_remap_bytes;
> + cache->commit_identity_remap_count =
> + old_commit_identity_remap_count;
> spin_unlock(&cache->lock);
> }
> return ret;
> diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
> index 9de356bcb411..c484118b8b8d 100644
> --- a/fs/btrfs/block-group.h
> +++ b/fs/btrfs/block-group.h
> @@ -127,6 +127,8 @@ struct btrfs_block_group {
> u64 flags;
> u64 cache_generation;
> u64 global_root_id;
> + u64 remap_bytes;
> + u32 identity_remap_count;
>
> /*
> * The last committed used bytes of this block group, if the above @used
> @@ -134,6 +136,15 @@ struct btrfs_block_group {
> * group item of this block group.
> */
> u64 commit_used;
> + /*
> + * The last committed remap_bytes value of this block group.
> + */
> + u64 commit_remap_bytes;
> + /*
> + * The last commited identity_remap_count value of this block group.
> + */
> + u32 commit_identity_remap_count;
> +
> /*
> * If the free space extent count exceeds this number, convert the block
> * group to bitmaps.
> @@ -275,7 +286,8 @@ static inline bool btrfs_is_block_group_used(const struct btrfs_block_group *bg)
> {
> lockdep_assert_held(&bg->lock);
>
> - return (bg->used > 0 || bg->reserved > 0 || bg->pinned > 0);
> + return (bg->used > 0 || bg->reserved > 0 || bg->pinned > 0 ||
> + bg->remap_bytes > 0);
> }
>
> static inline bool btrfs_is_block_group_data_only(const struct btrfs_block_group *block_group)
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index fd83df06e3fb..25311576fab6 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -687,6 +687,7 @@ static int check_block_group_item(struct extent_buffer *leaf,
> u64 chunk_objectid;
> u64 flags;
> u64 type;
> + size_t exp_size;
>
> /*
> * Here we don't really care about alignment since extent allocator can
> @@ -698,10 +699,15 @@ static int check_block_group_item(struct extent_buffer *leaf,
> return -EUCLEAN;
> }
>
> - if (unlikely(item_size != sizeof(bgi))) {
> + if (btrfs_fs_incompat(fs_info, REMAP_TREE))
> + exp_size = sizeof(struct btrfs_block_group_item_v2);
> + else
> + exp_size = sizeof(struct btrfs_block_group_item);
> +
> + if (unlikely(item_size != exp_size)) {
> block_group_err(leaf, slot,
> "invalid item size, have %u expect %zu",
> - item_size, sizeof(bgi));
> + item_size, exp_size);
> return -EUCLEAN;
> }
>
> diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
> index 9a36f0206d90..500e3a7df90b 100644
> --- a/include/uapi/linux/btrfs_tree.h
> +++ b/include/uapi/linux/btrfs_tree.h
> @@ -1229,6 +1229,14 @@ struct btrfs_block_group_item {
> __le64 flags;
> } __attribute__ ((__packed__));
>
> +struct btrfs_block_group_item_v2 {
> + __le64 used;
> + __le64 chunk_objectid;
> + __le64 flags;
> + __le64 remap_bytes;
> + __le32 identity_remap_count;
> +} __attribute__ ((__packed__));
> +
> struct btrfs_free_space_info {
> __le32 extent_count;
> __le32 flags;
next prev parent reply other threads:[~2025-05-23 9:53 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 16:36 [RFC PATCH 00/10] Remap tree Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 01/10] btrfs: add definitions and constants for remap-tree Mark Harmstone
2025-05-21 12:43 ` Johannes Thumshirn
2025-05-23 13:06 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 02/10] btrfs: add REMAP chunk type Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 03/10] btrfs: allow remapped chunks to have zero stripes Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 04/10] btrfs: add extended version of struct block_group_item Mark Harmstone
2025-05-23 9:53 ` Qu Wenruo [this message]
2025-05-23 12:00 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 05/10] btrfs: allow mounting filesystems with remap-tree incompat flag Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 06/10] btrfs: redirect I/O for remapped block groups Mark Harmstone
2025-05-23 10:09 ` Qu Wenruo
2025-05-23 11:53 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 07/10] btrfs: handle deletions from remapped block group Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 08/10] btrfs: handle setting up relocation of block group with remap-tree Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 09/10] btrfs: move existing remaps before relocating block group Mark Harmstone
[not found] ` <202505161726.w1lqCZxG-lkp@intel.com>
2025-05-16 11:43 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 10/10] btrfs: replace identity maps with actual remaps when doing relocations Mark Harmstone
2025-05-21 0:04 ` Boris Burkov
2025-05-23 14:54 ` Mark Harmstone
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1b511b2f-0738-42a1-95af-546bd7b4e51e@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=maharmstone@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox