From: Mark Harmstone <maharmstone@fb.com>
To: <linux-btrfs@vger.kernel.org>
Cc: Mark Harmstone <maharmstone@fb.com>
Subject: [PATCH 05/12] btrfs: don't add metadata items for the remap tree to the extent tree
Date: Thu, 5 Jun 2025 17:23:35 +0100 [thread overview]
Message-ID: <20250605162345.2561026-6-maharmstone@fb.com> (raw)
In-Reply-To: <20250605162345.2561026-1-maharmstone@fb.com>
There is the following potential problem with the remap tree and delayed refs:
* Remapped extent freed in a delayed ref, which removes an entry from the
remap tree
* Remap tree now small enough to fit in a single leaf
* Corruption as we now have a level-0 block with a level-1 metadata item
in the extent tree
One solution to this would be to rework the remap tree code so that it operates
via delayed refs. But as we're hoping to remove cow-only metadata items in the
future anyway, change things so that the remap tree doesn't have any entries in
the extent tree. This also has the benefit of reducing write amplification.
We also make it so that the clear_cache mount option is a no-op, as with the
extent tree v2, as the free-space tree can no longer be recreated from the
extent tree.
Finally disable relocating the remap tree itself for the time being: rather
than walking the extent tree, this will need to be changed so that the remap
tree gets walked, and any nodes within the specified block groups get COWed.
This code will also cover the future cases when we remove the metadata items
for the SYSTEM block groups, i.e. the chunk and root trees.
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
---
fs/btrfs/disk-io.c | 3 ++
fs/btrfs/extent-tree.c | 114 ++++++++++++++++++++++++-----------------
fs/btrfs/volumes.c | 3 ++
3 files changed, 73 insertions(+), 47 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 60cce96a9ec4..324116c3566c 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3064,6 +3064,9 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
if (btrfs_fs_incompat(fs_info, EXTENT_TREE_V2))
btrfs_warn(fs_info,
"'clear_cache' option is ignored with extent tree v2");
+ else if (btrfs_fs_incompat(fs_info, REMAP_TREE))
+ btrfs_warn(fs_info,
+ "'clear_cache' option is ignored with remap tree");
else
rebuild_free_space_tree = true;
} else if (btrfs_fs_compat_ro(fs_info, FREE_SPACE_TREE) &&
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 46d4963a8241..205692fc1c7e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3106,6 +3106,24 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
bool skinny_metadata = btrfs_fs_incompat(info, SKINNY_METADATA);
u64 delayed_ref_root = href->owning_root;
+ is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
+
+ if (!is_data && node->ref_root == BTRFS_REMAP_TREE_OBJECTID) {
+ ret = add_to_free_space_tree(trans, bytenr, num_bytes);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+ return ret;
+ }
+
+ ret = btrfs_update_block_group(trans, bytenr, num_bytes, false);
+ if (ret) {
+ btrfs_abort_transaction(trans, ret);
+ return ret;
+ }
+
+ return 0;
+ }
+
extent_root = btrfs_extent_root(info, bytenr);
ASSERT(extent_root);
@@ -3113,8 +3131,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans,
if (!path)
return -ENOMEM;
- is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID;
-
if (!is_data && refs_to_drop != 1) {
btrfs_crit(info,
"invalid refs_to_drop, dropping more than 1 refs for tree block %llu refs_to_drop %u",
@@ -4893,57 +4909,61 @@ static int alloc_reserved_tree_block(struct btrfs_trans_handle *trans,
int level = btrfs_delayed_ref_owner(node);
bool skinny_metadata = btrfs_fs_incompat(fs_info, SKINNY_METADATA);
- extent_key.objectid = node->bytenr;
- if (skinny_metadata) {
- /* The owner of a tree block is the level. */
- extent_key.offset = level;
- extent_key.type = BTRFS_METADATA_ITEM_KEY;
- } else {
- extent_key.offset = node->num_bytes;
- extent_key.type = BTRFS_EXTENT_ITEM_KEY;
- size += sizeof(*block_info);
- }
+ if (node->ref_root != BTRFS_REMAP_TREE_OBJECTID) {
+ extent_key.objectid = node->bytenr;
+ if (skinny_metadata) {
+ /* The owner of a tree block is the level. */
+ extent_key.offset = level;
+ extent_key.type = BTRFS_METADATA_ITEM_KEY;
+ } else {
+ extent_key.offset = node->num_bytes;
+ extent_key.type = BTRFS_EXTENT_ITEM_KEY;
+ size += sizeof(*block_info);
+ }
- path = btrfs_alloc_path();
- if (!path)
- return -ENOMEM;
+ path = btrfs_alloc_path();
+ if (!path)
+ return -ENOMEM;
- extent_root = btrfs_extent_root(fs_info, extent_key.objectid);
- ret = btrfs_insert_empty_item(trans, extent_root, path, &extent_key,
- size);
- if (ret) {
- btrfs_free_path(path);
- return ret;
- }
+ extent_root = btrfs_extent_root(fs_info, extent_key.objectid);
+ ret = btrfs_insert_empty_item(trans, extent_root, path,
+ &extent_key, size);
+ if (ret) {
+ btrfs_free_path(path);
+ return ret;
+ }
- leaf = path->nodes[0];
- extent_item = btrfs_item_ptr(leaf, path->slots[0],
- struct btrfs_extent_item);
- btrfs_set_extent_refs(leaf, extent_item, 1);
- btrfs_set_extent_generation(leaf, extent_item, trans->transid);
- btrfs_set_extent_flags(leaf, extent_item,
- flags | BTRFS_EXTENT_FLAG_TREE_BLOCK);
+ leaf = path->nodes[0];
+ extent_item = btrfs_item_ptr(leaf, path->slots[0],
+ struct btrfs_extent_item);
+ btrfs_set_extent_refs(leaf, extent_item, 1);
+ btrfs_set_extent_generation(leaf, extent_item, trans->transid);
+ btrfs_set_extent_flags(leaf, extent_item,
+ flags | BTRFS_EXTENT_FLAG_TREE_BLOCK);
- if (skinny_metadata) {
- iref = (struct btrfs_extent_inline_ref *)(extent_item + 1);
- } else {
- block_info = (struct btrfs_tree_block_info *)(extent_item + 1);
- btrfs_set_tree_block_key(leaf, block_info, &extent_op->key);
- btrfs_set_tree_block_level(leaf, block_info, level);
- iref = (struct btrfs_extent_inline_ref *)(block_info + 1);
- }
+ if (skinny_metadata) {
+ iref = (struct btrfs_extent_inline_ref *)(extent_item + 1);
+ } else {
+ block_info = (struct btrfs_tree_block_info *)(extent_item + 1);
+ btrfs_set_tree_block_key(leaf, block_info, &extent_op->key);
+ btrfs_set_tree_block_level(leaf, block_info, level);
+ iref = (struct btrfs_extent_inline_ref *)(block_info + 1);
+ }
- if (node->type == BTRFS_SHARED_BLOCK_REF_KEY) {
- btrfs_set_extent_inline_ref_type(leaf, iref,
- BTRFS_SHARED_BLOCK_REF_KEY);
- btrfs_set_extent_inline_ref_offset(leaf, iref, node->parent);
- } else {
- btrfs_set_extent_inline_ref_type(leaf, iref,
- BTRFS_TREE_BLOCK_REF_KEY);
- btrfs_set_extent_inline_ref_offset(leaf, iref, node->ref_root);
- }
+ if (node->type == BTRFS_SHARED_BLOCK_REF_KEY) {
+ btrfs_set_extent_inline_ref_type(leaf, iref,
+ BTRFS_SHARED_BLOCK_REF_KEY);
+ btrfs_set_extent_inline_ref_offset(leaf, iref,
+ node->parent);
+ } else {
+ btrfs_set_extent_inline_ref_type(leaf, iref,
+ BTRFS_TREE_BLOCK_REF_KEY);
+ btrfs_set_extent_inline_ref_offset(leaf, iref,
+ node->ref_root);
+ }
- btrfs_free_path(path);
+ btrfs_free_path(path);
+ }
return alloc_reserved_extent(trans, node->bytenr, fs_info->nodesize);
}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9159d11cb143..0f4954f998cd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3981,6 +3981,9 @@ static bool should_balance_chunk(struct extent_buffer *leaf, struct btrfs_chunk
struct btrfs_balance_args *bargs = NULL;
u64 chunk_type = btrfs_chunk_type(leaf, chunk);
+ if (chunk_type & BTRFS_BLOCK_GROUP_REMAP)
+ return false;
+
/* type filter */
if (!((chunk_type & BTRFS_BLOCK_GROUP_TYPE_MASK) &
(bctl->flags & BTRFS_BALANCE_TYPE_MASK))) {
--
2.49.0
next prev parent reply other threads:[~2025-06-05 16:23 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-05 16:23 [PATCH 00/12] btrfs: remap tree Mark Harmstone
2025-06-05 16:23 ` [PATCH 01/12] btrfs: add definitions and constants for remap-tree Mark Harmstone
2025-06-13 21:02 ` Boris Burkov
2025-06-05 16:23 ` [PATCH 02/12] btrfs: add REMAP chunk type Mark Harmstone
2025-06-13 21:22 ` Boris Burkov
2025-06-05 16:23 ` [PATCH 03/12] btrfs: allow remapped chunks to have zero stripes Mark Harmstone
2025-06-13 21:41 ` Boris Burkov
2025-08-08 14:12 ` Mark Harmstone
2025-06-05 16:23 ` [PATCH 04/12] btrfs: remove remapped block groups from the free-space tree Mark Harmstone
2025-06-06 6:41 ` kernel test robot
2025-06-13 22:00 ` Boris Burkov
2025-08-12 14:50 ` Mark Harmstone
2025-06-05 16:23 ` Mark Harmstone [this message]
2025-06-13 22:39 ` [PATCH 05/12] btrfs: don't add metadata items for the remap tree to the extent tree Boris Burkov
2025-06-05 16:23 ` [PATCH 06/12] btrfs: add extended version of struct block_group_item Mark Harmstone
2025-06-05 16:23 ` [PATCH 07/12] btrfs: allow mounting filesystems with remap-tree incompat flag Mark Harmstone
2025-06-05 16:23 ` [PATCH 08/12] btrfs: redirect I/O for remapped block groups Mark Harmstone
2025-06-05 16:23 ` [PATCH 09/12] btrfs: handle deletions from remapped block group Mark Harmstone
2025-06-13 23:42 ` Boris Burkov
2025-08-11 16:48 ` Mark Harmstone
2025-08-11 16:59 ` Mark Harmstone
2025-06-05 16:23 ` [PATCH 10/12] btrfs: handle setting up relocation of block group with remap-tree Mark Harmstone
2025-06-13 23:25 ` Boris Burkov
2025-08-12 11:20 ` Mark Harmstone
2025-06-05 16:23 ` [PATCH 11/12] btrfs: move existing remaps before relocating block group Mark Harmstone
2025-06-06 11:20 ` kernel test robot
2025-06-05 16:23 ` [PATCH 12/12] btrfs: replace identity maps with actual remaps when doing relocations Mark Harmstone
2025-06-05 16:43 ` [PATCH 00/12] btrfs: remap tree Jonah Sabean
2025-06-06 13:35 ` Mark Harmstone
2025-06-09 16:05 ` Anand Jain
2025-06-09 18:51 ` David Sterba
2025-06-10 9:19 ` Mark Harmstone
2025-06-10 14:31 ` Mark Harmstone
2025-06-10 23:56 ` Qu Wenruo
2025-06-11 8:06 ` Mark Harmstone
2025-06-11 15:28 ` Mark Harmstone
2025-06-14 0:04 ` Boris Burkov
2025-06-26 22:10 ` Mark Harmstone
2025-06-27 5:59 ` Neal Gompa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250605162345.2561026-6-maharmstone@fb.com \
--to=maharmstone@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox