From: Mark Harmstone <maharmstone@meta.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
Mark Harmstone <maharmstone@meta.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC PATCH 06/10] btrfs: redirect I/O for remapped block groups
Date: Fri, 23 May 2025 11:53:54 +0000 [thread overview]
Message-ID: <76aee017-6488-4185-92cf-c9442f1a36e1@meta.com> (raw)
In-Reply-To: <d5aeaff6-3e04-4525-861d-36dfa358eb45@gmx.com>
On 23/5/25 11:09, Qu Wenruo wrote:
> >
>
>
> 在 2025/5/16 02:06, Mark Harmstone 写道:
>> Change btrfs_map_block() so that if the block group has the REMAPPED
>> flag set, we call btrfs_translate_remap() to obtain a new address.
>
> I'm wondering if we can do it a little simpler:
>
> - Delete the chunk item for a fully relocated/remapped chunk
> So that future read/write into that logical range will not find a chunk.
>
> - If chunk map lookup failed, search remap tree instead
>
> By this we do not need the REMAPPED flag at all.
>
> Thanks,
> Qu
You would still need the REMAPPED flag, as that's also set on
partially-remapped block groups.
The life cycle is:
* Normal block group
* Block group with REMAPPED flag set and identity remaps covering its
data. The REMAPPED flag is an instruction to search the remap tree for
this BG, and also means that no new allocations can be made from it
* Block group with a mixture of identity remaps and actual remaps
* Fully-remapped block group, with no chunk stripes and no identity
remaps left
My concern with making fully-remapped block groups implicit is that it
makes it harder to diagnose corruption. If we see an address that's
outside of a block group but has no remap entry, is it a bit-flip error
or a bug in the remap tree code?
Mark
>
>>
>> btrfs_translate_remap() searches the remap tree for a range
>> corresponding to the logical address passed to btrfs_map_block(). If it
>> is within an identity remap, this part of the block group hasn't yet
>> been relocated, and so we use the existing address.
>>
>> If it is within an actual remap, we subtract the start of the remap
>> range and add the address of its destination, contained in the item's
>> payload.
>>
>> Signed-off-by: Mark Harmstone <maharmstone@fb.com>
>> ---
>> fs/btrfs/ctree.c | 11 ++++---
>> fs/btrfs/ctree.h | 3 ++
>> fs/btrfs/relocation.c | 75 +++++++++++++++++++++++++++++++++++++++++++
>> fs/btrfs/relocation.h | 2 ++
>> fs/btrfs/volumes.c | 19 +++++++++++
>> 5 files changed, 105 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index a2e7979372cc..7808f7bc2303 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -2331,7 +2331,8 @@ int btrfs_search_old_slot(struct btrfs_root
>> *root, const struct btrfs_key *key,
>> * This may release the path, and so you may lose any locks held at the
>> * time you call it.
>> */
>> -static int btrfs_prev_leaf(struct btrfs_root *root, struct btrfs_path
>> *path)
>> +int btrfs_prev_leaf(struct btrfs_trans_handle *trans, struct
>> btrfs_root *root,
>> + struct btrfs_path *path, int ins_len, int cow)
>> {
>> struct btrfs_key key;
>> struct btrfs_key orig_key;
>> @@ -2355,7 +2356,7 @@ static int btrfs_prev_leaf(struct btrfs_root
>> *root, struct btrfs_path *path)
>> }
>> btrfs_release_path(path);
>> - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
>> + ret = btrfs_search_slot(trans, root, &key, path, ins_len, cow);
>> if (ret <= 0)
>> return ret;
>> @@ -2454,7 +2455,7 @@ int btrfs_search_slot_for_read(struct btrfs_root
>> *root,
>> }
>> } else {
>> if (p->slots[0] == 0) {
>> - ret = btrfs_prev_leaf(root, p);
>> + ret = btrfs_prev_leaf(NULL, root, p, 0, 0);
>> if (ret < 0)
>> return ret;
>> if (!ret) {
>> @@ -5003,7 +5004,7 @@ int btrfs_previous_item(struct btrfs_root *root,
>> while (1) {
>> if (path->slots[0] == 0) {
>> - ret = btrfs_prev_leaf(root, path);
>> + ret = btrfs_prev_leaf(NULL, root, path, 0, 0);
>> if (ret != 0)
>> return ret;
>> } else {
>> @@ -5044,7 +5045,7 @@ int btrfs_previous_extent_item(struct btrfs_root
>> *root,
>> while (1) {
>> if (path->slots[0] == 0) {
>> - ret = btrfs_prev_leaf(root, path);
>> + ret = btrfs_prev_leaf(NULL, root, path, 0, 0);
>> if (ret != 0)
>> return ret;
>> } else {
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 075a06db43a1..90a0d38a31c9 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -721,6 +721,9 @@ static inline int btrfs_next_leaf(struct
>> btrfs_root *root, struct btrfs_path *pa
>> return btrfs_next_old_leaf(root, path, 0);
>> }
>> +int btrfs_prev_leaf(struct btrfs_trans_handle *trans, struct
>> btrfs_root *root,
>> + struct btrfs_path *path, int ins_len, int cow);
>> +
>> static inline int btrfs_next_item(struct btrfs_root *root, struct
>> btrfs_path *p)
>> {
>> return btrfs_next_old_item(root, p, 0);
>> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
>> index 02086191630d..e5571c897906 100644
>> --- a/fs/btrfs/relocation.c
>> +++ b/fs/btrfs/relocation.c
>> @@ -3897,6 +3897,81 @@ static const char *stage_to_string(enum
>> reloc_stage stage)
>> return "unknown";
>> }
>> +int btrfs_translate_remap(struct btrfs_fs_info *fs_info, u64 *logical,
>> + u64 *length)
>> +{
>> + int ret;
>> + struct btrfs_key key, found_key;
>> + struct extent_buffer *leaf;
>> + struct btrfs_remap *remap;
>> + BTRFS_PATH_AUTO_FREE(path);
>> +
>> + path = btrfs_alloc_path();
>> + if (!path)
>> + return -ENOMEM;
>> +
>> + key.objectid = *logical;
>> + key.type = BTRFS_IDENTITY_REMAP_KEY;
>> + key.offset = 0;
>> +
>> + ret = btrfs_search_slot(NULL, fs_info->remap_root, &key, path,
>> + 0, 0);
>> + if (ret < 0)
>> + return ret;
>> +
>> + leaf = path->nodes[0];
>> +
>> + if (path->slots[0] >= btrfs_header_nritems(leaf)) {
>> + ret = btrfs_next_leaf(fs_info->remap_root, path);
>> + if (ret < 0)
>> + return ret;
>> +
>> + leaf = path->nodes[0];
>> + }
>> +
>> + btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> +
>> + if (found_key.objectid > *logical) {
>> + if (path->slots[0] == 0) {
>> + ret = btrfs_prev_leaf(NULL, fs_info->remap_root, path,
>> + 0, 0);
>> + if (ret) {
>> + if (ret == 1)
>> + ret = -ENOENT;
>> + return ret;
>> + }
>> +
>> + leaf = path->nodes[0];
>> + } else {
>> + path->slots[0]--;
>> + }
>> +
>> + btrfs_item_key_to_cpu(leaf, &found_key, path->slots[0]);
>> + }
>> +
>> + if (found_key.type != BTRFS_REMAP_KEY &&
>> + found_key.type != BTRFS_IDENTITY_REMAP_KEY) {
>> + return -ENOENT;
>> + }
>> +
>> + if (found_key.objectid > *logical ||
>> + found_key.objectid + found_key.offset <= *logical) {
>> + return -ENOENT;
>> + }
>> +
>> + if (*logical + *length > found_key.objectid + found_key.offset)
>> + *length = found_key.objectid + found_key.offset - *logical;
>> +
>> + if (found_key.type == BTRFS_IDENTITY_REMAP_KEY)
>> + return 0;
>> +
>> + remap = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_remap);
>> +
>> + *logical = *logical - found_key.objectid +
>> btrfs_remap_address(leaf, remap);
>> +
>> + return 0;
>> +}
>> +
>> /*
>> * function to relocate all extents in a block group.
>> */
>> diff --git a/fs/btrfs/relocation.h b/fs/btrfs/relocation.h
>> index 788c86d8633a..f07dbd9a89c6 100644
>> --- a/fs/btrfs/relocation.h
>> +++ b/fs/btrfs/relocation.h
>> @@ -30,5 +30,7 @@ int btrfs_should_cancel_balance(const struct
>> btrfs_fs_info *fs_info);
>> struct btrfs_root *find_reloc_root(struct btrfs_fs_info *fs_info,
>> u64 bytenr);
>> bool btrfs_should_ignore_reloc_root(const struct btrfs_root *root);
>> u64 btrfs_get_reloc_bg_bytenr(const struct btrfs_fs_info *fs_info);
>> +int btrfs_translate_remap(struct btrfs_fs_info *fs_info, u64 *logical,
>> + u64 *length);
>> #endif
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 77194bb46b40..4777926213c0 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -6620,6 +6620,25 @@ int btrfs_map_block(struct btrfs_fs_info
>> *fs_info, enum btrfs_map_op op,
>> if (IS_ERR(map))
>> return PTR_ERR(map);
>> + if (map->type & BTRFS_BLOCK_GROUP_REMAPPED) {
>> + u64 new_logical = logical;
>> +
>> + ret = btrfs_translate_remap(fs_info, &new_logical, length);
>> + if (ret)
>> + return ret;
>> +
>> + if (new_logical != logical) {
>> + btrfs_free_chunk_map(map);
>> +
>> + map = btrfs_get_chunk_map(fs_info, new_logical,
>> + *length);
>> + if (IS_ERR(map))
>> + return PTR_ERR(map);
>> +
>> + logical = new_logical;
>> + }
>> + }
>> +
>> num_copies = btrfs_chunk_map_num_copies(map);
>> if (io_geom.mirror_num > num_copies)
>> return -EINVAL;
>
next prev parent reply other threads:[~2025-05-23 11:54 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 16:36 [RFC PATCH 00/10] Remap tree Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 01/10] btrfs: add definitions and constants for remap-tree Mark Harmstone
2025-05-21 12:43 ` Johannes Thumshirn
2025-05-23 13:06 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 02/10] btrfs: add REMAP chunk type Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 03/10] btrfs: allow remapped chunks to have zero stripes Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 04/10] btrfs: add extended version of struct block_group_item Mark Harmstone
2025-05-23 9:53 ` Qu Wenruo
2025-05-23 12:00 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 05/10] btrfs: allow mounting filesystems with remap-tree incompat flag Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 06/10] btrfs: redirect I/O for remapped block groups Mark Harmstone
2025-05-23 10:09 ` Qu Wenruo
2025-05-23 11:53 ` Mark Harmstone [this message]
2025-05-15 16:36 ` [RFC PATCH 07/10] btrfs: handle deletions from remapped block group Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 08/10] btrfs: handle setting up relocation of block group with remap-tree Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 09/10] btrfs: move existing remaps before relocating block group Mark Harmstone
[not found] ` <202505161726.w1lqCZxG-lkp@intel.com>
2025-05-16 11:43 ` Mark Harmstone
2025-05-15 16:36 ` [RFC PATCH 10/10] btrfs: replace identity maps with actual remaps when doing relocations Mark Harmstone
2025-05-21 0:04 ` Boris Burkov
2025-05-23 14:54 ` Mark Harmstone
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=76aee017-6488-4185-92cf-c9442f1a36e1@meta.com \
--to=maharmstone@meta.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox