* [PATCH 0/3] btrfs: add parent-child ownership check
@ 2022-02-22 7:41 Qu Wenruo
2022-02-22 7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Qu Wenruo @ 2022-02-22 7:41 UTC (permalink / raw)
To: linux-btrfs
Tree-checker doesn't really check owner except for empty leaf.
This allows parent-child ownership mismatch to sneak in.
Enhance the checks again tree block owner by:
- Add owner check when reading child tree blocks
- Make sure the tree root owner matches the root key
Unfortunately the check still has some cases missing, mostly for
- Log/reloc trees
Need full root key to check, which is not really possible for
some backref call sites.
The first 2 patches are just cleanup to unify the error handling
patterns, there are some "creative" way checking the errors, which is
not really reader friendly.
The last patch is doing the real work.
Qu Wenruo (3):
btrfs: unify the error handling pattern for read_tree_block()
btrfs: unify the error handling of btrfs_read_buffer()
btrfs: check extent buffer owner against the owner rootid
fs/btrfs/backref.c | 7 +++--
fs/btrfs/ctree.c | 48 +++++++++++++++++++---------------
fs/btrfs/disk-io.c | 37 +++++++++++++++++++++-----
fs/btrfs/print-tree.c | 4 +--
fs/btrfs/relocation.c | 3 ++-
fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
fs/btrfs/tree-checker.h | 1 +
7 files changed, 125 insertions(+), 32 deletions(-)
--
2.35.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block()
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
@ 2022-02-22 7:41 ` Qu Wenruo
2022-02-22 7:41 ` [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer() Qu Wenruo
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2022-02-22 7:41 UTC (permalink / raw)
To: linux-btrfs
We had an error handling pattern for read_tree_block() like this:
eb = read_tree_block();
if (IS_ERR(eb)) {
/*
* Handling error here
* Normally ended up with return or goto out.
*/
} else if (!extent_buffer_uptodate(eb)) {
/*
* Different error handling here
* Normally also ended up with return or goto out;
*/
}
This is fine, but if we want to add extra check for each
read_tree_block(), the existing if-else-if is not that expandable and
will take reader some seconds to figure out there is no extra branch.
Here we change it to a more common way, without the extra else:
eb = read_tree_block();
if (IS_ERR(eb)) {
/*
* Handling error here
*/
return eb or goto out;
}
if (!extent_buffer_uptodate(eb)) {
/*
* Different error handling here
*/
return eb or goto out;
}
This also removes some oddball call sites which uses some creative way to
check error.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/backref.c | 7 +++++--
fs/btrfs/ctree.c | 30 ++++++++++++++++--------------
fs/btrfs/disk-io.c | 16 +++++++++-------
fs/btrfs/print-tree.c | 4 ++--
fs/btrfs/relocation.c | 3 ++-
5 files changed, 34 insertions(+), 26 deletions(-)
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index c9ee579bc5a6..ebc392ea1d74 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -789,11 +789,13 @@ static int add_missing_keys(struct btrfs_fs_info *fs_info,
if (IS_ERR(eb)) {
free_pref(ref);
return PTR_ERR(eb);
- } else if (!extent_buffer_uptodate(eb)) {
+ }
+ if (!extent_buffer_uptodate(eb)) {
free_pref(ref);
free_extent_buffer(eb);
return -EIO;
}
+
if (lock)
btrfs_tree_read_lock(eb);
if (btrfs_header_level(eb) == 0)
@@ -1335,7 +1337,8 @@ static int find_parent_nodes(struct btrfs_trans_handle *trans,
if (IS_ERR(eb)) {
ret = PTR_ERR(eb);
goto out;
- } else if (!extent_buffer_uptodate(eb)) {
+ }
+ if (!extent_buffer_uptodate(eb)) {
free_extent_buffer(eb);
ret = -EIO;
goto out;
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 1bb5335c6d75..9b2d9cd41676 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -846,9 +846,11 @@ struct extent_buffer *btrfs_read_node_slot(struct extent_buffer *parent,
btrfs_header_owner(parent),
btrfs_node_ptr_generation(parent, slot),
level - 1, &first_key);
- if (!IS_ERR(eb) && !extent_buffer_uptodate(eb)) {
+ if (IS_ERR(eb))
+ return eb;
+ if (!extent_buffer_uptodate(eb)) {
free_extent_buffer(eb);
- eb = ERR_PTR(-EIO);
+ return ERR_PTR(-EIO);
}
return eb;
@@ -1460,19 +1462,19 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
ret = -EAGAIN;
tmp = read_tree_block(fs_info, blocknr, root->root_key.objectid,
gen, parent_level - 1, &first_key);
- if (!IS_ERR(tmp)) {
- /*
- * If the read above didn't mark this buffer up to date,
- * it will never end up being up to date. Set ret to EIO now
- * and give up so that our caller doesn't loop forever
- * on our EAGAINs.
- */
- if (!extent_buffer_uptodate(tmp))
- ret = -EIO;
- free_extent_buffer(tmp);
- } else {
- ret = PTR_ERR(tmp);
+ if (IS_ERR(tmp)) {
+ btrfs_release_path(p);
+ return PTR_ERR(tmp);
}
+ /*
+ * If the read above didn't mark this buffer up to date,
+ * it will never end up being up to date. Set ret to EIO now
+ * and give up so that our caller doesn't loop forever
+ * on our EAGAINs.
+ */
+ if (!extent_buffer_uptodate(tmp))
+ ret = -EIO;
+ free_extent_buffer(tmp);
btrfs_release_path(p);
return ret;
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index b6a81c39d5f4..8165ee3ae8a5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1543,7 +1543,8 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
ret = PTR_ERR(root->node);
root->node = NULL;
goto fail;
- } else if (!btrfs_buffer_uptodate(root->node, generation, 0)) {
+ }
+ if (!btrfs_buffer_uptodate(root->node, generation, 0)) {
ret = -EIO;
goto fail;
}
@@ -2538,11 +2539,13 @@ static int btrfs_replay_log(struct btrfs_fs_info *fs_info,
log_tree_root->node = NULL;
btrfs_put_root(log_tree_root);
return ret;
- } else if (!extent_buffer_uptodate(log_tree_root->node)) {
+ }
+ if (!extent_buffer_uptodate(log_tree_root->node)) {
btrfs_err(fs_info, "failed to read log tree");
btrfs_put_root(log_tree_root);
return -EIO;
}
+
/* returns with log_tree_root freed on success */
ret = btrfs_recover_log_trees(log_tree_root);
if (ret) {
@@ -2984,15 +2987,14 @@ static int load_super_root(struct btrfs_root *root, u64 bytenr, u64 gen, int lev
if (IS_ERR(root->node)) {
ret = PTR_ERR(root->node);
root->node = NULL;
- } else if (!extent_buffer_uptodate(root->node)) {
+ return ret;
+ }
+ if (!extent_buffer_uptodate(root->node)) {
free_extent_buffer(root->node);
root->node = NULL;
- ret = -EIO;
+ return -EIO;
}
- if (ret)
- return ret;
-
btrfs_set_root_node(&root->root_item, root->node);
root->commit_root = btrfs_root_node(root);
btrfs_set_root_refs(&root->root_item, 1);
diff --git a/fs/btrfs/print-tree.c b/fs/btrfs/print-tree.c
index 524fdb0ddd74..dd8777872143 100644
--- a/fs/btrfs/print-tree.c
+++ b/fs/btrfs/print-tree.c
@@ -392,9 +392,9 @@ void btrfs_print_tree(struct extent_buffer *c, bool follow)
btrfs_header_owner(c),
btrfs_node_ptr_generation(c, i),
level - 1, &first_key);
- if (IS_ERR(next)) {
+ if (IS_ERR(next))
continue;
- } else if (!extent_buffer_uptodate(next)) {
+ if (!extent_buffer_uptodate(next)) {
free_extent_buffer(next);
continue;
}
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index f5465197996d..1ac36d9bfaf4 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2601,7 +2601,8 @@ static int get_tree_block_key(struct btrfs_fs_info *fs_info,
block->key.offset, block->level, NULL);
if (IS_ERR(eb)) {
return PTR_ERR(eb);
- } else if (!extent_buffer_uptodate(eb)) {
+ }
+ if (!extent_buffer_uptodate(eb)) {
free_extent_buffer(eb);
return -EIO;
}
--
2.35.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer()
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
2022-02-22 7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
@ 2022-02-22 7:41 ` Qu Wenruo
2022-02-22 7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
2022-03-01 21:45 ` [PATCH 0/3] btrfs: add parent-child ownership check David Sterba
3 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2022-02-22 7:41 UTC (permalink / raw)
To: linux-btrfs
There is one oddball error handling of btrfs_read_buffer():
ret = btrfs_read_buffer(tmp, gen, parent_level - 1, &first_key);
if (!ret) {
*eb_ret = tmp;
return 0;
}
free_extent_buffer(tmp);
btrfs_release_path(p);
return -EIO;
While all other call sites check the error first.
Unify the behavior.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/ctree.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 9b2d9cd41676..0eecf98d0abb 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1438,13 +1438,13 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
/* now we're allowed to do a blocking uptodate check */
ret = btrfs_read_buffer(tmp, gen, parent_level - 1, &first_key);
- if (!ret) {
- *eb_ret = tmp;
- return 0;
+ if (ret) {
+ free_extent_buffer(tmp);
+ btrfs_release_path(p);
+ return -EIO;
}
- free_extent_buffer(tmp);
- btrfs_release_path(p);
- return -EIO;
+ *eb_ret = tmp;
+ return 0;
}
/*
--
2.35.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
2022-02-22 7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
2022-02-22 7:41 ` [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer() Qu Wenruo
@ 2022-02-22 7:41 ` Qu Wenruo
2022-03-04 16:25 ` Filipe Manana
2022-03-01 21:45 ` [PATCH 0/3] btrfs: add parent-child ownership check David Sterba
3 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2022-02-22 7:41 UTC (permalink / raw)
To: linux-btrfs
Btrfs doesn't really check whether the tree block really respects the
root owner.
This means, if a tree block referred by a parent in extent tree, but has
owner of 5, btrfs can still continue reading the tree block, as long as
it doesn't trigger other sanity checks.
Normally this is fine, but combined with the empty tree check in
check_leaf(), if we hit an empty extent tree, but the root node has
csum tree owner, we can let such extent buffer to sneak in.
Shrink the hole by:
- Do extra eb owner check at tree read time
- Make sure the root owner extent buffer exactly match the root id.
Unfortunately we can't yet completely patch the hole, there are several
call sites can't pass all info we need:
- For reloc/log trees
Their owner is key::offset, not key::objectid.
We need the full root key to do that accurate check.
For now, we just skip the ownership check for those trees.
- For add_data_references() of relocation
That call site doesn't have any parent/ownership info, as all the
bytenrs are all from btrfs_find_all_leafs().
Thankfully, btrfs_find_all_leafs() still do the ownership check,
and even for the read_tree_block() caller inside
add_data_references(), we know that all tree blocks there are
subvolume tree blocks.
Just manually convert root_owner 0 to FS_TREE to continue the check.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/ctree.c | 6 +++++
fs/btrfs/disk-io.c | 21 +++++++++++++++
fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
fs/btrfs/tree-checker.h | 1 +
4 files changed, 85 insertions(+)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 0eecf98d0abb..d904fe0973bd 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -16,6 +16,7 @@
#include "volumes.h"
#include "qgroup.h"
#include "tree-mod-log.h"
+#include "tree-checker.h"
static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
*root, struct btrfs_path *path, int level);
@@ -1443,6 +1444,11 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
btrfs_release_path(p);
return -EIO;
}
+ if (btrfs_check_eb_owner(tmp, root->root_key.objectid)) {
+ free_extent_buffer(tmp);
+ btrfs_release_path(p);
+ return -EUCLEAN;
+ }
*eb_ret = tmp;
return 0;
}
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8165ee3ae8a5..018a230efca5 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1109,6 +1109,10 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
free_extent_buffer_stale(buf);
return ERR_PTR(ret);
}
+ if (btrfs_check_eb_owner(buf, owner_root)) {
+ free_extent_buffer_stale(buf);
+ return ERR_PTR(-EUCLEAN);
+ }
return buf;
}
@@ -1548,6 +1552,23 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
ret = -EIO;
goto fail;
}
+
+ /*
+ * For real fs, and not log/reloc trees, root owner must
+ * match its root node owner
+ */
+ if (!test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state) &&
+ root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
+ root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
+ root->root_key.objectid != btrfs_header_owner(root->node)) {
+ btrfs_crit(fs_info,
+"root=%llu block=%llu, tree root owner mismatch, have %llu expect %llu",
+ root->root_key.objectid, root->node->start,
+ btrfs_header_owner(root->node),
+ root->root_key.objectid);
+ ret = -EUCLEAN;
+ goto fail;
+ }
root->commit_root = btrfs_root_node(root);
return root;
fail:
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 2c1072923590..f50bde52f466 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -1855,3 +1855,60 @@ int btrfs_check_node(struct extent_buffer *node)
return ret;
}
ALLOW_ERROR_INJECTION(btrfs_check_node, ERRNO);
+
+int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner)
+{
+ bool is_subvol = is_fstree(root_owner);
+ const u64 eb_owner = btrfs_header_owner(eb);
+
+ /*
+ * Skip dummy fs, as selftest doesn't bother to create unique ebs for
+ * each dummy root.
+ */
+ if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &eb->fs_info->fs_state))
+ return 0;
+
+ /*
+ * Those trees uses key.offset as their owner, our callers don't
+ * have the extra capacity to pass key.offset here.
+ * So we just skip those trees.
+ */
+ if (root_owner == BTRFS_TREE_LOG_OBJECTID ||
+ root_owner == BTRFS_TREE_RELOC_OBJECTID)
+ return 0;
+
+ /*
+ * This happens for add_data_references() of balance, at that call site
+ * we don't have owner info.
+ * But we know all tree blocks there are subvolume tree blocks.
+ */
+ if (root_owner == 0)
+ is_subvol = true;
+
+ if (!is_subvol) {
+ /* For non-subvolume trees, the eb owner should match root owner */
+ if (root_owner != eb_owner) {
+ btrfs_crit(eb->fs_info,
+"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
+ btrfs_header_level(eb) == 0 ? "leaf" : "node",
+ root_owner, btrfs_header_bytenr(eb), eb_owner,
+ root_owner);
+ return -EUCLEAN;
+ }
+ return 0;
+ }
+
+ /*
+ * For subvolume trees, owner can mismatch, but they should all belong
+ * to subvolume trees.
+ */
+ if (is_subvol != is_fstree(eb_owner)) {
+ btrfs_crit(eb->fs_info,
+"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
+ btrfs_header_level(eb) == 0 ? "leaf" : "node",
+ root_owner, btrfs_header_bytenr(eb), eb_owner,
+ BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
+ return -EUCLEAN;
+ }
+ return 0;
+}
diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
index 32fecc9dc1dd..c48719485687 100644
--- a/fs/btrfs/tree-checker.h
+++ b/fs/btrfs/tree-checker.h
@@ -25,5 +25,6 @@ int btrfs_check_node(struct extent_buffer *node);
int btrfs_check_chunk_valid(struct extent_buffer *leaf,
struct btrfs_chunk *chunk, u64 logical);
+int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner);
#endif
--
2.35.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/3] btrfs: add parent-child ownership check
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
` (2 preceding siblings ...)
2022-02-22 7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
@ 2022-03-01 21:45 ` David Sterba
3 siblings, 0 replies; 7+ messages in thread
From: David Sterba @ 2022-03-01 21:45 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Feb 22, 2022 at 03:41:18PM +0800, Qu Wenruo wrote:
> Tree-checker doesn't really check owner except for empty leaf.
>
> This allows parent-child ownership mismatch to sneak in.
>
> Enhance the checks again tree block owner by:
>
> - Add owner check when reading child tree blocks
>
> - Make sure the tree root owner matches the root key
>
> Unfortunately the check still has some cases missing, mostly for
>
> - Log/reloc trees
> Need full root key to check, which is not really possible for
> some backref call sites.
>
>
> The first 2 patches are just cleanup to unify the error handling
> patterns, there are some "creative" way checking the errors, which is
> not really reader friendly.
>
> The last patch is doing the real work.
>
> Qu Wenruo (3):
> btrfs: unify the error handling pattern for read_tree_block()
> btrfs: unify the error handling of btrfs_read_buffer()
> btrfs: check extent buffer owner against the owner rootid
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid
2022-02-22 7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
@ 2022-03-04 16:25 ` Filipe Manana
2022-03-04 23:11 ` Qu Wenruo
0 siblings, 1 reply; 7+ messages in thread
From: Filipe Manana @ 2022-03-04 16:25 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Tue, Feb 22, 2022 at 7:42 AM Qu Wenruo <wqu@suse.com> wrote:
>
> Btrfs doesn't really check whether the tree block really respects the
> root owner.
>
> This means, if a tree block referred by a parent in extent tree, but has
> owner of 5, btrfs can still continue reading the tree block, as long as
> it doesn't trigger other sanity checks.
>
> Normally this is fine, but combined with the empty tree check in
> check_leaf(), if we hit an empty extent tree, but the root node has
> csum tree owner, we can let such extent buffer to sneak in.
>
> Shrink the hole by:
>
> - Do extra eb owner check at tree read time
>
> - Make sure the root owner extent buffer exactly match the root id.
>
> Unfortunately we can't yet completely patch the hole, there are several
> call sites can't pass all info we need:
>
> - For reloc/log trees
> Their owner is key::offset, not key::objectid.
> We need the full root key to do that accurate check.
>
> For now, we just skip the ownership check for those trees.
>
> - For add_data_references() of relocation
> That call site doesn't have any parent/ownership info, as all the
> bytenrs are all from btrfs_find_all_leafs().
>
> Thankfully, btrfs_find_all_leafs() still do the ownership check,
> and even for the read_tree_block() caller inside
> add_data_references(), we know that all tree blocks there are
> subvolume tree blocks.
> Just manually convert root_owner 0 to FS_TREE to continue the check.
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/ctree.c | 6 +++++
> fs/btrfs/disk-io.c | 21 +++++++++++++++
> fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
> fs/btrfs/tree-checker.h | 1 +
> 4 files changed, 85 insertions(+)
>
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 0eecf98d0abb..d904fe0973bd 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -16,6 +16,7 @@
> #include "volumes.h"
> #include "qgroup.h"
> #include "tree-mod-log.h"
> +#include "tree-checker.h"
>
> static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
> *root, struct btrfs_path *path, int level);
> @@ -1443,6 +1444,11 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
> btrfs_release_path(p);
> return -EIO;
> }
> + if (btrfs_check_eb_owner(tmp, root->root_key.objectid)) {
> + free_extent_buffer(tmp);
> + btrfs_release_path(p);
> + return -EUCLEAN;
> + }
> *eb_ret = tmp;
> return 0;
> }
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 8165ee3ae8a5..018a230efca5 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1109,6 +1109,10 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
> free_extent_buffer_stale(buf);
> return ERR_PTR(ret);
> }
> + if (btrfs_check_eb_owner(buf, owner_root)) {
> + free_extent_buffer_stale(buf);
> + return ERR_PTR(-EUCLEAN);
> + }
> return buf;
>
> }
> @@ -1548,6 +1552,23 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
> ret = -EIO;
> goto fail;
> }
> +
> + /*
> + * For real fs, and not log/reloc trees, root owner must
> + * match its root node owner
> + */
> + if (!test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state) &&
> + root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
> + root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
> + root->root_key.objectid != btrfs_header_owner(root->node)) {
> + btrfs_crit(fs_info,
> +"root=%llu block=%llu, tree root owner mismatch, have %llu expect %llu",
> + root->root_key.objectid, root->node->start,
> + btrfs_header_owner(root->node),
> + root->root_key.objectid);
> + ret = -EUCLEAN;
> + goto fail;
> + }
> root->commit_root = btrfs_root_node(root);
> return root;
> fail:
> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
> index 2c1072923590..f50bde52f466 100644
> --- a/fs/btrfs/tree-checker.c
> +++ b/fs/btrfs/tree-checker.c
> @@ -1855,3 +1855,60 @@ int btrfs_check_node(struct extent_buffer *node)
> return ret;
> }
> ALLOW_ERROR_INJECTION(btrfs_check_node, ERRNO);
> +
> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner)
> +{
> + bool is_subvol = is_fstree(root_owner);
> + const u64 eb_owner = btrfs_header_owner(eb);
> +
> + /*
> + * Skip dummy fs, as selftest doesn't bother to create unique ebs for
> + * each dummy root.
> + */
> + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &eb->fs_info->fs_state))
> + return 0;
> +
> + /*
> + * Those trees uses key.offset as their owner, our callers don't
> + * have the extra capacity to pass key.offset here.
> + * So we just skip those trees.
> + */
> + if (root_owner == BTRFS_TREE_LOG_OBJECTID ||
> + root_owner == BTRFS_TREE_RELOC_OBJECTID)
> + return 0;
> +
> + /*
> + * This happens for add_data_references() of balance, at that call site
> + * we don't have owner info.
> + * But we know all tree blocks there are subvolume tree blocks.
> + */
> + if (root_owner == 0)
> + is_subvol = true;
> +
> + if (!is_subvol) {
> + /* For non-subvolume trees, the eb owner should match root owner */
> + if (root_owner != eb_owner) {
> + btrfs_crit(eb->fs_info,
> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
> + root_owner, btrfs_header_bytenr(eb), eb_owner,
> + root_owner);
> + return -EUCLEAN;
> + }
> + return 0;
> + }
> +
> + /*
> + * For subvolume trees, owner can mismatch, but they should all belong
> + * to subvolume trees.
> + */
> + if (is_subvol != is_fstree(eb_owner)) {
> + btrfs_crit(eb->fs_info,
> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
> + root_owner, btrfs_header_bytenr(eb), eb_owner,
> + BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
> + return -EUCLEAN;
This causes a failure when using free space cache v1 and doing balance:
BTRFS critical (device sdb): corrupted leaf, root=0 block=12320768
owner mismatch, have 1 expect [256, 18446744073709551360]
It's triggered from add_data_references() (relocation), for root tree
leaves that contain data extents for v1 space caches.
In that case the header owner is 1 (root tree), root_owner is 0, so
is_subvol is set to true, and is_fstree(1) returns false, triggering
the false corruption error here.
Thanks.
> + }
> + return 0;
> +}
> diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
> index 32fecc9dc1dd..c48719485687 100644
> --- a/fs/btrfs/tree-checker.h
> +++ b/fs/btrfs/tree-checker.h
> @@ -25,5 +25,6 @@ int btrfs_check_node(struct extent_buffer *node);
>
> int btrfs_check_chunk_valid(struct extent_buffer *leaf,
> struct btrfs_chunk *chunk, u64 logical);
> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner);
>
> #endif
> --
> 2.35.1
>
--
Filipe David Manana,
“Whether you think you can, or you think you can't — you're right.”
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid
2022-03-04 16:25 ` Filipe Manana
@ 2022-03-04 23:11 ` Qu Wenruo
0 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2022-03-04 23:11 UTC (permalink / raw)
To: fdmanana, Qu Wenruo; +Cc: linux-btrfs
On 2022/3/5 00:25, Filipe Manana wrote:
> On Tue, Feb 22, 2022 at 7:42 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> Btrfs doesn't really check whether the tree block really respects the
>> root owner.
>>
>> This means, if a tree block referred by a parent in extent tree, but has
>> owner of 5, btrfs can still continue reading the tree block, as long as
>> it doesn't trigger other sanity checks.
>>
>> Normally this is fine, but combined with the empty tree check in
>> check_leaf(), if we hit an empty extent tree, but the root node has
>> csum tree owner, we can let such extent buffer to sneak in.
>>
>> Shrink the hole by:
>>
>> - Do extra eb owner check at tree read time
>>
>> - Make sure the root owner extent buffer exactly match the root id.
>>
>> Unfortunately we can't yet completely patch the hole, there are several
>> call sites can't pass all info we need:
>>
>> - For reloc/log trees
>> Their owner is key::offset, not key::objectid.
>> We need the full root key to do that accurate check.
>>
>> For now, we just skip the ownership check for those trees.
>>
>> - For add_data_references() of relocation
>> That call site doesn't have any parent/ownership info, as all the
>> bytenrs are all from btrfs_find_all_leafs().
>>
>> Thankfully, btrfs_find_all_leafs() still do the ownership check,
>> and even for the read_tree_block() caller inside
>> add_data_references(), we know that all tree blocks there are
>> subvolume tree blocks.
>> Just manually convert root_owner 0 to FS_TREE to continue the check.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/ctree.c | 6 +++++
>> fs/btrfs/disk-io.c | 21 +++++++++++++++
>> fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
>> fs/btrfs/tree-checker.h | 1 +
>> 4 files changed, 85 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index 0eecf98d0abb..d904fe0973bd 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -16,6 +16,7 @@
>> #include "volumes.h"
>> #include "qgroup.h"
>> #include "tree-mod-log.h"
>> +#include "tree-checker.h"
>>
>> static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
>> *root, struct btrfs_path *path, int level);
>> @@ -1443,6 +1444,11 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
>> btrfs_release_path(p);
>> return -EIO;
>> }
>> + if (btrfs_check_eb_owner(tmp, root->root_key.objectid)) {
>> + free_extent_buffer(tmp);
>> + btrfs_release_path(p);
>> + return -EUCLEAN;
>> + }
>> *eb_ret = tmp;
>> return 0;
>> }
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 8165ee3ae8a5..018a230efca5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1109,6 +1109,10 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
>> free_extent_buffer_stale(buf);
>> return ERR_PTR(ret);
>> }
>> + if (btrfs_check_eb_owner(buf, owner_root)) {
>> + free_extent_buffer_stale(buf);
>> + return ERR_PTR(-EUCLEAN);
>> + }
>> return buf;
>>
>> }
>> @@ -1548,6 +1552,23 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
>> ret = -EIO;
>> goto fail;
>> }
>> +
>> + /*
>> + * For real fs, and not log/reloc trees, root owner must
>> + * match its root node owner
>> + */
>> + if (!test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state) &&
>> + root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
>> + root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
>> + root->root_key.objectid != btrfs_header_owner(root->node)) {
>> + btrfs_crit(fs_info,
>> +"root=%llu block=%llu, tree root owner mismatch, have %llu expect %llu",
>> + root->root_key.objectid, root->node->start,
>> + btrfs_header_owner(root->node),
>> + root->root_key.objectid);
>> + ret = -EUCLEAN;
>> + goto fail;
>> + }
>> root->commit_root = btrfs_root_node(root);
>> return root;
>> fail:
>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>> index 2c1072923590..f50bde52f466 100644
>> --- a/fs/btrfs/tree-checker.c
>> +++ b/fs/btrfs/tree-checker.c
>> @@ -1855,3 +1855,60 @@ int btrfs_check_node(struct extent_buffer *node)
>> return ret;
>> }
>> ALLOW_ERROR_INJECTION(btrfs_check_node, ERRNO);
>> +
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner)
>> +{
>> + bool is_subvol = is_fstree(root_owner);
>> + const u64 eb_owner = btrfs_header_owner(eb);
>> +
>> + /*
>> + * Skip dummy fs, as selftest doesn't bother to create unique ebs for
>> + * each dummy root.
>> + */
>> + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &eb->fs_info->fs_state))
>> + return 0;
>> +
>> + /*
>> + * Those trees uses key.offset as their owner, our callers don't
>> + * have the extra capacity to pass key.offset here.
>> + * So we just skip those trees.
>> + */
>> + if (root_owner == BTRFS_TREE_LOG_OBJECTID ||
>> + root_owner == BTRFS_TREE_RELOC_OBJECTID)
>> + return 0;
>> +
>> + /*
>> + * This happens for add_data_references() of balance, at that call site
>> + * we don't have owner info.
>> + * But we know all tree blocks there are subvolume tree blocks.
>> + */
>> + if (root_owner == 0)
>> + is_subvol = true;
>> +
>> + if (!is_subvol) {
>> + /* For non-subvolume trees, the eb owner should match root owner */
>> + if (root_owner != eb_owner) {
>> + btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
>> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> + root_owner, btrfs_header_bytenr(eb), eb_owner,
>> + root_owner);
>> + return -EUCLEAN;
>> + }
>> + return 0;
>> + }
>> +
>> + /*
>> + * For subvolume trees, owner can mismatch, but they should all belong
>> + * to subvolume trees.
>> + */
>> + if (is_subvol != is_fstree(eb_owner)) {
>> + btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
>> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> + root_owner, btrfs_header_bytenr(eb), eb_owner,
>> + BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
>> + return -EUCLEAN;
>
> This causes a failure when using free space cache v1 and doing balance:
Oh, no wonder why my default fstests run passed.
>
> BTRFS critical (device sdb): corrupted leaf, root=0 block=12320768
> owner mismatch, have 1 expect [256, 18446744073709551360]
>
> It's triggered from add_data_references() (relocation), for root tree
> leaves that contain data extents for v1 space caches.
> In that case the header owner is 1 (root tree), root_owner is 0, so
> is_subvol is set to true, and is_fstree(1) returns false, triggering
> the false corruption error here.
Right, the root_owner 0 is from relocation call site which doesn't have
the owner info, and since it's for data, we default to is_subvol = true,
and caused the problem.
It's much harder to distinguish data used by v1 cache in that context.
So please drop the patch for now.
Thanks for the report,
Qu
>
> Thanks.
>
>> + }
>> + return 0;
>> +}
>> diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
>> index 32fecc9dc1dd..c48719485687 100644
>> --- a/fs/btrfs/tree-checker.h
>> +++ b/fs/btrfs/tree-checker.h
>> @@ -25,5 +25,6 @@ int btrfs_check_node(struct extent_buffer *node);
>>
>> int btrfs_check_chunk_valid(struct extent_buffer *leaf,
>> struct btrfs_chunk *chunk, u64 logical);
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner);
>>
>> #endif
>> --
>> 2.35.1
>>
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-03-04 23:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
2022-02-22 7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
2022-02-22 7:41 ` [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer() Qu Wenruo
2022-02-22 7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
2022-03-04 16:25 ` Filipe Manana
2022-03-04 23:11 ` Qu Wenruo
2022-03-01 21:45 ` [PATCH 0/3] btrfs: add parent-child ownership check David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox