From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: fdmanana@gmail.com, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid
Date: Sat, 5 Mar 2022 07:11:55 +0800 [thread overview]
Message-ID: <068c1331-e7b1-dce2-2cda-a8d2df4603ce@gmx.com> (raw)
In-Reply-To: <CAL3q7H5CsrCN61zUY1-Vf5aEb6uJu9u2MdNSZtP1aEmyYi1TQA@mail.gmail.com>
On 2022/3/5 00:25, Filipe Manana wrote:
> On Tue, Feb 22, 2022 at 7:42 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> Btrfs doesn't really check whether the tree block really respects the
>> root owner.
>>
>> This means, if a tree block referred by a parent in extent tree, but has
>> owner of 5, btrfs can still continue reading the tree block, as long as
>> it doesn't trigger other sanity checks.
>>
>> Normally this is fine, but combined with the empty tree check in
>> check_leaf(), if we hit an empty extent tree, but the root node has
>> csum tree owner, we can let such extent buffer to sneak in.
>>
>> Shrink the hole by:
>>
>> - Do extra eb owner check at tree read time
>>
>> - Make sure the root owner extent buffer exactly match the root id.
>>
>> Unfortunately we can't yet completely patch the hole, there are several
>> call sites can't pass all info we need:
>>
>> - For reloc/log trees
>> Their owner is key::offset, not key::objectid.
>> We need the full root key to do that accurate check.
>>
>> For now, we just skip the ownership check for those trees.
>>
>> - For add_data_references() of relocation
>> That call site doesn't have any parent/ownership info, as all the
>> bytenrs are all from btrfs_find_all_leafs().
>>
>> Thankfully, btrfs_find_all_leafs() still do the ownership check,
>> and even for the read_tree_block() caller inside
>> add_data_references(), we know that all tree blocks there are
>> subvolume tree blocks.
>> Just manually convert root_owner 0 to FS_TREE to continue the check.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/ctree.c | 6 +++++
>> fs/btrfs/disk-io.c | 21 +++++++++++++++
>> fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
>> fs/btrfs/tree-checker.h | 1 +
>> 4 files changed, 85 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index 0eecf98d0abb..d904fe0973bd 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -16,6 +16,7 @@
>> #include "volumes.h"
>> #include "qgroup.h"
>> #include "tree-mod-log.h"
>> +#include "tree-checker.h"
>>
>> static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
>> *root, struct btrfs_path *path, int level);
>> @@ -1443,6 +1444,11 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
>> btrfs_release_path(p);
>> return -EIO;
>> }
>> + if (btrfs_check_eb_owner(tmp, root->root_key.objectid)) {
>> + free_extent_buffer(tmp);
>> + btrfs_release_path(p);
>> + return -EUCLEAN;
>> + }
>> *eb_ret = tmp;
>> return 0;
>> }
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 8165ee3ae8a5..018a230efca5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1109,6 +1109,10 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
>> free_extent_buffer_stale(buf);
>> return ERR_PTR(ret);
>> }
>> + if (btrfs_check_eb_owner(buf, owner_root)) {
>> + free_extent_buffer_stale(buf);
>> + return ERR_PTR(-EUCLEAN);
>> + }
>> return buf;
>>
>> }
>> @@ -1548,6 +1552,23 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
>> ret = -EIO;
>> goto fail;
>> }
>> +
>> + /*
>> + * For real fs, and not log/reloc trees, root owner must
>> + * match its root node owner
>> + */
>> + if (!test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state) &&
>> + root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
>> + root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
>> + root->root_key.objectid != btrfs_header_owner(root->node)) {
>> + btrfs_crit(fs_info,
>> +"root=%llu block=%llu, tree root owner mismatch, have %llu expect %llu",
>> + root->root_key.objectid, root->node->start,
>> + btrfs_header_owner(root->node),
>> + root->root_key.objectid);
>> + ret = -EUCLEAN;
>> + goto fail;
>> + }
>> root->commit_root = btrfs_root_node(root);
>> return root;
>> fail:
>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>> index 2c1072923590..f50bde52f466 100644
>> --- a/fs/btrfs/tree-checker.c
>> +++ b/fs/btrfs/tree-checker.c
>> @@ -1855,3 +1855,60 @@ int btrfs_check_node(struct extent_buffer *node)
>> return ret;
>> }
>> ALLOW_ERROR_INJECTION(btrfs_check_node, ERRNO);
>> +
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner)
>> +{
>> + bool is_subvol = is_fstree(root_owner);
>> + const u64 eb_owner = btrfs_header_owner(eb);
>> +
>> + /*
>> + * Skip dummy fs, as selftest doesn't bother to create unique ebs for
>> + * each dummy root.
>> + */
>> + if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &eb->fs_info->fs_state))
>> + return 0;
>> +
>> + /*
>> + * Those trees uses key.offset as their owner, our callers don't
>> + * have the extra capacity to pass key.offset here.
>> + * So we just skip those trees.
>> + */
>> + if (root_owner == BTRFS_TREE_LOG_OBJECTID ||
>> + root_owner == BTRFS_TREE_RELOC_OBJECTID)
>> + return 0;
>> +
>> + /*
>> + * This happens for add_data_references() of balance, at that call site
>> + * we don't have owner info.
>> + * But we know all tree blocks there are subvolume tree blocks.
>> + */
>> + if (root_owner == 0)
>> + is_subvol = true;
>> +
>> + if (!is_subvol) {
>> + /* For non-subvolume trees, the eb owner should match root owner */
>> + if (root_owner != eb_owner) {
>> + btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
>> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> + root_owner, btrfs_header_bytenr(eb), eb_owner,
>> + root_owner);
>> + return -EUCLEAN;
>> + }
>> + return 0;
>> + }
>> +
>> + /*
>> + * For subvolume trees, owner can mismatch, but they should all belong
>> + * to subvolume trees.
>> + */
>> + if (is_subvol != is_fstree(eb_owner)) {
>> + btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
>> + btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> + root_owner, btrfs_header_bytenr(eb), eb_owner,
>> + BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
>> + return -EUCLEAN;
>
> This causes a failure when using free space cache v1 and doing balance:
Oh, no wonder why my default fstests run passed.
>
> BTRFS critical (device sdb): corrupted leaf, root=0 block=12320768
> owner mismatch, have 1 expect [256, 18446744073709551360]
>
> It's triggered from add_data_references() (relocation), for root tree
> leaves that contain data extents for v1 space caches.
> In that case the header owner is 1 (root tree), root_owner is 0, so
> is_subvol is set to true, and is_fstree(1) returns false, triggering
> the false corruption error here.
Right, the root_owner 0 is from relocation call site which doesn't have
the owner info, and since it's for data, we default to is_subvol = true,
and caused the problem.
It's much harder to distinguish data used by v1 cache in that context.
So please drop the patch for now.
Thanks for the report,
Qu
>
> Thanks.
>
>> + }
>> + return 0;
>> +}
>> diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
>> index 32fecc9dc1dd..c48719485687 100644
>> --- a/fs/btrfs/tree-checker.h
>> +++ b/fs/btrfs/tree-checker.h
>> @@ -25,5 +25,6 @@ int btrfs_check_node(struct extent_buffer *node);
>>
>> int btrfs_check_chunk_valid(struct extent_buffer *leaf,
>> struct btrfs_chunk *chunk, u64 logical);
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner);
>>
>> #endif
>> --
>> 2.35.1
>>
>
>
next prev parent reply other threads:[~2022-03-04 23:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-22 7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
2022-02-22 7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
2022-02-22 7:41 ` [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer() Qu Wenruo
2022-02-22 7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
2022-03-04 16:25 ` Filipe Manana
2022-03-04 23:11 ` Qu Wenruo [this message]
2022-03-01 21:45 ` [PATCH 0/3] btrfs: add parent-child ownership check David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=068c1331-e7b1-dce2-2cda-a8d2df4603ce@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=fdmanana@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox