public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: fdmanana@gmail.com, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid
Date: Sat, 5 Mar 2022 07:11:55 +0800	[thread overview]
Message-ID: <068c1331-e7b1-dce2-2cda-a8d2df4603ce@gmx.com> (raw)
In-Reply-To: <CAL3q7H5CsrCN61zUY1-Vf5aEb6uJu9u2MdNSZtP1aEmyYi1TQA@mail.gmail.com>



On 2022/3/5 00:25, Filipe Manana wrote:
> On Tue, Feb 22, 2022 at 7:42 AM Qu Wenruo <wqu@suse.com> wrote:
>>
>> Btrfs doesn't really check whether the tree block really respects the
>> root owner.
>>
>> This means, if a tree block referred by a parent in extent tree, but has
>> owner of 5, btrfs can still continue reading the tree block, as long as
>> it doesn't trigger other sanity checks.
>>
>> Normally this is fine, but combined with the empty tree check in
>> check_leaf(), if we hit an empty extent tree, but the root node has
>> csum tree owner, we can let such extent buffer to sneak in.
>>
>> Shrink the hole by:
>>
>> - Do extra eb owner check at tree read time
>>
>> - Make sure the root owner extent buffer exactly match the root id.
>>
>> Unfortunately we can't yet completely patch the hole, there are several
>> call sites can't pass all info we need:
>>
>> - For reloc/log trees
>>    Their owner is key::offset, not key::objectid.
>>    We need the full root key to do that accurate check.
>>
>>    For now, we just skip the ownership check for those trees.
>>
>> - For add_data_references() of relocation
>>    That call site doesn't have any parent/ownership info, as all the
>>    bytenrs are all from btrfs_find_all_leafs().
>>
>>    Thankfully, btrfs_find_all_leafs() still do the ownership check,
>>    and even for the read_tree_block() caller inside
>>    add_data_references(), we know that all tree blocks there are
>>    subvolume tree blocks.
>>    Just manually convert root_owner 0 to FS_TREE to continue the check.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/ctree.c        |  6 +++++
>>   fs/btrfs/disk-io.c      | 21 +++++++++++++++
>>   fs/btrfs/tree-checker.c | 57 +++++++++++++++++++++++++++++++++++++++++
>>   fs/btrfs/tree-checker.h |  1 +
>>   4 files changed, 85 insertions(+)
>>
>> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
>> index 0eecf98d0abb..d904fe0973bd 100644
>> --- a/fs/btrfs/ctree.c
>> +++ b/fs/btrfs/ctree.c
>> @@ -16,6 +16,7 @@
>>   #include "volumes.h"
>>   #include "qgroup.h"
>>   #include "tree-mod-log.h"
>> +#include "tree-checker.h"
>>
>>   static int split_node(struct btrfs_trans_handle *trans, struct btrfs_root
>>                        *root, struct btrfs_path *path, int level);
>> @@ -1443,6 +1444,11 @@ read_block_for_search(struct btrfs_root *root, struct btrfs_path *p,
>>                          btrfs_release_path(p);
>>                          return -EIO;
>>                  }
>> +               if (btrfs_check_eb_owner(tmp, root->root_key.objectid)) {
>> +                       free_extent_buffer(tmp);
>> +                       btrfs_release_path(p);
>> +                       return -EUCLEAN;
>> +               }
>>                  *eb_ret = tmp;
>>                  return 0;
>>          }
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 8165ee3ae8a5..018a230efca5 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -1109,6 +1109,10 @@ struct extent_buffer *read_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr,
>>                  free_extent_buffer_stale(buf);
>>                  return ERR_PTR(ret);
>>          }
>> +       if (btrfs_check_eb_owner(buf, owner_root)) {
>> +               free_extent_buffer_stale(buf);
>> +               return ERR_PTR(-EUCLEAN);
>> +       }
>>          return buf;
>>
>>   }
>> @@ -1548,6 +1552,23 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
>>                  ret = -EIO;
>>                  goto fail;
>>          }
>> +
>> +       /*
>> +        * For real fs, and not log/reloc trees, root owner must
>> +        * match its root node owner
>> +        */
>> +       if (!test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &fs_info->fs_state) &&
>> +           root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID &&
>> +           root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID &&
>> +           root->root_key.objectid != btrfs_header_owner(root->node)) {
>> +               btrfs_crit(fs_info,
>> +"root=%llu block=%llu, tree root owner mismatch, have %llu expect %llu",
>> +                          root->root_key.objectid, root->node->start,
>> +                          btrfs_header_owner(root->node),
>> +                          root->root_key.objectid);
>> +               ret = -EUCLEAN;
>> +               goto fail;
>> +       }
>>          root->commit_root = btrfs_root_node(root);
>>          return root;
>>   fail:
>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>> index 2c1072923590..f50bde52f466 100644
>> --- a/fs/btrfs/tree-checker.c
>> +++ b/fs/btrfs/tree-checker.c
>> @@ -1855,3 +1855,60 @@ int btrfs_check_node(struct extent_buffer *node)
>>          return ret;
>>   }
>>   ALLOW_ERROR_INJECTION(btrfs_check_node, ERRNO);
>> +
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner)
>> +{
>> +       bool is_subvol = is_fstree(root_owner);
>> +       const u64 eb_owner = btrfs_header_owner(eb);
>> +
>> +       /*
>> +        * Skip dummy fs, as selftest doesn't bother to create unique ebs for
>> +        * each dummy root.
>> +        */
>> +       if (test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, &eb->fs_info->fs_state))
>> +               return 0;
>> +
>> +       /*
>> +        * Those trees uses key.offset as their owner, our callers don't
>> +        * have the extra capacity to pass key.offset here.
>> +        * So we just skip those trees.
>> +        */
>> +       if (root_owner == BTRFS_TREE_LOG_OBJECTID ||
>> +           root_owner == BTRFS_TREE_RELOC_OBJECTID)
>> +               return 0;
>> +
>> +       /*
>> +        * This happens for add_data_references() of balance, at that call site
>> +        * we don't have owner info.
>> +        * But we know all tree blocks there are subvolume tree blocks.
>> +        */
>> +       if (root_owner == 0)
>> +               is_subvol = true;
>> +
>> +       if (!is_subvol) {
>> +               /* For non-subvolume trees, the eb owner should match root owner */
>> +               if (root_owner != eb_owner) {
>> +                       btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect %llu",
>> +                               btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> +                               root_owner, btrfs_header_bytenr(eb), eb_owner,
>> +                               root_owner);
>> +                       return -EUCLEAN;
>> +               }
>> +               return 0;
>> +       }
>> +
>> +       /*
>> +        * For subvolume trees, owner can mismatch, but they should all belong
>> +        * to subvolume trees.
>> +        */
>> +       if (is_subvol != is_fstree(eb_owner)) {
>> +               btrfs_crit(eb->fs_info,
>> +"corrupted %s, root=%llu block=%llu owner mismatch, have %llu expect [%llu, %llu]",
>> +                       btrfs_header_level(eb) == 0 ? "leaf" : "node",
>> +                       root_owner, btrfs_header_bytenr(eb), eb_owner,
>> +                       BTRFS_FIRST_FREE_OBJECTID, BTRFS_LAST_FREE_OBJECTID);
>> +               return -EUCLEAN;
>
> This causes a failure when using free space cache v1 and doing balance:

Oh, no wonder why my default fstests run passed.

>
> BTRFS critical (device sdb): corrupted leaf, root=0 block=12320768
> owner mismatch, have 1 expect [256, 18446744073709551360]
>
> It's triggered from add_data_references() (relocation), for root tree
> leaves that contain data extents for v1 space caches.
> In that case the header owner is 1 (root tree), root_owner is 0, so
> is_subvol is set to true, and is_fstree(1) returns false, triggering
> the false corruption error here.

Right, the root_owner 0 is from relocation call site which doesn't have
the owner info, and since it's for data, we default to is_subvol = true,
and caused the problem.

It's much harder to distinguish data used by v1 cache in that context.

So please drop the patch for now.

Thanks for the report,
Qu

>
> Thanks.
>
>> +       }
>> +       return 0;
>> +}
>> diff --git a/fs/btrfs/tree-checker.h b/fs/btrfs/tree-checker.h
>> index 32fecc9dc1dd..c48719485687 100644
>> --- a/fs/btrfs/tree-checker.h
>> +++ b/fs/btrfs/tree-checker.h
>> @@ -25,5 +25,6 @@ int btrfs_check_node(struct extent_buffer *node);
>>
>>   int btrfs_check_chunk_valid(struct extent_buffer *leaf,
>>                              struct btrfs_chunk *chunk, u64 logical);
>> +int btrfs_check_eb_owner(struct extent_buffer *eb, u64 root_owner);
>>
>>   #endif
>> --
>> 2.35.1
>>
>
>

  reply	other threads:[~2022-03-04 23:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-22  7:41 [PATCH 0/3] btrfs: add parent-child ownership check Qu Wenruo
2022-02-22  7:41 ` [PATCH 1/3] btrfs: unify the error handling pattern for read_tree_block() Qu Wenruo
2022-02-22  7:41 ` [PATCH 2/3] btrfs: unify the error handling of btrfs_read_buffer() Qu Wenruo
2022-02-22  7:41 ` [PATCH 3/3] btrfs: check extent buffer owner against the owner rootid Qu Wenruo
2022-03-04 16:25   ` Filipe Manana
2022-03-04 23:11     ` Qu Wenruo [this message]
2022-03-01 21:45 ` [PATCH 0/3] btrfs: add parent-child ownership check David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=068c1331-e7b1-dce2-2cda-a8d2df4603ce@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=fdmanana@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox