linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konstantinos Skarlatos <k.skarlatos@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs-progs: fix failed resume due to bad search
Date: Sat, 30 Sep 2023 16:01:39 +0300	[thread overview]
Message-ID: <35bc07b9-2e6c-0848-661c-8e991cb5479a@gmail.com> (raw)
In-Reply-To: <cd390b15-7d42-41be-afc6-22c6fa22671d@gmx.com>


On 30/9/2023 1:25 π.μ., Qu Wenruo wrote:
>
>
> On 2023/9/29 21:01, Konstantinos Skarlatos wrote:
>> On 29/09/2023 1:54 μ.μ., Qu Wenruo wrote:
>>>
>>>
>>> On 2023/9/29 19:03, Konstantinos Skarlatos wrote:
>>>> On 29/09/2023 12:56 π.μ., Qu Wenruo wrote:
>>>>>
>>>>>
>>>>> On 2023/9/29 00:23, Konstantinos Skarlatos wrote:
>>>>>> Hi Qu, thanks for your patch. I just tried it on a clean btrfs-progs
>>>>>> tree with my filesystem:
>>>>>>
>>>>>> ❯ ./btrfstune --convert-to-block-group-tree /dev/sda
>>>>>> [1]    796483 segmentation fault (core dumped) ./btrfstune
>>>>>> --convert-to-block-group-tree /dev/sda
>>>>>
>>>>> Mind to enable debug build by "make D=1" for btrfs-progs, and go with
>>>>> gdb to show the crash callback?
>>>>>
>>>>> I assume it could be the same crash from your initial report.
>>>>>
>>>>> Thanks,
>>>>> Qu
>>>>>
>>>>
>>>> Hi Qu, i hope i am doing this correctly...
>>>>
>>>>
>>>> ❯ gdb -ex=r --args ./btrfstune --convert-to-block-group-tree /dev/sda
>>>> GNU gdb (GDB) 13.2
>>>> Copyright (C) 2023 Free Software Foundation, Inc.
>>>> License GPLv3+: GNU GPL version 3 or later
>>>> <http://gnu.org/licenses/gpl.html>
>>>> This is free software: you are free to change and redistribute it.
>>>> There is NO WARRANTY, to the extent permitted by law.
>>>> Type "show copying" and "show warranty" for details.
>>>> This GDB was configured as "x86_64-pc-linux-gnu".
>>>> Type "show configuration" for configuration details.
>>>> For bug reporting instructions, please see:
>>>> <https://www.gnu.org/software/gdb/bugs/>.
>>>> Find the GDB manual and other documentation resources online at:
>>>> <http://www.gnu.org/software/gdb/documentation/>.
>>>>
>>>> For help, type "help".
>>>> Type "apropos word" to search for commands related to "word"...
>>>> Reading symbols from ./btrfstune...
>>>> Starting program: /root/btrfs-progs/btrfstune
>>>> --convert-to-block-group-tree /dev/sda
>>>>
>>>> This GDB supports auto-downloading debuginfo from the following URLs:
>>>>     <https://debuginfod.archlinux.org>
>>>> Enable debuginfod for this session? (y or [n]) y
>>>> Debuginfod has been enabled.
>>>> To make this setting permanent, add 'set debuginfod enabled on' to
>>>> .gdbinit.
>>>> Downloading separate debug info for /lib64/ld-linux-x86-64.so.2
>>>> Downloading separate debug info for system-supplied DSO at
>>>> 0x7ffff7fc8000
>>>> Downloading separate debug info for /usr/lib/libuuid.so.1
>>>> Downloading separate debug info for /usr/lib/libblkid.so.1
>>>> Downloading separate debug info for /usr/lib/libudev.so.1
>>>> Downloading separate debug info for /usr/lib/libc.so.6
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/usr/lib/libthread_db.so.1".
>>>> Downloading separate debug info for /usr/lib/libcap.so.2
>>>>
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>> 0x00005555555c6600 in cache_tree_comp_range (node=0xffffffffff000f0f,
>>>> data=0x7fffffffd780) at common/extent-cache.c:40
>>>> 40              if (entry->start + entry->size <= range->start)
>>>> (gdb)
>>>
>>> That's great, what about the output from "bt" command?
>>>
>>
>> Hi Qu, this is what i get:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00005555555c6600 in cache_tree_comp_range (node=0xffffffffff000f0f,
>> data=0x7fffffffd780) at common/extent-cache.c:40
>> 40              if (entry->start + entry->size <= range->start)
>> (gdb) bt
>> #0  0x00005555555c6600 in cache_tree_comp_range
>> (node=0xffffffffff000f0f, data=0x7fffffffd780) at 
>> common/extent-cache.c:40
>> #1  0x00005555555ce458 in rb_search (root=0x555555667358,
>> key=0x7fffffffd780, comp=0x5555555c65d8 <cache_tree_comp_range>,
>> next_ret=0x0) at common/rbtree-utils.c:59
>
> Indeed we're back to the initial crash you reported.
>
> The root cause is the @node pointer, which is not valid.
>
> I'm not sure why, but we got an invalid node inside the rb tree, which
> leads to the crash.
>
> Considering this is a rare one, I'll need more time to dig into the
> original crash.
> I'm afraid with this bug still here, we will not be able to do anything,
> neither reverting back or continuing the conversion.
>
> Thanks,
> Qu

Hello Qu,

does that mean that the filesystem is not recoverable?

I can keep it for a while if you need it for further debugging.

Kind regards,


>> #2  0x00005555555c6964 in lookup_cache_extent (tree=0x555555667358,
>> start=17820099854336, size=16384) at common/extent-cache.c:145
>> #3  0x0000555555597cac in alloc_extent_buffer (fs_info=0x5555556672f0,
>> bytenr=17820099854336, blocksize=16384) at kernel-shared/extent_io.c:262
>> #4  0x000055555558253b in btrfs_find_create_tree_block
>> (fs_info=0x5555556672f0, bytenr=17820099854336) at
>> kernel-shared/disk-io.c:232
>> #5  0x0000555555582cdb in read_tree_block (fs_info=0x5555556672f0,
>> bytenr=17820099854336, owner_root=10, parent_transid=1833250, level=0,
>> first_key=0x0)
>>      at kernel-shared/disk-io.c:439
>> #6  0x00005555555776bd in read_node_slot (fs_info=0x5555556672f0,
>> parent=0x55557571ba60, slot=154) at kernel-shared/ctree.c:850
>> #7  0x000055555557a644 in push_leaf_right (trans=0x555567b3d140,
>> root=0x5555558da740, path=0x5555757f0d30, data_size=25, empty=0) at
>> kernel-shared/ctree.c:1965
>> #8  0x000055555557b7d1 in split_leaf (trans=0x555567b3d140,
>> root=0x5555558da740, ins_key=0x7fffffffdc50, path=0x5555757f0d30,
>> data_size=25, extend=0)
>>      at kernel-shared/ctree.c:2338
>> #9  0x0000555555578fc0 in btrfs_search_slot (trans=0x555567b3d140,
>> root=0x5555558da740, key=0x7fffffffdc50, p=0x5555757f0d30, ins_len=25,
>> cow=1) at kernel-shared/ctree.c:1399
>> #10 0x000055555557ca7d in btrfs_insert_empty_items
>> (trans=0x555567b3d140, root=0x5555558da740, path=0x5555757f0d30,
>> cpu_key=0x7fffffffdc50, data_size=0x7fffffffdb6c, nr=1)
>>      at kernel-shared/ctree.c:2758
>> #11 0x000055555559eb70 in btrfs_insert_empty_item (trans=0x555567b3d140,
>> root=0x5555558da740, path=0x5555757f0d30, key=0x7fffffffdc50, 
>> data_size=0)
>>      at ./kernel-shared/ctree.h:1019
>> #12 0x000055555559ff9a in convert_free_space_to_extents
>> (trans=0x555567b3d140, block_group=0x55555fcc7780, path=0x5555757f0d30)
>> at kernel-shared/free-space-tree.c:465
>> #13 0x00005555555a01ed in update_free_space_extent_count
>> (trans=0x555567b3d140, block_group=0x55555fcc7780, path=0x5555757f0d30,
>> new_extents=1)
>>      at kernel-shared/free-space-tree.c:525
>> #14 0x00005555555a07ae in modify_free_space_bitmap
>> (trans=0x555567b3d140, block_group=0x55555fcc7780, path=0x5555757f0d30,
>> start=17592699748352, size=16384, remove=0)
>>      at kernel-shared/free-space-tree.c:707
>> #15 0x00005555555a0ff2 in __add_to_free_space_tree
>> (trans=0x555567b3d140, block_group=0x55555fcc7780, path=0x5555757f0d30,
>> start=17592699748352, size=16384)
>>      at kernel-shared/free-space-tree.c:997
>> #16 0x00005555555a10ad in add_to_free_space_tree (trans=0x555567b3d140,
>> start=17592699748352, size=16384) at 
>> kernel-shared/free-space-tree.c:1028
>> #17 0x0000555555591db8 in __free_extent (trans=0x555567b3d140,
>> bytenr=17592699748352, num_bytes=16384, parent=0, root_objectid=2,
>> owner_objectid=0, owner_offset=0,
>>      refs_to_drop=1) at kernel-shared/extent-tree.c:2130
>> #18 0x000055555559651c in run_delayed_tree_ref (trans=0x555567b3d140,
>> fs_info=0x5555556672f0, node=0x55556bf11090, extent_op=0x0,
>> insert_reserved=0)
>>      at kernel-shared/extent-tree.c:3906
>> #19 0x00005555555965b1 in run_one_delayed_ref (trans=0x555567b3d140,
>> fs_info=0x5555556672f0, node=0x55556bf11090, extent_op=0x0,
>> insert_reserved=0)
>>      at kernel-shared/extent-tree.c:3926
>> #20 0x00005555555967de in btrfs_run_delayed_refs (trans=0x555567b3d140,
>> nr=18446744073709551615) at kernel-shared/extent-tree.c:4010
>> #21 0x00005555555afbd1 in btrfs_commit_transaction
>> (trans=0x555567b3d140, root=0x55555567bd70) at
>> kernel-shared/transaction.c:210
>> #22 0x0000555555566b19 in convert_to_bg_tree (fs_info=0x5555556672f0) at
>> tune/convert-bgt.c:112
>> #23 0x00005555555647bb in main (argc=3, argv=0x7fffffffe298) at
>> tune/main.c:393
>> (gdb)
>>
>>
>>
>>> Thanks,
>>> Qu
>>>>
>>>>
>>>>
>>>>
>>>>>>
>>>>>> Sep 28 17:46:17 elsinki kernel: btrfstune[796483]: segfault at
>>>>>> ffffffffff000f2f ip 0000564b6c2107aa sp 00007ffdc8ad25c8 error 5 in
>>>>>> btrfstune[564b6c1d1000+5b000] likely on CPU 3 (core 2, socket 0)
>>>>>> Sep 28 17:46:17 elsinki kernel: Code: ff 48 8b 34 24 48 8d 3d 5a 
>>>>>> d8 01
>>>>>> 00 b8 00 00 00 00 e8 5a 37 00 00 48 8b 33 bf 0a 00 00 00 e8 1d 0c
>>>>>> fc ff
>>>>>> eb a8 e8 86 0a fc ff <48> 8b 4f 20 48 8b 56 08 48 89 c8 48 03 47 
>>>>>> 28 48
>>>>>> 89 c7 b8 01 00 00
>>>>>> Sep 28 17:46:17 elsinki systemd[1]: Created slice Slice
>>>>>> /system/systemd-coredump.
>>>>>> Sep 28 17:46:17 elsinki systemd[1]: Started Process Core Dump (PID
>>>>>> 796493/UID 0).
>>>>>> Sep 28 17:46:21 elsinki systemd-coredump[796494]: [🡕] Process 
>>>>>> 796483
>>>>>> (btrfstune) of user 0 dumped core.
>>>>>>
>>>>>> Stack trace of
>>>>>> thread
>>>>>> 796483:
>>>>>>                                                      #0
>>>>>> 0x0000564b6c2107aa
>>>>>> n/a (/root/btrfs-progs/btrfstune + 0x4d7aa)
>>>>>>                                                      ELF object 
>>>>>> binary
>>>>>> architecture: AMD x86-64
>>>>>> Sep 28 17:46:21 elsinki systemd[1]:
>>>>>> systemd-coredump@0-796493-0.service:
>>>>>> Deactivated successfully.
>>>>>> Sep 28 17:46:21 elsinki systemd[1]:
>>>>>> systemd-coredump@0-796493-0.service:
>>>>>> Consumed 4.248s CPU time.
>>>>>>
>>>>>>
>>>>>> On 28/9/2023 1:50 π.μ., Qu Wenruo wrote:
>>>>>>> [BUG]
>>>>>>> There is a bug report that when converting to bg tree crashed, the
>>>>>>> resulted fs is unable to be resumed.
>>>>>>>
>>>>>>> This problems comes with the following error messages:
>>>>>>>
>>>>>>>      ./btrfstune --convert-to-block-group-tree /dev/sda
>>>>>>>      ERROR: Corrupted fs, no valid METADATA block group found
>>>>>>>      ERROR: failed to delete block group item from the old root: 
>>>>>>> -117
>>>>>>>      ERROR: failed to convert the filesystem to block group tree
>>>>>>> feature
>>>>>>>      ERROR: btrfstune failed
>>>>>>>      extent buffer leak: start 17825576632320 len 16384
>>>>>>>
>>>>>>> [CAUSE]
>>>>>>> When resuming a interrupted conversion, we go through
>>>>>>> read_converting_block_groups() to handle block group items in both
>>>>>>> extent and block group trees.
>>>>>>>
>>>>>>> However for the block group items in the extent tree, there are
>>>>>>> several
>>>>>>> problems involved:
>>>>>>>
>>>>>>> - Uninitialized @key inside read_old_block_groups_from_root()
>>>>>>>      Here we only set @key.type, not setting @key.objectid for the
>>>>>>> initial
>>>>>>>      search.
>>>>>>>
>>>>>>>      Thus if we're unlukcy, we can got (u64)-1 as key.objectid, and
>>>>>>> exit
>>>>>>>      the search immediately.
>>>>>>>
>>>>>>> - Wrong search direction
>>>>>>>      The conversion is converting block groups in descending order,
>>>>>>> but the
>>>>>>>      block groups read is in ascending order.
>>>>>>>      Meaning if we start from the last converted block group, we
>>>>>>> would at
>>>>>>>      most read one block group.
>>>>>>>
>>>>>>> [FIX]
>>>>>>> To fix the problems, this patch would just remove
>>>>>>> read_old_block_groups_from_root() function completely.
>>>>>>>
>>>>>>> As for the conversion, we ensured the block group item is either
>>>>>>> in the
>>>>>>> old extent or the new block group tree.
>>>>>>> Thus there is no special handling needed reading block groups.
>>>>>>>
>>>>>>> We only need to read all block groups from both trees, using the 
>>>>>>> same
>>>>>>> read_old_block_groups_from_root() function.
>>>>>>>
>>>>>>> Reported-by: Konstantinos Skarlatos <k.skarlatos@gmail.com>
>>>>>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>>>>>> ---
>>>>>>> To Konstantinos:
>>>>>>>
>>>>>>> The bug I fixed here can explain all the failures you hit (the
>>>>>>> initial
>>>>>>> one and the one after the quick diff).
>>>>>>>
>>>>>>> Please verify if this helps and report back (without the quick
>>>>>>> diff in
>>>>>>> the original thread).
>>>>>>>
>>>>>>> We may have other corner cases to go, but I believe the patch
>>>>>>> itself is
>>>>>>> necessary no matter what, as the deleted code is really
>>>>>>> over-engineered and buggy.
>>>>>>> ---
>>>>>>>     kernel-shared/extent-tree.c | 79
>>>>>>> +------------------------------------
>>>>>>>     1 file changed, 1 insertion(+), 78 deletions(-)
>>>>>>>
>>>>>>> diff --git a/kernel-shared/extent-tree.c
>>>>>>> b/kernel-shared/extent-tree.c
>>>>>>> index 7022643a9843..4d6bf2b228e9 100644
>>>>>>> --- a/kernel-shared/extent-tree.c
>>>>>>> +++ b/kernel-shared/extent-tree.c
>>>>>>> @@ -2852,83 +2852,6 @@ out:
>>>>>>>         return ret;
>>>>>>>     }
>>>>>>> -/*
>>>>>>> - * Helper to read old block groups items from specified root.
>>>>>>> - *
>>>>>>> - * The difference between this and 
>>>>>>> read_block_groups_from_root() is,
>>>>>>> - * we will exit if we have already read the last bg in the old 
>>>>>>> root.
>>>>>>> - *
>>>>>>> - * This is to avoid wasting time finding bg items which should be
>>>>>>> in the
>>>>>>> - * new root.
>>>>>>> - */
>>>>>>> -static int read_old_block_groups_from_root(struct btrfs_fs_info
>>>>>>> *fs_info,
>>>>>>> -                       struct btrfs_root *root)
>>>>>>> -{
>>>>>>> -    struct btrfs_path path = {0};
>>>>>>> -    struct btrfs_key key;
>>>>>>> -    struct cache_extent *ce;
>>>>>>> -    /* The last block group bytenr in the old root. */
>>>>>>> -    u64 last_bg_in_old_root;
>>>>>>> -    int ret;
>>>>>>> -
>>>>>>> -    if (fs_info->last_converted_bg_bytenr != (u64)-1) {
>>>>>>> -        /*
>>>>>>> -         * We know the last converted bg in the other tree, 
>>>>>>> load the
>>>>>>> chunk
>>>>>>> -         * before that last converted as our last bg in the tree.
>>>>>>> -         */
>>>>>>> -        ce = 
>>>>>>> search_cache_extent(&fs_info->mapping_tree.cache_tree,
>>>>>>> - fs_info->last_converted_bg_bytenr);
>>>>>>> -        if (!ce || ce->start != 
>>>>>>> fs_info->last_converted_bg_bytenr) {
>>>>>>> -            error("no chunk found for bytenr %llu",
>>>>>>> - fs_info->last_converted_bg_bytenr);
>>>>>>> -            return -ENOENT;
>>>>>>> -        }
>>>>>>> -        ce = prev_cache_extent(ce);
>>>>>>> -        /*
>>>>>>> -         * We should have previous unconverted chunk, or we have
>>>>>>> -         * already finished the convert.
>>>>>>> -         */
>>>>>>> -        ASSERT(ce);
>>>>>>> -
>>>>>>> -        last_bg_in_old_root = ce->start;
>>>>>>> -    } else {
>>>>>>> -        last_bg_in_old_root = (u64)-1;
>>>>>>> -    }
>>>>>>> -
>>>>>>> -    key.type = BTRFS_BLOCK_GROUP_ITEM_KEY;
>>>>>>> -
>>>>>>> -    while (true) {
>>>>>>> -        ret = find_first_block_group(root, &path, &key);
>>>>>>> -        if (ret > 0) {
>>>>>>> -            ret = 0;
>>>>>>> -            goto out;
>>>>>>> -        }
>>>>>>> -        if (ret != 0) {
>>>>>>> -            goto out;
>>>>>>> -        }
>>>>>>> -        btrfs_item_key_to_cpu(path.nodes[0], &key, path.slots[0]);
>>>>>>> -
>>>>>>> -        ret = read_one_block_group(fs_info, &path);
>>>>>>> -        if (ret < 0 && ret != -ENOENT)
>>>>>>> -            goto out;
>>>>>>> -
>>>>>>> -        /* We have reached last bg in the old root, no need to
>>>>>>> continue */
>>>>>>> -        if (key.objectid >= last_bg_in_old_root)
>>>>>>> -            break;
>>>>>>> -
>>>>>>> -        if (key.offset == 0)
>>>>>>> -            key.objectid++;
>>>>>>> -        else
>>>>>>> -            key.objectid = key.objectid + key.offset;
>>>>>>> -        key.offset = 0;
>>>>>>> -        btrfs_release_path(&path);
>>>>>>> -    }
>>>>>>> -    ret = 0;
>>>>>>> -out:
>>>>>>> -    btrfs_release_path(&path);
>>>>>>> -    return ret;
>>>>>>> -}
>>>>>>> -
>>>>>>>     /* Helper to read all block groups items from specified 
>>>>>>> root. */
>>>>>>>     static int read_block_groups_from_root(struct btrfs_fs_info
>>>>>>> *fs_info,
>>>>>>>                            struct btrfs_root *root)
>>>>>>> @@ -2989,7 +2912,7 @@ static int 
>>>>>>> read_converting_block_groups(struct
>>>>>>> btrfs_fs_info *fs_info)
>>>>>>>             return ret;
>>>>>>>         }
>>>>>>> -    ret = read_old_block_groups_from_root(fs_info, old_root);
>>>>>>> +    ret = read_block_groups_from_root(fs_info, old_root);
>>>>>>>         if (ret < 0) {
>>>>>>>             error("failed to load block groups from the old 
>>>>>>> root: %d",
>>>>>>> ret);
>>>>>>>             return ret;
>>>>
>>

  reply	other threads:[~2023-09-30 13:01 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27 22:50 [PATCH] btrfs-progs: fix failed resume due to bad search Qu Wenruo
2023-09-28 14:53 ` Konstantinos Skarlatos
2023-09-28 21:56   ` Qu Wenruo
2023-09-29  9:33     ` Konstantinos Skarlatos
2023-09-29 10:54       ` Qu Wenruo
2023-09-29 11:31         ` Konstantinos Skarlatos
2023-09-29 22:25           ` Qu Wenruo
2023-09-30 13:01             ` Konstantinos Skarlatos [this message]
2023-09-30 21:22               ` Qu Wenruo
2023-10-01 13:09                 ` Konstantinos Skarlatos
2023-10-01 23:33                   ` Qu Wenruo
2023-10-02 10:32                     ` Konstantinos Skarlatos
2023-10-03  0:01                       ` Qu Wenruo
2023-10-03 10:21                         ` Konstantinos Skarlatos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35bc07b9-2e6c-0848-661c-8e991cb5479a@gmail.com \
    --to=k.skarlatos@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).