From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72E86C433F5 for ; Wed, 1 Jun 2022 01:29:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348981AbiFAB3X (ORCPT ); Tue, 31 May 2022 21:29:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231869AbiFAB3W (ORCPT ); Tue, 31 May 2022 21:29:22 -0400 Received: from mail1.merlins.org (magic.merlins.org [209.81.13.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EAD65F76 for ; Tue, 31 May 2022 18:29:19 -0700 (PDT) Received: from merlin by mail1.merlins.org with local (Exim 4.94.2 #2) id 1nwDAl-0002Kc-LH by authid ; Tue, 31 May 2022 18:29:19 -0700 Date: Tue, 31 May 2022 18:29:19 -0700 From: Marc MERLIN To: Josef Bacik Cc: linux-btrfs Subject: Re: Rebuilding 24TB Raid5 array (was btrfs corruption: parent transid verify failed + open_ctree failed) Message-ID: <20220601012919.GE22722@merlins.org> References: <20220530003701.GJ24951@merlins.org> <20220530191834.GK24951@merlins.org> <20220531011224.GA1745079@merlins.org> <20220531224951.GC22722@merlins.org> <20220601002552.GD22722@merlins.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Sysadmin: BOFH X-URL: http://marc.merlins.org/ User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: marc@merlins.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Tue, May 31, 2022 at 09:26:03PM -0400, Josef Bacik wrote: > On Tue, May 31, 2022 at 8:25 PM Marc MERLIN wrote: > > > > On Tue, May 31, 2022 at 08:14:27PM -0400, Josef Bacik wrote: > > > Wtf, we're clearly writing the chunk root properly because I have to > > > re-open it to recover the tree root, and that's where it fails, but > > > then the chunk restore can't open the root, despite it being correctly > > > read in the tree recover. I've pushed new code, try tree-recover and > > > then recover-chunks again and capture the output please. Thanks, > > > > gargamel:/var/local/src/btrfs-progs-josefbacik# ./btrfs rescue recover-chunks /dev/mapper/dshelf1 > > checksum verify failed on 21135360 wanted 0x00000000 found 0x3533f3b5 > > checksum verify failed on 21135360 wanted 0x00000000 found 0x3533f3b5 > > checksum verify failed on 21135360 wanted 0x00000000 found 0x3533f3b5 > > Ah ok, I wasn't actually updating the pointer, fixed that, lets try > the same sequence again. Thanks, gargamel:/var/local/src/btrfs-progs-josefbacik# gdb ./btrfs rescue tree-recover /dev/mapper/dshelf1 Excess command line arguments ignored. (tree-recover ...) GNU gdb (Debian 7.12-6+b2) 7.12.0.20161007-git Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: . Find the GDB manual and other documentation resources online at: . For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./btrfs...done. /var/local/src/btrfs-progs-josefbacik/rescue: No such file or directory. (gdb) rn rescue tree-recover /dev/mapper/dshelf1 Target exec does not support this command. (gdb) run rescue tree-recover /dev/mapper/dshelf1 Starting program: /var/local/src/btrfs-progs-josefbacik/btrfs rescue tree-recover /dev/mapper/dshelf1 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". WARNING: cannot read chunk root, continue anyway none of our backups was sufficient, scanning for a root scanning, best has 0 found 1 bad ret is 0 offset 20971520 len 8388608 ret is -2 offset 20971520 len 8388608 checking block 22495232 generation 1572124 fs info generation 2582703 trying bytenr 22495232 got 1 blocks 0 bad checking block 22462464 generation 1479229 fs info generation 2582703 trying bytenr 22462464 got 1 blocks 0 bad checking block 22528000 generation 1572115 fs info generation 2582703 trying bytenr 22528000 got 1 blocks 0 bad checking block 22446080 generation 1571791 fs info generation 2582703 trying bytenr 22446080 got 1 blocks 0 bad checking block 22544384 generation 1556078 fs info generation 2582703 trying bytenr 22544384 got 1 blocks 0 bad checking block 22511616 generation 1555799 fs info generation 2582703 trying bytenr 22511616 got 1 blocks 0 bad checking block 22577152 generation 1586277 fs info generation 2582703 trying bytenr 22577152 got 1 blocks 0 bad checking block 22478848 generation 1561557 fs info generation 2582703 trying bytenr 22478848 got 1 blocks 0 bad checking block 22593536 generation 1590219 fs info generation 2582703 trying bytenr 22593536 got 1 blocks 0 bad checking block 22609920 generation 1551635 fs info generation 2582703 trying bytenr 22609920 got 1 blocks 0 bad checking block 22560768 generation 1590217 fs info generation 2582703 trying bytenr 22560768 got 1 blocks 0 bad ret is 0 offset 20971520 len 8388608 ret is -2 offset 20971520 len 8388608 setting chunk root to 22593536 Program received signal SIGSEGV, Segmentation fault. 0x000055555557b352 in backup_super_roots (info=0x55555564fbc0) at ./kernel-shared/ctree.h:2270 2270 BTRFS_SETGET_STACK_FUNCS(backup_tree_root_gen, struct btrfs_root_backup, (gdb) bt #0 0x000055555557b352 in backup_super_roots (info=0x55555564fbc0) at ./kernel-shared/ctree.h:2270 #1 write_all_supers (fs_info=fs_info@entry=0x55555564fbc0) at kernel-shared/disk-io.c:2102 #2 0x00005555555df49e in repair_super_root (fs_info_ptr=fs_info_ptr@entry=0x7fffffffdce8, ocf=ocf@entry=0x7fffffffdcc0, objectid=objectid@entry=3) at cmds/rescue-tree-recover.c:958 #3 0x00005555555df62c in btrfs_recover_trees (path=path@entry=0x7fffffffe1ce "/dev/mapper/dshelf1") at cmds/rescue-tree-recover.c:1194 #4 0x00005555555d808e in cmd_rescue_tree_recover (cmd=, argc=, argv=) at cmds/rescue.c:176 #5 0x000055555556c17b in cmd_execute (argv=0x7fffffffdeb8, argc=2, cmd=0x555555645c20 ) at cmds/commands.h:125 #6 handle_command_group (cmd=, argc=2, argv=0x7fffffffdeb8) at btrfs.c:152 #7 0x000055555556c275 in cmd_execute (argv=0x7fffffffdeb0, argc=3, cmd=0x555555646cc0 ) at cmds/commands.h:125 #8 main (argc=3, argv=0x7fffffffdeb0) at btrfs.c:405 -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Home page: http://marc.merlins.org/