From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net ([212.227.15.18]:49831 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751088AbdHUG6q (ORCPT ); Mon, 21 Aug 2017 02:58:46 -0400 Subject: Re: Btrfs Raid5 issue. To: Robert LeBlanc , linux-btrfs@vger.kernel.org References: From: Qu Wenruo Message-ID: <09e13dfc-20cf-57f9-5dff-b22013f5e77b@gmx.com> Date: Mon, 21 Aug 2017 14:58:39 +0800 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017年08月21日 12:33, Robert LeBlanc wrote: > I've been running btrfs in a raid5 for about a year now with bcache in > front of it. Yesterday, one of my drives was acting really slow, so I > was going to move it to a different port. I guess I get too > comfortable hot plugging drives in at work and didn't think twice > about what could go wrong, hey I set it up in RAID5 so it will be > fine. Well, it wasn't... Well, Btrfs RAID5 is not that safe. I would recommend to use RAID1 for metadata at least. (And in your case, your metadata is damaged, so I really recommend to use a better profile for your metadata) > > I was aware of the write hole issue, and thought it was committed to > the 4.12 branch, so I was running 4.12.5 at the time. I have two SSDs > that are in an md RAID1 that is the cache for the three backing > devices in bcache (bcache{0..2} or bcache{0,16,32} depending on the > kernel booted. I have all my critical data saved off on btrfs > snapshots on a different host, but I don't transfer my MythTV subs > that often, so I'd like to try to recover some of that if possible. > > What is really interesting is that I could not boot the first time > (root on the btrfs volume), but I rebooted again and the fs was in > read-only mode, but only one of the three disks was in read-only. I > tried to reboot again and it never mounted again after that. I see > some messages in dmesg like this: > > [ 151.201637] BTRFS info (device bcache0): disk space caching is enabled > [ 151.201640] BTRFS info (device bcache0): has skinny extents > [ 151.215697] BTRFS info (device bcache0): bdev /dev/bcache16 errs: > wr 309, rd 319, flush 39, corrupt 0, gen 0 > [ 151.931764] BTRFS info (device bcache0): detected SSD devices, > enabling SSD mode > [ 152.058915] BTRFS error (device bcache0): parent transid verify > failed on 5309837426688 wanted 1620383 found 1619473 > [ 152.059944] BTRFS error (device bcache0): parent transid verify > failed on 5309837426688 wanted 1620383 found 1619473 Normally transid error indicates bigger problem, and normally hard to trace. > [ 152.060018] BTRFS: error (device bcache0) in > __btrfs_free_extent:6989: errno=-5 IO failure > [ 152.060060] BTRFS: error (device bcache0) in > btrfs_run_delayed_refs:3009: errno=-5 IO failure > [ 152.071613] BTRFS info (device bcache0): delayed_refs has NO entry > [ 152.074126] BTRFS: error (device bcache0) in btrfs_replay_log:2475: > errno=-5 IO failure (Failed to recover log tree) > [ 152.074244] BTRFS error (device bcache0): cleaner transaction > attach returned -30 > [ 152.148993] BTRFS error (device bcache0): open_ctree failed > > So, I thought that the log was corrupted, I could live without the > last 30 seconds or so, I tried `btrfs rescue zero-log /dev/bcache0` > and I get a backtrace. Yes, your idea about log is correct. It's log replay causing problem. But the root cause seems to be corrupted extent tree, which is not easy to fix. > I ran `btrfs rescue chunk-recover /dev/bcache0` > and it spent hours scanning the three disks and at the end tried to > fix the logs (or tree, I can't remember exactly) and then I got > another backtrace. > > Today, I compiled 4.13-rc6 to see if some of the latest fixes would > help, no dice (the dmesg above is from 4.13-rc6). I compiled the > latest master of btrfs-progs, no progress. > > Things I've tried: > mount > mount -o degraded > mount -o degraded,ro > mount -o degraded (with each drive disconnected in turn to see if in > would start without one of the drives) > btrfs rescue chunk-recover > btrfs rescue super-recover (all drives report the superblocks are fine) > btrfs rescue zero-log (always has a backtrace) I think that's some other problem causing the backtrace. Normally extent tree corruption or transid error. > btrfs check > > I know that bcache complicates things, but I'm hoping for two things. > 1. Try to get what I can off the volume. 2. Provide some information > that can help make btrfs/bcache better for the future. > > Here is what `btrfs rescue zero-log` outputs: > > # ./btrfs rescue zero-log /dev/bcache0 > Clearing log on /dev/bcache0, previous log_root 2876047507456, level 0 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > parent transid verify failed on 5309233872896 wanted 1620381 found 1619462 > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE > bytenr mismatch, want=5309233872896, have=65536 > btrfs unable to find ref byte nr 5310039638016 parent 0 root 2 owner 2 offset 0 > parent transid verify failed on 5309275930624 wanted 1620381 found 1619462 > parent transid verify failed on 5309275930624 wanted 1620381 found 1619462 > checksum verify failed on 5309275930624 found A2FDBB6A wanted 461E06DC > parent transid verify failed on 5309275930624 wanted 1620381 found 1619462 > Ignoring transid failure > bad key ordering 67 68 Despite of transid and bytenr mismatch (which is already a big problem), we even have bad key order. That's definitely not a good sign. I think extent tree (maybe more) get heavily damaged. And considering how we update extent tree (delay it as long as possible), it's not that strange. I would recommend to use backup roots manually to see which one can pass btrfsck. But log tree will be a blockage as its content is bond to certain transid. Would you please try the following commands? # btrfs inspect dump-super -f /dev/bcache0 Check the output for part like: backup_roots[4]: backup 0: backup_tree_root: 29392896 gen: 6 level: 0 backup_chunk_root: 20987904 gen: 5 level: 0 Record the number of backup_tree_root. And then # btrfs check -r 29392896 /dev/bcache0 If you're lucky enough, you should not see backtrace. BTW, the newer backup the higher chance to recover. If you backup 0 and 1 don't give a good result, then nothing much left we can do. Thanks, Qu > btrfs unable to find ref byte nr 5310039867392 parent 0 root 2 owner 1 offset 0 > bad key ordering 67 68 > extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, value -1 > ./btrfs(+0x1c624)[0x562fde546624] > ./btrfs(+0x1d91a)[0x562fde54791a] > ./btrfs(+0x1da2b)[0x562fde547a2b] > ./btrfs(+0x1f3a5)[0x562fde5493a5] > ./btrfs(+0x1f91f)[0x562fde54991f] > ./btrfs(btrfs_alloc_free_block+0xd2)[0x562fde54c20c] > ./btrfs(__btrfs_cow_block+0x182)[0x562fde53c778] > ./btrfs(btrfs_cow_block+0xea)[0x562fde53d0ea] > ./btrfs(+0x185a3)[0x562fde5425a3] > ./btrfs(btrfs_commit_transaction+0x96)[0x562fde54411c] > ./btrfs(+0x6a702)[0x562fde594702] > ./btrfs(handle_command_group+0x44)[0x562fde53b40c] > ./btrfs(cmd_rescue+0x15)[0x562fde59486d] > ./btrfs(main+0x85)[0x562fde53b5c3] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd3931692b1] > ./btrfs(_start+0x2a)[0x562fde53b13a] > Aborted > > Please let me know if there is any other information I can provide > that would be helpful. > > Thank you, > > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >