From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Robert LeBlanc <robert@leblancnet.us>, linux-btrfs@vger.kernel.org
Subject: Re: Btrfs Raid5 issue.
Date: Mon, 21 Aug 2017 14:58:39 +0800 [thread overview]
Message-ID: <09e13dfc-20cf-57f9-5dff-b22013f5e77b@gmx.com> (raw)
In-Reply-To: <CAANLjFoLZ_CtO1XQNRfSCu0p2OSN7HECscAPy5ru6dTt3-qffQ@mail.gmail.com>
On 2017年08月21日 12:33, Robert LeBlanc wrote:
> I've been running btrfs in a raid5 for about a year now with bcache in
> front of it. Yesterday, one of my drives was acting really slow, so I
> was going to move it to a different port. I guess I get too
> comfortable hot plugging drives in at work and didn't think twice
> about what could go wrong, hey I set it up in RAID5 so it will be
> fine. Well, it wasn't...
Well, Btrfs RAID5 is not that safe.
I would recommend to use RAID1 for metadata at least.
(And in your case, your metadata is damaged, so I really recommend to
use a better profile for your metadata)
>
> I was aware of the write hole issue, and thought it was committed to
> the 4.12 branch, so I was running 4.12.5 at the time. I have two SSDs
> that are in an md RAID1 that is the cache for the three backing
> devices in bcache (bcache{0..2} or bcache{0,16,32} depending on the
> kernel booted. I have all my critical data saved off on btrfs
> snapshots on a different host, but I don't transfer my MythTV subs
> that often, so I'd like to try to recover some of that if possible.
>
> What is really interesting is that I could not boot the first time
> (root on the btrfs volume), but I rebooted again and the fs was in
> read-only mode, but only one of the three disks was in read-only. I
> tried to reboot again and it never mounted again after that. I see
> some messages in dmesg like this:
>
> [ 151.201637] BTRFS info (device bcache0): disk space caching is enabled
> [ 151.201640] BTRFS info (device bcache0): has skinny extents
> [ 151.215697] BTRFS info (device bcache0): bdev /dev/bcache16 errs:
> wr 309, rd 319, flush 39, corrupt 0, gen 0
> [ 151.931764] BTRFS info (device bcache0): detected SSD devices,
> enabling SSD mode
> [ 152.058915] BTRFS error (device bcache0): parent transid verify
> failed on 5309837426688 wanted 1620383 found 1619473
> [ 152.059944] BTRFS error (device bcache0): parent transid verify
> failed on 5309837426688 wanted 1620383 found 1619473
Normally transid error indicates bigger problem, and normally hard to trace.
> [ 152.060018] BTRFS: error (device bcache0) in
> __btrfs_free_extent:6989: errno=-5 IO failure
> [ 152.060060] BTRFS: error (device bcache0) in
> btrfs_run_delayed_refs:3009: errno=-5 IO failure
> [ 152.071613] BTRFS info (device bcache0): delayed_refs has NO entry
> [ 152.074126] BTRFS: error (device bcache0) in btrfs_replay_log:2475:
> errno=-5 IO failure (Failed to recover log tree)
> [ 152.074244] BTRFS error (device bcache0): cleaner transaction
> attach returned -30
> [ 152.148993] BTRFS error (device bcache0): open_ctree failed
>
> So, I thought that the log was corrupted, I could live without the
> last 30 seconds or so, I tried `btrfs rescue zero-log /dev/bcache0`
> and I get a backtrace.
Yes, your idea about log is correct. It's log replay causing problem.
But the root cause seems to be corrupted extent tree, which is not easy
to fix.
> I ran `btrfs rescue chunk-recover /dev/bcache0`
> and it spent hours scanning the three disks and at the end tried to
> fix the logs (or tree, I can't remember exactly) and then I got
> another backtrace.
>
> Today, I compiled 4.13-rc6 to see if some of the latest fixes would
> help, no dice (the dmesg above is from 4.13-rc6). I compiled the
> latest master of btrfs-progs, no progress.
>
> Things I've tried:
> mount
> mount -o degraded
> mount -o degraded,ro
> mount -o degraded (with each drive disconnected in turn to see if in
> would start without one of the drives)
> btrfs rescue chunk-recover
> btrfs rescue super-recover (all drives report the superblocks are fine)
> btrfs rescue zero-log (always has a backtrace)
I think that's some other problem causing the backtrace.
Normally extent tree corruption or transid error.
> btrfs check
>
> I know that bcache complicates things, but I'm hoping for two things.
> 1. Try to get what I can off the volume. 2. Provide some information
> that can help make btrfs/bcache better for the future.
>
> Here is what `btrfs rescue zero-log` outputs:
>
> # ./btrfs rescue zero-log /dev/bcache0
> Clearing log on /dev/bcache0, previous log_root 2876047507456, level 0
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> parent transid verify failed on 5309233872896 wanted 1620381 found 1619462
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> checksum verify failed on 5309233872896 found 6A103358 wanted 8EF38EEE
> bytenr mismatch, want=5309233872896, have=65536
> btrfs unable to find ref byte nr 5310039638016 parent 0 root 2 owner 2 offset 0
> parent transid verify failed on 5309275930624 wanted 1620381 found 1619462
> parent transid verify failed on 5309275930624 wanted 1620381 found 1619462
> checksum verify failed on 5309275930624 found A2FDBB6A wanted 461E06DC
> parent transid verify failed on 5309275930624 wanted 1620381 found 1619462
> Ignoring transid failure
> bad key ordering 67 68
Despite of transid and bytenr mismatch (which is already a big problem),
we even have bad key order.
That's definitely not a good sign.
I think extent tree (maybe more) get heavily damaged.
And considering how we update extent tree (delay it as long as
possible), it's not that strange.
I would recommend to use backup roots manually to see which one can pass
btrfsck.
But log tree will be a blockage as its content is bond to certain transid.
Would you please try the following commands?
# btrfs inspect dump-super -f /dev/bcache0
Check the output for part like:
backup_roots[4]:
backup 0:
backup_tree_root: 29392896 gen: 6 level: 0
backup_chunk_root: 20987904 gen: 5 level: 0
Record the number of backup_tree_root.
And then
# btrfs check -r 29392896 /dev/bcache0
If you're lucky enough, you should not see backtrace.
BTW, the newer backup the higher chance to recover.
If you backup 0 and 1 don't give a good result, then nothing much left
we can do.
Thanks,
Qu
> btrfs unable to find ref byte nr 5310039867392 parent 0 root 2 owner 1 offset 0
> bad key ordering 67 68
> extent-tree.c:2725: alloc_reserved_tree_block: BUG_ON `ret` triggered, value -1
> ./btrfs(+0x1c624)[0x562fde546624]
> ./btrfs(+0x1d91a)[0x562fde54791a]
> ./btrfs(+0x1da2b)[0x562fde547a2b]
> ./btrfs(+0x1f3a5)[0x562fde5493a5]
> ./btrfs(+0x1f91f)[0x562fde54991f]
> ./btrfs(btrfs_alloc_free_block+0xd2)[0x562fde54c20c]
> ./btrfs(__btrfs_cow_block+0x182)[0x562fde53c778]
> ./btrfs(btrfs_cow_block+0xea)[0x562fde53d0ea]
> ./btrfs(+0x185a3)[0x562fde5425a3]
> ./btrfs(btrfs_commit_transaction+0x96)[0x562fde54411c]
> ./btrfs(+0x6a702)[0x562fde594702]
> ./btrfs(handle_command_group+0x44)[0x562fde53b40c]
> ./btrfs(cmd_rescue+0x15)[0x562fde59486d]
> ./btrfs(main+0x85)[0x562fde53b5c3]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd3931692b1]
> ./btrfs(_start+0x2a)[0x562fde53b13a]
> Aborted
>
> Please let me know if there is any other information I can provide
> that would be helpful.
>
> Thank you,
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2017-08-21 6:58 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-21 4:33 Btrfs Raid5 issue Robert LeBlanc
2017-08-21 6:58 ` Qu Wenruo [this message]
2017-08-21 10:53 ` Janos Toth F.
-- strict thread matches above, loose matches on Subject: below --
2017-08-21 16:31 Robert LeBlanc
2017-08-21 16:49 ` Chris Murphy
2017-08-22 5:19 Robert LeBlanc
2017-08-22 5:53 ` Chris Murphy
2017-08-22 6:40 ` Qu Wenruo
2017-08-22 16:37 Robert LeBlanc
2017-08-23 0:00 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=09e13dfc-20cf-57f9-5dff-b22013f5e77b@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=robert@leblancnet.us \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).