From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: fs unreadable after powercycle: BTRFS (device sda): parent transid verify failed on 427084513280 wanted 390924 found 390922
Date: Sun, 9 Aug 2015 02:56:49 +0000 (UTC) [thread overview]
Message-ID: <pan$24874$5c494beb$ed9709e6$399a5c96@cox.net> (raw)
In-Reply-To: CABL_Pd-Z6-raB3ngEDbj=-d+QVMohGQ9DHXqv0fVvVNy-HVxUA@mail.gmail.com
Martin Tippmann posted on Sat, 08 Aug 2015 20:43:34 +0200 as excerpted:
> Hi, after a hard reboot (powercycle) a btrfs volume did not come up
> again:
>
> It's a single 4TB disk - only btrfs with lzo - data=single,metadata=dup
>
> [ 121.831814] BTRFS info (device sda): disk space caching is enabled [
> 121.857820] BTRFS (device sda): parent transid verify failed on
> 427084513280 wanted 390924 found 390922 [ 121.861607] BTRFS (device
> sda):
> parent transid verify failed on 427084513280 wanted 390924 found 390922
> [ 121.861715] BTRFS: failed to read tree root on sda [ 121.878111]
> BTRFS: open_ctree failed
>
> btrfs-progs v4.0 Kernel: 4.1.4
>
> I'm quite sure that the HDD is fine (no SMART Problems, Disk Errorlog is
> empty, It's a new Enterprise-Drive that worked well in the past
> days/weeks).
>
> So I'm kind at loss what to do:
>
> How can I recover from that problem? I've found just a note in the
> FAQ[1] but no solution to the problem.
[The FAQ reference was to the wiki problem faq, transid failure
explanation, but it didn't say what to do about it.]
Did you try the recovery mount option suggested earlier in the problem-faq
under mount problems?
https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_can.27t_mount_my_filesystem.2
For transid failures, that's what I'd try first, since that scans
previous tree-roots and tries to use the first one it can read. Since
the transid it wants (390924) is only a couple ahead of what it finds
(390922), and the recover mount option scans backward in the tree-root
history to see if it can find any that work, that could well solve the
problem.
If not, as Hugo mentions, given find-tree-root looks good, btrfs restore
has a good chance of working. I've used that myself to good effect a
couple times when a btrfs refused to mount (I have backups if I have to
use 'em, but recovery or restore, when they work, will normally leave me
with more current copies, since I tend to let my backups get somewhat
stale). There's a page on the wiki for using it with find-root if
necessary, but the wiki page is a bit dated. The btrfs-restore manpage
should be current, but doesn't have the detail about using it with find-
root that the wiki page has.
> Maybe someone can give some clues why does this happen in the first
> place?
> Is it unfortunate timing due to the abrupt power cycle?
> Shouldn't CoW protect against this somewhat?
As Hugo says, in theory cow should protect against this, but the
combination of possible bugs in a still not yet fully stable and mature
btrfs, and possibly buggy hardware, means theory and practice don't
always line up as well as they should, in theory. (How's that for an
ouroboros, aka snake eating it's tail circular-reference, explanation?
=:^)
But the recovery mount option is a reasonable first recovery (now
ouroboroi =:^) option, and btrfs restore not too bad to work with if that
fails.
Referencing the hardware write-caching option you mentioned later, yes,
turning that off can help... in theory... but it also tends to have a
DRAMATICALLY bad effect on spinning rust write performance (I don't know
enough about SSD write caching to venture a guess), and in some cases
voids warranties due to the additional thrashing it's likely to cause as
well, so do your research before turning it off. In general, it's not a
good idea as it's simply not worth it. Both Linux at the generic IO
level and the various filesystem stacks are designed to work around all
but the worst hardware IO barrier failures, and the write slowdown and
increased disk thrashing are simply not worth it, in most cases. If the
hardware is actually bad enough that it's worth it, I'd strongly consider
different hardware.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2015-08-09 2:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-08 18:43 fs unreadable after powercycle: BTRFS (device sda): parent transid verify failed on 427084513280 wanted 390924 found 390922 Martin Tippmann
2015-08-08 19:05 ` Hugo Mills
2015-08-08 19:32 ` Martin Tippmann
2015-08-09 2:56 ` Duncan [this message]
2015-08-17 2:27 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$24874$5c494beb$ed9709e6$399a5c96@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).