Re: fs unreadable after powercycle: BTRFS (device sda): parent transid verify failed on 427084513280 wanted 390924 found 390922

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: fs unreadable after powercycle: BTRFS (device sda): parent transid verify failed on 427084513280 wanted 390924 found 390922
Date: Sun, 9 Aug 2015 02:56:49 +0000 (UTC)	[thread overview]
Message-ID: <pan$24874$5c494beb$ed9709e6$399a5c96@cox.net> (raw)
In-Reply-To: CABL_Pd-Z6-raB3ngEDbj=-d+QVMohGQ9DHXqv0fVvVNy-HVxUA@mail.gmail.com

Martin Tippmann posted on Sat, 08 Aug 2015 20:43:34 +0200 as excerpted:

> Hi, after a hard reboot (powercycle) a btrfs volume did not come up
> again:
> 
> It's a single 4TB disk - only btrfs with lzo - data=single,metadata=dup
> 
> [  121.831814] BTRFS info (device sda): disk space caching is enabled [
> 121.857820] BTRFS (device sda): parent transid verify failed on
> 427084513280 wanted 390924 found 390922 [  121.861607] BTRFS (device
> sda):
> parent transid verify failed on 427084513280 wanted 390924 found 390922
> [ 121.861715] BTRFS: failed to read tree root on sda [  121.878111]
> BTRFS: open_ctree failed
> 
> btrfs-progs v4.0 Kernel: 4.1.4
> 
> I'm quite sure that the HDD is fine (no SMART Problems, Disk Errorlog is
> empty, It's a new Enterprise-Drive that worked well in the past
> days/weeks).
> 
> So I'm kind at loss what to do:
> 
> How can I recover from that problem? I've found just a note in the
> FAQ[1] but no solution to the problem.

[The FAQ reference was to the wiki problem faq, transid failure 
explanation, but it didn't say what to do about it.]

Did you try the recovery mount option suggested earlier in the problem-faq 
under mount problems?

https://btrfs.wiki.kernel.org/index.php/Problem_FAQ#I_can.27t_mount_my_filesystem.2

For transid failures, that's what I'd try first, since that scans 
previous tree-roots and tries to use the first one it can read.  Since 
the transid it wants (390924) is only a couple ahead of what it finds 
(390922), and the recover mount option scans backward in the tree-root 
history to see if it can find any that work, that could well solve the 
problem.

If not, as Hugo mentions, given find-tree-root looks good, btrfs restore 
has a good chance of working.  I've used that myself to good effect a 
couple times when a btrfs refused to mount (I have backups if I have to 
use 'em, but recovery or restore, when they work, will normally leave me 
with more current copies, since I tend to let my backups get somewhat 
stale).  There's a page on the wiki for using it with find-root if 
necessary, but the wiki page is a bit dated.  The btrfs-restore manpage 
should be current, but doesn't have the detail about using it with find-
root that the wiki page has.

> Maybe someone can give some clues why does this happen in the first
> place?
> Is it unfortunate timing due to the abrupt power cycle?
> Shouldn't CoW protect against this somewhat?

As Hugo says, in theory cow should protect against this, but the 
combination of possible bugs in a still not yet fully stable and mature 
btrfs, and possibly buggy hardware, means theory and practice don't 
always line up as well as they should, in theory. (How's that for an 
ouroboros, aka snake eating it's tail circular-reference, explanation? 
=:^)

But the recovery mount option is a reasonable first recovery (now 
ouroboroi =:^) option, and btrfs restore not too bad to work with if that 
fails.

Referencing the hardware write-caching option you mentioned later, yes, 
turning that off can help... in theory... but it also tends to have a 
DRAMATICALLY bad effect on spinning rust write performance (I don't know 
enough about SSD write caching to venture a guess), and in some cases 
voids warranties due to the additional thrashing it's likely to cause as 
well, so do your research before turning it off.  In general, it's not a 
good idea as it's simply not worth it.  Both Linux at the generic IO 
level and the various filesystem stacks are designed to work around all 
but the worst hardware IO barrier failures, and the write slowdown and 
increased disk thrashing are simply not worth it, in most cases.  If the 
hardware is actually bad enough that it's worth it, I'd strongly consider 
different hardware.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2015-08-09  2:56 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-08 18:43 fs unreadable after powercycle: BTRFS (device sda): parent transid verify failed on 427084513280 wanted 390924 found 390922 Martin Tippmann
2015-08-08 19:05 ` Hugo Mills
2015-08-08 19:32   ` Martin Tippmann
2015-08-09  2:56 ` Duncan [this message]
2015-08-17  2:27 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$24874$5c494beb$ed9709e6$399a5c96@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).