Fixing recursive fault and parent transid verify failed

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

* Fixing recursive fault and parent transid verify failed
@ 2015-12-07  1:57 Alistair Grant
  2015-12-07  2:09 ` Lukas Pirl
  2015-12-07  8:25 ` Duncan
  0 siblings, 2 replies; 10+ messages in thread
From: Alistair Grant @ 2015-12-07  1:57 UTC (permalink / raw)
  To: linux-btrfs

Hi,

(Resending as it looks like the first attempt didn't get through,
probably too large, so logs are now in dropbox)

I have a btrfs volume which is raid1 across two spinning rust disks,
each 2TB.

When trying to access some files from a another machine using sshfs the
server machine has crashed twice resulting in a hard lock up, i.e. power
off required to restart the machine.

There are no crash dumps in /var/log/syslog, or anything that looks like
an associated error message to me, however on the second occasion I was
able to see the following message flash up the console (in addition to
some stack dumps):

Fixing recursive fault, but reboot is needed

I've ran btrfs scrub and btrfsck on the drives, with the output
included below.  Based on what I've found on the web, I assume that a
btrfs-zero-log is required.

* Is this the recommended path?
* Is there a way to find out which files will be affected by the loss of
  the transactions?

I do have a backup of the drive (which I believe is completely up to
date, the btrfs volume is used for archiving media and documents, and
single person use of git repositories, i.e. only very light writing and
reading).

Some basic details:

OS: Ubuntu 15.10
Kernel: Ubuntu 4.2.0-19-generic (which is based on mainline 4.2.6)

> sudo btrfs fi df /srv/d2root
==============================

Data, RAID1: total=250.00GiB, used=248.86GiB
Data, single: total=8.00MiB, used=0.00B
System, RAID1: total=8.00MiB, used=64.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID1: total=1.00GiB, used=466.77MiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=160.00MiB, used=0.00B

> sudo btrfs fi usage /srv/d2root
=================================

Overall:
    Device size:		   3.64TiB
    Device allocated:		 502.04GiB
    Device unallocated:		   3.15TiB
    Device missing:		     0.00B
    Used:			 498.62GiB
    Free (estimated):		   1.58TiB	(min: 1.58TiB)
    Data ratio:			      2.00
    Metadata ratio:		      1.99
    Global reserve:		 160.00MiB	(used: 0.00B)

Data,single: Size:8.00MiB, Used:0.00B
   /dev/sdc	   8.00MiB

Data,RAID1: Size:250.00GiB, Used:248.86GiB
   /dev/sdb	 250.00GiB
   /dev/sdc	 250.00GiB

Metadata,single: Size:8.00MiB, Used:0.00B
   /dev/sdc	   8.00MiB

Metadata,RAID1: Size:1.00GiB, Used:466.77MiB
   /dev/sdb	   1.00GiB
   /dev/sdc	   1.00GiB

System,single: Size:4.00MiB, Used:0.00B
   /dev/sdc	   4.00MiB

System,RAID1: Size:8.00MiB, Used:64.00KiB
   /dev/sdb	   8.00MiB
   /dev/sdc	   8.00MiB

Unallocated:
   /dev/sdb	   1.57TiB
   /dev/sdc	   1.57TiB


btrfs scrub output:
https://www.dropbox.com/s/blqvopa1lhkghe5/scrub.log?dl=0


btrfsck sdb output:
https://www.dropbox.com/s/hw6w6cupuu1rny4/btrfsck.sdb.log?dl=0


btrfsck sdc output:
https://www.dropbox.com/s/mijz492mjr76p8z/btrfsck.sdc.log?dl=0



Thanks very much,
Alistair


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07  1:57 Fixing recursive fault and parent transid verify failed Alistair Grant
@ 2015-12-07  2:09 ` Lukas Pirl
  2015-12-07  8:25 ` Duncan
  1 sibling, 0 replies; 10+ messages in thread
From: Lukas Pirl @ 2015-12-07  2:09 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Alistair Grant

On 12/07/2015 02:57 PM, Alistair Grant wrote as excerpted:
> Fixing recursive fault, but reboot is needed

For the record:

I saw the same message (incl. hard lockup) when doing a balance on a
single-disk btrfs.

Besides that, the fs works flawlessly (~60GB, usage: no snapshots, ~15
lxc containers, low-load databases, few mails, a couple of Web servers).

As this is a production machine, I rather rebooted the machine instead
of investigating but the error is reproducible if that would be of
great interest.

> I've ran btrfs scrub and btrfsck on the drives, with the output
> included below.  Based on what I've found on the web, I assume that a
> btrfs-zero-log is required.
> 
> * Is this the recommended path?
> * Is there a way to find out which files will be affected by the loss of
>   the transactions?

> Kernel: Ubuntu 4.2.0-19-generic (which is based on mainline 4.2.6)

I used Debian Backports 4.2.6.

Cheers,

Lukas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07  1:57 Fixing recursive fault and parent transid verify failed Alistair Grant
  2015-12-07  2:09 ` Lukas Pirl
@ 2015-12-07  8:25 ` Duncan
  2015-12-07 10:02   ` Alistair Grant
  1 sibling, 1 reply; 10+ messages in thread
From: Duncan @ 2015-12-07  8:25 UTC (permalink / raw)
  To: linux-btrfs

Alistair Grant posted on Mon, 07 Dec 2015 12:57:15 +1100 as excerpted:

> I've ran btrfs scrub and btrfsck on the drives, with the output included
> below.  Based on what I've found on the web, I assume that a
> btrfs-zero-log is required.
> 
> * Is this the recommended path?

[Just replying to a couple more minor points, here.]

Absolutely not.  btrfs-zero-log isn't the tool you need here.

About the btrfs log...

Unlike most journaling filesystems, btrfs is designed to be atomic and 
consistent at commit time (every 30 seconds by default) and doesn't log 
normal filesystem activity at all.  The only thing logged is fsyncs, 
allowing them to deliver on their file-written-to-hardware guarantees, 
without forcing the entire atomic filesystem sync, which would trigger a 
normal atomic commit and thus is a far heavier weight process.  IOW, all 
it does is log and speedup fsyncs.  The filesystem is designed to be 
atomically consistent at commit time, with or without the log, with the 
only thing missing if the log isn't replayed being the last few seconds 
of fsyncs since the last atomic commit.

So the btrfs log is very limited in scope and will in many cases be 
entirely empty, if there were no fsyncs after the last atomic filesystem 
commit, again, every 30 seconds by default, so in human terms at least, 
not a lot of time.

About btrfs log replay...

The kernel, meanwhile, is designed to replay the log automatically at 
mount time.  If the mount is successful, the log has by definition been 
replayed successfully and zeroing it wouldn't have done much of anything 
but possibly lose you a few seconds worth of fsyncs.

Since you are able to run scrub, which requires a writable mount, the 
mount is definitely successful, which means btrfs-zero-log is the wrong 
tool for the job, since it addresses a problem you obviously don't have.

> * Is there a way to find out which files will be affected by the loss of
>   the transactions?

I'm interpreting that question in the context of the transid wanted/found 
listings in your linked logs, since it no longer makes sense in the 
context of btrfs-zero-log, given the information above.

I believe so, but the most direct method requires manual use of btrfs-
debug and similar tools, looking up addresses and tracing down the files 
to which they belong.  Of course that's if the addresses trace to actual 
files at all.  If they trace to metadata instead of data, then it's not 
normally files, but the metadata (including checksums and very small 
files of only a few KiB) about files, instead.  Of course if it's 
metadata the problem's worse, as a single bad metadata block can affect 
multiple actual files.

The more indirect way would be to use btrfs restore with the -t option, 
feeding it the root address associated with the transid found (with that 
association traced via btrfs-find-root), to restore the file from the 
filesystem as it existed at that point, to some other mounted filesystem, 
also using the restore metadata option.  You could then do for instance a 
diff of the listing (or possibly a per-file checksum, say md5sum, of both 
versions) between your current backup (or current mounted filesystem, 
since you can still mount it) and the restored version, which would be 
the files at the time of that transaction-id, and see which ones 
changed.  That of course would be the affected files. =:^]

> I do have a backup of the drive (which I believe is completely up to
> date, the btrfs volume is used for archiving media and documents, and
> single person use of git repositories, i.e. only very light writing and
> reading).

Of course either one of the above is going to be quite some work, and if 
you have a current backup, simply restoring it is likely to be far 
easier, unless of course you're interested in practicing your recovery 
technique or the like, certainly not a valueless endeavor, if you have 
the time and patience for it.

The *GOOD* thing is that you *DO* have a current backup.  Far *FAR* too 
many people we see posting here, are unfortunately finding out the hard 
way, that their actions, or more precisely, lack thereof, in failing to 
do backups, put the lie to any claims that they actually valued the 
data.  As any good sysadmin can tell you, often from unhappy lessons such 
as this, if it's not backed up, by definition, your actions are placing 
its value at less than the time and resources necessary to do that backup 
(modified of course by the risk factor of actually needing it, thus 
taking care of the Nth level backup, some of which are off-site, if the 
data is really /that/ valuable, while also covering the throw-away data 
that's so trivial as to not justify even the effort of a single level of 
backup).

So hurray for you! =:^)

(FWIW, I personally have backups of most stuff here, often several 
levels, tho I don't always keep them current.  But should I be forced to 
resort to them, I'm prepared to lose the intervening updates, as I 
recognize that by failing to keep those backups current I really am 
defining the intervening data at risk as worth less than the hassle and 
resources to more regularly update the backups.  It wouldn't be pleasant 
having to resort to them, and fortunately, the twice I might have since I 
started running btrfs, btrfs restore was able to restore very close to 
the latest copies, but if it comes to it, I'm prepared to live with loss 
of the data since those somewhat dated backups, as for me, the most 
important stuff is in my head anyway, and if I end up losing /that/ 
backup, I won't be caring much about the others, will I? =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07  8:25 ` Duncan
@ 2015-12-07 10:02   ` Alistair Grant
  2015-12-07 13:48     ` Duncan
  0 siblings, 1 reply; 10+ messages in thread
From: Alistair Grant @ 2015-12-07 10:02 UTC (permalink / raw)
  To: linux-btrfs

On Mon, Dec 07, 2015 at 08:25:01AM +0000, Duncan wrote:
> Alistair Grant posted on Mon, 07 Dec 2015 12:57:15 +1100 as excerpted:
> 
> > I've ran btrfs scrub and btrfsck on the drives, with the output included
> > below.  Based on what I've found on the web, I assume that a
> > btrfs-zero-log is required.
> > 
> > * Is this the recommended path?
> 
> [Just replying to a couple more minor points, here.]
> 
> Absolutely not.  btrfs-zero-log isn't the tool you need here.
> 
> About the btrfs log...
> 
> Unlike most journaling filesystems, btrfs is designed to be atomic and 
> consistent at commit time (every 30 seconds by default) and doesn't log 
> normal filesystem activity at all.  The only thing logged is fsyncs, 
> allowing them to deliver on their file-written-to-hardware guarantees, 
> without forcing the entire atomic filesystem sync, which would trigger a 
> normal atomic commit and thus is a far heavier weight process.  IOW, all 
> it does is log and speedup fsyncs.  The filesystem is designed to be 
> atomically consistent at commit time, with or without the log, with the 
> only thing missing if the log isn't replayed being the last few seconds 
> of fsyncs since the last atomic commit.
> 
> So the btrfs log is very limited in scope and will in many cases be 
> entirely empty, if there were no fsyncs after the last atomic filesystem 
> commit, again, every 30 seconds by default, so in human terms at least, 
> not a lot of time.
> 
> About btrfs log replay...
> 
> The kernel, meanwhile, is designed to replay the log automatically at 
> mount time.  If the mount is successful, the log has by definition been 
> replayed successfully and zeroing it wouldn't have done much of anything 
> but possibly lose you a few seconds worth of fsyncs.
> 
> Since you are able to run scrub, which requires a writable mount, the 
> mount is definitely successful, which means btrfs-zero-log is the wrong 
> tool for the job, since it addresses a problem you obviously don't have.

OK, thanks for the detailed explanation (here and below, so I don't have
to repeat myself).

The reason I thought it might be required was that the parent transid
failed errors were found even after a reboot (and obviously remounting
the filesystem) and without any user activity.

> 
> > * Is there a way to find out which files will be affected by the loss of
> >   the transactions?
> 
> I'm interpreting that question in the context of the transid wanted/found 
> listings in your linked logs, since it no longer makes sense in the 
> context of btrfs-zero-log, given the information above.
> 
> I believe so, but the most direct method requires manual use of btrfs-
> debug and similar tools, looking up addresses and tracing down the files 
> to which they belong.  Of course that's if the addresses trace to actual 
> files at all.  If they trace to metadata instead of data, then it's not 
> normally files, but the metadata (including checksums and very small 
> files of only a few KiB) about files, instead.  Of course if it's 
> metadata the problem's worse, as a single bad metadata block can affect 
> multiple actual files.
> 
> The more indirect way would be to use btrfs restore with the -t option, 
> feeding it the root address associated with the transid found (with that 
> association traced via btrfs-find-root), to restore the file from the 
> filesystem as it existed at that point, to some other mounted filesystem, 
> also using the restore metadata option.  You could then do for instance a 
> diff of the listing (or possibly a per-file checksum, say md5sum, of both 
> versions) between your current backup (or current mounted filesystem, 
> since you can still mount it) and the restored version, which would be 
> the files at the time of that transaction-id, and see which ones 
> changed.  That of course would be the affected files. =:^]
> 

I think I'll try the btrfs restore as a learning exercise, and to check
the contents of my backup (I don't trust my memory, so something could
have changed since the last backup).

Does btrfs restore require the path to be on a btrfs filesystem?  I've
got an existing ext4 drive with enough free space to do the restore, so
would prefer to use it than have to buy another drive.

My plan is:

* btrfs restore /dev/sdX /path/to/ext4/restorepoint
** Where /dev/sdX is one of the two drives that were part of the raid1
   fileystem
* hashdeep audit the restored drive and backup
* delete the existing corrupted btrfs filesystem and recreate
* rsync the merge filesystem (from backup and restore) on to the new
  filesystem

Any comments or suggestions are welcome.


> > I do have a backup of the drive (which I believe is completely up to
> > date, the btrfs volume is used for archiving media and documents, and
> > single person use of git repositories, i.e. only very light writing and
> > reading).
> 
> Of course either one of the above is going to be quite some work, and if 
> you have a current backup, simply restoring it is likely to be far 
> easier, unless of course you're interested in practicing your recovery 
> technique or the like, certainly not a valueless endeavor, if you have 
> the time and patience for it.
> 
> The *GOOD* thing is that you *DO* have a current backup.  Far *FAR* too 
> many people we see posting here, are unfortunately finding out the hard 
> way, that their actions, or more precisely, lack thereof, in failing to 
> do backups, put the lie to any claims that they actually valued the 
> data.  As any good sysadmin can tell you, often from unhappy lessons such 
> as this, if it's not backed up, by definition, your actions are placing 
> its value at less than the time and resources necessary to do that backup 
> (modified of course by the risk factor of actually needing it, thus 
> taking care of the Nth level backup, some of which are off-site, if the 
> data is really /that/ valuable, while also covering the throw-away data 
> that's so trivial as to not justify even the effort of a single level of 
> backup).
> 
> So hurray for you! =:^)
> 
> (FWIW, I personally have backups of most stuff here, often several 
> levels, tho I don't always keep them current.  But should I be forced to 
> resort to them, I'm prepared to lose the intervening updates, as I 
> recognize that by failing to keep those backups current I really am 
> defining the intervening data at risk as worth less than the hassle and 
> resources to more regularly update the backups.  It wouldn't be pleasant 
> having to resort to them, and fortunately, the twice I might have since I 
> started running btrfs, btrfs restore was able to restore very close to 
> the latest copies, but if it comes to it, I'm prepared to live with loss 
> of the data since those somewhat dated backups, as for me, the most 
> important stuff is in my head anyway, and if I end up losing /that/ 
> backup, I won't be caring much about the others, will I? =:^)
> 
> -- 
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman

Thanks again for all your help, Duncan.

Cheers,
Alistair


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07 10:02   ` Alistair Grant
@ 2015-12-07 13:48     ` Duncan
  2015-12-07 19:55       ` Alistair Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Duncan @ 2015-12-07 13:48 UTC (permalink / raw)
  To: linux-btrfs

Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:

> I think I'll try the btrfs restore as a learning exercise, and to check
> the contents of my backup (I don't trust my memory, so something could
> have changed since the last backup).

Trying btrfs restore is an excellent idea.  It'll make things far easier 
if you have to use it for real some day.

Note that while I see your kernel is reasonably current (4.2 series), I 
don't know what btrfs-progs ubuntu ships.  There have been some marked 
improvements to restore somewhat recently, checking the wiki btrfs-progs 
release-changelog list says 4.0 brought optional metadata restore, 4.0.1 
added --symlinks, and 4.2.3 fixed a symlink path check off-by-one error.  
(And don't use 4.1.1 as its mkfs.btrfs is broken and produces invalid 
filesystems.)  So you'll want at least progs 4.0 to get the optional 
metadata restoration, and 4.2.3 to get full symlinks restoration support.

> Does btrfs restore require the path to be on a btrfs filesystem?  I've
> got an existing ext4 drive with enough free space to do the restore, so
> would prefer to use it than have to buy another drive.

Restoring to ext4 should be fine.

Btrfs restore writes files as would an ordinary application, the reason 
metadata restoration is optional (otherwise it uses normal file change 
and mod times, with files written as the running user, root, using umask-
based file perms, all exactly the same as if it were a normal file 
writing application), so it will restore to any normal filesystem.  The 
filesystem it's restoring /from/ of course must be btrfs... unmounted 
since it's designed to be used when mounting is broken, but it writes 
files normally, so can write them to any filesystem.

FWIW, I restored to my reiserfs based media partition (still on spinning 
rust, my btrfs are all on ssd) here, since that's where I had the room to 
work with.

> My plan is:
> 
> * btrfs restore /dev/sdX /path/to/ext4/restorepoint
> ** Where /dev/sdX is one of the two drives that were part of the raid1
>    fileystem
> * hashdeep audit the restored drive and backup
> * delete the existing corrupted btrfs filesystem and recreate
> * rsync the merge filesystem (from backup and restore)
>   on to the new filesystem
> 
> Any comments or suggestions are welcome.

Looks very reasonable, here.  There's a restore page on the wiki with 
more information than the btrfs-restore manpage, describing how to use it 
with btrfs-find-root if necessary, etc.

https://btrfs.wiki.kernel.org/index.php/Restore

Some details on the page are a bit dated; it doesn't cover the dryrun, 
list-roots, metadata and symlink options, for instance, and these can be 
very helpful, but the general idea remains the same.

The general idea is to use btrfs-find-root to get a listing of available 
root generations (if restore can't find a working root from the 
superblocks or you want to try restoring an earlier root), then feed the 
corresponding bytenr to restore's -t option.

Note that generation and transid refer to the same thing, a normally 
increasing number, so higher generations are newer.  The wiki page makes 
this much clearer than it used to, but the old wording anyway was 
confusing to me until I figured that out.

Where the wiki page talks about root object-ids, those are the various 
subtrees, low numbers are the base trees, 256+ are subvolumes/snapshots.  
Note that restore's list-roots option lists these for the given bytenr as 
well.

So you try restore with list-roots (-l) to see what it gives you, try 
btrfs-find-root if not satisfied, to find older generations and get their 
bytenrs to plug into restore with -t, and then confirm specific 
generation bytenrs with list-roots again.

Once you have a good generation/bytenr candidate, try a dry-run (-D) to 
see if you get a list of files it's trying to restore that looks 
reasonable.

If the dry-run goes well, you can try the full restore, not forgetting 
the metadata and symlinks options (-m, -S, respectively), if desired.

>From there you can continue with your plan as above.

One more bonus hint.  Since you'll be doing a new mkfs.btrfs, it's a good 
time to review active features and decide which ones you might wish to 
activate (or not, if you're concerned about old-kernel compatibility).  
Additionally, before repopulating your new filesystem, you may want to 
review mount options, particularly autodefrag if appropriate, and 
compression if desired, so they take effect from the very first file 
created on the new filesystem. =:^)

FWIW in the past I usually did an immediate post-mkfs.btrfs mount and 
balance with -dusage=0 -musage=0 to get rid of the single-mode chunk 
artifacts from the mkfs.btrfs as well, but with a new enough mkfs.btrfs 
you may be able to avoid that now, as -progs 4.2 was supposed to 
eliminate those single-mode mkfs.btrfs artifacts on multi-device 
filesystems.  I've just not done any fresh mkfs.btrfs since then so 
haven't had a chance to play with it and see it personally, just yet.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07 13:48     ` Duncan
@ 2015-12-07 19:55       ` Alistair Grant
  2015-12-08 15:25         ` Duncan
  0 siblings, 1 reply; 10+ messages in thread
From: Alistair Grant @ 2015-12-07 19:55 UTC (permalink / raw)
  To: linux-btrfs

On Mon, Dec 07, 2015 at 01:48:47PM +0000, Duncan wrote:
> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:
> 
> > I think I'll try the btrfs restore as a learning exercise, and to check
> > the contents of my backup (I don't trust my memory, so something could
> > have changed since the last backup).
> 
> Trying btrfs restore is an excellent idea.  It'll make things far easier 
> if you have to use it for real some day.
> 
> Note that while I see your kernel is reasonably current (4.2 series), I 
> don't know what btrfs-progs ubuntu ships.  There have been some marked 
> improvements to restore somewhat recently, checking the wiki btrfs-progs 
> release-changelog list says 4.0 brought optional metadata restore, 4.0.1 
> added --symlinks, and 4.2.3 fixed a symlink path check off-by-one error.  
> (And don't use 4.1.1 as its mkfs.btrfs is broken and produces invalid 
> filesystems.)  So you'll want at least progs 4.0 to get the optional 
> metadata restoration, and 4.2.3 to get full symlinks restoration support.
> 

Ubuntu 15.10 comes with btrfs-progs v4.0.  It looks like it is easy
enough to compile and install the latest version from
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git so
I'll do that.

Should I stick to 4.2.3 or use the latest 4.3.1?


> > Does btrfs restore require the path to be on a btrfs filesystem?  I've
> > got an existing ext4 drive with enough free space to do the restore, so
> > would prefer to use it than have to buy another drive.
> 
> Restoring to ext4 should be fine.
> 
> Btrfs restore writes files as would an ordinary application, the reason 
> metadata restoration is optional (otherwise it uses normal file change 
> and mod times, with files written as the running user, root, using umask-
> based file perms, all exactly the same as if it were a normal file 
> writing application), so it will restore to any normal filesystem.  The 
> filesystem it's restoring /from/ of course must be btrfs... unmounted 
> since it's designed to be used when mounting is broken, but it writes 
> files normally, so can write them to any filesystem.
> 
> FWIW, I restored to my reiserfs based media partition (still on spinning 
> rust, my btrfs are all on ssd) here, since that's where I had the room to 
> work with.
>

Thanks for the confirmation.

 
> > My plan is:
> > 
> > * btrfs restore /dev/sdX /path/to/ext4/restorepoint
> > ** Where /dev/sdX is one of the two drives that were part of the raid1
> >    fileystem
> > * hashdeep audit the restored drive and backup
> > * delete the existing corrupted btrfs filesystem and recreate
> > * rsync the merge filesystem (from backup and restore)
> >   on to the new filesystem
> > 
> > Any comments or suggestions are welcome.
> 
> 
> Looks very reasonable, here.  There's a restore page on the wiki with 
> more information than the btrfs-restore manpage, describing how to use it 
> with btrfs-find-root if necessary, etc.
> 
> https://btrfs.wiki.kernel.org/index.php/Restore
> 

I'd seen this, but it isn't explicit about the target filesystem
support.  I should try and update the page a bit.


> Some details on the page are a bit dated; it doesn't cover the dryrun, 
> list-roots, metadata and symlink options, for instance, and these can be 
> very helpful, but the general idea remains the same.
> 
> The general idea is to use btrfs-find-root to get a listing of available 
> root generations (if restore can't find a working root from the 
> superblocks or you want to try restoring an earlier root), then feed the 
> corresponding bytenr to restore's -t option.
> 
> Note that generation and transid refer to the same thing, a normally 
> increasing number, so higher generations are newer.  The wiki page makes 
> this much clearer than it used to, but the old wording anyway was 
> confusing to me until I figured that out.
> 
> Where the wiki page talks about root object-ids, those are the various 
> subtrees, low numbers are the base trees, 256+ are subvolumes/snapshots.  
> Note that restore's list-roots option lists these for the given bytenr as 
> well.
> 
> So you try restore with list-roots (-l) to see what it gives you, try 
> btrfs-find-root if not satisfied, to find older generations and get their 
> bytenrs to plug into restore with -t, and then confirm specific 
> generation bytenrs with list-roots again.
> 
> Once you have a good generation/bytenr candidate, try a dry-run (-D) to 
> see if you get a list of files it's trying to restore that looks 
> reasonable.
> 
> If the dry-run goes well, you can try the full restore, not forgetting 
> the metadata and symlinks options (-m, -S, respectively), if desired.
> 
> From there you can continue with your plan as above.
> 
> One more bonus hint.  Since you'll be doing a new mkfs.btrfs, it's a good 
> time to review active features and decide which ones you might wish to 
> activate (or not, if you're concerned about old-kernel compatibility).  
> Additionally, before repopulating your new filesystem, you may want to 
> review mount options, particularly autodefrag if appropriate, and 
> compression if desired, so they take effect from the very first file 
> created on the new filesystem. =:^)
> 
> FWIW in the past I usually did an immediate post-mkfs.btrfs mount and 
> balance with -dusage=0 -musage=0 to get rid of the single-mode chunk 
> artifacts from the mkfs.btrfs as well, but with a new enough mkfs.btrfs 
> you may be able to avoid that now, as -progs 4.2 was supposed to 
> eliminate those single-mode mkfs.btrfs artifacts on multi-device 
> filesystems.  I've just not done any fresh mkfs.btrfs since then so 
> haven't had a chance to play with it and see it personally, just yet.
> 
> -- 
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman


Thanks!
Alistair


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-07 19:55       ` Alistair Grant
@ 2015-12-08 15:25         ` Duncan
  2015-12-08 22:38           ` Alistair Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Duncan @ 2015-12-08 15:25 UTC (permalink / raw)
  To: linux-btrfs

Alistair Grant posted on Tue, 08 Dec 2015 06:55:04 +1100 as excerpted:

> On Mon, Dec 07, 2015 at 01:48:47PM +0000, Duncan wrote:
>> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:
>> 
>> > I think I'll try the btrfs restore as a learning exercise, and to
>> > check the contents of my backup (I don't trust my memory, so
>> > something could have changed since the last backup).
>> 
>> Trying btrfs restore is an excellent idea.  It'll make things far
>> easier if you have to use it for real some day.
>> 
>> Note that while I see your kernel is reasonably current (4.2 series), I
>> don't know what btrfs-progs ubuntu ships.  There have been some marked
>> improvements to restore somewhat recently, checking the wiki
>> btrfs-progs release-changelog list says 4.0 brought optional metadata
>> restore, 4.0.1 added --symlinks, and 4.2.3 fixed a symlink path check
>> off-by-one error. (And don't use 4.1.1 as its mkfs.btrfs is broken and
>> produces invalid filesystems.)  So you'll want at least progs 4.0 to
>> get the optional metadata restoration, and 4.2.3 to get full symlinks
>> restoration support.
>> 
>> 
> Ubuntu 15.10 comes with btrfs-progs v4.0.  It looks like it is easy
> enough to compile and install the latest version from
> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git so
> I'll do that.
> 
> Should I stick to 4.2.3 or use the latest 4.3.1?

I generally use the latest myself, but recommend as a general guideline 
that at minimum, a userspace version series matching that of your kernel 
be used, as if the usual kernel recommendations (within two kernel series 
of either current or LTS, so presently 4.2 or 4.3 for current or 3.18 or 
4.1 for LTS) are followed, that will keep userspace reasonably current as 
well, and the userspace of a particular version was being developed 
concurrently with the kernel of the same series, so they're relatively in 
sync.

So with a 4.2 kernel, I'd suggest at least a 4.2 userspace.  If you want 
the latest, as I generally do, and are willing to put up with occasional 
bleeding edge bugs like that broken mkfs.btrfs in 4.1.1, by all means, 
use the latest, but otherwise, the general same series as your kernel 
guideline is quite acceptable.

The exception would be if you're trying to fix or recover from a broken 
filesystem, in which case the very latest tends to have the best chance 
at fixing things, since it has fixes for (or lacking that, at least 
detection of) the latest round of discovered bugs, that older versions 
will lack.

While btrfs restore does fall into the recover from broken category, we 
know from the changelogs that nothing specific has gone into it since the 
mentioned 4.2.3 symlink off-by-one fix, so while I would recommend at 
least that since you are going to be working with restore, there's no 
urgent need for 4.3.0 or 4.3.1 if you're more comfortable with the older 
version.  (In fact, while I knew I was on 4.3.something, I just had to 
run btrfs version, to check whether it was 4.3 or 4.3.1, myself.  FWIW, 
it was 4.3.1.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-08 15:25         ` Duncan
@ 2015-12-08 22:38           ` Alistair Grant
  2015-12-09 10:19             ` Duncan
  0 siblings, 1 reply; 10+ messages in thread
From: Alistair Grant @ 2015-12-08 22:38 UTC (permalink / raw)
  To: linux-btrfs

On Tue, Dec 08, 2015 at 03:25:14PM +0000, Duncan wrote:
> Alistair Grant posted on Tue, 08 Dec 2015 06:55:04 +1100 as excerpted:
> 
> > On Mon, Dec 07, 2015 at 01:48:47PM +0000, Duncan wrote:
> >> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as excerpted:
> >> 
> >> > I think I'll try the btrfs restore as a learning exercise, and to
> >> > check the contents of my backup (I don't trust my memory, so
> >> > something could have changed since the last backup).
> >> 
> >> Trying btrfs restore is an excellent idea.  It'll make things far
> >> easier if you have to use it for real some day.
> >> 
> >> Note that while I see your kernel is reasonably current (4.2 series), I
> >> don't know what btrfs-progs ubuntu ships.  There have been some marked
> >> improvements to restore somewhat recently, checking the wiki
> >> btrfs-progs release-changelog list says 4.0 brought optional metadata
> >> restore, 4.0.1 added --symlinks, and 4.2.3 fixed a symlink path check
> >> off-by-one error. (And don't use 4.1.1 as its mkfs.btrfs is broken and
> >> produces invalid filesystems.)  So you'll want at least progs 4.0 to
> >> get the optional metadata restoration, and 4.2.3 to get full symlinks
> >> restoration support.
> >> 
> >> ...

Thanks again Duncan for your assistance.

I plugged the ext4 drive I planned to use for the recovery in to the
machine and immediately got a couple of errors, which makes me wonder
whether there isn't a hardware problem with the machine somewhere.  So
decided to move to another machine to do the recovery.

So I'm now recovering on Arch Linux 4.1.13-1 with btrfs-progs v4.3.1
(the latest version from archlinuxarm.org).

Attempting:

sudo btrfs restore -S -m -v /dev/sdb /mnt/btrfs-recover/ ^&1 | tee btrfs-recover.log

only recovered 53 of the more than 106,000 files that should be available.

The log is available at: 

https://www.dropbox.com/s/p8bi6b8b27s9mhv/btrfs-recover.log?dl=0

I did attempt btrfs-find-root, but couldn't make sense of the output:

https://www.dropbox.com/s/qm3h2f7c6puvd4j/btrfs-find-root.log?dl=0

Simply mounting the drive, then re-mounting it read only, and rsync'ing
the files to the backup drive recovered 97,974 files before crashing.
If anyone is interested, I've uploaded a photo of the console to:

https://www.dropbox.com/s/xbrp6hiah9y6i7s/rsync%20crash.jpg?dl=0

I'm currently running a hashdeep audit between the recovered files and
the backup to see how the recovery went.

If you'd like me to try any other tests, I'll keep the damaged file
system for at least the next day or so.

Thanks again for all your assistance,
Alistair

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-08 22:38           ` Alistair Grant
@ 2015-12-09 10:19             ` Duncan
  2015-12-12 22:12               ` Alistair Grant
  0 siblings, 1 reply; 10+ messages in thread
From: Duncan @ 2015-12-09 10:19 UTC (permalink / raw)
  To: linux-btrfs

Alistair Grant posted on Wed, 09 Dec 2015 09:38:47 +1100 as excerpted:

> On Tue, Dec 08, 2015 at 03:25:14PM +0000, Duncan wrote:
>> Alistair Grant posted on Tue, 08 Dec 2015 06:55:04 +1100 as excerpted:
>> 
>> > On Mon, Dec 07, 2015 at 01:48:47PM +0000, Duncan wrote:
>> >> Alistair Grant posted on Mon, 07 Dec 2015 21:02:56 +1100 as
>> >> excerpted:
>> >> 
>> >> > I think I'll try the btrfs restore as a learning exercise
>> >> 
>> >> Trying btrfs restore is an excellent idea.  It'll make things far
>> >> easier if you have to use it for real some day.
> 
> Thanks again Duncan for your assistance.
> 
> I plugged the ext4 drive I planned to use for the recovery in to the
> machine and immediately got a couple of errors, which makes me wonder
> whether there isn't a hardware problem with the machine somewhere.
> 
> So decided to move to another machine to do the recovery.

Ouch!  That can happen, and if you moved the ext4 drive to a different 
machine and it was fine there, then it's not the drive.

But you didn't say what kind of errors or if you checked SMART, or even 
how it was plugged in (USB or SATA-direct or...).  So I guess you have 
that side of things under control.  (If not, there's some here who know 
quite a bit about that sort of thing...)

> So I'm now recovering on Arch Linux 4.1.13-1 with btrfs-progs v4.3.1
> (the latest version from archlinuxarm.org).
> 
> Attempting:
> 
> sudo btrfs restore -S -m -v /dev/sdb /mnt/btrfs-recover/ ^&1 | tee
> btrfs-recover.log
> 
> only recovered 53 of the more than 106,000 files that should be
> available.
> 
> The log is available at:
> 
> https://www.dropbox.com/s/p8bi6b8b27s9mhv/btrfs-recover.log?dl=0
> 
> I did attempt btrfs-find-root, but couldn't make sense of the output:
> 
> https://www.dropbox.com/s/qm3h2f7c6puvd4j/btrfs-find-root.log?dl=0

Yeah, btrfs-find-root's output deciphering takes a bit of knowledge.  
Between what I had said and the wiki, I was hoping you could make sense 
of things without further help, but...

Well, at least this gets you some practice before you are desperate. =:^)

FWIW, I was really hoping that it would find generation/transid 2308, 
since that's what it was finding on those errors, but that seems to be 
too far back.

OK, here's the thing about transaction IDs aka transids aka generations.  
Normally, it's a monotonically increasing number, representing the 
transaction/commit count at that point.

Taking a step back, btrfs organizes things as a tree of trees, with each 
change cascading up (down?) the tree to its root, and then to the master 
tree's root.  Between this and btrfs' copy-on-write nature, this means 
the filesystem is atomic.  If the system crashes at any point, either the 
latest changes are committed and the master root reflects them, or the 
master root points to the previous consistent state of all the subtrees, 
which is still in place due to copy-on-write and the fact that the 
changes hadn't cascaded all the way up the trees to the master root, yet.

And each time the master root is updated, the generation aka transid is 
incremented by one.  So 3503 is the current generation (see the superblock 
thinks... bit), 3502 the one before that, 3501 the one before that...

The superblocks record the current transid and point (by address, aka 
bytenr) to that master root.

But, because btrfs is copy-on-write, older copies of the master root (and 
the other roots it points to) tend to hang around for awhile.  Which is 
where btrfs-find-root comes along, as it's designed to find all those old 
roots, listing them by bytenr and generation/transid.

In your case, while generation 3361 is current, there's a list going back 
to generation 2497 with only a few (just eyeballing it) missing, then 
2326, and pretty much nothing before that but the REALLY early generation 
2 and 3, which are likely a nearly empty filesystem.

OK, that explains the generations/transids.  There's also levels, which I 
don't clearly understand myself; definitely not well enough to try to 
explain, tho I could make some WAGs but that'd just confuse things if 
they're equally wildly wrong.  But it turns out that levels aren't in 
practice something you normally need to worry much about anyway, so 
ignoring them seems to work fine.

Then, there's bytenrs, the block addresses.  These are more or less 
randomly large numbers, from an admin perspective, but they're very 
important numbers, because this is the number you feed to restore's -t 
option, that tells it which tree root to use.

Put a different way, humans read the generation aka transid numbers; 
btrfs reads the block numbers.  So what we do is find a generation number 
that looks reasonable, and get its corresponding block number, to feed to 
restore -t.

OK, knowing that, you can perhaps make a bit more sense of what those 
transid verify failed messages are all about.  As I said, the current 
generation is 3503.  Apparently, there's a problem in a subtree, however, 
where the last update was gen 3361, but some intervening updates got lost 
and it can only find 2308 (with a couple updated to 2309).

Which is why I had hoped btrfs-find-root would find history going back to 
2308, since while it'll be old, that would bring that subtree back 
online.  But it wasn't to be.  However, it's still possible that 
something a bit newer references that old 2308, and if we're lucky, one 
of those newer generations will indeed give us access to that subtree 
once again, even if it doesn't have the latest changes to it.

OK, so normally, you'd try with the newest generation you can find that 
works, but here, we know we may have to use an older one to get access to 
that out of sync subtree.

So here's what you do.  Pick a generation from the list.  We'll start 
fairly new (high generation number), but go back a few commits, since 
it's obvious the current generation is screwed up.  Somewhat at random, 
I'll say 3480, 20-some generations back.

For generation 3480, find-root says the bytenr aka block is 564199424.

So now feed that into btrfs restore using -t.  We'll also want to use -l, 
to list the subtrees.

btrfs restore -l -t 564199424

What you're looking for there is a complete list of trees.  You want 
(based on the listing I have for a btrfs here) the extent, dev, fs, csum, 
uuid, and data-reloc trees.  If you have subvolumes I think they should 
be listed as well.  (I don't use subvolumes or snapshots here, so I can't 
say for sure.)

If you have a complete list of at least the base trees (forgetting about 
snapshots/subvols for the moment), it's a reasonable candidate to try 
restoring.  If not, try picking a different generation from find-root and 
feeding its block number to restore.

Once you have a good generation candidate based on restore -l, you can 
use restore -S -m -D -t <block> to do a dry run, and see if it seems to 
give you a reasonable number of files or not (tho note that the last time 
I actually used it for real, the actual number of files restored in a 
real run was somewhat more than in the dry run, I think there's some 
stuff the dry run doesn't or possibly can't try that a run for real does, 
but clearly, only a hundred files or so when you're expecting tens of 
thousands isn't a reasonable candidate).  Again, if not, start over with 
another generation pick from find-root, while if the number of files in 
the dry-run seems reasonable, try the real restore without the -D.

Meanwhile, in terms of picking candidate generations, the example one I 
used was reasonably current, but hopefully before whatever was lost in 
the crash.  If that didn't turn out to be the case, I'd try going back 
further, say to generation 3400, then to 3361 and something shortly 
before that, say 3360 and 3350, since 3361 was one of the wanted 
generations in the errors, then 3300, 3000, 2497, and try both of the 
2326s, one of which I'm guessing is an error and thus very bad.  (I 
didn't actually check that your find-root listing listed all of those in 
the middle, I'm simply using them as examples.)

Of course the further back you go (beyond whatever immediate damage), the 
more likely some of the subtree roots have been overwritten, so the more 
likely you'll fail at the restore -l step.

If you go back quite a way and hit a good one, then try coming forward 
again, using the standard bisect method.  If the first set of picks don't 
yield anything reasonable, try some others.

If you try quite a few and nothing seems to be coming up good, then it 
might be time to try using the check and rescue tools to start restoring 
bad trees as necessary.  But I've never gotten to this point and would be 
feeling my own way along as well, so I'm not going to try being the blind 
leading the blind into that.

But given a reasonable backup in any case, by this point you will have at 
least covered the find-root and restore basics, so even if you weren't 
particularly successful with it this time, you do have the backup and 
don't need to be, and you already accomplished your goal of getting some 
practice in case you do end up needing it at some point, and hopefully 
when you actually do need it, the problem will be different and restore 
will work better for you.

But of course, with good backups, you _should_ never find yourself 
actually needing to use this knowledge.  But it's still nice to have 
actually gotten a bit of experience with it, just in case. =:^)

> Simply mounting the drive, then re-mounting it read only, and rsync'ing
> the files to the backup drive recovered 97,974 files before crashing.
> If anyone is interested, I've uploaded a photo of the console to:
> 
> https://www.dropbox.com/s/xbrp6hiah9y6i7s/rsync%20crash.jpg?dl=0
> 
> I'm currently running a hashdeep audit between the recovered files and
> the backup to see how the recovery went.
> 
> If you'd like me to try any other tests, I'll keep the damaged file
> system for at least the next day or so.
> 
> Thanks again for all your assistance,
> Alistair

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fixing recursive fault and parent transid verify failed
  2015-12-09 10:19             ` Duncan
@ 2015-12-12 22:12               ` Alistair Grant
  0 siblings, 0 replies; 10+ messages in thread
From: Alistair Grant @ 2015-12-12 22:12 UTC (permalink / raw)
  To: linux-btrfs

On Wed, Dec 09, 2015 at 10:19:41AM +0000, Duncan wrote:
> Alistair Grant posted on Wed, 09 Dec 2015 09:38:47 +1100 as excerpted:
> 
> > On Tue, Dec 08, 2015 at 03:25:14PM +0000, Duncan wrote:
> > Thanks again Duncan for your assistance.
> > 
> > I plugged the ext4 drive I planned to use for the recovery in to the
> > machine and immediately got a couple of errors, which makes me wonder
> > whether there isn't a hardware problem with the machine somewhere.
> > 
> > So decided to move to another machine to do the recovery.
> 
> Ouch!  That can happen, and if you moved the ext4 drive to a different 
> machine and it was fine there, then it's not the drive.
> 
> But you didn't say what kind of errors or if you checked SMART, or even 
> how it was plugged in (USB or SATA-direct or...).  So I guess you have 
> that side of things under control.  (If not, there's some here who know 
> quite a bit about that sort of thing...)

Yep, I'm familiar enough with smartmontools, etc. to (hopefully) figure
this out on my own.

> 
> > So I'm now recovering on Arch Linux 4.1.13-1 with btrfs-progs v4.3.1
> > (the latest version from archlinuxarm.org).
> > 
> > Attempting:
> > 
> > sudo btrfs restore -S -m -v /dev/sdb /mnt/btrfs-recover/ ^&1 | tee
> > btrfs-recover.log
> > 
> > only recovered 53 of the more than 106,000 files that should be
> > available.
> > 
> > The log is available at:
> > 
> > https://www.dropbox.com/s/p8bi6b8b27s9mhv/btrfs-recover.log?dl=0
> > 
> > I did attempt btrfs-find-root, but couldn't make sense of the output:
> > 
> > https://www.dropbox.com/s/qm3h2f7c6puvd4j/btrfs-find-root.log?dl=0
> 
> Yeah, btrfs-find-root's output deciphering takes a bit of knowledge.  
> Between what I had said and the wiki, I was hoping you could make sense 
> of things without further help, but...
>
> ...

It turns out that a drive from a separate filesystem was dying and
causing all the weird behaviour on the original machine.

Having two failures at the same time (drive physical failure and btrfs
filesystem corruption) was a bit too much for me, so I aborted the btrfs
restore attempts, bought a replacement drive and just went back to the
backups (for both failures).

Unfortunately, I now won't be able to determine whether there was any
connection between the failures or not.

So while I didn't get to practice my restore skills, the good news is
that it is all back up and running without any problems (yet :-)).

Thank you very much for the description and detailed set of steps for
using btrfs-find-root and restore.  While I didn't get to use them this
time, I've added links to the mailing list archive in my btrfs wiki user
page so I can find my way back (and if others search for restore and
find root they may also benefit from your effort).

Thanks again,
Alistair

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-12-12 22:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-07  1:57 Fixing recursive fault and parent transid verify failed Alistair Grant
2015-12-07  2:09 ` Lukas Pirl
2015-12-07  8:25 ` Duncan
2015-12-07 10:02   ` Alistair Grant
2015-12-07 13:48     ` Duncan
2015-12-07 19:55       ` Alistair Grant
2015-12-08 15:25         ` Duncan
2015-12-08 22:38           ` Alistair Grant
2015-12-09 10:19             ` Duncan
2015-12-12 22:12               ` Alistair Grant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox