Re: Data recovery from a linear multi-disk btrfs file system

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Kai Krakow <hurikhan77@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Data recovery from a linear multi-disk btrfs file system
Date: Thu, 21 Jul 2016 00:30:52 +0200	[thread overview]
Message-ID: <20160721003052.146a1caf@jupiter.sol.kaishome.de> (raw)
In-Reply-To: 20160721001941.0352d42f@jupiter.sol.kaishome.de

Am Thu, 21 Jul 2016 00:19:41 +0200
schrieb Kai Krakow <hurikhan77@gmail.com>:

> Am Fri, 15 Jul 2016 20:45:32 +0200
> schrieb Matt <langelino@gmx.net>:
> 
> > > On 15 Jul 2016, at 14:10, Austin S. Hemmelgarn
> > > <ahferroin7@gmail.com> wrote:
> > > 
> > > On 2016-07-15 05:51, Matt wrote:    
>  [...]  
> > > The tool you want is `btrfs restore`.  You'll need somewhere to
> > > put the files from this too of course.  That said, given that you
> > > had data in raid0 mode, you're not likely to get much other than
> > > very small files back out of this, and given other factors,
> > > you're not likely to get what you would consider reasonable
> > > performance out of this either.    
> > 
> > Thanks so much for pointing me towards btrfs-restore. I surely will
> > give it a try.  Note that the FS is not a RAID0 but  linear (“JPOD")
> > configuration. This is why  it somehow did not occur to me to try
> > btrfs-restore.  The good news about in this configuration  the files
> > are *not* distributed across disks. We can  read most of the files
> > just fine.  The failed disk was actually smaller than the others
> > five so that we should be able to recover more than 5/6 of the data,
> > shouldn’t we?  My trouble is that the IO errors due to the missing
> > disk  cripple the transfer speed of both rsync and dd_rescue.
> >   
> > > Your best bet to get a working filesystem again would be to just
> > > recreate it from scratch, there's not much else that can be done
> > > when you've got a raid0 profile and have lost a disk.    
> > 
> > This is what I plan to do if there if btrfs-restore turns out to be
> > too slow and nobody on this list has any better idea.  It will,
> > however, require  transferring  >15TB across the Atlantic (this is
> > were the “backup” reside).  This can be tedious which is why I would
> > love to avoid it.  
> 
> Depending on the importance of the data it may be cheaper to transfer
> the data physically on harddisks...
> 
> However, if your backup potentially includes a lot of duplicate
> blocks, you may have a better experience using borgbackup to transfer
> the data
> - it's a free, deduplicating and compressing backup tool. If your data
> isn't already compressed and doesn't contain a lot of images, you may
> end up with 8TB or less data to transfer. I'm using borg to compress a
> 300GB server down to 50-60GB backup (and this already includes 4 weeks
> worth of retention). My home machine compresses down to 1.2TB from
> 1.8TB data with around 1 week of retention - tho I'm having a lot of
> non-duplicated binary data (images, videos, games).
> 
> When backing up across a long or slow network link, you may want to
> work with a local cache of the backup - and you may want to work with
> deduplication. My strategy is to use borgbackup to create backups
> locally, then rsync the result to the remote location.

BTW: You should start transferring the backup to your local location in
parallel to recovering your local storage. Another option would be to
recover what's possible, take a borgbackup of it, then use borg to
backup the remote location into the same repository - thanks to its
deduplication it would only transfer blocks not known locally. You then
have an option to recover data easily from both copies in the
repository. This, however, will only work properly if your remote
backup has been built using rsync (so it has the same file structure
and is not some archive format) or is extracted temporarily at the
remote location. Extra tip: borg will recover from partial
transfers, you could still parallelize both options. ;-) You just
cannot access a single backup repository concurrently.

-- 
Regards,
Kai

Replies to list-only preferred.

     prev parent reply	other threads:[~2016-07-20 22:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-15  9:51 Data recovery from a linear multi-disk btrfs file system Matt
2016-07-15 12:10 ` Austin S. Hemmelgarn
2016-07-15 18:45   ` Matt
2016-07-15 18:52     ` Austin S. Hemmelgarn
2016-07-20 20:20       ` Chris Murphy
2016-07-20 22:19     ` Kai Krakow
2016-07-20 22:30       ` Kai Krakow [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160721003052.146a1caf@jupiter.sol.kaishome.de \
    --to=hurikhan77@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).