All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matt Brown <shadowfax@gmx.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: Copy/move btrfs volume
Date: Thu, 01 Jul 2010 16:21:14 -0600	[thread overview]
Message-ID: <4C2D14DA.3040301@gmx.com> (raw)
In-Reply-To: <i0huf7$98a$1@dough.gmane.org>

On 07/01/2010 05:33 AM, Lubos Kolouch wrote:
> Daniel J Blueman, Thu, 01 Jul 2010 12:26:10 +0100:
>>> What is the correct way to do this?
>>
>> The only way to do this preserving duplication is to use hardlinks
>> between duplicated files (which reference counts the inode), and use
>> 'rsync -H'.
>>
>> Dan

Hello,

With backed up files consisting of hard links, I usually use dd to copy
the file systems at the block level

# dd if=/dev/sda of=/dev/sdb bs=20M

and then expand the file system. This is because I found that tools like
rsync, while usually fast, are extremely slow when dealing with millions
of hard linked files.

This could also be used for btrfs to keep its snapshots.

> A scenario - I have raid5 of say, 1TB HDDs. It contains many snapshots.
> Then, few years later, new machine is bought and there are, say, 5TB
> discs.
> ...
> Lubos

For me, I had to copy over BackupPC hardlinked files from a full disk to
a smaller disk, both using ext4, and I could not use dd. What normally
should have taken an hour, instead took almost a week. (Yes, I wanted to
use btrfs, but it had a hard link limit of 255 - don't know if it still
does.)

It would be nice to have a btrfs command that could rapidly copy over
the file system, snapshots, and all other file system info.

But what benefit would having a native btrfs 'copy/rsync' command have
over the dd/resize option?

Pros
- Files will be immediately checksumed on new disks, but this may not be
as important since a checksum/verify command will be implemented.
- Great 'feature' for copying files to new drives, and keeping
snapshots. Could even be used to export snapshots.
- I believe compressed files will have to be uncompressed and
recompressed, depending on when file is checksummed. (I may be wrong on
this one). This will actually be a con for slow and/or high load machines.
- One command instead of many (dd -> resize -> verify).

Cons
- File system would still have to be unmounted, or at least read-only,
as I doubt the command will have rsync's update or delete abilities.
But, maybe it could.

Questionable
- May be faster than dd/resize, or it may be just as slow as rsync is
with hard links. And I am talking about dozens to thousands of
snapshots, and millions to billions of files.

Matt

  reply	other threads:[~2010-07-01 22:21 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-01 10:28 Copy/move btrfs volume Lubos Kolouch
2010-07-01 11:26 ` Daniel J Blueman
2010-07-01 11:33   ` Lubos Kolouch
2010-07-01 22:21     ` Matt Brown [this message]
2010-07-02  6:15       ` Oystein Viggen
2010-07-03  7:33         ` Lubos Kolouch
2010-07-21 15:00           ` Hubert Kario
2010-07-02  1:29     ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C2D14DA.3040301@gmx.com \
    --to=shadowfax@gmx.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.