From: Bill Davidsen <davidsen@tmr.com>
To: Jon Nelson <jnelson@jamponi.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: Backups w/ rsync
Date: Fri, 28 Sep 2007 12:25:23 -0400 [thread overview]
Message-ID: <46FD2AF3.4040501@tmr.com> (raw)
In-Reply-To: <cccedfc60709280811j44d6b1c8w60e209d7c6defca9@mail.gmail.com>
Jon Nelson wrote:
> Please note: I'm having trouble w/gmail's formatting... so please
> forgive this if it looks horrible. :-|
>
> On 9/28/07, Bill Davidsen <davidsen@tmr.com> wrote:
>
>> Dean S. Messing wrote:
>>
>>> It has been some time since I read the rsync man page. I see that
>>> there is (among the bazillion and one switches) a "--link-dest=DIR"
>>> switch which I suppose does what you describe. I'll have to
>>> experiment with this and think things through. Thanks, Michal.
>>>
>>>
>> Be aware that rsync is useful for making a *copy* of your files, which
>> isn't always the best backup. If the goal is to preserve data and be
>> able to recover in time of disaster, it's probably not optimal, while if
>> you need frequent access to old or deleted files it's fine.
>>
>
>
> You are absolutely right when you say it isn't always the best backup. There
> IS no 'best' backup.
>
> For example, full and incremental backup methods such as dump and
>
>> restore are usually faster to take and restore than a copy, and allow
>> easy incremental backups.
>>
>
>
> If "copy" meant "full data copy" and not "hard link where possible", I'd
> agree with you. However...
>
> I use a nightly rsync (with --link-dest) to backup more than 40 GiB to a
> drbd-backed drive. I'll explain why I use drbd in just a moment.
>
> Technically, I have a 3 disk raid5 (Linux Software Raid) which is the
> primary store for the data. Then I have a second drive (non-raid) that is
> used as a drbd backing store, which I rsync *to* from filesystems built off
> of the raid. I keep *30 days* of nightly backups on the drbd volume. The
> average difference between nightly backups is about 45MB, or a bit less than
> 10%. The total disk usage is (on average) about 10% more than a single
> backup. On an AMD x86-64 dual core (3600 de-clocked to run at 1GHz) the
> entire process takes between 1 and 2 minutes, from start to finish.
>
> Using hard links means I can snapshot ~175,000 files, about 40GiB, in under
> 2 minutes - something I'd have a hard time doing with dump+restore. I could
> easily make incremental or differential copies, and maybe even in that time
> frame, but I'm not sure I much advantage in that. Furthermore, as you state,
> dump+restore does *not* include the removal of files which for some
> scenarios is a huge deal.
>
What I don't understand is how you use hard links... because a hard link
needs to be in the same filesystem, and because a hard link is just
another pointer to the inode and doesn't make a physical copy of the
data to another device or to anywhere, really.
> The long and short of it is this: using hard links (via rsync or cp or
> whatever) to do snapshot backups can be really, really fast and have
> significant advantages but there are, as with all things, some downsides.
> Those downsides are fairly easily mitigated, however. In my case, I can lose
> 1 drive of the raid and I'm OK. If I lose 2, then the other drive (not part
> of the raid) has the data I care about. If I lose the entire machine, the
> *other* machine (the other end of the drbd, only woken up every other day or
> so) has the data. Going back 30 days. And a bare-metal "restore" is as fast
> as your I/O is. I back my /really/ important stuff up on DLT.
>
> Thanks again to drbd, when the secondary comes up it communicates with the
> primary and is able to figure out only which blocks have changed and only
> copies those. On a nightly basis that is usually a couple of hundred
> megabytes, and at 12MiB/s that doesn't take terribly long to take care of.
>
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
next prev parent reply other threads:[~2007-09-28 16:25 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-09-18 23:09 Help: very slow software RAID 5 Dean S. Messing
2007-09-19 0:05 ` Justin Piszcz
2007-09-19 1:49 ` Dean S. Messing
2007-09-19 8:38 ` Justin Piszcz
2007-09-19 17:49 ` Dean S. Messing
2007-09-19 18:25 ` Justin Piszcz
2007-09-19 23:31 ` Dean S. Messing
2007-09-20 8:25 ` Justin Piszcz
2007-09-20 18:16 ` Michal Soltys
2007-09-20 19:06 ` Dean S. Messing
2007-09-20 15:33 ` Bill Davidsen
2007-09-20 18:47 ` Dean S. Messing
2007-09-20 21:08 ` Michael Tokarev
2007-09-21 0:58 ` Dean S. Messing
2007-09-21 13:00 ` Bill Davidsen
2007-09-21 20:01 ` Dean S. Messing
2007-09-21 20:21 ` Dean S. Messing
2007-09-25 9:31 ` Goswin von Brederlow
2007-09-25 18:16 ` Dean S. Messing
2007-09-25 21:46 ` Goswin von Brederlow
2007-09-25 23:50 ` Dean S. Messing
2007-09-26 1:45 ` Goswin von Brederlow
2007-09-27 6:23 ` Dean S. Messing
2007-09-27 9:51 ` Michal Soltys
2007-09-27 22:10 ` Backups w/ rsync (was: Help: very slow software RAID 5.) Dean S. Messing
2007-09-28 7:57 ` Backups w/ rsync Michael Tokarev
2007-09-28 10:23 ` Goswin von Brederlow
2007-09-28 11:18 ` Michal Soltys
2007-09-28 12:47 ` Goswin von Brederlow
2007-09-28 14:17 ` Michal Soltys
2007-09-29 0:11 ` Dean S. Messing
2007-09-29 8:43 ` Michael Tokarev
2007-09-28 14:48 ` Bill Davidsen
2007-09-28 14:57 ` Wolfgang Denk
2007-09-28 16:50 ` Bill Davidsen
2007-10-01 4:45 ` Michal Soltys
2007-09-28 15:11 ` Jon Nelson
2007-09-28 16:25 ` Bill Davidsen [this message]
2007-09-28 16:52 ` Jon Nelson
2007-09-27 22:40 ` Help: very slow software RAID 5 Bill Davidsen
2007-09-28 23:38 ` Dean S. Messing
2007-09-29 14:52 ` Bill Davidsen
2007-09-27 22:17 ` Bill Davidsen
2007-09-28 23:21 ` Dean S. Messing
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46FD2AF3.4040501@tmr.com \
--to=davidsen@tmr.com \
--cc=jnelson@jamponi.net \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).