linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Various Questions
@ 2011-01-07 17:15 Carl Cook
  2011-01-07 17:37 ` C Anthony Risinger
  2011-01-07 17:41 ` Freddie Cash
  0 siblings, 2 replies; 10+ messages in thread
From: Carl Cook @ 2011-01-07 17:15 UTC (permalink / raw)
  To: linux-btrfs


On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
> I'd suggest at least 
> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
> if you really want raid0

I don't fully understand -m or -d.  Why would this make a truer raid0 that with no options?


Is it necessary to use fdisk on new drives in creating a BTRFS multi-drive array?  Or is this all that's needed:
# mkfs.btrfs /dev/sdb /dev/sdc
# btrfs filesystem show

Is this related to 'subvolumes'?  The FAQ implies that a subvolume is like a directory, but also like a partition.  What's the rationale for being able to create a subvolume under a subvolume, as Hubert says so he can "use the shadow_copy module for samba to publish the snapshots  to windows clients."  I don't have any windows clients, but what difference does his structure make?

I know that if using SATA+LVM, turn off the writeback cache on the drive, as it doesn't do cash flushing, and ensure NCQ is on.  But does this also apply to a BTRFS array?  If so, is this done in rc.local with 
hdparm -I /dev/sdb
hdparm -I /dev/sdc


How do you know what options to rsync are on by default?  I can't find this anywhere.  For example, it seems to me that --perms -ogE  --hard-links and --delete-excluded should be on by default, for a true sync?

If using the  --numeric-ids switch for rsync, do you just have to manually make sure the IDs and usernames are the same on source and destination machines?

For files that fail to transfer, wouldn't it be wise to use  --partial-dir=DIR to at least recover part of lost files?

The rsync man page says that rsync uses ssh by default, but is that the case?  I think -e may be related to engaging ssh, but don't understand the explanation.

So for my system where there is a backup server, I guess I run the rsync daemon on the backup server which presents a port, then when the other systems decide it's time for a backup (cron) they:
- stop mysql, dump the database somewhere, start mysql;
- connect to the backup server's rsync port and dump their data to (hopefully) some specific place there.
Right?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-07 17:15 Various Questions Carl Cook
@ 2011-01-07 17:37 ` C Anthony Risinger
  2011-01-07 17:41 ` Freddie Cash
  1 sibling, 0 replies; 10+ messages in thread
From: C Anthony Risinger @ 2011-01-07 17:37 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-btrfs

On Fri, Jan 7, 2011 at 11:15 AM, Carl Cook <CACook@quantum-sci.com> wro=
te:
>
> On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
>> I'd suggest at least
>> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
>> if you really want raid0
>
> I don't fully understand -m or -d. =A0Why would this make a truer rai=
d0 that with no options?

this will give you RAID0 for your data, but RAID1 for your metadata,
making it less likely that the FS itself gets corrupted, even though
you will lose some data in crash cases, if i understand correctly.

> Is it necessary to use fdisk on new drives in creating a BTRFS multi-=
drive array? =A0Or is this all that's needed:
> # mkfs.btrfs /dev/sdb /dev/sdc
> # btrfs filesystem show

depends on whether you need /boot partitions or other partitions.
what you have works fine though.

> Is this related to 'subvolumes'? =A0The FAQ implies that a subvolume =
is like a directory, but also like a partition. =A0What's the rationale=
 for being able to create a subvolume under a subvolume, as Hubert says=
 so he can "use the shadow_copy module for samba to publish the snapsho=
ts =A0to windows clients." =A0I don't have any windows clients, but wha=
t difference does his structure make?

just his preference to put it there... the snapshot of a snapshot can
go anywhere.  it doesn't have to reside under it's "parent", the
parent was just used as a base, it's not bound to it in any way AFAIK.

> How do you know what options to rsync are on by default? =A0I can't f=
ind this anywhere. =A0For example, it seems to me that --perms -ogE =A0=
--hard-links and --delete-excluded should be on by default, for a true =
sync?

the links and command Freddie Cash posted are a really good base to wor=
k from.

> So for my system where there is a backup server, I guess I run the rs=
ync daemon on the backup server which presents a port, then when the ot=
her systems decide it's time for a backup (cron) they:
> - stop mysql, dump the database somewhere, start mysql;
> - connect to the backup server's rsync port and dump their data to (h=
opefully) some specific place there.
> Right?

you don't have to stop mysql, you just need to "freeze" any new,
incoming writes, and flush (ie. let finish) whatever is happening
right now.  this ensures mysql is _internally_ consistent on the disk.

see comment by Lloyd Standish here:

http://dev.mysql.com/doc/refman/5.1/en/backup-methods.html

C Anthony
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-07 17:15 Various Questions Carl Cook
  2011-01-07 17:37 ` C Anthony Risinger
@ 2011-01-07 17:41 ` Freddie Cash
  2011-01-07 18:55   ` Carl Cook
  1 sibling, 1 reply; 10+ messages in thread
From: Freddie Cash @ 2011-01-07 17:41 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-btrfs

On Fri, Jan 7, 2011 at 9:15 AM, Carl Cook <CACook@quantum-sci.com> wrot=
e:
> How do you know what options to rsync are on by default? =C2=A0I can'=
t find this anywhere. =C2=A0For example, it seems to me that --perms -o=
gE =C2=A0--hard-links and --delete-excluded should be on by default, fo=
r a true sync?

Who cares which ones are on by default?  List the ones you want to use
on the command-line, everytime.  That way, if the defaults change,
your setup won't.

> If using the =C2=A0--numeric-ids switch for rsync, do you just have t=
o manually make sure the IDs and usernames are the same on source and d=
estination machines?

You use the --numeric-ids switch so that it *doesn't* matter if the
IDs/usernames are the same.  It just sends the ID number on the wire.
Sure, if you do an ls on the backup box, the username will appear to
be messed up.  But if you compare the user ID assigned to the file,
and the user ID to the backed up etc/passwd file, they are correct.
Then, if you ever need to restore the HTPC from backups, the
etc/passwd file is transferred over, the user IDs are transferred
over, and when you do an ls on the HTPC, everything matches up
correctly.

> For files that fail to transfer, wouldn't it be wise to use =C2=A0--p=
artial-dir=3DDIR to at least recover part of lost files?

Or, just run rsync again, if the connection is dropped.

> The rsync man page says that rsync uses ssh by default, but is that t=
he case? =C2=A0I think -e may be related to engaging ssh, but don't und=
erstand the explanation.

Does it matter what the default is, if you specify exactly how you
want it to work on the command-line?

> So for my system where there is a backup server, I guess I run the rs=
ync daemon on the backup server which presents a port, then when the ot=
her systems decide it's time for a backup (cron) they:
> - stop mysql, dump the database somewhere, start mysql;
> - connect to the backup server's rsync port and dump their data to (h=
opefully) some specific place there.
> Right?

That's one way (push backups).  It works ok for small numbers of
systems being backed up.  But get above a handful of machines, and it
gets very hard to time everything so that you don't hammer the disks
on the backup server.

Pull backups (backups server does everything) works better, in my
experience.  Then you just script things up once, run 1 script, worry
about 1 schedule, and everything is stored on the backups server.  No
need to run rsync daemons everywhere, just run the rsync client, using
-e ssh, and let it do everything.

If you need it to run a script on the remote machine first, that's
easy enough to do:
  - ssh to remote system, run script to stop DBs, dump DBs, snapshot
=46S, whatever
  - then run rsync
  - ssh to remote system run script to start DBs, delete snapshot, what=
ever

You're starting to over-think things.  Keep it simple, don't worry
about defaults, specify everything you want to do, and do it all from
the backups box.

--=20
=46reddie Cash
fjwcash@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-07 17:41 ` Freddie Cash
@ 2011-01-07 18:55   ` Carl Cook
  2011-01-08 13:25     ` Carl Cook
  0 siblings, 1 reply; 10+ messages in thread
From: Carl Cook @ 2011-01-07 18:55 UTC (permalink / raw)
  To: linux-btrfs


Wow, this rsync and backup system is pretty amazing.  I've always just tarred each directory manually, but now find I can RELIABLY automate backups, and have SOLID versioning to boot.  Thanks to everyone who advised, especially Freddie and Anthony.

I am still waiting for hardware for my backup server, but have been preparing.  On the backup server I'll be doing pull backups for everything except my phone (which is connected intermittently).  I'm going to set up a cron script on the backup server to pull backups once a week (as opposed to once/mo which I've done for 12 years).  I am at a loss how to to lock the database on the HTPC while exporting the dump, as per Lloyd Standish, but will study it.  (Freddie gave a nice script, but it doesn't seem to lock/flush first)  Also don't know how to email results/success/fail on completion, as I've not a very good coder.

But here is my proposed cron:
btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
rsync --archive --hard-links --delete-during --delete-excluded --inplace --numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex hex:///home /media/backups/hex
btrfs subvolume snapshot droog:///home /media/backups/snapshots/droog-{DATE}
rsync --archive --hard-links --delete-during --delete-excluded --inplace --numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog droog:///home /media/backups/droog

My root filesystems are ext4, so I guess they cannot be snapshotted before backup.  My home directories are/will be BTRFS though.


On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
>> I'd suggest at least 
>> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
>> if you really want raid0
>
> I don't fully understand -m or -d.  Why would this make a truer raid0 that with no options?

I am beginning to suspect that this is the -default- behavior, as described in the wiki:
"# Create a filesystem across four drives (metadata mirrored, data striped)"

Should I turn off the writeback cache on each drive when running BTRFS?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-07 18:55   ` Carl Cook
@ 2011-01-08 13:25     ` Carl Cook
  2011-01-08 15:40       ` Ian! D. Allen
  2011-01-09  1:26       ` Freddie Cash
  0 siblings, 2 replies; 10+ messages in thread
From: Carl Cook @ 2011-01-08 13:25 UTC (permalink / raw)
  To: linux-btrfs


In addition to the questions below, if anyone has a chance could you advise on why my destination drive has more data  than the source after this command:
# rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/* /home
sending incremental file list
sent 658660 bytes  received 2433 bytes  1322186.00 bytes/sec
total size is 1355368091626  speedup is 2050192.77

# df /media/disk
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md2             1868468340 1315408384 553059956  71% /media/disk
# df /home
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb             3907029168 1325491836 2581537332  34% /home




On Fri 07 January 2011 10:55:43 Carl Cook wrote:
> 
> Wow, this rsync and backup system is pretty amazing.  I've always just tarred each directory manually, but now find I can RELIABLY automate backups, and have SOLID versioning to boot.  Thanks to everyone who advised, especially Freddie and Anthony.
> 
> I am still waiting for hardware for my backup server, but have been preparing.  On the backup server I'll be doing pull backups for everything except my phone (which is connected intermittently).  I'm going to set up a cron script on the backup server to pull backups once a week (as opposed to once/mo which I've done for 12 years).  I am at a loss how to to lock the database on the HTPC while exporting the dump, as per Lloyd Standish, but will study it.  (Freddie gave a nice script, but it doesn't seem to lock/flush first)  Also don't know how to email results/success/fail on completion, as I've not a very good coder.
> 
> But here is my proposed cron:
> btrfs subvolume snapshot hex:///home /media/backups/snapshots/hex-{DATE}
> rsync --archive --hard-links --delete-during --delete-excluded --inplace --numeric-ids -e ssh --exclude-from=/media/backups/exclude-hex hex:///home /media/backups/hex
> btrfs subvolume snapshot droog:///home /media/backups/snapshots/droog-{DATE}
> rsync --archive --hard-links --delete-during --delete-excluded --inplace --numeric-ids -e ssh --exclude-from=/media/backups/exclude-droog droog:///home /media/backups/droog
> 
> My root filesystems are ext4, so I guess they cannot be snapshotted before backup.  My home directories are/will be BTRFS though.
> 
> 
> On Fri 07 January 2011 08:14:17 Hubert Kario wrote:
> >> I'd suggest at least 
> >> mkfs.btrfs -m raid1 -d raid0 /dev/sdc /dev/sdd
> >> if you really want raid0
> >
> > I don't fully understand -m or -d.  Why would this make a truer raid0 that with no options?
> 
> I am beginning to suspect that this is the -default- behavior, as described in the wiki:
> "# Create a filesystem across four drives (metadata mirrored, data striped)"
> 
> Should I turn off the writeback cache on each drive when running BTRFS?
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-08 13:25     ` Carl Cook
@ 2011-01-08 15:40       ` Ian! D. Allen
  2011-01-09  1:26       ` Freddie Cash
  1 sibling, 0 replies; 10+ messages in thread
From: Ian! D. Allen @ 2011-01-08 15:40 UTC (permalink / raw)
  To: linux-btrfs

On Sat, Jan 08, 2011 at 05:25:19AM -0800, Carl Cook wrote:
> In addition to the questions below, if anyone has a chance could you
> advise on why my destination drive has more data than the source after
> this command:
> # rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/* /home
> sending incremental file list
> sent 658660 bytes  received 2433 bytes  1322186.00 bytes/sec
> total size is 1355368091626  speedup is 2050192.77
> 
> # df /media/disk
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/md2             1868468340 1315408384 553059956  71% /media/disk
> # df /home
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sdb             3907029168 1325491836 2581537332  34% /home

This has little to do with btrfs; it happens with many file systems due
to file system infrastructure details such as directory sizes, sparse
file handling, file fragmentation, etc.

For example: If you have a directory with a huge number of file names
in it, the actual directory disk space used will be large and will not
be reclaimed when you delete all the file names from the directory.
You would have to remove the directory itself and recreate it to reclaim
that space.  Also, using rsync without --sparse (which can't work with
--inplace), sparse files on the source may get expanded to take real
disk blocks on the destination.

Unless you use "dd" to copy a partition exactly, including all the file
system infrastructure details, any copy you make will be subject to the
vagaries of how the file system decides to lay out the data.

-- 
| Ian! D. Allen  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-08 13:25     ` Carl Cook
  2011-01-08 15:40       ` Ian! D. Allen
@ 2011-01-09  1:26       ` Freddie Cash
  2011-01-09 13:16         ` Carl Cook
  1 sibling, 1 reply; 10+ messages in thread
From: Freddie Cash @ 2011-01-09  1:26 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-btrfs

On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook <CACook@quantum-sci.com> wrot=
e:
>
> In addition to the questions below, if anyone has a chance could you =
advise on why my destination drive has more data =C2=A0than the source =
after this command:
> # rsync --hard-links --delete --inplace --archive --numeric-ids /medi=
a/disk/* /home
> sending incremental file list

What happens if you delete /home, then run the command again, but
without the *?  You generally don't use wildcards for the source or
destination when using rsync.  You just tell it which directory to
start in.

If you do an "ls /home" and "ls /media/disk" are they different?

--=20
=46reddie Cash
fjwcash@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-09  1:26       ` Freddie Cash
@ 2011-01-09 13:16         ` Carl Cook
  2011-01-09 13:37           ` Fajar A. Nugraha
  0 siblings, 1 reply; 10+ messages in thread
From: Carl Cook @ 2011-01-09 13:16 UTC (permalink / raw)
  To: linux-btrfs


I'd rather not do the copy again unless necessary, as it took a day.

Directories look identical, but who knows?  I'm going to try and figure out how to do a file-by-file crc check, for peace of mind.


On Sat 08 January 2011 17:26:25 Freddie Cash wrote:
> On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook <CACook@quantum-sci.com> wrote:
> >
> > In addition to the questions below, if anyone has a chance could you advise on why my destination drive has more data  than the source after this command:
> > # rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/* /home
> > sending incremental file list
> 
> What happens if you delete /home, then run the command again, but
> without the *?  You generally don't use wildcards for the source or
> destination when using rsync.  You just tell it which directory to
> start in.
> 
> If you do an "ls /home" and "ls /media/disk" are they different?
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-09 13:16         ` Carl Cook
@ 2011-01-09 13:37           ` Fajar A. Nugraha
  2011-01-09 13:58             ` Alan Chandler
  0 siblings, 1 reply; 10+ messages in thread
From: Fajar A. Nugraha @ 2011-01-09 13:37 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-btrfs

On Sun, Jan 9, 2011 at 8:16 PM, Carl Cook <CACook@quantum-sci.com> wrot=
e:
>
> I'd rather not do the copy again unless necessary, as it took a day.
>
> Directories look identical, but who knows? =A0I'm going to try and fi=
gure out how to do a file-by-file crc check, for peace of mind.

try "du --apparent-size -slh"
It should rule out any differences caused by sparse files and hardlinks=
=2E

>
>
> On Sat 08 January 2011 17:26:25 Freddie Cash wrote:
>> On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook <CACook@quantum-sci.com> w=
rote:
>> >
>> > In addition to the questions below, if anyone has a chance could y=
ou advise on why my destination drive has more data =A0than the source =
after this command:
>> > # rsync --hard-links --delete --inplace --archive --numeric-ids /m=
edia/disk/* /home

Are you SURE you don't get the command mixed up? The last argument to
rsync should be the destination. Your command looks like you're
copying things to /home.

--=20
=46ajar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Various Questions
  2011-01-09 13:37           ` Fajar A. Nugraha
@ 2011-01-09 13:58             ` Alan Chandler
  0 siblings, 0 replies; 10+ messages in thread
From: Alan Chandler @ 2011-01-09 13:58 UTC (permalink / raw)
  To: linux-btrfs



On 09/01/11 13:37, Fajar A. Nugraha wrote:
> On Sun, Jan 9, 2011 at 8:16 PM, Carl Cook<CACook@quantum-sci.com>  wrote:
>    
>> I'd rather not do the copy again unless necessary, as it took a day.
>>
>> Directories look identical, but who knows?  I'm going to try and figure out how to do a file-by-file crc check, for peace of mind.
>>      
> try "du --apparent-size -slh"
> It should rule out any differences caused by sparse files and hardlinks.
>
>    
>>
>> On Sat 08 January 2011 17:26:25 Freddie Cash wrote:
>>      
>>> On Sat, Jan 8, 2011 at 5:25 AM, Carl Cook<CACook@quantum-sci.com>  wrote:
>>>        
>>>> In addition to the questions below, if anyone has a chance could you advise on why my destination drive has more data  than the source after this command:
>>>> # rsync --hard-links --delete --inplace --archive --numeric-ids /media/disk/* /home
>>>>          
> Are you SURE you don't get the command mixed up? The last argument to
> rsync should be the destination. Your command looks like you're
> copying things to /home.
>    

What is also important is that use of * - it means all the . files at 
the top level are NOT being copied

rsync is clever enough to notice if you have the / at the end of the 
source to know whether you want the directory to be put into the 
destination or the contents of the directory.  The / at the end of the 
source means copy the contents.

This could be (I am not sure of the exact scope of --delete) the reason 
why the destination has more data than the source.  If --delete is not 
deleting /home/.* files (if there any there).

-- 
Alan Chandler
http://www.chandlerfamily.org.uk


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-01-09 13:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-07 17:15 Various Questions Carl Cook
2011-01-07 17:37 ` C Anthony Risinger
2011-01-07 17:41 ` Freddie Cash
2011-01-07 18:55   ` Carl Cook
2011-01-08 13:25     ` Carl Cook
2011-01-08 15:40       ` Ian! D. Allen
2011-01-09  1:26       ` Freddie Cash
2011-01-09 13:16         ` Carl Cook
2011-01-09 13:37           ` Fajar A. Nugraha
2011-01-09 13:58             ` Alan Chandler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).