linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Is it safe to use snapshot without data as 'btrfs send' parent?
@ 2022-10-22 21:13 Nemcev Aleksey
  2022-10-23  7:01 ` Andrei Borzenkov
  0 siblings, 1 reply; 4+ messages in thread
From: Nemcev Aleksey @ 2022-10-22 21:13 UTC (permalink / raw)
  To: linux-btrfs

Hello Btrfs developers.

Thank you for your great product, Btrfs!

I want to use Btrfs snapshots and 'btrfs send | btrfs receive' features 
to incremental backup my PC to an external drive.
I can do this using commands:
btrfs subvolume snapshot -r subvolume snapshot; btrfs send snapshot -p 
previous_snapshot | btrfs receive backup_drive

But Btrfs snapshots on my PC consume space even for deleted files.
So I can't just remove unused files to free space on my PC if I keep 
parent snapshots for incremental backups on this PC).
I need to do another backup, then remove snapshot left from previous 
backup from my PC to free up space.

I want to use metadata-only snapshots to overcome this issue.

Can I safely use the following chain of commands to keep metadata-only 
snapshot on my PC and keep full snapshots on the
backup drive?

# Initial full backup:
# Create temporary snapshot
btrfs subvolume snapshot -r source/@ source/@_backup
# Send temporary snapshot to the backup drive
btrfs send source/@_backup | btrfs receive backup_drive
# Delete temporary snapshot
btrfs subvolume delete source/@_backup
# Move received on backup drive snapshot to its final name
mv backup_drive/@_backup backup_drive/@_backup1
# Send back metadata-only snapshot to source FS
btrfs send --no-data backup_drive/@_backup1 | btrfs receive 
source/skinny_snapshots

# Incremental backups:
# Create temporary snapshot
btrfs subvolume snapshot -r source/@ source/@_backup
# Send temporary snapshot to the backup drive using metadata-only 
snapshot as parent
btrfs send source/@_backup -p source/skinny_snapshots/@_backup1 | btrfs 
receive backup_drive
# Delete temporary snapshot
btrfs subvolume delete source/@_backup
# Move received on backup drive snapshot to its final name
mv backup_drive/@_backup backup_drive/@_backup2
# Send back metadata-only snapshot to source FS
btrfs send --no-data backup_drive/@_backup2 -p backup_drive/@_backup1 | 
btrfs receive source/skinny_snapshots

I tested this sequence, and it seems to work fine on small test filesystems.
Backups seem to be correct and seem to have all files they should have 
with correct checksums after all.
Source FS frees up space after deleting files while having metadata-only 
snapshots on it.
It's possible to use such "skinny" snapshots as parent for btrfs send 
command.

But I'd like to get confirmation from Btrfs developers:
Is this approach safe?
Can I use it daily and be sure my backups will be consistent?

Thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it safe to use snapshot without data as 'btrfs send' parent?
  2022-10-22 21:13 Is it safe to use snapshot without data as 'btrfs send' parent? Nemcev Aleksey
@ 2022-10-23  7:01 ` Andrei Borzenkov
  2022-10-23  9:32   ` Nemcev Aleksey
  0 siblings, 1 reply; 4+ messages in thread
From: Andrei Borzenkov @ 2022-10-23  7:01 UTC (permalink / raw)
  To: Nemcev Aleksey, linux-btrfs

On 23.10.2022 00:13, Nemcev Aleksey wrote:
> Hello Btrfs developers.
> 
> Thank you for your great product, Btrfs!
> 
> I want to use Btrfs snapshots and 'btrfs send | btrfs receive' features
> to incremental backup my PC to an external drive.
> I can do this using commands:
> btrfs subvolume snapshot -r subvolume snapshot; btrfs send snapshot -p
> previous_snapshot | btrfs receive backup_drive
> 
> But Btrfs snapshots on my PC consume space even for deleted files.

Snapshots by definition preserve the content of filesystem at the time 
snapshots were created. So of course data captured by snapshots continue 
to consume space on filesystem. What use would snapshots be if data were 
immediately deleted everywhere?

> So I can't just remove unused files to free space on my PC if I keep
> parent snapshots for incremental backups on this PC).

You only need to keep one latest snapshot to implement incremental 
forever backup.

> I need to do another backup, then remove snapshot left from previous
> backup from my PC to free up space.
> 

Yes, that is what every backup software that supports snapshots does.

> I want to use metadata-only snapshots to overcome this issue.
> 

There is no issue here.

> Can I safely use the following chain of commands to keep metadata-only
> snapshot on my PC and keep full snapshots on the
> backup drive?
> 

There is no such thing as "metadata-only snapshot". I am not sure what 
gave you this idea.

> # Initial full backup:
> # Create temporary snapshot
> btrfs subvolume snapshot -r source/@ source/@_backup
> # Send temporary snapshot to the backup drive
> btrfs send source/@_backup | btrfs receive backup_drive
> # Delete temporary snapshot
> btrfs subvolume delete source/@_backup
> # Move received on backup drive snapshot to its final name
> mv backup_drive/@_backup backup_drive/@_backup1
> # Send back metadata-only snapshot to source FS
> btrfs send --no-data backup_drive/@_backup1 | btrfs receive
> source/skinny_snapshots
> 

This creates new subvolume under source/skinny_snapshots with empty 
files. This subvolume is completely unrelated to the original source 
subvolume source/@.

> # Incremental backups:
> # Create temporary snapshot
> btrfs subvolume snapshot -r source/@ source/@_backup
> # Send temporary snapshot to the backup drive using metadata-only
> snapshot as parent
> btrfs send source/@_backup -p source/skinny_snapshots/@_backup1 | btrfs
> receive backup_drive

This sends difference between "skinny snapshot" with empty files and 
your current filesystem. Which means it sends full content of all files 
currently present on filesystem effectively converting incremental send 
stream into full send stream.

> # Delete temporary snapshot
> btrfs subvolume delete source/@_backup
> # Move received on backup drive snapshot to its final name
> mv backup_drive/@_backup backup_drive/@_backup2
> # Send back metadata-only snapshot to source FS
> btrfs send --no-data backup_drive/@_backup2 -p backup_drive/@_backup1 |
> btrfs receive source/skinny_snapshots
> 
> I tested this sequence, and it seems to work fine on small test filesystems.

This is very convoluted way to simply create sequence of full snapshots.

> Backups seem to be correct and seem to have all files they should have
> with correct checksums after all.
> Source FS frees up space after deleting files while having metadata-only
> snapshots on it.
> It's possible to use such "skinny" snapshots as parent for btrfs send
> command.
> 

It is possible to use *any* snapshot as parent. btrfs will compute 
sequence of operations to get from parent to your sent snapshot. In the 
worst case it means deleting all existing data and recreating from 
source. Using full send you start with empty target and avoid deleting.

> But I'd like to get confirmation from Btrfs developers:

I am not developer, but ...

> Is this approach safe?

If "safe" means "filesystem structure and the *content* of files will be 
replicated on receive side" - yes.

> Can I use it daily and be sure my backups will be consistent?
> 

I do not know what "consistent" means for you. If "consistent" refers to 
filesystem state during "btrfs send" - yes, it is consistent (snapshots 
are atomic).

Again - you could replace all this voodoo dance with simple sequence of 
full "btrfs send". Your procedure loses all advantages of btrfs data 
sharing between snapshots both during send (you always send full 
content) and on backup storage (because receive side is unaware that 
identical files share the same data).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it safe to use snapshot without data as 'btrfs send' parent?
  2022-10-23  7:01 ` Andrei Borzenkov
@ 2022-10-23  9:32   ` Nemcev Aleksey
  2022-10-23 13:54     ` Andrei Borzenkov
  0 siblings, 1 reply; 4+ messages in thread
From: Nemcev Aleksey @ 2022-10-23  9:32 UTC (permalink / raw)
  To: Andrei Borzenkov, linux-btrfs

23.10.2022 10:01, Andrei Borzenkov wrote:
 > On 23.10.2022 00:13, Nemcev Aleksey wrote:
 >> So I can't just remove unused files to free space on my PC if I keep
 >> parent snapshots for incremental backups on this PC).
 >
 > You only need to keep one latest snapshot to implement incremental 
forever backup.
Yes, I'm already keeping one latest snapshot on original FS for each 
subvolume.
 >
 >> I need to do another backup, then remove snapshot left from previous
 >> backup from my PC to free up space.
 >>
 >
 > Yes, that is what every backup software that supports snapshots does.
 >
 >
 >> # Initial full backup:
 >> # Create temporary snapshot
 >> btrfs subvolume snapshot -r source/@ source/@_backup
 >> # Send temporary snapshot to the backup drive
 >> btrfs send source/@_backup | btrfs receive backup_drive
 >> # Delete temporary snapshot
 >> btrfs subvolume delete source/@_backup
 >> # Move received on backup drive snapshot to its final name
 >> mv backup_drive/@_backup backup_drive/@_backup1
 >> # Send back metadata-only snapshot to source FS
 >> btrfs send --no-data backup_drive/@_backup1 | btrfs receive
 >> source/skinny_snapshots
 >>
 >
 > This creates new subvolume under source/skinny_snapshots with empty 
files. This subvolume is completely unrelated to the original source 
subvolume source/@.
Thank you, got this.
 >
 >> # Incremental backups:
 >> # Create temporary snapshot
 >> btrfs subvolume snapshot -r source/@ source/@_backup
 >> # Send temporary snapshot to the backup drive using metadata-only
 >> snapshot as parent
 >> btrfs send source/@_backup -p source/skinny_snapshots/@_backup1 | btrfs
 >> receive backup_drive
 >
 > This sends difference between "skinny snapshot" with empty files and 
your current filesystem. Which means it sends full content of all files 
currently present on filesystem effectively converting incremental send 
stream into full send stream.
 > Again - you could replace all this voodoo dance with simple sequence 
of full "btrfs send". Your procedure loses all advantages of btrfs data 
sharing between snapshots both during send (you always send full 
content) and on backup storage (because receive side is unaware that 
identical files share the same data).


I thought it will be possible to somehow track changes for ongoing 
incremental backup but also be able to delete files on original FS and 
immideately reclaim free space without losing incremental backup 
ability. I don't use snapshot on original FS as backup, only as parent 
reference for incremental send and would be happy if it will be any way 
to avoid space consumption by this snapshot for deleted files (just to 
keep track "this file should be deleted or overwritten" references).
But it seems the only way to use incremental send|receive requires to 
keep normal snapshot on original FS as parent.

Thank you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Is it safe to use snapshot without data as 'btrfs send' parent?
  2022-10-23  9:32   ` Nemcev Aleksey
@ 2022-10-23 13:54     ` Andrei Borzenkov
  0 siblings, 0 replies; 4+ messages in thread
From: Andrei Borzenkov @ 2022-10-23 13:54 UTC (permalink / raw)
  To: Nemcev Aleksey, linux-btrfs

On 23.10.2022 12:32, Nemcev Aleksey wrote:
> 
> 
> I thought it will be possible to somehow track changes for ongoing
> incremental backup but also be able to delete files on original FS and
> immideately reclaim free space without losing incremental backup
> ability. I don't use snapshot on original FS as backup, only as parent
> reference for incremental send and would be happy if it will be any way
> to avoid space consumption by this snapshot for deleted files (just to
> keep track "this file should be deleted or overwritten" references).

btrfs data sharing is extent based, not file based, so it is more 
complicated.

> But it seems the only way to use incremental send|receive requires to
> keep normal snapshot on original FS as parent.
> 

btrfs send works by computing difference between two subvolumes, so yes, 
you must have two subvolumes to be able to compute difference. It may 
theoretically be possible to use some metadata dump (like btrfs-image) 
instead of source subvolume, but it does not sound like a trivial task.

If data change rate is so high that even keeping snapshot for one day is 
a problem, you can achieve similar effects by combining btrfs snapshots 
and rsync

- on source create read-only snapshot (simply to have consistent image)
- on destination clone previous backup to a new volume
- run "rsync --inplace from source to target
- remove snapshot on source

This will effectively do the same as incremental btrfs send by sending 
only changed blocks. It does mean increased IO load on both source and 
destination though. If you have mostly small files, using rsync 
--whole-file may mitigate it.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-23 13:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-22 21:13 Is it safe to use snapshot without data as 'btrfs send' parent? Nemcev Aleksey
2022-10-23  7:01 ` Andrei Borzenkov
2022-10-23  9:32   ` Nemcev Aleksey
2022-10-23 13:54     ` Andrei Borzenkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).