public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Btrfs send bloat
@ 2019-05-19  8:11 Newbugreport
  2019-05-19 20:06 ` Andrei Borzenkov
  0 siblings, 1 reply; 10+ messages in thread
From: Newbugreport @ 2019-05-19  8:11 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

I have 3-4 years worth of snapshots I use for backup purposes. I keep R-O live snapshots, two local backups, and AWS Glacier Deep Freeze. I use both send | receive and send > file. This works well but I get massive deltas when files are moved around in a GUI via samba. Reorganize a bunch of files and the next snapshot is 50 or 100 GB. Perhaps mv or cp with reflink=always would fix the problem but it's just not usable enough for my family.

I'd like a solution to the massive delta problem. Perhaps someone already has a solution, that would be great. If not, I need advice on a few ideas.

It seems a realistic solution to deduplicate the subvolume  before each snapshot is taken, and in theory I could write a small program to do that. However I don't know if that would work. Will Btrfs will let me deduplicate between a file on the live subvolume and a file on the R-O snapshot (really the same file but different path). If so, will Btrfs send with -p result in a small delta?

Failing that I could probably make changes to the send data stream, but that's suboptimal for the live volume and any backup volumes where data has been received.

Also, is it possible to access the Btrfs hash values for files so I don't have to recalculate file hashes for the whole volume myself?

Thanks in advance for any advice.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-05-20 22:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-19  8:11 Btrfs send bloat Newbugreport
2019-05-19 20:06 ` Andrei Borzenkov
2019-05-20  9:20   ` David Disseldorp
2019-05-20 10:34   ` Patrik Lundquist
2019-05-20 11:15     ` Newbugreport
2019-05-20 11:58       ` Austin S. Hemmelgarn
2019-05-20 12:14         ` Patrik Lundquist
2019-05-20 12:40           ` Btrfs remote reflink with Samba David Disseldorp
2019-05-20 20:33             ` Patrik Lundquist
2019-05-20 22:50               ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox