From: Lionel Bouton <lionel-subscription@bouton.name>
To: Claudius Heine <ch@denx.de>,
Andrei Borzenkov <arvidjaar@gmail.com>,
linux-btrfs@vger.kernel.org
Cc: Henning Schild <henning.schild@siemens.com>
Subject: Re: btrfs-send format that contains binary diffs
Date: Mon, 29 Mar 2021 21:53:44 +0200 [thread overview]
Message-ID: <04d8b3c2-a5a7-abc2-b157-b6a39f6d435c@bouton.name> (raw)
In-Reply-To: <5ba46b04-f3ba-03ef-6ad5-38fd44f8c67e@denx.de>
Hi Claudius,
Le 29/03/2021 à 21:14, Claudius Heine a écrit :
> [...]
> Are you sure?
>
> I did a test with a 32MiB random file. I created one snapshot, then
> changed (not deleted or added) one byte in that file and then created
> a snapshot again. `btrfs send` created a >32MiB `btrfs-stream` file.
> If it would be only block based, then I would have expected that it
> would just contain the changed block, not the whole file.
I suspect there is another possible explanations : the tool you used to
change one byte actually rewrote the whole file.
You can test this by appending data to your file (for example with "cat
otherfile >> originalfile" or "dd if=/dev/urandom of=originalfile bs=1M
count=4 conv=notrunc oflag=append") and checking the size of `btrfs
send`'s output.
When I append data with dd as described above to a 32M file originally
created with "dd if=/dev/urandom of=originalfile bs=1M count=32" I get a
file with 1 extent only in each snapshot both marked shared and a little
other 4M in `btrfs send`'s output.
filefrag -v should tell you if the extents in your file are shared.
Note that if you use compression and your files compress well they will
use small extents (128kB from memory), this can be bad when you try to
avoid fragmentation but could help COW find more data to share if I
understand how COW works in respect to extents correctly.
Finally, using "dd if=/dev/urandom of=originalfile bs=1M count=1
conv=notrunc seek=12M" to write in the middle of my now 36M file results
in a little over 1M with `btrfs send` using -p <previous snapshot>
And filefrag -v shows 3 extents for this file. 2 of them share the same
logical offsets than the file in the previous snapshot, the last use a
new range, confirming the allocation of a new extent and reuse of the
previous ones.
This seems to confirm my hypothesis that the tool you used did rewrite
the whole file.
Another possibility would be that COW is disabled, either by a mount
option or a file attribute (see lsattr's output for your file).
Best regards,
Lionel
next prev parent reply other threads:[~2021-03-29 20:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 13:16 btrfs-send format that contains binary diffs Claudius Heine
2021-03-29 16:30 ` Andrei Borzenkov
2021-03-29 17:25 ` Henning Schild
2021-03-29 18:00 ` Martin Raiber
2021-03-29 19:25 ` Claudius Heine
2021-03-29 19:14 ` Claudius Heine
2021-03-29 19:53 ` Lionel Bouton [this message]
2021-03-30 7:48 ` Claudius Heine
2021-03-30 5:33 ` Andrei Borzenkov
2021-03-30 5:38 ` Andrei Borzenkov
2021-03-30 8:12 ` Claudius Heine
2021-03-30 16:32 ` Henning Schild
2021-03-31 1:17 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=04d8b3c2-a5a7-abc2-b157-b6a39f6d435c@bouton.name \
--to=lionel-subscription@bouton.name \
--cc=arvidjaar@gmail.com \
--cc=ch@denx.de \
--cc=henning.schild@siemens.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).