From: Mordechay Kaganer <mkaganer@gmail.com>
To: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: running duperemove but no free space gain
Date: Tue, 7 Jul 2015 16:14:24 +0300 [thread overview]
Message-ID: <CA+xOVSN1v5v9z3=zP8E6phTV0c54pOCf7Rh10EBR4zkkFNVo4A@mail.gmail.com> (raw)
In-Reply-To: <559B7158.6050404@spotprint.com.au>
B.H.
On Tue, Jul 7, 2015 at 9:27 AM, Ryan Bourne <hub@spotprint.com.au> wrote:
> To clarify, if I did the following:
>
> # btrfs subvolume create a
> # dd bs=1M count=10 if=/dev/urandom of=a/1
> # dd if=a/1 of=a/2
> # btrfs subvolume snapshot a b
>
> then I have four files containing the same data. a/1, b/1 share extents and a/2, b/2 share extents.
>
> If I then deduplicate a/1 and a/2 will all four files be sharing extents, or only three? (Assuming I have the patches for 4.2)
>
OK, i did a test almost exactly as you have suggested. It appears that
dedupe does not affect the "b" snapshot so only 3 of 4 files are
deduped, which explains no free space gain as the duplicate data is
still used.
Here's the log - fe_physical/fe_length can be used to figure out what
is actually deduped:
; Setup:
# btrfs sub create a
# dd bs=128K count=8 if=/dev/urandom of=a/1
# dd if=a/1 of=a/2
# btrfs subvolume snapshot a b
; Before dedupe:
# show-shared-extents a/1 a/2 b/1 b/2
(fiemap) [0] fe_logical: 0, fe_length: 524288, fe_physical:
3632062464, fe_flags: 0x2000 (shared )
(fiemap) [1] fe_logical: 524288, fe_length: 524288, fe_physical:
3632586752, fe_flags: 0x2001 (last shared )
a/1: 1048576 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 524288, fe_physical:
3633111040, fe_flags: 0x2000 (shared )
(fiemap) [1] fe_logical: 524288, fe_length: 524288, fe_physical:
3633635328, fe_flags: 0x2001 (last shared )
a/2: 1048576 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 1048576, fe_physical:
3632062464, fe_flags: 0x2001 (last shared )
b/1: 1048576 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 1048576, fe_physical:
3633111040, fe_flags: 0x2001 (last shared )
b/2: 1048576 shared bytes
; Dedupe:
# duperemove -d a/1 a/2
Using 128K blocks
Using hash: murmur3
Using 4 threads for file hashing phase
csum: a/1 [1/2] (50.00%)
csum: a/2 [2/2] (100.00%)
Hashing completed. Calculating duplicate extents - this may take some time.
[########################################]
Search completed with no errors.
Simple read and compare of file data found 1 instances of extents that
might benefit from deduplication.
Showing 2 identical extents with id 7ec588f6
Start Length Filename
0 1048576 "a/2"
0 1048576 "a/1"
Using 4 threads for dedupe phase
[0x1e42540] Try to dedupe extents with id 7ec588f6
[0x1e42540] Dedupe 1 extents (id: 7ec588f6) with target: (0, 1048576), "a/2"
Kernel processed data (excludes target files): 1048576
Comparison of extent info shows a net change in shared extents of: 0
; After dedupe:
# show-shared-extents a/1 a/2 b/1 b/2
(fiemap) [0] fe_logical: 0, fe_length: 524288, fe_physical:
3633111040, fe_flags: 0x2000 (shared )
(fiemap) [1] fe_logical: 524288, fe_length: 524288, fe_physical:
3633635328, fe_flags: 0x2001 (last shared )
a/1: 1048576 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 524288, fe_physical:
3633111040, fe_flags: 0x2000 (shared )
(fiemap) [1] fe_logical: 524288, fe_length: 524288, fe_physical:
3633635328, fe_flags: 0x2001 (last shared )
a/2: 1048576 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 1048576, fe_physical:
3632062464, fe_flags: 0x1 (last )
b/1: 0 shared bytes
(fiemap) [0] fe_logical: 0, fe_length: 1048576, fe_physical:
3633111040, fe_flags: 0x2001 (last shared )
b/2: 1048576 shared bytes
b/1 was not affected by duperemove. As far as i understand, after
creating snapshot the dedupe operation actually modifies the metadata
of a/1 and/or a/2 which causes it to be COWed so b's data is not
affected.
The conclusion is: to actually reclaim the duplicated space you have
to include all snapshots that may point to the file.
--
משיח NOW!
Moshiach is coming very soon, prepare yourself!
יחי אדוננו מורינו ורבינו מלך המשיח לעולם ועד!
next prev parent reply other threads:[~2015-07-07 13:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-06 21:54 running duperemove but no free space gain Mordechay Kaganer
2015-07-06 22:34 ` Mark Fasheh
2015-07-06 23:03 ` Mordechay Kaganer
2015-07-06 23:07 ` Mark Fasheh
2015-07-07 6:27 ` Ryan Bourne
2015-07-07 13:14 ` Mordechay Kaganer [this message]
2015-07-08 5:57 ` Mordechay Kaganer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+xOVSN1v5v9z3=zP8E6phTV0c54pOCf7Rh10EBR4zkkFNVo4A@mail.gmail.com' \
--to=mkaganer@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).