Why do full balance and deduplication reduce available free space?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Why do full balance and deduplication reduce available free space?
@ 2017-10-02 10:02 Niccolò Belli
  2017-10-02 10:16 ` Hans van Kranenburg
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Niccolò Belli @ 2017-10-02 10:02 UTC (permalink / raw)
  To: linux-btrfs

Hi,
I have several subvolumes mounted with compress-force=lzo and autodefrag. 
Since I use lots of snapshots (snapper keeps around 24 hourly snapshots, 7 
daily snapshots and 4 weekly snapshots) I had to create a systemd timer to 
perform a full balance and deduplication each night. In fact data needs to 
be already deduplicated when snapshots are created, otherwise I have no 
other way to deduplicate snapshots.

This is how I performe balance: btrfs balance start --full-balance rootfs
This is how I perform deduplication (duperemove is from git master):
duperemove -drh --dedupe-options=noblock --hashfile=../rootfs.hash 
<all_subvols_except_snapshots_ones>

Looking at the logs I noticed something weird: available free space 
actually decreases after balance or deduplication.

This is just before the timer starts:

Overall:
    Device size:                 128.00GiB
    Device allocated:             49.03GiB
    Device unallocated:           78.97GiB
    Device missing:                  0.00B
    Used:                         43.78GiB
    Free (estimated):             82.97GiB      (min: 82.97GiB)
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,single: Size:44.00GiB, Used:40.00GiB
   /dev/sda5      44.00GiB

Metadata,single: Size:5.00GiB, Used:3.78GiB
   /dev/sda5       5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/sda5      32.00MiB

Unallocated:
   /dev/sda5      78.97GiB



I also manually performed a full balance just before the timer starts:

Overall:
    Device size:                 128.00GiB
    Device allocated:             46.03GiB
    Device unallocated:           81.97GiB
    Device missing:                  0.00B
    Used:                         43.78GiB
    Free (estimated):             82.96GiB      (min: 82.96GiB)
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,single: Size:41.00GiB, Used:40.01GiB
   /dev/sda5      41.00GiB

Metadata,single: Size:5.00GiB, Used:3.77GiB
   /dev/sda5       5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/sda5      32.00MiB

Unallocated:
   /dev/sda5      81.97GiB



As you can see even doing a full balance was enough to reduce the available 
free space!

Then the timer started and it performed the deduplication:

Overall:
    Device size:                 128.00GiB
    Device allocated:             46.03GiB
    Device unallocated:           81.97GiB
    Device missing:                  0.00B
    Used:                         43.87GiB
    Free (estimated):             82.94GiB      (min: 82.94GiB)
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 176.00KiB)

Data,single: Size:41.00GiB, Used:40.03GiB
   /dev/sda5      41.00GiB

Metadata,single: Size:5.00GiB, Used:3.84GiB
   /dev/sda5       5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/sda5      32.00MiB

Unallocated:
   /dev/sda5      81.97GiB



Once again it reduced the available free space!

Then, after the deduplication, the timer also performed a full balance:

Overall:
    Device size:                 128.00GiB
    Device allocated:             46.03GiB
    Device unallocated:           81.97GiB
    Device missing:                  0.00B
    Used:                         44.00GiB
    Free (estimated):             82.93GiB      (min: 82.93GiB)
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              512.00MiB      (used: 0.00B)

Data,single: Size:41.00GiB, Used:40.04GiB
   /dev/sda5      41.00GiB

Metadata,single: Size:5.00GiB, Used:3.97GiB
   /dev/sda5       5.00GiB

System,single: Size:32.00MiB, Used:16.00KiB
   /dev/sda5      32.00MiB

Unallocated:
   /dev/sda5      81.97GiB




It further reduced the available free space! Balance and deduplication 
actually reduced my available free space of 400MB!
400MB each night!
How is it possible? Should I avoid doing balances and deduplications at 
all?

Thanks,
Niccolò

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free space?
  2017-10-02 10:02 Why do full balance and deduplication reduce available free space? Niccolò Belli
@ 2017-10-02 10:16 ` Hans van Kranenburg
  2017-10-02 10:29   ` Niccolò Belli
  2017-10-02 14:15 ` Why do full balance and deduplication reduce available free space? Niccolò Belli
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Hans van Kranenburg @ 2017-10-02 10:16 UTC (permalink / raw)
  To: Niccolò Belli, linux-btrfs

On 10/02/2017 12:02 PM, Niccolò Belli wrote:
> [...]
> 
> Since I use lots of snapshots [...] I had to
> create a systemd timer to perform a full balance and deduplication each
> night.

Can you explain what's your reasoning behind this 'because X it needs
Y'? I don't follow.

-- 
Hans van Kranenburg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free  space?
  2017-10-02 10:16 ` Hans van Kranenburg
@ 2017-10-02 10:29   ` Niccolò Belli
  2017-10-02 11:14     ` Paul Jones
  0 siblings, 1 reply; 10+ messages in thread
From: Niccolò Belli @ 2017-10-02 10:29 UTC (permalink / raw)
  To: Hans van Kranenburg; +Cc: linux-btrfs

Il 2017-10-02 12:16 Hans van Kranenburg ha scritto:
> On 10/02/2017 12:02 PM, Niccolò Belli wrote:
>> [...]
>> 
>> Since I use lots of snapshots [...] I had to
>> create a systemd timer to perform a full balance and deduplication 
>> each
>> night.
> 
> Can you explain what's your reasoning behind this 'because X it needs
> Y'? I don't follow.

Available free space is important to me, so I want snapshots to be 
deduplicated as well. Since I cannot deduplicate snapshots because they 
are read-only, then the data must be already deduplicated before the 
snapshots are taken. I do not consider the hourly snapshots because in a 
day they will be gone anyway, but daily snapshots will stay there for 
much longer so I want them to be deduplicated.

Niccolò

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Why do full balance and deduplication reduce available free space?
  2017-10-02 10:29   ` Niccolò Belli
@ 2017-10-02 11:14     ` Paul Jones
  2017-10-02 11:26       ` Is it really possible to dedupe read-only snapshots!? Niccolò Belli
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Jones @ 2017-10-02 11:14 UTC (permalink / raw)
  To: Niccolò Belli; +Cc: linux-btrfs@vger.kernel.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1439 bytes --]

> -----Original Message-----
> From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-
> owner@vger.kernel.org] On Behalf Of NiccolÃ² Belli
> Sent: Monday, 2 October 2017 9:29 PM
> To: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: Why do full balance and deduplication reduce available free
> space?
> 
> Il 2017-10-02 12:16 Hans van Kranenburg ha scritto:
> > On 10/02/2017 12:02 PM, NiccolÃ² Belli wrote:
> >> [...]
> >>
> >> Since I use lots of snapshots [...] I had to create a systemd timer
> >> to perform a full balance and deduplication each night.
> >
> > Can you explain what's your reasoning behind this 'because X it needs
> > Y'? I don't follow.
> 
> Available free space is important to me, so I want snapshots to be
> deduplicated as well. Since I cannot deduplicate snapshots because they are
> read-only, then the data must be already deduplicated before the snapshots
> are taken. I do not consider the hourly snapshots because in a day they will
> be gone anyway, but daily snapshots will stay there for much longer so I want
> them to be deduplicated.

I use bees for deduplication and it will quite happily dedupe read-only snapshots. You could always change them to RW while dedupe is running then change back to RO.


Paul.




ÿôèº{.nÇ+‰·Ÿ®‰†+%ŠËÿ±éÝ¶\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨èÚ&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~††Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ß£øm

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Is it really possible to dedupe read-only snapshots!?
  2017-10-02 11:14     ` Paul Jones
@ 2017-10-02 11:26       ` Niccolò Belli
  0 siblings, 0 replies; 10+ messages in thread
From: Niccolò Belli @ 2017-10-02 11:26 UTC (permalink / raw)
  To: Paul Jones; +Cc: linux-btrfs

Il 2017-10-02 13:14 Paul Jones ha scritto:
> I use bees for deduplication and it will quite happily dedupe
> read-only snapshots.

AFAIK no, it isn't possible. Source: 
https://www.spinics.net/lists/linux-btrfs/msg60385.html
"It should be possible to deduplicate a read-only file to a read-write 
one, but that's probably not worth the effort in many real-world use 
cases."

> You could always change them to RW while dedupe
> is running then change back to RO.

AFAIK it will break send/receive, can someone confirm?

Niccolò

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free  space?
  2017-10-02 10:02 Why do full balance and deduplication reduce available free space? Niccolò Belli
  2017-10-02 10:16 ` Hans van Kranenburg
@ 2017-10-02 14:15 ` Niccolò Belli
  2017-10-02 19:35 ` Kai Krakow
  2017-10-02 20:27 ` Goffredo Baroncelli
  3 siblings, 0 replies; 10+ messages in thread
From: Niccolò Belli @ 2017-10-02 14:15 UTC (permalink / raw)
  To: linux-btrfs

Maybe this is because of the autodefrag mount option? I thought it 
wasn't supposed to unshare lots of extents...

Niccolò

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free space?
  2017-10-02 10:02 Why do full balance and deduplication reduce available free space? Niccolò Belli
  2017-10-02 10:16 ` Hans van Kranenburg
  2017-10-02 14:15 ` Why do full balance and deduplication reduce available free space? Niccolò Belli
@ 2017-10-02 19:35 ` Kai Krakow
  2017-10-02 20:19   ` Niccolò Belli
  2017-10-02 20:27 ` Goffredo Baroncelli
  3 siblings, 1 reply; 10+ messages in thread
From: Kai Krakow @ 2017-10-02 19:35 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 02 Oct 2017 12:02:16 +0200
schrieb Niccolò Belli <darkbasic@linuxsystems.it>:

> This is how I performe balance: btrfs balance start --full-balance
> rootfs This is how I perform deduplication (duperemove is from git
> master): duperemove -drh --dedupe-options=noblock
> --hashfile=../rootfs.hash <all_subvols_except_snapshots_ones>

Besides defragging removing the reflinks, duperemove will unshare your
snapshots when used in this way: If it sees duplicate blocks within the
subvolumes you give it, it will potentially unshare blocks from the
snapshots while rewriting extents.

BTW, you should be able to use duperemove with read-only snapshots if
used in read-only-open mode. But I'd rather suggest to use bees
instead: It works at whole-volume level, walking extents instead of
files. That way it is much faster, doesn't reprocess already
deduplicated extents, and it works with read-only snapshots.

Until my patch it didn't like mixed nodatasum/datasum workloads.
Currently this is fixed by just leaving nocow data alone as users
probably set nocow for exactly the reason to not fragment extents and
relocate blocks.

-- 
Regards,
Kai

Replies to list-only preferred.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free  space?
  2017-10-02 19:35 ` Kai Krakow
@ 2017-10-02 20:19   ` Niccolò Belli
  2017-10-09 17:38     ` Kai Krakow
  0 siblings, 1 reply; 10+ messages in thread
From: Niccolò Belli @ 2017-10-02 20:19 UTC (permalink / raw)
  To: linux-btrfs

Il 2017-10-02 21:35 Kai Krakow ha scritto:
> Besides defragging removing the reflinks, duperemove will unshare your
> snapshots when used in this way: If it sees duplicate blocks within the
> subvolumes you give it, it will potentially unshare blocks from the
> snapshots while rewriting extents.
> 
> BTW, you should be able to use duperemove with read-only snapshots if
> used in read-only-open mode. But I'd rather suggest to use bees
> instead: It works at whole-volume level, walking extents instead of
> files. That way it is much faster, doesn't reprocess already
> deduplicated extents, and it works with read-only snapshots.
> 
> Until my patch it didn't like mixed nodatasum/datasum workloads.
> Currently this is fixed by just leaving nocow data alone as users
> probably set nocow for exactly the reason to not fragment extents and
> relocate blocks.

Bad Btrfs Feature Interactions: btrfs read-only snapshots (never tested, 
probably wouldn't work well)

Unfortunately it seems that bees doesn't support read-only snapshots, so 
it's a no way.

P.S.
I tried duperemove with -A, but besides taking much longer it didn't 
improve the situation.
Are you sure that the culprit is duperemove? AFAIK it shouldn't unshare 
extents...

Niccolò

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free space?
  2017-10-02 10:02 Why do full balance and deduplication reduce available free space? Niccolò Belli
                   ` (2 preceding siblings ...)
  2017-10-02 19:35 ` Kai Krakow
@ 2017-10-02 20:27 ` Goffredo Baroncelli
  3 siblings, 0 replies; 10+ messages in thread
From: Goffredo Baroncelli @ 2017-10-02 20:27 UTC (permalink / raw)
  To: Niccolò Belli, linux-btrfs

On 10/02/2017 12:02 PM, Niccolò Belli wrote:
> Hi,
> I have several subvolumes mounted with compress-force=lzo and autodefrag. 
>> Since I use lots of snapshots (snapper keeps around 24 hourly snapshots, 7 daily 
> snapshots and 4 weekly snapshots) I had to create a systemd timer to perform a 
> full balance and deduplication each night. In fact data needs to be 
> already deduplicated when snapshots are created, otherwise I have no other way to deduplicate snapshots.


[...] 
> Data,single: Size:44.00GiB, Used:40.00GiB
>   /dev/sda5      44.00GiB
> 
> Metadata,single: Size:5.00GiB, Used:3.78GiB
>   /dev/sda5       5.00GiB
[...]

> Data,single: Size:41.00GiB, Used:40.01GiB
>   /dev/sda5      41.00GiB
> 
> Metadata,single: Size:5.00GiB, Used:3.77GiB
>   /dev/sda5       5.00GiB
[...]

> Data,single: Size:41.00GiB, Used:40.03GiB
>   /dev/sda5      41.00GiB
> 
> Metadata,single: Size:5.00GiB, Used:3.84GiB
>   /dev/sda5       5.00GiB
[...]

> Data,single: Size:41.00GiB, Used:40.04GiB
>   /dev/sda5      41.00GiB
> 
> Metadata,single: Size:5.00GiB, Used:3.97GiB
>   /dev/sda5       5.00GiB
[....]

 
> 
> It further reduced the available free space! Balance and deduplication actually reduced my available free space of 400MB!
> 400MB each night!

Your data increased by 40MB (over 40GB, so about ~0.1%); instead your metadata increased about 200MB (over ~4GB, about ~2%); so

1) it seems to me that your data is quite "deduped"
2) (NB this is a my guessing) I think that deduping (and or re-balancing) rearranges the metadata leading to a increase disk usage. The only explanation that I found is that the deduping breaks the sharing of metadata with the snapshots:
- a snapshot share the metadata, which in turn refers to the data. Because the metadata is shared, there is only one copy. The metadata remains shared, until it is not changed/updated.
- dedupe, when shares a file block, updates the metadata breaking the sharing with its snapshot, and thus creating a copy of these.
NB: updating snapshot metadata is the same that updating subvolume metadata


> How is it possible? Should I avoid doing balances and deduplications at all?

Try few days without deduplication, and check if something change. May be that it would be sufficient to delay the deduping: not each night, but each week or month.
Another option is running dedupe on all the files (including the snapshotted ones). In fact this would still break the metadata sharing, but the extents should still be shared (IMHO :-) ). Of course the cost of deduping will increasing a lot (about 24+7+4 = 35 times)


> 
> Thanks,
> Niccolò

BR
G.Baroncelli

> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Why do full balance and deduplication reduce available free space?
  2017-10-02 20:19   ` Niccolò Belli
@ 2017-10-09 17:38     ` Kai Krakow
  0 siblings, 0 replies; 10+ messages in thread
From: Kai Krakow @ 2017-10-09 17:38 UTC (permalink / raw)
  To: linux-btrfs

Am Mon, 02 Oct 2017 22:19:32 +0200
schrieb Niccolò Belli <darkbasic@linuxsystems.it>:

> Il 2017-10-02 21:35 Kai Krakow ha scritto:
> > Besides defragging removing the reflinks, duperemove will unshare
> > your snapshots when used in this way: If it sees duplicate blocks
> > within the subvolumes you give it, it will potentially unshare
> > blocks from the snapshots while rewriting extents.
> > 
> > BTW, you should be able to use duperemove with read-only snapshots
> > if used in read-only-open mode. But I'd rather suggest to use bees
> > instead: It works at whole-volume level, walking extents instead of
> > files. That way it is much faster, doesn't reprocess already
> > deduplicated extents, and it works with read-only snapshots.
> > 
> > Until my patch it didn't like mixed nodatasum/datasum workloads.
> > Currently this is fixed by just leaving nocow data alone as users
> > probably set nocow for exactly the reason to not fragment extents
> > and relocate blocks.  
> 
> Bad Btrfs Feature Interactions: btrfs read-only snapshots (never
> tested, probably wouldn't work well)
> 
> Unfortunately it seems that bees doesn't support read-only snapshots,
> so it's a no way.
> 
> P.S.
> I tried duperemove with -A, but besides taking much longer it didn't 
> improve the situation.
> Are you sure that the culprit is duperemove? AFAIK it shouldn't
> unshare extents...

Unsharing of extents depends... If an extent is shared between a
r/o and r/w snapshot, rewriting the extent for deduplication ends up in
a shared extent again but it is no longer reflinked with the original
r/o snapshot. At least if btrfs doesn't allow to change extents part of
a r/o snapshot... Which you all tell is the case...

And then, there's unsharing of metadata by the deduplication process
itself.

Both effects should be minimal, tho. But since chunks are allocated in
1GB sizes, it may jump 1GB worth of allocation just for a few extra MB
needed. A metadata rebalance may fix this.


-- 
Regards,
Kai

Replies to list-only preferred.



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-10-09 17:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-02 10:02 Why do full balance and deduplication reduce available free space? Niccolò Belli
2017-10-02 10:16 ` Hans van Kranenburg
2017-10-02 10:29   ` Niccolò Belli
2017-10-02 11:14     ` Paul Jones
2017-10-02 11:26       ` Is it really possible to dedupe read-only snapshots!? Niccolò Belli
2017-10-02 14:15 ` Why do full balance and deduplication reduce available free space? Niccolò Belli
2017-10-02 19:35 ` Kai Krakow
2017-10-02 20:19   ` Niccolò Belli
2017-10-09 17:38     ` Kai Krakow
2017-10-02 20:27 ` Goffredo Baroncelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).