From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: understanding disk space usage
Date: Thu, 9 Feb 2017 07:47:56 -0500 [thread overview]
Message-ID: <125c2b27-928e-5261-a0ce-1622b7a70a76@gmail.com> (raw)
In-Reply-To: <22683.37260.208424.336485@tree.ty.sabi.co.uk>
On 2017-02-08 16:45, Peter Grandi wrote:
> [ ... ]
>> The issue isn't total size, it's the difference between total
>> size and the amount of data you want to store on it. and how
>> well you manage chunk usage. If you're balancing regularly to
>> compact chunks that are less than 50% full, [ ... ] BTRFS on
>> 16GB disk images before with absolutely zero issues, and have
>> a handful of fairly active 8GB BTRFS volumes [ ... ]
>
> Unfortunately balance operations are quite expensive, especially
> from inside VMs. On the other hand if the system is not much
> disk constrained relatively frequent balances is a good idea
> indeed. It is a bit like the advice in the other thread on OLTP
> to run frequent data defrags, which are also quite expensive.
That depends on how and when you do them. A full balance isn't part of
regular maintenance, and should never be such. Regular partial balances
done to clean up mostly empty chunks absolutely should be part of
regular maintenance, and are pretty inexpensive in terms of both time
and resource usage. Balance with -dusage=20 -musage=20 should run in at
most a few seconds on most reasonably sized filesystems even on low-end
systems like a Raspberry Pi, and running that on an at least weekly
basis will significantly improve the chances that you don't encounter a
situation like this.
>
> Both combined are like running the compactor/cleaner on log
> structured (another variants of "COW") filesystems like NILFS2:
> running that frequently means tighter space use and better
> locality, but is quite expensive too.
If you run with autodefrag, then you should rarely if ever need to
actually run a full defrag operation unless you're storing lots of
database files, VM disk images, or similar stuff. This goes double on
an SSD.
>
>>> [ ... ] My impression is that the Btrfs design trades space
>>> for performance and reliability.
>
>> In general, yes, but a more accurate statement would be that
>> it offers a trade-off between space and convenience. [ ... ]
>
> It is not quite "convenience", it is overhead: whole-volume
> operations like compacting, defragmenting (or fscking) tend to
> cost significantly in IOPS and also in transfer rate, and on
> flash SSDs they also consume lifetime.
Overhead is the inverse of convenience. By over-provisioning to a
greater degree, you're reducing the need to worry about those
'expensive' operations, reducing both resource overhead, and management
overhead.
>
> Therefore personally I prefer to have quite a bit of unused
> space in Btrfs or NILFS2, at a minimum around double at 10-20%
> than the 5-10% that I think is the minimum advisable with
> conventional designs.
I can agree on this point, over-provisioning is mandatory to a much
greater degree on COW filesystems.
next prev parent reply other threads:[~2017-02-09 12:48 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 16:44 understanding disk space usage Vasco Visser
2017-02-08 3:48 ` Qu Wenruo
2017-02-08 9:55 ` Vasco Visser
2017-02-09 2:53 ` Qu Wenruo
2017-02-09 12:01 ` Vasco Visser
2017-02-08 14:46 ` Peter Grandi
2017-02-08 17:50 ` Austin S. Hemmelgarn
2017-02-08 21:45 ` Peter Grandi
2017-02-09 12:47 ` Austin S. Hemmelgarn [this message]
2017-02-08 18:03 ` Hugo Mills
2017-02-09 13:25 ` Adam Borowski
2017-02-09 17:53 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=125c2b27-928e-5261-a0ce-1622b7a70a76@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).