linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Linux Btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: understanding disk space usage
Date: Thu, 9 Feb 2017 07:47:56 -0500	[thread overview]
Message-ID: <125c2b27-928e-5261-a0ce-1622b7a70a76@gmail.com> (raw)
In-Reply-To: <22683.37260.208424.336485@tree.ty.sabi.co.uk>

On 2017-02-08 16:45, Peter Grandi wrote:
> [ ... ]
>> The issue isn't total size, it's the difference between total
>> size and the amount of data you want to store on it. and how
>> well you manage chunk usage. If you're balancing regularly to
>> compact chunks that are less than 50% full, [ ... ] BTRFS on
>> 16GB disk images before with absolutely zero issues, and have
>> a handful of fairly active 8GB BTRFS volumes [ ... ]
>
> Unfortunately balance operations are quite expensive, especially
> from inside VMs. On the other hand if the system is not much
> disk constrained relatively frequent balances is a good idea
> indeed. It is a bit like the advice in the other thread on OLTP
> to run frequent data defrags, which are also quite expensive.
That depends on how and when you do them.  A full balance isn't part of 
regular maintenance, and should never be such.  Regular partial balances 
done to clean up mostly empty chunks absolutely should be part of 
regular maintenance, and are pretty inexpensive in terms of both time 
and resource usage.  Balance with -dusage=20 -musage=20 should run in at 
most a few seconds on most reasonably sized filesystems even on low-end 
systems like a Raspberry Pi, and running that on an at least weekly 
basis will significantly improve the chances that you don't encounter a 
situation like this.
>
> Both combined are like running the compactor/cleaner on log
> structured (another variants of "COW") filesystems like NILFS2:
> running that frequently means tighter space use and better
> locality, but is quite expensive too.
If you run with autodefrag, then you should rarely if ever need to 
actually run a full defrag operation unless you're storing lots of 
database files, VM disk images, or similar stuff.  This goes double on 
an SSD.
>
>>> [ ... ] My impression is that the Btrfs design trades space
>>> for performance and reliability.
>
>> In general, yes, but a more accurate statement would be that
>> it offers a trade-off between space and convenience. [ ... ]
>
> It is not quite "convenience", it is overhead: whole-volume
> operations like compacting, defragmenting (or fscking) tend to
> cost significantly in IOPS and also in transfer rate, and on
> flash SSDs they also consume lifetime.
Overhead is the inverse of convenience.  By over-provisioning to a 
greater degree, you're reducing the need to worry about those 
'expensive' operations, reducing both resource overhead, and management 
overhead.
>
> Therefore personally I prefer to have quite a bit of unused
> space in Btrfs or NILFS2, at a minimum around double at 10-20%
> than the 5-10% that I think is the minimum advisable with
> conventional designs.
I can agree on this point, over-provisioning is mandatory to a much 
greater degree on COW filesystems.

  reply	other threads:[~2017-02-09 12:48 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07 16:44 understanding disk space usage Vasco Visser
2017-02-08  3:48 ` Qu Wenruo
2017-02-08  9:55   ` Vasco Visser
2017-02-09  2:53     ` Qu Wenruo
2017-02-09 12:01       ` Vasco Visser
2017-02-08 14:46   ` Peter Grandi
2017-02-08 17:50     ` Austin S. Hemmelgarn
2017-02-08 21:45       ` Peter Grandi
2017-02-09 12:47         ` Austin S. Hemmelgarn [this message]
2017-02-08 18:03     ` Hugo Mills
2017-02-09 13:25   ` Adam Borowski
2017-02-09 17:53     ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=125c2b27-928e-5261-a0ce-1622b7a70a76@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).