From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-f54.google.com ([209.85.214.54]:35101 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751525AbdBIMsj (ORCPT ); Thu, 9 Feb 2017 07:48:39 -0500 Received: by mail-it0-f54.google.com with SMTP id 203so124349522ith.0 for ; Thu, 09 Feb 2017 04:48:02 -0800 (PST) Received: from [191.9.206.254] (rrcs-70-62-41-24.central.biz.rr.com. [70.62.41.24]) by smtp.gmail.com with ESMTPSA id g76sm13081817ioj.36.2017.02.09.04.48.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Feb 2017 04:48:00 -0800 (PST) Subject: Re: understanding disk space usage To: Linux Btrfs References: <7912da41-d58a-d57f-47cd-508bc709a761@cn.fujitsu.com> <22683.12104.679173.639568@tree.ty.sabi.co.uk> <171155ef-93c2-f438-3bbd-ca550381c80d@gmail.com> <22683.37260.208424.336485@tree.ty.sabi.co.uk> From: "Austin S. Hemmelgarn" Message-ID: <125c2b27-928e-5261-a0ce-1622b7a70a76@gmail.com> Date: Thu, 9 Feb 2017 07:47:56 -0500 MIME-Version: 1.0 In-Reply-To: <22683.37260.208424.336485@tree.ty.sabi.co.uk> Content-Type: text/plain; charset=windows-1252; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2017-02-08 16:45, Peter Grandi wrote: > [ ... ] >> The issue isn't total size, it's the difference between total >> size and the amount of data you want to store on it. and how >> well you manage chunk usage. If you're balancing regularly to >> compact chunks that are less than 50% full, [ ... ] BTRFS on >> 16GB disk images before with absolutely zero issues, and have >> a handful of fairly active 8GB BTRFS volumes [ ... ] > > Unfortunately balance operations are quite expensive, especially > from inside VMs. On the other hand if the system is not much > disk constrained relatively frequent balances is a good idea > indeed. It is a bit like the advice in the other thread on OLTP > to run frequent data defrags, which are also quite expensive. That depends on how and when you do them. A full balance isn't part of regular maintenance, and should never be such. Regular partial balances done to clean up mostly empty chunks absolutely should be part of regular maintenance, and are pretty inexpensive in terms of both time and resource usage. Balance with -dusage=20 -musage=20 should run in at most a few seconds on most reasonably sized filesystems even on low-end systems like a Raspberry Pi, and running that on an at least weekly basis will significantly improve the chances that you don't encounter a situation like this. > > Both combined are like running the compactor/cleaner on log > structured (another variants of "COW") filesystems like NILFS2: > running that frequently means tighter space use and better > locality, but is quite expensive too. If you run with autodefrag, then you should rarely if ever need to actually run a full defrag operation unless you're storing lots of database files, VM disk images, or similar stuff. This goes double on an SSD. > >>> [ ... ] My impression is that the Btrfs design trades space >>> for performance and reliability. > >> In general, yes, but a more accurate statement would be that >> it offers a trade-off between space and convenience. [ ... ] > > It is not quite "convenience", it is overhead: whole-volume > operations like compacting, defragmenting (or fscking) tend to > cost significantly in IOPS and also in transfer rate, and on > flash SSDs they also consume lifetime. Overhead is the inverse of convenience. By over-provisioning to a greater degree, you're reducing the need to worry about those 'expensive' operations, reducing both resource overhead, and management overhead. > > Therefore personally I prefer to have quite a bit of unused > space in Btrfs or NILFS2, at a minimum around double at 10-20% > than the 5-10% that I think is the minimum advisable with > conventional designs. I can agree on this point, over-provisioning is mandatory to a much greater degree on COW filesystems.