linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Thomas Leister <thomas.leister@mailbox.org>,
	dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org, lxc-devel@lists.linuxcontainers.org
Subject: Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota
Date: Tue, 31 Jul 2018 12:03:37 -0400	[thread overview]
Message-ID: <80e77411-2905-46a9-a5b1-fbbbb824dbf5@gmail.com> (raw)
In-Reply-To: <b10a3950-0b86-e4fb-3621-f4c5ad3d79e7@gmx.com>

On 2018-07-31 10:32, Qu Wenruo wrote:
> 
> 
> On 2018年07月31日 21:49, Thomas Leister wrote:
>> Dear David,
>> hello everyone,
>>
>> during a recent project of mine involving LXD and BTRFS I found out that
>> quotas on BTRFS subvolumes are enforced, but file system usage and
>> limits set via quotas are not reported correctly in LXC containers.
>>
>> I've found this discussion regarding my problem:
>> https://github.com/lxc/lxd/issues/2180
> 
> That's not the expected usage of btrfs qgroup/quota.
> 
> Quota only accounts how many bytes are used exclusively or shared
> between subvolumes at extent level.
> 
>>
>> There was already a proposal to introduce subvolume quota support some
>> time ago:
>> https://marc.info/?l=linux-btrfs&m=147576434114415&w=2
> 
> It's in fact impossible if I didn't miss something.
> 
> There are several technical problems in the proposal:
> 
> 1) Multi-level qgroups
>     The real limit is limited by all related qgroups, including higher
>     level qgroup.
>     Such design makes it pretty hard to calculation the real limit.
> 
> 2) Different limitations on exclusive/shared bytes
>     Btrfs can set different limit on exclusive/shared bytes, further
>     complicating the problem.
> 
> 3) Btrfs quota only accounts data/metadata used by the subvolume
>     It lacks all the shared trees (mentioned below), and in fact such
>     shared tree can be pretty large (especially for extent tree and csum
>     tree).
>     Only accounting quota limit would hit real ENOSPC easily IMHO.
> 
>>
>> @David as I've seen your response on that topic on the mailing list,
>> maybe you can tell me if there are any plans to support correct
>> subvolume quota reporting e.g. for "df -h" calls from within a
>> container? Maybe there's already something on your / SUSE's roadmap? :-)
>>
>> As more and more container environments spin up these days, there might
>> be a growing demand on that :-) Personally I'd really appreciate if I
>> could read the current file system usage and limit from within a
>> container using BTRFS as storage backend.
> 
> For current btrfs design, I think it's skeptical to implement such design.
> The main problem here is, btrfs doesn't do the full LVM work. (unlike
> ZFS IIRC)
> It doesn't really manage multiple volumes, that's why it's called
> subvolume in btrfs.
ZFS quotas work the way they do not because it's trivial to implement 
them that way due to the underlying implementation, but because they 
provide the functionality that people actually want.  Being able to put 
proper hard limits on space usage for a given volume/subvolume/dataset 
is _critical_ for a large number of enterprise deployment scenarios. 
Same goes for being able to put a fixed space reservation for a given 
volume/subvolume/dataset.  If we want to even remotely compete (and it 
sure seems like we do), we need equivalent features that work 
intuitively for _regular_ people (not those who have intimate 
understandings of the internal workings of BTRFS).

> A subvolume is not a fully usable fs, it's just a subset of a full fs.
> It relies on all the other trees (root tree, extent tree, chunk tree,
> csum tree, and quota tree in this case) to do all the work.
A ZFS dataset isn't a fully usable FS either.  It's still dependent on 
all the underlying infrastructure from the zpool itself (and so are 
zvols), which, in fact, does a vast majority of the work.  The 
difference here is that a ZFS dataset is far more self-contained than a 
BTRFS subvolume.  If we ever want sane per-subvolume storage profiles or 
mount options, we're going to need to get a lot closer to that anyway.

> Thus it's pretty hard to implement such special purposed df call.
To implement it perfectly maybe.  Except most applications don't need it 
to be perfect, they want to know how much space they can actually use. 
Even a trivial blatantly imperfect implementation that just shows you 
the total space that can be used and how much is used based on quotas 
will give better behavior that the current case of just hiding the 
quotas behind a root-only call.  Pretty much anything which does it's 
own disk usage management is currently broken on BTRFS when quotas are 
being used.  Just reporting the quota for the total space, and the space 
accounted to the subvolume by the quota would fix almost all such 
applications.
> 
> On the other hand, isn't easier to implement special interface for
> container to get real disk usage/limit other than using the old vanilla
> df interface?
This isn't just an issue for containers.  Anybody who is using quotas 
like they are typically used in ZFS deployments has the same issue, and 
there _ARE_ people doing that (see for example OpenSUSE, where they are 
using quotas (if they are enabled because of snapshot support) to limit 
space consumption of paths like /tmp).

  reply	other threads:[~2018-07-31 17:44 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31 13:49 Report correct filesystem usage / limits on BTRFS subvolumes with quota Thomas Leister
2018-07-31 14:32 ` Qu Wenruo
2018-07-31 16:03   ` Austin S. Hemmelgarn [this message]
2018-08-01  1:23     ` Qu Wenruo
2018-08-09 17:48   ` Tomasz Pala
2018-08-09 23:35     ` Qu Wenruo
2018-08-10  7:17       ` Tomasz Pala
2018-08-10  7:55         ` Qu Wenruo
2018-08-10  9:33           ` Tomasz Pala
2018-08-11  6:54             ` Andrei Borzenkov
2018-08-10 11:32       ` Austin S. Hemmelgarn
2018-08-10 18:07       ` Chris Murphy
2018-08-10 19:10         ` Austin S. Hemmelgarn
2018-08-11  3:29         ` Duncan
2018-08-12  3:16           ` Chris Murphy
2018-08-12  7:04             ` Andrei Borzenkov
2018-08-12 17:39               ` Andrei Borzenkov
2018-08-13 11:23               ` Austin S. Hemmelgarn
     [not found]     ` <f66b8ff3-d7ec-31ad-e9ca-e09c9eb76474@gmail.com>
2018-08-10  7:33       ` Tomasz Pala
2018-08-11  5:46         ` Andrei Borzenkov
2018-08-10 11:39     ` Austin S. Hemmelgarn
2018-08-10 18:21       ` Tomasz Pala
2018-08-10 18:48         ` Austin S. Hemmelgarn
2018-08-11  6:18         ` Andrei Borzenkov
2018-08-14  2:49 ` Jeff Mahoney
2018-08-15 11:22   ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80e77411-2905-46a9-a5b1-fbbbb824dbf5@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lxc-devel@lists.linuxcontainers.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=thomas.leister@mailbox.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).