Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tomasz Pala <gotar@polanet.pl>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota
Date: Fri, 10 Aug 2018 09:17:49 +0200	[thread overview]
Message-ID: <20180810071748.GA5473@polanet.pl> (raw)
In-Reply-To: <ecd793f3-55b3-84d4-80cf-3382a580037f@gmx.com>

On Fri, Aug 10, 2018 at 07:35:32 +0800, Qu Wenruo wrote:

>> when limiting somebody's data space we usually don't care about the
>> underlying "savings" coming from any deduplicating technique - these are
>> purely bonuses for system owner, so he could do larger resource overbooking.
> 
> In reality that's definitely not the case.

Definitely? How do you "sell" a disk space when there is no upper bound?
Every, and I mean _every_ resource quota out in the wild gives you an user-perspective.
You can assign CPU cores/time, RAM or network bandwidth with HARD limit.

Only after that you _can_ sometimes assign some best-effort
outer, not guaranteed limits, like extra network bandwidth or grace
periods with filesystem usage (disregarding technical details - in case
of quota you move hard limit beyond and apply lowere soft limit).

This is the primary quota usage. Quotas don't save system resources,
quotas are valuables to "sell" (by quotes I mean every possible
allocations, including interorganisation accouting).

Quotas are overbookable by design and like I said before, the underlying
savings mechanism allow sysadm to increase actual overbooking ratio.

If I run out of CPU, RAM, storage or network I simply need to expand
such resource. I won't shrink quotas in such case.
Or apply some other resuorce-saving technique, like LVM with VDO,
swapping, RAM deduplication etc.

If that is not the usecase of btrfs quotas, then it should be renamed to
not confuse users. Using the incorrect terms for things widely known
leads to user frustration at least.

> From what I see, most users would care more about exclusively used space
> (excl), other than the total space one subvolume is referring to (rfer).

Consider this:
1. there is some "template" system-wide snapshot,
2. users X and Y have CoW copies of it - both see "0 bytes exclusive"?
3. sysadm removes "template" - what happens to X and Y quotas?
4. user X removes his copy - what happens to Y quota?

The first thing about virtually every mechanism should be
discoverability and reliability. I expect my quota not to change without
my interaction. Never. How did you cope with this?
If not - how are you going to explain such weird behaviour to users?

Once again: numbers of quotas *I* got must not be influenced by external
operations or foreign users.

> The most common case is, you do a snapshot, user would only care how
> much new space can be written into the subvolume, other than the total
> subvolume size.

If only that would be the case... then exactly - I do care how much new
data is _guaranteed_ to fit on my storage.

So please tell me, as I might get it wrong - what happens if source
subvolume get's removed and the CoWed data are not shared anymore?
Is the quota recalculated? - this would be wrong, as there were no new data written.
Is the quota left intact? - this is wrong too, as this gives the false view of exclusive space taken.

This is just another reincarnation of famous "btrfs df" problem you
couldn't comprehend so long - when reporting "disk FREE" status I want
to know the amount of data that is guaranteed to be written in current
RAID profile, i.e. ignoring any possible savings from compression etc.

Please note: my assumptions are based on
https://btrfs.wiki.kernel.org/index.php/Quota_support

"File copy and file deletion may both affect limits since the unshared
limit of another qgroup can change if the original volume's files are
deleted and only one copy is remaining"

so if I write something invalid this might be the source of my mistake.

>> And the numbers accounted should reflect the uncompressed sizes.
> 
> No way for current extent based solution.

OK, since the data is provided by the user, it's "compressableness"
might be considered his saving (we only provide transparency).

>> Moreover - if there would be per-subvolume RAID levels someday, the data
>> should be accouted in relation to "default" (filesystem) RAID level,
>> i.e. having a RAID0 subvolume on RAID1 fs should account half of the
>> data, and twice the data in an opposite scenario (like "dup" profile on
>> single-drive filesystem).
> 
> No possible again for current extent based solution.

Doesn't extent have information about devices it's cloned on? But OK,
this is not important until per-subvolume profiles are available.

>> In short: values representing quotas are user-oriented ("the numbers one
>> bought"), not storage-oriented ("the numbers they actually occupy").
> 
> Well, if something is not possible or brings so big performance impact,
> there will be no argument on how it should work in the first place.

Actually I think you did something overcomplicated (shared/exclusive),
which would only lead to user confusion (especially when his data
becomes "exclusive" one day without any known reason), misnamed ...and
not reflecting anything valuable, unless the problems with extent
fragmentation are already resolved somehow?

So IMHO current quotas are:
- not discoverable for user (shared->exclusive transition of my data by someone's else action),
- not reliable for sysadm (offensive write pattern by any user can allocate virtually any space despite of quotas).

-- 
Tomasz Pala <gotar@pld-linux.org>

next prev parent reply	other threads:[~2018-08-10  9:46 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31 13:49 Report correct filesystem usage / limits on BTRFS subvolumes with quota Thomas Leister
2018-07-31 14:32 ` Qu Wenruo
2018-07-31 16:03   ` Austin S. Hemmelgarn
2018-08-01  1:23     ` Qu Wenruo
2018-08-09 17:48   ` Tomasz Pala
2018-08-09 23:35     ` Qu Wenruo
2018-08-10  7:17       ` Tomasz Pala [this message]
2018-08-10  7:55         ` Qu Wenruo
2018-08-10  9:33           ` Tomasz Pala
2018-08-11  6:54             ` Andrei Borzenkov
2018-08-10 11:32       ` Austin S. Hemmelgarn
2018-08-10 18:07       ` Chris Murphy
2018-08-10 19:10         ` Austin S. Hemmelgarn
2018-08-11  3:29         ` Duncan
2018-08-12  3:16           ` Chris Murphy
2018-08-12  7:04             ` Andrei Borzenkov
2018-08-12 17:39               ` Andrei Borzenkov
2018-08-13 11:23               ` Austin S. Hemmelgarn
     [not found]     ` <f66b8ff3-d7ec-31ad-e9ca-e09c9eb76474@gmail.com>
2018-08-10  7:33       ` Tomasz Pala
2018-08-11  5:46         ` Andrei Borzenkov
2018-08-10 11:39     ` Austin S. Hemmelgarn
2018-08-10 18:21       ` Tomasz Pala
2018-08-10 18:48         ` Austin S. Hemmelgarn
2018-08-11  6:18         ` Andrei Borzenkov
2018-08-14  2:49 ` Jeff Mahoney
2018-08-15 11:22   ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180810071748.GA5473@polanet.pl \
    --to=gotar@polanet.pl \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).