Re: Qgroups are not applied when snapshotting a subvol?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>,
	Moritz Sichert <moritz+linux@sichert.me>,
	Andrei Borzenkov <arvidjaar@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: Qgroups are not applied when snapshotting a subvol?
Date: Tue, 28 Mar 2017 07:44:56 -0400	[thread overview]
Message-ID: <24d10583-543f-cf1e-9f00-dd419a593f3d@gmail.com> (raw)
In-Reply-To: <3b03ab4a-a0c2-27df-c6e4-c5f60fd4b5db@cn.fujitsu.com>

On 2017-03-27 21:49, Qu Wenruo wrote:
>
>
> At 03/27/2017 08:01 PM, Austin S. Hemmelgarn wrote:
>> On 2017-03-27 07:02, Moritz Sichert wrote:
>>> Am 27.03.2017 um 05:46 schrieb Qu Wenruo:
>>>>
>>>>
>>>> At 03/27/2017 11:26 AM, Andrei Borzenkov wrote:
>>>>> 27.03.2017 03:39, Qu Wenruo пишет:
>>>>>>
>>>>>>
>>>>>> At 03/26/2017 06:03 AM, Moritz Sichert wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I tried to configure qgroups on a btrfs filesystem but was really
>>>>>>> surprised that when you snapshot a subvolume, the snapshot will
>>>>>>> not be
>>>>>>> assigned to the qgroup the subvolume was in.
>>>>>>>
>>>>>>> As an example consider the small terminal session in the
>>>>>>> attachment: I
>>>>>>> create a subvol A, assign it to qgroup 1/1 and set a limit of 5M on
>>>>>>> that qgroup. Then I write a file into A and eventually get "disk
>>>>>>> quota
>>>>>>> exceeded". Then I create a snapshot of A and call it B. B will
>>>>>>> not be
>>>>>>> assigned to 1/1 and writing a file into B confirms that no limits at
>>>>>>> all are imposed for B.
>>>>>>>
>>>>>>> I feel like I must be missing something here. Considering that
>>>>>>> creating a snapshot does not require root privileges this would mean
>>>>>>> that any user can just circumvent any quota and therefore make them
>>>>>>> useless.
>>>>>>>
>>>>>>> Is there a way to enforce quotas even when a user creates snapshots?
>>>>>>>
>>>>>>
>>>>>> Yes, there is always method to attach the subvolume/snapshot to
>>>>>> specified higher level qgroup.
>>>>>>
>>>>>> Just use "btrfs subvolume snapshot -i 1/1".
>>>>>>
>>>>>
>>>>> This requires cooperation from whoever creates subvolume, while the
>>>>> question was - is it possible to enforce it, without need for explicit
>>>>> option/action when snapshot is created.
>>>>>
>>>>> To reiterate - if user omits "-i 1/1" (s)he "escapes" from quota
>>>>> enforcement.
>>>>
>>>> What if user really want to create a subvolume assigned another group?
>>>>
>>>> You're implying a *policy* that if source subvolume belongs to a
>>>> higher level qgroup, then snapshot created should also follow that
>>>> higher level qgroup.
>>>>
>>>> However kernel should only provide *mechanisim*, not *policy*.
>>>> And btrfs does it, it provides method to do it, whether to do or not
>>>> is users responsibility.
>>>>
>>>> If you want to implement that policy, please do it in a higher level,
>>>> something like SUSE snapper, not in kernel.
>>>
>>> The problem is, I can't enforce the policy because *every user* can
>>> create snapshots. Even if I would restrict the btrfs executable so
>>> that only root can execute it, this doesn't help. As using the ioctl
>>> for btrfs is allowed for any user, they could just get the executable
>>> from somewhere else.
>> To reiterate and reinforce this:
>> If it is not possible to enforce new subvolumes counting for their
>> parent quota, and there is no option to prevent non-root (or
>> non-CAP_SYS_ADMIN) users from creating new subvolumes, then BTRFS
>> qgroups are useless on any system with shell access because a user can
>> trivially escape their quota restrictions (or hide from accounting) by
>> creating a new subvolume which is outside of their qgroup and storing
>> data there.
>>
>> Ideally, there should be an option to disable user subvolume creation
>> (it arguably should be the default, because of resource exhaustion
>> issues, but that's a separate argument), and there should be an option
>> in the kernel to force specific behavior.  Both cases are policy, but
>> they are policy that can only be concretely enforced _by the kernel_.
>>
>>
> The problem is, how should we treat subvolume.
>
> Btrfs subvolume sits in the middle of directory and (logical) volume
> used in traditional stacked solution.
>
> While we allow normal user to create/delete/modify dir as long as they
> follow access control, we require privilege to create/delete/modify
> volumes.
No, we require privilege to do certain modifications or delete 
subvolumes.  Regular users can create subvolumes with no privileges 
whatsoever, and most basic directory operations (rename, chown, chmod, 
etc) work just fine within normal UNIX DAC permissions.  Unless you're 
running some specially patched kernel or some LSM (SELinux possibly) 
that somehow restricts access to the ioctl, you can always create 
subvolumes.

This is part of the reason that I'm personally hesitant to use BTRFS on 
systems where end users have shell access, it's a DoS waiting to happen.
>
> Developers chose to treat btrfs subvolume as dir, makes it quite easy to
> operate for normal use case, sacrificing qgroup limit which is not a
> major function (or even did not exist) at that time.
>
> IIRC at the beginning time of btrfs, we don't have a full idea of use
> cases could be.
> This is common, a lot of problems(even bad design) can only be found
> after enough feedback from end users.
>
> Personally speaking, I prefer to restrict subvolume creation/deletion to
> privilege users only, and uses a daemon as a proxy to do such privilege
> operation.
> So we can do better accounting/access control without bothering the kernel.
I will agree that a daemon for this can be useful, but even if we add a 
mount option to restrict operations on subvolumes by normal users, we 
can still provide the option of a daemon.  On something like a single 
user system, there is not much advantage to having some complex access 
control in place.  I'm not saying we should put any kind of complex ACL 
into the kernel, but that we should at least have some all-or-nothing 
switch in the mount options that controls if unprivileged users can 
perform subvolume operations.  Ideally it would be just one option, but 
unfortunately we provided somewhat nonsensical initial semantics that we 
now have to continue to support.

Looking at the qgroup stuff specifically though, snapshots not 
inheriting their parent's qgroup seems really odd to me.  There are two 
things that come to mind, and I'd love to see both personally:
1. Provide an option to have snapshots inherit their parent's qgroup. 
This would eliminate the 'surprise' that initially sparked this 
discussion, and deal with tools that aren't qgroup aware.
2. Provide an option to specify the default qgroup for new subvolumes, 
instead of them being created with no qgroup.  This would cover the rest 
of things, and provide some further usefulness (for example, you could 
leave this default qgroup empty most of the time and have monitoring 
software alert you if it suddenly had data in it to detect stuff 
creating subvolumes behind your back).
Both sound more to me like they should probably be specified somewhere 
in the filesystem itself, not the mount options.
>
> But that makes a big behavior difference, I'm afraid this won't become
> true.
I'm definitely with you on having the ability to restrict subvolume 
operations to privileged users, I just don't feel that the suggested 
methodology is enough by itself.

next prev parent reply	other threads:[~2017-03-28 11:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-25 22:03 Qgroups are not applied when snapshotting a subvol? Moritz Sichert
2017-03-26  5:45 ` Duncan
2017-03-27  0:39 ` Qu Wenruo
2017-03-27  3:26   ` Andrei Borzenkov
2017-03-27  3:46     ` Qu Wenruo
2017-03-27 11:02       ` Moritz Sichert
2017-03-27 12:01         ` Austin S. Hemmelgarn
2017-03-27 19:32           ` Chris Murphy
2017-03-27 19:53             ` Roman Mamedov
2017-03-27 20:06               ` Hans van Kranenburg
2017-03-27 21:11                 ` Chris Murphy
2017-03-28  2:41                   ` Duncan
2017-03-28  5:21                     ` Duncan
2017-03-28  3:56             ` Andrei Borzenkov
2017-03-28 11:24             ` Austin S. Hemmelgarn
2017-03-28 12:00               ` Marat Khalili
2017-03-28 12:20                 ` Austin S. Hemmelgarn
2017-03-28 13:53                   ` Marat Khalili
2017-03-28 15:24                     ` Austin S. Hemmelgarn
2017-03-29  5:53                       ` Marat Khalili
2017-03-28  1:49           ` Qu Wenruo
2017-03-28 11:44             ` Austin S. Hemmelgarn [this message]
2017-03-29  5:38               ` Duncan
2017-03-29 11:36                 ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24d10583-543f-cf1e-9f00-dd419a593f3d@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=arvidjaar@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=moritz+linux@sichert.me \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).