From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f52.google.com ([209.85.192.52]:35120 "EHLO mail-qg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752734AbcCJRE0 (ORCPT ); Thu, 10 Mar 2016 12:04:26 -0500 Received: by mail-qg0-f52.google.com with SMTP id y89so75610888qge.2 for ; Thu, 10 Mar 2016 09:04:25 -0800 (PST) Subject: Re: btrfs and containers To: Duncan <1i5t5.duncan@cox.net>, linux-btrfs@vger.kernel.org References: <20160308195857.GB26981@localhost.localdomain> <56E013E8.9080401@gmail.com> From: "Austin S. Hemmelgarn" Message-ID: <56E1A901.6050207@gmail.com> Date: Thu, 10 Mar 2016 12:04:01 -0500 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2016-03-09 21:55, Duncan wrote: > Austin S. Hemmelgarn posted on Wed, 09 Mar 2016 07:15:36 -0500 as > excerpted: > >> On 2016-03-08 16:28, Chris Murphy wrote: > >>> Yes, it's a bit peculiar I can create subvolumes and snapshot them, but >>> can't 'btrfs sub list/show' >>> >>> It's an open question why the user needs a subvolume, but I'm not >>> thinking of a human user necessarily but rather some service, maybe >>> it's httpd. Or maybe with the xdg-app stuff the Gnome folks are working >>> on it makes sense to encapsulate applications and their updates in >>> their own subvolume. *shrug* I'm open to the idea that the use case >>> needs to be more compelling and detailed in order to get the >>> implementation right. >>> >> It's probably worth tossing out there that I use them on a regular basis >> as a normal user (not root or some service) for: >> 1. Local copies of VCS repositories. >> 2. Build directories. >> 3. Staging areas for a variety of things. >> 4. Specifically isolating certain parts of my home directory from >> backups. >> >> 1-3 are mostly because of the fact that deleting a subvolume is insanely >> fast compared to recursive deletion of a directory, although 4 is >> somewhat significant for those as well. > > For #2 and possibly #3, depending on what's being staged and why, tmpfs > works well, and deleting should be even faster (AFAIK, subvolume deletion > returns immediately but the work continues in the background, so if > you're running other IO-bound jobs they'll still be affected even tho the > subvolume deletion command has returned... if it's all in memory as is > tmpfs, that problem's eliminated too), tho of course you need enough > memory so that tmpfs doesn't trigger swap-thrashing. Yeah, most of the time I use subvolumes for item 2 or 3 it's either dealing with stuff that I specifically want persistent across reboots (for example, the build directory I keep in /usr/src for the kernel, or staging directories for audio recordings), or things that are big enough I really want to avoid the memory consumption from working on tmpfs (as of right now, the only package I have installed on any of my systems that fits this is LLVM/clang, I used to do this for some other software like LibreOffice, webkit-gtk, and icedtea as well though). > > But #1 and #4 of course don't work as well on tmpfs as you'll likely want > them around longer, and all four cases definitely make use of the the > fact that nested subvolumes wall off snapshotting and thus btrfs send, > for backup purposes. And of course if you're on a limited-memory machine > and thus can't easily use tmpfs for building and other staging, and don't > need to care about the ongoing background IO, using subvolumes for #2 and > 3 remains useful, as well. > >> In general I can see them being useful for any number of things from a >> service perspective, although I feel that snapshots are likely more >> useful there (the ability to atomically save the state of a set of files >> is extremely useful for a lot of things). > > I consider the current situation somewhat of a security (DoS) issue, > since users (or runaway scripts or malware) can create unlimited > subvolumes as an ordinary user, with that user then not being able to > delete them, requiring admin intervention to do so. Of course as long as > it's a single-human-user with an admin-rights alter-ego login, it's not > /that/ much of a security issue, but I could see it being one for human > users who do not have that admin-rights alter-ego login. So were I to be > running in such a situation, I'd probably use the mount option to let the > users delete their own subvolumes, unless of course that opens up other > security issues I'm not aware of. > > IMO before btrfs can really be considered stable, this possible DoS needs > resolved by making the list/delete set the exact same as the create set, > either by giving users some way to deal with (only) their own subvolumes > just as they can their own directories, or by reserving subvolume > creation to superuser, because that's what's needed for listing and > deletion. Because if not, I fear someone's going to take advantage of it > in some way, perhaps, as with many DoS vulns, using it to deny critical > resources as a way to simplify some other more critical attack, and it'll > be in the headlines as an attack that worked and a zero-day that still > works. The part that makes this tricky is that the list ioctl can be considered a potential information leak (as evidenced by the issue that started this thread), so IMHO what really needs to happen is for the mount option to be 'user_subvolume_ops', and control all three operations (or better yet, do something with ACL's in the btrfs xattr namespace to control it on a per-subvolume basis).