Re: Security implications of btrfs receive?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: dsterba@suse.cz, Christoph Anton Mitterer <calestyo@scientia.net>,
	linux-btrfs@vger.kernel.org
Subject: Re: Security implications of btrfs receive?
Date: Fri, 9 Sep 2016 13:21:27 -0400	[thread overview]
Message-ID: <21041efe-6e01-c277-5f74-a3a7568961b1@gmail.com> (raw)
In-Reply-To: <20160909163323.GZ16983@twin.jikos.cz>

On 2016-09-09 12:33, David Sterba wrote:
> On Wed, Sep 07, 2016 at 03:08:18PM -0400, Austin S. Hemmelgarn wrote:
>> On 2016-09-07 14:07, Christoph Anton Mitterer wrote:
>>> On Wed, 2016-09-07 at 11:06 -0400, Austin S. Hemmelgarn wrote:
>>>> This is an issue with any filesystem,
>>> Not really... any other filesystem I'd know (not sure about ZFS) keeps
>>> working when there are UUID collisions... or at least it won't cause
>>> arbitrary corruptions, which then in the end may even be used for such
>>> attacks as described in that thread.
>>>
>>> Even other multi-device containers (LVM, MD) don't at least corrupt
>>> your data like it allegedly can happen with btrfs.
>>>
>>>
>>>
>>>>  it is just a bigger issue with
>>>> BTRFS.
>>> No corruption vs. possible arbitrary data corruption and leakage seems
>>> to be more than "just bigger".
>>> I'd call it unacceptable for a production system.
>> So is refusing to boot.  In most cases, downtime is just as bad as data
>> corruption.
>>>
>>>
>>>>   Take a system using ext4, or XFS, or almost any other Linux
>>>> filesystem, running almost any major distro, create a minimum sized
>>>> partition on the disk for that filesystem type, and create a
>>>> filesystem
>>>> there with the same UUID as the root filesystem.  Next time that
>>>> system
>>>> reboots, things will usually blow up (XFS will refuse to mount, ext4
>>>> and
>>>> most other filesystems will work sometimes and not others).
>>> Well but that's something completely different.
>>> It would be perfectly fine if, in case of an UUID collision, the system
>>> simply denies mounting/assembly (actually that's one of the solutions
>>> others and I've proposed in the aforementioned thread).
>>>
>>> But it's not acceptable if the system does *something* in such
>>> situation,... or if such fs/container is already mounted/active and
>>> another device with colliding UUID appears *then*, it's neither
>>> acceptable that the already active fs/container wouldn't continue to
>>> work properly.
>>>
>>> And that seems to my experience just how e.g. LVM handles this.
>>>
>>> "Not booting" is not really an issue in terms of data corruption.
>>>
>>>
>>> At least I'm pretty sure to remember that one of the main developers
>>> (was it Qu?) acknowledged these issues (both in terms of accidental
>>> corruption and security wise) and that he was glad that these issues
>>> were brought up and that they should be solved.
>>>
>>>
>>>> It hasn't, because there's not any way it can be completely
>>>> fixed.
>>> Why not? As it was laid out by myself and others, the basic solution
>>> would be:
>>> - Refuse any mounting in case UUID collisions are detected.
>>> - Generally don't do any auto-rebuilds or e.g. RAID assemblies unless
>>>   specifically allowed/configured by the user (as this might also be
>>>   used to extract data from a system).
>>> - If there are any collisions (either by mounting or by processes like
>>>   rebuilds/added devices) require the user to specify uniquely which
>>>   device he actually wants (e.g. by path).
>>> - And in case a filesystem is already mounted and UUID collisions
>>>   happens then (e.g. a dd clone get's plugged in), continue to use the
>>>   already active device... just as e.g. LVM does.
>>>
>>>>   This
>>>> particular case is an excellent example of why it's so hard to
>>>> fix.  To
>>>> close this particular hole, BTRFS itself would have to become aware
>>>> of
>>>> whether whoever is running an ioctl is running in a chroot or not,
>>>> which
>>>> is non-trivial to determine to begin with, and even harder when you
>>>> factor in the fact that chroot() is a VFS level thing, not a
>>>> underlying
>>>> filesystem thing, while ioctls are much lower level.
>>> Isn't it simply enough to:
>>> - know which blockdevices with a btrfs and with which UUIDs there are
>>> - let userland tools deny any mount/assembly/etc. actions in case of
>>>   collisions
>>> - do the actual addressing of devices via the device path (so that
>>>   proper devices will continued to be if the fs was already mounted
>>>   when a collision occurs)
>>> ?
>> That's not the issue being discussed in this case.  The ultimate issue
>> is of course the same (the flawed assumption that some arbitrary bytes
>> will be globally unique), but the particular resultant issues are
>> different.  The problem being discussed is that receive doesn't verify
>> that subvolume UUID's it has been told to clone from are within the are
>> it's been told to work.  This can cause an information leak, but not
>> data corruption, and is actually an issue with the clone ioctl in
>> general.  Graham actually proposed a good solution to this particular
>> problem (require an open fd to a source file containing the blocks to be
>> passed into the ioctl in addition to everything else), but it's still
>> orthogonal to the symptoms you're talking about.
>>>
>>> And further, as I've said, security wise auto-assembly of multi-device
>>> seems always prone to attacks at least in certain use cases, so for the
>>> security conscious people:
>>> - Don't do auto-assembly/rebuild/etc. based on scanning for UUID
>>> - Let the user choose to do this manually via specifying the devices
>>>   (via e.g. path).
>>>   So a user could say something like
>>>   mount -t btrfs -o device=/dev/disk/by-path/pci-0000\:00\:1f.2-ata-1,device=/dev/disk/by-path/pci-0000\:00\:2f.2-ata-2 /foo
>>>   in order to be sure that just these devices would be *tried* to be
>>>   used for the RAID1 btrfs, and not the one an attacker might have
>>>   plugged into the USB.
>>>
>>>
>>>> That said, nobody's really done any work on mitigating the issues
>>>> either, although David Sterba has commented about having interest in
>>>> fixing issues caused by crafted filesystem images, so hopefully
>>>> things will start moving more in the direction of proper security.
>>> Well that's good do hear... it's IMO one of the bigger potential issues
>>> in btrfs, next to the ongoing stability problems[0] and still not
>>> really working RAID.
>>>
>>> Anyone working on this should probably have a look at the thread I've
>>> mentioned, cause there are really several tricky ways one could exploit
>>> this... to me especially any auto-(i.e. based on scanning for UUIDs)-
>>> assembly/rebuilding and that like seemed to pose quite a big surface.
>>>
>> I think I covered it already in the last thread on this, but the best
>> way I see to fix the whole auto-assembly issue is:
>> 1. Stop the damn auto-scanning of new devices on hot-plug.  The scanning
>> should be done on mount or invoking something like btrfs dev scan, not
>> on hot-plug.  This is the biggest current issue, and is in theory the
>> easiest thing to fix.  The problem here is that it's udev sources we
>> need to change, not our own.
>
> I see this as difficult to fix. Even if udev would be configured not to
> do the scanning, we'd probably end up with systems that have kernel
> expecting userspace scanning and udev not doing that. Possibly fixable.
I see the auto-discovery thing as less of an issue if we move to having 
the mount helper do discovery.  ISTR that if you pass in device= 
options, you don't need the scan from userspace provided that the 
options point to a complete valid set of devices for the filesystem.  If 
that's the case, then we just need to have the mount helper do discovery 
and pass the options regardless (unless manually specified), which would 
work even on older kernels which expect userspace to do discovery.  The 
problem that I see with this though is that the device= option adds 
devices to a list to try to use for the FS, while for this to work 
securely, it would have to be changed to be an exact set of devices to 
use for the FS.
>
>> 2. Get rid of the tracking in the kernel.  If a filesystem isn't mounted
>> or requested to be mounted, then the kernel has no business worrying
>> about what what devices it's on.  If the filesystem is mounted, then the
>> only way to associate new devices should be from userspace.
>
>> 3. When mounting, the mount helper should be doing the checking to
>> verify that the UUID's and everything else are correct.  Ideally, the
>> mount(2) call should require a list of devices to use, and mount should
>> be doing the discovery.
>
> So 2 and 3 would be a job for the mount helper. We'd pondered that idea
> in the past but I'm not sure why we did not want it in the end. (The
> scanning part). What we could possibly do, let the mount helper verify
> that there are no duplicate UUIDs at least, and refuse mount.
The primary argument that comes to mind is that having a mount helper 
usually requires an initramfs to use that filesystem as a root 
filesystem.  This is generally necessary on most end-user systems anyway 
right now for other reasons though, so I see it as much less of a valid 
argument (and if the helper just composes some extra mount options if 
they aren't present, then we don't technically need it for root, as 
those options could just be specified on the kernel command line).

I'm not sold on doing verification in the mount helper without also 
doing discovery there.  Having the helper do verification when doing 
discovery is a good thing because it lets us give a much more useful 
error message without having to point at kernel logs, but I don't 
personally feel that that's sufficient reason to add a mount helper, as 
we should ideally be having the kernel verify things, as not everything 
goes through the mount helper.
>
>> This is at odds with how systemd handles BTRFS
>> mounts, but they're being stupid with that too (the only way to tell for
>> certain if a FS will mount is to try to mount it, if the mount(2) call
>> succeeds, then the filesystem was ready, regardless of whether or not
>> userspace thinks the device is).
>> 4. The kernel should be doing a better job of validating filesystems.
>> It should be checking that all the devices agree on how many devices
>> there should be, as well as checking that they all have correct UUID's.
>> This is technically not necessary if item 3 is implemented, but is still
>> good practice from a hardening perspective.
>
> Agreed.
>
In this case, I think what really needs to happen is that we just 
validate all the information we can parse out of the SB matches between 
all devices, or at a minimum:
1. Filesystem UUID (we already do this I think).
2. Number of devices in the FS.
3. UUID's of all the devices in the FS.
As well as checking that there are no duplicate UUID's.

Now that I've had a bit more time to think on this from a sysadmin's 
perspective, I've got a list of what behaviors I'd personally expect 
assuming minimal knowledge of BTRFS, ZFS, or any similar filesystem:
1. `mount <device> <mountpoint>` works for any device in the FS without 
needing any user intervention.
2. Mounting root on boot may require special handling or an initramfs to 
work.
3. The kernel will refuse to mount filesystems that are obviously broken 
or are in states that may cause data loss without explicitly being asked 
to do so by the user.
4. Adding and removing devices from the system will not change a 
filesystem's configuration without user intervention.
6. `mount LABEL=<label> <mountpoint>` and `mount UUID=<uuid> 
<mountpoint>` work for any functional filesystem without needing special 
handling by the user.

next prev parent reply	other threads:[~2016-09-09 17:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-05  9:59 Security implications of btrfs receive? Graham Cobb
2016-09-05 14:33 ` Duncan
2016-09-06 12:15 ` Austin S. Hemmelgarn
2016-09-06 17:20   ` Graham Cobb
2016-09-07 11:58     ` Austin S. Hemmelgarn
2016-09-07 14:44       ` Christoph Anton Mitterer
2016-09-07 14:55         ` Austin S. Hemmelgarn
2016-09-07 15:20       ` Austin S. Hemmelgarn
2016-09-07 16:10         ` Graham Cobb
2016-09-07 17:33           ` Austin S. Hemmelgarn
2016-09-09 16:18       ` David Sterba
2016-09-09 16:58         ` Austin S. Hemmelgarn
2016-09-07 14:41     ` Christoph Anton Mitterer
2016-09-07 15:06       ` Austin S. Hemmelgarn
2016-09-07 16:27         ` Graham Cobb
2016-09-07 18:07         ` Christoph Anton Mitterer
2016-09-07 19:08           ` Austin S. Hemmelgarn
2016-09-07 19:34             ` Chris Murphy
2016-09-08 11:48               ` Austin S. Hemmelgarn
2016-09-09 18:58                 ` Chris Murphy
2016-09-10 19:27                   ` Chris Murphy
2016-09-12 11:24                   ` Austin S. Hemmelgarn
2016-09-12 20:25                     ` Chris Murphy
2016-09-13 11:46                       ` Austin S. Hemmelgarn
2016-09-09 16:33             ` David Sterba
2016-09-09 17:21               ` Austin S. Hemmelgarn [this message]
2016-09-07 20:29           ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=21041efe-6e01-c277-5f74-a3a7568961b1@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=calestyo@scientia.net \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).