linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	linux-kernel@vger.kernel.org, Hugh Dickins <hughd@google.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	kvm@vger.kernel.org, Jerome Glisse <jglisse@redhat.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Martin Cracauer <cracauer@cons.org>,
	Denis Plotnikov <dplotnikov@virtuozzo.com>,
	linux-mm@kvack.org, Marty McFadden <mcfadden8@llnl.gov>,
	Maya Gokhale <gokhale2@llnl.gov>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Kees Cook <keescook@chromium.org>, Mel Gorman <mgorman@suse.de>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	linux-fsdevel@vger.kernel.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 0/3] userfaultfd: allow to forbid unprivileged users
Date: Wed, 13 Mar 2019 19:44:58 -0400	[thread overview]
Message-ID: <20190313234458.GJ25147@redhat.com> (raw)
In-Reply-To: <1934896481.7779933.1552504348591.JavaMail.zimbra@redhat.com>

Hi Paolo,

On Wed, Mar 13, 2019 at 03:12:28PM -0400, Paolo Bonzini wrote:
> 
> > On Wed, Mar 13, 2019 at 09:22:31AM +0100, Paolo Bonzini wrote:
> > Unless somebody suggests a consistent way to make hugetlbfs "just
> > work" (like we could achieve clean with CRIU and KVM), I think Oracle
> > will need a one liner change in the Oracle setup to echo into that
> > file in addition of running the hugetlbfs mount.
> 
> Hi Andrea, can you explain more in detail the risks of enabling
> userfaultfd for unprivileged users?

There's no more risk than in allowing mremap (direct memory overwrite
after underflow) or O_DIRECT (dirtycow) for unprivileged users.

If there was risk in allowing userfaultfd for unprivileged users,
CAP_SYS_PTRACE would be mandatory and there wouldn't be an option to
allow userfaultfd to all unprivileged users.

This is just an hardening policy in kernel (for those that don't run
everything under seccomp) that may even be removed later.

Unlike mremap and O_DIRECT, because we've only an handful of
(important) applications using userfaultfd so far, we can do like the
bpf syscall:

SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
{
	union bpf_attr attr = {};
	int err;

	if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
		return -EPERM;

Except we picked CAP_SYS_PTRACE because CRIU already has to run with
such capability for other reasons.

So this is intended as the "sysctl_unprivileged_bpf_disabled" trick
applied to the bpf syscall, also applied to the userfaultfd syscall,
nothing more nothing less.

Then I thought we can add a tristate so an open of /dev/kvm would also
allow the syscall to make things more user friendly because
unprivileged containers ideally should have writable mounts done with
nodev and no matter the privilege they shouldn't ever get an hold on
the KVM driver (and those who do, like kubevirt, will then just work).

There has been one complaint because userfaultfd can also be used to
facilitate exploiting use after free bugs in other kernel code:

https://cyseclabs.com/blog/linux-kernel-heap-spray

This isn't a bug in userfaultfd, the only bug there is some kernel
code doing an use after free in a reproducible place, userfaultfd only
allows to stop copy-user where the exploits like copy-user to be
stopped.

This isn't particularly concerning, because you can achieve the same
objective with FUSE. In fact even if you set CONFIG_USERFAULTFD=n in
the kernel config and CONFIG_FUSE_FS=n, a security bug like that can
still be exploited eventually. It's just less reproducible if you
can't stop copy-user.

Restricting userfaultfd to privileged processes, won't make such
kernel bug become harmless, it'll just require more attempts to
exploit as far as I can tell. To put things in prospective, the
exploits for the most serious security bugs like mremap missing
underflow check, dirtycow or the missing stack_guard_gap would not
get any facilitation by userfaultfd.

I also thought we were randomizing all slab heaps by now to avoid
issues like above, I don't know if the randomized slab freelist oreder
CONFIG_SLAB_FREELIST_RANDOM and also the pointer with
CONFIG_SLAB_FREELIST_HARDENED precisely to avoid the exploits like
above. It's not like you can disable those two options even if you set
CONFIG_USERFAULTFD=n. I wonder if in that blog post the exploit was
set on a kernel with those two options enabled.

In any case not allowing non privileged processes to run the
userfaultfd syscall will provide some hardening feature also against
such kind of concern.

Thanks,
Andrea

  reply	other threads:[~2019-03-13 23:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-11  9:36 [PATCH 0/3] userfaultfd: allow to forbid unprivileged users Peter Xu
2019-03-11  9:36 ` [PATCH 1/3] userfaultfd/sysctl: introduce unprivileged_userfaultfd Peter Xu
2019-03-12  6:58   ` Mike Rapoport
2019-03-12 12:26     ` Peter Xu
2019-03-12 13:53       ` Mike Rapoport
2019-03-11  9:37 ` [PATCH 2/3] kvm/mm: introduce MMF_USERFAULTFD_ALLOW flag Peter Xu
2019-03-11  9:37 ` [PATCH 3/3] userfaultfd: apply unprivileged_userfaultfd check Peter Xu
2019-03-11  9:58   ` Peter Xu
2019-03-12  7:01 ` [PATCH 0/3] userfaultfd: allow to forbid unprivileged users Mike Rapoport
2019-03-12 12:29   ` Peter Xu
2019-03-12  7:49 ` Kirill A. Shutemov
2019-03-12 12:43   ` Peter Xu
2019-03-12 19:59 ` Mike Kravetz
2019-03-13  6:00   ` Peter Xu
2019-03-13  8:22     ` Paolo Bonzini
2019-03-13 18:52       ` Andrea Arcangeli
2019-03-13 19:12         ` Paolo Bonzini
2019-03-13 23:44           ` Andrea Arcangeli [this message]
2019-03-14 10:58             ` Paolo Bonzini
2019-03-14 15:23               ` Alexei Starovoitov
2019-03-14 16:00                 ` Paolo Bonzini
2019-03-14 16:16               ` Andrea Arcangeli
2019-03-15 16:09                 ` Kees Cook
2019-03-13 20:01         ` Mike Kravetz
2019-03-13 23:55           ` Andrea Arcangeli
2019-03-14  3:32             ` Mike Kravetz
2019-03-13 17:50     ` Mike Kravetz
2019-03-15  8:26       ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190313234458.GJ25147@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cracauer@cons.org \
    --cc=dgilbert@redhat.com \
    --cc=dplotnikov@virtuozzo.com \
    --cc=gokhale2@llnl.gov \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=mcfadden8@llnl.gov \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mike.kravetz@oracle.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).