From: Ingo Molnar <mingo@elte.hu>
To: James Morris <jmorris@namei.org>
Cc: linux-mips@linux-mips.org, linux-sh@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Heiko Carstens <heiko.carstens@de.ibm.com>,
Oleg Nesterov <oleg@redhat.com>,
David Howells <dhowells@redhat.com>,
Paul Mackerras <paulus@samba.org>, Eric Paris <eparis@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
sparclinux@vger.kernel.org, Jiri Slaby <jslaby@suse.cz>,
linux-s390@vger.kernel.org, Russell King <linux@arm.linux.org.uk>,
x86@kernel.org, Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@redhat.com>,
linux-arm-kernel@lists.infradead.org, kees.cook@canonical.com,
"Serge E. Hallyn" <serge@hallyn.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
microblaze-uclinux@itee.uq.edu.au,
Steven Rostedt <rostedt@goodmis.org>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Roland McGrath <roland@redhat.com>, Michal Marek <mmarek@suse.cz>,
Michal Simek <monstr@monstr.eu>, Will Drewry <wad@chromium.org>,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
Ralf Baechle <ralf@linux-mips.org>,
Paul Mundt <lethal@linux-sh.org>, Tejun Heo <tj@kernel.org>,
linux390@de.ibm.com, Andrew Morton <akpm@linux-foundation.org>,
agl@chromium.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering
Date: Fri, 13 May 2011 14:10:34 +0200 [thread overview]
Message-ID: <20110513121034.GG21022@elte.hu> (raw)
In-Reply-To: <alpine.LRH.2.00.1105131018040.3047@tundra.namei.org>
* James Morris <jmorris@namei.org> wrote:
> On Thu, 12 May 2011, Ingo Molnar wrote:
> > Funnily enough, back then you wrote this:
> >
> > " I'm concerned that we're seeing yet another security scheme being designed on
> > the fly, without a well-formed threat model, and without taking into account
> > lessons learned from the seemingly endless parade of similar, failed schemes. "
> >
> > so when and how did your opinion of this scheme turn from it being an
> > "endless parade of failed schemes" to it being a "well-defined and readily
> > understandable feature"? :-)
>
> When it was defined in a way which limited its purpose to reducing the attack
> surface of the sycall interface.
Let me outline a simple example of a new filter expression based security
feature that could be implemented outside the narrow system call boundary you
find acceptable, and please tell what is bad about it.
Say i'm a user-space sandbox developer who wants to enforce that sandboxed code
should only be allowed to open files in /home/sandbox/, /lib/ and /usr/lib/.
It is a simple and sensible security feature, agreed? It allows most code to
run well and link to countless libraries - but no access to other files is
allowed.
I would also like my sandbox app to be able to install this policy without
having to be root. I do not want the sandbox app to have permission to create
labels on /lib and /usr/lib and what not.
Firstly, using the filter code i deny the various link creation syscalls so
that sandboxed code cannot escape for example by creating a symlink to outside
the permitted VFS namespace. (Note: we opt-in to syscalls, that way new
syscalls added by new kernels are denied by defalt. The current symlink
creation syscalls are not opted in to.)
But the next step, actually checking filenames, poses a big hurdle: i cannot
implement the filename checking at the sys_open() syscall level in a secure
way: because the pathname is passed to sys_open() by pointer, and if i check it
at the generic sys_open() syscall level, another thread in the sandbox might
modify the underlying filename *after* i've checked it.
But if i had a VFS event at the fs/namei.c::getname() level, i would have
access to a central point where the VFS string becomes stable to the kernel and
can be checked (and denied if necessary).
A sidenote, and not surprisingly, the audit subsystem already has an event
callback there:
audit_getname(result);
Unfortunately this audit callback cannot be used for my purposes, because the
event is single-purpose for auditd and because it allows no feedback (no
deny/accept discretion for the security policy).
But if had this simple event there:
err = event_vfs_getname(result);
I could implement this new filename based sandboxing policy, using a filter
like this installed on the vfs::getname event and inherited by all sandboxed
tasks (which cannot uninstall the filter, obviously):
"
if (strstr(name, ".."))
return -EACCESS;
if (!strncmp(name, "/home/sandbox/", 14) &&
!strncmp(name, "/lib/", 5) &&
!strncmp(name, "/usr/lib/", 9))
return -EACCESS;
"
#
# Note1: Obviously the filter engine would be extended to allow such simple string
# match functions. )
#
# Note2: ".." is disallowed so that sandboxed code cannot escape the restrictions
# using "/..".
#
This kind of flexible and dynamic sandboxing would allow a wide range of file
ops within the sandbox, while still isolating it from files not included in the
specified VFS namespace.
( Note that there are tons of other examples as well, for useful security features
that are best done using events outside the syscall boundary. )
The security event filters code tied to seccomp and syscalls at the moment is
useful, but limited in its future potential.
So i argue that it should go slightly further and should become:
- unprivileged: application-definable, allowing the embedding of security
policy in *apps* as well, not just the system
- flexible: can be added/removed runtime unprivileged, and cheaply so
- transparent: does not impact executing code that meets the policy
- nestable: it is inherited by child tasks and is fundamentally stackable,
multiple policies will have the combined effect and they
are transparent to each other. So if a child task within a
sandbox adds *more* checks then those add to the already
existing set of checks. We only narrow permissions, never
extend them.
- generic: allowing observation and (safe) control of security relevant
parameters not just at the system call boundary but at other
relevant places of kernel execution as well: which
points/callbacks could also be used for other types of event
extraction such as perf. It could even be shared with audit ...
I argue that this is the LSM and audit subsystems designed right: in the long
run it could allow everything that LSM does at the moment - and so much more
...
And you argue that allowing this would be bad, if it was extended like that
then you'd consider it a failed scheme? Why?
Thanks,
Ingo
next prev parent reply other threads:[~2011-05-13 12:11 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1304017638.18763.205.camel@gandalf.stny.rr.com>
2011-05-12 3:02 ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering Will Drewry
2011-05-12 7:48 ` Ingo Molnar
2011-05-12 9:24 ` Kees Cook
2011-05-12 10:49 ` Ingo Molnar
2011-05-12 11:44 ` James Morris
2011-05-12 13:01 ` Ingo Molnar
2011-05-12 16:26 ` Will Drewry
2011-05-16 12:55 ` Ingo Molnar
2011-05-16 14:42 ` Will Drewry
2011-05-13 0:18 ` James Morris
2011-05-13 12:10 ` Ingo Molnar [this message]
2011-05-13 12:19 ` Peter Zijlstra
2011-05-13 12:26 ` Ingo Molnar
2011-05-13 12:39 ` Peter Zijlstra
2011-05-13 12:43 ` Peter Zijlstra
2011-05-13 12:54 ` Ingo Molnar
2011-05-13 13:08 ` Peter Zijlstra
2011-05-13 13:18 ` Ingo Molnar
2011-05-13 13:55 ` Peter Zijlstra
2011-05-13 14:57 ` Ingo Molnar
2011-05-13 15:27 ` Peter Zijlstra
2011-05-14 7:05 ` Ingo Molnar
2011-05-16 16:23 ` Steven Rostedt
2011-05-16 16:52 ` Ingo Molnar
2011-05-16 17:03 ` Steven Rostedt
2011-05-17 12:42 ` Ingo Molnar
2011-05-17 13:05 ` Steven Rostedt
2011-05-17 13:19 ` Ingo Molnar
2011-05-19 4:07 ` Will Drewry
2011-05-19 12:22 ` Steven Rostedt
2011-05-19 21:05 ` Will Drewry
2011-05-24 15:59 ` Will Drewry
2011-05-24 16:20 ` Peter Zijlstra
2011-05-24 16:25 ` Thomas Gleixner
2011-05-24 19:00 ` Will Drewry
2011-05-24 19:54 ` Ingo Molnar
2011-05-24 20:10 ` Ingo Molnar
2011-05-25 10:35 ` Thomas Gleixner
2011-05-25 15:01 ` Ingo Molnar
2011-05-25 17:43 ` Peter Zijlstra
2011-05-29 20:17 ` Ingo Molnar
2011-05-25 17:48 ` Thomas Gleixner
2011-05-26 8:43 ` Ingo Molnar
2011-05-26 9:15 ` Ingo Molnar
2011-05-24 20:08 ` Ingo Molnar
2011-05-24 20:14 ` Steven Rostedt
2011-05-13 15:17 ` Eric Paris
2011-05-13 15:29 ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering David Laight
2011-05-16 12:03 ` Ingo Molnar
2011-05-13 12:49 ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering Ingo Molnar
2011-05-13 13:55 ` Peter Zijlstra
2011-05-13 15:02 ` Ingo Molnar
2011-05-13 15:10 ` Eric Paris
2011-05-13 15:23 ` Peter Zijlstra
2011-05-13 15:55 ` Eric Paris
2011-05-13 16:29 ` Will Drewry
2011-05-14 7:30 ` Ingo Molnar
2011-05-14 20:57 ` Will Drewry
2011-05-16 12:43 ` Ingo Molnar
2011-05-16 15:29 ` Will Drewry
2011-05-17 12:57 ` Ingo Molnar
2011-05-16 0:36 ` James Morris
2011-05-16 15:08 ` Ingo Molnar
2011-05-17 2:24 ` James Morris
2011-05-17 13:10 ` Ingo Molnar
2011-05-17 13:29 ` James Morris
2011-05-17 18:34 ` Ingo Molnar
2011-05-26 6:27 ` Pavel Machek
2011-05-26 8:35 ` Ingo Molnar
2011-05-12 12:15 ` Frederic Weisbecker
2011-05-12 11:33 ` James Morris
2011-05-13 19:35 ` Arnd Bergmann
2011-05-14 20:58 ` Will Drewry
2011-05-15 6:42 ` Arnd Bergmann
2011-05-16 12:00 ` Ingo Molnar
2011-05-16 15:26 ` Steven Rostedt
2011-05-16 15:28 ` Will Drewry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110513121034.GG21022@elte.hu \
--to=mingo@elte.hu \
--cc=a.p.zijlstra@chello.nl \
--cc=agl@chromium.org \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=eparis@redhat.com \
--cc=fweisbec@gmail.com \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=jmorris@namei.org \
--cc=jslaby@suse.cz \
--cc=kees.cook@canonical.com \
--cc=lethal@linux-sh.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@linux-mips.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linux390@de.ibm.com \
--cc=linux@arm.linux.org.uk \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=microblaze-uclinux@itee.uq.edu.au \
--cc=mingo@redhat.com \
--cc=mmarek@suse.cz \
--cc=monstr@monstr.eu \
--cc=oleg@redhat.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=ralf@linux-mips.org \
--cc=roland@redhat.com \
--cc=rostedt@goodmis.org \
--cc=schwidefsky@de.ibm.com \
--cc=serge@hallyn.com \
--cc=sparclinux@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=wad@chromium.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).