All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Ted Ts'o <tytso@mit.edu>
Cc: Matt Helsley <matthltc@us.ibm.com>,
	Lennart Poettering <mzxreary@0pointer.de>,
	Kay Sievers <kay.sievers@vrfy.org>,
	linux-kernel@vger.kernel.org, harald@redhat.com, david@fubar.dk,
	greg@kroah.com, Linux Containers <containers@lists.osdl.org>,
	Linux Containers <lxc-devel@lists.sourceforge.net>,
	"Serge E. Hallyn" <serge@hallyn.com>,
	Daniel Lezcano <daniel.lezcano@free.fr>,
	Paul Menage <paul@paulmenage.org>
Subject: Re: Detecting if you are running in a container
Date: Mon, 10 Oct 2011 23:42:36 -0700	[thread overview]
Message-ID: <m1ty7ghvvn.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20111011032523.GB7948@thunk.org> (Ted Ts'o's message of "Mon, 10 Oct 2011 23:25:23 -0400")

Ted Ts'o <tytso@mit.edu> writes:

> On Mon, Oct 10, 2011 at 07:05:30PM -0700, Matt Helsley wrote:
>> Yes, it does detract from the unique advantages of using a container.
>> However, I think the value here is not the effeciency of the initial
>> system configuration but the fact that it gives users a better place to
>> start.
>> 
>> Right now we're effectively asking users to start with non-working
>> and/or unfamiliar systems and repair them until they work.
>
> If things are not working with containers, I would submit to you that
> we're doing something wrong(tm). 

That is what this discussion is about.  What we are doing wrong(tm).
Mostly it is about the bits that have not yet been namespacified but
need to be.

I am totally in favor of not starting the entire world.  But just
like I find it convienient to loopback mount an iso image to see
what is on a disk image.  It would be handy to be able to just
download a distro image and play with it, without doing anything
special.

We can pair things down farther for the people who are running 1000
copies of apache but not requiring detailed distro surgery before
starting up the binaries on a livecd sounds handy.

> Things should just work, except that
> processes in one container can't use more than their fair share (as
> dictated by policy) of memory, CPU, networking, and I/O bandwidth.

You have to be careful with the limiters.  The fundamental reason
why containers are more efficient than hardware virtualization is
that with containers we can do over commit of resources, especially
memory.  I keep seeing implementations of resource limiters that want
to do things in a heavy handed way that break resource over commit.

> Something which is baked in my world view of containers (which I
> suspect is not shared by other people who are interested in using
> containers) is that given that kernel is shared, trying to use
> containers to provide better security isolation between mutually
> suspicious users is hopeless.  That is, it's pretty much impossible to
> prevent a user from finding one or more zero day local privilege
> escalation bugs that will allow a user to break root.  And at that
> point, they will be able to penetrate the kernel, and from there,
> break security of other processes.

You don't even have to get to security problems to have that concern.
There are enough crazy timing and side channel attacks.

I don't know what concern you have security wise, but the problem that
wants to be solved with user namespaces is something you hit much
earlier than when you worry about sharing a kernel between mutually
distrusting users.  Right now root inside a container is root rout
outside of a container just like in a chroot jail.  Where this becomes a
problem is that people change things like like
/proc/sys/kernel/print-fatal-signals expecting it to be a setting local
to their sand box when in fact the global setting and things start
behaving weirdly for other users.  Running sysctl -a during bootup 
has that problem in spades.

With user namespaces what we get is that the global root user is not the
container root user and we have been working our way through the
permission checks in the kernel to ensure we get them right in the
context of the user namespace.  This trivially means that the things
that we allow the global root user to do in /proc/ and /sysfs and
the like simply won't be allowed as a container root user.  Which
makes doing something stupid and affecting other people much more
difficult.

What the user namespace also allows is an escape hatch from the
bonds of suid.  Right now anything that could confuse an existing
app with that is suid root we have to only allow to root, or risk
adding a security hole.  With the user namespaces we can relax
that check and allow it also for container root users as well
as global root users.  When we are brave enough and certain
enough of our code we can allow non-root users to create their
own user namespaces.

There is the third use for containers where for some reason
we have uid assignment overlap.  Perhaps one distroy assigns
uid 22 to sshd and another to the nobody user.  Or perhaps there
are two departments who have that have done the silly thing
of assigning overlapping uids to their users and we want to
accesses filesystems created by both departments at the same
time without a chance of confusion and conflict.

With my sysadmin hat on I would not want to touch two untrusting groups
of users on the same machine.  Because of the probability there is at
least one security hole that can be found and exploited to allow
privilege escalation.

With my kernel developer hat on I can't just say surrender to the
idea that there will in fact be a privilege escalation bug that
is easy to exploit.  The code has to be built and designed so that
privilege escalation is difficult.  Otherwise we might as well
assume if you visit a website an stealthy worm has taken over your
computer.

It is my hope at the end of the day that the user namespaces will be one
more line of defense in messing up and slowing down the evil omnicient
worms that seem to uneering go for every privilege exploit there is.

Eric

  reply	other threads:[~2011-10-11  6:42 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-06 23:17 A Plumber’s Wish List for Linux Kay Sievers
2011-10-06 23:46 ` Andi Kleen
2011-10-07  0:13   ` Lennart Poettering
2011-10-07  1:57     ` Andi Kleen
2011-10-07 15:58       ` Lennart Poettering
2011-10-19 23:16     ` H. Peter Anvin
2011-10-07  7:49 ` Matt Helsley
2011-10-07 16:01   ` Lennart Poettering
2011-10-08  4:24     ` Eric W. Biederman
2011-10-10 16:31       ` Lennart Poettering
2011-10-10 20:59         ` Detecting if you are running in a container Eric W. Biederman
2011-10-10 21:41           ` Lennart Poettering
2011-10-11  5:40             ` Eric W. Biederman
2011-10-11  6:54             ` Eric W. Biederman
2011-10-12 16:59             ` Kay Sievers
2011-11-01 22:05               ` [lxc-devel] " Michael Tokarev
2011-11-01 23:51                 ` Eric W. Biederman
2011-11-02  8:08                   ` Michael Tokarev
2011-10-11  1:32           ` Ted Ts'o
2011-10-11  2:05             ` Matt Helsley
2011-10-11  3:25               ` Ted Ts'o
2011-10-11  6:42                 ` Eric W. Biederman [this message]
2011-10-11 12:53                   ` Theodore Tso
2011-10-11 21:16                     ` Eric W. Biederman
2011-10-11 22:30                       ` david
2011-10-12  4:26                         ` Eric W. Biederman
2011-10-12  5:10                           ` david
2011-10-12 15:08                             ` Serge E. Hallyn
2011-10-12 17:57                       ` J. Bruce Fields
2011-10-12 18:25                         ` Kyle Moffett
2011-10-12 19:04                           ` J. Bruce Fields
2011-10-12 19:12                             ` Kyle Moffett
2011-10-14 15:54                               ` Ted Ts'o
2011-10-14 18:04                                 ` Eric W. Biederman
2011-10-14 21:58                                   ` H. Peter Anvin
2011-10-16  9:42                                     ` Eric W. Biederman
2011-10-30 20:11                                       ` H. Peter Anvin
2011-11-01 13:38                                         ` Eric W. Biederman
2011-10-11 22:25               ` david
2011-10-07 10:12 ` A Plumber’s Wish List for Linux Alan Cox
2011-10-07 10:28   ` Kay Sievers
2011-10-07 10:38     ` Alan Cox
2011-10-07 12:46       ` Kay Sievers
2011-10-07 13:39         ` Theodore Tso
2011-10-07 15:21         ` Hugo Mills
2011-10-10 11:18           ` A Plumber???s " David Sterba
2011-10-10 11:18             ` David Sterba
2011-10-10 13:09             ` Theodore Tso
2011-10-13  0:28               ` Dave Chinner
2011-10-14 15:47                 ` Ted Ts'o
2011-10-11 13:14             ` Serge E. Hallyn
2011-10-11 15:49               ` Andrew G. Morgan
2011-10-12  2:31                 ` Serge E. Hallyn
2011-10-12 20:51                 ` Lennart Poettering
2011-10-08  9:53         ` A Plumber’s " Bastien ROUCARIES
2011-10-09  3:15           ` Alex Elsayed
2011-10-07 16:07       ` Valdis.Kletnieks
2011-10-07 12:35 ` Vivek Goyal
2011-10-07 18:59 ` Greg KH
2011-10-09 12:20   ` Kay Sievers
2011-10-09  8:45 ` Rusty Russell
2011-10-11 23:16 ` Andrew Morton
2011-10-12  0:53   ` Frederic Weisbecker
2011-10-12  0:59   ` Frederic Weisbecker
     [not found]     ` <20111012174014.GE6281@google.com>
2011-10-12 18:16       ` Cyrill Gorcunov
2011-10-14 15:38         ` Frederic Weisbecker
2011-10-14 16:01           ` Cyrill Gorcunov
2011-10-14 16:08             ` Cyrill Gorcunov
2011-10-14 16:19               ` Frederic Weisbecker
2011-10-19 21:19           ` Paul Menage
2011-10-19 21:12 ` Paul Menage
2011-10-19 23:03   ` Lennart Poettering
2011-10-19 23:09     ` Paul Menage
2011-10-19 23:31       ` Lennart Poettering
2011-10-22 10:21         ` Frederic Weisbecker
2011-10-22 15:28           ` Lennart Poettering
2011-10-25  5:40             ` Li Zefan
2011-10-30 17:18               ` Lennart Poettering
2011-11-01  1:27                 ` Li Zefan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1ty7ghvvn.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=containers@lists.osdl.org \
    --cc=daniel.lezcano@free.fr \
    --cc=david@fubar.dk \
    --cc=greg@kroah.com \
    --cc=harald@redhat.com \
    --cc=kay.sievers@vrfy.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lxc-devel@lists.sourceforge.net \
    --cc=matthltc@us.ibm.com \
    --cc=mzxreary@0pointer.de \
    --cc=paul@paulmenage.org \
    --cc=serge@hallyn.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.