From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755978AbaEOUdV (ORCPT ); Thu, 15 May 2014 16:33:21 -0400 Received: from b.ns.miles-group.at ([95.130.255.144]:1660 "EHLO radon.swed.at" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755935AbaEOUdU (ORCPT ); Thu, 15 May 2014 16:33:20 -0400 Message-ID: <53752487.3060303@nod.at> Date: Thu, 15 May 2014 22:33:11 +0200 From: Richard Weinberger User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: "Serge E. Hallyn" CC: Serge Hallyn , LXC development mailing-list , "Michael H. Warfield" , Jens Axboe , Serge Hallyn , Arnd Bergmann , LKML , Andy Lutomirski , James.Bottomley@HansenPartnership.com Subject: Re: [lxc-devel] [RFC PATCH 00/11] Add support for devtmpfs in user namespaces References: <1400103299-144589-1-git-send-email-seth.forshee@canonical.com> <20140515013245.GA1764@kroah.com> <1400120251.7699.11.camel@canyon.ip6.wittsend.com> <20140515031527.GA146352@ubuntu-hedt> <20140515040032.GA6702@kroah.com> <1400161337.7699.33.camel@canyon.ip6.wittsend.com> <20140515140856.GA17453@kroah.com> <20140515195010.GA22317@ubuntumail> <53751FFA.5040103@nod.at> <20140515202628.GB25896@mail.hallyn.com> In-Reply-To: <20140515202628.GB25896@mail.hallyn.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 15.05.2014 22:26, schrieb Serge E. Hallyn: > Quoting Richard Weinberger (richard@nod.at): >> Am 15.05.2014 21:50, schrieb Serge Hallyn: >>> Quoting Richard Weinberger (richard.weinberger@gmail.com): >>>> On Thu, May 15, 2014 at 4:08 PM, Greg Kroah-Hartman >>>> wrote: >>>>> Then don't use a container to build such a thing, or fix the build >>>>> scripts to not do that :) >>>> >>>> I second this. >>>> To me it looks like some folks try to (ab)use Linux containers >>>> for purposes where KVM would much better fit in. >>>> Please don't put more complexity into containers. They are already >>>> horrible complex >>>> and error prone. >>> >>> I, naturally, disagree :) The only use case which is inherently not >>> valid for containers is running a kernel. Practically speaking there >>> are other things which likely will never be possible, but if someone >>> offers a way to do something in containers, "you can't do that in >>> containers" is not an apropos response. >>> >>> "That abstraction is wrong" is certainly valid, as when vpids were >>> originally proposed and rejected, resulting in the development of >>> pid namespaces. "We have to work out (x) first" can be valid (and >>> I can think of examples here), assuming it's not just trying to hide >>> behind a catch-22/chicken-egg problem. >>> >>> Finally, saying "containers are complex and error prone" is conflating >>> several large suites of userspace code and many kernel features which >>> support them. Being more precise would, if the argument is valid, >>> lend it a lot more weight. >> >> We (my company) use Linux containers since 2011 in production. First LXC, now libvirt-lxc. >> To understand the internals better I also wrote my own userspace to create/start >> containers. There are so many things which can hurt you badly. >> With user namespaces we expose a really big attack surface to regular users. >> I.e. Suddenly a user is allowed to mount filesystems. > > That is currently not the case. They can mount some virtual filesystems > and do bind mounts, but cannot mount most real filesystems. This keeps > us protected (for now) from potentially unsafe superblock readers in the > kernel. Yeah, I meant not only "real" filesystems. I had VFS issues in mind where an attacker could do bad things using bind mounts for example. >> Ask Andy, he found already lots of nasty things... > > Yes, of course, and there may be more to come... > >> I agree that user namespaces are the way to go, all the papering with LSM >> over security issues is much worse. >> But we have to make sure that we don't add too much features too fast. > > Agreed. Like I said, 'we have to work (x) out first' could be valid, > including 'we should wait (a year?) for user ns issues to fall out > before relaxing any of the current user ns constraints." > > On the other hand, not exercising the new code may only mean that > existing flaws stick around longer, undetected (by most). Fair point. >> That said, I like containers a lot because they are cheap but as they are lightweight >> also therefore also isolation level is lightweight. >> IMHO containers are not a cheap replacement for KVM. > > The building blocks for containers can also be used for entirely > new, simpler use cases - i.e. perhaps a new fakeroot alternative based > on user namespace mappings. Which is why "this is not a use case for > containers" is not the right way to push back, whether or not the > feature ends up being appropriate. Agreed. Maybe I'm too pessimistic. We'll see. :-) Thanks, //richard