From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:54062) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXHBN-0002UP-Qg for qemu-devel@nongnu.org; Tue, 30 Apr 2013 16:29:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UXHBH-00086U-HF for qemu-devel@nongnu.org; Tue, 30 Apr 2013 16:29:49 -0400 Received: from e33.co.us.ibm.com ([32.97.110.151]:57678) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXHBH-00086H-76 for qemu-devel@nongnu.org; Tue, 30 Apr 2013 16:29:43 -0400 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 30 Apr 2013 14:29:41 -0600 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 9395B1FF002C for ; Tue, 30 Apr 2013 14:23:58 -0600 (MDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3UKSumA055376 for ; Tue, 30 Apr 2013 14:28:59 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3UKSteO015980 for ; Tue, 30 Apr 2013 14:28:55 -0600 Message-ID: <51802986.3070701@linux.vnet.ibm.com> Date: Tue, 30 Apr 2013 16:28:54 -0400 From: Corey Bryant MIME-Version: 1.0 References: <517AC9E5.3050204@linux.vnet.ibm.com> <7515044.dYPbKXmJQB@sifl> <517EBE7D.4020100@linux.vnet.ibm.com> <517EEE0C.603@linux.vnet.ibm.com> <518011C8.7050200@linux.vnet.ibm.com> In-Reply-To: <518011C8.7050200@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] Continuous work on sandboxing List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Otubo Cc: Paul Moore , qemu-devel@nongnu.org, Eric Paris On 04/30/2013 02:47 PM, Eduardo Otubo wrote: > > > On 04/29/2013 07:02 PM, Corey Bryant wrote: >> >> >> On 04/29/2013 02:39 PM, Eduardo Otubo wrote: >>> >>> >>> On 04/26/2013 06:07 PM, Paul Moore wrote: >>>> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote: >>>>> Hello folks, >>>>> >>>>> Resuming the sandboxing work, I'd like to ask for comments on the >>>>> ideias I have: >>>>> >>>>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu >>>>> with different configurations to reduce to the smallest syscall set >>>>> possible; test and send a patch weekly (this is already being >>>>> performed >>>>> and a patch is on the way) >>>> >>>> Is this hooked into a testing framework? While it is always nice to >>>> have >>>> someone verify the correctness, having a simple tool/testsuite what >>>> can run >>>> through things on a regular basis is even better. >>> >>> Unfortunately it is currently not. I'm running the tests manually, but I >>> have in mind some ideas to implement a tool for this purpose. >>> >> >> How about testing in KVM autotest? I assume it would be as simple as >> modifying some existing tests to use -sandbox on. We definitely should >> get some automated regression tests running with seccomp on. >> >>>> >>>> Also, looking a bit further ahead, it might be interesting to look at >>>> removing >>>> some of the arch dependent stuff in qemu-seccomp.c. The latest >>>> version of >>>> libseccomp should remove the need for many, if not all, of the arch >>>> specific >>>> #ifdefs and the next version of libseccomp will add support for x32 >>>> and ARM. >>> >>> Tell me more about this. You're saying I can remove the #ifdefs and keep >>> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these >>> syscalls in another way? >>> >>>> >>>>> 2. Introduce a second whitelist - the whitelist should be defined in >>>>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove >>>>> execve() and avoid open() and socket() and its parameters ... >>>> >>>> If I'm understanding you correctly, I think what you'll want is a >>>> second >>>> *blacklist*. We talked about this previously; we currently have a >>>> single >>>> whitelist, and considering how seccomp works, you can really only >>>> further >>>> restrict things after you install a whitelist into the kernel (hence >>>> the >>>> blacklist). >>> >>> Yes, that's exactly what I'm planning to do. >>> >> >> Hmm, I thought you were going to introduce a completely new whitelist so >> that a guest could optionally be run under: >> 1) the existing sandbox environment where everything in QEMU works, >> *or* >> 2) a new tighter and more restricted sandbox environment where things >> like execve() is denied, open() is denied (once the pre-req's are in >> place for fd passing), and potentially other "dangerous" syscalls are >> denied. > > I think we're talking about the same thing here. I believe the execution I think so, but I'm not entirely sure. > flow will happen like this: 1) first whitelist installed, only few > syscalls allowed. 2) qemu starts 3) given the current scenario (the > current list of syscalls allowed) the second *blacklist* is installed, > denying execve and open. 4) start guests. Yes, you could implement the new whitelist this way. > > At the end of step 3, we'll have the same environment we have at step 1, > without execve and open. Is that correct? > >> >> If the whitelist for #2 was passed from libvirt to qemu then libvirt >> could define the syscalls and syscall parameters that are denied. >> > Just to be clear, I'm thinking you could launch guests in one of two different seccomp sandboxed environments: 1) Using the existing and more permissive whitelist where every QEMU feature works: qemu-kvm -sandbox on,default 2) A more restricted whitelist environment that doesn't allow all QEMU features to work. It would be limited to the whitelist in 1 and it would also deny things like execve(), open(), socket(), certain ioctl() parameters, and may only allow reads/writes to specifc fds, and/or block anything else that could be dangerous: qemu-kvm -sandbox on,restricted I'm just throwing these command line options and syscalls out there. And maybe it makes more sense for libvirt to pass the syscalls and parameters to QEMU so that libvirt can determine the parameters to restrict, like fd's the guest is allowed to read/write. Here's another thread where this was discussed: http://www.redhat.com/archives/libvir-list/2013-April/msg01501.html -- Regards, Corey Bryant