From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:55271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXFae-0008ES-2j for qemu-devel@nongnu.org; Tue, 30 Apr 2013 14:47:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UXFac-0007gS-Vk for qemu-devel@nongnu.org; Tue, 30 Apr 2013 14:47:48 -0400 Received: from e24smtp04.br.ibm.com ([32.104.18.25]:42761) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UXFac-0007fE-Ks for qemu-devel@nongnu.org; Tue, 30 Apr 2013 14:47:46 -0400 Received: from /spool/local by e24smtp04.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 30 Apr 2013 15:47:39 -0300 Received: from d24relay01.br.ibm.com (d24relay01.br.ibm.com [9.8.31.16]) by d24dlp01.br.ibm.com (Postfix) with ESMTP id F350E3520068 for ; Tue, 30 Apr 2013 14:47:37 -0400 (EDT) Received: from d24av05.br.ibm.com (d24av05.br.ibm.com [9.18.232.44]) by d24relay01.br.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r3UIiobV2691318 for ; Tue, 30 Apr 2013 15:44:50 -0300 Received: from d24av05.br.ibm.com (loopback [127.0.0.1]) by d24av05.br.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r3UIlapo029356 for ; Tue, 30 Apr 2013 15:47:37 -0300 Message-ID: <518011C8.7050200@linux.vnet.ibm.com> Date: Tue, 30 Apr 2013 15:47:36 -0300 From: Eduardo Otubo MIME-Version: 1.0 References: <517AC9E5.3050204@linux.vnet.ibm.com> <7515044.dYPbKXmJQB@sifl> <517EBE7D.4020100@linux.vnet.ibm.com> <517EEE0C.603@linux.vnet.ibm.com> In-Reply-To: <517EEE0C.603@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] Continuous work on sandboxing List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Corey Bryant Cc: Paul Moore , qemu-devel@nongnu.org, Eric Paris On 04/29/2013 07:02 PM, Corey Bryant wrote: > > > On 04/29/2013 02:39 PM, Eduardo Otubo wrote: >> >> >> On 04/26/2013 06:07 PM, Paul Moore wrote: >>> On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote: >>>> Hello folks, >>>> >>>> Resuming the sandboxing work, I'd like to ask for comments on the >>>> ideias I have: >>>> >>>> 1. Reduce whitelist to the optimal subset: Run various tests on Qemu >>>> with different configurations to reduce to the smallest syscall set >>>> possible; test and send a patch weekly (this is already being performed >>>> and a patch is on the way) >>> >>> Is this hooked into a testing framework? While it is always nice to >>> have >>> someone verify the correctness, having a simple tool/testsuite what >>> can run >>> through things on a regular basis is even better. >> >> Unfortunately it is currently not. I'm running the tests manually, but I >> have in mind some ideas to implement a tool for this purpose. >> > > How about testing in KVM autotest? I assume it would be as simple as > modifying some existing tests to use -sandbox on. We definitely should > get some automated regression tests running with seccomp on. > >>> >>> Also, looking a bit further ahead, it might be interesting to look at >>> removing >>> some of the arch dependent stuff in qemu-seccomp.c. The latest >>> version of >>> libseccomp should remove the need for many, if not all, of the arch >>> specific >>> #ifdefs and the next version of libseccomp will add support for x32 >>> and ARM. >> >> Tell me more about this. You're saying I can remove the #ifdefs and keep >> the lines like "{ SCMP_SYS(getresuid32), 241 }, " or address these >> syscalls in another way? >> >>> >>>> 2. Introduce a second whitelist - the whitelist should be defined in >>>> libvirt and passed on to qemu or just pre defined in Qemu? Also remove >>>> execve() and avoid open() and socket() and its parameters ... >>> >>> If I'm understanding you correctly, I think what you'll want is a second >>> *blacklist*. We talked about this previously; we currently have a >>> single >>> whitelist, and considering how seccomp works, you can really only >>> further >>> restrict things after you install a whitelist into the kernel (hence the >>> blacklist). >> >> Yes, that's exactly what I'm planning to do. >> > > Hmm, I thought you were going to introduce a completely new whitelist so > that a guest could optionally be run under: > 1) the existing sandbox environment where everything in QEMU works, > *or* > 2) a new tighter and more restricted sandbox environment where things > like execve() is denied, open() is denied (once the pre-req's are in > place for fd passing), and potentially other "dangerous" syscalls are > denied. I think we're talking about the same thing here. I believe the execution flow will happen like this: 1) first whitelist installed, only few syscalls allowed. 2) qemu starts 3) given the current scenario (the current list of syscalls allowed) the second *blacklist* is installed, denying execve and open. 4) start guests. At the end of step 3, we'll have the same environment we have at step 1, without execve and open. Is that correct? > > If the whitelist for #2 was passed from libvirt to qemu then libvirt > could define the syscalls and syscall parameters that are denied. > -- Eduardo Otubo IBM Linux Technology Center