From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:44391) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SRmCR-0007cj-Je for qemu-devel@nongnu.org; Tue, 08 May 2012 11:19:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SRmCP-0005Ar-LE for qemu-devel@nongnu.org; Tue, 08 May 2012 11:19:39 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:50466) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SRmCP-000555-H1 for qemu-devel@nongnu.org; Tue, 08 May 2012 11:19:37 -0400 Received: from /spool/local by e9.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 8 May 2012 11:19:33 -0400 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id B39B36E806D for ; Tue, 8 May 2012 11:19:31 -0400 (EDT) Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q48FJVB3118154 for ; Tue, 8 May 2012 11:19:31 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q48FJU9N024355 for ; Tue, 8 May 2012 12:19:31 -0300 Message-ID: <4FA93981.7080502@linux.vnet.ibm.com> Date: Tue, 08 May 2012 11:19:29 -0400 From: Corey Bryant MIME-Version: 1.0 References: <20120508091535.GB18762@redhat.com> <4FA92951.8090601@linux.vnet.ibm.com> <20120508142723.GJ18762@redhat.com> In-Reply-To: <20120508142723.GJ18762@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC] [PATCH 0/2] Sandboxing Qemu guests with Libseccomp List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" Cc: Stefano Stabellini , "qemu-devel@nongnu.org" , Eduardo Otubo On 05/08/2012 10:27 AM, Daniel P. Berrange wrote: > On Tue, May 08, 2012 at 10:10:25AM -0400, Corey Bryant wrote: >> >> >> On 05/08/2012 07:32 AM, Stefano Stabellini wrote: >>> On Tue, 8 May 2012, Daniel P. Berrange wrote: >>>> On Fri, May 04, 2012 at 04:08:36PM -0300, Eduardo Otubo wrote: >>>>> Hello all, >>>>> >>>>> This is the first effort to sandboxing Qemu guests using Libseccomp[0]. The >>>>> patches that follows are pretty simple and straightforward. I added the correct >>>>> options and checks to the configure script and the basic calls to libseccomp in >>>>> the main loop at vl.c. Details of each one are in the emails of the patch set. >>>>> >>>>> This support limits the system call footprint of the entire QEMU process to a >>>>> limited set of syscalls, those that we know QEMU uses. The idea is to limit >>>>> the allowable syscalls, therefore limiting the impact that an attacked guest >>>>> could have on the host system. >>>> >>>> What functionality has been lost by applying this seccomp filter ? I've not >>>> looked closely at the code, but it appears as if this blocks pretty much >>>> any kind of runtime device changes. ie no hotplug of any kind will work ? >>> >>> Right, I was wondering the same thing: open is not on the list so adding >>> a new disk shouldn't be possible. >>> >>> Regarding Xen, most of the hypercalls go through xc_* calls that are >>> ioctls on the privcmd device. Is it possible to add ioctl to the list? >>> >> >> If the whitelist is complete there should be no functionality lost >> when using seccomp with QEMU. The idea (at least at this point) is >> to disallow the system calls that QEMU doesn't use. open and ioctl >> should be added to the whitelist. > > Ok. So my next question is what is the benchmark for evaluating > whether this seccomp code provides any kind of meaningful security > improvement ? AFAICT, if you were allow open(), or indeed every > syscall any QEMU feature could possibly use, then there would be > little-to-no security benefit. Well let's say we have a seccomp whitelist of 50 syscalls. That reduces the syscall footprint from ~350 (on x86) syscalls to 50, limiting what the attacker could execute from an exploited guest. Eventually it would be nice to fine-tune the syscall parameters that are whitelisted. For example, we could only allow a designated subset of allowable ioctls. Or we could allow I/O operations only on a designated set of file descriptors that the guest needs to access. -- Regards, Corey