From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:50410) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNT8B-0000JI-LM for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RNT89-0004VZ-I2 for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:11 -0500 Received: from mail-gy0-f173.google.com ([209.85.160.173]:49877) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RNT89-0004V7-Dw for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:09 -0500 Received: by gyb11 with SMTP id 11so4659437gyb.4 for ; Mon, 07 Nov 2011 09:37:08 -0800 (PST) Message-ID: <4EB8173F.9090008@codemonkey.ws> Date: Mon, 07 Nov 2011 11:37:03 -0600 From: Anthony Liguori MIME-Version: 1.0 References: <1320612020.3299.22.camel@lappy> In-Reply-To: <1320612020.3299.22.camel@lappy> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Secure KVM List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Sasha Levin Cc: Andrea Arcangeli , Cyrill Gorcunov , Rusty Russell , kvm , "Michael S. Tsirkin" , Corentin Chary , Asias He , Marcelo Tosatti , qemu-devel , Pekka Enberg , Avi Kivity , Ingo Molnar On 11/06/2011 02:40 PM, Sasha Levin wrote: > Hi all, > > I'm planning on doing a small fork of the KVM tool to turn it into a > 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh? > > The idea was discussed briefly couple of months ago, but never got off > the ground - which is a shame IMO. > > It's easy to explain the problem: If an attacker finds a security hole > in any of the devices which are exposed to the guest, the attacker would > be able to either crash the guest, or possibly run code on the host > itself. > > The solution is also simple to explain: Split the devices into different > processes and use seccomp to sandbox each device into the exact set of > resources it needs to operate, nothing more and nothing less. > > Since I'll be basing it on the KVM tool, which doesn't really emulate > that many legacy devices, I'll focus first on the virtio family for the > sake of simplicity (and covering 90% of the options). > > This is my basic overview of how I'm planning on implementing the > initial POC: > > 1. First I'll focus on the simple virtio-rng device, it's simple enough > to allow us to focus on the aspects which are important for the POC > while still covering most bases (i.e. sandbox to single file > - /dev/urandom and such). > > 2. Do it on a one process per device concept, where for each device > (notice - not device *type*) requested, a new process which handles it > will be spawned. > > 3. That process will be limited exactly to the resources it needs to > operate, for example - if we run a virtio-blk device, it would be able > to access only the image file which it should be using. > > 4. Connection between hypervisor and devices will be based on unix > sockets, this should allow for better separation compared to other > approaches such as shared memory. > > 5. While performance is an aspect, complete isolation is more important. > Security is primary, performance is secondary. > > 6. Share as much code as possible with current implementation of virtio > devices, make it possible to run virtio devices either like it's being > done now, or by spawning them as separate processes - the amount of > specific code for the separate process case should be minimal. > > > Thats all I have for now, comments are *very* welcome. I thought about this a bit and have some ideas that may or may not help. 1) If you add device save/load support, then it's something you can potentially use to give yourself quite a bit of flexibility in changing the sandbox. At any point in run time, you can save the device model's state in the sandbox, destroy the sandbox, and then build a new sandbox and restore the device to its former state. This might turn out to be very useful in supporting things like device hotplug and/or memory hot plug. 2) I think it's largely possible to implement all device emulation without doing any dynamic memory allocation. Since memory allocation DoS is something you have to deal with anyway, I suspect most device emulation already uses a fixed amount of memory per device. This can potentially dramatically simplify things. 3) I think virtio can/should be used as a generic "backend to frontend" transport between the device model and the tool. 4) Lack of select() is really challenging. I understand why it's not there since it can technically be emulated but it seems like a no-risk syscall to whitelist and it would make programming in a sandbox so much easier. Maybe Andrea has some comments here? I might be missing something here. Regards, Anthony Liguori >