From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:50410)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RNT8B-0000JI-LM
	for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:12 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RNT89-0004VZ-I2
	for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:11 -0500
Received: from mail-gy0-f173.google.com ([209.85.160.173]:49877)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1RNT89-0004V7-Dw
	for qemu-devel@nongnu.org; Mon, 07 Nov 2011 12:37:09 -0500
Received: by gyb11 with SMTP id 11so4659437gyb.4
	for <qemu-devel@nongnu.org>; Mon, 07 Nov 2011 09:37:08 -0800 (PST)
Message-ID: <4EB8173F.9090008@codemonkey.ws>
Date: Mon, 07 Nov 2011 11:37:03 -0600
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <1320612020.3299.22.camel@lappy>
In-Reply-To: <1320612020.3299.22.camel@lappy>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Secure KVM
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Sasha Levin <levinsasha928@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>, Cyrill Gorcunov <gorcunov@gmail.com>, Rusty Russell <rusty@rustcorp.com.au>, kvm <kvm@vger.kernel.org>, "Michael S. Tsirkin" <mst@redhat.com>, Corentin Chary <corentincj@iksaif.net>, Asias He <asias.hejun@gmail.com>, Marcelo Tosatti <mtosatti@redhat.com>, qemu-devel <qemu-devel@nongnu.org>, Pekka Enberg <penberg@kernel.org>, Avi Kivity <avi@redhat.com>, Ingo Molnar <mingo@elte.hu>

On 11/06/2011 02:40 PM, Sasha Levin wrote:
> Hi all,
>
> I'm planning on doing a small fork of the KVM tool to turn it into a
> 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh?
>
> The idea was discussed briefly couple of months ago, but never got off
> the ground - which is a shame IMO.
>
> It's easy to explain the problem: If an attacker finds a security hole
> in any of the devices which are exposed to the guest, the attacker would
> be able to either crash the guest, or possibly run code on the host
> itself.
>
> The solution is also simple to explain: Split the devices into different
> processes and use seccomp to sandbox each device into the exact set of
> resources it needs to operate, nothing more and nothing less.
>
> Since I'll be basing it on the KVM tool, which doesn't really emulate
> that many legacy devices, I'll focus first on the virtio family for the
> sake of simplicity (and covering 90% of the options).
>
> This is my basic overview of how I'm planning on implementing the
> initial POC:
>
> 1. First I'll focus on the simple virtio-rng device, it's simple enough
> to allow us to focus on the aspects which are important for the POC
> while still covering most bases (i.e. sandbox to single file
> - /dev/urandom and such).
>
> 2. Do it on a one process per device concept, where for each device
> (notice - not device *type*) requested, a new process which handles it
> will be spawned.
>
> 3. That process will be limited exactly to the resources it needs to
> operate, for example - if we run a virtio-blk device, it would be able
> to access only the image file which it should be using.
>
> 4. Connection between hypervisor and devices will be based on unix
> sockets, this should allow for better separation compared to other
> approaches such as shared memory.
>
> 5. While performance is an aspect, complete isolation is more important.
> Security is primary, performance is secondary.
>
> 6. Share as much code as possible with current implementation of virtio
> devices, make it possible to run virtio devices either like it's being
> done now, or by spawning them as separate processes - the amount of
> specific code for the separate process case should be minimal.
>
>
> Thats all I have for now, comments are *very* welcome.

I thought about this a bit and have some ideas that may or may not help.

1) If you add device save/load support, then it's something you can potentially 
use to give yourself quite a bit of flexibility in changing the sandbox.  At any 
point in run time, you can save the device model's state in the sandbox, destroy 
the sandbox, and then build a new sandbox and restore the device to its former 
state.

This might turn out to be very useful in supporting things like device hotplug 
and/or memory hot plug.

2) I think it's largely possible to implement all device emulation without doing 
any dynamic memory allocation.  Since memory allocation DoS is something you 
have to deal with anyway, I suspect most device emulation already uses a fixed 
amount of memory per device.   This can potentially dramatically simplify things.

3) I think virtio can/should be used as a generic "backend to frontend" 
transport between the device model and the tool.

4) Lack of select() is really challenging.  I understand why it's not there 
since it can technically be emulated but it seems like a no-risk syscall to 
whitelist and it would make programming in a sandbox so much easier.  Maybe 
Andrea has some comments here?  I might be missing something here.

Regards,

Anthony Liguori

>