From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: Secure KVM Date: Mon, 07 Nov 2011 11:39:30 -0600 Message-ID: <4EB817D2.5010200@codemonkey.ws> References: <1320612020.3299.22.camel@lappy> <4EB7A45D.1030600@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Sasha Levin , Andrea Arcangeli , Marcelo Tosatti , Ingo Molnar , Pekka Enberg , Cyrill Gorcunov , Asias He , Rusty Russell , "Michael S. Tsirkin" , kvm To: Avi Kivity Return-path: Received: from mail-iy0-f174.google.com ([209.85.210.174]:49434 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933071Ab1KGRje (ORCPT ); Mon, 7 Nov 2011 12:39:34 -0500 Received: by iage36 with SMTP id e36so6080468iag.19 for ; Mon, 07 Nov 2011 09:39:33 -0800 (PST) In-Reply-To: <4EB7A45D.1030600@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 11/07/2011 03:26 AM, Avi Kivity wrote: > On 11/06/2011 10:40 PM, Sasha Levin wrote: >> Hi all, >> >> I'm planning on doing a small fork of the KVM tool to turn it into a >> 'Secure KVM' enabled hypervisor. Now you probably ask yourself, Huh? > > Actually, no. > >> The idea was discussed briefly couple of months ago, but never got off >> the ground - which is a shame IMO. >> >> It's easy to explain the problem: If an attacker finds a security hole >> in any of the devices which are exposed to the guest, the attacker would >> be able to either crash the guest, or possibly run code on the host >> itself. > > Crashing the guest is fine (not 100% - you can have unprivileged code > managing a device, in which case we allow unprivileged code to crash the > entire guest - but that's rare). Running code on the host is also fine; > we have a permissions system in place to prevent damage; see libvirt's > sVirt code, which uses selinux to disallow an exploited guest from > touching other guests or host data. It should be able to protect > host-only networks as well (not sure if it does that). > > The real risk is that the exploited hypervisor turns around and exploits > yet another hole in the system, like a privileged daemon that the > hypervisor is allowed to be in contact with, or the kernel itself, via a > vulnerability in the kernel interfaces. > >> The solution is also simple to explain: Split the devices into different >> processes and use seccomp to sandbox each device into the exact set of >> resources it needs to operate, nothing more and nothing less. > > One thing to beware of is memory hotplug. If the memory map is static, > then a fork() once everything is set up (with MAP_SHARED) alllows all > processes to access guest memory. However, if memory hotplug is > supported (or planned to be supported), then you can't do that, as > seccomp doesn't allow you to run mmap() in confined processes. > > This means they have to use RPC to the main process in order to access > memory, which is going to slow them down significantly. If you treat the sandbox as ephemeral by leveraging save/restore, you can throw away and rebuild the device model on every memory change. While not a super cheap operation, it's at least amortized over time. Regards, Anthony Liguori >> Since I'll be basing it on the KVM tool, which doesn't really emulate >> that many legacy devices, I'll focus first on the virtio family for the >> sake of simplicity (and covering 90% of the options). > > Since virtio is so performance sensitive, my feeling is that it is > better to audit it, and rely on sandboxing for the non performance > sensitive parts of the device model. Of course for a POC it's fine to > start with it. > >> This is my basic overview of how I'm planning on implementing the >> initial POC: > > > >> Thats all I have for now, comments are *very* welcome. > > This plan is quite similar to the equivalent plans for qemu. However, > as kvm-tool is much smaller than qemu, you're likely to have much easier > time and make much faster progress. This is really a great use of > kvm-tool, to explore new ideas rather than catching up; and I'm sure > your experience will prove useful for qemu as well. >