From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: Secure KVM Date: Mon, 07 Nov 2011 12:27:53 +0200 Message-ID: <4EB7B2A9.5020608@redhat.com> References: <1320612020.3299.22.camel@lappy> <4EB7A45D.1030600@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Andrea Arcangeli , Marcelo Tosatti , Ingo Molnar , Pekka Enberg , Cyrill Gorcunov , Asias He , Anthony Liguori , Rusty Russell , "Michael S. Tsirkin" , kvm To: Sasha Levin Return-path: Received: from mx1.redhat.com ([209.132.183.28]:61446 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753248Ab1KGK2L (ORCPT ); Mon, 7 Nov 2011 05:28:11 -0500 In-Reply-To: Sender: kvm-owner@vger.kernel.org List-ID: On 11/07/2011 12:17 PM, Sasha Levin wrote: > Hi Avi, > > Thank you for your comments! > > Just one question below: > > On Mon, Nov 7, 2011 at 11:26 AM, Avi Kivity wrote: > > Crashing the guest is fine (not 100% - you can have unprivileged code > > managing a device, in which case we allow unprivileged code to crash the > > entire guest - but that's rare). Running code on the host is also fine; > > On Mon, Nov 7, 2011 at 11:26 AM, Avi Kivity wrote: > > One thing to beware of is memory hotplug. If the memory map is static, > > then a fork() once everything is set up (with MAP_SHARED) alllows all > > processes to access guest memory. However, if memory hotplug is > > supported (or planned to be supported), then you can't do that, as > > seccomp doesn't allow you to run mmap() in confined processes. > > > > This means they have to use RPC to the main process in order to access > > memory, which is going to slow them down significantly. > > Is the risk of a non-privileged guest code being able to exploit > hypervisor to access guest memory which it's not allowed to access is > really that small? I actually thought it would be one of the main > concerns we'd need to handle, but from what I understand from you it's > an irrelevant scenario. I wouldn't say it's completely irrelevant. But mainstream deployments (Linux and Windows) don't really suffer from it, since all device drivers are privilged (an exception may be graphics drivers on newer Windows). Scenarios which may be vulnerable are nested virtualization with the guest using device assignment. > If it's really the case, then mapping guest memory is preferable. > While mmap() is an issue, I think it's a great example of why seccomp > filters are needed in the kernel, and might be a good chance to push > that feature forward. In that sense, 'Secure KVM' could be used as a > guinea pig both for seccomp filters and future QEMU work. Sure. Another direction we're looking up is making it harder to exploit a vulnerability. PIC/PIE (position independent code/executable) make it harder to exploit a bug; and selinux controls on exec(), mprotect(), and mmap(PROT_EXEC) make it impossible to inject code (you can still use code in the hypervisor or its libraries). So we still have vulnerabilities, but they're all denial of service rather than privilege escalation. -- error compiling committee.c: too many arguments to function