From mboxrd@z Thu Jan  1 00:00:00 1970
From: Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: A few KVM security questions
Date: Mon, 07 Dec 2009 11:33:08 -0600
Message-ID: <4B1D3C54.6030305@codemonkey.ws>
References: <4B1CFD93.7090307@invisiblethingslab.com> <4B1D0057.8030707@redhat.com> <4B1D0383.1080306@invisiblethingslab.com> <4B1D0544.9000603@redhat.com> <4B1D30F6.7050609@codemonkey.ws> <4B1D36E3.9090206@invisiblethingslab.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Avi Kivity <avi@redhat.com>, kvm@vger.kernel.org
To: Joanna Rutkowska <joanna@invisiblethingslab.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-bw0-f227.google.com ([209.85.218.227]:48016 "EHLO
	mail-bw0-f227.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S935528AbZLGRdI (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 7 Dec 2009 12:33:08 -0500
Received: by bwz27 with SMTP id 27so3707615bwz.21
        for <kvm@vger.kernel.org>; Mon, 07 Dec 2009 09:33:13 -0800 (PST)
In-Reply-To: <4B1D36E3.9090206@invisiblethingslab.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Joanna Rutkowska wrote:
> Anthony Liguori wrote:
>   
>> Avi Kivity wrote:
>>     
>>> No.  Paravirtualization just augments the standard hardware interface,
>>> it doesn't replace it as in Xen.
>>>       
>> NB, unlike Xen, we can (and do) run qemu as non-root.  Things like
>> RHEV-H and oVirt constrain the qemu process with SELinux.
>>
>>     
>
> On Xen you can get rid of the qemu entirely, if you run only PV domains.
>
>   
>> Also, you can use qemu to provide the backends to a Xen PV guest (see -M
>> xenpv).  The effect is that you are moving that privileged code from the
>> kernel (netback/blkback) to userspace (qemu -M xenpv).
>>
>> In general, KVM tends to keep code in userspace unless absolutely
>> necessary.  That's a fundamental difference from Xen which tends to do
>> the opposite.
>>
>>     
>
> But the difference is that in case of Xen one can *easily* move the
> backends to small unprivileged VMs. In that case it doesn't matter the
> code is in kernel mode, it's still only in an unprivileged domain.
>   

Right, in KVM, Linux == hypervisor.  A process is our "unprivileged 
domain".  Putting an unprivileged domain within an unprivileged domain 
is probably not helpful from a security perspective since the exposure 
surface is identical.

> Sandboxing a process in a monolithic OS, like Linux, is generally
> considered unfeasible, for anything more complex than a hello world
> program. The process <-> kernel interface seem to be just too fat. See
> e.g. the recent Linux kernel overflows by Spender.
>   

That's the point of mandatory access control.  Of course, you need the 
right policy and Spender highlighted an issue with the standard RHEL 
SELinux policy, but that should be addressed now upstream.

> Also, SELinux seems to me like a step into the wrong direction. It not
> only adds complexity to the already-too-complex kernel, but requires
> complex configuration. See e.g. this paper[1] for a nice example of how
> to escape SE-sandboxed qemu on FC8 due to SELinux policy misconfiguration.
>
> When some people tried to add SELinux-like-thing to Xen hypervisor, it
> only resulted in an exploitable heap overflow in Xen [2].
>   

It's certainly fair to argue the merits of SELinux as a mandatory access 
control mechanism.

Again though, that's the point of MLS.  Our first line of defense is 
qemu.  Our second line of defense is traditional Posix direct access 
control.  Our third line of defense is namespace isolation (ala lxc).  
Our fourth line of defense is mandatory access control (ala SELinux and 
AppArmor).

If you take a somewhat standard deployment like RHEV-H, an awful lot of 
things have to go wrong before you can successfully exploit the system.  
And 5.4 doesn't even implement all of what's possible.  If you're really 
looking to harden, you can be much more aggressive about privileges and 
namespace isolation.

Regards,

Anthony Liguori