From mboxrd@z Thu Jan  1 00:00:00 1970
From: Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: KVM usability
Date: Mon, 01 Mar 2010 20:34:26 -0600
Message-ID: <4B8C7932.4010008@codemonkey.ws>
References: <20100226111734.GE7463@elte.hu> <4B8813F2.8090208@redhat.com> <20100227105643.GA17425@elte.hu> <4B893B2B.40301@redhat.com> <20100227172546.GA31472@elte.hu> <4B8BEFC7.2040000@redhat.com> <20100301174106.GB2362@ghostprotocols.net> <4B8C0778.8050908@redhat.com> <20100301205620.GA26151@elte.hu> <4B8C3562.30900@codemonkey.ws> <20100302003036.GA1654@elte.hu>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Zachary Amsden <zamsden@redhat.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Avi Kivity <avi@redhat.com>,
	"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>, ming.m.lin@intel.com,
	sheng.yang@intel.com, Jes Sorensen <Jes.Sorensen@redhat.com>,
	KVM General <kvm@vger.kernel.org>,
	Gleb Natapov <gleb@redhat.com>,
	Fr??d??ric Weisbecker <fweisbec@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Arjan van de Ven <arjan@infradead.org>
To: Ingo Molnar <mingo@elte.hu>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-yw0-f197.google.com ([209.85.211.197]:39256 "EHLO
	mail-yw0-f197.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753176Ab0CBCeb (ORCPT <rfc822;kvm@vger.kernel.org>);
	Mon, 1 Mar 2010 21:34:31 -0500
Received: by ywh35 with SMTP id 35so1519823ywh.4
        for <kvm@vger.kernel.org>; Mon, 01 Mar 2010 18:34:30 -0800 (PST)
In-Reply-To: <20100302003036.GA1654@elte.hu>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 03/01/2010 06:30 PM, Ingo Molnar wrote:
> IMO that's a bug, not a feature. There should be a lot more interaction
> between kvm-qemu and KVM: for example Qemu should have a feature to install
> paravirt drivers in the guest, this would be helped by living in the kernel
> repo.
>    

Not in the slightest bit.

To support automatically installing paravirt drivers in a guest, we need 
to distribute an ISO containing *binary* versions of drivers.  For 
Windows, there's a licensing issue that I described earlier with respect 
to signing.  Figuring out distribution is non-trivial and is being 
worked on.  So far, Red Hat are the only ones actually capable of 
producing signed binaries (no mere mortal can do it).  For Linux 
drivers, we need to be able to ship different versions of the kernel 
drivers for different distribution kernels if we don't want to rely on 
what they ship.

The way we've tackled this in the past is by having an awk script that 
automatically converts the virtio drivers into something buildable 
across kernel versions.  It's incredibly difficult to maintain and we 
stopped maintaining it about a year ago when virtio drivers became 
common in all distro kernels.  See 
http://git.kernel.org/?p=virt/kvm/kvm-guest-drivers-linux.git if you're 
interested.

What would make this much easier for us is if we could add all of the 
#ifdef's for various kernel versions in the mainline source tree.  I'm 
not holding my breath for that though :-)

But once we had an ISO with binary drivers (and such a thing is 
available for Windows today), it's just a matter of adding an option to 
change the CDROM to the shipped ISO.  This is purely within qemu and 
doesn't touch kvm.ko at all.

Once the winpv driver's binary hosting is sorted out, virt-manager will 
have this feature.  There are zero changes required to the kvm kernel 
code to support this.

>>      
>>>   - It's released together with the kernel, which gives a periodic 3 months
>>>     release frequency. Not too slow, not too fast.
>>>        
>> qemu release range in length from 3-6 months depending on
>> distribution schedules.  They are very regular.
>>      
> The Linux kernel is released every 3 months, +- one week. Our experience is
> that even 6 months would be (way) too painful for distros.
>    

I expect that we'll eventually even out to a consistent release 
schedule. For now, we're still trying to see what fits us best.  The 
last 3 month release was very compressed so we're trying something a 
little longer this time.

>>>   - Code quality requirements are that of the kernel's. No muck allowed and
>>>     it's not hard to explain what kind of code is preferred.
>>>        
>> Code quality is subjective.  We have a different coding style.
>>      
> That's somewhat of a problem when for example a KVM kernel-space developer
> crosses into Qemu code and back. Two separate styles, etc. I certainly
> remember a 'culture clash' when going from the kernel into Qemu and back.
> Different principles, different culture. It's better to standardize.
>    

Some would argue that having diversity of culture is a good thing that 
breeds creative thinking :-)

It's annoying to switch coding styles but I don't think it's a major 
problem for anyone.

>>>   - Tool breakage bisection is a breeze: there's never any mismatch between
>>>     tools/perf and the kernel counterpart. With a separate package we'd
>>>     have more complex test and bisection scenarios.
>>>        
>> KVM has a backwards compatible ABI so there's no such thing as mismatch
>> between user and kernel space.
>>      
> perf too is ABI compatible (between releases) - still bisection is a lot
> easier because the evolution of a particular feature can be bisected back to.
>
> Btw., KVM certainly ha ABI breakages around 2.6.16(?) when it was added, even
> of released versions.

That was a one-time thing in the very early days of KVM.

>   Also, within a development version you sure sometimes
> iterate a new ABI component, right?

It's not really happened.  We introduce new ABIs very rarely.  KVM has a 
very defined purpose; it provides CPU virtualization.  We only extend 
the ABI to support new CPU features that we didn't previously support 
and since these things are defined by the Intel architecture, it's 
fairly easy to define the ABI properly up front.

>   With a time-coherent repository both
> intentional and unintentional breakages and variations can be bisected back to
> as they happened.
>
> This is an unconditional advantage and i made use of it numerous times.
>    

We used to keep the kernel code in the same repository as the userspace 
code.  We stopped doing that about a year ago and it's rare that we have 
a circumstance where joint bisecting is required.

>> You should try it.  I think you'll find that it's not as obvious thing to do
>> as you think it is.
>>      
> A few years ago I looked into cleaning up Qemu, when i hacked KVM and Qemu. I
> also wanted to have a 'qemu light', which is both smaller and cleaner, and
> still fits to KVM. It didnt look particularly hard back then - but it's
> certainly not zero amount of work.
>    

First impressions are deceptive.  My long term goal for qemu is to get 
to a point where the device models live independently of the rest of 
qemu.  I think it's reasonable to split these devices into a modular 
library that can then be used by other applications.

That would make it possible to create a kvm-specific virtualization tool 
that only supported tap and linux-aio and the bare minimum numbers of 
devices.  It would be easy to look at and for kernel hackers to play with.

But to be honest, it would never replace qemu.  Once you add a VNC/Spice 
server (you need remote connectivity), support for sparse file formats 
(because we can't wait forever for btrfs to solve all of our problems), 
live migration and snapshotting (required ticky marks for 
virtualization), a management layer, and all of the other bells and 
whistles, you'll find that you did an awful lot of work to recreate what 
qemu does.

Most people that have gone down this road believe that it's more 
efficient to just improve qemu's quality than it is to try and replicate 
it.  So far, we've been pretty successful IMHO.

Regards,

Anthony Liguori

> Cleanups pay - they make a piece of code both more hackable, more debuggable
> and more appealing to new developers. (i suspect you have no argument with
> that notion) Also note that it wasnt me who suggested that Qemu wouldnt fit
> the kernel standards as-is - it was raised by others in this discussion.
>
> 	Ingo
>