From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([140.186.70.92]:54295)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QjwAk-0000wm-Ux
	for qemu-devel@nongnu.org; Thu, 21 Jul 2011 12:32:30 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QjwAg-0002jF-VZ
	for qemu-devel@nongnu.org; Thu, 21 Jul 2011 12:32:26 -0400
Received: from mail-gy0-f173.google.com ([209.85.160.173]:52261)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <anthony@codemonkey.ws>) id 1QjwAg-0002j6-Ms
	for qemu-devel@nongnu.org; Thu, 21 Jul 2011 12:32:22 -0400
Received: by gyf2 with SMTP id 2so824834gyf.4
	for <qemu-devel@nongnu.org>; Thu, 21 Jul 2011 09:32:21 -0700 (PDT)
Message-ID: <4E285492.1070006@codemonkey.ws>
Date: Thu, 21 Jul 2011 11:32:18 -0500
From: Anthony Liguori <anthony@codemonkey.ws>
MIME-Version: 1.0
References: <4E259F6E.8000204@us.ibm.com>	<4E2824D2.2050401@redhat.com>	<4E2827A2.6010603@us.ibm.com>	<4E282BE3.1050404@redhat.com>	<4E283C90.8010806@us.ibm.com>
	<4E283FFE.6090201@redhat.com>	<4E28497C.5010801@us.ibm.com>
	<4E284C74.2010708@redhat.com>
In-Reply-To: <4E284C74.2010708@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [RFC] QEMU Object Model
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>, qemu-devel <qemu-devel@nongnu.org>, Markus Armbruster <armbru@redhat.com>

On 07/21/2011 10:57 AM, Avi Kivity wrote:
> On 07/21/2011 06:45 PM, Anthony Liguori wrote:
>>>> See git://git.codemonkey.ws/kvm++.git
>>>>
>>>
>>> Connection refused..
>>
>>
>> Sorry, stupid EC2. Try http://git.codemonkey.ws/git/kvm++.git
>
> You don't have permission to access /git/kvm++.git/ on this server.

git clone http://git.codemonkey.ws/git/kvm++.git

>>> Yes, that's a big worry. Is all of that exposed to the implementer? Or
>>> just in the framework?
>>
>> I was able to get it mostly hidden but not entirely. The biggest
>> problem is that even with a semi normal looking function, a simple
>> mistake like passing in the wrong number of arguments can result in an
>> undecipherable error message. It's not bad to debug if you're used to
>> it but very difficult to just slip in to a C project.
>
> I have to agree with this. You can write neat domain specific languages
> in C++ but the nice abstractions break the minute you misplace a comma.

Yeah, I love C++, but am empathic towards people who don't share my 
affection.

>>>> Yup. That's the price you pay for using C over C++. But the ratio is
>>>> only true for simple types. Complex types (look at chrdrv.h), the
>>>> ratio is much lower.
>>>
>>> chrdrv.c is almost 100% boilerplate.
>>
>> Sure, but so is qemu-char.c and block.c. It's not really different
>> from what we have today.
>
> I'm trying to get us away from boilerplate, actually writing the code
> that matters?

Yeah, I'm with you, but if you figure out how to get all of the 
performance of C/C++, with nice understandable error messages, while 
avoiding writing any unnecessary code than I'm sure you'll be receiving 
the Turing award 20 years from now :-)

> But it's a big problem. Those conversions suck the life force out of
> whoever's doing them, and then he comes back as zombie to haunt us at
> the next kvm forum. It's not just type safety at stake here.

There are many problems to solve and I think we have to attack them 
systematically.

The problem I'm trying to solve here is that we're duplicating the same 
infrastructure over and over again.  Namely, we're inventing object 
models at every opportunity.  qdev, BDS, VLANClientState, 
CharDriverState, FsDev, DisplayState, etc.  The madness has to stop.

Just as we're now realizing that we need to do dramatic things in the 
block layer to make -blockdev work, I'm sure we're going to realize that 
we want to do PCI hotplug of virtio-serial and therefore we need to do 
dynamic creation/destruction of character devices.

We need to come up with a single way to make this work for everything.

Reducing boiler plate code is yet another problem, and I'll argue a much 
less critical one.  I'm all for solving it, but rome wasn't built in a day.


>>> The big problem with this plan is that steps 2 and 4 are very difficult
>>> and yet we see no gains while it's being done.
>>
>> So at least with the chardev conversion, we'll get the ability to
>> dynamically add character devices and change the settings of a chardev
>> after start up. Both of these are important features.
>>
>> I think done properly, it all can have incremental value. I worry
>> about converting the device model.
>>
>>> A much smaller job (the
>>> memory API conversion) is turning out to be no fun at all.
>>
>> Yeah, I don't know how to fix that. This is why I'm starting with the
>> backends. They're much smaller than the device model.
>
> Ok. Let's hope you've hit on the least-bad tradeoff. I'm sceptical, but
> I don't have anything concrete to offer.

I've got lots of ideas, but want to focus on backends first because 
that's the biggest bang for the buck right now.

>>>> I don't ever see the device model being written in something other
>>>> than C/C++ too. Having a GC in the VCPU thread would be a major issue
>>>> IMHO.
>>>
>>> We get the equivalent of GC every time a vcpu thread is scheduled out so
>>> an iothread or an unrelated process can run.
>>
>> But you can control this with pinning, priorities, etc. You cannot
>> with GC.
>>
>> And in quite a lot of systems, GC pauses are very, very long.
>
> I expect the number of objects we'll have will be very small (O(info qdm
> | wc -l)). They'll also be very long lived. Won't that make the GC
> rather fast?

Really depends on the language and the JIT implementation.  Small 
objects that are short lived can lead to the need to do compaction. 
Some JITs do compaction in such a way that there is a periodic very long 
pause (even up to 10ms).

>>> It will hurt hard realtime
>>> guests, but these are only interesting to a small subset of users.
>>>
>>> I think that if we can get the data path to run in pure C, and have the
>>> GC HLL involved only when the device model is reconfigured, then we have
>>> an acceptable tradeoff. I don't claim that I know how to do this,
>>> though. This is a really hard problem, mostly due to the huge size of
>>> qemu.
>>
>> This is one very good thing about having a common object model that's
>> pluggable (which is what QOM is all about). It becomes relatively easy
>> to build with CONFIG_PCI=n, then build a shared library that
>> implements a new PCI layer in your favorite HLL so that you can
>> experiment.
>
> If I wanted a new PCI layer I'd write one outside of qemu. Getting PC
> emulation is probably easier than converting all of qemu - see tools/kvm
> (though they're not doing full emulation, just the subset needed to get
> Linux going). I want the old layer, it has a lot of knowledge sweated
> into it and it's very compatible with itself and with the guests it has
> booted.

What I meant is that if you wanted to port the PCI layer to another 
langauge, you could do it because there are well defined boundaries that 
can be bounded to dynamic languages.  It boils down to the fact that 
there's standard ways to do things like invoke methods on objects, 
introspect properties, etc.  That provides the ability to do incremental 
integration of HLLs.

Regards,

Anthony Liguori

>