From mboxrd@z Thu Jan  1 00:00:00 1970
From: Avi Kivity <avi@qumranet.com>
Subject: Re: [patch 00/13] RFC: split the global mutex
Date: Sun, 20 Apr 2008 14:16:52 +0300
Message-ID: <480B2624.9040805@qumranet.com>
References: <20080417201021.515148882@localhost.localdomain>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: kvm-devel <kvm-devel@lists.sourceforge.net>
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-devel-bounces@lists.sourceforge.net>
In-Reply-To: <20080417201021.515148882@localhost.localdomain>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=kvm-devel>
List-Post: <mailto:kvm-devel@lists.sourceforge.net>
List-Help: <mailto:kvm-devel-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/kvm-devel>,
	<mailto:kvm-devel-request@lists.sourceforge.net?subject=subscribe>
Sender: kvm-devel-bounces@lists.sourceforge.net
Errors-To: kvm-devel-bounces@lists.sourceforge.net
List-Id: kvm.vger.kernel.org

Marcelo Tosatti wrote:
> Introduce QEMUDevice, making the ioport/iomem->device relationship visible. 
>
> At the moment it only contains a lock, but could be extended.
>
> With it the following is possible:
>     - vcpu's to read/write via ioports/iomem while the iothread is working on 
>       some unrelated device, or just copying data from the kernel.
>     - vcpu's to read/write via ioports/iomem to different devices simultaneously.
>
> This patchset is only a proof of concept kind of thing, so only serial+raw image
> are supported. 
>
> Tried two benchmarks, iperf and tiobench. With tiobench the reported latency is 
> significantly lower (20%+), but throughput with IDE is only slightly higher. 
>
> Expect to see larger improvements with a higher performing IO scheme (SCSI still buggy,
> looking at it).
>
> The iperf numbers are pretty good. Performance of UP guests increase slightly but SMP
> is quite significant.
>
>   


I expect you're seeing contention induced by memcpy()s and inefficient 
emulation.  With the dma api, I expect the benefit will drop.


> Note that workloads with multiple busy devices (such as databases, web servers) should
> be the real winners.
>
> What is the feeling on this? Its not _that_ intrusive and can be easily NOP'ed out for
> QEMU.
>
>   

I think many parts are missing (or maybe, I missed them).  You need to 
lock the qemu internals (there are many read-mostly qemu caches 
scattered around the code), lock against hotplug, etc.  For pure cpu 
emulation, there is a ton of work to be done: protecting the translator 
as well as making the translated code smp safe.

I think that QemuDevice makes sense, and that we want this long term, 
but that we first need to improve efficiency (which reduces cpu 
utilization _and_ improves scalability) rather than look at scalability 
alone (which is much harder in addition to the drawback of not reducing 
cpu utilization).


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone