From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Theurer Subject: Re: KVM performance vs. Xen Date: Thu, 30 Apr 2009 07:49:46 -0500 Message-ID: <49F99E6A.3060404@linux.vnet.ibm.com> References: <49F8672E.5080507@linux.vnet.ibm.com> <49F967AE.4040905@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Avi Kivity Return-path: Received: from e34.co.us.ibm.com ([32.97.110.152]:48115 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755669AbZD3Mtu (ORCPT ); Thu, 30 Apr 2009 08:49:50 -0400 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n3UClKvm001011 for ; Thu, 30 Apr 2009 06:47:20 -0600 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n3UCnnJ6120366 for ; Thu, 30 Apr 2009 06:49:50 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n3UCnn3H016469 for ; Thu, 30 Apr 2009 06:49:49 -0600 In-Reply-To: <49F967AE.4040905@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Andrew Theurer wrote: >> I wanted to share some performance data for KVM and Xen. I thought it >> would be interesting to share some performance results especially >> compared to Xen, using a more complex situation like heterogeneous >> server consolidation. >> >> The Workload: >> The workload is one that simulates a consolidation of servers on to a >> single host. There are 3 server types: web, imap, and app (j2ee). In >> addition, there are other "helper" servers which are also consolidated: >> a db server, which helps out with the app server, and an nfs server, >> which helps out with the web server (a portion of the docroot is nfs >> mounted). There is also one other server that is simply idle. All 6 >> servers make up one set. The first 3 server types are sent requests, >> which in turn may send requests to the db and nfs helper servers. The >> request rate is throttled to produce a fixed amount of work. In order >> to increase utilization on the host, more sets of these servers are >> used. The clients which send requests also have a response time >> requirement which is monitored. The following results have passed the >> response time requirements. >> > > What's the typical I/O load (disk and network bandwidth) while the > tests are running? This is average thrgoughput: network: Tx: 79 MB/sec Rx: 5 MB/sec disk: read: 17 MB/sec write: 40 MB/sec > >> The host hardware: >> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x >> 1 GB Ethenret > > CPU time measurements with SMT can vary wildly if the system is not > fully loaded. If the scheduler happens to schedule two threads on a > single core, both of these threads will generate less work compared to > if they were scheduled on different cores. Understood. Even if at low loads, the scheduler does the right thing and spreads out to all the cores first, once it goes beyond 50% util, the CPU util can climb at a much higher rate (compared to a linear increase in work) because it then starts scheduling 2 threads per core, and each thread can do less work. I have always wanted something which could more accurately show the utilization of a processor core, but I guess we have to use what we have today. I will run again with SMT off to see what we get. > > >> Test Results: >> The throughput is equal in these tests, as the clients throttle the work >> (this is assuming you don't run out of a resource on the host). What's >> telling is the CPU used to do the same amount of work: >> >> Xen: 52.85% >> KVM: 66.93% >> >> So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of >> work. Here's the breakdown: >> >> total user nice system irq softirq guest >> 66.90 7.20 0.00 12.94 0.35 3.39 43.02 >> >> Comparing guest time to all other busy time, that's a 23.88/43.02 = 55% >> overhead for virtualization. I certainly don't expect it to be 0, but >> 55% seems a bit high. So, what's the reason for this overhead? At the >> bottom is oprofile output of top functions for KVM. Some observations: >> >> 1) I'm seeing about 2.3% in scheduler functions [that I recognize]. >> Does that seems a bit excessive? > > Yes, it is. If there is a lot of I/O, this might be due to the thread > pool used for I/O. I have a older patch which makes a small change to posix_aio_thread.c by trying to keep the thread pool size a bit lower than it is today. I will dust that off and see if it helps. > >> 2) cpu_physical_memory_rw due to not using preadv/pwritev? > > I think both virtio-net and virtio-blk use memcpy(). > >> 3) vmx_[save|load]_host_state: I take it this is from guest switches? > > These are called when you context-switch from a guest, and, much more > frequently, when you enter qemu. > >> We have 180,000 context switches a second. Is this more than expected? > > > Way more. Across 16 logical cpus, this is >10,000 cs/sec/cpu. > >> I wonder if schedstats can show why we context switch (need to let >> someone else run, yielded, waiting on io, etc). >> > > Yes, there is a scheduler tracer, though I have no idea how to operate > it. > > Do you have kvm_stat logs? Sorry, I don't, but I'll run that next time. BTW, I did not notice a batch/log mode the last time I ram kvm_stat. Or maybe it was not obvious to me. Is there an ideal way to run kvm_stat without a curses like output? -Andrew