From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Theurer Subject: Re: KVM performance vs. Xen Date: Thu, 30 Apr 2009 08:44:15 -0500 Message-ID: <49F9AB2F.4020505@linux.vnet.ibm.com> References: <49F8672E.5080507@linux.vnet.ibm.com> <49F967AE.4040905@redhat.com> <49F99E6A.3060404@linux.vnet.ibm.com> <49F9A160.3030609@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: kvm-devel To: Avi Kivity Return-path: Received: from e39.co.us.ibm.com ([32.97.110.160]:42256 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753656AbZD3NoS (ORCPT ); Thu, 30 Apr 2009 09:44:18 -0400 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e39.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n3UDeqFt024901 for ; Thu, 30 Apr 2009 07:40:52 -0600 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n3UDiI6h220406 for ; Thu, 30 Apr 2009 07:44:18 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n3UDiIqk024721 for ; Thu, 30 Apr 2009 07:44:18 -0600 In-Reply-To: <49F9A160.3030609@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: Avi Kivity wrote: > Andrew Theurer wrote: >> Avi Kivity wrote: >>>> >>> >>> What's the typical I/O load (disk and network bandwidth) while the >>> tests are running? >> This is average thrgoughput: >> network: Tx: 79 MB/sec Rx: 5 MB/sec > > MB as in Byte or Mb as in bit? Byte. There are 4 x 1 Gb adapters, each handling about 20 MB/sec or 160 Mbit/sec. > >> disk: read: 17 MB/sec write: 40 MB/sec > > This could definitely cause the extra load, especially if it's many > small requests (compared to a few large ones). I don't have the request sizes at my fingertips, but we have to use a lot of disks to support this I/O, so I think it's safe to assume there are a lot more requests than a simple large sequential read/write. > >>>> The host hardware: >>>> A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of >>>> disks, 4 x >>>> 1 GB Ethenret >>> >>> CPU time measurements with SMT can vary wildly if the system is not >>> fully loaded. If the scheduler happens to schedule two threads on a >>> single core, both of these threads will generate less work compared >>> to if they were scheduled on different cores. >> Understood. Even if at low loads, the scheduler does the right thing >> and spreads out to all the cores first, once it goes beyond 50% util, >> the CPU util can climb at a much higher rate (compared to a linear >> increase in work) because it then starts scheduling 2 threads per >> core, and each thread can do less work. I have always wanted >> something which could more accurately show the utilization of a >> processor core, but I guess we have to use what we have today. I >> will run again with SMT off to see what we get. > > On the other hand, without SMT you will get to overcommit much faster, > so you'll have scheduling artifacts. Unfortunately there's no good > answer here (except to improve the SMT scheduler). > >>> Yes, it is. If there is a lot of I/O, this might be due to the >>> thread pool used for I/O. >> I have a older patch which makes a small change to posix_aio_thread.c >> by trying to keep the thread pool size a bit lower than it is today. >> I will dust that off and see if it helps. > > Really, I think linux-aio support can help here. Yes, I think that would work for real block devices, but would that help for files? I am using real block devices right now, but it would be nice to also see a benefit for files in a file-system. Or maybe I am mis-understanding this, and linux-aio can be used on files? -Andrew > >>> >>> Yes, there is a scheduler tracer, though I have no idea how to >>> operate it. >>> >>> Do you have kvm_stat logs? >> Sorry, I don't, but I'll run that next time. BTW, I did not notice a >> batch/log mode the last time I ram kvm_stat. Or maybe it was not >> obvious to me. Is there an ideal way to run kvm_stat without a >> curses like output? > > You're probably using an ancient version: > > $ kvm_stat --help > Usage: kvm_stat [options] > > Options: > -h, --help show this help message and exit > -1, --once, --batch run in batch mode for one second > -l, --log run in logging mode (like vmstat) > -f FIELDS, --fields=FIELDS > fields to display (regex) > >