From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Lieven Subject: Re: performance trouble Date: Tue, 27 Mar 2012 16:06:13 +0200 Message-ID: <4F71C955.8030002@dlh.net> References: <20120222163356.GE26955@nfs-rbx.ovh.net> <201203262211.44284.vrozenfe@redhat.com> <20120327085604.GQ22368@redhat.com> <201203271123.33524.vrozenfe@redhat.com> <4F7187C5.4080607@dlh.net> <20120327100034.GT22368@redhat.com> <4F71B087.8060008@dlh.net> <20120327122645.GW22368@redhat.com> <4F71B254.807@dlh.net> <20120327122902.GX22368@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Vadim Rozenfeld , David Cure , Avi Kivity , kvm@vger.kernel.org To: Gleb Natapov Return-path: Received: from ssl.dlh.net ([91.198.192.8]:56789 "EHLO ssl.dlh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753046Ab2C0OGQ (ORCPT ); Tue, 27 Mar 2012 10:06:16 -0400 In-Reply-To: <20120327122902.GX22368@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 27.03.2012 14:29, Gleb Natapov wrote: > On Tue, Mar 27, 2012 at 02:28:04PM +0200, Peter Lieven wrote: >> On 27.03.2012 14:26, Gleb Natapov wrote: >>> On Tue, Mar 27, 2012 at 02:20:23PM +0200, Peter Lieven wrote: >>>> On 27.03.2012 12:00, Gleb Natapov wrote: >>>>> On Tue, Mar 27, 2012 at 11:26:29AM +0200, Peter Lieven wrote: >>>>>> On 27.03.2012 11:23, Vadim Rozenfeld wrote: >>>>>>> On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: >>>>>>>> On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: >>>>>>>>> On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: >>>>>>>>>> On 26.03.2012 20:36, Vadim Rozenfeld wrote: >>>>>>>>>>> On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: >>>>>>>>>>>> On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld wrote: >>>>>>>>>>>>> On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: >>>>>>>>>>>>>> On 22.03.2012 10:38, Vadim Rozenfeld wrote: >>>>>>>>>>>>>>> On Thursday, March 22, 2012 10:52:42 AM Peter Lieven wrote: >>>>>>>>>>>>>>>> On 22.03.2012 09:48, Vadim Rozenfeld wrote: >>>>>>>>>>>>>>>>> On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov wrote: >>>>>>>>>>>>>>>>>> On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven wrote: >>>>>>>>>>>>>>>>>>> On 21.03.2012 12:10, David Cure wrote: >>>>>>>>>>>>>>>>>>>> hello, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb Natapov >>>>>>> ecrivait : >>>>>>>>>>>>>>>>>>>>> Try to add >>>>>>>>>>>>>>>>>>>>> to cpu definition in XML and check command line. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ok I try this but I can't use to map the >>>>>>>>>>>>>>>>>>>> host cpu >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (my libvirt is 0.9.8) so I use : >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Opteron_G3 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> (the physical server use Opteron CPU). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The log is here : >>>>>>>>>>>>>>>>>>>> http://www.roullier.net/Report/report-3.2-vhost-net-1vcpu-cp >>>>>>>>>>>>>>>>>>>> u.tx t.gz >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> And now with only 1 vcpu, the response time is 8.5s, great >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> improvment. We keep this configuration for production : we >>>>>>>>>>>>>>>>>>>> check the response time when some other users are >>>>>>>>>>>>>>>>>>>> connected. >>>>>>>>>>>>>>>>>>> please keep in mind, that setting -hypervisor, disabling hpet >>>>>>>>>>>>>>>>>>> and only one vcpu >>>>>>>>>>>>>>>>>>> makes windows use tsc as clocksource. you have to make sure, >>>>>>>>>>>>>>>>>>> that your vm is not switching between physical sockets on >>>>>>>>>>>>>>>>>>> your system and that you have constant_tsc feature to have a >>>>>>>>>>>>>>>>>>> stable tsc between the cores in the same socket. its also >>>>>>>>>>>>>>>>>>> likely that the vm will crash when live migrated. >>>>>>>>>>>>>>>>>> All true. I asked to try -hypervisor only to verify where we >>>>>>>>>>>>>>>>>> loose performance. Since you get good result with it frequent >>>>>>>>>>>>>>>>>> access to PM timer is probably the reason. I do not recommend >>>>>>>>>>>>>>>>>> using -hypervisor for production! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> @gleb: do you know whats the state of in-kernel hyper-v >>>>>>>>>>>>>>>>>>> timers? >>>>>>>>>>>>>>>>>> Vadim is working on it. I'll let him answer. >>>>>>>>>>>>>>>>> It would be nice to have synthetic timers supported. But, at >>>>>>>>>>>>>>>>> the moment, I'm only researching this feature. >>>>>>>>>>>>>>>> So it will take months at least? >>>>>>>>>>>>>>> I would say weeks. >>>>>>>>>>>>>> Is there a way, we could contribute and help you with this? >>>>>>>>>>>>> Hi Peter, >>>>>>>>>>>>> You are welcome to add an appropriate handler. >>>>>>>>>>>> I think Vadim refers to this HV MSR >>>>>>>>>>>> http://msdn.microsoft.com/en-us/library/windows/hardware/ff542633%28 >>>>>>>>>>>> v=vs .85 %29.aspx >>>>>>>>>>> This one is pretty simple to support. Please see attachments for more >>>>>>>>>>> details. I was thinking about synthetic timers >>>>>>>>>>> http://msdn.microsoft.com/en- >>>>>>>>>>> us/library/windows/hardware/ff542758(v=vs.85).aspx >>>>>>>>>> is this what microsoft qpc uses as clocksource in hyper-v? >>>>>>>>> Yes, it should be enough for Win7 / W2K8R2. >>>>>>>> To clarify the thing that microsoft qpc uses is what is implemented by >>>>>>>> the patch Vadim attached to his previous email. But I believe that >>>>>>>> additional qemu patch is needed for Windows to actually use it. >>>>>>> You are right. >>>>>>> bits 1 and 9 must be set to on in leaf 0x40000003 and HPET >>>>>>> should be completely removed from ACPI. >>>>>> could you advise how to do this and/or make a patch? >>>>>> >>>>>> the stuff you send yesterday is for qemu, right? would >>>>>> it be possible to use it in qemu-kvm also? >>>>>> >>>>> No, they are for kernel. >>>> i meant the qemu.diff file. >>>> >>> Yes, I missed the second attachment. >>> >>>> if i understand correctly i have to pass -cpu host,+hv_refcnt to qemu? >>>> >>> Looks like it. >> ok, so it would be interesting if it helps to avoid the pmtimer reads >> we observed earlier. right? >> > Yes. first feedback: performance seems to be amazing. i cannot confirm that it breaks hv_spinlocks, hv_vapic and hv_relaxed. why did you assume this? no more pmtimer reads. i can now almost fully utililizy a 1GBit interface with a file transfer while there was not one cpu core fully utilized as observed with pmtimer. some live migration tests revealed that it did not crash even under load. @vadim: i think we need a proper patch for the others to test this ;-) what i observed: is it right, that HV_X64_MSR_TIME_REF_COUNT is missing in msrs_to_save[] in x86/x86.c of the kernel module? thanks for you help, peter > -- > Gleb.