From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Lieven Subject: Re: performance trouble Date: Tue, 27 Mar 2012 18:16:11 +0200 Message-ID: <4F71E7CB.9000709@dlh.net> References: <20120222163356.GE26955@nfs-rbx.ovh.net> <201203271744.09024.vrozenfe@redhat.com> <4F71E389.30701@dlh.net> <201203271812.12374.vrozenfe@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Gleb Natapov , David Cure , Avi Kivity , kvm@vger.kernel.org To: Vadim Rozenfeld Return-path: Received: from ssl.dlh.net ([91.198.192.8]:58669 "EHLO ssl.dlh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752115Ab2C0QQO (ORCPT ); Tue, 27 Mar 2012 12:16:14 -0400 In-Reply-To: <201203271812.12374.vrozenfe@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On 27.03.2012 18:12, Vadim Rozenfeld wrote: > On Tuesday, March 27, 2012 05:58:01 PM Peter Lieven wrote: >> On 27.03.2012 17:44, Vadim Rozenfeld wrote: >>> On Tuesday, March 27, 2012 04:06:13 PM Peter Lieven wrote: >>>> On 27.03.2012 14:29, Gleb Natapov wrote: >>>>> On Tue, Mar 27, 2012 at 02:28:04PM +0200, Peter Lieven wrote: >>>>>> On 27.03.2012 14:26, Gleb Natapov wrote: >>>>>>> On Tue, Mar 27, 2012 at 02:20:23PM +0200, Peter Lieven wrote: >>>>>>>> On 27.03.2012 12:00, Gleb Natapov wrote: >>>>>>>>> On Tue, Mar 27, 2012 at 11:26:29AM +0200, Peter Lieven wrote: >>>>>>>>>> On 27.03.2012 11:23, Vadim Rozenfeld wrote: >>>>>>>>>>> On Tuesday, March 27, 2012 10:56:05 AM Gleb Natapov wrote: >>>>>>>>>>>> On Mon, Mar 26, 2012 at 10:11:43PM +0200, Vadim Rozenfeld wrote: >>>>>>>>>>>>> On Monday, March 26, 2012 08:54:50 PM Peter Lieven wrote: >>>>>>>>>>>>>> On 26.03.2012 20:36, Vadim Rozenfeld wrote: >>>>>>>>>>>>>>> On Monday, March 26, 2012 07:52:49 PM Gleb Natapov wrote: >>>>>>>>>>>>>>>> On Mon, Mar 26, 2012 at 07:46:03PM +0200, Vadim Rozenfeld >>> wrote: >>>>>>>>>>>>>>>>> On Monday, March 26, 2012 07:00:32 PM Peter Lieven wrote: >>>>>>>>>>>>>>>>>> On 22.03.2012 10:38, Vadim Rozenfeld wrote: >>>>>>>>>>>>>>>>>>> On Thursday, March 22, 2012 10:52:42 AM Peter Lieven > wrote: >>>>>>>>>>>>>>>>>>>> On 22.03.2012 09:48, Vadim Rozenfeld wrote: >>>>>>>>>>>>>>>>>>>>> On Thursday, March 22, 2012 09:53:45 AM Gleb Natapov >>> wrote: >>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 21, 2012 at 06:31:02PM +0100, Peter Lieven >>> wrote: >>>>>>>>>>>>>>>>>>>>>>> On 21.03.2012 12:10, David Cure wrote: >>>>>>>>>>>>>>>>>>>>>>>> hello, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Le Tue, Mar 20, 2012 at 02:38:22PM +0200, Gleb >>>>>>>>>>>>>>>>>>>>>>>> Natapov >>>>>>>>>>> ecrivait : >>>>>>>>>>>>>>>>>>>>>>>>> Try to add>>>>>>>>>>>>>>>>>>>>>>>> name='hypervisor'/> to cpu definition in XML and >>>>>>>>>>>>>>>>>>>>>>>>> check command line. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> ok I try this but I can't use >>>>>>>>>>>>>>>>>>>>>>>> to map the host cpu >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> (my libvirt is 0.9.8) so I use : >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Opteron_G3 >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> name='hypervisor'/> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> (the physical server use Opteron CPU). >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> The log is here : >>>>>>>>>>>>>>>>>>>>>>>> http://www.roullier.net/Report/report-3.2-vhost-net- >>>>>>>>>>>>>>>>>>>>>>>> 1v cpu-cp u.tx t.gz >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> And now with only 1 vcpu, the response time is >>>>>>>>>>>>>>>>>>>>>>>> 8.5s, great >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> improvment. We keep this configuration for >>>>>>>>>>>>>>>>>>>>>>>> production >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> : we check the response time when some other users >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> are connected. >>>>>>>>>>>>>>>>>>>>>>> please keep in mind, that setting -hypervisor, >>>>>>>>>>>>>>>>>>>>>>> disabling hpet and only one vcpu >>>>>>>>>>>>>>>>>>>>>>> makes windows use tsc as clocksource. you have to >>>>>>>>>>>>>>>>>>>>>>> make sure, that your vm is not switching between >>>>>>>>>>>>>>>>>>>>>>> physical sockets on your system and that you have >>>>>>>>>>>>>>>>>>>>>>> constant_tsc feature to have a stable tsc between >>>>>>>>>>>>>>>>>>>>>>> the cores in the same socket. its also likely that >>>>>>>>>>>>>>>>>>>>>>> the vm will crash when live migrated. >>>>>>>>>>>>>>>>>>>>>> All true. I asked to try -hypervisor only to verify >>>>>>>>>>>>>>>>>>>>>> where we loose performance. Since you get good result >>>>>>>>>>>>>>>>>>>>>> with it frequent access to PM timer is probably the >>>>>>>>>>>>>>>>>>>>>> reason. I do not recommend using -hypervisor for >>>>>>>>>>>>>>>>>>>>>> production! >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> @gleb: do you know whats the state of in-kernel >>>>>>>>>>>>>>>>>>>>>>> hyper-v timers? >>>>>>>>>>>>>>>>>>>>>> Vadim is working on it. I'll let him answer. >>>>>>>>>>>>>>>>>>>>> It would be nice to have synthetic timers supported. >>>>>>>>>>>>>>>>>>>>> But, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> at the moment, I'm only researching this feature. >>>>>>>>>>>>>>>>>>>> So it will take months at least? >>>>>>>>>>>>>>>>>>> I would say weeks. >>>>>>>>>>>>>>>>>> Is there a way, we could contribute and help you with >>>>>>>>>>>>>>>>>> this? >>>>>>>>>>>>>>>>> Hi Peter, >>>>>>>>>>>>>>>>> You are welcome to add an appropriate handler. >>>>>>>>>>>>>>>> I think Vadim refers to this HV MSR >>>>>>>>>>>>>>>> http://msdn.microsoft.com/en-us/library/windows/hardware/ff5 >>>>>>>>>>>>>>>> 42 633%28 v=vs .85 %29.aspx >>>>>>>>>>>>>>> This one is pretty simple to support. Please see attachments >>>>>>>>>>>>>>> for more details. I was thinking about synthetic timers >>>>>>>>>>>>>>> http://msdn.microsoft.com/en- >>>>>>>>>>>>>>> us/library/windows/hardware/ff542758(v=vs.85).aspx >>>>>>>>>>>>>> is this what microsoft qpc uses as clocksource in hyper-v? >>>>>>>>>>>>> Yes, it should be enough for Win7 / W2K8R2. >>>>>>>>>>>> To clarify the thing that microsoft qpc uses is what is >>>>>>>>>>>> implemented by the patch Vadim attached to his previous email. >>>>>>>>>>>> But I believe that additional qemu patch is needed for Windows >>>>>>>>>>>> to actually use it. >>>>>>>>>>> You are right. >>>>>>>>>>> bits 1 and 9 must be set to on in leaf 0x40000003 and HPET >>>>>>>>>>> should be completely removed from ACPI. >>>>>>>>>> could you advise how to do this and/or make a patch? >>>>>>>>>> >>>>>>>>>> the stuff you send yesterday is for qemu, right? would >>>>>>>>>> it be possible to use it in qemu-kvm also? >>>>>>>>> No, they are for kernel. >>>>>>>> i meant the qemu.diff file. >>>>>>> Yes, I missed the second attachment. >>>>>>> >>>>>>>> if i understand correctly i have to pass -cpu host,+hv_refcnt to >>>>>>>> qemu? >>>>>>> Looks like it. >>>>>> ok, so it would be interesting if it helps to avoid the pmtimer reads >>>>>> we observed earlier. right? >>>>> Yes. >>>> first feedback: performance seems to be amazing. i cannot confirm that >>>> it breaks hv_spinlocks, hv_vapic and hv_relaxed. >>>> why did you assume this? >>> I didn't mean that hv_refcnt will break any other hyper-v features. >>> I just want to say that turning hv_refcnt on (as any other hv_ option) >>> will crash Win8 on boot-up. >> yes, i got it meanwhile ;-) >> >> let me know what you think should be done to further test >> the refcnt implementation. >> >> i would suggest to return at least 0xFFFFFFFF if msr 0x40000021 >> is read. > IIRC Win7(W2k8R2) only reads this MSR. Win8 reads and writes. you mean win7 only writes, don't you? at least you put a break in set_msr_hyperv for this msr. i just thought that it would be ok to return the value that is defined for iTSC is not supported? peter >> peter >> >>> Cheers, >>> Vadim. >>> >>>> no more pmtimer reads. i can now almost fully utililizy a 1GBit >>>> interface with a file transfer while there was not one >>>> cpu core fully utilized as observed with pmtimer. some live migration >>>> tests revealed that it did not crash even under load. >>>> >>>> @vadim: i think we need a proper patch for the others to test this ;-) >>>> >>>> what i observed: is it right, that HV_X64_MSR_TIME_REF_COUNT is missing >>>> in msrs_to_save[] in x86/x86.c of the kernel module? >>>> >>>> thanks for you help, >>>> peter >>>> >>>>> -- >>>>> >>>>> Gleb.