From mboxrd@z Thu Jan 1 00:00:00 1970 From: Hailiang Zhang Subject: =?UTF-8?Q?Re:_=e7=ad=94=e5=a4=8d:_[PATCH_0/5]_mc146818rtc:_fix_Wind?= =?UTF-8?Q?ows_VM_clock_faster?= Date: Wed, 19 Apr 2017 18:41:43 +0800 Message-ID: <58F73EE7.1030607@huawei.com> References: <20170412095111.11728-1-xiaoguangrong@tencent.com> <418f9ca3-09a9-e40c-e15b-9498036f0b12@gmail.com> <58EF4512.1060004@huawei.com> <58EF4732.3050006@huawei.com> <402d04f4-2baf-6bd0-57d6-fc5890ea603a@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Cc: , "qemu-devel@nongnu.org" , "kvm@vger.kernel.org" , "yunfangtai@tencent.com" , Xiao Guangrong To: Xiao Guangrong , Xiao Guangrong , Paolo Bonzini , "mst@redhat.com" , "mtosatti@redhat.com" Return-path: Received: from szxga01-in.huawei.com ([45.249.212.187]:5777 "EHLO dggrg01-dlp.huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1761740AbdDSKmM (ORCPT ); Wed, 19 Apr 2017 06:42:12 -0400 In-Reply-To: <402d04f4-2baf-6bd0-57d6-fc5890ea603a@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2017/4/19 10:02, Xiao Guangrong wrote: > > On 04/13/2017 05:38 PM, Hailiang Zhang wrote: >> On 2017/4/13 17:35, Xiao Guangrong wrote: >>> On 04/13/2017 05:29 PM, Hailiang Zhang wrote: >>>> On 2017/4/13 17:18, Xiao Guangrong wrote: >>>>> On 04/13/2017 05:05 PM, Zhanghailiang wrote: >>>>>> Hi, >>>>>> >>>>>> -----邮件原件----- >>>>>> 发件人: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] >>>>>> 代表 Xiao Guangrong >>>>>> 发送时间: 2017年4月13日 16:53 >>>>>> 收件人: Paolo Bonzini; mst@redhat.com; mtosatti@redhat.com >>>>>> 抄送: qemu-devel@nongnu.org; kvm@vger.kernel.org; >>>>>> yunfangtai@tencent.com; Xiao Guangrong >>>>>> 主题: Re: [PATCH 0/5] mc146818rtc: fix Windows VM clock faster >>>>>> >>>>>> >>>>>> >>>>>> On 04/13/2017 04:39 PM, Xiao Guangrong wrote: >>>>>>> On 04/13/2017 02:37 PM, Paolo Bonzini wrote: >>>>>>>> On 12/04/2017 17:51, guangrong.xiao@gmail.com wrote: >>>>>>>>> The root cause is that the clock will be lost if the periodic >>>>>>>>> period >>>>>>>>> is changed as currently code counts the next periodic time like >>>>>>>>> this: >>>>>>>>> next_irq_clock = (cur_clock & ~(period - 1)) + period; >>>>>>>>> >>>>>>>>> consider the case if cur_clock = 0x11FF and period = 0x100, then >>>>>>>>> the >>>>>>>>> next_irq_clock is 0x1200, however, there is only 1 clock left to >>>>>>>>> trigger the next irq. Unfortunately, Windows guests (at least >>>>>>>>> Windows7) change the period very frequently if it runs the attached >>>>>>>>> code, so that the lost clock is accumulated, the wall-time become >>>>>>>>> faster and faster >>>>>>>> Very interesting. >>>>>>>> >>>>>>> Yes, indeed. >>>>>>> >>>>>>>> However, I think that the above should be exactly how the RTC should >>>>>>>> work. The original RTC circuit had 22 divider stages (see page >>>>>>>> 13 of >>>>>>>> the datasheet[1], at the bottom right), and the periodic interrupt >>>>>>>> taps the rising edge of one of the dividers (page 16, second >>>>>>>> paragraph). The datasheet also never mentions a comparator being >>>>>>>> used to trigger the periodic interrupts. >>>>>>>> >>>>>>> That was my thought before, however, after more test, i am not >>>>>>> sure if >>>>>>> re-configuring RegA changes these divider stages internal... >>>>>>> >>>>>>>> Have you checked that this Windows bug doesn't happen on real >>>>>>>> hardware too? Or is the combination of driftfix=slew and changing >>>>>>>> periods that is a problem? >>>>>>>> >>>>>>> I have two physical windows 7 machines, both of them have >>>>>>> 'useplatformclock = off' and ntp disabled, the wall time is really >>>>>>> accurate. The difference is that the physical machines are using >>>>>>> Intel >>>>>>> Q87 LPC chipset which is mc146818rtc compatible. However, on VM, the >>>>>>> issue is easily be reproduced just in ~10 mins. >>>>>>> >>>>>>> Our test mostly focus on 'driftfix=slew' and after this patchset the >>>>>>> time is accurate and stable. >>>>>>> >>>>>>> I will do the test for dropping 'slew' and see what will happen... >>>>>>> >>>>>>> Well, the time is easily observed to be faster if 'driftfix=slew' is >>>>>>> not used. :( >>>>>> You mean, it only fixes the one case which with the ' driftfix=slew ' >>>>>> is used ? >>>>> No. for both. >>>>> >>>>>> We encountered this problem too, I have tried to fix it long time ago. >>>>>> https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg06937.html. >>>>>> (It seems that your solution is more useful) >>>>>> But it seems that it is impossible to fix, we need to emulate the >>>>>> behaviors of real hardware, >>>>>> but we didn't find any clear description about it. And it seems that >>>>>> other virtualization platforms >>>>> That is the issue, the hardware spec does not detail how the clock is >>>>> counted when the timer interval is changed. What we can do at this time >>>>> is that speculate it from the behaviors. Current RTC is completely >>>>> unusable anyway. >>>>> >>>>> >>>>>> have this problem too: >>>>>> VMware: >>>>>> https://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf >>>>>> Heper-v: >>>>>> https://blogs.msdn.microsoft.com/virtual_pc_guy/2010/11/19/time-synchronization-in-hyper-v/ >>>>>> >>>>>> >>>>> Hmm, slower clock is understandable, does really the Windows7 on hyperV >>>>> have faster clock? Did you meet it? >>>> I don't know, we didn't test it, besides, I'd like to know how long did >>>> your testcase run before >>>> you judge it is stable with 'driftfix=slew' option? (My previous patch >>>> can't fix it completely but >>>> only narrows the gap between timer in guest and real timer.) >>> More than 12 hours. >> Great, I'll test and look into it ... thanks. >> > Hi Hailiang, > > Does this patchset work for you? :) Yes, i think it works for us, nice work :) > > > . >