From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753926AbbIASP6 (ORCPT ); Tue, 1 Sep 2015 14:15:58 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:39366 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751936AbbIASP5 (ORCPT ); Tue, 1 Sep 2015 14:15:57 -0400 Date: Tue, 1 Sep 2015 11:14:17 -0700 From: Shaohua Li To: Thomas Gleixner CC: John Stultz , lkml , Prarit Bhargava , Richard Cochran , Daniel Lezcano , "Ingo Molnar" , Clark Williams , "Steven Rostedt" , Peter Zijlstra Subject: Re: [PATCH 8/9] clocksource: Improve unstable clocksource detection Message-ID: <20150901181404.GA3954400@devbig257.prn2.facebook.com> References: <20150826171533.GA2189998@devbig257.prn2.facebook.com> <20150831211233.GA1413758@devbig257.prn2.facebook.com> <20150831223931.GA1586538@devbig257.prn2.facebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-12-10) X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000 definitions=2015-09-01_07:2015-08-31,2015-09-01,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 01, 2015 at 07:13:40PM +0200, Thomas Gleixner wrote: > On Mon, 31 Aug 2015, Shaohua Li wrote: > > On Mon, Aug 31, 2015 at 11:47:52PM +0200, Thomas Gleixner wrote: > > > On Mon, 31 Aug 2015, Shaohua Li wrote: > > > > > The HPET wraps interval is 0xffffffff / 100000000 = 42.9s > > > > > > > > > > tsc interval is (0x481250b45b - 0x219e6efb50) / 2200000000 = 75s > > > > > > > > > > 32.1 + 42.9 = 75 > > > > > > > > > > The example shows hpet wraps, while tsc is marked unstable > > > > > > > > Thomas & John, > > > > Is this data enough to prove TSC unstable issue can be triggered by HPET > > > > wrap? I can resend the patch with the data included. > > > > > > Well, it's enough data to prove: > > > > > > - that keeping a VM off the CPU for 75 seconds is insane. > > > > It wraps in 42.9s. 42.9s isn't a long time hard to block. I don’t think > > You think that blocking softirq execution for 42.9 seconds is normal? > Seems we are living in a different universe. I don't say it's normal. I say it's not hard to trigger. > > it's just VM off. A softirq can hog the cpu. > > I still want to see prove of that. There is just handwaving about > that, but nobody has provided proper data to back that up. I showed you the TSC runs 75s, while hpet wraps. What info you think can prove this? > > > - that emulating the HPET with 100MHz shortens the HPET wraparound by > > > a factor of 7 compared to real hardware. With a realist HPET > > > frequency you have about 300 seconds. > > > > > > Who though that using 100MHz HPET frequency is a brilliant idea? > > > > I'm not a VM expert. My guess is the 100Mhz can reduce interrupt. It’s > > insane hypervisor updates HPET count in 14.3Mhz. Switching to HPET can > > introduce even higher overhead in virtual, because of the vmexit of > > iomemory access > > Sorry, that does not make any sense at all. > > - How does 100Mhz HPET frequency reduce interrupts? > > - What's insane about a lower emulated HPET frequency? > > - We all know that switching to HPET is more expensive than just > using TSC. That's not the question at all and completely > unrelated to the 100MHz HPET emulation frequency. It's meaningless to argue about HPET frequency. The code should not just work for 14.3Mhz HPET. > > > So we should add crappy heuristics to the watchdog just to workaround > > > virt insanities? I'm not convinced. > > > > This is a real issue which could impact performance seriously. Though > > the data is collected in vm, we do see the issue happens in physical > > machines too. > > And what's the exact reason for this on physical machines? Some magic > softirq hog again for which you cannot provide proof? > > > The watchdog clock source shows restriction here apparently, it > > deserves an improvement if we can do. > > The restriction in a sane environment is 300 seconds. And the only > fallout on physical hardware which we have seen so far is on > preempt-RT where the softirq can actually be blocked by RT hogs, but > that's a completely different issue and has nothing to do with the > situation in mainline. > > > I'm happy to hear from you if there is better solution, but we > > shouldn't pretend there is no issue here. > > I'm not pretending anything. I'm merily refusing to accept that change > w/o a proper explanation WHY the watchdog fails on physical hardware, > i.e. WHY it does not run for more than 300 seconds. It's meaningless to argue about virtual/physical machine too. Linux works for both virtual/physical machines. What about acpi_pm clocksource then? It wraps in abour 5s. It's sane HPET is disabled and acpi_pm is used for watchdog. Do you still think 5s is long? Thanks, Shaohua