From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [patch 0/7] force the TSC unreliable by reporting C2 state Date: Wed, 18 Jun 2008 16:02:39 -0500 Message-ID: <485977EF.3090002@codemonkey.ws> References: <20080618164205.108219607@localhost.localdomain> <48596B85.7090008@codemonkey.ws> <20080618204042.GA15981@dmt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , kvm@vger.kernel.org To: Marcelo Tosatti Return-path: Received: from yw-out-2324.google.com ([74.125.46.28]:38414 "EHLO yw-out-2324.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752997AbYFRVDD (ORCPT ); Wed, 18 Jun 2008 17:03:03 -0400 Received: by yw-out-2324.google.com with SMTP id 9so248964ywe.1 for ; Wed, 18 Jun 2008 14:03:02 -0700 (PDT) In-Reply-To: <20080618204042.GA15981@dmt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > On Wed, Jun 18, 2008 at 03:09:41PM -0500, Anthony Liguori wrote: > >> Marcelo Tosatti wrote: >> >>> Avi, I don't think this causes such a huge performance regression. NOHZ >>> makes the frequency of timer reads go down significantly. >>> >>> >> Have we yet determined why the TSC is so unstable in the first place? >> In theory, it should be relatively stable on single-node Intel and >> Barcelona chips. >> > > If the host enters C2/C3, or changes CPU frequency, it becomes > unreliable as a clocksource and there's no guarantee the guest will > detect that. > On Intel, the TSC should be fixed-frequency for basically all shipping processors supporting VT. Starting with 10h (Barcelona), I believe AMD also has a fixed frequency TSC. > Also, as mentioned earlier, large systems with clustered APIC have > unstable TSC. > Right, that's why I qualified with single-node. > We _could_ hook this fake-C2-state thing to the host TSC reliability: > > 1) Hook into Linux's mark_tsc_unstable(). > 2) On migration check if the destination host is using the TSC, if not, > force a faked-C2-state. > > Problem with 2) is that not all guests honour the ACPI _CST package > notification (which would change C2's latency time from an unusable > value to something usable). And now I don't think assuming the _CST > notification to work is a good thing (after we found out that for ex. > Ubuntu 7.10 kernel ignores it). > I think that for hosts with a known unstable TSC, we should do something like this. But I also think we have a bug with TSC synchronization for AMD although I don't at all know what the source of it is. Regards, Anthony Liguori