From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757426Ab0IGWOq (ORCPT ); Tue, 7 Sep 2010 18:14:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64424 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756049Ab0IGWOm (ORCPT ); Tue, 7 Sep 2010 18:14:42 -0400 Message-ID: <4C86B948.9050000@redhat.com> Date: Tue, 07 Sep 2010 12:14:32 -1000 From: Zachary Amsden User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Thunderbird/3.0.5 MIME-Version: 1.0 To: "Dong, Eddie" CC: "kvm@vger.kernel.org" , Avi Kivity , Marcelo Tosatti , Glauber Costa , Thomas Gleixner , John Stultz , "linux-kernel@vger.kernel.org" Subject: Re: [KVM timekeeping 26/35] Catchup slower TSC to guest rate References: <1282291669-25709-1-git-send-email-zamsden@redhat.com> <1282291669-25709-27-git-send-email-zamsden@redhat.com> <1A42CE6F5F474C41B63392A5F80372B22A8253C5@shsmsx501.ccr.corp.intel.com> In-Reply-To: <1A42CE6F5F474C41B63392A5F80372B22A8253C5@shsmsx501.ccr.corp.intel.com> Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/06/2010 05:44 PM, Dong, Eddie wrote: > Zachary: > Will you extend the logic to cover the situation when the guest runs at higher than the guest rate but the PCPU is over committed. In that case, likely we can use the time spent when the VCPU is scheduled out to catch up as well. Of course if the VCPU scheduled out time is not enough to compensate the cycles caused by fast host TSC (exceeding a threahold), we will eventually have to fall back to trap and emulation mode. > It is possible to do this, but it is rather dangerous. We can't let the guest clock accelerate without bounds. We could put a limit on the maximum overrun the TSC is allowed to reach, and then switch into trapping mode, but this pre-supposes we will actually get an interrupt in time. A CPU heavy guest with little host activity could easily overrun much further than we would like unless we have a way to reliably trigger interrupts near the time of maximum allowed overrun. So, first, we must have a way to get such interrupts; this is needed anyway, for the catchup case, we have a similar problem with underrun which must be addressed. It's quite possible to add the mode you describe once that feature is in, but it also adds even more complexity to an already intricate clock system (which is one of the problem with the latter part of this patch series). Second, this mode of operation is incompatible with SMP guests under all circumstances. SMP guests with mismatched clock speeds must always run in trapping mode, as it is not possible to synchronize the catchup / trap switching without extremely heavyweight measures (use IPI wakeup). Those mechanisms will not only cost more than the trapping overhead (future, faster systems, and larger, more parallel systems), but they will also damage host performance (unneeded wakeups when other VCPUs are not scheduled). Unless, of course, we gang-schedule... but that is a difficult change and a very different mode of operation. Getting rid of TSC trap overhead on systems with non-constant TSC isn't a sufficient motivation for that kind of design change. Zach