From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: Clock jumped 50 minutes in dom0 caused incorrect 2008 R2 domU time Date: Tue, 26 Oct 2010 10:03:07 -0700 Message-ID: <4CC709CB.7090203@goop.org> References: <20101006111618.GA31233@campbell-lange.net> <4CAC98BF.9010902@goop.org> <5e238400-51d4-4ed7-8f8b-1f3f44486d45@default> <20101026092254.GA2066@campbell-lange.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20101026092254.GA2066@campbell-lange.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Mark Adams Cc: Dan Magenheimer , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 10/26/2010 02:22 AM, Mark Adams wrote: > On Thu, Oct 07, 2010 at 07:04:18AM -0700, Dan Magenheimer wrote: >> Hi Jeremy and Mark -- >> >> Oddly, I saw that "clocksource tsc unstable" message myself >> on a busy 2.6.36-rc5 PV domain yesterday. While it is possible >> that this reflects a hardware problem, the fact that you >> saw it on a Nehalem+ Intel processor makes it very unlikely. >> The "s" and "t" debug keys (the output of which can be seen via >> "xm debug-key s; xm dmesg | tail" in dom0) can help diagnose >> the problem if it is indeed a hardware problem or BIOS >> problem or the result of a CPU hot-add... all unlikely. >> >> It IS possible that the code that emulates tsc is broken >> somewhere, but I don't think tsc should be emulated by >> default for dom0 on a Nehalem+ box... and even if it is, >> it is directly based on Xen system time which, if it went >> awry, would probably cause major problems. >> >> Looking through the Linux code that prints that message (in >> kernel/time/clocksource.c) it appears that the message >> appears if the tsc deviates from the "watchdog clocksource", >> which in PV domains is "xen" (or more precisely pvclock >> I think). So most likely, this is a symptom of a problem >> with pvclock or the watchdog code in the pvops kernel, not >> an indicator that the tsc is actually unstable. >> >> Dan > Is there any more information I can provide to help with debugging this? > We haven't had the problem since. It could just be a coincidence but it > happened around the time that daylight savings occurred in the US (we > are in the UK). In Linux/Xen it shouldn't have any effect since the clocks are always maintained in UTC, then timezone details are applied much later in usermode. But Windows has a bad habit of setting the hardware RTC to local time, and mucking about with it for DST changes - but that would only be relevant if you booted Windows on your host machine (I don't think there's any way for a Windows guest's time to leak into the host/dom0's timebase). Unfortunately these kinds of time problems can be notoriously hard to pin down and diagnose. J