From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dan Magenheimer Subject: RE: [Xen-devel] Xen 4 TSC problems Date: Wed, 23 Feb 2011 08:16:20 -0800 (PST) Message-ID: <4e4ddd9f-ed80-48b7-b001-c6b02c0d1935@default> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0989069949==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-users-bounces@lists.xensource.com Errors-To: xen-users-bounces@lists.xensource.com To: Olivier Hanesse , xen-devel@lists.xensource.com, Xen Users Cc: Jeremy Fitzhardinge , keir@xen.org, Mark Adams List-Id: xen-devel@lists.xenproject.org --===============0989069949== Content-Type: multipart/alternative; boundary="__129847779289796705abhmt006" --__129847779289796705abhmt006 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable It's very unlikely this is a problem with TSC. It is most likely a Xen (or = possibly a PV Linux) problem where a guest (or dom0) either "goes out to lu= nch" for a long period, or some other timer gets stuck. The "clocksource t= sc unstable" message is a side effect of this... it's very likely the TSC t= hat IS stable and correct and the other clocksource (pvclock) has lost/gain= ed 50 minutes! =20 Mark Adams cc'ed and his original xen-devel posting below. The fact that t= wo different users (possibly on the same processor/system type?) have submi= tted the message with a delta so similar would lead me to believe there is = some timer that is "wrapping". And since pvclock is usually the clocksourc= e for dom0, and pvclock is driven by Xen's "system time", a reasonable gues= s is that the timer that is wrapping is in Xen itself. =20 Mark's delta =3D -2999660303788 ns Your delta =3D -2999660334211 ns =20 Googling, I see the HPET wraparound is ~306 seconds and this delta is about= 3000 seconds, so that may be a bad guess. =20 Keir, any thoughts on this? Do you recall any post-4.0 patches that may ha= ve fixed this? =20 Thanks, Dan =20 References: http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00210.html https://lkml.org/lkml/2010/10/26/126 =20 From: Olivier Hanesse [mailto:olivier.hanesse@gmail.com]=20 Sent: Wednesday, February 23, 2011 3:50 AM To: xen-devel@lists.xensource.com; Xen Users Subject: [Xen-devel] Xen 4 TSC problems =20 Hello =20 I've got an issue about time keeping with Xen 4.0 (Debian squeeze release).= =20 =20 My problem is here (hopefully I amn't the only one, so there might be a bug= somewhere) : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D599161#50 After some times, I got this error : Clocksource tsc unstable (delta =3D -= 2999660334211 ns). It has happened on several servers.=20 =20 Looking at the output of "xm debug-key s;" =20 (XEN) TSC has constant rate, deep Cstates possible, so not reliable, warp= =3D2850 (count=3D3) =20 I am using a "Intel(R) Xeon(R) CPU L5420 @ 2.50GHz", which has the "consta= nt_tsc", but not the "nonstop_tsc" one. On other systems with a newer cpu with "nonstop_tsc", I don't have this iss= ue (systems are running the same distros with same config). =20 I tried to boot with "max_cstate=3D0", but nothing changed, my TSC isn't re= liable and after some times, I will got the "50min" issue again. =20 I don't understand how a system can do a jump of "50min" in the future. Why= 50min ? it is not 40min, not 1 hour, it is always 50min. I don't know how to make my TSC "reliable" (I already disable everything ab= out Powerstate in BIOS Settings). =20 Any ideas ? =20 Regards =20 Olivier --__129847779289796705abhmt006 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: quoted-printable

It’s v= ery unlikely this is a problem with TSC. It is most likely a Xen (or poss= ibly a PV Linux) problem where a guest (or dom0) either “goes out t= o lunch” for a long period, or some other timer gets stuck.  T= he “clocksource tsc unstable” message is a side effect of thi= s... it’s very likely the TSC that IS stable and correct and the ot= her clocksource (pvclock) has lost/gained 50 minutes!

 

Mark Adams cc&#= 8217;ed and his original xen-devel posting below.  The fact that two= different users (possibly on the same processor/system type?) have submi= tted the message with a delta so similar would lead me to believe there i= s some timer that is “wrapping”.  And since pvclock is u= sually the clocksource for dom0, and pvclock is driven! =20 by Xen’s “system time”, a reasonable guess is that the tim= er that is wrapping is in Xen itself.

=  

Mark’s delta =3D -2999660= 303788 ns

Your delta =3D -299966033421= 1 ns

 

Googling, I see the HPET wraparound is ~306 seconds and this del= ta is about 3000 seconds, so that may be a bad guess.

 

Keir, any thoug= hts on this?  Do you recall any post-4.0 patches that may have fixed= this?

 

Thanks,

Dan

 

References:<= o:p>

http://lists.xensource.com/archives/html/xen-devel/2010-10/msg00= 210.html

https://lkml.org/lkml/2010/10/26/126=

 

From: Olivier Hane= sse [mailto:olivier.hanesse@gmail.com]
Sent: Wednesday, Februa= ry 23, 2011 3:50 AM
To: xen-devel@lists.xensource.co! =20 m; Xen Users
Subject: [Xen-devel] Xen 4 TSC problems
=

 

Hello

 =

I've got an issue about time ke= eping with Xen 4.0 (Debian squeeze release). 

 

My problem is here (hopefully I amn't the only one, so there might be= a bug somewhere) : http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=3D59= 9161#50

After some time= s,  I got this error : Clocksource tsc unstable (delta =3D -29996603= 34211 ns). It has happened on several servers. 

=

 

Looking at the output of "xm debug-key s;"

 <= /p>

(XEN) TSC has constant rate, deep Csta= tes possible, so not reliable, warp=3D2850 (count=3D3)

 

I am using a "Intel(R) Xeon(R) CPU L5420  @ 2.50GHz&quo= t;, which has the "constant_tsc", but not the "nonstop_tsc= " one.

On other system= s with a newer cpu with "nonstop_tsc", I don't have this issue = (systems are running the same distros with same config).

 

I tried to boot with "max_cstate=3D0", but nothing ch= anged, my TSC isn't reliable and after some times, I will got the "5= 0min" issue again.

 

I don't unders! =20 tand how a system can do a jump of "50min" in the future. Why 50min ? it= is not 40min, not 1 hour, it is always 50min.

<= p class=3DMsoNormal>I don't know how to make my TSC "reliable" = (I already disable everything about Powerstate in BIOS Settings).

 

Any ideas ?

 

Regards

 

Olivier

--__129847779289796705abhmt006-- --===============0989069949== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users --===============0989069949==--