From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pz0-f173.google.com (mail-pz0-f173.google.com [209.85.222.173]) by ozlabs.org (Postfix) with ESMTP id 03FD3B7CB9 for ; Fri, 26 Mar 2010 12:11:14 +1100 (EST) Received: by pzk3 with SMTP id 3so899098pzk.9 for ; Thu, 25 Mar 2010 18:11:13 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <1269549524.8599.243.camel@pasglop> References: <43c137a81003241941p84cba56y3e02e40cb22623e2@mail.gmail.com> <1269505301.8599.238.camel@pasglop> <201003251105.10033.arnd@arndb.de> <43c137a81003250800n660195c5k42c8516068aeda8d@mail.gmail.com> <1269549524.8599.243.camel@pasglop> Date: Fri, 26 Mar 2010 09:11:13 +0800 Message-ID: <43c137a81003251811s52ac72eaud921d187e9747098@mail.gmail.com> Subject: Re: Continual reading from the PowerPc time base register is not stable From: Csdncannon To: Benjamin Herrenschmidt Content-Type: multipart/alternative; boundary=0016e64c3ba2e6baef0482a9d611 Cc: linuxppc-dev@ozlabs.org, Arnd Bergmann List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , --0016e64c3ba2e6baef0482a9d611 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable After trying the new code with "isync" and unsigned long long convertion, this problem doesn't happen(I tested for several minutes). But the previous block of codes(lacking of isync) is borrowed from kernel. And if this is a bug of kernel? Thanks Gino 2010/3/26 Benjamin Herrenschmidt > On Thu, 2010-03-25 at 23:00 +0800, Csdncannon wrote: > > I am really sorry that the previously attached code is wrong, this one > > "timebase.c" is the right one, and the "log_timebase" file is the > > right log. > > > > We are using FreeScale PowerPc 8378, kernel 2.6.28 and compiled as > > 32-bit. > > And despite all those sync/isync you can still observe the timebase > going backward ? That sounds scary. However, at this stage all I can > suggest is getting freescale folks to have a look, as this should really > not happen. Maybe there's some setting with that specific SoC that is > missing or similar... > > Cheers, > Ben. > > > > > Thanks > > Gino > > > > 2010/3/25 Arnd Bergmann > > On Thursday 25 March 2010, Benjamin Herrenschmidt wrote: > > > On Thu, 2010-03-25 at 10:41 +0800, Csdncannon wrote: > > > > In my program, the value of the 64-bit time base > > register is > > > > read out, and you will find the later value is even > > smaller than the > > > > earlier value from the log =93log_timebase=94. While the > > kernel depends on > > > > the accuracy of the timebase for the compensation of the > > lost PIT > > > > interrupt, the negative value between two continual > > timebase reading > > > > will bring to the jump of the jiffies. And this timebase > > problem will > > > > bring to the instability of the gettimeofday system call. > > > > > > > > Do you have any idea about this problem, thanks > > for your any > > > > advice. Attached is the code and log. > > > > > > This is a concern, it should definitely not happen. What > > machine is > > > that ? is the code compiled 32-bit or 64-bit ? What kernel > > version ? > > > > > > Arnd, any chance that could relate to the bug you've been > > chasing on > > > Cell ? > > > > > > We're still busy with the problem analysis on Cell, waiting > > for a time > > slot to run the next test kernel. So far it seems like the > > timebase > > is actually synchronized at a significant accuracy on QS22 to > > never > > cause this problem with correct code, however it is possible > > to > > observe incorrect timebase values on Cell whenever the mftb > > instruction > > is not serialized with memory accesses, e.g. by using an isync > > in front > > of the mftb. On Power6 and other CPUs, that problem will not > > happen. > > > > Arnd > > > > > --0016e64c3ba2e6baef0482a9d611 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable After trying the new code with "isync" and unsigned long long con= vertion, this problem doesn't happen(I tested for several minutes). But= the previous block of codes(lacking of isync) is borrowed from kernel. And= if this is a bug of kernel?

Thanks
Gino

2010/3/26 Benjamin Her= renschmidt <benh@kernel.crashing.org>
On Thu, 2010-03-25 at 23:00 +0800, Csdncannon wrote:
> I am really sorry that the previously attached code is wrong, this one=
> "timebase.c" is the right one, and the "log_timebase&qu= ot; file is the
> right log.
>
> We are using FreeScale PowerPc 8378, kernel 2.6.28 and compiled as
> 32-bit.

And despite all those sync/isync you can still observe the timebase going backward ? That sounds scary. However, at this stage all I can
suggest is getting freescale folks to have a look, as this should really not happen. Maybe there's some setting with that specific SoC that is missing or similar...

Cheers,
Ben.

>
> Thanks
> Gino
>
> 2010/3/25 Arnd Bergmann <arnd@arnd= b.de>
> =A0 =A0 =A0 =A0 On Thursday 25 March 2010, Benjamin Herrenschmidt wrot= e:
> =A0 =A0 =A0 =A0 > On Thu, 2010-03-25 at 10:41 +0800, Csdncannon wro= te:
> =A0 =A0 =A0 =A0 > > =A0 =A0 =A0 =A0 =A0In my program, the value = of the 64-bit time base
> =A0 =A0 =A0 =A0 register is
> =A0 =A0 =A0 =A0 > > read out, and you will find the later value = is even
> =A0 =A0 =A0 =A0 smaller than the
> =A0 =A0 =A0 =A0 > > earlier value from the log =93log_timebase= =94. While the
> =A0 =A0 =A0 =A0 kernel depends on
> =A0 =A0 =A0 =A0 > > the accuracy of the timebase for the compens= ation of the
> =A0 =A0 =A0 =A0 lost PIT
> =A0 =A0 =A0 =A0 > > interrupt, the negative value between two co= ntinual
> =A0 =A0 =A0 =A0 timebase reading
> =A0 =A0 =A0 =A0 > > will bring to the jump of the jiffies. And t= his timebase
> =A0 =A0 =A0 =A0 problem will
> =A0 =A0 =A0 =A0 > > bring to the instability of the gettimeofday= system call.
> =A0 =A0 =A0 =A0 > >
> =A0 =A0 =A0 =A0 > > =A0 =A0 =A0 =A0 =A0Do you have any idea abou= t this problem, thanks
> =A0 =A0 =A0 =A0 for your any
> =A0 =A0 =A0 =A0 > > advice. Attached is the code and log.
> =A0 =A0 =A0 =A0 >
> =A0 =A0 =A0 =A0 > This is a concern, it should definitely not happe= n. What
> =A0 =A0 =A0 =A0 machine is
> =A0 =A0 =A0 =A0 > that ? is the code compiled 32-bit or 64-bit ? Wh= at kernel
> =A0 =A0 =A0 =A0 version ?
> =A0 =A0 =A0 =A0 >
> =A0 =A0 =A0 =A0 > Arnd, any chance that could relate to the bug you= 've been
> =A0 =A0 =A0 =A0 chasing on
> =A0 =A0 =A0 =A0 > Cell ?
>
>
> =A0 =A0 =A0 =A0 We're still busy with the problem analysis on Cell= , waiting
> =A0 =A0 =A0 =A0 for a time
> =A0 =A0 =A0 =A0 slot to run the next test kernel. So far it seems like= the
> =A0 =A0 =A0 =A0 timebase
> =A0 =A0 =A0 =A0 is actually synchronized at a significant accuracy on = QS22 to
> =A0 =A0 =A0 =A0 never
> =A0 =A0 =A0 =A0 cause this problem with correct code, however it is po= ssible
> =A0 =A0 =A0 =A0 to
> =A0 =A0 =A0 =A0 observe incorrect timebase values on Cell whenever the= mftb
> =A0 =A0 =A0 =A0 instruction
> =A0 =A0 =A0 =A0 is not serialized with memory accesses, e.g. by using = an isync
> =A0 =A0 =A0 =A0 in front
> =A0 =A0 =A0 =A0 of the mftb. On Power6 and other CPUs, that problem wi= ll not
> =A0 =A0 =A0 =A0 happen.
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Arnd
>



--0016e64c3ba2e6baef0482a9d611--