* gettimeofday problem
@ 2002-06-24 8:55 Salvatore D'Angelo
[not found] ` <200206240934.g5O9YL524660@budgie.cs.uwa.edu.au>
0 siblings, 1 reply; 23+ messages in thread
From: Salvatore D'Angelo @ 2002-06-24 8:55 UTC (permalink / raw)
To: linux-kernel
Hi,
I am writing a small piece of code that use the gettimeofday routine and
I have noticed a very strange behaviour. If I call the routine two times
in sequence I expect that the second value is greater than or equal to
the first one, but it is not true.
please check the following code.
Sometimes happen that the string "Strange Behaviour" is printed with
kernel 2.4.18.
I tried to find in the Linux Archive patches to solve this problem, but
I didn't find anything (there are emails that talk about gettimeofday,
but probably they do not answer to my questions).
The same thing happen in Java using the System.currentTimeMillis() routine.
#include <fstream.h>
#include <sys/time.h>
// this routine calculate the current time returning its value in long
long format.
long long currentTimeMillis() {
long long t;
struct timeval tv;
gettimeofday(&tv, 0);
t = tv.tv_sec;
t = (t *1000) + (tv.tv_usec/1000);
return t;
}
void main() {
for (;;) {
long long a = currentTimeMillis();
long long b = currentTimeMillis();
if (a>b) {
cout << "Strange Behaviour" << endl;
}
}
}
^ permalink raw reply [flat|nested] 23+ messages in thread[parent not found: <200206240934.g5O9YL524660@budgie.cs.uwa.edu.au>]
* Re: gettimeofday problem [not found] ` <200206240934.g5O9YL524660@budgie.cs.uwa.edu.au> @ 2002-06-24 10:20 ` Salvatore D'Angelo 2002-06-24 12:46 ` Matti Aarnio 2002-06-24 19:44 ` Richard B. Johnson 0 siblings, 2 replies; 23+ messages in thread From: Salvatore D'Angelo @ 2002-06-24 10:20 UTC (permalink / raw) To: Chris McDonald; +Cc: linux-kernel In this piece of code I convert seconds and microseconds in milliseconds. I think the problem is not in my code, in fact I wrote the following piece of code in Java, and it does not work too. In the for loop the 90% of times b > a while for 10% of times not. class Prova { public static void main(....) { for (;;) { long a = System.currentTimeMillis(); long b = System.currentTimeMillis(); if (a > b) { System.out.println("Wrong!!!!!!!!!!!!!"); } } } } Chris McDonald wrote: >In linux.kernel you write: > > > >>Hi, >> >> > > > >>I am writing a small piece of code that use the gettimeofday routine and >>I have noticed a very strange behaviour. If I call the routine two times >>in sequence I expect that the second value is greater than or equal to >>the first one, but it is not true. >> >> > >Perhaps because tv_usecs are MICROsecs, not MILLIsecs ? > >_______________________________________________________________________________ > Dr Chris McDonald EMAIL: chris@csse.uwa.edu.au > Department of Computer Science & Software Engineering >The University of Western Australia WWW: http://www.csse.uwa.edu.au/~chris > Crawley, Western Australia, 6009 PH: +61 8 9380 2533, FAX: +61 8 9380 1089 > > > > >>please check the following code. >> >> > > > >>Sometimes happen that the string "Strange Behaviour" is printed with >>kernel 2.4.18. >> >> > > > >>I tried to find in the Linux Archive patches to solve this problem, but >>I didn't find anything (there are emails that talk about gettimeofday, >>but probably they do not answer to my questions). >> >> > > > >>The same thing happen in Java using the System.currentTimeMillis() routine. >> >> > > > >>#include <fstream.h> >>#include <sys/time.h> >> >> > > > >>// this routine calculate the current time returning its value in long >>long format. >>long long currentTimeMillis() { >> long long t; >> struct timeval tv; >> >> > > > >> gettimeofday(&tv, 0); >> >> > > > >> t = tv.tv_sec; >> t = (t *1000) + (tv.tv_usec/1000); >> >> > > > >> return t; >>} >> >> > > > >>void main() { >> for (;;) { >> long long a = currentTimeMillis(); >> long long b = currentTimeMillis(); >> >> > > > >> if (a>b) { >> cout << "Strange Behaviour" << endl; >> } >> } >>} >> >> > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 10:20 ` Salvatore D'Angelo @ 2002-06-24 12:46 ` Matti Aarnio 2002-06-24 13:57 ` Salvatore D'Angelo 2002-07-01 2:29 ` Pavel Machek 2002-06-24 19:44 ` Richard B. Johnson 1 sibling, 2 replies; 23+ messages in thread From: Matti Aarnio @ 2002-06-24 12:46 UTC (permalink / raw) To: Salvatore D'Angelo; +Cc: Chris McDonald, linux-kernel On Mon, Jun 24, 2002 at 12:20:02PM +0200, Salvatore D'Angelo wrote: > In this piece of code I convert seconds and microseconds in > milliseconds. I think the problem is not in my code, in fact I wrote the > following piece of code in Java, and it does not work too. In the for > loop the 90% of times b > a while for 10% of times not. > ... > long a = System.currentTimeMillis(); > long b = System.currentTimeMillis(); > if (a > b) { > System.out.println("Wrong!!!!!!!!!!!!!"); > } So in 10% of the cases, two successive calls yield time rolling BACK ? I used gettimeofday() call, and compared the original data from the code. At a modern uniprocessor machine I never get anything except monotonously increasing time (TSC is used in betwen timer ticks to supply time increase.) At a dual processor machine, on occasion I do get SAME value twice. I have never seen time rolling backwards. Uh.. correction: 216199245 0:-1 -- it did step backwards, but only once within about 216 million gettimeofday() calls. (I am running 2.4.19-pre8smp at the test box.) /Matti Aarnio ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 12:46 ` Matti Aarnio @ 2002-06-24 13:57 ` Salvatore D'Angelo 2002-06-24 18:56 ` Karim Yaghmour 2002-07-01 2:30 ` Pavel Machek 2002-07-01 2:29 ` Pavel Machek 1 sibling, 2 replies; 23+ messages in thread From: Salvatore D'Angelo @ 2002-06-24 13:57 UTC (permalink / raw) To: Matti Aarnio; +Cc: Chris McDonald, linux-kernel On 2000000 call -> 189 times I found the problem (0.00945%) On 20000000 call ->1956 found I found the problem (0.00978%) Probably you're right my previous percentage is too high (the one above should be the correct one). But do you think that this behaviour is normal? Matti Aarnio wrote: >On Mon, Jun 24, 2002 at 12:20:02PM +0200, Salvatore D'Angelo wrote: > > >>In this piece of code I convert seconds and microseconds in >>milliseconds. I think the problem is not in my code, in fact I wrote the >>following piece of code in Java, and it does not work too. In the for >>loop the 90% of times b > a while for 10% of times not. >> >> >> >... > > >> long a = System.currentTimeMillis(); >> long b = System.currentTimeMillis(); >> if (a > b) { >> System.out.println("Wrong!!!!!!!!!!!!!"); >> } >> >> > > > So in 10% of the cases, two successive calls yield time > rolling BACK ? > > I used gettimeofday() call, and compared the original data > from the code. > > At a modern uniprocessor machine I never get anything except > monotonously increasing time (TSC is used in betwen timer ticks > to supply time increase.) At a dual processor machine, on > occasion I do get SAME value twice. I have never seen time > rolling backwards. > > Uh.. correction: 216199245 0:-1 -- it did step backwards, > but only once within about 216 million gettimeofday() calls. > (I am running 2.4.19-pre8smp at the test box.) > >/Matti Aarnio >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 13:57 ` Salvatore D'Angelo @ 2002-06-24 18:56 ` Karim Yaghmour 2002-06-26 11:01 ` Gabriel Paubert 2002-07-01 2:30 ` Pavel Machek 1 sibling, 1 reply; 23+ messages in thread From: Karim Yaghmour @ 2002-06-24 18:56 UTC (permalink / raw) To: Salvatore D'Angelo; +Cc: Matti Aarnio, Chris McDonald, linux-kernel Salvatore D'Angelo wrote: > On 2000000 call -> 189 times I found the problem (0.00945%) > On 20000000 call ->1956 found I found the problem (0.00978%) ... > But do you think that this behaviour is normal? This has already been discussed on the LKML. Here's the thread: http://marc.theaimsgroup.com/?t=102348161100006&r=1&w=2 I posted the following message on this issue: http://marc.theaimsgroup.com/?l=linux-kernel&m=102348249521519&w=2 As I had said earlier, I've seen this happen before on both i386 and PPC machines. Cheers, Karim =================================================== Karim Yaghmour karim@opersys.com Embedded and Real-Time Linux Expert =================================================== ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 18:56 ` Karim Yaghmour @ 2002-06-26 11:01 ` Gabriel Paubert 0 siblings, 0 replies; 23+ messages in thread From: Gabriel Paubert @ 2002-06-26 11:01 UTC (permalink / raw) To: Karim Yaghmour, linux-kernel Karim Yaghmour wrote: > As I had said earlier, I've seen this happen before on both i386 and > PPC machines. Have you seen this on PPC with recent 2.4 and 2.5 kernels or only with 2.2 ? Gabriel. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 13:57 ` Salvatore D'Angelo 2002-06-24 18:56 ` Karim Yaghmour @ 2002-07-01 2:30 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Pavel Machek @ 2002-07-01 2:30 UTC (permalink / raw) To: Salvatore D'Angelo; +Cc: Matti Aarnio, Chris McDonald, linux-kernel Hi! > On 2000000 call -> 189 times I found the problem (0.00945%) > On 20000000 call ->1956 found I found the problem (0.00978%) > > Probably you're right my previous percentage is too high (the one above > should be the correct one). > > But do you think that this behaviour is normal? Buggy hardware and kernel that is not able to work around that. Try googling or asking vojtech pavlik, there should be patches to fix this. What chipset? Pavel -- (about SSSCA) "I don't say this lightly. However, I really think that the U.S. no longer is classifiable as a democracy, but rather as a plutocracy." --hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 12:46 ` Matti Aarnio 2002-06-24 13:57 ` Salvatore D'Angelo @ 2002-07-01 2:29 ` Pavel Machek 1 sibling, 0 replies; 23+ messages in thread From: Pavel Machek @ 2002-07-01 2:29 UTC (permalink / raw) To: Matti Aarnio; +Cc: Salvatore D'Angelo, Chris McDonald, linux-kernel Hi! > > In this piece of code I convert seconds and microseconds in > > milliseconds. I think the problem is not in my code, in fact I wrote the > > following piece of code in Java, and it does not work too. In the for > > loop the 90% of times b > a while for 10% of times not. > > > ... > > long a = System.currentTimeMillis(); > > long b = System.currentTimeMillis(); > > if (a > b) { > > System.out.println("Wrong!!!!!!!!!!!!!"); > > } > > > So in 10% of the cases, two successive calls yield time > rolling BACK ? > > I used gettimeofday() call, and compared the original data > from the code. > > At a modern uniprocessor machine I never get anything except > monotonously increasing time (TSC is used in betwen timer ticks > to supply time increase.) At a dual processor machine, on > occasion I do get SAME value twice. I have never seen time > rolling backwards. > > Uh.. correction: 216199245 0:-1 -- it did step backwards, > but only once within about 216 million gettimeofday() calls. > (I am running 2.4.19-pre8smp at the test box.) Hmm, so it is buggy even for you. He probably has way crappier hardware. Neptun chipsets and via chipsets have bugs in time implementation. Pavel -- (about SSSCA) "I don't say this lightly. However, I really think that the U.S. no longer is classifiable as a democracy, but rather as a plutocracy." --hpa ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 10:20 ` Salvatore D'Angelo 2002-06-24 12:46 ` Matti Aarnio @ 2002-06-24 19:44 ` Richard B. Johnson 2002-06-24 23:34 ` Frank van de Pol 2002-06-25 13:36 ` Chris Friesen 1 sibling, 2 replies; 23+ messages in thread From: Richard B. Johnson @ 2002-06-24 19:44 UTC (permalink / raw) To: Salvatore D'Angelo; +Cc: Chris McDonald, linux-kernel On Mon, 24 Jun 2002, Salvatore D'Angelo wrote: > In this piece of code I convert seconds and microseconds in > milliseconds. I think the problem is not in my code, in fact I wrote the > following piece of code in Java, and it does not work too. In the for > loop the 90% of times b > a while for 10% of times not. > > class Prova { > public static void main(....) { > for (;;) { > long a = System.currentTimeMillis(); > long b = System.currentTimeMillis(); > > if (a > b) { > System.out.println("Wrong!!!!!!!!!!!!!"); > } > } > } > } > > This has been running since I first read your mail about 10:00 this morning. The kernel is 2.4.18 #include <stdio.h> #include <sys/time.h> #define MICRO 1000000 #define ULL unsigned long long int main(void); static ULL tim(void); static ULL tim() { struct timeval t; (void)gettimeofday(&t, NULL); return (ULL)t.tv_sec * MICRO + (ULL)t.tv_usec; } int main() { ULL a, b, cnt; for(cnt=0;;cnt++) { a = tim(); b = tim(); if(b < a) break; } printf("Failed after %llu\n", cnt); printf("a = %llu, b = %llu\n", a, b); return 0; } It seems to work fine. That said, I didn't use your code or check for the possibility of a sign-change miscompare. I just made sure I don't have any by using unsigned stuff. You may want to try this code to see if you have a compiler (or coding) problem. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Windows-2000/Professional isn't. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 19:44 ` Richard B. Johnson @ 2002-06-24 23:34 ` Frank van de Pol 2002-06-25 11:42 ` Richard B. Johnson 2002-06-25 13:36 ` Chris Friesen 1 sibling, 1 reply; 23+ messages in thread From: Frank van de Pol @ 2002-06-24 23:34 UTC (permalink / raw) To: Richard B. Johnson; +Cc: Salvatore D'Angelo, Chris McDonald, linux-kernel Richard, I gave your test program a try, and the problem is reproducably triggered: frank@idefix:~/projects$ ./tod Failed after 2008 a = 1024961239107823, b = 1024961238254652 frank@idefix:~/projects$ ./tod Failed after 1394 a = 1024961252215231, b = 1024961239964379 frank@idefix:~/projects$ ./tod Failed after 110179 a = 1024961241574880, b = 1024961241573638 frank@idefix:~/projects$ ./tod Failed after 4891 a = 1024961242145265, b = 1024961242144027 frank@idefix:~/projects$ ./tod Failed after 45008 a = 1024961243210834, b = 1024961242943904 frank@idefix:~/projects$ ./tod Failed after 6695 a = 1024961243405068, b = 1024961243403828 frank@idefix:~/projects$ ./tod Failed after 1487 a = 1024961244216903, b = 1024961244093738 frank@idefix:~/projects$ ./tod Failed after 3275 a = 1024961245674712, b = 1024961245673487 frank@idefix:~/projects$ ./tod Failed after 42122 a = 1024961259065626, b = 1024961246333377 frank@idefix:~/projects$ ./tod Failed after 425 a = 1024961253600435, b = 1024961252652402 frank@idefix:~/projects$ Possibly it might have to do with the fact that my machine is an SMP box (dual pentium II) $ uname -a Linux idefix 2.4.18 #3 SMP Sat Apr 20 14:35:58 CEST 2002 i686 unknown In fact it is almost symmetrical (one CPU seems to be faster than the other, eventhough they are running at the same clock speed). $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 3 model name : Pentium II (Klamath) stepping : 4 cpu MHz : 333.225 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov mmx bogomips : 665.19 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 2 cpu MHz : 333.225 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx bogomips : 748.74 On Mon, Jun 24, 2002 at 03:44:02PM -0400, Richard B. Johnson wrote: > On Mon, 24 Jun 2002, Salvatore D'Angelo wrote: > > > In this piece of code I convert seconds and microseconds in > > milliseconds. I think the problem is not in my code, in fact I wrote the > > following piece of code in Java, and it does not work too. In the for > > loop the 90% of times b > a while for 10% of times not. > > > > class Prova { > > public static void main(....) { > > for (;;) { > > long a = System.currentTimeMillis(); > > long b = System.currentTimeMillis(); > > > > if (a > b) { > > System.out.println("Wrong!!!!!!!!!!!!!"); > > } > > } > > } > > } > > > > > > This has been running since I first read your mail about 10:00 this > morning. The kernel is 2.4.18 > > #include <stdio.h> > #include <sys/time.h> > #define MICRO 1000000 > #define ULL unsigned long long > int main(void); > static ULL tim(void); > > static ULL tim() > { > struct timeval t; > (void)gettimeofday(&t, NULL); > return (ULL)t.tv_sec * MICRO + (ULL)t.tv_usec; > } > int main() > { > ULL a, b, cnt; > for(cnt=0;;cnt++) > { > a = tim(); > b = tim(); > if(b < a) > break; > } > printf("Failed after %llu\n", cnt); > printf("a = %llu, b = %llu\n", a, b); > return 0; > } > > It seems to work fine. That said, I didn't use your code or check > for the possibility of a sign-change miscompare. I just made sure > I don't have any by using unsigned stuff. You may want to try > this code to see if you have a compiler (or coding) problem. > > > Cheers, > Dick Johnson > > Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). > > Windows-2000/Professional isn't. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- +---- --- -- - - - - | Frank van de Pol -o) A-L-S-A | FvdPol@home.nl /\\ Sounds good! | http://www.alsa-project.org _\_v | Linux - Why use Windows if we have doors available? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 23:34 ` Frank van de Pol @ 2002-06-25 11:42 ` Richard B. Johnson 2002-06-28 18:04 ` george anzinger 0 siblings, 1 reply; 23+ messages in thread From: Richard B. Johnson @ 2002-06-25 11:42 UTC (permalink / raw) To: Frank van de Pol; +Cc: Salvatore D'Angelo, Chris McDonald, linux-kernel On Tue, 25 Jun 2002, Frank van de Pol wrote: > > > Richard, > > I gave your test program a try, and the problem is reproducably triggered: > > > frank@idefix:~/projects$ ./tod > Failed after 2008 > a = 1024961239107823, b = 1024961238254652 > frank@idefix:~/projects$ ./tod > Failed after 1394 > a = 1024961252215231, b = 1024961239964379 > frank@idefix:~/projects$ ./tod > Failed after 110179 > a = 1024961241574880, b = 1024961241573638 > frank@idefix:~/projects$ ./tod > Failed after 4891 > a = 1024961242145265, b = 1024961242144027 > frank@idefix:~/projects$ ./tod > Failed after 45008 > a = 1024961243210834, b = 1024961242943904 > frank@idefix:~/projects$ ./tod > Failed after 6695 > a = 1024961243405068, b = 1024961243403828 > frank@idefix:~/projects$ ./tod > Failed after 1487 > a = 1024961244216903, b = 1024961244093738 > frank@idefix:~/projects$ ./tod > Failed after 3275 > a = 1024961245674712, b = 1024961245673487 > frank@idefix:~/projects$ ./tod > Failed after 42122 > a = 1024961259065626, b = 1024961246333377 > frank@idefix:~/projects$ ./tod > Failed after 425 > a = 1024961253600435, b = 1024961252652402 > frank@idefix:~/projects$ > > > Possibly it might have to do with the fact that my machine is an SMP box > (dual pentium II) > > $ uname -a > > Linux idefix 2.4.18 #3 SMP Sat Apr 20 14:35:58 CEST 2002 i686 unknown > > In fact it is almost symmetrical (one CPU seems to be faster than the other, > eventhough they are running at the same clock speed). > > $ cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 3 > model name : Pentium II (Klamath) > stepping : 4 > cpu MHz : 333.225 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov mmx > bogomips : 665.19 > > processor : 1 > vendor_id : GenuineIntel > cpu family : 6 > model : 5 > model name : Pentium II (Deschutes) > stepping : 2 > cpu MHz : 333.225 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx > bogomips : 748.74 > > Well, I have a dual Pentium also, but I didn't mix two different CPUs as you have done. I'm supprised that your combination even works! It could be a good 'checker' for hard-to-make race conditions. Linux chaos 2.4.18 #14 SMP Mon Jun 17 15:31:13 EDT 2002 i686 processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 1 cpu MHz : 399.570 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr bogomips : 797.90 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 1 cpu MHz : 399.570 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr bogomips : 797.90 I ran my program all night on another machine and it's still running. Neither of these machines are trying to sync with NIST. Machines that are running timing daemons that attempt to sync their clocks could, of course, have problems with time-jumps. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Windows-2000/Professional isn't. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 11:42 ` Richard B. Johnson @ 2002-06-28 18:04 ` george anzinger 0 siblings, 0 replies; 23+ messages in thread From: george anzinger @ 2002-06-28 18:04 UTC (permalink / raw) To: root; +Cc: Frank van de Pol, Salvatore D'Angelo, Chris McDonald, linux-kernel "Richard B. Johnson" wrote: > <snip> > I ran my program all night on another machine and it's still running. > Neither of these machines are trying to sync with NIST. Machines that > are running timing daemons that attempt to sync their clocks could, of > course, have problems with time-jumps. > Me thinks it is time to fix this NIST/NTP issue. The problem is that we are adjusting the wall clock every 1/HZ tick instead of adjusting the 1/HZ tick AND the interpolation constant. What happens is (in the X86) is that we assume that the conversion of TSC to usec is fixed and exact as computed at boot time. The time sync protocols have a more "correct" story to tell. We need to incorporate this information into the TSC to usec conversion so that the wall clock correction for times between 1/HZ ticks agrees with what is done at the tick time. > -- George Anzinger george@mvista.com High-res-timers: http://sourceforge.net/projects/high-res-timers/ Real time sched: http://sourceforge.net/projects/rtsched/ Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-24 19:44 ` Richard B. Johnson 2002-06-24 23:34 ` Frank van de Pol @ 2002-06-25 13:36 ` Chris Friesen 2002-06-26 10:58 ` Gabriel Paubert 1 sibling, 1 reply; 23+ messages in thread From: Chris Friesen @ 2002-06-25 13:36 UTC (permalink / raw) To: root; +Cc: Salvatore D'Angelo, Chris McDonald, linux-kernel "Richard B. Johnson" wrote: > This has been running since I first read your mail about 10:00 this > morning. The kernel is 2.4.18 <code snipped, find it on google> Using that code, I can reliably trigger the fault on 2.2.17 on a G4 desktop, while it doesn't trigger on 2.4.18 on a G4 cPCI blade. I saw this a long time back (a year or two) on 2.2 but never really tracked it down properly as it wasn't a showstopper and I had other things to do at the time. Chris -- Chris Friesen | MailStop: 043/33/F10 Nortel Networks | work: (613) 765-0557 3500 Carling Avenue | fax: (613) 765-2986 Nepean, ON K2H 8E9 Canada | email: cfriesen@nortelnetworks.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 13:36 ` Chris Friesen @ 2002-06-26 10:58 ` Gabriel Paubert 0 siblings, 0 replies; 23+ messages in thread From: Gabriel Paubert @ 2002-06-26 10:58 UTC (permalink / raw) To: Chris Friesen, linux-kernel Chris Friesen wrote: > "Richard B. Johnson" wrote: > > >>This has been running since I first read your mail about 10:00 this >>morning. The kernel is 2.4.18 >> > > <code snipped, find it on google> > > Using that code, I can reliably trigger the fault on 2.2.17 on a G4 desktop, > while it doesn't trigger on 2.4.18 on a G4 cPCI blade. Given the fact that the relevant code has been almost completely rewritten for 2.4 (I'm the culprit), this is not surprising. The rewrite was mostly to make the code more robust in case of missed ticks: all PPC are now guaranteed to withstand at least (HZ-1) lost clock interrupts before things go awfully wrong (this was very useful when the frame buffer drivers blocked interrupts for too long during screen switching). There are also several other less visible bug fixes: when connected to an NTP server, in some cases the timebase frequency appeared to vary depending on system load (fixed by a different algorithm to compute values to load into the decrementer) and other bogosities. Now if you are able to trigger a problem with 2.4. I'm extremely interested. > > I saw this a long time back (a year or two) on 2.2 but never really tracked it > down properly as it wasn't a showstopper and I had other things to do at the > time. I have a patch for 2.2, but it is mixed up in a set of huge patches for Motorola VME boards (MVME2600 and 2400 series) with a bootloader (including an x86 emulator to initialize graphics boards by executing the BIOS ROM), VME drivers and other things that you can find at: ftp://vlab1.iram.es/pub/linux-2.2 I could extract the timing patches if somebody is interested, but not in the next few days (I have patches which apply to more recent versions than the ones I put on the FTP area). Gabriel. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem
@ 2002-06-25 0:37 Christian Robert
2002-06-25 0:43 ` Brad Hards
0 siblings, 1 reply; 23+ messages in thread
From: Christian Robert @ 2002-06-25 0:37 UTC (permalink / raw)
To: linux-kernel
For your eyes,
while reading this thread I wrote a sample program to test if the clock
sometimes goes backward/forward.
I started my program while continuing reading threads on the linux-kernel
archive. After about 90 minutes of continuous running I went back to the window
running the program and surprise I saw this:
$ ./tloop
Bump negative -4294967295
^C
Summary:
-------
Min = 0
Max = 257845
Avg = 1 (6009092476/5521919279)
So it looks like the time changed somewhere of a value +/- 4,295 seconds.
kernel 2.4.18
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 8
model name : Pentium III (Coppermine)
stepping : 1
cpu MHz : 602.566
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1202.58
--------------------- program tloop.c -------------------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <signal.h>
typedef long long LL;
LL GetTime (void)
{
struct timeval tv;
LL retval;
gettimeofday (&tv, NULL);
retval = (tv.tv_sec * 1000000) + (tv.tv_usec);
return retval;
}
volatile int Break = 0;
void Trap (int sig)
{
Break = 1;
}
int main (void)
{
LL Now, Old;
LL Dt, Min=9999999, Max=0, Num=0, Tot=0;
Old = Now = GetTime ();
signal (SIGINT, Trap);
signal (SIGQUIT, Trap);
for ( ; Break==0 ; )
{
Now = GetTime();
if (Now < Old)
{
printf ("Bump negative %lld\n", (Now-Old));
}
else
{
Dt = Now-Old;
Min = (Dt < Min) ? Dt : Min;
Max = (Dt > Max) ? Dt : Max;
Tot += Dt;
Num += 1;
}
Old = Now;
}
printf ("Summary:\n-------\n Min = %lld\n Max = %lld\n "
"Avg = %lld (%lld/%lld)\n", Min, Max, Tot/Num, Tot, Num);
return 0;
}
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: gettimeofday problem 2002-06-25 0:37 Christian Robert @ 2002-06-25 0:43 ` Brad Hards 2002-06-25 2:03 ` Christian Robert 2002-06-25 11:45 ` Richard B. Johnson 0 siblings, 2 replies; 23+ messages in thread From: Brad Hards @ 2002-06-25 0:43 UTC (permalink / raw) To: Christian Robert, linux-kernel On Tue, 25 Jun 2002 10:37, Christian Robert wrote: > gettimeofday (&tv, NULL); How about checking the return value of the function call? Brad -- http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 0:43 ` Brad Hards @ 2002-06-25 2:03 ` Christian Robert 2002-06-25 2:47 ` John Alvord 2002-06-25 11:45 ` Richard B. Johnson 1 sibling, 1 reply; 23+ messages in thread From: Christian Robert @ 2002-06-25 2:03 UTC (permalink / raw) To: Brad Hards; +Cc: linux-kernel Brad Hards wrote: > > On Tue, 25 Jun 2002 10:37, Christian Robert wrote: > > gettimeofday (&tv, NULL); > How about checking the return value of the function call? > > Brad > -- > http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. $ time ./tloop Bump negative -4294967295 Summary: ------- Min = 0 Max = 140068 Avg = 1 (4064861295/3825797418) real 67m44.891s user 29m29.690s sys 27m53.130s Same thing. \x14Took about an hour before getting the negative bump. Xtian. ---- modified GetTime() checking return value of gettimeofday() ----- LL GetTime (void) { struct timeval tv; LL retval; int rc; while (0 != (rc = gettimeofday (&tv, NULL))) printf ("Wow! gettimeofday () returned %d\n", rc); retval = (tv.tv_sec * 1000000) + (tv.tv_usec); return retval; } ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 2:03 ` Christian Robert @ 2002-06-25 2:47 ` John Alvord 2002-06-25 9:17 ` Christian Robert 0 siblings, 1 reply; 23+ messages in thread From: John Alvord @ 2002-06-25 2:47 UTC (permalink / raw) To: Christian Robert; +Cc: Brad Hards, linux-kernel Maybe this is the result of floating point rounding errors. Floating point is notorious for occaisional strange results. I suggest redoing the test program to keep all results in integer and seeing what happens... john On Mon, 24 Jun 2002 22:03:12 -0400, Christian Robert <xtian-test@sympatico.ca> wrote: >Brad Hards wrote: >> >> On Tue, 25 Jun 2002 10:37, Christian Robert wrote: >> > gettimeofday (&tv, NULL); >> How about checking the return value of the function call? >> >> Brad >> -- >> http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. > >$ time ./tloop >Bump negative -4294967295 >Summary: >------- > Min = 0 > Max = 140068 > Avg = 1 (4064861295/3825797418) > >real 67m44.891s >user 29m29.690s >sys 27m53.130s > > >Same thing. \x14Took about an hour before getting the negative bump. > >Xtian. > >---- modified GetTime() checking return value of gettimeofday() ----- > >LL GetTime (void) >{ > struct timeval tv; > LL retval; > int rc; > > while (0 != (rc = gettimeofday (&tv, NULL))) > printf ("Wow! gettimeofday () returned %d\n", rc); > > retval = (tv.tv_sec * 1000000) + (tv.tv_usec); > return retval; >} >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 2:47 ` John Alvord @ 2002-06-25 9:17 ` Christian Robert 2002-06-25 10:00 ` Jan Hudec 0 siblings, 1 reply; 23+ messages in thread From: Christian Robert @ 2002-06-25 9:17 UTC (permalink / raw) To: John Alvord; +Cc: linux-kernel John Alvord wrote: > > Maybe this is the result of floating point rounding errors. Floating > point is notorious for occaisional strange results. I suggest redoing > the test program to keep all results in integer and seeing what > happens... > > john You were close. Programming error on my part. This modified subroutine does correct my test. Notice the (LL) cast on 1000000 Computation was done on 32 bits integer then assign on the 64 bits integer. sorry. ps: It may help explain the other > 1 hour time jump I've seen in an other thread. LL GetTime (void) { struct timeval tv; LL retval; int rc; while (0 != (rc = gettimeofday (&tv, NULL))) printf ("Wow! gettimeofday () returned %d\n", rc); retval = (tv.tv_sec * (LL)1000000) + (tv.tv_usec); return retval; } ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 9:17 ` Christian Robert @ 2002-06-25 10:00 ` Jan Hudec 2002-07-19 12:17 ` Amos Waterland 0 siblings, 1 reply; 23+ messages in thread From: Jan Hudec @ 2002-06-25 10:00 UTC (permalink / raw) To: linux-kernel On Tue, Jun 25, 2002 at 05:17:52AM -0400, Christian Robert wrote: > John Alvord wrote: > > Maybe this is the result of floating point rounding errors. Floating > > point is notorious for occaisional strange results. I suggest redoing > > the test program to keep all results in integer and seeing what > > happens... > You were close. > Programming error on my part. What about comparing the struct timeval things directly? There is even a timercmp macro for that (well I noticed that in the manpage when I have olrady had the test written; the macro can only do sharp comparsions). Something like this: (I am now running it on three machines - Athlon 850, Pentium 1500 and dual Pentium III 500 - all seem to be OK so far) #include<stdio.h> #include<errno.h> #include<sys/time.h> #include<signal.h> volatile int loop = 1; void sigint(int foo) { loop = 0; } int main(void) { unsigned long long cnt = 0, bcnt = 0, ecnt = 0; struct timeval old, new = {0, 0}; signal(SIGINT, sigint); while(loop && cnt < (1LLU<<54)) { cnt++; old = new; if(gettimeofday(&new, NULL)) { ecnt++; printf("Error #%llu: count=%llu" " error/count=0.%04llu errno=%i (%s)\n", ecnt, cnt, (10000*ecnt)/cnt, errno, sys_errlist[errno]); continue; } if((new.tv_sec < old.tv_sec) || ((new.tv_sec == old.tv_sec) && (new.tv_usec < old.tv_usec))) { bcnt++; printf("Skew #%llu: count=%llu errors=%llu" " skew/good count=0.%04llu, new=(%li," " %li) old=(%li, %li)\n", bcnt, cnt, ecnt, (10000*bcnt)/(cnt-ecnt), new.tv_sec, new.tv_usec, old.tv_sec, old.tv_usec); } } printf("Counted %llu, errors %llu (0.%04llu), skews %llu" " (0.%04llu)\n", cnt, ecnt, (10000*ecnt)/cnt, bcnt, (10000*bcnt)/(cnt-ecnt)); return 0; } ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bulb@ucw.cz> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 10:00 ` Jan Hudec @ 2002-07-19 12:17 ` Amos Waterland 0 siblings, 0 replies; 23+ messages in thread From: Amos Waterland @ 2002-07-19 12:17 UTC (permalink / raw) To: Jan Hudec, linux-kernel; +Cc: Tom Gall There may be several distinct errors that people are observing. I wrote a program that makes 32 gettimeofday() calls in a tight loop so as to try to catch rare anomalies, and after running for about 24 hours, it just triggered a strange problem. On 2.4.18-3 on a Pentium III, the following output shows a large jump in the tv_usec value, followed by an immediate return to the monotonically increasing sequence, but with a large sequence discontinuity. So in this case the time is not so much "going backwards", but rather making a huge jump forwards and then returning to the original sequence. Hope this helps. % ./test0035 Run: 1844080460; Timer went backwards!; dump: 1027095141 707896 1027095141 707896 1027095141 707897 1027095141 707897 1027095141 707898 1027095141 707898 1027095141 707899 1027095141 707899 1027095141 707900 1027095141 707900 1027095141 707901 1027095141 707901 1027095141 707902 1027095141 707902 1027095141 707903 1027095141 707903 1027095141 852234 <-- this is the anomaly 1027095141 712510 <-- sequence returns with 4607us discontinuity 1027095141 712510 1027095141 712511 1027095141 712511 1027095141 712512 1027095141 712512 1027095141 712513 1027095141 712513 1027095141 712514 1027095141 712514 1027095141 712515 1027095141 712515 1027095141 712516 1027095141 712516 1027095141 712517 % cat test0035.c /* Test for timer weirdness. * Amos Waterland <apw@us.ibm.com> * 15 July 2002 */ #include <errno.h> #include <error.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/time.h> int main (int argc, char *argv[]) { int result = 0; { const int ELEMS = 32; struct timeval tm[ELEMS]; long j, i; while (1) { ++j; for (i = 0; i < ELEMS; i++) { gettimeofday (&(tm[i]), NULL); } for (i = 1; i < ELEMS; i++) { if ((long) tm[i].tv_sec > (long) tm[i - 1].tv_sec) continue; if ((long) tm[i].tv_sec < (long) tm[i - 1].tv_sec || (long) tm[i].tv_usec < (long) tm[i - 1].tv_usec) { printf ("\nRun: %li; Timer went backwards!; dump:\n", j); for (i = 0; i < ELEMS; i++) { printf ("%li %li\n", (long) tm[i].tv_sec, (long) tm[i].tv_usec); } } } } } exit (result); } On Tue, Jun 25, 2002 at 12:00:57PM +0200, Jan Hudec wrote: > On Tue, Jun 25, 2002 at 05:17:52AM -0400, Christian Robert wrote: > > John Alvord wrote: > > > Maybe this is the result of floating point rounding errors. Floating > > > point is notorious for occaisional strange results. I suggest redoing > > > the test program to keep all results in integer and seeing what > > > happens... > > You were close. > > Programming error on my part. > > What about comparing the struct timeval things directly? There is even a > timercmp macro for that (well I noticed that in the manpage when I > have olrady had the test written; the macro can only do sharp comparsions). > > Something like this: > (I am now running it on three machines - Athlon 850, Pentium 1500 and dual > Pentium III 500 - all seem to be OK so far) > > #include<stdio.h> > #include<errno.h> > #include<sys/time.h> > #include<signal.h> > > volatile int loop = 1; > > void sigint(int foo) { > loop = 0; > } > > int main(void) { > unsigned long long cnt = 0, bcnt = 0, ecnt = 0; > struct timeval old, new = {0, 0}; > > signal(SIGINT, sigint); > while(loop && cnt < (1LLU<<54)) { > cnt++; > old = new; > if(gettimeofday(&new, NULL)) { > ecnt++; > printf("Error #%llu: count=%llu" > " error/count=0.%04llu errno=%i (%s)\n", > ecnt, cnt, (10000*ecnt)/cnt, errno, > sys_errlist[errno]); > continue; > } > if((new.tv_sec < old.tv_sec) || ((new.tv_sec == > old.tv_sec) && (new.tv_usec < old.tv_usec))) { bcnt++; > printf("Skew #%llu: count=%llu errors=%llu" > " skew/good count=0.%04llu, new=(%li," > " %li) old=(%li, %li)\n", bcnt, cnt, > ecnt, (10000*bcnt)/(cnt-ecnt), > new.tv_sec, new.tv_usec, old.tv_sec, > old.tv_usec); > } > } > > printf("Counted %llu, errors %llu (0.%04llu), skews %llu" > " (0.%04llu)\n", cnt, ecnt, (10000*ecnt)/cnt, bcnt, > (10000*bcnt)/(cnt-ecnt)); > return 0; > } ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 0:43 ` Brad Hards 2002-06-25 2:03 ` Christian Robert @ 2002-06-25 11:45 ` Richard B. Johnson 2002-06-25 11:50 ` Brad Hards 1 sibling, 1 reply; 23+ messages in thread From: Richard B. Johnson @ 2002-06-25 11:45 UTC (permalink / raw) To: Brad Hards; +Cc: Christian Robert, linux-kernel On Tue, 25 Jun 2002, Brad Hards wrote: > On Tue, 25 Jun 2002 10:37, Christian Robert wrote: > > gettimeofday (&tv, NULL); > How about checking the return value of the function call? > > Brad > -- > http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. I think the only possible error returned would relate to the time-zone being invalid. The time-zone pointer being a NULL is valid so it isn't going to return EINVAL. Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). Windows-2000/Professional isn't. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: gettimeofday problem 2002-06-25 11:45 ` Richard B. Johnson @ 2002-06-25 11:50 ` Brad Hards 0 siblings, 0 replies; 23+ messages in thread From: Brad Hards @ 2002-06-25 11:50 UTC (permalink / raw) To: root; +Cc: Christian Robert, linux-kernel On Tue, 25 Jun 2002 21:45, Richard B. Johnson wrote: > On Tue, 25 Jun 2002, Brad Hards wrote: > > On Tue, 25 Jun 2002 10:37, Christian Robert wrote: > > > gettimeofday (&tv, NULL); > > > > How about checking the return value of the function call? > > > > Brad > > -- > > http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. > > I think the only possible error returned would relate to the time-zone > being invalid. The time-zone pointer being a NULL is valid so it isn't > going to return EINVAL. It was just a thought - it just seemed a reasonable thing to test for, since the call can fail. I didn't check the lib code, so I imagined that their might be some glibc munging of the syscall output. man gettimeofday sez, inter alia: EINVAL Timezone (or something else) is invalid. EFAULT One of tv or tz pointed outside your accessible address space. As you pointed out, EINVAL doesn't seem too likely, and I can't see a pointer problem. So it looks like something else. Brad -- http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2002-07-19 17:14 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-06-24 8:55 gettimeofday problem Salvatore D'Angelo
[not found] ` <200206240934.g5O9YL524660@budgie.cs.uwa.edu.au>
2002-06-24 10:20 ` Salvatore D'Angelo
2002-06-24 12:46 ` Matti Aarnio
2002-06-24 13:57 ` Salvatore D'Angelo
2002-06-24 18:56 ` Karim Yaghmour
2002-06-26 11:01 ` Gabriel Paubert
2002-07-01 2:30 ` Pavel Machek
2002-07-01 2:29 ` Pavel Machek
2002-06-24 19:44 ` Richard B. Johnson
2002-06-24 23:34 ` Frank van de Pol
2002-06-25 11:42 ` Richard B. Johnson
2002-06-28 18:04 ` george anzinger
2002-06-25 13:36 ` Chris Friesen
2002-06-26 10:58 ` Gabriel Paubert
-- strict thread matches above, loose matches on Subject: below --
2002-06-25 0:37 Christian Robert
2002-06-25 0:43 ` Brad Hards
2002-06-25 2:03 ` Christian Robert
2002-06-25 2:47 ` John Alvord
2002-06-25 9:17 ` Christian Robert
2002-06-25 10:00 ` Jan Hudec
2002-07-19 12:17 ` Amos Waterland
2002-06-25 11:45 ` Richard B. Johnson
2002-06-25 11:50 ` Brad Hards
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox