* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 1:58 /proc or ps tools bug? 2.6.3, time is off David Ford
@ 2004-02-25 1:54 ` Albert Cahalan
2004-02-25 5:10 ` David Ford
2004-02-25 9:14 ` Petri Kaukasoina
1 sibling, 1 reply; 36+ messages in thread
From: Albert Cahalan @ 2004-02-25 1:54 UTC (permalink / raw)
To: David Ford; +Cc: linux-kernel mailing list, albert
On Tue, 2004-02-24 at 20:58, David Ford wrote:
> Kernel 2.6.3, procps 3.2.0
>
> # while [ 1 ]; do (ps aux|grep "grep ps aux" && date) ; sleep 1; done
> root 20043 0.0 0.0 1504 456 pts/0 R 20:45 0:00 grep grep ps aux
> Tue Feb 24 20:45:25 EST 2004
> root 20062 0.0 0.0 1504 460 pts/0 S 20:45 0:00 grep grep ps aux
> Tue Feb 24 20:45:26 EST 2004
> root 20081 0.0 0.0 1504 460 pts/0 S 20:46 0:00 grep grep ps aux
> Tue Feb 24 20:45:27 EST 2004
>
> Note the change in the timestamp as reported by 'ps' v.s. the time
> reported by 'date'.
>
> Repeatable every time at 26 seconds after the minute +/- a portion of a
> second.
I'm not seeing it, with:
procps both 3.1.8 and procps 3.2.0+
kernel 2.6.0-test11
library glibc 2.3
hardware uniprocessor G4 Mac
ntp none (and you can tell by my email!)
Run "ps --info" to gather much of this data.
Note that time is a very awkward thing. You boot up,
with some incorrect clock. Then you adjust the time.
Later, you may discover that your clock has been
running too slow. So you adjust the frequency, but
what about the time that has already passed? Should
you change the boot time to represent what is now
known about your clock? What if, by doing so, you
cause some processes to have started before boot?
Then again, perhaps due to temperature change, you
discover that your clock frequency is wrong... This
is without even getting into the concept of leap
seconds, which are determined a few months in advance.
Two guesses:
1. leap seconds
2. SMP, with cycle counters out of sync
^ permalink raw reply [flat|nested] 36+ messages in thread
* /proc or ps tools bug? 2.6.3, time is off
@ 2004-02-25 1:58 David Ford
2004-02-25 1:54 ` Albert Cahalan
2004-02-25 9:14 ` Petri Kaukasoina
0 siblings, 2 replies; 36+ messages in thread
From: David Ford @ 2004-02-25 1:58 UTC (permalink / raw)
To: linux-kernel mailing list, albert
Kernel 2.6.3, procps 3.2.0
# while [ 1 ]; do (ps aux|grep "grep ps aux" && date) ; sleep 1; done
root 20043 0.0 0.0 1504 456 pts/0 R 20:45 0:00 grep grep
ps aux
Tue Feb 24 20:45:25 EST 2004
root 20062 0.0 0.0 1504 460 pts/0 S 20:45 0:00 grep grep
ps aux
Tue Feb 24 20:45:26 EST 2004
root 20081 0.0 0.0 1504 460 pts/0 S 20:46 0:00 grep grep
ps aux
Tue Feb 24 20:45:27 EST 2004
Note the change in the timestamp as reported by 'ps' v.s. the time
reported by 'date'.
Repeatable every time at 26 seconds after the minute +/- a portion of a
second.
David
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 5:10 ` David Ford
@ 2004-02-25 3:27 ` Albert Cahalan
2004-02-25 16:28 ` George Anzinger
0 siblings, 1 reply; 36+ messages in thread
From: Albert Cahalan @ 2004-02-25 3:27 UTC (permalink / raw)
To: David Ford; +Cc: Albert Cahalan, linux-kernel mailing list
On Wed, 2004-02-25 at 00:10, David Ford wrote:
> I can see if a process long in the past would have a different time set
> on it, but shouldn't the entry in /proc coincide with the system clock
> that date is accessing? Or how many different "clocks" does the kernel
> have going?
There are way too many clocks, none of which are good.
> Actually, it seems that there is a -significant- time difference in this
> phantom clock now, I suspended my notebook to bring it home from the
> station, and now this time difference is greater than 9 minutes. I
> suspect it's roughly 46 seconds plus the amount of time that my notebook
> was suspended. Yes, I'm running ntpd.
>
> root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
> Wed Feb 25 00:09:09 EST 2004
OK, this is pointing right at the problem.
Linux does not record process start times at all.
Instead, it records the number of clock ticks
from boot until the process starts.
Either the boot time or current time is real.
The other may be computed from the uptime, which
may be measured in clock ticks.
The clock doesn't tick when your laptop sleeps.
I seem to recall recent changes to the way the
boot time in /proc/stat gets reported. In any
case, a sleeping laptop suggests some interesting
questions about process run times.
Here's another one to make you scream: Linux does
not supply real %CPU data.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 1:54 ` Albert Cahalan
@ 2004-02-25 5:10 ` David Ford
2004-02-25 3:27 ` Albert Cahalan
0 siblings, 1 reply; 36+ messages in thread
From: David Ford @ 2004-02-25 5:10 UTC (permalink / raw)
To: Albert Cahalan; +Cc: linux-kernel mailing list
Albert Cahalan wrote:
>On Tue, 2004-02-24 at 20:58, David Ford wrote:
>
>
>>Kernel 2.6.3, procps 3.2.0
>>
>># while [ 1 ]; do (ps aux|grep "grep ps aux" && date) ; sleep 1; done
>>root 20043 0.0 0.0 1504 456 pts/0 R 20:45 0:00 grep grep ps aux
>>Tue Feb 24 20:45:25 EST 2004
>>root 20062 0.0 0.0 1504 460 pts/0 S 20:45 0:00 grep grep ps aux
>>Tue Feb 24 20:45:26 EST 2004
>>root 20081 0.0 0.0 1504 460 pts/0 S 20:46 0:00 grep grep ps aux
>>Tue Feb 24 20:45:27 EST 2004
>>
>>Note the change in the timestamp as reported by 'ps' v.s. the time
>>reported by 'date'.
>>
>>Repeatable every time at 26 seconds after the minute +/- a portion of a
>>second.
>>
>>
>
>I'm not seeing it, with:
>
>procps both 3.1.8 and procps 3.2.0+
>kernel 2.6.0-test11
>library glibc 2.3
>hardware uniprocessor G4 Mac
>ntp none (and you can tell by my email!)
>
>Run "ps --info" to gather much of this data.
>
>Note that time is a very awkward thing. You boot up,
>with some incorrect clock. Then you adjust the time.
>Later, you may discover that your clock has been
>running too slow. So you adjust the frequency, but
>what about the time that has already passed? Should
>you change the boot time to represent what is now
>known about your clock? What if, by doing so, you
>cause some processes to have started before boot?
>Then again, perhaps due to temperature change, you
>discover that your clock frequency is wrong... This
>is without even getting into the concept of leap
>seconds, which are determined a few months in advance.
>
>Two guesses:
>
>1. leap seconds
>2. SMP, with cycle counters out of sync
>
>
I'm seeing it on two machines now, I'm going to test on more machines as
I get access. The second machine is my notebook with procps 3.1.15 on
it, and it does it at the 46 second mark, also 2.6.3.
I can see if a process long in the past would have a different time set
on it, but shouldn't the entry in /proc coincide with the system clock
that date is accessing? Or how many different "clocks" does the kernel
have going?
powerix conf.d # ps --info
BSD j OL_j
BSD l OL_l
BSD s OL_s
BSD u OL_u
BSD v OL_v
SysV -f (none)
SysV -fl (none)
SysV -j (none)
SysV -l (none)
procps version 3.1.15
Linux version 2.6.3
Compiled with: glibc 2.3, gcc 3.3
header_gap=-1 lines_to_next_header=1
screen_cols=91 screen_rows=29
personality=0x00000000 (from "unknown")
EUID=0 TTY=136,3 Hertz=100 PAGE_SIZE=4096 page_size=4096
sizeof(proc_t)=492 sizeof(long)=4 sizeof(KLONG)=4
archdefs: i386
namelist_file="<no System.map file>"
Actually, it seems that there is a -significant- time difference in this
phantom clock now, I suspended my notebook to bring it home from the
station, and now this time difference is greater than 9 minutes. I
suspect it's roughly 46 seconds plus the amount of time that my notebook
was suspended. Yes, I'm running ntpd.
root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
Wed Feb 25 00:09:09 EST 2004
David
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 1:58 /proc or ps tools bug? 2.6.3, time is off David Ford
2004-02-25 1:54 ` Albert Cahalan
@ 2004-02-25 9:14 ` Petri Kaukasoina
2004-02-25 9:18 ` Petri Kaukasoina
2004-02-25 21:39 ` David Ford
1 sibling, 2 replies; 36+ messages in thread
From: Petri Kaukasoina @ 2004-02-25 9:14 UTC (permalink / raw)
To: linux-kernel mailing list
On Tue, Feb 24, 2004 at 08:58:39PM -0500, David Ford wrote:
> Kernel 2.6.3, procps 3.2.0
>
> Note the change in the timestamp as reported by 'ps' v.s. the time
> reported by 'date'.
>
Hi,
I reported the same problem some time ago. Could you type
grep cpu /proc/stat; cat /proc/uptime
for example, I get
cpu 140708 1489 43735 21209021 292168 4879 4192
cpu0 140708 1489 43735 21209021 292168 4879 4192
216925.15 215037.34
Then add jiffies and divide by uptime:
(140708+1489+43735+21209021+292168+4879+4192)/216925.15 = 100.01695
which is not 100 here as it should be. (On kernel 2.2.* I have it exactly
100). ps uses Hertz=100 but it should be 170 ppm larger which makes an error
of about 15 seconds a day. (Running without ntpd doesn't fix it.)
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 9:14 ` Petri Kaukasoina
@ 2004-02-25 9:18 ` Petri Kaukasoina
2004-02-25 21:39 ` David Ford
1 sibling, 0 replies; 36+ messages in thread
From: Petri Kaukasoina @ 2004-02-25 9:18 UTC (permalink / raw)
To: linux-kernel mailing list
On Wed, Feb 25, 2004 at 11:14:25AM +0200, I wrote:
> (On kernel 2.2.* I have it exactly 100).
I meant on kernel 2.4.* it's ok. I don't know about 2.2. Kernel 2.6.3 does
show the problem. Sorry about the confusion with version numbers.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 16:28 ` George Anzinger
@ 2004-02-25 16:04 ` Albert Cahalan
2004-02-25 20:45 ` George Anzinger
2004-02-25 21:10 ` George Anzinger
0 siblings, 2 replies; 36+ messages in thread
From: Albert Cahalan @ 2004-02-25 16:04 UTC (permalink / raw)
To: George Anzinger; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
On Wed, 2004-02-25 at 11:28, George Anzinger wrote:
> Albert Cahalan wrote:
>> On Wed, 2004-02-25 at 00:10, David Ford wrote:
>> Actually, it seems that there is a -significant- time difference in this
>>> phantom clock now, I suspended my notebook to bring it home from the
>>> station, and now this time difference is greater than 9 minutes. I
>>> suspect it's roughly 46 seconds plus the amount of time that my notebook
>>> was suspended. Yes, I'm running ntpd.
>>>
>>> root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
>>> Wed Feb 25 00:09:09 EST 2004
>>
>> OK, this is pointing right at the problem.
>>
>> Linux does not record process start times at all.
>> Instead, it records the number of clock ticks
>> from boot until the process starts.
>>
>> Either the boot time or current time is real.
>> The other may be computed from the uptime, which
>> may be measured in clock ticks.
>
> In 2.6.* boot time is captured at boot. This is then adjusted when ever the
> clock is set. Up time is the difference between the saved boot time and the
> current wall clock time.
>
>> The clock doesn't tick when your laptop sleeps.
>
> I would guess that the clock adjustment made when the sleep ends is not
> adjusting the boot time as it should. That code should set the clock by calling
> do_settimeofday() which will do the right thing.
I don't think so. The problem might be fixable by advancing
jiffies, crediting the extra ticks to idle time.
Consider the current situation as I know it, in jiffies:
00000 boot
10000 process 42 starts
20000 go to sleep
20000 wake (same jiffies, different time)
30000 process 51 starts
40000 ps examines the state of the system
Process 42 was started 10 seconds after boot. (10000 jiffies)
Process 51 appears to be started 30 seconds after boot. (30000 jiffies + ???)
Now we want to compute:
1. real-world date and time for process start
2. length of process lifetime (real-world or not?)
What works for process 42 won't work for process 51,
because they are on different sides of a hidden gap.
Another way to fix the problem is to move the boot time.
It's kind of sick, but so are the alternatives.
> As to small drifts of ~170 PPM, they are caused by code (ps I would guess) that
> assumes that jiffies is exactly 1/HZ whereas it is NOT in the 2.6.* kernel. The
> size of the jiffie that the kernel uses is returned by:
>
> struct timespec tv;
> :
> :
> clock_res(CLOCK_REALTIME, &tv);
>
> This will be in nanoseconds (and must be as that is what the wall clock is in).
This is NOT sane. Remeber that procps doesn't get to see HZ.
Only USER_HZ is available, as the AT_CLKTCK ELF note.
I think the way to fix this is to skip or add a tick
every now and then, so that the long-term HZ is exact.
Another way is to simply choose between pure old-style
tick-based timekeeping and pure new-style cycle-based
(TSC or ACPI) timekeeping. Systems with uncooperative
hardware have to use the old-style time keeping. This
should simply the code greatly.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 3:27 ` Albert Cahalan
@ 2004-02-25 16:28 ` George Anzinger
2004-02-25 16:04 ` Albert Cahalan
0 siblings, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-02-25 16:28 UTC (permalink / raw)
To: Albert Cahalan; +Cc: David Ford, linux-kernel mailing list
Albert Cahalan wrote:
> On Wed, 2004-02-25 at 00:10, David Ford wrote:
>
>
>>I can see if a process long in the past would have a different time set
>>on it, but shouldn't the entry in /proc coincide with the system clock
>>that date is accessing? Or how many different "clocks" does the kernel
>>have going?
>
>
> There are way too many clocks, none of which are good.
>
>
>>Actually, it seems that there is a -significant- time difference in this
>>phantom clock now, I suspended my notebook to bring it home from the
>>station, and now this time difference is greater than 9 minutes. I
>>suspect it's roughly 46 seconds plus the amount of time that my notebook
>>was suspended. Yes, I'm running ntpd.
>>
>>root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
>>Wed Feb 25 00:09:09 EST 2004
>
>
> OK, this is pointing right at the problem.
>
> Linux does not record process start times at all.
> Instead, it records the number of clock ticks
> from boot until the process starts.
>
> Either the boot time or current time is real.
> The other may be computed from the uptime, which
> may be measured in clock ticks.
In 2.6.* boot time is captured at boot. This is then adjusted when ever the
clock is set. Up time is the difference between the saved boot time and the
current wall clock time.
>
> The clock doesn't tick when your laptop sleeps.
I would guess that the clock adjustment made when the sleep ends is not
adjusting the boot time as it should. That code should set the clock by calling
do_settimeofday() which will do the right thing.
As to small drifts of ~170 PPM, they are caused by code (ps I would guess) that
assumes that jiffies is exactly 1/HZ whereas it is NOT in the 2.6.* kernel. The
size of the jiffie that the kernel uses is returned by:
struct timespec tv;
:
:
clock_res(CLOCK_REALTIME, &tv);
This will be in nanoseconds (and must be as that is what the wall clock is in).
George
> I seem to recall recent changes to the way the
> boot time in /proc/stat gets reported. In any
> case, a sleeping laptop suggests some interesting
> questions about process run times.
>
> Here's another one to make you scream: Linux does
> not supply real %CPU data.
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 20:45 ` George Anzinger
@ 2004-02-25 19:16 ` Albert Cahalan
0 siblings, 0 replies; 36+ messages in thread
From: Albert Cahalan @ 2004-02-25 19:16 UTC (permalink / raw)
To: George Anzinger; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
George Anzinger writes:
> Albert Cahalan wrote:
>> On Wed, 2004-02-25 at 11:28, George Anzinger wrote:
>>> As to small drifts of ~170 PPM, they are caused by code
>>> (ps I would guess) that assumes that jiffies is exactly
>>> 1/HZ whereas it is NOT in the 2.6.* kernel. The size of
>>> the jiffie that the kernel uses is returned by:
>>>
>>> struct timespec tv;
>>> :
>>> :
>>> clock_res(CLOCK_REALTIME, &tv);
>>>
>>> This will be in nanoseconds (and must be as that is what the wall clock is in).
>>
>> This is NOT sane. Remeber that procps doesn't get to see HZ.
>> Only USER_HZ is available, as the AT_CLKTCK ELF note.
>
> May be, I did not do this, but only cleaned up the internal
> notion of jiffy so timers would work more correctly. If you
> go back to HZ=100, every thing works better in this regard.
>
> On the other hand, what practical difference does it make?
> Almost no user code even looks at USER_HZ. Its just things
> like ps and friends as far as I can tell... Possibly we
> should just fix the utilities to use the above call to get
> the jiffie size... I don't know the full history, but was
> USER_HZ invented by the 2.5 changes?
USER_HZ was invented by the 2.5 changes.
Linus has decreed that USER_HZ is part of the ABI.
For some reason (ARM port or stubborn glibc hacker?)
he has allowed USER_HZ to be exposed via an ELF note.
Prior to that, he'd refused all attempts to get HZ
exported through /proc/sys and similar.
I'm OK with any integer value as long as I can get it.
On older kernels procps will guess HZ from the uptime
and clock ticks, since there is a long history of
people running with non-standard HZ values.
Since the ABI is that USER_HZ==100, the kernel is
currently in violation. Perhaps the HZ-to-USER_HZ
conversion needs to be redone.
USER_HZ appears in SCSI ioctls, network stats,
an old clock ("clocks"? "times"?) syscall...
>> I think the way to fix this is to skip or add a tick
>> every now and then, so that the long-term HZ is exact.
>
> This is REAL problem for any code that wants to use
> more exact time/ timers than the 1/HZ. See, for example,
> the high res patch (signature). You can not just throw
> in an extra tick every so often.
You're just considering short-term time scales.
The extra ticks, over a period of many days, lead
you to the exact time.
I suppose it would be possible to have things both ways.
Raw jiffies is as it is today. Then we have a correction
factor that gets adjusted as needed to ensure that we can
get long-term-exact 1/HZ ticks as: jiffies + correction
>> Another way is to simply choose between pure old-style
>> tick-based timekeeping and pure new-style cycle-based
>> (TSC or ACPI) timekeeping. Systems with uncooperative
>> hardware have to use the old-style time keeping. This
>> should simply the code greatly.
>
> Hm, the reason 1/HZ is not used is that the x86 hardware
> (PIT, to be exact) can not give a good 1/1000 value...
If using the PIT:
a. no broken attempt at high-res timekeeping
b. add or skip whole ticks as needed for timekeeping
Common timekeeping tasks don't fit neatly on jiffie
ticks anyway. You need 16.683 ticks per NTSC field, etc.
When you fully push timekeeping onto the TSC, ACPI
timer, or PowerPC bus counter, then you can have a
relatively crud-free high-res implementation.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 16:04 ` Albert Cahalan
@ 2004-02-25 20:45 ` George Anzinger
2004-02-25 19:16 ` Albert Cahalan
2004-02-25 21:10 ` George Anzinger
1 sibling, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-02-25 20:45 UTC (permalink / raw)
To: Albert Cahalan; +Cc: David Ford, linux-kernel mailing list
Albert Cahalan wrote:
> On Wed, 2004-02-25 at 11:28, George Anzinger wrote:
>
>>Albert Cahalan wrote:
>>
>>>On Wed, 2004-02-25 at 00:10, David Ford wrote:
>
>
>>>Actually, it seems that there is a -significant- time difference in this
>>>
>>>>phantom clock now, I suspended my notebook to bring it home from the
>>>>station, and now this time difference is greater than 9 minutes. I
>>>>suspect it's roughly 46 seconds plus the amount of time that my notebook
>>>>was suspended. Yes, I'm running ntpd.
>>>>
>>>>root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
>>>>Wed Feb 25 00:09:09 EST 2004
>>>
>>>OK, this is pointing right at the problem.
>>>
>>>Linux does not record process start times at all.
>>>Instead, it records the number of clock ticks
>>>from boot until the process starts.
>>>
>>>Either the boot time or current time is real.
>>>The other may be computed from the uptime, which
>>>may be measured in clock ticks.
>>
>>In 2.6.* boot time is captured at boot. This is then adjusted when ever the
>>clock is set. Up time is the difference between the saved boot time and the
>>current wall clock time.
>>
>>
>>>The clock doesn't tick when your laptop sleeps.
>>
>>I would guess that the clock adjustment made when the sleep ends is not
>>adjusting the boot time as it should. That code should set the clock by calling
>>do_settimeofday() which will do the right thing.
>
>
> I don't think so. The problem might be fixable by advancing
> jiffies, crediting the extra ticks to idle time.
> Consider the current situation as I know it, in jiffies:
>
> 00000 boot
> 10000 process 42 starts
> 20000 go to sleep
> 20000 wake (same jiffies, different time)
> 30000 process 51 starts
> 40000 ps examines the state of the system
>
> Process 42 was started 10 seconds after boot. (10000 jiffies)
> Process 51 appears to be started 30 seconds after boot. (30000 jiffies + ???)
>
> Now we want to compute:
>
> 1. real-world date and time for process start
> 2. length of process lifetime (real-world or not?)
>
> What works for process 42 won't work for process 51,
> because they are on different sides of a hidden gap.
>
> Another way to fix the problem is to move the boot time.
> It's kind of sick, but so are the alternatives.
>
>
>>As to small drifts of ~170 PPM, they are caused by code (ps I would guess) that
>>assumes that jiffies is exactly 1/HZ whereas it is NOT in the 2.6.* kernel. The
>>size of the jiffie that the kernel uses is returned by:
>>
>>struct timespec tv;
>>:
>>:
>>clock_res(CLOCK_REALTIME, &tv);
>>
>>This will be in nanoseconds (and must be as that is what the wall clock is in).
>
>
> This is NOT sane. Remeber that procps doesn't get to see HZ.
> Only USER_HZ is available, as the AT_CLKTCK ELF note.
May be, I did not do this, but only cleaned up the internal notion of jiffy so
timers would work more correctly. If you go back to HZ=100, every thing works
better in this regard.
On the other hand, what practical difference does it make? Almost no user code
even looks at USER_HZ. Its just things like ps and friends as far as I can
tell... Possibly we should just fix the utilities to use the above call to get
the jiffie size... I don't know the full history, but was USER_HZ invented by
the 2.5 changes?
>
> I think the way to fix this is to skip or add a tick
> every now and then, so that the long-term HZ is exact.
This is REAL problem for any code that wants to use more exact time/ timers than
the 1/HZ. See, for example, the high res patch (signature). You can not just
throw in an extra tick every so often.
>
> Another way is to simply choose between pure old-style
> tick-based timekeeping and pure new-style cycle-based
> (TSC or ACPI) timekeeping. Systems with uncooperative
> hardware have to use the old-style time keeping. This
> should simply the code greatly.
Hm, the reason 1/HZ is not used is that the x86 hardware (PIT, to be exact) can
not give a good 1/1000 value...
>
>
>
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 16:04 ` Albert Cahalan
2004-02-25 20:45 ` George Anzinger
@ 2004-02-25 21:10 ` George Anzinger
2004-02-26 1:52 ` john stultz
1 sibling, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-02-25 21:10 UTC (permalink / raw)
To: Albert Cahalan; +Cc: David Ford, linux-kernel mailing list
Albert Cahalan wrote:
> On Wed, 2004-02-25 at 11:28, George Anzinger wrote:
>
>>Albert Cahalan wrote:
>>
>>>On Wed, 2004-02-25 at 00:10, David Ford wrote:
>
>
>>>Actually, it seems that there is a -significant- time difference in this
>>>
>>>>phantom clock now, I suspended my notebook to bring it home from the
>>>>station, and now this time difference is greater than 9 minutes. I
>>>>suspect it's roughly 46 seconds plus the amount of time that my notebook
>>>>was suspended. Yes, I'm running ntpd.
>>>>
>>>>root 16894 0.0 0.0 1544 484 pts/3 S Feb24 0:00 grep grep ps
>>>>Wed Feb 25 00:09:09 EST 2004
>>>
>>>OK, this is pointing right at the problem.
>>>
>>>Linux does not record process start times at all.
>>>Instead, it records the number of clock ticks
>>>from boot until the process starts.
>>>
>>>Either the boot time or current time is real.
>>>The other may be computed from the uptime, which
>>>may be measured in clock ticks.
>>
>>In 2.6.* boot time is captured at boot. This is then adjusted when ever the
>>clock is set. Up time is the difference between the saved boot time and the
>>current wall clock time.
>>
>>
>>>The clock doesn't tick when your laptop sleeps.
>>
>>I would guess that the clock adjustment made when the sleep ends is not
>>adjusting the boot time as it should. That code should set the clock by calling
>>do_settimeofday() which will do the right thing.
>
>
> I don't think so. The problem might be fixable by advancing
> jiffies, crediting the extra ticks to idle time.
> Consider the current situation as I know it, in jiffies:
>
> 00000 boot
> 10000 process 42 starts
> 20000 go to sleep
> 20000 wake (same jiffies, different time)
> 30000 process 51 starts
> 40000 ps examines the state of the system
>
> Process 42 was started 10 seconds after boot. (10000 jiffies)
> Process 51 appears to be started 30 seconds after boot. (30000 jiffies + ???)
>
> Now we want to compute:
>
> 1. real-world date and time for process start
> 2. length of process lifetime (real-world or not?)
>
> What works for process 42 won't work for process 51,
> because they are on different sides of a hidden gap.
>
> Another way to fix the problem is to move the boot time.
> It's kind of sick, but so are the alternatives.
>
>
>>As to small drifts of ~170 PPM, they are caused by code (ps I would guess) that
>>assumes that jiffies is exactly 1/HZ whereas it is NOT in the 2.6.* kernel. The
>>size of the jiffie that the kernel uses is returned by:
>>
>>struct timespec tv;
>>:
>>:
>>clock_res(CLOCK_REALTIME, &tv);
>>
>>This will be in nanoseconds (and must be as that is what the wall clock is in).
>
>
> This is NOT sane. Remeber that procps doesn't get to see HZ.
> Only USER_HZ is available, as the AT_CLKTCK ELF note.
>
> I think the way to fix this is to skip or add a tick
> every now and then, so that the long-term HZ is exact.
>
> Another way is to simply choose between pure old-style
> tick-based timekeeping and pure new-style cycle-based
> (TSC or ACPI) timekeeping. Systems with uncooperative
> hardware have to use the old-style time keeping. This
> should simply the code greatly.
On checking the code and thinking about this, I would suggest that we change
start_time in the task struct to be the wall time (or monotonic time if that
seems better). I only find two places this is used, in proc and in the
accounting code. Both of these could easily be changed. Of course, even
leaving it as it is, they could be changed to report more correct values by
using the correct conversions to translate the system HZ to USER_HZ.
Hm, OK, I will work up a patch to do the ladder, i.e. to correctly convert the
elapsed jiffies to USER_HZ and (for the accounting code) seconds.
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 9:14 ` Petri Kaukasoina
2004-02-25 9:18 ` Petri Kaukasoina
@ 2004-02-25 21:39 ` David Ford
1 sibling, 0 replies; 36+ messages in thread
From: David Ford @ 2004-02-25 21:39 UTC (permalink / raw)
To: linux-kernel mailing list
powerix root # grep cpu /proc/stat; cat /proc/uptime
cpu 403880 466666 580559 15017904 475868 10864 3030
cpu0 403880 466666 580559 15017904 475868 10864 3030
186882.63 154999.74
(gdb) p 403880 +466666 +580559 +15017904 +475868 +10864 +3030
$1 = 16958771
(gdb) p 16958771/186882.63
$2 = 90.745571164104447
Hmm, not quite 100.0
david
Petri Kaukasoina wrote:
>Hi,
>
>I reported the same problem some time ago. Could you type
>
>grep cpu /proc/stat; cat /proc/uptime
>
>for example, I get
>
>cpu 140708 1489 43735 21209021 292168 4879 4192
>cpu0 140708 1489 43735 21209021 292168 4879 4192
>216925.15 215037.34
>
>Then add jiffies and divide by uptime:
>
>(140708+1489+43735+21209021+292168+4879+4192)/216925.15 = 100.01695
>
>which is not 100 here as it should be. (On kernel 2.2.* I have it exactly
>100). ps uses Hertz=100 but it should be 170 ppm larger which makes an error
>of about 15 seconds a day. (Running without ntpd doesn't fix it.)
>
>
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-25 21:10 ` George Anzinger
@ 2004-02-26 1:52 ` john stultz
2004-02-26 23:06 ` George Anzinger
2004-02-26 23:14 ` George Anzinger
0 siblings, 2 replies; 36+ messages in thread
From: john stultz @ 2004-02-26 1:52 UTC (permalink / raw)
To: george anzinger; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
> Albert Cahalan wrote:
> > This is NOT sane. Remeber that procps doesn't get to see HZ.
> > Only USER_HZ is available, as the AT_CLKTCK ELF note.
> >
> > I think the way to fix this is to skip or add a tick
> > every now and then, so that the long-term HZ is exact.
> >
> > Another way is to simply choose between pure old-style
> > tick-based timekeeping and pure new-style cycle-based
> > (TSC or ACPI) timekeeping. Systems with uncooperative
> > hardware have to use the old-style time keeping. This
> > should simply the code greatly.
>
> On checking the code and thinking about this, I would suggest that we change
> start_time in the task struct to be the wall time (or monotonic time if that
> seems better). I only find two places this is used, in proc and in the
> accounting code. Both of these could easily be changed. Of course, even
> leaving it as it is, they could be changed to report more correct values by
> using the correct conversions to translate the system HZ to USER_HZ.
Is this close to what your thinking of?
I can't reproduce the issue on my systems, so I'll need someone else to
test this.
thanks
-john
--- 1.5/include/linux/times.h Sun Nov 9 19:26:08 2003
+++ edited/include/linux/times.h Wed Feb 25 17:39:11 2004
@@ -7,7 +7,7 @@
#include <asm/param.h>
#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
+# define jiffies_to_clock_t(x) (((x*TICK_NSEC*HZ)/NSEC_PER_SEC) / (HZ / USER_HZ))
#else
# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
#endif
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-26 1:52 ` john stultz
@ 2004-02-26 23:06 ` George Anzinger
2004-02-26 23:10 ` john stultz
2004-02-26 23:14 ` George Anzinger
1 sibling, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-02-26 23:06 UTC (permalink / raw)
To: john stultz; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
john stultz wrote:
> On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
>
>>Albert Cahalan wrote:
>>
>>>This is NOT sane. Remeber that procps doesn't get to see HZ.
>>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
>>>
>>>I think the way to fix this is to skip or add a tick
>>>every now and then, so that the long-term HZ is exact.
>>>
>>>Another way is to simply choose between pure old-style
>>>tick-based timekeeping and pure new-style cycle-based
>>>(TSC or ACPI) timekeeping. Systems with uncooperative
>>>hardware have to use the old-style time keeping. This
>>>should simply the code greatly.
>>
>>On checking the code and thinking about this, I would suggest that we change
>>start_time in the task struct to be the wall time (or monotonic time if that
>>seems better). I only find two places this is used, in proc and in the
>>accounting code. Both of these could easily be changed. Of course, even
>>leaving it as it is, they could be changed to report more correct values by
>>using the correct conversions to translate the system HZ to USER_HZ.
>
>
> Is this close to what your thinking of?
> I can't reproduce the issue on my systems, so I'll need someone else to
> test this.
More or less. I wonder if:
static inline long jiffies_to_clock_t(long x)
{
u64 tmp = (u64)x * TICK_NSEC;
div64(tmp, (NSEC_PER_SEC / USER_HZ));
return (long)x;
}
might be better as it addresses the overflow issue. Should be able to toss the
#if (HZ % USER_HZ)==0 test too. We could get carried away and do scaled math to
eliminate the div64 but I don't think this path is used enough to justify the
clarity ;) that would make.
-g
>
> thanks
> -john
>
> --- 1.5/include/linux/times.h Sun Nov 9 19:26:08 2003
> +++ edited/include/linux/times.h Wed Feb 25 17:39:11 2004
> @@ -7,7 +7,7 @@
> #include <asm/param.h>
>
> #if (HZ % USER_HZ)==0
> -# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
> +# define jiffies_to_clock_t(x) (((x*TICK_NSEC*HZ)/NSEC_PER_SEC) / (HZ / USER_HZ))
> #else
> # define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
> #endif
>
>
>
>
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-26 23:06 ` George Anzinger
@ 2004-02-26 23:10 ` john stultz
2004-02-27 0:20 ` George Anzinger
0 siblings, 1 reply; 36+ messages in thread
From: john stultz @ 2004-02-26 23:10 UTC (permalink / raw)
To: george anzinger; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
On Thu, 2004-02-26 at 15:06, George Anzinger wrote:
> john stultz wrote:
> > On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
> >
> >>Albert Cahalan wrote:
> >>
> >>>This is NOT sane. Remeber that procps doesn't get to see HZ.
> >>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
> >>>
> >>>I think the way to fix this is to skip or add a tick
> >>>every now and then, so that the long-term HZ is exact.
> >>>
> >>>Another way is to simply choose between pure old-style
> >>>tick-based timekeeping and pure new-style cycle-based
> >>>(TSC or ACPI) timekeeping. Systems with uncooperative
> >>>hardware have to use the old-style time keeping. This
> >>>should simply the code greatly.
> >>
> >>On checking the code and thinking about this, I would suggest that we change
> >>start_time in the task struct to be the wall time (or monotonic time if that
> >>seems better). I only find two places this is used, in proc and in the
> >>accounting code. Both of these could easily be changed. Of course, even
> >>leaving it as it is, they could be changed to report more correct values by
> >>using the correct conversions to translate the system HZ to USER_HZ.
> >
> >
> > Is this close to what your thinking of?
> > I can't reproduce the issue on my systems, so I'll need someone else to
> > test this.
>
> More or less. I wonder if:
> static inline long jiffies_to_clock_t(long x)
> {
> u64 tmp = (u64)x * TICK_NSEC;
> div64(tmp, (NSEC_PER_SEC / USER_HZ));
> return (long)x;
> }
> might be better as it addresses the overflow issue. Should be able to toss the
> #if (HZ % USER_HZ)==0 test too. We could get carried away and do scaled math to
> eliminate the div64 but I don't think this path is used enough to justify the
> clarity ;) that would make.
Sounds good to me. Would you mind sending the diff so Petri and David
could test it?
thanks
-john
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-26 1:52 ` john stultz
2004-02-26 23:06 ` George Anzinger
@ 2004-02-26 23:14 ` George Anzinger
1 sibling, 0 replies; 36+ messages in thread
From: George Anzinger @ 2004-02-26 23:14 UTC (permalink / raw)
To: john stultz; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
John,
This is the other place I found "start_time" being used. It is at about line
340 in kernel/acct.c. The same sort of math should be done here:
elapsed = get_jiffies_64() - current->start_time;
ac.ac_etime = encode_comp_t(elapsed < (unsigned long) -1l ?
(unsigned long) elapsed : (unsigned long) -1l);
do_div(elapsed, HZ);
-g
john stultz wrote:
> On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
>
>>Albert Cahalan wrote:
>>
>>>This is NOT sane. Remeber that procps doesn't get to see HZ.
>>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
>>>
>>>I think the way to fix this is to skip or add a tick
>>>every now and then, so that the long-term HZ is exact.
>>>
>>>Another way is to simply choose between pure old-style
>>>tick-based timekeeping and pure new-style cycle-based
>>>(TSC or ACPI) timekeeping. Systems with uncooperative
>>>hardware have to use the old-style time keeping. This
>>>should simply the code greatly.
>>
>>On checking the code and thinking about this, I would suggest that we change
>>start_time in the task struct to be the wall time (or monotonic time if that
>>seems better). I only find two places this is used, in proc and in the
>>accounting code. Both of these could easily be changed. Of course, even
>>leaving it as it is, they could be changed to report more correct values by
>>using the correct conversions to translate the system HZ to USER_HZ.
>
>
> Is this close to what your thinking of?
> I can't reproduce the issue on my systems, so I'll need someone else to
> test this.
>
> thanks
> -john
>
> --- 1.5/include/linux/times.h Sun Nov 9 19:26:08 2003
> +++ edited/include/linux/times.h Wed Feb 25 17:39:11 2004
> @@ -7,7 +7,7 @@
> #include <asm/param.h>
>
> #if (HZ % USER_HZ)==0
> -# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
> +# define jiffies_to_clock_t(x) (((x*TICK_NSEC*HZ)/NSEC_PER_SEC) / (HZ / USER_HZ))
> #else
> # define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
> #endif
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-26 23:10 ` john stultz
@ 2004-02-27 0:20 ` George Anzinger
2004-04-13 22:38 ` john stultz
0 siblings, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-02-27 0:20 UTC (permalink / raw)
To: john stultz; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
john stultz wrote:
> On Thu, 2004-02-26 at 15:06, George Anzinger wrote:
>
>>john stultz wrote:
>>
>>>On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
>>>
>>>
>>>>Albert Cahalan wrote:
>>>>
>>>>
>>>>>This is NOT sane. Remeber that procps doesn't get to see HZ.
>>>>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
>>>>>
>>>>>I think the way to fix this is to skip or add a tick
>>>>>every now and then, so that the long-term HZ is exact.
>>>>>
>>>>>Another way is to simply choose between pure old-style
>>>>>tick-based timekeeping and pure new-style cycle-based
>>>>>(TSC or ACPI) timekeeping. Systems with uncooperative
>>>>>hardware have to use the old-style time keeping. This
>>>>>should simply the code greatly.
>>>>
>>>>On checking the code and thinking about this, I would suggest that we change
>>>>start_time in the task struct to be the wall time (or monotonic time if that
>>>>seems better). I only find two places this is used, in proc and in the
>>>>accounting code. Both of these could easily be changed. Of course, even
>>>>leaving it as it is, they could be changed to report more correct values by
>>>>using the correct conversions to translate the system HZ to USER_HZ.
>>>
>>>
>>>Is this close to what your thinking of?
>>>I can't reproduce the issue on my systems, so I'll need someone else to
>>>test this.
>>
>>More or less. I wonder if:
>
>
>>static inline long jiffies_to_clock_t(long x)
>>{
>> u64 tmp = (u64)x * TICK_NSEC;
>> div64(tmp, (NSEC_PER_SEC / USER_HZ));
>> return (long)x;
>>}
>>might be better as it addresses the overflow issue. Should be able to toss the
>>#if (HZ % USER_HZ)==0 test too. We could get carried away and do scaled math to
>>eliminate the div64 but I don't think this path is used enough to justify the
>>clarity ;) that would make.
>
>
> Sounds good to me. Would you mind sending the diff so Petri and David
> could test it?
Oops, I have been caught :) The above was composed in the email window. I
don't have a 2.6.x kernel up at the moment and I don't have any free cycles...
Late next week??
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-02-27 0:20 ` George Anzinger
@ 2004-04-13 22:38 ` john stultz
2004-04-13 22:59 ` George Anzinger
2004-04-14 12:10 ` Tim Schmielau
0 siblings, 2 replies; 36+ messages in thread
From: john stultz @ 2004-04-13 22:38 UTC (permalink / raw)
To: george anzinger; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
On Thu, 2004-02-26 at 16:20, George Anzinger wrote:
> john stultz wrote:
> > On Thu, 2004-02-26 at 15:06, George Anzinger wrote:
> >>john stultz wrote:
> >>>On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
> >>>>Albert Cahalan wrote:
> >>>>
> >>>>>This is NOT sane. Remeber that procps doesn't get to see HZ.
> >>>>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
> >>>>>
> >>>>>I think the way to fix this is to skip or add a tick
> >>>>>every now and then, so that the long-term HZ is exact.
> >>>>>
> >>>>>Another way is to simply choose between pure old-style
> >>>>>tick-based timekeeping and pure new-style cycle-based
> >>>>>(TSC or ACPI) timekeeping. Systems with uncooperative
> >>>>>hardware have to use the old-style time keeping. This
> >>>>>should simply the code greatly.
> >>>>
> >>>>On checking the code and thinking about this, I would suggest that we change
> >>>>start_time in the task struct to be the wall time (or monotonic time if that
> >>>>seems better). I only find two places this is used, in proc and in the
> >>>>accounting code. Both of these could easily be changed. Of course, even
> >>>>leaving it as it is, they could be changed to report more correct values by
> >>>>using the correct conversions to translate the system HZ to USER_HZ.
> >>>
> >>>
> >>>Is this close to what your thinking of?
> >>>I can't reproduce the issue on my systems, so I'll need someone else to
> >>>test this.
> >>
> >>More or less. I wonder if:
> >
> >>static inline long jiffies_to_clock_t(long x)
> >>{
> >> u64 tmp = (u64)x * TICK_NSEC;
> >> div64(tmp, (NSEC_PER_SEC / USER_HZ));
> >> return (long)x;
> >>}
> >>might be better as it addresses the overflow issue. Should be able to toss the
> >>#if (HZ % USER_HZ)==0 test too. We could get carried away and do scaled math to
> >>eliminate the div64 but I don't think this path is used enough to justify the
> >>clarity ;) that would make.
> >
> > Sounds good to me. Would you mind sending the diff so Petri and David
> > could test it?
>
> Oops, I have been caught :) The above was composed in the email window. I
> don't have a 2.6.x kernel up at the moment and I don't have any free cycles...
> Late next week??
Finally got a chance to go through my work queue and yikes! This is
seriously stale! As neither George or I have come to bat with a patch,
I'll attempt a swing.
Albert/David: Would you mind testing the following to see if it resolves
the issue for you?
George: Mind skimming this to make sure its close enough to what you
intended?
thanks
-john
diff -Nru a/include/linux/times.h b/include/linux/times.h
--- a/include/linux/times.h Tue Apr 13 15:00:25 2004
+++ b/include/linux/times.h Tue Apr 13 15:00:25 2004
@@ -7,7 +7,12 @@
#include <asm/param.h>
#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
+static inline long jiffies_to_clock_t(long x)
+{
+ u64 tmp = (u64)x * TICK_NSEC;
+ x = do_div(tmp, (NSEC_PER_SEC / USER_HZ));
+ return (long)tmp;
+}
#else
# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
#endif
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-13 22:38 ` john stultz
@ 2004-04-13 22:59 ` George Anzinger
2004-04-14 12:10 ` Tim Schmielau
1 sibling, 0 replies; 36+ messages in thread
From: George Anzinger @ 2004-04-13 22:59 UTC (permalink / raw)
To: john stultz; +Cc: Albert Cahalan, David Ford, linux-kernel mailing list
john stultz wrote:
> On Thu, 2004-02-26 at 16:20, George Anzinger wrote:
>
>>john stultz wrote:
>>
>>>On Thu, 2004-02-26 at 15:06, George Anzinger wrote:
>>>
>>>>john stultz wrote:
>>>>
>>>>>On Wed, 2004-02-25 at 13:10, George Anzinger wrote:
>>>>>
>>>>>>Albert Cahalan wrote:
>>>>>>
>>>>>>
>>>>>>>This is NOT sane. Remeber that procps doesn't get to see HZ.
>>>>>>>Only USER_HZ is available, as the AT_CLKTCK ELF note.
>>>>>>>
>>>>>>>I think the way to fix this is to skip or add a tick
>>>>>>>every now and then, so that the long-term HZ is exact.
>>>>>>>
>>>>>>>Another way is to simply choose between pure old-style
>>>>>>>tick-based timekeeping and pure new-style cycle-based
>>>>>>>(TSC or ACPI) timekeeping. Systems with uncooperative
>>>>>>>hardware have to use the old-style time keeping. This
>>>>>>>should simply the code greatly.
>>>>>>
>>>>>>On checking the code and thinking about this, I would suggest that we change
>>>>>>start_time in the task struct to be the wall time (or monotonic time if that
>>>>>>seems better). I only find two places this is used, in proc and in the
>>>>>>accounting code. Both of these could easily be changed. Of course, even
>>>>>>leaving it as it is, they could be changed to report more correct values by
>>>>>>using the correct conversions to translate the system HZ to USER_HZ.
>>>>>
>>>>>
>>>>>Is this close to what your thinking of?
>>>>>I can't reproduce the issue on my systems, so I'll need someone else to
>>>>>test this.
>>>>
>>>>More or less. I wonder if:
>>>
>>>>static inline long jiffies_to_clock_t(long x)
>>>>{
>>>> u64 tmp = (u64)x * TICK_NSEC;
>>>> div64(tmp, (NSEC_PER_SEC / USER_HZ));
>>>> return (long)x;
>>>>}
>>>>might be better as it addresses the overflow issue. Should be able to toss the
>>>>#if (HZ % USER_HZ)==0 test too. We could get carried away and do scaled math to
>>>>eliminate the div64 but I don't think this path is used enough to justify the
>>>>clarity ;) that would make.
>>>
>>>Sounds good to me. Would you mind sending the diff so Petri and David
>>>could test it?
>>
>>Oops, I have been caught :) The above was composed in the email window. I
>>don't have a 2.6.x kernel up at the moment and I don't have any free cycles...
>>Late next week??
>
>
> Finally got a chance to go through my work queue and yikes! This is
> seriously stale! As neither George or I have come to bat with a patch,
> I'll attempt a swing.
>
> Albert/David: Would you mind testing the following to see if it resolves
> the issue for you?
>
> George: Mind skimming this to make sure its close enough to what you
> intended?
Looks rather like exactly what I intended.
-g
>
> thanks
> -john
>
>
> diff -Nru a/include/linux/times.h b/include/linux/times.h
> --- a/include/linux/times.h Tue Apr 13 15:00:25 2004
> +++ b/include/linux/times.h Tue Apr 13 15:00:25 2004
> @@ -7,7 +7,12 @@
> #include <asm/param.h>
>
> #if (HZ % USER_HZ)==0
> -# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
> +static inline long jiffies_to_clock_t(long x)
> +{
> + u64 tmp = (u64)x * TICK_NSEC;
> + x = do_div(tmp, (NSEC_PER_SEC / USER_HZ));
> + return (long)tmp;
> +}
> #else
> # define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
> #endif
>
>
>
>
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-13 22:38 ` john stultz
2004-04-13 22:59 ` George Anzinger
@ 2004-04-14 12:10 ` Tim Schmielau
2004-04-14 17:03 ` George Anzinger
2004-04-14 18:28 ` john stultz
1 sibling, 2 replies; 36+ messages in thread
From: Tim Schmielau @ 2004-04-14 12:10 UTC (permalink / raw)
To: john stultz
Cc: george anzinger, Albert Cahalan, David Ford,
linux-kernel mailing list
> diff -Nru a/include/linux/times.h b/include/linux/times.h
> --- a/include/linux/times.h Tue Apr 13 15:00:25 2004
> +++ b/include/linux/times.h Tue Apr 13 15:00:25 2004
> @@ -7,7 +7,12 @@
> #include <asm/param.h>
>
> #if (HZ % USER_HZ)==0
> -# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
> +static inline long jiffies_to_clock_t(long x)
> +{
> + u64 tmp = (u64)x * TICK_NSEC;
> + x = do_div(tmp, (NSEC_PER_SEC / USER_HZ));
> + return (long)tmp;
> +}
> #else
> # define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
> #endif
Excuse me for barging in lately and innocently, but I find this patch
hard to comprehend:
- shouldn't a foo_to_clock_t() function return a clock?
- the x = seems superfluous
- the #if is not a shortcut anymore, so why keep it?
Shouldn't this patch be more like the following
(completely untested)?
Tim
diff -urp --exclude-from dontdiff linux-2.6.5/include/linux/times.h linux-2.6.5-jfix1/include/linux/times.h
--- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
+++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-14 13:48:57.000000000 +0200
@@ -6,11 +6,16 @@
#include <asm/types.h>
#include <asm/param.h>
-#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
-#else
-# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
-#endif
+static inline clock_t jiffies_to_clock_t(long x)
+{
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
+ return x / (HZ / USER_HZ);
+#else
+ u64 tmp = (u64)x * TICK_NSEC;
+ do_div(tmp, (NSEC_PER_SEC / USER_HZ));
+ return (long)tmp;
+#endif
+}
static inline unsigned long clock_t_to_jiffies(unsigned long x)
{
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-14 12:10 ` Tim Schmielau
@ 2004-04-14 17:03 ` George Anzinger
2004-04-14 18:28 ` john stultz
1 sibling, 0 replies; 36+ messages in thread
From: George Anzinger @ 2004-04-14 17:03 UTC (permalink / raw)
To: Tim Schmielau
Cc: john stultz, Albert Cahalan, David Ford,
linux-kernel mailing list
Tim Schmielau wrote:
>>diff -Nru a/include/linux/times.h b/include/linux/times.h
>>--- a/include/linux/times.h Tue Apr 13 15:00:25 2004
>>+++ b/include/linux/times.h Tue Apr 13 15:00:25 2004
>>@@ -7,7 +7,12 @@
>> #include <asm/param.h>
>>
>> #if (HZ % USER_HZ)==0
>>-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
>>+static inline long jiffies_to_clock_t(long x)
>>+{
>>+ u64 tmp = (u64)x * TICK_NSEC;
>>+ x = do_div(tmp, (NSEC_PER_SEC / USER_HZ));
>>+ return (long)tmp;
>>+}
>> #else
>> # define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
>> #endif
>
>
> Excuse me for barging in lately and innocently, but I find this patch
> hard to comprehend:
> - shouldn't a foo_to_clock_t() function return a clock?
> - the x = seems superfluous
> - the #if is not a shortcut anymore, so why keep it?
> Shouldn't this patch be more like the following
> (completely untested)?
>
> Tim
>
>
> diff -urp --exclude-from dontdiff linux-2.6.5/include/linux/times.h linux-2.6.5-jfix1/include/linux/times.h
> --- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
> +++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-14 13:48:57.000000000 +0200
> @@ -6,11 +6,16 @@
> #include <asm/types.h>
> #include <asm/param.h>
>
> -#if (HZ % USER_HZ)==0
> -# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
> -#else
> -# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
> -#endif
> +static inline clock_t jiffies_to_clock_t(long x)
> +{
> +#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
> + return x / (HZ / USER_HZ);
> +#else
> + u64 tmp = (u64)x * TICK_NSEC;
> + do_div(tmp, (NSEC_PER_SEC / USER_HZ));
> + return (long)tmp;
> +#endif
> +}
>
> static inline unsigned long clock_t_to_jiffies(unsigned long x)
> {
>
It does look a bit better. Takes into account the issue of TICK_NSEC being what
it is.
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-14 12:10 ` Tim Schmielau
2004-04-14 17:03 ` George Anzinger
@ 2004-04-14 18:28 ` john stultz
2004-04-15 10:37 ` Petri Kaukasoina
1 sibling, 1 reply; 36+ messages in thread
From: john stultz @ 2004-04-14 18:28 UTC (permalink / raw)
To: Tim Schmielau
Cc: george anzinger, Albert Cahalan, David Ford,
linux-kernel mailing list
On Wed, 2004-04-14 at 05:10, Tim Schmielau wrote:
> Excuse me for barging in lately and innocently, but I find this patch
> hard to comprehend:
> - shouldn't a foo_to_clock_t() function return a clock?
> - the x = seems superfluous
> - the #if is not a shortcut anymore, so why keep it?
> Shouldn't this patch be more like the following
> (completely untested)?
Yes, you're cleanups look much better! Although we still have yet to
hear if it resolves the problem.
thanks
-john
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-14 18:28 ` john stultz
@ 2004-04-15 10:37 ` Petri Kaukasoina
2004-04-15 11:05 ` Tim Schmielau
0 siblings, 1 reply; 36+ messages in thread
From: Petri Kaukasoina @ 2004-04-15 10:37 UTC (permalink / raw)
To: john stultz
Cc: Tim Schmielau, george anzinger, Albert Cahalan, David Ford,
linux-kernel mailing list
On Wed, Apr 14, 2004 at 11:28:15AM -0700, john stultz wrote:
> On Wed, 2004-04-14 at 05:10, Tim Schmielau wrote:
> > Excuse me for barging in lately and innocently, but I find this patch
> > hard to comprehend:
> > - shouldn't a foo_to_clock_t() function return a clock?
> > - the x = seems superfluous
> > - the #if is not a shortcut anymore, so why keep it?
> > Shouldn't this patch be more like the following
> > (completely untested)?
>
> Yes, you're cleanups look much better! Although we still have yet to
> hear if it resolves the problem.
Hi,
If we are still talking about the problem with ps showing process start
times in future, I'm sorry neither of the patches helped. The error grows
here at a rate of 15 seconds in 24 hours as before.
-Petri
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-15 10:37 ` Petri Kaukasoina
@ 2004-04-15 11:05 ` Tim Schmielau
2004-04-15 16:14 ` Petri Kaukasoina
0 siblings, 1 reply; 36+ messages in thread
From: Tim Schmielau @ 2004-04-15 11:05 UTC (permalink / raw)
To: Petri Kaukasoina
Cc: john stultz, george anzinger, Albert Cahalan, David Ford,
linux-kernel mailing list
On Thu, 15 Apr 2004, Petri Kaukasoina wrote:
> If we are still talking about the problem with ps showing process start
> times in future, I'm sorry neither of the patches helped. The error grows
> here at a rate of 15 seconds in 24 hours as before.
Oops...
sure, it cannot. Maybe this one is better...
--- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
+++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-15 12:59:05.000000000 +0200
@@ -6,11 +6,16 @@
#include <asm/types.h>
#include <asm/param.h>
-#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
-#else
-# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
-#endif
+static inline clock_t jiffies_to_clock_t(long x)
+{
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
+ return x / (HZ / USER_HZ);
+#else
+ u64 tmp = (u64)x * TICK_NSEC;
+ do_div(tmp, (NSEC_PER_SEC / USER_HZ));
+ return (long)tmp;
+#endif
+}
static inline unsigned long clock_t_to_jiffies(unsigned long x)
{
@@ -34,7 +39,7 @@ static inline unsigned long clock_t_to_j
static inline u64 jiffies_64_to_clock_t(u64 x)
{
-#if (HZ % USER_HZ)==0
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
do_div(x, HZ / USER_HZ);
#else
/*
@@ -42,8 +47,8 @@ static inline u64 jiffies_64_to_clock_t(
* but even this doesn't overflow in hundreds of years
* in 64 bits, so..
*/
- x *= USER_HZ;
- do_div(x, HZ);
+ x *= TICK_NSEC;
+ do_div(x, (NSEC_PER_SEC / USER_HZ));
#endif
return x;
}
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-15 11:05 ` Tim Schmielau
@ 2004-04-15 16:14 ` Petri Kaukasoina
2004-05-01 13:51 ` Tim Schmielau
0 siblings, 1 reply; 36+ messages in thread
From: Petri Kaukasoina @ 2004-04-15 16:14 UTC (permalink / raw)
To: Tim Schmielau
Cc: john stultz, george anzinger, Albert Cahalan, David Ford,
linux-kernel mailing list
On Thu, Apr 15, 2004 at 01:05:17PM +0200, Tim Schmielau wrote:
> On Thu, 15 Apr 2004, Petri Kaukasoina wrote:
>
> > If we are still talking about the problem with ps showing process start
> > times in future, I'm sorry neither of the patches helped. The error grows
> > here at a rate of 15 seconds in 24 hours as before.
>
> Oops...
> sure, it cannot. Maybe this one is better...
>
>
> --- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
> +++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-15 12:59:05.000000000 +0200
Yes, it seems to have fixed it. There is a small error: ps shows a start
time of a new minute about four seconds too early, but the error stays
constant and does not change as a function of uptime any longer. (Actually
it still does but only at the same rate as ntpd corrects time.)
-Petri
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-04-15 16:14 ` Petri Kaukasoina
@ 2004-05-01 13:51 ` Tim Schmielau
2004-05-02 1:41 ` Andrew Morton
0 siblings, 1 reply; 36+ messages in thread
From: Tim Schmielau @ 2004-05-01 13:51 UTC (permalink / raw)
To: john stultz, george anzinger; +Cc: Petri Kaukasoina, linux-kernel mailing list
On Thu, 15 Apr 2004, Petri Kaukasoina wrote:
> On Thu, Apr 15, 2004 at 01:05:17PM +0200, Tim Schmielau wrote:
> > On Thu, 15 Apr 2004, Petri Kaukasoina wrote:
> >
> > > If we are still talking about the problem with ps showing process start
> > > times in future, I'm sorry neither of the patches helped. The error grows
> > > here at a rate of 15 seconds in 24 hours as before.
> >
> > Oops...
> > sure, it cannot. Maybe this one is better...
> >
> >
> > --- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
> > +++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-15 12:59:05.000000000 +0200
>
> Yes, it seems to have fixed it. There is a small error: ps shows a start
> time of a new minute about four seconds too early, but the error stays
> constant and does not change as a function of uptime any longer. (Actually
> it still does but only at the same rate as ntpd corrects time.)
>
> -Petri
>
John, George,
can you take care of the patch so it doesn't get lost?
I don't know how to handle the ntpd issue, which I guess also is the
reason of the four seconds difference.
Thanks,
Tim
--- linux-2.6.5/include/linux/times.h 2004-02-04 04:43:09.000000000 +0100
+++ linux-2.6.5-jfix1/include/linux/times.h 2004-04-15 12:59:05.000000000 +0200
@@ -6,11 +6,16 @@
#include <asm/types.h>
#include <asm/param.h>
-#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
-#else
-# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
-#endif
+static inline clock_t jiffies_to_clock_t(long x)
+{
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
+ return x / (HZ / USER_HZ);
+#else
+ u64 tmp = (u64)x * TICK_NSEC;
+ do_div(tmp, (NSEC_PER_SEC / USER_HZ));
+ return (long)tmp;
+#endif
+}
static inline unsigned long clock_t_to_jiffies(unsigned long x)
{
@@ -34,7 +39,7 @@ static inline unsigned long clock_t_to_j
static inline u64 jiffies_64_to_clock_t(u64 x)
{
-#if (HZ % USER_HZ)==0
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
do_div(x, HZ / USER_HZ);
#else
/*
@@ -42,8 +47,8 @@ static inline u64 jiffies_64_to_clock_t(
* but even this doesn't overflow in hundreds of years
* in 64 bits, so..
*/
- x *= USER_HZ;
- do_div(x, HZ);
+ x *= TICK_NSEC;
+ do_div(x, (NSEC_PER_SEC / USER_HZ));
#endif
return x;
}
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-01 13:51 ` Tim Schmielau
@ 2004-05-02 1:41 ` Andrew Morton
2004-05-02 1:59 ` Tim Schmielau
0 siblings, 1 reply; 36+ messages in thread
From: Andrew Morton @ 2004-05-02 1:41 UTC (permalink / raw)
To: Tim Schmielau; +Cc: johnstul, george, kaukasoi, linux-kernel
Tim Schmielau <tim@physik3.uni-rostock.de> wrote:
>
> +#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
I think this has an inclusion ordering problem.
In file included from net/ipv6/route.c:30:
include/linux/times.h:11:42: division by zero in #if
include/linux/times.h:42:42: division by zero in #if
either NSEC_PER_SEC or USER_HZ hasn't been defined yet.
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-02 1:41 ` Andrew Morton
@ 2004-05-02 1:59 ` Tim Schmielau
2004-05-04 2:40 ` john stultz
0 siblings, 1 reply; 36+ messages in thread
From: Tim Schmielau @ 2004-05-02 1:59 UTC (permalink / raw)
To: Andrew Morton; +Cc: johnstul, george, kaukasoi, linux-kernel
On Sat, 1 May 2004, Andrew Morton wrote:
> Tim Schmielau <tim@physik3.uni-rostock.de> wrote:
> >
> > +#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
>
> I think this has an inclusion ordering problem.
>
> In file included from net/ipv6/route.c:30:
> include/linux/times.h:11:42: division by zero in #if
> include/linux/times.h:42:42: division by zero in #if
>
> either NSEC_PER_SEC or USER_HZ hasn't been defined yet.
>
Yep, we'd need to include timex.h for it. This get's messy.
OK, I found why John's original patch didn't fix the issue, but I'd like
to hand the patch off to someone with a vision of how time shall be
handled in the kernel.
Sorry,
Tim
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-02 1:59 ` Tim Schmielau
@ 2004-05-04 2:40 ` john stultz
2004-05-04 6:12 ` Tim Schmielau
0 siblings, 1 reply; 36+ messages in thread
From: john stultz @ 2004-05-04 2:40 UTC (permalink / raw)
To: Andrew Morton; +Cc: Tim Schmielau, george, kaukasoi, linux-kernel, davem
On Sat, 2004-05-01 at 18:59, Tim Schmielau wrote:
> On Sat, 1 May 2004, Andrew Morton wrote:
>
> > Tim Schmielau <tim@physik3.uni-rostock.de> wrote:
> > >
> > > +#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
> >
> > I think this has an inclusion ordering problem.
>
> Yep, we'd need to include timex.h for it. This get's messy.
Well, not too messy. Including timex.h looks to resolve the issue
without trouble. Let me know if I somehow stepped over an issue.
DaveM: Do please note the changes to the TCP code. Having
jiffies_to_clock_t be more then just a #define introduced a few cast
warnings, and I believe the fixes are right, but you can't be too sure.
jiffies-to-clockt-fix_A1:
-------------------------
All,
This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix
for jiffies_to_clock_t() and jiffies_64_to_clock_t(). The issues
observed was w/ /proc output not matching up to wall time due to
accumulated error caused by HZ not being exactly 1000 on i386 systems.
The solution is to correct that error by using the more accurate
TICK_NSEC in our calculation.
Additionally, this patch corrects 3 warnings in the TCP layer uncovered
by this change.
***DaveM please review!***
thanks
-john
diff -Nru a/include/linux/times.h b/include/linux/times.h
--- a/include/linux/times.h Mon May 3 19:24:14 2004
+++ b/include/linux/times.h Mon May 3 19:24:14 2004
@@ -2,15 +2,21 @@
#define _LINUX_TIMES_H
#ifdef __KERNEL__
+#include <linux/timex.h>
#include <asm/div64.h>
#include <asm/types.h>
#include <asm/param.h>
-#if (HZ % USER_HZ)==0
-# define jiffies_to_clock_t(x) ((x) / (HZ / USER_HZ))
+static inline clock_t jiffies_to_clock_t(long x)
+{
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
+ return x / (HZ / USER_HZ);
#else
-# define jiffies_to_clock_t(x) ((clock_t) jiffies_64_to_clock_t((u64) x))
+ u64 tmp = (u64)x * TICK_NSEC;
+ do_div(tmp, (NSEC_PER_SEC / USER_HZ));
+ return (long)tmp;
#endif
+}
static inline unsigned long clock_t_to_jiffies(unsigned long x)
{
@@ -34,7 +40,7 @@
static inline u64 jiffies_64_to_clock_t(u64 x)
{
-#if (HZ % USER_HZ)==0
+#if (TICK_NSEC % (NSEC_PER_SEC / USER_HZ)) == 0
do_div(x, HZ / USER_HZ);
#else
/*
@@ -42,8 +48,8 @@
* but even this doesn't overflow in hundreds of years
* in 64 bits, so..
*/
- x *= USER_HZ;
- do_div(x, HZ);
+ x *= TICK_NSEC;
+ do_div(x, (NSEC_PER_SEC / USER_HZ));
#endif
return x;
}
diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
--- a/net/ipv4/tcp_ipv4.c Mon May 3 19:24:14 2004
+++ b/net/ipv4/tcp_ipv4.c Mon May 3 19:24:14 2004
@@ -2452,7 +2452,7 @@
int ttd = req->expires - jiffies;
sprintf(tmpbuf, "%4d: %08X:%04X %08X:%04X"
- " %02X %08X:%08X %02X:%08X %08X %5d %8d %u %d %p",
+ " %02X %08X:%08X %02X:%08lX %08X %5d %8d %u %d %p",
i,
req->af.v4_req.loc_addr,
ntohs(inet_sk(sk)->sport),
@@ -2526,7 +2526,7 @@
srcp = ntohs(tw->tw_sport);
sprintf(tmpbuf, "%4d: %08X:%04X %08X:%04X"
- " %02X %08X:%08X %02X:%08X %08X %5d %8d %d %d %p",
+ " %02X %08X:%08X %02X:%08lX %08X %5d %8d %d %d %p",
i, src, srcp, dest, destp, tw->tw_substate, 0, 0,
3, jiffies_to_clock_t(ttd), 0, 0, 0, 0,
atomic_read(&tw->tw_refcnt), tw);
diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
--- a/net/ipv6/tcp_ipv6.c Mon May 3 19:24:14 2004
+++ b/net/ipv6/tcp_ipv6.c Mon May 3 19:24:14 2004
@@ -1933,7 +1933,7 @@
dest = &req->af.v6_req.rmt_addr;
seq_printf(seq,
"%4d: %08X%08X%08X%08X:%04X %08X%08X%08X%08X:%04X "
- "%02X %08X:%08X %02X:%08X %08X %5d %8d %d %d %p\n",
+ "%02X %08X:%08X %02X:%08lX %08X %5d %8d %d %d %p\n",
i,
src->s6_addr32[0], src->s6_addr32[1],
src->s6_addr32[2], src->s6_addr32[3],
@@ -2019,7 +2019,7 @@
seq_printf(seq,
"%4d: %08X%08X%08X%08X:%04X %08X%08X%08X%08X:%04X "
- "%02X %08X:%08X %02X:%08X %08X %5d %8d %d %d %p\n",
+ "%02X %08X:%08X %02X:%08lX %08X %5d %8d %d %d %p\n",
i,
src->s6_addr32[0], src->s6_addr32[1],
src->s6_addr32[2], src->s6_addr32[3], srcp,
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-04 2:40 ` john stultz
@ 2004-05-04 6:12 ` Tim Schmielau
2004-05-04 14:59 ` john stultz
0 siblings, 1 reply; 36+ messages in thread
From: Tim Schmielau @ 2004-05-04 6:12 UTC (permalink / raw)
To: john stultz; +Cc: Andrew Morton, george, kaukasoi, linux-kernel, davem
On Mon, 3 May 2004, john stultz wrote:
> On Sat, 2004-05-01 at 18:59, Tim Schmielau wrote:
> >
> > Yep, we'd need to include timex.h for it. This get's messy.
>
> Well, not too messy. Including timex.h looks to resolve the issue
> without trouble. Let me know if I somehow stepped over an issue.
It looks ok, but somehow defeats the whole purpose of having separate
include files. Someday we may consolidate all the time related things
into just one ore two header files then.
> jiffies-to-clockt-fix_A1:
> -------------------------
Thanks, John!
> All,
> This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix
> for jiffies_to_clock_t() and jiffies_64_to_clock_t(). The issues
> observed was w/ /proc output not matching up to wall time due to
> accumulated error caused by HZ not being exactly 1000 on i386 systems.
> The solution is to correct that error by using the more accurate
> TICK_NSEC in our calculation.
I wonder whether it's conceptually correct to use jiffies for accurate
long-time measurements at all. ntpd is there for a reason. Using both
corrected, accurate and freely running clocks IMHO is calling for trouble.
This might be something to think about for 2.7.
Thanks,
Tim
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-04 6:12 ` Tim Schmielau
@ 2004-05-04 14:59 ` john stultz
2004-05-04 16:50 ` Tim Schmielau
2004-05-07 0:33 ` George Anzinger
0 siblings, 2 replies; 36+ messages in thread
From: john stultz @ 2004-05-04 14:59 UTC (permalink / raw)
To: Tim Schmielau; +Cc: Andrew Morton, george, kaukasoi, linux-kernel, davem
On Mon, 2004-05-03 at 23:12, Tim Schmielau wrote:
> On Mon, 3 May 2004, john stultz wrote:
> > This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix
> > for jiffies_to_clock_t() and jiffies_64_to_clock_t(). The issues
> > observed was w/ /proc output not matching up to wall time due to
> > accumulated error caused by HZ not being exactly 1000 on i386 systems.
> > The solution is to correct that error by using the more accurate
> > TICK_NSEC in our calculation.
>
> I wonder whether it's conceptually correct to use jiffies for accurate
> long-time measurements at all. ntpd is there for a reason. Using both
> corrected, accurate and freely running clocks IMHO is calling for trouble.
> This might be something to think about for 2.7.
Indeed. Moving away from jiffies as a time counter and more of an
interrupt counter is important. That allows for implementations of
variable HZ and other things the high-res timer folks want without
affecting the time keeping code.
Roughly, I'd like to see the time code for all arches in 2.7 to look
like:
u64 system_time /* NTP adjusted nanosecs since boot */
u64 wall_time_offset /* offset to system_time for time of day */
u64 offset_base /* last read raw hw value */
ts_read():
returns the raw cycle value from the hardware timesource
(TSC/ACPI PM/HPET)
ts_delta(now, then):
returns the difference between two raw cycle values
ts_cyc2ns(cycles):
converts a cycle value to ns
monotonic_clock():
returns NTP adjusted nanoseconds since boot
ie: system_time +
NTP_GUNK(ts_cyc2ns(ts_delta(ts_read(),offset_base)))
gettimeofday():
returns monotonic_clock() + sys_time_offset
settimeofday():
adjusts only sys_time_offset
time_interrupt_hook():
updates system_time. called by timer interrupt atleast once
every hardware cycle (ie: before the hardware counter
overflows), but otherwise unaffected by lost interrupts, etc.
ie:
then = offset_base
now = ts_read()
system_time += NTP_GUNK(ts_cyc2ns(ts_delta(now, then)));
DO_MORE_NTP_GUNK()
And ignoring the magic NTP_GUNK macros, that's all there is to it
(Although don't kid your self, the NTP_GUNK is nasty).
Of course, with this approach, we actually have to be able to trust the
hardware 100%. With the current state of i386 hw having serious problems
w/ reliable timesources, this may be difficult.
I've got a bigger proposal (with proper credits to Keith Mannthey and
George Anzinger for reviews and corrections) that I wrote up awhile
back, and I'll likely send it out if this sketch gathers any interest.
thanks
-john
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-04 14:59 ` john stultz
@ 2004-05-04 16:50 ` Tim Schmielau
2004-05-07 0:33 ` George Anzinger
1 sibling, 0 replies; 36+ messages in thread
From: Tim Schmielau @ 2004-05-04 16:50 UTC (permalink / raw)
To: john stultz; +Cc: Andrew Morton, george, kaukasoi, linux-kernel, davem
On Tue, 4 May 2004, john stultz wrote:
> On Mon, 2004-05-03 at 23:12, Tim Schmielau wrote:
> >
> > I wonder whether it's conceptually correct to use jiffies for accurate
> > long-time measurements at all. ntpd is there for a reason. Using both
> > corrected, accurate and freely running clocks IMHO is calling for trouble.
> > This might be something to think about for 2.7.
>
> Indeed. Moving away from jiffies as a time counter and more of an
> interrupt counter is important. That allows for implementations of
> variable HZ and other things the high-res timer folks want without
> affecting the time keeping code.
>
> Roughly, I'd like to see the time code for all arches in 2.7 to look
> like:
[simple, well thought-out proposal snipped]
> time_interrupt_hook():
> updates system_time.
> Of course, with this approach, we actually have to be able to trust the
> hardware 100%. With the current state of i386 hw having serious problems
> w/ reliable timesources, this may be difficult.
Well, with some configurable plausibility checks in time_interrupt_hook()
it shouldn't be worse than what we have now...
> I've got a bigger proposal (with proper credits to Keith Mannthey and
> George Anzinger for reviews and corrections) that I wrote up awhile
> back, and I'll likely send it out if this sketch gathers any interest.
Yes, that sounds interesting. It's just that I won't have any spare time
to spend in the next two weeks.
Tim
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-04 14:59 ` john stultz
2004-05-04 16:50 ` Tim Schmielau
@ 2004-05-07 0:33 ` George Anzinger
2004-05-07 1:21 ` john stultz
1 sibling, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-05-07 0:33 UTC (permalink / raw)
To: john stultz; +Cc: Tim Schmielau, Andrew Morton, kaukasoi, linux-kernel, davem
john stultz wrote:
> On Mon, 2004-05-03 at 23:12, Tim Schmielau wrote:
>
>>On Mon, 3 May 2004, john stultz wrote:
>>
>>> This patch polishes up Tim Schmielau's (tim@physik3.uni-rostock.de) fix
>>>for jiffies_to_clock_t() and jiffies_64_to_clock_t(). The issues
>>>observed was w/ /proc output not matching up to wall time due to
>>>accumulated error caused by HZ not being exactly 1000 on i386 systems.
>>>The solution is to correct that error by using the more accurate
>>>TICK_NSEC in our calculation.
>>
>>I wonder whether it's conceptually correct to use jiffies for accurate
>>long-time measurements at all. ntpd is there for a reason. Using both
>>corrected, accurate and freely running clocks IMHO is calling for trouble.
>>This might be something to think about for 2.7.
>
>
> Indeed. Moving away from jiffies as a time counter and more of an
> interrupt counter is important. That allows for implementations of
> variable HZ and other things the high-res timer folks want without
> affecting the time keeping code.
>
> Roughly, I'd like to see the time code for all arches in 2.7 to look
> like:
>
> u64 system_time /* NTP adjusted nanosecs since boot */
> u64 wall_time_offset /* offset to system_time for time of day */
> u64 offset_base /* last read raw hw value */
Hm. In 2.6 we use an NTP adjusted wall time and a wall_to_monotonic offset. I
don't really see the advantage here. Does this change buy us something?
For what its worth, I introduced the wall_to_monotonic offset just because it
was easier to do (and understand, I think) in the current kernel.
>
> ts_read():
> returns the raw cycle value from the hardware timesource
> (TSC/ACPI PM/HPET)
> ts_delta(now, then):
> returns the difference between two raw cycle values
> ts_cyc2ns(cycles):
> converts a cycle value to ns
>
> monotonic_clock():
> returns NTP adjusted nanoseconds since boot
> ie: system_time +
> NTP_GUNK(ts_cyc2ns(ts_delta(ts_read(),offset_base)))
> gettimeofday():
> returns monotonic_clock() + sys_time_offset
> settimeofday():
> adjusts only sys_time_offset
> time_interrupt_hook():
> updates system_time. called by timer interrupt atleast once
> every hardware cycle (ie: before the hardware counter
> overflows), but otherwise unaffected by lost interrupts, etc.
> ie:
> then = offset_base
> now = ts_read()
> system_time += NTP_GUNK(ts_cyc2ns(ts_delta(now, then)));
> DO_MORE_NTP_GUNK()
>
> And ignoring the magic NTP_GUNK macros, that's all there is to it
> (Although don't kid your self, the NTP_GUNK is nasty).
Right, and it needs to be recast to use secs and nanosecs... But you forget the
accounting code which needs the periodic interrupt to charge time to whom ever.
>
> Of course, with this approach, we actually have to be able to trust the
> hardware 100%. With the current state of i386 hw having serious problems
> w/ reliable timesources, this may be difficult.
Yes, and there is also a problem getting a stable, reliable, and correct
calibration of TSC to PIT even with a constant TSC rate. In the HRT patch I
finally resorted to correcting the TSC last read value on a regular basis. With
out this it drifts (or maybe, more correctly, the calibration was wrong) enough
to mess up the high res timers.
I suspect that while we use two different timers (PIT & TSC or PIT & pm-timer or
...) that don't use the same source clock we will continue to have such
problems. Other archs have a much easier time of it.
>
> I've got a bigger proposal (with proper credits to Keith Mannthey and
> George Anzinger for reviews and corrections) that I wrote up awhile
> back, and I'll likely send it out if this sketch gathers any interest.
>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-07 0:33 ` George Anzinger
@ 2004-05-07 1:21 ` john stultz
2004-05-07 20:41 ` George Anzinger
0 siblings, 1 reply; 36+ messages in thread
From: john stultz @ 2004-05-07 1:21 UTC (permalink / raw)
To: ganzinger; +Cc: Tim Schmielau, Andrew Morton, kaukasoi, linux-kernel, davem
On Thu, 2004-05-06 at 17:33, George Anzinger wrote:
> john stultz wrote:
> > Roughly, I'd like to see the time code for all arches in 2.7 to look
> > like:
> >
> > u64 system_time /* NTP adjusted nanosecs since boot */
> > u64 wall_time_offset /* offset to system_time for time of day */
> > u64 offset_base /* last read raw hw value */
>
> Hm. In 2.6 we use an NTP adjusted wall time and a wall_to_monotonic offset. I
> don't really see the advantage here. Does this change buy us something?
> For what its worth, I introduced the wall_to_monotonic offset just because it
> was easier to do (and understand, I think) in the current kernel.
Well, in my opinion it seems much cleaner. Right now any time we adjust
xtime, we have to remember to adjust wall_to_monotonic. I believe we've
had issues where a change was made to just one and not the other.
This is easier and has simpler rules. system_time always increments and
is only modified by the periodic time_interrupt_hook(). Then
wall_time_offset is only changes by do_settimeofday(). In fact, I hope
to make these values static to the time code, so that all in-kernel
users must go through the monotonic_clock() and do_gettimeofday()
interfaces.
To be brutal, I'd like to see xtime killed completely. Jiffies and HZ
too, although I'd be happy with those two being made static to the
interval timer code. There are too many places where folks have tried to
extrapolate a time value from some global accounting variable, and w/ HZ
not quite being exactly 1000 now on i386, all that code is just slightly
wrong. Its spaghetti code now, and we just need to put that mess behind
a few clean understandable interfaces.
But hey, that's me dreaming, I'm sure there will be some reason odd
someone will need to get into the guts of the time code and we'll have
to break the opacity. :)
And I do realize we'll also need a get_timestamp() or something that
will quickly return a low-res timestamp like the current value of
system_time. I don't intend for all those users of xtime or jiffies to
really go out and hit hardware to calculate a nanosecond accurate time
value.
> > And ignoring the magic NTP_GUNK macros, that's all there is to it
> > (Although don't kid your self, the NTP_GUNK is nasty).
>
> Right, and it needs to be recast to use secs and nanosecs...
Yea, yea, you're right. u64 nanosecond values are just so much simpler
to work with, until you hit the NTP code.
> But you forget the
> accounting code which needs the periodic interrupt to charge time to whom ever.
True, although I'd like to avoid doing that in the time subsystem.
Instead the interval timer subsystem would run the accounting code,
which would then call get_timestamp() to calculate the amount of time to
charge a process.
thanks
-john
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-07 1:21 ` john stultz
@ 2004-05-07 20:41 ` George Anzinger
2004-05-07 21:38 ` john stultz
0 siblings, 1 reply; 36+ messages in thread
From: George Anzinger @ 2004-05-07 20:41 UTC (permalink / raw)
To: john stultz
Cc: ganzinger, Tim Schmielau, Andrew Morton, kaukasoi, linux-kernel,
davem
john stultz wrote:
> On Thu, 2004-05-06 at 17:33, George Anzinger wrote:
>
>>john stultz wrote:
>>
>>>Roughly, I'd like to see the time code for all arches in 2.7 to look
>>>like:
>>>
>>>u64 system_time /* NTP adjusted nanosecs since boot */
>>>u64 wall_time_offset /* offset to system_time for time of day */
>>>u64 offset_base /* last read raw hw value */
>>
>>Hm. In 2.6 we use an NTP adjusted wall time and a wall_to_monotonic offset. I
>>don't really see the advantage here. Does this change buy us something?
>>For what its worth, I introduced the wall_to_monotonic offset just because it
>>was easier to do (and understand, I think) in the current kernel.
>
>
> Well, in my opinion it seems much cleaner. Right now any time we adjust
> xtime, we have to remember to adjust wall_to_monotonic. I believe we've
> had issues where a change was made to just one and not the other.
>
> This is easier and has simpler rules. system_time always increments and
> is only modified by the periodic time_interrupt_hook(). Then
> wall_time_offset is only changes by do_settimeofday(). In fact, I hope
> to make these values static to the time code, so that all in-kernel
> users must go through the monotonic_clock() and do_gettimeofday()
> interfaces.
All that is fine for the kernel coder and such, but the fact remains that
gettimeofday() is the BIG user and I keep seeing folks trying to make it faster.
Also xtime.tv_sec is used a LOT in the kernel under the name: get_seconds().
~>
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: /proc or ps tools bug? 2.6.3, time is off
2004-05-07 20:41 ` George Anzinger
@ 2004-05-07 21:38 ` john stultz
0 siblings, 0 replies; 36+ messages in thread
From: john stultz @ 2004-05-07 21:38 UTC (permalink / raw)
To: ganzinger; +Cc: Tim Schmielau, Andrew Morton, kaukasoi, lkml, davem
On Fri, 2004-05-07 at 13:41, George Anzinger wrote:
> john stultz wrote:
> > On Thu, 2004-05-06 at 17:33, George Anzinger wrote:
> >
> >>john stultz wrote:
> >>
> >>>Roughly, I'd like to see the time code for all arches in 2.7 to look
> >>>like:
> >>>
> >>>u64 system_time /* NTP adjusted nanosecs since boot */
> >>>u64 wall_time_offset /* offset to system_time for time of day */
> >>>u64 offset_base /* last read raw hw value */
> >>
> >>Hm. In 2.6 we use an NTP adjusted wall time and a wall_to_monotonic offset. I
> >>don't really see the advantage here. Does this change buy us something?
> >>For what its worth, I introduced the wall_to_monotonic offset just because it
> >>was easier to do (and understand, I think) in the current kernel.
> >
> >
> > Well, in my opinion it seems much cleaner. Right now any time we adjust
> > xtime, we have to remember to adjust wall_to_monotonic. I believe we've
> > had issues where a change was made to just one and not the other.
> >
> > This is easier and has simpler rules. system_time always increments and
> > is only modified by the periodic time_interrupt_hook(). Then
> > wall_time_offset is only changes by do_settimeofday(). In fact, I hope
> > to make these values static to the time code, so that all in-kernel
> > users must go through the monotonic_clock() and do_gettimeofday()
> > interfaces.
>
> All that is fine for the kernel coder and such, but the fact remains that
> gettimeofday() is the BIG user and I keep seeing folks trying to make it faster.
> Also xtime.tv_sec is used a LOT in the kernel under the name: get_seconds().
<sigh> You're may be right. Having to convert from a u64 nanosec value
to a timeval in sys_gettimeofday() as well as get_seconds() may be a
performance problem. But I'm not completely convinced, as we already
have to play games shifting from timevals to timespecs and back. I'm not
sure the nsec/1000000 will kill us.
Pragmatically I'm willing to bend on that one by using timespecs instead
of u64s. But while I'm in the design phase, thinking of the problem as
juggling u64 nanoseconds simplifies it. Be it a u64 or a timespec, it
really doesn't change the design all that much. One you get to use "+"
and the other you use "time_add()".
thanks
-john
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2004-05-07 21:39 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-25 1:58 /proc or ps tools bug? 2.6.3, time is off David Ford
2004-02-25 1:54 ` Albert Cahalan
2004-02-25 5:10 ` David Ford
2004-02-25 3:27 ` Albert Cahalan
2004-02-25 16:28 ` George Anzinger
2004-02-25 16:04 ` Albert Cahalan
2004-02-25 20:45 ` George Anzinger
2004-02-25 19:16 ` Albert Cahalan
2004-02-25 21:10 ` George Anzinger
2004-02-26 1:52 ` john stultz
2004-02-26 23:06 ` George Anzinger
2004-02-26 23:10 ` john stultz
2004-02-27 0:20 ` George Anzinger
2004-04-13 22:38 ` john stultz
2004-04-13 22:59 ` George Anzinger
2004-04-14 12:10 ` Tim Schmielau
2004-04-14 17:03 ` George Anzinger
2004-04-14 18:28 ` john stultz
2004-04-15 10:37 ` Petri Kaukasoina
2004-04-15 11:05 ` Tim Schmielau
2004-04-15 16:14 ` Petri Kaukasoina
2004-05-01 13:51 ` Tim Schmielau
2004-05-02 1:41 ` Andrew Morton
2004-05-02 1:59 ` Tim Schmielau
2004-05-04 2:40 ` john stultz
2004-05-04 6:12 ` Tim Schmielau
2004-05-04 14:59 ` john stultz
2004-05-04 16:50 ` Tim Schmielau
2004-05-07 0:33 ` George Anzinger
2004-05-07 1:21 ` john stultz
2004-05-07 20:41 ` George Anzinger
2004-05-07 21:38 ` john stultz
2004-02-26 23:14 ` George Anzinger
2004-02-25 9:14 ` Petri Kaukasoina
2004-02-25 9:18 ` Petri Kaukasoina
2004-02-25 21:39 ` David Ford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox