All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Provide an interface for getting the current tick length
@ 2006-02-16  1:17 Paul Mackerras
  2006-02-16  1:35 ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Mackerras @ 2006-02-16  1:17 UTC (permalink / raw)
  To: torvalds, akpm; +Cc: linux-kernel, johnstul

This provides an interface for arch code to find out how many
nanoseconds are going to be added on to xtime by the next call to
do_timer.  The value returned is a fixed-point number in 52.12 format
in nanoseconds.  The reason for this format is that it gives the
full precision that the timekeeping code is using internally.

The motivation for this is to fix a problem that has arisen on 32-bit
powerpc in that the value returned by do_gettimeofday drifts apart
from xtime if NTP is being used.  PowerPC is now using a lockless
do_gettimeofday based on reading the timebase register and performing
some simple arithmetic.  (This method of getting the time is also
exported to userspace via the VDSO.)  However, the factor and offset
it uses were calculated based on the nominal tick length and weren't
being adjusted when NTP varied the tick length.

Note that 64-bit powerpc has had the lockless do_gettimeofday for a
long time now.  It also had an extremely hairy routine that got called
from the 32-bit compat routine for adjtimex, which adjusted the
factor and offset according to what it thought the timekeeping code
was going to do.  Not only was this only called if a 32-bit task did
adjtimex (i.e. not if a 64-bit task did adjtimex), it was also
duplicating computations from kernel/timer.c and it wasn't clear that
it was (still) correct.

The simple solution is to ask the timekeeping code how long the
current jiffy will be on each timer interrupt, after calling
do_timer.  If this jiffy will be a different length from the last one,
we then need to compute new values for the factor and offset used in
the lockless do_gettimeofday.  In this way we can keep xtime and
do_gettimeofday in sync, even when NTP is varying the tick length.

Note that when adjtimex varies the tick length, it almost always
introduces the variation from the next tick on.  The only case I could
see where adjtimex would vary the length of the current tick is when
an old-style adjtime adjustment is being cancelled.  (It's not clear
to me why the adjustment has to be cancelled immediately rather than
from the next tick on.)  Thus I don't see any real need for a hook in
adjtimex; the rare case of an old-style adjustment being cancelled can
be fixed up at the next tick.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
Linus, Andrew:

This is the generic part of a patch that I posted recently in response
to reports of a regression on 32-bit powerpc, where xtime and
gettimeofday were getting out of sync.  I want this to go into 2.6.16
so that I can then put in the powerpc-specific part of the patch and
fix the regression.  This will also let me remove the extremely hairy
ppc_adjtimex routine from arch/powerpc/kernel/time.c, which nobody on
the planet really understands (except possibly one person in IBM
Rochester :).

This patch doesn't affect other architectures at all, so I think it
should be safe to put in.

diff -urN linux-2.6/include/linux/timex.h ntpfix/include/linux/timex.h
--- linux-2.6/include/linux/timex.h	2005-10-31 13:10:39.000000000 +1100
+++ ntpfix/include/linux/timex.h	2006-02-13 15:07:52.000000000 +1100
@@ -345,6 +345,9 @@
 
 #endif /* !CONFIG_TIME_INTERPOLATION */
 
+/* Returns how long ticks are at present, in ns / 2^(SHIFT_SCALE-10). */
+extern u64 current_tick_length(void);
+
 #endif /* KERNEL */
 
 #endif /* LINUX_TIMEX_H */
diff -urN linux-2.6/kernel/timer.c ntpfix/kernel/timer.c
--- linux-2.6/kernel/timer.c	2006-02-09 11:39:05.000000000 +1100
+++ ntpfix/kernel/timer.c	2006-02-13 15:07:54.000000000 +1100
@@ -759,6 +759,30 @@
 }
 
 /*
+ * Return how long ticks are at the moment, that is, how much time
+ * update_wall_time_one_tick will add to xtime next time we call it
+ * (assuming no calls to do_adjtimex in the meantime).
+ * The return value is in fixed-point nanoseconds with SHIFT_SCALE-10
+ * bits to the right of the binary point.
+ * This function has no side-effects.
+ */
+u64 current_tick_length(void)
+{
+	long time_adjust_step, delta_nsec;
+
+	if ((time_adjust_step = time_adjust) != 0 ) {
+		/*
+		 * Limit the amount of the step to be in the range
+		 * -tickadj .. +tickadj
+		 */
+		time_adjust_step = min(time_adjust_step, (long)tickadj);
+		time_adjust_step = max(time_adjust_step, (long)-tickadj);
+	}
+	delta_nsec = tick_nsec + time_adjust_step * 1000;
+	return ((u64) delta_nsec << (SHIFT_SCALE - 10)) + time_adj;
+}
+
+/*
  * Using a loop looks inefficient, but "ticks" is
  * usually just one (we shouldn't be losing ticks,
  * we're doing this this way mainly for interrupt

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  1:17 [PATCH] Provide an interface for getting the current tick length Paul Mackerras
@ 2006-02-16  1:35 ` Andrew Morton
  2006-02-16  1:55   ` Paul Mackerras
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-02-16  1:35 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: torvalds, linux-kernel, johnstul

Paul Mackerras <paulus@samba.org> wrote:
>
> +u64 current_tick_length(void)
>  +{
>  +	long time_adjust_step, delta_nsec;
>  +
>  +	if ((time_adjust_step = time_adjust) != 0 ) {

<mutters something about coding style>

>  +		/*
>  +		 * Limit the amount of the step to be in the range
>  +		 * -tickadj .. +tickadj
>  +		 */
>  +		time_adjust_step = min(time_adjust_step, (long)tickadj);
>  +		time_adjust_step = max(time_adjust_step, (long)-tickadj);
>  +	}
>  +	delta_nsec = tick_nsec + time_adjust_step * 1000;

Is that going to overflow if sizeof(long) == 4?

>  +	return ((u64) delta_nsec << (SHIFT_SCALE - 10)) + time_adj;
>  +}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  1:35 ` Andrew Morton
@ 2006-02-16  1:55   ` Paul Mackerras
  2006-02-16  2:08     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Mackerras @ 2006-02-16  1:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel, johnstul

Andrew Morton writes:

> >  +	if ((time_adjust_step = time_adjust) != 0 ) {
> 
> <mutters something about coding style>

Seems perfectly plain to me, but if you don't like it I can change it.

> 
> >  +		/*
> >  +		 * Limit the amount of the step to be in the range
> >  +		 * -tickadj .. +tickadj
> >  +		 */
> >  +		time_adjust_step = min(time_adjust_step, (long)tickadj);
> >  +		time_adjust_step = max(time_adjust_step, (long)-tickadj);
> >  +	}
> >  +	delta_nsec = tick_nsec + time_adjust_step * 1000;
> 
> Is that going to overflow if sizeof(long) == 4?

No.  time_adjust_step is in microseconds and is restricted to the
range -tickadj .. tickadj, and tickadj is between 1 and 10 (assuming
HZ >= 50).  tick_nsec is around 1e9 / HZ.  There's no way delta_nsec
could end up less than 0 or larger than around 20 million for any
reasonable HZ value (i.e. >= 50).

Paul.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  1:55   ` Paul Mackerras
@ 2006-02-16  2:08     ` Andrew Morton
  2006-02-16  2:54       ` Paul Mackerras
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-02-16  2:08 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: torvalds, linux-kernel, johnstul

Paul Mackerras <paulus@samba.org> wrote:
>
> Andrew Morton writes:
> 
> > >  +	if ((time_adjust_step = time_adjust) != 0 ) {
> > 
> > <mutters something about coding style>
> 
> Seems perfectly plain to me, but if you don't like it I can change it.
> 

[random space before right-paren?]

	time_adjust_step = time_adjust;
	if (time_adjust_step) {

would be more conventional.  Some like the `!= 0' as well - personally I
find that a net distraction.

Ah, you copied-and-pasted from update_wall_time_one_tick().

Can we share that code?

> > 
> > >  +		/*
> > >  +		 * Limit the amount of the step to be in the range
> > >  +		 * -tickadj .. +tickadj
> > >  +		 */
> > >  +		time_adjust_step = min(time_adjust_step, (long)tickadj);
> > >  +		time_adjust_step = max(time_adjust_step, (long)-tickadj);
> > >  +	}
> > >  +	delta_nsec = tick_nsec + time_adjust_step * 1000;
> > 
> > Is that going to overflow if sizeof(long) == 4?
> 
> No.  time_adjust_step is in microseconds and is restricted to the
> range -tickadj .. tickadj, and tickadj is between 1 and 10 (assuming
> HZ >= 50).  tick_nsec is around 1e9 / HZ.  There's no way delta_nsec
> could end up less than 0 or larger than around 20 million for any
> reasonable HZ value (i.e. >= 50).
> 

ok...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  2:08     ` Andrew Morton
@ 2006-02-16  2:54       ` Paul Mackerras
  2006-02-16  3:11         ` Andrew Morton
  2006-02-16  3:30         ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: Paul Mackerras @ 2006-02-16  2:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel, johnstul

Andrew Morton writes:

> Ah, you copied-and-pasted from update_wall_time_one_tick().

Yep, and I didn't even notice the extra space.

> Can we share that code?

We could share the code that computes time_adjust_step, i.e. this
much:

	if ((time_adjust_step = time_adjust) != 0) {
		/*
		 * We are doing an adjtime thing.  Prepare time_adjust_step to
		 * be within bounds.  Note that a positive time_adjust means we
		 * want the clock to run faster.
		 *
		 * Limit the amount of the step to be in the range
		 * -tickadj .. +tickadj
		 */
		time_adjust_step = min(time_adjust_step, (long)tickadj);
		time_adjust_step = max(time_adjust_step, (long)-tickadj);
	}

Is that enough to be worth factoring out?  Note that
update_wall_time_one_tick() needs both time_adjust_step and
delta_nsec, so to share more, we would have to have a function
returning two values and it would start to get ugly.

Paul.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  2:54       ` Paul Mackerras
@ 2006-02-16  3:11         ` Andrew Morton
  2006-02-16  3:18           ` Paul Mackerras
  2006-02-16  3:30         ` Linus Torvalds
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-02-16  3:11 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: torvalds, linux-kernel, johnstul

Paul Mackerras <paulus@samba.org> wrote:
>
> > Can we share that code?
> 
> We could share the code that computes time_adjust_step, i.e. this
> much:
> 
> 	if ((time_adjust_step = time_adjust) != 0) {
> 		/*
> 		 * We are doing an adjtime thing.  Prepare time_adjust_step to
> 		 * be within bounds.  Note that a positive time_adjust means we
> 		 * want the clock to run faster.
> 		 *
> 		 * Limit the amount of the step to be in the range
> 		 * -tickadj .. +tickadj
> 		 */
> 		time_adjust_step = min(time_adjust_step, (long)tickadj);
> 		time_adjust_step = max(time_adjust_step, (long)-tickadj);
> 	}
>

And the next line!

> Is that enough to be worth factoring out?  Note that
> update_wall_time_one_tick() needs both time_adjust_step and
> delta_nsec, so to share more, we would have to have a function
> returning two values and it would start to get ugly.
> 

update_wall_time_one_tick() gets:

	long delta_nsec = new_function();

and your new function becomes

	return (u64)new_function() << (SHIFT_SCALE - 10)) + time_adj;



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  3:11         ` Andrew Morton
@ 2006-02-16  3:18           ` Paul Mackerras
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Mackerras @ 2006-02-16  3:18 UTC (permalink / raw)
  To: Andrew Morton; +Cc: torvalds, linux-kernel, johnstul

Andrew Morton writes:

> Paul Mackerras <paulus@samba.org> wrote:
> 
> > Is that enough to be worth factoring out?  Note that
> > update_wall_time_one_tick() needs both time_adjust_step and
> > delta_nsec, so to share more, we would have to have a function
> > returning two values and it would start to get ugly.
> > 
> 
> update_wall_time_one_tick() gets:
> 
> 	long delta_nsec = new_function();
> 
> and your new function becomes
> 
> 	return (u64)new_function() << (SHIFT_SCALE - 10)) + time_adj;

and update_wall_time_one_tick misses out on this bit:

		/* Reduce by this step the amount of time left  */
		time_adjust -= time_adjust_step;

That's why I said that update_wall_time_one_tick needs both
time_adjust_step and delta_nsec.  If you have a nice way to return
both values, please let me know...

Paul.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  2:54       ` Paul Mackerras
  2006-02-16  3:11         ` Andrew Morton
@ 2006-02-16  3:30         ` Linus Torvalds
  2006-02-16  3:43           ` Paul Mackerras
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2006-02-16  3:30 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Andrew Morton, linux-kernel, johnstul



On Thu, 16 Feb 2006, Paul Mackerras wrote:
> 
> We could share the code that computes time_adjust_step, i.e. this
> much:
> 
> 	if ((time_adjust_step = time_adjust) != 0) {

And while at it, please make it much more readable by writing it as

	time_adjust_step = time_adjust;
	if (time_adjust_step) {
		..

which is even less to type (no "!= 0", no extra parenthesis, just a 
";<nl><tab>", and you've saved a whopping three bytes of source code while 
making the end result more readable, and the compiler will generate the 
same thing.

Assignments inside tests should probably be relegated entirely to loop 
constructs, where doing them outside the test changes semantics.

			Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  3:30         ` Linus Torvalds
@ 2006-02-16  3:43           ` Paul Mackerras
  2006-02-16  3:51             ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Mackerras @ 2006-02-16  3:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, linux-kernel, johnstul

Linus Torvalds writes:

> And while at it, please make it much more readable by writing it as
> 
> 	time_adjust_step = time_adjust;
> 	if (time_adjust_step) {
> 		..
> 
> which is even less to type (no "!= 0", no extra parenthesis, just a 
> ";<nl><tab>", and you've saved a whopping three bytes of source code while 

... you've forgotten that now I've had to type "time_adjust_step"
twice. :P  Anyway, I don't mind changing it, if you think it's worth
pulling that little bit of code out into its own function.

Paul.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] Provide an interface for getting the current tick length
  2006-02-16  3:43           ` Paul Mackerras
@ 2006-02-16  3:51             ` Linus Torvalds
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Torvalds @ 2006-02-16  3:51 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Andrew Morton, linux-kernel, johnstul



On Thu, 16 Feb 2006, Paul Mackerras wrote:
> 
> ... you've forgotten that now I've had to type "time_adjust_step"
> twice. :P

Damn, you're right.

> Anyway, I don't mind changing it, if you think it's worth
> pulling that little bit of code out into its own function.

Sounds sane. Less duplication, and clearer code. Go wild.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-02-16  3:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-16  1:17 [PATCH] Provide an interface for getting the current tick length Paul Mackerras
2006-02-16  1:35 ` Andrew Morton
2006-02-16  1:55   ` Paul Mackerras
2006-02-16  2:08     ` Andrew Morton
2006-02-16  2:54       ` Paul Mackerras
2006-02-16  3:11         ` Andrew Morton
2006-02-16  3:18           ` Paul Mackerras
2006-02-16  3:30         ` Linus Torvalds
2006-02-16  3:43           ` Paul Mackerras
2006-02-16  3:51             ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.