[PATCH 0/3] timekeeping: Improved NOHZ frequency steering

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
@ 2014-04-24 23:04 John Stultz
  2014-04-24 23:04 ` [PATCH 1/3] [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz John Stultz
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: John Stultz @ 2014-04-24 23:04 UTC (permalink / raw)
  To: LKML; +Cc: John Stultz, Miroslav Lichvar, Richard Cochran, Prarit Bhargava

Continuing the sporadic work on improving the timekeeping
frequency steering logic when NOHZ is enabled, I've made a number
of changes to my re-implementation of Miroslav's patch (most
recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
and I'm getting much closer results in the simulator.

Compared with Miroslav's patch, this avoids doing any extra
divisions, and instead approximates the correction
logarithmically. In addition to trying to lower overhead, this
allows the other accounting bits to be managed in much the
same way we have been doing for awhile.

This patch set contains one portion from Miroslav's patch
(ntp_tick caching) split out into a separate logical patch.

Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc,
you'll need to revert b399fe355b30d01 to get the simulator building)
comparing this patchset to Miroslav's patch to show we're getting in
the right order of magnitude.


Miroslav's patch:
-----------------
$ ./tk_test -t 500 -n 4000
samples: 1-500 reg: 1-500 slope: 1.00 dev: 115.5 max: 300.1 freq: -100.00004
samples: 501-1000 reg: 501-1000 slope: 1.00 dev: 73.7 max: 289.6 freq: -100.00003
samples: 1001-1500 reg: 1001-1500 slope: 1.00 dev: 59.7 max: 283.9 freq: -100.00001
samples: 1501-2000 reg: 1501-2000 slope: 1.00 dev: 50.6 max: 291.6 freq: -100.00000
samples: 2001-2500 reg: 2001-2500 slope: 1.00 dev: 48.6 max: 285.7 freq: -100.00002
samples: 2501-3000 reg: 2501-3000 slope: 1.00 dev: 44.8 max: 293.5 freq: -99.99993
samples: 3001-3500 reg: 3001-3500 slope: 1.00 dev: 40.5 max: 303.8 freq: -99.99996
samples: 3501-4000 reg: 3501-4000 slope: 1.00 dev: 37.6 max: 288.5 freq: -99.99997

$ ./test1.sh 
		freq10          freq100         dev             max
nohz on		0.00601         0.00028         74.0            279.4
nohz off	0.05867         0.00204         0.2             0.6


This patchset:
--------------
$ ./tk_test -t 500 -n 4000
samples: 1-500 reg: 1-500 slope: 1.00 dev: 104.8 max: 278.4 freq: -100.00000
samples: 501-1000 reg: 501-1000 slope: 1.00 dev: 75.9 max: 295.5 freq: -100.00000
samples: 1001-1500 reg: 1001-1500 slope: 1.00 dev: 59.5 max: 284.5 freq: -100.00001
samples: 1501-2000 reg: 1501-2000 slope: 1.00 dev: 52.5 max: 287.3 freq: -99.99998
samples: 2001-2500 reg: 2001-2500 slope: 1.00 dev: 46.2 max: 292.2 freq: -100.00003
samples: 2501-3000 reg: 2501-3000 slope: 1.00 dev: 43.4 max: 294.8 freq: -100.00005
samples: 3001-3500 reg: 3001-3500 slope: 1.00 dev: 39.2 max: 279.5 freq: -99.99999
samples: 3501-4000 reg: 3501-4000 slope: 1.00 dev: 37.6 max: 285.2 freq: -99.99999


$ ./test1.sh 
		freq10          freq100         dev             max
nohz on		0.00748         0.00076         110.8           476.4
nohz off	0.07173         0.03590         0.6             2.1


Again, many many thanks to Miroslav for pointing out this issue, providing
the simulator and initial patches, as well as helping to point out problems
with my rework of his change.

Comments, feedback and testing would be greatly appreciated!

thanks
-john

Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>

John Stultz (3):
  [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz
  [RFC] timekeeping: Use cached ntp_tick_length when accumulating error
  [RFC] timekeeping: Loop in the freqadjust logic to speed up
    convergence

 include/linux/timekeeper_internal.h |   7 ++
 kernel/time/timekeeping.c           | 201 ++++++++++++++++--------------------
 2 files changed, 98 insertions(+), 110 deletions(-)

-- 
1.8.3.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
@ 2014-04-24 23:04 ` John Stultz
  2014-04-24 23:04 ` [PATCH 2/3] [RFC] timekeeping: Use cached ntp_tick_length when accumulating error John Stultz
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2014-04-24 23:04 UTC (permalink / raw)
  To: LKML; +Cc: John Stultz, Miroslav Lichvar, Richard Cochran, Prarit Bhargava

The existing timekeeping_adjust logic has always been complicated
to understand. Further, since it was developed prior to NOHZ becoming
common, its not surprising it performs poorly when NOHZ is enabled.

Since Miroslav pointed out the problematic nature of the existing code
in the NOHZ case, I've tried to refactor the code to perform better.

The problem with the previous approach was that it tried to adjust
for the total cumulative error using a scaled dampening factor. This
resulted in large errors to be corrected slowly, while small errors
were corrected quickly. With NOHZ the timekeeping code doesn't know
how far out the next tick will be, so this results in bad
over-correction to small errors, and insufficient correction to large
errors.

Inspired by Miroslav's patch, I've refactored the code to try to
address the correction in two steps.

1) Check the future freq error for the next tick, and if the frequency
error is large, try to make sure we correct it so it doesn't cause
much accumulated error.

2) Then make a small single unit adjustment to correct any cumulative
error that has collected over time.

This method performs fairly well in the simulator Miroslav created.

Major credit to Miroslav for pointing out the issue, providing the
original patch to resolve this, a simulator for testing, as well as
helping debug and resolve issues in my implementation so that it
performed closer to his original implementation.

I'd be very interested in feedback, thoughts, and testing.

Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c | 194 ++++++++++++++++++++--------------------------
 1 file changed, 85 insertions(+), 109 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index f7df8ea..9aa8ccf 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1052,122 +1052,31 @@ static int __init timekeeping_init_ops(void)
 
 device_initcall(timekeeping_init_ops);
 
-/*
- * If the error is already larger, we look ahead even further
- * to compensate for late or lost adjustments.
- */
-static __always_inline int timekeeping_bigadjust(struct timekeeper *tk,
-						 s64 error, s64 *interval,
-						 s64 *offset)
-{
-	s64 tick_error, i;
-	u32 look_ahead, adj;
-	s32 error2, mult;
-
-	/*
-	 * Use the current error value to determine how much to look ahead.
-	 * The larger the error the slower we adjust for it to avoid problems
-	 * with losing too many ticks, otherwise we would overadjust and
-	 * produce an even larger error.  The smaller the adjustment the
-	 * faster we try to adjust for it, as lost ticks can do less harm
-	 * here.  This is tuned so that an error of about 1 msec is adjusted
-	 * within about 1 sec (or 2^20 nsec in 2^SHIFT_HZ ticks).
-	 */
-	error2 = tk->ntp_error >> (NTP_SCALE_SHIFT + 22 - 2 * SHIFT_HZ);
-	error2 = abs(error2);
-	for (look_ahead = 0; error2 > 0; look_ahead++)
-		error2 >>= 2;
-
-	/*
-	 * Now calculate the error in (1 << look_ahead) ticks, but first
-	 * remove the single look ahead already included in the error.
-	 */
-	tick_error = ntp_tick_length() >> (tk->ntp_error_shift + 1);
-	tick_error -= tk->xtime_interval >> 1;
-	error = ((error - tick_error) >> look_ahead) + tick_error;
-
-	/* Finally calculate the adjustment shift value.  */
-	i = *interval;
-	mult = 1;
-	if (error < 0) {
-		error = -error;
-		*interval = -*interval;
-		*offset = -*offset;
-		mult = -1;
-	}
-	for (adj = 0; error > i; adj++)
-		error >>= 1;
 
-	*interval <<= adj;
-	*offset <<= adj;
-	return mult << adj;
-}
 
-/*
- * Adjust the multiplier to reduce the error value,
- * this is optimized for the most common adjustments of -1,0,1,
- * for other values we can do a bit more work.
- */
-static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
+static __always_inline void timekeeping_apply_adjustment(struct timekeeper *tk,
+							 s64 offset,
+							 bool negative,
+							 int adj_scale)
 {
-	s64 error, interval = tk->cycle_interval;
-	int adj;
+	s64 interval = tk->cycle_interval;
+	s32 mult_adj = 1;
 
-	/*
-	 * The point of this is to check if the error is greater than half
-	 * an interval.
-	 *
-	 * First we shift it down from NTP_SHIFT to clocksource->shifted nsecs.
-	 *
-	 * Note we subtract one in the shift, so that error is really error*2.
-	 * This "saves" dividing(shifting) interval twice, but keeps the
-	 * (error > interval) comparison as still measuring if error is
-	 * larger than half an interval.
-	 *
-	 * Note: It does not "save" on aggravation when reading the code.
-	 */
-	error = tk->ntp_error >> (tk->ntp_error_shift - 1);
-	if (error > interval) {
-		/*
-		 * We now divide error by 4(via shift), which checks if
-		 * the error is greater than twice the interval.
-		 * If it is greater, we need a bigadjust, if its smaller,
-		 * we can adjust by 1.
-		 */
-		error >>= 2;
-		if (likely(error <= interval))
-			adj = 1;
-		else
-			adj = timekeeping_bigadjust(tk, error, &interval, &offset);
-	} else {
-		if (error < -interval) {
-			/* See comment above, this is just switched for the negative */
-			error >>= 2;
-			if (likely(error >= -interval)) {
-				adj = -1;
-				interval = -interval;
-				offset = -offset;
-			} else {
-				adj = timekeeping_bigadjust(tk, error, &interval, &offset);
-			}
-		} else {
-			goto out_adjust;
-		}
+	if (negative) {
+		mult_adj = -mult_adj;
+		interval = -interval;
+		offset  = -offset;
 	}
+	mult_adj <<= adj_scale;
+	interval <<= adj_scale;
+	offset <<= adj_scale;
 
-	if (unlikely(tk->clock->maxadj &&
-		(tk->mult + adj > tk->clock->mult + tk->clock->maxadj))) {
-		printk_once(KERN_WARNING
-			"Adjusting %s more than 11%% (%ld vs %ld)\n",
-			tk->clock->name, (long)tk->mult + adj,
-			(long)tk->clock->mult + tk->clock->maxadj);
-	}
 	/*
 	 * So the following can be confusing.
 	 *
-	 * To keep things simple, lets assume adj == 1 for now.
+	 * To keep things simple, lets assume mult_adj == 1 for now.
 	 *
-	 * When adj != 1, remember that the interval and offset values
+	 * When mult_adj != 1, remember that the interval and offset values
 	 * have been appropriately scaled so the math is the same.
 	 *
 	 * The basic idea here is that we're increasing the multiplier
@@ -1211,12 +1120,80 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
 	 *
 	 * XXX - TODO: Doc ntp_error calculation.
 	 */
-	tk->mult += adj;
+	tk->mult += mult_adj;
 	tk->xtime_interval += interval;
 	tk->xtime_nsec -= offset;
 	tk->ntp_error -= (interval - offset) << tk->ntp_error_shift;
+}
+
+/*
+ * Calculate the future error caused by incorrect freq value
+ * and adjust the timekeeper to correct that.
+ */
+static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,
+						 s64 interval,
+						 s64 offset)
+{
+	s64 tick_error, i;
+	bool negative;
+	u32 adj;
+
+	/* Calculate current error per tick */
+	tick_error = ntp_tick_length() >> tk->ntp_error_shift;
+	tick_error -= (tk->xtime_interval + tk->xtime_remainder);
+
+	/* Don't worry about correcting it if its small */
+	if (likely(abs(tick_error) < interval/2) && (tick_error > 0))
+		return;
+
+	/* preserve the direction of correction */
+	negative = (tick_error < 0);
+
+	/* Sort out the magnitude of the correction */
+	tick_error = abs(tick_error);
+	i = abs(interval);
+	for (adj = 0; tick_error > i; adj++)
+		tick_error >>= 1;
+
+	/* scale the corrections */
+	timekeeping_apply_adjustment(tk, offset, negative, adj);
+}
+
+
+/*
+ * Adjust the multiplier to reduce the error value,
+ * this is optimized for the most common adjustments of -1,0,1,
+ * for other values we can do a bit more work.
+ */
+static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
+{
+
+	s64 interval = tk->cycle_interval;
+	static int tk_erradj;
+
+	/* Undo any existing error adjustment */
+	if (tk_erradj) {
+		timekeeping_apply_adjustment(tk, offset, 1, 0);
+		tk_erradj = 0;
+	}
+
+	/* Correct for the current frequency error */
+	timekeeping_freqadjust(tk, interval, offset);
+
+	/* Next make a small adjustment to fix any cumulative error */
+	if (tk->ntp_error > 0) {
+		tk_erradj = 1;
+		timekeeping_apply_adjustment(tk, offset, 0, 0);
+	}
+
+	if (unlikely(tk->clock->maxadj &&
+		(tk->mult > tk->clock->mult + tk->clock->maxadj))) {
+		printk_once(KERN_WARNING
+			"Adjusting %s more than 11%% (%ld vs %ld)\n",
+			tk->clock->name, (long)tk->mult,
+			(long)tk->clock->mult + tk->clock->maxadj);
+	}
 
-out_adjust:
 	/*
 	 * It may be possible that when we entered this function, xtime_nsec
 	 * was very small.  Further, if we're slightly speeding the clocksource
@@ -1236,7 +1213,6 @@ out_adjust:
 		tk->xtime_nsec = 0;
 		tk->ntp_error += neg << tk->ntp_error_shift;
 	}
-
 }
 
 /**
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] [RFC] timekeeping: Use cached ntp_tick_length when accumulating error
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
  2014-04-24 23:04 ` [PATCH 1/3] [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz John Stultz
@ 2014-04-24 23:04 ` John Stultz
  2014-04-24 23:04 ` [PATCH 3/3] [RFC] timekeeping: Loop in the freqadjust logic to speed up convergence John Stultz
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2014-04-24 23:04 UTC (permalink / raw)
  To: LKML; +Cc: John Stultz, Miroslav Lichvar, Richard Cochran, Prarit Bhargava

By caching the ntp_tick_length() when we correct the frequency error,
and then using that cached value to accumulate error, we avoid large
initial errors when the tick length is changed.

This makes convergence happen much faster in the simulator, since the
initial error doesn't have to be slowly whittled away.

This initially seems like an accounting error, but Miroslav pointed out
that ntp_tick_length() can change mid-tick, so when we apply it in the
error accumulation, we are applying any recent change to the entire tick.

This approach chooses to apply changes in the ntp_tick_length() only to
the next tick, which allows us to calculate the freq correction before
using the new tick length, which avoids accummulating error.

Credit to Miroslav for pointing this out and providing the original patch
this functionality has been pulled out from, along with the rational.

Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 include/linux/timekeeper_internal.h | 7 +++++++
 kernel/time/timekeeping.c           | 4 +++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index c1825eb..83fd40f 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -34,6 +34,13 @@ struct timekeeper {
 	/* Clock shifted nano seconds */
 	u64			xtime_nsec;
 
+	/* The ntp_tick_length() value currently being used.
+	 * This cached copy ensures we consistently apply the tick
+	 * length for an entire tick, as ntp_tick_length may change
+	 * mid-tick, and we don't want to apply that new value to
+	 * the tick in progress.
+	 */
+	u64			ntp_tick;
 	/* Difference between accumulated time and NTP time in ntp
 	 * shifted nano seconds. */
 	s64			ntp_error;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 9aa8ccf..a130d00 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -148,6 +148,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 	 * to counteract clock drifting.
 	 */
 	tk->mult = clock->mult;
+	tk->ntp_tick = ntpinterval << tk->ntp_error_shift;
 }
 
 /* Timekeeper helper functions. */
@@ -1138,6 +1139,7 @@ static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,
 	bool negative;
 	u32 adj;
 
+	tk->ntp_tick = ntp_tick_length();
 	/* Calculate current error per tick */
 	tick_error = ntp_tick_length() >> tk->ntp_error_shift;
 	tick_error -= (tk->xtime_interval + tk->xtime_remainder);
@@ -1292,7 +1294,7 @@ static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
 	tk->raw_time.tv_nsec = raw_nsecs;
 
 	/* Accumulate error between NTP and clock interval */
-	tk->ntp_error += ntp_tick_length() << shift;
+	tk->ntp_error += tk->ntp_tick << shift;
 	tk->ntp_error -= (tk->xtime_interval + tk->xtime_remainder) <<
 						(tk->ntp_error_shift + shift);
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] [RFC] timekeeping: Loop in the freqadjust logic to speed up convergence
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
  2014-04-24 23:04 ` [PATCH 1/3] [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz John Stultz
  2014-04-24 23:04 ` [PATCH 2/3] [RFC] timekeeping: Use cached ntp_tick_length when accumulating error John Stultz
@ 2014-04-24 23:04 ` John Stultz
  2014-04-25 14:04 ` [PATCH 0/3] timekeeping: Improved NOHZ frequency steering Miroslav Lichvar
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2014-04-24 23:04 UTC (permalink / raw)
  To: LKML; +Cc: John Stultz, Miroslav Lichvar, Richard Cochran, Prarit Bhargava

I'm not convinced this is necessary, but this patch changes
the freqadjust logic so instead of approximating the correct
frequency adjustment making one logarithmic adjustment per tick,
we loop until the correct adjustment is found on the first call.

Thus, rather then splitting the correction up over a number of ticks
we do it all in the first tick. This is an increase of work, but
should avoid error accumulation which would slow convergence.

Again, I'm not sure the error accumulation is significant enough
to warrant the extra work here, but it improves simulator results
so I'm including this change for review and discussion.

Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 kernel/time/timekeeping.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index a130d00..6e715c3 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1139,6 +1139,7 @@ static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,
 	bool negative;
 	u32 adj;
 
+again:
 	tk->ntp_tick = ntp_tick_length();
 	/* Calculate current error per tick */
 	tick_error = ntp_tick_length() >> tk->ntp_error_shift;
@@ -1159,6 +1160,8 @@ static __always_inline void timekeeping_freqadjust(struct timekeeper *tk,
 
 	/* scale the corrections */
 	timekeeping_apply_adjustment(tk, offset, negative, adj);
+	if (adj)
+		goto again;
 }
 
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
                   ` (2 preceding siblings ...)
  2014-04-24 23:04 ` [PATCH 3/3] [RFC] timekeeping: Loop in the freqadjust logic to speed up convergence John Stultz
@ 2014-04-25 14:04 ` Miroslav Lichvar
  2014-04-25 21:05   ` John Stultz
  2014-05-17  0:04   ` John Stultz
  2014-04-25 19:00 ` Prarit Bhargava
  2014-04-29 11:19 ` Prarit Bhargava
  5 siblings, 2 replies; 11+ messages in thread
From: Miroslav Lichvar @ 2014-04-25 14:04 UTC (permalink / raw)
  To: John Stultz; +Cc: LKML, Richard Cochran, Prarit Bhargava

On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote:
> Continuing the sporadic work on improving the timekeeping
> frequency steering logic when NOHZ is enabled, I've made a number
> of changes to my re-implementation of Miroslav's patch (most
> recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
> and I'm getting much closer results in the simulator.

Thanks, in my initial testing it seems to be working well. The results
from the simulator are much better than with the previous patch.

> Compared with Miroslav's patch, this avoids doing any extra
> divisions, and instead approximates the correction
> logarithmically.

Hm, doesn't that basically make the code a software implementation of
division? It seems it needs about 4-8 iterations to get to the final
result.

I didn't measure it, but I think with this change it now may be close
or possibly even slower than my patch. The extra division and
multiplication in my patch are used only when the tick length changes
(normally once per second), otherwise the update is very cheap.

> Miroslav's patch:
> -----------------
> $ ./test1.sh 
> 		freq10          freq100         dev             max
> nohz on	0.00601         0.00028         74.0            279.4
> nohz off	0.05867         0.00204         0.2             0.6

> This patchset:
> --------------
> $ ./test1.sh 
> 		freq10          freq100         dev             max
> nohz on	0.00748         0.00076         110.8           476.4
> nohz off	0.07173         0.03590         0.6             2.1

This looks pretty good to me. It's interesting that the performance
with nohz off got worse, but when I modify the test to use more points
I can see it does converge to the correct frequency and probably it's
not a big problem.

It seems it still doesn't always switch mult only between the two
closest values, which explains the slightly worse dev and max values.

Thanks,

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
                   ` (3 preceding siblings ...)
  2014-04-25 14:04 ` [PATCH 0/3] timekeeping: Improved NOHZ frequency steering Miroslav Lichvar
@ 2014-04-25 19:00 ` Prarit Bhargava
  2014-04-29 11:19 ` Prarit Bhargava
  5 siblings, 0 replies; 11+ messages in thread
From: Prarit Bhargava @ 2014-04-25 19:00 UTC (permalink / raw)
  To: John Stultz; +Cc: LKML, Miroslav Lichvar, Richard Cochran



On 04/24/2014 07:04 PM, John Stultz wrote:
> Continuing the sporadic work on improving the timekeeping
> frequency steering logic when NOHZ is enabled, I've made a number
> of changes to my re-implementation of Miroslav's patch (most
> recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
> and I'm getting much closer results in the simulator.
> 
> Compared with Miroslav's patch, this avoids doing any extra
> divisions, and instead approximates the correction
> logarithmically. In addition to trying to lower overhead, this
> allows the other accounting bits to be managed in much the
> same way we have been doing for awhile.
> 
> This patch set contains one portion from Miroslav's patch
> (ntp_tick caching) split out into a separate logical patch.
> 
> Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc,
> you'll need to revert b399fe355b30d01 to get the simulator building)
> comparing this patchset to Miroslav's patch to show we're getting in
> the right order of magnitude.
> 

John, I'm going to put this into a weekend long test right away (along with
another unrelated cpupower patch).  I'll ping back Monday to let you know how it
went.

P.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-25 14:04 ` [PATCH 0/3] timekeeping: Improved NOHZ frequency steering Miroslav Lichvar
@ 2014-04-25 21:05   ` John Stultz
  2014-04-30 14:01     ` Miroslav Lichvar
  2014-05-17  0:04   ` John Stultz
  1 sibling, 1 reply; 11+ messages in thread
From: John Stultz @ 2014-04-25 21:05 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: LKML, Richard Cochran, Prarit Bhargava

On 04/25/2014 07:04 AM, Miroslav Lichvar wrote:
> On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote:
>> Continuing the sporadic work on improving the timekeeping
>> frequency steering logic when NOHZ is enabled, I've made a number
>> of changes to my re-implementation of Miroslav's patch (most
>> recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
>> and I'm getting much closer results in the simulator.
> Thanks, in my initial testing it seems to be working well. The results
> from the simulator are much better than with the previous patch.
>
>> Compared with Miroslav's patch, this avoids doing any extra
>> divisions, and instead approximates the correction
>> logarithmically.
> Hm, doesn't that basically make the code a software implementation of
> division? It seems it needs about 4-8 iterations to get to the final
> result.
>
> I didn't measure it, but I think with this change it now may be close
> or possibly even slower than my patch. The extra division and
> multiplication in my patch are used only when the tick length changes
> (normally once per second), otherwise the update is very cheap.

This point is quite true, though I'm really not convinced the last patch
in the series is realistically needed. It does help the code get close
to your implementation in comparisons w/ the simulator but I haven't had
the chance to sort out what the real world impact is likely to be.

That said, you're points about just setting the freq rather then trying
to converge towards it to eliminate the second layer of correction
oscillation is starting to resonate more with me. Thus I may very well
switch to the division, but if so I also want to get the manipulations
to the supporting accounting variables similarly documented as the
scaled manipulation currently provides.


>
>> Miroslav's patch:
>> -----------------
>> $ ./test1.sh 
>> 		freq10          freq100         dev             max
>> nohz on	0.00601         0.00028         74.0            279.4
>> nohz off	0.05867         0.00204         0.2             0.6
>> This patchset:
>> --------------
>> $ ./test1.sh 
>> 		freq10          freq100         dev             max
>> nohz on	0.00748         0.00076         110.8           476.4
>> nohz off	0.07173         0.03590         0.6             2.1
> This looks pretty good to me. It's interesting that the performance
> with nohz off got worse, but when I modify the test to use more points
> I can see it does converge to the correct frequency and probably it's
> not a big problem.
>
> It seems it still doesn't always switch mult only between the two
> closest values, which explains the slightly worse dev and max values.
Huh. I don't think I saw that in my testing. I'll look into it again.

I suspect the extra error comes from the occasional underflow handling
(which you avoid w/ the second_overflow_skip stuff which would help but
feels a little clunky to me - but I'm still thinking it over).

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
                   ` (4 preceding siblings ...)
  2014-04-25 19:00 ` Prarit Bhargava
@ 2014-04-29 11:19 ` Prarit Bhargava
  5 siblings, 0 replies; 11+ messages in thread
From: Prarit Bhargava @ 2014-04-29 11:19 UTC (permalink / raw)
  To: John Stultz; +Cc: LKML, Miroslav Lichvar, Richard Cochran



On 04/24/2014 07:04 PM, John Stultz wrote:
> Continuing the sporadic work on improving the timekeeping
> frequency steering logic when NOHZ is enabled, I've made a number
> of changes to my re-implementation of Miroslav's patch (most
> recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
> and I'm getting much closer results in the simulator.
> 
> Compared with Miroslav's patch, this avoids doing any extra
> divisions, and instead approximates the correction
> logarithmically. In addition to trying to lower overhead, this
> allows the other accounting bits to be managed in much the
> same way we have been doing for awhile.
> 
> This patch set contains one portion from Miroslav's patch
> (ntp_tick caching) split out into a separate logical patch.
> 
> Below are some of the simulator results (BTW Miroslav, w/ 3.15-rc,
> you'll need to revert b399fe355b30d01 to get the simulator building)
> comparing this patchset to Miroslav's patch to show we're getting in
> the right order of magnitude.
> 
> 

Tested-by: Prarit Bhargava <prarit@redhat.com>

P.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-25 21:05   ` John Stultz
@ 2014-04-30 14:01     ` Miroslav Lichvar
  2014-05-16 23:37       ` John Stultz
  0 siblings, 1 reply; 11+ messages in thread
From: Miroslav Lichvar @ 2014-04-30 14:01 UTC (permalink / raw)
  To: John Stultz; +Cc: LKML, Richard Cochran, Prarit Bhargava

On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote:
> On 04/25/2014 07:04 AM, Miroslav Lichvar wrote:
> > It seems it still doesn't always switch mult only between the two
> > closest values, which explains the slightly worse dev and max values.
> Huh. I don't think I saw that in my testing. I'll look into it again.

I can see it with tk_test -o 100000, for instance. It's switching
between 8389446, 8389447 and 8389448.

> I suspect the extra error comes from the occasional underflow handling
> (which you avoid w/ the second_overflow_skip stuff which would help but
> feels a little clunky to me - but I'm still thinking it over).

It seems to be something else as I can see it even when I remove
"advance_ticks(3, 4, 1);" from tk_test.c so clock updates are aligned
exactly with ticks and no underflow can happen (i.e. offset in
timekeeping_apply_adjustment() is zero).

I agree the skip_second_overflow flag in my patch is ugly, but it's
necesssary as the code would otherwise take too long to correct the
underflowed part in ntp error.

Anyway, I did more testing and I think I found a more serious problem.
It seems the loop doesn't handle well tick lengths which happen to be
close to the middle between multipliers. For example:

$ ./tk_test -n 10000 -o 100077
samples: 1-10000 reg: 1-10000 slope: 1.00 dev: 1241.7 max: 3532.3 freq: 100.07717

When I add the following line to the kernel code to see the value of
mult and ntp_error after clock update:

+++ b/kernel/time/timekeeping.c
@@ -1386,6 +1386,7 @@ void update_wall_time(void)
        /* correct the clock when NTP error is too big */
        timekeeping_adjust(tk, offset);

+       printk("%d %lld\n", tk->mult, tk->ntp_error >> (tk->ntp_error_shift + tk->shift));

I get this:

8389447 -101
8389449 6
8389447 -321
8389448 -198
8389447 -249
...
8389447 -6344
8389448 -6158
8389447 -6223
8389448 -6211
8389447 -6265
8389448 -6029

It looks like the correction is not able to handle the random
cumulation of differences in the lengths between odd and even update
intervals. The overall frequency is accurate, but ntp error is in
microseconds here.

Can you please look into this?

Thanks,

-- 
Miroslav Lichvar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-30 14:01     ` Miroslav Lichvar
@ 2014-05-16 23:37       ` John Stultz
  0 siblings, 0 replies; 11+ messages in thread
From: John Stultz @ 2014-05-16 23:37 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: LKML, Richard Cochran, Prarit Bhargava

On 04/30/2014 07:01 AM, Miroslav Lichvar wrote:
> On Fri, Apr 25, 2014 at 02:05:49PM -0700, John Stultz wrote:
>> On 04/25/2014 07:04 AM, Miroslav Lichvar wrote:
>>> It seems it still doesn't always switch mult only between the two
>>> closest values, which explains the slightly worse dev and max values.
>> Huh. I don't think I saw that in my testing. I'll look into it again.
> I can see it with tk_test -o 100000, for instance. It's switching
> between 8389446, 8389447 and 8389448.

Ok, I think I sorted this part out. Thanks for the heads up here!


>
>> I suspect the extra error comes from the occasional underflow handling
>> (which you avoid w/ the second_overflow_skip stuff which would help but
>> feels a little clunky to me - but I'm still thinking it over).
> It seems to be something else as I can see it even when I remove
> "advance_ticks(3, 4, 1);" from tk_test.c so clock updates are aligned
> exactly with ticks and no underflow can happen (i.e. offset in
> timekeeping_apply_adjustment() is zero).
>
> I agree the skip_second_overflow flag in my patch is ugly, but it's
> necesssary as the code would otherwise take too long to correct the
> underflowed part in ntp error.
>
> Anyway, I did more testing and I think I found a more serious problem.
> It seems the loop doesn't handle well tick lengths which happen to be
> close to the middle between multipliers. For example:
>
> $ ./tk_test -n 10000 -o 100077
> samples: 1-10000 reg: 1-10000 slope: 1.00 dev: 1241.7 max: 3532.3 freq: 100.07717
>
> When I add the following line to the kernel code to see the value of
> mult and ntp_error after clock update:
>
> +++ b/kernel/time/timekeeping.c
> @@ -1386,6 +1386,7 @@ void update_wall_time(void)
>         /* correct the clock when NTP error is too big */
>         timekeeping_adjust(tk, offset);
>
> +       printk("%d %lld\n", tk->mult, tk->ntp_error >> (tk->ntp_error_shift + tk->shift));
>
> I get this:
>
> 8389447 -101
> 8389449 6
> 8389447 -321
> 8389448 -198
> 8389447 -249
> ...
> 8389447 -6344
> 8389448 -6158
> 8389447 -6223
> 8389448 -6211
> 8389447 -6265
> 8389448 -6029
>
> It looks like the correction is not able to handle the random
> cumulation of differences in the lengths between odd and even update
> intervals. The overall frequency is accurate, but ntp error is in
> microseconds here.

Yea, in the freqadjust logic, we chose to do nothing if it was inbetween
0 to (interval/2). The problem being that interval/2 is too small
target, and if we get too close a single unit adjustment may bounce us
on either side of that range. Changed the comparision to being the
0-interval (inclusive) which should assure the approximation will land
in that range.

New patchset to follow shortly!

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] timekeeping: Improved NOHZ frequency steering
  2014-04-25 14:04 ` [PATCH 0/3] timekeeping: Improved NOHZ frequency steering Miroslav Lichvar
  2014-04-25 21:05   ` John Stultz
@ 2014-05-17  0:04   ` John Stultz
  1 sibling, 0 replies; 11+ messages in thread
From: John Stultz @ 2014-05-17  0:04 UTC (permalink / raw)
  To: Miroslav Lichvar; +Cc: LKML, Richard Cochran, Prarit Bhargava

On 04/25/2014 07:04 AM, Miroslav Lichvar wrote:
> On Thu, Apr 24, 2014 at 04:04:34PM -0700, John Stultz wrote:
>> Continuing the sporadic work on improving the timekeeping
>> frequency steering logic when NOHZ is enabled, I've made a number
>> of changes to my re-implementation of Miroslav's patch (most
>> recently posted here: https://lkml.org/lkml/2014/2/12/401 ),
>> and I'm getting much closer results in the simulator.
> Thanks, in my initial testing it seems to be working well. The results
> from the simulator are much better than with the previous patch.
>
>> Compared with Miroslav's patch, this avoids doing any extra
>> divisions, and instead approximates the correction
>> logarithmically.
> Hm, doesn't that basically make the code a software implementation of
> division? It seems it needs about 4-8 iterations to get to the final
> result.
>
> I didn't measure it, but I think with this change it now may be close
> or possibly even slower than my patch. The extra division and
> multiplication in my patch are used only when the tick length changes
> (normally once per second), otherwise the update is very cheap.

Fair enough. I've moved to your division method for the last patch in my
series (instead of looping).

I'm still not 100% convinced we need it, since I'm still worried the
error seen in the simulator with the default options is somewhat
contrived. For example, since it regularly skips 4k ticks, that
exaggerates the accumulated error in the approximation case. But we can
discuss this further w/ the new patch series.

thanks!
-john




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-17  0:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-24 23:04 [PATCH 0/3] timekeeping: Improved NOHZ frequency steering John Stultz
2014-04-24 23:04 ` [PATCH 1/3] [RFC] timekeeping: Rework frequency adjustments to work better w/ nohz John Stultz
2014-04-24 23:04 ` [PATCH 2/3] [RFC] timekeeping: Use cached ntp_tick_length when accumulating error John Stultz
2014-04-24 23:04 ` [PATCH 3/3] [RFC] timekeeping: Loop in the freqadjust logic to speed up convergence John Stultz
2014-04-25 14:04 ` [PATCH 0/3] timekeeping: Improved NOHZ frequency steering Miroslav Lichvar
2014-04-25 21:05   ` John Stultz
2014-04-30 14:01     ` Miroslav Lichvar
2014-05-16 23:37       ` John Stultz
2014-05-17  0:04   ` John Stultz
2014-04-25 19:00 ` Prarit Bhargava
2014-04-29 11:19 ` Prarit Bhargava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).