All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
@ 2016-02-16 19:19 Rasmus Villemoes
  2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
  2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
  0 siblings, 2 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
  To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
	Rasmus Villemoes
  Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel

Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).

On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/cpuidle/governors/menu.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 0742b3296673..beef7ae123ba 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -200,7 +200,7 @@ static void get_typical_interval(struct menu_device *data)
 {
 	int i, divisor;
 	unsigned int max, thresh;
-	uint64_t avg, stddev;
+	uint64_t avg, variance;
 
 	thresh = UINT_MAX; /* Discard outliers above this value */
 
@@ -224,36 +224,35 @@ again:
 	else
 		do_div(avg, divisor);
 
-	/* Then try to determine standard deviation */
-	stddev = 0;
+	/* Then try to determine variance */
+	variance = 0;
 	for (i = 0; i < INTERVALS; i++) {
 		unsigned int value = data->intervals[i];
 		if (value <= thresh) {
 			int64_t diff = value - avg;
-			stddev += diff * diff;
+			variance += diff * diff;
 		}
 	}
 	if (divisor == INTERVALS)
-		stddev >>= INTERVAL_SHIFT;
+		variance >>= INTERVAL_SHIFT;
 	else
-		do_div(stddev, divisor);
+		do_div(variance, divisor);
 
 	/*
-	 * The typical interval is obtained when standard deviation is small
-	 * or standard deviation is small compared to the average interval.
-	 *
-	 * int_sqrt() formal parameter type is unsigned long. When the
-	 * greatest difference to an outlier exceeds ~65 ms * sqrt(divisor)
-	 * the resulting squared standard deviation exceeds the input domain
-	 * of int_sqrt on platforms where unsigned long is 32 bits in size.
-	 * In such case reject the candidate average.
+	 * The typical interval is obtained when standard deviation is
+	 * small (stddev <= 20 us, variance <= 400 us^2) or standard
+	 * deviation is small compared to the average interval (avg >
+	 * 6*stddev, avg^2 > 36*variance). The average is smaller than
+	 * UINT_MAX aka U32_MAX, so computing its square does not
+	 * overflow a u64. We simply reject this candidate average if
+	 * the standard deviation is greater than 715 s (which is
+	 * rather unlikely).
 	 *
 	 * Use this result only if there is no timer to wake us up sooner.
 	 */
-	if (likely(stddev <= ULONG_MAX)) {
-		stddev = int_sqrt(stddev);
-		if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
-							|| stddev <= 20) {
+	if (likely(variance <= U64_MAX/36)) {
+		if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+							|| variance <= 400) {
 			if (data->next_timer_us > avg)
 				data->predicted_us = avg;
 			return;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-02-16 23:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.