public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
@ 2016-02-16 19:19 Rasmus Villemoes
  2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
  2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
  0 siblings, 2 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
  To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
	Rasmus Villemoes
  Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel

Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).

On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/cpuidle/governors/menu.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 0742b3296673..beef7ae123ba 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -200,7 +200,7 @@ static void get_typical_interval(struct menu_device *data)
 {
 	int i, divisor;
 	unsigned int max, thresh;
-	uint64_t avg, stddev;
+	uint64_t avg, variance;
 
 	thresh = UINT_MAX; /* Discard outliers above this value */
 
@@ -224,36 +224,35 @@ again:
 	else
 		do_div(avg, divisor);
 
-	/* Then try to determine standard deviation */
-	stddev = 0;
+	/* Then try to determine variance */
+	variance = 0;
 	for (i = 0; i < INTERVALS; i++) {
 		unsigned int value = data->intervals[i];
 		if (value <= thresh) {
 			int64_t diff = value - avg;
-			stddev += diff * diff;
+			variance += diff * diff;
 		}
 	}
 	if (divisor == INTERVALS)
-		stddev >>= INTERVAL_SHIFT;
+		variance >>= INTERVAL_SHIFT;
 	else
-		do_div(stddev, divisor);
+		do_div(variance, divisor);
 
 	/*
-	 * The typical interval is obtained when standard deviation is small
-	 * or standard deviation is small compared to the average interval.
-	 *
-	 * int_sqrt() formal parameter type is unsigned long. When the
-	 * greatest difference to an outlier exceeds ~65 ms * sqrt(divisor)
-	 * the resulting squared standard deviation exceeds the input domain
-	 * of int_sqrt on platforms where unsigned long is 32 bits in size.
-	 * In such case reject the candidate average.
+	 * The typical interval is obtained when standard deviation is
+	 * small (stddev <= 20 us, variance <= 400 us^2) or standard
+	 * deviation is small compared to the average interval (avg >
+	 * 6*stddev, avg^2 > 36*variance). The average is smaller than
+	 * UINT_MAX aka U32_MAX, so computing its square does not
+	 * overflow a u64. We simply reject this candidate average if
+	 * the standard deviation is greater than 715 s (which is
+	 * rather unlikely).
 	 *
 	 * Use this result only if there is no timer to wake us up sooner.
 	 */
-	if (likely(stddev <= ULONG_MAX)) {
-		stddev = int_sqrt(stddev);
-		if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
-							|| stddev <= 20) {
+	if (likely(variance <= U64_MAX/36)) {
+		if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+							|| variance <= 400) {
 			if (data->next_timer_us > avg)
 				data->predicted_us = avg;
 			return;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code
  2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
@ 2016-02-16 19:19 ` Rasmus Villemoes
  2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
  1 sibling, 0 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
  To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
	Rasmus Villemoes
  Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel

We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 drivers/cpuidle/governors/menu.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index beef7ae123ba..27fc733cb5b9 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -199,8 +199,8 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev);
 static void get_typical_interval(struct menu_device *data)
 {
 	int i, divisor;
-	unsigned int max, thresh;
-	uint64_t avg, variance;
+	unsigned int max, thresh, avg;
+	uint64_t sum, variance;
 
 	thresh = UINT_MAX; /* Discard outliers above this value */
 
@@ -208,28 +208,28 @@ again:
 
 	/* First calculate the average of past intervals */
 	max = 0;
-	avg = 0;
+	sum = 0;
 	divisor = 0;
 	for (i = 0; i < INTERVALS; i++) {
 		unsigned int value = data->intervals[i];
 		if (value <= thresh) {
-			avg += value;
+			sum += value;
 			divisor++;
 			if (value > max)
 				max = value;
 		}
 	}
 	if (divisor == INTERVALS)
-		avg >>= INTERVAL_SHIFT;
+		avg = sum >> INTERVAL_SHIFT;
 	else
-		do_div(avg, divisor);
+		avg = div_u64(sum, divisor);
 
 	/* Then try to determine variance */
 	variance = 0;
 	for (i = 0; i < INTERVALS; i++) {
 		unsigned int value = data->intervals[i];
 		if (value <= thresh) {
-			int64_t diff = value - avg;
+			int64_t diff = (int64_t)value - avg;
 			variance += diff * diff;
 		}
 	}
@@ -251,7 +251,7 @@ again:
 	 * Use this result only if there is no timer to wake us up sooner.
 	 */
 	if (likely(variance <= U64_MAX/36)) {
-		if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+		if ((((u64)avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
 							|| variance <= 400) {
 			if (data->next_timer_us > avg)
 				data->predicted_us = avg;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
  2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
  2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
@ 2016-02-16 23:20 ` Rafael J. Wysocki
  1 sibling, 0 replies; 3+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 23:20 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Andi Kleen, Andi Kleen, Ingo Molnar, akpm, lenb, Eric Dumazet,
	Fengguang Wu, linux-kernel

On 2/16/2016 8:19 PM, Rasmus Villemoes wrote:
> Computing the integer square root is a rather expensive operation, at
> least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
> bit platforms, doing an extra comparison to a constant (variance <=
> U64_MAX/36).
>
> On 64 bit platforms, this does mean that we add a restriction on the
> range of the variance where we end up using the estimate (since
> previously the stddev <= ULONG_MAX was a tautology), but on the other
> hand, we extend the range quite substantially on 32 bit platforms - in
> both cases, we now allow standard deviations up to 715 seconds, which
> is for example guaranteed if all observations are less than 1430
> seconds.
>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>

Both patches look good to me,  so I'm going to queue them up for 4.6.

If anyone has any issues with that, please let me know.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-02-16 23:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox