* [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
@ 2016-02-16 19:19 Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
0 siblings, 2 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
Rasmus Villemoes
Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel
Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).
On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
| 35 +++++++++++++++++------------------
1 file changed, 17 insertions(+), 18 deletions(-)
--git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 0742b3296673..beef7ae123ba 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -200,7 +200,7 @@ static void get_typical_interval(struct menu_device *data)
{
int i, divisor;
unsigned int max, thresh;
- uint64_t avg, stddev;
+ uint64_t avg, variance;
thresh = UINT_MAX; /* Discard outliers above this value */
@@ -224,36 +224,35 @@ again:
else
do_div(avg, divisor);
- /* Then try to determine standard deviation */
- stddev = 0;
+ /* Then try to determine variance */
+ variance = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
int64_t diff = value - avg;
- stddev += diff * diff;
+ variance += diff * diff;
}
}
if (divisor == INTERVALS)
- stddev >>= INTERVAL_SHIFT;
+ variance >>= INTERVAL_SHIFT;
else
- do_div(stddev, divisor);
+ do_div(variance, divisor);
/*
- * The typical interval is obtained when standard deviation is small
- * or standard deviation is small compared to the average interval.
- *
- * int_sqrt() formal parameter type is unsigned long. When the
- * greatest difference to an outlier exceeds ~65 ms * sqrt(divisor)
- * the resulting squared standard deviation exceeds the input domain
- * of int_sqrt on platforms where unsigned long is 32 bits in size.
- * In such case reject the candidate average.
+ * The typical interval is obtained when standard deviation is
+ * small (stddev <= 20 us, variance <= 400 us^2) or standard
+ * deviation is small compared to the average interval (avg >
+ * 6*stddev, avg^2 > 36*variance). The average is smaller than
+ * UINT_MAX aka U32_MAX, so computing its square does not
+ * overflow a u64. We simply reject this candidate average if
+ * the standard deviation is greater than 715 s (which is
+ * rather unlikely).
*
* Use this result only if there is no timer to wake us up sooner.
*/
- if (likely(stddev <= ULONG_MAX)) {
- stddev = int_sqrt(stddev);
- if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
- || stddev <= 20) {
+ if (likely(variance <= U64_MAX/36)) {
+ if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+ || variance <= 400) {
if (data->next_timer_us > avg)
data->predicted_us = avg;
return;
--
2.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
@ 2016-02-16 19:19 ` Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
1 sibling, 0 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
Rasmus Villemoes
Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel
We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
| 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index beef7ae123ba..27fc733cb5b9 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -199,8 +199,8 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev);
static void get_typical_interval(struct menu_device *data)
{
int i, divisor;
- unsigned int max, thresh;
- uint64_t avg, variance;
+ unsigned int max, thresh, avg;
+ uint64_t sum, variance;
thresh = UINT_MAX; /* Discard outliers above this value */
@@ -208,28 +208,28 @@ again:
/* First calculate the average of past intervals */
max = 0;
- avg = 0;
+ sum = 0;
divisor = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
- avg += value;
+ sum += value;
divisor++;
if (value > max)
max = value;
}
}
if (divisor == INTERVALS)
- avg >>= INTERVAL_SHIFT;
+ avg = sum >> INTERVAL_SHIFT;
else
- do_div(avg, divisor);
+ avg = div_u64(sum, divisor);
/* Then try to determine variance */
variance = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
- int64_t diff = value - avg;
+ int64_t diff = (int64_t)value - avg;
variance += diff * diff;
}
}
@@ -251,7 +251,7 @@ again:
* Use this result only if there is no timer to wake us up sooner.
*/
if (likely(variance <= U64_MAX/36)) {
- if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+ if ((((u64)avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
|| variance <= 400) {
if (data->next_timer_us > avg)
data->predicted_us = avg;
--
2.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
@ 2016-02-16 23:20 ` Rafael J. Wysocki
1 sibling, 0 replies; 3+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 23:20 UTC (permalink / raw)
To: Rasmus Villemoes
Cc: Andi Kleen, Andi Kleen, Ingo Molnar, akpm, lenb, Eric Dumazet,
Fengguang Wu, linux-kernel
On 2/16/2016 8:19 PM, Rasmus Villemoes wrote:
> Computing the integer square root is a rather expensive operation, at
> least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
> bit platforms, doing an extra comparison to a constant (variance <=
> U64_MAX/36).
>
> On 64 bit platforms, this does mean that we add a restriction on the
> range of the variance where we end up using the estimate (since
> previously the stddev <= ULONG_MAX was a tautology), but on the other
> hand, we extend the range quite substantially on 32 bit platforms - in
> both cases, we now allow standard deviations up to 715 seconds, which
> is for example guaranteed if all observations are less than 1430
> seconds.
>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Both patches look good to me, so I'm going to queue them up for 4.6.
If anyone has any issues with that, please let me know.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-02-16 23:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox