* [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
@ 2016-02-16 19:19 Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
0 siblings, 2 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
Rasmus Villemoes
Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel
Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).
On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
| 35 +++++++++++++++++------------------
1 file changed, 17 insertions(+), 18 deletions(-)
--git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 0742b3296673..beef7ae123ba 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -200,7 +200,7 @@ static void get_typical_interval(struct menu_device *data)
{
int i, divisor;
unsigned int max, thresh;
- uint64_t avg, stddev;
+ uint64_t avg, variance;
thresh = UINT_MAX; /* Discard outliers above this value */
@@ -224,36 +224,35 @@ again:
else
do_div(avg, divisor);
- /* Then try to determine standard deviation */
- stddev = 0;
+ /* Then try to determine variance */
+ variance = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
int64_t diff = value - avg;
- stddev += diff * diff;
+ variance += diff * diff;
}
}
if (divisor == INTERVALS)
- stddev >>= INTERVAL_SHIFT;
+ variance >>= INTERVAL_SHIFT;
else
- do_div(stddev, divisor);
+ do_div(variance, divisor);
/*
- * The typical interval is obtained when standard deviation is small
- * or standard deviation is small compared to the average interval.
- *
- * int_sqrt() formal parameter type is unsigned long. When the
- * greatest difference to an outlier exceeds ~65 ms * sqrt(divisor)
- * the resulting squared standard deviation exceeds the input domain
- * of int_sqrt on platforms where unsigned long is 32 bits in size.
- * In such case reject the candidate average.
+ * The typical interval is obtained when standard deviation is
+ * small (stddev <= 20 us, variance <= 400 us^2) or standard
+ * deviation is small compared to the average interval (avg >
+ * 6*stddev, avg^2 > 36*variance). The average is smaller than
+ * UINT_MAX aka U32_MAX, so computing its square does not
+ * overflow a u64. We simply reject this candidate average if
+ * the standard deviation is greater than 715 s (which is
+ * rather unlikely).
*
* Use this result only if there is no timer to wake us up sooner.
*/
- if (likely(stddev <= ULONG_MAX)) {
- stddev = int_sqrt(stddev);
- if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
- || stddev <= 20) {
+ if (likely(variance <= U64_MAX/36)) {
+ if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+ || variance <= 400) {
if (data->next_timer_us > avg)
data->predicted_us = avg;
return;
--
2.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread* [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
@ 2016-02-16 19:19 ` Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
1 sibling, 0 replies; 3+ messages in thread
From: Rasmus Villemoes @ 2016-02-16 19:19 UTC (permalink / raw)
To: Andi Kleen, Andi Kleen, Rafael J. Wysocki, Ingo Molnar,
Rasmus Villemoes
Cc: akpm, lenb, Eric Dumazet, Fengguang Wu, linux-kernel
We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
| 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index beef7ae123ba..27fc733cb5b9 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -199,8 +199,8 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev);
static void get_typical_interval(struct menu_device *data)
{
int i, divisor;
- unsigned int max, thresh;
- uint64_t avg, variance;
+ unsigned int max, thresh, avg;
+ uint64_t sum, variance;
thresh = UINT_MAX; /* Discard outliers above this value */
@@ -208,28 +208,28 @@ again:
/* First calculate the average of past intervals */
max = 0;
- avg = 0;
+ sum = 0;
divisor = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
- avg += value;
+ sum += value;
divisor++;
if (value > max)
max = value;
}
}
if (divisor == INTERVALS)
- avg >>= INTERVAL_SHIFT;
+ avg = sum >> INTERVAL_SHIFT;
else
- do_div(avg, divisor);
+ avg = div_u64(sum, divisor);
/* Then try to determine variance */
variance = 0;
for (i = 0; i < INTERVALS; i++) {
unsigned int value = data->intervals[i];
if (value <= thresh) {
- int64_t diff = value - avg;
+ int64_t diff = (int64_t)value - avg;
variance += diff * diff;
}
}
@@ -251,7 +251,7 @@ again:
* Use this result only if there is no timer to wake us up sooner.
*/
if (likely(variance <= U64_MAX/36)) {
- if (((avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
+ if ((((u64)avg*avg > variance*36) && (divisor * 4 >= INTERVALS * 3))
|| variance <= 400) {
if (data->next_timer_us > avg)
data->predicted_us = avg;
--
2.1.4
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH 1/2] cpuidle: menu: avoid expensive square root computation
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
@ 2016-02-16 23:20 ` Rafael J. Wysocki
1 sibling, 0 replies; 3+ messages in thread
From: Rafael J. Wysocki @ 2016-02-16 23:20 UTC (permalink / raw)
To: Rasmus Villemoes
Cc: Andi Kleen, Andi Kleen, Ingo Molnar, akpm, lenb, Eric Dumazet,
Fengguang Wu, linux-kernel
On 2/16/2016 8:19 PM, Rasmus Villemoes wrote:
> Computing the integer square root is a rather expensive operation, at
> least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
> bit platforms, doing an extra comparison to a constant (variance <=
> U64_MAX/36).
>
> On 64 bit platforms, this does mean that we add a restriction on the
> range of the variance where we end up using the estimate (since
> previously the stddev <= ULONG_MAX was a tautology), but on the other
> hand, we extend the range quite substantially on 32 bit platforms - in
> both cases, we now allow standard deviations up to 715 seconds, which
> is for example guaranteed if all observations are less than 1430
> seconds.
>
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Both patches look good to me, so I'm going to queue them up for 4.6.
If anyone has any issues with that, please let me know.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-02-16 23:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-16 19:19 [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rasmus Villemoes
2016-02-16 19:19 ` [PATCH 2/2] cpuidle: menu: help gcc generate slightly better code Rasmus Villemoes
2016-02-16 23:20 ` [PATCH 1/2] cpuidle: menu: avoid expensive square root computation Rafael J. Wysocki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.