* [PATCH v2] sched/fair: Make PELT signal more accurate
@ 2017-08-18 23:50 Joel Fernandes
2017-08-19 4:57 ` Mike Galbraith
0 siblings, 1 reply; 5+ messages in thread
From: Joel Fernandes @ 2017-08-18 23:50 UTC (permalink / raw)
To: linux-kernel
Cc: Joel Fernandes, Vincent Guittot, Peter Zijlstra, Juri Lelli,
Brendan Jackman, Dietmar Eggeman, kernel-team, Ingo Molnar
The PELT signal (sa->load_avg and sa->util_avg) are not updated if the amount
accumulated during a single update doesn't cross a period boundary. This is
fine in cases where the amount accrued is much smaller than the size of a
single PELT window (1ms) however if the amount accrued is high then the
relative error (calculated against what the actual signal would be had we
updated the averages) can be high - as much 2% in my testing. On plotting
signals, I found that there are errors especially high when we update just
before the period boundary is hit. These errors can be significantly reduced if
we update the averages more often.
Inorder to fix this, this patch does the average update by also checking how
much time has elapsed since the last update and update the averages if it has
been long enough (as a threshold I chose 512us).
In order to compare the signals with/without the patch I created a synthetic
test (20ms runtime, 100ms period) and analyzed the signals and created a report
on the analysis data/plots and histograms both with and without the fix:
http://www.linuxinternals.org/misc/pelt-error-v4.pdf
With the patch, the error in the signal is significantly reduced, during
testing I found that 512us was a good choice. With this we ensure there are
atmost 2 updates per-window to the averages. Previous revisions chose 128 and
caused around 4% regression with unixbench. This version doesn't have such
regression. The tradeoff is slight overhead for the reduction in error.
Here's the unixbench shell8 test result for 2 runs:
Without: 9157.9, 9134.6
With: 9103.2, 9082
I also ran the hackbench-perf LKP test, total time:
With: 183.03 , Without: 183.12
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Brendan Jackman <brendan.jackman@arm.com>
Cc: Dietmar Eggeman <dietmar.eggemann@arm.com>
Cc: kernel-team@android.com
Signed-off-by: Joel Fernandes <joelaf@google.com>
---
kernel/sched/fair.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 40a89f415db0..4f61e8f1d63a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3163,6 +3163,7 @@ ___update_load_sum(u64 now, int cpu, struct sched_avg *sa,
unsigned long load, unsigned long runnable, int running)
{
u64 delta;
+ int periods;
delta = now - sa->last_update_time;
/*
@@ -3201,9 +3202,12 @@ ___update_load_sum(u64 now, int cpu, struct sched_avg *sa,
* accrues by two steps:
*
* Step 1: accumulate *_sum since last_update_time. If we haven't
- * crossed period boundaries, finish.
+ * crossed period boundaries and the time since last update is small
+ * enough, we're done.
*/
- if (!accumulate_sum(delta, cpu, sa, load, runnable, running))
+ periods = accumulate_sum(delta, cpu, sa, load, runnable, running);
+
+ if (!periods && delta < 512)
return 0;
return 1;
--
2.14.1.480.gb18f417b89-goog
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: Make PELT signal more accurate
2017-08-18 23:50 [PATCH v2] sched/fair: Make PELT signal more accurate Joel Fernandes
@ 2017-08-19 4:57 ` Mike Galbraith
2017-08-19 17:58 ` Joel Fernandes
0 siblings, 1 reply; 5+ messages in thread
From: Mike Galbraith @ 2017-08-19 4:57 UTC (permalink / raw)
To: Joel Fernandes, linux-kernel
Cc: Vincent Guittot, Peter Zijlstra, Juri Lelli, Brendan Jackman,
Dietmar Eggeman, kernel-team, Ingo Molnar
On Fri, 2017-08-18 at 16:50 -0700, Joel Fernandes wrote:
> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the amount
> accumulated during a single update doesn't cross a period boundary. This is
> fine in cases where the amount accrued is much smaller than the size of a
> single PELT window (1ms) however if the amount accrued is high then the
> relative error (calculated against what the actual signal would be had we
> updated the averages) can be high - as much 2% in my testing. On plotting
> signals, I found that there are errors especially high when we update just
> before the period boundary is hit. These errors can be significantly reduced if
> we update the averages more often.
>
> Inorder to fix this, this patch does the average update by also checking how
> much time has elapsed since the last update and update the averages if it has
> been long enough (as a threshold I chose 512us).
Ok, I gotta ask: In order to fix what? What exactly does the small
but existent overhead increase buy us other than an ever so slightly
different chart? What is your motivation to care about a microscopic
change in signal shape?
-Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: Make PELT signal more accurate
2017-08-19 4:57 ` Mike Galbraith
@ 2017-08-19 17:58 ` Joel Fernandes
2017-08-20 6:00 ` Mike Galbraith
0 siblings, 1 reply; 5+ messages in thread
From: Joel Fernandes @ 2017-08-19 17:58 UTC (permalink / raw)
To: Mike Galbraith
Cc: LKML, Vincent Guittot, Peter Zijlstra, Juri Lelli,
Brendan Jackman, Dietmar Eggeman, kernel-team, Ingo Molnar
Hi Mike,
On Fri, Aug 18, 2017 at 9:57 PM, Mike Galbraith <efault@gmx.de> wrote:
> On Fri, 2017-08-18 at 16:50 -0700, Joel Fernandes wrote:
>> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the amount
>> accumulated during a single update doesn't cross a period boundary. This is
>> fine in cases where the amount accrued is much smaller than the size of a
>> single PELT window (1ms) however if the amount accrued is high then the
>> relative error (calculated against what the actual signal would be had we
>> updated the averages) can be high - as much 2% in my testing. On plotting
>> signals, I found that there are errors especially high when we update just
>> before the period boundary is hit. These errors can be significantly reduced if
>> we update the averages more often.
>>
>> Inorder to fix this, this patch does the average update by also checking how
>> much time has elapsed since the last update and update the averages if it has
>> been long enough (as a threshold I chose 512us).
>
> Ok, I gotta ask: In order to fix what? What exactly does the small
> but existent overhead increase buy us other than an ever so slightly
> different chart? What is your motivation to care about a microscopic
> change in signal shape?
I wouldn't call the change "microscopic", its about 2% absolute which
comes down to 1% with this change (if you count in relative terms, its
higher and you can see the bump as the signal rises).
If you look at the first chart at [1] at 3.74, that's not microscopic
at all to me.
Also about motivation, as I described in previous threads - I didn't
nail this down to a particular change in behavior but other patches
have been posted before that do things to improve signal accuracy,
this is just one step in that direction.
thanks,
-Joel
[1] https://github.com/joelagnel/joelagnel.github.io/blob/master/misc/pelt-error-v4.pdf
>
> -Mike
>
> --
> You received this message because you are subscribed to the Google Groups "kernel-team" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: Make PELT signal more accurate
2017-08-19 17:58 ` Joel Fernandes
@ 2017-08-20 6:00 ` Mike Galbraith
2017-08-20 6:46 ` Joel Fernandes
0 siblings, 1 reply; 5+ messages in thread
From: Mike Galbraith @ 2017-08-20 6:00 UTC (permalink / raw)
To: Joel Fernandes
Cc: LKML, Vincent Guittot, Peter Zijlstra, Juri Lelli,
Brendan Jackman, Dietmar Eggeman, kernel-team, Ingo Molnar
On Sat, 2017-08-19 at 10:58 -0700, Joel Fernandes wrote:
>
> > Ok, I gotta ask: In order to fix what? What exactly does the small
> > but existent overhead increase buy us other than an ever so slightly
> > different chart? What is your motivation to care about a microscopic
> > change in signal shape?
>
> I wouldn't call the change "microscopic", its about 2% absolute which
> comes down to 1% with this change (if you count in relative terms, its
> higher and you can see the bump as the signal rises).
> If you look at the first chart at [1] at 3.74, that's not microscopic
> at all to me.
Whether that 0.02->0.01 is viewed as significant by you or I matters
not at all, for it to be significant, it must have measurable effect.
Where is the consumer which benefits from precision improvement.. of
the instantaneously wildly inaccurate average. Where is the non-zero
return on investment? If the chart is the entire product, no sale.
-Mike
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: Make PELT signal more accurate
2017-08-20 6:00 ` Mike Galbraith
@ 2017-08-20 6:46 ` Joel Fernandes
0 siblings, 0 replies; 5+ messages in thread
From: Joel Fernandes @ 2017-08-20 6:46 UTC (permalink / raw)
To: Mike Galbraith
Cc: LKML, Vincent Guittot, Peter Zijlstra, Juri Lelli,
Brendan Jackman, Dietmar Eggeman, kernel-team, Ingo Molnar
Hi Mike,
On Sat, Aug 19, 2017 at 11:00 PM, Mike Galbraith <efault@gmx.de> wrote:
> On Sat, 2017-08-19 at 10:58 -0700, Joel Fernandes wrote:
>>
>> > Ok, I gotta ask: In order to fix what? What exactly does the small
>> > but existent overhead increase buy us other than an ever so slightly
>> > different chart? What is your motivation to care about a microscopic
>> > change in signal shape?
>>
>> I wouldn't call the change "microscopic", its about 2% absolute which
>> comes down to 1% with this change (if you count in relative terms, its
>> higher and you can see the bump as the signal rises).
>> If you look at the first chart at [1] at 3.74, that's not microscopic
>> at all to me.
>
> Whether that 0.02->0.01 is viewed as significant by you or I matters
> not at all, for it to be significant, it must have measurable effect.
>
> Where is the consumer which benefits from precision improvement.. of
> the instantaneously wildly inaccurate average. Where is the non-zero
> return on investment? If the chart is the entire product, no sale.
One thing I want to mention is the overhead shows up only in the one
unixbench test that this is sensitive to (and even in that the
overhead is around 0.005, its within the noise for large number of
runs), for the other tests I ran like hackbench, there wasn't any
overhead at all. I am not saying that patch solves a major issue that
exists that I know off, but its just an observation while studying the
code but I'd argue its still an improvement that's harmless and
worthwhile to have.
Anyway the study is here for any one to see in the the future and it
was indeed useful to me, to learn more about the code. I am Ok with
dropping this patch if the general feeling is the inaccurate average
is Ok.
thanks for your review,
-Joel
>
> -Mike
>
> --
> You received this message because you are subscribed to the Google Groups "kernel-team" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-08-20 6:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-18 23:50 [PATCH v2] sched/fair: Make PELT signal more accurate Joel Fernandes
2017-08-19 4:57 ` Mike Galbraith
2017-08-19 17:58 ` Joel Fernandes
2017-08-20 6:00 ` Mike Galbraith
2017-08-20 6:46 ` Joel Fernandes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox