Re: [PATCH] sched/fair: Make PELT signal more accurate

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Vincent Guittot <vincent.guittot@linaro.org>
To: Joel Fernandes <joelaf@google.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	kernel-team@android.com, Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@arm.com>,
	Brendan Jackman <brendan.jackman@arm.com>,
	Dietmar Eggeman <dietmar.eggemann@arm.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH] sched/fair: Make PELT signal more accurate
Date: Thu, 10 Aug 2017 09:17:34 +0200	[thread overview]
Message-ID: <CAKfTPtDjGmFM0H_twXmnZ6ejLekSW5y_auXfcXDFxL+tnLK1Hg@mail.gmail.com> (raw)
In-Reply-To: <CAJWu+orz-OAH1WKgWM9SP93pOu1sDNGV8WyWnysGFnc8Q58y0w@mail.gmail.com>

On 9 August 2017 at 19:51, Joel Fernandes <joelaf@google.com> wrote:
> Hi Vincent,
>
> On Wed, Aug 9, 2017 at 3:23 AM, Vincent Guittot
> <vincent.guittot@linaro.org> wrote:
> <snip>
>>>
>>> Yes this is true, however since I'm using the 'delta' instead of
>>> period_contrib, its only does the update every 128us, however if
>>> several updates fall within a 128us boundary then those will be rate
>>> limited. So say we have a flood of updates, then the updates have to
>>> be spaced every 128us to reach the maximum number of division, I don't
>>> know whether this is a likely situation or would happen very often? I
>>> am planning to run some benchmarks and check that there is no
>>> regression as well as Peter mentioned about the performance aspect.
>>>
>>>>> In order to compare the signals with/without the patch I created a synthetic
>>>>> test (20ms runtime, 100ms period) and analyzed the signals and created a report
>>>>> on the analysis data/plots both with and without the fix:
>>>>> http://www.linuxinternals.org/misc/pelt-error.pdf
>>>>
>>>> The glitch described in page 2 shows a decrease of the util_avg which
>>>> is not linked to accuracy of the calculation but due to the use of the
>>>> wrong range when computing util_avg.
>>>
>>> Yes, and I corrected the graphs this time to show what its like after
>>> your patch and confirm that there is STILL a glitch. You are right
>>> that there isn't a reduction after your patch, however in my updated
>>> graphs there is a glitch and its not a downward peak but a stall in
>>> the update, the error is still quite high and can be as high as the
>>> absolute 2% error, in my update graphs I show an example where its ~
>>> 1.8% (18 / 1024).
>>>
>>> Could you please take a look at my updated document? I have included
>>> new graph and traces there and color coded them so its easy to
>>> correlate the trace lines to the error in the graph: Here's the
>>> updated new link:
>>> https://github.com/joelagnel/joelagnel.github.io/blob/master/misc/pelt-error-rev2.pdf
>>
>> I see strange behavior in your rev2 document:
>> At timestamp 9.235635, we have util_avg=199 acc_util_avg=199
>> util_err=0. Everything looks fine but I don't this point on the graph
>> Then, at 9.235636 (which is the red colored faulty trace), we have
>> util_avg=182 acc_util_avg=200 util_err=18.
>> Firstly, this means that util_avg has been updated (199 -> 182) so the
>> error is not a problem of util_avg not been updated often enough :-)
>> Then, util_avg decreases (199 -> 182) whereas it must increase because
>> the task is still running. This should not happen and this is exactly
>> what commit 625ed2bf049d should fix. So either the patch is not
>> applied or it doesn't fix completely the problem.
>
> I think you are looking at wrong trace lines. The graph is generated
> with for rq util only (cfs_rq == 1), so the lines in the traces you
> should look at are the ones with cfs_rq= 1. Only cfs_rq==1 lines were
> used to generate the graphs.

Ah! this is quite confusing and not obvious that the trace is not for
1 signal but in fact 2 signals are interleaved and only 1 is displayed
and that we have to filter them

So ok i can see that the trace with cfs_rq=1 is not updated. At the
opposite, we can see that the other trace (for the se i assume) is
updated normally whereas they are normally synced on the same clock

>
> In this you will see rq util_avg change as follows: 165 -> 182 -> 182
> (missed an update causing error). This is also reflected in the graph
> in the graph where you see the flat green line.
>
>>
>> That would be interesting to also display the last_update_time of sched_avg
>>
>>>
>>>> commit  625ed2bf049d "sched/cfs: Make util/load_avg more stable" fixes
>>>> this glitch.
>>>> And the lower peak value in page 3 is probably linked to the inaccuracy
>>>
>>> This is not true. The reduction in peak in my tests which happen even
>>> after your patch is because of the dequeue that happens just before
>>> the period boundary is hit. Could you please take a look at the
>>> updated document in the link above? In there I show in the second
>>> example with a trace that corresponds the reduction in peak during the
>>> dequeue and is because of the delay in update. These errors go away
>>> with my patch.
>>
>> There is the same strange behavior there:
>> When the reduction in peak happens, the util_avg is updated whereas
>> your concerns is that util_avg is not update often enough.
>> At timestamp 10.656683, we have util_avg=389 acc_util_avg=389 util_err=0
>> At timestamp 10.657420, we have util_avg=396 acc_util_avg=396
>> util_err=0. I don't see this point on the graph
>> At timestamp 10.657422, we have util_avg=389 acc_util_avg=399
>> util_err=10. This is the colored faulty trace but util_avg has been
>> updated from 369 to 389
>
> Yeah, same thing here, you should look at the lines with cfs_rq == 1.
> The util changes as: 363 -> 376 -> 389 -> 389 (missed update).
>
>
> thanks,
>
> -Joel
>
>
>>
>> Regards,
>> Vincent
>>
>>>
>>>> I agree that there is an inaccuracy (the max absolute value of 22) but
>>>> that's in favor of less overhead. Have you seen wrong behavior because
>>>> of this inaccuracy ?
>>>
>>> I haven't tried to nail this to a wrong behavior however since other
>>> patches have been posted to fix inaccuracy and I do see we reach the
>>> theoretical maximum error on quite a few occassions, I think its
>>> justifiable. Also the overhead is minimal if updates aren't happening
>>> several times in a window, and at 128us interval, and the few times
>>> that the update does happen, the division is performed only during
>>> those times. So incases where it does fix the error, it does so with
>>> minimal overhead. I do agree with the overhead point and I'm planning
>>> to do more tests with hackbench to confirm overhead is minimal. I'll
>>> post some updates about it soon.
>>>
>>> Thanks!
>>>
>>> -Joel
>>>
>>>
>>>>
>>>>>
>>>>> With the patch, the error in the signal is significantly reduced, and is
>>>>> non-existent beyond a small negligible amount.
>>>>>
>>>>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>> Cc: Juri Lelli <juri.lelli@arm.com>
>>>>> Cc: Brendan Jackman <brendan.jackman@arm.com>
>>>>> Cc: Dietmar Eggeman <dietmar.eggemann@arm.com>
>>>>> Signed-off-by: Joel Fernandes <joelaf@google.com>
>>>>> ---
>>>>>  kernel/sched/fair.c | 8 ++++++--
>>>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index 4f1825d60937..1347643737f3 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -2882,6 +2882,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
>>>>>                   unsigned long weight, int running, struct cfs_rq *cfs_rq)
>>>>>  {
>>>>>         u64 delta;
>>>>> +       int periods;
>>>>>
>>>>>         delta = now - sa->last_update_time;
>>>>>         /*
>>>>> @@ -2908,9 +2909,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
>>>>>          * accrues by two steps:
>>>>>          *
>>>>>          * Step 1: accumulate *_sum since last_update_time. If we haven't
>>>>> -        * crossed period boundaries, finish.
>>>>> +        * crossed period boundaries and the time since last update is small
>>>>> +        * enough, we're done.
>>>>>          */
>>>>> -       if (!accumulate_sum(delta, cpu, sa, weight, running, cfs_rq))
>>>>> +       periods = accumulate_sum(delta, cpu, sa, weight, running, cfs_rq);
>>>>> +
>>>>> +       if (!periods && delta < 128)
>>>>>                 return 0;
>>>>>
>>>>>         /*
>>>>> --
>>>>> 2.14.0.rc1.383.gd1ce394fe2-goog
>>>>>

next prev parent reply	other threads:[~2017-08-10  7:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-04 15:40 [PATCH] sched/fair: Make PELT signal more accurate Joel Fernandes
2017-08-07 13:24 ` Vincent Guittot
2017-08-08 23:11   ` Joel Fernandes
2017-08-09 10:23     ` Vincent Guittot
2017-08-09 17:51       ` Joel Fernandes
2017-08-10  7:17         ` Vincent Guittot [this message]
2017-08-10 10:37           ` Peter Zijlstra
2017-08-10 15:22           ` Joel Fernandes
2017-08-07 13:40 ` Peter Zijlstra
2017-08-08 23:37   ` Joel Fernandes
2017-08-10 23:11   ` Joel Fernandes
2017-08-11  2:47 ` [lkp-robot] [sched/fair] dca93994f6: unixbench.score -8.1% regression kernel test robot
2017-08-11 15:57   ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtDjGmFM0H_twXmnZ6ejLekSW5y_auXfcXDFxL+tnLK1Hg@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=brendan.jackman@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@arm.com \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).