From: Josef Bacik <josef@toxicpanda.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
josef@toxicpanda.com, mingo@redhat.com,
linux-kernel@vger.kernel.org, kernel-team@fb.com,
Josef Bacik <jbacik@fb.com>
Subject: Re: [RFC][PATCH] sched: attach extra runtime to the right avg
Date: Tue, 4 Jul 2017 09:51:25 -0400 [thread overview]
Message-ID: <20170704135123.GA7328@destiny> (raw)
In-Reply-To: <20170704124003.i7lg4c7gdnqqbjyo@hirez.programming.kicks-ass.net>
On Tue, Jul 04, 2017 at 02:40:03PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 04, 2017 at 02:21:50PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 04, 2017 at 12:13:09PM +0200, Ingo Molnar wrote:
> > >
> > > This code on the other hand:
> > >
> > > sa->last_update_time += delta << 10;
> > >
> > > ... in essence creates a whole new absolute clock value that slowly but surely is
> > > drifting away from the real rq->clock, because 'delta' is always rounded down to
> > > the nearest 1024 ns boundary, so we accumulate the 'remainder' losses.
> > >
> > > That is because:
> > >
> > > delta >>= 10;
> > > ...
> > > sa->last_update_time += delta << 10;
> > >
> > > Given enough time, ->last_update_time can drift a long way, and this delta:
> > >
> > > delta = now - sa->last_update_time;
> > >
> > > ... becomes meaningless AFAICS, because it's essentially two different clocks that
> > > get compared.
> >
> > Thing is, once you drift over 1023 (ns) your delta increases and you
> > catch up again.
> >
> >
> >
> > A B C D E F
> > | | | | | |
> > +----+----+----+----+----+----+----+----+----+----+----+
> >
> >
> > A: now = 0
> > sa->last_update_time = 0
> > delta := (now - sa->last_update_time) >> 10 = 0
> >
> > B: now = 614 (+614)
> > delta = (614 - 0) >> 10 = 0
> > sa->last_update_time += 0 (0)
> > sa->last_update_time = now & ~1023 (0)
> >
> > C: now = 1843 (+1229)
> > delta = (1843 - 0) >> 10 = 1
> > sa->last_update_time += 1024 (1024)
> > sa->last_update_time = now & ~1023 (1024)
> >
> >
> > D: now = 3481 (+1638)
> > delta = (3481 - 1024) >> 10 = 2
> > sa->last_update_time += 2048 (3072)
> > sa->last_update_time = now & ~1023 (3072)
> >
> > E: now = 5734 (+2253)
> > delta = (5734 - 3072) = 2
> > sa->last_update_time += 2048 (5120)
> > sa->last_update_time = now & ~1023 (5120)
> >
> > F: now = 6348 (+614)
> > delta = (6348 - 5120) >> 10 = 1
> > sa->last_update_time += 1024 (6144)
> > sa->last_update_time = now & ~1023 (6144)
> >
> >
> >
> > And you'll see that both are identical, and that both D and F have
> > gotten a spill from sub-chunk accounting.
>
>
> Where the two approaches differ is when we have different modifications
> to sa->last_update_time (and we do).
>
> The differential (+=) one does not mandate initial value of
> ->last_update_time has the bottom 9 bits cleared. It will simply
> continue from wherever.
>
> The absolute (&) one however mandates that ->last_update_time always has
> the bottom few bits 0, otherwise we can 'gain' time. The first iteration
> will clear those bits and we'll then double account them.
>
> It so happens that we have an explicit assign in migrate
> (attach_entity_load_avg / set_task_rq_fair). And on negative delta. In
> all those cases we use the immediate 'now' value, no clearing of bottom
> bits.
>
> The differential should work fine with that, the absolute one has double
> accounting issues in that case.
>
> So it would be very good to find what exactly causes Josef's workload to
> get 'fixed'.
Sorry everybody, I thought I had tested all of the patches minus this one, but
apparently I did not. Re-testing with the original code and my other patches
verified that the problem is still fixed, so this isn't needed. Sorry for the
noise,
Josef
next prev parent reply other threads:[~2017-07-04 13:51 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-30 1:56 [RFC][PATCH] sched: attach extra runtime to the right avg josef
2017-07-02 9:37 ` Ingo Molnar
2017-07-03 13:30 ` Josef Bacik
2017-07-04 9:41 ` Peter Zijlstra
2017-07-04 10:13 ` Ingo Molnar
2017-07-04 12:21 ` Peter Zijlstra
2017-07-04 12:40 ` Peter Zijlstra
2017-07-04 12:47 ` Josef Bacik
2017-07-04 13:51 ` Josef Bacik [this message]
2017-07-04 14:28 ` Peter Zijlstra
2017-07-03 7:26 ` Vincent Guittot
2017-07-03 13:41 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170704135123.GA7328@destiny \
--to=josef@toxicpanda.com \
--cc=jbacik@fb.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox