public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	josef@toxicpanda.com, mingo@redhat.com,
	linux-kernel@vger.kernel.org, kernel-team@fb.com,
	Josef Bacik <jbacik@fb.com>
Subject: Re: [RFC][PATCH] sched: attach extra runtime to the right avg
Date: Tue, 4 Jul 2017 08:47:55 -0400	[thread overview]
Message-ID: <20170704124754.GA6807@destiny> (raw)
In-Reply-To: <20170704124003.i7lg4c7gdnqqbjyo@hirez.programming.kicks-ass.net>

On Tue, Jul 04, 2017 at 02:40:03PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 04, 2017 at 02:21:50PM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 04, 2017 at 12:13:09PM +0200, Ingo Molnar wrote:
> > > 
> > > This code on the other hand:
> > > 
> > > 	sa->last_update_time += delta << 10;
> > > 
> > > ... in essence creates a whole new absolute clock value that slowly but surely is 
> > > drifting away from the real rq->clock, because 'delta' is always rounded down to 
> > > the nearest 1024 ns boundary, so we accumulate the 'remainder' losses.
> > > 
> > > That is because:
> > > 
> > >         delta >>= 10;
> > > 	...
> > >         sa->last_update_time += delta << 10;
> > > 
> > > Given enough time, ->last_update_time can drift a long way, and this delta:
> > > 
> > > 	delta = now - sa->last_update_time;
> > > 
> > > ... becomes meaningless AFAICS, because it's essentially two different clocks that 
> > > get compared.
> > 
> > Thing is, once you drift over 1023 (ns) your delta increases and you
> > catch up again.
> > 
> > 
> > 
> >  A  B     C       D          E  F
> >  |  |     |       |          |  |
> >  +----+----+----+----+----+----+----+----+----+----+----+
> > 
> > 
> > A: now = 0
> >    sa->last_update_time = 0
> >    delta := (now - sa->last_update_time) >> 10 = 0
> > 
> > B: now = 614				(+614)
> >    delta = (614 - 0) >> 10 = 0
> >    sa->last_update_time += 0		(0)
> >    sa->last_update_time = now & ~1023	(0)
> > 
> > C: now = 1843				(+1229)
> >    delta = (1843 - 0) >> 10 = 1
> >    sa->last_update_time += 1024		(1024)
> >    sa->last_update_time = now & ~1023	(1024)
> > 
> > 
> > D: now = 3481				(+1638)
> >    delta = (3481 - 1024) >> 10 = 2
> >    sa->last_update_time += 2048		(3072)
> >    sa->last_update_time = now & ~1023	(3072)
> > 
> > E: now = 5734				(+2253)
> >    delta = (5734 - 3072) = 2
> >    sa->last_update_time += 2048		(5120)
> >    sa->last_update_time = now & ~1023	(5120)
> > 
> > F: now = 6348				(+614)
> >    delta = (6348 - 5120) >> 10 = 1
> >    sa->last_update_time += 1024		(6144)
> >    sa->last_update_time = now & ~1023	(6144)
> > 
> > 
> > 
> > And you'll see that both are identical, and that both D and F have
> > gotten a spill from sub-chunk accounting.
> 
> 
> Where the two approaches differ is when we have different modifications
> to sa->last_update_time (and we do).
> 
> The differential (+=) one does not mandate initial value of
> ->last_update_time has the bottom 9 bits cleared. It will simply
> continue from wherever.
> 
> The absolute (&) one however mandates that ->last_update_time always has
> the bottom few bits 0, otherwise we can 'gain' time. The first iteration
> will clear those bits and we'll then double account them.
> 
> It so happens that we have an explicit assign in migrate
> (attach_entity_load_avg / set_task_rq_fair). And on negative delta. In
> all those cases we use the immediate 'now' value, no clearing of bottom
> bits.
> 
> The differential should work fine with that, the absolute one has double
> accounting issues in that case.
> 
> So it would be very good to find what exactly causes Josef's workload to
> get 'fixed'.

Sorry let me experiment some more, like I said this is one of 4 patches I need
to actually fix my workload, and I've tested so many iterations of this problem
that I may be _thinking_ it affects things but it really doesn't.  I'll re-test
with the normal code and the other 3 patches in place and see if things are ok.
Thanks,

Josef

  reply	other threads:[~2017-07-04 12:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-30  1:56 [RFC][PATCH] sched: attach extra runtime to the right avg josef
2017-07-02  9:37 ` Ingo Molnar
2017-07-03 13:30   ` Josef Bacik
2017-07-04  9:41   ` Peter Zijlstra
2017-07-04 10:13     ` Ingo Molnar
2017-07-04 12:21       ` Peter Zijlstra
2017-07-04 12:40         ` Peter Zijlstra
2017-07-04 12:47           ` Josef Bacik [this message]
2017-07-04 13:51           ` Josef Bacik
2017-07-04 14:28             ` Peter Zijlstra
2017-07-03  7:26 ` Vincent Guittot
2017-07-03 13:41   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170704124754.GA6807@destiny \
    --to=josef@toxicpanda.com \
    --cc=jbacik@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox