From: Ingo Molnar <mingo@kernel.org>
To: josef@toxicpanda.com
Cc: mingo@redhat.com, peterz@infradead.org,
linux-kernel@vger.kernel.org, kernel-team@fb.com,
Josef Bacik <jbacik@fb.com>
Subject: Re: [RFC][PATCH] sched: attach extra runtime to the right avg
Date: Sun, 2 Jul 2017 11:37:18 +0200 [thread overview]
Message-ID: <20170702093718.aq5p5xxfvrjdeful@gmail.com> (raw)
In-Reply-To: <1498787766-9593-1-git-send-email-jbacik@fb.com>
* josef@toxicpanda.com <josef@toxicpanda.com> wrote:
> From: Josef Bacik <jbacik@fb.com>
>
> We only track the load avg of a se in 1024 ns chunks, so in order to
> make up for the loss of the < 1024 ns part of a run/sleep delta we only
> add the time we processed to the se->avg.last_update_time. The problem
> is there is no way to know if this extra time was while we were asleep
> or while we were running. Instead keep track of the remainder and apply
> it in the appropriate place. If the remainder was while we were
> running, add it to the delta the next time we update the load avg while
> running, and the same for sleeping. This (coupled with other fixes)
> mostly fixes the regression to my workload introduced by Peter's
> experimental runnable load propagation patches.
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> @@ -2897,12 +2904,16 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> * Use 1024ns as the unit of measurement since it's a reasonable
> * approximation of 1us and fast to compute.
> */
> + remainder = delta & (1023UL);
> + sa->last_update_time = now;
> + if (running)
> + sa->run_remainder = remainder;
> + else
> + sa->sleep_remainder = remainder;
> delta >>= 10;
> if (!delta)
> return 0;
>
> - sa->last_update_time += delta << 10;
> -
So I'm wondering, this chunk changes how sa->last_update_time is maintained in
___update_load_avg(): the new code takes a precise timestamp, but the old code was
not taking an imprecise timestamp, but was updating it via deltas - where each
delta was rounded down to the nearest 1024 nsecs boundary.
That, if this is the main code path that updates ->last_update_time, creates a
constant drift of rounding error that skews ->last_update_time into larger and
larger distances from the real 'now' - ever increasing the value of 'delta'.
An intermediate approach to improve that skew would be something like below. It
doesn't track the remainder like your patch does, but doesn't lose precision
either, just rounds down 'now' to the nearest 1024 boundary.
Does this fix the regression you observed as well? Totally untested.
Thanks,
Ingo
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 008c514dc241..b03703cd7989 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2965,7 +2965,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
if (!delta)
return 0;
- sa->last_update_time += delta << 10;
+ sa->last_update_time = now & ~1023ULL;
/*
* Now we know we crossed measurement unit boundaries. The *_avg
next prev parent reply other threads:[~2017-07-02 9:37 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-30 1:56 [RFC][PATCH] sched: attach extra runtime to the right avg josef
2017-07-02 9:37 ` Ingo Molnar [this message]
2017-07-03 13:30 ` Josef Bacik
2017-07-04 9:41 ` Peter Zijlstra
2017-07-04 10:13 ` Ingo Molnar
2017-07-04 12:21 ` Peter Zijlstra
2017-07-04 12:40 ` Peter Zijlstra
2017-07-04 12:47 ` Josef Bacik
2017-07-04 13:51 ` Josef Bacik
2017-07-04 14:28 ` Peter Zijlstra
2017-07-03 7:26 ` Vincent Guittot
2017-07-03 13:41 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170702093718.aq5p5xxfvrjdeful@gmail.com \
--to=mingo@kernel.org \
--cc=jbacik@fb.com \
--cc=josef@toxicpanda.com \
--cc=kernel-team@fb.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox