From: Peter Zijlstra <peterz@infradead.org>
To: Yuyang Du <yuyang.du@intel.com>
Cc: mingo@redhat.com, linux-kernel@vger.kernel.org, pjt@google.com,
bsegall@google.com, arjan.van.de.ven@intel.com,
len.brown@intel.com, rafael.j.wysocki@intel.com,
alan.cox@intel.com, mark.gross@intel.com, fengguang.wu@intel.com
Subject: Re: [RESEND PATCH 2/3 v5] sched: Rewrite per entity runnable load average tracking
Date: Wed, 22 Oct 2014 12:04:11 +0200 [thread overview]
Message-ID: <20141022100411.GC23531@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <1412907717-2871-3-git-send-email-yuyang.du@intel.com>
On Fri, Oct 10, 2014 at 10:21:56AM +0800, Yuyang Du wrote:
> +/* Group cfs_rq's load_avg is used for task_h_load and update_cfs_share */
> +static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
> {
> + int decayed;
>
> + if (atomic_long_read(&cfs_rq->removed_load_avg)) {
> + long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> + cfs_rq->avg.load_avg = max_t(long, cfs_rq->avg.load_avg - r, 0);
> + cfs_rq->avg.load_sum =
> + max_t(s64, cfs_rq->avg.load_sum - r * LOAD_AVG_MAX, 0);
> }
>
> + decayed = __update_load_avg(now, &cfs_rq->avg, cfs_rq->load.weight);
>
> +#ifndef CONFIG_64BIT
> + smp_wmb();
> + cfs_rq->load_last_update_time_copy = cfs_rq->avg.last_update_time;
> +#endif
>
> -static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq);
> + return decayed;
> +}
> +void remove_entity_load_avg(struct sched_entity *se)
> {
> + struct cfs_rq *cfs_rq = cfs_rq_of(se);
> + u64 last_update_time;
> +
> +#ifndef CONFIG_64BIT
> + u64 last_update_time_copy;
> +
> + do {
> + last_update_time_copy = cfs_rq->load_last_update_time_copy;
> + smp_rmb();
> + last_update_time = cfs_rq->avg.last_update_time;
> + } while (last_update_time != last_update_time_copy);
> +#else
> + last_update_time = cfs_rq->avg.last_update_time;
> +#endif
>
> + __update_load_avg(last_update_time, &se->avg, 0);
> + atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg);
> }
> +static void migrate_task_rq_fair(struct task_struct *p, int next_cpu)
> {
> /*
> + * We are supposed to update the task to "current" time, then its up to date
> + * and ready to go to new CPU/cfs_rq. But we have difficulty in getting
> + * what current time is, so simply throw away the out-of-date time. This
> + * will result in the wakee task is less decayed, but giving the wakee more
> + * load sounds not bad.
> */
> + remove_entity_load_avg(&p->se);
> +
> + /* Tell new CPU we are migrated */
> + p->se.avg.last_update_time = 0;
>
> /* We have migrated, no longer consider this task hot */
> + p->se.exec_start = 0;
> }
Because of:
entity_tick()
update_load_avg()
update_cfs_rq_load_avg()
we're likely to only lag TICK_NSEC behind, right? And thus the
truncation we do in migrate_task_rq_fair() is of equal size.
Hmm,. one problem, cgroup cfs_rq can be idle for a long while and not
get get any ticks at all, so those can lag unbounded. Then again, this
appears to be a problem in the current code too, hmm..
Anybody?
next prev parent reply other threads:[~2014-10-22 10:04 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-10 2:21 [RESEND PATCH 0/3 v5] sched: Rewrite per entity runnable load average tracking Yuyang Du
2014-10-10 2:21 ` [RESEND PATCH 1/3 v5] sched: Remove update_rq_runnable_avg Yuyang Du
2014-10-22 0:23 ` Yuyang Du
2014-10-10 2:21 ` [RESEND PATCH 2/3 v5] sched: Rewrite per entity runnable load average tracking Yuyang Du
2014-10-21 14:32 ` Peter Zijlstra
2014-10-21 23:33 ` Yuyang Du
2014-10-21 14:54 ` Peter Zijlstra
2014-10-21 23:39 ` Yuyang Du
2014-10-21 14:56 ` Peter Zijlstra
2014-10-22 0:13 ` Yuyang Du
2014-10-22 10:04 ` Peter Zijlstra [this message]
2014-10-22 10:28 ` Peter Zijlstra
2014-10-23 11:06 ` Dietmar Eggemann
2014-10-23 11:06 ` Dietmar Eggemann
2014-10-24 1:49 ` Yuyang Du
2014-10-10 2:21 ` [RESEND PATCH 3/3 v5] sched: Remove task and group entity load_avg when they are dead Yuyang Du
2014-10-22 10:31 ` Peter Zijlstra
2014-10-10 10:25 ` [RESEND PATCH 0/3 v5] sched: Rewrite per entity runnable load average tracking Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141022100411.GC23531@worktop.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=alan.cox@intel.com \
--cc=arjan.van.de.ven@intel.com \
--cc=bsegall@google.com \
--cc=fengguang.wu@intel.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.gross@intel.com \
--cc=mingo@redhat.com \
--cc=pjt@google.com \
--cc=rafael.j.wysocki@intel.com \
--cc=yuyang.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.