public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Yuyang Du <yuyang.du@intel.com>
Cc: mingo@redhat.com, linux-kernel@vger.kernel.org, pjt@google.com,
	bsegall@google.com, arjan.van.de.ven@intel.com,
	len.brown@intel.com, rafael.j.wysocki@intel.com,
	alan.cox@intel.com, mark.gross@intel.com, fengguang.wu@intel.com
Subject: Re: [PATCH 2/2 v4] sched: Rewrite per entity runnable load average tracking
Date: Mon, 28 Jul 2014 15:51:22 +0200	[thread overview]
Message-ID: <20140728135122.GT6758@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1405639567-21445-3-git-send-email-yuyang.du@intel.com>

[-- Attachment #1: Type: text/plain, Size: 3234 bytes --]



> +static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
>  {
> +	int decayed;
>  
> +	if (atomic_long_read(&cfs_rq->removed_load_avg)) {
> +		long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
> +		cfs_rq->avg.load_avg = subtract_until_zero(cfs_rq->avg.load_avg, r);
> +		r *= LOAD_AVG_MAX;
> +		cfs_rq->avg.load_sum = subtract_until_zero(cfs_rq->avg.load_sum, r);
>  	}
>  
> +	decayed = __update_load_avg(now, &cfs_rq->avg, cfs_rq->load.weight);
>  
> +#ifndef CONFIG_64BIT
> +	if (cfs_rq->avg.last_update_time != cfs_rq->load_last_update_time_copy) {
> +		smp_wmb();
> +		cfs_rq->load_last_update_time_copy = cfs_rq->avg.last_update_time;
> +	}
> +#endif
>  
> +	return decayed;
> +}

So on every cfs_rq update we first process the 'pending' removals, then
decay and then store the current timestamp.

> +static inline void enqueue_entity_load_avg(struct sched_entity *se)
>  {
> +	struct sched_avg *sa = &se->avg;
> +	struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +	u64 now = cfs_rq_clock_task(cfs_rq);
> +	int migrated = 0, decayed;
>  
> +	if (sa->last_update_time == 0) {
> +		sa->last_update_time = now;
>  
> +		if (entity_is_task(se))
> +			migrated = 1;
>  	}
> +	else
> +		__update_load_avg(now, sa, se->on_rq * se->load.weight);
>  
> +	decayed = update_cfs_rq_load_avg(now, cfs_rq);
>  
> +	if (migrated) {
> +		cfs_rq->avg.load_avg += sa->load_avg;
> +		cfs_rq->avg.load_sum += sa->load_sum;
>  	}
>  
> +	if (decayed || migrated)
> +		update_tg_load_avg(cfs_rq);
>  }

On enqueue we add ourselves to the cfs_rq.. and assume the entity is
'current' wrt updates since we did that when we just pulled it from the
old rq.

> @@ -4551,18 +4382,34 @@ migrate_task_rq_fair(struct task_struct *p, int next_cpu)
>  {
>  	struct sched_entity *se = &p->se;
>  	struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +	u64 last_update_time;
>  
>  	/*
> +	 * Task on old CPU catches up with its old cfs_rq, and subtract itself from
> +	 * the cfs_rq (task must be off the queue now).
>  	 */
> +#ifndef CONFIG_64BIT
> +	u64 last_update_time_copy;
> +
> +	do {
> +		last_update_time_copy = cfs_rq->load_last_update_time_copy;
> +		smp_rmb();
> +		last_update_time = cfs_rq->avg.last_update_time;
> +	} while (last_update_time != last_update_time_copy);
> +#else
> +	last_update_time = cfs_rq->avg.last_update_time;
> +#endif
> +	__update_load_avg(last_update_time, &se->avg, 0);
> +	atomic_long_add(se->avg.load_avg, &cfs_rq->removed_load_avg);
> +
> +	/*
> +	 * We are supposed to update the task to "current" time, then its up to date
> +	 * and ready to go to new CPU/cfs_rq. But we have difficulty in getting
> +	 * what current time is, so simply throw away the out-of-date time. This
> +	 * will result in the wakee task is less decayed, but giving the wakee more
> +	 * load sounds not bad.
> +	 */
> +	se->avg.last_update_time = 0;
>  
>  	/* We have migrated, no longer consider this task hot */
>  	se->exec_start = 0;


And here we try and make good on that assumption. The thing I worry
about is what happens if the machine is entirely idle...

What guarantees an semi up-to-date cfs_rq->avg.last_update_time.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

  parent reply	other threads:[~2014-07-28 13:51 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-17 23:26 [PATCH 0/2 v4] sched: Rewrite per entity runnable load average tracking Yuyang Du
2014-07-17 23:26 ` [PATCH 1/2 v4] sched: Remove update_rq_runnable_avg Yuyang Du
2014-07-17 23:26 ` [PATCH 2/2 v4] sched: Rewrite per entity runnable load average tracking Yuyang Du
2014-07-18  9:43   ` Vincent Guittot
2014-07-27 17:36     ` Yuyang Du
2014-07-29  9:12       ` Vincent Guittot
2014-07-29  1:43         ` Yuyang Du
2014-07-29 13:17           ` Vincent Guittot
2014-07-29 22:27             ` Yuyang Du
2014-07-30  8:30               ` Peter Zijlstra
2014-07-30  0:40                 ` Yuyang Du
2014-07-29  9:39         ` Peter Zijlstra
2014-07-29  1:53           ` Yuyang Du
2014-07-29 13:35             ` Peter Zijlstra
2014-07-29 15:55               ` Peter Zijlstra
2014-07-29 23:08               ` Yuyang Du
2014-07-31  9:40             ` Vincent Guittot
2014-07-31  9:56             ` [PATCH 2/2 v4] sched: Rewrite per entity runnable load average Vincent Guittot
2014-07-31 19:16               ` Yuyang Du
2014-08-01  9:28                 ` Vincent Guittot
2014-07-28 10:48   ` [PATCH 2/2 v4] sched: Rewrite per entity runnable load average tracking Peter Zijlstra
2014-07-29  0:56     ` Yuyang Du
2014-07-29 13:15       ` Peter Zijlstra
2014-07-28 11:39   ` Peter Zijlstra
2014-07-29  1:09     ` Yuyang Du
2014-07-29 13:19       ` Peter Zijlstra
2014-07-28 12:01   ` Peter Zijlstra
2014-07-28 13:51   ` Peter Zijlstra [this message]
2014-07-28 16:58     ` bsegall
2014-07-28 17:19       ` Peter Zijlstra
2014-07-29  1:13         ` Yuyang Du
2014-07-18 15:39 ` [PATCH 0/2 " Morten Rasmussen
2014-07-27 19:02   ` Yuyang Du
2014-07-28 10:38     ` Peter Zijlstra
2014-07-29  1:17       ` Yuyang Du
2014-07-29 13:06         ` Peter Zijlstra
2014-07-30 10:13     ` Morten Rasmussen
2014-07-30 10:21       ` Peter Zijlstra
2014-07-30 10:57         ` Morten Rasmussen
2014-07-30 19:17       ` Yuyang Du
2014-07-31  8:54         ` Morten Rasmussen
2014-07-31  2:15           ` Yuyang Du
2014-07-20  5:46 ` Mike Galbraith
2014-07-27 19:34   ` Yuyang Du
2014-07-28  7:49     ` Mike Galbraith
2014-07-28  0:01       ` Yuyang Du
2014-07-28  8:55     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140728135122.GT6758@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=alan.cox@intel.com \
    --cc=arjan.van.de.ven@intel.com \
    --cc=bsegall@google.com \
    --cc=fengguang.wu@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.gross@intel.com \
    --cc=mingo@redhat.com \
    --cc=pjt@google.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=yuyang.du@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox