From: Peter Zijlstra <peterz@infradead.org>
To: bsegall@google.com
Cc: Yuyang Du <yuyang.du@intel.com>,
mingo@redhat.com, linux-kernel@vger.kernel.org,
rafael.j.wysocki@intel.com, arjan.van.de.ven@intel.com,
len.brown@intel.com, alan.cox@intel.com, mark.gross@intel.com,
pjt@google.com, fengguang.wu@intel.com
Subject: Re: [PATCH 2/2] sched: Rewrite per entity runnable load average tracking
Date: Tue, 8 Jul 2014 14:50:50 +0200 [thread overview]
Message-ID: <20140708125050.GA2923@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <xm26r41wyfgc.fsf@sword-of-the-dawn.mtv.corp.google.com>
[-- Attachment #1: Type: text/plain, Size: 2725 bytes --]
On Mon, Jul 07, 2014 at 03:25:07PM -0700, bsegall@google.com wrote:
> >> +static inline void enqueue_entity_load_avg(struct sched_entity *se)
> >> {
> >> + struct sched_avg *sa = &se->avg;
> >> + struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >> + u64 now = cfs_rq_clock_task(cfs_rq);
> >> + u32 old_load_avg = cfs_rq->avg.load_avg;
> >> + int migrated = 0;
> >>
> >> + if (entity_is_task(se)) {
> >> + if (sa->last_update_time == 0) {
> >> + sa->last_update_time = now;
> >> + migrated = 1;
> >> }
> >> + else
> >> + __update_load_avg(now, sa, se->on_rq * se->load.weight);
> >> }
> >>
> >> + __update_load_avg(now, &cfs_rq->avg, cfs_rq->load.weight);
> >>
> >> + if (migrated)
> >> + cfs_rq->avg.load_avg += sa->load_avg;
> >>
> >> + synchronize_tg_load_avg(cfs_rq, old_load_avg);
> >> }
> >
> > So here you add the task to the cfs_rq avg when its got migrate in,
> > however:
> >
> >> @@ -4552,17 +4326,9 @@ migrate_task_rq_fair(struct task_struct *p, int next_cpu)
> >> struct sched_entity *se = &p->se;
> >> struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >>
> >> + /* Update task on old CPU, then ready to go (entity must be off the queue) */
> >> + __update_load_avg(cfs_rq_clock_task(cfs_rq), &se->avg, 0);
> >> + se->avg.last_update_time = 0;
> >>
> >> /* We have migrated, no longer consider this task hot */
> >> se->exec_start = 0;
> >
> > there you don't remove it first..
>
> Yeah, the issue is that you can't remove it, because you don't hold the
> lock. Thus the whole runnable/blocked split iirc. Also the
> cfs_rq_clock_task read is incorrect for the same reason (and while
> rq_clock_task could certainly be fixed min_vruntime-style,
> cfs_rq_clock_task would be harder).
>
> The problem with just working around the clock issue somehow and then using an
> atomic to do this subtraction is that you have no idea when the /cfs_rq/
> last updated - there's no guarantee it is up to date, and if it's not
> then the subtraction is wrong. You can't update it to make it up to date
> like the se->avg, becasue you don't hold any locks. You would need
> decay_counter stuff like the current code, and I'm not certain how well
> that would work out without the runnable/blocked split.
Right; so the current code jumps through a few nasty hoops because of
this. But I think the proposed code got this wrong (understandably).
But yes, we spend a lot of time and effort to remove the rq->lock from
the remote wakeup path, which makes all this very tedious indeed.
Like you said, we can indeed make the time thing work, but the remote
subtraction is going to be messy. Can't seem to come up with anything
sane there either.
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
prev parent reply other threads:[~2014-07-08 12:51 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-02 2:30 [PATCH 1/2] sched: Remove update_rq_runnable_avg Yuyang Du
2014-07-02 2:30 ` [PATCH 2/2] sched: Rewrite per entity runnable load average tracking Yuyang Du
2014-07-07 10:07 ` Peter Zijlstra
2014-07-07 10:46 ` Peter Zijlstra
2014-07-07 20:03 ` Yuyang Du
2014-07-07 22:25 ` bsegall
2014-07-08 0:08 ` Yuyang Du
2014-07-08 17:04 ` bsegall
2014-07-09 1:07 ` Yuyang Du
2014-07-09 17:08 ` bsegall
2014-07-09 18:39 ` Yuyang Du
2014-07-09 18:45 ` Peter Zijlstra
2014-07-09 19:07 ` bsegall
2014-07-10 10:08 ` Peter Zijlstra
2014-07-10 17:01 ` bsegall
2014-07-10 19:53 ` Yuyang Du
2014-07-10 23:22 ` Yuyang Du
2014-07-11 8:47 ` Peter Zijlstra
2014-07-11 0:52 ` Yuyang Du
2014-07-11 2:01 ` Yuyang Du
2014-07-09 23:30 ` Yuyang Du
2014-07-10 17:06 ` bsegall
2014-07-10 20:08 ` Yuyang Du
2014-07-08 12:50 ` Peter Zijlstra [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140708125050.GA2923@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=alan.cox@intel.com \
--cc=arjan.van.de.ven@intel.com \
--cc=bsegall@google.com \
--cc=fengguang.wu@intel.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.gross@intel.com \
--cc=mingo@redhat.com \
--cc=pjt@google.com \
--cc=rafael.j.wysocki@intel.com \
--cc=yuyang.du@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.