Re: [RFC][PATCH 1/3] sched: Rewrite tg_shares_up

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Paul Turner <pjt@google.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Chris Friesen <cfriesen@nortel.com>,
	Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>,
	Pierre Bourdon <pbourdon@excellency.fr>
Subject: Re: [RFC][PATCH 1/3] sched: Rewrite tg_shares_up
Date: Fri, 03 Sep 2010 09:59:35 +0200	[thread overview]
Message-ID: <1283500775.1783.135.camel@laptop> (raw)
In-Reply-To: <AANLkTi=ugQbFNV2h-wQkoTBkQgFvDXEzwctgch8XpWVB@mail.gmail.com>

On Fri, 2010-09-03 at 04:09 +0100, Paul Turner wrote:

> > @@ -7652,8 +7574,7 @@ static void init_tg_cfs_entry(struct tas
> >                se->cfs_rq = parent->my_q;
> >
> >        se->my_q = cfs_rq;
> > -       se->load.weight = tg->shares;
> > -       se->load.inv_weight = 0;
> > +       update_load_set(&se->load, tg->shares);
> 
> Given now instantaneous update of shares->load on enqueue/dequeue
> initialization to 0 would result in sane(r) sums across tg->se->load.
> Only relevant for debug though.

Ah, indeed.

> > @@ -8375,7 +8291,6 @@ int sched_group_set_shares(struct task_g
> >                /*
> >                 * force a rebalance
> >                 */
> > -               cfs_rq_set_shares(tg->cfs_rq[i], 0);
> >                set_se_shares(tg->se[i], shares);
> 
> I think a update_cfs_shares is wanted instead here, this will
> potentially over-commit everything until we hit tg_shares_up (e.g.
> long running task case).
> 
> Ironically, the heavy weight full enqueue/dequeue in the
> __set_se_shares path will actually fix up the weights ignoring the
> passed weight for the se->on_rq case.
> 
> I think both functions can be knocked out and just replaced with a
> <lock> <update load> <update shares> <unlock>
> 
> Although.. for total correctness this update should probably be hierarchical.

Right, I just didn't want to bother too much with this code yet, getting
it to more or less not explode when changing weights was good 'nuff.

> > +#ifdef CONFIG_FAIR_GROUP_SCHED
> > +static void update_cfs_load(struct cfs_rq *cfs_rq)
> > +{
> > +       u64 period = sched_avg_period();
> 
> This is a pretty large history window; while it should overlap the
> update period for obvious reasons, intuition suggests a smaller window
> (e.g. 2 x sched_latency) would probably be preferable here in terms of
> reducing over-commit and reducing convergence time.
> 
> I'll run some benchmarks and see how it impacts fairness.

Agreed, maybe even as small as 2*TICK_NSEC, its certainly something we
want to play with, which is basically why I picked the variable that
already had a sysctl knob ;-)

> > +       u64 now = rq_of(cfs_rq)->clock;
> > +       u64 delta = now - cfs_rq->load_stamp;
> > +
> 
> Is is meaningful/useful to maintain cfs_rq->load for the rq->cfs_rq case?

Probably not,.. I had ideas of maybe using this load_avg for other
things, but then, maybe not..


> > @@ -771,7 +844,9 @@ enqueue_entity(struct cfs_rq *cfs_rq, st
> >         * Update run-time statistics of the 'current'.
> >         */
> >        update_curr(cfs_rq);
> > +       update_cfs_load(cfs_rq);
> >        account_entity_enqueue(cfs_rq, se);
> > +       update_cfs_shares(group_cfs_rq(se));
> 
> Don't we want to be updating the queuing cfs_rq's shares here?
> 
> The owned cfs_rq's share proportion isn't going to change as a result
> of being enqueued -- and is guaranteed to be hit by a previous queuing
> cfs_rq update in the initial enqueue case.

Right, I had that, that didn't work because,.. uhm,. /me scratches
head.. Ah!, yes, you need the queueing cfs_rq's group to be already
enqueued. So instead of updating ahead, we update backwards.

> > @@ -1055,6 +1134,9 @@ enqueue_task_fair(struct rq *rq, struct
> >                flags = ENQUEUE_WAKEUP;
> >        }
> >
> > +       for_each_sched_entity(se)
> > +               update_cfs_shares(group_cfs_rq(se));
> 
> If the queuing cfs_rq is used above then group_cfs_rq is redundant
> here, cfs_rq_of can be used.
> 
> Also, the respective load should be updated here.

Ah, indeed, that wants a update_cfs_load() as well. /me does

> > @@ -3510,6 +3545,8 @@ static void rebalance_domains(int cpu, e
> >        int update_next_balance = 0;
> >        int need_serialize;
> >
> > +       update_shares(cpu);
> > +
> 
> This may not be frequent enough, especially in the dilated cpus-busy case

Not exactly sure what you mean, but if there's wakeup/sleep activity
that activity will already rebalance for us, its is purely long running
jobs, once a tick should suffice, no?

next prev parent reply	other threads:[~2010-09-03  7:59 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-28 22:30 [RFC][PATCH 0/3] Try and make cpu-cgroup suck less Peter Zijlstra
2010-08-28 22:30 ` [RFC][PATCH 1/3] sched: Rewrite tg_shares_up Peter Zijlstra
2010-08-30 17:20   ` Srivatsa Vaddagiri
2010-08-30 17:53     ` Peter Zijlstra
2010-09-03  3:09   ` Paul Turner
2010-09-03  7:59     ` Peter Zijlstra [this message]
2010-08-28 22:30 ` [RFC][PATCH 2/3] sched: On-demand cfs_rq list Peter Zijlstra
2010-09-03  3:33   ` Paul Turner
2010-09-03  7:59     ` Peter Zijlstra
2010-08-28 22:30 ` [RFC][PATCH 3/3] sched: On-demand tg_shares_up() Peter Zijlstra
2010-09-03  1:52   ` Paul Turner
2010-09-03  7:59     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1283500775.1783.135.camel@laptop \
    --to=peterz@infradead.org \
    --cc=cfriesen@nortel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=pbourdon@excellency.fr \
    --cc=pjt@google.com \
    --cc=svaidy@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.