public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: balbir@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH 02/30] sched: revert the revert of: weight calculations
Date: Tue, 15 Jul 2008 22:16:05 +0200	[thread overview]
Message-ID: <1216152965.12595.251.camel@twins> (raw)
In-Reply-To: <20080630180702.GC23606@balbir.in.ibm.com>

On Mon, 2008-06-30 at 23:37 +0530, Balbir Singh wrote:
> * Peter Zijlstra <a.p.zijlstra@chello.nl> [2008-06-27 13:41:11]:

> >  /*
> > + * delta *= w / rw
> > + */
> > +static inline unsigned long
> > +calc_delta_weight(unsigned long delta, struct sched_entity *se)
> > +{
> > +	for_each_sched_entity(se) {
> > +		delta = calc_delta_mine(delta,
> > +				se->load.weight, &cfs_rq_of(se)->load);
> > +	}
> > +
> > +	return delta;
> > +}
> > +
> > +/*
> > + * delta *= rw / w
> > + */
> > +static inline unsigned long
> > +calc_delta_fair(unsigned long delta, struct sched_entity *se)
> > +{
> > +	for_each_sched_entity(se) {
> > +		delta = calc_delta_mine(delta,
> > +				cfs_rq_of(se)->load.weight, &se->load);
> > +	}
> > +
> > +	return delta;
> > +}
> > +
> 
> These functions can do with better comments

you mean like: 

/*
 * delta *= \Prod_{i} rw_{i} / w_{i} ?
 */

?

> delta is scaled up as we move up the hierarchy
> 
> Why is calc_delta_weight() different from calc_delta_fair()?

Because they do the opposite operation.

I agree though that perhaps the names could have been chosen better.
I've wondered about that at several occasions but so far failed to come
up with anything sane.

> > +/*
> >   * The idea is to set a period in which each task runs once.
> >   *
> >   * When there are too many tasks (sysctl_sched_nr_latency) we have to stretch
> > @@ -362,47 +390,54 @@ static u64 __sched_period(unsigned long 
> >   */
> >  static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > -	u64 slice = __sched_period(cfs_rq->nr_running);
> > -
> > -	for_each_sched_entity(se) {
> > -		cfs_rq = cfs_rq_of(se);
> > -
> > -		slice *= se->load.weight;
> > -		do_div(slice, cfs_rq->load.weight);
> > -	}
> > -
> > -
> > -	return slice;
> > +	return calc_delta_weight(__sched_period(cfs_rq->nr_running), se);
> >  }
> > 
> >  /*
> >   * We calculate the vruntime slice of a to be inserted task
> >   *
> > - * vs = s/w = p/rw
> > + * vs = s*rw/w = p
> >   */
> >  static u64 sched_vslice_add(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> >  	unsigned long nr_running = cfs_rq->nr_running;
> > -	unsigned long weight;
> > -	u64 vslice;
> > 
> >  	if (!se->on_rq)
> >  		nr_running++;
> > 
> > -	vslice = __sched_period(nr_running);
> > +	return __sched_period(nr_running);
> 
> Do we always return a constant value based on nr_running? Am I
> misreading the diff by any chance?

static u64 __sched_period(unsigned long nr_running)
{
        u64 period = sysctl_sched_latency;
        unsigned long nr_latency = sched_nr_latency;

        if (unlikely(nr_running > nr_latency)) {
                period = sysctl_sched_min_granularity;
                period *= nr_running;
        }

        return period;
}

its not exactly constant..

> > +}
> > +
> > +/*
> > + * The goal of calc_delta_asym() is to be asymmetrically around NICE_0_LOAD, in
> > + * that it favours >=0 over <0.
> > + *
> > + *   -20         |
> > + *               |
> > + *     0 --------+-------
> > + *             .'
> > + *    19     .'
> > + *
> > + */
> > +static unsigned long
> > +calc_delta_asym(unsigned long delta, struct sched_entity *se)
> > +{
> > +	struct load_weight lw = {
> > +		.weight = NICE_0_LOAD,
> > +		.inv_weight = 1UL << (WMULT_SHIFT-NICE_0_SHIFT)
> > +	};
> 
> Could you please explain this
> 
> weight is 1 << 10
> and inv_weight is 1 << 22

we have the relation that:

 x/weight ~= (x*inv_weight) >> 32

or

 inv_weight = (1<<32) / weight

See kernel/sched.c:calc_delta_mine()

when weight is 1<<10, that reduces to 1<<(32-10) = 1<<22

> > 
> >  	for_each_sched_entity(se) {
> > -		cfs_rq = cfs_rq_of(se);
> > +		struct load_weight *se_lw = &se->load;
> > 
> > -		weight = cfs_rq->load.weight;
> > -		if (!se->on_rq)
> > -			weight += se->load.weight;
> > +		if (se->load.weight < NICE_0_LOAD)
> > +			se_lw = &lw;
> 
> Why do we do this?

You're basically asking what the _asym part is about, right?

So, what this patch does is change the virtual time calculation from:

 1 / w, to rw / w

[ actuallly to: \Prod_{i} rw_{i}/w_{i} ]

Now wakeup_gran() has this asymetry:

> > 	/*
> > -	 * More easily preempt - nice tasks, while not making
> > -	 * it harder for + nice tasks.
> >  	 */
> > -	if (unlikely(se->load.weight > NICE_0_LOAD))
> > -		gran = calc_delta_fair(gran, &se->load);

calc_delta_asym() tries to generalize that to the new scheme. As you can
see from the next two patches the code in this patch isn't perfect. This
patch just restores the status quo to before the revert, the next
patches continue.



  reply	other threads:[~2008-07-15 20:16 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-27 11:41 [PATCH 00/30] SMP-group balancer - take 3 Peter Zijlstra
2008-06-27 11:41 ` [PATCH 01/30] sched: clean up some unused variables Peter Zijlstra
2008-06-27 11:41 ` [PATCH 02/30] sched: revert the revert of: weight calculations Peter Zijlstra
2008-06-30 18:07   ` Balbir Singh
2008-07-15 20:16     ` Peter Zijlstra [this message]
2008-06-27 11:41 ` [PATCH 03/30] sched: fix calc_delta_asym() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 04/30] sched: fix calc_delta_asym Peter Zijlstra
2008-06-27 11:41 ` [PATCH 05/30] sched: revert revert of: fair-group: SMP-nice for group scheduling Peter Zijlstra
2008-06-27 11:41 ` [PATCH 06/30] sched: sched_clock_cpu() based cpu_clock() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 07/30] sched: fix wakeup granularity and buddy granularity Peter Zijlstra
2008-06-27 11:41 ` [PATCH 08/30] sched: add full schedstats to /proc/sched_debug Peter Zijlstra
2008-06-27 11:41 ` [PATCH 09/30] sched: fix sched_domain aggregation Peter Zijlstra
2008-06-27 11:41 ` [PATCH 10/30] sched: update aggregate when holding the RQs Peter Zijlstra
2008-06-27 11:41 ` [PATCH 11/30] sched: kill task_group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 12/30] sched: dont micro manage share losses Peter Zijlstra
2008-06-27 11:41 ` [PATCH 13/30] sched: no need to aggregate task_weight Peter Zijlstra
2008-06-27 11:41 ` [PATCH 14/30] sched: simplify the group load balancer Peter Zijlstra
2008-06-27 11:41 ` [PATCH 15/30] sched: fix newidle smp group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 16/30] sched: fix sched_balance_self() " Peter Zijlstra
2008-06-27 11:41 ` [PATCH 17/30] sched: persistent average load per task Peter Zijlstra
2008-06-27 11:41 ` [PATCH 18/30] sched: hierarchical load vs affine wakeups Peter Zijlstra
2008-06-27 11:41 ` [PATCH 19/30] sched: hierarchical load vs find_busiest_group Peter Zijlstra
2008-06-27 11:41 ` [PATCH 20/30] sched: fix load scaling in group balancing Peter Zijlstra
2008-06-27 11:41 ` [PATCH 21/30] sched: fix task_h_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 22/30] sched: remove prio preference from balance decisions Peter Zijlstra
2008-06-27 11:41 ` [PATCH 23/30] sched: optimize effective_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 24/30] sched: disable source/target_load bias Peter Zijlstra
2008-06-27 11:41 ` [PATCH 25/30] sched: fix shares boost logic Peter Zijlstra
2008-06-27 11:41 ` [PATCH 26/30] sched: update shares on wakeup Peter Zijlstra
2008-06-27 11:41 ` [PATCH 27/30] sched: fix mult overflow Peter Zijlstra
2008-06-27 11:41 ` [PATCH 28/30] sched: correct wakeup weight calculations Peter Zijlstra
2008-06-27 11:41 ` [PATCH 29/30] sched: incremental effective_load() Peter Zijlstra
2008-06-27 11:41 ` [PATCH 30/30] sched: bias effective_load() error towards failing wake_affine() Peter Zijlstra
2008-06-27 12:46 ` [PATCH 00/30] SMP-group balancer - take 3 Ingo Molnar
2008-06-27 17:33 ` Dhaval Giani
2008-06-28 17:08   ` Dhaval Giani
2008-06-30 12:59     ` Ingo Molnar
2008-06-30 14:53       ` Dhaval Giani
2008-07-01 10:57         ` Dhaval Giani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1216152965.12595.251.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox