Re: [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dhaval Giani <dhaval@linux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Sudhir Kumar <skumar@linux.vnet.ibm.com>,
	Balbir Singh <balbir@in.ibm.com>,
	Aneesh Kumar KV <aneesh.kumar@linux.vnet.ibm.com>,
	lkml <linux-kernel@vger.kernel.org>,
	vgoyal@redhat.com, serue@us.ibm.com, menage@google.com
Subject: Re: [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler
Date: Fri, 29 Feb 2008 14:34:00 +0530	[thread overview]
Message-ID: <20080229090400.GD15328@linux.vnet.ibm.com> (raw)
In-Reply-To: <1204227768.6243.42.camel@lappy>

On Thu, Feb 28, 2008 at 08:42:48PM +0100, Peter Zijlstra wrote:
> 
> On Mon, 2008-02-25 at 19:46 +0530, Dhaval Giani wrote:
> > This patch allows tasks and groups to exist in the same cfs_rq. With this
> > change the CFS group scheduling follows a 1/(M+N) model from a 1/(1+N)
> > fairness model where M tasks and N groups exist at the cfs_rq level.
> 
> Good. You know I agree with this direction.
> 
> > Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
> > Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> > ---
> >  kernel/sched.c      |   46 +++++++++++++++++++++
> >  kernel/sched_fair.c |  113 +++++++++++++++++++++++++++++++++++++++++-----------
> >  2 files changed, 137 insertions(+), 22 deletions(-)
> > 
> > Index: linux-2.6.25-rc2/kernel/sched.c
> > ===================================================================
> > --- linux-2.6.25-rc2.orig/kernel/sched.c
> > +++ linux-2.6.25-rc2/kernel/sched.c
> > @@ -224,10 +224,13 @@ struct task_group {
> >  };
> >  
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> > +
> > +#ifdef CONFIG_USER_SCHED
> >  /* Default task group's sched entity on each cpu */
> >  static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
> >  /* Default task group's cfs_rq on each cpu */
> >  static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
> > +#endif
> >  
> >  static struct sched_entity *init_sched_entity_p[NR_CPUS];
> >  static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
> > @@ -7163,6 +7166,10 @@ static void init_tg_cfs_entry(struct rq 
> >  		list_add(&cfs_rq->leaf_cfs_rq_list, &rq->leaf_cfs_rq_list);
> >  
> >  	tg->se[cpu] = se;
> > +	/* se could be NULL for init_task_group */
> > +	if (!se)
> > +		return;
> > +
> >  	se->cfs_rq = &rq->cfs;
> >  	se->my_q = cfs_rq;
> >  	se->load.weight = tg->shares;
> > @@ -7217,11 +7224,46 @@ void __init sched_init(void)
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> >  		init_task_group.shares = init_task_group_load;
> >  		INIT_LIST_HEAD(&rq->leaf_cfs_rq_list);
> > +#ifdef CONFIG_CGROUP_SCHED
> > +		/*
> > +		 * How much cpu bandwidth does init_task_group get?
> > +		 *
> > +		 * In case of task-groups formed thr' the cgroup filesystem, it
> > +		 * gets 100% of the cpu resources in the system. This overall
> > +		 * system cpu resource is divided among the tasks of
> > +		 * init_task_group and its child task-groups in a fair manner,
> > +		 * based on each entity's (task or task-group's) weight
> > +		 * (se->load.weight).
> > +		 *
> > +		 * In other words, if init_task_group has 10 tasks of weight
> > +		 * 1024) and two child groups A0 and A1 (of weight 1024 each),
> > +		 * then A0's share of the cpu resource is:
> > +		 *
> > +		 * 	A0's bandwidth = 1024 / (10*1024 + 1024 + 1024) = 8.33%
> > +		 *
> > +		 * We achieve this by letting init_task_group's tasks sit
> > +		 * directly in rq->cfs (i.e init_task_group->se[] = NULL).
> > +		 */
> > +		init_tg_cfs_entry(rq, &init_task_group, &rq->cfs, NULL, i, 1);
> > +		init_tg_rt_entry(rq, &init_task_group, &rq->rt, NULL, i, 1);
> 
> That seems to agree with that.
> 
> > +#elif defined CONFIG_USER_SCHED
> > +		/*
> > +		 * In case of task-groups formed thr' the user id of tasks,
> > +		 * init_task_group represents tasks belonging to root user.
> > +		 * Hence it forms a sibling of all subsequent groups formed.
> > +		 * In this case, init_task_group gets only a fraction of overall
> > +		 * system cpu resource, based on the weight assigned to root
> > +		 * user's cpu share (INIT_TASK_GROUP_LOAD). This is accomplished
> > +		 * by letting tasks of init_task_group sit in a separate cfs_rq
> > +		 * (init_cfs_rq) and having one entity represent this group of
> > +		 * tasks in rq->cfs (i.e init_task_group->se[] != NULL).
> > +		 */
> >  		init_tg_cfs_entry(rq, &init_task_group,
> >  				&per_cpu(init_cfs_rq, i),
> >  				&per_cpu(init_sched_entity, i), i, 1);
> 
> But I fail to parse this lengthy comment. What does it do:
> 
>     init_group
>    /     |    \
> uid-0 uid-1000 uid-n
> 
> or does it blend uid-0 into the init_group?
> 

It blends uid-0 (root) into init_group.

<snip>

> > @@ -1100,6 +1127,27 @@ static void check_preempt_wakeup(struct 
> >  	if (!sched_feat(WAKEUP_PREEMPT))
> >  		return;
> >  
> > +	/*
> > +	 * preemption test can be made between sibling entities who are in the
> > +	 * same cfs_rq i.e who have a common parent. Walk up the hierarchy of
> > +	 * both tasks untill we find their ancestors who are siblings of common
> > +	 * parent.
> > +	 */
> > +
> > +	/* First walk up until both entities are at same depth */
> > +	se_depth = depth_se(se);
> > +	pse_depth = depth_se(pse);
> > +
> > +	while (se_depth > pse_depth) {
> > +		se_depth--;
> > +		se = parent_entity(se);
> > +	}
> > +
> > +	while (pse_depth > se_depth) {
> > +		pse_depth--;
> > +		pse = parent_entity(pse);
> > +	}
> 
> Sad, but needed.. for now..
> 

better ideas if any are welcome! Cannot think of anything right now :(

-- 
regards,
Dhaval

next prev parent reply	other threads:[~2008-02-29  9:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25 14:15 [RFC, PATCH 0/2] sched: add multiple hierarchy support to the CFS group scheduler Dhaval Giani
2008-02-25 14:16 ` [RFC, PATCH 1/2] sched: change the fairness model of " Dhaval Giani
2008-02-27 20:23   ` Serge E. Hallyn
2008-02-28 19:42   ` Peter Zijlstra
2008-02-29  9:04     ` Dhaval Giani [this message]
2008-02-29 10:37       ` Peter Zijlstra
2008-03-04  9:49         ` Dhaval Giani
2008-02-25 14:17 ` [RFC, PATCH 1/2] sched: allow the CFS group scheduler to have multiple levels Dhaval Giani
2008-02-25 14:17   ` Dhaval Giani
2008-02-28 19:42   ` Peter Zijlstra
2008-02-29  5:57     ` Dhaval Giani
  -- strict thread matches above, loose matches on Subject: below --
2008-02-29  0:43 [RFC, PATCH 1/2] sched: change the fairness model of the CFS group scheduler J.C. Pizarro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080229090400.GD15328@linux.vnet.ibm.com \
    --to=dhaval@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=balbir@in.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mingo@elte.hu \
    --cc=serue@us.ibm.com \
    --cc=skumar@linux.vnet.ibm.com \
    --cc=vatsa@linux.vnet.ibm.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.