From: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Mike Galbraith <efault@gmx.de>,
Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy
Date: Thu, 1 Nov 2007 17:01:38 +0530 [thread overview]
Message-ID: <20071101113138.GA20788@linux.vnet.ibm.com> (raw)
In-Reply-To: <20071031211248.796653000@chello.nl>
On Wed, Oct 31, 2007 at 10:10:32PM +0100, Peter Zijlstra wrote:
> Currently the ideal slice length does not take group scheduling into account.
> Change it so that it properly takes all the runnable tasks on this cpu into
> account and caluclate the weight according to the grouping hierarchy.
>
> Also fixes a bug in vslice which missed a factor NICE_0_LOAD.
>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> ---
> kernel/sched_fair.c | 42 +++++++++++++++++++++++++++++++-----------
> 1 file changed, 31 insertions(+), 11 deletions(-)
>
> Index: linux-2.6/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched_fair.c
> +++ linux-2.6/kernel/sched_fair.c
> @@ -331,10 +331,15 @@ static u64 __sched_period(unsigned long
> */
> static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
> {
> - u64 slice = __sched_period(cfs_rq->nr_running);
> + unsigned long nr_running = rq_of(cfs_rq)->nr_running;
> + u64 slice = __sched_period(nr_running);
>
> - slice *= se->load.weight;
> - do_div(slice, cfs_rq->load.weight);
> + for_each_sched_entity(se) {
> + cfs_rq = cfs_rq_of(se);
> +
> + slice *= se->load.weight;
> + do_div(slice, cfs_rq->load.weight);
> + }
>
> return slice;
Lets say we have two groups A and B on CPU0, of equal weight (1024).
Further,
A has 1 task (A0)
B has 1000 tasks (B0 .. B999)
Agreed its a extreme case, but illustrates the problem I have in mind
for this patch.
All tasks of same weight=1024.
Before this patch
=================
sched_slice(grp A) = 20ms * 1/2 = 10ms
sched_slice(A0) = 20ms
sched_slice(grp B) = 20ms * 1/2 = 10ms
sched_slice(B0) = (20ms * 1000/20) * 1 / 1000 = 1ms
sched_slice(B1) = ... = sched_slice(B99) = 1 ms
Fairness between groups and tasks would be obtained as below:
A0 B0-B9 A0 B10-B19 A0 B20-B29
|--------|--------|--------|--------|--------|--------|-----//--|
0 10ms 20ms 30ms 40ms 50ms 60ms
After this patch
================
sched_slice(grp A) = (20ms * 1001/20) * 1/2 ~= 500ms
sched_slice(A0) = 500ms
sched_slice(grp B) = 500ms
sched_slice(B0) = 0.5ms
Fairness between groups and tasks would be obtained as below:
A0 B0 - B99 A0
|-----------------------|-----------------------|-----------------------|
0 500ms 1000ms 1500ms
Did I get it right? If so, I don't like the fact that group A is allowed to run
for a long time (500ms) before giving chance to group B.
Can I know what real problem is being addressed by this change to
sched_slice()?
--
Regards,
vatsa
next prev parent reply other threads:[~2007-11-01 11:19 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-31 21:10 [PATCH 0/6] various scheduler patches Peter Zijlstra
2007-10-31 21:10 ` [PATCH 1/6] sched: move the group scheduling primitives around Peter Zijlstra
2007-10-31 21:10 ` [PATCH 2/6] sched: make sched_slice() group scheduling savvy Peter Zijlstra
2007-11-01 11:31 ` Srivatsa Vaddagiri [this message]
2007-11-01 11:51 ` Peter Zijlstra
2007-11-01 11:58 ` Peter Zijlstra
2007-11-01 12:03 ` Peter Zijlstra
2007-11-01 12:20 ` Peter Zijlstra
2007-11-01 16:31 ` Srivatsa Vaddagiri
2007-11-01 16:55 ` Peter Zijlstra
2007-10-31 21:10 ` [PATCH 3/6] sched: high-res preemption tick Peter Zijlstra
2007-10-31 21:53 ` Andi Kleen
2007-10-31 22:04 ` Peter Zijlstra
2007-11-01 10:12 ` Peter Zijlstra
2007-10-31 21:10 ` [PATCH 4/6] sched: sched_rt_entity Peter Zijlstra
2007-10-31 21:10 ` [PATCH 5/6] sched: SCHED_FIFO/SCHED_RR watchdog timer Peter Zijlstra
2007-10-31 21:49 ` Andi Kleen
2007-10-31 22:03 ` Peter Zijlstra
2007-11-03 18:16 ` Andi Kleen
2007-10-31 21:10 ` [PATCH 6/6] sched: place_entity() comments Peter Zijlstra
2007-11-01 8:29 ` [PATCH 0/6] various scheduler patches Ingo Molnar
2007-11-01 10:08 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071101113138.GA20788@linux.vnet.ibm.com \
--to=vatsa@linux.vnet.ibm.com \
--cc=a.p.zijlstra@chello.nl \
--cc=dmitry.adamushko@gmail.com \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.