From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: vatsa@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Mike Galbraith <efault@gmx.de>,
Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy
Date: Thu, 01 Nov 2007 12:51:52 +0100 [thread overview]
Message-ID: <1193917912.27652.258.camel@twins> (raw)
In-Reply-To: <20071101113138.GA20788@linux.vnet.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 3374 bytes --]
On Thu, 2007-11-01 at 17:01 +0530, Srivatsa Vaddagiri wrote:
> On Wed, Oct 31, 2007 at 10:10:32PM +0100, Peter Zijlstra wrote:
> > Currently the ideal slice length does not take group scheduling into account.
> > Change it so that it properly takes all the runnable tasks on this cpu into
> > account and caluclate the weight according to the grouping hierarchy.
> >
> > Also fixes a bug in vslice which missed a factor NICE_0_LOAD.
> >
> > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > CC: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> > ---
> > kernel/sched_fair.c | 42 +++++++++++++++++++++++++++++++-----------
> > 1 file changed, 31 insertions(+), 11 deletions(-)
> >
> > Index: linux-2.6/kernel/sched_fair.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/sched_fair.c
> > +++ linux-2.6/kernel/sched_fair.c
> > @@ -331,10 +331,15 @@ static u64 __sched_period(unsigned long
> > */
> > static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > {
> > - u64 slice = __sched_period(cfs_rq->nr_running);
> > + unsigned long nr_running = rq_of(cfs_rq)->nr_running;
> > + u64 slice = __sched_period(nr_running);
> >
> > - slice *= se->load.weight;
> > - do_div(slice, cfs_rq->load.weight);
> > + for_each_sched_entity(se) {
> > + cfs_rq = cfs_rq_of(se);
> > +
> > + slice *= se->load.weight;
> > + do_div(slice, cfs_rq->load.weight);
> > + }
> >
> > return slice;
>
>
> Lets say we have two groups A and B on CPU0, of equal weight (1024).
>
> Further,
>
> A has 1 task (A0)
> B has 1000 tasks (B0 .. B999)
>
> Agreed its a extreme case, but illustrates the problem I have in mind
> for this patch.
>
> All tasks of same weight=1024.
>
> Before this patch
> =================
>
> sched_slice(grp A) = 20ms * 1/2 = 10ms
> sched_slice(A0) = 20ms
>
> sched_slice(grp B) = 20ms * 1/2 = 10ms
> sched_slice(B0) = (20ms * 1000/20) * 1 / 1000 = 1ms
> sched_slice(B1) = ... = sched_slice(B99) = 1 ms
>
> Fairness between groups and tasks would be obtained as below:
>
> A0 B0-B9 A0 B10-B19 A0 B20-B29
> |--------|--------|--------|--------|--------|--------|-----//--|
> 0 10ms 20ms 30ms 40ms 50ms 60ms
>
> After this patch
> ================
>
> sched_slice(grp A) = (20ms * 1001/20) * 1/2 ~= 500ms
> sched_slice(A0) = 500ms
Hmm, right that is indeed not intended
> sched_slice(grp B) = 500ms
> sched_slice(B0) = 0.5ms
This 0.5 is indeed correct, whereas the previous 1ms was not
> Fairness between groups and tasks would be obtained as below:
>
> A0 B0 - B99 A0
> |-----------------------|-----------------------|-----------------------|
> 0 500ms 1000ms 1500ms
>
> Did I get it right? If so, I don't like the fact that group A is allowed to run
> for a long time (500ms) before giving chance to group B.
Hmm, quite bad indeed.
> Can I know what real problem is being addressed by this change to
> sched_slice()?
sched_slice() is about lantecy, its intended purpose is to ensure each
task is ran exactly once during sched_period() - which is
sysctl_sched_latency when nr_running <= sysctl_sched_nr_latency, and
otherwise linearly scales latency.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2007-11-01 11:52 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-31 21:10 [PATCH 0/6] various scheduler patches Peter Zijlstra
2007-10-31 21:10 ` [PATCH 1/6] sched: move the group scheduling primitives around Peter Zijlstra
2007-10-31 21:10 ` [PATCH 2/6] sched: make sched_slice() group scheduling savvy Peter Zijlstra
2007-11-01 11:31 ` Srivatsa Vaddagiri
2007-11-01 11:51 ` Peter Zijlstra [this message]
2007-11-01 11:58 ` Peter Zijlstra
2007-11-01 12:03 ` Peter Zijlstra
2007-11-01 12:20 ` Peter Zijlstra
2007-11-01 16:31 ` Srivatsa Vaddagiri
2007-11-01 16:55 ` Peter Zijlstra
2007-10-31 21:10 ` [PATCH 3/6] sched: high-res preemption tick Peter Zijlstra
2007-10-31 21:53 ` Andi Kleen
2007-10-31 22:04 ` Peter Zijlstra
2007-11-01 10:12 ` Peter Zijlstra
2007-10-31 21:10 ` [PATCH 4/6] sched: sched_rt_entity Peter Zijlstra
2007-10-31 21:10 ` [PATCH 5/6] sched: SCHED_FIFO/SCHED_RR watchdog timer Peter Zijlstra
2007-10-31 21:49 ` Andi Kleen
2007-10-31 22:03 ` Peter Zijlstra
2007-11-03 18:16 ` Andi Kleen
2007-10-31 21:10 ` [PATCH 6/6] sched: place_entity() comments Peter Zijlstra
2007-11-01 8:29 ` [PATCH 0/6] various scheduler patches Ingo Molnar
2007-11-01 10:08 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1193917912.27652.258.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=dmitry.adamushko@gmail.com \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=vatsa@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.