Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: vatsa@linux.vnet.ibm.com
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Mike Galbraith <efault@gmx.de>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>
Subject: Re: [PATCH 2/6] sched: make sched_slice() group scheduling savvy
Date: Thu, 01 Nov 2007 17:55:15 +0100	[thread overview]
Message-ID: <1193936115.27652.319.camel@twins> (raw)
In-Reply-To: <20071101163120.GB20788@linux.vnet.ibm.com>

On Thu, 2007-11-01 at 22:01 +0530, Srivatsa Vaddagiri wrote:
> On Thu, Nov 01, 2007 at 01:20:08PM +0100, Peter Zijlstra wrote:
> > On Thu, 2007-11-01 at 13:03 +0100, Peter Zijlstra wrote:
> > > On Thu, 2007-11-01 at 12:58 +0100, Peter Zijlstra wrote:
> > > 
> > > > > sched_slice() is about lantecy, its intended purpose is to ensure each
> > > > > task is ran exactly once during sched_period() - which is
> > > > > sysctl_sched_latency when nr_running <= sysctl_sched_nr_latency, and
> > > > > otherwise linearly scales latency.
> > > 
> > > The thing that got my brain in a twist is what to do about the non-leaf
> > > nodes, for those it seems I'm not doing the right thing - I think.
> > 
> > Ok, suppose a tree like so:
> > 
> > 
> > level 2                   cfs_rq
> >                        A           B
> > 
> > level 1             cfs_rqA     cfs_rqB
> >                      A0        B0 - B99
> > 
> > 
> > So for sake of determinism, we want to impose a period in which all
> > level 1 tasks will have ran (at least) once.
> 
> Peter,
> 	I fail to see why this requirement to "determine a period in
> which all level 1 tasks will have ran (at least) once" is essential.

Because it gives a steady feel to things. For humans its most essential
that things run in a predicable fashion. So no only does it matter how
much time a task gets, it also very much matters when it gets that.

It contributes to the feeling of gradual slow down. 

> I am visualizing each of the groups to be similar to Xen-like partitions
> which are given fair timeslices by the hypervisor (Linux kernel in this
> case). How each partition (group in this case) manages the allocated
> timeslice(s) to provide fairness to tasks within that partition/group should not
> (IMHO) depend on other groups and esp. how many tasks other groups has.

Agreed, I've realised this since my last mail, one group should not
influence another in such a fashion, in this respect you don't want to
flatten it like I did.

> For ex: before this patch, fair time would be allocated to group and
> their tasks as below:
> 
>     A0       B0-B9     A0    B10-B19     A0     B20-B29
>  |--------|--------|--------|--------|--------|--------|-----//--| 
>  0       10ms	   20ms	   30ms     40ms     50ms     60ms
> 
> i.e during the first 10ms allocated to group B, B0-B9 run,
>     during the next  10ms allocated to group B, B10-B19 run etc
> 
> What's wrong with this scheme?

What made me start tinkering here is that the nested level is again
distributing wall-time without being aware of the fraction it gets from
the upper levels.

So if we have two groups, A and B, and B is selected for 1/2 of period,
then Bn will again divide period, even though in reality it will only
have p/2.

So I guess, I need a top down traversal, not a bottom up traversal to
get this fixed up... I'll ponder this.

> By letting __sched_period() be determined for each group independently,
> we are building stronger isolation between them, which is good IMO
> (imagine a rogue container that does a fork bomb).

Agreed.

> > Index: linux-2.6/kernel/sched_fair.c
> > ===================================================================
> > --- linux-2.6.orig/kernel/sched_fair.c
> > +++ linux-2.6/kernel/sched_fair.c
> > @@ -341,7 +341,7 @@ static u64 sched_slice(struct cfs_rq *cf
> >  		do_div(slice, cfs_rq->load.weight);
> >  	}
> > 
> > -	return slice;
> > +	return min_t(u64, sysctl_sched_latency, slice);

> which seems to be more or less giving what we already have w/o the
> patch?

Well, its basically giving up on overload, admittedly not very nice.

next prev parent reply	other threads:[~2007-11-01 16:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-31 21:10 [PATCH 0/6] various scheduler patches Peter Zijlstra
2007-10-31 21:10 ` [PATCH 1/6] sched: move the group scheduling primitives around Peter Zijlstra
2007-10-31 21:10 ` [PATCH 2/6] sched: make sched_slice() group scheduling savvy Peter Zijlstra
2007-11-01 11:31   ` Srivatsa Vaddagiri
2007-11-01 11:51     ` Peter Zijlstra
2007-11-01 11:58       ` Peter Zijlstra
2007-11-01 12:03         ` Peter Zijlstra
2007-11-01 12:20           ` Peter Zijlstra
2007-11-01 16:31             ` Srivatsa Vaddagiri
2007-11-01 16:55               ` Peter Zijlstra [this message]
2007-10-31 21:10 ` [PATCH 3/6] sched: high-res preemption tick Peter Zijlstra
2007-10-31 21:53   ` Andi Kleen
2007-10-31 22:04     ` Peter Zijlstra
2007-11-01 10:12     ` Peter Zijlstra
2007-10-31 21:10 ` [PATCH 4/6] sched: sched_rt_entity Peter Zijlstra
2007-10-31 21:10 ` [PATCH 5/6] sched: SCHED_FIFO/SCHED_RR watchdog timer Peter Zijlstra
2007-10-31 21:49   ` Andi Kleen
2007-10-31 22:03     ` Peter Zijlstra
2007-11-03 18:16       ` Andi Kleen
2007-10-31 21:10 ` [PATCH 6/6] sched: place_entity() comments Peter Zijlstra
2007-11-01  8:29 ` [PATCH 0/6] various scheduler patches Ingo Molnar
2007-11-01 10:08   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1193936115.27652.319.camel@twins \
    --to=peterz@infradead.org \
    --cc=dmitry.adamushko@gmail.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=vatsa@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox