public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Douglas Niehaus <niehaus@ittc.ku.edu>
Cc: Henrik Austad <henrik@austad.us>,
	LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Bill Huey <billh@gnuppy.monkey.org>,
	Linux RT <linux-rt-users@vger.kernel.org>,
	Fabio Checconi <fabio@gandalf.sssup.it>,
	"James H. Anderson" <anderson@cs.unc.edu>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ted Baker <baker@cs.fsu.edu>,
	Dhaval Giani <dhaval.giani@gmail.com>,
	Noah Watkins <jayhawk@soe.ucsc.edu>,
	KUSP Google Group <kusp@googlegroups.com>
Subject: Re: RFC for a new Scheduling policy/class in the Linux-kernel
Date: Sun, 12 Jul 2009 17:31:48 +0200	[thread overview]
Message-ID: <1247412708.6704.105.camel@laptop> (raw)
In-Reply-To: <4A594D2D.3080101@ittc.ku.edu>

On Sat, 2009-07-11 at 21:40 -0500, Douglas Niehaus wrote:
> Peter:
>     Perhaps you could expand on what you meant when you said:
> 
> 	Thing is, both BWI and PEP seems to work brilliantly on Uni-Processor
> 	but SMP leaves things to be desired. Dhaval is currently working on a
> 	PEP implementation that will migrate all the blocked tasks to the
> 	owner's cpu, basically reducing it to the UP problem.
> 
> What is left to be desired with PEP on SMP? I am not saying it is 
> perfect, as I can think of a few things I would like to improve or 
> understand better, but I am curious what you have in mind.

Right, please don't take this as a critism against PEP, any scheme I
know of has enormous complications on SMP ;-)

But the thing that concerns me most, there seem to be a few O(n)
consequences. Suppose that for each resource (or lock) R_i, there is a
block graph G_i, which consists of n nodes and would be m deep.

Functionally (generalized) PIP and PEP are identical, their big
difference is that PIP uses waitqueues to encode the block graph G,
whereas PEP leaves everybody on the runqueue and uses the proxy field to
encode the block graph G.

The downside of PIP is that the waitqueue needs to re-implement the full
schedule function in order to evaluate the highest prio task on the
waitqueue. Ttraditionally this was rather easy, since you'd only
consider the limited SCHED_FIFO static prio range, leaving you with a
O(1) evaluation, when you add more complex scheduling functions things
get considerably more involved. Let's call this cost S.

So for PIP you get O(m*S) evaluations whenever you get a change to the
block graph.

Now for PEP, you get an increased O(m) cost on schedule, which can be
compared to the PIP cost.

However PEP on SMP needs to ensure all n tasks in G_i are on the same
cpu, because otherwise we can end up wanting to execute the resource
owner on multiple cpus at the same time, which is bad.

This can of course be amortized, but you end up having to migrate the
task (or an avatar thereof) to the owner's cpu (if you'd want to migrate
the owner to the blocker's cpu, then you're quickly into trouble when
there's multiple blockers), but any way around this ends up being O(n).

Also, when the owner gets blocked on something that doesn't have an
owner (io completion, or a traditional semaphore), you have to take all
n tasks from the runqueue (and back again when they do become runnable).

PIP doesn't suffer this, but does suffer the pain from having to
reimplement the full schedule function on the waitqueues, which when you
have hierarchical scheduling means you have to replicate the full
hierarchy per waitqueue.

Furthermore we cannot assume locked sections are short, and we must
indeed assume that its any resource in the kernel associated with any
service which can be used by any thread, worse, it can be any odd
userspace resource/thread too, since we expose the block graph to
userspace processes through PI-futexes.


  reply	other threads:[~2009-07-12 15:32 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-10 21:50 RFC for a new Scheduling policy/class in the Linux-kernel Henrik Austad
2009-07-11 18:28 ` Peter Zijlstra
2009-07-12  2:40   ` Douglas Niehaus
2009-07-12 15:31     ` Peter Zijlstra [this message]
2009-07-13 15:44       ` Raistlin
2009-07-13 16:33         ` Chris Friesen
2009-07-14 10:47           ` Raistlin
2009-07-14 11:03             ` Peter Zijlstra
2009-07-14 18:19               ` Raistlin
2009-07-14 14:48             ` Chris Friesen
2009-07-14 15:19               ` James H. Anderson
2009-07-14 16:31                 ` Peter Zijlstra
2009-07-14 16:54                   ` James H. Anderson
2009-07-14 19:28                     ` Henrik Austad
2009-07-14 19:33                       ` James H. Anderson
2009-07-15 21:53                       ` Ted Baker
2009-07-17  7:40                         ` Henrik Austad
2009-07-17 13:37                           ` Ted Baker
2009-07-15  4:25                     ` Bjoern B. Brandenburg
2009-07-15 20:55                     ` Ted Baker
2009-07-15 21:53                       ` Chris Friesen
2009-07-15 22:34                         ` Ted Baker
2009-07-15 22:39                           ` Dhaval Giani
2009-07-15 23:16                             ` Ted Baker
2009-07-16  8:58                               ` Peter Zijlstra
2009-07-16  9:11                                 ` Peter Zijlstra
2009-07-17  0:32                                 ` Raistlin
2009-07-17  0:43                                 ` Raistlin
2009-07-16 12:17                               ` Raistlin
2009-07-16 23:29                       ` Raistlin
2009-07-18 20:12                         ` Michal Sojka
2009-07-14 17:16                   ` James H. Anderson
2009-07-15 21:19                     ` Ted Baker
2009-07-14 19:54                   ` Raistlin
2009-07-14 16:48               ` Raistlin
2009-07-14 18:24                 ` Chris Friesen
2009-07-14 19:14                   ` Raistlin
2009-07-15 22:14                   ` Ted Baker
2009-07-16  7:17                     ` Henrik Austad
2009-07-16 23:13                       ` Ted Baker
2009-07-17  0:19                         ` Raistlin
2009-07-17  7:31                         ` Henrik Austad
2009-07-16 14:46                     ` Chris Friesen
2009-07-16 22:34                       ` Ted Baker
2009-07-16 23:07                         ` Raistlin
2009-07-15 21:45               ` Ted Baker
2009-07-15 22:12                 ` Chris Friesen
2009-07-15 22:52                   ` Ted Baker
2009-07-17 13:35             ` Giuseppe Lipari
2009-07-13 17:25         ` Peter Zijlstra
2009-07-13 18:14           ` Noah Watkins
2009-07-13 20:13             ` Ted Baker
2009-07-13 21:45               ` Chris Friesen
2009-07-14 11:16                 ` Raistlin
2009-07-15 23:11                 ` Ted Baker
2009-07-16  7:58                   ` Peter Zijlstra
2009-07-16  8:52                     ` Thomas Gleixner
2009-07-16 12:17                     ` Raistlin
2009-07-16 12:59                       ` James H. Anderson
2009-07-16 13:37                         ` Peter Zijlstra
2009-07-16 22:15                     ` Ted Baker
2009-07-16 22:34                       ` Karthik Singaram Lakshmanan
2009-07-16 23:38                         ` Ted Baker
2009-07-17  1:44                           ` Karthik Singaram Lakshmanan
2009-07-16 15:17                   ` Chris Friesen
2009-07-16 21:26                     ` Ted Baker
2009-07-16 22:08                       ` Chris Friesen
2009-07-16 23:54                         ` Ted Baker
2009-07-14  9:15             ` Peter Zijlstra
2009-07-14 19:07               ` Raistlin
2009-07-13 17:28         ` Peter Zijlstra
2009-07-14 19:47           ` Raistlin
     [not found]     ` <002301ca0403$47f9d9d0$d7ed8d70$@tlh@comcast.net>
2009-07-13 23:47       ` Douglas Niehaus
2009-07-14  7:27         ` Chris Friesen
2009-07-14  7:44           ` Douglas Niehaus
2009-07-12  6:17   ` Henrik Austad
2009-07-13  9:55   ` Raistlin
2009-07-13 10:14     ` Peter Zijlstra
2009-07-13 16:06       ` Raistlin
2009-07-14  8:42         ` Peter Zijlstra
2009-07-14  9:36           ` Raistlin
  -- strict thread matches above, loose matches on Subject: below --
2009-07-16 17:54 Raj Rajkumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1247412708.6704.105.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=anderson@cs.unc.edu \
    --cc=baker@cs.fsu.edu \
    --cc=billh@gnuppy.monkey.org \
    --cc=dhaval.giani@gmail.com \
    --cc=fabio@gandalf.sssup.it \
    --cc=henrik@austad.us \
    --cc=jayhawk@soe.ucsc.edu \
    --cc=kusp@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=niehaus@ittc.ku.edu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox