Re: [PATCH][RFC] Proposal For A More Scalable Scheduler ...

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Hubertus Franke <frankeh@watson.ibm.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: lkml <linux-kernel@vger.kernel.org>, lse-tech@lists.sourceforge.net
Subject: Re: [PATCH][RFC] Proposal For A More Scalable Scheduler ...
Date: Tue, 30 Oct 2001 11:29:37 -0500	[thread overview]
Message-ID: <20011030112937.A16154@watson.ibm.com> (raw)
In-Reply-To: <20011030092834.A16050@watson.ibm.com> <Pine.LNX.4.40.0110300914450.1495-100000@blue1.dev.mcafeelabs.com>
In-Reply-To: <Pine.LNX.4.40.0110300914450.1495-100000@blue1.dev.mcafeelabs.com>; from Davide Libenzi on Tue, Oct 30, 2001 at 09:19:05AM -0800

* Davide Libenzi <davidel@xmailserver.org> [20011030 12;19]:"
> On Tue, 30 Oct 2001, Hubertus Franke wrote:
> 
> > Davide, nice analysis.
> > I want to point out that some (not all) of the stuff is already done
> > in our scalable MQ scheduler (http://lse.sourceforge.net/scheduling).
> >
> > What we have:
> > -------------
> > multiple queues, each protected by their own lock to avoid
> > the contention.
> > Automatic Loadbalancing across all queues (yes, that creates overhead)
> > CPU pooling as configurable mean to get from isolated queues to a fully
> > balanced (global scheduling decision) scheduler.
> > Also have some initial placement to the least loaded runqueue in the least
> > loaded pool
> >
> > We look at this as a configurable infrastructure....
> >
> > What we don't have:
> > -------------------
> >
> > The removal of PROC_CHANGE_PENALTY with a time decay cache affinity definition.
> >
> >
> > At ALS: I will be reporting on our experience with what we have
> > for a 8-way system and a 4x4-way NUMA system (OSDL)
> > wrt early placement, choice of best pool size ?
> >
> > Are you can get an early start at:
> > 	http://lse.sourceforge.net/scheduling/als2001/pmqs.ps
> 
> I see the proposed implementation as a decisive cut with the try to have
> processes instantly moved across CPUs and stuff like na_goodness, etc..
> Inside each CPU the scheduler is _exactly_ the same as the UP one.
> 

Well, to that extent that what MQ does as too. We do a local decision 
first and then compare across multiple queues. In the pooling approach
we limit that global check to some cpus within the proximity.
I think your CPU Weight history could fit into this model as well.
We don't care how the local decision was reached.

There is however another problem that you haven't addressed yet, which
is realtime. As far as I can tell, the realtime semantics require a 
strict ordering with respect to each other and their priorities.
General approach can be either to limit all RT processes to a single CPU
or, as we have done, declare a global RT runqueue.

> 
> 
> > Are you going to be a ALS ? Maybe we can chat about what the pros and cons
> > of each approach are and whether we could/should merge things together.
> > I am very intriged by the "CPU History Weight" that I see as a major
> > add-on to our stuff. What I am not so keen about is the fact
> > you seem to only do load-balancing at fork and idle time.
> > In a loaded system that can lead to load inbalances
> >
> > We do a periodic (configurable) call, which has also some drawbacks.
> > Another thing that needs to be thought about is the metric used
> > to determine <load> on a queue. For simplicity, runqueue length is
> > one indication, for fairness, maybe the sum of nice-value would be ok.
> > We experimented with both and didn't see to much of a difference, however
> > measuring fairness is difficult to do.
> 
> Hey, ... that's part of Episode 2 " Balancing the world", where the evil
> Mr. MoveSoon fight with Hysteresis for the universe domination :)
> 
> 

Well, one has to be careful, if the system is loaded and processes are
more long lived rather then come and go, Initial Placement and Idle-Loop 
Load balancing doesn't get you very far with respect to decent load balancing.
In these kind of scenarios, one needs a feedback system. Trick is to come
up with an algorithm that is not too intrusive and that is not overcorrecting.
Take a look at the paper link, where we experimented with some of these
issues. We tolerated a difference tolerance around the runqueue length.  

> 
> 
> - Davide
> 

:-#  Hubertus

next prev parent reply	other threads:[~2001-10-30 18:30 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-30  5:38 [PATCH][RFC] Proposal For A More Scalable Scheduler Davide Libenzi
2001-10-30 11:40 ` Mike Fedyk
2001-10-30 17:02   ` Davide Libenzi
2001-10-30 23:04     ` Mike Fedyk
2001-10-30 23:14       ` Davide Libenzi
2001-10-30 23:44         ` Mike Fedyk
2001-10-31  0:01           ` Davide Libenzi
2001-10-30 14:28 ` Hubertus Franke
2001-10-30 17:19   ` Davide Libenzi
2001-10-30 16:29     ` Hubertus Franke [this message]
2001-10-30 18:50       ` Davide Libenzi
2001-10-30 16:52         ` Hubertus Franke
2001-10-30 19:08           ` [Lse-tech] " Mike Kravetz
2001-10-30 19:19           ` Davide Libenzi
2001-10-31  0:11     ` [Lse-tech] " Mike Kravetz
2001-10-31  1:06       ` Davide Libenzi
2001-10-31  5:29         ` Mike Kravetz
2001-10-31  4:45           ` Davide Libenzi
2001-10-31  5:50             ` Mike Kravetz
2001-10-31 17:07           ` Mike Kravetz
2001-10-31 17:59             ` Davide Libenzi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20011030112937.A16154@watson.ibm.com \
    --to=frankeh@watson.ibm.com \
    --cc=davidel@xmailserver.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox