public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Ingo Molnar <mingo@elte.hu>
Cc: Hubertus Franke <frankeh@us.ibm.com>,
	Mike Kravetz <mkravetz@sequent.com>,
	Fabio Riccardi <fabio@chromium.com>,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	lse-tech@lists.sourceforge.net
Subject: Re: a quest for a better scheduler
Date: Wed, 4 Apr 2001 17:08:47 +0200	[thread overview]
Message-ID: <20010404170846.V20911@athlon.random> (raw)
In-Reply-To: <OF401BD38B.CF3B1E9F-ON85256A24.0048543A@pok.ibm.com> <Pine.LNX.4.30.0104041527190.5382-100000@elte.hu>
In-Reply-To: <Pine.LNX.4.30.0104041527190.5382-100000@elte.hu>; from mingo@elte.hu on Wed, Apr 04, 2001 at 03:34:22PM +0200

On Wed, Apr 04, 2001 at 03:34:22PM +0200, Ingo Molnar wrote:
> 
> On Wed, 4 Apr 2001, Hubertus Franke wrote:
> 
> > Another point to raise is that the current scheduler does a exhaustive
> > search for the "best" task to run. It touches every process in the
> > runqueue. this is ok if the runqueue length is limited to a very small
> > multiple of the #cpus. [...]
> 
> indeed. The current scheduler handles UP and SMP systems, up to 32
> (perhaps 64) CPUs efficiently. Agressively NUMA systems need a different
> approach anyway in many other subsystems too, Kanoj is doing some
> scheduler work in that area.

I didn't seen anything from Kanoj but I did something myself for the wildfire:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.3aa1/10_numa-sched-1

this is mostly an userspace issue, not really intended as a kernel optimization
(however it's also partly a kernel optimization). Basically it splits the load
of the numa machine into per-node load, there can be unbalanced load across the
nodes but fairness is guaranteed inside each node. It's not extremely well
tested but benchmarks were ok and it is at least certainly stable.

However Ingo consider that in a 32-way if you don't have at least 32 tasks
running all the time _always_ you're really stupid paying such big money for
nothing ;). So the fact the scheduler is optimized for 1/2 tasks running all
the time is not nearly enough for those machines (and of course also the
scheduling rate automatically increases linearly with the increase of the
number of cpus). Now it's perfectly fine that we don't ask the embedded and
desktop guys to pay for that, but a kernel configuration option to select an
algorithm that scales would be a good idea IMHO. The above patch just adds a
CONFIG_NUMA_SCHED. The scalable algorithm can fit into it and nobody will be
hurted by that (CONFIG_NUMA_SCHED cannot even be selected by x86 compiles).

Andrea

  reply	other threads:[~2001-04-04 15:11 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-04-04 13:43 a quest for a better scheduler Hubertus Franke
2001-04-04 13:25 ` Ingo Molnar
2001-04-04 13:34 ` Ingo Molnar
2001-04-04 15:08   ` Andrea Arcangeli [this message]
2001-04-04 16:50     ` [Lse-tech] " Kanoj Sarcar
2001-04-04 17:16       ` Andrea Arcangeli
2001-04-04 17:49         ` Kanoj Sarcar
2001-04-04 18:00           ` Andrea Arcangeli
2001-04-05 11:13             ` Zdenek Kabelac
2001-04-04 16:39   ` Kanoj Sarcar
2001-04-04 17:00     ` Andrea Arcangeli
2001-04-04 15:44 ` Khalid Aziz
2001-04-04 15:55   ` [Lse-tech] " Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2001-04-18 14:50 Yoav Etsion
2001-04-06 13:15 Hubertus Franke
2001-04-05 23:53 Torrey Hoffman
2001-04-05 23:01 Hubertus Franke
2001-04-04 19:06 Hubertus Franke
2001-04-04 17:17 Hubertus Franke
2001-04-04 15:36 Hubertus Franke
2001-04-04 15:28 Hubertus Franke
2001-04-04 14:03 Hubertus Franke
2001-04-04 13:23 ` Ingo Molnar
2001-04-04 22:16   ` Tim Wright
2001-04-04 22:54     ` Christopher Smith
2001-04-05 22:38       ` Timothy D. Witham
2001-04-06  3:27         ` Christopher Smith
2001-04-06 18:06         ` Timothy D. Witham
2001-04-06 21:08           ` Michael Peddemors
2001-04-06 22:33           ` Nathan Straz
2001-04-04 15:12 ` Andrea Arcangeli
2001-04-04 15:49   ` Khalid Aziz
2001-04-04  6:36 alad
2001-04-03  2:23 Fabio Riccardi
2001-04-03  8:55 ` Ingo Molnar
2001-04-03 19:13   ` Mike Kravetz
2001-04-03 18:47     ` Ingo Molnar
2001-04-03 22:43       ` Mike Kravetz
2001-04-04  0:18         ` Fabio Riccardi
2001-04-04  2:47           ` Mike Kravetz
2001-04-04  4:21             ` Fabio Riccardi
2001-04-04 17:27               ` Mike Kravetz
2001-04-04  6:53           ` Ingo Molnar
2001-04-04 16:12             ` Davide Libenzi
2001-04-04  6:28         ` Ingo Molnar
2001-04-03 12:31 ` Alan Cox
2001-04-04  0:33   ` Fabio Riccardi
2001-04-04  0:35     ` Alan Cox
2001-04-04  1:17       ` Fabio Riccardi
2001-04-04  1:50         ` Christopher Smith
2001-04-04 11:57       ` Ingo Molnar
2001-04-04 11:51     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20010404170846.V20911@athlon.random \
    --to=andrea@suse.de \
    --cc=fabio@chromium.com \
    --cc=frankeh@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lse-tech@lists.sourceforge.net \
    --cc=mingo@elte.hu \
    --cc=mkravetz@sequent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox