public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@muc.de>
To: Zoltan.Menyhart@bull.net
Cc: linux-kernel@vger.kernel.org
Subject: Re: NUMA API - wish list
Date: Mon, 03 May 2004 15:17:52 +0200	[thread overview]
Message-ID: <m3k6zttzsv.fsf@averell.firstfloor.org> (raw)
In-Reply-To: <1RLdk-29R-11@gated-at.bofh.it> (Zoltan Menyhart's message of "Mon, 03 May 2004 15:00:14 +0200")

Zoltan Menyhart <Zoltan.Menyhart_AT_bull.net@nospam.org> writes:

> The work load manager / load balancer can negotiate other resource
> assignment at any time with the application.
> The work load manager / load balancer is free to move a collection of
> resources from some NUMA domains to others, provided the application's
> requirements are still met. (No hard binding.)

IMHO these are hard research topics that will need considerable
more work to be automated, if they will ever work automated at all.
The main problem is that you several conflicting goals: you 
want to use all available CPU power, all available memory,
all available memory bandwidth and the best average memory latency.
They all conflict.

First: basically any more advanced automatic schemes will
require to go all the way to a full workload manager 
that can move around memory later, because it is near impossible
to get even two of these goals right in advance.

I first tried to develop a NUMA scheduler "homenode scheduler" that
attempted to do a lot of this automatically.  I then realized that it
is just too hard to do and it never worked very well. That is why I
changed gears and just started with a simple API to let the user tell
the kernel what he wants.

The advantage of this is that a lot of complexity is avoided; 
e.g. the NUMA API avoids any need to move memory around.

Now if somebody comes up with a good design for a workload manager and
does all the experiments needed to validate it then it could be later
added. But defering NUMA optimization efforts until this considerable
task is solved (if it even can be solved) would be a big mistake IMHO.

> Billing is done accordingly :-)
>
> As you do not need to know anything about SCSI LUNs, sector IDs, phy-
> sical memory maps or the other applications when you compile your kernel,  
> why should an application care for HW NUMA details ?

There is a big difference between these and NUMA. 

LUNs, sectors, physical memory are all hidden for correctness. For 
that virtualization is fine, because performance is secondary 
after correctness.

But NUMA knowledge is purely for optimization. And for optimization
purposes you want to avoid virtualization layers, because they get
in the way of your optimization efforts.

When a human does NUMA optimization they usually want to work near the
bare hardware.  And if your dream of a automatic workload manager ever
worked it would also work on the bare hardware.

-Andi


  parent reply	other threads:[~2004-05-03 13:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1QAMU-4gf-15@gated-at.bofh.it>
2004-04-30 20:01 ` NUMA API Andi Kleen
2004-05-01  5:15   ` Martin J. Bligh
2004-05-03 18:34   ` Ulrich Drepper
2004-04-30 20:39 ` Andi Kleen
     [not found] ` <1RLdk-29R-11@gated-at.bofh.it>
2004-05-03 13:17   ` Andi Kleen [this message]
2004-04-30  7:35 Ulrich Drepper
2004-05-03 12:48 ` NUMA API - wish list Zoltan Menyhart
2004-05-03 17:57   ` Paul Jackson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3k6zttzsv.fsf@averell.firstfloor.org \
    --to=ak@muc.de \
    --cc=Zoltan.Menyhart@bull.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox