public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Node Affine NUMA scheduler, updated
@ 2002-05-01 10:22 Erich Focht
  0 siblings, 0 replies; 2+ messages in thread
From: Erich Focht @ 2002-05-01 10:22 UTC (permalink / raw)
  To: LSE; +Cc: linux-kernel

Hi,

an updated patch for the node affine NUMA scheduler extension based on the 
O(1) scheduler can be found at 
http://home.arcor.de/efocht/sched/Nod15_O1-2.4.18.patch

Detailed information on the implementation is at 
http://home.arcor.de/efocht/sched .

What's new:
The topology information has been updated and supports the following ccNUMA 
platforms:
 - IBM NUMA-Q - i386 (thanks to Matt Dobson),
 - SGI SN1/2 - ia64 (thanks to Jesse Barnes),
 - NEC AzusA - ia64
No other i386 platforms have been tested, yet.

The topology info now uses the notions of logical and physical node, also the 
variables are protected by a rw_lock. This was a must for the integration 
with the cpu-hotplug patch.

There are two configuration variables which control the way how the scheduler 
works:
 - CONFIG_NUMA_SCHED : switch on pooling scheduler (otherwise it behaves like 
the O(1) scheduler, though it looks different).
 - CONFIG_NODE_AFFINE_SCHED: tasks remember their homenode and are attracted 
back to it.
For platforms with a big node-level cache it might be better to only configure 
CONFIG_NUMA_SCHED=y and leave CONFIG_NODE_AFFINE_SCHED undefined. This is 
better if the penalty for trashing the node-level cache is bigger than the 
benefit of running on the right node (where the memory is allocated).

I added a variable node_policy to the task structure which is inheritable and 
decides on the initial load balancing. There is a prctl interface to change 
this from userland, a utility called nodpol is available on the web page. The 
possible values for node_policy are:
  0 (default) : select homenode in do_exec(),
  1           : select homenode in do_fork() only if CLONE_VM is unset,
  2           : select homenode in do_fork() (allways).
It's mainly meant for experiments and benchmarks. Some benchmarks (e.g. AIM7, 
which simulates large loads) only fork but don't exec, thus the default 
homenode selection mechanism doesn't apply and the load balance is bad right 
from the start. In real life one should just check whether multithreaded jobs 
need to be distributed across multiple nodes or better take their memory from 
one node and change the node_policy accordingly before starting them. The 
default behavior should be fine otherwise.

On the web page I included some results showing performance increase with the 
node affine scheduler and its functionality. Basically it works fine for 
medium and high loads but has some trouble with low loads. This is due to the 
fact that a task running on a remote node alone on its CPU cannot be stolen 
by CPUs on the homenode. load_balance() is called in such places that the 
only mechanism for moving a currently running task (migration_thread) cannot 
be used. Any ideas (besides a signal) are welcome. The initial load balancing 
is improveable, too, a better measure for load will help.

Thanks in advance for your feedback, I'm especially curious about results for 
the affinity_test on other platforms than NEC AzusA.

Best regards,
Erich

^ permalink raw reply	[flat|nested] 2+ messages in thread
* Re: Node Affine NUMA scheduler, updated
@ 2002-05-01 19:09 Dieter Nützel
  0 siblings, 0 replies; 2+ messages in thread
From: Dieter Nützel @ 2002-05-01 19:09 UTC (permalink / raw)
  To: Erich Focht; +Cc: Robert Love, Linux Kernel List

Erich Focht wrote:

> Hi,
>
> an updated patch for the node affine NUMA scheduler extension based on the 
> O(1) scheduler can be found at 
> http://home.arcor.de/efocht/sched/Nod15_O1-2.4.18.patch
[-]

Hello Erich,

first, I hope you had a nice 1. May.

Please have a look at Robert's latest 2.4 O(1) backport to avoid double work.
http://www.kernel.org/pub/linux/kernel/people/rml/sched/ingo-O1/

Regards,
	Dieter

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-05-01 19:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-01 10:22 Node Affine NUMA scheduler, updated Erich Focht
  -- strict thread matches above, loose matches on Subject: below --
2002-05-01 19:09 Dieter Nützel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox