Re: [PATCH] -mm tree: broken "dynamic sched domains" and "migration cost"

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Paul Jackson <pj@sgi.com>
To: hawkes@sgi.com
Cc: nickpiggin@yahoo.com.au, dino@in.ibm.com, akpm@osdl.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu, steiner@sgi.com,
	hawkes@sgi.com
Subject: Re: [PATCH] -mm tree: broken "dynamic sched domains" and "migration cost"
Date: Fri, 9 Dec 2005 16:06:47 -0800	[thread overview]
Message-ID: <20051209160647.275febe4.pj@sgi.com> (raw)
In-Reply-To: <20051209205454.18325.46768.sendpatchset@tomahawk.engr.sgi.com>

> (5) Besides, the migration cost between any two CPUs is something
>     that can be calculated once at boot time and remembered
>     thereafter.  I suspect the reason why the algorithm doesn't do
>     this is that an exhaustive NxN calculation will be very slow for
>     large NR_CPUS counts, which explains why the calculations are
>     now done in the context of sched domains.

Agreed - I too suspect that this a form of compression, both of
computation costs and data size.  We save space and time by not
calculating the full N * N matrix, where N is num_onlinecpus(), but
just the sched domain sub-matrices.

In theory, I would think that we should -not- compress based on sched
domains, because:
 1) these are (now becoming) dynamic, and
 2) they don't reflect the "natural" basis for such compression,
    which would be hardware topology based, not sched domain based.

Rather we should compress based on the topological symmetries of the
hardware system.  Of course, this is an ARCH specific characteristic,
or even platform specific.

Perhaps we could provide an ARCH specific routine that would map any
ordered pair <cpu0, cpu1> of cpu numbers to a canonical pair, such that
the canonical pairs were "about as far apart, for that system
topology", but potentially much fewer in number than the entire N * N
space, and a smaller maximum value of the largest cpu number returned.
The default routine would be the identify function, which would work
fine for ordinary sized systems.

A second ARCH specific routine would return the largest value M
canonical cpu number that would be returned by the above routine.
The distance array could be dynamically allocated to M**2 size.
The default routine would just return the highest online CPU number.

These 'canonical cpu pairs' would replace the sched domains as the
basis for compression.

Then one time at boot, for each possible pair of online cpus, map that
pair to its canonical pair, and if not already done, compute its
migration cost.  For example, if on the current systems topology, cpu
pairs <3,5> and <67,69> are pretty much the same distances apart, the
"canonical" pair for both these might be <3,5>, and only that pair
would have to be actually computed and stored.  Everytime the software
using this wanted results for <67,69>, it would get mapped to <3,5> for
resolution.

In the extreme case of a big NUMA system with an essentially homogeneous
topology (all cpu-cpu distances the same), all <cpu0, cpu1> pairs where
cpu- != cpu1, could be mapped to same canonical pair <0, 1>.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

next prev parent reply	other threads:[~2005-12-10  0:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-09 20:54 [PATCH] -mm tree: broken "dynamic sched domains" and "migration cost" hawkes
2005-12-10  0:06 ` Paul Jackson [this message]
2005-12-10 12:02 ` [patch -mm] scheduler cache hot autodetect, print less Ingo Molnar
2005-12-10 12:25 ` [patch -mm] scheduler cache hot autodetect, isolcpus fix Ingo Molnar
2005-12-12 20:20   ` John Hawkes
2005-12-13  7:32     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051209160647.275febe4.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=akpm@osdl.org \
    --cc=dino@in.ibm.com \
    --cc=hawkes@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=steiner@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox