From: Jack Steiner <steiner@sgi.com>
To: Erich Focht <efocht@hpce.nec.com>
Cc: Takayoshi Kochi <t-kochi@bq.jp.nec.com>,
ak@suse.de, linux-ia64@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Externalize SLIT table
Date: Fri, 05 Nov 2004 19:13:16 +0000 [thread overview]
Message-ID: <20041105191316.GA30434@sgi.com> (raw)
In-Reply-To: <200411051813.24231.efocht@hpce.nec.com>
On Fri, Nov 05, 2004 at 06:13:24PM +0100, Erich Focht wrote:
> Hi Jack,
>
> the patch looks fine, of course.
> > # cat ./node/node0/distance
> > 10 20 64 42 42 22
> Great!
>
> But:
> > # cat ./cpu/cpu8/distance
> > 42 42 64 64 22 22 42 42 10 10 20 20
> ...
>
> what exactly do you mean by cpu_to_cpu distance? In analogy with the
> node distance I'd say it is the time (latency) for moving data from
> the register of one CPU into the register of another CPU:
> cpu*/distance : cpu -> memory -> cpu
> node1 node? node2
>
I'm trying to create an easy-to-use metric for finding sets of cpus that
are close to each other. By "close", I mean that the average offnode
reference from a cpu to remote memory in the set is minimized.
The numbers in cpuN/distance represent the distance from cpu N to
the memory that is local to each of the other cpus.
I agree that this can be derived from converting cpuN->node, finding
internode distances, then finding the cpus on each remote node.
The cpu metric is much easier to use.
> On most architectures this means flushing a cacheline to memory on one
> side and reading it on another side. What you actually implement is
> the latency from memory (one node) to a particular cpu (on some
> node).
> memory -> cpu
> node1 node2
I see how the term can be misleading. The metric is intended to
represent ONLY the cost of remote access to another processor's local memory.
Is there a better way to describe the cpu-to-remote-cpu's-memory metric OR
should we let users contruct their own matrix from the node data?
>
> That's only half of the story and actually misleading. I don't
> think the complexity hiding is good in this place. Questions coming to
> my mind are: Where is the memory? Is the SLIT matrix really symmetric
> (cpu_to_cpu distance only makes sense for symmetric matrices)? I
> remember talking to IBM people about hardware where the node distance
> matrix was asymmetric.
>
> Why do you want this distance anyway? libnuma offers you _node_ masks
> for allocating memory from a particular node. And when you want to
> arrange a complex MPI process structure you'll have to think about
> latency for moving data from one processes buffer to the other
> processes buffer. The buffers live on nodes, not on cpus.
One important use is in the creation of cpusets. The batch scheduler needs
to pick a subset of cpus that are as close together as possible.
--
Thanks
Jack Steiner (steiner@sgi.com) 651-683-5302
Principal Engineer SGI - Silicon Graphics, Inc.
WARNING: multiple messages have this Message-ID (diff)
From: Jack Steiner <steiner@sgi.com>
To: Erich Focht <efocht@hpce.nec.com>
Cc: Takayoshi Kochi <t-kochi@bq.jp.nec.com>,
ak@suse.de, linux-ia64@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: Externalize SLIT table
Date: Fri, 5 Nov 2004 13:13:16 -0600 [thread overview]
Message-ID: <20041105191316.GA30434@sgi.com> (raw)
In-Reply-To: <200411051813.24231.efocht@hpce.nec.com>
On Fri, Nov 05, 2004 at 06:13:24PM +0100, Erich Focht wrote:
> Hi Jack,
>
> the patch looks fine, of course.
> > # cat ./node/node0/distance
> > 10 20 64 42 42 22
> Great!
>
> But:
> > # cat ./cpu/cpu8/distance
> > 42 42 64 64 22 22 42 42 10 10 20 20
> ...
>
> what exactly do you mean by cpu_to_cpu distance? In analogy with the
> node distance I'd say it is the time (latency) for moving data from
> the register of one CPU into the register of another CPU:
> cpu*/distance : cpu -> memory -> cpu
> node1 node? node2
>
I'm trying to create an easy-to-use metric for finding sets of cpus that
are close to each other. By "close", I mean that the average offnode
reference from a cpu to remote memory in the set is minimized.
The numbers in cpuN/distance represent the distance from cpu N to
the memory that is local to each of the other cpus.
I agree that this can be derived from converting cpuN->node, finding
internode distances, then finding the cpus on each remote node.
The cpu metric is much easier to use.
> On most architectures this means flushing a cacheline to memory on one
> side and reading it on another side. What you actually implement is
> the latency from memory (one node) to a particular cpu (on some
> node).
> memory -> cpu
> node1 node2
I see how the term can be misleading. The metric is intended to
represent ONLY the cost of remote access to another processor's local memory.
Is there a better way to describe the cpu-to-remote-cpu's-memory metric OR
should we let users contruct their own matrix from the node data?
>
> That's only half of the story and actually misleading. I don't
> think the complexity hiding is good in this place. Questions coming to
> my mind are: Where is the memory? Is the SLIT matrix really symmetric
> (cpu_to_cpu distance only makes sense for symmetric matrices)? I
> remember talking to IBM people about hardware where the node distance
> matrix was asymmetric.
>
> Why do you want this distance anyway? libnuma offers you _node_ masks
> for allocating memory from a particular node. And when you want to
> arrange a complex MPI process structure you'll have to think about
> latency for moving data from one processes buffer to the other
> processes buffer. The buffers live on nodes, not on cpus.
One important use is in the creation of cpusets. The batch scheduler needs
to pick a subset of cpus that are as close together as possible.
--
Thanks
Jack Steiner (steiner@sgi.com) 651-683-5302
Principal Engineer SGI - Silicon Graphics, Inc.
next prev parent reply other threads:[~2004-11-05 19:13 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-03 20:56 Externalize SLIT table Jack Steiner
2004-11-04 1:59 ` Takayoshi Kochi
2004-11-04 1:59 ` Takayoshi Kochi
2004-11-04 4:07 ` Andi Kleen
2004-11-04 4:07 ` Andi Kleen
2004-11-04 4:57 ` Takayoshi Kochi
2004-11-04 4:57 ` Takayoshi Kochi
2004-11-04 6:37 ` Andi Kleen
2004-11-04 6:37 ` Andi Kleen
2004-11-05 16:08 ` Jack Steiner
2004-11-05 16:08 ` Jack Steiner
2004-11-05 16:26 ` Andreas Schwab
2004-11-05 16:26 ` Andreas Schwab
2004-11-05 16:44 ` Jack Steiner
2004-11-05 16:44 ` Jack Steiner
2004-11-06 11:50 ` Christoph Hellwig
2004-11-06 11:50 ` Christoph Hellwig
2004-11-06 12:48 ` Andi Kleen
2004-11-06 12:48 ` Andi Kleen
2004-11-06 13:07 ` Christoph Hellwig
2004-11-06 13:07 ` Christoph Hellwig
2004-11-05 17:13 ` Erich Focht
2004-11-05 17:13 ` Erich Focht
2004-11-05 19:13 ` Jack Steiner [this message]
2004-11-05 19:13 ` Jack Steiner
2004-11-09 19:23 ` Matthew Dobson
2004-11-09 19:23 ` Matthew Dobson
2004-11-04 14:13 ` Jack Steiner
2004-11-04 14:13 ` Jack Steiner
2004-11-04 14:29 ` Andi Kleen
2004-11-04 14:29 ` Andi Kleen
2004-11-04 15:31 ` Erich Focht
2004-11-04 15:31 ` Erich Focht
2004-11-04 17:04 ` Andi Kleen
2004-11-04 17:04 ` Andi Kleen
2004-11-04 19:36 ` Jack Steiner
2004-11-04 19:36 ` Jack Steiner
2004-11-09 19:45 ` Matthew Dobson
2004-11-09 19:45 ` Matthew Dobson
2004-11-09 19:43 ` Matthew Dobson
2004-11-09 19:43 ` Matthew Dobson
2004-11-09 20:34 ` Mark Goodwin
2004-11-09 20:34 ` Mark Goodwin
2004-11-09 22:00 ` Jesse Barnes
2004-11-09 22:00 ` Jesse Barnes
2004-11-09 23:58 ` Matthew Dobson
2004-11-09 23:58 ` Matthew Dobson
2004-11-10 5:05 ` Mark Goodwin
2004-11-10 5:05 ` Mark Goodwin
2004-11-10 18:45 ` Erich Focht
2004-11-10 18:45 ` Erich Focht
2004-11-10 22:09 ` Matthew Dobson
2004-11-10 22:09 ` Matthew Dobson
2004-11-18 16:39 ` Jack Steiner
2004-11-18 16:39 ` Jack Steiner
[not found] <20041103205655.GA5084@sgi.com.suse.lists.linux.kernel>
[not found] ` <20041104.105908.18574694.t-kochi@bq.jp.nec.com.suse.lists.linux.kernel>
[not found] ` <20041104040713.GC21211@wotan.suse.de.suse.lists.linux.kernel>
[not found] ` <20041104.135721.08317994.t-kochi@bq.jp.nec.com.suse.lists.linux.kernel>
[not found] ` <20041105160808.GA26719@sgi.com.suse.lists.linux.kernel>
2004-11-06 6:30 ` Andi Kleen
2004-11-23 17:32 ` Jack Steiner
2004-11-23 19:06 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20041105191316.GA30434@sgi.com \
--to=steiner@sgi.com \
--cc=ak@suse.de \
--cc=efocht@hpce.nec.com \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=t-kochi@bq.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.