linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: hanjun.guo@linaro.org (Hanjun Guo)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH v2 2/4] Documentation: arm64/arm: dt bindings for numa.
Date: Wed, 26 Nov 2014 17:12:49 +0800	[thread overview]
Message-ID: <54759991.50204@linaro.org> (raw)
In-Reply-To: <2065254.jRdQcinOdg@wuerfel>

On 2014-11-26 3:00, Arnd Bergmann wrote:
> On Tuesday 25 November 2014 08:15:47 Ganapatrao Kulkarni wrote:
>>> No, don't hardcode ARM specifics into a common binding either. I've looked
>>> at the ibm,associativity properties again, and I think we should just use
>>> those, they can cover all cases and are completely independent of the
>>> architecture. We should probably discuss about the property name though,
>>> as using the "ibm," prefix might not be the best idea.
>>
>> We have started with new proposal, since not got enough details how
>> ibm/ppc is managing the numa using dt.
>> there is no documentation and there is no power/PAPR spec for numa in
>> public domain and there are no single dt file in arch/powerpc which
>> describes the numa. if we get any one of these details, we can align
>> to powerpc implementation.
> 
> Basically the idea is to have an "ibm,associativity" property in each
> bus or device that is node specific, and this includes all CPUs and
> memory nodes. The property contains an array of 32-bit integers that
> count the resources. Take an example of a NUMA cluster of two machines
> with four sockets and four cores each (32 cores total), a memory
> channel on each socket and one PCI host per board that is connected
> at equal speed to each socket on the board.
> 
> The ibm,associativity property in each PCI host, CPU or memory device
> node consequently has an array of three (board, socket, core) integers:
> 
> 	memory at 0,0 {
> 		device_type = "memory";
> 		reg = <0x0 0x0  0x4 0x0;
> 		/* board 0, socket 0, no specific core */
> 		ibm,asssociativity = <0 0 0xffff>;
> 	};
> 
> 	memory at 4,0 {
> 		device_type = "memory";
> 		reg = <0x4 0x0  0x4 0x0>;
> 		/* board 0, socket 1, no specific core */
> 		ibm,asssociativity = <0 1 0xffff>; 
> 	};
> 
> 	...
> 
> 	memory at 1c,0 {
> 		device_type = "memory";
> 		reg = <0x1c 0x0  0x4 0x0>;
> 		/* board 0, socket 7, no specific core */
> 		ibm,asssociativity = <1 7 0xffff>; 
> 	};
> 
> 	cpus {
> 		#address-cells = <2>;
> 		#size-cells = <0>;
> 		cpu at 0 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		cpu at 1 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		...
> 
> 		cpu at 31 {
> 			device_type = "cpu";
> 			reg = <0 32>;
> 			/* board 1, socket 7, core 31*/
> 			ibm,asssociativity = <1 7 31>; 
> 		};
> 	};
> 
> 	pci at 100,0 {
> 		device_type = "pci";
> 		/* board 0 */
> 		ibm,associativity = <0 0xffff 0xffff>;
> 		...
> 	};
> 
> 	pci at 200,0 {
> 		device_type = "pci";
> 		/* board 1 */
> 		ibm,associativity = <1 0xffff 0xffff>;
> 		...
> 	};
> 
> 	ibm,associativity-reference-points = <0 1>;
> 
> The "ibm,associativity-reference-points" property here indicates that index 2
> of each array is the most important NUMA boundary for the particular system,
> because the performance impact of allocating memory on the remote board 
> is more significant than the impact of using memory on a remote socket of the
> same board. Linux will consequently use the first field in the array as
> the NUMA node ID. If the link between the boards however is relatively fast,
> so you care mostly about allocating memory on the same socket, but going to
> another board isn't much worse than going to another socket on the same
> board, this would be
> 
> 	ibm,associativity-reference-points = <1 0>;
> 
> so Linux would ignore the board ID and use the socket ID as the NUMA node
> number. The same would apply if you have only one (otherwise identical
> board, then you would get
> 
> 	ibm,associativity-reference-points = <1>;
> 
> which means that index 0 is completely irrelevant for NUMA considerations
> and you just care about the socket ID. In this case, devices on the PCI
> bus would also not care about NUMA policy and just allocate buffers from
> anywhere, while in original example Linux would allocate DMA buffers only
> from the local board.

Thanks for the detail information. I have the concerns about the distance
for NUMA nodes, does the "ibm,associativity-reference-points" property can
represent the distance between NUMA nodes?

For example, a system with 4 sockets connected like below:

Socket 0  <---->  Socket 1  <---->  Socket 2  <---->  Socket 3

So from socket 0 to socket 1 (maybe on the same board), it just need 1
jump to access the memory, but from socket 0 to socket 2/3, it needs
2/3 jumps and the *distance* relative longer. Can
"ibm,associativity-reference-points" property cover this?

Thanks
Hanjun

  parent reply	other threads:[~2014-11-26  9:12 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:23 [RFC PATCH v2 0/4] arm64:numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
2014-11-21 21:23 ` [RFC PATCH v2 1/4] arm64: defconfig: increase NR_CPUS range to 2-128 Ganapatrao Kulkarni
2014-11-24 11:53   ` Arnd Bergmann
2014-12-09  1:57     ` Zi Shen Lim
2014-12-09  8:27       ` Arnd Bergmann
2014-12-24 12:33         ` Ganapatrao Kulkarni
2014-11-21 21:23 ` [RFC PATCH v2 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
2014-11-25  3:55   ` Shannon Zhao
2014-11-25  9:42     ` Hanjun Guo
2014-11-25 11:02       ` Arnd Bergmann
2014-11-25 13:15         ` Ganapatrao Kulkarni
2014-11-25 19:00           ` Arnd Bergmann
2014-11-25 21:09             ` Arnd Bergmann
2014-11-26  9:12             ` Hanjun Guo [this message]
2014-12-10 10:57               ` Arnd Bergmann
2014-12-11  9:16                 ` Hanjun Guo
2014-12-12 14:20                   ` Arnd Bergmann
2014-12-15  3:50                     ` Hanjun Guo
2014-11-30 16:38             ` Ganapatrao Kulkarni
2014-11-30 17:13               ` Arnd Bergmann
2014-11-25 14:54         ` Hanjun Guo
2014-11-26  2:29         ` Shannon Zhao
2014-11-26 16:51           ` Arnd Bergmann
2014-11-21 21:23 ` [RFC PATCH v2 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
2014-11-24 11:59   ` Arnd Bergmann
2014-11-24 16:32     ` Roy Franz
2014-11-24 17:01       ` Arnd Bergmann
2014-11-25 12:38         ` Ard Biesheuvel
2014-11-25 12:45           ` Arnd Bergmann
2014-11-24 17:01   ` Marc Zyngier
2014-11-21 21:23 ` [RFC PATCH v2 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
2014-12-06  9:36   ` Ashok Kumar
     [not found]   ` <5482ce36.c9e2420a.5d40.71c7SMTPIN_ADDED_BROKEN@mx.google.com>
2014-12-06 18:50     ` Ganapatrao Kulkarni
2014-12-10 12:26       ` Ashok Kumar
     [not found]       ` <54883be3.8284440a.3154.ffffa34fSMTPIN_ADDED_BROKEN@mx.google.com>
2014-12-15 18:16         ` Ganapatrao Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54759991.50204@linaro.org \
    --to=hanjun.guo@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).