All of lore.kernel.org
 help / color / mirror / Atom feed
From: hanjun.guo@linaro.org (Hanjun Guo)
To: linux-arm-kernel@lists.infradead.org
Subject: [RFC PATCH v2 2/4] Documentation: arm64/arm: dt bindings for numa.
Date: Wed, 26 Nov 2014 17:12:49 +0800	[thread overview]
Message-ID: <54759991.50204@linaro.org> (raw)
In-Reply-To: <2065254.jRdQcinOdg@wuerfel>

On 2014-11-26 3:00, Arnd Bergmann wrote:
> On Tuesday 25 November 2014 08:15:47 Ganapatrao Kulkarni wrote:
>>> No, don't hardcode ARM specifics into a common binding either. I've looked
>>> at the ibm,associativity properties again, and I think we should just use
>>> those, they can cover all cases and are completely independent of the
>>> architecture. We should probably discuss about the property name though,
>>> as using the "ibm," prefix might not be the best idea.
>>
>> We have started with new proposal, since not got enough details how
>> ibm/ppc is managing the numa using dt.
>> there is no documentation and there is no power/PAPR spec for numa in
>> public domain and there are no single dt file in arch/powerpc which
>> describes the numa. if we get any one of these details, we can align
>> to powerpc implementation.
> 
> Basically the idea is to have an "ibm,associativity" property in each
> bus or device that is node specific, and this includes all CPUs and
> memory nodes. The property contains an array of 32-bit integers that
> count the resources. Take an example of a NUMA cluster of two machines
> with four sockets and four cores each (32 cores total), a memory
> channel on each socket and one PCI host per board that is connected
> at equal speed to each socket on the board.
> 
> The ibm,associativity property in each PCI host, CPU or memory device
> node consequently has an array of three (board, socket, core) integers:
> 
> 	memory at 0,0 {
> 		device_type = "memory";
> 		reg = <0x0 0x0  0x4 0x0;
> 		/* board 0, socket 0, no specific core */
> 		ibm,asssociativity = <0 0 0xffff>;
> 	};
> 
> 	memory at 4,0 {
> 		device_type = "memory";
> 		reg = <0x4 0x0  0x4 0x0>;
> 		/* board 0, socket 1, no specific core */
> 		ibm,asssociativity = <0 1 0xffff>; 
> 	};
> 
> 	...
> 
> 	memory at 1c,0 {
> 		device_type = "memory";
> 		reg = <0x1c 0x0  0x4 0x0>;
> 		/* board 0, socket 7, no specific core */
> 		ibm,asssociativity = <1 7 0xffff>; 
> 	};
> 
> 	cpus {
> 		#address-cells = <2>;
> 		#size-cells = <0>;
> 		cpu at 0 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		cpu at 1 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		...
> 
> 		cpu at 31 {
> 			device_type = "cpu";
> 			reg = <0 32>;
> 			/* board 1, socket 7, core 31*/
> 			ibm,asssociativity = <1 7 31>; 
> 		};
> 	};
> 
> 	pci at 100,0 {
> 		device_type = "pci";
> 		/* board 0 */
> 		ibm,associativity = <0 0xffff 0xffff>;
> 		...
> 	};
> 
> 	pci at 200,0 {
> 		device_type = "pci";
> 		/* board 1 */
> 		ibm,associativity = <1 0xffff 0xffff>;
> 		...
> 	};
> 
> 	ibm,associativity-reference-points = <0 1>;
> 
> The "ibm,associativity-reference-points" property here indicates that index 2
> of each array is the most important NUMA boundary for the particular system,
> because the performance impact of allocating memory on the remote board 
> is more significant than the impact of using memory on a remote socket of the
> same board. Linux will consequently use the first field in the array as
> the NUMA node ID. If the link between the boards however is relatively fast,
> so you care mostly about allocating memory on the same socket, but going to
> another board isn't much worse than going to another socket on the same
> board, this would be
> 
> 	ibm,associativity-reference-points = <1 0>;
> 
> so Linux would ignore the board ID and use the socket ID as the NUMA node
> number. The same would apply if you have only one (otherwise identical
> board, then you would get
> 
> 	ibm,associativity-reference-points = <1>;
> 
> which means that index 0 is completely irrelevant for NUMA considerations
> and you just care about the socket ID. In this case, devices on the PCI
> bus would also not care about NUMA policy and just allocate buffers from
> anywhere, while in original example Linux would allocate DMA buffers only
> from the local board.

Thanks for the detail information. I have the concerns about the distance
for NUMA nodes, does the "ibm,associativity-reference-points" property can
represent the distance between NUMA nodes?

For example, a system with 4 sockets connected like below:

Socket 0  <---->  Socket 1  <---->  Socket 2  <---->  Socket 3

So from socket 0 to socket 1 (maybe on the same board), it just need 1
jump to access the memory, but from socket 0 to socket 2/3, it needs
2/3 jumps and the *distance* relative longer. Can
"ibm,associativity-reference-points" property cover this?

Thanks
Hanjun

WARNING: multiple messages have this Message-ID (diff)
From: Hanjun Guo <hanjun.guo-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
To: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Cc: Ganapatrao Kulkarni
	<gpkulkarni-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	"devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Steve Capper
	<steve.capper-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Al Stone <al.stone-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Ard Biesheuvel
	<ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Catalin Marinas <catalin.marinas-5wv7dgnIgG8@public.gmane.org>,
	msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	Will Deacon <Will.Deacon-5wv7dgnIgG8@public.gmane.org>,
	Leif Lindholm
	<leif.lindholm-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Roy Franz <roy.franz-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Shannon Zhao
	<zhaoshenglong-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Grant Likely
	<grant.likely-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
	jchandra-dY08KVG/lbpWk0Htik3J/w@public.gmane.org,
	Ganapatrao Kulkarni
	<ganapatrao.kulkarni-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org>
Subject: Re: [RFC PATCH v2 2/4] Documentation: arm64/arm: dt bindings for numa.
Date: Wed, 26 Nov 2014 17:12:49 +0800	[thread overview]
Message-ID: <54759991.50204@linaro.org> (raw)
In-Reply-To: <2065254.jRdQcinOdg@wuerfel>

On 2014-11-26 3:00, Arnd Bergmann wrote:
> On Tuesday 25 November 2014 08:15:47 Ganapatrao Kulkarni wrote:
>>> No, don't hardcode ARM specifics into a common binding either. I've looked
>>> at the ibm,associativity properties again, and I think we should just use
>>> those, they can cover all cases and are completely independent of the
>>> architecture. We should probably discuss about the property name though,
>>> as using the "ibm," prefix might not be the best idea.
>>
>> We have started with new proposal, since not got enough details how
>> ibm/ppc is managing the numa using dt.
>> there is no documentation and there is no power/PAPR spec for numa in
>> public domain and there are no single dt file in arch/powerpc which
>> describes the numa. if we get any one of these details, we can align
>> to powerpc implementation.
> 
> Basically the idea is to have an "ibm,associativity" property in each
> bus or device that is node specific, and this includes all CPUs and
> memory nodes. The property contains an array of 32-bit integers that
> count the resources. Take an example of a NUMA cluster of two machines
> with four sockets and four cores each (32 cores total), a memory
> channel on each socket and one PCI host per board that is connected
> at equal speed to each socket on the board.
> 
> The ibm,associativity property in each PCI host, CPU or memory device
> node consequently has an array of three (board, socket, core) integers:
> 
> 	memory@0,0 {
> 		device_type = "memory";
> 		reg = <0x0 0x0  0x4 0x0;
> 		/* board 0, socket 0, no specific core */
> 		ibm,asssociativity = <0 0 0xffff>;
> 	};
> 
> 	memory@4,0 {
> 		device_type = "memory";
> 		reg = <0x4 0x0  0x4 0x0>;
> 		/* board 0, socket 1, no specific core */
> 		ibm,asssociativity = <0 1 0xffff>; 
> 	};
> 
> 	...
> 
> 	memory@1c,0 {
> 		device_type = "memory";
> 		reg = <0x1c 0x0  0x4 0x0>;
> 		/* board 0, socket 7, no specific core */
> 		ibm,asssociativity = <1 7 0xffff>; 
> 	};
> 
> 	cpus {
> 		#address-cells = <2>;
> 		#size-cells = <0>;
> 		cpu@0 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		cpu@1 {
> 			device_type = "cpu";
> 			reg = <0 0>;
> 			/* board 0, socket 0, core 0*/
> 			ibm,asssociativity = <0 0 0>; 
> 		};
> 
> 		...
> 
> 		cpu@31 {
> 			device_type = "cpu";
> 			reg = <0 32>;
> 			/* board 1, socket 7, core 31*/
> 			ibm,asssociativity = <1 7 31>; 
> 		};
> 	};
> 
> 	pci@100,0 {
> 		device_type = "pci";
> 		/* board 0 */
> 		ibm,associativity = <0 0xffff 0xffff>;
> 		...
> 	};
> 
> 	pci@200,0 {
> 		device_type = "pci";
> 		/* board 1 */
> 		ibm,associativity = <1 0xffff 0xffff>;
> 		...
> 	};
> 
> 	ibm,associativity-reference-points = <0 1>;
> 
> The "ibm,associativity-reference-points" property here indicates that index 2
> of each array is the most important NUMA boundary for the particular system,
> because the performance impact of allocating memory on the remote board 
> is more significant than the impact of using memory on a remote socket of the
> same board. Linux will consequently use the first field in the array as
> the NUMA node ID. If the link between the boards however is relatively fast,
> so you care mostly about allocating memory on the same socket, but going to
> another board isn't much worse than going to another socket on the same
> board, this would be
> 
> 	ibm,associativity-reference-points = <1 0>;
> 
> so Linux would ignore the board ID and use the socket ID as the NUMA node
> number. The same would apply if you have only one (otherwise identical
> board, then you would get
> 
> 	ibm,associativity-reference-points = <1>;
> 
> which means that index 0 is completely irrelevant for NUMA considerations
> and you just care about the socket ID. In this case, devices on the PCI
> bus would also not care about NUMA policy and just allocate buffers from
> anywhere, while in original example Linux would allocate DMA buffers only
> from the local board.

Thanks for the detail information. I have the concerns about the distance
for NUMA nodes, does the "ibm,associativity-reference-points" property can
represent the distance between NUMA nodes?

For example, a system with 4 sockets connected like below:

Socket 0  <---->  Socket 1  <---->  Socket 2  <---->  Socket 3

So from socket 0 to socket 1 (maybe on the same board), it just need 1
jump to access the memory, but from socket 0 to socket 2/3, it needs
2/3 jumps and the *distance* relative longer. Can
"ibm,associativity-reference-points" property cover this?

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-11-26  9:12 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:23 [RFC PATCH v2 0/4] arm64:numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
2014-11-21 21:23 ` Ganapatrao Kulkarni
2014-11-21 21:23 ` [RFC PATCH v2 1/4] arm64: defconfig: increase NR_CPUS range to 2-128 Ganapatrao Kulkarni
2014-11-21 21:23   ` Ganapatrao Kulkarni
2014-11-24 11:53   ` Arnd Bergmann
2014-11-24 11:53     ` Arnd Bergmann
2014-12-09  1:57     ` Zi Shen Lim
2014-12-09  1:57       ` Zi Shen Lim
2014-12-09  8:27       ` Arnd Bergmann
2014-12-09  8:27         ` Arnd Bergmann
2014-12-24 12:33         ` Ganapatrao Kulkarni
2014-12-24 12:33           ` Ganapatrao Kulkarni
2014-11-21 21:23 ` [RFC PATCH v2 2/4] Documentation: arm64/arm: dt bindings for numa Ganapatrao Kulkarni
2014-11-21 21:23   ` Ganapatrao Kulkarni
2014-11-25  3:55   ` Shannon Zhao
2014-11-25  3:55     ` Shannon Zhao
2014-11-25  9:42     ` Hanjun Guo
2014-11-25  9:42       ` Hanjun Guo
2014-11-25 11:02       ` Arnd Bergmann
2014-11-25 11:02         ` Arnd Bergmann
2014-11-25 13:15         ` Ganapatrao Kulkarni
2014-11-25 13:15           ` Ganapatrao Kulkarni
2014-11-25 19:00           ` Arnd Bergmann
2014-11-25 19:00             ` Arnd Bergmann
2014-11-25 21:09             ` Arnd Bergmann
2014-11-25 21:09               ` Arnd Bergmann
2014-11-26  9:12             ` Hanjun Guo [this message]
2014-11-26  9:12               ` Hanjun Guo
2014-12-10 10:57               ` Arnd Bergmann
2014-12-10 10:57                 ` Arnd Bergmann
2014-12-11  9:16                 ` Hanjun Guo
2014-12-11  9:16                   ` Hanjun Guo
2014-12-12 14:20                   ` Arnd Bergmann
2014-12-12 14:20                     ` Arnd Bergmann
2014-12-15  3:50                     ` Hanjun Guo
2014-12-15  3:50                       ` Hanjun Guo
2014-11-30 16:38             ` Ganapatrao Kulkarni
2014-11-30 16:38               ` Ganapatrao Kulkarni
2014-11-30 17:13               ` Arnd Bergmann
2014-11-30 17:13                 ` Arnd Bergmann
2014-11-25 14:54         ` Hanjun Guo
2014-11-25 14:54           ` Hanjun Guo
2014-11-26  2:29         ` Shannon Zhao
2014-11-26  2:29           ` Shannon Zhao
2014-11-26 16:51           ` Arnd Bergmann
2014-11-26 16:51             ` Arnd Bergmann
2014-11-21 21:23 ` [RFC PATCH v2 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni
2014-11-21 21:23   ` Ganapatrao Kulkarni
2014-11-24 11:59   ` Arnd Bergmann
2014-11-24 11:59     ` Arnd Bergmann
2014-11-24 16:32     ` Roy Franz
2014-11-24 16:32       ` Roy Franz
2014-11-24 17:01       ` Arnd Bergmann
2014-11-24 17:01         ` Arnd Bergmann
2014-11-25 12:38         ` Ard Biesheuvel
2014-11-25 12:38           ` Ard Biesheuvel
2014-11-25 12:45           ` Arnd Bergmann
2014-11-25 12:45             ` Arnd Bergmann
2014-11-24 17:01   ` Marc Zyngier
2014-11-24 17:01     ` Marc Zyngier
2014-11-21 21:23 ` [RFC PATCH v2 4/4] arm64:numa: adding numa support for arm64 platforms Ganapatrao Kulkarni
2014-11-21 21:23   ` Ganapatrao Kulkarni
2014-12-06  9:36   ` Ashok Kumar
2014-12-06  9:36   ` Ashok Kumar
2014-12-06  9:36   ` Ashok Kumar
2014-12-06  9:36   ` Ashok Kumar
     [not found]   ` <5482ce36.c9e2420a.5d40.71c7SMTPIN_ADDED_BROKEN@mx.google.com>
2014-12-06 18:50     ` Ganapatrao Kulkarni
2014-12-06 18:50       ` Ganapatrao Kulkarni
2014-12-10 12:26       ` Ashok Kumar
2014-12-10 12:26       ` Ashok Kumar
2014-12-10 12:26       ` Ashok Kumar
2014-12-10 12:26       ` Ashok Kumar
     [not found]       ` <54883be3.8284440a.3154.ffffa34fSMTPIN_ADDED_BROKEN@mx.google.com>
2014-12-15 18:16         ` Ganapatrao Kulkarni
2014-12-15 18:16           ` Ganapatrao Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54759991.50204@linaro.org \
    --to=hanjun.guo@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.