linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2014-12-31  7:36 [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
@ 2014-12-31  7:36 ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2014-12-31  7:36 UTC (permalink / raw)
  To: linux-arm-kernel

adding dt file for Cavium's Thunder SoC in 2 Node topology
using arm,associativity device node property.

Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
---
 arch/arm64/boot/dts/thunder-88xx-2n.dts  |  78 +++
 arch/arm64/boot/dts/thunder-88xx-2n.dtsi | 789 +++++++++++++++++++++++++++++++
 2 files changed, 867 insertions(+)
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dts
 create mode 100644 arch/arm64/boot/dts/thunder-88xx-2n.dtsi

diff --git a/arch/arm64/boot/dts/thunder-88xx-2n.dts b/arch/arm64/boot/dts/thunder-88xx-2n.dts
new file mode 100644
index 0000000..5dc89d5e
--- /dev/null
+++ b/arch/arm64/boot/dts/thunder-88xx-2n.dts
@@ -0,0 +1,78 @@
+/*
+ * Cavium Thunder DTS file - Thunder board description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/dts-v1/;
+
+/include/ "thunder-88xx-2n.dtsi"
+
+/ {
+	model = "Cavium ThunderX CN88XX board";
+	compatible = "cavium,thunder-88xx";
+	arm,associativity-reference-points = <0 1>;
+
+	aliases {
+		serial0 = &uaa0;
+		serial1 = &uaa1;
+	};
+
+	memory at 00000000 {
+		device_type = "memory";
+		reg = <0x0 0x00000000 0x0 0x80000000>;
+		/* board 0, socket 0, no specific core */
+		arm,associativity = <0 0 0xffff>;
+	};
+
+	memory at 10000000000 {
+		device_type = "memory";
+		reg = <0x100 0x00000000 0x0 0x80000000>;
+		/* board 1, socket 0, no specific core */
+		arm,associativity = <1 0 0xffff>;
+	};
+
+};
diff --git a/arch/arm64/boot/dts/thunder-88xx-2n.dtsi b/arch/arm64/boot/dts/thunder-88xx-2n.dtsi
new file mode 100644
index 0000000..f7f561a
--- /dev/null
+++ b/arch/arm64/boot/dts/thunder-88xx-2n.dtsi
@@ -0,0 +1,789 @@
+/*
+ * Cavium Thunder DTS file - Thunder SoC description
+ *
+ * Copyright (C) 2014, Cavium Inc.
+ *
+ * This file is dual-licensed: you can use it either under the terms
+ * of the GPL or the X11 license, at your option. Note that this dual
+ * licensing only applies to this file, and not this project as a
+ * whole.
+ *
+ *  a) This library is free software; you can redistribute it and/or
+ *     modify it under the terms of the GNU General Public License as
+ *     published by the Free Software Foundation; either version 2 of the
+ *     License, or (at your option) any later version.
+ *
+ *     This library is distributed in the hope that it will be useful,
+ *     but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *     GNU General Public License for more details.
+ *
+ *     You should have received a copy of the GNU General Public
+ *     License along with this library; if not, write to the Free
+ *     Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston,
+ *     MA 02110-1301 USA
+ *
+ * Or, alternatively,
+ *
+ *  b) Permission is hereby granted, free of charge, to any person
+ *     obtaining a copy of this software and associated documentation
+ *     files (the "Software"), to deal in the Software without
+ *     restriction, including without limitation the rights to use,
+ *     copy, modify, merge, publish, distribute, sublicense, and/or
+ *     sell copies of the Software, and to permit persons to whom the
+ *     Software is furnished to do so, subject to the following
+ *     conditions:
+ *
+ *     The above copyright notice and this permission notice shall be
+ *     included in all copies or substantial portions of the Software.
+ *
+ *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+ *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+ *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+ *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ *     OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+/ {
+	compatible = "cavium,thunder-88xx";
+	interrupt-parent = <&gic0>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	psci {
+		compatible = "arm,psci-0.2";
+		method = "smc";
+	};
+
+	cpus {
+		#address-cells = <2>;
+		#size-cells = <0>;
+
+		cpu at 000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x000>;
+			enable-method = "psci";
+			/* board 0, socket 0, core 0*/
+			arm,associativity = <0 0 0x000>;
+		};
+		cpu at 001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x001>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x001>;
+		};
+		cpu at 002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x002>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x002>;
+		};
+		cpu at 003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x003>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x003>;
+		};
+		cpu at 004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x004>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x004>;
+		};
+		cpu at 005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x005>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x005>;
+		};
+		cpu at 006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x006>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x006>;
+		};
+		cpu at 007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x007>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x007>;
+		};
+		cpu at 008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x008>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x008>;
+		};
+		cpu at 009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x009>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x009>;
+		};
+		cpu at 00a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00a>;
+		};
+		cpu at 00b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00b>;
+		};
+		cpu at 00c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00c>;
+		};
+		cpu at 00d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00d>;
+		};
+		cpu at 00e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00e>;
+		};
+		cpu at 00f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x00f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x00f>;
+		};
+		cpu at 100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x100>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x100>;
+		};
+		cpu at 101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x101>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x101>;
+		};
+		cpu at 102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x102>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x102>;
+		};
+		cpu at 103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x103>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x103>;
+		};
+		cpu at 104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x104>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x104>;
+		};
+		cpu at 105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x105>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x105>;
+		};
+		cpu at 106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x106>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x106>;
+		};
+		cpu at 107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x107>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x107>;
+		};
+		cpu at 108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x108>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x108>;
+		};
+		cpu at 109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x109>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x109>;
+		};
+		cpu at 10a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10a>;
+		};
+		cpu at 10b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10b>;
+		};
+		cpu at 10c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10c>;
+		};
+		cpu at 10d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10d>;
+		};
+		cpu at 10e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10e>;
+		};
+		cpu at 10f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x10f>;
+		};
+		cpu at 200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x200>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x200>;
+		};
+		cpu at 201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x201>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x201>;
+		};
+		cpu at 202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x202>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x202>;
+		};
+		cpu at 203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x203>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x203>;
+		};
+		cpu at 204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x204>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x204>;
+		};
+		cpu at 205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x205>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x205>;
+		};
+		cpu at 206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x206>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x206>;
+		};
+		cpu at 207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x207>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x207>;
+		};
+		cpu at 208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x208>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x208>;
+		};
+		cpu at 209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x209>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x209>;
+		};
+		cpu at 20a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20a>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20a>;
+		};
+		cpu at 20b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20b>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20b>;
+		};
+		cpu at 20c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20c>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20c>;
+		};
+		cpu at 20d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20d>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20d>;
+		};
+		cpu at 20e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20e>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20e>;
+		};
+		cpu at 20f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x20f>;
+			enable-method = "psci";
+			arm,associativity = <0 0 0x20f>;
+		};
+		cpu at 10000 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10000>;
+			enable-method = "psci";
+			/* board 1, socket 0, core 0*/
+			arm,associativity = <1 0 0x10000>;
+		};
+		cpu at 10001 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10001>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10001>;
+		};
+		cpu at 10002 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10002>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10002>;
+		};
+		cpu at 10003 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10003>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10003>;
+		};
+		cpu at 10004 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10004>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10004>;
+		};
+		cpu at 10005 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10005>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10005>;
+		};
+		cpu at 10006 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10006>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10006>;
+		};
+		cpu at 10007 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10007>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10007>;
+		};
+		cpu at 10008 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10008>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10008>;
+		};
+		cpu at 10009 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10009>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10009>;
+		};
+		cpu at 1000a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000a>;
+		};
+		cpu at 1000b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000b>;
+		};
+		cpu at 1000c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000c>;
+		};
+		cpu at 1000d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000d>;
+		};
+		cpu at 1000e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000e>;
+		};
+		cpu at 1000f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1000f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1000f>;
+		};
+		cpu at 10100 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10100>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10100>;
+		};
+		cpu at 10101 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10101>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10101>;
+		};
+		cpu at 10102 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10102>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10102>;
+		};
+		cpu at 10103 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10103>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10103>;
+		};
+		cpu at 10104 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10104>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10104>;
+		};
+		cpu at 10105 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10105>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10105>;
+		};
+		cpu at 10106 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10106>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10106>;
+		};
+		cpu at 10107 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10107>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10107>;
+		};
+		cpu at 10108 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10108>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10108>;
+		};
+		cpu at 10109 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10109>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10109>;
+		};
+		cpu at 1010a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010a>;
+		};
+		cpu at 1010b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010b>;
+		};
+		cpu at 1010c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010c>;
+		};
+		cpu at 1010d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010d>;
+		};
+		cpu at 1010e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010e>;
+		};
+		cpu at 1010f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1010f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1010f>;
+		};
+		cpu at 10200 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10200>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10200>;
+		};
+		cpu at 10201 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10201>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10201>;
+		};
+		cpu at 10202 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10202>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10202>;
+		};
+		cpu at 10203 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10203>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10203>;
+		};
+		cpu at 10204 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10204>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10204>;
+		};
+		cpu at 10205 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10205>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10205>;
+		};
+		cpu at 10206 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10206>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10206>;
+		};
+		cpu at 10207 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10207>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10207>;
+		};
+		cpu at 10208 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10208>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10208>;
+		};
+		cpu at 10209 {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x10209>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x10209>;
+		};
+		cpu at 1020a {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020a>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020a>;
+		};
+		cpu at 1020b {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020b>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020b>;
+		};
+		cpu at 1020c {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020c>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020c>;
+		};
+		cpu at 1020d {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020d>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020d>;
+		};
+		cpu at 1020e {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020e>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020e>;
+		};
+		cpu at 1020f {
+			device_type = "cpu";
+			compatible = "cavium,thunder", "arm,armv8";
+			reg = <0x0 0x1020f>;
+			enable-method = "psci";
+			arm,associativity = <1 0 0x1020f>;
+		};
+	};
+
+	timer {
+		compatible = "arm,armv8-timer";
+		interrupts = <1 13 0xff01>,
+		             <1 14 0xff01>,
+		             <1 11 0xff01>,
+		             <1 10 0xff01>;
+	};
+
+	soc {
+		compatible = "simple-bus";
+		#address-cells = <2>;
+		#size-cells = <2>;
+		ranges;
+
+		refclk50mhz: refclk50mhz {
+			compatible = "fixed-clock";
+			#clock-cells = <0>;
+			clock-frequency = <50000000>;
+			clock-output-names = "refclk50mhz";
+		};
+
+		gic0: interrupt-controller at 8010,00000000 {
+			compatible = "arm,gic-v3";
+			#interrupt-cells = <3>;
+			#redistributor-regions = <2>;
+			interrupt-controller;
+			reg = <0x8010 0x00000000 0x0 0x010000>, /* GICD */
+			      <0x8010 0x80000000 0x0 0x600000>, /* GICR Node 0 */
+			      <0x9010 0x80000000 0x0 0x600000>; /* GICR Node 1 */
+			interrupts = <1 9 0xf04>;
+		};
+
+		uaa0: serial at 87e0,24000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x24000000 0x0 0x1000>;
+			interrupts = <1 21 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+
+		uaa1: serial at 87e0,25000000 {
+			compatible = "arm,pl011", "arm,primecell";
+			reg = <0x87e0 0x25000000 0x0 0x1000>;
+			interrupts = <1 22 4>;
+			clocks = <&refclk50mhz>;
+			clock-names = "apb_pclk";
+		};
+	};
+};
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
       [not found] ` <1420011208-7051-5-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
@ 2015-01-02 21:10   ` Arnd Bergmann
  2015-01-06  9:25     ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

[re-sent with correct mailing list address]

On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
> Adding numa support for arm64 based platforms.
> Adding dt node pasring for numa topology using property arm,associativity.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>

Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
We can always look for both arm,associativity and ibm,associativity, I don't
think we should be worried about any conflicts that way.

> +#define MAX_DISTANCE_REF_POINTS 4

I think we should use 8 here like powerpc, four levels might get exceeded
on complex SoCs.

> +int dt_get_cpu_node_id(int cpu)
> +{
> +	struct device_node *dn = NULL;
> +
> +	while ((dn = of_find_node_by_type(dn, "cpu"))) {
> +		const u32 *cell;
> +		u64 hwid;
> +
> +		/*
> +		 * A cpu node with missing "reg" property is
> +		 * considered invalid to build a cpu_logical_map
> +		 * entry.
> +		 */
> +		cell = of_get_property(dn, "reg", NULL);
> +		if (!cell) {
> +			pr_err("%s: missing reg property\n", dn->full_name);
> +			return default_nid;
> +		}
> +		hwid = of_read_number(cell, of_n_addr_cells(dn));
> +
> +		if (cpu_logical_map(cpu) == hwid)
> +		return of_node_to_nid_single(dn);
> +	}
> +	return NUMA_NO_NODE;
> +}
> +EXPORT_SYMBOL(dt_get_cpu_node_id);

Maybe just expose a function to the device node for a CPU ID here, and
expect callers to use of_node_to_nid?

> +
> +/**
> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
> + */
> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
> +				     int depth, void *data)
> +{
> +	const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
> +
> +	/* We are scanning "numa-map" nodes only */

a stale comment?

> +/* DT node mapping is done already early_init_dt_scan_memory */
> +int __init arm64_dt_numa_init(void)
> +{
> +	int i;
> +	u32 nodea, nodeb, distance, node_count = 0;
> +
> +	of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
> +
> +	for_each_node_mask(i, numa_nodes_parsed)
> +		node_count = i;
> +	node_count++;
> +
> +	for (nodea =  0; nodea < node_count; nodea++) {
> +		for (nodeb = 0; nodeb < node_count; nodeb++) {
> +			distance = dt_get_node_distance(nodea, nodeb);
> +			numa_set_distance(nodea, nodeb, distance);
> +		}
> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(arm64_dt_numa_init);

No need to export functions that are called only be architecture code.
Since this works on the flattened device tree format, you can never
have loadable modules calling it.

> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>  		 * "processor".  Give glibc what it expects.
>  		 */
>  #ifdef CONFIG_SMP
> +	if (IS_ENABLED(CONFIG_NUMA)) {
> +		seq_printf(m, "processor\t: %d", i);
> +		seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
> +	} else {
>  		seq_printf(m, "processor\t: %d\n", i);
> +	}
>  #endif
>  	}

Do we need to make this conditional? I think we can just always
print the node number, even if it's going to be zero for systems
without the associativity properties.

> +
> +int cpu_to_node_map[NR_CPUS];
> +EXPORT_SYMBOL(cpu_to_node_map);

This seems to be x86 specific, do we need it?

> +/*
> + *  Set the cpu to node and mem mapping
> + */
> +void numa_store_cpu_info(int cpu)
> +{
> +#ifdef CONFIG_ARM64_DT_NUMA
> +	node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
> +#endif

I would try to avoid the #ifdef here, by providing a stub function of
dt_get_cpu_node_id or whichever function we end up calling here when
NUMA is disabled.

> +
> +/**
> + * arm64_numa_init - Initialize NUMA
> + *
> + * Try each configured NUMA initialization method until one succeeds.  The
> + * last fallback is dummy single node config encomapssing whole memory and
> + * never fails.
> + */
> +void __init arm64_numa_init(void)
> +{
> +	if (!numa_off) {
> +#ifdef CONFIG_ARM64_DT_NUMA
> +		if (!numa_init(arm64_dt_numa_init))
> +			return;
> +#endif
> +	}
> +
> +	numa_init(dummy_numa_init);
> +}

I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
not be conditional, and the arm64_dt_numa_init should fall back to doing
something reasonable when numa is turned off or there are no associativity
properties.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
       [not found] ` <1420011208-7051-3-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
@ 2015-01-02 21:17   ` Arnd Bergmann
  2015-01-06  5:28     ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
> DT bindings for numa map for memory, cores and IOs using arm,associativity
> device node property.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
> ---
>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>  1 file changed, 198 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
> 
> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
> new file mode 100644
> index 0000000..4f51e25
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/numa.txt
> @@ -0,0 +1,198 @@
> +==============================================================================
> +NUMA binding description.
> +==============================================================================
> +
> +==============================================================================
> +1 - Introduction
> +==============================================================================
> +
> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
> +collections of hardware resources including processors, memory, and I/O buses,
> +that comprise what is commonly known as a ???NUMA node???.
> +Processor accesses to memory within the local NUMA node is generally faster
> +than processor accesses to memory outside of the local NUMA node.
> +DT defines interfaces that allow the platform to convey NUMA node
> +topology information to OS.
> +
> +==============================================================================
> +2 - arm,associativity
> +==============================================================================
> +
> +The mapping is done using arm,associativity device property.
> +this property needs to be present in every device node which needs to to be
> +mapped to numa nodes.
> +
> +arm,associativity property is set of 32-bit integers. representing the
> +board id, socket id and core id.
> +
> +ex:
> +	/* board 0, socket 0, core 0 */
> +	arm,associativity = <0 0 0x000>;
> +
> +	/* board 1, socket 0, core 8 */
> +	arm,associativity = <1 0 0x08>;

This is way too specific to Cavium machines. Most other vendors will not (at first)
have multiple boards or multiple sockets, but need to represent multiple clusters
and/or SMT threads instead. Also the wording suggests that this is only relevant
for NUMA, which I don't think is helpful because we will also want to describe
the topology within one NUMA node for locality.

I think we should stick to the powerpc definition here and not define what the
levels mean at the binding level. Something like:

"Each level of topology defines a boundary in the system at which a significant
difference in performance can be measured between cross-device accesses within
a single location and those spanning multiple locations. The first cell always
contains the broadest subdivision within the system, while the last cell enumerates
the individual devices, such as an SMT thread of a CPU, or a bus bridge within
an SoC".

> +==============================================================================
> +3 - arm,associativity-reference-points
> +==============================================================================
> +This property is a set of 32-bit integers, each representing an index into
> +the arm,associativity nodes. The first integer is the most significant
> +NUMA boundary and the following are progressively less significant boundaries.
> +There can be more than one level of NUMA.
> +
> +Ex:
> +	arm,associativity-reference-points = <0 1>;
> +	The board Id(index 0) used first to calculate the associativity (node
> +	distance), then follows the  socket id(index 1).
> +
> +	arm,associativity-reference-points = <1 0>;
> +	The socket Id(index 1) used first to calculate the associativity,
> +	then follows the board id(index 0).
> +
> +	arm,associativity-reference-points = <0>;
> +	Only the board Id(index 0) used to calculate the associativity.
> +
> +	arm,associativity-reference-points = <1>;
> +	Only socket Id(index 1) used to calculate the associativity.
> +
> +==============================================================================
> +4 - Example dts
> +==============================================================================
> +
> +Example: 2 Node system consists of 2 boards and each board having one socket
> +and 8 core in each socket.

I think the example should also include a PCI controller.

> +
> +	arm,associativity-reference-points = <0 1>;

This doesn't really match the associativity properties, because the
second level in the cpus nodes is completely meaningless and should
not be listed as a secondary reference point.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096.
       [not found] ` <1420011208-7051-2-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
@ 2015-01-02 21:17   ` Arnd Bergmann
  0 siblings, 0 replies; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:25 Ganapatrao Kulkarni wrote:
> Raising the maximum limit to 4096.
> This is to accomadate up-coming higher multi-core platforms.
> 
> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
       [not found] ` <1420011208-7051-4-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
@ 2015-01-02 21:17   ` Arnd Bergmann
  2015-01-06  9:34     ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-02 21:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> +
> +	memory at 00000000 {
> +		device_type = "memory";
> +		reg = <0x0 0x00000000 0x0 0x80000000>;
> +		/* board 0, socket 0, no specific core */
> +		arm,associativity = <0 0 0xffff>;
> +	};
> +
> +	memory at 10000000000 {
> +		device_type = "memory";
> +		reg = <0x100 0x00000000 0x0 0x80000000>;
> +		/* board 1, socket 0, no specific core */
> +		arm,associativity = <1 0 0xffff>;
> +	};
> +};

So no memory in any other socket?

> +		cpu at 00f {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x00f>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x00f>;
> +		};
> +		cpu at 100 {
> +			device_type = "cpu";
> +			compatible = "cavium,thunder", "arm,armv8";
> +			reg = <0x0 0x100>;
> +			enable-method = "psci";
> +			arm,associativity = <0 0 0x100>;
> +		};

What is the 0x100 offset in the last-level topology field? Does this have
no significance to topology at all? I would expect that to be something
like cluster number that is relevant to caching and should be represented
as a separate level.

In contrast, the level-two topology information seems to always be
zero for all CPUs, so you could probably leave that one out.

> +	soc {
> +		compatible = "simple-bus";
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		ranges;

The soc node is missing a topology information, please add one.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa.
  2015-01-02 21:17   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Arnd Bergmann
@ 2015-01-06  5:28     ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  5:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,


On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 31 December 2014 13:03:26 Ganapatrao Kulkarni wrote:
>> DT bindings for numa map for memory, cores and IOs using arm,associativity
>> device node property.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>> ---
>>  Documentation/devicetree/bindings/arm/numa.txt | 198 +++++++++++++++++++++++++
>>  1 file changed, 198 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/arm/numa.txt
>>
>> diff --git a/Documentation/devicetree/bindings/arm/numa.txt b/Documentation/devicetree/bindings/arm/numa.txt
>> new file mode 100644
>> index 0000000..4f51e25
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/numa.txt
>> @@ -0,0 +1,198 @@
>> +==============================================================================
>> +NUMA binding description.
>> +==============================================================================
>> +
>> +==============================================================================
>> +1 - Introduction
>> +==============================================================================
>> +
>> +Systems employing a Non Uniform Memory Access (NUMA) architecture contain
>> +collections of hardware resources including processors, memory, and I/O buses,
>> +that comprise what is commonly known as a ???NUMA node?? .
>> +Processor accesses to memory within the local NUMA node is generally faster
>> +than processor accesses to memory outside of the local NUMA node.
>> +DT defines interfaces that allow the platform to convey NUMA node
>> +topology information to OS.
>> +
>> +==============================================================================
>> +2 - arm,associativity
>> +==============================================================================
>> +
>> +The mapping is done using arm,associativity device property.
>> +this property needs to be present in every device node which needs to to be
>> +mapped to numa nodes.
>> +
>> +arm,associativity property is set of 32-bit integers. representing the
>> +board id, socket id and core id.
>> +
>> +ex:
>> +     /* board 0, socket 0, core 0 */
>> +     arm,associativity = <0 0 0x000>;
>> +
>> +     /* board 1, socket 0, core 8 */
>> +     arm,associativity = <1 0 0x08>;
>
> This is way too specific to Cavium machines. Most other vendors will not (at first)
> have multiple boards or multiple sockets, but need to represent multiple clusters
> and/or SMT threads instead. Also the wording suggests that this is only relevant
> for NUMA, which I don't think is helpful because we will also want to describe
> the topology within one NUMA node for locality.
>
> I think we should stick to the powerpc definition here and not define what the
> levels mean at the binding level. Something like:
>
> "Each level of topology defines a boundary in the system at which a significant
> difference in performance can be measured between cross-device accesses within
> a single location and those spanning multiple locations. The first cell always
> contains the broadest subdivision within the system, while the last cell enumerates
> the individual devices, such as an SMT thread of a CPU, or a bus bridge within
> an SoC".
Ok,, i will change as suggested.
>
>> +==============================================================================
>> +3 - arm,associativity-reference-points
>> +==============================================================================
>> +This property is a set of 32-bit integers, each representing an index into
>> +the arm,associativity nodes. The first integer is the most significant
>> +NUMA boundary and the following are progressively less significant boundaries.
>> +There can be more than one level of NUMA.
>> +
>> +Ex:
>> +     arm,associativity-reference-points = <0 1>;
>> +     The board Id(index 0) used first to calculate the associativity (node
>> +     distance), then follows the  socket id(index 1).
>> +
>> +     arm,associativity-reference-points = <1 0>;
>> +     The socket Id(index 1) used first to calculate the associativity,
>> +     then follows the board id(index 0).
>> +
>> +     arm,associativity-reference-points = <0>;
>> +     Only the board Id(index 0) used to calculate the associativity.
>> +
>> +     arm,associativity-reference-points = <1>;
>> +     Only socket Id(index 1) used to calculate the associativity.
>> +
>> +==============================================================================
>> +4 - Example dts
>> +==============================================================================
>> +
>> +Example: 2 Node system consists of 2 boards and each board having one socket
>> +and 8 core in each socket.
>
> I think the example should also include a PCI controller.
Yes, i will add pci.
>
>> +
>> +     arm,associativity-reference-points = <0 1>;
>
> This doesn't really match the associativity properties, because the
> second level in the cpus nodes is completely meaningless and should
> not be listed as a secondary reference point.
agreed, will remove second entry.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-02 21:10   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Arnd Bergmann
@ 2015-01-06  9:25     ` Ganapatrao Kulkarni
  2015-01-06 19:59       ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,


On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> [re-sent with correct mailing list address]
>
> On Wednesday 31 December 2014 13:03:28 Ganapatrao Kulkarni wrote:
>> Adding numa support for arm64 based platforms.
>> Adding dt node pasring for numa topology using property arm,associativity.
>>
>> Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
>
> Maybe the parts that are common with powerpc can be moved to drivers/of/numa.c?
> We can always look for both arm,associativity and ibm,associativity, I don't
> think we should be worried about any conflicts that way.
ok, i will move common functions from powerpc and arm64 to driver/of/numa.c
>
>> +#define MAX_DISTANCE_REF_POINTS 4
>
> I think we should use 8 here like powerpc, four levels might get exceeded
> on complex SoCs.
sure.
>
>> +int dt_get_cpu_node_id(int cpu)
>> +{
>> +     struct device_node *dn = NULL;
>> +
>> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> +             const u32 *cell;
>> +             u64 hwid;
>> +
>> +             /*
>> +              * A cpu node with missing "reg" property is
>> +              * considered invalid to build a cpu_logical_map
>> +              * entry.
>> +              */
>> +             cell = of_get_property(dn, "reg", NULL);
>> +             if (!cell) {
>> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> +                     return default_nid;
>> +             }
>> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> +
>> +             if (cpu_logical_map(cpu) == hwid)
>> +             return of_node_to_nid_single(dn);
>> +     }
>> +     return NUMA_NO_NODE;
>> +}
>> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>
> Maybe just expose a function to the device node for a CPU ID here, and
> expect callers to use of_node_to_nid?
shall i make this wrapper function in dt_numa.c, which will use
functions _of_node_to_nid and  _of_cpu_to_node(cpu)
And,  this function can be a weak function in numa.c which returns 0.
>
>> +
>> +/**
>> + * early_init_dt_scan_numa_map - parse memory node and map nid to memory range.
>> + */
>> +int __init early_init_dt_scan_numa_map(unsigned long node, const char *uname,
>> +                                  int depth, void *data)
>> +{
>> +     const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
>> +
>> +     /* We are scanning "numa-map" nodes only */
>
> a stale comment?
oops, will remove.
>
>> +/* DT node mapping is done already early_init_dt_scan_memory */
>> +int __init arm64_dt_numa_init(void)
>> +{
>> +     int i;
>> +     u32 nodea, nodeb, distance, node_count = 0;
>> +
>> +     of_scan_flat_dt(early_init_dt_scan_numa_map, NULL);
>> +
>> +     for_each_node_mask(i, numa_nodes_parsed)
>> +             node_count = i;
>> +     node_count++;
>> +
>> +     for (nodea =  0; nodea < node_count; nodea++) {
>> +             for (nodeb = 0; nodeb < node_count; nodeb++) {
>> +                     distance = dt_get_node_distance(nodea, nodeb);
>> +                     numa_set_distance(nodea, nodeb, distance);
>> +             }
>> +     }
>> +     return 0;
>> +}
>> +EXPORT_SYMBOL(arm64_dt_numa_init);
>
> No need to export functions that are called only be architecture code.
> Since this works on the flattened device tree format, you can never
> have loadable modules calling it.
yes, will do.
>
>> @@ -461,7 +464,12 @@ static int c_show(struct seq_file *m, void *v)
>>                * "processor".  Give glibc what it expects.
>>                */
>>  #ifdef CONFIG_SMP
>> +     if (IS_ENABLED(CONFIG_NUMA)) {
>> +             seq_printf(m, "processor\t: %d", i);
>> +             seq_printf(m, " [nid: %d]\n", cpu_to_node(i));
>> +     } else {
>>               seq_printf(m, "processor\t: %d\n", i);
>> +     }
>>  #endif
>>       }
>
> Do we need to make this conditional? I think we can just always
> print the node number, even if it's going to be zero for systems
> without the associativity properties.
yes, we can.
>
>> +
>> +int cpu_to_node_map[NR_CPUS];
>> +EXPORT_SYMBOL(cpu_to_node_map);
>
> This seems to be x86 specific, do we need it?
>
>> +/*
>> + *  Set the cpu to node and mem mapping
>> + */
>> +void numa_store_cpu_info(int cpu)
>> +{
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +     node_cpu_hwid[cpu].node_id  =  dt_get_cpu_node_id(cpu);
>> +#endif
>
> I would try to avoid the #ifdef here, by providing a stub function of
> dt_get_cpu_node_id or whichever function we end up calling here when
> NUMA is disabled.
as commented above.
.>
>> +
>> +/**
>> + * arm64_numa_init - Initialize NUMA
>> + *
>> + * Try each configured NUMA initialization method until one succeeds.  The
>> + * last fallback is dummy single node config encomapssing whole memory and
>> + * never fails.
>> + */
>> +void __init arm64_numa_init(void)
>> +{
>> +     if (!numa_off) {
>> +#ifdef CONFIG_ARM64_DT_NUMA
>> +             if (!numa_init(arm64_dt_numa_init))
>> +                     return;
>> +#endif
>> +     }
>> +
>> +     numa_init(dummy_numa_init);
>> +}
>
> I don't think we need the CONFIG_ARM64_DT_NUMA=n case here, it should just
> not be conditional, and the arm64_dt_numa_init should fall back to doing
> something reasonable when numa is turned off or there are no associativity
> properties.
i think we can remove ifdef, will do it.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-02 21:17   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Arnd Bergmann
@ 2015-01-06  9:34     ` Ganapatrao Kulkarni
  2015-01-06 20:02       ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-06  9:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> +
>> +     memory at 00000000 {
>> +             device_type = "memory";
>> +             reg = <0x0 0x00000000 0x0 0x80000000>;
>> +             /* board 0, socket 0, no specific core */
>> +             arm,associativity = <0 0 0xffff>;
>> +     };
>> +
>> +     memory at 10000000000 {
>> +             device_type = "memory";
>> +             reg = <0x100 0x00000000 0x0 0x80000000>;
>> +             /* board 1, socket 0, no specific core */
>> +             arm,associativity = <1 0 0xffff>;
>> +     };
>> +};
>
> So no memory in any other socket?
>
>> +             cpu at 00f {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x00f>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x00f>;
>> +             };
>> +             cpu at 100 {
>> +                     device_type = "cpu";
>> +                     compatible = "cavium,thunder", "arm,armv8";
>> +                     reg = <0x0 0x100>;
>> +                     enable-method = "psci";
>> +                     arm,associativity = <0 0 0x100>;
>> +             };
>
> What is the 0x100 offset in the last-level topology field? Does this have
> no significance to topology at all? I would expect that to be something
> like cluster number that is relevant to caching and should be represented
> as a separate level.
i did not understand, can you please explain little more about "
should be represented as a separate level."
at present, i have put the hwid of a cpu.
>
> In contrast, the level-two topology information seems to always be
> zero for all CPUs, so you could probably leave that one out.
>
>> +     soc {
>> +             compatible = "simple-bus";
>> +             #address-cells = <2>;
>> +             #size-cells = <2>;
>> +             ranges;
>
> The soc node is missing a topology information, please add one.
ok, will be added.
>
>         Arnd

thanks
ganapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-06  9:25     ` Ganapatrao Kulkarni
@ 2015-01-06 19:59       ` Arnd Bergmann
  2015-01-07  7:09         ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-06 19:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> +int dt_get_cpu_node_id(int cpu)
> >> +{
> >> +     struct device_node *dn = NULL;
> >> +
> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
> >> +             const u32 *cell;
> >> +             u64 hwid;
> >> +
> >> +             /*
> >> +              * A cpu node with missing "reg" property is
> >> +              * considered invalid to build a cpu_logical_map
> >> +              * entry.
> >> +              */
> >> +             cell = of_get_property(dn, "reg", NULL);
> >> +             if (!cell) {
> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
> >> +                     return default_nid;
> >> +             }
> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
> >> +
> >> +             if (cpu_logical_map(cpu) == hwid)
> >> +             return of_node_to_nid_single(dn);
> >> +     }
> >> +     return NUMA_NO_NODE;
> >> +}
> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
> >
> > Maybe just expose a function to the device node for a CPU ID here, and
> > expect callers to use of_node_to_nid?
> shall i make this wrapper function in dt_numa.c, which will use
> functions _of_node_to_nid and  _of_cpu_to_node(cpu)

Yes, I guess that would work.

> And,  this function can be a weak function in numa.c which returns 0.

No, please don't use weak functions. You can either use IS_ENABLED()
tricks to remove function calls at compile-time, or in the header
file provide an inline function as an alternative to the extern
declaration, based on a configuration symbol.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-06  9:34     ` Ganapatrao Kulkarni
@ 2015-01-06 20:02       ` Arnd Bergmann
  2015-01-07  7:07         ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-06 20:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> +             cpu at 00f {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x00f>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x00f>;
> >> +             };
> >> +             cpu at 100 {
> >> +                     device_type = "cpu";
> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> +                     reg = <0x0 0x100>;
> >> +                     enable-method = "psci";
> >> +                     arm,associativity = <0 0 0x100>;
> >> +             };
> >
> > What is the 0x100 offset in the last-level topology field? Does this have
> > no significance to topology at all? I would expect that to be something
> > like cluster number that is relevant to caching and should be represented
> > as a separate level.
>
> i did not understand, can you please explain little more about "
> should be represented as a separate level."
> at present, i have put the hwid of a cpu.

>From what I undertand, the hwid of the CPU contains the "cluster" number in
this bit position, so you typically have a shared L2 or L3 cache between
all cores within a cluster, but separate caches in other clusters.

If this is the case, there will be a measurable difference in performance
between two processes sharing memory when running on the same cluster,
or when running on different clusters on the same socket. If the
performance difference is relevant, it should be described as a separate
level in the associativity property.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-06 20:02       ` Arnd Bergmann
@ 2015-01-07  7:07         ` Ganapatrao Kulkarni
  2015-01-07  8:18           ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:07 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Arnd,

On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> >> +             cpu at 00f {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x00f>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x00f>;
>> >> +             };
>> >> +             cpu at 100 {
>> >> +                     device_type = "cpu";
>> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> >> +                     reg = <0x0 0x100>;
>> >> +                     enable-method = "psci";
>> >> +                     arm,associativity = <0 0 0x100>;
>> >> +             };
>> >
>> > What is the 0x100 offset in the last-level topology field? Does this have
>> > no significance to topology at all? I would expect that to be something
>> > like cluster number that is relevant to caching and should be represented
>> > as a separate level.
>>
>> i did not understand, can you please explain little more about "
>> should be represented as a separate level."
>> at present, i have put the hwid of a cpu.
>
> From what I undertand, the hwid of the CPU contains the "cluster" number in
> this bit position, so you typically have a shared L2 or L3 cache between
> all cores within a cluster, but separate caches in other clusters.
>
> If this is the case, there will be a measurable difference in performance
> between two processes sharing memory when running on the same cluster,
> or when running on different clusters on the same socket. If the
> performance difference is relevant, it should be described as a separate
> level in the associativity property.
you mean, the associativity as array of  <board> <socket> <cluster>
>
>         Arnd
thanks
ganapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms.
  2015-01-06 19:59       ` Arnd Bergmann
@ 2015-01-07  7:09         ` Ganapatrao Kulkarni
  0 siblings, 0 replies; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-07  7:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 7, 2015 at 1:29 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tuesday 06 January 2015 14:55:53 Ganapatrao Kulkarni wrote:
>> On Sat, Jan 3, 2015 at 2:40 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> +int dt_get_cpu_node_id(int cpu)
>> >> +{
>> >> +     struct device_node *dn = NULL;
>> >> +
>> >> +     while ((dn = of_find_node_by_type(dn, "cpu"))) {
>> >> +             const u32 *cell;
>> >> +             u64 hwid;
>> >> +
>> >> +             /*
>> >> +              * A cpu node with missing "reg" property is
>> >> +              * considered invalid to build a cpu_logical_map
>> >> +              * entry.
>> >> +              */
>> >> +             cell = of_get_property(dn, "reg", NULL);
>> >> +             if (!cell) {
>> >> +                     pr_err("%s: missing reg property\n", dn->full_name);
>> >> +                     return default_nid;
>> >> +             }
>> >> +             hwid = of_read_number(cell, of_n_addr_cells(dn));
>> >> +
>> >> +             if (cpu_logical_map(cpu) == hwid)
>> >> +             return of_node_to_nid_single(dn);
>> >> +     }
>> >> +     return NUMA_NO_NODE;
>> >> +}
>> >> +EXPORT_SYMBOL(dt_get_cpu_node_id);
>> >
>> > Maybe just expose a function to the device node for a CPU ID here, and
>> > expect callers to use of_node_to_nid?
>> shall i make this wrapper function in dt_numa.c, which will use
>> functions _of_node_to_nid and  _of_cpu_to_node(cpu)
>
> Yes, I guess that would work.
>
>> And,  this function can be a weak function in numa.c which returns 0.
>
> No, please don't use weak functions. You can either use IS_ENABLED()
> tricks to remove function calls at compile-time, or in the header
> file provide an inline function as an alternative to the extern
> declaration, based on a configuration symbol.
ok
>
>         Arnd
thanks
ganaapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-07  7:07         ` Ganapatrao Kulkarni
@ 2015-01-07  8:18           ` Arnd Bergmann
  2015-01-14 17:36             ` Lorenzo Pieralisi
  0 siblings, 1 reply; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-07  8:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> Hi Arnd,
> 
> On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> >> +             cpu at 00f {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x00f>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x00f>;
> >> >> +             };
> >> >> +             cpu at 100 {
> >> >> +                     device_type = "cpu";
> >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> >> +                     reg = <0x0 0x100>;
> >> >> +                     enable-method = "psci";
> >> >> +                     arm,associativity = <0 0 0x100>;
> >> >> +             };
> >> >
> >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > no significance to topology at all? I would expect that to be something
> >> > like cluster number that is relevant to caching and should be represented
> >> > as a separate level.
> >>
> >> i did not understand, can you please explain little more about "
> >> should be represented as a separate level."
> >> at present, i have put the hwid of a cpu.
> >
> > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > this bit position, so you typically have a shared L2 or L3 cache between
> > all cores within a cluster, but separate caches in other clusters.
> >
> > If this is the case, there will be a measurable difference in performance
> > between two processes sharing memory when running on the same cluster,
> > or when running on different clusters on the same socket. If the
> > performance difference is relevant, it should be described as a separate
> > level in the associativity property.
> you mean, the associativity as array of  <board> <socket> <cluster>

No, that would leave out the core number, which is required to identify
the individual thread. I meant adding an extra level such as

<board> <socket> <cluster> <core>

A lot of machines will leave out the <board> number because they are
built with SoCs that don't have a long-distance coherency protocol.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-07  8:18           ` Arnd Bergmann
@ 2015-01-14 17:36             ` Lorenzo Pieralisi
  2015-01-14 18:48               ` Ganapatrao Kulkarni
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> > Hi Arnd,
> > 
> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> > >> >> +             cpu at 00f {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x00f>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x00f>;
> > >> >> +             };
> > >> >> +             cpu at 100 {
> > >> >> +                     device_type = "cpu";
> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> > >> >> +                     reg = <0x0 0x100>;
> > >> >> +                     enable-method = "psci";
> > >> >> +                     arm,associativity = <0 0 0x100>;
> > >> >> +             };
> > >> >
> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> > >> > no significance to topology at all? I would expect that to be something
> > >> > like cluster number that is relevant to caching and should be represented
> > >> > as a separate level.
> > >>
> > >> i did not understand, can you please explain little more about "
> > >> should be represented as a separate level."
> > >> at present, i have put the hwid of a cpu.
> > >
> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> > > this bit position, so you typically have a shared L2 or L3 cache between
> > > all cores within a cluster, but separate caches in other clusters.
> > >
> > > If this is the case, there will be a measurable difference in performance
> > > between two processes sharing memory when running on the same cluster,
> > > or when running on different clusters on the same socket. If the
> > > performance difference is relevant, it should be described as a separate
> > > level in the associativity property.
> > you mean, the associativity as array of  <board> <socket> <cluster>
> 
> No, that would leave out the core number, which is required to identify
> the individual thread. I meant adding an extra level such as
> 
> <board> <socket> <cluster> <core>
> 
> A lot of machines will leave out the <board> number because they are
> built with SoCs that don't have a long-distance coherency protocol.

Can't we use phandles to cpu-map nodes instead of a list of numbers (and
yet another topology binding description) ?

Is arm,associativity used solely to map "devices" (inclusive of caches)
to a set of cpus ?

cpu-map misses a notion of distance between hierarchy layers, but we can
add to that.

Lorenzo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 17:36             ` Lorenzo Pieralisi
@ 2015-01-14 18:48               ` Ganapatrao Kulkarni
  2015-01-14 23:49                 ` Lorenzo Pieralisi
  0 siblings, 1 reply; 17+ messages in thread
From: Ganapatrao Kulkarni @ 2015-01-14 18:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Lorenzo,

On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
<lorenzo.pieralisi@arm.com> wrote:
> On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
>> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
>> > Hi Arnd,
>> >
>> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
>> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
>> > >> >> +             cpu at 00f {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x00f>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x00f>;
>> > >> >> +             };
>> > >> >> +             cpu at 100 {
>> > >> >> +                     device_type = "cpu";
>> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
>> > >> >> +                     reg = <0x0 0x100>;
>> > >> >> +                     enable-method = "psci";
>> > >> >> +                     arm,associativity = <0 0 0x100>;
>> > >> >> +             };
>> > >> >
>> > >> > What is the 0x100 offset in the last-level topology field? Does this have
>> > >> > no significance to topology at all? I would expect that to be something
>> > >> > like cluster number that is relevant to caching and should be represented
>> > >> > as a separate level.
>> > >>
>> > >> i did not understand, can you please explain little more about "
>> > >> should be represented as a separate level."
>> > >> at present, i have put the hwid of a cpu.
>> > >
>> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
>> > > this bit position, so you typically have a shared L2 or L3 cache between
>> > > all cores within a cluster, but separate caches in other clusters.
>> > >
>> > > If this is the case, there will be a measurable difference in performance
>> > > between two processes sharing memory when running on the same cluster,
>> > > or when running on different clusters on the same socket. If the
>> > > performance difference is relevant, it should be described as a separate
>> > > level in the associativity property.
>> > you mean, the associativity as array of  <board> <socket> <cluster>
>>
>> No, that would leave out the core number, which is required to identify
>> the individual thread. I meant adding an extra level such as
>>
>> <board> <socket> <cluster> <core>
>>
>> A lot of machines will leave out the <board> number because they are
>> built with SoCs that don't have a long-distance coherency protocol.
>
> Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> yet another topology binding description) ?
cpu-map describes only a cpu topology.
infact, i have tried initially(in v1 patch set) to use topology for
the numa mapping.
However, for numa, we need to define association of cpu, memory and IOs.
arm,associativity is a generic node property and can be used in any dt nodes.
>
> Is arm,associativity used solely to map "devices" (inclusive of caches)
> to a set of cpus ?
>
> cpu-map misses a notion of distance between hierarchy layers, but we can
> add to that.
>
> Lorenzo
thanks
ganapat

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 18:48               ` Ganapatrao Kulkarni
@ 2015-01-14 23:49                 ` Lorenzo Pieralisi
  2015-01-15 17:32                   ` Arnd Bergmann
  0 siblings, 1 reply; 17+ messages in thread
From: Lorenzo Pieralisi @ 2015-01-14 23:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> Hi Lorenzo,
> 
> On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> <lorenzo.pieralisi@arm.com> wrote:
> > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> >> On Wednesday 07 January 2015 12:37:51 Ganapatrao Kulkarni wrote:
> >> > Hi Arnd,
> >> >
> >> > On Wed, Jan 7, 2015 at 1:32 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > > On Tuesday 06 January 2015 15:04:26 Ganapatrao Kulkarni wrote:
> >> > >> On Sat, Jan 3, 2015 at 2:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> > >> > On Wednesday 31 December 2014 13:03:27 Ganapatrao Kulkarni wrote:
> >> > >> >> +             cpu at 00f {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x00f>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x00f>;
> >> > >> >> +             };
> >> > >> >> +             cpu at 100 {
> >> > >> >> +                     device_type = "cpu";
> >> > >> >> +                     compatible = "cavium,thunder", "arm,armv8";
> >> > >> >> +                     reg = <0x0 0x100>;
> >> > >> >> +                     enable-method = "psci";
> >> > >> >> +                     arm,associativity = <0 0 0x100>;
> >> > >> >> +             };
> >> > >> >
> >> > >> > What is the 0x100 offset in the last-level topology field? Does this have
> >> > >> > no significance to topology at all? I would expect that to be something
> >> > >> > like cluster number that is relevant to caching and should be represented
> >> > >> > as a separate level.
> >> > >>
> >> > >> i did not understand, can you please explain little more about "
> >> > >> should be represented as a separate level."
> >> > >> at present, i have put the hwid of a cpu.
> >> > >
> >> > > From what I undertand, the hwid of the CPU contains the "cluster" number in
> >> > > this bit position, so you typically have a shared L2 or L3 cache between
> >> > > all cores within a cluster, but separate caches in other clusters.
> >> > >
> >> > > If this is the case, there will be a measurable difference in performance
> >> > > between two processes sharing memory when running on the same cluster,
> >> > > or when running on different clusters on the same socket. If the
> >> > > performance difference is relevant, it should be described as a separate
> >> > > level in the associativity property.
> >> > you mean, the associativity as array of  <board> <socket> <cluster>
> >>
> >> No, that would leave out the core number, which is required to identify
> >> the individual thread. I meant adding an extra level such as
> >>
> >> <board> <socket> <cluster> <core>
> >>
> >> A lot of machines will leave out the <board> number because they are
> >> built with SoCs that don't have a long-distance coherency protocol.
> >
> > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > yet another topology binding description) ?
> cpu-map describes only a cpu topology.
> infact, i have tried initially(in v1 patch set) to use topology for
> the numa mapping.
> However, for numa, we need to define association of cpu, memory and IOs.
> arm,associativity is a generic node property and can be used in any dt nodes.

I understand that, I was advising to define "arm,associativity" as a
phandle in cpu nodes AND all devices.

Why can't you make it point at a phandle in the cpu-map instead of adding
a t-uple doing the same thing. Am I missing something here ?
cpu-map allows you to describe the system hierarchy and can expand beyond
clusters (several layers of clusterings, above core it is just a way to
define the system hierarchy, leaves node will always be cores or threads).

On a side note, one of the reasons cpu-map was devised for was exactly
that, to allow mappings of resources (ie IRQs but it is valid for caches
and other devices too) to groups of CPUs.

Is there anything that you can't do by using cpu-map phandles to
describe devices associativity ?

We have to add bindings that allow to compute the distance as you
do by using the reference points (I am reading the code to figure
out how it is used), but that's feasible as a binding update.

Lorenzo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology.
  2015-01-14 23:49                 ` Lorenzo Pieralisi
@ 2015-01-15 17:32                   ` Arnd Bergmann
  0 siblings, 0 replies; 17+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 14 January 2015 23:49:05 Lorenzo Pieralisi wrote:
> On Wed, Jan 14, 2015 at 06:48:32PM +0000, Ganapatrao Kulkarni wrote:
> > On Wed, Jan 14, 2015 at 11:06 PM, Lorenzo Pieralisi
> > <lorenzo.pieralisi@arm.com> wrote:
> > > On Wed, Jan 07, 2015 at 08:18:50AM +0000, Arnd Bergmann wrote:
> > >> No, that would leave out the core number, which is required to identify
> > >> the individual thread. I meant adding an extra level such as
> > >>
> > >> <board> <socket> <cluster> <core>
> > >>
> > >> A lot of machines will leave out the <board> number because they are
> > >> built with SoCs that don't have a long-distance coherency protocol.
> > >
> > > Can't we use phandles to cpu-map nodes instead of a list of numbers (and
> > > yet another topology binding description) ?
> > cpu-map describes only a cpu topology.
> > infact, i have tried initially(in v1 patch set) to use topology for
> > the numa mapping.
> > However, for numa, we need to define association of cpu, memory and IOs.
> > arm,associativity is a generic node property and can be used in any dt nodes.
> 
> I understand that, I was advising to define "arm,associativity" as a
> phandle in cpu nodes AND all devices.
> 
> Why can't you make it point at a phandle in the cpu-map instead of adding
> a t-uple doing the same thing. Am I missing something here ?

Most importantly, it's following an existing spec for ibm,associativity,
which defines topology in terms of associativity, not a hierarchical tree.

> cpu-map allows you to describe the system hierarchy and can expand beyond
> clusters (several layers of clusterings, above core it is just a way to
> define the system hierarchy, leaves node will always be cores or threads).

> On a side note, one of the reasons cpu-map was devised for was exactly
> that, to allow mappings of resources (ie IRQs but it is valid for caches
> and other devices too) to groups of CPUs.
> 
> Is there anything that you can't do by using cpu-map phandles to
> describe devices associativity ?

- It doesn't work for cpu-less nodes.
- It fails if you have multiple paths between two devices, rather than
  a strict tree.
- It doesn't (yet) have a way to define which levels are relevant to NUMA
  topology.
- the phandle references are done in the wrong way if you want to
  represent a lot of devices.

> We have to add bindings that allow to compute the distance as you
> do by using the reference points (I am reading the code to figure
> out how it is used), but that's feasible as a binding update.

It's very unfortunate that we have two conflicting bindings that are
established. I still think that the associativity binding is more
flexible, but we could try to extend the arm topology binding if
necessary, but I'm not sure the end result of that would be better.

	Arnd

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-01-15 17:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1420011208-7051-1-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
     [not found] ` <1420011208-7051-5-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
2015-01-02 21:10   ` [RFC PATCH v3 4/4] arm64:numa: adding numa support for arm64 platforms Arnd Bergmann
2015-01-06  9:25     ` Ganapatrao Kulkarni
2015-01-06 19:59       ` Arnd Bergmann
2015-01-07  7:09         ` Ganapatrao Kulkarni
     [not found] ` <1420011208-7051-3-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
2015-01-02 21:17   ` [RFC PATCH v3 2/4] Documentation: arm64/arm: dt bindings for numa Arnd Bergmann
2015-01-06  5:28     ` Ganapatrao Kulkarni
     [not found] ` <1420011208-7051-2-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
2015-01-02 21:17   ` [RFC PATCH v3 1/4] arm64: defconfig: increase NR_CPUS range to 2-4096 Arnd Bergmann
     [not found] ` <1420011208-7051-4-git-send-email-ganapatrao.kulkarni@caviumnetworks.com>
2015-01-02 21:17   ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Arnd Bergmann
2015-01-06  9:34     ` Ganapatrao Kulkarni
2015-01-06 20:02       ` Arnd Bergmann
2015-01-07  7:07         ` Ganapatrao Kulkarni
2015-01-07  8:18           ` Arnd Bergmann
2015-01-14 17:36             ` Lorenzo Pieralisi
2015-01-14 18:48               ` Ganapatrao Kulkarni
2015-01-14 23:49                 ` Lorenzo Pieralisi
2015-01-15 17:32                   ` Arnd Bergmann
2014-12-31  7:36 [RFC PATCH v3 0/4] arm64:numa: Add numa support for arm64 platforms Ganapatrao Kulkarni
2014-12-31  7:36 ` [RFC PATCH v3 3/4] arm64:thunder: Add initial dts for Cavium's Thunder SoC in 2 Node topology Ganapatrao Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).