linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] DT: Enable sharing resources for SMT threads
@ 2025-05-12  8:07 Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 1/7] of: add infra for finding CPU id from phandle Alireza Sanaee
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

This patchset allows for sharing resources between SMT threads in the
device tree (DT).

WHY? Given the current use of the DT, it is not possible to share L1
caches, as well as other resources such as clock among SMT threads.
However, DT spec in section Section 3.8.1 [1], describes how SMT threads
can be described in the reg array, this is how PowerPC describes SMT
threads in DT.

CHALLENGE: Given discussions with the community [2], it was apparent
that it is not straightforward to implement this, since cpu-maps must
point to a particular CPU node in DT [3], Section 2.1. However, it is
not only the cpu-map but also there other nodes that point to cpu nodes
which indeed need care and changes.

SOLUTION: This led to more discussions on what the solution should look
like and based on recent conversations we ended up with the following
approach [4].

core0 {
  thread0 {
    cpu = <&cpu0 0>;
  };
  thread1 {
    cpu = <&cpu0 1>;
  };
};

In this layout, first parameter is the phandle to cpu-node and second
index would be the local-thread index in the reg array available in the
cpu-node reg property.

SIDE-NOTE: This patchset does not change any bindings, so I am not
including anything in this patchset.

[1] https://github.com/devicetree-org/devicetree-specification/releases/download/v0.4/devicetree-specification-v0.4.pdf
[2] https://lore.kernel.org/linux-arm-kernel/Z4FJZPRg75YIUR2l@J2N7QTR9R3/
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/cpu/cpu-topology.txt
[4] https://lore.kernel.org/devicetree-spec/CAL_JsqK1yqRLD9B+G7UUp=D8K++mXHq0Rmv=1i6DL_jXyZwXAw@mail.gmail.com/

PRIOR VERSIONs:
   [V1] https://lore.kernel.org/all/20250422084340.457-1-alireza.sanaee@huawei.com/
   [V2] https://lore.kernel.org/all/20250502161300.1411-1-alireza.sanaee@huawei.com/

CHANGE LOG:
    V2 -> V3:
        * I got the V2 completely wrong, so I updated it.
        * Re-introduce #cpu-cells property.
    V1 -> V2:
        * Address Rob's comments.
            ** Re-order patches.
            ** Fix bugs.
        * Remove #cpu-cells property

Alireza Sanaee (7):
  of: add infra for finding CPU id from phandle
  arch_topology: update CPU map to use the new API
  coresight: cti: Use of_cpu_phandle_to_id for grabbing CPU id
  coresight: Use of_cpu_phandle_to_id for grabbing CPU id
  perf/arm-dsu: refactor cpu id retrieval via new API
    of_cpu_phandle_to_id
  arm64: of: handle multiple threads in ARM cpu node
  of: of_cpu_phandle_to_id to support SMT threads

 arch/arm64/kernel/smp.c                       | 74 ++++++++++---------
 drivers/base/arch_topology.c                  | 12 +--
 .../coresight/coresight-cti-platform.c        | 15 +---
 .../hwtracing/coresight/coresight-platform.c  | 14 +---
 drivers/of/cpu.c                              | 56 +++++++++++++-
 drivers/perf/arm_dsu_pmu.c                    |  6 +-
 include/linux/of.h                            |  9 +++
 7 files changed, 118 insertions(+), 68 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/7] of: add infra for finding CPU id from phandle
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-28 15:54   ` Jonathan Cameron
  2025-05-12  8:07 ` [PATCH v3 2/7] arch_topology: update CPU map to use the new API Alireza Sanaee
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Get CPU id from phandle. Many drivers get do this by getting hold of CPU
node first through a phandle and then find the CPU ID using the relevant
function. This commit encapsulates cpu node finding and improves
readability.

The API interface requires two parameters, 1) node, 2) pointer to CPU
node. API sets the pointer to the CPU node and allows the driver to play
with the CPU itself, for logging purposes for instance.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 drivers/of/cpu.c   | 29 +++++++++++++++++++++++++++++
 include/linux/of.h |  9 +++++++++
 2 files changed, 38 insertions(+)

diff --git a/drivers/of/cpu.c b/drivers/of/cpu.c
index 5214dc3d05ae..fba17994fc20 100644
--- a/drivers/of/cpu.c
+++ b/drivers/of/cpu.c
@@ -173,6 +173,35 @@ int of_cpu_node_to_id(struct device_node *cpu_node)
 }
 EXPORT_SYMBOL(of_cpu_node_to_id);
 
+/**
+ * of_cpu_phandle_to_id: Get the logical CPU number for a given device_node
+ *
+ * @node: Pointer to the device_node containing CPU phandle.
+ * @cpu_np: Pointer to the device_node for CPU.
+ * @cpu_idx: The index of the CPU in the list of CPUs.
+ *
+ * Return: The logical CPU number of the given CPU device_node or -ENODEV if
+ * the CPU is not found, or if the node is NULL, it returns -1. On success,
+ * cpu_np will always point to the retrieved CPU device_node with refcount
+ * incremented, use of_node_put() on it when done.
+ */
+int of_cpu_phandle_to_id(const struct device_node *node,
+			 struct device_node **cpu_np,
+			 uint8_t cpu_idx)
+{
+	if (!node)
+		return -1;
+
+	*cpu_np = of_parse_phandle(node, "cpu", 0);
+	if (!*cpu_np)
+		*cpu_np = of_parse_phandle(node, "cpus", cpu_idx);
+			if (!*cpu_np)
+				return -ENODEV;
+
+	return of_cpu_node_to_id(*cpu_np);
+}
+EXPORT_SYMBOL(of_cpu_phandle_to_id);
+
 /**
  * of_get_cpu_state_node - Get CPU's idle state node at the given index
  *
diff --git a/include/linux/of.h b/include/linux/of.h
index eaf0e2a2b75c..194f1cb0f6c6 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -360,6 +360,8 @@ extern const void *of_get_property(const struct device_node *node,
 extern struct device_node *of_get_cpu_node(int cpu, unsigned int *thread);
 extern struct device_node *of_cpu_device_node_get(int cpu);
 extern int of_cpu_node_to_id(struct device_node *np);
+extern int of_cpu_phandle_to_id(const struct device_node *np,
+				struct device_node **cpu_np, uint8_t cpu_idx);
 extern struct device_node *of_get_next_cpu_node(struct device_node *prev);
 extern struct device_node *of_get_cpu_state_node(const struct device_node *cpu_node,
 						 int index);
@@ -662,6 +664,13 @@ static inline int of_cpu_node_to_id(struct device_node *np)
 	return -ENODEV;
 }
 
+static inline int of_cpu_phandle_to_id(const struct device_node *np,
+				       struct device_node **cpu_np,
+				       uint8_t cpu_idx)
+{
+	return -ENODEV;
+}
+
 static inline struct device_node *of_get_next_cpu_node(struct device_node *prev)
 {
 	return NULL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/7] arch_topology: update CPU map to use the new API
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 1/7] of: add infra for finding CPU id from phandle Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 3/7] coresight: cti: Use of_cpu_phandle_to_id for grabbing CPU id Alireza Sanaee
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Cleans up the cpu-map generation using the created API.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 drivers/base/arch_topology.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 3ebe77566788..88970f13f684 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -518,23 +518,23 @@ core_initcall(free_raw_capacity);
  */
 static int __init get_cpu_for_node(struct device_node *node)
 {
+	struct device_node *cpu_node __free(device_node) = NULL;
 	int cpu;
-	struct device_node *cpu_node __free(device_node) =
-		of_parse_phandle(node, "cpu", 0);
 
-	if (!cpu_node)
-		return -1;
+	cpu = of_cpu_phandle_to_id(node, &cpu_node, 0);
 
-	cpu = of_cpu_node_to_id(cpu_node);
 	if (cpu >= 0)
 		topology_parse_cpu_capacity(cpu_node, cpu);
-	else
+	else if (cpu == -ENODEV)
 		pr_info("CPU node for %pOF exist but the possible cpu range is :%*pbl\n",
 			cpu_node, cpumask_pr_args(cpu_possible_mask));
+	else
+		return -1;
 
 	return cpu;
 }
 
+
 static int __init parse_core(struct device_node *core, int package_id,
 			     int cluster_id, int core_id)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/7] coresight: cti: Use of_cpu_phandle_to_id for grabbing CPU id
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 1/7] of: add infra for finding CPU id from phandle Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 2/7] arch_topology: update CPU map to use the new API Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 4/7] coresight: " Alireza Sanaee
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Use the newly created API to grab CPU id.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 .../hwtracing/coresight/coresight-cti-platform.c  | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-cti-platform.c b/drivers/hwtracing/coresight/coresight-cti-platform.c
index d0ae10bf6128..cd821e926792 100644
--- a/drivers/hwtracing/coresight/coresight-cti-platform.c
+++ b/drivers/hwtracing/coresight/coresight-cti-platform.c
@@ -41,21 +41,12 @@
  */
 static int of_cti_get_cpu_at_node(const struct device_node *node)
 {
+	struct device_node *dn = NULL;
 	int cpu;
-	struct device_node *dn;
 
-	if (node == NULL)
-		return -1;
-
-	dn = of_parse_phandle(node, "cpu", 0);
-	/* CTI affinity defaults to no cpu */
-	if (!dn)
-		return -1;
-	cpu = of_cpu_node_to_id(dn);
+	cpu = of_cpu_phandle_to_id(node, &dn, 0);
 	of_node_put(dn);
-
-	/* No Affinity  if no cpu nodes are found */
-	return (cpu < 0) ? -1 : cpu;
+	return cpu;
 }
 
 #else
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/7] coresight: Use of_cpu_phandle_to_id for grabbing CPU id
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
                   ` (2 preceding siblings ...)
  2025-05-12  8:07 ` [PATCH v3 3/7] coresight: cti: Use of_cpu_phandle_to_id for grabbing CPU id Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 5/7] perf/arm-dsu: refactor cpu id retrieval via new API of_cpu_phandle_to_id Alireza Sanaee
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Use the newly created API to grab CPU id.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 drivers/hwtracing/coresight/coresight-platform.c | 14 ++------------
 1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-platform.c b/drivers/hwtracing/coresight/coresight-platform.c
index 8192ba3279f0..f032fdbe959b 100644
--- a/drivers/hwtracing/coresight/coresight-platform.c
+++ b/drivers/hwtracing/coresight/coresight-platform.c
@@ -167,19 +167,9 @@ of_coresight_get_output_ports_node(const struct device_node *node)
 
 static int of_coresight_get_cpu(struct device *dev)
 {
-	int cpu;
-	struct device_node *dn;
-
-	if (!dev->of_node)
-		return -ENODEV;
-
-	dn = of_parse_phandle(dev->of_node, "cpu", 0);
-	if (!dn)
-		return -ENODEV;
-
-	cpu = of_cpu_node_to_id(dn);
+	struct device_node *dn = NULL;
+	int cpu = of_cpu_phandle_to_id(dev->of_node, &dn, 0);
 	of_node_put(dn);
-
 	return cpu;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 5/7] perf/arm-dsu: refactor cpu id retrieval via new API of_cpu_phandle_to_id
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
                   ` (3 preceding siblings ...)
  2025-05-12  8:07 ` [PATCH v3 4/7] coresight: " Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 6/7] arm64: of: handle multiple threads in ARM cpu node Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads Alireza Sanaee
  6 siblings, 0 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Update arm-dsu to use the new API, where both "cpus" and "cpu"
properties are supported.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 drivers/perf/arm_dsu_pmu.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index cb4fb59fe04b..7ef204d39173 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -596,11 +596,9 @@ static int dsu_pmu_dt_get_cpus(struct device *dev, cpumask_t *mask)
 	n = of_count_phandle_with_args(dev->of_node, "cpus", NULL);
 	if (n <= 0)
 		return -ENODEV;
+
 	for (; i < n; i++) {
-		cpu_node = of_parse_phandle(dev->of_node, "cpus", i);
-		if (!cpu_node)
-			break;
-		cpu = of_cpu_node_to_id(cpu_node);
+		cpu = of_cpu_phandle_to_id(dev->of_node, &cpu_node, i);
 		of_node_put(cpu_node);
 		/*
 		 * We have to ignore the failures here and continue scanning
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 6/7] arm64: of: handle multiple threads in ARM cpu node
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
                   ` (4 preceding siblings ...)
  2025-05-12  8:07 ` [PATCH v3 5/7] perf/arm-dsu: refactor cpu id retrieval via new API of_cpu_phandle_to_id Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-05-12  8:07 ` [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads Alireza Sanaee
  6 siblings, 0 replies; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Update `of_parse_and_init_cpus` to parse reg property of CPU node as
an array based as per spec for SMT threads.

Spec v0.4 Section 3.8.1:
The value of reg is a <prop-encoded-**array**> that defines a unique
CPU/thread id for the CPU/threads represented by the CPU node.  **If a CPU
supports more than one thread (i.e.  multiple streams of execution) the
reg property is an array with 1 element per thread**.  The address-cells
on the /cpus node specifies how many cells each element of the array
takes. Software can determine the number of threads by dividing the size
of reg by the parent node's address-cells.

An accurate example of 1 core with 2 SMTs:

	cpus {
		#size-cells = <0x00>;
		#address-cells = <0x01>;

		cpu@0 {
			phandle = <0x8000>;
			**reg = <0x00 0x01>;**
			enable-method = "psci";
			compatible = "arm,cortex-a57";
			device_type = "cpu";
		};
	};

Instead of:

	cpus {
		#size-cells = <0x00>;
		#address-cells = <0x01>;

		cpu@0 {
			phandle = <0x8000>;
			reg = <0x00>;
			enable-method = "psci";
			compatible = "arm,cortex-a57";
			device_type = "cpu";
		};

		cpu@1 {
			phandle = <0x8001>;
			reg = <0x01>;
			enable-method = "psci";
			compatible = "arm,cortex-a57";
			device_type = "cpu";
		};
	};

which is **NOT** accurate.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
---
 arch/arm64/kernel/smp.c | 74 +++++++++++++++++++++++------------------
 1 file changed, 41 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 3b3f6b56e733..8dd3b3c82967 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -689,53 +689,61 @@ static void __init acpi_parse_and_init_cpus(void)
 static void __init of_parse_and_init_cpus(void)
 {
 	struct device_node *dn;
+	u64 hwid;
+	u32 tid;
 
 	for_each_of_cpu_node(dn) {
-		u64 hwid = of_get_cpu_hwid(dn, 0);
+		tid = 0;
 
-		if (hwid & ~MPIDR_HWID_BITMASK)
-			goto next;
+		while (1) {
+			hwid = of_get_cpu_hwid(dn, tid++);
+			if (hwid == ~0ULL)
+				break;
 
-		if (is_mpidr_duplicate(cpu_count, hwid)) {
-			pr_err("%pOF: duplicate cpu reg properties in the DT\n",
-				dn);
-			goto next;
-		}
+			if (hwid & ~MPIDR_HWID_BITMASK)
+				goto next;
 
-		/*
-		 * The numbering scheme requires that the boot CPU
-		 * must be assigned logical id 0. Record it so that
-		 * the logical map built from DT is validated and can
-		 * be used.
-		 */
-		if (hwid == cpu_logical_map(0)) {
-			if (bootcpu_valid) {
-				pr_err("%pOF: duplicate boot cpu reg property in DT\n",
-					dn);
+			if (is_mpidr_duplicate(cpu_count, hwid)) {
+				pr_err("%pOF: duplicate cpu reg properties in the DT\n",
+				       dn);
 				goto next;
 			}
 
-			bootcpu_valid = true;
-			early_map_cpu_to_node(0, of_node_to_nid(dn));
-
 			/*
-			 * cpu_logical_map has already been
-			 * initialized and the boot cpu doesn't need
-			 * the enable-method so continue without
-			 * incrementing cpu.
+			 * The numbering scheme requires that the boot CPU
+			 * must be assigned logical id 0. Record it so that
+			 * the logical map built from DT is validated and can
+			 * be used.
 			 */
-			continue;
-		}
+			if (hwid == cpu_logical_map(0)) {
+				if (bootcpu_valid) {
+					pr_err("%pOF: duplicate boot cpu reg property in DT\n",
+					       dn);
+					goto next;
+				}
+
+				bootcpu_valid = true;
+				early_map_cpu_to_node(0, of_node_to_nid(dn));
+
+				/*
+				 * cpu_logical_map has already been
+				 * initialized and the boot cpu doesn't need
+				 * the enable-method so continue without
+				 * incrementing cpu.
+				 */
+				continue;
+			}
 
-		if (cpu_count >= NR_CPUS)
-			goto next;
+			if (cpu_count >= NR_CPUS)
+				goto next;
 
-		pr_debug("cpu logical map 0x%llx\n", hwid);
-		set_cpu_logical_map(cpu_count, hwid);
+			pr_debug("cpu logical map 0x%llx\n", hwid);
+			set_cpu_logical_map(cpu_count, hwid);
 
-		early_map_cpu_to_node(cpu_count, of_node_to_nid(dn));
+			early_map_cpu_to_node(cpu_count, of_node_to_nid(dn));
 next:
-		cpu_count++;
+			cpu_count++;
+		}
 	}
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads
  2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
                   ` (5 preceding siblings ...)
  2025-05-12  8:07 ` [PATCH v3 6/7] arm64: of: handle multiple threads in ARM cpu node Alireza Sanaee
@ 2025-05-12  8:07 ` Alireza Sanaee
  2025-06-06 14:18   ` Rob Herring
  6 siblings, 1 reply; 10+ messages in thread
From: Alireza Sanaee @ 2025-05-12  8:07 UTC (permalink / raw)
  To: devicetree
  Cc: robh, alireza.sanaee, jonathan.cameron, linux-arm-kernel,
	linux-kernel, linuxarm, mark.rutland, shameerali.kolothum.thodi,
	krzk, dianders, catalin.marinas, suzuki.poulose, mike.leach,
	james.clark, linux-perf-users, coresight, gshan, ruanjinjie,
	saravanak

Enhance the API to support SMT threads, this will allow sharing
resources among multiple SMT threads.

Enabled the sharing of resources, such as L1 Cache and clocks, between
SMT threads. It introduces a fix that uses thread IDs to match each CPU
thread in the register array within the cpu-node. This ensures that the
cpu-map or any driver relying on this API is fine even when SMT threads
share resources.

Additionally, I have tested this for CPU based on the discussions in
[1], I adopted the new cpu-map layout, where the first parameter is a
phandle and the second is the local thread index, as shown below:

    core0 {
      thread0 {
        cpu = <&cpu0 0>;
      };
      thread1 {
        cpu = <&cpu0 1>;
      };
    };

Also, there are devices such as below that are a bit different.

    arm_dsu@0 {
      compatible = "arm,dsu";
      cpus = <&cpu0 &cpu1 &cpu2 &cpu3>;
    }

In these cases, we can also point to a CPU thread as well like the
following:

    arm_dsu@0 {
      compatible = "arm,dsu";
        cpus = <&cpu0 5 &cpu1 9 &cpu2 1 &cpu3 0>;
    }

It should be possible to know how many arguments a phandle might
require, and this information is encoded in another variable in the dt
called #cpu-cells in cpu node.

Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>

[1] https://lore.kernel.org/devicetree-spec/CAL_JsqK1yqRLD9B+G7UUp=D8K++mXHq0Rmv=1i6DL_jXyZwXAw@mail.gmail.com/
---
 drivers/of/cpu.c | 41 +++++++++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/drivers/of/cpu.c b/drivers/of/cpu.c
index fba17994fc20..cf54ef47f029 100644
--- a/drivers/of/cpu.c
+++ b/drivers/of/cpu.c
@@ -189,16 +189,41 @@ int of_cpu_phandle_to_id(const struct device_node *node,
 			 struct device_node **cpu_np,
 			 uint8_t cpu_idx)
 {
+	bool found = false;
+	int cpu, ret = -1, i, j;
+	uint32_t local_thread, thread_index;
+	struct device_node *np;
+	struct of_phandle_args args;
+	static const char * const phandle_names[] = { "cpus", "cpu" };
+	static const char * const cpu_cells[] = { "#cpu-cells", NULL };
+
 	if (!node)
-		return -1;
+		return ret;
 
-	*cpu_np = of_parse_phandle(node, "cpu", 0);
-	if (!*cpu_np)
-		*cpu_np = of_parse_phandle(node, "cpus", cpu_idx);
-			if (!*cpu_np)
-				return -ENODEV;
+	for (i = 0; i < ARRAY_SIZE(phandle_names); i++) {
+		for (j = 0; j < ARRAY_SIZE(cpu_cells); j++) {
+			ret = of_parse_phandle_with_args(node, phandle_names[i],
+							 cpu_cells[j], cpu_idx,
+							 &args);
+				if (ret >= 0)
+					goto success;
+		}
+	}
 
-	return of_cpu_node_to_id(*cpu_np);
+	if (ret < 0)
+		return ret;
+success:
+	*cpu_np = args.np;
+	thread_index = args.args_count == 1 ? args.args[0] : 0;
+	for_each_possible_cpu(cpu) {
+		np = of_get_cpu_node(cpu, &local_thread);
+		found = (*cpu_np == np) && (local_thread == thread_index);
+		of_node_put(np);
+		if (found)
+			return cpu;
+	}
+
+	return -ENODEV;
 }
 EXPORT_SYMBOL(of_cpu_phandle_to_id);
 
@@ -206,7 +231,7 @@ EXPORT_SYMBOL(of_cpu_phandle_to_id);
  * of_get_cpu_state_node - Get CPU's idle state node at the given index
  *
  * @cpu_node: The device node for the CPU
- * @index: The index in the list of the idle states
+g* @index: The index in the list of the idle states
  *
  * Two generic methods can be used to describe a CPU's idle states, either via
  * a flattened description through the "cpu-idle-states" binding or via the
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/7] of: add infra for finding CPU id from phandle
  2025-05-12  8:07 ` [PATCH v3 1/7] of: add infra for finding CPU id from phandle Alireza Sanaee
@ 2025-05-28 15:54   ` Jonathan Cameron
  0 siblings, 0 replies; 10+ messages in thread
From: Jonathan Cameron @ 2025-05-28 15:54 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: devicetree, robh, linux-arm-kernel, linux-kernel, linuxarm,
	mark.rutland, shameerali.kolothum.thodi, krzk, dianders,
	catalin.marinas, suzuki.poulose, mike.leach, james.clark,
	linux-perf-users, coresight, gshan, ruanjinjie, saravanak

On Mon, 12 May 2025 09:07:09 +0100
Alireza Sanaee <alireza.sanaee@huawei.com> wrote:

> Get CPU id from phandle. Many drivers get do this by getting hold of CPU
> node first through a phandle and then find the CPU ID using the relevant
> function. This commit encapsulates cpu node finding and improves
> readability.
> 
> The API interface requires two parameters, 1) node, 2) pointer to CPU
> node. API sets the pointer to the CPU node and allows the driver to play
> with the CPU itself, for logging purposes for instance.
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> ---
>  drivers/of/cpu.c   | 29 +++++++++++++++++++++++++++++
>  include/linux/of.h |  9 +++++++++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/drivers/of/cpu.c b/drivers/of/cpu.c
> index 5214dc3d05ae..fba17994fc20 100644
> --- a/drivers/of/cpu.c
> +++ b/drivers/of/cpu.c
> @@ -173,6 +173,35 @@ int of_cpu_node_to_id(struct device_node *cpu_node)
>  }
>  EXPORT_SYMBOL(of_cpu_node_to_id);
>  
> +/**
> + * of_cpu_phandle_to_id: Get the logical CPU number for a given device_node
> + *
> + * @node: Pointer to the device_node containing CPU phandle.
> + * @cpu_np: Pointer to the device_node for CPU.
> + * @cpu_idx: The index of the CPU in the list of CPUs.
> + *
> + * Return: The logical CPU number of the given CPU device_node or -ENODEV if
> + * the CPU is not found, or if the node is NULL, it returns -1. On success,
> + * cpu_np will always point to the retrieved CPU device_node with refcount
> + * incremented, use of_node_put() on it when done.
> + */
> +int of_cpu_phandle_to_id(const struct device_node *node,
> +			 struct device_node **cpu_np,
> +			 uint8_t cpu_idx)
> +{
> +	if (!node)
> +		return -1;
> +
> +	*cpu_np = of_parse_phandle(node, "cpu", 0);
> +	if (!*cpu_np)
> +		*cpu_np = of_parse_phandle(node, "cpus", cpu_idx);
> +			if (!*cpu_np)
> +				return -ENODEV;

Indent has gone a bit crazy here.

> +
> +	return of_cpu_node_to_id(*cpu_np);
> +}
> +EXPORT_SYMBOL(of_cpu_phandle_to_id);
> +
>  /**
>   * of_get_cpu_state_node - Get CPU's idle state node at the given index
>   *
> diff --git a/include/linux/of.h b/include/linux/of.h
> index eaf0e2a2b75c..194f1cb0f6c6 100644
> --- a/include/linux/of.h
> +++ b/include/linux/of.h
> @@ -360,6 +360,8 @@ extern const void *of_get_property(const struct device_node *node,
>  extern struct device_node *of_get_cpu_node(int cpu, unsigned int *thread);
>  extern struct device_node *of_cpu_device_node_get(int cpu);
>  extern int of_cpu_node_to_id(struct device_node *np);
> +extern int of_cpu_phandle_to_id(const struct device_node *np,
> +				struct device_node **cpu_np, uint8_t cpu_idx);
>  extern struct device_node *of_get_next_cpu_node(struct device_node *prev);
>  extern struct device_node *of_get_cpu_state_node(const struct device_node *cpu_node,
>  						 int index);
> @@ -662,6 +664,13 @@ static inline int of_cpu_node_to_id(struct device_node *np)
>  	return -ENODEV;
>  }
>  
> +static inline int of_cpu_phandle_to_id(const struct device_node *np,
> +				       struct device_node **cpu_np,
> +				       uint8_t cpu_idx)
> +{
> +	return -ENODEV;
> +}
> +
>  static inline struct device_node *of_get_next_cpu_node(struct device_node *prev)
>  {
>  	return NULL;


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads
  2025-05-12  8:07 ` [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads Alireza Sanaee
@ 2025-06-06 14:18   ` Rob Herring
  0 siblings, 0 replies; 10+ messages in thread
From: Rob Herring @ 2025-06-06 14:18 UTC (permalink / raw)
  To: Alireza Sanaee
  Cc: devicetree, jonathan.cameron, linux-arm-kernel, linux-kernel,
	linuxarm, mark.rutland, shameerali.kolothum.thodi, krzk, dianders,
	catalin.marinas, suzuki.poulose, mike.leach, james.clark,
	linux-perf-users, coresight, gshan, ruanjinjie, saravanak

On Mon, May 12, 2025 at 09:07:15AM +0100, Alireza Sanaee wrote:
> Enhance the API to support SMT threads, this will allow sharing
> resources among multiple SMT threads.
> 
> Enabled the sharing of resources, such as L1 Cache and clocks, between
> SMT threads. It introduces a fix that uses thread IDs to match each CPU
> thread in the register array within the cpu-node. This ensures that the
> cpu-map or any driver relying on this API is fine even when SMT threads
> share resources.
> 
> Additionally, I have tested this for CPU based on the discussions in
> [1], I adopted the new cpu-map layout, where the first parameter is a
> phandle and the second is the local thread index, as shown below:
> 
>     core0 {
>       thread0 {
>         cpu = <&cpu0 0>;
>       };
>       thread1 {
>         cpu = <&cpu0 1>;
>       };

I think the thread nodes should be omitted in this case.

>     };
> 
> Also, there are devices such as below that are a bit different.
> 
>     arm_dsu@0 {
>       compatible = "arm,dsu";
>       cpus = <&cpu0 &cpu1 &cpu2 &cpu3>;
>     }
> 
> In these cases, we can also point to a CPU thread as well like the
> following:
> 
>     arm_dsu@0 {
>       compatible = "arm,dsu";
>         cpus = <&cpu0 5 &cpu1 9 &cpu2 1 &cpu3 0>;

The purpose of 'cpus' properties is to define CPU affinity. I don't 
think the affinity could ever be different for threads in a core.

And cpu1 having 10 threads is nonsense.

Most cases of 'cpus' (and 'affinity') lookups and then callers of 
of_cpu_node_to_id() ultimately just want to set a cpumask. So we should 
provide that rather than opencoding the same loop everywhere.

>     }
> 
> It should be possible to know how many arguments a phandle might
> require, and this information is encoded in another variable in the dt
> called #cpu-cells in cpu node.
> 
> Signed-off-by: Alireza Sanaee <alireza.sanaee@huawei.com>
> 
> [1] https://lore.kernel.org/devicetree-spec/CAL_JsqK1yqRLD9B+G7UUp=D8K++mXHq0Rmv=1i6DL_jXyZwXAw@mail.gmail.com/
> ---
>  drivers/of/cpu.c | 41 +++++++++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/of/cpu.c b/drivers/of/cpu.c
> index fba17994fc20..cf54ef47f029 100644
> --- a/drivers/of/cpu.c
> +++ b/drivers/of/cpu.c
> @@ -189,16 +189,41 @@ int of_cpu_phandle_to_id(const struct device_node *node,
>  			 struct device_node **cpu_np,
>  			 uint8_t cpu_idx)
>  {
> +	bool found = false;
> +	int cpu, ret = -1, i, j;
> +	uint32_t local_thread, thread_index;
> +	struct device_node *np;
> +	struct of_phandle_args args;
> +	static const char * const phandle_names[] = { "cpus", "cpu" };
> +	static const char * const cpu_cells[] = { "#cpu-cells", NULL };
> +
>  	if (!node)
> -		return -1;
> +		return ret;
>  
> -	*cpu_np = of_parse_phandle(node, "cpu", 0);
> -	if (!*cpu_np)
> -		*cpu_np = of_parse_phandle(node, "cpus", cpu_idx);
> -			if (!*cpu_np)
> -				return -ENODEV;
> +	for (i = 0; i < ARRAY_SIZE(phandle_names); i++) {
> +		for (j = 0; j < ARRAY_SIZE(cpu_cells); j++) {
> +			ret = of_parse_phandle_with_args(node, phandle_names[i],
> +							 cpu_cells[j], cpu_idx,
> +							 &args);
> +				if (ret >= 0)
> +					goto success;
> +		}
> +	}
>  
> -	return of_cpu_node_to_id(*cpu_np);
> +	if (ret < 0)
> +		return ret;
> +success:
> +	*cpu_np = args.np;
> +	thread_index = args.args_count == 1 ? args.args[0] : 0;
> +	for_each_possible_cpu(cpu) {
> +		np = of_get_cpu_node(cpu, &local_thread);
> +		found = (*cpu_np == np) && (local_thread == thread_index);
> +		of_node_put(np);
> +		if (found)
> +			return cpu;
> +	}
> +
> +	return -ENODEV;
>  }
>  EXPORT_SYMBOL(of_cpu_phandle_to_id);
>  
> @@ -206,7 +231,7 @@ EXPORT_SYMBOL(of_cpu_phandle_to_id);
>   * of_get_cpu_state_node - Get CPU's idle state node at the given index
>   *
>   * @cpu_node: The device node for the CPU
> - * @index: The index in the list of the idle states
> +g* @index: The index in the list of the idle states

Oops!

>   *
>   * Two generic methods can be used to describe a CPU's idle states, either via
>   * a flattened description through the "cpu-idle-states" binding or via the
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-06-06 14:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-12  8:07 [PATCH v3 0/7] DT: Enable sharing resources for SMT threads Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 1/7] of: add infra for finding CPU id from phandle Alireza Sanaee
2025-05-28 15:54   ` Jonathan Cameron
2025-05-12  8:07 ` [PATCH v3 2/7] arch_topology: update CPU map to use the new API Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 3/7] coresight: cti: Use of_cpu_phandle_to_id for grabbing CPU id Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 4/7] coresight: " Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 5/7] perf/arm-dsu: refactor cpu id retrieval via new API of_cpu_phandle_to_id Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 6/7] arm64: of: handle multiple threads in ARM cpu node Alireza Sanaee
2025-05-12  8:07 ` [PATCH v3 7/7] of: of_cpu_phandle_to_id to support SMT threads Alireza Sanaee
2025-06-06 14:18   ` Rob Herring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).