linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v11 0/4] Support SMT control on arm64
@ 2025-02-18 14:10 Yicong Yang
  2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Yicong Yang @ 2025-02-18 14:10 UTC (permalink / raw)
  To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

From: Yicong Yang <yangyicong@hisilicon.com>

The core CPU control framework supports runtime SMT control which
is not yet supported on arm64. Besides the general vulnerabilities
concerns we want this runtime control on our arm64 server for:

- better single CPU performance in some cases
- saving overall power consumption

This patchset implements it in the following aspects:

- Provides a default topology_is_primary_thread()
- support retrieve SMT thread number on OF based system
- support retrieve SMT thread number on ACPI based system
- select HOTPLUG_SMT for arm64

Tests has been done on our ACPI based arm64 server and on ACPI/OF
based QEMU VMs.

Change since v10:
- handle topology parsing failure case on DT based system
- address some style comments per Jonathan and add tags, Thanks
Link: https://lore.kernel.org/linux-arm-kernel/20241220075313.51502-1-yangyicong@huawei.com/

Change since v9:
- Refine the comment of topology_is_primary_thread(). Tested with LoongArch
  to prove it also works on architecture's not using CONFIG_GENERIC_ARCH_TOPOLOGY
- always call cpu_smt_set_num_threads() to make the smt/control shows correct
  status on non-SMT system
Link: https://lore.kernel.org/linux-arm-kernel/20241114141127.23232-1-yangyicong@huawei.com/

Change since v8:
- Fix WARN on ACPI based non-SMT platform noticed in v7, per Pierre.
Link: https://lore.kernel.org/all/20241105093237.63565-1-yangyicong@huawei.com/

Change since v7:
Address the comments from Thomas:
- Add a newline between the glue define and function of topology_is_primary_thread
- Explicitly mention the sibling mask won't be empty in the comment
Link: https://lore.kernel.org/lkml/20241030125415.18994-1-yangyicong@huawei.com/

Change since v6:
- Fix unused variable if !CONFIG_ARM64 || !CONFIG_RISV found by lkp-test
- Fix max_smt_thread_num updating in OF path pointed by Pierre
- Drop unused variable and refine the comments/commit per Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20241015021841.35713-1-yangyicong@huawei.com/

Change since v5:
- Drop the dependency on CONFIG_SMP since it's always on on arm64, per Pierre
- Avoid potential multiple calls of cpu_smt_set_num_threads() on asymmetric system, per Dietmar
- Detect heterogenous SMT topology and issue a warning for partly support, per Pierre
- Thanks Dietmar for testing, didn't pickup the tag due to code changes. Thanks testing by Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20240806085320.63514-1-yangyicong@huawei.com/

Change since v4:
- Provide a default topology_is_primary_thread() in the framework, Per Will
Link: https://lore.kernel.org/linux-arm-kernel/20231121092602.47792-1-yangyicong@huawei.com/

Change since v3:
- Fix some build and kconfig error reported by kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-arm-kernel/20231114040110.54590-1-yangyicong@huawei.com/

Change since v2:
- Detect SMT thread number at topology build from ACPI/DT, avoid looping CPUs
- Split patches into ACPI/OF/arch_topology path and enable the kconfig for arm64
Link: https://lore.kernel.org/linux-arm-kernel/20231010115335.13862-1-yangyicong@huawei.com/

Yicong Yang (4):
  cpu/SMT: Provide a default topology_is_primary_thread()
  arch_topology: Support SMT control for OF based system
  arm64: topology: Support SMT control on ACPI based system
  arm64: Kconfig: Enable HOTPLUG_SMT

 arch/arm64/Kconfig                  |  1 +
 arch/arm64/kernel/topology.c        | 66 +++++++++++++++++++++++++++++
 arch/powerpc/include/asm/topology.h |  1 +
 arch/x86/include/asm/topology.h     |  2 +-
 drivers/base/arch_topology.c        | 27 ++++++++++++
 include/linux/topology.h            | 22 ++++++++++
 6 files changed, 118 insertions(+), 1 deletion(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
  2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
@ 2025-02-18 14:10 ` Yicong Yang
  2025-02-28 11:10   ` Dietmar Eggemann
  2025-02-28 13:54   ` Sudeep Holla
  2025-02-18 14:10 ` [PATCH v11 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Yicong Yang @ 2025-02-18 14:10 UTC (permalink / raw)
  To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

From: Yicong Yang <yangyicong@hisilicon.com>

Currently if architectures want to support HOTPLUG_SMT they need to
provide a topology_is_primary_thread() telling the framework which
thread in the SMT cannot offline. However arm64 doesn't have a
restriction on which thread in the SMT cannot offline, a simplest
choice is that just make 1st thread as the "primary" thread. So
just make this as the default implementation in the framework and
let architectures like x86 that have special primary thread to
override this function (which they've already done).

There's no need to provide a stub function if !CONFIG_SMP or
!CONFIG_HOTPLUG_SMT. In such case the testing CPU is already
the 1st CPU in the SMT so it's always the primary thread.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
Pre questioned in v9 [1] whether this works on architectures not using
CONFIG_GENERIC_ARCH_TOPOLOGY, See [2] for demonstration hacking on LoongArch
VM and this also works. Architectures should use this on their own situation.
[1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/a5690fee-3019-f26c-8bad-1d95e388e877@huawei.com/

 arch/powerpc/include/asm/topology.h |  1 +
 arch/x86/include/asm/topology.h     |  2 +-
 include/linux/topology.h            | 22 ++++++++++++++++++++++
 3 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 16bacfe8c7a2..da15b5efe807 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
 {
 	return cpu == cpu_first_thread_sibling(cpu);
 }
+#define topology_is_primary_thread topology_is_primary_thread
 
 static inline bool topology_smt_thread_allowed(unsigned int cpu)
 {
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index ec134b719144..6c79ee7c0957 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -229,11 +229,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
 {
 	return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
 }
+#define topology_is_primary_thread topology_is_primary_thread
 
 #else /* CONFIG_SMP */
 static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
 static inline int topology_max_smt_threads(void) { return 1; }
-static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
 static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
 #endif /* !CONFIG_SMP */
 
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 52f5850730b3..b3aba443c4eb 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
 }
 #endif
 
+#ifndef topology_is_primary_thread
+
+#define topology_is_primary_thread topology_is_primary_thread
+
+static inline bool topology_is_primary_thread(unsigned int cpu)
+{
+	/*
+	 * On SMT hotplug the primary thread of the SMT won't be disabled.
+	 * Architectures do have a special primary thread (e.g. x86) need
+	 * to override this function. Otherwise just make the first thread
+	 * in the SMT as the primary thread.
+	 *
+	 * The sibling cpumask of an offline CPU contains always the CPU
+	 * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
+	 * Other architectures should use this depend on their own
+	 * situation.
+	 */
+	return cpu == cpumask_first(topology_sibling_cpumask(cpu));
+}
+
+#endif
+
 static inline const struct cpumask *cpu_cpu_mask(int cpu)
 {
 	return cpumask_of_node(cpu_to_node(cpu));
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
  2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2025-02-18 14:10 ` Yicong Yang
  2025-02-28 11:11   ` Dietmar Eggemann
  2025-02-28 13:54   ` Sudeep Holla
  2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 29+ messages in thread
From: Yicong Yang @ 2025-02-18 14:10 UTC (permalink / raw)
  To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

From: Yicong Yang <yangyicong@hisilicon.com>

On building the topology from the devicetree, we've already
gotten the SMT thread number of each core. Update the largest
SMT thread number and enable the SMT control by the end of
topology parsing.

The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number

If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/base/arch_topology.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 3ebe77566788..23f425a9d77a 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -11,6 +11,7 @@
 #include <linux/cleanup.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
 #include <linux/device.h>
 #include <linux/of.h>
 #include <linux/slab.h>
@@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
 #endif
 
 #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+
+/* Maximum SMT thread number detected used to enable the SMT control */
+static unsigned int max_smt_thread_num;
+
 /*
  * This function returns the logic cpu number of the node.
  * There are basically three kinds of return values:
@@ -565,6 +570,16 @@ static int __init parse_core(struct device_node *core, int package_id,
 		i++;
 	} while (1);
 
+	/*
+	 * If max_smt_thread_num has been initialized and doesn't match
+	 * the thread number of this entry, then the system has
+	 * heterogeneous SMT topology.
+	 */
+	if (max_smt_thread_num && max_smt_thread_num != i)
+		pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+	max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i);
+
 	cpu = get_cpu_for_node(core);
 	if (cpu >= 0) {
 		if (!leaf) {
@@ -677,6 +692,18 @@ static int __init parse_socket(struct device_node *socket)
 	if (!has_socket)
 		ret = parse_cluster(socket, 0, -1, 0);
 
+	/*
+	 * Notify the CPU framework of the SMT support. Initialize the
+	 * max_smt_thread_num to 1 if no SMT support detected or failed
+	 * to parse the topology. A thread number of 1 can be handled by
+	 * the framework so we don't need to check max_smt_thread_num to
+	 * see we support SMT or not.
+	 */
+	if (!max_smt_thread_num || ret)
+		max_smt_thread_num = 1;
+
+	cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+
 	return ret;
 }
 
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
  2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
  2025-02-18 14:10 ` [PATCH v11 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
@ 2025-02-18 14:10 ` Yicong Yang
  2025-02-25  6:08   ` Hanjun Guo
                     ` (2 more replies)
  2025-02-18 14:10 ` [PATCH v11 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
  2025-02-28 11:12 ` [PATCH v11 0/4] Support SMT control on arm64 Dietmar Eggemann
  4 siblings, 3 replies; 29+ messages in thread
From: Yicong Yang @ 2025-02-18 14:10 UTC (permalink / raw)
  To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

From: Yicong Yang <yangyicong@hisilicon.com>

For ACPI we'll build the topology from PPTT and we cannot directly
get the SMT number of each core. Instead using a temporary xarray
to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
and SMT information of the first core in its heterogeneous CPU cluster
when building the topology. Then we can know the largest SMT number
in the system. If a homogeneous system's using ACPI 6.2 or later,
all the CPUs should be under the root node of PPTT. There'll be
only one entry in the xarray and all the CPUs in the system will
be assumed identical.

The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number

If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 1a2c72f3e7f8..6eba1ac091ee 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -15,8 +15,10 @@
 #include <linux/arch_topology.h>
 #include <linux/cacheinfo.h>
 #include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
 #include <linux/init.h>
 #include <linux/percpu.h>
+#include <linux/xarray.h>
 
 #include <asm/cpu.h>
 #include <asm/cputype.h>
@@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
 	return !!is_threaded;
 }
 
+struct cpu_smt_info {
+	unsigned int thread_num;
+	int core_id;
+};
+
 /*
  * Propagate the topology information of the processor_topology_node tree to the
  * cpu_topology array.
  */
 int __init parse_acpi_topology(void)
 {
+	unsigned int max_smt_thread_num = 0;
+	struct cpu_smt_info *entry;
+	struct xarray hetero_cpu;
+	unsigned long hetero_id;
 	int cpu, topology_id;
 
 	if (acpi_disabled)
 		return 0;
 
+	xa_init(&hetero_cpu);
+
 	for_each_possible_cpu(cpu) {
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
@@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
 			cpu_topology[cpu].thread_id = topology_id;
 			topology_id = find_acpi_cpu_topology(cpu, 1);
 			cpu_topology[cpu].core_id   = topology_id;
+
+			/*
+			 * In the PPTT, CPUs below a node with the 'identical
+			 * implementation' flag have the same number of threads.
+			 * Count the number of threads for only one CPU (i.e.
+			 * one core_id) among those with the same hetero_id.
+			 * See the comment of find_acpi_cpu_topology_hetero_id()
+			 * for more details.
+			 *
+			 * One entry is created for each node having:
+			 * - the 'identical implementation' flag
+			 * - its parent not having the flag
+			 */
+			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
+			entry = xa_load(&hetero_cpu, hetero_id);
+			if (!entry) {
+				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+				WARN_ON_ONCE(!entry);
+
+				if (entry) {
+					entry->core_id = topology_id;
+					entry->thread_num = 1;
+					xa_store(&hetero_cpu, hetero_id,
+						 entry, GFP_KERNEL);
+				}
+			} else if (entry->core_id == topology_id) {
+				entry->thread_num++;
+			}
 		} else {
 			cpu_topology[cpu].thread_id  = -1;
 			cpu_topology[cpu].core_id    = topology_id;
@@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
 		cpu_topology[cpu].package_id = topology_id;
 	}
 
+	/*
+	 * This should be a short loop depending on the number of heterogeneous
+	 * CPU clusters. Typically on a homogeneous system there's only one
+	 * entry in the XArray.
+	 */
+	xa_for_each(&hetero_cpu, hetero_id, entry) {
+		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
+			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
+		xa_erase(&hetero_cpu, hetero_id);
+		kfree(entry);
+	}
+
+	/*
+	 * Notify the CPU framework of the SMT support. Initialize the
+	 * max_smt_thread_num to 1 if no SMT support detected. A thread
+	 * number of 1 can be handled by the framework so we don't need
+	 * to check max_smt_thread_num to see we support SMT or not.
+	 */
+	if (!max_smt_thread_num)
+		max_smt_thread_num = 1;
+
+	cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+	xa_destroy(&hetero_cpu);
 	return 0;
 }
 #endif
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v11 4/4] arm64: Kconfig: Enable HOTPLUG_SMT
  2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
                   ` (2 preceding siblings ...)
  2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2025-02-18 14:10 ` Yicong Yang
  2025-02-28 11:12 ` [PATCH v11 0/4] Support SMT control on arm64 Dietmar Eggemann
  4 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-02-18 14:10 UTC (permalink / raw)
  To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

From: Yicong Yang <yangyicong@hisilicon.com>

Enable HOTPLUG_SMT for SMT control.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fcdd0ed3eca8..947616d37e0c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -251,6 +251,7 @@ config ARM64
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
 	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
+	select HOTPLUG_SMT if HOTPLUG_CPU
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
-- 
2.24.0



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2025-02-25  6:08   ` Hanjun Guo
  2025-03-03 14:42     ` Yicong Yang
  2025-02-28 11:11   ` Dietmar Eggemann
  2025-02-28 13:56   ` Sudeep Holla
  2 siblings, 1 reply; 29+ messages in thread
From: Hanjun Guo @ 2025-02-25  6:08 UTC (permalink / raw)
  To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, sshegde

On 2025/2/18 22:10, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> For ACPI we'll build the topology from PPTT and we cannot directly
> get the SMT number of each core. Instead using a temporary xarray
> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> and SMT information of the first core in its heterogeneous CPU cluster
> when building the topology. Then we can know the largest SMT number
> in the system. If a homogeneous system's using ACPI 6.2 or later,
> all the CPUs should be under the root node of PPTT. There'll be
> only one entry in the xarray and all the CPUs in the system will
> be assumed identical.
> 
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
> 
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>   arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 66 insertions(+)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 1a2c72f3e7f8..6eba1ac091ee 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -15,8 +15,10 @@
>   #include <linux/arch_topology.h>
>   #include <linux/cacheinfo.h>
>   #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
>   #include <linux/init.h>
>   #include <linux/percpu.h>
> +#include <linux/xarray.h>
>   
>   #include <asm/cpu.h>
>   #include <asm/cputype.h>
> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>   	return !!is_threaded;
>   }
>   
> +struct cpu_smt_info {
> +	unsigned int thread_num;
> +	int core_id;
> +};
> +
>   /*
>    * Propagate the topology information of the processor_topology_node tree to the
>    * cpu_topology array.
>    */
>   int __init parse_acpi_topology(void)
>   {
> +	unsigned int max_smt_thread_num = 0;
> +	struct cpu_smt_info *entry;
> +	struct xarray hetero_cpu;
> +	unsigned long hetero_id;
>   	int cpu, topology_id;
>   
>   	if (acpi_disabled)
>   		return 0;
>   
> +	xa_init(&hetero_cpu);
> +
>   	for_each_possible_cpu(cpu) {
>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>   		if (topology_id < 0)
> @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
>   			cpu_topology[cpu].thread_id = topology_id;
>   			topology_id = find_acpi_cpu_topology(cpu, 1);
>   			cpu_topology[cpu].core_id   = topology_id;
> +
> +			/*
> +			 * In the PPTT, CPUs below a node with the 'identical
> +			 * implementation' flag have the same number of threads.
> +			 * Count the number of threads for only one CPU (i.e.
> +			 * one core_id) among those with the same hetero_id.
> +			 * See the comment of find_acpi_cpu_topology_hetero_id()
> +			 * for more details.
> +			 *
> +			 * One entry is created for each node having:
> +			 * - the 'identical implementation' flag
> +			 * - its parent not having the flag
> +			 */
> +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> +			entry = xa_load(&hetero_cpu, hetero_id);
> +			if (!entry) {
> +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +				WARN_ON_ONCE(!entry);
> +
> +				if (entry) {
> +					entry->core_id = topology_id;
> +					entry->thread_num = 1;
> +					xa_store(&hetero_cpu, hetero_id,
> +						 entry, GFP_KERNEL);
> +				}
> +			} else if (entry->core_id == topology_id) {
> +				entry->thread_num++;
> +			}
>   		} else {
>   			cpu_topology[cpu].thread_id  = -1;
>   			cpu_topology[cpu].core_id    = topology_id;
> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>   		cpu_topology[cpu].package_id = topology_id;
>   	}
>   
> +	/*
> +	 * This should be a short loop depending on the number of heterogeneous
> +	 * CPU clusters. Typically on a homogeneous system there's only one
> +	 * entry in the XArray.
> +	 */
> +	xa_for_each(&hetero_cpu, hetero_id, entry) {
> +		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> +			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +
> +		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
> +		xa_erase(&hetero_cpu, hetero_id);
> +		kfree(entry);
> +	}
> +
> +	/*
> +	 * Notify the CPU framework of the SMT support. Initialize the
> +	 * max_smt_thread_num to 1 if no SMT support detected. A thread
> +	 * number of 1 can be handled by the framework so we don't need
> +	 * to check max_smt_thread_num to see we support SMT or not.
> +	 */
> +	if (!max_smt_thread_num)
> +		max_smt_thread_num = 1;
> +
> +	cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> +	xa_destroy(&hetero_cpu);
>   	return 0;
>   }
>   #endif

Looks good to me,

Reviewed-by: Hanjun Guo <guohanjun@huawei.com>

Thanks
Hanjun


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
  2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2025-02-28 11:10   ` Dietmar Eggemann
  2025-03-03 13:35     ` Yicong Yang
  2025-02-28 13:54   ` Sudeep Holla
  1 sibling, 1 reply; 29+ messages in thread
From: Dietmar Eggemann @ 2025-02-28 11:10 UTC (permalink / raw)
  To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On 18/02/2025 15:10, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>

[...]

> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..b3aba443c4eb 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>  }
>  #endif
>  
> +#ifndef topology_is_primary_thread
> +
> +#define topology_is_primary_thread topology_is_primary_thread
> +
> +static inline bool topology_is_primary_thread(unsigned int cpu)
> +{
> +	/*
> +	 * On SMT hotplug the primary thread of the SMT won't be disabled.
> +	 * Architectures do have a special primary thread (e.g. x86) need
> +	 * to override this function. Otherwise just make the first thread
> +	 * in the SMT as the primary thread.
> +	 *
> +	 * The sibling cpumask of an offline CPU contains always the CPU
> +	 * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
> +	 * Other architectures should use this depend on their own
> +	 * situation.

This sentence is hard to get. Do you want to say that other
architectures (CONFIG_GENERIC_ARCH_TOPOLOGY or
!CONFIG_GENERIC_ARCH_TOPOLOGY) have to check whether they can use this
default implementation or have to override it?

[...]


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-02-18 14:10 ` [PATCH v11 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
@ 2025-02-28 11:11   ` Dietmar Eggemann
  2025-03-03 14:03     ` Yicong Yang
  2025-02-28 13:54   ` Sudeep Holla
  1 sibling, 1 reply; 29+ messages in thread
From: Dietmar Eggemann @ 2025-02-28 11:11 UTC (permalink / raw)
  To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On 18/02/2025 15:10, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> On building the topology from the devicetree, we've already
> gotten the SMT thread number of each core. Update the largest
> SMT thread number and enable the SMT control by the end of
> topology parsing.
> 
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number

1/max_thread_number stands for '1 or max_thread_number', right ?

Aren't the two interfaces:

(a) /sys/devices/system/cpu/smt/active
(b) /sys/devices/system/cpu/smt/control

and you write 1) or 2) (or 'forceoff') into (b)?

> If a system have more than one SMT thread number the 2) may

s/have/has

> not handle it well, since there're multiple thread numbers in the

multiple thread numbers other than 1, right?

> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.

This paragraph seems to be about heterogeneous systems. Maybe mention this?

Heterogeneous system with SMT only on a subset of cores (like Intel
Hybrid): This one works (N threads per core with N=1 and N=2) just fine.

But on Arm64 (default) we would still see:

[0.075782] Heterogeneous SMT topology is partly supported by SMT control

> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/base/arch_topology.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 3ebe77566788..23f425a9d77a 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -11,6 +11,7 @@
>  #include <linux/cleanup.h>
>  #include <linux/cpu.h>
>  #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
>  #include <linux/device.h>
>  #include <linux/of.h>
>  #include <linux/slab.h>
> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
>  #endif
>  
>  #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +
> +/* Maximum SMT thread number detected used to enable the SMT control */

maybe shorter ?

/* used to enable SMT control */

> +static unsigned int max_smt_thread_num;
> +
>  /*
>   * This function returns the logic cpu number of the node.
>   * There are basically three kinds of return values:
> @@ -565,6 +570,16 @@ static int __init parse_core(struct device_node *core, int package_id,
>  		i++;
>  	} while (1);
>  
> +	/*
> +	 * If max_smt_thread_num has been initialized and doesn't match
> +	 * the thread number of this entry, then the system has
> +	 * heterogeneous SMT topology.
> +	 */
> +	if (max_smt_thread_num && max_smt_thread_num != i)
> +		pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +
> +	max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i);
> +
>  	cpu = get_cpu_for_node(core);
>  	if (cpu >= 0) {
>  		if (!leaf) {
> @@ -677,6 +692,18 @@ static int __init parse_socket(struct device_node *socket)
>  	if (!has_socket)
>  		ret = parse_cluster(socket, 0, -1, 0);
>  
> +	/*
> +	 * Notify the CPU framework of the SMT support. Initialize the
> +	 * max_smt_thread_num to 1 if no SMT support detected or failed
> +	 * to parse the topology. A thread number of 1 can be handled by
> +	 * the framework so we don't need to check max_smt_thread_num to
> +	 * see we support SMT or not.

Not sure whether the last sentence is needed here?

[...]



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
  2025-02-25  6:08   ` Hanjun Guo
@ 2025-02-28 11:11   ` Dietmar Eggemann
  2025-02-28 13:56   ` Sudeep Holla
  2 siblings, 0 replies; 29+ messages in thread
From: Dietmar Eggemann @ 2025-02-28 11:11 UTC (permalink / raw)
  To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On 18/02/2025 15:10, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>

[...]

> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>  		cpu_topology[cpu].package_id = topology_id;
>  	}
>  
> +	/*
> +	 * This should be a short loop depending on the number of heterogeneous
                ^^^^^^

This _is_ a short loop since the number of xArray elements is the number
of heterogeneous CPU clusters.

> +	 * CPU clusters. Typically on a homogeneous system there's only one
                         ^^^^^^^^^

I would remove 'Typically' here.

[...]



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 0/4] Support SMT control on arm64
  2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
                   ` (3 preceding siblings ...)
  2025-02-18 14:10 ` [PATCH v11 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
@ 2025-02-28 11:12 ` Dietmar Eggemann
  2025-03-03 14:41   ` Yicong Yang
  4 siblings, 1 reply; 29+ messages in thread
From: Dietmar Eggemann @ 2025-02-28 11:12 UTC (permalink / raw)
  To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois
  Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On 18/02/2025 15:10, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> The core CPU control framework supports runtime SMT control which
> is not yet supported on arm64. Besides the general vulnerabilities
> concerns we want this runtime control on our arm64 server for:
> 
> - better single CPU performance in some cases
> - saving overall power consumption
> 
> This patchset implements it in the following aspects:
> 
> - Provides a default topology_is_primary_thread()
> - support retrieve SMT thread number on OF based system
> - support retrieve SMT thread number on ACPI based system
> - select HOTPLUG_SMT for arm64
> 
> Tests has been done on our ACPI based arm64 server and on ACPI/OF
> based QEMU VMs.

[...]

> Yicong Yang (4):
>   cpu/SMT: Provide a default topology_is_primary_thread()
>   arch_topology: Support SMT control for OF based system
>   arm64: topology: Support SMT control on ACPI based system
>   arm64: Kconfig: Enable HOTPLUG_SMT
> 
>  arch/arm64/Kconfig                  |  1 +
>  arch/arm64/kernel/topology.c        | 66 +++++++++++++++++++++++++++++
>  arch/powerpc/include/asm/topology.h |  1 +
>  arch/x86/include/asm/topology.h     |  2 +-
>  drivers/base/arch_topology.c        | 27 ++++++++++++
>  include/linux/topology.h            | 22 ++++++++++
>  6 files changed, 118 insertions(+), 1 deletion(-)

With the review comments on the individual patches [0-3]/4:

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
  2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
  2025-02-28 11:10   ` Dietmar Eggemann
@ 2025-02-28 13:54   ` Sudeep Holla
  2025-03-03 13:38     ` Yicong Yang
  1 sibling, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-02-28 13:54 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, tglx, Sudeep Holla, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, yangyicong, xuwei5, guohanjun, sshegde

On Tue, Feb 18, 2025 at 10:10:15PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> Currently if architectures want to support HOTPLUG_SMT they need to
> provide a topology_is_primary_thread() telling the framework which
> thread in the SMT cannot offline. However arm64 doesn't have a
> restriction on which thread in the SMT cannot offline, a simplest
> choice is that just make 1st thread as the "primary" thread. So
> just make this as the default implementation in the framework and
> let architectures like x86 that have special primary thread to
> override this function (which they've already done).
> 
> There's no need to provide a stub function if !CONFIG_SMP or
> !CONFIG_HOTPLUG_SMT. In such case the testing CPU is already
> the 1st CPU in the SMT so it's always the primary thread.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> Pre questioned in v9 [1] whether this works on architectures not using
> CONFIG_GENERIC_ARCH_TOPOLOGY, See [2] for demonstration hacking on LoongArch
> VM and this also works. Architectures should use this on their own situation.
> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
> [2] https://lore.kernel.org/linux-arm-kernel/a5690fee-3019-f26c-8bad-1d95e388e877@huawei.com/
> 
>  arch/powerpc/include/asm/topology.h |  1 +
>  arch/x86/include/asm/topology.h     |  2 +-
>  include/linux/topology.h            | 22 ++++++++++++++++++++++
>  3 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index 16bacfe8c7a2..da15b5efe807 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>  {
>  	return cpu == cpu_first_thread_sibling(cpu);
>  }
> +#define topology_is_primary_thread topology_is_primary_thread
>  
>  static inline bool topology_smt_thread_allowed(unsigned int cpu)
>  {
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index ec134b719144..6c79ee7c0957 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -229,11 +229,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>  {
>  	return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
>  }
> +#define topology_is_primary_thread topology_is_primary_thread
>  
>  #else /* CONFIG_SMP */
>  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
>  static inline int topology_max_smt_threads(void) { return 1; }
> -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
>  static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
>  #endif /* !CONFIG_SMP */
>  
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..b3aba443c4eb 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>  }
>  #endif
>  
> +#ifndef topology_is_primary_thread
> +
> +#define topology_is_primary_thread topology_is_primary_thread
> +
> +static inline bool topology_is_primary_thread(unsigned int cpu)
> +{
> +	/*
> +	 * On SMT hotplug the primary thread of the SMT won't be disabled.

I may be misunderstanding the term "SMT hotplug" above. For me it is
comparable with logical CPU hotplug, so the above statement may be
misleading. IIUC, what you mean above is if SMT is disabled, the
primary thread will always remain enabled/active. Does that make sense
or am I missing something ?

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-02-18 14:10 ` [PATCH v11 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
  2025-02-28 11:11   ` Dietmar Eggemann
@ 2025-02-28 13:54   ` Sudeep Holla
  2025-03-03 14:11     ` Yicong Yang
  1 sibling, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-02-28 13:54 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, tglx, Sudeep Holla, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, yangyicong, xuwei5, guohanjun, sshegde

On Tue, Feb 18, 2025 at 10:10:16PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> On building the topology from the devicetree, we've already
> gotten the SMT thread number of each core. Update the largest
> SMT thread number and enable the SMT control by the end of
> topology parsing.
> 
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
> 
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/base/arch_topology.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 3ebe77566788..23f425a9d77a 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -11,6 +11,7 @@
>  #include <linux/cleanup.h>
>  #include <linux/cpu.h>
>  #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
>  #include <linux/device.h>
>  #include <linux/of.h>
>  #include <linux/slab.h>
> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
>  #endif
>  
>  #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +
> +/* Maximum SMT thread number detected used to enable the SMT control */
> +static unsigned int max_smt_thread_num;
> +
>  /*
>   * This function returns the logic cpu number of the node.
>   * There are basically three kinds of return values:
> @@ -565,6 +570,16 @@ static int __init parse_core(struct device_node *core, int package_id,
>  		i++;
>  	} while (1);
>  
> +	/*
> +	 * If max_smt_thread_num has been initialized and doesn't match
> +	 * the thread number of this entry, then the system has
> +	 * heterogeneous SMT topology.
> +	 */
> +	if (max_smt_thread_num && max_smt_thread_num != i)
> +		pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +

May be we need to make it more conditional as we may have to support
systems with few cores that are single threaded ? I think Dietmar's
comment is about that.

> +	max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i);
> +
>  	cpu = get_cpu_for_node(core);
>  	if (cpu >= 0) {
>  		if (!leaf) {
> @@ -677,6 +692,18 @@ static int __init parse_socket(struct device_node *socket)
>  	if (!has_socket)
>  		ret = parse_cluster(socket, 0, -1, 0);
>  
> +	/*
> +	 * Notify the CPU framework of the SMT support. Initialize the
> +	 * max_smt_thread_num to 1 if no SMT support detected or failed
> +	 * to parse the topology. A thread number of 1 can be handled by
> +	 * the framework so we don't need to check max_smt_thread_num to
> +	 * see we support SMT or not.
> +	 */
> +	if (!max_smt_thread_num || ret)
> +		max_smt_thread_num = 1;
> +

For the failed parsing of topology, reset_cpu_topology() gets called.
I suggest resetting max_smt_thread_num to 1 belongs there.

And if you start with max_smt_thread_num, we don't need to update it to
1 explicitly here. So I would like to get rid of above check completely.

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
  2025-02-25  6:08   ` Hanjun Guo
  2025-02-28 11:11   ` Dietmar Eggemann
@ 2025-02-28 13:56   ` Sudeep Holla
  2025-02-28 17:51     ` Pierre Gondois
  2 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-02-28 13:56 UTC (permalink / raw)
  To: Yicong Yang
  Cc: catalin.marinas, will, tglx, peterz, mpe, Sudeep Holla,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, yangyicong, xuwei5, guohanjun, sshegde

On Tue, Feb 18, 2025 at 10:10:17PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> For ACPI we'll build the topology from PPTT and we cannot directly
> get the SMT number of each core. Instead using a temporary xarray
> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> and SMT information of the first core in its heterogeneous CPU cluster
> when building the topology. Then we can know the largest SMT number
> in the system. If a homogeneous system's using ACPI 6.2 or later,
> all the CPUs should be under the root node of PPTT. There'll be
> only one entry in the xarray and all the CPUs in the system will
> be assumed identical.
> 
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
> 
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 1a2c72f3e7f8..6eba1ac091ee 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -15,8 +15,10 @@
>  #include <linux/arch_topology.h>
>  #include <linux/cacheinfo.h>
>  #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
>  #include <linux/init.h>
>  #include <linux/percpu.h>
> +#include <linux/xarray.h>
>  
>  #include <asm/cpu.h>
>  #include <asm/cputype.h>
> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>  	return !!is_threaded;
>  }
>  
> +struct cpu_smt_info {
> +	unsigned int thread_num;
> +	int core_id;
> +};
> +
>  /*
>   * Propagate the topology information of the processor_topology_node tree to the
>   * cpu_topology array.
>   */
>  int __init parse_acpi_topology(void)
>  {
> +	unsigned int max_smt_thread_num = 0;
> +	struct cpu_smt_info *entry;
> +	struct xarray hetero_cpu;
> +	unsigned long hetero_id;
>  	int cpu, topology_id;
>  
>  	if (acpi_disabled)
>  		return 0;
>  
> +	xa_init(&hetero_cpu);
> +
>  	for_each_possible_cpu(cpu) {
>  		topology_id = find_acpi_cpu_topology(cpu, 0);
>  		if (topology_id < 0)
> @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
>  			cpu_topology[cpu].thread_id = topology_id;
>  			topology_id = find_acpi_cpu_topology(cpu, 1);
>  			cpu_topology[cpu].core_id   = topology_id;
> +
> +			/*
> +			 * In the PPTT, CPUs below a node with the 'identical
> +			 * implementation' flag have the same number of threads.
> +			 * Count the number of threads for only one CPU (i.e.
> +			 * one core_id) among those with the same hetero_id.
> +			 * See the comment of find_acpi_cpu_topology_hetero_id()
> +			 * for more details.
> +			 *
> +			 * One entry is created for each node having:
> +			 * - the 'identical implementation' flag
> +			 * - its parent not having the flag
> +			 */
> +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> +			entry = xa_load(&hetero_cpu, hetero_id);
> +			if (!entry) {
> +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +				WARN_ON_ONCE(!entry);
> +
> +				if (entry) {
> +					entry->core_id = topology_id;
> +					entry->thread_num = 1;
> +					xa_store(&hetero_cpu, hetero_id,
> +						 entry, GFP_KERNEL);
> +				}
> +			} else if (entry->core_id == topology_id) {
> +				entry->thread_num++;
> +			}
>  		} else {
>  			cpu_topology[cpu].thread_id  = -1;
>  			cpu_topology[cpu].core_id    = topology_id;
> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>  		cpu_topology[cpu].package_id = topology_id;
>  	}
>  
> +	/*
> +	 * This should be a short loop depending on the number of heterogeneous
> +	 * CPU clusters. Typically on a homogeneous system there's only one
> +	 * entry in the XArray.
> +	 */
> +	xa_for_each(&hetero_cpu, hetero_id, entry) {
> +		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> +			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");

Ditto as previous patch about handling no threaded cores with threaded cores
in the system. I am not sure if that is required but just raising it here.

> +
> +		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
> +		xa_erase(&hetero_cpu, hetero_id);
> +		kfree(entry);
> +	}
> +
> +	/*
> +	 * Notify the CPU framework of the SMT support. Initialize the
> +	 * max_smt_thread_num to 1 if no SMT support detected. A thread
> +	 * number of 1 can be handled by the framework so we don't need
> +	 * to check max_smt_thread_num to see we support SMT or not.
> +	 */
> +	if (!max_smt_thread_num)
> +		max_smt_thread_num = 1;
> +

Ditto as previous patch, can get rid if it is default 1.

-- 
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-28 13:56   ` Sudeep Holla
@ 2025-02-28 17:51     ` Pierre Gondois
  2025-02-28 19:06       ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Gondois @ 2025-02-28 17:51 UTC (permalink / raw)
  To: Sudeep Holla, Yicong Yang
  Cc: catalin.marinas, will, tglx, peterz, mpe, linux-arm-kernel, mingo,
	bp, dave.hansen, dietmar.eggemann, linuxppc-dev, x86,
	linux-kernel, morten.rasmussen, msuchanek, gregkh, rafael,
	jonathan.cameron, prime.zeng, linuxarm, yangyicong, xuwei5,
	guohanjun, sshegde



On 2/28/25 14:56, Sudeep Holla wrote:
> On Tue, Feb 18, 2025 at 10:10:17PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
>> and SMT information of the first core in its heterogeneous CPU cluster
>> when building the topology. Then we can know the largest SMT number
>> in the system. If a homogeneous system's using ACPI 6.2 or later,
>> all the CPUs should be under the root node of PPTT. There'll be
>> only one entry in the xarray and all the CPUs in the system will
>> be assumed identical.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>   arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>>   1 file changed, 66 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 1a2c72f3e7f8..6eba1ac091ee 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -15,8 +15,10 @@
>>   #include <linux/arch_topology.h>
>>   #include <linux/cacheinfo.h>
>>   #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>>   #include <linux/init.h>
>>   #include <linux/percpu.h>
>> +#include <linux/xarray.h>
>>   
>>   #include <asm/cpu.h>
>>   #include <asm/cputype.h>
>> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>>   	return !!is_threaded;
>>   }
>>   
>> +struct cpu_smt_info {
>> +	unsigned int thread_num;
>> +	int core_id;
>> +};
>> +
>>   /*
>>    * Propagate the topology information of the processor_topology_node tree to the
>>    * cpu_topology array.
>>    */
>>   int __init parse_acpi_topology(void)
>>   {
>> +	unsigned int max_smt_thread_num = 0;
>> +	struct cpu_smt_info *entry;
>> +	struct xarray hetero_cpu;
>> +	unsigned long hetero_id;
>>   	int cpu, topology_id;
>>   
>>   	if (acpi_disabled)
>>   		return 0;
>>   
>> +	xa_init(&hetero_cpu);
>> +
>>   	for_each_possible_cpu(cpu) {
>>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>>   		if (topology_id < 0)
>> @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
>>   			cpu_topology[cpu].thread_id = topology_id;
>>   			topology_id = find_acpi_cpu_topology(cpu, 1);
>>   			cpu_topology[cpu].core_id   = topology_id;
>> +
>> +			/*
>> +			 * In the PPTT, CPUs below a node with the 'identical
>> +			 * implementation' flag have the same number of threads.
>> +			 * Count the number of threads for only one CPU (i.e.
>> +			 * one core_id) among those with the same hetero_id.
>> +			 * See the comment of find_acpi_cpu_topology_hetero_id()
>> +			 * for more details.
>> +			 *
>> +			 * One entry is created for each node having:
>> +			 * - the 'identical implementation' flag
>> +			 * - its parent not having the flag
>> +			 */
>> +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
>> +			entry = xa_load(&hetero_cpu, hetero_id);
>> +			if (!entry) {
>> +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> +				WARN_ON_ONCE(!entry);
>> +
>> +				if (entry) {
>> +					entry->core_id = topology_id;
>> +					entry->thread_num = 1;
>> +					xa_store(&hetero_cpu, hetero_id,
>> +						 entry, GFP_KERNEL);
>> +				}
>> +			} else if (entry->core_id == topology_id) {
>> +				entry->thread_num++;
>> +			}
>>   		} else {
>>   			cpu_topology[cpu].thread_id  = -1;
>>   			cpu_topology[cpu].core_id    = topology_id;
>> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>>   		cpu_topology[cpu].package_id = topology_id;
>>   	}
>>   
>> +	/*
>> +	 * This should be a short loop depending on the number of heterogeneous
>> +	 * CPU clusters. Typically on a homogeneous system there's only one
>> +	 * entry in the XArray.
>> +	 */
>> +	xa_for_each(&hetero_cpu, hetero_id, entry) {
>> +		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
>> +			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> 
> Ditto as previous patch about handling no threaded cores with threaded cores
> in the system. I am not sure if that is required but just raising it here.
> 
>> +
>> +		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
>> +		xa_erase(&hetero_cpu, hetero_id);
>> +		kfree(entry);
>> +	}
>> +
>> +	/*
>> +	 * Notify the CPU framework of the SMT support. Initialize the
>> +	 * max_smt_thread_num to 1 if no SMT support detected. A thread
>> +	 * number of 1 can be handled by the framework so we don't need
>> +	 * to check max_smt_thread_num to see we support SMT or not.
>> +	 */
>> +	if (!max_smt_thread_num)
>> +		max_smt_thread_num = 1;
>> +
> 
> Ditto as previous patch, can get rid if it is default 1.
> 

On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
cpu_smt_num_threads uninitialized to UINT_MAX:

smt/active:0
smt/control:-1

If cpu_smt_set_num_threads() is called:
active:0
control:notsupported

So it might be slightly better to still initialize max_smt_thread_num.

Otherwise I tested the patches on arm64 ACPI smt platforms and it worked
well, so for all the patches (if there are no other major modifications):
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>

Regards,
Pierre


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-28 17:51     ` Pierre Gondois
@ 2025-02-28 19:06       ` Sudeep Holla
  2025-03-03  9:56         ` Pierre Gondois
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-02-28 19:06 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Yicong Yang, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On Fri, Feb 28, 2025 at 06:51:16PM +0100, Pierre Gondois wrote:
> 
> 
> On 2/28/25 14:56, Sudeep Holla wrote:
> > On Tue, Feb 18, 2025 at 10:10:17PM +0800, Yicong Yang wrote:
> > > From: Yicong Yang <yangyicong@hisilicon.com>
> > > 
> > > For ACPI we'll build the topology from PPTT and we cannot directly
> > > get the SMT number of each core. Instead using a temporary xarray
> > > to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> > > and SMT information of the first core in its heterogeneous CPU cluster
> > > when building the topology. Then we can know the largest SMT number
> > > in the system. If a homogeneous system's using ACPI 6.2 or later,
> > > all the CPUs should be under the root node of PPTT. There'll be
> > > only one entry in the xarray and all the CPUs in the system will
> > > be assumed identical.
> > > 
> > > The core's SMT control provides two interface to the users [1]:
> > > 1) enable/disable SMT by writing on/off
> > > 2) enable/disable SMT by writing thread number 1/max_thread_number
> > > 
> > > If a system have more than one SMT thread number the 2) may
> > > not handle it well, since there're multiple thread numbers in the
> > > system and 2) only accept 1/max_thread_number. So issue a warning
> > > to notify the users if such system detected.
> > > 
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> > > 
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> > > ---
> > >   arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 66 insertions(+)
> > > 
> > > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> > > index 1a2c72f3e7f8..6eba1ac091ee 100644
> > > --- a/arch/arm64/kernel/topology.c
> > > +++ b/arch/arm64/kernel/topology.c
> > > @@ -15,8 +15,10 @@
> > >   #include <linux/arch_topology.h>
> > >   #include <linux/cacheinfo.h>
> > >   #include <linux/cpufreq.h>
> > > +#include <linux/cpu_smt.h>
> > >   #include <linux/init.h>
> > >   #include <linux/percpu.h>
> > > +#include <linux/xarray.h>
> > >   #include <asm/cpu.h>
> > >   #include <asm/cputype.h>
> > > @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
> > >   	return !!is_threaded;
> > >   }
> > > +struct cpu_smt_info {
> > > +	unsigned int thread_num;
> > > +	int core_id;
> > > +};
> > > +
> > >   /*
> > >    * Propagate the topology information of the processor_topology_node tree to the
> > >    * cpu_topology array.
> > >    */
> > >   int __init parse_acpi_topology(void)
> > >   {
> > > +	unsigned int max_smt_thread_num = 0;
> > > +	struct cpu_smt_info *entry;
> > > +	struct xarray hetero_cpu;
> > > +	unsigned long hetero_id;
> > >   	int cpu, topology_id;
> > >   	if (acpi_disabled)
> > >   		return 0;
> > > +	xa_init(&hetero_cpu);
> > > +
> > >   	for_each_possible_cpu(cpu) {
> > >   		topology_id = find_acpi_cpu_topology(cpu, 0);
> > >   		if (topology_id < 0)
> > > @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
> > >   			cpu_topology[cpu].thread_id = topology_id;
> > >   			topology_id = find_acpi_cpu_topology(cpu, 1);
> > >   			cpu_topology[cpu].core_id   = topology_id;
> > > +
> > > +			/*
> > > +			 * In the PPTT, CPUs below a node with the 'identical
> > > +			 * implementation' flag have the same number of threads.
> > > +			 * Count the number of threads for only one CPU (i.e.
> > > +			 * one core_id) among those with the same hetero_id.
> > > +			 * See the comment of find_acpi_cpu_topology_hetero_id()
> > > +			 * for more details.
> > > +			 *
> > > +			 * One entry is created for each node having:
> > > +			 * - the 'identical implementation' flag
> > > +			 * - its parent not having the flag
> > > +			 */
> > > +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> > > +			entry = xa_load(&hetero_cpu, hetero_id);
> > > +			if (!entry) {
> > > +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> > > +				WARN_ON_ONCE(!entry);
> > > +
> > > +				if (entry) {
> > > +					entry->core_id = topology_id;
> > > +					entry->thread_num = 1;
> > > +					xa_store(&hetero_cpu, hetero_id,
> > > +						 entry, GFP_KERNEL);
> > > +				}
> > > +			} else if (entry->core_id == topology_id) {
> > > +				entry->thread_num++;
> > > +			}
> > >   		} else {
> > >   			cpu_topology[cpu].thread_id  = -1;
> > >   			cpu_topology[cpu].core_id    = topology_id;
> > > @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
> > >   		cpu_topology[cpu].package_id = topology_id;
> > >   	}
> > > +	/*
> > > +	 * This should be a short loop depending on the number of heterogeneous
> > > +	 * CPU clusters. Typically on a homogeneous system there's only one
> > > +	 * entry in the XArray.
> > > +	 */
> > > +	xa_for_each(&hetero_cpu, hetero_id, entry) {
> > > +		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> > > +			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> > 
> > Ditto as previous patch about handling no threaded cores with threaded cores
> > in the system. I am not sure if that is required but just raising it here.
> > 
> > > +
> > > +		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
> > > +		xa_erase(&hetero_cpu, hetero_id);
> > > +		kfree(entry);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Notify the CPU framework of the SMT support. Initialize the
> > > +	 * max_smt_thread_num to 1 if no SMT support detected. A thread
> > > +	 * number of 1 can be handled by the framework so we don't need
> > > +	 * to check max_smt_thread_num to see we support SMT or not.
> > > +	 */
> > > +	if (!max_smt_thread_num)
> > > +		max_smt_thread_num = 1;
> > > +
> > 
> > Ditto as previous patch, can get rid if it is default 1.
> > 
> 
> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
> cpu_smt_num_threads uninitialized to UINT_MAX:
> 
> smt/active:0
> smt/control:-1
> 
> If cpu_smt_set_num_threads() is called:
> active:0
> control:notsupported
> 
> So it might be slightly better to still initialize max_smt_thread_num.
>

Sure, what I meant is to have max_smt_thread_num set to 1 by default is
that is what needed anyways and the above code does that now.

Why not start with initialised to 1 instead ?
Of course some current logic needs to change around testing it for zero.

-- 
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-28 19:06       ` Sudeep Holla
@ 2025-03-03  9:56         ` Pierre Gondois
  2025-03-03 11:16           ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Gondois @ 2025-03-03  9:56 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Yicong Yang, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde



On 2/28/25 20:06, Sudeep Holla wrote:
> On Fri, Feb 28, 2025 at 06:51:16PM +0100, Pierre Gondois wrote:
>>
>>
>> On 2/28/25 14:56, Sudeep Holla wrote:
>>> On Tue, Feb 18, 2025 at 10:10:17PM +0800, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>
>>>> For ACPI we'll build the topology from PPTT and we cannot directly
>>>> get the SMT number of each core. Instead using a temporary xarray
>>>> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
>>>> and SMT information of the first core in its heterogeneous CPU cluster
>>>> when building the topology. Then we can know the largest SMT number
>>>> in the system. If a homogeneous system's using ACPI 6.2 or later,
>>>> all the CPUs should be under the root node of PPTT. There'll be
>>>> only one entry in the xarray and all the CPUs in the system will
>>>> be assumed identical.
>>>>
>>>> The core's SMT control provides two interface to the users [1]:
>>>> 1) enable/disable SMT by writing on/off
>>>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>>>
>>>> If a system have more than one SMT thread number the 2) may
>>>> not handle it well, since there're multiple thread numbers in the
>>>> system and 2) only accept 1/max_thread_number. So issue a warning
>>>> to notify the users if such system detected.
>>>>
>>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>>>>
>>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>>> ---
>>>>    arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>>>>    1 file changed, 66 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>>>> index 1a2c72f3e7f8..6eba1ac091ee 100644
>>>> --- a/arch/arm64/kernel/topology.c
>>>> +++ b/arch/arm64/kernel/topology.c
>>>> @@ -15,8 +15,10 @@
>>>>    #include <linux/arch_topology.h>
>>>>    #include <linux/cacheinfo.h>
>>>>    #include <linux/cpufreq.h>
>>>> +#include <linux/cpu_smt.h>
>>>>    #include <linux/init.h>
>>>>    #include <linux/percpu.h>
>>>> +#include <linux/xarray.h>
>>>>    #include <asm/cpu.h>
>>>>    #include <asm/cputype.h>
>>>> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>>>>    	return !!is_threaded;
>>>>    }
>>>> +struct cpu_smt_info {
>>>> +	unsigned int thread_num;
>>>> +	int core_id;
>>>> +};
>>>> +
>>>>    /*
>>>>     * Propagate the topology information of the processor_topology_node tree to the
>>>>     * cpu_topology array.
>>>>     */
>>>>    int __init parse_acpi_topology(void)
>>>>    {
>>>> +	unsigned int max_smt_thread_num = 0;
>>>> +	struct cpu_smt_info *entry;
>>>> +	struct xarray hetero_cpu;
>>>> +	unsigned long hetero_id;
>>>>    	int cpu, topology_id;
>>>>    	if (acpi_disabled)
>>>>    		return 0;
>>>> +	xa_init(&hetero_cpu);
>>>> +
>>>>    	for_each_possible_cpu(cpu) {
>>>>    		topology_id = find_acpi_cpu_topology(cpu, 0);
>>>>    		if (topology_id < 0)
>>>> @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
>>>>    			cpu_topology[cpu].thread_id = topology_id;
>>>>    			topology_id = find_acpi_cpu_topology(cpu, 1);
>>>>    			cpu_topology[cpu].core_id   = topology_id;
>>>> +
>>>> +			/*
>>>> +			 * In the PPTT, CPUs below a node with the 'identical
>>>> +			 * implementation' flag have the same number of threads.
>>>> +			 * Count the number of threads for only one CPU (i.e.
>>>> +			 * one core_id) among those with the same hetero_id.
>>>> +			 * See the comment of find_acpi_cpu_topology_hetero_id()
>>>> +			 * for more details.
>>>> +			 *
>>>> +			 * One entry is created for each node having:
>>>> +			 * - the 'identical implementation' flag
>>>> +			 * - its parent not having the flag
>>>> +			 */
>>>> +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
>>>> +			entry = xa_load(&hetero_cpu, hetero_id);
>>>> +			if (!entry) {
>>>> +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>>>> +				WARN_ON_ONCE(!entry);
>>>> +
>>>> +				if (entry) {
>>>> +					entry->core_id = topology_id;
>>>> +					entry->thread_num = 1;
>>>> +					xa_store(&hetero_cpu, hetero_id,
>>>> +						 entry, GFP_KERNEL);
>>>> +				}
>>>> +			} else if (entry->core_id == topology_id) {
>>>> +				entry->thread_num++;
>>>> +			}
>>>>    		} else {
>>>>    			cpu_topology[cpu].thread_id  = -1;
>>>>    			cpu_topology[cpu].core_id    = topology_id;
>>>> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>>>>    		cpu_topology[cpu].package_id = topology_id;
>>>>    	}
>>>> +	/*
>>>> +	 * This should be a short loop depending on the number of heterogeneous
>>>> +	 * CPU clusters. Typically on a homogeneous system there's only one
>>>> +	 * entry in the XArray.
>>>> +	 */
>>>> +	xa_for_each(&hetero_cpu, hetero_id, entry) {
>>>> +		if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
>>>> +			pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>>>
>>> Ditto as previous patch about handling no threaded cores with threaded cores
>>> in the system. I am not sure if that is required but just raising it here.
>>>
>>>> +
>>>> +		max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
>>>> +		xa_erase(&hetero_cpu, hetero_id);
>>>> +		kfree(entry);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Notify the CPU framework of the SMT support. Initialize the
>>>> +	 * max_smt_thread_num to 1 if no SMT support detected. A thread
>>>> +	 * number of 1 can be handled by the framework so we don't need
>>>> +	 * to check max_smt_thread_num to see we support SMT or not.
>>>> +	 */
>>>> +	if (!max_smt_thread_num)
>>>> +		max_smt_thread_num = 1;
>>>> +
>>>
>>> Ditto as previous patch, can get rid if it is default 1.
>>>
>>
>> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
>> cpu_smt_num_threads uninitialized to UINT_MAX:
>>
>> smt/active:0
>> smt/control:-1
>>
>> If cpu_smt_set_num_threads() is called:
>> active:0
>> control:notsupported
>>
>> So it might be slightly better to still initialize max_smt_thread_num.
>>
> 
> Sure, what I meant is to have max_smt_thread_num set to 1 by default is
> that is what needed anyways and the above code does that now.
> 
> Why not start with initialised to 1 instead ?
> Of course some current logic needs to change around testing it for zero.
> 

I think there would still be a way to check against the default value.
If we have:
unsigned int max_smt_thread_num = 1;

then on a platform with 2 threads, the detection condition would trigger:
xa_for_each(&hetero_cpu, hetero_id, entry) {
     if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
         pr_warn_once("Heterogeneous SMT topology is partly
                       supported by SMT control\n");

so we would need an additional variable:
bool is_initialized = false;


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-03  9:56         ` Pierre Gondois
@ 2025-03-03 11:16           ` Sudeep Holla
  2025-03-03 14:40             ` Yicong Yang
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-03-03 11:16 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Yicong Yang, catalin.marinas, Sudeep Holla, will, tglx, peterz,
	mpe, linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	yangyicong, xuwei5, guohanjun, sshegde

On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
> On 2/28/25 20:06, Sudeep Holla wrote:
> > > >
> > > > Ditto as previous patch, can get rid if it is default 1.
> > > >
> > >
> > > On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
> > > cpu_smt_num_threads uninitialized to UINT_MAX:
> > >
> > > smt/active:0
> > > smt/control:-1
> > >
> > > If cpu_smt_set_num_threads() is called:
> > > active:0
> > > control:notsupported
> > >
> > > So it might be slightly better to still initialize max_smt_thread_num.
> > >
> >
> > Sure, what I meant is to have max_smt_thread_num set to 1 by default is
> > that is what needed anyways and the above code does that now.
> >
> > Why not start with initialised to 1 instead ?
> > Of course some current logic needs to change around testing it for zero.
> >
>
> I think there would still be a way to check against the default value.
> If we have:
> unsigned int max_smt_thread_num = 1;
>
> then on a platform with 2 threads, the detection condition would trigger:
> xa_for_each(&hetero_cpu, hetero_id, entry) {
>     if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
>         pr_warn_once("Heterogeneous SMT topology is partly
>                       supported by SMT control\n");
>
> so we would need an additional variable:
> bool is_initialized = false;

Sure, we could do that or skip the check if max_smt_thread_num == 1 ?

I mean
	if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)

I assume entry->thread_num must be set to 1 on single threaded cores
Won't that work ? Am I missing something still ?

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
  2025-02-28 11:10   ` Dietmar Eggemann
@ 2025-03-03 13:35     ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 13:35 UTC (permalink / raw)
  To: Dietmar Eggemann, catalin.marinas, will, sudeep.holla, tglx,
	peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
	pierre.gondois
  Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
	msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	xuwei5, guohanjun, sshegde

On 2025/2/28 19:10, Dietmar Eggemann wrote:
> On 18/02/2025 15:10, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> [...]
> 
>> diff --git a/include/linux/topology.h b/include/linux/topology.h
>> index 52f5850730b3..b3aba443c4eb 100644
>> --- a/include/linux/topology.h
>> +++ b/include/linux/topology.h
>> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>>  }
>>  #endif
>>  
>> +#ifndef topology_is_primary_thread
>> +
>> +#define topology_is_primary_thread topology_is_primary_thread
>> +
>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>> +{
>> +	/*
>> +	 * On SMT hotplug the primary thread of the SMT won't be disabled.
>> +	 * Architectures do have a special primary thread (e.g. x86) need
>> +	 * to override this function. Otherwise just make the first thread
>> +	 * in the SMT as the primary thread.
>> +	 *
>> +	 * The sibling cpumask of an offline CPU contains always the CPU
>> +	 * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
>> +	 * Other architectures should use this depend on their own
>> +	 * situation.
> 
> This sentence is hard to get. Do you want to say that other
> architectures (CONFIG_GENERIC_ARCH_TOPOLOGY or
> !CONFIG_GENERIC_ARCH_TOPOLOGY) have to check whether they can use this
> default implementation or have to override it?
> 

yes exactly, will improve the comments. thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
  2025-02-28 13:54   ` Sudeep Holla
@ 2025-03-03 13:38     ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 13:38 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, guohanjun, sshegde

On 2025/2/28 21:54, Sudeep Holla wrote:
> On Tue, Feb 18, 2025 at 10:10:15PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Currently if architectures want to support HOTPLUG_SMT they need to
>> provide a topology_is_primary_thread() telling the framework which
>> thread in the SMT cannot offline. However arm64 doesn't have a
>> restriction on which thread in the SMT cannot offline, a simplest
>> choice is that just make 1st thread as the "primary" thread. So
>> just make this as the default implementation in the framework and
>> let architectures like x86 that have special primary thread to
>> override this function (which they've already done).
>>
>> There's no need to provide a stub function if !CONFIG_SMP or
>> !CONFIG_HOTPLUG_SMT. In such case the testing CPU is already
>> the 1st CPU in the SMT so it's always the primary thread.
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>> Pre questioned in v9 [1] whether this works on architectures not using
>> CONFIG_GENERIC_ARCH_TOPOLOGY, See [2] for demonstration hacking on LoongArch
>> VM and this also works. Architectures should use this on their own situation.
>> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
>> [2] https://lore.kernel.org/linux-arm-kernel/a5690fee-3019-f26c-8bad-1d95e388e877@huawei.com/
>>
>>  arch/powerpc/include/asm/topology.h |  1 +
>>  arch/x86/include/asm/topology.h     |  2 +-
>>  include/linux/topology.h            | 22 ++++++++++++++++++++++
>>  3 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
>> index 16bacfe8c7a2..da15b5efe807 100644
>> --- a/arch/powerpc/include/asm/topology.h
>> +++ b/arch/powerpc/include/asm/topology.h
>> @@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>>  {
>>  	return cpu == cpu_first_thread_sibling(cpu);
>>  }
>> +#define topology_is_primary_thread topology_is_primary_thread
>>  
>>  static inline bool topology_smt_thread_allowed(unsigned int cpu)
>>  {
>> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
>> index ec134b719144..6c79ee7c0957 100644
>> --- a/arch/x86/include/asm/topology.h
>> +++ b/arch/x86/include/asm/topology.h
>> @@ -229,11 +229,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>>  {
>>  	return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
>>  }
>> +#define topology_is_primary_thread topology_is_primary_thread
>>  
>>  #else /* CONFIG_SMP */
>>  static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
>>  static inline int topology_max_smt_threads(void) { return 1; }
>> -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
>>  static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
>>  #endif /* !CONFIG_SMP */
>>  
>> diff --git a/include/linux/topology.h b/include/linux/topology.h
>> index 52f5850730b3..b3aba443c4eb 100644
>> --- a/include/linux/topology.h
>> +++ b/include/linux/topology.h
>> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>>  }
>>  #endif
>>  
>> +#ifndef topology_is_primary_thread
>> +
>> +#define topology_is_primary_thread topology_is_primary_thread
>> +
>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>> +{
>> +	/*
>> +	 * On SMT hotplug the primary thread of the SMT won't be disabled.
> 
> I may be misunderstanding the term "SMT hotplug" above. For me it is
> comparable with logical CPU hotplug, so the above statement may be
> misleading. IIUC, what you mean above is if SMT is disabled, the
> primary thread will always remain enabled/active. Does that make sense
> or am I missing something ?
> 

I just the borrow the term from kconfig HOTPLUG_SMT here, but here the statement
only involves the disable part, so maybe it'll be more accurate to use "SMT
disable" rather than "SMT hotplug" here?

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-02-28 11:11   ` Dietmar Eggemann
@ 2025-03-03 14:03     ` Yicong Yang
  2025-03-04  9:32       ` Dietmar Eggemann
  0 siblings, 1 reply; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 14:03 UTC (permalink / raw)
  To: Dietmar Eggemann, sudeep.holla, pierre.gondois
  Cc: dave.hansen, bp, mingo, linux-arm-kernel, mpe, peterz, tglx, will,
	catalin.marinas, yangyicong, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, guohanjun, sshegde

On 2025/2/28 19:11, Dietmar Eggemann wrote:
> On 18/02/2025 15:10, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> On building the topology from the devicetree, we've already
>> gotten the SMT thread number of each core. Update the largest
>> SMT thread number and enable the SMT control by the end of
>> topology parsing.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
> 
> 1/max_thread_number stands for '1 or max_thread_number', right ?
> 
> Aren't the two interfaces:
> 
> (a) /sys/devices/system/cpu/smt/active
> (b) /sys/devices/system/cpu/smt/control
> 
> and you write 1) or 2) (or 'forceoff') into (b)?

yes you're correct. "active" is a RO file for status only so not for this interface.
Let me explicitly mention the /sys/devices/system/cpu/smt/control here in the commit.

> 
>> If a system have more than one SMT thread number the 2) may
> 
> s/have/has
> 
>> not handle it well, since there're multiple thread numbers in the
> 
> multiple thread numbers other than 1, right?

according to the pr_warn_once() we implemented below it also includes the case
where the system have one type of SMT cores and non-SMT cores (the thread number is 1):
- 1 thread
- X (!= 1) threads

Discussion made in [1] and I thought we have agreement (hope I understood correctly)
that all the asymmetric cases need to notify. Do you and Sudeep think we should not
warn in such case?

[1] https://lore.kernel.org/linux-arm-kernel/10082e64-b00a-a30b-b9c5-1401a54f6717@huawei.com/

> 
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
> 
> This paragraph seems to be about heterogeneous systems. Maybe mention this?
> 
> Heterogeneous system with SMT only on a subset of cores (like Intel
> Hybrid): This one works (N threads per core with N=1 and N=2) just fine.
> 
> But on Arm64 (default) we would still see:
> 
> [0.075782] Heterogeneous SMT topology is partly supported by SMT control
> 

more clearer, will add it. Thanks.

>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  drivers/base/arch_topology.c | 27 +++++++++++++++++++++++++++
>>  1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index 3ebe77566788..23f425a9d77a 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -11,6 +11,7 @@
>>  #include <linux/cleanup.h>
>>  #include <linux/cpu.h>
>>  #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>>  #include <linux/device.h>
>>  #include <linux/of.h>
>>  #include <linux/slab.h>
>> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
>>  #endif
>>  
>>  #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>> +
>> +/* Maximum SMT thread number detected used to enable the SMT control */
> 
> maybe shorter ?
> 
> /* used to enable SMT control */
> 

sure.

>> +static unsigned int max_smt_thread_num;
>> +
>>  /*
>>   * This function returns the logic cpu number of the node.
>>   * There are basically three kinds of return values:
>> @@ -565,6 +570,16 @@ static int __init parse_core(struct device_node *core, int package_id,
>>  		i++;
>>  	} while (1);
>>  
>> +	/*
>> +	 * If max_smt_thread_num has been initialized and doesn't match
>> +	 * the thread number of this entry, then the system has
>> +	 * heterogeneous SMT topology.
>> +	 */
>> +	if (max_smt_thread_num && max_smt_thread_num != i)
>> +		pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
>> +	max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i);
>> +
>>  	cpu = get_cpu_for_node(core);
>>  	if (cpu >= 0) {
>>  		if (!leaf) {
>> @@ -677,6 +692,18 @@ static int __init parse_socket(struct device_node *socket)
>>  	if (!has_socket)
>>  		ret = parse_cluster(socket, 0, -1, 0);
>>  
>> +	/*
>> +	 * Notify the CPU framework of the SMT support. Initialize the
>> +	 * max_smt_thread_num to 1 if no SMT support detected or failed
>> +	 * to parse the topology. A thread number of 1 can be handled by
>> +	 * the framework so we don't need to check max_smt_thread_num to
>> +	 * see we support SMT or not.
> 
> Not sure whether the last sentence is needed here?
> 

We always need to call cpu_smt_set_num_threads() to notify the framework
of the thread number even if SMT is not supported. In which case the
thread number is 1 but the framework can handle this well. I worry readers
may get confused for notifying a thread number of 1 so add this comment this.

Will get rid of this if thought redundant.

Thanks.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-02-28 13:54   ` Sudeep Holla
@ 2025-03-03 14:11     ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 14:11 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, guohanjun, sshegde

On 2025/2/28 21:54, Sudeep Holla wrote:
> On Tue, Feb 18, 2025 at 10:10:16PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> On building the topology from the devicetree, we've already
>> gotten the SMT thread number of each core. Update the largest
>> SMT thread number and enable the SMT control by the end of
>> topology parsing.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  drivers/base/arch_topology.c | 27 +++++++++++++++++++++++++++
>>  1 file changed, 27 insertions(+)
>>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index 3ebe77566788..23f425a9d77a 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -11,6 +11,7 @@
>>  #include <linux/cleanup.h>
>>  #include <linux/cpu.h>
>>  #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>>  #include <linux/device.h>
>>  #include <linux/of.h>
>>  #include <linux/slab.h>
>> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
>>  #endif
>>  
>>  #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>> +
>> +/* Maximum SMT thread number detected used to enable the SMT control */
>> +static unsigned int max_smt_thread_num;
>> +
>>  /*
>>   * This function returns the logic cpu number of the node.
>>   * There are basically three kinds of return values:
>> @@ -565,6 +570,16 @@ static int __init parse_core(struct device_node *core, int package_id,
>>  		i++;
>>  	} while (1);
>>  
>> +	/*
>> +	 * If max_smt_thread_num has been initialized and doesn't match
>> +	 * the thread number of this entry, then the system has
>> +	 * heterogeneous SMT topology.
>> +	 */
>> +	if (max_smt_thread_num && max_smt_thread_num != i)
>> +		pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
> 
> May be we need to make it more conditional as we may have to support
> systems with few cores that are single threaded ? I think Dietmar's
> comment is about that.
> 

it thought of ignoring the cores with single thread in one previous discussion
as replied in Dietmar's thread.

>> +	max_smt_thread_num = max_t(unsigned int, max_smt_thread_num, i);
>> +
>>  	cpu = get_cpu_for_node(core);
>>  	if (cpu >= 0) {
>>  		if (!leaf) {
>> @@ -677,6 +692,18 @@ static int __init parse_socket(struct device_node *socket)
>>  	if (!has_socket)
>>  		ret = parse_cluster(socket, 0, -1, 0);
>>  
>> +	/*
>> +	 * Notify the CPU framework of the SMT support. Initialize the
>> +	 * max_smt_thread_num to 1 if no SMT support detected or failed
>> +	 * to parse the topology. A thread number of 1 can be handled by
>> +	 * the framework so we don't need to check max_smt_thread_num to
>> +	 * see we support SMT or not.
>> +	 */
>> +	if (!max_smt_thread_num || ret)
>> +		max_smt_thread_num = 1;
>> +
> 
> For the failed parsing of topology, reset_cpu_topology() gets called.
> I suggest resetting max_smt_thread_num to 1 belongs there.

this is only used by ARM64 || RISCV for using arch_topology to parse
the CPU topology, but the reset_cpu_topology() is also shared by arm/parisc.
Should we move it there and add some ARM64 || RISCV protection macro?

> 
> And if you start with max_smt_thread_num, we don't need to update it to
> 1 explicitly here. So I would like to get rid of above check completely.
> 
> --
> Regards,
> Sudeep
> 
> .
> 


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-03 11:16           ` Sudeep Holla
@ 2025-03-03 14:40             ` Yicong Yang
  2025-03-04  8:25               ` Pierre Gondois
  0 siblings, 1 reply; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 14:40 UTC (permalink / raw)
  To: Sudeep Holla, Pierre Gondois
  Cc: yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm, xuwei5,
	guohanjun, sshegde

On 2025/3/3 19:16, Sudeep Holla wrote:
> On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
>> On 2/28/25 20:06, Sudeep Holla wrote:
>>>>>
>>>>> Ditto as previous patch, can get rid if it is default 1.
>>>>>
>>>>
>>>> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
>>>> cpu_smt_num_threads uninitialized to UINT_MAX:
>>>>
>>>> smt/active:0
>>>> smt/control:-1
>>>>
>>>> If cpu_smt_set_num_threads() is called:
>>>> active:0
>>>> control:notsupported
>>>>
>>>> So it might be slightly better to still initialize max_smt_thread_num.
>>>>
>>>
>>> Sure, what I meant is to have max_smt_thread_num set to 1 by default is
>>> that is what needed anyways and the above code does that now.
>>>
>>> Why not start with initialised to 1 instead ?
>>> Of course some current logic needs to change around testing it for zero.
>>>
>>
>> I think there would still be a way to check against the default value.
>> If we have:
>> unsigned int max_smt_thread_num = 1;
>>
>> then on a platform with 2 threads, the detection condition would trigger:
>> xa_for_each(&hetero_cpu, hetero_id, entry) {
>>     if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
>>         pr_warn_once("Heterogeneous SMT topology is partly
>>                       supported by SMT control\n");
>>
>> so we would need an additional variable:
>> bool is_initialized = false;
> 
> Sure, we could do that or skip the check if max_smt_thread_num == 1 ?
> 
> I mean
> 	if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)
> 

this will work for me. will launch some tests.

Thanks.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 0/4] Support SMT control on arm64
  2025-02-28 11:12 ` [PATCH v11 0/4] Support SMT control on arm64 Dietmar Eggemann
@ 2025-03-03 14:41   ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 14:41 UTC (permalink / raw)
  To: Dietmar Eggemann, catalin.marinas, will, sudeep.holla, tglx,
	peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
	pierre.gondois
  Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
	msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
	xuwei5, guohanjun, sshegde

On 2025/2/28 19:12, Dietmar Eggemann wrote:
> On 18/02/2025 15:10, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> The core CPU control framework supports runtime SMT control which
>> is not yet supported on arm64. Besides the general vulnerabilities
>> concerns we want this runtime control on our arm64 server for:
>>
>> - better single CPU performance in some cases
>> - saving overall power consumption
>>
>> This patchset implements it in the following aspects:
>>
>> - Provides a default topology_is_primary_thread()
>> - support retrieve SMT thread number on OF based system
>> - support retrieve SMT thread number on ACPI based system
>> - select HOTPLUG_SMT for arm64
>>
>> Tests has been done on our ACPI based arm64 server and on ACPI/OF
>> based QEMU VMs.
> 
> [...]
> 
>> Yicong Yang (4):
>>   cpu/SMT: Provide a default topology_is_primary_thread()
>>   arch_topology: Support SMT control for OF based system
>>   arm64: topology: Support SMT control on ACPI based system
>>   arm64: Kconfig: Enable HOTPLUG_SMT
>>
>>  arch/arm64/Kconfig                  |  1 +
>>  arch/arm64/kernel/topology.c        | 66 +++++++++++++++++++++++++++++
>>  arch/powerpc/include/asm/topology.h |  1 +
>>  arch/x86/include/asm/topology.h     |  2 +-
>>  drivers/base/arch_topology.c        | 27 ++++++++++++
>>  include/linux/topology.h            | 22 ++++++++++
>>  6 files changed, 118 insertions(+), 1 deletion(-)
> 
> With the review comments on the individual patches [0-3]/4:

will fix.

> 
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> 

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-02-25  6:08   ` Hanjun Guo
@ 2025-03-03 14:42     ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-03 14:42 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
	dietmar.eggemann, yangyicong, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, sshegde

On 2025/2/25 14:08, Hanjun Guo wrote:
> On 2025/2/18 22:10, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
>> and SMT information of the first core in its heterogeneous CPU cluster
>> when building the topology. Then we can know the largest SMT number
>> in the system. If a homogeneous system's using ACPI 6.2 or later,
>> all the CPUs should be under the root node of PPTT. There'll be
>> only one entry in the xarray and all the CPUs in the system will
>> be assumed identical.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>   arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>>   1 file changed, 66 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 1a2c72f3e7f8..6eba1ac091ee 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -15,8 +15,10 @@
>>   #include <linux/arch_topology.h>
>>   #include <linux/cacheinfo.h>
>>   #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>>   #include <linux/init.h>
>>   #include <linux/percpu.h>
>> +#include <linux/xarray.h>
>>     #include <asm/cpu.h>
>>   #include <asm/cputype.h>
>> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>>       return !!is_threaded;
>>   }
>>   +struct cpu_smt_info {
>> +    unsigned int thread_num;
>> +    int core_id;
>> +};
>> +
>>   /*
>>    * Propagate the topology information of the processor_topology_node tree to the
>>    * cpu_topology array.
>>    */
>>   int __init parse_acpi_topology(void)
>>   {
>> +    unsigned int max_smt_thread_num = 0;
>> +    struct cpu_smt_info *entry;
>> +    struct xarray hetero_cpu;
>> +    unsigned long hetero_id;
>>       int cpu, topology_id;
>>         if (acpi_disabled)
>>           return 0;
>>   +    xa_init(&hetero_cpu);
>> +
>>       for_each_possible_cpu(cpu) {
>>           topology_id = find_acpi_cpu_topology(cpu, 0);
>>           if (topology_id < 0)
>> @@ -57,6 +70,34 @@ int __init parse_acpi_topology(void)
>>               cpu_topology[cpu].thread_id = topology_id;
>>               topology_id = find_acpi_cpu_topology(cpu, 1);
>>               cpu_topology[cpu].core_id   = topology_id;
>> +
>> +            /*
>> +             * In the PPTT, CPUs below a node with the 'identical
>> +             * implementation' flag have the same number of threads.
>> +             * Count the number of threads for only one CPU (i.e.
>> +             * one core_id) among those with the same hetero_id.
>> +             * See the comment of find_acpi_cpu_topology_hetero_id()
>> +             * for more details.
>> +             *
>> +             * One entry is created for each node having:
>> +             * - the 'identical implementation' flag
>> +             * - its parent not having the flag
>> +             */
>> +            hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
>> +            entry = xa_load(&hetero_cpu, hetero_id);
>> +            if (!entry) {
>> +                entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> +                WARN_ON_ONCE(!entry);
>> +
>> +                if (entry) {
>> +                    entry->core_id = topology_id;
>> +                    entry->thread_num = 1;
>> +                    xa_store(&hetero_cpu, hetero_id,
>> +                         entry, GFP_KERNEL);
>> +                }
>> +            } else if (entry->core_id == topology_id) {
>> +                entry->thread_num++;
>> +            }
>>           } else {
>>               cpu_topology[cpu].thread_id  = -1;
>>               cpu_topology[cpu].core_id    = topology_id;
>> @@ -67,6 +108,31 @@ int __init parse_acpi_topology(void)
>>           cpu_topology[cpu].package_id = topology_id;
>>       }
>>   +    /*
>> +     * This should be a short loop depending on the number of heterogeneous
>> +     * CPU clusters. Typically on a homogeneous system there's only one
>> +     * entry in the XArray.
>> +     */
>> +    xa_for_each(&hetero_cpu, hetero_id, entry) {
>> +        if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
>> +            pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
>> +        max_smt_thread_num = max(max_smt_thread_num, entry->thread_num);
>> +        xa_erase(&hetero_cpu, hetero_id);
>> +        kfree(entry);
>> +    }
>> +
>> +    /*
>> +     * Notify the CPU framework of the SMT support. Initialize the
>> +     * max_smt_thread_num to 1 if no SMT support detected. A thread
>> +     * number of 1 can be handled by the framework so we don't need
>> +     * to check max_smt_thread_num to see we support SMT or not.
>> +     */
>> +    if (!max_smt_thread_num)
>> +        max_smt_thread_num = 1;
>> +
>> +    cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>> +    xa_destroy(&hetero_cpu);
>>       return 0;
>>   }
>>   #endif
> 
> Looks good to me,
> 
> Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
> 

Thanks a lot for taking a look :)




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-03 14:40             ` Yicong Yang
@ 2025-03-04  8:25               ` Pierre Gondois
  2025-03-04 10:02                 ` Sudeep Holla
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Gondois @ 2025-03-04  8:25 UTC (permalink / raw)
  To: Yicong Yang, Sudeep Holla
  Cc: yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm, xuwei5,
	guohanjun, sshegde



On 3/3/25 15:40, Yicong Yang wrote:
> On 2025/3/3 19:16, Sudeep Holla wrote:
>> On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
>>> On 2/28/25 20:06, Sudeep Holla wrote:
>>>>>>
>>>>>> Ditto as previous patch, can get rid if it is default 1.
>>>>>>
>>>>>
>>>>> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
>>>>> cpu_smt_num_threads uninitialized to UINT_MAX:
>>>>>
>>>>> smt/active:0
>>>>> smt/control:-1
>>>>>
>>>>> If cpu_smt_set_num_threads() is called:
>>>>> active:0
>>>>> control:notsupported
>>>>>
>>>>> So it might be slightly better to still initialize max_smt_thread_num.
>>>>>
>>>>
>>>> Sure, what I meant is to have max_smt_thread_num set to 1 by default is
>>>> that is what needed anyways and the above code does that now.
>>>>
>>>> Why not start with initialised to 1 instead ?
>>>> Of course some current logic needs to change around testing it for zero.
>>>>
>>>
>>> I think there would still be a way to check against the default value.
>>> If we have:
>>> unsigned int max_smt_thread_num = 1;
>>>
>>> then on a platform with 2 threads, the detection condition would trigger:
>>> xa_for_each(&hetero_cpu, hetero_id, entry) {
>>>      if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
>>>          pr_warn_once("Heterogeneous SMT topology is partly
>>>                        supported by SMT control\n");
>>>
>>> so we would need an additional variable:
>>> bool is_initialized = false;
>>
>> Sure, we could do that or skip the check if max_smt_thread_num == 1 ?
>>
>> I mean
>> 	if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)
>>

I think it will be problematic if we parse:
- first a CPU with 1 thread
- then a CPU with 2 threads

in that case we should detect the 'Heterogeneous SMT topology',
but we cannot because we don't know whether max_smt_thread_num=1
because 1 is the default value or we found a CPU with one thread.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 2/4] arch_topology: Support SMT control for OF based system
  2025-03-03 14:03     ` Yicong Yang
@ 2025-03-04  9:32       ` Dietmar Eggemann
  0 siblings, 0 replies; 29+ messages in thread
From: Dietmar Eggemann @ 2025-03-04  9:32 UTC (permalink / raw)
  To: Yicong Yang, sudeep.holla, pierre.gondois
  Cc: dave.hansen, bp, mingo, linux-arm-kernel, mpe, peterz, tglx, will,
	catalin.marinas, yangyicong, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, guohanjun, sshegde

On 03/03/2025 15:03, Yicong Yang wrote:
> On 2025/2/28 19:11, Dietmar Eggemann wrote:
>> On 18/02/2025 15:10, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>

[...]

>>> If a system have more than one SMT thread number the 2) may
>>
>> s/have/has
>>
>>> not handle it well, since there're multiple thread numbers in the
>>
>> multiple thread numbers other than 1, right?
> 
> according to the pr_warn_once() we implemented below it also includes the case
> where the system have one type of SMT cores and non-SMT cores (the thread number is 1):
> - 1 thread
> - X (!= 1) threads
> 
> Discussion made in [1] and I thought we have agreement (hope I understood correctly)
> that all the asymmetric cases need to notify. Do you and Sudeep think we should not
> warn in such case?

Systems with non-SMT and SMT-2 cores are IMHO a special case since for
them the '/sys/devices/system/cpu/smt' interface still works correctly.
And on X86 those systems do exist today.

IMHO, it would be awkward to see the message 'Heterogeneous SMT topology
is partly supported by SMT control' on arm64 but not on x86 on such a
system.

I do understand that this message is more tailored to theoretically
possible 'multiple SMT-X (X>1) core' systems (e.g. 1,2,4).

And here we cannot issue a '2 > ./control' since
cpu_smt_num_threads_valid() only returns true for 1 or 4.

IMHO, I would remove the warning and state clearly in the patch that for
systems with multiple SMT-X (X>1) cores, this interface only support SMT
completely on or off.

Example Arm64 DT:

cpu-map {
        cluster0 {
                core0 {
                        thread0 {
                                cpu = <&A53_0>;
                        };
                };
                core1 {
                        thread0 {
                                cpu = <&A53_1>;
                        };
                };
                core2 {
                        thread0 {
                                cpu = <&A53_2>;
                        };
                        thread1 {
                                cpu = <&A53_3>;
                        };
                };
                core3 {
                        thread0 {
                                cpu = <&A53_4>;
                        };
                        thread1 {
                                cpu = <&A53_5>;
                        };
                        thread2 {
                                cpu = <&A53_6>;
                        };
                        thread3 {
                                cpu = <&A53_7>;
                        };
                };
        };
};

# cat /proc/cpuinfo | grep ^processor
processor	: 0
processor	: 1
processor	: 2
processor	: 3
processor	: 4
processor	: 5
processor	: 6
processor	: 7

/sys/devices/system/cpu/smt# echo 1 >control

# cat /proc/cpuinfo | grep ^processor
processor	: 0
processor	: 1
processor	: 2
processor	: 4

/sys/devices/system/cpu/smt# echo 2 >control
-bash: echo: write error: Invalid argument

[...]


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-04  8:25               ` Pierre Gondois
@ 2025-03-04 10:02                 ` Sudeep Holla
  2025-03-04 15:07                   ` Pierre Gondois
  0 siblings, 1 reply; 29+ messages in thread
From: Sudeep Holla @ 2025-03-04 10:02 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Yicong Yang, yangyicong, Sudeep Holla, catalin.marinas, will,
	tglx, peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
	dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
	morten.rasmussen, msuchanek, gregkh, rafael, jonathan.cameron,
	prime.zeng, linuxarm, xuwei5, guohanjun, sshegde

On Tue, Mar 04, 2025 at 09:25:02AM +0100, Pierre Gondois wrote:
>
>
> On 3/3/25 15:40, Yicong Yang wrote:
> > On 2025/3/3 19:16, Sudeep Holla wrote:
> > > On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
> > > > On 2/28/25 20:06, Sudeep Holla wrote:
> > > > > > >
> > > > > > > Ditto as previous patch, can get rid if it is default 1.
> > > > > > >
> > > > > >
> > > > > > On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
> > > > > > cpu_smt_num_threads uninitialized to UINT_MAX:
> > > > > >
> > > > > > smt/active:0
> > > > > > smt/control:-1
> > > > > >
> > > > > > If cpu_smt_set_num_threads() is called:
> > > > > > active:0
> > > > > > control:notsupported
> > > > > >
> > > > > > So it might be slightly better to still initialize max_smt_thread_num.
> > > > > >
> > > > >
> > > > > Sure, what I meant is to have max_smt_thread_num set to 1 by default is
> > > > > that is what needed anyways and the above code does that now.
> > > > >
> > > > > Why not start with initialised to 1 instead ?
> > > > > Of course some current logic needs to change around testing it for zero.
> > > > >
> > > >
> > > > I think there would still be a way to check against the default value.
> > > > If we have:
> > > > unsigned int max_smt_thread_num = 1;
> > > >
> > > > then on a platform with 2 threads, the detection condition would trigger:
> > > > xa_for_each(&hetero_cpu, hetero_id, entry) {
> > > >      if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
> > > >          pr_warn_once("Heterogeneous SMT topology is partly
> > > >                        supported by SMT control\n");
> > > >
> > > > so we would need an additional variable:
> > > > bool is_initialized = false;
> > >
> > > Sure, we could do that or skip the check if max_smt_thread_num == 1 ?
> > >
> > > I mean
> > > 	if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)
> > >
>
> I think it will be problematic if we parse:
> - first a CPU with 1 thread
> - then a CPU with 2 threads
>
> in that case we should detect the 'Heterogeneous SMT topology',
> but we cannot because we don't know whether max_smt_thread_num=1
> because 1 is the default value or we found a CPU with one thread.

Right, but as per Dietmar's and my previous response, it may be a valid
case. See latest response from Dietmar which is explicitly requesting
support for this. It may need some special handling if we decide to support
that.

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-04 10:02                 ` Sudeep Holla
@ 2025-03-04 15:07                   ` Pierre Gondois
  2025-03-05  9:01                     ` Yicong Yang
  0 siblings, 1 reply; 29+ messages in thread
From: Pierre Gondois @ 2025-03-04 15:07 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Yicong Yang, yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
	linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
	gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm, xuwei5,
	guohanjun, sshegde



On 3/4/25 11:02, Sudeep Holla wrote:
> On Tue, Mar 04, 2025 at 09:25:02AM +0100, Pierre Gondois wrote:
>>
>>
>> On 3/3/25 15:40, Yicong Yang wrote:
>>> On 2025/3/3 19:16, Sudeep Holla wrote:
>>>> On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
>>>>> On 2/28/25 20:06, Sudeep Holla wrote:
>>>>>>>>
>>>>>>>> Ditto as previous patch, can get rid if it is default 1.
>>>>>>>>
>>>>>>>
>>>>>>> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
>>>>>>> cpu_smt_num_threads uninitialized to UINT_MAX:
>>>>>>>
>>>>>>> smt/active:0
>>>>>>> smt/control:-1
>>>>>>>
>>>>>>> If cpu_smt_set_num_threads() is called:
>>>>>>> active:0
>>>>>>> control:notsupported
>>>>>>>
>>>>>>> So it might be slightly better to still initialize max_smt_thread_num.
>>>>>>>
>>>>>>
>>>>>> Sure, what I meant is to have max_smt_thread_num set to 1 by default is
>>>>>> that is what needed anyways and the above code does that now.
>>>>>>
>>>>>> Why not start with initialised to 1 instead ?
>>>>>> Of course some current logic needs to change around testing it for zero.
>>>>>>
>>>>>
>>>>> I think there would still be a way to check against the default value.
>>>>> If we have:
>>>>> unsigned int max_smt_thread_num = 1;
>>>>>
>>>>> then on a platform with 2 threads, the detection condition would trigger:
>>>>> xa_for_each(&hetero_cpu, hetero_id, entry) {
>>>>>       if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
>>>>>           pr_warn_once("Heterogeneous SMT topology is partly
>>>>>                         supported by SMT control\n");
>>>>>
>>>>> so we would need an additional variable:
>>>>> bool is_initialized = false;
>>>>
>>>> Sure, we could do that or skip the check if max_smt_thread_num == 1 ?
>>>>
>>>> I mean
>>>> 	if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)
>>>>
>>
>> I think it will be problematic if we parse:
>> - first a CPU with 1 thread
>> - then a CPU with 2 threads
>>
>> in that case we should detect the 'Heterogeneous SMT topology',
>> but we cannot because we don't know whether max_smt_thread_num=1
>> because 1 is the default value or we found a CPU with one thread.
> 
> Right, but as per Dietmar's and my previous response, it may be a valid
> case. See latest response from Dietmar which is explicitly requesting
> support for this. It may need some special handling if we decide to support
> that.

Ah ok, right indeed.
For heterogeneous SMT platforms, the 'smt/control' file is able to accept
on/off/forceoff strings. But providing the max #count of threads as an integer would
be wrong if the CPU doesn't have this #count of threads.

Initially the idea was to just warn that support might be needed for heterogeneous
SMT platforms, and let whoever would have such platform solve this case, but just
disabling the integer interface in this case would solve the issue generically.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI based system
  2025-03-04 15:07                   ` Pierre Gondois
@ 2025-03-05  9:01                     ` Yicong Yang
  0 siblings, 0 replies; 29+ messages in thread
From: Yicong Yang @ 2025-03-05  9:01 UTC (permalink / raw)
  To: Pierre Gondois, Sudeep Holla, dietmar.eggemann
  Cc: yangyicong, catalin.marinas, will, tglx, peterz, mpe,
	linux-arm-kernel, mingo, bp, dave.hansen, linuxppc-dev, x86,
	linux-kernel, morten.rasmussen, msuchanek, gregkh, rafael,
	jonathan.cameron, prime.zeng, linuxarm, xuwei5, guohanjun,
	sshegde

On 2025/3/4 23:07, Pierre Gondois wrote:
> 
> 
> On 3/4/25 11:02, Sudeep Holla wrote:
>> On Tue, Mar 04, 2025 at 09:25:02AM +0100, Pierre Gondois wrote:
>>>
>>>
>>> On 3/3/25 15:40, Yicong Yang wrote:
>>>> On 2025/3/3 19:16, Sudeep Holla wrote:
>>>>> On Mon, Mar 03, 2025 at 10:56:12AM +0100, Pierre Gondois wrote:
>>>>>> On 2/28/25 20:06, Sudeep Holla wrote:
>>>>>>>>>
>>>>>>>>> Ditto as previous patch, can get rid if it is default 1.
>>>>>>>>>
>>>>>>>>
>>>>>>>> On non-SMT platforms, not calling cpu_smt_set_num_threads() leaves
>>>>>>>> cpu_smt_num_threads uninitialized to UINT_MAX:
>>>>>>>>
>>>>>>>> smt/active:0
>>>>>>>> smt/control:-1
>>>>>>>>
>>>>>>>> If cpu_smt_set_num_threads() is called:
>>>>>>>> active:0
>>>>>>>> control:notsupported
>>>>>>>>
>>>>>>>> So it might be slightly better to still initialize max_smt_thread_num.
>>>>>>>>
>>>>>>>
>>>>>>> Sure, what I meant is to have max_smt_thread_num set to 1 by default is
>>>>>>> that is what needed anyways and the above code does that now.
>>>>>>>
>>>>>>> Why not start with initialised to 1 instead ?
>>>>>>> Of course some current logic needs to change around testing it for zero.
>>>>>>>
>>>>>>
>>>>>> I think there would still be a way to check against the default value.
>>>>>> If we have:
>>>>>> unsigned int max_smt_thread_num = 1;
>>>>>>
>>>>>> then on a platform with 2 threads, the detection condition would trigger:
>>>>>> xa_for_each(&hetero_cpu, hetero_id, entry) {
>>>>>>       if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)     <---- (entry->thread_num=2) and (max_smt_thread_num=1)
>>>>>>           pr_warn_once("Heterogeneous SMT topology is partly
>>>>>>                         supported by SMT control\n");
>>>>>>
>>>>>> so we would need an additional variable:
>>>>>> bool is_initialized = false;
>>>>>
>>>>> Sure, we could do that or skip the check if max_smt_thread_num == 1 ?
>>>>>
>>>>> I mean
>>>>>     if (entry->thread_num != max_smt_thread_num && max_smt_thread_num != 1)
>>>>>
>>>
>>> I think it will be problematic if we parse:
>>> - first a CPU with 1 thread
>>> - then a CPU with 2 threads
>>>
>>> in that case we should detect the 'Heterogeneous SMT topology',
>>> but we cannot because we don't know whether max_smt_thread_num=1
>>> because 1 is the default value or we found a CPU with one thread.
>>
>> Right, but as per Dietmar's and my previous response, it may be a valid
>> case. See latest response from Dietmar which is explicitly requesting
>> support for this. It may need some special handling if we decide to support
>> that.
> 
> Ah ok, right indeed.
> For heterogeneous SMT platforms, the 'smt/control' file is able to accept
> on/off/forceoff strings. But providing the max #count of threads as an integer would
> be wrong if the CPU doesn't have this #count of threads.
> 
> Initially the idea was to just warn that support might be needed for heterogeneous
> SMT platforms, and let whoever would have such platform solve this case, but just
> disabling the integer interface in this case would solve the issue generically.
> 

ok so let's regard the asymmetric platform as a valid case as suggested (also mentioned
by Dietmar on another thread) and remove the check here. Will update and test.

Thanks.



^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-03-05 11:18 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-18 14:10 [PATCH v11 0/4] Support SMT control on arm64 Yicong Yang
2025-02-18 14:10 ` [PATCH v11 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2025-02-28 11:10   ` Dietmar Eggemann
2025-03-03 13:35     ` Yicong Yang
2025-02-28 13:54   ` Sudeep Holla
2025-03-03 13:38     ` Yicong Yang
2025-02-18 14:10 ` [PATCH v11 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
2025-02-28 11:11   ` Dietmar Eggemann
2025-03-03 14:03     ` Yicong Yang
2025-03-04  9:32       ` Dietmar Eggemann
2025-02-28 13:54   ` Sudeep Holla
2025-03-03 14:11     ` Yicong Yang
2025-02-18 14:10 ` [PATCH v11 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
2025-02-25  6:08   ` Hanjun Guo
2025-03-03 14:42     ` Yicong Yang
2025-02-28 11:11   ` Dietmar Eggemann
2025-02-28 13:56   ` Sudeep Holla
2025-02-28 17:51     ` Pierre Gondois
2025-02-28 19:06       ` Sudeep Holla
2025-03-03  9:56         ` Pierre Gondois
2025-03-03 11:16           ` Sudeep Holla
2025-03-03 14:40             ` Yicong Yang
2025-03-04  8:25               ` Pierre Gondois
2025-03-04 10:02                 ` Sudeep Holla
2025-03-04 15:07                   ` Pierre Gondois
2025-03-05  9:01                     ` Yicong Yang
2025-02-18 14:10 ` [PATCH v11 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
2025-02-28 11:12 ` [PATCH v11 0/4] Support SMT control on arm64 Dietmar Eggemann
2025-03-03 14:41   ` Yicong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).