* [PATCH v10 0/4] Support SMT control on arm64
@ 2024-12-20 7:53 Yicong Yang
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
` (4 more replies)
0 siblings, 5 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:53 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
The core CPU control framework supports runtime SMT control which
is not yet supported on arm64. Besides the general vulnerabilities
concerns we want this runtime control on our arm64 server for:
- better single CPU performance in some cases
- saving overall power consumption
This patchset implements it in the following aspects:
- Provides a default topology_is_primary_thread()
- support retrieve SMT thread number on OF based system
- support retrieve SMT thread number on ACPI based system
- select HOTPLUG_SMT for arm64
Tests has been done on our ACPI based arm64 server and on ACPI/OF\
based QEMU VMs.
Change since v9:
- Refine the comment of topology_is_primary_thread(). Tested with LoongArch
to prove it also works on architecture's not using CONFIG_GENERIC_ARCH_TOPOLOGY
- always call cpu_smt_set_num_threads() to make the smt/control shows correct
status on non-SMT system
Link: https://lore.kernel.org/linux-arm-kernel/20241114141127.23232-1-yangyicong@huawei.com/
Change since v8:
- Fix WARN on ACPI based non-SMT platform noticed in v7, per Pierre.
Link: https://lore.kernel.org/all/20241105093237.63565-1-yangyicong@huawei.com/
Change since v7:
Address the comments from Thomas:
- Add a newline between the glue define and function of topology_is_primary_thread
- Explicitly mention the sibling mask won't be empty in the comment
Link: https://lore.kernel.org/lkml/20241030125415.18994-1-yangyicong@huawei.com/
Change since v6:
- Fix unused variable if !CONFIG_ARM64 || !CONFIG_RISV found by lkp-test
- Fix max_smt_thread_num updating in OF path pointed by Pierre
- Drop unused variable and refine the comments/commit per Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20241015021841.35713-1-yangyicong@huawei.com/
Change since v5:
- Drop the dependency on CONFIG_SMP since it's always on on arm64, per Pierre
- Avoid potential multiple calls of cpu_smt_set_num_threads() on asymmetric system, per Dietmar
- Detect heterogenous SMT topology and issue a warning for partly support, per Pierre
- Thanks Dietmar for testing, didn't pickup the tag due to code changes. Thanks testing by Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20240806085320.63514-1-yangyicong@huawei.com/
Change since v4:
- Provide a default topology_is_primary_thread() in the framework, Per Will
Link: https://lore.kernel.org/linux-arm-kernel/20231121092602.47792-1-yangyicong@huawei.com/
Change since v3:
- Fix some build and kconfig error reported by kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-arm-kernel/20231114040110.54590-1-yangyicong@huawei.com/
Change since v2:
- Detect SMT thread number at topology build from ACPI/DT, avoid looping CPUs
- Split patches into ACPI/OF/arch_topology path and enable the kconfig for arm64
Link: https://lore.kernel.org/linux-arm-kernel/20231010115335.13862-1-yangyicong@huawei.com/
Yicong Yang (4):
cpu/SMT: Provide a default topology_is_primary_thread()
arch_topology: Support SMT control for OF based system
arm64: topology: Support SMT control on ACPI based system
arm64: Kconfig: Enable HOTPLUG_SMT
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/topology.c | 66 +++++++++++++++++++++++++++++
arch/powerpc/include/asm/topology.h | 1 +
arch/x86/include/asm/topology.h | 2 +-
drivers/base/arch_topology.c | 26 ++++++++++++
include/linux/topology.h | 22 ++++++++++
6 files changed, 117 insertions(+), 1 deletion(-)
--
2.24.0
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
@ 2024-12-20 7:53 ` Yicong Yang
2024-12-20 7:59 ` Yicong Yang
2024-12-26 7:18 ` Shrikanth Hegde
2024-12-20 7:53 ` [PATCH v10 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
` (3 subsequent siblings)
4 siblings, 2 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:53 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
Currently if architectures want to support HOTPLUG_SMT they need to
provide a topology_is_primary_thread() telling the framework which
thread in the SMT cannot offline. However arm64 doesn't have a
restriction on which thread in the SMT cannot offline, a simplest
choice is that just make 1st thread as the "primary" thread. So
just make this as the default implementation in the framework and
let architectures like x86 that have special primary thread to
override this function (which they've already done).
There's no need to provide a stub function if !CONFIG_SMP or
!CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
the 1st CPU in the SMT so it's always the primary thread.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
As questioned in v9 [1] whether this works on architectures not using
CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
Architectures should use this on their own situation.
[1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
[root@localhost smt]# uname -m
loongarch64
[root@localhost smt]# pwd
/sys/devices/system/cpu/smt
[root@localhost smt]# cat ../possible
0-3
[root@localhost smt]# cat ../online
0-3
[root@localhost smt]# cat control
on
[root@localhost smt]# echo off > control
[root@localhost smt]# cat control
off
[root@localhost smt]# cat ../online
0,2
[root@localhost smt]# echo on > control
[root@localhost smt]# cat control
on
[root@localhost smt]# cat ../online
0-3
arch/powerpc/include/asm/topology.h | 1 +
arch/x86/include/asm/topology.h | 2 +-
include/linux/topology.h | 22 ++++++++++++++++++++++
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 16bacfe8c7a2..da15b5efe807 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
{
return cpu == cpu_first_thread_sibling(cpu);
}
+#define topology_is_primary_thread topology_is_primary_thread
static inline bool topology_smt_thread_allowed(unsigned int cpu)
{
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index fd41103ad342..faa0d6334ea4 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -228,11 +228,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
}
+#define topology_is_primary_thread topology_is_primary_thread
#else /* CONFIG_SMP */
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
-static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
#endif /* !CONFIG_SMP */
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 52f5850730b3..b3aba443c4eb 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
}
#endif
+#ifndef topology_is_primary_thread
+
+#define topology_is_primary_thread topology_is_primary_thread
+
+static inline bool topology_is_primary_thread(unsigned int cpu)
+{
+ /*
+ * On SMT hotplug the primary thread of the SMT won't be disabled.
+ * Architectures do have a special primary thread (e.g. x86) need
+ * to override this function. Otherwise just make the first thread
+ * in the SMT as the primary thread.
+ *
+ * The sibling cpumask of an offline CPU contains always the CPU
+ * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
+ * Other architectures should use this depend on their own
+ * situation.
+ */
+ return cpu == cpumask_first(topology_sibling_cpumask(cpu));
+}
+
+#endif
+
static inline const struct cpumask *cpu_cpu_mask(int cpu)
{
return cpumask_of_node(cpu_to_node(cpu));
--
2.24.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v10 2/4] arch_topology: Support SMT control for OF based system
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2024-12-20 7:53 ` Yicong Yang
2024-12-23 16:33 ` Jonathan Cameron
2024-12-20 7:53 ` [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
` (2 subsequent siblings)
4 siblings, 1 reply; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:53 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
On building the topology from the devicetree, we've already
gotten the SMT thread number of each core. Update the largest
SMT thread number and enable the SMT control by the end of
topology parsing.
The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number
If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
drivers/base/arch_topology.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 3ebe77566788..9e81060144c7 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -11,6 +11,7 @@
#include <linux/cleanup.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
#include <linux/device.h>
#include <linux/of.h>
#include <linux/slab.h>
@@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
#endif
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+
+/* Maximum SMT thread number detected used to enable the SMT control */
+static unsigned int max_smt_thread_num;
+
/*
* This function returns the logic cpu number of the node.
* There are basically three kinds of return values:
@@ -565,6 +570,17 @@ static int __init parse_core(struct device_node *core, int package_id,
i++;
} while (1);
+ /*
+ * If max_smt_thread_num has been initialized and doesn't match
+ * the thread number of this entry, then the system has
+ * heterogeneous SMT topology.
+ */
+ if (max_smt_thread_num && max_smt_thread_num != i)
+ pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+ if (max_smt_thread_num < i)
+ max_smt_thread_num = i;
+
cpu = get_cpu_for_node(core);
if (cpu >= 0) {
if (!leaf) {
@@ -677,6 +693,16 @@ static int __init parse_socket(struct device_node *socket)
if (!has_socket)
ret = parse_cluster(socket, 0, -1, 0);
+ /*
+ * Notify the CPU framework of the SMT support. Initialize the
+ * max_smt_thread_num to 1 if no SMT support detected. A thread
+ * number of 1 can be handled by the framework so we don't need
+ * to check max_smt_thread_num to see we support SMT or not.
+ */
+ if (!max_smt_thread_num)
+ max_smt_thread_num = 1;
+
+ cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
return ret;
}
--
2.24.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI based system
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-12-20 7:53 ` [PATCH v10 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
@ 2024-12-20 7:53 ` Yicong Yang
2024-12-23 16:40 ` Jonathan Cameron
2024-12-20 7:53 ` [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
2024-12-26 9:23 ` [PATCH v10 0/4] Support SMT control on arm64 Shrikanth Hegde
4 siblings, 1 reply; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:53 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
For ACPI we'll build the topology from PPTT and we cannot directly
get the SMT number of each core. Instead using a temporary xarray
to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
and SMT information of the first core in its heterogeneous CPU cluster
when building the topology. Then we can know the largest SMT number
in the system. If a homogeneous system's using ACPI 6.2 or later,
all the CPUs should be under the root node of PPTT. There'll be
only one entry in the xarray and all the CPUs in the system will
be assumed identical.
The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number
If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
1 file changed, 66 insertions(+)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 1a2c72f3e7f8..85cb18d72a29 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -15,8 +15,10 @@
#include <linux/arch_topology.h>
#include <linux/cacheinfo.h>
#include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
#include <linux/init.h>
#include <linux/percpu.h>
+#include <linux/xarray.h>
#include <asm/cpu.h>
#include <asm/cputype.h>
@@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
return !!is_threaded;
}
+struct cpu_smt_info {
+ int thread_num;
+ int core_id;
+};
+
/*
* Propagate the topology information of the processor_topology_node tree to the
* cpu_topology array.
*/
int __init parse_acpi_topology(void)
{
+ int max_smt_thread_num = 0;
+ struct cpu_smt_info *entry;
+ struct xarray hetero_cpu;
+ unsigned long hetero_id;
int cpu, topology_id;
if (acpi_disabled)
return 0;
+ xa_init(&hetero_cpu);
+
for_each_possible_cpu(cpu) {
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
@@ -57,6 +70,32 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].thread_id = topology_id;
topology_id = find_acpi_cpu_topology(cpu, 1);
cpu_topology[cpu].core_id = topology_id;
+
+ /*
+ * In the PPTT, CPUs below a node with the 'identical
+ * implementation' flag have the same number of threads.
+ * Count the number of threads for only one CPU (i.e.
+ * one core_id) among those with the same hetero_id.
+ * See the comment of find_acpi_cpu_topology_hetero_id()
+ * for more details.
+ *
+ * One entry is created for each node having:
+ * - the 'identical implementation' flag
+ * - its parent not having the flag
+ */
+ hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
+ entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
+ if (!entry) {
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ WARN_ON(!entry);
+
+ entry->core_id = topology_id;
+ entry->thread_num = 1;
+ xa_store(&hetero_cpu, hetero_id,
+ entry, GFP_KERNEL);
+ } else if (entry->core_id == topology_id) {
+ entry->thread_num++;
+ }
} else {
cpu_topology[cpu].thread_id = -1;
cpu_topology[cpu].core_id = topology_id;
@@ -67,6 +106,33 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].package_id = topology_id;
}
+ /*
+ * This should be a short loop depending on the number of heterogeneous
+ * CPU clusters. Typically on a homogeneous system there's only one
+ * entry in the XArray.
+ */
+ xa_for_each(&hetero_cpu, hetero_id, entry) {
+ if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
+ pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+ if (entry->thread_num > max_smt_thread_num)
+ max_smt_thread_num = entry->thread_num;
+
+ xa_erase(&hetero_cpu, hetero_id);
+ kfree(entry);
+ }
+
+ /*
+ * Notify the CPU framework of the SMT support. Initialize the
+ * max_smt_thread_num to 1 if no SMT support detected. A thread
+ * number of 1 can be handled by the framework so we don't need
+ * to check max_smt_thread_num to see we support SMT or not.
+ */
+ if (!max_smt_thread_num)
+ max_smt_thread_num = 1;
+
+ cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+ xa_destroy(&hetero_cpu);
return 0;
}
#endif
--
2.24.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
` (2 preceding siblings ...)
2024-12-20 7:53 ` [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2024-12-20 7:53 ` Yicong Yang
2024-12-23 16:40 ` Jonathan Cameron
2024-12-26 9:23 ` [PATCH v10 0/4] Support SMT control on arm64 Shrikanth Hegde
4 siblings, 1 reply; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:53 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
Enable HOTPLUG_SMT for SMT control.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 100570a048c5..1ee926e2252e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -246,6 +246,7 @@ config ARM64
select HAVE_KRETPROBES
select HAVE_GENERIC_VDSO
select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
+ select HOTPLUG_SMT if HOTPLUG_CPU
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN
--
2.24.0
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2024-12-20 7:59 ` Yicong Yang
2024-12-23 16:34 ` Jonathan Cameron
2024-12-26 7:18 ` Shrikanth Hegde
1 sibling, 1 reply; 19+ messages in thread
From: Yicong Yang @ 2024-12-20 7:59 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/12/20 15:53, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> Currently if architectures want to support HOTPLUG_SMT they need to
> provide a topology_is_primary_thread() telling the framework which
> thread in the SMT cannot offline. However arm64 doesn't have a
> restriction on which thread in the SMT cannot offline, a simplest
> choice is that just make 1st thread as the "primary" thread. So
> just make this as the default implementation in the framework and
> let architectures like x86 that have special primary thread to
> override this function (which they've already done).
>
> There's no need to provide a stub function if !CONFIG_SMP or
> !CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
> the 1st CPU in the SMT so it's always the primary thread.
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> As questioned in v9 [1] whether this works on architectures not using
> CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
> Architectures should use this on their own situation.
> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
>
> [root@localhost smt]# uname -m
> loongarch64
> [root@localhost smt]# pwd
> /sys/devices/system/cpu/smt
> [root@localhost smt]# cat ../possible
> 0-3
> [root@localhost smt]# cat ../online
> 0-3
> [root@localhost smt]# cat control
> on
> [root@localhost smt]# echo off > control
> [root@localhost smt]# cat control
> off
> [root@localhost smt]# cat ../online
> 0,2
> [root@localhost smt]# echo on > control
> [root@localhost smt]# cat control
> on
> [root@localhost smt]# cat ../online
> 0-3
Tested with below code using the topology_is_primary_thread() introduced
in this patch. Tested on an ACPI-based QEMU VM emulating SMT2.
Subject: [PATCH] LoongArch: Support HOTPLUG_SMT on ACPI-based system
Support HOTPLUG_SMT on ACPI-based system using generic
topology_is_primary_thread().
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/kernel/acpi.c | 26 ++++++++++++++++++++++++--
2 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index dae3a9104ca6..bed1b0640b97 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -172,6 +172,7 @@ config LOONGARCH
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_TIF_NOHZ
select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
+ select HOTPLUG_SMT if HOTPLUG_CPU
select IRQ_FORCED_THREADING
select IRQ_LOONGARCH_CPU
select LOCK_MM_AND_FIND_VMA
diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
index 382a09a7152c..e642b0de57e7 100644
--- a/arch/loongarch/kernel/acpi.c
+++ b/arch/loongarch/kernel/acpi.c
@@ -15,9 +15,11 @@
#include <linux/memblock.h>
#include <linux/of_fdt.h>
#include <linux/serial_core.h>
+#include <linux/xarray.h>
#include <asm/io.h>
#include <asm/numa.h>
#include <asm/loongson.h>
+#include <linux/cpu_smt.h>
int acpi_disabled;
EXPORT_SYMBOL(acpi_disabled);
@@ -175,8 +177,12 @@ int pptt_enabled;
int __init parse_acpi_topology(void)
{
+ int thread_num, max_smt_thread_num = 1;
+ struct xarray core_threads;
int cpu, topology_id;
+ void *entry;
+ xa_init(&core_threads);
for_each_possible_cpu(cpu) {
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0) {
@@ -184,19 +190,35 @@ int __init parse_acpi_topology(void)
return -ENOENT;
}
- if (acpi_pptt_cpu_is_thread(cpu) <= 0)
+ if (acpi_pptt_cpu_is_thread(cpu) <= 0) {
cpu_data[cpu].core = topology_id;
- else {
+ } else {
topology_id = find_acpi_cpu_topology(cpu, 1);
if (topology_id < 0)
return -ENOENT;
cpu_data[cpu].core = topology_id;
+
+ entry = xa_load(&core_threads, topology_id);
+ if (!entry) {
+ xa_store(&core_threads, topology_id,
+ xa_mk_value(1), GFP_KERNEL);
+ } else {
+ thread_num = xa_to_value(entry);
+ thread_num++;
+ xa_store(&core_threads, topology_id,
+ xa_mk_value(thread_num), GFP_KERNEL);
+
+ if (thread_num > max_smt_thread_num)
+ max_smt_thread_num = thread_num;
+ }
}
}
pptt_enabled = 1;
+ cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+ xa_destroy(&core_threads);
return 0;
}
--
2.24.0
>
> arch/powerpc/include/asm/topology.h | 1 +
> arch/x86/include/asm/topology.h | 2 +-
> include/linux/topology.h | 22 ++++++++++++++++++++++
> 3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index 16bacfe8c7a2..da15b5efe807 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
> {
> return cpu == cpu_first_thread_sibling(cpu);
> }
> +#define topology_is_primary_thread topology_is_primary_thread
>
> static inline bool topology_smt_thread_allowed(unsigned int cpu)
> {
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index fd41103ad342..faa0d6334ea4 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -228,11 +228,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
> {
> return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
> }
> +#define topology_is_primary_thread topology_is_primary_thread
>
> #else /* CONFIG_SMP */
> static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
> static inline int topology_max_smt_threads(void) { return 1; }
> -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
> static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
> #endif /* !CONFIG_SMP */
>
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..b3aba443c4eb 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
> }
> #endif
>
> +#ifndef topology_is_primary_thread
> +
> +#define topology_is_primary_thread topology_is_primary_thread
> +
> +static inline bool topology_is_primary_thread(unsigned int cpu)
> +{
> + /*
> + * On SMT hotplug the primary thread of the SMT won't be disabled.
> + * Architectures do have a special primary thread (e.g. x86) need
> + * to override this function. Otherwise just make the first thread
> + * in the SMT as the primary thread.
> + *
> + * The sibling cpumask of an offline CPU contains always the CPU
> + * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
> + * Other architectures should use this depend on their own
> + * situation.
> + */
> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
> +}
> +
> +#endif
> +
> static inline const struct cpumask *cpu_cpu_mask(int cpu)
> {
> return cpumask_of_node(cpu_to_node(cpu));
>
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v10 2/4] arch_topology: Support SMT control for OF based system
2024-12-20 7:53 ` [PATCH v10 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
@ 2024-12-23 16:33 ` Jonathan Cameron
2024-12-24 12:23 ` Yicong Yang
0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2024-12-23 16:33 UTC (permalink / raw)
To: Yicong Yang
Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
On Fri, 20 Dec 2024 15:53:11 +0800
Yicong Yang <yangyicong@huawei.com> wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> On building the topology from the devicetree, we've already
> gotten the SMT thread number of each core. Update the largest
> SMT thread number and enable the SMT control by the end of
> topology parsing.
>
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
>
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Hi Yicong,
Apologies that I'm late to the game on this one.
A few comments inline. Only important one is whether to bail out early
on error from parse_cluster()
Thanks,
Jonathan
> ---
> drivers/base/arch_topology.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 3ebe77566788..9e81060144c7 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -11,6 +11,7 @@
> #include <linux/cleanup.h>
> #include <linux/cpu.h>
> #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
> #include <linux/device.h>
> #include <linux/of.h>
> #include <linux/slab.h>
> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
> #endif
>
> #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
> +
> +/* Maximum SMT thread number detected used to enable the SMT control */
> +static unsigned int max_smt_thread_num;
> +
> /*
> * This function returns the logic cpu number of the node.
> * There are basically three kinds of return values:
> @@ -565,6 +570,17 @@ static int __init parse_core(struct device_node *core, int package_id,
> i++;
> } while (1);
>
> + /*
> + * If max_smt_thread_num has been initialized and doesn't match
> + * the thread number of this entry, then the system has
> + * heterogeneous SMT topology.
> + */
> + if (max_smt_thread_num && max_smt_thread_num != i)
> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +
> + if (max_smt_thread_num < i)
> + max_smt_thread_num = i;
Maybe more self documenting if you use min()? I'm not sure...
max_smt_thread_num = min(max_smt_thread_num, i);
> +
> cpu = get_cpu_for_node(core);
> if (cpu >= 0) {
> if (!leaf) {
> @@ -677,6 +693,16 @@ static int __init parse_socket(struct device_node *socket)
> if (!has_socket)
> ret = parse_cluster(socket, 0, -1, 0);
Is it appropriate to check ret before setting num threads?
if (!has_socket) {
ret = parse_cluster(socket, 0, -1, 0);
if (ret)
return ret;
}
...
return 0;
>
> + /*
> + * Notify the CPU framework of the SMT support. Initialize the
> + * max_smt_thread_num to 1 if no SMT support detected. A thread
> + * number of 1 can be handled by the framework so we don't need
> + * to check max_smt_thread_num to see we support SMT or not.
> + */
> + if (!max_smt_thread_num)
> + max_smt_thread_num = 1;
> +
> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
Trivial but I'd put a blank line here.
> return ret;
> }
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-20 7:59 ` Yicong Yang
@ 2024-12-23 16:34 ` Jonathan Cameron
2024-12-24 12:15 ` Yicong Yang
0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2024-12-23 16:34 UTC (permalink / raw)
To: Yicong Yang
Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, yangyicong, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
xuwei5, guohanjun
On Fri, 20 Dec 2024 15:59:27 +0800
Yicong Yang <yangyicong@huawei.com> wrote:
> On 2024/12/20 15:53, Yicong Yang wrote:
> > From: Yicong Yang <yangyicong@hisilicon.com>
> >
> > Currently if architectures want to support HOTPLUG_SMT they need to
> > provide a topology_is_primary_thread() telling the framework which
> > thread in the SMT cannot offline. However arm64 doesn't have a
> > restriction on which thread in the SMT cannot offline, a simplest
> > choice is that just make 1st thread as the "primary" thread. So
> > just make this as the default implementation in the framework and
> > let architectures like x86 that have special primary thread to
> > override this function (which they've already done).
> >
> > There's no need to provide a stub function if !CONFIG_SMP or
> > !CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
> > the 1st CPU in the SMT so it's always the primary thread.
> >
> > Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> > ---
> > As questioned in v9 [1] whether this works on architectures not using
> > CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
> > Architectures should use this on their own situation.
> > [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
> >
> > [root@localhost smt]# uname -m
> > loongarch64
> > [root@localhost smt]# pwd
> > /sys/devices/system/cpu/smt
> > [root@localhost smt]# cat ../possible
> > 0-3
> > [root@localhost smt]# cat ../online
> > 0-3
> > [root@localhost smt]# cat control
> > on
> > [root@localhost smt]# echo off > control
> > [root@localhost smt]# cat control
> > off
> > [root@localhost smt]# cat ../online
> > 0,2
> > [root@localhost smt]# echo on > control
> > [root@localhost smt]# cat control
> > on
> > [root@localhost smt]# cat ../online
> > 0-3
>
> Tested with below code using the topology_is_primary_thread() introduced
> in this patch. Tested on an ACPI-based QEMU VM emulating SMT2.
Nice bit of testing.
Given it all seems fine. FWIW
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
(for original patch, not the longarch one!)
>
> Subject: [PATCH] LoongArch: Support HOTPLUG_SMT on ACPI-based system
>
> Support HOTPLUG_SMT on ACPI-based system using generic
> topology_is_primary_thread().
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> arch/loongarch/Kconfig | 1 +
> arch/loongarch/kernel/acpi.c | 26 ++++++++++++++++++++++++--
> 2 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index dae3a9104ca6..bed1b0640b97 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -172,6 +172,7 @@ config LOONGARCH
> select HAVE_SYSCALL_TRACEPOINTS
> select HAVE_TIF_NOHZ
> select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
> + select HOTPLUG_SMT if HOTPLUG_CPU
> select IRQ_FORCED_THREADING
> select IRQ_LOONGARCH_CPU
> select LOCK_MM_AND_FIND_VMA
> diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
> index 382a09a7152c..e642b0de57e7 100644
> --- a/arch/loongarch/kernel/acpi.c
> +++ b/arch/loongarch/kernel/acpi.c
> @@ -15,9 +15,11 @@
> #include <linux/memblock.h>
> #include <linux/of_fdt.h>
> #include <linux/serial_core.h>
> +#include <linux/xarray.h>
> #include <asm/io.h>
> #include <asm/numa.h>
> #include <asm/loongson.h>
> +#include <linux/cpu_smt.h>
>
> int acpi_disabled;
> EXPORT_SYMBOL(acpi_disabled);
> @@ -175,8 +177,12 @@ int pptt_enabled;
>
> int __init parse_acpi_topology(void)
> {
> + int thread_num, max_smt_thread_num = 1;
> + struct xarray core_threads;
> int cpu, topology_id;
> + void *entry;
>
> + xa_init(&core_threads);
> for_each_possible_cpu(cpu) {
> topology_id = find_acpi_cpu_topology(cpu, 0);
> if (topology_id < 0) {
> @@ -184,19 +190,35 @@ int __init parse_acpi_topology(void)
> return -ENOENT;
> }
>
> - if (acpi_pptt_cpu_is_thread(cpu) <= 0)
> + if (acpi_pptt_cpu_is_thread(cpu) <= 0) {
> cpu_data[cpu].core = topology_id;
> - else {
> + } else {
> topology_id = find_acpi_cpu_topology(cpu, 1);
> if (topology_id < 0)
> return -ENOENT;
>
> cpu_data[cpu].core = topology_id;
> +
> + entry = xa_load(&core_threads, topology_id);
> + if (!entry) {
> + xa_store(&core_threads, topology_id,
> + xa_mk_value(1), GFP_KERNEL);
> + } else {
> + thread_num = xa_to_value(entry);
> + thread_num++;
> + xa_store(&core_threads, topology_id,
> + xa_mk_value(thread_num), GFP_KERNEL);
> +
> + if (thread_num > max_smt_thread_num)
> + max_smt_thread_num = thread_num;
> + }
> }
> }
>
> pptt_enabled = 1;
>
> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> + xa_destroy(&core_threads);
> return 0;
> }
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI based system
2024-12-20 7:53 ` [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2024-12-23 16:40 ` Jonathan Cameron
2024-12-24 12:30 ` Yicong Yang
0 siblings, 1 reply; 19+ messages in thread
From: Jonathan Cameron @ 2024-12-23 16:40 UTC (permalink / raw)
To: Yicong Yang
Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
On Fri, 20 Dec 2024 15:53:12 +0800
Yicong Yang <yangyicong@huawei.com> wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> For ACPI we'll build the topology from PPTT and we cannot directly
> get the SMT number of each core. Instead using a temporary xarray
> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> and SMT information of the first core in its heterogeneous CPU cluster
> when building the topology. Then we can know the largest SMT number
> in the system. If a homogeneous system's using ACPI 6.2 or later,
> all the CPUs should be under the root node of PPTT. There'll be
> only one entry in the xarray and all the CPUs in the system will
> be assumed identical.
>
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
>
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
A few trivial things inline. Either way it's fine as really just my style
preferences
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 66 insertions(+)
>
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 1a2c72f3e7f8..85cb18d72a29 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -15,8 +15,10 @@
> #include <linux/arch_topology.h>
> #include <linux/cacheinfo.h>
> #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
> #include <linux/init.h>
> #include <linux/percpu.h>
> +#include <linux/xarray.h>
>
> #include <asm/cpu.h>
> #include <asm/cputype.h>
> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
> return !!is_threaded;
> }
>
> +struct cpu_smt_info {
> + int thread_num;
> + int core_id;
> +};
> +
> /*
> * Propagate the topology information of the processor_topology_node tree to the
> * cpu_topology array.
> */
> int __init parse_acpi_topology(void)
> {
> + int max_smt_thread_num = 0;
> + struct cpu_smt_info *entry;
> + struct xarray hetero_cpu;
> + unsigned long hetero_id;
> int cpu, topology_id;
>
> if (acpi_disabled)
> return 0;
>
> + xa_init(&hetero_cpu);
> +
> for_each_possible_cpu(cpu) {
> topology_id = find_acpi_cpu_topology(cpu, 0);
> if (topology_id < 0)
> @@ -57,6 +70,32 @@ int __init parse_acpi_topology(void)
> cpu_topology[cpu].thread_id = topology_id;
> topology_id = find_acpi_cpu_topology(cpu, 1);
> cpu_topology[cpu].core_id = topology_id;
> +
> + /*
> + * In the PPTT, CPUs below a node with the 'identical
> + * implementation' flag have the same number of threads.
> + * Count the number of threads for only one CPU (i.e.
> + * one core_id) among those with the same hetero_id.
> + * See the comment of find_acpi_cpu_topology_hetero_id()
> + * for more details.
> + *
> + * One entry is created for each node having:
> + * - the 'identical implementation' flag
> + * - its parent not having the flag
> + */
> + hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> + entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
Given xa_load returns a void *,
entry = xa_load(&hetero_cpu, hetero_id);
should be fine (I haven't checked local style, so feel free to ignore if
local style is to cast anyway). Maybe drag the definition of entry into
a more local scope as well.
> + if (!entry) {
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + WARN_ON(!entry);
> +
> + entry->core_id = topology_id;
> + entry->thread_num = 1;
> + xa_store(&hetero_cpu, hetero_id,
> + entry, GFP_KERNEL);
> + } else if (entry->core_id == topology_id) {
> + entry->thread_num++;
> + }
> } else {
> cpu_topology[cpu].thread_id = -1;
> cpu_topology[cpu].core_id = topology_id;
> @@ -67,6 +106,33 @@ int __init parse_acpi_topology(void)
> cpu_topology[cpu].package_id = topology_id;
> }
>
> + /*
> + * This should be a short loop depending on the number of heterogeneous
> + * CPU clusters. Typically on a homogeneous system there's only one
> + * entry in the XArray.
> + */
> + xa_for_each(&hetero_cpu, hetero_id, entry) {
> + if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +
> + if (entry->thread_num > max_smt_thread_num)
> + max_smt_thread_num = entry->thread_num;
As with DT, maybe min is more informative?
max_smt_thread_num = min(max_smt_thread_num, entry->thread_num);
I don't care strongly about it though.
> +
> + xa_erase(&hetero_cpu, hetero_id);
> + kfree(entry);
> + }
> +
> + /*
> + * Notify the CPU framework of the SMT support. Initialize the
> + * max_smt_thread_num to 1 if no SMT support detected. A thread
> + * number of 1 can be handled by the framework so we don't need
> + * to check max_smt_thread_num to see we support SMT or not.
> + */
> + if (!max_smt_thread_num)
> + max_smt_thread_num = 1;
> +
> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> + xa_destroy(&hetero_cpu);
> return 0;
> }
> #endif
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT
2024-12-20 7:53 ` [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
@ 2024-12-23 16:40 ` Jonathan Cameron
0 siblings, 0 replies; 19+ messages in thread
From: Jonathan Cameron @ 2024-12-23 16:40 UTC (permalink / raw)
To: Yicong Yang
Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
On Fri, 20 Dec 2024 15:53:13 +0800
Yicong Yang <yangyicong@huawei.com> wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> Enable HOTPLUG_SMT for SMT control.
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
FWIW LGTM
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> arch/arm64/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 100570a048c5..1ee926e2252e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -246,6 +246,7 @@ config ARM64
> select HAVE_KRETPROBES
> select HAVE_GENERIC_VDSO
> select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
> + select HOTPLUG_SMT if HOTPLUG_CPU
> select IRQ_DOMAIN
> select IRQ_FORCED_THREADING
> select KASAN_VMALLOC if KASAN
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-23 16:34 ` Jonathan Cameron
@ 2024-12-24 12:15 ` Yicong Yang
0 siblings, 0 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-24 12:15 UTC (permalink / raw)
To: Jonathan Cameron
Cc: yangyicong, catalin.marinas, will, sudeep.holla, tglx, peterz,
mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/12/24 0:34, Jonathan Cameron wrote:
> On Fri, 20 Dec 2024 15:59:27 +0800
> Yicong Yang <yangyicong@huawei.com> wrote:
>
>> On 2024/12/20 15:53, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>
>>> Currently if architectures want to support HOTPLUG_SMT they need to
>>> provide a topology_is_primary_thread() telling the framework which
>>> thread in the SMT cannot offline. However arm64 doesn't have a
>>> restriction on which thread in the SMT cannot offline, a simplest
>>> choice is that just make 1st thread as the "primary" thread. So
>>> just make this as the default implementation in the framework and
>>> let architectures like x86 that have special primary thread to
>>> override this function (which they've already done).
>>>
>>> There's no need to provide a stub function if !CONFIG_SMP or
>>> !CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
>>> the 1st CPU in the SMT so it's always the primary thread.
>>>
>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>> ---
>>> As questioned in v9 [1] whether this works on architectures not using
>>> CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
>>> Architectures should use this on their own situation.
>>> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
>>>
>>> [root@localhost smt]# uname -m
>>> loongarch64
>>> [root@localhost smt]# pwd
>>> /sys/devices/system/cpu/smt
>>> [root@localhost smt]# cat ../possible
>>> 0-3
>>> [root@localhost smt]# cat ../online
>>> 0-3
>>> [root@localhost smt]# cat control
>>> on
>>> [root@localhost smt]# echo off > control
>>> [root@localhost smt]# cat control
>>> off
>>> [root@localhost smt]# cat ../online
>>> 0,2
>>> [root@localhost smt]# echo on > control
>>> [root@localhost smt]# cat control
>>> on
>>> [root@localhost smt]# cat ../online
>>> 0-3
>>
>> Tested with below code using the topology_is_primary_thread() introduced
>> in this patch. Tested on an ACPI-based QEMU VM emulating SMT2.
> Nice bit of testing.
>
> Given it all seems fine. FWIW
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> (for original patch, not the longarch one!)
thanks. certainly :)
>
>>
>> Subject: [PATCH] LoongArch: Support HOTPLUG_SMT on ACPI-based system
>>
>> Support HOTPLUG_SMT on ACPI-based system using generic
>> topology_is_primary_thread().
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>> arch/loongarch/Kconfig | 1 +
>> arch/loongarch/kernel/acpi.c | 26 ++++++++++++++++++++++++--
>> 2 files changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
>> index dae3a9104ca6..bed1b0640b97 100644
>> --- a/arch/loongarch/Kconfig
>> +++ b/arch/loongarch/Kconfig
>> @@ -172,6 +172,7 @@ config LOONGARCH
>> select HAVE_SYSCALL_TRACEPOINTS
>> select HAVE_TIF_NOHZ
>> select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
>> + select HOTPLUG_SMT if HOTPLUG_CPU
>> select IRQ_FORCED_THREADING
>> select IRQ_LOONGARCH_CPU
>> select LOCK_MM_AND_FIND_VMA
>> diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
>> index 382a09a7152c..e642b0de57e7 100644
>> --- a/arch/loongarch/kernel/acpi.c
>> +++ b/arch/loongarch/kernel/acpi.c
>> @@ -15,9 +15,11 @@
>> #include <linux/memblock.h>
>> #include <linux/of_fdt.h>
>> #include <linux/serial_core.h>
>> +#include <linux/xarray.h>
>> #include <asm/io.h>
>> #include <asm/numa.h>
>> #include <asm/loongson.h>
>> +#include <linux/cpu_smt.h>
>>
>> int acpi_disabled;
>> EXPORT_SYMBOL(acpi_disabled);
>> @@ -175,8 +177,12 @@ int pptt_enabled;
>>
>> int __init parse_acpi_topology(void)
>> {
>> + int thread_num, max_smt_thread_num = 1;
>> + struct xarray core_threads;
>> int cpu, topology_id;
>> + void *entry;
>>
>> + xa_init(&core_threads);
>> for_each_possible_cpu(cpu) {
>> topology_id = find_acpi_cpu_topology(cpu, 0);
>> if (topology_id < 0) {
>> @@ -184,19 +190,35 @@ int __init parse_acpi_topology(void)
>> return -ENOENT;
>> }
>>
>> - if (acpi_pptt_cpu_is_thread(cpu) <= 0)
>> + if (acpi_pptt_cpu_is_thread(cpu) <= 0) {
>> cpu_data[cpu].core = topology_id;
>> - else {
>> + } else {
>> topology_id = find_acpi_cpu_topology(cpu, 1);
>> if (topology_id < 0)
>> return -ENOENT;
>>
>> cpu_data[cpu].core = topology_id;
>> +
>> + entry = xa_load(&core_threads, topology_id);
>> + if (!entry) {
>> + xa_store(&core_threads, topology_id,
>> + xa_mk_value(1), GFP_KERNEL);
>> + } else {
>> + thread_num = xa_to_value(entry);
>> + thread_num++;
>> + xa_store(&core_threads, topology_id,
>> + xa_mk_value(thread_num), GFP_KERNEL);
>> +
>> + if (thread_num > max_smt_thread_num)
>> + max_smt_thread_num = thread_num;
>> + }
>> }
>> }
>>
>> pptt_enabled = 1;
>>
>> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>> + xa_destroy(&core_threads);
>> return 0;
>> }
>>
>
> .
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 2/4] arch_topology: Support SMT control for OF based system
2024-12-23 16:33 ` Jonathan Cameron
@ 2024-12-24 12:23 ` Yicong Yang
0 siblings, 0 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-24 12:23 UTC (permalink / raw)
To: Jonathan Cameron
Cc: yangyicong, catalin.marinas, will, sudeep.holla, tglx, peterz,
mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/12/24 0:33, Jonathan Cameron wrote:
> On Fri, 20 Dec 2024 15:53:11 +0800
> Yicong Yang <yangyicong@huawei.com> wrote:
>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> On building the topology from the devicetree, we've already
>> gotten the SMT thread number of each core. Update the largest
>> SMT thread number and enable the SMT control by the end of
>> topology parsing.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> Hi Yicong,
>
> Apologies that I'm late to the game on this one.
>
> A few comments inline. Only important one is whether to bail out early
> on error from parse_cluster()
>
thanks for the comments.
> Thanks,
>
> Jonathan
>
>
>> ---
>> drivers/base/arch_topology.c | 26 ++++++++++++++++++++++++++
>> 1 file changed, 26 insertions(+)
>>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index 3ebe77566788..9e81060144c7 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -11,6 +11,7 @@
>> #include <linux/cleanup.h>
>> #include <linux/cpu.h>
>> #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>> #include <linux/device.h>
>> #include <linux/of.h>
>> #include <linux/slab.h>
>> @@ -506,6 +507,10 @@ core_initcall(free_raw_capacity);
>> #endif
>>
>> #if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
>> +
>> +/* Maximum SMT thread number detected used to enable the SMT control */
>> +static unsigned int max_smt_thread_num;
>> +
>> /*
>> * This function returns the logic cpu number of the node.
>> * There are basically three kinds of return values:
>> @@ -565,6 +570,17 @@ static int __init parse_core(struct device_node *core, int package_id,
>> i++;
>> } while (1);
>>
>> + /*
>> + * If max_smt_thread_num has been initialized and doesn't match
>> + * the thread number of this entry, then the system has
>> + * heterogeneous SMT topology.
>> + */
>> + if (max_smt_thread_num && max_smt_thread_num != i)
>> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
>> + if (max_smt_thread_num < i)
>> + max_smt_thread_num = i;
>
> Maybe more self documenting if you use min()? I'm not sure...
> max_smt_thread_num = min(max_smt_thread_num, i);
>
sure, will use max_t here (since we'd like to know the maximum thread number).
>
>> +
>> cpu = get_cpu_for_node(core);
>> if (cpu >= 0) {
>> if (!leaf) {
>> @@ -677,6 +693,16 @@ static int __init parse_socket(struct device_node *socket)
>> if (!has_socket)
>> ret = parse_cluster(socket, 0, -1, 0);
>
> Is it appropriate to check ret before setting num threads?
> if (!has_socket) {
> ret = parse_cluster(socket, 0, -1, 0);
> if (ret)
> return ret;
> }
> ...
>
> return 0;
>
thanks for catching this. since we always need to notify the framework of
the SMT thread number if HOTPLUG_SMT selected, we should make it to 1 if
topology parsing failed. Since on failure the topology information will
be reset, a thread_number of 1 will be handled as SMT not supported.
>>
>> + /*
>> + * Notify the CPU framework of the SMT support. Initialize the
>> + * max_smt_thread_num to 1 if no SMT support detected. A thread
>> + * number of 1 can be handled by the framework so we don't need
>> + * to check max_smt_thread_num to see we support SMT or not.
>> + */
>> + if (!max_smt_thread_num)
>> + max_smt_thread_num = 1;
>> +
>> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>
> Trivial but I'd put a blank line here.
>
ok.
Thanks.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI based system
2024-12-23 16:40 ` Jonathan Cameron
@ 2024-12-24 12:30 ` Yicong Yang
0 siblings, 0 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-24 12:30 UTC (permalink / raw)
To: Jonathan Cameron
Cc: yangyicong, catalin.marinas, will, sudeep.holla, tglx, peterz,
mpe, linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann, linuxppc-dev, x86, linux-kernel,
morten.rasmussen, msuchanek, gregkh, rafael, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/12/24 0:40, Jonathan Cameron wrote:
> On Fri, 20 Dec 2024 15:53:12 +0800
> Yicong Yang <yangyicong@huawei.com> wrote:
>
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
>> and SMT information of the first core in its heterogeneous CPU cluster
>> when building the topology. Then we can know the largest SMT number
>> in the system. If a homogeneous system's using ACPI 6.2 or later,
>> all the CPUs should be under the root node of PPTT. There'll be
>> only one entry in the xarray and all the CPUs in the system will
>> be assumed identical.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>
> A few trivial things inline. Either way it's fine as really just my style
> preferences
>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>
>> ---
>> arch/arm64/kernel/topology.c | 66 ++++++++++++++++++++++++++++++++++++
>> 1 file changed, 66 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 1a2c72f3e7f8..85cb18d72a29 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -15,8 +15,10 @@
>> #include <linux/arch_topology.h>
>> #include <linux/cacheinfo.h>
>> #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>> #include <linux/init.h>
>> #include <linux/percpu.h>
>> +#include <linux/xarray.h>
>>
>> #include <asm/cpu.h>
>> #include <asm/cputype.h>
>> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>> return !!is_threaded;
>> }
>>
>> +struct cpu_smt_info {
>> + int thread_num;
>> + int core_id;
>> +};
>> +
>> /*
>> * Propagate the topology information of the processor_topology_node tree to the
>> * cpu_topology array.
>> */
>> int __init parse_acpi_topology(void)
>> {
>> + int max_smt_thread_num = 0;
>> + struct cpu_smt_info *entry;
>> + struct xarray hetero_cpu;
>> + unsigned long hetero_id;
>> int cpu, topology_id;
>>
>> if (acpi_disabled)
>> return 0;
>>
>> + xa_init(&hetero_cpu);
>> +
>> for_each_possible_cpu(cpu) {
>> topology_id = find_acpi_cpu_topology(cpu, 0);
>> if (topology_id < 0)
>> @@ -57,6 +70,32 @@ int __init parse_acpi_topology(void)
>> cpu_topology[cpu].thread_id = topology_id;
>> topology_id = find_acpi_cpu_topology(cpu, 1);
>> cpu_topology[cpu].core_id = topology_id;
>> +
>> + /*
>> + * In the PPTT, CPUs below a node with the 'identical
>> + * implementation' flag have the same number of threads.
>> + * Count the number of threads for only one CPU (i.e.
>> + * one core_id) among those with the same hetero_id.
>> + * See the comment of find_acpi_cpu_topology_hetero_id()
>> + * for more details.
>> + *
>> + * One entry is created for each node having:
>> + * - the 'identical implementation' flag
>> + * - its parent not having the flag
>> + */
>> + hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
>> + entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
>
> Given xa_load returns a void *,
>
> entry = xa_load(&hetero_cpu, hetero_id);
>
> should be fine (I haven't checked local style, so feel free to ignore if
> local style is to cast anyway). Maybe drag the definition of entry into
> a more local scope as well.
>
sure. will get rid of the cast and checked it won't violate the local style.
>
>> + if (!entry) {
>> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> + WARN_ON(!entry);
>> +
>> + entry->core_id = topology_id;
>> + entry->thread_num = 1;
>> + xa_store(&hetero_cpu, hetero_id,
>> + entry, GFP_KERNEL);
>> + } else if (entry->core_id == topology_id) {
>> + entry->thread_num++;
>> + }
>> } else {
>> cpu_topology[cpu].thread_id = -1;
>> cpu_topology[cpu].core_id = topology_id;
>> @@ -67,6 +106,33 @@ int __init parse_acpi_topology(void)
>> cpu_topology[cpu].package_id = topology_id;
>> }
>>
>> + /*
>> + * This should be a short loop depending on the number of heterogeneous
>> + * CPU clusters. Typically on a homogeneous system there's only one
>> + * entry in the XArray.
>> + */
>> + xa_for_each(&hetero_cpu, hetero_id, entry) {
>> + if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
>> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
>> + if (entry->thread_num > max_smt_thread_num)
>> + max_smt_thread_num = entry->thread_num;
>
> As with DT, maybe min is more informative?
>
> max_smt_thread_num = min(max_smt_thread_num, entry->thread_num);
>
> I don't care strongly about it though.
either's ok with me, will keep it consistent with DT.
Thanks.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-12-20 7:59 ` Yicong Yang
@ 2024-12-26 7:18 ` Shrikanth Hegde
2024-12-26 11:36 ` Yicong Yang
1 sibling, 1 reply; 19+ messages in thread
From: Shrikanth Hegde @ 2024-12-26 7:18 UTC (permalink / raw)
To: Yicong Yang
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun, catalin.marinas, will,
sudeep.holla, tglx, peterz, mpe, linux-arm-kernel, mingo, bp,
dave.hansen, pierre.gondois, dietmar.eggemann
On 12/20/24 13:23, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> Currently if architectures want to support HOTPLUG_SMT they need to
> provide a topology_is_primary_thread() telling the framework which
> thread in the SMT cannot offline. However arm64 doesn't have a
> restriction on which thread in the SMT cannot offline, a simplest
> choice is that just make 1st thread as the "primary" thread. So
> just make this as the default implementation in the framework and
> let architectures like x86 that have special primary thread to
> override this function (which they've already done).
>
> There's no need to provide a stub function if !CONFIG_SMP or
> !CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
s/CONFIG_HOTPLUG_SMP/CONFIG_HOTPLUG_SMT
> the 1st CPU in the SMT so it's always the primary thread.
>
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> As questioned in v9 [1] whether this works on architectures not using
> CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
> Architectures should use this on their own situation.
> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
>
sorry to ask this question this late in the series.
I am curious to know why not just add the arch specific
topology_is_primary_thread? current infra would handle that right?
is there any other arch that is going to enable this soon? or the
rationale is to add the generic function so that archs that use generic
topology it is just a kconfig change in case they want to add the support?
> [root@localhost smt]# uname -m
> loongarch64
> [root@localhost smt]# pwd
> /sys/devices/system/cpu/smt
> [root@localhost smt]# cat ../possible
> 0-3
> [root@localhost smt]# cat ../online
> 0-3
> [root@localhost smt]# cat control
> on
> [root@localhost smt]# echo off > control
> [root@localhost smt]# cat control
> off
> [root@localhost smt]# cat ../online
> 0,2
> [root@localhost smt]# echo on > control
> [root@localhost smt]# cat control
> on
> [root@localhost smt]# cat ../online
> 0-3
>
> arch/powerpc/include/asm/topology.h | 1 +
> arch/x86/include/asm/topology.h | 2 +-
> include/linux/topology.h | 22 ++++++++++++++++++++++
> 3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
> index 16bacfe8c7a2..da15b5efe807 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
> {
> return cpu == cpu_first_thread_sibling(cpu);
> }
> +#define topology_is_primary_thread topology_is_primary_thread
>
> static inline bool topology_smt_thread_allowed(unsigned int cpu)
> {
> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
> index fd41103ad342..faa0d6334ea4 100644
> --- a/arch/x86/include/asm/topology.h
> +++ b/arch/x86/include/asm/topology.h
> @@ -228,11 +228,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
> {
> return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
> }
> +#define topology_is_primary_thread topology_is_primary_thread
>
> #else /* CONFIG_SMP */
> static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
> static inline int topology_max_smt_threads(void) { return 1; }
> -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
> static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
> #endif /* !CONFIG_SMP */
>
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..b3aba443c4eb 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
> }
> #endif
>
> +#ifndef topology_is_primary_thread
> +
> +#define topology_is_primary_thread topology_is_primary_thread
> +
> +static inline bool topology_is_primary_thread(unsigned int cpu)
> +{
> + /*
> + * On SMT hotplug the primary thread of the SMT won't be disabled.
> + * Architectures do have a special primary thread (e.g. x86) need
> + * to override this function. Otherwise just make the first thread
> + * in the SMT as the primary thread.
> + *
> + * The sibling cpumask of an offline CPU contains always the CPU
> + * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
> + * Other architectures should use this depend on their own
> + * situation.
> + */
> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
> +}
> +
> +#endif
> +
> static inline const struct cpumask *cpu_cpu_mask(int cpu)
> {
> return cpumask_of_node(cpu_to_node(cpu));
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 0/4] Support SMT control on arm64
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
` (3 preceding siblings ...)
2024-12-20 7:53 ` [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
@ 2024-12-26 9:23 ` Shrikanth Hegde
2024-12-26 11:50 ` Yicong Yang
4 siblings, 1 reply; 19+ messages in thread
From: Shrikanth Hegde @ 2024-12-26 9:23 UTC (permalink / raw)
To: Yicong Yang
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun, catalin.marinas, will,
sudeep.holla, tglx, peterz, mpe, linux-arm-kernel, mingo, bp,
dave.hansen, pierre.gondois, dietmar.eggemann
On 12/20/24 13:23, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> The core CPU control framework supports runtime SMT control which
> is not yet supported on arm64. Besides the general vulnerabilities
> concerns we want this runtime control on our arm64 server for:
>
> - better single CPU performance in some cases
> - saving overall power consumption
>
EAS is disabled when SMT is present.
I am curious to know how power saving happens here.
> This patchset implements it in the following aspects:
>
> - Provides a default topology_is_primary_thread()
> - support retrieve SMT thread number on OF based system
> - support retrieve SMT thread number on ACPI based system
> - select HOTPLUG_SMT for arm64
>
> Tests has been done on our ACPI based arm64 server and on ACPI/OF\
> based QEMU VMs.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-12-26 7:18 ` Shrikanth Hegde
@ 2024-12-26 11:36 ` Yicong Yang
0 siblings, 0 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-26 11:36 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun, catalin.marinas, will, sudeep.holla, tglx,
peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
pierre.gondois, dietmar.eggemann
On 2024/12/26 15:18, Shrikanth Hegde wrote:
>
>
> On 12/20/24 13:23, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> Currently if architectures want to support HOTPLUG_SMT they need to
>> provide a topology_is_primary_thread() telling the framework which
>> thread in the SMT cannot offline. However arm64 doesn't have a
>> restriction on which thread in the SMT cannot offline, a simplest
>> choice is that just make 1st thread as the "primary" thread. So
>> just make this as the default implementation in the framework and
>> let architectures like x86 that have special primary thread to
>> override this function (which they've already done).
>>
>> There's no need to provide a stub function if !CONFIG_SMP or
>> !CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
>
> s/CONFIG_HOTPLUG_SMP/CONFIG_HOTPLUG_SMT
>
>> the 1st CPU in the SMT so it's always the primary thread.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>> As questioned in v9 [1] whether this works on architectures not using
>> CONFIG_GENERIC_ARCH_TOPOLOGY, hacked on LoongArch VM and this also works.
>> Architectures should use this on their own situation.
>> [1] https://lore.kernel.org/linux-arm-kernel/427bd639-33c3-47e4-9e83-68c428eb1a7d@arm.com/
>>
>
> sorry to ask this question this late in the series.
>
> I am curious to know why not just add the arch specific topology_is_primary_thread? current infra would handle that right?
Yes, this also works. It was implemented like you said before v4. It's thought trival and was suggested to provide
a default implementation for all and make archs that have special requirement to override it.[1]
[1] https://lore.kernel.org/linux-arm-kernel/fe03425c-6b9a-f0eb-0e8d-e0f47404a7cb@huawei.com/
>
> is there any other arch that is going to enable this soon? or the rationale is to add the generic function so that archs that use generic topology it is just a kconfig change in case they want to add the support?
>
It depends on the arch and currently it's only going to be supported on arm64. If they don't have special requirement
of the primary thread, they can use this function. It's still needed to call cpu_smt_set_num_threads() besides select
the kconfig to notify the framework of the thread number since it maybe detected by arthitectural way or ACPI which is
also from the arch codes (like arm64). If it's detected from device tree and using CONFIG_GENERIC_ARCH_TOPOLOGY,
should work to only select kconfig.
Thanks.
>> [root@localhost smt]# uname -m
>> loongarch64
>> [root@localhost smt]# pwd
>> /sys/devices/system/cpu/smt
>> [root@localhost smt]# cat ../possible
>> 0-3
>> [root@localhost smt]# cat ../online
>> 0-3
>> [root@localhost smt]# cat control
>> on
>> [root@localhost smt]# echo off > control
>> [root@localhost smt]# cat control
>> off
>> [root@localhost smt]# cat ../online
>> 0,2
>> [root@localhost smt]# echo on > control
>> [root@localhost smt]# cat control
>> on
>> [root@localhost smt]# cat ../online
>> 0-3
>>
>> arch/powerpc/include/asm/topology.h | 1 +
>> arch/x86/include/asm/topology.h | 2 +-
>> include/linux/topology.h | 22 ++++++++++++++++++++++
>> 3 files changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
>> index 16bacfe8c7a2..da15b5efe807 100644
>> --- a/arch/powerpc/include/asm/topology.h
>> +++ b/arch/powerpc/include/asm/topology.h
>> @@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>> {
>> return cpu == cpu_first_thread_sibling(cpu);
>> }
>> +#define topology_is_primary_thread topology_is_primary_thread
>> static inline bool topology_smt_thread_allowed(unsigned int cpu)
>> {
>> diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
>> index fd41103ad342..faa0d6334ea4 100644
>> --- a/arch/x86/include/asm/topology.h
>> +++ b/arch/x86/include/asm/topology.h
>> @@ -228,11 +228,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
>> {
>> return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
>> }
>> +#define topology_is_primary_thread topology_is_primary_thread
>> #else /* CONFIG_SMP */
>> static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
>> static inline int topology_max_smt_threads(void) { return 1; }
>> -static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
>> static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
>> #endif /* !CONFIG_SMP */
>> diff --git a/include/linux/topology.h b/include/linux/topology.h
>> index 52f5850730b3..b3aba443c4eb 100644
>> --- a/include/linux/topology.h
>> +++ b/include/linux/topology.h
>> @@ -240,6 +240,28 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
>> }
>> #endif
>> +#ifndef topology_is_primary_thread
>> +
>> +#define topology_is_primary_thread topology_is_primary_thread
>> +
>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>> +{
>> + /*
>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>> + * Architectures do have a special primary thread (e.g. x86) need
>> + * to override this function. Otherwise just make the first thread
>> + * in the SMT as the primary thread.
>> + *
>> + * The sibling cpumask of an offline CPU contains always the CPU
>> + * itself for architectures using CONFIG_GENERIC_ARCH_TOPOLOGY.
>> + * Other architectures should use this depend on their own
>> + * situation.
>> + */
>> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
>> +}
>> +
>> +#endif
>> +
>> static inline const struct cpumask *cpu_cpu_mask(int cpu)
>> {
>> return cpumask_of_node(cpu_to_node(cpu));
>
> .
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 0/4] Support SMT control on arm64
2024-12-26 9:23 ` [PATCH v10 0/4] Support SMT control on arm64 Shrikanth Hegde
@ 2024-12-26 11:50 ` Yicong Yang
2024-12-26 12:28 ` Shrikanth Hegde
0 siblings, 1 reply; 19+ messages in thread
From: Yicong Yang @ 2024-12-26 11:50 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun, catalin.marinas, will, sudeep.holla, tglx,
peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
pierre.gondois, dietmar.eggemann
On 2024/12/26 17:23, Shrikanth Hegde wrote:
>
>
> On 12/20/24 13:23, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> The core CPU control framework supports runtime SMT control which
>> is not yet supported on arm64. Besides the general vulnerabilities
>> concerns we want this runtime control on our arm64 server for:
>>
>> - better single CPU performance in some cases
>> - saving overall power consumption
>>
>
> EAS is disabled when SMT is present.
> I am curious to know how power saving happens here.
EAS shouldn't work on non-asymmetic systems, so it's not the case here.
System wide power consumption comes down from the CPU offlining here.
Thanks.
>
>> This patchset implements it in the following aspects:
>>
>> - Provides a default topology_is_primary_thread()
>> - support retrieve SMT thread number on OF based system
>> - support retrieve SMT thread number on ACPI based system
>> - select HOTPLUG_SMT for arm64
>>
>> Tests has been done on our ACPI based arm64 server and on ACPI/OF\
>> based QEMU VMs.
>>
> .
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 0/4] Support SMT control on arm64
2024-12-26 11:50 ` Yicong Yang
@ 2024-12-26 12:28 ` Shrikanth Hegde
2024-12-27 7:16 ` Yicong Yang
0 siblings, 1 reply; 19+ messages in thread
From: Shrikanth Hegde @ 2024-12-26 12:28 UTC (permalink / raw)
To: Yicong Yang
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun, catalin.marinas, will, sudeep.holla, tglx,
peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
pierre.gondois, dietmar.eggemann, Nysal Jan K.A.
On 12/26/24 17:20, Yicong Yang wrote:
> On 2024/12/26 17:23, Shrikanth Hegde wrote:
>>
>>
>> On 12/20/24 13:23, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>
>>> The core CPU control framework supports runtime SMT control which
>>> is not yet supported on arm64. Besides the general vulnerabilities
>>> concerns we want this runtime control on our arm64 server for:
>>>
>>> - better single CPU performance in some cases
>>> - saving overall power consumption
>>>
>>
>> EAS is disabled when SMT is present.
>> I am curious to know how power saving happens here.
>
> EAS shouldn't work on non-asymmetic systems, so it's not the case here.
Ok. so this is a symmetric system then?
> System wide power consumption comes down from the CPU offlining here.
>
Ok. So SMT is enabled by default and then at runtime disable it to save
power by off-lining the sibling threads?
Note: When enabling SMT, current behavior differs when a core is fully
offline on different archs. You may want to see which is behavior you
need in that case. i.e either online or skip.
PowerPC change where we are skipping a fully offline core.
https://lore.kernel.org/all/20240731030126.956210-1-nysal@linux.ibm.com/
> Thanks.
>
>>
>>> This patchset implements it in the following aspects:
>>>
>>> - Provides a default topology_is_primary_thread()
>>> - support retrieve SMT thread number on OF based system
>>> - support retrieve SMT thread number on ACPI based system
>>> - select HOTPLUG_SMT for arm64
>>>
>>> Tests has been done on our ACPI based arm64 server and on ACPI/OF\
>>> based QEMU VMs.
>>>
>> .
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v10 0/4] Support SMT control on arm64
2024-12-26 12:28 ` Shrikanth Hegde
@ 2024-12-27 7:16 ` Yicong Yang
0 siblings, 0 replies; 19+ messages in thread
From: Yicong Yang @ 2024-12-27 7:16 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun, catalin.marinas, will, sudeep.holla, tglx,
peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
pierre.gondois, dietmar.eggemann, Nysal Jan K.A.
On 2024/12/26 20:28, Shrikanth Hegde wrote:
>
>
> On 12/26/24 17:20, Yicong Yang wrote:
>> On 2024/12/26 17:23, Shrikanth Hegde wrote:
>>>
>>>
>>> On 12/20/24 13:23, Yicong Yang wrote:
>>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>>
>>>> The core CPU control framework supports runtime SMT control which
>>>> is not yet supported on arm64. Besides the general vulnerabilities
>>>> concerns we want this runtime control on our arm64 server for:
>>>>
>>>> - better single CPU performance in some cases
>>>> - saving overall power consumption
>>>>
>>>
>>> EAS is disabled when SMT is present.
>>> I am curious to know how power saving happens here.
>>
>> EAS shouldn't work on non-asymmetic systems, so it's not the case here.
>
> Ok. so this is a symmetric system then?
yes, symmetric.
>
>> System wide power consumption comes down from the CPU offlining here.
>>
>
> Ok. So SMT is enabled by default and then at runtime disable it to save power by off-lining the sibling threads?
>
yes.
>
> Note: When enabling SMT, current behavior differs when a core is fully offline on different archs. You may want to see which is behavior you need in that case. i.e either online or skip.
>
> PowerPC change where we are skipping a fully offline core.
> https://lore.kernel.org/all/20240731030126.956210-1-nysal@linux.ibm.com/
>
Thanks for the information! Currently it's implemented as online and no special need for skip. We may need further
support if skip is required in the future, currently for GENERIC_ARCH_TOPOLOGY an offline CPU's thread sibling
only contains itself so there's no information for checking whether the whole core is offline or not.
Thanks.
>> Thanks.
>>
>>>
>>>> This patchset implements it in the following aspects:
>>>>
>>>> - Provides a default topology_is_primary_thread()
>>>> - support retrieve SMT thread number on OF based system
>>>> - support retrieve SMT thread number on ACPI based system
>>>> - select HOTPLUG_SMT for arm64
>>>>
>>>> Tests has been done on our ACPI based arm64 server and on ACPI/OF\
>>>> based QEMU VMs.
>>>>
>>> .
>
> .
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2024-12-27 7:16 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-20 7:53 [PATCH v10 0/4] Support SMT control on arm64 Yicong Yang
2024-12-20 7:53 ` [PATCH v10 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-12-20 7:59 ` Yicong Yang
2024-12-23 16:34 ` Jonathan Cameron
2024-12-24 12:15 ` Yicong Yang
2024-12-26 7:18 ` Shrikanth Hegde
2024-12-26 11:36 ` Yicong Yang
2024-12-20 7:53 ` [PATCH v10 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
2024-12-23 16:33 ` Jonathan Cameron
2024-12-24 12:23 ` Yicong Yang
2024-12-20 7:53 ` [PATCH v10 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
2024-12-23 16:40 ` Jonathan Cameron
2024-12-24 12:30 ` Yicong Yang
2024-12-20 7:53 ` [PATCH v10 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
2024-12-23 16:40 ` Jonathan Cameron
2024-12-26 9:23 ` [PATCH v10 0/4] Support SMT control on arm64 Shrikanth Hegde
2024-12-26 11:50 ` Yicong Yang
2024-12-26 12:28 ` Shrikanth Hegde
2024-12-27 7:16 ` Yicong Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).