* [PATCH v7 0/4] Support SMT control on arm64
@ 2024-10-30 12:54 Yicong Yang
2024-10-30 12:54 ` [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Yicong Yang @ 2024-10-30 12:54 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
The core CPU control framework supports runtime SMT control which
is not yet supported on arm64. Besides the general vulnerabilities
concerns we want this runtime control on our arm64 server for:
- better single CPU performance in some cases
- saving overall power consumption
This patchset implements it in the following aspects:
- Provides a default topology_is_primary_thread()
- support retrieve SMT thread number on OF based system
- support retrieve SMT thread number on ACPI based system
- select HOTPLUG_SMT for arm64
Tests has been done on our real ACPI based arm64 server and on
ACPI/OF based QEMU VMs.
Change since v6:
- Fix unused variable if !CONFIG_ARM64 || !CONFIG_RISV found by lkp-test
- Fix max_smt_thread_num updating in OF path pointed by Pierre
- Drop unused variable and refine the comments/commit per Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20241015021841.35713-1-yangyicong@huawei.com/
Change since v5:
- Drop the dependency on CONFIG_SMP since it's always on on arm64, per Pierre
- Avoid potential multiple calls of cpu_smt_set_num_threads() on asymmetric system, per Dietmar
- Detect heterogenous SMT topology and issue a warning for partly support, per Pierre
- Thanks Dietmar for testing, didn't pickup the tag due to code changes. Thanks testing by Pierre
Link: https://lore.kernel.org/linux-arm-kernel/20240806085320.63514-1-yangyicong@huawei.com/
Change since v4:
- Provide a default topology_is_primary_thread() in the framework, Per Will
Link: https://lore.kernel.org/linux-arm-kernel/20231121092602.47792-1-yangyicong@huawei.com/
Change since v3:
- Fix some build and kconfig error reported by kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-arm-kernel/20231114040110.54590-1-yangyicong@huawei.com/
Change since v2:
- Detect SMT thread number at topology build from ACPI/DT, avoid looping CPUs
- Split patches into ACPI/OF/arch_topology path and enable the kconfig for arm64
Link: https://lore.kernel.org/linux-arm-kernel/20231010115335.13862-1-yangyicong@huawei.com/
Yicong Yang (4):
cpu/SMT: Provide a default topology_is_primary_thread()
arch_topology: Support SMT control for OF based system
arm64: topology: Support SMT control on ACPI based system
arm64: Kconfig: Enable HOTPLUG_SMT
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/topology.c | 57 +++++++++++++++++++++++++++++
arch/powerpc/include/asm/topology.h | 1 +
arch/x86/include/asm/topology.h | 2 +-
drivers/base/arch_topology.c | 24 ++++++++++++
include/linux/topology.h | 14 +++++++
6 files changed, 98 insertions(+), 1 deletion(-)
--
2.24.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-30 12:54 [PATCH v7 0/4] Support SMT control on arm64 Yicong Yang
@ 2024-10-30 12:54 ` Yicong Yang
2024-10-30 14:55 ` Thomas Gleixner
2024-10-30 12:54 ` [PATCH v7 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Yicong Yang @ 2024-10-30 12:54 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
Currently if architectures want to support HOTPLUG_SMT they need to
provide a topology_is_primary_thread() telling the framework which
thread in the SMT cannot offline. However arm64 doesn't have a
restriction on which thread in the SMT cannot offline, a simplest
choice is that just make 1st thread as the "primary" thread. So
just make this as the default implementation in the framework and
let architectures like x86 that have special primary thread to
override this function (which they've already done).
There's no need to provide a stub function if !CONFIG_SMP or
!CONFIG_HOTPLUG_SMP. In such case the testing CPU is already
the 1st CPU in the SMT so it's always the primary thread.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/powerpc/include/asm/topology.h | 1 +
arch/x86/include/asm/topology.h | 2 +-
include/linux/topology.h | 14 ++++++++++++++
3 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 16bacfe8c7a2..da15b5efe807 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -152,6 +152,7 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
{
return cpu == cpu_first_thread_sibling(cpu);
}
+#define topology_is_primary_thread topology_is_primary_thread
static inline bool topology_smt_thread_allowed(unsigned int cpu)
{
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index aef70336d624..3a7c18851016 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -219,11 +219,11 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_primary_thread_mask);
}
+#define topology_is_primary_thread topology_is_primary_thread
#else /* CONFIG_SMP */
static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
static inline int topology_max_smt_threads(void) { return 1; }
-static inline bool topology_is_primary_thread(unsigned int cpu) { return true; }
static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
#endif /* !CONFIG_SMP */
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 52f5850730b3..9c9f6b9f3462 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -240,6 +240,20 @@ static inline const struct cpumask *cpu_smt_mask(int cpu)
}
#endif
+#ifndef topology_is_primary_thread
+#define topology_is_primary_thread topology_is_primary_thread
+static inline bool topology_is_primary_thread(unsigned int cpu)
+{
+ /*
+ * On SMT hotplug the primary thread of the SMT won't be disabled.
+ * Architectures do have a special primary thread (e.g. x86) need
+ * to override this function. Otherwise just make the first thread
+ * in the SMT as the primary thread.
+ */
+ return cpu == cpumask_first(topology_sibling_cpumask(cpu));
+}
+#endif
+
static inline const struct cpumask *cpu_cpu_mask(int cpu)
{
return cpumask_of_node(cpu_to_node(cpu));
--
2.24.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v7 2/4] arch_topology: Support SMT control for OF based system
2024-10-30 12:54 [PATCH v7 0/4] Support SMT control on arm64 Yicong Yang
2024-10-30 12:54 ` [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2024-10-30 12:54 ` Yicong Yang
2024-10-30 12:54 ` [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
2024-10-30 12:54 ` [PATCH v7 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
3 siblings, 0 replies; 13+ messages in thread
From: Yicong Yang @ 2024-10-30 12:54 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
On building the topology from the devicetree, we've already
gotten the SMT thread number of each core. Update the largest
SMT thread number and enable the SMT control by the end of
topology parsing.
The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number
If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
drivers/base/arch_topology.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 75fcb75d5515..e2c8e6d92414 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -11,6 +11,7 @@
#include <linux/cleanup.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
#include <linux/device.h>
#include <linux/of.h>
#include <linux/slab.h>
@@ -502,6 +503,10 @@ core_initcall(free_raw_capacity);
#endif
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
+
+/* Maximum SMT thread number detected used to enable the SMT control */
+static unsigned int max_smt_thread_num;
+
/*
* This function returns the logic cpu number of the node.
* There are basically three kinds of return values:
@@ -561,6 +566,17 @@ static int __init parse_core(struct device_node *core, int package_id,
i++;
} while (1);
+ /*
+ * If max_smt_thread_num has been initialized and doesn't match
+ * the thread number of this entry, then the system has
+ * heterogeneous SMT topology.
+ */
+ if (max_smt_thread_num && max_smt_thread_num != i)
+ pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+ if (max_smt_thread_num < i)
+ max_smt_thread_num = i;
+
cpu = get_cpu_for_node(core);
if (cpu >= 0) {
if (!leaf) {
@@ -673,6 +689,14 @@ static int __init parse_socket(struct device_node *socket)
if (!has_socket)
ret = parse_cluster(socket, 0, -1, 0);
+ /*
+ * Notify the CPU framework of the SMT support. A thread number of 1
+ * can be handled by the framework so we don't need to check
+ * max_smt_thread_num to see we support SMT or not.
+ */
+ if (max_smt_thread_num)
+ cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+
return ret;
}
--
2.24.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI based system
2024-10-30 12:54 [PATCH v7 0/4] Support SMT control on arm64 Yicong Yang
2024-10-30 12:54 ` [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-10-30 12:54 ` [PATCH v7 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
@ 2024-10-30 12:54 ` Yicong Yang
2024-11-07 17:20 ` Pierre Gondois
2024-10-30 12:54 ` [PATCH v7 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
3 siblings, 1 reply; 13+ messages in thread
From: Yicong Yang @ 2024-10-30 12:54 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
For ACPI we'll build the topology from PPTT and we cannot directly
get the SMT number of each core. Instead using a temporary xarray
to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
and SMT information of the first core in its heterogeneous CPU cluster
when building the topology. Then we can know the largest SMT number
in the system. If a homogeneous system's using ACPI 6.2 or later,
all the CPUs should be under the root node of PPTT. There'll be
only one entry in the xarray and all the CPUs in the system will
be assumed identical.
The core's SMT control provides two interface to the users [1]:
1) enable/disable SMT by writing on/off
2) enable/disable SMT by writing thread number 1/max_thread_number
If a system have more than one SMT thread number the 2) may
not handle it well, since there're multiple thread numbers in the
system and 2) only accept 1/max_thread_number. So issue a warning
to notify the users if such system detected.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/arm64/kernel/topology.c | 57 ++++++++++++++++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 1a2c72f3e7f8..47f838d6e823 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -15,8 +15,10 @@
#include <linux/arch_topology.h>
#include <linux/cacheinfo.h>
#include <linux/cpufreq.h>
+#include <linux/cpu_smt.h>
#include <linux/init.h>
#include <linux/percpu.h>
+#include <linux/xarray.h>
#include <asm/cpu.h>
#include <asm/cputype.h>
@@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
return !!is_threaded;
}
+struct cpu_smt_info {
+ int thread_num;
+ int core_id;
+};
+
/*
* Propagate the topology information of the processor_topology_node tree to the
* cpu_topology array.
*/
int __init parse_acpi_topology(void)
{
+ int max_smt_thread_num = 0;
+ struct cpu_smt_info *entry;
+ struct xarray hetero_cpu;
+ unsigned long hetero_id;
int cpu, topology_id;
if (acpi_disabled)
return 0;
+ xa_init(&hetero_cpu);
+
for_each_possible_cpu(cpu) {
topology_id = find_acpi_cpu_topology(cpu, 0);
if (topology_id < 0)
@@ -57,6 +70,32 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].thread_id = topology_id;
topology_id = find_acpi_cpu_topology(cpu, 1);
cpu_topology[cpu].core_id = topology_id;
+
+ /*
+ * In the PPTT, CPUs below a node with the 'identical
+ * implementation' flag have the same number of threads.
+ * Count the number of threads for only one CPU (i.e.
+ * one core_id) among those with the same hetero_id.
+ * See the comment of find_acpi_cpu_topology_hetero_id()
+ * for more details.
+ *
+ * One entry is created for each node having:
+ * - the 'identical implementation' flag
+ * - its parent not having the flag
+ */
+ hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
+ entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
+ if (!entry) {
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ WARN_ON(!entry);
+
+ entry->core_id = topology_id;
+ entry->thread_num = 1;
+ xa_store(&hetero_cpu, hetero_id,
+ entry, GFP_KERNEL);
+ } else if (entry->core_id == topology_id) {
+ entry->thread_num++;
+ }
} else {
cpu_topology[cpu].thread_id = -1;
cpu_topology[cpu].core_id = topology_id;
@@ -67,6 +106,24 @@ int __init parse_acpi_topology(void)
cpu_topology[cpu].package_id = topology_id;
}
+ /*
+ * This should be a short loop depending on the number of heterogeneous
+ * CPU clusters. Typically on a homogeneous system there's only one
+ * entry in the XArray.
+ */
+ xa_for_each(&hetero_cpu, hetero_id, entry) {
+ if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
+ pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
+
+ if (entry->thread_num > max_smt_thread_num)
+ max_smt_thread_num = entry->thread_num;
+
+ xa_erase(&hetero_cpu, hetero_id);
+ kfree(entry);
+ }
+
+ cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
+ xa_destroy(&hetero_cpu);
return 0;
}
#endif
--
2.24.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v7 4/4] arm64: Kconfig: Enable HOTPLUG_SMT
2024-10-30 12:54 [PATCH v7 0/4] Support SMT control on arm64 Yicong Yang
` (2 preceding siblings ...)
2024-10-30 12:54 ` [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2024-10-30 12:54 ` Yicong Yang
3 siblings, 0 replies; 13+ messages in thread
From: Yicong Yang @ 2024-10-30 12:54 UTC (permalink / raw)
To: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
From: Yicong Yang <yangyicong@hisilicon.com>
Enable HOTPLUG_SMT for SMT control.
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
arch/arm64/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fd9df6dcc593..cccd0a07c050 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -243,6 +243,7 @@ config ARM64
select HAVE_KRETPROBES
select HAVE_GENERIC_VDSO
select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
+ select HOTPLUG_SMT if HOTPLUG_CPU
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN
--
2.24.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-30 12:54 ` [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
@ 2024-10-30 14:55 ` Thomas Gleixner
2024-10-31 12:17 ` Yicong Yang
0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2024-10-30 14:55 UTC (permalink / raw)
To: Yicong Yang, catalin.marinas, will, sudeep.holla, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
On Wed, Oct 30 2024 at 20:54, Yicong Yang wrote:
>
> +#ifndef topology_is_primary_thread
> +#define topology_is_primary_thread topology_is_primary_thread
Please do not glue defines and functions together w/o a newline in between.
> +static inline bool topology_is_primary_thread(unsigned int cpu)
> +{
> + /*
> + * On SMT hotplug the primary thread of the SMT won't be disabled.
> + * Architectures do have a special primary thread (e.g. x86) need
> + * to override this function. Otherwise just make the first thread
> + * in the SMT as the primary thread.
> + */
> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
How is that supposed to work? Assume both siblings are offline, then the
sibling mask is empty and you can't boot the CPU anymore.
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-30 14:55 ` Thomas Gleixner
@ 2024-10-31 12:17 ` Yicong Yang
2024-10-31 13:33 ` Thomas Gleixner
2024-11-07 17:20 ` Pierre Gondois
0 siblings, 2 replies; 13+ messages in thread
From: Yicong Yang @ 2024-10-31 12:17 UTC (permalink / raw)
To: Thomas Gleixner, catalin.marinas, will, sudeep.holla, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/10/30 22:55, Thomas Gleixner wrote:
> On Wed, Oct 30 2024 at 20:54, Yicong Yang wrote:
>>
>> +#ifndef topology_is_primary_thread
>> +#define topology_is_primary_thread topology_is_primary_thread
>
> Please do not glue defines and functions together w/o a newline in between.
>
sure, will add a newline here.
>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>> +{
>> + /*
>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>> + * Architectures do have a special primary thread (e.g. x86) need
>> + * to override this function. Otherwise just make the first thread
>> + * in the SMT as the primary thread.
>> + */
>> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
>
> How is that supposed to work? Assume both siblings are offline, then the
> sibling mask is empty and you can't boot the CPU anymore.
>
For architectures' using arch_topology, topology_sibling_cpumask() will at least
contain the tested CPU itself. This is initialized in
drivers/base/arch_topology.c:reset_cpu_topology(). So it won't be empty here.
Besides we don't need to check topology_is_primary_thread() at boot time:
-> cpu_up(cpu)
cpu_bootable()
if (cpu_smt_control == CPU_SMT_ENABLED &&
cpu_smt_thread_allowed(cpu)) // will always return true if !CONFIG_SMT_NUM_THREADS_DYNAMIC
return true; // we'll always return here and @cpu is always bootable
Also tested fine in practice.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-31 12:17 ` Yicong Yang
@ 2024-10-31 13:33 ` Thomas Gleixner
2024-11-01 3:18 ` Yicong Yang
2024-11-07 17:20 ` Pierre Gondois
1 sibling, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2024-10-31 13:33 UTC (permalink / raw)
To: Yicong Yang, catalin.marinas, will, sudeep.holla, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On Thu, Oct 31 2024 at 20:17, Yicong Yang wrote:
> On 2024/10/30 22:55, Thomas Gleixner wrote:
>>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>>> +{
>>> + /*
>>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>>> + * Architectures do have a special primary thread (e.g. x86) need
>>> + * to override this function. Otherwise just make the first thread
>>> + * in the SMT as the primary thread.
>>> + */
>>> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
>>
>> How is that supposed to work? Assume both siblings are offline, then the
>> sibling mask is empty and you can't boot the CPU anymore.
>>
>
> For architectures' using arch_topology, topology_sibling_cpumask() will at least
> contain the tested CPU itself. This is initialized in
> drivers/base/arch_topology.c:reset_cpu_topology(). So it won't be
> empty here.
Fair enough. Can you please expand the comment and say:
The sibling cpumask of a offline CPU contains always the CPU
itself.
> Besides we don't need to check topology_is_primary_thread() at boot time:
> -> cpu_up(cpu)
> cpu_bootable()
> if (cpu_smt_control == CPU_SMT_ENABLED &&
> cpu_smt_thread_allowed(cpu)) // will always return true if !CONFIG_SMT_NUM_THREADS_DYNAMIC
> return true; // we'll always return here and @cpu is always bootable
cpu_smt_control is not guaranteed to have CPU_SMT_ENABLED state, so this
argument is bogus.
> Also tested fine in practice.
I've heard that song before.
What matters is not what you tested. What matters is whether the code is
correct _and_ understandable.
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-31 13:33 ` Thomas Gleixner
@ 2024-11-01 3:18 ` Yicong Yang
2024-11-01 9:31 ` Thomas Gleixner
0 siblings, 1 reply; 13+ messages in thread
From: Yicong Yang @ 2024-11-01 3:18 UTC (permalink / raw)
To: Thomas Gleixner, catalin.marinas, will, sudeep.holla, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/10/31 21:33, Thomas Gleixner wrote:
> On Thu, Oct 31 2024 at 20:17, Yicong Yang wrote:
>> On 2024/10/30 22:55, Thomas Gleixner wrote:
>>>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>>>> +{
>>>> + /*
>>>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>>>> + * Architectures do have a special primary thread (e.g. x86) need
>>>> + * to override this function. Otherwise just make the first thread
>>>> + * in the SMT as the primary thread.
>>>> + */
>>>> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
>>>
>>> How is that supposed to work? Assume both siblings are offline, then the
>>> sibling mask is empty and you can't boot the CPU anymore.
>>>
>>
>> For architectures' using arch_topology, topology_sibling_cpumask() will at least
>> contain the tested CPU itself. This is initialized in
>> drivers/base/arch_topology.c:reset_cpu_topology(). So it won't be
>> empty here.
>
> Fair enough. Can you please expand the comment and say:
>
> The sibling cpumask of a offline CPU contains always the CPU
> itself.
>
Sure, will make it clear.
>> Besides we don't need to check topology_is_primary_thread() at boot time:
>> -> cpu_up(cpu)
>> cpu_bootable()
>> if (cpu_smt_control == CPU_SMT_ENABLED &&
>> cpu_smt_thread_allowed(cpu)) // will always return true if !CONFIG_SMT_NUM_THREADS_DYNAMIC
>> return true; // we'll always return here and @cpu is always bootable
>
> cpu_smt_control is not guaranteed to have CPU_SMT_ENABLED state, so this
> argument is bogus.
>
sorry for didn't explain all the cases here.
For cpu_sm_control == {CPU_SMT_ENABLED, CPU_SMT_NOT_SUPPORTED, CPU_SMT_NOT_IMPLEMENTED},
all the cpu's bootable and we won't check topology_is_primary_thread().
static inline bool cpu_bootable(unsigned int cpu)
{
if (cpu_smt_control == CPU_SMT_ENABLED && cpu_smt_thread_allowed(cpu))
return true;
/* All CPUs are bootable if controls are not configured */
if (cpu_smt_control == CPU_SMT_NOT_IMPLEMENTED)
return true;
/* All CPUs are bootable if CPU is not SMT capable */
if (cpu_smt_control == CPU_SMT_NOT_SUPPORTED)
return true;
if (topology_is_primary_thread(cpu)) // Will be true for all the CPUs when thread sibling's not built
// Only true for primary thread if thread sibling's updated
// thread sibling will be updated once the CPU's bootup, for arm64
// in secondary_start_kernel()
return true;
return !cpumask_test_cpu(cpu, &cpus_booted_once_mask); // Also be updated once the CPU's bootup, in
// secondary_start_kernel() for arm64
// Will return false in the second check of
// cpu_bootable() in the call chain below
}
For cpu_smt_control == {CPUS_SMT_DISABLED, CPU_SMT_FORCE_DISABLED} if user specified the
boot option "nosmt" or "nosmt=force", it'll be a bit more complex. For a non-primary
thread CPU, cpu_bootable() will return true and it'll be boot. Then after thread sibling's
built cpu_bootable() will be checked secondly it the cpuhp callbacks, since it'll return
false then and we'll roll back and offline it.
// for a non-primary thread CPU, system boot with "nosmt" or "nosmt=force"
-> cpu_up()
cpu_bootable() -> true, since the thread sibling mask only coutains CPU itself
[...]
cpuhp_bringup_ap()
bringup_wait_for_ap_online()
if (!cpu_bootable(cpu)) // target CPU has been bringup, thread sibling mask's updated
// then this non-primay thread won't be bootable in this case
return -ECANCELED // roll back and offline this CPU
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-11-01 3:18 ` Yicong Yang
@ 2024-11-01 9:31 ` Thomas Gleixner
0 siblings, 0 replies; 13+ messages in thread
From: Thomas Gleixner @ 2024-11-01 9:31 UTC (permalink / raw)
To: Yicong Yang, catalin.marinas, will, sudeep.holla, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, pierre.gondois,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On Fri, Nov 01 2024 at 11:18, Yicong Yang wrote:
> On 2024/10/31 21:33, Thomas Gleixner wrote:
>> cpu_smt_control is not guaranteed to have CPU_SMT_ENABLED state, so this
>> argument is bogus.
>>
> sorry for didn't explain all the cases here.
>
> For cpu_sm_control == {CPU_SMT_ENABLED, CPU_SMT_NOT_SUPPORTED, CPU_SMT_NOT_IMPLEMENTED},
> all the cpu's bootable and we won't check topology_is_primary_thread().
You don't have to copy the code to me. I'm familiar with it.
All I need is a proper explanation why your topology_is_primary_thread()
implementation is correct under all circumstances.
Thanks,
tglx
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI based system
2024-10-30 12:54 ` [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
@ 2024-11-07 17:20 ` Pierre Gondois
2024-11-08 8:06 ` Yicong Yang
0 siblings, 1 reply; 13+ messages in thread
From: Pierre Gondois @ 2024-11-07 17:20 UTC (permalink / raw)
To: Yicong Yang, catalin.marinas, will, sudeep.holla, tglx, peterz,
mpe, linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann
Cc: linuxppc-dev, x86, linux-kernel, morten.rasmussen, msuchanek,
gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
yangyicong, xuwei5, guohanjun
On 10/30/24 13:54, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
>
> For ACPI we'll build the topology from PPTT and we cannot directly
> get the SMT number of each core. Instead using a temporary xarray
> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> and SMT information of the first core in its heterogeneous CPU cluster
> when building the topology. Then we can know the largest SMT number
> in the system. If a homogeneous system's using ACPI 6.2 or later,
> all the CPUs should be under the root node of PPTT. There'll be
> only one entry in the xarray and all the CPUs in the system will
> be assumed identical.
>
> The core's SMT control provides two interface to the users [1]:
> 1) enable/disable SMT by writing on/off
> 2) enable/disable SMT by writing thread number 1/max_thread_number
>
> If a system have more than one SMT thread number the 2) may
> not handle it well, since there're multiple thread numbers in the
> system and 2) only accept 1/max_thread_number. So issue a warning
> to notify the users if such system detected.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> arch/arm64/kernel/topology.c | 57 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 57 insertions(+)
>
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 1a2c72f3e7f8..47f838d6e823 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -15,8 +15,10 @@
> #include <linux/arch_topology.h>
> #include <linux/cacheinfo.h>
> #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
> #include <linux/init.h>
> #include <linux/percpu.h>
> +#include <linux/xarray.h>
>
> #include <asm/cpu.h>
> #include <asm/cputype.h>
> @@ -37,17 +39,28 @@ static bool __init acpi_cpu_is_threaded(int cpu)
> return !!is_threaded;
> }
>
> +struct cpu_smt_info {
> + int thread_num;
> + int core_id;
> +};
> +
> /*
> * Propagate the topology information of the processor_topology_node tree to the
> * cpu_topology array.
> */
> int __init parse_acpi_topology(void)
> {
> + int max_smt_thread_num = 0;
> + struct cpu_smt_info *entry;
> + struct xarray hetero_cpu;
> + unsigned long hetero_id;
> int cpu, topology_id;
>
> if (acpi_disabled)
> return 0;
>
> + xa_init(&hetero_cpu);
> +
> for_each_possible_cpu(cpu) {
> topology_id = find_acpi_cpu_topology(cpu, 0);
> if (topology_id < 0)
> @@ -57,6 +70,32 @@ int __init parse_acpi_topology(void)
> cpu_topology[cpu].thread_id = topology_id;
> topology_id = find_acpi_cpu_topology(cpu, 1);
> cpu_topology[cpu].core_id = topology_id;
> +
> + /*
> + * In the PPTT, CPUs below a node with the 'identical
> + * implementation' flag have the same number of threads.
> + * Count the number of threads for only one CPU (i.e.
> + * one core_id) among those with the same hetero_id.
> + * See the comment of find_acpi_cpu_topology_hetero_id()
> + * for more details.
> + *
> + * One entry is created for each node having:
> + * - the 'identical implementation' flag
> + * - its parent not having the flag
> + */
> + hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> + entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
> + if (!entry) {
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + WARN_ON(!entry);
> +
> + entry->core_id = topology_id;
> + entry->thread_num = 1;
> + xa_store(&hetero_cpu, hetero_id,
> + entry, GFP_KERNEL);
> + } else if (entry->core_id == topology_id) {
> + entry->thread_num++;
> + }
> } else {
> cpu_topology[cpu].thread_id = -1;
> cpu_topology[cpu].core_id = topology_id;
> @@ -67,6 +106,24 @@ int __init parse_acpi_topology(void)
> cpu_topology[cpu].package_id = topology_id;
> }
>
> + /*
> + * This should be a short loop depending on the number of heterogeneous
> + * CPU clusters. Typically on a homogeneous system there's only one
> + * entry in the XArray.
> + */
> + xa_for_each(&hetero_cpu, hetero_id, entry) {
> + if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
> +
> + if (entry->thread_num > max_smt_thread_num)
> + max_smt_thread_num = entry->thread_num;
> +
> + xa_erase(&hetero_cpu, hetero_id);
> + kfree(entry);
> + }
I think you need to check that max_smt_thread_num !=0,
like in the DT path. Otherwise on a platform with no SMT,
max_smt_thread_num = 0 and I hit:
kernel/cpu::cpu_smt_set_num_threads()
WARN_ON(!num_threads || (num_threads > max_threads));
->
if (max_smt_thread_num)
cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> +
> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> + xa_destroy(&hetero_cpu);
> return 0;
> }
> #endif
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread()
2024-10-31 12:17 ` Yicong Yang
2024-10-31 13:33 ` Thomas Gleixner
@ 2024-11-07 17:20 ` Pierre Gondois
1 sibling, 0 replies; 13+ messages in thread
From: Pierre Gondois @ 2024-11-07 17:20 UTC (permalink / raw)
To: Yicong Yang, Thomas Gleixner, catalin.marinas, will, sudeep.holla,
peterz, mpe, linux-arm-kernel, mingo, bp, dave.hansen,
dietmar.eggemann
Cc: yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On 10/31/24 13:17, Yicong Yang wrote:
> On 2024/10/30 22:55, Thomas Gleixner wrote:
>> On Wed, Oct 30 2024 at 20:54, Yicong Yang wrote:
>>>
>>> +#ifndef topology_is_primary_thread
>>> +#define topology_is_primary_thread topology_is_primary_thread
>>
>> Please do not glue defines and functions together w/o a newline in between.
>>
>
> sure, will add a newline here.
>
>>> +static inline bool topology_is_primary_thread(unsigned int cpu)
>>> +{
>>> + /*
>>> + * On SMT hotplug the primary thread of the SMT won't be disabled.
>>> + * Architectures do have a special primary thread (e.g. x86) need
>>> + * to override this function. Otherwise just make the first thread
>>> + * in the SMT as the primary thread.
>>> + */
>>> + return cpu == cpumask_first(topology_sibling_cpumask(cpu));
>>
>> How is that supposed to work? Assume both siblings are offline, then the
>> sibling mask is empty and you can't boot the CPU anymore.
>>
>
> For architectures' using arch_topology, topology_sibling_cpumask() will at least
> contain the tested CPU itself. This is initialized in
> drivers/base/arch_topology.c:reset_cpu_topology(). So it won't be empty here.
>
> Besides we don't need to check topology_is_primary_thread() at boot time:
> -> cpu_up(cpu)
> cpu_bootable()
> if (cpu_smt_control == CPU_SMT_ENABLED &&
> cpu_smt_thread_allowed(cpu)) // will always return true if !CONFIG_SMT_NUM_THREADS_DYNAMIC
> return true; // we'll always return here and @cpu is always bootable
>
> Also tested fine in practice.
>
> Thanks.
>
>
FWIW, I also tested the case where:
- setting maxcpus=1 in the kernel cmdline to have CPUs that never booted
- setting smt to off:
'echo off > /sys/devices/system/cpu/smt/control'
and effectively the primary CPUs can boot and secondary CPUs can't,
so it works as expected.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI based system
2024-11-07 17:20 ` Pierre Gondois
@ 2024-11-08 8:06 ` Yicong Yang
0 siblings, 0 replies; 13+ messages in thread
From: Yicong Yang @ 2024-11-08 8:06 UTC (permalink / raw)
To: Pierre Gondois
Cc: catalin.marinas, will, sudeep.holla, tglx, peterz, mpe,
linux-arm-kernel, mingo, bp, dave.hansen, dietmar.eggemann,
yangyicong, linuxppc-dev, x86, linux-kernel, morten.rasmussen,
msuchanek, gregkh, rafael, jonathan.cameron, prime.zeng, linuxarm,
xuwei5, guohanjun
On 2024/11/8 1:20, Pierre Gondois wrote:
>
>
> On 10/30/24 13:54, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
>> and SMT information of the first core in its heterogeneous CPU cluster
>> when building the topology. Then we can know the largest SMT number
>> in the system. If a homogeneous system's using ACPI 6.2 or later,
>> all the CPUs should be under the root node of PPTT. There'll be
>> only one entry in the xarray and all the CPUs in the system will
>> be assumed identical.
>>
>> The core's SMT control provides two interface to the users [1]:
>> 1) enable/disable SMT by writing on/off
>> 2) enable/disable SMT by writing thread number 1/max_thread_number
>>
>> If a system have more than one SMT thread number the 2) may
>> not handle it well, since there're multiple thread numbers in the
>> system and 2) only accept 1/max_thread_number. So issue a warning
>> to notify the users if such system detected.
>>
>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/sysfs-devices-system-cpu#n542
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>> arch/arm64/kernel/topology.c | 57 ++++++++++++++++++++++++++++++++++++
>> 1 file changed, 57 insertions(+)
>>
[...]
>> @@ -67,6 +106,24 @@ int __init parse_acpi_topology(void)
>> cpu_topology[cpu].package_id = topology_id;
>> }
>> + /*
>> + * This should be a short loop depending on the number of heterogeneous
>> + * CPU clusters. Typically on a homogeneous system there's only one
>> + * entry in the XArray.
>> + */
>> + xa_for_each(&hetero_cpu, hetero_id, entry) {
>> + if (entry->thread_num != max_smt_thread_num && max_smt_thread_num)
>> + pr_warn_once("Heterogeneous SMT topology is partly supported by SMT control\n");
>> +
>> + if (entry->thread_num > max_smt_thread_num)
>> + max_smt_thread_num = entry->thread_num;
>> +
>> + xa_erase(&hetero_cpu, hetero_id);
>> + kfree(entry);
>> + }
>
> I think you need to check that max_smt_thread_num !=0,
> like in the DT path. Otherwise on a platform with no SMT,
> max_smt_thread_num = 0 and I hit:
>
> kernel/cpu::cpu_smt_set_num_threads()
> WARN_ON(!num_threads || (num_threads > max_threads));
>
>
> ->
> if (max_smt_thread_num)
> cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>
sorry for this silly mistake. yes we should do the same like in the DT path.
I'll get this fixed. thanks for testing.
>> +
>> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>> + xa_destroy(&hetero_cpu);
>> return 0;
>> }
>> #endif
>
> .
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-11-08 8:09 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-30 12:54 [PATCH v7 0/4] Support SMT control on arm64 Yicong Yang
2024-10-30 12:54 ` [PATCH v7 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-10-30 14:55 ` Thomas Gleixner
2024-10-31 12:17 ` Yicong Yang
2024-10-31 13:33 ` Thomas Gleixner
2024-11-01 3:18 ` Yicong Yang
2024-11-01 9:31 ` Thomas Gleixner
2024-11-07 17:20 ` Pierre Gondois
2024-10-30 12:54 ` [PATCH v7 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
2024-10-30 12:54 ` [PATCH v7 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
2024-11-07 17:20 ` Pierre Gondois
2024-11-08 8:06 ` Yicong Yang
2024-10-30 12:54 ` [PATCH v7 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).