linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Pierre Gondois <pierre.gondois@arm.com>
To: Yicong Yang <yangyicong@huawei.com>
Cc: yangyicong@hisilicon.com, linuxppc-dev@lists.ozlabs.org,
	bp@alien8.de, dave.hansen@linux.intel.com, mingo@redhat.com,
	linux-arm-kernel@lists.infradead.org, mpe@ellerman.id.au,
	peterz@infradead.org, tglx@linutronix.de, sudeep.holla@arm.com,
	will@kernel.org, catalin.marinas@arm.com, x86@kernel.org,
	linux-kernel@vger.kernel.org, dietmar.eggemann@arm.com,
	gregkh@linuxfoundation.org, rafael@kernel.org,
	jonathan.cameron@huawei.com, prime.zeng@hisilicon.com,
	linuxarm@huawei.com, xuwei5@huawei.com, guohanjun@huawei.com
Subject: Re: [PATCH v5 3/4] arm64: topology: Support SMT control on ACPI based system
Date: Thu, 5 Sep 2024 10:34:57 +0200	[thread overview]
Message-ID: <277bd093-422b-4301-92a3-d0a58eb41af5@arm.com> (raw)
In-Reply-To: <7f722af2-2969-aae5-1fb5-68d353eb95b9@huawei.com>

Hello Yicong,

>>> Wondering if we can avoid this 2nd loop. Greg express the worries of looping twice on large scale
>>> system in v1. Maybe we could use the hetero_id and get the necessary information in one loop, I need
>>> further think.
>>
>> I found this comments (not sure this is what you are refering to):
>> - https://lore.kernel.org/linux-arm-kernel/20231011103303.00002d8f@Huawei.com/
>> - https://lore.kernel.org/all/20230921150333.c2zqigs3xxwcg4ln@bogus/T/#m406c4c16871ca7ae431beb20feccfb5e14498452
>>
>> I don't see another way to do it right now. Also, I thing the complexity is in
>> O(2n), which should be better than the original O(n**2),
>>
> 
> yes it's less complex. I'm wondering build up the xarray in another way then we can avoid the
> long loops. What about below:

I tried the patch on a ThunderX2 with 4 threads per CPU. PPTT topology describes
2 clusters of 128 cores each. Each cluster is described as independent (i.e. there
is no root node in the PPTT). The PPTT is of revision 1, so the IDENTICAL
flag is not available on the platform.

I also tried it on a faked SMT asymmetric platform and there might be a small
correction required.

> 
>  From 5ff5d0100435982764cd85566a6fe006e60ee98e Mon Sep 17 00:00:00 2001
> From: Yicong Yang <yangyicong@hisilicon.com>
> Date: Fri, 20 Oct 2023 15:38:38 +0800
> Subject: [PATCH] arm64: topology: Support SMT control on ACPI based system
> 
> For ACPI we'll build the topology from PPTT and we cannot directly
> get the SMT number of each core. Instead using a temporary xarray
> to record the heterogeneous information (from ACPI_PPTT_ACPI_IDENTICAL)
> and SMT information of the first core in its heterogeneous CPU cluster
> when building the topology. Then we can know the largest SMT number
> in the system. Warn if heterogeneous SMT topology exists (multiple
> heterogeneous CPU clusters with different SMT thread number) since the
> SMT control cannot handle this well. Then enable the support of SMT
> control.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>   arch/arm64/kernel/topology.c | 60 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index 1a2c72f3e7f8..f6ec30fae70e 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -15,8 +15,10 @@
>   #include <linux/arch_topology.h>
>   #include <linux/cacheinfo.h>
>   #include <linux/cpufreq.h>
> +#include <linux/cpu_smt.h>
>   #include <linux/init.h>
>   #include <linux/percpu.h>
> +#include <linux/xarray.h>
> 
>   #include <asm/cpu.h>
>   #include <asm/cputype.h>
> @@ -37,17 +39,29 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>   	return !!is_threaded;
>   }
> 
> +struct cpu_smt_info {
> +	int thread_num;
> +	int core_id;
> +	int cpu;
> +};
> +
>   /*
>    * Propagate the topology information of the processor_topology_node tree to the
>    * cpu_topology array.
>    */
>   int __init parse_acpi_topology(void)
>   {
> +	int max_smt_thread_num = 1;
> +	struct cpu_smt_info *entry;
> +	struct xarray hetero_cpu;
> +	unsigned long hetero_id;
>   	int cpu, topology_id;
> 
>   	if (acpi_disabled)
>   		return 0;
> 
> +	xa_init(&hetero_cpu);
> +
>   	for_each_possible_cpu(cpu) {
>   		topology_id = find_acpi_cpu_topology(cpu, 0);
>   		if (topology_id < 0)
> @@ -57,6 +71,30 @@ int __init parse_acpi_topology(void)
>   			cpu_topology[cpu].thread_id = topology_id;
>   			topology_id = find_acpi_cpu_topology(cpu, 1);
>   			cpu_topology[cpu].core_id   = topology_id;
> +
> +			/*
> +			 * Build up the XArray using the heterogeneous ID of
> +			 * the CPU cluster. Store the CPU and SMT information
> +			 * of the first appeared CPU in the CPU cluster of this
> +			 * heterogeneous ID since the SMT information should be
> +			 * the same in this CPU cluster. Then we can know the
> +			 * SMT information of each heterogeneous CPUs in the
> +			 * system.
> +			 */
> +			hetero_id = find_acpi_cpu_topology_hetero_id(cpu);
> +			entry = (struct cpu_smt_info *)xa_load(&hetero_cpu, hetero_id);
> +			if (!entry) {
> +				entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +				WARN_ON(!entry);
> +
> +				entry->cpu = cpu;
> +				entry->core_id = topology_id;
> +				entry->thread_num = 1;
> +				xa_store(&hetero_cpu, hetero_id,
> +					 entry, GFP_KERNEL);
> +			} else if (entry->core_id == topology_id) {
> +				entry->thread_num++;
> +			}
>   		} else {
>   			cpu_topology[cpu].thread_id  = -1;
>   			cpu_topology[cpu].core_id    = topology_id;
> @@ -67,6 +105,28 @@ int __init parse_acpi_topology(void)
>   		cpu_topology[cpu].package_id = topology_id;
>   	}
> 
> +	/*
> +	 * This should be a short loop depending on the number of heterogeneous
> +	 * CPU clusters. Typically on a homogeneous system there's only one
> +	 * entry in the XArray.
> +	 */
> +	xa_for_each(&hetero_cpu, hetero_id, entry) {
> +		if (entry->thread_num == 1)
> +			continue;

If a platform has CPUs with:
- 1 thread
- X (!= 1) threads
Then I think that the asymmetry is not detected

> +
> +		if (entry->thread_num != max_smt_thread_num &&
> +		    max_smt_thread_num != 1)
> +			pr_warn("Heterogeneous SMT topology not handled");
> +
> +		if (entry->thread_num > max_smt_thread_num)
> +			max_smt_thread_num = entry->thread_num;
> +
> +		xa_erase(&hetero_cpu, hetero_id);
> +		kfree(entry);
> +	}
> +
> +	cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
> +	xa_destroy(&hetero_cpu);
>   	return 0;
>   }
>   #endif


  reply	other threads:[~2024-09-05  8:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-06  8:53 [PATCH v5 0/4] Support SMT control on arm64 Yicong Yang
2024-08-06  8:53 ` [PATCH v5 1/4] cpu/SMT: Provide a default topology_is_primary_thread() Yicong Yang
2024-08-16 15:54   ` Dietmar Eggemann
2024-08-19  7:25     ` Yicong Yang
2024-08-06  8:53 ` [PATCH v5 2/4] arch_topology: Support SMT control for OF based system Yicong Yang
2024-08-16 15:55   ` Dietmar Eggemann
2024-08-19  7:18     ` Yicong Yang
2024-08-06  8:53 ` [PATCH v5 3/4] arm64: topology: Support SMT control on ACPI " Yicong Yang
2024-08-16 15:55   ` Dietmar Eggemann
2024-08-19  7:03     ` Yicong Yang
2024-08-22  7:19       ` Dietmar Eggemann
2024-08-27 15:40   ` Pierre Gondois
2024-08-29  7:40     ` Yicong Yang
2024-08-29 12:46       ` Pierre Gondois
2024-08-30  9:35         ` Yicong Yang
2024-09-02  7:43           ` Pierre Gondois
2024-09-03 12:44             ` Yicong Yang
2024-09-05  8:34               ` Pierre Gondois [this message]
2024-09-05 12:02                 ` Yicong Yang
2024-09-06  7:06                   ` Morten Rasmussen
2024-09-06  8:36                     ` Yicong Yang
2024-09-12 11:59                       ` Morten Rasmussen
2024-09-12 12:53                         ` Michal Suchánek
2024-08-06  8:53 ` [PATCH v5 4/4] arm64: Kconfig: Enable HOTPLUG_SMT Yicong Yang
2024-08-27 15:40   ` Pierre Gondois
2024-08-29  6:50     ` Yicong Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=277bd093-422b-4301-92a3-d0a58eb41af5@arm.com \
    --to=pierre.gondois@arm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=guohanjun@huawei.com \
    --cc=jonathan.cameron@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=prime.zeng@hisilicon.com \
    --cc=rafael@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=xuwei5@huawei.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yangyicong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).