linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] MIPS: CPS: Optimise delay CPU calibration and cluster helper function
@ 2025-07-04 15:13 Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster() Gregory CLEMENT
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Gregory CLEMENT @ 2025-07-04 15:13 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jiaxun Yang, Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips, linux-kernel, Gregory CLEMENT

This series allow booting faster by reusing the delay calibration
across the CPU belonging of the same cluster. While doing it we now
reuse the mips_cps_first_online_in_cluster() that we improve also.

This series enables faster booting by reusing delay calibration across
CPUs within the same cluster. During this process, we reuse the
improved mips_cps_first_online_in_cluster function.

With the introduction of this series, a configuration running 32 cores
spread across two clusters sees a significant reduction in boot time
by approximately 600 milliseconds.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
---
Changes in v2:
- Add a patch improving mips_cps_first_online_in_cluster()
- Use mips_cps_first_online_in_cluster() in calibrate_delay_is_known()
  as suggested by Jiaxun
- Link to v1: https://lore.kernel.org/r/20250520-smp_calib-v1-1-cd04f0a78648@bootlin.com

---
Gregory CLEMENT (3):
      MIPS: CPS: Improve mips_cps_first_online_in_cluster()
      MIPS: CPS: Change default cluster value for EyeQ SoCs
      MIPS: CPS: Optimise delay CPU calibration for SMP

 arch/mips/Kconfig                |  7 +++++++
 arch/mips/include/asm/mips-cps.h | 18 ++++++++++++++++--
 arch/mips/kernel/mips-cm.c       | 40 +++++++---------------------------------
 arch/mips/kernel/smp-cps.c       | 13 +++++++++++++
 4 files changed, 43 insertions(+), 35 deletions(-)
---
base-commit: 86731a2a651e58953fc949573895f2fa6d456841
change-id: 20250520-smp_calib-6d3009e1f5b9

Best regards,
-- 
Grégory CLEMENT, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster()
  2025-07-04 15:13 [PATCH v2 0/3] MIPS: CPS: Optimise delay CPU calibration and cluster helper function Gregory CLEMENT
@ 2025-07-04 15:13 ` Gregory CLEMENT
  2025-07-04 18:03   ` Jiaxun Yang
  2025-07-04 15:13 ` [PATCH v2 2/3] MIPS: CPS: Change default cluster value for EyeQ SoCs Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP Gregory CLEMENT
  2 siblings, 1 reply; 7+ messages in thread
From: Gregory CLEMENT @ 2025-07-04 15:13 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jiaxun Yang, Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips, linux-kernel, Gregory CLEMENT

The initial implementation of this function goes through all the CPUs
in a cluster to determine if the current CPU is the only one
running. This process occurs every time the function is called.

However, during boot, we already perform this task, so let's take
advantage of this opportunity to create and fill a CPU bitmask that
can be easily and efficiently used later.

This requires creating a single CPU bitmask per cluster, which is why
it's essential to know how many clusters can be utilized. The default
setting is 4 clusters, but this value can be changed when configuring
the kernel or even customized for a given SoC family.

This patch modifies the function to allow providing the first
available online CPU when one already exists, which is necessary for
delay CPU calibration optimization.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
---
 arch/mips/Kconfig                |  6 ++++++
 arch/mips/include/asm/mips-cps.h | 18 ++++++++++++++++--
 arch/mips/kernel/mips-cm.c       | 40 +++++++---------------------------------
 arch/mips/kernel/smp-cps.c       |  2 ++
 4 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 1e48184ecf1ec8e29c0a25de6452ece5da835e30..47aa3f8849f05632773c9064282147608483c715 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2329,6 +2329,12 @@ config MIPS_CPS
 	  no external assistance. It is safe to enable this when hardware
 	  support is unavailable.
 
+config MIPS_CPS_CLUSTER_MAX
+       int "Maximum cluster supported"
+       default 4
+       help
+	Maximum number of cluster available on the SoC.
+
 config MIPS_CPS_PM
 	depends on MIPS_CPS
 	bool
diff --git a/arch/mips/include/asm/mips-cps.h b/arch/mips/include/asm/mips-cps.h
index 917009b80e6951dc7e2b308ad7fb42cd9fbbf7d7..674f0628a2ded42f4f42c7362a667a457944caa0 100644
--- a/arch/mips/include/asm/mips-cps.h
+++ b/arch/mips/include/asm/mips-cps.h
@@ -12,6 +12,8 @@
 #include <linux/io.h>
 #include <linux/types.h>
 
+extern struct cpumask __cpu_cluster_mask[CONFIG_MIPS_CPS_CLUSTER_MAX] __read_mostly;
+
 extern unsigned long __cps_access_bad_size(void)
 	__compiletime_error("Bad size for CPS accessor");
 
@@ -114,10 +116,20 @@ static inline void clear_##unit##_##name(uint##sz##_t val)		\
  */
 static inline unsigned int mips_cps_numclusters(void)
 {
+	unsigned int nclusters;
+
 	if (mips_cm_revision() < CM_REV_CM3_5)
 		return 1;
 
-	return FIELD_GET(CM_GCR_CONFIG_NUM_CLUSTERS, read_gcr_config());
+	nclusters = FIELD_GET(CM_GCR_CONFIG_NUM_CLUSTERS, read_gcr_config());
+	if (nclusters > CONFIG_MIPS_CPS_CLUSTER_MAX) {
+		pr_warn("%d clusters detected but limited to %d because of CONFIG_MIPS_CPU_CLUSTER_MAX value\n"
+			"consider modifying it to match the hardware capability.\n",
+			nclusters, CONFIG_MIPS_CPS_CLUSTER_MAX);
+		nclusters = CONFIG_MIPS_CPS_CLUSTER_MAX;
+	}
+
+	return nclusters;
 }
 
 /**
@@ -258,6 +270,8 @@ static inline bool mips_cps_multicluster_cpus(void)
 
 /**
  * mips_cps_first_online_in_cluster() - Detect if CPU is first online in cluster
+ * @first_cpu: The first other online CPU in cluster, or nr_cpu_ids if
+ * the function returns true.
  *
  * Determine whether the local CPU is the first to be brought online in its
  * cluster - that is, whether there are any other online CPUs in the local
@@ -265,6 +279,6 @@ static inline bool mips_cps_multicluster_cpus(void)
  *
  * Returns true if this CPU is first online, else false.
  */
-extern unsigned int mips_cps_first_online_in_cluster(void);
+extern unsigned int mips_cps_first_online_in_cluster(int *first_cpu);
 
 #endif /* __MIPS_ASM_MIPS_CPS_H__ */
diff --git a/arch/mips/kernel/mips-cm.c b/arch/mips/kernel/mips-cm.c
index 43cb1e20baed3648ff83bb5d3bbe6a726072e063..d1d98e03559df5f891c3afce0955d63db7eb1c45 100644
--- a/arch/mips/kernel/mips-cm.c
+++ b/arch/mips/kernel/mips-cm.c
@@ -529,39 +529,13 @@ void mips_cm_error_report(void)
 	write_gcr_error_cause(cm_error);
 }
 
-unsigned int mips_cps_first_online_in_cluster(void)
+unsigned int mips_cps_first_online_in_cluster(int *first_cpu)
 {
-	unsigned int local_cl;
-	int i;
+	unsigned int local_cl = cpu_cluster(&current_cpu_data);
+	struct cpumask *local_cluster_mask = &__cpu_cluster_mask[local_cl];
 
-	local_cl = cpu_cluster(&current_cpu_data);
-
-	/*
-	 * We rely upon knowledge that CPUs are numbered sequentially by
-	 * cluster - ie. CPUs 0..X will be in cluster 0, CPUs X+1..Y in cluster
-	 * 1, CPUs Y+1..Z in cluster 2 etc. This means that CPUs in the same
-	 * cluster will immediately precede or follow one another.
-	 *
-	 * First we scan backwards, until we find an online CPU in the cluster
-	 * or we move on to another cluster.
-	 */
-	for (i = smp_processor_id() - 1; i >= 0; i--) {
-		if (cpu_cluster(&cpu_data[i]) != local_cl)
-			break;
-		if (!cpu_online(i))
-			continue;
-		return false;
-	}
-
-	/* Then do the same for higher numbered CPUs */
-	for (i = smp_processor_id() + 1; i < nr_cpu_ids; i++) {
-		if (cpu_cluster(&cpu_data[i]) != local_cl)
-			break;
-		if (!cpu_online(i))
-			continue;
-		return false;
-	}
-
-	/* We found no online CPUs in the local cluster */
-	return true;
+	*first_cpu = cpumask_any_and_but(local_cluster_mask,
+					 cpu_online_mask,
+					 smp_processor_id());
+	return (*first_cpu >= nr_cpu_ids);
 }
diff --git a/arch/mips/kernel/smp-cps.c b/arch/mips/kernel/smp-cps.c
index 7b0e69af4097025196b93115139a5e89c1d71fcc..a5c538742769dcbf22e27d2d4485c071e2e64ec2 100644
--- a/arch/mips/kernel/smp-cps.c
+++ b/arch/mips/kernel/smp-cps.c
@@ -40,6 +40,7 @@ static u64 core_entry_reg;
 static phys_addr_t cps_vec_pa;
 
 struct cluster_boot_config *mips_cps_cluster_bootcfg;
+struct cpumask __cpu_cluster_mask[CONFIG_MIPS_CPS_CLUSTER_MAX] __read_mostly;
 
 static void power_up_other_cluster(unsigned int cluster)
 {
@@ -242,6 +243,7 @@ static void __init cps_smp_setup(void)
 				cpu_set_cluster(&cpu_data[nvpes + v], cl);
 				cpu_set_core(&cpu_data[nvpes + v], c);
 				cpu_set_vpe_id(&cpu_data[nvpes + v], v);
+				cpumask_set_cpu(nvpes + v, &__cpu_cluster_mask[cl]);
 			}
 
 			nvpes += core_vpes;

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/3] MIPS: CPS: Change default cluster value for EyeQ SoCs
  2025-07-04 15:13 [PATCH v2 0/3] MIPS: CPS: Optimise delay CPU calibration and cluster helper function Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster() Gregory CLEMENT
@ 2025-07-04 15:13 ` Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP Gregory CLEMENT
  2 siblings, 0 replies; 7+ messages in thread
From: Gregory CLEMENT @ 2025-07-04 15:13 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jiaxun Yang, Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips, linux-kernel, Gregory CLEMENT

On theses SoC only 2 clusters are used. Modify the
MIPS_CPS_CLUSTER_MAX default value accordingly for EyeQ5 and EyeQ6.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
---
 arch/mips/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 47aa3f8849f05632773c9064282147608483c715..63d085db42f5ea2ddaf517d4cbbe2a637771ac89 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2331,6 +2331,7 @@ config MIPS_CPS
 
 config MIPS_CPS_CLUSTER_MAX
        int "Maximum cluster supported"
+       default 2 if EYEQ
        default 4
        help
 	Maximum number of cluster available on the SoC.

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP
  2025-07-04 15:13 [PATCH v2 0/3] MIPS: CPS: Optimise delay CPU calibration and cluster helper function Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster() Gregory CLEMENT
  2025-07-04 15:13 ` [PATCH v2 2/3] MIPS: CPS: Change default cluster value for EyeQ SoCs Gregory CLEMENT
@ 2025-07-04 15:13 ` Gregory CLEMENT
  2025-07-04 17:46   ` Jiaxun Yang
  2 siblings, 1 reply; 7+ messages in thread
From: Gregory CLEMENT @ 2025-07-04 15:13 UTC (permalink / raw)
  To: Thomas Bogendoerfer
  Cc: Jiaxun Yang, Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips, linux-kernel, Gregory CLEMENT

On MIPS architecture with CPS-based SMP support, all CPU cores in the
same cluster run at the same frequency since they share the same L2
cache, requiring a fixed CPU/L2 cache ratio.

This allows to implement calibrate_delay_is_known(), which will return
0 (triggering calibration) only for the primary CPU of each
cluster. For other CPUs, we can simply reuse the value from their
cluster's primary CPU core.

With the introduction of this patch, a configuration running 32 cores
spread across two clusters sees a significant reduction in boot time
by approximately 600 milliseconds.

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
---
 arch/mips/kernel/smp-cps.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/mips/kernel/smp-cps.c b/arch/mips/kernel/smp-cps.c
index a5c538742769dcbf22e27d2d4485c071e2e64ec2..9c4882c3c69d20f15d4826f125e975e64e712e9b 100644
--- a/arch/mips/kernel/smp-cps.c
+++ b/arch/mips/kernel/smp-cps.c
@@ -283,6 +283,17 @@ static void __init cps_smp_setup(void)
 #endif /* CONFIG_MIPS_MT_FPAFF */
 }
 
+unsigned long calibrate_delay_is_known(void)
+{
+	int first_cpu_cluster = 0;
+
+	/* The calibration has to be done on the primary CPU of the cluster */
+	if (mips_cps_first_online_in_cluster(&first_cpu_cluster))
+		return 0;
+
+	return cpu_data[first_cpu_cluster].udelay_val;
+}
+
 static void __init cps_prepare_cpus(unsigned int max_cpus)
 {
 	unsigned int nclusters, ncores, core_vpes, c, cl, cca;

-- 
2.47.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP
  2025-07-04 15:13 ` [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP Gregory CLEMENT
@ 2025-07-04 17:46   ` Jiaxun Yang
  0 siblings, 0 replies; 7+ messages in thread
From: Jiaxun Yang @ 2025-07-04 17:46 UTC (permalink / raw)
  To: Gregory CLEMENT, Thomas Bogendoerfer
  Cc: Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips@vger.kernel.org, linux-kernel



在2025年7月4日周五 下午4:13,Gregory CLEMENT写道:
> On MIPS architecture with CPS-based SMP support, all CPU cores in the
> same cluster run at the same frequency since they share the same L2
> cache, requiring a fixed CPU/L2 cache ratio.
>
> This allows to implement calibrate_delay_is_known(), which will return
> 0 (triggering calibration) only for the primary CPU of each
> cluster. For other CPUs, we can simply reuse the value from their
> cluster's primary CPU core.
>
> With the introduction of this patch, a configuration running 32 cores
> spread across two clusters sees a significant reduction in boot time
> by approximately 600 milliseconds.
>
> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>

Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>

Thanks!

-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster()
  2025-07-04 15:13 ` [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster() Gregory CLEMENT
@ 2025-07-04 18:03   ` Jiaxun Yang
  2025-07-07 13:40     ` Gregory CLEMENT
  0 siblings, 1 reply; 7+ messages in thread
From: Jiaxun Yang @ 2025-07-04 18:03 UTC (permalink / raw)
  To: Gregory CLEMENT, Thomas Bogendoerfer
  Cc: Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips@vger.kernel.org, linux-kernel



在2025年7月4日周五 下午4:13,Gregory CLEMENT写道:
> The initial implementation of this function goes through all the CPUs
> in a cluster to determine if the current CPU is the only one
> running. This process occurs every time the function is called.
>
> However, during boot, we already perform this task, so let's take
> advantage of this opportunity to create and fill a CPU bitmask that
> can be easily and efficiently used later.
>
> This requires creating a single CPU bitmask per cluster, which is why
> it's essential to know how many clusters can be utilized. The default
> setting is 4 clusters, but this value can be changed when configuring
> the kernel or even customized for a given SoC family.

Hmm, I think we should avoid this sort of random limitation.

You can actually store per cluster cpumask_var_t into `mips_cps_cluster_bootcfg`,
which is allocated in cps_prepare_cpus(), and generate cpumask just there.

It should be pretty straightforward to handle.

Thanks!

>
> This patch modifies the function to allow providing the first
> available online CPU when one already exists, which is necessary for
> delay CPU calibration optimization.
>
> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
> ---
[...]
-- 
- Jiaxun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster()
  2025-07-04 18:03   ` Jiaxun Yang
@ 2025-07-07 13:40     ` Gregory CLEMENT
  0 siblings, 0 replies; 7+ messages in thread
From: Gregory CLEMENT @ 2025-07-07 13:40 UTC (permalink / raw)
  To: Jiaxun Yang, Thomas Bogendoerfer
  Cc: Vladimir Kondratiev, Théo Lebrun, Tawfik Bayouk,
	Thomas Petazzoni, linux-mips@vger.kernel.org, linux-kernel

Hello Jiaxun,

> 在2025年7月4日周五 下午4:13,Gregory CLEMENT写道:
>> The initial implementation of this function goes through all the CPUs
>> in a cluster to determine if the current CPU is the only one
>> running. This process occurs every time the function is called.
>>
>> However, during boot, we already perform this task, so let's take
>> advantage of this opportunity to create and fill a CPU bitmask that
>> can be easily and efficiently used later.
>>
>> This requires creating a single CPU bitmask per cluster, which is why
>> it's essential to know how many clusters can be utilized. The default
>> setting is 4 clusters, but this value can be changed when configuring
>> the kernel or even customized for a given SoC family.
>
> Hmm, I think we should avoid this sort of random limitation.

It's not great, but it seemed like the best approach to keep optimized
boot times and memory consumption.

>
> You can actually store per cluster cpumask_var_t into `mips_cps_cluster_bootcfg`,
> which is allocated in cps_prepare_cpus(), and generate cpumask just
> there.
>
> It should be pretty straightforward to handle.
>

The drawback of this option is that cps_prepare_cpus() is called after
the building of the topology. However, I am trying to see how to insert
this in the loop already used in cps_prepare_cpus().

Gregory

> Thanks!
>
>>
>> This patch modifies the function to allow providing the first
>> available online CPU when one already exists, which is necessary for
>> delay CPU calibration optimization.
>>
>> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
>> ---
> [...]
> -- 
> - Jiaxun

-- 
Grégory CLEMENT, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-07-07 13:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-04 15:13 [PATCH v2 0/3] MIPS: CPS: Optimise delay CPU calibration and cluster helper function Gregory CLEMENT
2025-07-04 15:13 ` [PATCH v2 1/3] MIPS: CPS: Improve mips_cps_first_online_in_cluster() Gregory CLEMENT
2025-07-04 18:03   ` Jiaxun Yang
2025-07-07 13:40     ` Gregory CLEMENT
2025-07-04 15:13 ` [PATCH v2 2/3] MIPS: CPS: Change default cluster value for EyeQ SoCs Gregory CLEMENT
2025-07-04 15:13 ` [PATCH v2 3/3] MIPS: CPS: Optimise delay CPU calibration for SMP Gregory CLEMENT
2025-07-04 17:46   ` Jiaxun Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).