public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] LoongArch: CPU parallel bring up
@ 2024-07-15 13:35 Jiaxun Yang
  2024-07-15 13:35 ` [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT Jiaxun Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jiaxun Yang @ 2024-07-15 13:35 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

Hi all,

This series implemented CPU parallel bring up for LoongArch.

Being the first non-x86 arch enabling that we need to fix some
infra in patch 1 and 2, then implement everything in patch 3.

Please review.
Thanks

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
Changes in v2:
- Use stub functions (tglx)
- Link to v1: https://lore.kernel.org/r/20240705-loongarch-hotplug-v1-0-67d9c4709aa9@flygoat.com

---
Jiaxun Yang (3):
      cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
      cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup
      LoongArch: SMP: Implement parallel CPU bring up

 arch/loongarch/Kconfig              |  1 +
 arch/loongarch/include/asm/smp.h    |  6 ------
 arch/loongarch/kernel/asm-offsets.c | 10 ----------
 arch/loongarch/kernel/head.S        |  7 ++++---
 arch/loongarch/kernel/smp.c         | 35 +++++++----------------------------
 kernel/cpu.c                        | 16 ++++++++++++++++
 6 files changed, 28 insertions(+), 47 deletions(-)
---
base-commit: 82e4255305c554b0bb18b7ccf2db86041b4c8b6e
change-id: 20240704-loongarch-hotplug-3f8826b88a43

Best regards,
-- 
Jiaxun Yang <jiaxun.yang@flygoat.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
  2024-07-15 13:35 [PATCH v2 0/3] LoongArch: CPU parallel bring up Jiaxun Yang
@ 2024-07-15 13:35 ` Jiaxun Yang
  2024-07-16  4:45   ` Thomas Gleixner
  2024-07-15 13:35 ` [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup Jiaxun Yang
  2024-07-15 13:35 ` [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up Jiaxun Yang
  2 siblings, 1 reply; 7+ messages in thread
From: Jiaxun Yang @ 2024-07-15 13:35 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

Provide stub function for smt related parallel bring up functions
so that HOTPLUG_PARALLEL can work without HOTPLUG_PARALLEL.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
v2: Use stub function (tglx)
---
 kernel/cpu.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1209ddaec026..c89e0e91379a 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1808,6 +1808,7 @@ static int __init parallel_bringup_parse_param(char *arg)
 }
 early_param("cpuhp.parallel", parallel_bringup_parse_param);
 
+#ifdef CONFIG_HOTPLUG_SMT
 static inline bool cpuhp_smt_aware(void)
 {
 	return cpu_smt_max_threads > 1;
@@ -1817,6 +1818,16 @@ static inline const struct cpumask *cpuhp_get_primary_thread_mask(void)
 {
 	return cpu_primary_thread_mask;
 }
+#else
+static inline bool cpuhp_smt_aware(void)
+{
+	return false;
+}
+static inline const struct cpumask *cpuhp_get_primary_thread_mask(void)
+{
+	return cpu_none_mask;
+}
+#endif
 
 /*
  * On architectures which have enabled parallel bringup this invokes all BP

-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup
  2024-07-15 13:35 [PATCH v2 0/3] LoongArch: CPU parallel bring up Jiaxun Yang
  2024-07-15 13:35 ` [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT Jiaxun Yang
@ 2024-07-15 13:35 ` Jiaxun Yang
  2024-07-16  4:53   ` Thomas Gleixner
  2024-07-15 13:35 ` [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up Jiaxun Yang
  2 siblings, 1 reply; 7+ messages in thread
From: Jiaxun Yang @ 2024-07-15 13:35 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

It is a general assumption that architectures entitled to parallel
bringup with CONFIG_HOTPLUG_PARALLEL do expect parallel bringup to
be available.

Provide a weak fallback arch_cpuhp_init_parallel_bringup function
to match this assumption.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
 kernel/cpu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index c89e0e91379a..16323610cd20 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1829,6 +1829,11 @@ static inline const struct cpumask *cpuhp_get_primary_thread_mask(void)
 }
 #endif
 
+bool __weak arch_cpuhp_init_parallel_bringup(void)
+{
+	return true;
+}
+
 /*
  * On architectures which have enabled parallel bringup this invokes all BP
  * prepare states for each of the to be onlined APs first. The last state

-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up
  2024-07-15 13:35 [PATCH v2 0/3] LoongArch: CPU parallel bring up Jiaxun Yang
  2024-07-15 13:35 ` [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT Jiaxun Yang
  2024-07-15 13:35 ` [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup Jiaxun Yang
@ 2024-07-15 13:35 ` Jiaxun Yang
  2024-07-16  4:56   ` Thomas Gleixner
  2 siblings, 1 reply; 7+ messages in thread
From: Jiaxun Yang @ 2024-07-15 13:35 UTC (permalink / raw)
  To: Thomas Gleixner, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

Implement parallel CPU bring up for LoongArch to reduce
boot time consumption on bring up CPUs.

On my Loongson-3A5000 ~120ms boot time improvement is observed.

tp, sp register values are passed by MBUF now to avoid racing
cpuboot_data global struct.

cpu_running completion is handled by HOTPLUG_CORE_SYNC_FULL.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
---
 arch/loongarch/Kconfig              |  1 +
 arch/loongarch/include/asm/smp.h    |  6 ------
 arch/loongarch/kernel/asm-offsets.c | 10 ----------
 arch/loongarch/kernel/head.S        |  7 ++++---
 arch/loongarch/kernel/smp.c         | 35 +++++++----------------------------
 5 files changed, 12 insertions(+), 47 deletions(-)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index ddc042895d01..656435c1dbd5 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -162,6 +162,7 @@ config LOONGARCH
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_TIF_NOHZ
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN if !SMP
+	select HOTPLUG_PARALLEL if SMP
 	select IRQ_FORCED_THREADING
 	select IRQ_LOONGARCH_CPU
 	select LOCK_MM_AND_FIND_VMA
diff --git a/arch/loongarch/include/asm/smp.h b/arch/loongarch/include/asm/smp.h
index 50db503f44e3..f6953cb16492 100644
--- a/arch/loongarch/include/asm/smp.h
+++ b/arch/loongarch/include/asm/smp.h
@@ -75,12 +75,6 @@ extern int __cpu_logical_map[NR_CPUS];
 #define SMP_CALL_FUNCTION	BIT(ACTION_CALL_FUNCTION)
 #define SMP_IRQ_WORK		BIT(ACTION_IRQ_WORK)
 
-struct secondary_data {
-	unsigned long stack;
-	unsigned long thread_info;
-};
-extern struct secondary_data cpuboot_data;
-
 extern asmlinkage void smpboot_entry(void);
 extern asmlinkage void start_secondary(void);
 
diff --git a/arch/loongarch/kernel/asm-offsets.c b/arch/loongarch/kernel/asm-offsets.c
index bee9f7a3108f..598498f47a4c 100644
--- a/arch/loongarch/kernel/asm-offsets.c
+++ b/arch/loongarch/kernel/asm-offsets.c
@@ -257,16 +257,6 @@ static void __used output_signal_defines(void)
 	BLANK();
 }
 
-#ifdef CONFIG_SMP
-static void __used output_smpboot_defines(void)
-{
-	COMMENT("Linux smp cpu boot offsets.");
-	OFFSET(CPU_BOOT_STACK, secondary_data, stack);
-	OFFSET(CPU_BOOT_TINFO, secondary_data, thread_info);
-	BLANK();
-}
-#endif
-
 #ifdef CONFIG_HIBERNATION
 static void __used output_pbe_defines(void)
 {
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index fdb831dc64df..8dd8fb450f46 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -136,9 +136,10 @@ SYM_CODE_START(smpboot_entry)
 	li.w		t0, 0x00		# FPE=0, SXE=0, ASXE=0, BTE=0
 	csrwr		t0, LOONGARCH_CSR_EUEN
 
-	la.pcrel	t0, cpuboot_data
-	ld.d		sp, t0, CPU_BOOT_STACK
-	ld.d		tp, t0, CPU_BOOT_TINFO
+	li.w		t0, LOONGARCH_IOCSR_MBUF1
+	iocsrrd.d	sp, t0
+	li.w		t0, LOONGARCH_IOCSR_MBUF2
+	iocsrrd.d	tp, t0
 
 	bl		start_secondary
 	ASM_BUG()
diff --git a/arch/loongarch/kernel/smp.c b/arch/loongarch/kernel/smp.c
index ca405ab86aae..ca6a95a0280d 100644
--- a/arch/loongarch/kernel/smp.c
+++ b/arch/loongarch/kernel/smp.c
@@ -48,10 +48,6 @@ EXPORT_SYMBOL(cpu_sibling_map);
 /* Representing the core map of multi-core chips of each logical CPU */
 cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
 EXPORT_SYMBOL(cpu_core_map);
-
-static DECLARE_COMPLETION(cpu_starting);
-static DECLARE_COMPLETION(cpu_running);
-
 /*
  * A logcal cpu mask containing only one VPE per core to
  * reduce the number of IPIs on large MT systems.
@@ -65,7 +61,6 @@ static cpumask_t cpu_sibling_setup_map;
 /* representing cpus for which core maps can be computed */
 static cpumask_t cpu_core_setup_map;
 
-struct secondary_data cpuboot_data;
 static DEFINE_PER_CPU(int, cpu_state);
 
 static const char *ipi_types[NR_IPI] __tracepoint_string = {
@@ -340,14 +335,16 @@ void __init loongson_prepare_cpus(unsigned int max_cpus)
  */
 void loongson_boot_secondary(int cpu, struct task_struct *idle)
 {
-	unsigned long entry;
+	unsigned long entry, stack, thread_info;
 
 	pr_info("Booting CPU#%d...\n", cpu);
 
 	entry = __pa_symbol((unsigned long)&smpboot_entry);
-	cpuboot_data.stack = (unsigned long)__KSTK_TOS(idle);
-	cpuboot_data.thread_info = (unsigned long)task_thread_info(idle);
+	stack = (unsigned long)__KSTK_TOS(idle);
+	thread_info = (unsigned long)task_thread_info(idle);
 
+	csr_mail_send(thread_info, cpu_logical_map(cpu), 2);
+	csr_mail_send(stack, cpu_logical_map(cpu), 1);
 	csr_mail_send(entry, cpu_logical_map(cpu), 0);
 
 	loongson_send_ipi_single(cpu, ACTION_BOOT_CPU);
@@ -525,20 +522,10 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 #endif
 }
 
-int __cpu_up(unsigned int cpu, struct task_struct *tidle)
+int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle)
 {
 	loongson_boot_secondary(cpu, tidle);
 
-	/* Wait for CPU to start and be ready to sync counters */
-	if (!wait_for_completion_timeout(&cpu_starting,
-					 msecs_to_jiffies(5000))) {
-		pr_crit("CPU%u: failed to start\n", cpu);
-		return -EIO;
-	}
-
-	/* Wait for CPU to finish startup & mark itself online before return */
-	wait_for_completion(&cpu_running);
-
 	return 0;
 }
 
@@ -561,22 +548,14 @@ asmlinkage void start_secondary(void)
 	set_cpu_sibling_map(cpu);
 	set_cpu_core_map(cpu);
 
+	cpuhp_ap_sync_alive();
 	notify_cpu_starting(cpu);
 
-	/* Notify boot CPU that we're starting */
-	complete(&cpu_starting);
-
 	/* The CPU is running, now mark it online */
 	set_cpu_online(cpu, true);
 
 	calculate_cpu_foreign_map();
 
-	/*
-	 * Notify boot CPU that we're up & online and it can safely return
-	 * from __cpu_up()
-	 */
-	complete(&cpu_running);
-
 	/*
 	 * irq will be enabled in loongson_smp_finish(), enabling it too
 	 * early is dangerous.

-- 
2.45.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
  2024-07-15 13:35 ` [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT Jiaxun Yang
@ 2024-07-16  4:45   ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2024-07-16  4:45 UTC (permalink / raw)
  To: Jiaxun Yang, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

On Mon, Jul 15 2024 at 21:35, Jiaxun Yang wrote:
> Provide stub function for smt related parallel bring up functions
> so that HOTPLUG_PARALLEL can work without HOTPLUG_PARALLEL.

That sentence does not make any sense. Also please use SMT instead of smt

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup
  2024-07-15 13:35 ` [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup Jiaxun Yang
@ 2024-07-16  4:53   ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2024-07-16  4:53 UTC (permalink / raw)
  To: Jiaxun Yang, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

On Mon, Jul 15 2024 at 21:35, Jiaxun Yang wrote:
> It is a general assumption that architectures entitled to parallel
> bringup with CONFIG_HOTPLUG_PARALLEL do expect parallel bringup to
> be available.

I can't parse that sentence.

> Provide a weak fallback arch_cpuhp_init_parallel_bringup function
> to match this assumption.

I assume you want to say something like this:

 CONFIG_HOTPLUG_PARALLEL expects the architecture to implement
 arch_cpuhp_init_parallel_bringup() to decide whether paralllel hotplug
 is possible and to do the necessary architecture specific
 initialization.

 There are architectures which can enable it unconditionally and do not
 require architecture specific initialization.

 Provide a wark fallback for arch_cpuhp_init_parallel_bringup() so that
 such architectures are not forced to implement empty stub functions.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up
  2024-07-15 13:35 ` [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up Jiaxun Yang
@ 2024-07-16  4:56   ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2024-07-16  4:56 UTC (permalink / raw)
  To: Jiaxun Yang, Peter Zijlstra, Huacai Chen, WANG Xuerui
  Cc: linux-kernel, loongarch, Jiaxun Yang

On Mon, Jul 15 2024 at 21:35, Jiaxun Yang wrote:
>   */
>  void loongson_boot_secondary(int cpu, struct task_struct *idle)
>  {
> -	unsigned long entry;
> +	unsigned long entry, stack, thread_info;
>  
>  	pr_info("Booting CPU#%d...\n", cpu);
>  
>  	entry = __pa_symbol((unsigned long)&smpboot_entry);
> -	cpuboot_data.stack = (unsigned long)__KSTK_TOS(idle);
> -	cpuboot_data.thread_info = (unsigned long)task_thread_info(idle);
> +	stack = (unsigned long)__KSTK_TOS(idle);
> +	thread_info = (unsigned long)task_thread_info(idle);
>  
> +	csr_mail_send(thread_info, cpu_logical_map(cpu), 2);
> +	csr_mail_send(stack, cpu_logical_map(cpu), 1);
>  	csr_mail_send(entry, cpu_logical_map(cpu), 0);
>  
>  	loongson_send_ipi_single(cpu, ACTION_BOOT_CPU);
> @@ -525,20 +522,10 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>  #endif
>  }
>  
> -int __cpu_up(unsigned int cpu, struct task_struct *tidle)
> +int arch_cpuhp_kick_ap_alive(unsigned int cpu, struct task_struct *tidle)
>  {
>  	loongson_boot_secondary(cpu, tidle);

What's the point of this indirection and why is
loongson_boot_secondary() global? The only caller is this function, no?
  
Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-07-16  4:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-15 13:35 [PATCH v2 0/3] LoongArch: CPU parallel bring up Jiaxun Yang
2024-07-15 13:35 ` [PATCH v2 1/3] cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT Jiaxun Yang
2024-07-16  4:45   ` Thomas Gleixner
2024-07-15 13:35 ` [PATCH v2 2/3] cpu/hotplug: Weak fallback for arch_cpuhp_init_parallel_bringup Jiaxun Yang
2024-07-16  4:53   ` Thomas Gleixner
2024-07-15 13:35 ` [PATCH v2 3/3] LoongArch: SMP: Implement parallel CPU bring up Jiaxun Yang
2024-07-16  4:56   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox