* [RFC 00/10] arm64/riscv: Introduce fast kexec reboot
@ 2022-08-22 2:15 Pingfan Liu
2022-08-22 2:15 ` [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS Pingfan Liu
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Pingfan Liu @ 2022-08-22 2:15 UTC (permalink / raw)
To: linux-arm-kernel, linux-ia64, linux-riscv, linux-kernel
Cc: Pingfan Liu, Thomas Gleixner, Steven Price,
Kuppuswamy Sathyanarayanan, Jason A. Donenfeld,
Frederic Weisbecker, Russell King, Catalin Marinas, Will Deacon,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Peter Zijlstra,
Eric W. Biederman
On a SMP arm64 machine, it may take a long time to kexec-reboot a new
kernel, where the time is linear to the number of the cpus. On a 80 cpus
machine, it takes about 15 seconds, while with this patch, the time
will dramaticly drop to one second.
*** Current situation 'slow kexec reboot' ***
At present, some architectures rely on smp_shutdown_nonboot_cpus() to
implement "kexec -e". Since smp_shutdown_nonboot_cpus() tears down the
cpus serially, it is very slow.
Take a close look, a cpu_down() processing on a single cpu can approximately be
divided into two stages:
-1. from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU
-2. from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD
which is by stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));
and runs on the teardown cpu.
If these processes can run in parallel, then, the reboot can be speeded
up. That is the aim of this patch.
*** Contrast to other implements ***
X86 and PowerPC have their own machine_shutdown(), which does not reply
on the cpu hot-removing mechanism. They just discriminate some critical
components and tear down in per cpu NMI handler during the kexec
reboot. But for some architectures, let's say arm64, it is not easy to define
these critical component due to various chipmakers' implements.
As a result, sticking to the cpu hot-removing mechanism is the simplest
way to re-implement the parallel.
*** Things worthy of consideration ***
1. The definition of a clean boundary between the first kernel and the new kernel
-1.1 firmware
The firmware's internal state should enter into a proper state, so
it can work for the new kernel. And this is achieved by the firmware's
cpuhp_step's teardown interface if any.
-1.2 CPU internal state
Whether the cache or PMU needs a clean shutdown before rebooting.
2. The dependency of each cpuhp_step
The boundary of a clean cut involves only few cpuhp_step, but they
may propagate to other cpuhp_step by dependency. This series does not
bother to judge the dependency, instead, just iterate downside each
cpuhp_step. And this strategy demands that each involved cpuhp_step's
teardown procedure supports parallelism.
*** Solution ***
Ideally, if the interface _cpu_down() can be enhanced to enable
parallelism, then the fast reboot can be achieved.
But revisiting the two parts of the current cpu_down() process, the
second part 'stop_machine_cpuslocked()' is a blockade. Packed inside the
_cpu_down(), stop_machine_cpuslocked() only allow one cpu to execute the
teardown.
So this patch breaks down the process of _cpu_down(), and divides the
teardown into three steps.
1. Send each AP from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU
in parallel.
2. Sync on BP to wait all APs to enter CPUHP_TEARDOWN_CPU state
3. Send each AP from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD by the
interface of stop_machine_cpuslocked() in parallel.
Finally the exposed stop_machine_cpuslocked()can be used to support
parallelism.
Apparently, step 2 is introduced in order to satisfy the prerequisite on
which stop_machine_cpuslocked() can start on each cpu.
Then the rest issue is about how to support parallelism in step 1&3.
Fortunately, each subsystem has its own carefully designed lock
mechanism. In each cpuhp_step teardown interface, adapting to the
subsystem's lock rule will make things work.
*** No rollback if failure ***
During kexec reboot, the devices have already been shutdown, there is no
way for system to roll back to a workable state. So this series also
does not consider the rollback issue if a failure on cpu_down() happens,
it just adventures to move on.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
To: linux-arm-kernel@lists.infradead.org
To: linux-ia64@vger.kernel.org
To: linux-riscv@lists.infradead.org
To: linux-kernel@vger.kernel.org
Pingfan Liu (10):
cpu/hotplug: Make __cpuhp_kick_ap() ready for async
cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on
CONFIG_SHUTDOWN_NONBOOT_CPUS
cpu/hotplug: Introduce fast kexec reboot
cpu/hotplug: Check the capability of kexec quick reboot
perf/arm-dsu: Make dsu_pmu_cpu_teardown() parallel
rcu/hotplug: Make rcutree_dead_cpu() parallel
lib/cpumask: Introduce cpumask_not_dying_but()
cpuhp: Replace cpumask_any_but(cpu_online_mask, cpu)
genirq/cpuhotplug: Ask migrate_one_irq() to migrate to a real online
cpu
arm64: smp: Make __cpu_disable() parallel
arch/Kconfig | 4 +
arch/arm/Kconfig | 1 +
arch/arm/mach-imx/mmdc.c | 2 +-
arch/arm/mm/cache-l2x0-pmu.c | 2 +-
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/smp.c | 31 +++-
arch/ia64/Kconfig | 1 +
arch/riscv/Kconfig | 1 +
drivers/dma/idxd/perfmon.c | 2 +-
drivers/fpga/dfl-fme-perf.c | 2 +-
drivers/gpu/drm/i915/i915_pmu.c | 2 +-
drivers/perf/arm-cci.c | 2 +-
drivers/perf/arm-ccn.c | 2 +-
drivers/perf/arm-cmn.c | 4 +-
drivers/perf/arm_dmc620_pmu.c | 2 +-
drivers/perf/arm_dsu_pmu.c | 16 +-
drivers/perf/arm_smmuv3_pmu.c | 2 +-
drivers/perf/fsl_imx8_ddr_perf.c | 2 +-
drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +-
drivers/perf/marvell_cn10k_tad_pmu.c | 2 +-
drivers/perf/qcom_l2_pmu.c | 2 +-
drivers/perf/qcom_l3_pmu.c | 2 +-
drivers/perf/xgene_pmu.c | 2 +-
drivers/soc/fsl/qbman/bman_portal.c | 2 +-
drivers/soc/fsl/qbman/qman_portal.c | 2 +-
include/linux/cpuhotplug.h | 2 +
include/linux/cpumask.h | 3 +
kernel/cpu.c | 213 ++++++++++++++++++++---
kernel/irq/cpuhotplug.c | 3 +-
kernel/rcu/tree.c | 3 +-
lib/cpumask.c | 18 ++
31 files changed, 281 insertions(+), 54 deletions(-)
--
2.31.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 4+ messages in thread
* [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS
2022-08-22 2:15 [RFC 00/10] arm64/riscv: Introduce fast kexec reboot Pingfan Liu
@ 2022-08-22 2:15 ` Pingfan Liu
2022-08-22 2:15 ` [RFC 03/10] cpu/hotplug: Introduce fast kexec reboot Pingfan Liu
2022-08-22 2:15 ` [RFC 04/10] cpu/hotplug: Check the capability of kexec quick reboot Pingfan Liu
2 siblings, 0 replies; 4+ messages in thread
From: Pingfan Liu @ 2022-08-22 2:15 UTC (permalink / raw)
To: linux-arm-kernel, linux-ia64, linux-riscv, linux-kernel
Cc: Pingfan Liu, Russell King, Catalin Marinas, Will Deacon,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Peter Zijlstra,
Eric W. Biederman, Mark Rutland, Marco Elver, Masami Hiramatsu,
Dan Li, Song Liu, Sami Tolvanen, Arnd Bergmann, Linus Walleij,
Ard Biesheuvel, Tony Lindgren, Nick Hawkins, John Crispin,
Geert Uytterhoeven, Andrew Morton, Bjorn Andersson,
Anshuman Khandual, Thomas Gleixner, Steven Price
Only arm/arm64/ia64/riscv share the smp_shutdown_nonboot_cpus(). So
compiling this code conditioned on the macro
CONFIG_SHUTDOWN_NONBOOT_CPUS. Later this macro will brace the quick
kexec reboot code.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dan Li <ashimida@linux.alibaba.com>
Cc: Song Liu <song@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Nick Hawkins <nick.hawkins@hpe.com>
Cc: John Crispin <john@phrozen.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Price <steven.price@arm.com>
To: linux-arm-kernel@lists.infradead.org
To: linux-ia64@vger.kernel.org
To: linux-riscv@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
arch/Kconfig | 4 ++++
arch/arm/Kconfig | 1 +
arch/arm64/Kconfig | 1 +
arch/ia64/Kconfig | 1 +
arch/riscv/Kconfig | 1 +
kernel/cpu.c | 3 +++
6 files changed, 11 insertions(+)
diff --git a/arch/Kconfig b/arch/Kconfig
index f330410da63a..be447537d0f6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -14,6 +14,10 @@ menu "General architecture-dependent options"
config CRASH_CORE
bool
+config SHUTDOWN_NONBOOT_CPUS
+ select KEXEC_CORE
+ bool
+
config KEXEC_CORE
select CRASH_CORE
bool
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 87badeae3181..711cfdb4f9f4 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -129,6 +129,7 @@ config ARM
select PCI_SYSCALL if PCI
select PERF_USE_VMALLOC
select RTC_LIB
+ select SHUTDOWN_NONBOOT_CPUS
select SYS_SUPPORTS_APM_EMULATION
select THREAD_INFO_IN_TASK
select HAVE_ARCH_VMAP_STACK if MMU && ARM_HAS_GROUP_RELOCS
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 571cc234d0b3..8c481a0b1829 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -223,6 +223,7 @@ config ARM64
select PCI_SYSCALL if PCI
select POWER_RESET
select POWER_SUPPLY
+ select SHUTDOWN_NONBOOT_CPUS
select SPARSE_IRQ
select SWIOTLB
select SYSCTL_EXCEPTION_TRACE
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 26ac8ea15a9e..8a3ddea97d1b 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -52,6 +52,7 @@ config IA64
select ARCH_CLOCKSOURCE_DATA
select GENERIC_TIME_VSYSCALL
select LEGACY_TIMER_TICK
+ select SHUTDOWN_NONBOOT_CPUS
select SWIOTLB
select SYSCTL_ARCH_UNALIGN_NO_WARN
select HAVE_MOD_ARCH_SPECIFIC
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ed66c31e4655..02606a48c5ea 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -120,6 +120,7 @@ config RISCV
select PCI_MSI if PCI
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
+ select SHUTDOWN_NONBOOT_CPUS
select SPARSE_IRQ
select SYSCTL_EXCEPTION_TRACE
select THREAD_INFO_IN_TASK
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 338e1d426c7e..2be6ba811a01 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1258,6 +1258,8 @@ int remove_cpu(unsigned int cpu)
}
EXPORT_SYMBOL_GPL(remove_cpu);
+#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS
+
void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
{
unsigned int cpu;
@@ -1299,6 +1301,7 @@ void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
cpu_maps_update_done();
}
+#endif
#else
#define takedown_cpu NULL
--
2.31.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [RFC 03/10] cpu/hotplug: Introduce fast kexec reboot
2022-08-22 2:15 [RFC 00/10] arm64/riscv: Introduce fast kexec reboot Pingfan Liu
2022-08-22 2:15 ` [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS Pingfan Liu
@ 2022-08-22 2:15 ` Pingfan Liu
2022-08-22 2:15 ` [RFC 04/10] cpu/hotplug: Check the capability of kexec quick reboot Pingfan Liu
2 siblings, 0 replies; 4+ messages in thread
From: Pingfan Liu @ 2022-08-22 2:15 UTC (permalink / raw)
To: linux-arm-kernel, linux-ia64, linux-riscv, linux-kernel
Cc: Pingfan Liu, Thomas Gleixner, Steven Price,
Kuppuswamy Sathyanarayanan, Jason A. Donenfeld,
Frederic Weisbecker, Russell King, Catalin Marinas, Will Deacon,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Peter Zijlstra,
Eric W. Biederman
*** Current situation 'slow kexec reboot' ***
At present, some architectures rely on smp_shutdown_nonboot_cpus() to
implement "kexec -e". Since smp_shutdown_nonboot_cpus() tears down the
cpus serially, it is very slow.
Take a close look, a cpu_down() processing on a single cpu can approximately be
divided into two stages:
-1. from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU
-2. from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD
which is by stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu));
and runs on the teardown cpu.
If these processes can run in parallel, then, the reboot can be speeded
up. That is the aim of this patch.
*** Contrast to other implements ***
X86 and PowerPC have their own machine_shutdown(), which does not reply
on the cpu hot-removing mechanism. They just discriminate some critical
component and tears them down in per cpu NMI handler during the kexec
reboot. But for some architectures, let's say arm64, it is not easy to define
these critical component due to various chipmakers' implements.
As a result, sticking to the cpu hot-removing mechanism is the simplest
way to re-implement the parallel. It also renders an opportunity to
implement the cpu_down() in parallel in future (not done by this series).
*** Things worthy of consideration ***
1. The definition of a clean boundary between the first kernel and the new kernel
-1.1 firmware
The firmware's internal state should enter into a proper state.
And this is achieved by the firmware's cpuhp_step's teardown interface
if any.
-1.2 CPU internal
Whether the cache or PMU needs a clean shutdown before rebooting.
2. The dependency of each cpuhp_step
The boundary of a clean cut involves only few cpuhp_step, but they
may propagate to other cpuhp_step by the way of the dependency. This
series does not bother to judge the dependency, instead, just iterate
downside each cpuhp_step. And this stragegy demands that each cpuhp_step's
teardown interface supports parallel.
*** Solution ***
Ideally, if the interface _cpu_down() can be enhanced to enable
parallel, then the fast reboot can be achieved.
But revisiting the two parts of the current cpu_down() process, the
second part 'stop_machine_cpuslocked()' is a blockade. Packed inside the
_cpu_down(), stop_machine_cpuslocked() only allow one cpu to execute the
teardown.
So this patch breaks down the process of _cpu_down(), and divides the
teardown into three steps. And the exposed stop_machine_cpuslocked()
can be used to support parallel.
1. Bring each AP from CPUHP_ONLINE to CPUHP_TEARDOWN_CPU
in parallel.
2. Sync on BP to wait all APs to enter CPUHP_TEARDOWN_CPU state
3. Bring each AP from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD by the
interface of stop_machine_cpuslocked() in parallel.
Apparently, the step 2 is introduced in order to satisfy the condition
on which stop_machine_cpuslocked() can start on each cpu.
Then the rest issue is about how to support parallel in step 1&3.
Furtunately, each subsystem has its own carefully designed lock
mechanism. In each cpuhp_step teardown interface, adopting to the
subsystem's lock rule will make things work.
*** No rollback if failure ***
During kexec reboot, the devices have already been shutdown, there is no
way for system to roll back to a workable state. So this series also
does not consider the rollback issue.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
To: linux-arm-kernel@lists.infradead.org
To: linux-ia64@vger.kernel.org
To: linux-riscv@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
kernel/cpu.c | 139 +++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 129 insertions(+), 10 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 2be6ba811a01..94ab2727d6bb 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1260,10 +1260,125 @@ EXPORT_SYMBOL_GPL(remove_cpu);
#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS
-void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
+/*
+ * Push all of cpus to the state CPUHP_AP_ONLINE_IDLE.
+ * Since kexec-reboot has already shut down all devices, there is no way to
+ * roll back, the cpus' teardown also requires no rollback, instead, just throw
+ * warning.
+ */
+static void cpus_down_no_rollback(struct cpumask *cpus)
{
+ struct cpuhp_cpu_state *st;
unsigned int cpu;
+
+ /* launch ap work one by one, but not wait for completion */
+ for_each_cpu(cpu, cpus) {
+ st = per_cpu_ptr(&cpuhp_state, cpu);
+ /*
+ * If the current CPU state is in the range of the AP hotplug thread,
+ * then we need to kick the thread.
+ */
+ if (st->state > CPUHP_TEARDOWN_CPU) {
+ cpuhp_set_state(cpu, st, CPUHP_TEARDOWN_CPU);
+ /* In order to parallel, async. And there is no way to rollback */
+ cpuhp_kick_ap_work_async(cpu);
+ }
+ }
+
+ /* wait for all ap work completion */
+ for_each_cpu(cpu, cpus) {
+ st = per_cpu_ptr(&cpuhp_state, cpu);
+ wait_for_ap_thread(st, st->bringup);
+ if (st->result)
+ pr_warn("cpu %u refuses to offline due to %d\n", cpu, st->result);
+ else if (st->state > CPUHP_TEARDOWN_CPU)
+ pr_warn("cpu %u refuses to offline, state: %d\n", cpu, st->state);
+ }
+}
+
+static int __takedown_cpu_cleanup(unsigned int cpu)
+{
+ struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
+
+ /*
+ * The teardown callback for CPUHP_AP_SCHED_STARTING will have removed
+ * all runnable tasks from the CPU, there's only the idle task left now
+ * that the migration thread is done doing the stop_machine thing.
+ *
+ * Wait for the stop thread to go away.
+ */
+ wait_for_ap_thread(st, false);
+ BUG_ON(st->state != CPUHP_AP_IDLE_DEAD);
+
+ hotplug_cpu__broadcast_tick_pull(cpu);
+ /* This actually kills the CPU. */
+ __cpu_die(cpu);
+
+ tick_cleanup_dead_cpu(cpu);
+ rcutree_migrate_callbacks(cpu);
+ return 0;
+}
+
+/*
+ * There is a sync that all ap threads are done before calling this func.
+ */
+static void takedown_cpus_no_rollback(struct cpumask *cpus)
+{
+ struct cpuhp_cpu_state *st;
+ unsigned int cpu;
+
+ for_each_cpu(cpu, cpus) {
+ st = per_cpu_ptr(&cpuhp_state, cpu);
+ WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
+ /* No invoke to takedown_cpu(), so set the state by manual */
+ st->state = CPUHP_AP_ONLINE;
+ cpuhp_set_state(cpu, st, CPUHP_AP_OFFLINE);
+ }
+
+ irq_lock_sparse();
+ /* ask stopper kthreads to execute take_cpu_down() in parallel */
+ stop_machine_cpuslocked(take_cpu_down, NULL, cpus);
+
+ /* Finally wait for completion and clean up */
+ for_each_cpu(cpu, cpus)
+ __takedown_cpu_cleanup(cpu);
+ irq_unlock_sparse();
+}
+
+static bool check_quick_reboot(void)
+{
+ return false;
+}
+
+static struct cpumask kexec_ap_map;
+
+void smp_shutdown_nonboot_cpus_quick_path(unsigned int primary_cpu)
+{
+ struct cpumask *cpus = &kexec_ap_map;
+ /*
+ * To prevent other subsystem from access to __cpu_online_mask, but internally,
+ * __cpu_disable() accesses the bitmap in parral and needs its own local lock.
+ */
+ cpus_write_lock();
+
+ cpumask_copy(cpus, cpu_online_mask);
+ cpumask_clear_cpu(primary_cpu, cpus);
+ cpus_down_no_rollback(cpus);
+ takedown_cpus_no_rollback(cpus);
+ /*
+ * For some subsystems, there are still remains for offline cpus from
+ * CPUHP_BRINGUP_CPU to CPUHP_OFFLINE. But since none of them interact
+ * with hardwares or firmware, they have no effect on the new kernel.
+ * So skipping the cpuhp callbacks in that range
+ */
+
+ cpus_write_unlock();
+}
+
+void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
+{
int error;
+ unsigned int cpu;
cpu_maps_update_begin();
@@ -1275,15 +1390,19 @@ void smp_shutdown_nonboot_cpus(unsigned int primary_cpu)
if (!cpu_online(primary_cpu))
primary_cpu = cpumask_first(cpu_online_mask);
- for_each_online_cpu(cpu) {
- if (cpu == primary_cpu)
- continue;
-
- error = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
- if (error) {
- pr_err("Failed to offline CPU%d - error=%d",
- cpu, error);
- break;
+ if (check_quick_reboot()) {
+ smp_shutdown_nonboot_cpus_quick_path(primary_cpu);
+ } else {
+ for_each_online_cpu(cpu) {
+ if (cpu == primary_cpu)
+ continue;
+
+ error = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
+ if (error) {
+ pr_err("Failed to offline CPU%d - error=%d",
+ cpu, error);
+ break;
+ }
}
}
--
2.31.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [RFC 04/10] cpu/hotplug: Check the capability of kexec quick reboot
2022-08-22 2:15 [RFC 00/10] arm64/riscv: Introduce fast kexec reboot Pingfan Liu
2022-08-22 2:15 ` [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS Pingfan Liu
2022-08-22 2:15 ` [RFC 03/10] cpu/hotplug: Introduce fast kexec reboot Pingfan Liu
@ 2022-08-22 2:15 ` Pingfan Liu
2 siblings, 0 replies; 4+ messages in thread
From: Pingfan Liu @ 2022-08-22 2:15 UTC (permalink / raw)
To: linux-arm-kernel, linux-ia64, linux-riscv, linux-kernel
Cc: Pingfan Liu, Thomas Gleixner, Steven Price,
Kuppuswamy Sathyanarayanan, Jason A. Donenfeld,
Frederic Weisbecker, Russell King, Catalin Marinas, Will Deacon,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Peter Zijlstra,
Eric W. Biederman
The kexec quick reboot needs each involved cpuhp_step to run in
parallel.
There are lots of teardown cpuhp_step, but not all of them belong to
arm/arm64/riscv kexec reboot path. So introducing a member
'support_kexec_parallel' in the struct cpuhp_step to signal whether the
teardown supports parallel or not. If a cpuhp_step is used in kexec
reboot, then it needs to support parallel to enable the quick reboot.
The function check_quick_reboot() checks all teardown cpuhp_steps and
report those unsupported if any.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Price <steven.price@arm.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
To: linux-arm-kernel@lists.infradead.org
To: linux-ia64@vger.kernel.org
To: linux-riscv@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
include/linux/cpuhotplug.h | 2 ++
kernel/cpu.c | 28 +++++++++++++++++++++++++++-
2 files changed, 29 insertions(+), 1 deletion(-)
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index f61447913db9..73093fc15aec 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -374,6 +374,8 @@ static inline int cpuhp_setup_state_multi(enum cpuhp_state state,
(void *) teardown, true);
}
+void cpuhp_set_step_parallel(enum cpuhp_state state);
+
int __cpuhp_state_add_instance(enum cpuhp_state state, struct hlist_node *node,
bool invoke);
int __cpuhp_state_add_instance_cpuslocked(enum cpuhp_state state,
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 94ab2727d6bb..1261c3f3be51 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -137,6 +137,9 @@ struct cpuhp_step {
/* public: */
bool cant_stop;
bool multi_instance;
+#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS
+ bool support_kexec_parallel;
+#endif
};
static DEFINE_MUTEX(cpuhp_state_mutex);
@@ -147,6 +150,14 @@ static struct cpuhp_step *cpuhp_get_step(enum cpuhp_state state)
return cpuhp_hp_states + state;
}
+#ifdef CONFIG_SHUTDOWN_NONBOOT_CPUS
+void cpuhp_set_step_parallel(enum cpuhp_state state)
+{
+ cpuhp_hp_states[state].support_kexec_parallel = true;
+}
+EXPORT_SYMBOL(cpuhp_set_step_parallel);
+#endif
+
static bool cpuhp_step_empty(bool bringup, struct cpuhp_step *step)
{
return bringup ? !step->startup.single : !step->teardown.single;
@@ -1347,7 +1358,22 @@ static void takedown_cpus_no_rollback(struct cpumask *cpus)
static bool check_quick_reboot(void)
{
- return false;
+ struct cpuhp_step *step;
+ enum cpuhp_state state;
+ bool ret = true;
+
+ for (state = CPUHP_ONLINE; state >= CPUHP_AP_OFFLINE; state--) {
+ step = cpuhp_get_step(state);
+ if (step->teardown.single == NULL)
+ continue;
+ if (step->support_kexec_parallel == false) {
+ pr_info("cpuhp state:%d, %s, does not support cpudown in parallel\n",
+ state, step->name);
+ ret = false;
+ }
+ }
+
+ return ret;
}
static struct cpumask kexec_ap_map;
--
2.31.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-08-22 2:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-22 2:15 [RFC 00/10] arm64/riscv: Introduce fast kexec reboot Pingfan Liu
2022-08-22 2:15 ` [RFC 02/10] cpu/hotplug: Compile smp_shutdown_nonboot_cpus() conditioned on CONFIG_SHUTDOWN_NONBOOT_CPUS Pingfan Liu
2022-08-22 2:15 ` [RFC 03/10] cpu/hotplug: Introduce fast kexec reboot Pingfan Liu
2022-08-22 2:15 ` [RFC 04/10] cpu/hotplug: Check the capability of kexec quick reboot Pingfan Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox