From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12A47C77B76 for ; Mon, 17 Apr 2023 15:57:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229555AbjDQP5J (ORCPT ); Mon, 17 Apr 2023 11:57:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229515AbjDQP5I (ORCPT ); Mon, 17 Apr 2023 11:57:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E88EB92; Mon, 17 Apr 2023 08:57:06 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 748271691; Mon, 17 Apr 2023 08:51:45 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.19.253]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 99D093F5A1; Mon, 17 Apr 2023 08:50:56 -0700 (PDT) Date: Mon, 17 Apr 2023 16:50:53 +0100 From: Mark Rutland To: Thomas Gleixner Cc: LKML , x86@kernel.org, David Woodhouse , Andrew Cooper , Brian Gerst , Arjan van de Veen , Paolo Bonzini , Paul McKenney , Tom Lendacky , Sean Christopherson , Oleksandr Natalenko , Paul Menzel , "Guilherme G. Piccoli" , Piotr Gorski , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, David Woodhouse , Usama Arif , Juergen Gross , Boris Ostrovsky , xen-devel@lists.xenproject.org, Russell King , Arnd Bergmann , Guo Ren , linux-csky@vger.kernel.org, Thomas Bogendoerfer , linux-mips@vger.kernel.org, "James E.J. Bottomley" , Helge Deller , linux-parisc@vger.kernel.org, Paul Walmsley , Palmer Dabbelt , linux-riscv@lists.infradead.org, Sabin Rapan Subject: Re: [patch 22/37] arm64: smp: Switch to hotplug core state synchronization Message-ID: References: <20230414225551.858160935@linutronix.de> <20230414232310.569498144@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230414232310.569498144@linutronix.de> Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org On Sat, Apr 15, 2023 at 01:44:49AM +0200, Thomas Gleixner wrote: > Switch to the CPU hotplug core state tracking and synchronization > mechanim. No functional change intended. > > Signed-off-by: Thomas Gleixner > Cc: Catalin Marinas > Cc: Will Deacon > Cc: linux-arm-kernel@lists.infradead.org I gave this a spin on arm64 (in a 64-vCPU VM on an M1 host), and it seems to work fine with a bunch of vCPUs being hotplugged off and on again randomly. FWIW: Tested-by: Mark Rutland I also hacked the code to have the dying CPU spin forever before the call to cpuhp_ap_report_dead(). In that case I see a warning, and that we don't call arch_cpuhp_cleanup_dead_cpu(), and that the CPU is marked as offline (per /sys/devices/system/cpu/$N/online). As a tangent/aside, we might need to improve that for confidential compute architectures, and we might want to generically track cpus which might still be using kernel text/data. On arm64 we ensure that via our cpu_kill() callback (which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP have a similar mechanism. Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things to blow up. Thanks, Mark. > --- > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/smp.h | 2 +- > arch/arm64/kernel/smp.c | 14 +++++--------- > 3 files changed, 7 insertions(+), 10 deletions(-) > > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -216,6 +216,7 @@ config ARM64 > select HAVE_KPROBES > select HAVE_KRETPROBES > select HAVE_GENERIC_VDSO > + select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU > select IRQ_DOMAIN > select IRQ_FORCED_THREADING > select KASAN_VMALLOC if KASAN > --- a/arch/arm64/include/asm/smp.h > +++ b/arch/arm64/include/asm/smp.h > @@ -99,7 +99,7 @@ static inline void arch_send_wakeup_ipi_ > > extern int __cpu_disable(void); > > -extern void __cpu_die(unsigned int cpu); > +static inline void __cpu_die(unsigned int cpu) { } > extern void cpu_die(void); > extern void cpu_die_early(void); > > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -333,17 +333,13 @@ static int op_cpu_kill(unsigned int cpu) > } > > /* > - * called on the thread which is asking for a CPU to be shutdown - > - * waits until shutdown has completed, or it is timed out. > + * Called on the thread which is asking for a CPU to be shutdown after the > + * shutdown completed. > */ > -void __cpu_die(unsigned int cpu) > +void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu) > { > int err; > > - if (!cpu_wait_death(cpu, 5)) { > - pr_crit("CPU%u: cpu didn't die\n", cpu); > - return; > - } > pr_debug("CPU%u: shutdown\n", cpu); > > /* > @@ -370,8 +366,8 @@ void cpu_die(void) > > local_daif_mask(); > > - /* Tell __cpu_die() that this CPU is now safe to dispose of */ > - (void)cpu_report_death(); > + /* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */ > + cpuhp_ap_report_dead(); > > /* > * Actually shutdown the CPU. This must never fail. The specific hotplug >