From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D142C33CB1 for ; Thu, 16 Jan 2020 11:15:06 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BDC2D2072B for ; Thu, 16 Jan 2020 11:15:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BDC2D2072B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 47z1nM1x5KzDqbW for ; Thu, 16 Jan 2020 22:15:03 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linutronix.de (client-ip=2a0a:51c0:0:12e:550::1; helo=galois.linutronix.de; envelope-from=tglx@linutronix.de; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linutronix.de Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) (using TLSv1.2 with cipher DHE-RSA-AES256-SHA256 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 47z1j03wPqzDqb9 for ; Thu, 16 Jan 2020 22:11:16 +1100 (AEDT) Received: from [5.158.153.52] (helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1is33A-00050C-K7; Thu, 16 Jan 2020 12:10:56 +0100 Received: by nanos.tec.linutronix.de (Postfix, from userid 1000) id 3D46D101B66; Thu, 16 Jan 2020 12:10:56 +0100 (CET) From: Thomas Gleixner To: Hsin-Yi Wang Subject: Re: [PATCH v5] reboot: support offline CPUs before reboot In-Reply-To: References: <20200115063410.131692-1-hsinyi@chromium.org> <8736cgxmxi.fsf@nanos.tec.linutronix.de> Date: Thu, 16 Jan 2020 12:10:56 +0100 Message-ID: <87h80vwta7.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , linux-ia64@vger.kernel.org, Linux-sh list , Peter Zijlstra , Heiko Carstens , lkml , sparclinux@vger.kernel.org, Guenter Roeck , Will Deacon , Ingo Molnar , linux-s390@vger.kernel.org, linux-csky@vger.kernel.org, Aaro Koskinen , Fenghua Yu , Linux PM , linux-xtensa@linux-xtensa.org, Stephen Boyd , Josh Poimboeuf , Pavankumar Kondeti , "moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" , linux-parisc@vger.kernel.org, Greg Kroah-Hartman , linux-mips@vger.kernel.org, James Morse , Jiri Kosina , Vitaly Kuznetsov , linuxppc-dev Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" Hsin-Yi Wang writes: > On Thu, Jan 16, 2020 at 8:30 AM Thomas Gleixner wrote: > We saw this issue on regular reboot (not panic) on arm64: If tick > broadcast and smp_send_stop() happen together and the first broadcast > arrives to some idled CPU that hasn't already executed reboot ipi to > run in spinloop, it would try to broadcast to another CPU, but that > target CPU is already marked as offline by set_cpu_online() in reboot > ipi, and a warning comes out since tick_handle_oneshot_broadcast() > would check if it tries to broadcast to offline cpus. Most of the time > the CPU getting the broadcast interrupt is already in the spinloop and > thus isn't going to receive interrupts from the broadcast timer. The timer broadcasting is obviously broken by the existing reboot unplug mechanism as the outgoing CPU should remove itself from the broadcast. Just addressing the broadcast issue is not sufficient as there are tons of other places which rely on consistency of the various cpu masks. > If system supports hotplug, _cpu_down() would properly handle tasks > termination such as remove CPU from timer broadcasting by > tick_offline_cpu()...etc, as well as some interrupt handling. Well, emphasis on 'if system supports hotplug'. If not, then you are back to square one. On ARM64 hotplug is selectable by a config option. So either we mandate HOTPLUG_CPU for SMP and get rid of all the ifdeffery or we need to have a mechanism which works on !HOTPLUG_CPU as well. That whole reboot/shutdown stuff is an unpenetrable mess of notifiers and architecture hackery, so something generic and understandable is really required. Thanks, tglx