From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA717C388F7 for ; Fri, 6 Nov 2020 02:16:05 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 18A792072E for ; Fri, 6 Nov 2020 02:16:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="fhaE/cGt"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="C1zh03HK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 18A792072E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:Mime-Version:References:In-Reply-To:Date:To:From: Subject:Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Bkz+EpWga4MlathMc2dNm2VcW6aDl6UWuSTWXP+Wvq8=; b=fhaE/cGt2UE4K7LsUNYcy+hv+ bi0CZg7z/9IP92xG5jsv1IcDn76NTq/ufN4r/GblFkrqS1tkHMWXCyQdQcwNlqPN5pkoF9cGh8yYs /AmUpeozNWCxyTnyM5t3m5p0tJDYxbnnEHawW2YI1kOApK8wNUMA+VoLZjE4ba8Mcv5Ltt6VKiYOp GkNkj1G0S9bBgjAnTaq9HDe7iab/SyyAQJYrulIf+wjIRZmbSBPczYDS4EbSIz1wK7+xRJ9uUdL0n BkTC/22s5/Zy8wcRSb4Kaj/yvmtXm9YhenYXHoAD4PrbknwG8ZrmpjDvSxUMS3Qz9sHSfleuhluJT wzyKc+8jw==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1karHp-00024H-Qw; Fri, 06 Nov 2020 02:15:33 +0000 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1karHm-00023T-O3 for linux-arm-kernel@lists.infradead.org; Fri, 06 Nov 2020 02:15:32 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604628929; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nMiau+jB2NAVadVlE4lIKx50BpAJxCRzjB9cxthbDzE=; b=C1zh03HKMVsaGHx3ImFI9xPja8gWyk0y00ZYcYiGqFolBA1blfpo4NywkacSG4dy0BXml0 zOoEqIiN6r+ShbZmXOvCcwsgUPeWTNbeRJvi+Bld8o8AKNbSdtvgM7CcrJ5LZYEJDkQ4dr 7RhjYWCpyhDWXxFSdG41JOCEB3Y1ai4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-80-mjNetBekP4qbiEIJBGsy_A-1; Thu, 05 Nov 2020 21:15:27 -0500 X-MC-Unique: mjNetBekP4qbiEIJBGsy_A-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 638308030B8; Fri, 6 Nov 2020 02:15:26 +0000 (UTC) Received: from ovpn-114-171.rdu2.redhat.com (ovpn-114-171.rdu2.redhat.com [10.10.114.171]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5BEA355794; Fri, 6 Nov 2020 02:15:25 +0000 (UTC) Message-ID: Subject: Re: [PATCH] arm64/smp: Move rcu_cpu_starting() earlier From: Qian Cai To: paulmck@kernel.org Date: Thu, 05 Nov 2020 21:15:24 -0500 In-Reply-To: <20201105232813.GR3249@paulmck-ThinkPad-P72> References: <20201028182614.13655-1-cai@redhat.com> <160404559895.1777248.8248643695413627642.b4-ty@kernel.org> <20201105222242.GA8842@willie-the-truck> <3b4c324abdabd12d7bd5346c18411e667afe6a55.camel@redhat.com> <20201105232813.GR3249@paulmck-ThinkPad-P72> Mime-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201105_211530_972841_4F0B40F2 X-CRM114-Status: GOOD ( 30.19 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Will Deacon , Peter Zijlstra , catalin.marinas@arm.com, linux-kernel@vger.kernel.org, kernel-team@android.com, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, 2020-11-05 at 15:28 -0800, Paul E. McKenney wrote: > On Thu, Nov 05, 2020 at 06:02:49PM -0500, Qian Cai wrote: > > On Thu, 2020-11-05 at 22:22 +0000, Will Deacon wrote: > > > On Fri, Oct 30, 2020 at 04:33:25PM +0000, Will Deacon wrote: > > > > On Wed, 28 Oct 2020 14:26:14 -0400, Qian Cai wrote: > > > > > The call to rcu_cpu_starting() in secondary_start_kernel() is not > > > > > early > > > > > enough in the CPU-hotplug onlining process, which results in lockdep > > > > > splats as follows: > > > > > > > > > > WARNING: suspicious RCU usage > > > > > ----------------------------- > > > > > kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader > > > > > section!! > > > > > > > > > > [...] > > > > > > > > Applied to arm64 (for-next/fixes), thanks! > > > > > > > > [1/1] arm64/smp: Move rcu_cpu_starting() earlier > > > > https://git.kernel.org/arm64/c/ce3d31ad3cac > > > > > > Hmm, this patch has caused a regression in the case that we fail to > > > online a CPU because it has incompatible CPU features and so we park it > > > in cpu_die_early(). We now get an endless spew of RCU stalls because the > > > core will never come online, but is being tracked by RCU. So I'm tempted > > > to revert this and live with the lockdep warning while we figure out a > > > proper fix. > > > > > > What's the correct say to undo rcu_cpu_starting(), given that we cannot > > > invoke the full hotplug machinery here? Is it correct to call > > > rcutree_dying_cpu() on the bad CPU and then rcutree_dead_cpu() from the > > > CPU doing cpu_up(), or should we do something else? > > It looks to me that rcu_report_dead() does the opposite of > > rcu_cpu_starting(), > > so lift rcu_report_dead() out of CONFIG_HOTPLUG_CPU and use it there to > > rewind, > > Paul? > > Yes, rcu_report_dead() should do the trick. Presumably the earlier > online-time CPU-hotplug notifiers are also unwound? I don't think that is an issue here. cpu_die_early() set CPU_STUCK_IN_KERNEL, and then __cpu_up() will see a timeout waiting for the AP online and then deal with CPU_STUCK_IN_KERNEL according. Thus, something like this? I don't see anything in rcu_report_dead() depends on CONFIG_HOTPLUG_CPU=y. diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 09c96f57818c..10729d2d6084 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -421,6 +421,8 @@ void cpu_die_early(void) update_cpu_boot_status(CPU_STUCK_IN_KERNEL); + rcu_report_dead(cpu); + cpu_park_loop(); } diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2a52f42f64b6..bd04b09b84b3 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4077,7 +4077,6 @@ void rcu_cpu_starting(unsigned int cpu) smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ } -#ifdef CONFIG_HOTPLUG_CPU /* * The outgoing function has no further need of RCU, so remove it from * the rcu_node tree's ->qsmaskinitnext bit masks. @@ -4117,6 +4116,7 @@ void rcu_report_dead(unsigned int cpu) rdp->cpu_started = false; } +#ifdef CONFIG_HOTPLUG_CPU /* * The outgoing CPU has just passed through the dying-idle state, and we * are being invoked from the CPU that was IPIed to continue the offline _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel