From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A065F1B970 for ; Fri, 26 Jul 2024 17:14:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722014072; cv=none; b=HxDjavSaI9Gumrq7lYbormS4P9V+fSNxsGO8KV2a4ZXo0O1h9AjfI0OACWV8rgodK6zfFRpQ6VHS82AwOxUEfnNoB3HlPgFagmgqIrws5vghTA8QNmoI64Uf5E5KO7r8csYiiRPDU0PiuFzp9SmoCZXFAQ4K6no71LfcMZNEki8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722014072; c=relaxed/simple; bh=1iGbDt4EkzLIsr2JyeJ2+d3KlNlzECwQg/iOY5Y/eNM=; h=Date:From:To:CC:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XGUvHCN673LJGjZE+HNnUesZKURk0QYkleqXc0RLP0OjJLqyiEUO8kSyKdyJa/wCixclhN8lya4q83tgcPWu9Nauig+MGFyve6LfOxxy1vo7I1cmN2GEKwq66bORaoo5PB3KkS3QctE+0JBRk5g9fdvyttJIXlJTw8tyxSuipEE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=Huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=Huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4WVvSS6m1pz6K5xW; Sat, 27 Jul 2024 01:12:08 +0800 (CST) Received: from lhrpeml500005.china.huawei.com (unknown [7.191.163.240]) by mail.maildlp.com (Postfix) with ESMTPS id C88321404FC; Sat, 27 Jul 2024 01:14:25 +0800 (CST) Received: from localhost (10.203.174.77) by lhrpeml500005.china.huawei.com (7.191.163.240) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Fri, 26 Jul 2024 18:14:25 +0100 Date: Fri, 26 Jul 2024 18:14:24 +0100 From: Jonathan Cameron To: Thomas Gleixner CC: Mikhail Gavrilov , , , , , , , Linux List Kernel Mailing , Linux regressions mailing list , Ingo Molnar , "Borislav Petkov" , Dave Hansen , , "H. Peter Anvin" , "Bowman, Terry" , Shameerali Kolothum Thodi Subject: Re: 6.11/regression/bisected - The commit c1385c1f0ba3 caused a new possible recursive locking detected warning at computer boot. Message-ID: <20240726181424.000039a4@Huawei.com> In-Reply-To: <87le1ounl2.ffs@tglx> References: <20240723112456.000053b3@Huawei.com> <20240723181728.000026b3@huawei.com> <20240725181354.000040bf@huawei.com> <87le1ounl2.ffs@tglx> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 4.1.0 (GTK 3.24.33; x86_64-w64-mingw32) Precedence: bulk X-Mailing-List: regressions@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: lhrpeml500001.china.huawei.com (7.191.163.213) To lhrpeml500005.china.huawei.com (7.191.163.240) On Fri, 26 Jul 2024 18:26:01 +0200 Thomas Gleixner wrote: > On Thu, Jul 25 2024 at 18:13, Jonathan Cameron wrote: > > On Tue, 23 Jul 2024 18:20:06 +0100 > > Jonathan Cameron wrote: > > > >> > This is an interesting corner and perhaps reflects a flawed > >> > assumption we were making that for this path anything that can happen for an > >> > initially present CPU can also happen for a hotplugged one. On the hotplugged > >> > path the lock was always held and hence the static_key_enable() would > >> > have failed. > > No. The original code invoked this without cpus read locked via: > > acpi_processor_driver.probe() > __acpi_processor_start() > .... > > and the cpu hotplug callback finds it already set up, so it won't reach > the static_key_enable() anymore. > > > One bit I need to check out tomorrow is to make sure this doesn't race with the > > workfn that is used to tear down the same static key on error. > > There is a simpler solution for that. See the uncompiled below. Thanks. FWIW I got pretty much the same suggestion from Shameer this morning when he saw the workfn solution on list. Classic case of me missing the simple solution because I was down in the weeds. I'm absolutely fine with this fix. Mikhail, please could you test Thomas' proposal so we are absolutely sure nothing else is hiding. Tglx's solution is much less likely to cause problems than what I proposed because it avoids changing the ordering. Jonathan > > Thanks, > > tglx > --- > diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c > index b3fa61d45352..0b69bfbf345d 100644 > --- a/arch/x86/kernel/cpu/aperfmperf.c > +++ b/arch/x86/kernel/cpu/aperfmperf.c > @@ -306,7 +306,7 @@ static void freq_invariance_enable(void) > WARN_ON_ONCE(1); > return; > } > - static_branch_enable(&arch_scale_freq_key); > + static_branch_enable_cpuslocked(&arch_scale_freq_key); > register_freq_invariance_syscore_ops(); > pr_info("Estimated ratio of average max frequency by base frequency (times 1024): %llu\n", arch_max_freq_ratio); > } > @@ -323,8 +323,10 @@ static void __init bp_init_freq_invariance(void) > if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) > return; > > - if (intel_set_max_freq_ratio()) > + if (intel_set_max_freq_ratio()) { > + guard(cpus_read_lock)(); > freq_invariance_enable(); > + } > } > > static void disable_freq_invariance_workfn(struct work_struct *work) > >