From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37629C4167D for ; Mon, 6 Nov 2023 23:15:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233597AbjKFXPd (ORCPT ); Mon, 6 Nov 2023 18:15:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34738 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233601AbjKFXPR (ORCPT ); Mon, 6 Nov 2023 18:15:17 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6663D170D; Mon, 6 Nov 2023 15:14:59 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C6306C433C9; Mon, 6 Nov 2023 23:14:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1699312499; bh=vjd0JqHshV4EQWvdq3l9gNQxupKjz1i7BeZDi/ojUUg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cZWXHZdWzDnKFdzhArlYVYGC5Ne1UaVrgkH89W+xRCHqd+yAMqjQskCnX5MHPINSx oBNkut9Jmd3OnL5gzUP5s8vzp7GMsBGsd69ar1S+4rX2t02ShoOmaXK2XtN4o+p9A/ Ufwe0oI5H1vAd/xU3NvhXyDOsfPLIGl98yF/bZsWFQSzMVY2YkrJWcAuPEL3AFHEUj kuSxYaoRRVh7lcTxH0TnGpnS6/NxLwy4B8djhcNblGtsCp6UJmPU+2SeqdlGluWZEA OHkgxMFUw46ZSDba3rLNh+MqDC7Te/cxR40aAMSRPAnXSF1twbxp7gcIjKnsB2vJSu TtMzfO+uxCP6g== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Ran Xiaokai , Thomas Gleixner , Sasha Levin , peterz@infradead.org Subject: [PATCH AUTOSEL 6.6 11/13] cpu/hotplug: Don't offline the last non-isolated CPU Date: Mon, 6 Nov 2023 18:14:24 -0500 Message-ID: <20231106231435.3734790-11-sashal@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231106231435.3734790-1-sashal@kernel.org> References: <20231106231435.3734790-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.6 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ran Xiaokai [ Upstream commit 38685e2a0476127db766f81b1c06019ddc4c9ffa ] If a system has isolated CPUs via the "isolcpus=" command line parameter, then an attempt to offline the last housekeeping CPU will result in a WARN_ON() when rebuilding the scheduler domains and a subsequent panic due to and unhandled empty CPU mas in partition_sched_domains_locked(). cpuset_hotplug_workfn() rebuild_sched_domains_locked() ndoms = generate_sched_domains(&doms, &attr); cpumask_and(doms[0], top_cpuset.effective_cpus, housekeeping_cpumask(HK_FLAG_DOMAIN)); Thus results in an empty CPU mask which triggers the warning and then the subsequent crash: WARNING: CPU: 4 PID: 80 at kernel/sched/topology.c:2366 build_sched_domains+0x120c/0x1408 Call trace: build_sched_domains+0x120c/0x1408 partition_sched_domains_locked+0x234/0x880 rebuild_sched_domains_locked+0x37c/0x798 rebuild_sched_domains+0x30/0x58 cpuset_hotplug_workfn+0x2a8/0x930 Unable to handle kernel paging request at virtual address fffe80027ab37080 partition_sched_domains_locked+0x318/0x880 rebuild_sched_domains_locked+0x37c/0x798 Aside of the resulting crash, it does not make any sense to offline the last last housekeeping CPU. Prevent this by masking out the non-housekeeping CPUs when selecting a target CPU for initiating the CPU unplug operation via the work queue. Suggested-by: Thomas Gleixner Signed-off-by: Ran Xiaokai Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/202310171709530660462@zte.com.cn Signed-off-by: Sasha Levin --- kernel/cpu.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 6de7c6bb74eee..94430ea8b4a54 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1515,11 +1515,14 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target) /* * Ensure that the control task does not run on the to be offlined * CPU to prevent a deadlock against cfs_b->period_timer. + * Also keep at least one housekeeping cpu onlined to avoid generating + * an empty sched_domain span. */ - cpu = cpumask_any_but(cpu_online_mask, cpu); - if (cpu >= nr_cpu_ids) - return -EBUSY; - return work_on_cpu(cpu, __cpu_down_maps_locked, &work); + for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN)) { + if (cpu != work.cpu) + return work_on_cpu(cpu, __cpu_down_maps_locked, &work); + } + return -EBUSY; } static int cpu_down(unsigned int cpu, enum cpuhp_state target) -- 2.42.0