From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from DM1PR04CU001.outbound.protection.outlook.com (mail-centralusazon11010045.outbound.protection.outlook.com [52.101.61.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98B703BE14D for ; Sat, 9 May 2026 18:03:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.61.45 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778349807; cv=fail; b=J+L3tc5/Xm4vDivQN1kdzLw+nmUqf1dH6124giDEwSDgE3iPcLDicdbDF9hcOBNExYm3ZWiVfcxOOS1ay488Y6JQxrS4q6s+Fhj+OgNaQbHzwbdhJ1QWDLtccOrXEyCs4gzG57LLbA2WiAM9K7gQfub/GRSWwxX6ROavekRuguE= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778349807; c=relaxed/simple; bh=hSNJvbmDFUOwzMAAJSKN4CNIhxyGqVvN9KSqzfRq/2E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=aW5QA7B2DJStjKf0kVMbU54IZTBefFCy9ZCU7hB0qQBK6nMDxpFzNjteAdyBhsayjaYOVfAHhMmNcYyKmb10jEfFrs7LdY3ic3JgYGXuABBqARNUrQAPggy1QL5kWT4/7MuYJMFUYtrTMJq3DYeasTqCem7gvf1fkQmd7W0ybQU= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=j86f/DH3; arc=fail smtp.client-ip=52.101.61.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="j86f/DH3" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NHYkIbcHuBjy7vaxXymBhL356MwCHeR8rWTs3g0yVfIYXeDDphPW6HKbNVNFoEUujGI+IEu0zhVuriXtwim7Ax9zLiqxHI9STVGjM71PIu/2ppVBvJu/p+m08HGpX6lg9FipW7uS0dnS2u6/mUp/DQud93RLMehpxjd3Ji1Zk/VHcC0bqq5KlJPqMy3Fz3MsxH6a4j5ZLvyNuUYzTc+ulcYS4HOgfGGTVG9sswIhr5fBZq9PfMFXrOGVEKHQHt3D+MoTdE2UBgQQ6wBZtgKcUkvjGSkKyOre8L8LTiwmLM4Ny3vE06U328vMbAGbNxhKhErYI8aC70eKey4e1P9Khg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pEV+Lw0HuznHk6rtUeSfTu3qPHh5oXqhxVm1sYCaWMo=; b=lQoVNNbWJbnTpi7iv4fAWhgaAx7KYhz7jWpai00wJu6r7M2fN/82sS6cGY8Q8AIsaYldJ6y326uvIMXNaLWy9LAuILB/d6QSJZ4rOUhW0q8SU5Hf4R5FwuoVsPRuFsjQao8BJUo1P6Yd+Nd59Rc4OZFdyZvwGYdL4+fm1j8Abwi+ynjzST1eEimbaNdIJ3RpbxMLUaaDTVU9aSz4sp8GxFgqIE/9da6vSwgInlSI1R5bhN6+4PeaXExccIfI1Sodylc+5IS266IxY260Gs0UkC1dTb4wIlqxUUQDeQJCxMrPzt4ti13ByIhGNRjWBrj9vZAfta3364jzawLizrdaCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pEV+Lw0HuznHk6rtUeSfTu3qPHh5oXqhxVm1sYCaWMo=; b=j86f/DH398Ocx4ehzFTaxQIxI1VKH1sX3K77+ChX+1DjQg4tdg9MYHVqQ7V4N5LFB/AdkzlAI1f+dYVb9LzSKIEJw0as15BHpVq1zHRsrocUa958NBJ5pJmicApGgKJggLkrzSLGJxd1iZDBNLFyt6wwLu1RdQkYnY5zYQB2Po8TVproUPH3oRDqYOxwagcgj4Mx2Kf4L/JhAtVz5peGPgf2faO37V4uq8neroJUB/vqUWT7W4yIse9KU0kmTfpspeB4oVgdwkUUwz7A9y6xWkuHy8Jd1vns+XFcmHf+Jkcjmj14VRR8dnZ4MpnNj7off664syy4JI3VEQQAvX05lQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by DM4PR12MB6039.namprd12.prod.outlook.com (2603:10b6:8:aa::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9891.21; Sat, 9 May 2026 18:03:19 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9891.019; Sat, 9 May 2026 18:03:19 +0000 From: Andrea Righi To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Christian Loehle , Phil Auld , Koba Ko , Felix Abecassis , Balbir Singh , Joel Fernandes , Shrikanth Hegde , linux-kernel@vger.kernel.org Subject: [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Date: Sat, 9 May 2026 20:01:22 +0200 Message-ID: <20260509180302.1839122-3-arighi@nvidia.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260509180302.1839122-1-arighi@nvidia.com> References: <20260509180302.1839122-1-arighi@nvidia.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: MI1PEPF000008CB.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::439) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|DM4PR12MB6039:EE_ X-MS-Office365-Filtering-Correlation-Id: a0e225b6-deb7-4dcd-dabb-08deadf54061 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|7416014|376014|22082099003|56012099003|18002099003; X-Microsoft-Antispam-Message-Info: 9QcRNhq0GtsG8BN3Oa+l+S+suXe3V3VK+552MLkNHZc1r06QR42gQQezxYZNt0bIHaS9Rv2Zoj/L6m8/XNq5sVkEIXDotvz5t0lSokdR08unOIhxkFpbR/SqX1PseeXP3KFb4iwVMjzAtVp9Uxr5Ao1aOmp4fSGx/X5Hup6++HZYgcDkLrdjWV9uNmMbK93c2OnrIXRyLy386xXvTPvlEP91BpoRDz96z8/HcpH3GM2q+iKfcHMJbziQXRoUA9CQ3YyWFiVFkpLUq+opAqh5YdeQ1dMJtzF5USin+6YKrio2UfvkvNGxt9YinsLSO++hchbwpNjKKkcOxbvxU+paG76uOfFKtHUsKfMa/OTvcwu044AofQHfVwf+O2EkwEJ587N0fzZXrbno0oNaPijeNtJqDG9nkdJK1u+qsIWciw0BG2utTfgc+LcwUAorNm6dKUuIxm4tASnIZsFbCmjjAEGPwcVSnCjgBMk3lsA6Bd+vrw4s+8qGqDQxVs/nOqLhOLfBGkQeVhc70RaccN9IUwqDdFGPHk2gg8PO5omYJFQqBLWldzxZq9Tyr0VJOxDTVmkHi9vE3cyT6cTUrnukOwN1fvqvOS9L25Lo9SKidx5b1xGXqLHCK/BPPwP0CN1fs+IUpchI6hh14UM6koQolPFXfR+GBUPE3zk6oVHDjvJgf0K6hlw47SSWfo389or4 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(7416014)(376014)(22082099003)(56012099003)(18002099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?J6vvRCW/8ccLpYOF0D4hK4DI8KDH32TBxKLpAEvTQSlXQvb6GpY2CxiBuTMw?= =?us-ascii?Q?2UQqeXzXLO3a+mDIM6PZzlMPlByZ2zUoL9joIoLTAKSBswIzVTZCTk/DILMT?= =?us-ascii?Q?AYhIHBAtvT5QyVjiWCP8yLHlg0m4xeZUGoklUfV3LRY+2lgQmF/bOidn5M8P?= =?us-ascii?Q?OjS54LasSb2OVrj6TsjX2+s+WIZakHjN+2Vjb/IP+BM/8CqNAPnToL8pzjT9?= =?us-ascii?Q?6X55/9wolSL+KQOPZPypkACX/6cw2PyyAM3+0i6ZiVr8LYQKkvy5nrMHAi6r?= =?us-ascii?Q?7iCtzdfO5mok9lXLn2XysCcUToTYpbAUqqaSx5eYUPxESCRv2LD2pkklp/6Z?= =?us-ascii?Q?SiAP2x9isPcWXkLGBDIqxfsob8QH/8aHGILhfHfXzMVGDFU7AABkIH1ZEgjS?= =?us-ascii?Q?b/8tVPBGm1NwDrTCIwE0RjiCELT7/wtE2PNKpOJdwervp/Wu+8NdLSurZuOf?= =?us-ascii?Q?XIM5nI6f/X+dqo2GBuuRqQ8bppxNs7Egg7uQTmr+2ncBCH+cQ+FYg89BuIGM?= =?us-ascii?Q?NC4SQFRIn2PkkJqbvAeIQX0g1ws2pNru0rYY5gsL14399oyLQK229vSVg/9v?= =?us-ascii?Q?i++PFEtRaGTY+tbWfW8cB+VEBkFLKAkXNgUqlhBr7gJyZeMhxPwXEBQg5HrV?= =?us-ascii?Q?FvLEBIlUb1rX0mRnRwUiE65o2qzEPIscSjNdtuZVMK9Ke7AxjWxWDmiK9oYU?= =?us-ascii?Q?cwrFDQ9n+J2v8DgLphGKVo1LTWS8RQvUp221YYwhtvuq7L0TFGsIzmNzDtnJ?= =?us-ascii?Q?smSoizLbuzG3uFeg/R5Ce6P3xor1q55iejSmgNzFgMgR6p3fDCLt6CFlv3fH?= =?us-ascii?Q?9YlAaNApzg0jpVq/nQ4HCKwYHldbf0RReU6xTCfZqtrGRo9foJomOLF3p7j5?= =?us-ascii?Q?b0uVD1p2g1dk1PVS2iKihXnzHCLdzKr4qMDPB61vbCHPLdiox3+oGDnQagbC?= =?us-ascii?Q?NMSIRe6R7qKqTT59mCHBya0iLwA9XDFHHKOwXX1u8l7SrDGBjvFmMfIj+03k?= =?us-ascii?Q?3qatEyaj8eSVPOLJOOlFrBlGdx71qJdXVg/V4AQigirwVfWxxSRsbRITqgaX?= =?us-ascii?Q?Qgwdm0kTxSRF3cRtI5+qsPFfnGbcu9lORQ0GCVUBUuaBZkXmJg9OEaGrKubh?= =?us-ascii?Q?hOTtmk6XGRu554G5zgIk9vg0B46HMbtInN/Xwrg1WEwx+imAqvEr9X3Oc3AN?= =?us-ascii?Q?3WSi67FdDmgFu6Ml0xe9OKE68l1AR6U3/hlsuqndDpDdKAZmagTYw/sPj2I+?= =?us-ascii?Q?jSyC2NCiXpXotZ4nV1dhdO3JcxJlSHicsd+avOl32aKATP/Qu/B33Q+s6MdM?= =?us-ascii?Q?rtJVkROc0qNIhTkvP0QrltVyRnU3fgfk8xOepcollDkPCqYoxbASpTtXt2DH?= =?us-ascii?Q?T+HPR76zM9b/9ELnLnRGQrEGcD9h54/pHAQmwTJzjyguIoTckfFrrHG07IXL?= =?us-ascii?Q?c26gg4NdD0OUiaIyjPGD9tQ6g4VaNIlc+gf7iyz/zwdkieurTea0kBnqj6Kw?= =?us-ascii?Q?//wgYEw/h02PqMk9Z6I5smmkraeV6LBHos3fm9XYblnxbJ/aar7SBO9QABr8?= =?us-ascii?Q?5bOxv7KLOXtYqKLntbMzCbqwlNOqFEd900bHclKPm+mUvxHELIE7kK42zkkK?= =?us-ascii?Q?JdEz9IkpKv0vWuJ/uxAANWvgU/u6JlOBtuWKR+uZ2Aw/9c81l+yzJtH0tdlJ?= =?us-ascii?Q?qMimBpRgMdxxWhmZzX5Y34IWQs0K+ZQjw6MfkwZf7qvNLRfv?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: a0e225b6-deb7-4dcd-dabb-08deadf54061 X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 May 2026 18:03:19.4762 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: j/oqUopidrD/YsAZhC+vZwTqBuncE3QSjzoFtn6iXgCFgWEtv2w7kFUWxcvNaQRbhlmhGYLMvamhsymK+UWH8g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6039 From: K Prateek Nayak On asymmetric CPU capacity systems, the wakeup path uses select_idle_capacity(), which scans the span of sd_asym_cpucapacity rather than sd_llc. The has_idle_cores hint however lives on sd_llc->shared, so the wakeup-time read of has_idle_cores operates on an LLC-scoped blob while the actual scan/decision spans the asym domain; nr_busy_cpus also lives in the same shared sched_domain data, but it's never used in the asym CPU capacity scenario. Therefore, move the sched_domain_shared object to sd_asym_cpucapacity whenever the CPU has a SD_ASYM_CPUCAPACITY_FULL ancestor and that ancestor is non-overlapping (i.e., not built from SD_NUMA). In that case the scope of has_idle_cores matches the scope of the wakeup scan. Fall back to attaching the shared object to sd_llc in three cases: 1) plain symmetric systems (no SD_ASYM_CPUCAPACITY_FULL anywhere); 2) CPUs in an exclusive cpuset that carves out a symmetric capacity island: has_asym is system-wide but those CPUs have no SD_ASYM_CPUCAPACITY_FULL ancestor in their hierarchy and follow the symmetric LLC path in select_idle_sibling(); 3) exotic topologies where SD_ASYM_CPUCAPACITY_FULL lands on an SD_NUMA-built domain. init_sched_domain_shared() keys the shared blob off cpumask_first(span), which on overlapping NUMA domains would alias unrelated spans onto the same blob. Keep the shared object on the LLC there; select_idle_capacity() gracefully skips the has_idle_cores preference when sd->shared is NULL. While at it, also rename the per-CPU sd_llc_shared to sd_balance_shared, as it is no longer strictly tied to the LLC. Cc: Vincent Guittot Cc: Dietmar Eggemann Co-developed-by: Andrea Righi Signed-off-by: Andrea Righi Signed-off-by: K Prateek Nayak --- kernel/sched/fair.c | 19 ++++++--- kernel/sched/sched.h | 2 +- kernel/sched/topology.c | 95 +++++++++++++++++++++++++++++++++++------ 3 files changed, 95 insertions(+), 21 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6b059ee80b631..960a1a9696b98 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7819,7 +7819,7 @@ static inline void set_idle_cores(int cpu, int val) { struct sched_domain_shared *sds; - sds = rcu_dereference_all(per_cpu(sd_llc_shared, cpu)); + sds = rcu_dereference_all(per_cpu(sd_balance_shared, cpu)); if (sds) WRITE_ONCE(sds->has_idle_cores, val); } @@ -7828,7 +7828,7 @@ static inline bool test_idle_cores(int cpu) { struct sched_domain_shared *sds; - sds = rcu_dereference_all(per_cpu(sd_llc_shared, cpu)); + sds = rcu_dereference_all(per_cpu(sd_balance_shared, cpu)); if (sds) return READ_ONCE(sds->has_idle_cores); @@ -7837,7 +7837,7 @@ static inline bool test_idle_cores(int cpu) /* * Scans the local SMT mask to see if the entire core is idle, and records this - * information in sd_llc_shared->has_idle_cores. + * information in sd_balance_shared->has_idle_cores. * * Since SMT siblings share all cache levels, inspecting this limited remote * state should be fairly cheap. @@ -7954,7 +7954,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask); int i, cpu, idle_cpu = -1, nr = INT_MAX; - if (sched_feat(SIS_UTIL)) { + if (sched_feat(SIS_UTIL) && sd->shared) { /* * Increment because !--nr is the condition to stop scan. * @@ -12834,7 +12834,7 @@ static void nohz_balancer_kick(struct rq *rq) goto out; } - sds = rcu_dereference_all(per_cpu(sd_llc_shared, cpu)); + sds = rcu_dereference_all(per_cpu(sd_balance_shared, cpu)); if (sds) { /* * If there is an imbalance between LLC domains (IOW we could @@ -12862,7 +12862,11 @@ static void set_cpu_sd_state_busy(int cpu) struct sched_domain *sd; sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); - if (!sd || !sd->nohz_idle) + /* + * sd->nohz_idle only pairs with nr_busy_cpus on sd->shared; if this + * domain has no shared object there is nothing to clear or account. + */ + if (!sd || !sd->shared || !sd->nohz_idle) return; sd->nohz_idle = 0; @@ -12887,7 +12891,8 @@ static void set_cpu_sd_state_idle(int cpu) struct sched_domain *sd; sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); - if (!sd || sd->nohz_idle) + /* See set_cpu_sd_state_busy(): nohz_idle is only used with sd->shared. */ + if (!sd || !sd->shared || sd->nohz_idle) return; sd->nohz_idle = 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9f63b15d309d1..330f5893c4561 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2170,7 +2170,7 @@ DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc); DECLARE_PER_CPU(int, sd_llc_size); DECLARE_PER_CPU(int, sd_llc_id); DECLARE_PER_CPU(int, sd_share_id); -DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); +DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_balance_shared); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_numa); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity); diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 5847b83d9d552..9bc4d11dd6a98 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -665,7 +665,7 @@ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc); DEFINE_PER_CPU(int, sd_llc_size); DEFINE_PER_CPU(int, sd_llc_id); DEFINE_PER_CPU(int, sd_share_id); -DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); +DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_balance_shared); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity); @@ -680,20 +680,38 @@ static void update_top_cache_domain(int cpu) int id = cpu; int size = 1; + sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL); + /* + * The shared object is attached to sd_asym_cpucapacity only when the + * asym domain is non-overlapping (i.e., not built from SD_NUMA). + * On overlapping (NUMA) asym domains we fall back to letting the + * SD_SHARE_LLC path own the shared object, so sd->shared may be NULL + * here. + */ + if (sd && sd->shared) + sds = sd->shared; + + rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd); + sd = highest_flag_domain(cpu, SD_SHARE_LLC); if (sd) { id = cpumask_first(sched_domain_span(sd)); size = cpumask_weight(sched_domain_span(sd)); - /* If sd_llc exists, sd_llc_shared should exist too. */ - WARN_ON_ONCE(!sd->shared); - sds = sd->shared; + /* + * If sd_asym_cpucapacity didn't claim the shared object, + * sd_llc must have one linked. + */ + if (!sds) { + WARN_ON_ONCE(!sd->shared); + sds = sd->shared; + } } rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); per_cpu(sd_llc_size, cpu) = size; per_cpu(sd_llc_id, cpu) = id; - rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds); + rcu_assign_pointer(per_cpu(sd_balance_shared, cpu), sds); sd = lowest_flag_domain(cpu, SD_CLUSTER); if (sd) @@ -711,9 +729,6 @@ static void update_top_cache_domain(int cpu) sd = highest_flag_domain(cpu, SD_ASYM_PACKING); rcu_assign_pointer(per_cpu(sd_asym_packing, cpu), sd); - - sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL); - rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd); } /* @@ -2650,6 +2665,54 @@ static void adjust_numa_imbalance(struct sched_domain *sd_llc) } } +static void init_sched_domain_shared(struct s_data *d, struct sched_domain *sd) +{ + int sd_id = cpumask_first(sched_domain_span(sd)); + + sd->shared = *per_cpu_ptr(d->sds, sd_id); + /* + * nr_busy_cpus is consumed only by the NOHZ kick path via + * sd_balance_shared; on the asym-capacity path it is initialized but + * never read. + */ + atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight); + atomic_inc(&sd->shared->ref); +} + +/* + * For asymmetric CPU capacity, attach sched_domain_shared on the innermost + * SD_ASYM_CPUCAPACITY_FULL ancestor of @cpu's base domain when that ancestor is + * not an overlapping NUMA-built domain (then LLC should claim shared). + * + * A CPU may lack any FULL ancestor (e.g., exclusive cpuset symmetric island), + * then LLC must claim shared instead. + * + * Note: SD_ASYM_CPUCAPACITY_FULL is only set when all CPU capacity values + * are present in the domain span, so the asym domain we attach to cannot + * degenerate into a single-capacity group. The relevant edge cases are instead + * covered by the caveats above. + * + * Return true if this CPU's asym path claimed sd->shared, false otherwise. + */ +static bool claim_asym_sched_domain_shared(struct s_data *d, int cpu) +{ + struct sched_domain *sd = *per_cpu_ptr(d->sd, cpu); + struct sched_domain *sd_asym; + + if (!sd) + return false; + + sd_asym = sd; + while (sd_asym && !(sd_asym->flags & SD_ASYM_CPUCAPACITY_FULL)) + sd_asym = sd_asym->parent; + + if (!sd_asym || (sd_asym->flags & SD_NUMA)) + return false; + + init_sched_domain_shared(d, sd_asym); + return true; +} + /* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs @@ -2708,20 +2771,26 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att } for_each_cpu(i, cpu_map) { + bool asym_claimed = false; + sd = *per_cpu_ptr(d.sd, i); if (!sd) continue; + if (has_asym) + asym_claimed = claim_asym_sched_domain_shared(&d, i); + /* First, find the topmost SD_SHARE_LLC domain */ while (sd->parent && (sd->parent->flags & SD_SHARE_LLC)) sd = sd->parent; if (sd->flags & SD_SHARE_LLC) { - int sd_id = cpumask_first(sched_domain_span(sd)); - - sd->shared = *per_cpu_ptr(d.sds, sd_id); - atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight); - atomic_inc(&sd->shared->ref); + /* + * Initialize the sd->shared for SD_SHARE_LLC unless + * the asym path above already claimed it. + */ + if (!asym_claimed) + init_sched_domain_shared(&d, sd); /* * In presence of higher domains, adjust the -- 2.54.0