From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BN1PR04CU002.outbound.protection.outlook.com (mail-eastus2azon11010031.outbound.protection.outlook.com [52.101.56.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5628D38F638 for ; Tue, 28 Apr 2026 14:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.56.31 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777387470; cv=fail; b=qBMih/T64wVEmRmtnOFx1ZVUSCKCFlVUGH60ZOvzadiRnepfG7+uJaX7S63dYWweMK40EFq5odpThYN2Fq+9eKok6KcWes6nX/mIpiMEAV9L4IvD4eYNjbO2k7/Lm+aE3G+wR+/0L+4ZCuHjF+OO9yj5AQugY7UU94tPBmDk/As= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777387470; c=relaxed/simple; bh=WKnNhqSNEdXkfx5f0KL0VsFtBYQCHaNWCBZAuMjmB1M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=BDZi8EUYFkzrBG8NrE8VqP0Tmzoye9xoUyON27k/t7tZNUHLY8MuLNbnafxrLfW9zb52XbPFo1SC6H4+Qxn0CsjuwpY4v9Uad9chYR3V7O0lJlp0OFY0WlRV2vRTBlPFXcIVq5Vci2rfwWdzI9bh/TNa6WX3zigDJaovsLOmPig= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=G4v/Jkim; arc=fail smtp.client-ip=52.101.56.31 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="G4v/Jkim" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Jb7nc9JF1CqjIWP11hFad2jnifCTe6kqtGi1C3e2aFeRGCXJYbIUJzW8trnyLfRlK6Tei6vMhW06ZzAvvO/U4FLpTJ8Sc9NxQSbr2vPefxDp2P2IeXoejegizrUBEOhsFnLIDdxQOACJ6CoaNfRnKJmA3hlASRFrjdLwonqxsN2m3jaOVfVzYFyUQGUoqdeIp61ZhlT0WXeCONOmpez8o+Xl6BLvWaLGOmvl0VTwLbh0l/qJiDgGE1gbmle9PnGtrVUL4FCF7Ba7kHxd/cVNuVh/c7GmtNQHghlP72zKIqZqFQ+OCwHzyjXtNuZ9+ebffWAMFPneVH44PBVpJwFjHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TZ3mPHBb7EEa3Svg04LKeZlqIzLx+WMAH30C7/HvALA=; b=WHwPI3AdxLAIL/6W03ZGevi7k7f8ZhTrA1O6lv1GGgKhaVh3zVQMe66Sk+6umVv2Jc3ff5Ltueb3hDQ83tP1uCpa2/yWGHHPUomFjXgJgVr19l7kQJ8LSUOVCpWeCYPOfxhNkmyHuywx8O42G2dSl7iD388xvuHvmRIVzGyPwvxv0ijPjcknT4zjm2QQ/TcOJu9TQSO9FNGF714Kp4vmN6GTESrvDqu/60sWDKnCWE2xyRCc5gMXCFUMXqN+3tlvvghkZ7Ut1rl3EwJzO3yhvxQFTqNQFZPxOL1okjbYvcFOXT5dmHoptVW3sHvv9mPVvELlrcl70q4P3ggJGi8Nxg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TZ3mPHBb7EEa3Svg04LKeZlqIzLx+WMAH30C7/HvALA=; b=G4v/JkimiMX7VXNFGIBiCfzX1G2ws711uuKeKJIs8mW9eoeQDGN8QtabDdaJShN4mIPiCLV6dHtXfl35ulk9lS1T7baBuRvKKggpgHFPtNJfPZI/Rp9JxP9f/UVacfsks0RwsPdhzS7sBIZdz7Jef9+EJUA1lGjArRnXlerdN8CJv35VzsMz/YdWuvBKGKEgmOcuKR2MUD39vNievD/WBiG2kN8ONCndHgW+G9azXPWiVUXjQLHMvDWMMNlhCmgbVWArtORyC4GLopd8cOmVW6IWpe/CwFFJq7+IW21Our25Si8Jq3FlnYmkKJIhokh4pnmIRGYZQv4rPB3D7TlMSQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by IA1PR12MB6211.namprd12.prod.outlook.com (2603:10b6:208:3e5::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9870.15; Tue, 28 Apr 2026 14:44:18 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9870.013; Tue, 28 Apr 2026 14:44:18 +0000 From: Andrea Righi To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , K Prateek Nayak , Christian Loehle , Koba Ko , Felix Abecassis , Balbir Singh , Joel Fernandes , Shrikanth Hegde , linux-kernel@vger.kernel.org Subject: [PATCH 2/5] sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity Date: Tue, 28 Apr 2026 16:41:08 +0200 Message-ID: <20260428144352.3575863-3-arighi@nvidia.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260428144352.3575863-1-arighi@nvidia.com> References: <20260428144352.3575863-1-arighi@nvidia.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: MI1P293CA0022.ITAP293.PROD.OUTLOOK.COM (2603:10a6:290:3::19) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|IA1PR12MB6211:EE_ X-MS-Office365-Filtering-Correlation-Id: 169f5767-303f-4f10-25cd-08dea534a083 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|7416014|1800799024|376014|366016|18002099003|22082099003|56012099003; X-Microsoft-Antispam-Message-Info: 8tHH3iB0kqPWFHj5JBNZ9X2GvJp/YAKeyTcw4Ew/w+xfooiRG1SUsJDFbpxcfxjorkih6zSJ+m61wTqXF1Oh5wIbVThjXGfhdRTcdZdvRYcCa1MNPw5tWk/IZnNWdSsDgKXh2yv0655EJbNZbKmca6+wbB31BThz56c3d5LGPTNl3gQehpRW1CVSHi2W6wu5HjqGwK+fJooYTBbQA8uTZAsR+jFGX/wfdMsKNxUm920kc1QIY3vMBpqh7GAzr77sFXYxaPllwcjw0JzhQ+59DIJqFQyldL6eazABG93tzVESRPyQLv4qElRaT0IX/N5/er3MDxvsvBTQj7mBJb1ZxXLZVgtfkQlmQUYivAhec9j2mLZL+nKfmSwLANg6e7o+7O8BKgFUFfPwZ02U28EFtNzTqlcyAzcn/lTI21scZEqVye5Su+M6HYupCcP4cm0kCY30OXG0x6lB1kZWjwg3WC4FabHpx5oj2gRUs/nZy4Iu8qwlSDKioP3VTto45TmAT1rnOp5b8eKK+eWuDnel49uju3xKzrP9PuIPugeBI2xFy4KD2V7DEovnOPTXE9wn+3lBqDstokZklmQGGS9dmuC3GrW3z5GlnGncLG+DPDuPBY61ufvRfB70ItTQECDrXxzWxNpoQF65voOhVgmvsN3+776ZiPkt0qbi4Zofss+4mkPjCOO8GV7tQlHfG+xn7bKyKp5DGbjP9zVWhEiV/BzfJUSXle+yLXB5uPS8gUE= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(7416014)(1800799024)(376014)(366016)(18002099003)(22082099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?e2bXpXiriFqysC4dkwzy6iunUkmun98fB4EUIIUbsJ9hl1TwL6dHgV18Uckn?= =?us-ascii?Q?6/6QtVThssrQ3/rUmEhBDIxJP47ajvrHAaz1AO5zjG1d5Z515eMO1+VOgUXI?= =?us-ascii?Q?tT4eWxIkKmKvowxmgGtWbXowAEYnHY7QYI2XgG/H7E9HHtrH0U0lN3iFMnMS?= =?us-ascii?Q?J3Lp5gEKP7yURE2JqvAYm07bpRu6d4w2cVYkPyv3K/DIjEejJ1Pj4UIz3/Is?= =?us-ascii?Q?UxjviGEXg2dK5qgIFGhQjDiP9MgFrL0h9inXhE8lEc0YWhULkAPyVjDaFGR5?= =?us-ascii?Q?wRUgBfP7Sm7o6Hxbk4eovG1w8uf+sMRcBI8VDgCX4tx6ynFrqpSIy3YwdTDb?= =?us-ascii?Q?W7Uq2yaZ8UtYBnjmvfj32jHxA11oEpc/WDumHQd8bnpuefnPbieHPAaXIpYP?= =?us-ascii?Q?2GBZJ8KHrP1ZnM1bqyxNIkPnmc0wfOQBPDkR09ZizPpBSKOw3YgVUmBMu7TT?= =?us-ascii?Q?WttuPkAJOOZVEaaV2YC8+PmaTbMSw1zhHDEkkyMbCz4OJRNy3nvaZ7eyOP6b?= =?us-ascii?Q?345oipWc2ArGMaZ4CZofi6BeCRXd0XSoX8Ka0M1gSpq6tJ4YxG6L91QO1Lfn?= =?us-ascii?Q?e5xsQEKFsbozwrscLWsXcunfl0xNS8zYnFcSPwquCdUHUOIu/In17439FV4J?= =?us-ascii?Q?KVKYZF2QLdP1w4xM0HdyXb/b85WRx7H5v9ZbjwccHBCjJvfZSu4HrGW5gq+T?= =?us-ascii?Q?w141w7G0TH9gBI7MtZ0fBNGrzzFuUzVbty888EO/JKJOiTQ2rlRryrwGSprJ?= =?us-ascii?Q?rHMzJos/3tMWe3oVYqEpGyukUJ2cJlaQNY9gSwlj009kqFEOURux9uyy3ECx?= =?us-ascii?Q?+Dmvd9g9DXRUBFamR5FsWOwYPRZp4VLdcIXz5AQapzLvGNnAm1Zwu+Vh5dRQ?= =?us-ascii?Q?XyDkxY7lkQWs8eZ+UINrcOiQrpOyqhuS4i+/oituLa7eGvfy3/nei0a5G7Xv?= =?us-ascii?Q?x3nEa+LyhBgfjYcrM5Cs5N+LhasbVGAdn+vW2CDAP+XIs+7tQlpYWKvfYAnN?= =?us-ascii?Q?TpD7FV6kcJZUKgthYZQyd8bsKPeZx5yJb3rCMrxeJJylPIUU5cwM1gr9XDGX?= =?us-ascii?Q?A8cpgrZNfHhHAQWuh4gOKV0wDgLvA9iweLG99XbaNABIMC2P+dfeyKdLpepU?= =?us-ascii?Q?GKuo8pYnun0l2Ukc/Uh6rQefLPfbt6+MHWzymHYezM5QsQIS6kApA50+YXvt?= =?us-ascii?Q?kYLXp3EyiuRv1Ay5XsgZPhZocS2JCYGNUDW+YexLf79PCsVm/v/s5QuYW/Gy?= =?us-ascii?Q?LbXMFG1fA6XPjwsqhUNQ+Qswgq4vMutmQj4Xw0AC7WF79ERcpZeUO6P6ohVC?= =?us-ascii?Q?46jXrpEYKt6uFEMyENAwEBIJ7wIqD4Cu3XSjBELG0NTdDVDyvP+Ao967BsAR?= =?us-ascii?Q?n3cCF3tyBlyMxnMXWAmtmyr9U/dQkiqoQdMmQRO9kUjSogx9f61PZfMnJ9wW?= =?us-ascii?Q?vGHOdot5Ey9EbJTZcgXel/zQPv3L9p67SWbnUWFLObf6lw+UnY3tff/TNQYE?= =?us-ascii?Q?MvSpdm/837DVw3w6+1k107LVUk7WDXk3VnkCxIj5lbuxZVw8Bc8oJ7TEmPK8?= =?us-ascii?Q?gMDv96arVXqFuJSFPNRnMqOpBTbd2aml/xF2AA+5fxKWawW4Wk4Qc5uJCPjq?= =?us-ascii?Q?SC0XbV/8zqFX8GMGz3jOIBZlVtHcYiiBrXllZ3vNndXsUTg1IMqzmglD6irt?= =?us-ascii?Q?2oj6TU7jYuWrD3DNE7doIVRg+TMqSrQCyMTtVZ3m97x8D1LKVv5IWq3GwdMQ?= =?us-ascii?Q?mWb534T1Kw=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 169f5767-303f-4f10-25cd-08dea534a083 X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Apr 2026 14:44:18.6446 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1/za11UvvMbk9Ee4op4cNTbv220/JtXmhiE/XjZjxey9lX5PCuxVanzgu2XrNzG/PNTr6R5TW7HA15b9frsnYQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6211 From: K Prateek Nayak On asymmetric CPU capacity systems, the wakeup path uses select_idle_capacity(), which scans the span of sd_asym_cpucapacity rather than sd_llc. The has_idle_cores hint however lives on sd_llc->shared, so the wakeup-time read of has_idle_cores operates on an LLC-scoped blob while the actual scan/decision spans the asym domain; nr_busy_cpus also lives in the same shared sched_domain data, but it's never used in the asym CPU capacity scenario. Therefore, move the sched_domain_shared object to sd_asym_cpucapacity whenever the CPU has a SD_ASYM_CPUCAPACITY_FULL ancestor and that ancestor is non-overlapping (i.e., not built from SD_NUMA). In that case the scope of has_idle_cores matches the scope of the wakeup scan. Fall back to attaching the shared object to sd_llc in three cases: 1) plain symmetric systems (no SD_ASYM_CPUCAPACITY_FULL anywhere); 2) CPUs in an exclusive cpuset that carves out a symmetric capacity island: has_asym is system-wide but those CPUs have no SD_ASYM_CPUCAPACITY_FULL ancestor in their hierarchy and follow the symmetric LLC path in select_idle_sibling(); 3) exotic topologies where SD_ASYM_CPUCAPACITY_FULL lands on an SD_NUMA-built domain. init_sched_domain_shared() keys the shared blob off cpumask_first(span), which on overlapping NUMA domains would alias unrelated spans onto the same blob. Keep the shared object on the LLC there; select_idle_capacity() gracefully skips the has_idle_cores preference when sd->shared is NULL. While at it, also rename the per-CPU sd_llc_shared to sd_balance_shared, as it is no longer strictly tied to the LLC. Co-developed-by: Andrea Righi Signed-off-by: Andrea Righi Signed-off-by: K Prateek Nayak --- kernel/sched/fair.c | 17 +++++--- kernel/sched/sched.h | 2 +- kernel/sched/topology.c | 90 +++++++++++++++++++++++++++++++++++------ 3 files changed, 89 insertions(+), 20 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e0f75dedc8456..bbdf537f61154 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7790,7 +7790,7 @@ static inline void set_idle_cores(int cpu, int val) { struct sched_domain_shared *sds; - sds = rcu_dereference_all(per_cpu(sd_llc_shared, cpu)); + sds = rcu_dereference_all(per_cpu(sd_balance_shared, cpu)); if (sds) WRITE_ONCE(sds->has_idle_cores, val); } @@ -7799,7 +7799,7 @@ static inline bool test_idle_cores(int cpu) { struct sched_domain_shared *sds; - sds = rcu_dereference_all(per_cpu(sd_llc_shared, cpu)); + sds = rcu_dereference_all(per_cpu(sd_balance_shared, cpu)); if (sds) return READ_ONCE(sds->has_idle_cores); @@ -7808,7 +7808,7 @@ static inline bool test_idle_cores(int cpu) /* * Scans the local SMT mask to see if the entire core is idle, and records this - * information in sd_llc_shared->has_idle_cores. + * information in sd_balance_shared->has_idle_cores. * * Since SMT siblings share all cache levels, inspecting this limited remote * state should be fairly cheap. @@ -7925,7 +7925,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask); int i, cpu, idle_cpu = -1, nr = INT_MAX; - if (sched_feat(SIS_UTIL)) { + if (sched_feat(SIS_UTIL) && sd->shared) { /* * Increment because !--nr is the condition to stop scan. * @@ -12826,7 +12826,11 @@ static void set_cpu_sd_state_busy(int cpu) struct sched_domain *sd; sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); - if (!sd || !sd->nohz_idle) + /* + * sd->nohz_idle only pairs with nr_busy_cpus on sd->shared; if this + * domain has no shared object there is nothing to clear or account. + */ + if (!sd || !sd->shared || !sd->nohz_idle) return; sd->nohz_idle = 0; @@ -12851,7 +12855,8 @@ static void set_cpu_sd_state_idle(int cpu) struct sched_domain *sd; sd = rcu_dereference_all(per_cpu(sd_llc, cpu)); - if (!sd || sd->nohz_idle) + /* See set_cpu_sd_state_busy(): nohz_idle is only used with sd->shared. */ + if (!sd || !sd->shared || sd->nohz_idle) return; sd->nohz_idle = 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9f63b15d309d1..330f5893c4561 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2170,7 +2170,7 @@ DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc); DECLARE_PER_CPU(int, sd_llc_size); DECLARE_PER_CPU(int, sd_llc_id); DECLARE_PER_CPU(int, sd_share_id); -DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); +DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_balance_shared); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_numa); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing); DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity); diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 5847b83d9d552..69d465cc93ab4 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -665,7 +665,7 @@ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc); DEFINE_PER_CPU(int, sd_llc_size); DEFINE_PER_CPU(int, sd_llc_id); DEFINE_PER_CPU(int, sd_share_id); -DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); +DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_balance_shared); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity); @@ -680,20 +680,38 @@ static void update_top_cache_domain(int cpu) int id = cpu; int size = 1; + sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL); + /* + * The shared object is attached to sd_asym_cpucapacity only when the + * asym domain is non-overlapping (i.e., not built from SD_NUMA). + * On overlapping (NUMA) asym domains we fall back to letting the + * SD_SHARE_LLC path own the shared object, so sd->shared may be NULL + * here. + */ + if (sd && sd->shared) + sds = sd->shared; + + rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd); + sd = highest_flag_domain(cpu, SD_SHARE_LLC); if (sd) { id = cpumask_first(sched_domain_span(sd)); size = cpumask_weight(sched_domain_span(sd)); - /* If sd_llc exists, sd_llc_shared should exist too. */ - WARN_ON_ONCE(!sd->shared); - sds = sd->shared; + /* + * If sd_asym_cpucapacity didn't claim the shared object, + * sd_llc must have one linked. + */ + if (!sds) { + WARN_ON_ONCE(!sd->shared); + sds = sd->shared; + } } rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); per_cpu(sd_llc_size, cpu) = size; per_cpu(sd_llc_id, cpu) = id; - rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds); + rcu_assign_pointer(per_cpu(sd_balance_shared, cpu), sds); sd = lowest_flag_domain(cpu, SD_CLUSTER); if (sd) @@ -711,9 +729,6 @@ static void update_top_cache_domain(int cpu) sd = highest_flag_domain(cpu, SD_ASYM_PACKING); rcu_assign_pointer(per_cpu(sd_asym_packing, cpu), sd); - - sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL); - rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd); } /* @@ -2650,6 +2665,49 @@ static void adjust_numa_imbalance(struct sched_domain *sd_llc) } } +static void init_sched_domain_shared(struct s_data *d, struct sched_domain *sd) +{ + int sd_id = cpumask_first(sched_domain_span(sd)); + + sd->shared = *per_cpu_ptr(d->sds, sd_id); + atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight); + atomic_inc(&sd->shared->ref); +} + +/* + * For asymmetric CPU capacity, attach sched_domain_shared on the innermost + * SD_ASYM_CPUCAPACITY_FULL ancestor of @cpu's base domain when that ancestor is + * not an overlapping NUMA-built domain (then LLC should claim shared). + * + * A CPU may lack any FULL ancestor (e.g., exclusive cpuset symmetric island), + * then LLC must claim shared instead. + * + * Note: SD_ASYM_CPUCAPACITY_FULL is only set when multiple distinct capacities + * exist in the domain span, so the asym domain we attach to cannot degenerate + * into a single-capacity group. The relevant edge cases are instead covered by + * the caveats above. + * + * Return true if this CPU's asym path claimed sd->shared, false otherwise. + */ +static bool claim_asym_sched_domain_shared(struct s_data *d, int cpu) +{ + struct sched_domain *sd = *per_cpu_ptr(d->sd, cpu); + struct sched_domain *sd_asym; + + if (!sd) + return false; + + sd_asym = sd; + while (sd_asym && !(sd_asym->flags & SD_ASYM_CPUCAPACITY_FULL)) + sd_asym = sd_asym->parent; + + if (!sd_asym || (sd_asym->flags & SD_NUMA)) + return false; + + init_sched_domain_shared(d, sd_asym); + return true; +} + /* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs @@ -2708,20 +2766,26 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att } for_each_cpu(i, cpu_map) { + bool asym_claimed = false; + sd = *per_cpu_ptr(d.sd, i); if (!sd) continue; + if (has_asym) + asym_claimed = claim_asym_sched_domain_shared(&d, i); + /* First, find the topmost SD_SHARE_LLC domain */ while (sd->parent && (sd->parent->flags & SD_SHARE_LLC)) sd = sd->parent; if (sd->flags & SD_SHARE_LLC) { - int sd_id = cpumask_first(sched_domain_span(sd)); - - sd->shared = *per_cpu_ptr(d.sds, sd_id); - atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight); - atomic_inc(&sd->shared->ref); + /* + * Initialize the sd->shared for SD_SHARE_LLC unless + * the asym path above already claimed it. + */ + if (!asym_claimed) + init_sched_domain_shared(&d, sd); /* * In presence of higher domains, adjust the -- 2.54.0