From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from SN4PR2101CU001.outbound.protection.outlook.com (mail-southcentralusazon11012019.outbound.protection.outlook.com [40.93.195.19])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id E184D4C6C
	for <linux-kernel@vger.kernel.org>; Fri, 24 Apr 2026 08:46:28 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.195.19
ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1777020390; cv=fail; b=fj9O6wyVhWA8FEylFOEdZB2tW0x3HXl1OBeNaFL9wDirICpapGlTbvzxtDthPehnyvRepODCWGm6k/fVnd8k3H7tqwxD7j/cFoIL7gWN+jnEls8Fm7RuYvcXGw3yRTEJx+/MNYHxY+SllkLMRdzI9S1tmpG/k4Yaz8mEFYlb8qk=
ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1777020390; c=relaxed/simple;
	bh=XiN9BLfQwAUIrnrmdLpxg6Cffc/avFG6wK7UF6L2h8Q=;
	h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type:
	 Content-Disposition:In-Reply-To:MIME-Version; b=Z9bQ6Mudu6yLQvJ1xNQ518/jjlkYkJKr48z3iy/Z5heIECy+tW3pBTQSgk3SfLMbnDMRTjoWjTKFPse8QqGRT/ePg7xAIZ1ds1cD9aDdAmsi2ANWQr9zAaDA7ynKVrUHJ2+C/n/8EWWFwyrItvmPsh2cMuPlSs9hK1z0vbWEWPo=
ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=dzqpwRYz; arc=fail smtp.client-ip=40.93.195.19
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com
Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="dzqpwRYz"
ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;
 b=EjbN4OZRUcWZ5YIim67uyu5pwbq4yg8OU4X/71reTzkuS1Be39yyrZVONmVYfQusJ3NsNBWPvDGMf1t9CJw69sDa1L+LhQgd13d7oQ5qvaiDVPNVNgpgLemuN36vYyqk7GydAANDpCI5maiIWtU6xYJ/qwFxqKGLSWVPrOZnCuWJ2lVqVDjERZfQZmwzzNjlNEUVEo9E+QKv095Mw25T88gn46vvo1xEiSulsHJuQ4QoGS10HMDOJY98PpvDEYjT6J5TPgPECnfanIIOxxKfDgMGuDTsCdF970MhrEIu8/SNx9Wm534Rp+L78dS4MEv/gjBBxONBrhAEG27iAzHDuQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
 s=arcselector10001;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=GcEIBFNhY7dI41V566dUEEAnHs0rVsg1EjCXI22L0Bs=;
 b=WMAWJyo9J6SEH029IcImq5422F+uQPUzzIZFoArOBz+PJEJQh4cdgZQsfpqsTPHeiQiDqO5Nu7+5j2tIUZLf0jVxj5AkdaJebpmmUjvcWXPlCswUjrtiW3Zgl72AXH5a2SSjT9jDiOscB7wqbUDcrfd9gqZYFI3nYwElOUb+ym0K6nLyuu5FYJn2vYlFq2MKNHtI72aNkLGzh93G1Q9vGDhbZrB+qgjKmJE2/FozxMFY7Sd7OFGXKF07kuGgKGt9N72t7g3dn6HySSNtKgkpS/QsEAjqLz6sHNOGUD21jRnChCG4223MLlk5xmtjpOqbj4MezYtLfGNU3OGL82N0YA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com;
 dkim=pass header.d=nvidia.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com;
 s=selector2;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=GcEIBFNhY7dI41V566dUEEAnHs0rVsg1EjCXI22L0Bs=;
 b=dzqpwRYzoyvOmvCg2gN5r3T03a+m7x0L0xlbi96u2+bH7gqNqGUPlgy4FjcFStfmfPczfOKkbzCNivPtcBQXJV94hmqAWLycOYxkPZhXNZB4i2nj9Vm912h+2L3ZUrSaEPhOj9u3V/hb6tXnQDB9SXCN6FQrEkq6CFfZtYrSohmbT8bX6a5vh9cdjRrKJTtmmWNCqVMSqHhlcrT97JEk/xX/Wl7LqjJfjJ0pBf5W33a9SNaGtgL9OsfM1t8frTVGxhyPz8Sg8dAhfrS4LmyshAfSuTZkOJelBPRFLN2BaW2WRHtRK3zfA6SLAKVdWSse6/sBWgkdI18dMGI6VCOmfA==
Authentication-Results: dkim=none (message not signed)
 header.d=none;dmarc=none action=none header.from=nvidia.com;
Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19)
 by DS0PR12MB999106.namprd12.prod.outlook.com (2603:10b6:8:301::15) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9846.18; Fri, 24 Apr
 2026 08:46:24 +0000
Received: from LV8PR12MB9620.namprd12.prod.outlook.com
 ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com
 ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9846.019; Fri, 24 Apr 2026
 08:46:24 +0000
Date: Fri, 24 Apr 2026 10:46:14 +0200
From: Andrea Righi <arighi@nvidia.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Koba Ko <kobak@nvidia.com>, Felix Abecassis <fabecassis@nvidia.com>,
	Balbir Singh <balbirs@nvidia.com>,
	Joel Fernandes <joelagnelf@nvidia.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] sched/fair: Attach sched_domain_shared to
 sd_asym_cpucapacity
Message-ID: <aest1kBSzJ4JUvim@gpd4>
References: <20260423074135.380390-1-arighi@nvidia.com>
 <20260423074135.380390-2-arighi@nvidia.com>
 <75cf4fd1-2e80-4167-9113-954015ba63e1@amd.com>
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <75cf4fd1-2e80-4167-9113-954015ba63e1@amd.com>
X-ClientProxiedBy: MI0P293CA0013.ITAP293.PROD.OUTLOOK.COM
 (2603:10a6:290:44::18) To LV8PR12MB9620.namprd12.prod.outlook.com
 (2603:10b6:408:2a1::19)
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|DS0PR12MB999106:EE_
X-MS-Office365-Filtering-Correlation-Id: 1011efb5-5d12-40f2-8ae0-08dea1ddf704
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam:
	BCL:0;ARA:13230040|376014|7416014|1800799024|366016|56012099003|18002099003|22082099003;
X-Microsoft-Antispam-Message-Info:
	u4dPazVbTbar8sA2egD4pwCXEVdMAcxESO3ibDHJP/3vQ8sC0Zk+lPMEoPvtl4QqE/SKVvnj2WYz+O7hl13y34BuczxvDksXfKhVbbgJXOKYL+LXyo3xkphR7X4+gK3c32lLervZMwXV6CKwd/x/bzuXeuS5qqUYMWZYdNrZIUy/N4Q0uzbKll3ON5oK7ovUmpnipdmnKnBretxgIPCrdfxQWPDJlGmbjB0OHnxT8EyB7RZ+B1v7tXCTvMKZ1D4vH1moHLvBdVMAMK8QY2NlCEj5ZDlgrTJtqjlLlDl3nTXNhVFspqdykuNz2kFaBYt6jRwQTSEkrBR89fxD2KhuYb3me/h7AC5lPoTXRjN8jBhQ+s/A55mk54+TC/k9eVDlARE3JQbb1fMpnd7SYndThCah+PBsdXl2TLx7l8wemkOUAGuv+R/NXk7rcIqs9NQKfNXhlz7+jXXXuLiEq9mIxE+pfer3LKPkLNu4+kc4aAm/eOdRoTcjTBDLBrgBRV3zAQHlsaq14wsdW2yz9Bdkq1EH7g0WdQ8JAhaPYYpQuzr4D6WV3oN/1Qs+8G5HEqzLakuOPotUnX0DsKF6LCcadocfYDxiIR7IX1+tMliLwcvDksfKsASquT4KHM2H8xCHd7oihzImN7cTAR2gRjNg/c1OxNsNP3yw+eSYStOJubmcTOj3RUxg3TXW3yJeGMGfSIuOUc28p5BaHtTnmssX0jAqkD5FzYjRGHrNvNkfhTw=
X-Forefront-Antispam-Report:
	CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016)(56012099003)(18002099003)(22082099003);DIR:OUT;SFP:1101;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0:
	=?us-ascii?Q?eJ7a7L1SEcMklBcDcCcJsOh6RMCxPnXkLxCZMpU9+/m2UoPHt5BBjV7RvY8e?=
 =?us-ascii?Q?nnzHHFMTKpzREyQcj4N9i0uEIKknTArQzenadTMhsECIWqhAorbhKr+LU5W5?=
 =?us-ascii?Q?+1uZWBpDwU0CbTNUyZvsqLTYGjVAH00DdraBGp1cjv6XSEh0+s7n/IrbRMwD?=
 =?us-ascii?Q?ijA28JswxeVu+w+nyRdymrmuGgVg5JNH1PtqqaJPrH76nNDX+Ddfvqg78ueQ?=
 =?us-ascii?Q?W/y7j3Vpk0LO+SGRi/RSmRi333Q6tui2g2SsiVWk9APNBAAvsdUB0NydHCee?=
 =?us-ascii?Q?FQkqSRnrJarKFLXoKTuBvwSGChmJ0PVz9CRBB3pwqy9xKcT8CQJm0Qfk80rg?=
 =?us-ascii?Q?Nw6sm/XB4JoiZ3GiNzjvluHpuSJ6DFkP0sby9ZRWd/opIlpIcKs3rByM9DLl?=
 =?us-ascii?Q?p1TBO+fCFwzeSSgQyUh62R0T7cyqXcfKVa/LfB2fr16BNKjJHELFjkHocYtK?=
 =?us-ascii?Q?KPSh1y+Q9h+3PaKa87ud3Mb1earFgwtkU5MT1scZgXWYrXIbFTUDBUKxcLgB?=
 =?us-ascii?Q?Rw01IDUsyBofzcIMoDWwkhbIyp6Xqz3+064FxQuGFDbPpmntT4LjMGA2V0MY?=
 =?us-ascii?Q?UY0fDYljkWakyuxRA71N4CnB0xNPuU8alcZ9YaK3dr33K/JTs9eHFUgwgXYi?=
 =?us-ascii?Q?vArnMREDTtd3H9TDbB7ufDwk6v9bfB6fwOdHD+EKxJSBJX+0RyBKcl5Xawkg?=
 =?us-ascii?Q?t352BQ7Q5SHCPXv44n6mEj7FkvFQcVfDW/ShEybu570VN7+G9cmQaP3jrRI7?=
 =?us-ascii?Q?WmuigdwyKcccqPsB1wgvsTjYUaEUfmbt+IQFj+99BZhYwyXo0i08X11gyzVv?=
 =?us-ascii?Q?RSS9LoA2mMnDRkn+4eqgwbb5ZxOpGBTK28bL7sdwSW3od/K9b0xl/T0KSAiX?=
 =?us-ascii?Q?BpJrydafg8bFMmQa14zZ8WVLwsH0eOG5CWukSv9XBOIZ6KJ7neEc5qigzeic?=
 =?us-ascii?Q?JGCXPe0eeVWjnbMUNs0VRd3Z7+/7TiYPeT/j8DmkIBIBxqAcXsHf1Q0w0dd4?=
 =?us-ascii?Q?wk7TfCpWsuLhdy5DT26iH/xmKMrUzcMQvdHz7LXGSqRlDFmnybwJZV6xgTtR?=
 =?us-ascii?Q?EYuX/3SZ3jD/7NeFQDmEViVneHr7BTNnGxh6vYAvnmEvsTNRnwaf8l2UvfyD?=
 =?us-ascii?Q?IqdF6Zy8vdELqm3zuKDcK38soAzADJTOxCr8OmXxav3mB1lmr2NmsczF/r8O?=
 =?us-ascii?Q?qH2p+nIBrzmU9A7IB/osrUSyR4VYeWdxg9+ST5IzjeKH8Y8G6SPxZI5D2/kE?=
 =?us-ascii?Q?fINsvudn0tAL3XcKOWJ44c0h1sM+WXe57CSh8jAmxK/Wg6D0fmnJluCs4irT?=
 =?us-ascii?Q?y4/nVIPemL1Z+sn2iWmb8NKWBRuRiCs+V7SHert06sB5J6NVZlBuwhD+Qu5r?=
 =?us-ascii?Q?a4s8WZ0GlBZn1pMdq/kL0lu/80vWkD65zxvSd5DDxtYvfLJsprxc11P0oKr+?=
 =?us-ascii?Q?ayrdn/PmEiyEJk/6F47AZkfOJ5caNWiOwTWhEOnlZ8edJJkd1xtAJM6jTTNP?=
 =?us-ascii?Q?RQYETVAYcXk1qihpmAC5ozpG836eTmxoWQYY+uBrx+2HM1EbpoU/wXYaYZV6?=
 =?us-ascii?Q?IOePSEb31Zj3srMREN9R+QbZDGa1tz1RwNZW3BV4oO4HlTdcqRDHdmKJhLNl?=
 =?us-ascii?Q?Aw4a+kgD5rI4t4J8SBLRRH2WGsZdV/bQvMTPdEDC5LaznobbjRKfUdMPcuAq?=
 =?us-ascii?Q?FRSpRts7Ch7yq7rVKlrZMYbZif2lpOrK5+7mu0hIWXunM/ewK5VJw/eA5MEU?=
 =?us-ascii?Q?b4hjfRDbzQ=3D=3D?=
X-OriginatorOrg: Nvidia.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 1011efb5-5d12-40f2-8ae0-08dea1ddf704
X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Apr 2026 08:46:23.9627
 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: M87A8wLu3yc6zZtsorNXHW06uL7l/aN26sQcF2bY+2ippmcXr2Cv/ViIkzGuxQkX822GWHI4+++C74mCeSI4Rg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB999106

Hi Prateek,

On Fri, Apr 24, 2026 at 10:44:09AM +0530, K Prateek Nayak wrote:
> Hello Andrea,
> 
> On 4/23/2026 1:06 PM, Andrea Righi wrote:
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 69361c63353ad..934eb663f445e 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7925,7 +7925,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> >  	struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_rq_mask);
> >  	int i, cpu, idle_cpu = -1, nr = INT_MAX;
> >  
> > -	if (sched_feat(SIS_UTIL)) {
> > +	if (sched_feat(SIS_UTIL) && sd->shared) {
> >  		/*
> >  		 * Increment because !--nr is the condition to stop scan.
> >  		 *
> > @@ -12840,7 +12840,8 @@ static void set_cpu_sd_state_busy(int cpu)
> >  		goto unlock;
> >  	sd->nohz_idle = 0;
> 
> I just realised this flag only matters for accounting to "nr_busy_cpus"
> and we can bail out earlier if we don't have an sd->shared altogether.
> 
> You can probably adapt this to use guard(rcu)() while you are at it
> and send these bits as a separate cleanup first saying that the
> assumption of sd_llc->shared always existing will change with the
> coming patches and you are introducing guard rails for the same.

Ack.

> 
> >  
> > -	atomic_inc(&sd->shared->nr_busy_cpus);
> > +	if (sd->shared)
> > +		atomic_inc(&sd->shared->nr_busy_cpus);
> >  unlock:
> >  	rcu_read_unlock();
> >  }
> > @@ -12869,7 +12870,8 @@ static void set_cpu_sd_state_idle(int cpu)
> >  		goto unlock;
> >  	sd->nohz_idle = 1;
> >  
> > -	atomic_dec(&sd->shared->nr_busy_cpus);
> > +	if (sd->shared)
> > +		atomic_dec(&sd->shared->nr_busy_cpus);
> >  unlock:
> >  	rcu_read_unlock();
> >  }
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index 5847b83d9d552..dc50193b198c6 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -680,19 +680,39 @@ static void update_top_cache_domain(int cpu)
> >  	int id = cpu;
> >  	int size = 1;
> >  
> > +	sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL);
> > +	/*
> > +	 * The shared object is attached to sd_asym_cpucapacity only when the
> > +	 * asym domain is non-overlapping (i.e., not built from SD_NUMA).
> > +	 * On overlapping (NUMA) asym domains we fall back to letting the
> > +	 * SD_SHARE_LLC path own the shared object, so sd->shared may be NULL
> > +	 * here.
> > +	 */
> > +	if (sd && sd->shared)
> > +		sds = sd->shared;
> > +
> > +	rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd);
> > +
> >  	sd = highest_flag_domain(cpu, SD_SHARE_LLC);
> >  	if (sd) {
> >  		id = cpumask_first(sched_domain_span(sd));
> >  		size = cpumask_weight(sched_domain_span(sd));
> >  
> > -		/* If sd_llc exists, sd_llc_shared should exist too. */
> > -		WARN_ON_ONCE(!sd->shared);
> > -		sds = sd->shared;
> > +		/*
> > +		 * If sd_asym_cpucapacity didn't claim the shared object,
> > +		 * sd_llc must have one linked.
> > +		 */
> > +		if (!sds) {
> > +			WARN_ON_ONCE(!sd->shared);
> > +			sds = sd->shared;
> > +		}
> >  	}
> >  
> >  	rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
> >  	per_cpu(sd_llc_size, cpu) = size;
> >  	per_cpu(sd_llc_id, cpu) = id;
> > +
> > +	/* TODO: Rename sd_llc_shared to fit the new role. */
> >  	rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds);
> 
> Would love for folks to chime in but IMO "sd_wakeup_shared" sounds
> pretty reasonable since it is mainly the wakeup path that depends on
> this except for one !ASYM load balancing trigger.

sd_wakeup_shared captures the bigger consumer (wakeup), but not the nohz
balancer kick logic.

Maybe "sd_balance_shared" (balance in a broad sense, wakeup is still affecting
balancing at the end) or "sd_effective_shared" (if we want to stress that
topology may move: LLC vs asym)?

> 
> >  
> >  	sd = lowest_flag_domain(cpu, SD_CLUSTER);
> > @@ -711,9 +731,6 @@ static void update_top_cache_domain(int cpu)
> >  
> >  	sd = highest_flag_domain(cpu, SD_ASYM_PACKING);
> >  	rcu_assign_pointer(per_cpu(sd_asym_packing, cpu), sd);
> > -
> > -	sd = lowest_flag_domain(cpu, SD_ASYM_CPUCAPACITY_FULL);
> > -	rcu_assign_pointer(per_cpu(sd_asym_cpucapacity, cpu), sd);
> >  }
> >  
> >  /*
> > @@ -2650,6 +2667,15 @@ static void adjust_numa_imbalance(struct sched_domain *sd_llc)
> >  	}
> >  }
> >  
> > +static void init_sched_domain_shared(struct s_data *d, struct sched_domain *sd)
> > +{
> > +	int sd_id = cpumask_first(sched_domain_span(sd));
> > +
> > +	sd->shared = *per_cpu_ptr(d->sds, sd_id);
> > +	atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight);
> > +	atomic_inc(&sd->shared->ref);
> > +}
> > +
> >  /*
> >   * Build sched domains for a given set of CPUs and attach the sched domains
> >   * to the individual CPUs
> > @@ -2708,20 +2734,53 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> >  	}
> >  
> >  	for_each_cpu(i, cpu_map) {
> > +		struct sched_domain *sd_asym = NULL;
> > +		bool asym_claimed = false;
> > +
> >  		sd = *per_cpu_ptr(d.sd, i);
> >  		if (!sd)
> >  			continue;
> >  
> > +		/*
> > +		 * In case of ASYM_CPUCAPACITY, attach sd->shared to
> > +		 * sd_asym_cpucapacity for wakeup stat tracking.
> > +		 *
> > +		 * Caveats:
> > +		 *
> > +		 * 1) has_asym is system-wide, but a given CPU may still
> > +		 *    lack an SD_ASYM_CPUCAPACITY_FULL ancestor (e.g., an
> > +		 *    exclusive cpuset carving out a symmetric capacity island).
> > +		 *    Such CPUs must fall through to the LLC seeding path below.
> > +		 *
> > +		 * 2) Skip the asym attach if the asym ancestor is an
> > +		 *    overlapping domain (SD_NUMA). On those topologies let the
> > +		 *    LLC path own the shared object instead.
> > +		 *
> > +		 * XXX: This assumes SD_ASYM_CPUCAPACITY_FULL domain
> > +		 * always has more than one group else it is prone to
> > +		 * degeneration.
> 
> I looked into this and we only set SD_ASYM_CPUCAPACITY if we find more
> than one capacity and SD_ASYM_CPUCAPACITY_FULL implies there are atleast
> two CPUs covering differnt capcities in the span.
> 
> The very first SD_ASYM_CPUCAPACITY_FULL domain should be safe from
> degeneration when it is non-overlapping.

Makes sense, maybe we can replace the XXX part with note like this:

 * Note: SD_ASYM_CPUCAPACITY_FULL is only set when multiple distinct
 * capacities exist in the domain span, so the asym domain we attach
 * to cannot degenerate into a single-capacity group. The relevant
 * edge cases are instead covered by the caveats above.

> 
> > +		 */
> > +		sd_asym = sd;
> > +		while (sd_asym && !(sd_asym->flags & SD_ASYM_CPUCAPACITY_FULL))
> > +			sd_asym = sd_asym->parent;
> > +
> > +		if (sd_asym && !(sd_asym->flags & SD_NUMA)) {
> > +			init_sched_domain_shared(&d, sd_asym);
> > +			asym_claimed = true;
> > +		}
> 
> We should probably guard this behind a "has_asym" check. Maybe even
> extract into a sperate helper if the nesting gets too deep. Thoughts? 

Ack, we can add an `if (has_asym)` as a quick skip logic and fold the walk +
NUMA check into a small helper.

> 
> > +
> >  		/* First, find the topmost SD_SHARE_LLC domain */
> > +		sd = *per_cpu_ptr(d.sd, i);
> 
> nit.
> 
> I think this reassignment is no longer required since you use a separate
> "sd_asym" variable now.

Ack.

> 
> >  		while (sd->parent && (sd->parent->flags & SD_SHARE_LLC))
> >  			sd = sd->parent;
> >  
> >  		if (sd->flags & SD_SHARE_LLC) {
> > -			int sd_id = cpumask_first(sched_domain_span(sd));
> > -
> > -			sd->shared = *per_cpu_ptr(d.sds, sd_id);
> > -			atomic_set(&sd->shared->nr_busy_cpus, sd->span_weight);
> > -			atomic_inc(&sd->shared->ref);
> > +			/*
> > +			 * Initialize the sd->shared for SD_SHARE_LLC unless
> > +			 * the asym path above already claimed it.
> > +			 */
> > +			if (!asym_claimed)
> > +				init_sched_domain_shared(&d, sd);
> 
> Tbh, if "has_asym" is true, we probabaly don't even need this since the
> nr_busy_cpus accounting gets us nothing.
> 
> Might save a little overhead and space on those systems but I would
> love to hear if there are any concerns if we just drop the
> sd_llc->shared when we detect asym capacities.

Hm... but "has_asym" is global, we may still need LLC-owned shared for symmetric
islands and NUMA-overlap cases, no?

Thanks,
-Andrea