From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from BYAPR05CU005.outbound.protection.outlook.com (mail-westusazon11010041.outbound.protection.outlook.com [52.101.85.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 118B2189BB6 for ; Wed, 25 Mar 2026 12:25:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.85.41 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774441555; cv=fail; b=miD7LHCc+LFsEP2mxSC29cTlHfY5SWp5oXIkg1rouLzfeSz1+l9JsK4jod0BzRkB4kyC4VA5CBU6Owe1uldzswqqV6hTmNxGSFIEIDIDTUCPXfLQQCaTLXQW3YPXKUCVbsaam0Yzm3LGvtTzumrtpm5g7g+//ir7mibps2I6lsk= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774441555; c=relaxed/simple; bh=oX1BZiB/nCnWg9vA4usbbJn7A7WF1t1Yfiel8NMRJMc=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=TgBwcJ2Ro5fZhUbBd16IDufvDrQyC2SlRiltYRTnd8xzYjt/1uc+/AMDYagWKr4Wfk5hsg954IN9JBFfAaLC36A3bD2xyqza5w/f4Sbh+MsHC6IVZWXoe94yWUJEmtJMS7cHVGln/HjpThmiV0YtQGPD48NQ+GD47SbeQ/xg61w= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=rrPIDJQt; arc=fail smtp.client-ip=52.101.85.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="rrPIDJQt" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=jmi4Zue1HU4F9gKXlx8po/JzhxkdmWp2ondcmVYaDelC6OVXJ9wdKPHu0CCasE3pQ2Jq71zFfe98mIng5Y1dUucdMJCg+xYUe6nyRruPjwdEiBnt1Tu82GXxdVAfqpKF+B8kpgwi9DKI6deVuqKP2AkB2ypYCa9Z+RXYZMp5MtoAJnZ66PL1klI2nsjCUZoIfODRjJQCc0MMkix9gRnRsOyAMkQoZsI3vYM0iz03WnFMUQUYOA3tgfyTLNuxW4tskIVZVIYOxYvjDnuRfAfESwn43a5P0o27uwCjXLz2kVZHghPNQip732ROJp17GcFd6JWUyAGckgBW7hmBgw6ACA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4dfPuIHxwZbLeG9jaRapICYUH5mCqzAB25vgVer6ubc=; b=TUdou2fwMQFK8NHQmE43xrAx71LBQ3DbkmbavENR7GHmulm9HdPv18rkT+EIdEAhtHssT2JpEAC1nu0LUfJ90O3V+kpQZjt4BfTswAQhoa0V6Wd8PoMWnj0LjLNHn3DquWwLf7NyJWx1tuc/rsu5i23jfm5ny5nwEx2NG7ehI0VVkV6a48C8BA0EJEHgpTjFudrO7tNij7mxmi7xoQ/rwc760AHr8kKjwMYTDUiTYPtnJ1IOvOGSJ8Sy8/V+erQBMwxfjoqYL5UkRHjb8/ZZte/SyKGlKzcnnIRALmyL2xT3EFRvAkJOdmk3UeXrbG83ciKVlGzeI0/o1PdL/zL5Jw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4dfPuIHxwZbLeG9jaRapICYUH5mCqzAB25vgVer6ubc=; b=rrPIDJQtmrYdb9zUITKx8Anm6pMtw+E8N3G2ON0vMqj3BEmibg4VbxIi8HXdppiB9X5e0caHUJfFb74tA+uPvu97oK+B1UJE3LB9Fa1ZZLuwnL9udUW0hEZmO8DBgVG+DfxaxQnkXbtUWw/Dd9DTO/d/Yl9XC4IMutJJvm8T+LuYUYLTB2/Em/MD4cNLO6i3CnQ978ACfwu6D0fhTKM22lhWmTi5SoUjjTRzMCFzdlmtbJihg17eSiqkZt0HXKuAUbk5RcTlFcfV6bKsPuvdyqrpcE8VhdxNN5LHJOxzmX5oDay/xQXl7VyzwnC6rVo12TGkROqq+s7vXx3Jm8Wazw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by MW6PR12MB8663.namprd12.prod.outlook.com (2603:10b6:303:240::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.17; Wed, 25 Mar 2026 12:25:49 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9745.019; Wed, 25 Mar 2026 12:25:49 +0000 Date: Wed, 25 Mar 2026 13:25:37 +0100 From: Andrea Righi To: Dietmar Eggemann Cc: Christian Loehle , Vincent Guittot , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org, Felix Abecassis Subject: Re: [PATCH] sched/topology: Avoid spurious asymmetry from CPU capacity noise Message-ID: References: <20260324005509.1134981-1-arighi@nvidia.com> <0fb05951-1f2f-474f-9f7c-9f0f15a5f675@arm.com> <86cb3979-02cd-4171-80fd-df20cb3430cb@arm.com> <1d9b4abf-4b70-4775-92b8-924ced316578@arm.com> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1d9b4abf-4b70-4775-92b8-924ced316578@arm.com> X-ClientProxiedBy: ZR2P278CA0040.CHEP278.PROD.OUTLOOK.COM (2603:10a6:910:47::11) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|MW6PR12MB8663:EE_ X-MS-Office365-Filtering-Correlation-Id: 91620c16-a7a2-4e67-19e0-08de8a69a5e2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|366016|1800799024|22082099003|56012099003|18002099003; X-Microsoft-Antispam-Message-Info: cdFhM43oeRg1RIpr9/eXWiFUzy1BST+B1YPvriv7RkRl1Fx6G67+dRJO9nDoLlNJfRmfvtysI1S+5ZLsBrHQMqNiGMVU9vsFM+IOlryNRpcVro7wvA1J3VpoL0C51adA7cb7HcukYdZ9orPY/u8en04LEDNp5jLJQrnvYbt6E4ob4P144uMfnZNVlwpikq/5rg4M3JmApjUVOBP+ilnKT+BxBxJ0ZsdUOE1lhWCdWyOJ0N4gf5qNeqGrniAnsVFggNu9oezxeDGDhiXCNxlVlAo7aryMHgY6NafIJ4KAdrhku+4IzTaWJoY1VQ6ov4O2z1bpShVRj8gz9bHXledfLMgzLdzd/zYwVHTr33vHV6BMslAYZpE/gKyvna6OUnVuGtn9ZZrCatGYfhVM641OAuVGj6ppllqeYhFKFN8fUlrbQBRy12QyVGWILhC6sAZwtHZU+iJhFL3jWtHyeKfj58mNywP5NBGa6EjeII5c2KWTPsIQ3hWs3XP2j8cTrp02OHNZQBukeVj80i8W/h7BKRWP2ZKpKP86DfSj1B+CUkf4HQgz7v0uaIBtKiZA8zBPyiNveXfyub3sbFBN3Ap32rrx2wyBmSO2UBj3PCZ9o3RL2rh/aNKvX/WgMInChXvBcCOqpkf083Ydz+wJyKHb8TiQl9CnS4AStOP4xCayEYae97hMWOc49SrQhvo7UvV99a4tIXt2pTDrIdK13e75Gn9+/kkkfkr3/or2tWMmSl4= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(366016)(1800799024)(22082099003)(56012099003)(18002099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?NHZqSlp0cGYzOWs0eVB0ajdrdVBoTXQ1Sml3MEwxOHUyUzFXMTJnYm53VVpp?= =?utf-8?B?NXoyalY5cnNNZ0xSbTllbUVUOXlnbHdYaFJFbHM4WkYwZFdHaWZrLzVjZ2Mx?= =?utf-8?B?U3JWbERybXdxVmVPSXl0OGNINmZNNVI5akRvajMrTEtGS2NGYkZlbUlIQ2lV?= =?utf-8?B?K0xHMUR4dHQyTDFLYlNiZzh4bHBSMWZtTHczbHJJWUczclB2Q2VIUHBrZ2VI?= =?utf-8?B?dFNnLzlmaUlWYU5HU0ZaWmVQai9lNk9KNlVnNnE4MjdBS1lva0piS04yT0FH?= =?utf-8?B?Ri9tT1I3Tlh0UDBmSWNMckdKS2hTOW5iNkpTckgyR2RNK0RXVDUvdk45cVJP?= =?utf-8?B?dTJwK291eDRuc0FzT21XM0Q0VjNRMWNpdVY4NHhjUHk4b0NiYVVwcWVuUGRY?= =?utf-8?B?K3ZwN2RLUWQrWllicHQ2SmhHQ3prOTZ1Q0RxQnE5KzlDblU5c01qU0R6TVlW?= =?utf-8?B?UW9VbmdXS1owOHNvY1ZjTXF1clNnNGNtQytwMHk1NnV5OTQ0WlBKU3RJNnJC?= =?utf-8?B?UlJNd2ZvVGhRU2RYOFY5K2FISkp1REs4QWsvS1JEMnpJUmdXeXd1OExWa1g5?= =?utf-8?B?eHl1dE1YZUxtTDNFS3JKNTkwVkdJRWJRY043c2RGS1dPdkZQTDd6bjBLN1ho?= =?utf-8?B?MkZDM0lldVZvL3hsQmJhYzRYQTZxT1Zwd0tTVVgyUDRXQnF3b2xvZ3h0dTR5?= =?utf-8?B?UStXVnBkYzhGREpFdGVPVlZMVnpWaXJMekdMRVJpYWtpZTQ1MlhxV3JZQmFS?= =?utf-8?B?ekgwQTFvc21relRxUEdWMGtNV1Q4bjVSSEpUeDVYYzFZQU81dTF2NUs0aVhj?= =?utf-8?B?MVhzZHI3ZDlJSE5rTktUTEsvc21oQ2l5RDhkTVA2b3YzMEUyeHc5TkFWM1Q4?= =?utf-8?B?bGJuMXVMVDFmOEhqZDJxcG5DSFBQYXQxU1VyWC9DeUhnbVVWdGM3VlFqWTho?= =?utf-8?B?bnVBS2FGZjI5THdJUWFiTWFNbDhjcUlZWUhJMXlaTkFlNDZyN1dFNGRhQTNi?= =?utf-8?B?WVdBcjh2OHZabEdJb1JyZ3BQdXowNEVVNWpIWVVwNThZWSt2dUx0YWdwZ2pQ?= =?utf-8?B?aklOaGgwZnVHV29lc053WDZrYVAxMFlkQmwxV2FLdlZYS2QvWFU2T2xNR0Zv?= =?utf-8?B?RWtlK1gxM2FMdFFTVWlsVWVVcW5yRU1OdW5hdGY1MktwZldDRnBUVWRiM0pv?= =?utf-8?B?VXUxNWlncGtiVUFSRjlhU2tkMFV1WkZLMENJQkYzbGlrT3Yra2U1aE5TYTZu?= =?utf-8?B?RGczNjhkem92dWRVNS9VZVJ1KzBZQyt3RmlCODc5NVk1NjcyY2x4aUpWSXRW?= =?utf-8?B?bktCeTV3UDd5L1ZRcW4ydFMvODhVOTVuY3d6OFc5Sk9jZlA1aU5zUlc0Ky8r?= =?utf-8?B?ZVM0MHNvV2tvK29Fc1hmVUZqc2piNW01U3krWGhDUU5Ycm5kM1BON202MlFu?= =?utf-8?B?b1h4blJHdGM3L0NvWnE5RW9WNWs2OGZhUnlLZHZYZWhGcENRY3gvNnVXbEJP?= =?utf-8?B?enNkTE1OTXpLcDRHMGg3VUY5SThsVmMwamUwYkhaYW9FTXhZVXIvWHVuU1dk?= =?utf-8?B?Mlg2MDl5SkV0YzVCSWY2TndiM01jTXdORXM0QjBPWGl3dDR1eldMaVM5cUJ1?= =?utf-8?B?M3hKSlNKQjk3YXpMMm9OUVp4MVgxYkJnejBVUFhXSDFHMFluQ3JpejF2VFZw?= =?utf-8?B?NHRDNDdRUWV6OVRCeTBGUnJFU3ovWmE5SFc4SmR0bmsvaDRDWXpLbTAzZy8r?= =?utf-8?B?ZkpBdTZRMm1oV0NNYVhLa0UyR09pTWpHeGJNMEt3ZTRreVBETndWQkE3NjMy?= =?utf-8?B?eWhYQkg4QjlYVlM4T2dEMVFqK0VldjIra2R6d0JYb2pnTFF6WDNtNlk0ODdq?= =?utf-8?B?am90VWVqOUdGMEgwdGlzeTZqdWRFYUpld2k5UFZMU1E1TGozOWJhNTkyaUxB?= =?utf-8?B?bGUvc256NForM1hEWXZHbW1BYmlDZXpNMm9pSEhJc1V6UTNpeTdIbU9ONzJN?= =?utf-8?B?eXpKakpSa0VMVUkzVUkyVUhyVngwTzF2ckN5RytzOUdUeXhPRktvVy9DQzBL?= =?utf-8?B?Zmo2SzdZalQ4SnJGbm1wUVMxZ1M3U3BzdG1tU0NCbC9hT1ROUTVRckhKL2F4?= =?utf-8?B?R08wbFZHYTdXcVpmNHdkWmZNcUJNeCtWN3ZjaEgzNHlJeFhidStOQ1QxLzh5?= =?utf-8?B?SG44U2lLVHFrYjA4TVY3Sk5uMzVranZEdi8zT3lhN2syM2FwYjh2NUhUSUpr?= =?utf-8?B?ZlhXaGNLSGRMdm1aVlF5RkhDV1RWcDZkd296YlVlMUk1dk82YzRKVlVFMEM0?= =?utf-8?B?SkJGSUg3SGY3RmNscmVUVjdORlovZzZlcDFjSm9DbVJwdEN0L2ZLZz09?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 91620c16-a7a2-4e67-19e0-08de8a69a5e2 X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Mar 2026 12:25:49.4489 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 1OlPH9yzVsdIvPXNdbOik7zhbNBRib2WbIuKn4+nbMQzlxgq9mPOwlNxLqebOvS3dG/V/jKQR4h8opij3kvvSg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8663 On Wed, Mar 25, 2026 at 12:16:59PM +0100, Dietmar Eggemann wrote: > On 25.03.26 10:32, Andrea Righi wrote: > > On Wed, Mar 25, 2026 at 10:23:09AM +0100, Dietmar Eggemann wrote: > >> On 24.03.26 12:01, Andrea Righi wrote: > >>> Hi Dietmar, > >>> > >>> On Tue, Mar 24, 2026 at 11:29:24AM +0100, Dietmar Eggemann wrote: > >>>> On 24.03.26 10:46, Andrea Righi wrote: > >>>>> Hi Christian, > >>>>> > >>>>> On Tue, Mar 24, 2026 at 08:08:22AM +0000, Christian Loehle wrote: > >>>>>> On 3/24/26 07:55, Christian Loehle wrote: > >>>>>>> On 3/24/26 07:39, Vincent Guittot wrote: > >>>>>>>> On Tue, 24 Mar 2026 at 01:55, Andrea Righi wrote: > >> > >> [...] > >> > >>>> The first time we observed this on NVIDIA Grace, we wondered whether > >>>> there might be functionality outside the task scheduler that makes use > >>>> of these slightly heterogeneous CPU capacity values from CPPC—and > >>>> whether the dependency on task scheduling was simply an overlooked > >>>> phenomenon. > >>>> > >>>> And then there was DCPerf Mediawiki on 72 CPUs system always scoring > >>>> better with sched_asym_cpucap_active() = TRUE (mentioned already by > >>>> Chris L. in: > >>>> https://lore.kernel.org/r/15ffdeb3-a0f3-4b88-92c0-17ffb03b0574@arm.com > >>> > >>> Yeah, I think Chris' asym-packing approach might be the safest thing to do. > >>> > >>> At the same time it would be nice to improve asym-capacity to introduce > >>> some concept of SMT awareness, that was my original attempt with > >>> https://lore.kernel.org/all/20260318092214.130908-1-arighi@nvidia.com, > >>> since we may see similar asym-capacity benefits on Vera (that has SMT, > >>> unlike Grace). What do you think? > >> > >> We never found a good way to specify a CPU capacity in the SMT case (EAS > >> and energy model included). So comparing CPU capacity w/ utilization, CPU > >> overutilization detection etc. definitions get more blurry. > > > > Hm... so should we just avoid calling select_idle_capacity() when SMT is > > enabled to prevent waking up tasks on both SMT siblings when there are > > fully-idle SMT cores? > > Yeah, pretty much. So prefer (2) over (1). > > IMHO, we do have a similar issue here. Can we say that a logical CPU is idle > if its SMT sibling isn't? But at least we don't have to use any CPU cap/util > comparison there. > > select_idle_sibling() > > 8132 if (sched_smt_active()) { > 8133 has_idle_core = test_idle_cores(target); > 8134 > 8135 if (!has_idle_core && cpus_share_cache(prev, target)) { <-- (1) > 8136 i = select_idle_smt(p, sd, prev); > 8137 if ((unsigned int)i < nr_cpumask_bits) > 8138 return i; > 8139 } > 8140 } > 8141 > 8142 i = select_idle_cpu(p, sd, has_idle_core, target); <-- (2a) > 8143 if ((unsigned)i < nr_cpumask_bits) > 8144 return i > > select_idle_cpu() > > 7926 for_each_cpu_wrap(cpu, cpus, target + 1) { > 7927 if (has_idle_core) { > 7928 i = select_idle_core(p, cpu, cpus, &idle_cpu); <-- (2b) > 7929 if ((unsigned int)i < nr_cpumask_bits) > 7930 return i; > 7931 > 7932 } else { > 7933 if (--nr <= 0) > 7934 return -1; > 7935 idle_cpu = __select_idle_cpu(cpu, p); > 7936 if ((unsigned int)idle_cpu < nr_cpumask_bits) > 7937 break; > 7938 } > 7939 } Exactly, we already prefer fully-idle cores over partially-idle cores with asym-capacity disabled, but in that case the idle selection logic stays in a world of idle bits, without cap/util math, so it's a bit easier. And it's probably fine also when we have both asym-capacity + SMT (at least it seems better than what we have now, ignoring the SMT part). Essentially having somethig like the following (which already gives better performance on Vera): kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d57c02e82f3a1..534634f813fca 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8086,7 +8086,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) * For asymmetric CPU capacity systems, our domain of interest is * sd_asym_cpucapacity rather than sd_llc. */ - if (sched_asym_cpucap_active()) { + if (sched_asym_cpucap_active() && !sched_smt_active()) { sd = rcu_dereference_all(per_cpu(sd_asym_cpucapacity, target)); /* * On an asymmetric CPU capacity system where an exclusive Thanks, -Andrea