From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH0PR06CU001.outbound.protection.outlook.com (mail-westus3azon11011001.outbound.protection.outlook.com [40.107.208.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 267563D5251 for ; Fri, 3 Apr 2026 20:45:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.208.1 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775249107; cv=fail; b=pIXJWh27It59ouw0k1R4kFBBA8nmOT2w8Jk+x7gqI5RM6dEzuKPpXCUi5pIHNx/oGIaedBV0+tb7bXR48N64fJMhvnE1GZUzaiPPT/OAFyPWjhgeY81oB4rfrNQs966QL33wKd1GSt0oxiJczn3LDDzawIz85pkvUVGx7+ko4Vo= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775249107; c=relaxed/simple; bh=CZVuyBp683aLfMHcc3fpmhHH3zYIeEU+SIkHJRek5U8=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=R4YCx8xQQwZJJqDGWJ25I6aCjwSOGdtI2gRrDWRMI2q+ok2sjo9AWgbeR6eli9NmTnxVZFYE6e0h2AbKKX5AmGlim/nOFNcoLoEzdDvmEe/hiZQB3xZr/iHtHBVzfPrAmI7vqM5s+H7PO4rd3tjfbHjjn4K6YSd1ykzHXcN2ocU= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=uaeFIFTn; arc=fail smtp.client-ip=40.107.208.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="uaeFIFTn" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KMMPMKAEk43Mkrjl7Nveay4DYB6y1WoXVpVvCz2XDN4Z84cAqt4nre2OdPg8n/5o0pQPOGKrXx+r1VBcnXB14dp8FIialRV2t7tqFcnNKF5/Q2Iy9uoyjdGXmvu2XBXtM0WKvU82RtWHohjequ7NTjlbDqDNO2lLnbOdj74GKjH+74ktxD4kCvuVN1curMfptr3QKBtpYBHcPw0lRwvk+kIIvhnAoNoXMAx2aFX09XswC4d/M+H1SgNSTjhbFqWOgvLnfonn9qu6MaSBfr1RrLTgzUiyw9LB277iJiD025oJtBFVfp1nu9uQL3Gel2xnPo9nZJXmv8/KuHNibxkJcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vKOp8DjmidFPm1sPhkPfVxLlg3mT6kvUXRx+S//agKA=; b=GDPkvimEDLwViBaH3OU2DfiGmtjcPzeBWgkVY9tLLKCIWNyk4v3x6sbXy33TlcJhcX5MkCbqITlGLSq45CIQF4+VtMdY5d70n/BA94xuULXstv2EsR/3/2SqMyrEuKx7ZvtnU6HP62vlCKC7dzaTgRcYesHiY25pSgP3TdJC3uwncublNNTXAf5hRo5Z8VJLIlzutLvRtwgYl415FkFdTO+CEQhCtVqDiRUtyXZTUeTaKnscRZXTk5xUL0SXcKq316CKj4Bxix+CSPNG8K8uw8Y1K7wxSHxdGfrP8XjguwwTkWDhiGXvaVO6AxpsHumysoiY73yT1ZQ2CL6jDnGGuQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vKOp8DjmidFPm1sPhkPfVxLlg3mT6kvUXRx+S//agKA=; b=uaeFIFTnJwP9jgc7f0HdlKRyHEEQp+Xdyvf1Zym73fHeM9ays6i4+nsvY6UXBHcP3HqUalno9r1/G1PTS9S1eCpJUzCFc04AQTw+i812Rw2tABjJbAaHYUiC55YDBSS3yK84ctMYpKjO0ZSc8/T1Awlg32Ksa+IG7ZZcPasU1My5TR2k9Ml8LWfaqSmkeXKf+nI42lgG2W3sCiiVP3/nFjkVhf5xJs+u1FWhoIRR8dXGiUgmVn7HvC71YpvicLJIlzX1IcVBrD0XBZl9nfEbnKwZZtIlviUfsGMCmrbgGgmTc8DN5/LK/nHvmNUTSd/B5/JYBJcs7rRiIYMRLhRIpg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS2PR12MB9615.namprd12.prod.outlook.com (2603:10b6:8:275::18) by LV0PR12MB999092.namprd12.prod.outlook.com (2603:10b6:408:32e::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9769.17; Fri, 3 Apr 2026 20:45:03 +0000 Received: from DS2PR12MB9615.namprd12.prod.outlook.com ([fe80::f4e9:9ad6:cb62:2c15]) by DS2PR12MB9615.namprd12.prod.outlook.com ([fe80::f4e9:9ad6:cb62:2c15%6]) with mapi id 15.20.9769.018; Fri, 3 Apr 2026 20:45:03 +0000 Date: Fri, 3 Apr 2026 22:44:53 +0200 From: Andrea Righi To: Dietmar Eggemann Cc: Vincent Guittot , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Christian Loehle , Koba Ko , Felix Abecassis , Balbir Singh , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Message-ID: References: <20260326151211.1862600-1-arighi@nvidia.com> <193f1cd1-ced3-4b37-83af-ea43a7e5e3d0@arm.com> <9886a7d3-fb54-4637-8b4c-1f35272f4882@arm.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MI3PEPF00004E9E.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::455) To DS2PR12MB9615.namprd12.prod.outlook.com (2603:10b6:8:275::18) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PR12MB9615:EE_|LV0PR12MB999092:EE_ X-MS-Office365-Filtering-Correlation-Id: 4c807c7f-7676-467f-2784-08de91c1e173 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|7416014|376014|366016|22082099003|18002099003|56012099003; X-Microsoft-Antispam-Message-Info: LObqs/3n4WwhbGVJwBxpyE3So5QnunmOexEiaAPsNOc/q/cE+d6XDDHyyI+H6TBorc32WqTwTGTNiG2qEjRwuGzjgkX45oAoeRGXPwLYfFQBh5CAWzFEuQchTuBtClsFng46Y9/oN9Kc9SZUAGu5qFxBDLDIlxoXX1PzOkjLrTpUvCUYPRSd47pls841NY9VHw4DXx6bzF8ALFkSubFODrnADyTVsyE5ji2xPNHiigjxbwwXyxl3pqWXkAkcw3UAdEl0ef6hcum5g3sadNG50JRuI9rN9yTxQuANzSnv6Lmkb7qFHtMh0kW3IR3a/3oAxDceDp5Y0KmhvxFzOTgdSnl3LIq2VAPsSA/4H/Y4lSK+zKj62BvbkkJwtMw14Y8qRyL9P9Ntp9ajj9aW0rUuoQzfPDxI8AKxOs18wALpkCnFOPI3IDGUXpfpwqqdIXWq+WTt+tnwrD6kQcN7MgQGJTZ2FaGo/s9ftLLAlu3iY3JjLIcBEWzUkkS/h+LSMK+mzMK4p4b4BG1V0A3KOP9h6hYFG8KxUFLVUmTYU92MRvHKh2OfFoYB538p/eEEU0NxbLkCiT7jkno6H8W0fICUoIsYi0YadfMzKAAu0suXOwTW/u/jepMPcIeGPcy+2ypksDvKy0nHtLz8lPD3/Zjo5ZIIqk93CaOuoJtT5zcvz35zs6KjUHzIBM4Z4htZGqy440UtnWOSlZEbszgBgUeknuC51qKTA5cOVxMAH8D02wo= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DS2PR12MB9615.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(1800799024)(7416014)(376014)(366016)(22082099003)(18002099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?/dO4SvG3lLXF/1a5bkBn2X3OZMKQR3mWIg+r2WkGEZwknWF3biOsgZisn5ux?= =?us-ascii?Q?JgH0Kd7YTMLO0F3o/G5iqXAU3A3jkTKdVz1A+/4j8uOvaMuAbf0/0ZgOV5NN?= =?us-ascii?Q?DlEgqwWNXielLP0A2LgTlAjUDlONSNg4z13GqBRdWtY7dcOsp4TuJeFNjVWX?= =?us-ascii?Q?g8DIl8hVvRQawP4D4hAYQUWHGe3aZigVM5jvy9TETwVQsoFiu7RrABytFZJj?= =?us-ascii?Q?glTSSIudwAYMNjl0VQFqfw0JjVBL0B+mpIDdjStq5hiqJKT00mNl69+RWfD1?= =?us-ascii?Q?eLoqaj8wqy5zGpgXpSvM0IokuBALVVBivdRyO+cIlQ1pj6krHdNz4SkAgX3E?= =?us-ascii?Q?3cuedKYgE08nOv3cFFdnSr4sIDvJpG+awhOGuGMZf2oz5SVwEpDbzsf5uMOg?= =?us-ascii?Q?zdv8Yw6VyfHmmSyZT6Sz8hX48EIP3hCnHnW5Izd0FHzeZcX59j9LxOD/+k+e?= =?us-ascii?Q?3aTGrLV7J92HQn/YG0O0N+3HacwH63Oo15oQCBMx6MXrvYQyHkvLymdWNXag?= =?us-ascii?Q?SBSXYnbDJoe/x8q1RTch1U6OdF3NH0+79bunQ8QvJP7dL/KZz5MMKNHM1s5d?= =?us-ascii?Q?5+pNWNHkR3hw+rJbNgGLCtcvjFVRlWqMdyEyXygRdj/I0XS8PZvBrETdBQAp?= =?us-ascii?Q?XtWZgeHiLtvepoXAhPKBBSD0MgzT0tXrhZpGcye5OEEdS5iBfMu5XKKeW4oo?= =?us-ascii?Q?KIQmrzde7z+J9qLmMnqnrGSQocaQgxeM86+Ex86tuEIh8VkVTXIFgp4HlkW8?= =?us-ascii?Q?pkBCqGrBtET0J++2bBHq9piqdxNV1HCHbd/x7pxVb5HQbN5pZzL74ywhsnUt?= =?us-ascii?Q?es1dFdcT6n1XCCFAvQLhNNHUfK22Ov46QppPjvCXLoopu7lYV90AuwZETPje?= =?us-ascii?Q?CW8kOuWyD0R2hTrTbHuFuR7hTXtqgm/igwesPMxPSwodHqZh4kLu+mamXCuF?= =?us-ascii?Q?E2So6qp4zdgSXPml7cjYW8TJEGnGTOb2EP9nADP8qGjfy1tDRgtyv9BbUjkQ?= =?us-ascii?Q?Xf2bp8xmaAWQ+fCMy62wRi8DC1qct/zrVsJusBt7JYuP05zIojjm8ceXDHYN?= =?us-ascii?Q?iHpAcS+vft0CCzKab5vNjfC0WczmLZUbn0PxQQvpd/2Ajazu6vo+PSQHiq3O?= =?us-ascii?Q?L1S2ms5tIhHf8NbqUtJZk2hUg6XhTjHfZPfdBuk/OuI3C4sJlEs7x8CO/bZr?= =?us-ascii?Q?dI4n7FkRR+Idss8MDCqB+D0SUcJggw+kAZk7qiP0kXqLn7HfR1wHr9rXHaLS?= =?us-ascii?Q?g60YbfBCpuUgjY8HGuNsAW7rgEF+Gdj8QrWKSKm7paFQVQewT06pHS2hdRG1?= =?us-ascii?Q?oAUAj6m+uIG1+YJFiT45iDhvurq2w76q2kKdgCVCLoKbztzpjVH6rgpGFBe/?= =?us-ascii?Q?uVBoO1TnoSSiwkYGx5pWvTrMnIOn9NsN8CEbxnQqD0L3cWVQmqUUOGK9+OWU?= =?us-ascii?Q?Gvv4fbUKt3dkcn0G83MOaLpUy+GQAJRV3BgCoVmkLC+A3ddryJea7h+cZkFl?= =?us-ascii?Q?pOmpNkc5N28GIKiEy1IcpHqIoBPm7DDl+gLVoTQ37i/xquHDb11p++8YSN+i?= =?us-ascii?Q?rvZ1VnrA9LzitBur10yEAO/wAm2FuDtEOXEcM4PkdkxbxPxaCzlJ5IQQ6200?= =?us-ascii?Q?p9Y+lUSU+yIBBhtbcIrxfJ/gEiTiOy01EvhkK1MEZw0epaJst4OdK4iMkq/A?= =?us-ascii?Q?uudSFBlpt8SEGmOSQt00juWHAssCISN+1MACZsYIL9PAiv/Tn+rZs0DltFPX?= =?us-ascii?Q?SBFDaQBWkg=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4c807c7f-7676-467f-2784-08de91c1e173 X-MS-Exchange-CrossTenant-AuthSource: DS2PR12MB9615.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Apr 2026 20:45:03.2207 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: K8/nC9x1rnH6+k48/TKoo0zR/l0IfKQadCsBBmpjqeBfcYEHHCoW7CCAB9gniH7xN+xZ/TBXp6r6YGie9U54BA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV0PR12MB999092 On Fri, Apr 03, 2026 at 04:46:03PM +0200, Andrea Righi wrote: > On Fri, Apr 03, 2026 at 01:47:17PM +0200, Dietmar Eggemann wrote: ... > > > Looking at the data: > > > - SIS_UTIL doesn't seem relevant in this case (differences are within > > > error range), > > > - ASYM_CPU_CAPACITY seems to provide a small throughput gain, but it seems > > > more beneficial for tail latency reduction, > > > - the ILB SMT patch seems to slightly improve throughput, but the biggest > > > benefit is still coming from ASYM_CPU_CAPACITY. > > > > > Overall, also in this case it seems beneficial to use ASYM_CPU_CAPACITY > > > rather than equalizing the capacities. > > > > > > That said, I'm still not sure why ASYM is helping. The frequency asymmetry > > > > OK, I still would be more comfortable with this when I would now why > > this is :-) > > Working on this. :) Alright, I think I found something. I tried to make sis() behave more like sic() by adding the same SMT "full idle core" check in the fast path and removing the extra select_idle_smt(prev) hop from the LLC idle path. Essentially this: diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7bebceb5ed9df..19fffa2df2d36 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7651,29 +7651,6 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu return -1; } -/* - * Scan the local SMT mask for idle CPUs. - */ -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) -{ - int cpu; - - for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) { - if (cpu == target) - continue; - /* - * Check if the CPU is in the LLC scheduling domain of @target. - * Due to isolcpus, there is no guarantee that all the siblings are in the domain. - */ - if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) - continue; - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) - return cpu; - } - - return -1; -} - #else /* !CONFIG_SCHED_SMT: */ static inline void set_idle_cores(int cpu, int val) @@ -7690,11 +7667,6 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma return __select_idle_cpu(core, p); } -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) -{ - return -1; -} - #endif /* !CONFIG_SCHED_SMT */ /* @@ -7859,7 +7831,7 @@ static inline bool asym_fits_cpu(unsigned long util, (util_fits_cpu(util, util_min, util_max, cpu) > 0); } - return true; + return !sched_smt_active() || is_core_idle(cpu); } /* @@ -7964,16 +7936,9 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) if (!sd) return target; - if (sched_smt_active()) { + if (sched_smt_active()) has_idle_core = test_idle_cores(target); - if (!has_idle_core && cpus_share_cache(prev, target)) { - i = select_idle_smt(p, sd, prev); - if ((unsigned int)i < nr_cpumask_bits) - return i; - } - } - i = select_idle_cpu(p, sd, has_idle_core, target); if ((unsigned)i < nr_cpumask_bits) return i; --- With this applied, I see identical performance between NO_ASYM and ASYM+SMT. I'm not suggesting to apply this, but that seems to be the reason why ASYM+SMT performs better in my case. -Andrea