From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from PH0PR06CU001.outbound.protection.outlook.com (mail-westus3azon11011053.outbound.protection.outlook.com [40.107.208.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE396145FE0 for ; Tue, 7 Apr 2026 19:16:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.208.53 ARC-Seal:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775589373; cv=fail; b=DmxplVn1Z004tw9PNIL+orWjOmyRINi1zoekTvkcs/AFOvOrTJiWKsE9/6b390ZBSw9AjOT8h2IO5CymSWBsYYyruWc7s+SzbufUK4PZg0EaEcbJqXfOAvBReXxaVm7uMcC15aZWOAKy7nmSyiCfDAvWlY+GsKCxDqsIF6CvimE= ARC-Message-Signature:i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775589373; c=relaxed/simple; bh=rOi6fO8bX0SPx2MUK8/bLZNV1K6CSPGnhC4vdw/p7aY=; h=Date:From:To:Cc:Subject:Message-ID:References:Content-Type: Content-Disposition:In-Reply-To:MIME-Version; b=ADTzgBVDENwWmEUkVxKAzu7aaiQrxnX8S7/OHUxrUE750nltMGKnDpC0O4An498FG/Obv+H0dze5bl3Y0kZCMmET8VHAfA6M0uJGAkoSD71xM/I8B/hDhZSKAysM4hx+6cgbAJSDp6EP29yS7sdbTdxUcyCmPmgESQuf3BUEecE= ARC-Authentication-Results:i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=P29LhXBz; arc=fail smtp.client-ip=40.107.208.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="P29LhXBz" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=g98Nr9t7+zuRPe1Hcje2WOMRE5MnzZpQJa4HzHv33s9xouU/XBNlVTsntWPd3io0nd3NYYdFE50Dwc2rwc0h420+YZyowpigrwun8NUe5iiJ7UPmR4wPGoht8/thcjTwBGY9nkraNsj4Exs3fCFpe/9yrXdJ1ri0p7E7ivkYys958AG5xVq6KcW6qMYM43XJ1T8i8Fo36KUAoT8P6ncWNrbTk081t1u/RGBAX/sz9kEODEuQ9FhI3vSzc0khShddv9BImNjmee3SMSh0Z6fHl/A/QMCM5z5t1E8WCt5rDCz98XbjXZFq5cTLE/3Bm7xlqldIbuOxLXB3cS5iaQpqvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=kB7+zpRPw7ykroWjsi11tvIA607jKaMpU0RiAVhUyEM=; b=jC1BzoUzWtbc9/iFDLNktw1wyjvjUfQiCwLSYuRq0k5vwkU2fVF9AEt5oFCayW4reQI8akg58H5Cf/r+faP7t7lJlVJhO8P5ufOzZrsK0Y0TGMX5sTEBAlDxiZZvG2mBm0J6e5xXtmWpyIokFe5pASPC11wb61jmNc7hrfwwH6CC6YQVMWucA4on4mXHRRR24BnBtei4b5UPLj1xAWUYvzMJeS5k1Lvw2IKSS3x5Us0tbSR3xhGgfqWZKxpzNutM/p6J7MABmGmYYV8yw4uEvBEvGGQ8RVRtzwrLEXFAfIs1qRIS3FJZednbiBxZfKVfSsJpDBntxok4jQFTdNxj7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=kB7+zpRPw7ykroWjsi11tvIA607jKaMpU0RiAVhUyEM=; b=P29LhXBzCkXyV7/KkDTWNU8NgdIoLI31WeVpTaALR+vQ3c09Z3rz0a/RUNqQyAyax/GsLu/cGqoUvbjs5uZ5V617VhkYEXNGpDSZFw1gFBS1kwwM/7tyBRSPlBQUhyKjkJ0ppv67TdP+IcRHj/X6Sl36njw3MnBqULUhfCswTcmJo8vvPNB98WDeDl4V68lRSbQN6JuaZX1Ix0Jg/Qq3S3JORwsHuptLbCmS95289YGf3UFCQmKTWkrt9YSyPGzC3J+dn7tCOkbvmOvDDzdzW7KbivbCw18FaJvm2GFLuKiMAvoGG11Ghr5Xlx9kZWpLZ8vZcNFch19fC0WR503LZA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) by DS0PR12MB9728.namprd12.prod.outlook.com (2603:10b6:8:226::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9769.15; Tue, 7 Apr 2026 19:16:07 +0000 Received: from LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528]) by LV8PR12MB9620.namprd12.prod.outlook.com ([fe80::299d:f5e0:3550:1528%5]) with mapi id 15.20.9769.017; Tue, 7 Apr 2026 19:16:05 +0000 Date: Tue, 7 Apr 2026 21:16:02 +0200 From: Andrea Righi To: Dietmar Eggemann Cc: Vincent Guittot , Ingo Molnar , Peter Zijlstra , Juri Lelli , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Christian Loehle , Koba Ko , Felix Abecassis , Balbir Singh , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] sched/fair: SMT-aware asymmetric CPU capacity Message-ID: References: <193f1cd1-ced3-4b37-83af-ea43a7e5e3d0@arm.com> <9886a7d3-fb54-4637-8b4c-1f35272f4882@arm.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: MI3PEPF00004EAC.ITAP293.PROD.OUTLOOK.COM (2603:10a6:298:1::453) To LV8PR12MB9620.namprd12.prod.outlook.com (2603:10b6:408:2a1::19) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: LV8PR12MB9620:EE_|DS0PR12MB9728:EE_ X-MS-Office365-Filtering-Correlation-Id: 94bbc794-6ac5-4c6f-e0bd-08de94da1daf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016|56012099003|18002099003|22082099003; X-Microsoft-Antispam-Message-Info: Vg5VSCiCBffZEpqPiZWuP1S4Ab3cMW+1VYxxRcZDzJdqkxkYRsaYk6gYUzSZxFJdySAM4QhH/i3NURpfrUupVYzEjV2elRJsU1Od+2amY+dxzrDceqCTGH58JEngSaAl6N+Eao/kEEEmqvq5laYAjNdPdoNiTG8UAHI3oyzVEOKF4H58q0SGem5d8LgYcwSoO8CF1I1iwKGAEdYHSUEpIsVfbUbHleo00wHfb3V0F0pl0n6TfApzv512VsmEyuPUjT8t6Bz3lU6HvzIzD6ovgj7uKK4mfSJJLD4+weOXOSaqZPKUlwlEmbgxajiy8Pq9mrFXbNEig/2vNiFQGFRWwBLJsIGc0WaMy/CetelB3BPIE/r8gN6l1K3Y/HrkY01+CL6/uB+QxeGLgPljPnBgJTWTw4uoY8rwGZrK/vU5+0075hS+EVl56NT12OiFdNhQWH8yZ2I6p6vQVNlYnd9dGsYlVpNzHL50DDw5rUck/ugUESXBxDTptB0tq53DEIANxSd6chUTITa853A2hoM+sgT+Nj9S8xC2hS+DW/2PHNKU2y794S48dA4PyFarZzSedI5nEffTYvdVvwtcWdBi1ese7/UoM7Sw5kzp4J0aVGQOCe8IQIqGd2OXr21wdx8UYTb9xIbULEFdReOXvlbR9I42krtdEe8G7GZL52Ufr3okMtPtoJXfmvUYF2dUMxjQ8aGQsZcNfw1Bkmi6HFrqIv+3CREAbyrvEAHmZqqHdzQ= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:LV8PR12MB9620.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(7416014)(1800799024)(366016)(56012099003)(18002099003)(22082099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?egERwgtc8QI6mO/TIZqsg2Avt+eBbrdv5qqTZVZf+ui1W0wh6O2Pq5/Q1kpH?= =?us-ascii?Q?k4riQ2JylL18JWzKgUIOwm1Wo5ULt+bHxdeZfA9CuxeRgMwiQpcwpBOWlPcr?= =?us-ascii?Q?iLDWB8JlmmcVKHUBVINl5oBjL2HY+wdtFPTGjz1iTUuwkpqScu8e62dNP8ql?= =?us-ascii?Q?ERPrZq+HKawMuuiCTfOn4EXeJAe76dOvAE59Pz0sxdKKRTAwIhM5a/F8MJeP?= =?us-ascii?Q?9rMtT0+DL4UN6uZ3zcG5CoRMPVxnMbicUy/F4IXOrNQZ5mE+JNO7BM55lOMN?= =?us-ascii?Q?jS9W4AO6lNl5oHXEUl1+Cok8K2tFmF2uCT+tuL0kN6s5FBiQXAFusceUEdIX?= =?us-ascii?Q?xbfjS5HYFeI4z8zTNL4+yGmGduIJcq7ztlsUF+HIEA0La7L71rmD8UVQiLbX?= =?us-ascii?Q?3l8ylF7mak8+v1J1IZDfgNxUNB6tkNaK+k+98L9GG3DAYLcqCz49Lv/VmBUq?= =?us-ascii?Q?lhewOCWlkzOmhGt9kYG8qn90LHS7XBwCQwaRuD/a/xZGJZu0WPs8YhZkJROp?= =?us-ascii?Q?T2eaTDsIcKZqE5Zaj4qhtUlcRERpwOhuqfzwslXVJn2i565w0D27FBUNoZ9Q?= =?us-ascii?Q?l3+LOWF/LFxEdOFvFDDx9GmCaKu1JJIzBLf+pACTfSh1kk4U8/t6qxkRg9iQ?= =?us-ascii?Q?Mmd3LElD1QgksdkQwy+jzdG2OepQpuybZR7dLum3gHB8wP5MWvQMtCdNyM/6?= =?us-ascii?Q?4lqIxwS4g+AGGzHbVn77afmAkxKXJMd6IRvpBx72OjbH0mXNNV40hx/Neuwr?= =?us-ascii?Q?7ZsUzIMA+e5LavF+qctWSQyVxZtETpZCEFxmeYhNzuwUC6+WugSWTh6TKsAz?= =?us-ascii?Q?nHnnlkbOlQFaCV3F1dXeVzaTjQlps4bzSRJoY8Lum1r+y9BY5CUrchFipjmC?= =?us-ascii?Q?nISx0nwS+r+gs4hMUFkGfMmjQyFqHqM2j4BhxOvcCgneUmerTOr3wLDpx2I5?= =?us-ascii?Q?TUbP6DW6SY/Q7ATBDcOqME4tLw4cavZs7mmlZMc4sZBhPooAI0aqQAyhQna7?= =?us-ascii?Q?F9LCHbctJTp2Q/utBBCObPiGgtLbJ65Q1NrxZnsb2IKl8C7yBg3mtRyp2NnF?= =?us-ascii?Q?opP3IsVQGZgM5YmwuM5zAFCXtYmD4BHhRN9IaCI3Lj3ubAum01+1YOMAU6aw?= =?us-ascii?Q?2Z82meZ3L8Mr9kCUxiLUyEviZgV/vC/caiL3ay108ZY+E/5kSfXmVjfysYi7?= =?us-ascii?Q?txR7sodR6tDyMWmrdCLf7vm1+v77UDlX7t2UGKtFVLI7kAUV0tM5OM90o2sp?= =?us-ascii?Q?Vehff7xWV05Xj6cPzm3asLgvvO/W+rKdyugUhWaNWjq9XJnUv5+BQxbHR2NC?= =?us-ascii?Q?3o2shkjUrXReVqGEK4FCSrczLdjqluhUMVgim6qkWsyjwcRhBdEOo+lorP0n?= =?us-ascii?Q?KKxlGBYizpF3xy5Xmmx7qL9G1LpotKXkGxgPxVVPXFvfJrwLCncqgCVB36Ag?= =?us-ascii?Q?MXMA3bfAkEK5+ElaKNRClasneCUq6H8GsC224Ct4+sI3GEy0N8EmGslfId35?= =?us-ascii?Q?Kg/K4byOVJK8kH+Gb06Fj+bXUP7Rf8PexAfThNRk82LKjo1ELTY/QzWWMg1R?= =?us-ascii?Q?37gy3MP4oLlbFvwE//i/kcBRAL65l9B7P24AJFfpSqTjmQeyYtxLrMvodH6X?= =?us-ascii?Q?Sqm9sN3WHaOj/xZK74OfGHcGETmBQikxzIlOh1sTPZp+7z9T+6149HZIgjvx?= =?us-ascii?Q?yObr/fivarR7dwd2ChZhQR2U0E0iXC0bOq/df0amC8yGBXXXHh67J96PNRzh?= =?us-ascii?Q?6YtlpvQPSg=3D=3D?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 94bbc794-6ac5-4c6f-e0bd-08de94da1daf X-MS-Exchange-CrossTenant-AuthSource: LV8PR12MB9620.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Apr 2026 19:16:05.9100 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: O5fwZeowU5vUuCXNuMjYTmwH06BuOgWi0+wprc7kcqW2BSG0KiafIFuhaOLJzNXmSnXnPv+oA1W6V3i7/J5ksQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB9728 Hi Dietmar, On Tue, Apr 07, 2026 at 01:50:51PM +0200, Dietmar Eggemann wrote: > On 03.04.26 22:44, Andrea Righi wrote: > > On Fri, Apr 03, 2026 at 04:46:03PM +0200, Andrea Righi wrote: > >> On Fri, Apr 03, 2026 at 01:47:17PM +0200, Dietmar Eggemann wrote: > > ... > >>>> Looking at the data: > >>>> - SIS_UTIL doesn't seem relevant in this case (differences are within > >>>> error range), > >>>> - ASYM_CPU_CAPACITY seems to provide a small throughput gain, but it seems > >>>> more beneficial for tail latency reduction, > >>>> - the ILB SMT patch seems to slightly improve throughput, but the biggest > >>>> benefit is still coming from ASYM_CPU_CAPACITY. > >>> > >>>> Overall, also in this case it seems beneficial to use ASYM_CPU_CAPACITY > >>>> rather than equalizing the capacities. > >>>> > >>>> That said, I'm still not sure why ASYM is helping. The frequency asymmetry > >>> > >>> OK, I still would be more comfortable with this when I would now why > >>> this is :-) > >> > >> Working on this. :) > > > > Alright, I think I found something. I tried to make sis() behave more like sic() > > by adding the same SMT "full idle core" check in the fast path and removing the > > extra select_idle_smt(prev) hop from the LLC idle path. > > > > Essentially this: > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 7bebceb5ed9df..19fffa2df2d36 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -7651,29 +7651,6 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu > > return -1; > > } > > > > -/* > > - * Scan the local SMT mask for idle CPUs. > > - */ > > -static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) > > -{ > > - int cpu; > > - > > - for_each_cpu_and(cpu, cpu_smt_mask(target), p->cpus_ptr) { > > - if (cpu == target) > > - continue; > > - /* > > - * Check if the CPU is in the LLC scheduling domain of @target. > > - * Due to isolcpus, there is no guarantee that all the siblings are in the domain. > > - */ > > - if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) > > - continue; > > - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) > > - return cpu; > > So it is this returning of CPU from the smt mask rather than the > > for_each_cpu_wrap(cpu, cpus, target + 1) > > __select_idle_cpu() > > if (choose_idle_cpu(cpu, p) && ...) > return cpu > > where cpus is cpumask_and(cpus, sched_domain_span(MC), p->cpus_ptr) Right, and this is a different behavior that I was trying to eliminate from sis() to make it similar to sic(). > > I wonder wether this has anything to do with your NVIDIA Spatial > Multithreading (SMT) versus Traditional (time-shared resources) SMT? I don't have data to prove or disprove that... it'd be interesting to try the same approach on a system with traditional SMT. > > > > - } > > - > > - return -1; > > -} > > - > > #else /* !CONFIG_SCHED_SMT: */ > > > > static inline void set_idle_cores(int cpu, int val) > > @@ -7690,11 +7667,6 @@ static inline int select_idle_core(struct task_struct *p, int core, struct cpuma > > return __select_idle_cpu(core, p); > > } > > > > -static inline int select_idle_smt(struct task_struct *p, struct sched_domain *sd, int target) > > -{ > > - return -1; > > -} > > - > > #endif /* !CONFIG_SCHED_SMT */ > > > > /* > > @@ -7859,7 +7831,7 @@ static inline bool asym_fits_cpu(unsigned long util, > > (util_fits_cpu(util, util_min, util_max, cpu) > 0); > > } > > > > - return true; > > + return !sched_smt_active() || is_core_idle(cpu); > > } > > This change seems to be orthogonal to the removal of select_idle_smt() > for sis()? Right, essentially this modifies sis() to return only if cpu is a fully-idle core. > > BTW, the is_core_idle() in asym_fits_cpu() (used for those early return > CPU conditions in sis()) is something we don't have on the NO_ASYM side > where we only use choose_idle_cpu(). You mean without this change? In that case, yes, because asym_fits_cpu() was just a no-op. This is one of the behavior changes in sis() to make it similar to sic() with SMT awareness. > > > /* > > @@ -7964,16 +7936,9 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target) > > if (!sd) > > return target; > > > > - if (sched_smt_active()) { > > + if (sched_smt_active()) > > has_idle_core = test_idle_cores(target); > > > > - if (!has_idle_core && cpus_share_cache(prev, target)) { > > - i = select_idle_smt(p, sd, prev); > > - if ((unsigned int)i < nr_cpumask_bits) > > - return i; > > - } > > - } > > - > > i = select_idle_cpu(p, sd, has_idle_core, target); > > if ((unsigned)i < nr_cpumask_bits) > > return i; > > > > --- > > > > With this applied, I see identical performance between NO_ASYM and ASYM+SMT. > > Interesting! > > > I'm not suggesting to apply this, but that seems to be the reason why ASYM+SMT > > performs better in my case. > > > > -Andrea > Thanks, -Andrea